Xiaomi MiMo V2.5 Guide

Xiaomi MiMo is worth testing, not blindly switching to. The source-backed case is unusually strong for a newer model lane: official Xiaomi MiMo docs list OpenAI-compatible and Anthropic-compatible APIs, 1M-context V2.5 models, a coding-focused Token Plan, tool setup paths, and open weights on Hugging Face. The caution is just as important: pricing changed in late May 2026, legacy V2 model names are being routed to V2.5 before deprecation, and privacy/enterprise guarantees need direct review before sensitive production use.

Best next click: Smart Spend Guide

Quick Facts

Item	Current Xiaomi MiMo status
Provider	Xiaomi MiMo / Xiaomi
Main developer model family	MiMo-V2.5 series
Current flagship API model	`mimo-v2.5-pro`
Current multimodal API model	`mimo-v2.5`
Other current API models	`mimo-v2-flash`, `mimo-v2.5-asr`, `mimo-v2.5-tts`, `mimo-v2.5-tts-voiceclone`, `mimo-v2.5-tts-voicedesign`, legacy `mimo-v2-pro`, `mimo-v2-omni`, and `mimo-v2-tts` during migration
Context and output limits	V2.5 Pro and V2.5: 1M context and 128K max output in the model/rate-limit docs
API formats	OpenAI-compatible chat completions and Anthropic-compatible messages
Plan types	Pay-as-you-go API, Token Plan subscription for programming tools, referral/trial credits, and open-weight self-hosting
Open-weight status	MiMo-V2.5-Pro and MiMo-V2.5 are listed on Hugging Face with MIT licenses; Xiaomi says the V2.5 series weights are open-sourced
Pricing status	Source-backed from Xiaomi docs updated June 2, 2026; checkout remains final authority
Biggest unknowns	Enterprise data controls, zero-retention guarantees, regional availability by account, actual tool quota burn, latency, and independent eval results

Which Xiaomi Path Should I Choose?

If you want…	Start with	Why	Stop if…
A normal API for apps, scripts, evals, and custom routing	Pay-as-you-go API	It uses ordinary Xiaomi MiMo API keys and account balance, with OpenAI-compatible and Anthropic-compatible routes	You need flat monthly coding-tool quota or cannot accept hosted API data terms
A coding-tool subscription lane	Token Plan	It is built for programming tools such as OpenCode, OpenClaw, and Claude Code, with package API keys and dedicated base URLs	You need backend/API automation, non-coding use, refund flexibility, or PAYG balance interoperability
A fast hands-on model trial	MiMo Studio / demo paths	It lets you inspect model behavior before wiring API keys into tools	You need reproducible API metering, privacy review, or production terms
Local control or fine-tuning	Open weights	V2.5 and V2.5-Pro weights are MIT-licensed on Hugging Face	You do not have server-grade inference hardware and operations capacity

What Xiaomi Actually Offers

There are two different Xiaomi AI stories here. Do not mix them.

Developer model lane: Xiaomi MiMo is the model/API platform. It includes hosted API access, Token Plan subscriptions for coding tools, open weights, Studio access, model docs, and setup paths for agent/coding tools.

Device and ecosystem lane: Xiaomi also ships AI inside phones, apps, smart-home products, cars, and HyperOS-era experiences. Those may use Xiaomi AI systems, but they are not automatically the same thing as a public MiMo API plan. This guide only treats device/app AI as relevant when Xiaomi’s developer docs connect it to MiMo models, APIs, or open weights.

The current developer shortlist is:

Model	What it is	Use it for
`mimo-v2.5-pro`	Flagship sparse MoE language model, 1.02T total / 42B active parameters on the Hugging Face card	Hard coding, long-horizon agent work, long-context reasoning
`mimo-v2.5`	Native omnimodal model, 310B total / 15B active parameters on the Hugging Face card	Text, image, video, audio understanding, multimodal agent workflows
`mimo-v2-flash`	Lower-cost V2 flash model still listed in API pricing and model limits	Fast and cheaper text-generation lane
`mimo-v2.5-asr`	ASR model	Speech recognition
`mimo-v2.5-tts` family	TTS, voice clone, and voice design models	Speech synthesis experiments

Legacy names matter. Xiaomi’s docs say mimo-v2-pro and mimo-v2-omni auto-route to V2.5 pricing from June 1, 2026, and are scheduled for full deprecation on June 30, 2026. New integrations should use the V2.5 names directly.

Release Chronology

This is the compact source-backed sequence that matters for buyers and implementers:

Date	Event	Why it matters
2025 open-weight lane	MiMo-7B models appeared under the XiaomiMiMo Hugging Face organization	Xiaomi had smaller open-weight research models before the hosted V2 API lane
December 16, 2025	MiMo-V2-Flash release appears in Xiaomi’s model-release log	Establishes the V2 flash/code-agent foundation model before V2-Pro and V2.5
January 12 / February 4, 2026	MiMo-V2-Flash updates appear in Xiaomi’s model-release log	Shows Xiaomi was iterating the same V2-Flash model ID before the later V2.5 migration
March 18, 2026	MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS releases appear in Xiaomi’s model-release log	These are now legacy names for new integrations because the docs route them toward V2.5
April 22-23, 2026	MiMo-V2.5 launch/public beta and V2.5-Pro release	V2.5 introduces the current 1M-context multimodal/agent lane and the V2.5-Pro coding/agent lane
May 27, 2026	Xiaomi price update takes effect	PAYG V2.5 prices drop and Token Plan billing/quota mechanics change
May 28, 2026	V2.5 series open-weight announcement update	Xiaomi says V2.5 weights are open-sourced under MIT and usable for commercial inference/fine-tuning without extra authorization
June 1, 2026	`mimo-v2-pro` and `mimo-v2-omni` begin auto-routing to V2.5 pricing	Old model names can hide new billing/model behavior
June 2, 2026	Current model/rate-limit and Token Plan docs show updated tables	This is the current page-level anchor for limits, plan quotas, and package rules
June 30, 2026	V2-Pro and V2-Omni deprecation deadline in Xiaomi docs	New work should use `mimo-v2.5-pro` or `mimo-v2.5` directly

Plans And Pricing

Use the status labels consistently:

Lane	Status	What Xiaomi documents	Caveat
Open weights	Free	V2.5 model weights are published under MIT on Hugging Face; Xiaomi says commercial inference deployment and secondary training need no extra authorization	Self-hosting a 310B or 1.02T model is not free in infrastructure terms
Referral credits	Free	Refer-and-earn gives both users API bonus credits, valid 40 days, API calls only	Not usable for Token Plan packages; daily reward issuance can be limited
TTS limited-time API use	Free	Xiaomi lists the V2.5 TTS series as free for a limited time	“Limited time” is not an evergreen free tier
Pay-as-you-go API	Paid	Overseas V2.5 pricing lists `mimo-v2.5-pro` at cache-hit/input/output rates of $0.0036 / $0.435 / $0.87 per 1M tokens, and `mimo-v2.5` at $0.0028 / $0.14 / $0.28 per 1M tokens	Pricing changed recently; account region, balance, and checkout/docs control final billing
Token Plan	Paid	Monthly plans list Lite, Standard, Pro, and Max at $6, $16, $50, and $100 with fixed credit quotas	Credits are not normal API dollars; quota burn depends on model, cache status, and usage path
Xiaomi versus other models	Compare	Xiaomi’s API and Token Plan are now price-competitive enough to test beside GLM, MiniMax, Kimi, Gemini, and local/open-weight lanes	Do not call it a winner without your own evals

The Token Plan is the most important caveat. Xiaomi describes it as a subscription for AI programming scenarios that works in tools such as OpenCode, OpenClaw, and Claude Code. It uses package-specific API keys and base URLs, and Xiaomi says Token Plan package quota is not interoperable with ordinary pay-as-you-go account balance.

The plan table in Xiaomi’s June 2 docs lists:

Token Plan	Monthly price	Monthly fixed credit limit	Annual price	Annual fixed credit limit
Lite	$6/month	4.1B credits	$63.36/year	49.2B credits
Standard	$16/month	11B credits	$168.96/year	132B credits
Pro	$50/month	38B credits	$528/year	456B credits
Max	$100/month	82B credits	$1,056/year	984B credits

Do not flatten those credits into a single per-token price. Xiaomi’s docs list separate package-credit consumption for cache-hit input, cache-miss input, output, and ASR duration. The safe buying question is: does your actual coding tool show acceptable credit burn on real tasks?

Developer Access Paths

Access path	Current read
API	Xiaomi documents OpenAI-compatible and Anthropic-compatible API endpoints at `api.xiaomimimo.com`, with API keys created in the console
Token Plan tools	Xiaomi documents Token Plan base URLs for China, Singapore, and Europe clusters, with OpenAI-compatible and Anthropic-compatible routes
Coding/agent tools	Xiaomi documents setup paths or overview support for OpenCode, Claude Code, OpenClaw, Hermes Agent, Kilo Code, Cherry Studio, Qwen Code, CodeBuddy, Cline, and compatible custom providers
Studio/app	Xiaomi links MiMo Studio and “Try it now” paths for interactive use
Open weights	XiaomiMiMo publishes V2.5-Pro, V2.5, V2-Flash, ASR, audio, embodied, and vision-language resources on Hugging Face/GitHub
Enterprise	Xiaomi references enterprise-level development scenarios and corporate authentication/payment in FAQ material, but this page did not verify a full public enterprise data-control sheet

For coding tools, the practical split is simple:

Use ordinary API keys for normal pay-as-you-go API calls and custom applications.
Use Token Plan keys for Xiaomi’s subscription quota inside supported programming tools.
Use open weights when you can afford the infrastructure and need local control.

Benchmarks And Capability Claims

Keep the source labels attached. Xiaomi has useful signals, but none of them are an AIHackers-owned “switch now” result.

Evidence bucket	What it says	How to use it
Xiaomi-reported launch claims	The MiMo-V2.5 launch page says V2.5 supports native visual/audio understanding, 1M context, and agentic capability. It also makes benchmark-specific relative claims against frontier closed models on video and multimodal agentic work.	Treat as vendor positioning and a reason to test multimodal/agent tasks. Do not convert it into “beats Claude/Gemini” headline language.
XiaomiMiMo model-card reported	The V2.5-Pro card reports 1.02T total / 42B active parameters, 1M context, base-model eval tables, long-context GraphWalks results, and deployment guidance. The V2.5 card reports 310B total / 15B active parameters, 1M context, text/image/video/audio modalities, and training on about 48T tokens.	Use for architecture, context, open-weight, and model-card benchmark references. Keep “reported by model card” visible.
arXiv lineage	MiMo-7B, MiMo-V2-Flash, and MiMo-VL papers document Xiaomi’s research lineage and earlier model families.	Use for context only. They are not proof of current hosted `mimo-v2.5` or `mimo-v2.5-pro` API behavior, pricing, latency, or coding quality.
Third-party / leaderboard signal	Hugging Face surfaces evaluation entries such as SWE Bench Resolved, WildClawBench, and Claw-Eval on the V2.5-Pro page.	Useful shortlist signal, especially for coding-agent work, but still not a substitute for your own repo evals.
Not AIHackers-owned	No AIHackers-owned Xiaomi patch-quality benchmark, latency benchmark, privacy test, long-running coding-agent field test, or Token Plan burn study has been run yet.	Keep Xiaomi in the test bench, not the default stack, until real workload checks pass.

That makes the verdict narrow: Xiaomi MiMo belongs in an eval stack. It does not yet replace Claude, GPT, Gemini, Kimi, GLM, MiniMax, or local models by default.

Privacy, Terms, And Data Use

The hosted platform is not the same risk profile as open weights.

What was verified:

Xiaomi publishes a MiMo Open Platform privacy policy and service agreement.
The API requires a Xiaomi account, and the docs currently describe personal account login for the Open Platform.
FAQ material says Token Plan purchases require personal or corporate real-name authentication.
Domestic and overseas accounts can receive different base URLs and keys, and Xiaomi says domestic/overseas usage is calculated separately.

What was not verified:

A clear public zero-retention promise for ordinary API calls.
A public enterprise data-isolation or no-training term that can be summarized safely without legal review.
Whether all regions receive the same privacy, payment, and support terms.

For private code, regulated data, client data, or enterprise secrets, use open weights or obtain written terms before routing production data through the hosted Xiaomi API.

Where Xiaomi Fits

Compared with	Xiaomi is interesting when…	Prefer the alternative when…
Qwen 3.6 Plus	You want official Xiaomi API access, Token Plan tools, or MiMo open weights instead of a hosted preview lane	You want to compare a Qwen path or a live limited-time hosted offer before paying
Kimi K2.7 Code	You want to test lower listed V2.5 API output rates, 1M context, and Xiaomi’s coding-tool Token Plan	You already rely on Kimi’s official coding API lane, membership quota, or Kimi API behavior
GLM-5.2	You want another low-cost supported-tool lane with open weights and multimodal/audio models	You want Z.AI’s established GLM Coding Plan path and current GLM-5.2 guide
MiniMax M3	You want a second 1M-context coding-agent Token Plan candidate and MiMo’s MIT-weight option	You want MiniMax’s current M3 buyer guide and already know its Token Plan fits your tools
Gemini 3 Flash	You want a Chinese open-weight/API model lane to test beside Google high-context workflows	You need Google-native AI Studio, Vertex, or Gemini app integration
Gemma 4	You have serious hardware and want to self-host a larger Xiaomi model	You want a cleaner local/on-device lane for laptops, phones, and smaller private deployments

The simplest Xiaomi role today: add MiMo to your test bench as a low-cost long-context and coding-agent candidate. Keep premium models for review and arbitration until MiMo passes your own evals.

Evaluation Plan

Run the same tasks you use for GLM, MiniMax, Kimi, Gemini, GPT, and Claude:

Test	What to ask Xiaomi MiMo	Pass signal
Coding	Fix a real failing test with repository context	Small patch, correct diagnosis, no unrelated churn
Long context	Analyze a large codebase or docs set	Specific references, accurate cross-file reasoning
Multimodal	Feed a screenshot, chart, or UI artifact	Correct visual read and useful implementation advice
Tool use	Use a coding tool route with function/tool calls	Stable tool calls, correct continuation behavior, no runaway loops
Latency	Run repeated short and long prompts	Predictable response time for your workflow
Quota	Track Token Plan credit burn on real tasks	Credit use matches docs closely enough to budget
Privacy	Review terms against your data class	Written policy fit before sensitive production use

Graduate Xiaomi only after it beats your current low-cost lane on your own workload. If it merely looks good on public benchmarks, keep it experimental.

/value/smart-spend/ - paid-stack strategy and low-cost lane triage
/models/minimax-m3/ - another 1M-context value coding model to test
/models/glm-5.2/ - current Z.AI supported-tool coding lane
/models/kimi-k3/ - Kimi’s newest flagship and 1M-context eval lane
/models/kimi-k2.7-code/ - cheaper Kimi coding API guide
/models/qwen-3.6-plus/ - free 1M-context preview lane
/models/gemini-3-flash/ - Google high-context value lane
/models/gemma-4/ - private local/on-device lane
/compare/models/budget-tier/ - budget model comparison

Sources

Last verified: June 11, 2026. Xiaomi MiMo pricing, plan credits, model routing, promotions, and supported tools can change quickly. Treat Xiaomi’s live docs, console, and checkout as final authority before buying or routing sensitive work.