Short verdict: Start with GLM-5.2 when the job needs 1M context, whole-repo context retention, or Z.AI-supported coding-tool routing. Start with Kimi K2.7 Code when you want Kimi’s current coding release, cheaper cache-hit input, multimodal tool calls, or a HighSpeed API lane. Keep Kimi K2.6 in the comparison because it still has exact-match demand and remains relevant where non-thinking mode or existing Kimi integrations matter.
This is a shortlist guide, not a leaderboard. Vendor benchmark claims can help pick what to test, but the winning model is the one that fixes your actual bug with fewer retries and less human cleanup.
Quick Facts
| Model | Current role | Context | Input price | Output price |
|---|---|---|---|---|
| GLM-5.2 | Z.AI current flagship coding model | 1M | $1.40 / 1M, $0.26 cached | $4.40 / 1M |
| Kimi K2.7 Code | Kimi current coding release | 256K-class | $0.19 cache hit / $0.95 cache miss | $4.00 / 1M |
| Kimi K2.7 Code HighSpeed | Faster K2.7 API lane | 256K-class | $0.38 cache hit / $1.90 cache miss | $8.00 / 1M |
| Kimi K2.6 | Prior Kimi multimodal/API comparison lane | 256K-class | $0.16 cache hit / $0.95 cache miss | $4.00 / 1M |
Which Should You Test First?
| If your constraint is… | Start with | Why |
|---|---|---|
| Whole-repo context or very long tasks | GLM-5.2 | Z.AI documents 1M context and 128K max output |
| Lowest cache-hit input cost | Kimi K2.6 or Kimi K2.7 Code | Kimi cache-hit input is lower than GLM’s public non-cached input anchor |
| Current Kimi coding model | Kimi K2.7 Code | Kimi docs now position K2.7 Code as the strongest coding model |
| Fast coding responses | Kimi K2.7 Code HighSpeed | Same K2.7 model, higher listed output speed, higher token prices |
| Supported-tool subscription economics | GLM Coding Plan | Z.AI sells a GLM Coding Plan for supported coding tools |
| Non-thinking Kimi mode | Kimi K2.6 | K2.7 Code requires thinking; K2.6 can disable it |
| Multimodal coding inputs | Kimi K2.7 Code or K2.6 | Kimi docs emphasize text, image, and video input |
Price And Context
GLM-5.2’s price case is not “cheapest raw token.” Its case is 1M context plus supported coding-tool economics. Kimi’s price case is cheaper input in cache-hit workflows and a simple OpenAI-compatible API path, with K2.7 Code HighSpeed available when speed is worth the higher output price.
| Buyer question | GLM-5.2 answer | Kimi answer |
|---|---|---|
| “Which has bigger context?” | GLM-5.2 at 1M | Kimi K2.7/K2.6 at 256K-class |
| “Which has cheaper output?” | $4.40 / 1M | $4.00 / 1M base K2.7/K2.6; $8.00 HighSpeed |
| “Which has cheaper cache-hit input?” | $0.26 / 1M cached input | $0.19 K2.7 Code, $0.16 K2.6 |
| “Which has a subscription lane?” | GLM Coding Plan | Kimi membership/Kimi Code paths, but verify current quota wording |
| “Which should handle routine work?” | Test GLM if tool support fits | Test Kimi if API/tooling support fits |
Tool Fit
| Workflow | Better first test | Reason |
|---|---|---|
| Claude Code-style supported-tool routing | GLM-5.2 via Z.AI | Z.AI documents Anthropic-compatible and supported-tool routes |
| OpenAI-compatible API experiments | Both | Both have OpenAI-compatible API paths |
| Kimi-native coding membership | Kimi K2.7 Code / Kimi Code | Kimi-specific workflows should test the current Kimi coding model |
| OpenClaw/BYOK agent routing | Both | Use provider-specific docs and treat subscriptions separately from direct API billing |
| Very large repo audit | GLM-5.2 | 1M context is the main spec advantage |
| Vision/video coding task | Kimi K2.7 Code or K2.6 | Kimi docs explicitly emphasize multimodal input |
Caveats
- Z.AI benchmark language is vendor-published. Use it to decide what to test, not to claim “best model.”
- Kimi K2.7 Code benchmark and speed language is also vendor-published. Measure your own cost per accepted patch.
- Kimi K2.7 Code requires thinking mode. If your workflow needs non-thinking behavior, test K2.6.
- Subscription quota and membership wording can move faster than API pricing docs. Treat checkout and current docs as final before purchase.
- Context size is not quality. A smaller-context model can still win if it follows your repo conventions better.
Eval Checklist
Run the same tasks through each candidate:
| Eval | What to measure | Winner signal |
|---|---|---|
| Bug fix | One real failing test | Smallest correct patch, fewest retries |
| Refactor | 2-4 files with local style constraints | Preserves behavior and tests |
| Repo audit | Architecture map from docs and code | Accurate module boundaries |
| Tool calls | Multi-step task with tool results | Keeps reasoning/tool context intact |
| Cost | Actual cache-hit/cache-miss and output usage | Lower cost per accepted result |
| Review | Patch review against a known risky change | Concrete findings without invented policy |
Related links
- /models/glm-5.2/ - GLM-5.2 model guide
- /models/kimi-k2.7-code/ - Kimi K2.7 Code model guide
- /tools/zai/ - Z.AI Coding Plan setup and caveats
- /tools/kimi-code/ - Kimi Code membership and tool path
- /value/kimi-access/ - Kimi access and promo-check guidance
- /value/smart-spend/ - Paid-stack routing strategy
- /compare/models/mid-range/ - Production spend-band comparison
Sources
- Z.AI GLM-5.2 docs
- Z.AI pricing
- Z.AI GLM Coding Plan overview
- Kimi K2.7 Code quickstart
- Kimi K2.7 Code pricing
- Kimi K2.6 quickstart
- Kimi K2.6 pricing
Last verified: June 20, 2026. Recheck API pricing, context limits, thinking-mode requirements, and subscription quota terms before committing production workloads.