Short verdict: Start with GLM-5.2 when the job needs 1M context, whole-repo context retention, or Z.AI-supported coding-tool routing. Start with Kimi K2.7 Code when you want Kimi’s current coding release, cheaper cache-hit input, multimodal tool calls, or a HighSpeed API lane. Keep Kimi K2.6 in the comparison because it still has exact-match demand and remains relevant where non-thinking mode or existing Kimi integrations matter.

This is a shortlist guide, not a leaderboard. Vendor benchmark claims can help pick what to test, but the winning model is the one that fixes your actual bug with fewer retries and less human cleanup.

Quick Facts

ModelCurrent roleContextInput priceOutput price
GLM-5.2Z.AI current flagship coding model1M$1.40 / 1M, $0.26 cached$4.40 / 1M
Kimi K2.7 CodeKimi current coding release256K-class$0.19 cache hit / $0.95 cache miss$4.00 / 1M
Kimi K2.7 Code HighSpeedFaster K2.7 API lane256K-class$0.38 cache hit / $1.90 cache miss$8.00 / 1M
Kimi K2.6Prior Kimi multimodal/API comparison lane256K-class$0.16 cache hit / $0.95 cache miss$4.00 / 1M

Which Should You Test First?

If your constraint is…Start withWhy
Whole-repo context or very long tasksGLM-5.2Z.AI documents 1M context and 128K max output
Lowest cache-hit input costKimi K2.6 or Kimi K2.7 CodeKimi cache-hit input is lower than GLM’s public non-cached input anchor
Current Kimi coding modelKimi K2.7 CodeKimi docs now position K2.7 Code as the strongest coding model
Fast coding responsesKimi K2.7 Code HighSpeedSame K2.7 model, higher listed output speed, higher token prices
Supported-tool subscription economicsGLM Coding PlanZ.AI sells a GLM Coding Plan for supported coding tools
Non-thinking Kimi modeKimi K2.6K2.7 Code requires thinking; K2.6 can disable it
Multimodal coding inputsKimi K2.7 Code or K2.6Kimi docs emphasize text, image, and video input

Price And Context

GLM-5.2’s price case is not “cheapest raw token.” Its case is 1M context plus supported coding-tool economics. Kimi’s price case is cheaper input in cache-hit workflows and a simple OpenAI-compatible API path, with K2.7 Code HighSpeed available when speed is worth the higher output price.

Buyer questionGLM-5.2 answerKimi answer
“Which has bigger context?”GLM-5.2 at 1MKimi K2.7/K2.6 at 256K-class
“Which has cheaper output?”$4.40 / 1M$4.00 / 1M base K2.7/K2.6; $8.00 HighSpeed
“Which has cheaper cache-hit input?”$0.26 / 1M cached input$0.19 K2.7 Code, $0.16 K2.6
“Which has a subscription lane?”GLM Coding PlanKimi membership/Kimi Code paths, but verify current quota wording
“Which should handle routine work?”Test GLM if tool support fitsTest Kimi if API/tooling support fits

Tool Fit

WorkflowBetter first testReason
Claude Code-style supported-tool routingGLM-5.2 via Z.AIZ.AI documents Anthropic-compatible and supported-tool routes
OpenAI-compatible API experimentsBothBoth have OpenAI-compatible API paths
Kimi-native coding membershipKimi K2.7 Code / Kimi CodeKimi-specific workflows should test the current Kimi coding model
OpenClaw/BYOK agent routingBothUse provider-specific docs and treat subscriptions separately from direct API billing
Very large repo auditGLM-5.21M context is the main spec advantage
Vision/video coding taskKimi K2.7 Code or K2.6Kimi docs explicitly emphasize multimodal input

Caveats

  • Z.AI benchmark language is vendor-published. Use it to decide what to test, not to claim “best model.”
  • Kimi K2.7 Code benchmark and speed language is also vendor-published. Measure your own cost per accepted patch.
  • Kimi K2.7 Code requires thinking mode. If your workflow needs non-thinking behavior, test K2.6.
  • Subscription quota and membership wording can move faster than API pricing docs. Treat checkout and current docs as final before purchase.
  • Context size is not quality. A smaller-context model can still win if it follows your repo conventions better.

Eval Checklist

Run the same tasks through each candidate:

EvalWhat to measureWinner signal
Bug fixOne real failing testSmallest correct patch, fewest retries
Refactor2-4 files with local style constraintsPreserves behavior and tests
Repo auditArchitecture map from docs and codeAccurate module boundaries
Tool callsMulti-step task with tool resultsKeeps reasoning/tool context intact
CostActual cache-hit/cache-miss and output usageLower cost per accepted result
ReviewPatch review against a known risky changeConcrete findings without invented policy

Sources


Last verified: June 20, 2026. Recheck API pricing, context limits, thinking-mode requirements, and subscription quota terms before committing production workloads.