GLM-5.2 vs Kimi K2.6/K2.7

Short verdict: Start with GLM-5.2 when the job needs 1M context, whole-repo context retention, or Z.AI-supported coding-tool routing. Start with Kimi K2.7 Code when you want Kimi’s current coding release, cheaper cache-hit input, multimodal tool calls, or a HighSpeed API lane. Keep Kimi K2.6 in the comparison because it still has exact-match demand and remains relevant where non-thinking mode or existing Kimi integrations matter.

This is a shortlist guide, not a leaderboard. Vendor benchmark claims can help pick what to test, but the winning model is the one that fixes your actual bug with fewer retries and less human cleanup.

Quick Facts

Model	Current role	Context	Input price	Output price
GLM-5.2	Z.AI current flagship coding model	1M	$1.40 / 1M, $0.26 cached	$4.40 / 1M
Kimi K2.7 Code	Kimi current coding release	256K-class	$0.19 cache hit / $0.95 cache miss	$4.00 / 1M
Kimi K2.7 Code HighSpeed	Faster K2.7 API lane	256K-class	$0.38 cache hit / $1.90 cache miss	$8.00 / 1M
Kimi K2.6	Prior Kimi multimodal/API comparison lane	256K-class	$0.16 cache hit / $0.95 cache miss	$4.00 / 1M

Which Should You Test First?

If your constraint is…	Start with	Why
Whole-repo context or very long tasks	GLM-5.2	Z.AI documents 1M context and 128K max output
Lowest cache-hit input cost	Kimi K2.6 or Kimi K2.7 Code	Kimi cache-hit input is lower than GLM’s public non-cached input anchor
Current Kimi coding model	Kimi K2.7 Code	Kimi docs now position K2.7 Code as the strongest coding model
Fast coding responses	Kimi K2.7 Code HighSpeed	Same K2.7 model, higher listed output speed, higher token prices
Supported-tool subscription economics	GLM Coding Plan	Z.AI sells a GLM Coding Plan for supported coding tools
Non-thinking Kimi mode	Kimi K2.6	K2.7 Code requires thinking; K2.6 can disable it
Multimodal coding inputs	Kimi K2.7 Code or K2.6	Kimi docs emphasize text, image, and video input

Price And Context

GLM-5.2’s price case is not “cheapest raw token.” Its case is 1M context plus supported coding-tool economics. Kimi’s price case is cheaper input in cache-hit workflows and a simple OpenAI-compatible API path, with K2.7 Code HighSpeed available when speed is worth the higher output price.

Buyer question	GLM-5.2 answer	Kimi answer
“Which has bigger context?”	GLM-5.2 at 1M	Kimi K2.7/K2.6 at 256K-class
“Which has cheaper output?”	$4.40 / 1M	$4.00 / 1M base K2.7/K2.6; $8.00 HighSpeed
“Which has cheaper cache-hit input?”	$0.26 / 1M cached input	$0.19 K2.7 Code, $0.16 K2.6
“Which has a subscription lane?”	GLM Coding Plan	Kimi membership/Kimi Code paths, but verify current quota wording
“Which should handle routine work?”	Test GLM if tool support fits	Test Kimi if API/tooling support fits

Tool Fit

Workflow	Better first test	Reason
Claude Code-style supported-tool routing	GLM-5.2 via Z.AI	Z.AI documents Anthropic-compatible and supported-tool routes
OpenAI-compatible API experiments	Both	Both have OpenAI-compatible API paths
Kimi-native coding membership	Kimi K2.7 Code / Kimi Code	Kimi-specific workflows should test the current Kimi coding model
OpenClaw/BYOK agent routing	Both	Use provider-specific docs and treat subscriptions separately from direct API billing
Very large repo audit	GLM-5.2	1M context is the main spec advantage
Vision/video coding task	Kimi K2.7 Code or K2.6	Kimi docs explicitly emphasize multimodal input

Caveats

Z.AI benchmark language is vendor-published. Use it to decide what to test, not to claim “best model.”
Kimi K2.7 Code benchmark and speed language is also vendor-published. Measure your own cost per accepted patch.
Kimi K2.7 Code requires thinking mode. If your workflow needs non-thinking behavior, test K2.6.
Subscription quota and membership wording can move faster than API pricing docs. Treat checkout and current docs as final before purchase.
Context size is not quality. A smaller-context model can still win if it follows your repo conventions better.

Eval Checklist

Run the same tasks through each candidate:

Eval	What to measure	Winner signal
Bug fix	One real failing test	Smallest correct patch, fewest retries
Refactor	2-4 files with local style constraints	Preserves behavior and tests
Repo audit	Architecture map from docs and code	Accurate module boundaries
Tool calls	Multi-step task with tool results	Keeps reasoning/tool context intact
Cost	Actual cache-hit/cache-miss and output usage	Lower cost per accepted result
Review	Patch review against a known risky change	Concrete findings without invented policy

/models/glm-5.2/ - GLM-5.2 model guide
/models/kimi-k2.7-code/ - Kimi K2.7 Code model guide
/tools/zai/ - Z.AI Coding Plan setup and caveats
/tools/kimi-code/ - Kimi Code membership and tool path
/value/kimi-access/ - Kimi access and promo-check guidance
/value/smart-spend/ - Paid-stack routing strategy
/compare/models/mid-range/ - Production spend-band comparison

Sources

Last verified: June 20, 2026. Recheck API pricing, context limits, thinking-mode requirements, and subscription quota terms before committing production workloads.