The practical answer: test ZCode first for an integrated GLM-5.2-native desktop workflow and long-running goals; choose Pi for a minimal, programmable terminal harness; choose OpenCode for a polished general-purpose open-source agent with broad providers, IDE/GitHub integration, and granular permissions.
That is a workflow recommendation, not an empirical ranking. We have not run controlled same-task tests proving that one of these harnesses wins. Use the test card below before standardizing.
| Harness | Start here when you want | Main trade-off |
|---|---|---|
| ZCode | GLM-5.2-native desktop workflow, Goal Mode, built-in subagents, Git/change review | Tighter Z.AI product fit; less provider-neutral |
| Pi | Small terminal core, sessions, compaction, custom tools and TypeScript extensions | You assemble more of the workflow; Coding Plan authorization is unclear |
| OpenCode | Broad providers, terminal/desktop/IDE/GitHub use, agents and explicit permissions | Zen, BYO-provider, and Z.AI Coding Plan are separate billing paths |
The Model Is Only Half the System
A coding model does not select files, expose tools, decide when to retry, or recover a long session by itself. The harness does. Six levers materially affect useful output:
- Context selection: which files, instructions, diffs, and tool results enter the prompt.
- Tool interfaces: whether the model gets precise read/edit/test primitives or a vague adapter.
- Agent loop: how the harness plans, acts, observes results, and decides what to try next.
- Verification and stopping: whether it runs the requested tests, checks the diff, and stops on evidence.
- Permissions: what it may read, edit, execute, publish, or delete without approval.
- Recovery and compaction: how it preserves decisions when context fills or a tool call fails.
Harness-Bench isolates this execution layer across shared tasks, budgets, and protocols. Its 5,194 trajectories show substantial variation in completion, process quality, efficiency, and failure behavior across model-harness pairings.
Claw-SWE-Bench gives a concrete GLM example: a minimal direct-diff adapter scored 19.1% Pass@1, while a full adapter reached 73.4% with the same GLM-5.1 backbone. That result shows adapter/interface design can dominate outcomes. It does not measure Pi versus ZCode versus OpenCode, and it does not establish GLM-5.2 performance in any of them.
Pi vs ZCode vs OpenCode
| Decision point | Pi | ZCode | OpenCode |
|---|---|---|---|
| Primary interface | Terminal TUI, print/JSON, RPC, SDK | Desktop ADE with terminal, Git, tasks, remote and bot controls | Terminal TUI plus desktop, IDE and GitHub integrations |
| Provider posture | Broad provider support and custom providers | Deep GLM-5.2 integration | Broad provider support; optional OpenCode Zen |
| Long work | Persistent branching sessions and compaction | Goal Mode iterates until goal verification passes | Primary agents, subagents, sessions and configurable workflows |
| Extensibility | TypeScript extensions, custom tools, skills, prompt templates, packages | Skills, MCP, plugins, commands and custom subagents | Agents, commands, tools, MCP and provider configuration |
| Permissions | Project trust plus extension-controlled tool interception | Confirmation modes from confirm-before-changes through fuller access | Per-tool, per-command and per-agent allow/ask/deny rules |
| GLM-5.2 access | Pi-native Z.AI coding endpoint; Coding Plan authorization is not established | Z.AI product with GLM models and Coding Plan connection | OpenCode Zen PAYG, direct Z.AI PAYG, or Z.AI Coding Plan |
| Best first test | Programmable terminal workflow | Integrated GLM-5.2 desktop workflow | General-purpose multi-provider workflow |
Three Different Ways To Pay
Do not treat “supports GLM-5.2” as one entitlement.
| Path | What it means | Use in |
|---|---|---|
| Direct Z.AI PAYG | API usage billed to a Z.AI API account | OpenCode’s Z.AI provider; other compatible clients |
| OpenCode Zen | Optional OpenCode gateway; add credits and pay per request/model pricing | OpenCode only |
| GLM Coding Plan | Subscription quota restricted to Z.AI’s officially supported tools and products | ZCode and listed integrations such as OpenCode |
For the subscription rules, supported tools, and quotas, use the Z.AI Coding Plan guide. For model specs and PAYG pricing, use GLM-5.2.
Copyable Productive Tasks
These prompts constrain scope, define evidence, and make harness behavior easier to compare.
1. Bounded failing-test repair
| |
2. Read-only repository audit
| |
3. Multi-file refactor with review
| |
Same-Model Harness Test Card
Run the same model, task, repository commit, instructions, budget, and timeout in each harness. Repeat enough times to expose flaky behavior.
| |
Compare successful patches per unit of cost/quota and review time, not just whether the harness eventually produced a diff.
Why 1M Context Does Not Guarantee Productivity
One million tokens is capacity, not a promise that the right evidence will be selected or retained. Dumping an entire repository into context can dilute relevant instructions, increase latency, and make failures harder to diagnose. Agent loops also multiply usage: Z.AI estimates one Coding Plan prompt may invoke a model 15–20 times.
Costs rise when a harness:
- rereads large files instead of using targeted search;
- retries without changing its hypothesis;
- launches redundant subagents;
- carries noisy tool output forward;
- compacts away constraints or earlier test evidence;
- keeps working after acceptance criteria already pass.
Start with the smallest sufficient context. Record quota/tokens, retries, interventions, and test evidence in the test card.
Recommendation By Workflow
- Choose ZCode when GLM-5.2 is the primary model and you want an integrated desktop environment with explicit goals, ongoing verification, safety confirmations, and built-in collaboration features.
- Choose Pi when you want a small terminal harness that can become your own tool through extensions, custom tools, session branching, and customizable compaction.
- Choose OpenCode when you need a provider-neutral default, explicit permission policies, reusable agents, and a path across terminal, desktop, IDE, and GitHub workflows.
- Keep testing when the task is high risk. None of these feature lists proves lower defect rates in your repository.
Related links
- /tools/pi/
- /tools/zcode/
- /tools/opencode/
- /models/glm-5.2/
- /tools/zai/
- /value/smart-spend/
- /posts/how-to-read-ai-benchmarks/
Sources
- Pi coding-agent repository and README (Archive)
- Pi GLM-5.2 provider configuration (Archive)
- Pi extensions (Archive)
- Pi compaction (Archive)
- ZCode product page (Archive)
- ZCode Goal Mode (Archive)
- ZCode subagents (Archive)
- ZCode safety confirmations (Archive)
- OpenCode repository (Archive)
- OpenCode agents and permissions (Archive)
- OpenCode Zen (Archive)
- Z.AI OpenCode integration (Archive)
- Z.AI supported Coding Plan tools (Archive)
- Z.AI Coding Plan usage policy (Archive)
- Z.AI Coding Plan FAQ (Archive)
- Harness-Bench (Archive)
- Claw-SWE-Bench (Archive)
Last verified: July 2, 2026. Harness features, model routing, prices, and subscription authorization can change independently.