Codex, Claude Code, and Cursor overlap, but they optimize different workflows. Compare their controls, execution model, repository fit, and live plan terms before comparing the models they can route.
Quick Decision
| Need | Start with | Reason |
|---|---|---|
| OpenAI-native cloud-agent tasks | Codex | Managed task execution and OpenAI account integration |
| Terminal-native repository work | Claude Code | Claude-native coding workflow with Sonnet and Opus lanes |
| IDE-first autocomplete and interactive editing | Cursor | Editor-integrated completion, chat, and agent workflows |
| Premium Claude review | Claude Code with Opus 4.8 | Opus 4.8 is the active premium Claude baseline |
| GPT-5.6 evaluation | Codex only if explicitly approved | GPT-5.6 is not a normal Codex or ChatGPT entitlement |
There is no defensible universal winner. A tool that reduces interaction time can matter more than a small benchmark difference, while a premium model can matter when it prevents an expensive mistake.
Current Model Status
| Lane | Current read |
|---|---|
| OpenAI | GPT-5.5 remains the active baseline; GPT-5.6 Sol/Terra/Luna are selected-organization API/Codex previews |
| Anthropic | Sonnet 5 is the daily production lane and Opus 4.8 is the premium baseline |
| Fable/Mythos | Fable restored but guarded and high-cost; Mythos remains trusted-access only |
| Older rows | Opus 4.5 and earlier GPT results are historical, not June 2026 rankings |
Tool model pickers and aliases change. Confirm the exact model available in the target account instead of treating this page as an entitlement list.
benchmark artifact
Relevant Current Model Evidence
| Model | Provider | Status | Context | Input price | Output price | Coding signal | Tool-use signal | Benchmark evidence | Speed | Verdict | Sources | Checked |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GPT-5.5 | OpenAI | active Generally available OpenAI baseline while GPT-5.6 remains in limited preview. | 1.05M API; 400K Codex | $5.00 / 1M | $30.00 / 1M | not verified | not verified | not verified | not verified | Primary coding seat while ChatGPT/Codex limits fit the workload. | OpenAI GPT-5.5 API model page, OpenAI GPT-5.5 ChatGPT limits, Artificial Analysis: GPT-5.5, LMArena leaderboard dataset | 2026-06-28 |
| GPT-5.6 Sol | OpenAI | preview Selected API organizations and Codex workspaces only; no public enrollment and no ChatGPT access during preview. | not published | $5.00 / 1M | $30.00 / 1M | OpenAI reports a new state of the art on Terminal-Bench 2.1; expanded and independent results are pending. | API and Codex preview; max reasoning and ultra multi-agent modes are vendor-documented. |
| OpenAI announced a selected-customer Cerebras preview for July; production latency is not verified. | Restricted-preview evaluation only; keep GPT-5.5 as the active OpenAI baseline. | OpenAI GPT-5.6 launch [archive], OpenAI GPT-5.6 preview access [archive], OpenAI GPT-5.6 Preview system card [archive] | 2026-06-28 |
| Claude Sonnet 5 | Anthropic | active Generally available across Claude plans, Claude Code, the Claude API, GitHub Copilot, and supported AWS paths. | 1M | $2.00 / 1M | $10.00 / 1M | Anthropic reports substantial coding and agentic gains over Sonnet 4.6; independent normalized results are pending. | Available in Claude Code and the Claude API; adaptive thinking is on by default. |
| No site-owned normalized latency result is verified. | First Claude cost/performance test before Opus API pricing; retain Opus for highest-accuracy arbitration. | Anthropic Claude Sonnet 5 launch [archive], Claude Sonnet 5 migration guide [archive], GitHub Copilot Claude Sonnet 5 launch [archive], Claude Sonnet 5 on AWS [archive] | 2026-07-01 |
| Claude Opus 4.8 | Anthropic | active Current generally available Opus-tier premium baseline. | 1M | $5.00 / 1M | $25.00 / 1M | Practical premium Claude baseline; AIHackers recommends task-level comparison rather than blanket default routing. | Claude API and Claude-native workflow baseline; third-party routing must follow Anthropic terms. |
| Artificial Analysis measured 60.4 output tokens/s; provider and workload latency vary. | Premium review, architecture, hard-debugging, and final-arbitration lane. | Claude models overview [archive], Claude API pricing [archive], Artificial Analysis: Claude Opus 4.8 [archive], Artificial Analysis Intelligence Index v4.1, LMArena leaderboard dataset, Berkeley Function Calling Leaderboard | 2026-06-28 |
| Claude Fable 5 | Anthropic | active Restored globally on native Claude surfaces July 1; included usage is plan-specific and cloud re-enablement is rolling out. | 1M | $10.00 / 1M | $50.00 / 1M | Anthropic reports frontier launch results; independent reproducible ranking is pending. | Guarded-domain requests can refuse or fall back; verify account behavior before routing. |
| not verified | High-cost guarded escalation only; use Opus 4.8 as the practical Claude premium baseline. | Claude models overview [archive], Claude API pricing [archive], Anthropic Fable 5 and Mythos 5 [archive], Anthropic Fable/Mythos access statement [archive], Anthropic Fable 5 redeployment [archive] | 2026-07-01 |
This table evaluates model evidence, not the surrounding coding tools. Tool productivity still requires the same repository task and acceptance rules.
Workflow Differences
Codex
Use Codex when the OpenAI-native agent workflow, managed execution environment, and task delegation fit the repository. Verify current workspace permissions, network access, model selection, rate limits, and data controls.
GPT-5.6 preview access is scoped separately to approved API organizations and Codex workspaces. Approval for one does not automatically include the other.
Claude Code
Use Claude Code when a terminal-native workflow and Claude model routing fit. Start routine work with Sonnet 5 and escalate difficult review, architecture, or debugging to Opus 4.8.
Do not describe Claude Code as exposing private chain-of-thought. Evaluate the visible plan, tool calls, diffs, tests, and final explanation instead.
Cursor
Use Cursor when editor-integrated completion, interactive changes, and visual diff review matter most. Its available models, quotas, modes, and prices can change independently of provider API list prices, so check the current product and billing pages.
What to Compare
| Dimension | Evidence to collect |
|---|---|
| Repository control | Allowed paths, confirmation gates, worktree behavior, and diff review |
| Model identity | Exact model ID or a recorded “provider-managed/undisclosed” limitation |
| Completion quality | Tests, accepted patches, regressions, and repair work |
| Cost | Subscription, API usage, premium requests, retries, and review time |
| Latency | Time to first useful edit and time to accepted completion |
| Security | Credential scope, network access, retention, logs, and administrative controls |
Avoid hard-coded concurrency, latency, or message-limit claims unless the current provider page documents them. Account tiers and rollout cohorts can produce materially different behavior.
Evaluation Protocol
Pin one repository commit and run:
- A repository architecture map.
- One real failing-test fix.
- A cross-file refactor with explicit boundaries.
- Review of another tool’s patch.
For every run, record the tool version, model, settings, prompt, time, tokens or credits, retries, tests, final diff, and reviewer disposition. A run without the exact model identity should not be merged into a model leaderboard.
Pricing Rules
- Keep tool subscriptions separate from provider API pricing.
- Treat checkout, credits, premium requests, and rate limits as live account facts.
- Compare cost per accepted task, not only input-token list price.
- Do not infer a tool’s monthly cost from a model API example.
Current API reference points include Sonnet 5 at $2/$10 through August 31 (then $3/$15), Opus 4.8 at $5/$25, and GPT-5.6 Sol at $5/$30 during restricted preview. Those numbers do not describe Cursor, Claude Code, or Codex subscription entitlements.
Security Rules
All three tools can act on valuable repositories and credentials. Use least privilege, separate development and production credentials, require confirmation for destructive or external actions, and verify completion from repository and external-state evidence.
Air-gapped or local-model support must be verified for the exact configuration. A terminal interface alone does not make a cloud-model workflow local or offline.
Verdict
- Choose Codex for OpenAI-native managed agent tasks after validating workspace controls and the current model.
- Choose Claude Code for terminal-native Claude workflows, with Sonnet 5 as the routine lane and Opus 4.8 as premium escalation.
- Choose Cursor for IDE-first interaction after validating its current model roster, quotas, and privacy controls.
- Use multiple tools only when the repository policy, commit boundaries, and review process keep their changes attributable.
Sources
- OpenAI: Codex and GPT-5.6 preview access
- Anthropic: Claude Code, model overview, and pricing
- Cursor: Documentation and pricing
- Artificial Analysis: Claude Opus 4.8
Related links
- /compare/codex-vs-claude-vs-kimi/
- /models/gpt-5-6/
- /models/claude-opus-4-8/
- /tools/codex/
- /tools/cursor/
- /risks/codex/cloud-dependency-risks/
Last verified: June 28, 2026. Tool plans, model rosters, quotas, security controls, and preview access change independently.