Codex, Claude Code, and Cursor overlap, but they optimize different workflows. Compare their controls, execution model, repository fit, and live plan terms before comparing the models they can route.

Quick Decision

NeedStart withReason
OpenAI-native cloud-agent tasksCodexManaged task execution and OpenAI account integration
Terminal-native repository workClaude CodeClaude-native coding workflow with Sonnet and Opus lanes
IDE-first autocomplete and interactive editingCursorEditor-integrated completion, chat, and agent workflows
Premium Claude reviewClaude Code with Opus 4.8Opus 4.8 is the active premium Claude baseline
GPT-5.6 evaluationCodex only if explicitly approvedGPT-5.6 is not a normal Codex or ChatGPT entitlement

There is no defensible universal winner. A tool that reduces interaction time can matter more than a small benchmark difference, while a premium model can matter when it prevents an expensive mistake.

Current Model Status

LaneCurrent read
OpenAIGPT-5.5 remains the active baseline; GPT-5.6 Sol/Terra/Luna are selected-organization API/Codex previews
AnthropicSonnet 5 is the daily production lane and Opus 4.8 is the premium baseline
Fable/MythosFable restored but guarded and high-cost; Mythos remains trusted-access only
Older rowsOpus 4.5 and earlier GPT results are historical, not June 2026 rankings

Tool model pickers and aliases change. Confirm the exact model available in the target account instead of treating this page as an entitlement list.

benchmark artifact

Relevant Current Model Evidence

ModelProviderStatusContextInput priceOutput priceCoding signalTool-use signalBenchmark evidenceSpeedVerdictSourcesChecked
GPT-5.5OpenAIactive
Generally available OpenAI baseline while GPT-5.6 remains in limited preview.
1.05M API; 400K Codex$5.00 / 1M$30.00 / 1Mnot verifiednot verifiednot verifiednot verifiedPrimary coding seat while ChatGPT/Codex limits fit the workload.OpenAI GPT-5.5 API model page, OpenAI GPT-5.5 ChatGPT limits, Artificial Analysis: GPT-5.5, LMArena leaderboard dataset2026-06-28
GPT-5.6 SolOpenAIpreview
Selected API organizations and Codex workspaces only; no public enrollment and no ChatGPT access during preview.
not published$5.00 / 1M$30.00 / 1MOpenAI reports a new state of the art on Terminal-Bench 2.1; expanded and independent results are pending.API and Codex preview; max reasoning and ultra multi-agent modes are vendor-documented.
  • Terminal-Bench 2.1: vendor reports new state of the art (vendor)
  • AIHackers repo eval: not verified (site-owned)
OpenAI announced a selected-customer Cerebras preview for July; production latency is not verified.Restricted-preview evaluation only; keep GPT-5.5 as the active OpenAI baseline.OpenAI GPT-5.6 launch [archive], OpenAI GPT-5.6 preview access [archive], OpenAI GPT-5.6 Preview system card [archive]2026-06-28
Claude Sonnet 5Anthropicactive
Generally available across Claude plans, Claude Code, the Claude API, GitHub Copilot, and supported AWS paths.
1M$2.00 / 1M$10.00 / 1MAnthropic reports substantial coding and agentic gains over Sonnet 4.6; independent normalized results are pending.Available in Claude Code and the Claude API; adaptive thinking is on by default.
  • Cross-model benchmark evidence: vendor-reported; updated chart and system card preferred (vendor)
  • Independent normalized evaluation: not verified (independent)
  • AIHackers repo eval: not verified (site-owned)
No site-owned normalized latency result is verified.First Claude cost/performance test before Opus API pricing; retain Opus for highest-accuracy arbitration.Anthropic Claude Sonnet 5 launch [archive], Claude Sonnet 5 migration guide [archive], GitHub Copilot Claude Sonnet 5 launch [archive], Claude Sonnet 5 on AWS [archive]2026-07-01
Claude Opus 4.8Anthropicactive
Current generally available Opus-tier premium baseline.
1M$5.00 / 1M$25.00 / 1MPractical premium Claude baseline; AIHackers recommends task-level comparison rather than blanket default routing.Claude API and Claude-native workflow baseline; third-party routing must follow Anthropic terms.
  • Artificial Analysis Intelligence Index v4.0: 61 (independent)
  • Artificial Analysis output speed: 60.4 tokens/s (independent)
Artificial Analysis measured 60.4 output tokens/s; provider and workload latency vary.Premium review, architecture, hard-debugging, and final-arbitration lane.Claude models overview [archive], Claude API pricing [archive], Artificial Analysis: Claude Opus 4.8 [archive], Artificial Analysis Intelligence Index v4.1, LMArena leaderboard dataset, Berkeley Function Calling Leaderboard2026-06-28
Claude Fable 5Anthropicactive
Restored globally on native Claude surfaces July 1; included usage is plan-specific and cloud re-enablement is rolling out.
1M$10.00 / 1M$50.00 / 1MAnthropic reports frontier launch results; independent reproducible ranking is pending.Guarded-domain requests can refuse or fall back; verify account behavior before routing.
  • Independent cross-model evaluation: not verified (independent)
  • AIHackers repo eval: not verified (site-owned)
not verifiedHigh-cost guarded escalation only; use Opus 4.8 as the practical Claude premium baseline.Claude models overview [archive], Claude API pricing [archive], Anthropic Fable 5 and Mythos 5 [archive], Anthropic Fable/Mythos access statement [archive], Anthropic Fable 5 redeployment [archive]2026-07-01

This table evaluates model evidence, not the surrounding coding tools. Tool productivity still requires the same repository task and acceptance rules.

Workflow Differences

Codex

Use Codex when the OpenAI-native agent workflow, managed execution environment, and task delegation fit the repository. Verify current workspace permissions, network access, model selection, rate limits, and data controls.

GPT-5.6 preview access is scoped separately to approved API organizations and Codex workspaces. Approval for one does not automatically include the other.

Claude Code

Use Claude Code when a terminal-native workflow and Claude model routing fit. Start routine work with Sonnet 5 and escalate difficult review, architecture, or debugging to Opus 4.8.

Do not describe Claude Code as exposing private chain-of-thought. Evaluate the visible plan, tool calls, diffs, tests, and final explanation instead.

Cursor

Use Cursor when editor-integrated completion, interactive changes, and visual diff review matter most. Its available models, quotas, modes, and prices can change independently of provider API list prices, so check the current product and billing pages.

What to Compare

DimensionEvidence to collect
Repository controlAllowed paths, confirmation gates, worktree behavior, and diff review
Model identityExact model ID or a recorded “provider-managed/undisclosed” limitation
Completion qualityTests, accepted patches, regressions, and repair work
CostSubscription, API usage, premium requests, retries, and review time
LatencyTime to first useful edit and time to accepted completion
SecurityCredential scope, network access, retention, logs, and administrative controls

Avoid hard-coded concurrency, latency, or message-limit claims unless the current provider page documents them. Account tiers and rollout cohorts can produce materially different behavior.

Evaluation Protocol

Pin one repository commit and run:

  1. A repository architecture map.
  2. One real failing-test fix.
  3. A cross-file refactor with explicit boundaries.
  4. Review of another tool’s patch.

For every run, record the tool version, model, settings, prompt, time, tokens or credits, retries, tests, final diff, and reviewer disposition. A run without the exact model identity should not be merged into a model leaderboard.

Pricing Rules

  • Keep tool subscriptions separate from provider API pricing.
  • Treat checkout, credits, premium requests, and rate limits as live account facts.
  • Compare cost per accepted task, not only input-token list price.
  • Do not infer a tool’s monthly cost from a model API example.

Current API reference points include Sonnet 5 at $2/$10 through August 31 (then $3/$15), Opus 4.8 at $5/$25, and GPT-5.6 Sol at $5/$30 during restricted preview. Those numbers do not describe Cursor, Claude Code, or Codex subscription entitlements.

Security Rules

All three tools can act on valuable repositories and credentials. Use least privilege, separate development and production credentials, require confirmation for destructive or external actions, and verify completion from repository and external-state evidence.

Air-gapped or local-model support must be verified for the exact configuration. A terminal interface alone does not make a cloud-model workflow local or offline.

Verdict

  • Choose Codex for OpenAI-native managed agent tasks after validating workspace controls and the current model.
  • Choose Claude Code for terminal-native Claude workflows, with Sonnet 5 as the routine lane and Opus 4.8 as premium escalation.
  • Choose Cursor for IDE-first interaction after validating its current model roster, quotas, and privacy controls.
  • Use multiple tools only when the repository policy, commit boundaries, and review process keep their changes attributable.

Sources


Last verified: June 28, 2026. Tool plans, model rosters, quotas, security controls, and preview access change independently.