Codex vs Claude Code vs Cursor

Codex, Claude Code, and Cursor overlap, but they optimize different workflows. Compare their controls, execution model, repository fit, and live plan terms before comparing the models they can route.

Quick Decision

Need	Start with	Reason
OpenAI-native cloud-agent tasks	Codex	Managed task execution and OpenAI account integration
Terminal-native repository work	Claude Code	Claude-native coding workflow with Sonnet and Opus lanes
IDE-first autocomplete and interactive editing	Cursor	Editor-integrated completion, chat, and agent workflows
Premium Claude review	Claude Code with Opus 4.8	Opus 4.8 is the active premium Claude baseline
GPT-5.6 evaluation	Codex only if explicitly approved	GPT-5.6 is not a normal Codex or ChatGPT entitlement

There is no defensible universal winner. A tool that reduces interaction time can matter more than a small benchmark difference, while a premium model can matter when it prevents an expensive mistake.

Current Model Status

Lane	Current read
OpenAI	GPT-5.5 remains the active baseline; GPT-5.6 Sol/Terra/Luna are selected-organization API/Codex previews
Anthropic	Sonnet 5 is the daily production lane and Opus 4.8 is the premium baseline
Fable/Mythos	Fable restored but guarded and high-cost; Mythos remains trusted-access only
Older rows	Opus 4.5 and earlier GPT results are historical, not June 2026 rankings

Tool model pickers and aliases change. Confirm the exact model available in the target account instead of treating this page as an entitlement list.

Model	Provider	Status	Context	Input price	Output price	Coding signal	Tool-use signal	Benchmark evidence	Speed	Verdict	Sources	Checked
GPT-5.5	OpenAI	active Generally available OpenAI baseline while GPT-5.6 remains in limited preview.	1.05M API; 400K Codex	$5.00 / 1M	$30.00 / 1M	not verified	not verified	not verified	not verified	Primary coding seat while ChatGPT/Codex limits fit the workload.	OpenAI GPT-5.5 API model page, OpenAI GPT-5.5 ChatGPT limits, Artificial Analysis: GPT-5.5, LMArena leaderboard dataset	2026-06-28
GPT-5.6 Sol	OpenAI	preview Selected API organizations and Codex workspaces only; no public enrollment and no ChatGPT access during preview.	not published	$5.00 / 1M	$30.00 / 1M	OpenAI reports a new state of the art on Terminal-Bench 2.1; expanded and independent results are pending.	API and Codex preview; max reasoning and ultra multi-agent modes are vendor-documented.	Terminal-Bench 2.1: vendor reports new state of the art (vendor) AIHackers repo eval: not verified (site-owned)	OpenAI announced a selected-customer Cerebras preview for July; production latency is not verified.	Restricted-preview evaluation only; keep GPT-5.5 as the active OpenAI baseline.	OpenAI GPT-5.6 launch [archive], OpenAI GPT-5.6 preview access [archive], OpenAI GPT-5.6 Preview system card [archive]	2026-06-28
Claude Sonnet 5	Anthropic	active Generally available across Claude plans, Claude Code, the Claude API, GitHub Copilot, and supported AWS paths.	1M	$2.00 / 1M	$10.00 / 1M	Anthropic reports substantial coding and agentic gains over Sonnet 4.6; independent normalized results are pending.	Available in Claude Code and the Claude API; adaptive thinking is on by default.	Cross-model benchmark evidence: vendor-reported; updated chart and system card preferred (vendor) Independent normalized evaluation: not verified (independent) AIHackers repo eval: not verified (site-owned)	No site-owned normalized latency result is verified.	First Claude cost/performance test before Opus API pricing; retain Opus for highest-accuracy arbitration.	Anthropic Claude Sonnet 5 launch [archive], Claude Sonnet 5 migration guide [archive], GitHub Copilot Claude Sonnet 5 launch [archive], Claude Sonnet 5 on AWS [archive]	2026-07-01
Claude Opus 4.8	Anthropic	active Current generally available Opus-tier premium baseline.	1M	$5.00 / 1M	$25.00 / 1M	Practical premium Claude baseline; AIHackers recommends task-level comparison rather than blanket default routing.	Claude API and Claude-native workflow baseline; third-party routing must follow Anthropic terms.	Artificial Analysis Intelligence Index v4.0: 61 (independent) Artificial Analysis output speed: 60.4 tokens/s (independent)	Artificial Analysis measured 60.4 output tokens/s; provider and workload latency vary.	Premium review, architecture, hard-debugging, and final-arbitration lane.	Claude models overview [archive], Claude API pricing [archive], Artificial Analysis: Claude Opus 4.8 [archive], Artificial Analysis Intelligence Index v4.1, LMArena leaderboard dataset, Berkeley Function Calling Leaderboard	2026-06-28
Claude Fable 5	Anthropic	active Restored globally on native Claude surfaces July 1; included usage is plan-specific and cloud re-enablement is rolling out.	1M	$10.00 / 1M	$50.00 / 1M	Anthropic reports frontier launch results; independent reproducible ranking is pending.	Guarded-domain requests can refuse or fall back; verify account behavior before routing.	Independent cross-model evaluation: not verified (independent) AIHackers repo eval: not verified (site-owned)	not verified	High-cost guarded escalation only; use Opus 4.8 as the practical Claude premium baseline.	Claude models overview [archive], Claude API pricing [archive], Anthropic Fable 5 and Mythos 5 [archive], Anthropic Fable/Mythos access statement [archive], Anthropic Fable 5 redeployment [archive]	2026-07-01

This table evaluates model evidence, not the surrounding coding tools. Tool productivity still requires the same repository task and acceptance rules.

Workflow Differences

Codex

Use Codex when the OpenAI-native agent workflow, managed execution environment, and task delegation fit the repository. Verify current workspace permissions, network access, model selection, rate limits, and data controls.

GPT-5.6 preview access is scoped separately to approved API organizations and Codex workspaces. Approval for one does not automatically include the other.

Claude Code

Use Claude Code when a terminal-native workflow and Claude model routing fit. Start routine work with Sonnet 5 and escalate difficult review, architecture, or debugging to Opus 4.8.

Do not describe Claude Code as exposing private chain-of-thought. Evaluate the visible plan, tool calls, diffs, tests, and final explanation instead.

Cursor

Use Cursor when editor-integrated completion, interactive changes, and visual diff review matter most. Its available models, quotas, modes, and prices can change independently of provider API list prices, so check the current product and billing pages.

What to Compare

Dimension	Evidence to collect
Repository control	Allowed paths, confirmation gates, worktree behavior, and diff review
Model identity	Exact model ID or a recorded “provider-managed/undisclosed” limitation
Completion quality	Tests, accepted patches, regressions, and repair work
Cost	Subscription, API usage, premium requests, retries, and review time
Latency	Time to first useful edit and time to accepted completion
Security	Credential scope, network access, retention, logs, and administrative controls

Avoid hard-coded concurrency, latency, or message-limit claims unless the current provider page documents them. Account tiers and rollout cohorts can produce materially different behavior.

Evaluation Protocol

Pin one repository commit and run:

A repository architecture map.
One real failing-test fix.
A cross-file refactor with explicit boundaries.
Review of another tool’s patch.

For every run, record the tool version, model, settings, prompt, time, tokens or credits, retries, tests, final diff, and reviewer disposition. A run without the exact model identity should not be merged into a model leaderboard.

Pricing Rules

Keep tool subscriptions separate from provider API pricing.
Treat checkout, credits, premium requests, and rate limits as live account facts.
Compare cost per accepted task, not only input-token list price.
Do not infer a tool’s monthly cost from a model API example.

Current API reference points include Sonnet 5 at $2/$10 through August 31 (then $3/$15), Opus 4.8 at $5/$25, and GPT-5.6 Sol at $5/$30 during restricted preview. Those numbers do not describe Cursor, Claude Code, or Codex subscription entitlements.

Security Rules

All three tools can act on valuable repositories and credentials. Use least privilege, separate development and production credentials, require confirmation for destructive or external actions, and verify completion from repository and external-state evidence.

Air-gapped or local-model support must be verified for the exact configuration. A terminal interface alone does not make a cloud-model workflow local or offline.

Verdict

Choose Codex for OpenAI-native managed agent tasks after validating workspace controls and the current model.
Choose Claude Code for terminal-native Claude workflows, with Sonnet 5 as the routine lane and Opus 4.8 as premium escalation.
Choose Cursor for IDE-first interaction after validating its current model roster, quotas, and privacy controls.
Use multiple tools only when the repository policy, commit boundaries, and review process keep their changes attributable.

Sources

OpenAI: Codex and GPT-5.6 preview access
Anthropic: Claude Code, model overview, and pricing
Cursor: Documentation and pricing
Artificial Analysis: Claude Opus 4.8

Last verified: June 28, 2026. Tool plans, model rosters, quotas, security controls, and preview access change independently.