Code Kit v5.2 is out, optimized for Claude Opus 4.7.
Claude FastClaude Fast

Claude Opus 4.7: Vision, Verification, and a New Effort Tier

Opus 4.7 adds 3x vision resolution, xhigh effort, stricter instruction-following, and 70% CursorBench. Same $5/$25 pricing.

Stop configuring. Start shipping.Everything you're reading about and more..
Agentic Orchestration Kit for Claude Code.

Claude Opus 4.7 is Anthropic's latest flagship model. It verifies its own outputs before reporting, processes images at more than three times the resolution of any previous Claude, and introduces a new xhigh effort level that sits between high and max. Instruction-following is stricter, vision capabilities took a generational leap, and coding benchmarks climbed again. Pricing stays at $5/$25 per million tokens.

The model ships alongside a Cyber Verification Program for security professionals, higher-resolution image support across the API, and task budgets in public beta for managing token spend on long-running agentic sessions.

Key Specs

SpecDetails
API IDclaude-opus-4-7
Release DateApril 16, 2026
Context Window1M tokens (GA)
Max Output128,000 tokens
Pricing$5 input / $25 output per 1M tokens
StatusActive, current recommended Opus

What Changed: The Practical Improvements

Anthropic's internal teams use Claude Code daily, and each model release reflects what they learned from the previous one. Opus 4.7's changes are specific:

Self-verification before reporting. The model checks its own work before presenting results. It catches logical faults during planning and validates outputs against the original requirements. Intuit describes it as "catching its own logical faults during the planning phase and accelerating execution." Vercel's team observed it doing "proofs on systems code before starting work, which is new behavior."

3x vision resolution. Opus 4.7 accepts images up to 2,576 pixels on the long edge (roughly 3.75 megapixels), more than three times what prior Claude models supported. No API parameter changes needed. On XBOW's visual-acuity benchmark, it scored 98.5% compared to 54.5% for Opus 4.6. It reads chemical structures, complex technical diagrams, and dense charts that previous models struggled with.

Stricter instruction-following. The model interprets instructions more literally than Opus 4.6. This is a double-edged upgrade: prompts that relied on the model filling in implied context may need adjustment. The flip side is that explicit instructions produce more predictable results. Notion found it was the first model to pass their implicit-need tests.

New xhigh effort level. The effort scale now has five levels: low, medium (implied between low and high), high, xhigh, and max. Claude Code defaults to xhigh for all plans. The xhigh level gives deeper reasoning than high without the full cost of max. Hex's CTO noted that "low-effort Opus 4.7 is roughly equivalent to medium-effort Opus 4.6."

Longer autonomous sessions. Cognition (Devin) reports the model "works coherently for hours, pushes through hard problems." Factory saw 10-15% higher task success rates with fewer instances of the model stopping halfway through complex work.

Updated tokenizer. The same input may map to roughly 1.0-1.35x more tokens depending on content type. Combined with deeper thinking at higher effort levels, token usage increases. Controls include the effort parameter, task budgets, and conciseness prompting. Check the migration guide for details.

Benchmark Results

Opus 4.7 posted gains across coding, vision, legal, finance, and agentic evaluations:

BenchmarkOpus 4.7Opus 4.6Notable
CursorBench70%58%+12 points
Terminal-Bench 2.03 new tasks solvedbaselineTasks no prior model could pass
XBOW Visual Acuity98.5%54.5%+44 points, generational leap
BigLaw Bench90.9%--Harvey, high effort
GDPval-AASoTASoTAMaintained frontier position
General Finance0.8130.767AlphaSense research-agent module
OfficeQA Pro21% fewer errorsbaselineDatabricks evaluation
Notion Agent+13% resolutionbaseline93-task internal benchmark, fewer tokens

CursorBench measures real-world coding assistance quality, the benchmark most directly relevant to developers using Claude in their editor. The jump from 58% to 70% represents meaningfully better code suggestions, completions, and refactors.

The XBOW visual-acuity result deserves attention. Going from 54.5% to 98.5% is not incremental improvement. This is the first Claude model where you can reliably pass in high-resolution screenshots, architectural diagrams, or scientific figures and expect accurate interpretation.

On Terminal-Bench 2.0, Opus 4.7 solved three tasks that no previous Claude model (or competing frontier model) could handle, including fixing a race condition that required multi-file reasoning across a complex codebase.

Rakuten reported 3x more production tasks resolved compared to Opus 4.6, with double-digit gains in Code Quality and Test Quality scores. CodeRabbit saw recall improve over 10%, noting the model is "a bit faster than GPT-5.4 xhigh."

Claude Opus 4.7 benchmark comparison showing performance across coding, vision, finance, and agentic evaluations

Vision: A Generational Leap

Previous Claude models were limited to lower-resolution image inputs. Opus 4.7 raises the ceiling to 2,576 pixels on the long edge, roughly 3.75 megapixels. This is a model-level change with no API parameters to toggle.

What this means in practice:

  • Code screenshots at full resolution, no more squinting artifacts
  • Technical diagrams with fine labels and small text rendered accurately
  • Chemical structures and scientific notation parsed correctly (Solve Intelligence confirmed this)
  • Charts and graphs with dense data points interpreted without hallucinating values

Higher-resolution images consume more tokens. If you are passing images where fine detail is not critical, downsample before sending to manage costs.

Cybersecurity and the Cyber Verification Program

Opus 4.7 includes automated safeguards that detect and block requests indicating prohibited or high-risk cybersecurity uses. These safeguards are part of Anthropic's broader Project Glasswing initiative and serve as a testing ground for eventual broader release of their more capable Mythos-class models.

Legitimate security professionals can access cybersecurity capabilities through Anthropic's new Cyber Verification Program. This covers vulnerability research, penetration testing, and red-teaming. Apply at claude.com/form/cyber-use-case.

The cyber capabilities in Opus 4.7 are intentionally less advanced than what Anthropic's internal Mythos Preview can do. Training included differential reduction of certain cyber capabilities as a safety measure.

Safety Profile

Opus 4.7 maintains a similar safety profile to Opus 4.6 with targeted improvements in honesty and resistance to prompt injection attacks. Anthropic's assessment describes it as "largely well-aligned and trustworthy, though not fully ideal."

One noted weakness: the model can be overly detailed in harm-reduction advice on controlled substances. The full details are in the Claude Opus 4.7 System Card.

New API and Product Features

Alongside the model, Anthropic shipped:

Task budgets (public beta). Guide Claude's token spend across longer agentic runs. Set a budget and the model plans its token usage accordingly rather than spending freely.

Higher-resolution image support. The 2,576-pixel limit applies across the API for all Opus 4.7 requests. No opt-in needed.

/ultrareview in Claude Code. A dedicated slash command that runs a focused review session, flagging bugs and design issues. Pro and Max users get 3 free ultrareviews.

Auto mode for Max users. Claude makes decisions autonomously with fewer interruptions and managed risk. Previously limited to Enterprise plans.

Pricing

No price increase. Unchanged from Opus 4.6:

TierCost
All contexts$5 input / $25 output per 1M tokens
Prompt cachingUp to 90% savings
Batch processing50% savings
US-only inference1.1x standard pricing

The tokenizer change means the same input may cost slightly more (1.0-1.35x) due to different token boundaries. For most workloads the increase is negligible. For token-sensitive applications, benchmark your specific content.

How to Use Opus 4.7 in Claude Code

Switch your default model:

claude config set model claude-opus-4-7

For per-session overrides:

claude --model claude-opus-4-7

The model is available across claude.ai, the Messages API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. The API model identifier is claude-opus-4-7.

To use the new xhigh effort level:

/effort xhigh

Opus 4.7 vs Opus 4.6: What Changed

FeatureOpus 4.6Opus 4.7
Vision resolutionStandard (~850px long edge)2,576px long edge (3.75 MP, 3x more)
Effort levelslow, high, maxlow, high, xhigh (new), max
Default efforthighxhigh
CursorBench58%70% (+12 points)
XBOW Visual Acuity54.5%98.5% (+44 points)
Self-verificationBasicProactive output verification
Instruction-followingStandardStricter, more literal
TokenizerPrevious generationUpdated (1.0-1.35x more tokens)
Cyber safeguardsNot presentAutomated detection and blocking
Pricing$5/$25 per 1M$5/$25 per 1M (unchanged)

The core story is vision and verification. The 3x resolution jump makes Opus 4.7 the first Claude model where image understanding is genuinely reliable for professional use. The self-verification behavior means fewer rounds of "wait, let me check that" from the user. Everything Opus 4.6 did well -- 1M context, 128K output, adaptive thinking, agent teams -- carries forward unchanged.

For model selection, the upgrade path is straightforward. If you use Opus for complex work, switch to 4.7. The only adjustment needed is reviewing prompts that depended on loose instruction interpretation, since the model now follows instructions more literally. If you are on Sonnet 4.6 for daily work and Opus for heavy lifting, that split still makes sense with the newer Opus.

Last updated on

On this page

Stop configuring. Start shipping.Everything you're reading about and more..
Agentic Orchestration Kit for Claude Code.
Claude Fast Code Kit v5.2
New

Code Kit v5.2 is here

Optimized for Claude Opus 4.7. Detailed plans, explicit agents, tighter scoping, sharper output.

Learn more