Ultracode in Claude Code: Effort Setting Explained

Ultracode sends xhigh effort plus auto-triggered Dynamic Workflows. How it differs from xhigh, max, and ultrathink, plus the real token cost.

A single Claude context window has three predictable failure modes on hard work. It quits early on a fifty-item task after finishing thirty-five and calls it done. It grades its own output too generously when you ask it to verify itself. And it slowly loses the thread of your original goal as the conversation gets long enough to need compaction. Dynamic Workflows fix all three by spreading the work across separate Claudes, each with a clean context window and one focused job. The catch is that you normally have to ask for a workflow. Ultracode is the setting that makes Claude reach for one on its own.

That is the whole idea: ultracode is the workflow toggle, left on for the entire session. It sends xhigh reasoning effort to the model and additionally has Claude auto-orchestrate Dynamic Workflows for substantive tasks, so a request that would otherwise run in one overloaded window gets fanned out across verifying subagents without you typing the word "workflow." It shipped in Claude Code v2.1.154 on May 28, 2026, alongside Claude Opus 4.8. This guide covers what ultracode is, how it differs from xhigh, max, and ultrathink, how to enable it, where it runs, and the open-ended token cost you need to understand before turning it on for routine work.

What Ultracode Actually Is

Here is the definition straight from Anthropic's model configuration docs, worth quoting in full because the wording matters:

"Ultracode is a Claude Code setting rather than a model effort level: it sends xhigh to the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. It applies to the current session only."

Read that twice. Ultracode is not a deeper rung on the reasoning ladder. It is a setting that does two jobs at once. First, it pins your per-message reasoning to xhigh. Second, it flips on automatic workflow orchestration, so Claude decides on its own when a task is big enough to fan out across subagents.

That second job is where the leverage lives, and it is exactly the fix for the three failure modes above. A workflow spreads the task across isolated agents so no single window has to hold all fifty items, which kills the early quitting. It hands verification to a separate agent that never wrote the answer it is judging, which removes the self-preferential bias. And it keeps each agent's window short enough that goal drift never sets in, because the agent finishes and returns before compaction can blur the objective. If you want the full mechanics of how a workflow plans, fans out, and verifies, read the Dynamic Workflows guide. This post owns the effort-setting and cost side.

The practical upshot: with ultracode on, a single request can turn into several workflows in a row. As Anthropic's workflows docs put it, "one to understand the code, one to make the change, and one to verify it. This applies to every task in the session, so each request uses more tokens and takes longer than at lower effort levels." That is the trade. You buy depth and built-in verification on every task, and you pay for it in tokens and wall-clock time whether the task needed it or not.

Ultracode vs xhigh vs max vs ultrathink

These four get confused constantly because they all sound like "more." They are four different mechanisms operating on different axes. Here is how they actually differ.

Setting	What it is	API effort sent	Triggers workflows?	Scope	Persists across sessions?
`xhigh`	A model effort level (deep reasoning, high token spend)	`xhigh`	No	Session-wide setting	Yes
`max`	A model effort level (deepest reasoning, no token cap)	`max`	No	Session-wide setting	No (session-only)
`ultrathink`	A one-turn prompt keyword for deeper reasoning	Unchanged	No	A single turn	N/A (not a setting)
`ultracode`	A Claude Code setting: `xhigh` plus auto-orchestration	`xhigh`	Yes (automatic)	Session-wide setting	No (session-only)

The ultracode vs ultrathink distinction trips up the most people. ultrathink is a prompt keyword. Drop the word ultrathink anywhere in a single prompt and Claude Code adds an in-context instruction to think harder on that one turn. It does not change your session effort level, it does not change the effort value sent to the API, and it does not trigger a workflow. It is a per-turn nudge that only affects how hard Claude thinks.

Ultracode is the opposite shape: a session-wide setting that pins effort to xhigh and, crucially, auto-orchestrates workflows for every substantive task until you turn it off. One is a single-turn reasoning request. The other is a session-long behavioral mode that changes what Claude does, not just how hard it thinks. They share four letters and nothing else.

The xhigh vs ultracode line is cleaner once you see it through the failure-mode lens. Both send xhigh, so raw reasoning depth is identical. But xhigh alone still runs in one context window, which means it is still exposed to early quitting, self-grading, and goal drift on big tasks. Ultracode adds the automatic workflow layer that structurally removes those risks. If you want deeper reasoning without Claude spinning up parallel subagents on its own, set xhigh directly and leave ultracode off.

The Claude Code Effort Ladder

Effort levels control adaptive reasoning, which lets the model decide whether and how much to think on each step. Here is the full ladder, with use cases and persistence behavior.

Level	Use it for	Persists across sessions
`low`	Short, scoped, latency-sensitive tasks that aren't intelligence-sensitive	Yes
`medium`	Cost-sensitive work that can trade off some intelligence	Yes
`high`	The balanced default for most coding. Default on Opus 4.8, 4.6, Sonnet 4.6	Yes
`xhigh`	Deeper reasoning at higher token spend. Default on Opus 4.7	Yes
`max`	Deepest reasoning, no token cap. Prone to overthinking, test before adopting	No (session-only)
`ultracode`	`xhigh` plus automatic Dynamic Workflow orchestration on substantive tasks	No (session-only)

Notice that the first five rungs only change reasoning depth within a single window. Ultracode is the only one that also changes the execution shape, moving the work out of one window and into many. That is why it sits at the top: it is not just the deepest thinker, it is the only setting that defeats the structural failure modes instead of just thinking harder inside them.

Two model facts shape which levels you actually see. Opus 4.8 and Opus 4.7 support low, medium, high, xhigh, and max. Opus 4.6 and Sonnet 4.6 support low, medium, high, and max with no xhigh. Because ultracode sends xhigh, it only appears in the /effort menu on xhigh-capable models. On Opus 4.6 or Sonnet 4.6, the menu doesn't offer it.

A common error worth correcting: Opus 4.8 defaults to high, not xhigh. Opus 4.7 is the model that defaults to xhigh. So if you're on Opus 4.8 and you want anything above balanced effort, you're opting in deliberately. Ultracode is the most aggressive opt-in on that list. For more on matching the model to the work, see the Claude Opus 4.8 guide.

How to Enable Ultracode

Three methods work. Three that look like they should work don't. Get this right and you'll save yourself a confusing debugging session.

The three that work:

The simplest is the /effort menu. Run /effort to open the slider, or set it directly:

/effort ultracode

You can also pass it through settings as a boolean, either via --settings on launch or through an Agent SDK control request:

{ "ultracode": true }

The three that don't work:

Ultracode is not part of the effortLevel settings field. It is not accepted by the --effort flag. And it is not honored through the CLAUDE_CODE_EFFORT_LEVEL environment variable. Those three channels handle low through xhigh, but ultracode (like max) is session-only and lives outside them. If you put "effortLevel": "ultracode" in your settings file, it silently won't take.

That session-only nature is the other thing to remember. Ultracode lasts for the current session and resets the moment you start a new one. Anthropic's guidance is blunt: "Drop back with /effort high when you return to routine work." It won't follow you between sessions the way high or xhigh will, which is by design. You don't want yesterday's audit-grade orchestration mode quietly applying to today's one-line typo fix.

If you want workflow behavior for a single task without committing the whole session to ultracode, there's an escape hatch. Type the word workflow anywhere in a prompt and Claude Code runs that one task as a workflow without changing your effort level. Think of it as ultracode-lite for a single request. If the keyword triggers when you didn't mean it, press alt+w to ignore it for that prompt, or backspace right after the highlighted word. To switch the trigger off entirely, toggle "Workflow keyword trigger" off in /config.

Plan and Model Availability

Dynamic Workflows, which ultracode drives, ship as a research preview on all paid plans, plus the Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. The defaults differ by plan:

Plan	Workflows default	How to enable
Pro	Off	Turn on the Dynamic workflows row in `/config`
Max	On	Available out of the box
Team	On	Available out of the box
Enterprise	Off	An admin enables it first

Pro is not gated out. That's a meaningful correction to the assumption that workflow-style orchestration is a Max-and-up feature. Pro users enable it from /config and then ultracode becomes available on xhigh-capable models just like anywhere else.

Two requirements gate everything. You need Claude Code v2.1.154 or later, so run claude update if you're behind. And ultracode only shows up when workflows are enabled. If you disable workflows through the /config toggle, through "disableWorkflows": true in settings, or through CLAUDE_CODE_DISABLE_WORKFLOWS=1, ultracode is removed from the /effort menu entirely. No workflows means no ultracode, because the orchestration half of the setting has nowhere to run.

The Token-Cost Reality Check

This is the section that matters most, so let's be precise and balanced about it.

The official cost language is direct. A workflow spawns many agents, so "a single run can use meaningfully more tokens than working through the same task in conversation. Runs count toward your plan's usage and rate limits." With ultracode on, that multiplier applies to every substantive task in the session, not just one run you chose to fire. The same isolation that defeats the three failure modes is what makes ultracode expensive: you are paying for many agents instead of one, every time.

Here is the part people miss: there is no spending cap by default. You can impose one with an explicit token budget (ask for "use 10k tokens" and that becomes the ceiling), but left unbudgeted, convergence is open-ended. A workflow stops when the answers stabilize, not when it hits a fixed token ceiling. That's a feature for hard problems where you want the run to keep refuting and re-checking until it's confident. It's a liability on small work, because there's no built-in floor to catch a run that decides a trivial task deserves ninety agents.

The community reaction reflects that double edge. These are practitioner reports, not Anthropic figures, so treat them as anecdotes rather than guaranteed numbers. On Hacker News, one user said they "spun up 62 Opus 4.8 sub-agents and hit the 5-hour cap in 18 minutes," and another described roughly 90 agents running to review a fairly small package. A more pointed critic called it "tokenmaxxing disguised as a product," while others reported reliability gripes about runs that "give up constantly." Separately, findskill.ai (citing r/ClaudeAI) reported a Max ($200/mo) user burning about 20% of their weekly token limit on day one, and a Pro user reportedly hitting their cap in roughly ten minutes. The sentiment splits cleanly: powerful for genuinely large work, easy to overspend on small work.

The defense is straightforward: calibrate before you commit. Run ultracode on a scoped task first to see how it decomposes and how many subagents it spawns, then decide whether the budget makes sense for your real workload. Anthropic raised Claude Code's rate limits alongside the Opus 4.8 launch specifically to absorb heavier workflow consumption, so if you've bumped the old ceiling, the new headroom helps. Our usage optimization guide covers patterns that keep multi-agent runs inside their token envelope, and the higher usage limits guide covers the rate-limit headroom directly.

When to Use Ultracode (and When to Avoid It)

The test is the same one the Dynamic Workflows guide lands on: does this task really need more compute? Most traditional coding tasks do not need a panel of five reviewers. Reserve ultracode for work you'd otherwise hand to several engineers in parallel, where the task naturally decomposes into independent angles that benefit from cross-checking. If it's one person, one afternoon, ultracode is overkill and a single agent at high or xhigh is the right tool.

Use ultracode when:

You're running a codebase-wide audit (security, dead code, performance) where coverage matters more than speed, and the early-quitting failure mode would otherwise leave files unchecked.
You're planning a migration that touches hundreds of files and you want every change verified before it lands. The large codebase playbook covers this class of work in depth.
You're stress-testing a plan from multiple independent angles, where you specifically want agents that did not write the plan to try to refute it.
The work is high-stakes enough that the adversarial verification loop is worth paying for.

Avoid ultracode when:

The task fits a single agent in one pass. There is no failure mode to defeat, so the token cost dwarfs the benefit and the verification loop adds latency you don't need.
You already know the role decomposition. If you can name "Frontend, Backend, Quality" up front, a deliberately structured Agent Teams setup gives you tighter control than letting Claude auto-orchestrate. For a side-by-side of when to reach for each, see Ultracode and Dynamic Workflows vs Agent Teams.
You're doing routine editing. Ultracode applies its behavior to every substantive task, so ordinary changes get the full treatment whether they need it or not.

A few operational facts are worth naming. Workflow subagents always run in acceptEdits mode and inherit your session tool allowlist regardless of your permission mode, so file edits are auto-approved inside a workflow. Parallel agents can step on each other when they edit the same file, which is why file-mutating fan-outs run in isolated worktrees. There's no mid-run human input, so you can't redirect an agent partway through. But a run isn't fragile: you can pause and resume it, and an interrupted run picks up from its last completed stage by run ID rather than restarting, replaying the finished stages from cache instead of re-running them. For unattended runs, pair ultracode with Auto Mode. Worth noting: in Auto permission mode, the per-run workflow approval prompt is skipped entirely when ultracode is on, which keeps the parallelism flowing but removes one of your last manual checkpoints.

The runtime has guardrails. A workflow runs up to 16 concurrent agents (fewer on machines with limited CPU cores) and caps at 1,000 agents total per run as a runaway-loop backstop. The workflow script has no direct filesystem or shell access of its own. Only the spawned agents read, write, and run commands. Those bounds limit the blast radius, but they don't impose a token cap, which is exactly why the cost discipline lands on you. The orchestration patterns underneath all of this map onto the broader thread model in thread-based engineering.

Frequently Asked Questions

What's the difference between ultracode and ultrathink?

They're completely different mechanisms. ultrathink is a one-turn prompt keyword: drop it in a prompt and Claude reasons harder on that single turn, without changing your session effort or the effort value sent to the API, and without triggering a workflow. Ultracode is a session-wide setting that pins effort to xhigh and auto-orchestrates Dynamic Workflows for every substantive task until you turn it off. One is a per-turn nudge. The other is a session-long mode that changes what Claude does.

Does ultracode persist across sessions?

No. Ultracode applies to the current session only and resets when you start a new one. Anthropic's recommendation is to drop back to /effort high when you return to routine work. The persistent levels are low, medium, high, and xhigh. Both max and ultracode are session-only.

Can my admin disable ultracode for our org?

Yes, indirectly. Ultracode depends on Dynamic Workflows, and an org admin can disable workflows for the whole organization through managed settings or the Claude Code admin settings page. When workflows are disabled, ultracode is removed from the /effort menu, so disabling workflows disables ultracode.

Is there a spending cap on ultracode?

No. There is no preset token cap. A workflow run converges when the answers stabilize, not at a fixed token limit, so cost is open-ended by design. Runs count toward your plan's usage and rate limits. The practical control is to calibrate on a scoped task first, watch how many subagents spin up, and decide the budget from there.

Does ultracode require Opus 4.8?

It requires a model that supports xhigh effort, which today means Opus 4.8 or Opus 4.7. On models without xhigh (Opus 4.6, Sonnet 4.6), the /effort menu doesn't offer ultracode at all. It shipped alongside Opus 4.8, and the high-stakes work that justifies workflow-level fan-out is exactly what Opus 4.8 was tuned for, so in practice most ultracode sessions run on Opus.

Can Pro users use ultracode?

Yes. Dynamic Workflows ship as a research preview on all paid plans including Pro. On Pro, you enable workflows from the Dynamic workflows row in /config first. Once workflows are on and you're on an xhigh-capable model, ultracode appears in /effort like it does on any other plan. Pro is not gated out.

The Bottom Line

Ultracode is the most aggressive setting on the /effort ladder, and the only one that changes what Claude does rather than just how hard it thinks. It is the workflow toggle left on for the whole session, which means it structurally defeats the three failure modes (early quitting, self-grading, goal drift) that a single context window hits on hard work. That makes it genuinely powerful for audits, migrations, and plan stress-tests you'd otherwise split across several engineers. It also makes it the easiest setting to overspend on, because the automatic workflow layer applies to every substantive task and convergence has no token cap. The Bun team's Zig-to-Rust port (about 750,000 lines of Rust, 11 days from first commit to merge, 99.8% of the test suite passing) is the headline example of what orchestration at this scale can do. You may see it cited elsewhere as a six-day, million-line effort: that framing comes from Bun's own account (six days of active work, roughly 960,000 lines of Zig translated), while the figures here follow Anthropic's count of the Rust output from first commit to merge. Most teams won't be porting a runtime. They'll be deciding, task by task, whether this particular job is worth several engineers' worth of parallel work.

Treat ultracode as a deliberate mode, not a default. Turn it on for the work that earns it, calibrate the cost on a scoped run, and drop back to /effort high when you're back to routine edits.

If you're building a Claude Code setup that pairs ultracode and Dynamic Workflows with permission rules, hooks, and an agent framework tuned for orchestration handoffs, the ClaudeFast Code Kit ships with those patterns preconfigured.

Ultracode in Claude Code: What It Actually Does (and When Not To)