Ultracode in Claude Code: What It Actually Does (and When Not To)
Ultracode sends xhigh effort plus auto-triggered Dynamic Workflows. How it differs from xhigh, max, and ultrathink, plus the real token cost.
Agentic Orchestration Kit for Claude Code.
Ultracode in Claude Code is the one /effort option that changes what Claude does, not just how hard it thinks. Every other effort level dials reasoning depth up or down. Ultracode does that and one more thing: it has Claude orchestrate Dynamic Workflows automatically for substantive tasks, fanning your request out across parallel subagents without you asking. It shipped in Claude Code v2.1.154 on May 28, 2026, alongside Claude Opus 4.8.
That extra behavior is the entire reason to care about it, and the entire reason to be careful with it. This guide covers what ultracode actually is, how it differs from xhigh, max, and ultrathink, how to enable it (and the three ways that don't work), where it's available, and the token-cost reality you need to understand before turning it on for routine work.
What Ultracode Actually Is
Here is the definition straight from Anthropic's model configuration docs, worth quoting in full because the wording matters:
"Ultracode is a Claude Code setting rather than a model effort level: it sends
xhighto the model and additionally has Claude orchestrate dynamic workflows for substantive tasks. It applies to the current session only."
Read that twice. Ultracode is not a deeper rung on the reasoning ladder. It is a setting that does two jobs at once. First, it pins your per-message reasoning to xhigh. Second, it flips on automatic workflow orchestration, so Claude decides on its own when a task is big enough to warrant fanning out across subagents.
That second job is where the real leverage lives. Dynamic Workflows let Claude write an orchestration script that runs subagents in parallel, has them cross-check each other's findings, and iterates until the answers converge. Normally you trigger that by asking. Ultracode makes it the default disposition for the whole session. For the full mechanics of how workflows plan, fan out, and verify, read the Dynamic Workflows guide. This post owns the effort-setting and cost side.
The practical upshot: with ultracode on, a single request can turn into several workflows in a row. As Anthropic's workflows docs put it, "one to understand the code, one to make the change, and one to verify it. This applies to every task in the session, so each request uses more tokens and takes longer than at lower effort levels." That is the trade. You buy depth and verification, and you pay for it in tokens and wall-clock time on every task.
Ultracode vs xhigh vs max vs ultrathink
These four get confused constantly because they all sound like "more." They are four different mechanisms operating on different axes. Here is how they actually differ.
| Setting | What it is | API effort sent | Triggers workflows? | Scope | Persists across sessions? |
|---|---|---|---|---|---|
xhigh | A model effort level (deep reasoning, high token spend) | xhigh | No | Session-wide setting | Yes |
max | A model effort level (deepest reasoning, no token cap) | max | No | Session-wide setting | No (session-only) |
ultrathink | A one-turn prompt keyword for deeper reasoning | Unchanged | No | A single turn | N/A (not a setting) |
ultracode | A Claude Code setting: xhigh plus auto-orchestration | xhigh | Yes (automatic) | Session-wide setting | No (session-only) |
The ultracode vs ultrathink distinction trips up the most people. ultrathink is a prompt keyword. Drop the word ultrathink anywhere in a single prompt and Claude Code adds an in-context instruction to think harder on that one turn. It does not change your session effort level, it does not change the effort value sent to the API, and it does not trigger a workflow. It is a per-turn nudge.
Ultracode is the opposite shape: a session-wide setting that pins effort to xhigh and auto-orchestrates workflows for every substantive task until you turn it off. One is a single-turn reasoning request. The other is a session-long behavioral mode. They share four letters and nothing else.
The xhigh vs ultracode line is cleaner once you see it: ultracode sends xhigh to the model, so the raw reasoning depth is identical. The only difference is the automatic workflow layer on top. If you want xhigh reasoning without Claude spinning up parallel subagents on its own, set xhigh directly and leave ultracode off.
The Claude Code Effort Ladder
Effort levels control adaptive reasoning, which lets the model decide whether and how much to think on each step. Here is the full ladder, with use cases and persistence behavior.
| Level | Use it for | Persists across sessions |
|---|---|---|
low | Short, scoped, latency-sensitive tasks that aren't intelligence-sensitive | Yes |
medium | Cost-sensitive work that can trade off some intelligence | Yes |
high | The balanced default for most coding. Default on Opus 4.8, 4.6, Sonnet 4.6 | Yes |
xhigh | Deeper reasoning at higher token spend. Default on Opus 4.7 | Yes |
max | Deepest reasoning, no token cap. Prone to overthinking, test before adopting | No (session-only) |
ultracode | xhigh plus automatic Dynamic Workflow orchestration on substantive tasks | No (session-only) |
Two model facts shape which levels you actually see. Opus 4.8 and Opus 4.7 support low, medium, high, xhigh, and max. Opus 4.6 and Sonnet 4.6 support low, medium, high, and max with no xhigh. Because ultracode sends xhigh, it only appears in the /effort menu on xhigh-capable models. On Opus 4.6 or Sonnet 4.6, the menu doesn't offer it.
A common error worth correcting: Opus 4.8 defaults to high, not xhigh. Opus 4.7 is the model that defaults to xhigh. So if you're on Opus 4.8 and you want anything above balanced effort, you're opting in deliberately. Ultracode is the most aggressive opt-in on that list. For more on matching the model to the work, see the Claude Opus 4.8 guide.
How to Enable Ultracode
Three methods work. Three that look like they should work don't. Get this right and you'll save yourself a confusing debugging session.
The three that work:
The simplest is the /effort menu. Run /effort to open the slider, or set it directly:
You can also pass it through settings as a boolean, either via --settings on launch or through an Agent SDK control request:
The three that don't work:
Ultracode is not part of the effortLevel settings field. It is not accepted by the --effort flag. And it is not honored through the CLAUDE_CODE_EFFORT_LEVEL environment variable. Those three channels handle low through xhigh, but ultracode (like max) is session-only and lives outside them. If you put "effortLevel": "ultracode" in your settings file, it silently won't take.
That session-only nature is the other thing to remember. Ultracode lasts for the current session and resets the moment you start a new one. Anthropic's guidance is blunt: "Drop back with /effort high when you return to routine work." It won't follow you between sessions the way high or xhigh will, which is by design. You don't want yesterday's audit-grade orchestration mode quietly applying to today's one-line typo fix.
If you want workflow behavior for a single task without committing the whole session to ultracode, there's an escape hatch. Type the word workflow anywhere in a prompt and Claude Code runs that one task as a workflow without changing your effort level. Think of it as ultracode-lite for a single request. If the keyword triggers when you didn't mean it, press alt+w to ignore it for that prompt, or backspace right after the highlighted word. To switch the trigger off entirely, toggle "Workflow keyword trigger" off in /config.
Plan and Model Availability
Dynamic Workflows, which ultracode drives, ship as a research preview on all paid plans, plus the Anthropic API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. The defaults differ by plan:
| Plan | Workflows default | How to enable |
|---|---|---|
| Pro | Off | Turn on the Dynamic workflows row in /config |
| Max | On | Available out of the box |
| Team | On | Available out of the box |
| Enterprise | Off | An admin enables it first |
Pro is not gated out. That's a meaningful correction to the assumption that workflow-style orchestration is a Max-and-up feature. Pro users enable it from /config and then ultracode becomes available on xhigh-capable models just like anywhere else.
Two requirements gate everything. You need Claude Code v2.1.154 or later, so run claude update if you're behind. And ultracode only shows up when workflows are enabled. If you disable workflows through the /config toggle, through "disableWorkflows": true in settings, or through CLAUDE_CODE_DISABLE_WORKFLOWS=1, ultracode is removed from the /effort menu entirely. No workflows means no ultracode, because the orchestration half of the setting has nowhere to run.
The Token-Cost Reality Check
This is the section that matters most, so let's be precise and balanced about it.
The official cost language is direct. A workflow spawns many agents, so "a single run can use meaningfully more tokens than working through the same task in conversation. Runs count toward your plan's usage and rate limits." With ultracode on, that multiplier applies to every substantive task in the session, not just one run you chose to fire.
Here is the part people miss: there is no preset spending cap. Convergence is open-ended. A workflow stops when the answers stabilize, not when it hits a fixed token ceiling. That's a feature for hard problems where you want the run to keep refuting and re-checking until it's confident. It's a liability on small work, because there's no built-in floor to catch a run that decides a trivial task deserves ninety agents.
The community reaction reflects that double edge. These are practitioner reports, not Anthropic figures, so treat them as anecdotes rather than guaranteed numbers. On Hacker News, one user said they "spun up 62 Opus 4.8 sub-agents and hit the 5-hour cap in 18 minutes," and another described roughly 90 agents running to review a fairly small package. A more pointed critic called it "tokenmaxxing disguised as a product," while others reported reliability gripes about runs that "give up constantly." Separately, findskill.ai (citing r/ClaudeAI) reported a Max ($200/mo) user burning about 20% of their weekly token limit on day one, and a Pro user reportedly hitting their cap in roughly ten minutes. The sentiment splits cleanly: powerful for genuinely large work, easy to overspend on small work.
The defense is straightforward: calibrate before you commit. Run ultracode on a scoped task first to see how it decomposes and how many subagents it spawns, then decide whether the budget makes sense for your real workload. Anthropic raised Claude Code's rate limits alongside the Opus 4.8 launch specifically to absorb heavier workflow consumption, so if you've bumped the old ceiling, the new headroom helps. Our usage optimization guide covers patterns that keep multi-agent runs inside their token envelope, and the higher usage limits guide covers the rate-limit headroom directly.
When to Use Ultracode (and When to Avoid It)
The rule of thumb is simple. Reserve ultracode for work you'd otherwise hand to several engineers in parallel. If the task naturally decomposes into independent angles that benefit from cross-checking, ultracode earns its tokens. If it's one person, one afternoon, ultracode is overkill.
Use ultracode when:
- You're running a codebase-wide audit (security, dead code, performance) where coverage matters more than speed.
- You're planning a migration that touches hundreds of files and you want every change verified before it lands. The large codebase playbook covers this class of work in depth.
- You're stress-testing a plan from multiple independent angles before you commit to one.
- The work is high-stakes enough that the adversarial verification loop is worth paying for.
Avoid ultracode when:
- The task fits a single agent in one pass. The token cost dwarfs the benefit, and the verification loop adds latency you don't need.
- You already know the role decomposition. If you can name "Frontend, Backend, Quality" up front, a deliberately structured Agent Teams setup gives you tighter control than letting Claude auto-orchestrate.
- You're doing routine editing. Ultracode applies its behavior to every substantive task, so ordinary changes get the full treatment whether they need it or not.
A few failure modes are worth naming. Workflow subagents always run in acceptEdits mode and inherit your session tool allowlist regardless of your permission mode, so file edits are auto-approved inside a workflow. Parallel agents can step on each other when they edit the same file. There's no mid-run human input, so you can't course-correct partway through, only agent permission prompts can pause a run. And if you quit Claude Code mid-run, the next session starts the workflow fresh rather than resuming it. For unattended runs, pair ultracode with Auto Mode. Worth noting: in Auto permission mode, the per-run workflow approval prompt is skipped entirely when ultracode is on, which keeps the parallelism flowing but removes one of your last manual checkpoints.
The runtime has guardrails. A workflow runs up to 16 concurrent agents (fewer on machines with limited CPU cores) and caps at 1,000 agents total per run as a runaway-loop backstop. The workflow script has no direct filesystem or shell access of its own. Only the spawned agents read, write, and run commands. Those bounds limit the blast radius, but they don't impose a token cap, which is exactly why the cost discipline lands on you.
Frequently Asked Questions
What's the difference between ultracode and ultrathink?
They're completely different mechanisms. ultrathink is a one-turn prompt keyword: drop it in a prompt and Claude reasons harder on that single turn, without changing your session effort or the effort value sent to the API, and without triggering a workflow. Ultracode is a session-wide setting that pins effort to xhigh and auto-orchestrates Dynamic Workflows for every substantive task until you turn it off. One is a per-turn nudge. The other is a session-long mode.
Does ultracode persist across sessions?
No. Ultracode applies to the current session only and resets when you start a new one. Anthropic's recommendation is to drop back to /effort high when you return to routine work. The persistent levels are low, medium, high, and xhigh. Both max and ultracode are session-only.
Can my admin disable ultracode for our org?
Yes, indirectly. Ultracode depends on Dynamic Workflows, and an org admin can disable workflows for the whole organization through managed settings or the Claude Code admin settings page. When workflows are disabled, ultracode is removed from the /effort menu, so disabling workflows disables ultracode.
Is there a spending cap on ultracode?
No. There is no preset token cap. A workflow run converges when the answers stabilize, not at a fixed token limit, so cost is open-ended by design. Runs count toward your plan's usage and rate limits. The practical control is to calibrate on a scoped task first, watch how many subagents spin up, and decide the budget from there.
Does ultracode require Opus 4.8?
It requires a model that supports xhigh effort, which today means Opus 4.8 or Opus 4.7. On models without xhigh (Opus 4.6, Sonnet 4.6), the /effort menu doesn't offer ultracode at all. It shipped alongside Opus 4.8, and the high-stakes work that justifies workflow-level fan-out is exactly what Opus 4.8 was tuned for, so in practice most ultracode sessions run on Opus.
Can Pro users use ultracode?
Yes. Dynamic Workflows ship as a research preview on all paid plans including Pro. On Pro, you enable workflows from the Dynamic workflows row in /config first. Once workflows are on and you're on an xhigh-capable model, ultracode appears in /effort like it does on any other plan. Pro is not gated out.
The Bottom Line
Ultracode is the most aggressive setting on the /effort ladder, and the only one that changes what Claude does rather than just how hard it thinks. That makes it genuinely powerful for the work it was built for: audits, migrations, and plan stress-tests you'd otherwise split across several engineers. It also makes it the easiest setting to overspend on, because the automatic workflow layer applies to every substantive task and convergence has no token cap. The Bun team's Zig-to-Rust port (about 750,000 lines of Rust, 11 days from first commit to merge, 99.8% of the test suite passing) is the headline example of what orchestration at this scale can do. Most teams won't be porting a runtime. They'll be deciding, task by task, whether this particular job is worth several engineers' worth of parallel work.
Treat ultracode as a deliberate mode, not a default. Turn it on for the work that earns it, calibrate the cost on a scoped run, and drop back to /effort high when you're back to routine edits.
If you're building a Claude Code setup that pairs ultracode and Dynamic Workflows with permission rules, hooks, and an agent framework tuned for orchestration handoffs, the ClaudeFast Code Kit ships with those patterns preconfigured.
Last updated on
