Claude Code Limits Doubled: 5x More With Smart Routing

Anthropic doubled Claude Code's 5-hour windows and removed peak throttling. Weekly caps unchanged. Here's how to absorb the new throughput.

Anthropic announced this morning that Claude Code's 5-hour rate limits are doubling for every Pro, Max, Team, and seat-based Enterprise account. Peak-hour throttling on Pro and Max is gone. Opus API rate limits went up. All effective today.

That's the headline. The amount of work you can push through Claude Code in a single 5-hour window just doubled.

The part most people will misread: the weekly caps did not change. Today's announcement only touches the 5-hour rate-limit window and the peak-hour throttle. Your weekly bucket is the same size it was yesterday. What's different is the size of the spigot draining out of that bucket during your actual work hours.

That distinction matters, because it changes what "doubled limits" actually means in practice. It isn't more total capacity. It's more bandwidth to absorb the capacity you already had. Most users were leaving weekly headroom on the table because the per-window throttle and peak-hour clip stopped them from reaching it during the hours they were actually at the keyboard. Now they can.

Stack that with smart model routing -- Opus on the bookends, Sonnet in the middle, the same pipeline we ship as the default in ClaudeFast v5.3 -- and your usable throughput inside a working day goes from 1x to closer to 5x without changing a single prompt.

This post covers what actually changed, what didn't, why bandwidth matters more than ceiling for most users, and the routing and efficiency patterns that turn the new headroom into shipped work.

What Anthropic Actually Shipped Today

The announcement post is short. Three concrete changes, all effective today:

Claude Code's 5-hour rate limits are doubled. Applies to Pro, Max, Team, and seat-based Enterprise. Whatever your old per-window cap was on Sonnet and Opus inside Claude Code, double it.
Peak-hour throttling on Claude Code is removed for Pro and Max. Anthropic had been quietly clipping limits during high-traffic windows. That clip is gone for the two tiers most likely to feel it.
Opus API rate limits are raised considerably. Higher per-minute and per-day ceilings for anyone hitting Opus through the API. The published table is in their post.

The why behind the change is short too. Anthropic is bringing on more compute. The headline partnership is a SpaceX/xAI deal that adds Colossus 1 -- 300 megawatts and 220,000+ NVIDIA GPUs -- to Claude infrastructure inside the next month. That sits on top of existing buildouts with AWS Trainium, Google TPUs, and NVIDIA GPU partners. More capacity, looser limits.

For context on how Pro and Max compare structurally, and what counts toward the 5-hour window in the first place, see our Claude Code subscription guide and the usage optimization guide. The new limits don't change the OAuth versus API-key boundary. They only change the size of the spigot.

What's Doubled, and What Isn't

Read the announcement carefully and the wording is precise. Anthropic doubled the five-hour rate limits. They removed the peak-hour limit reduction. They did not say anything about weekly caps.

Here's the practical decomposition:

Limit type	Before today	After today
Per 5-hour window	Baseline cap, throttled during peak	2x baseline, no peak throttle (Pro/Max)
Weekly cap	Standing cap on weekly Sonnet + Opus consumption	Same standing cap
Monthly cap	Implicit (4x weekly)	Same
API Opus rate limit	Lower per-minute and per-day	Considerably higher

The 5-hour window resets every five hours. The weekly cap is the absolute ceiling on what you can spend across a rolling seven-day period. Today's change widens the per-window pipe; it does not widen the weekly tank.

That sounds like a smaller win, and it is. But the working assumption that most users were maxing their weekly cap was already wrong. Most weren't. They were being clipped inside individual work sessions by the per-window throttle and the peak-hour cut, and walking away from a session with weekly headroom they couldn't actually consume during the hours they were at the keyboard.

What this announcement does, in effect, is unblock your weekly cap during the hours you actually want to use it.

The Real Bottleneck Just Moved

If you've been bumping into the 5-hour ceiling, doubling it feels like the entire story. It isn't.

The real bottleneck for most Claude Code users isn't the size of any one bucket. It's the rate at which you spend, multiplied by where you're spending. Run Opus on every step of every task and you'll burn through the doubled per-window cap as fast as you used to burn through the old one, then crash into the unchanged weekly cap days earlier in the week. The ceilings move; your spend pattern moves to match; you hit the same wall in a different shape.

Here's what you actually want to do with the extra room: spend it on the steps that change the outcome of the run, and stop spending it on the steps that don't.

That distinction is the entire game. It's also the thing v5.3 was rebuilt around.

The Opus-Sonnet-Opus Pipeline

In the standard /team-plan to /build flow inside the Code Kit, there are three stages. They look identical from the outside -- a central orchestrator dispatches specialist sub-agents -- but they have very different cost-of-being-wrong profiles.

Stage 1: Planning. Run on Opus. Master Orchestrator decomposes the request, picks specialists, writes the plan file. Deeper reasoning at this step changes the structure of every downstream task. This is the one place where pinching tokens is genuinely expensive.

Stage 2: Execution. Run on Sonnet. The specialist agents implement against the plan. This is where roughly 70-80% of the token spend in a given session lives. It's also where Sonnet tracks Opus most closely on real coding work. The runs we benchmarked came back ~99.5% identical in output quality.

Stage 3: Review and validation. Run on Opus. frontend-specialist and debugger-detective stay on Opus. These are the agents catching subtle correctness issues before the work ships. Same logic as Stage 1: the cost of a missed defect is much higher than the cost of the deeper model.

Bookend on Opus, middle on Sonnet. Sonnet pulls roughly five times less from your usage limits than Opus does for equivalent work. Move the heaviest stage off Opus and the math is straightforward -- you've cut the usage of that stage by ~5x without changing the deliverable.

If you want the model-by-model picture behind that 5x figure, our model selection guide walks through the trade-offs across Opus 4.7, Sonnet 4.6, and Haiku 4.5.

Efficiency Patterns That Compound With the New Throughput

The doubled 5-hour window is wasted on a session that burns tokens on bad context, redundant reads, or runaway loops. The patterns that turn raw throughput into shipped work are the same ones that mattered before today; they just matter more now that you actually have the bandwidth to feel them.

The four patterns that compound hardest with the new caps:

Context discipline. Long, sloppy sessions with stale CLAUDE.md and bloated tool output spend tokens on noise. The context management guide covers compaction strategy and /clear discipline. The context engineering walkthrough goes deeper on what to load and what to leave out. For longer-running sessions, the context buffer management post covers the buffer hygiene that keeps quality from collapsing past the 70% mark.
Plan before you build. Plan mode (Shift+Tab twice) is the cheapest leverage point in Claude Code. Cheap because planning runs on Opus and produces a structure that Sonnet can execute against. Expensive when you skip it. The planning modes guide covers when to use it.
Speed and efficiency patterns. Our efficiency patterns post covers the workflow-level moves that cut wasted iterations -- batching, fewer reads per turn, tighter tool calls. The speed optimization guide covers what to do when latency is the actual problem. Fast mode is the right default when you're doing high-volume Sonnet execution work.
Reach for deep thinking only where it pays. Deep thinking techniques are expensive on the bucket. They're worth it on architectural decisions and ambiguous debugging; they're waste on routine implementation. Same logic as the Opus-Sonnet-Opus split, applied per-prompt.

None of these are new. They were already best practice. What's new is that the per-window cap no longer hides the cost of skipping them.

How to Apply This Without Rebuilding Your Workflow

You don't need our framework to use any of this. The principle is portable.

Default your shell to Sonnet. /model sonnet at the start of every session. This is the single highest-leverage habit for stretching the new doubled cap.

Reach for Opus only at the planning and review boundaries. Architectural decisions, ambiguous requirements, post-implementation review of anything touching auth, payments, or migrations. Use opusplan if you want planning on Opus and execution on Sonnet automatically.

If you're using the Code Kit, you don't have to think about this. v5.3 routes 13 of 18 specialists to Sonnet by default while keeping master-orchestrator, frontend-specialist, debugger-detective, content-writer, and growth-engineer on Opus. The routing happens in agent frontmatter. Override per task when you want deeper reasoning -- the CLAUDE.md guidance was rewritten this release to encourage that judgment call instead of mandating Opus blanket-style.

Watch your absorption rate, not just your ceiling. With the per-window cap doubled and peak throttling gone, the new constraint for heavy users is the weekly cap, not the per-session one. Track how much of your weekly budget you're actually using. If you're running well under 80% per week, the new throughput is pure upside -- absorb it. If you're already brushing 100%, the routing patterns above are how you stay productive without paying for an upgrade.

What This Compounds To

Take a Max 20x account that used to cap around 900 messages per 5-hour window. The doubled per-window limit is now ~1,800. Run that on Opus end-to-end and you'll empty your weekly bucket faster than ever, because the weekly cap didn't move.

Run the same volume through the Opus-Sonnet-Opus pipeline and you're spending Opus tokens on roughly 30% of the calls and Sonnet tokens on the other 70%. Sonnet pulls ~5x less per equivalent unit. The blended draw on your weekly bucket drops to a fraction of the all-Opus baseline. Net effect: you can ship 4-5x more work inside the same weekly cap, with the new doubled per-window throughput letting you actually pull that work into the hours you're at the keyboard.

The compute on the other end of this -- Colossus 1, AWS Trainium, the Google TPU build -- is finite. The new throughput will only feel like throughput if you spend it on the work that benefits from compounded capacity. Routing and context discipline are what turn a bandwidth bump into actual shipped output.

Make the New Ceiling Count

Today's change is the largest single increase to Claude Code's per-window limits Anthropic has shipped. It also won't matter much for developers who route every task through Opus and run out of weekly room twice as fast as before. The pattern that does matter is the one underneath -- which model handles which step, what stays in context, and what gets compacted out.

Opus on the bookends. Sonnet in the middle. Tight context, plan first, fast mode for the volume work.

If you want this baked into your setup without configuring it yourself, the Code Kit ships with the routing wired into 18 specialist agents and the Agent Teams workflow on top. If you'd rather wire it manually, model selection, usage optimization, and efficiency patterns are the three posts to read next.

Either way, today is a good day to stop running Opus by default.

Claude Code Limits Doubled: How to Multiply Yours Further