Claude Code Quotas Vanish 20x Overnight, Official Response: “Use It Sparingly”

12,576 0 7

This means that when you resume a previous session, Claude Code does not reuse the already processed context, but instead processes the entire content from scratch each time, consuming 10 to 20 times the normal quota. You think you are continuing a conversation, but in reality, you are starting a brand new, full-price conversation each time.

Claude Code Quotas Vanish 20x Overnight, Official Response:

This figure comes from independent developer ArkNill’s proxy monitoring tests. By setting up a transparent proxy, he recorded every request between Claude Code and the Anthropic API, discovering at least two client-side cache bugs that prevented the API server from matching cached conversation prefixes, forcing a complete token rebuild every round.

Claude Code Quotas Vanish 20x Overnight, Official Response:

The chart above shows a comparison of cache read rates across three phases. During the period from v2.1.69 to v2.1.89 (i.e., the bug period), the cache read rate for the standalone version was only 4-17%. After v2.1.90 fixed one of the critical bugs, the cold-start cache read rate returned to 47-99.7%. By v2.1.91, the stable operational cache read rate recovered to 97-99%.

It’s worth noting a detail in the chart: the range for v2.1.90 is very wide (47% to 99.7%). This is because sessions still need to “warm up” the cache upon resumption, with low hit rates for the first few rounds, but quickly returning to normal levels. In the buggy version, this warm-up never happens—the cache read permanently stalls at the 14,500 tokens of the system prompt, and all conversation history is billed at full price every time.

28 Days, 20 Versions

This bug wasn’t the type introduced in one update and fixed in the next. According to npm registry release records, the bug-introducing v2.1.69 was released on March 4th, and the bug-fixing v2.1.90 was released on April 1st. There was a 28-day gap spanning 20 versions.

Claude Code Quotas Vanish 20x Overnight, Official Response:

The timeline reveals an intriguing detail. After the bug was introduced on March 4th, users did not immediately complain on a large scale. Complaints only surged around March 23rd, nearly three weeks later. The reason, as outlined in GitHub issue #41930, is that Anthropic ran a 2x quota promotion (doubling off-peak hours) from March 13th to 28th, which objectively masked the bug’s impact. After the promotion ended, the consumption caused by the cache bug returned to the normal billing baseline, and users’ quotas instantly “evaporated.”

Anthropic’s response was not swift. On March 26th, three days after user complaints erupted, engineer Thariq Shihipar announced on his personal X account that peak-hour (weekdays 5am-11am PT) rate limits had been tightened. On March 30th, Anthropic acknowledged on Reddit that “users were hitting their limits far faster than expected,” stating it was the team’s top priority. It wasn’t until April 1st that team member Lydia Hallie published the formal investigation conclusion.

Throughout this process, Anthropic did not publish any blog posts, send email notifications, or update its status page. All official communication was conducted solely through engineers’ personal social media posts and a few Reddit comments.

How Much Did You Pay, How Long Could You Use It?

GitHub issue #41930 gathered hundreds of user reports. The most extreme case was a Max 20x subscriber ($200/month) whose 5-hour rolling window was completely exhausted in 19 minutes. Max 5x users ($100/month) reported their 5-hour windows being used up within 90 minutes. According to The Letter Two, some users claimed a simple “hello” consumed 13% of their session quota. A Pro user ($20/month) said on Discord that their quota was “used up by Monday, reset on Saturday,” leaving only 12 usable days out of 30.

Claude Code Quotas Vanish 20x Overnight, Official Response:

Based on ArkNill’s benchmark tests, on the buggy version v2.1.89, the 100% quota for the Max 20x plan would be exhausted in about 70 minutes. He also calculated the quota cost of a single –resume operation on a 500K token context session to be approximately $0.15, as the system would fully replay the entire context.

“You’re Holding It Wrong”

Lydia Hallie’s investigation conclusion confirmed two points: first, peak-hour rate limits had indeed been tightened, and second, sessions with 1 million token contexts consumed more. She stated the team had fixed some bugs but emphasized that “none of these bugs resulted in overcharging.”

She then offered four quota-saving suggestions:

1. Use Sonnet 4.6 instead of Opus (Opus consumes roughly twice as fast);

2. Reduce reasoning strength or turn off extended thinking when deep reasoning isn’t needed;

3. Don’t resume long sessions idle for over an hour, start a new one instead;

4. Set the environment variable CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000 to limit context window size.

There was no mention of any form of quota reset or compensation.

AI podcast host Alex Volkov summarized this response as “You’re holding it wrong,” pointing out that Anthropic itself set the 1 million token context as the default, promoted Opus as the flagship model, and marketed extended thinking as a selling point, but is now advising paying users not to use these features.

The claim of “no overcharging” also seems at odds with Claude Code’s own update logs. Just one day before Lydia’s response, v2.1.90 fixed a cache regression bug present since v2.1.69: when using –resume to restore a session, requests that should have hit the cache triggered a full prompt cache miss, billed at full price. Lydia’s response did not mention this confirmed billing anomaly.

Claude Code Quotas Vanish 20x Overnight, Official Response:

As a comparison, OpenAI’s Codex previously experienced a similar issue of abnormal quota consumption. OpenAI’s approach was to reset user quotas, issue credit refunds, and announce in March the removal of Codex usage limits. Anthropic’s approach is to advise users to downgrade models, turn off features, limit context, and attribute responsibility to user behavior.

Anthropic sells subscriptions for the “strongest model + largest context + highest reasoning capability” and charges $20 to $200 per month. A cache bug spanning 28 days caused paying users’ quotas to evaporate at 10-20 times the normal rate, and the official response is to tell you to use it more sparingly.

この記事はインターネットから得たものです。 Claude Code Quotas Vanish 20x Overnight, Official Response: “Use It Sparingly”

Related: Matrixport Research: Why Is the Altcoin Bull 市場 Absent? Supply Pressure and トークン Unlocks Emerge as Key Variables

Sustained Supply Pressure: Altcoin Upside Momentum Remains Constrained Since October 2025, although Bitcoin has experienced periodic pullbacks, its market dominance has not declined significantly and has recently trended higher again. This market movement appears more like a Bitcoin-led rally, with altcoins following intermittently. Meanwhile, an increasing number of companies adopting treasury allocation strategies continue to accumulate Bitcoin, maintaining its structural demand. In contrast, altcoins face more pronounced supply pressure. Continuous selling by early investors, coupled with new circulating supply from token unlocks, frequently caps price rebounds with selling pressure. Since August 2024, approximately $99 billion worth of tokens have been unlocked and entered circulation, creating a persistent supply shock for the market. As retail enthusiasm cools, the impact of supply-side factors has gradually become a key variable dominating price trends.…