I was banned from Claude for scaffolding a Claude.md file?

>>hugoda+(OP)
I've been doing something a lot like this, using a claude-desktop instance attached to my personal mcp server to spawn claude-code worker nodes for things, and for a month or two now it's been working great using the main desktop chat as a project manager of sorts. I even started paying for MAX plan as I've been using it effectively to write software now (I am NOT a developer).

Lately it's gotten entirely flaky, where chat's will just stop working, simply ignoring new prompots, and otherwise go unresponsive. I wondered if maybe I'm pissing them off somehow like the author of this article did.

Now even worse is Claude seemingly has no real support channel. You get their AI bot, and that's about it. Eventually it will offer to put you through to a human, and then tell you that don't wait for them, they'll contact you via email. That email never comes after several attempts.

I'm assuming at this point any real support is all smoke and mirrors, meaning I'm paying for a service now that has become almost unusable, with absolutely NO means of support to fix it. I guess for all the cool tech, customer support is something they have not figured out.

I love Claude as it's an amazing tool, but when it starts to implode on itself that you actually require some out-of-box support, there is NONE to be had. Grok seems the only real alternative, and over my dead body would I use anything from "him".

>>bastar+Dj
Anthropic has been flying by the seat of their pants for a while now and it shows across the board. From the terminal flashing bug that’s been around for months to the lack of support to instabilities in Claude mobile and Code for the web (I get 10-20% message failure rates on the former and 5-10% on CC for web).

They’re growing too fast and it’s bursting the seams of the company. If there’s ever a correction in the AI industry, I think that will all quickly come back to bite them. It’s like Claude Code is vibe-operating the entire company.

>>throwu+Ok
The Pro plan quota seems to be getting worse. I can get maybe 20-30 minutes work done before I hit my 4 hour quota. I found myself using it more just for the planning phase to get a little bit more time out of it, but yesterday I managed to ask it ONE question in plan mode (from a fresh quota window), and while it was thinking it ran out of quota. I'm assuming it probably pulled in a ton of references from my project automatically and blew out the token count. I find I get good answers from it when it does work, but it's getting very annoying to use.

(on the flip side, Codex seems like it's being SO efficient with the tokens it can be hard to understand its answers sometimes, it rarely includes files without you doing it manually, and often takes quite a few attempts to get the right answer because it's so strict what it's doing each iteration. But I never run out of quota!)

>>laserD+yL
^ THIS

I've run out of quota on my Pro plan so many times in the past 2-3 weeks. This seems to be a recent occurrence. And I'm not even that active. Just one project, execute in Plan > Develop > Test mode, just one terminal. That's it. I keep getting a quota reset every few hours.

What's happening @Anthropic ?? Anybody here who can answer??

>>aanet+SU
sounds like the "thinking tokens" are a mechanism to extract more money from users?

>>genewi+qV
Anecdotally but it definitely feels like in the last couple weeks CC tends to be more aggressive at pulling in significantly larger chunks of an existing code base - even for some simple queries I'll see it easily ramp up to 50-60k token usage.

>>vunder+8Z
I'm curious if anyone has logged the number of thinking tokens over time. My implication was the "thinking/reasoning" modes are a way for LLM providers to put their thumb on the scale for how much the service costs.

they get to see (if not opted-out) your context, idea, source code, etc. and in return you give them $220 and they give you back "out of tokens"

>>genewi+U01
> My implication was the "thinking/reasoning" modes are a way for LLM providers to put their thumb on the scale for how much the service costs.

I've done RL training on small local models, and there's a strong correlation between length of response and accuracy. The more they churn tokens, the better the end result gets.

I actually think that the hyper-scalers would prefer to serve shorter answers. A token generated at 1k ctx length is cheaper to serve than one at 10k context, and way way cheaper than one at 100k context.

zlacker