zlacker

I've been doing something a lot like this, using a claude-desktop instance attached to my personal mcp server to spawn claude-code worker nodes for things, and for a month or two now it's been working great using the main desktop chat as a project manager of sorts. I even started paying for MAX plan as I've been using it effectively to write software now (I am NOT a developer).

Lately it's gotten entirely flaky, where chat's will just stop working, simply ignoring new prompots, and otherwise go unresponsive. I wondered if maybe I'm pissing them off somehow like the author of this article did.

Now even worse is Claude seemingly has no real support channel. You get their AI bot, and that's about it. Eventually it will offer to put you through to a human, and then tell you that don't wait for them, they'll contact you via email. That email never comes after several attempts.

I'm assuming at this point any real support is all smoke and mirrors, meaning I'm paying for a service now that has become almost unusable, with absolutely NO means of support to fix it. I guess for all the cool tech, customer support is something they have not figured out.

I love Claude as it's an amazing tool, but when it starts to implode on itself that you actually require some out-of-box support, there is NONE to be had. Grok seems the only real alternative, and over my dead body would I use anything from "him".

replies(16): >>syntax+11 >>throwu+b1 >>spike0+l1 >>Bombth+a2 >>hecanj+w3 >>thtmni+V8 >>uxcolu+Fb >>unytti+Ug >>keepam+XM >>ph4eve+g01 >>raptor+s71 >>Aeolun+Gy1 >>left-s+FH1 >>steve1+0L1 >>cyanyd+RN1 >>xnx+Wt5

>>bastar+(OP)
Serious question, why is codex and mistral(vibe) not a real alternative?

replies(3): >>bastar+Sg >>pixelm+Qo >>deaux+xO

>>bastar+(OP)
Anthropic has been flying by the seat of their pants for a while now and it shows across the board. From the terminal flashing bug that’s been around for months to the lack of support to instabilities in Claude mobile and Code for the web (I get 10-20% message failure rates on the former and 5-10% on CC for web).

They’re growing too fast and it’s bursting the seams of the company. If there’s ever a correction in the AI industry, I think that will all quickly come back to bite them. It’s like Claude Code is vibe-operating the entire company.

replies(6): >>Bombth+f2 >>sixtyj+n7 >>laserD+Vr >>IgorPa+rz >>threec+1M1 >>cyanyd+2O1

>>bastar+(OP)
> where chat's will just stop working, simply ignoring new prompots, and otherwise go unresponsive

I had this start happening around August/September and by December or so I chose to cancel my subscription.

I haven't noticed this at work so I'm not sure if they're prioritizing certain seats or how that works.

replies(1): >>sawjet+4r

>>bastar+(OP)
Have a max plan, didn't use it much the last few days. Just used it to explain me a few things with examples for a ttrpg. It just hanged up a few times.

Max plan and in average I use it ten times a day? Yeah, I am cancel. Guess they don't need me

replies(1): >>bastar+tg

>>throwu+b1
Well, they vibe code almost every tool at least

replies(1): >>tuhgde+s7

>>bastar+(OP)
> I've been using it effectively to write software now (I am NOT a developer)

What have you found it useful for? I'm curious about how people without software backgrounds work with it to build software.

replies(3): >>bastar+Ff >>bastar+Yi >>ofalka+Sk

>>throwu+b1
They whistleblowed themselves that Claude Cowork was coded by Claude Code… :)

replies(2): >>notsur+ja >>throwu+0i

>>Bombth+f2
Claude Code has accumulated so much technical dept (+emojis) that Claude Code can no longer code itself.

replies(2): >>wwwest+Ai >>behnam+Kd1

>>bastar+(OP)
Gemini CLI is a solid alternative to Claude Code. The limits are restrictive, though. If you're paying for Max, I can't imagine Gemini CLI will take you very far.

replies(5): >>Consca+cb >>andrew+nf >>bastar+4g >>samusi+tC >>subscr+ak1

>>sixtyj+n7
Whistleblowed dog food.

replies(1): >>b00ty4+Ii

>>thtmni+V8
Gemini CLI regularly gets stuck failing to do anything after declaring its plan to me. There seems to be no way to un-lock it from this state except closing and reopening the interface, losing all its progress.

replies(2): >>genewi+eC >>dudein+0V2

>>bastar+(OP)
Have you tried any of the leading open weight models, like GLM etc. And how does chatGPT or Gemini compare?

And kudos for refusing to use anything from the guy who's OK with his platform proliferating generated CSAM.

replies(2): >>Leynos+mr >>Balina+Ll1

>>thtmni+V8
Kilocode is a good alt as well. You can plug into OpenRouter or Kilocode to access their models.

>>hecanj+w3
I started using claude-code, but found it pretty useless without any ability to talk to other chats. Claude recommended I make my own MCP server, so I did. I built a wrapper script to invoke anthropic's sandbox-runtime toolkit to invoke claude-code in a project with tmux, and my mcp server allows desktop to talk to tmux. Later I built in my own filesystem tools, and now it just spawns konsole sessions for itself invoking workers to read tasks it drops into my filesystem, points claude-code to it, and runs until it commits code, and then I have the PM in desktop verify it, do the final push/pr/merge. I use an approval system in a gui to tell me when claude is trying to use something, and I set an approve for period to let it do it's thang.

Now I've been using it to build on my MCP server I now call endpoint-mcp-server (coming soon to github near you), which I've modularized with plugins, adding lots more features and a more versatile qt6 gui with advanced workspace panels and widgets.

At least I was until Claude started crapping the bed lately.

replies(1): >>cyanyd+TO1

>>thtmni+V8
I tried Gemini like a year or so ago, and I gave up after it directly refused to write me a script and instead tried to tell me how to learn to code. I do not make this up.

replies(1): >>mkl+Uk

>>Bombth+a2
That's about what I'm getting too! It just literally stops at some point, and any new prompt it starts, then immediately stops. This was even on a fairly short conversation with maybe 5-6 back and forth dialogs.

>>syntax+11
I tried codex, using my same sandbox setup with it. Normally I work with sonnet in code, but it was stuck on a problem for hours, and I thought hmm, let me try codex. Codex just started monkey patching stuff and broke everything within like 3-4 prompts. I said f-this, went back to my last commit, and tried Opus this time in code, which fixed the problem within 2 prompts.

So yeah, codex kinda sucks to me. Maybe I'll try mistral.

>>bastar+(OP)
> I'm paying for a service now that has become almost unusable, with absolutely NO means of support to fix it.

Isn’t the future of support a series of automations and LLMs? I mean, have you considered that the AI bot is their tech support, and that it’s about to be everyone else’s approach too?

replies(1): >>b00ty4+jj

>>sixtyj+n7
You can tell they’re all vibe coded.

Claude iOS app, Claude on the web (including Claude Code on the web) and Claude Code are some of the buggiest tools I have ever had to use on a daily basis. I’m including monstrosities like Altium and Solidworks and Vivado in the mix - software that actually does real shit constrained by the laws of physics rather than slinging basic JSON and strings around over HTTP.

It’s an utter embarrassment to the field of software engineering that they can’t even beat a single nine of reliability in their consumer facing products and if it wasn’t for the advantage Opus has over other models, they’d be dead in the water.

replies(5): >>cactus+7o >>loopdo+KF >>fizx+EO >>0x500x+wW >>ilikeb+Tbv

>>tuhgde+s7
What’s the opposite of bootstrapping? Stakebooting?

replies(4): >>irishc+an >>outsid+ln >>fsckbo+o71 >>Punchy+Yj1

>>notsur+ja
normally you don't share your dog food when you find out it actually sucks.

>>hecanj+w3
About my not having a software background, I started this as I've been a network/security/systems engineer/architect/consultant for 25 years, but never dev work. I can read and follow code well enough to debug things, but I've never had the knack to learn languages and write my own. Never really had to, but wanted to.

This now lets me use my IT and business experience to apply toward making bespoke code for my own uses so far, such as firewall config parsers specialized for wacky vendor cli's and filling in gaps in automation when there are no good vendor solutions for a given task. I started building my mcp server enable me to use agents to interact with the outside world, such as invoking automation for firewalls, switches, routers, servers, even home automation ideally, and I've been successful so far in doing so, still not having to know any code.

I'm sure a real dev will find it to be a giant pile of crap in the end, but I've been doing like applying security frameworks, code style guidelines using ruff, and things like that to keep it from going too wonky, and actually working it up to a state I can call it as a 1.0 and plan to run a full audit cycle against it for security audits, performance testing, and whatever else I can to avoid it being entirely craptastic. If nothing else, it works for me, so others can take it or not once I put it out there.

Even being NOT a developer, I understand the need for applying best practices, and after watching a lot of really terrible developers adjacent to me over the years make a living, think I can offer a thing or two in avoiding that as it is.

>>unytti+Ug
Support has been automated for a while, LLMs just made it even less useful (and it wasn't very useful to begin with; for over a decade it's been a Byzantine labyrinth of dead-ends, punji-pits and endless hours spent listening to smooth jazz).

replies(1): >>george+q11

>>hecanj+w3
My use is considerably simpler than GP's but I use it anytime I get bogged down in the details and lose my way, just have Claude handle that bit of code and move on. Also good for any block of code that breaks often as the program evolves, Claude has much better foresight than I do so I replace that code with a prompt.

I enjoy programming but it is not my interest and I can't justify the time required to get competent, so I let Claude and ChatGPT pick up my slack.

>>bastar+4g
That's at least two major updates ago. Probably worth another try.

replies(2): >>bayare+Q11 >>elyobo+xe1

>>wwwest+Ai
Hmm... VC funded?

>>wwwest+Ai
Sell to Google and run away

>>throwu+0i
You're right.

https://github.com/anthropics/claude-code/issues

Codex has less but they also had quite a few outages in December. And I don't think Codex is as popular as Claude Code but that could change.

replies(1): >>qcnguy+uv1

>>syntax+11
The Claude models are still the best at what they do, right now GLM is just barely scratching sonnet 4.5 quality, mistral isnt really usable for real codebases and gemini is kind of in a weird spot where it's sometimes better then Claude at small targeted changes but randomly goes off the rails. Haven't tried codex recently but the last time I did the model thought for 27 minutes straight and then gave me about the same (incorrect) output that opus would have in 20 seconds. Anthropics models are their only moat as demonstrated by their cutting off of tools other then Claude code on their coding plans.

>>spike0+l1
I have noticed this when switching locations on my VPN. Some locations are stable and some will drop the connection while the response is streaming on a regular basis.

replies(1): >>fragme+1C

>>uxcolu+Fb
I tried GLM 4.7 in Opencode today. In terms of capability and autonomy, it's about on par with Sonnet 3.7. Not terrible for a 10th the price of an Anthropic plan, but not a replacement.

>>throwu+b1
The Pro plan quota seems to be getting worse. I can get maybe 20-30 minutes work done before I hit my 4 hour quota. I found myself using it more just for the planning phase to get a little bit more time out of it, but yesterday I managed to ask it ONE question in plan mode (from a fresh quota window), and while it was thinking it ran out of quota. I'm assuming it probably pulled in a ton of references from my project automatically and blew out the token count. I find I get good answers from it when it does work, but it's getting very annoying to use.

(on the flip side, Codex seems like it's being SO efficient with the tokens it can be hard to understand its answers sometimes, it rarely includes files without you doing it manually, and often takes quite a few attempts to get the right answer because it's so strict what it's doing each iteration. But I never run out of quota!)

replies(6): >>starea+xu >>aanet+fB >>Chicag+PS >>nwatso+d01 >>thunfi+1e1 >>rasmus+pV2

>>laserD+Vr
Claude Code allegedly auto-includes the currently active file and often all visible tabs and sometimes neighboring files it thinks are 'related' - on every prompt.

The advice I got when scouring the internets was primarily to close everything except the file you’re editing and maybe one reference file (before asking Claude anything). For added effect add something like 'Only use the currently open file. Do not read or reference any other files' to the prompt.

I don't have any hard facts to back this up, but I'm sure going to try it myself tomorrow (when my weekly cap is lifted ...).

replies(2): >>sigseg+LS >>idonot+ZB2

>>throwu+b1
You are giving me images from The Bug Short where the guy goes to investigate mortgages and knocks on some random person’s door to ask about a house/mortgage just to learn that it belongs to a dog. Imagine finding out that Anthropic employs no humans at all. Just an AI that has fired everyone and been working on its own releases and press releases since.

replies(2): >>smcin+2U >>moring+Sf1

>>laserD+Vr
^ THIS

I've run out of quota on my Pro plan so many times in the past 2-3 weeks. This seems to be a recent occurrence. And I'm not even that active. Just one project, execute in Plan > Develop > Test mode, just one terminal. That's it. I keep getting a quota reset every few hours.

What's happening @Anthropic ?? Anybody here who can answer??

replies(8): >>genewi+NB >>fragme+YC >>vbezhe+vJ >>bmurph+4L >>heavys+JL >>Millio+JN >>alexk6+qQ >>behnam+Vc1

>>aanet+fB
sounds like the "thinking tokens" are a mechanism to extract more money from users?

replies(4): >>mystra+8E >>vunder+vF >>arthur+QM >>behnam+6d1

>>sawjet+4r
The Peets right next to the Anthropic office could be selling VPN endpoint service for quite the premium!

>>Consca+cb
you should be able to copy the entire conversation and paste it in (including thinking/reasoning tokens).

When you have a conversation with an AI, in simple terms, when you type a new line and hit enter, the client sends the entire conversation to the LLM. It has always worked this way, and it's how "reasoning tokens" were first realized. you allow a client to "edit" the context, and the client deletes the hallucination, then says "Wait..." at the end of the context, and hits enter.

the LLM is tricked into thinking it's confused/wrong/unsure, and "reasons" more about that particular thing.

>>thtmni+V8
Gemini CLI isn't even close to the quality of Claude Code as a coding harness. Codex and even OpenCode are much better alternatives.

>>aanet+fB
How quickly do you also hit compaction when running? Also, if you open a new CC instance and run /context, what does it show for tools/memories/skills %age? And that's before we look at what you're actually doing. CC will add context to each prompt it thinks is necessary. So if you've got a few number of large files, (vs a large number of smaller files), at some level that'll contribute to the problem as well.

Quota's basically a count of tokens, so if a new CC session starts with that relatively full, that could explain what's going on. Also, what language is this project in? If it's something noisy that uses up many tokens fast, even if you're using agents to preserve the context window in the main CC, those tokens still count against your quota so you'd still be hitting it awkwardly fast.

>>genewi+NB
Its the clanker version of the "Check Wallet Light" (check engine light).

>>genewi+NB
Anecdotally but it definitely feels like in the last couple weeks CC tends to be more aggressive at pulling in significantly larger chunks of an existing code base - even for some simple queries I'll see it easily ramp up to 50-60k token usage.

replies(4): >>genewi+hH >>troyvi+J51 >>behnam+ed1 >>vidarh+Ar1

>>throwu+0i
Single nine reliability would be 90% uptime lol. For 99.9% we call it triple 9 reliability.

replies(2): >>throwu+EG >>jrflow+gZ

>>loopdo+KF
Single 9 would be 90%, which is roughly what I’m experiencing between CC for Web and the Claude iOS app. About 1 in 10 messages fail because of an unknown error and 1 in 10 CC for web sessions die irrecoverably. It’d probably be worse except for the fact that CC’s bugs in the terminal aren’t show stoppers like they are on web/mobile.

The only way Anthropic has two or three nines is in read only mode, but that’s be like measuring AWS using the console uptime while ignoring the actual control plane.

>>vunder+vF
I'm curious if anyone has logged the number of thinking tokens over time. My implication was the "thinking/reasoning" modes are a way for LLM providers to put their thumb on the scale for how much the service costs.

they get to see (if not opted-out) your context, idea, source code, etc. and in return you give them $220 and they give you back "out of tokens"

replies(3): >>throwu+KJ >>jumplo+zZ >>Nitpic+p91

>>aanet+fB
This whole API vs plan looks weird to me. Why not force everyone to use API? You pay for what you use, it's very simple. API should be the most honest way to monetize, right?

This fixed subscription plan with some hardly specified quotas looks like they want to extract extra money from these users who pay $200 and don't use that value, at the same time preventing other users from going over $200. Like I understand that it might work at scale, but just feels a bit not fair to everyone?

replies(5): >>rootus+LW >>throwu+X21 >>nobody+ge1 >>sailfa+fr2 >>thadk+988

>>genewi+hH
> My implication was the "thinking/reasoning" modes are a way for LLM providers to put their thumb on the scale for how much the service costs.

It's also a way to improve performance on the things their customers care about. I'm not paying Anthropic more than I do for car insurance every month because I want to pinch ~~pennies~~ tokens, I do it because I can finally offload a ton of tedious work on Opus 4.5 without hand holding it and reviewing every line.

The subscription is already such a great value over paying by the token, they've got plenty of space to find the right balance.

>>aanet+fB
I've been hitting the limit a lot lately as well. The worst part is I try to compact things and check my limits using the / commands and can't make heads or tails how much I actually have left. It's not clear at all.

I've been using CC until I run out of credits and then switch to Cursor (my employer pays for both). I prefer Claude but I never hit any limits in Cursor.

replies(1): >>rubenf+zB2

>>aanet+fB
Like a good dealer, they gave you a cheap/free hit and now you want more. This time you're gonna have to pay.

replies(1): >>codera+1Tc

>>genewi+NB
Their system prompt + MCP is more of the culprit here. 16 tools, sophisticated parameters, you're looking at 24K tokens minimum

>>bastar+(OP)
Folks a solution might be to use the claude models inside the latest copilot. Copilot is good. Try it out. Latest versions improving all the time. You get plenty of usage at reasonable price.

>>aanet+fB
I very recently (~ 1 week ago) subscribed to the Pro plan and was indeed surprised by how fast I reached my quota compared to say Codex with similar subscription tier. The UX is generally really cool with Claude Code, which left me with a bit of a bittersweet feeling of not even being able to truly explore all the possibilities since after just making basic planning and code changes I am already out of quota for experimenting with various ways of using subagents, testing background stuff etc.

replies(4): >>0x500x+jW >>behnam+2d1 >>jack_p+pP1 >>rubenf+ZA2

>>syntax+11
Codex: Three reasons. I've used all extensively, for multiple months.

Main one is that it's ~3 times slower. This is the real dealbreaker, not quality. I can guarantee that if tomorrow we woke up and gpt-5.2-codex became the same speed as 4.5-opus without a change in quality, a huge number of people - not HNers but everyone price sensitive - would switch to Codex because it's so much cheaper per usage.

The second one is that it's a little worse at using tools, though 5.2-codex is pretty good at it.

The third is that its knowledge cutoff is further in the past than both Opus 4.5 and Gemini 3 that it's noticeable and annoying when you're working with more recent libraries. This is irrelevant if you're not using those.

For Gemini 3 Pro, it's the same first two reasons as Codex, though the tool calling gap is even much bigger.

Mistral is of course so far removed in quality that it's apples to oranges.

replies(2): >>EnPiss+zr1 >>dudein+jT2

>>throwu+0i
hey, they have 9 8's

>>aanet+fB
[BUG] Instantly hitting usage limits with Max subscription: https://github.com/anthropics/claude-code/issues/16157

It's the most commented issue on their GitHub and it's basically ignored by Anthropic. Title mentions Max, but commenters report it for other plans too.

replies(2): >>czk+iS >>quikoa+dl1

>>alexk6+qQ
“After creating a new account, I can confirm the quota drains 2.5x–3x slower. So basically Max (5x) on an older accounts is almost like Pro on a new one in terms of quota. Pretty blatant rug pull tbh.”

lol

replies(1): >>Aeolun+cz1

>>starea+xu
What does "all visible tabs" mean in the context of Claude Code in a terminal window? Are you saying it's reading other terminals open on the system? Also how do you determine "currently active file"? It just greps files as needed.

replies(2): >>adobra+d61 >>DANmod+uf4

>>laserD+Vr
I never run out of this mysterious quota thing. I close Claude Code at 10% context and restart.

I work for hours and it never says anything. No clue why you’re hitting this.

$230 pro max.

replies(3): >>croes+JV >>yjtpes+351 >>fluidc+jl2

>>IgorPa+rz
'The Big Short' (2015)

replies(1): >>taneq+8E4

>>Chicag+PS
Pro is 20x less than Max

>>Millio+JN
I use opencode with codex after all the shenanigans from anthropic recently. You might want to give that a shot!

>>throwu+0i
Even their status page (which are usually gamed) shows two 9s over the past 90 days.

>>vbezhe+vJ
You're welcome to use the API, it asks you to do that when you run out of quota on your Pro plan. The next thing you find out is how expensive using the API is. More honest, perhaps, but you definitely will be paying for that.

replies(1): >>Jeff_B+vU1

>>loopdo+KF
Single nine could be just 9% :D

>>genewi+hH
I believe Claude Code recently turned on max reasoning for all requests. Previously you’d have to set it manually or use the word “ultrathink”

>>laserD+Vr
Self-hosted might be the way to go soon. I'm getting 2x Olares One boxes, each with an RTX 5090 GPU (NVIDIA 24GB VRAM), and a built-in ecosystem of AI apps, many of which should be useful, and Kubernetes + Docker will let me deploy whatever else I want. Presumably I will manage to host a good coding model and use Claude Code as the framework (or some other). There will be many good options out there soon.

replies(3): >>Nitpic+U81 >>behnam+Jc1 >>zen4tt+1d1

>>bastar+(OP)
The desktop app is pretty terrible and super flaky, throwing vague errors all the time. Claude code seems to be doing much better. I also use it for non-code related tasks.

>>b00ty4+jj
Yup, the main goal of customer support for almost every Internet-based company for over a decade now is to just be so frustrating that you give up before you can reach an actual human (since that is the point where there is a real cost to the company in giving you that support).

I'm not really sure LLMs have made it worse. They also haven't made it better, but it was already so awful that it just feels like a different flavor of awful.

replies(1): >>dejli+Ob2

>>mkl+Uk
Gemini is my preferred LLM for coding, but it still does goofy shit once in a while even with the latest version.

I'm 99.9999% sure Gemini has a dynamic scaling system that will route you to smaller models when its overloaded, and that seems to be when it will still occasionally do things like tell you it edited some files without actually presenting the changes to you or go off on other strange tangents.

>>vbezhe+vJ
Consumers like predictable billing more than they care about getting the most bang for their buck and beancounters like sticky recurring revenue streams more than they care about maximizing the profit margins for every user.

replies(1): >>charci+M91

>>Chicag+PS
Any clue why you might be a favored/favoured high value user?

replies(1): >>0xack+771

>>vunder+vF
This really speaks to the need to separate the LLM you use and the coding tool that uses it. LLM makers utilizing the SaaS model make money on the tokens you spend whether or not they need them. Tools like aider and opencode (each in their own way) use separate tools build a map of the codebase that they can use to work with code using fewer tokens. When I see posts like this I start to understand why Anthropic now blocks opencode.

We're about to get Claude Code for work and I'm sad about it. There are more efficient ways to do the job.

replies(1): >>ayewo+l91

>>sigseg+LS
You can install VSCode extension and use "/ide" to connect them.

replies(1): >>within+Bd1

>>yjtpes+351
The entire conversation is fed in as context effectively compounding your token usage over the course of a session. Sessions are most efficient when used for one task only.

replies(1): >>Chicag+Dg1

>>wwwest+Ai
pulling yourself down by your chinstrap

>>bastar+(OP)
> Lately it's gotten entirely flaky, where chat's will just stop working

This happens to me more often than not both in the Claude Desktop and in web. It seems that longer the conversation goes the more likely it is to happen. Frustrating.

replies(1): >>Raston+W81

>>nwatso+d01
I've been using local LLMs since before chatgpt launched (gpt-j, gpt-neox for those that remember), and have tried all the promising models as they launch. While things are improving faster than I thought ~3 years ago, we're still not there in terms of 1-1 comparison with the SotA models. For "consumer" local at least.

The best you can get today with consumer hardware is something like devstral2-small(24B) or qwen-coder30b(underwhelming) or glm-4.7-flash (promising but buggy atm). And you'll still need beefy workstations ~5-10k.

If you want open-SotA you have to get hardware worth 80-100k to run the big boys (dsv3.2, glm4.7, minimax2.1, devstral2-123b, etc). It's ok for small office setups, but out of range for most local deployments (esp considering that the workstations need lots of power if you go 8x GPUs, even with something like 8x 6000pro @ 300w).

replies(1): >>asno30+zfd

>>raptor+s71
Judging by their status page riddled with red and orange as well as the months long degradation with blog post last Sept, it is not very reliable. If I sense it's responses are crap, I check the status page and low and behold usually it's degraded. For a non deterministric product, silent quality drops are pretty bad

replies(1): >>Balina+bl1

>>troyvi+J51
When you state it like that, I now totally understand why Anthropic have a strong incentive to kick out OpenCode.

OpenCode is incentivized to make a good product that uses your token budget efficiently since it allows you to seamlessly switch between different models.

Anthropic as a model provider on the other hand, is incentivized to exhaust your token budget to keep you hooked. You'll be forced to wait when your usage limits are reached, or pay up for a higher plan if you can't wait to get your fix.

CC, specifically Opus 4.5, is an incredible tool, but Anthropic is handling its distribution the way a drug dealer would.

replies(3): >>vidarh+Rr1 >>jack_p+0Q1 >>Brian_+Sz2

>>genewi+hH
> My implication was the "thinking/reasoning" modes are a way for LLM providers to put their thumb on the scale for how much the service costs.

I've done RL training on small local models, and there's a strong correlation between length of response and accuracy. The more they churn tokens, the better the end result gets.

I actually think that the hyper-scalers would prefer to serve shorter answers. A token generated at 1k ctx length is cheaper to serve than one at 10k context, and way way cheaper than one at 100k context.

replies(1): >>genewi+kh1

>>throwu+X21
I just like beong able to make like $250 of API calls for $20.

replies(1): >>Aeolun+1N1

>>nwatso+d01
> Self-hosted might be the way to go soon.

As someone with 2x RTX Pro 6000 and a 512GB M3 Ultra, I have yet to find these machines usable for "agentic" tasks. Sure, they can be great chat bots, but agentic work involves huge context sent to the system. That already rules out the Mac Studio because it lacks tensor cores and it's painfully slow to process even relatively large CLAUDE.md files, let alone a big project.

The RTX setup is much faster but can only support models ≤192GB, which severely limits its capabilities as you're limited to low Q GLM 4.7, GLM 4.7 Flash/Air/ GPT OSS 120b, etc.

>>aanet+fB
> I've run out of quota on my Pro plan so many times in the past 2-3 weeks.

Waiting for Anthropic to somehow blame this on users again. "We investigated, turns out the reason was users used it too much".

>>nwatso+d01
I think this is the future as well, running locally, controlling the entire pipeline. I built acf on github using Claude among others. You essentially configure everything as you want, models, profiles, agents and RAG. It's free. I also built a marketplace to sell or give away to the community these pipeline enhancements. It's a project I wanted to do for a while and Claude was nice to me allowing it to happen. It's a work in progress but you have 100% control, locally. There is also a website for those not as technical where you can buy credits or plugin Claude or OpenAI APIs. Read the manifesto. I need help now and contributors.

>>Millio+JN
Use cliproxyapi and use any model in CC. I use Codex models in CC and it's the best of both worlds!

>>genewi+NB
probably, because they recently said the ultrathink is enabled by default now.

replies(1): >>genewi+oh1

>>vunder+vF
> more aggressive at pulling in significantly larger chunks of an existing code base

They need more training data, and with people moving on to OpenCode/Codex, they wanna extract as much data from their current users as possible.

>>adobra+d61
Do people actually use this mode? Having to approve diffs in the ide is too annoying.

replies(2): >>solumu+tj1 >>HumanO+YC2

>>tuhgde+s7
yeah, and it gets so clunky and laggy when the context grows. Anthropic just can't make software and yet they claim 90% of code will be written by AI by yesterday.

>>laserD+Vr
I've used the Anthropic models mostly through Openrouter using aider. With so much buzz around Claude Code I wantes to try it out and thought that a subscription might be more cost efficient for me. I was kinda disappointed by how quickly I hit the quota limit. Claude Code gives me a lot more freedom than what aider can do, on the other side I have the feeling that pure coding tasks work better through aider or Roo Code. The API version is also much much faster that the subscription one.

replies(1): >>aja12+3f1

>>vbezhe+vJ
The fixed fee plan is because the agent and the tools have internal choices/planning about cost. If you simply pay for API the only feedback to them that they are being too costly is for you to stop.

If you look at tool calls like MCP and what not you can see it gets ridiculous. Even though it's small for example calling pal MCP from the prompt is still burning tokens afaik. This is "nobody's" fault in this case really but you can see how the incentives are and we all need to think how to make this entire space more usable.

>>mkl+Uk
I tried it on Tuesday and, having used CC a lot lately, was shocked at how bad it was - I'd forgotten.

>>thunfi+1e1
Being in the same boat as you I switched to OpenCode with z.ai GLM 4.7 Pro plan and it's quite ok. Not as smart as Opus but smart enough for my needs, and the pricing is unbeatable

replies(2): >>davidw+6l1 >>thunfi+OJ1

>>IgorPa+rz
"Just an AI that has fired everyone"

At least it did not turn against them physically... "get comfortable while I warm up the neurotoxin emitters"

>>0xack+771
I get a decent amount of work in before restarts.

>>Nitpic+p91
> there's a strong correlation between length of response and accuracy

i'd need to see real numbers. I can trigger a thinking model to generate hundreds of tokens and return a 3 word response (however many tokens that is), or switch to a non-thinking model of the same family that just gives the same result. I don't necessarily doubt your experience, i just haven't had that experience tuning SD, for example; which is also xformer based

I'm sure there's some math reason why longer context = more accuracy; but is that intrinsic to transformer-based LLMs? that is, per your thought that the 'scalers want shorter responses, do you think they are expending more effort to get shorter, equivalent accuracy responses; or, are they trying to find some other architecture or whatever to overcome the "limitations" of the current?

>>behnam+6d1
does this translate into "the end-user's cost goes up"

by default?

>>within+Bd1
Depends on my task. If it’s complex and my expectation is for Claude to get things wrong the diff preview is helpful.

replies(1): >>vidarh+Tq1

>>wwwest+Ai
I believe its "digging your own grave"

>>thtmni+V8
Well, I use Gemini a lot (because it's one of three allowed families), but tbh it's pretty bad. I mean, it can get the job done but it's exhausting. No pleasure in using it.

>>aja12+3f1
Ditto. It is very very slow but I never hit quota limits but people on Discord are complaining like mad it is slow even on the Pro plans. I tend to use glm-*air a lot for planning before using 4.7

>>Raston+W81
It's amusing to observe that Claude works about as reliably as I'd expect for software written by Claude.

>>alexk6+qQ
It's not a bug it's a feature (for Anthropic).

replies(1): >>cyanyd+dO1

>>uxcolu+Fb
Giving Gemini a go after Opus did crap one time too many, and so far it seems that Gemini does better at identifying and fixing root causes, instead of piling code or disabling checks to hide the symptoms like Opus consistently seems to do.

>>solumu+tj1
Even then, I'd wait until it's had a chance to iterate and correct itself in a loop before I'd even consider looking at the output, or I end up babysitting it to prevent it from making mistakes it'd often recognise and fix itself if given the chance.

replies(1): >>solumu+tI1

>>deaux+xO
Have you tried lower reasoning levels?

replies(1): >>deaux+6v1

>>vunder+vF
It's absolutely a work-around in part, but use sub-agents, have the top level pass in the data, and limit the tool use for the sub-agent (the front matter can specify allowed tools) so it can't read more.

(And once you've done that, also consider whether a given task can be achieved with a dumber model - I've had good luck switching some of my sub-agents to Haiku).

>>ayewo+l91
OpenCode also would be incentivized to do things like having you configure multiple providers and route requests to cheaper providers where possible.

Controlling the coding tool absolutely is a major asset, and will be an even greater asset as the improvements in each model iteration makes it matter less which specific model you're using.

>>EnPiss+zr1
Yes and this makes it faster, but still quite a bit slower than Claude Code, and the tool use gap remains. Especially since the comparison for e.g. 5.2 Codex-Low is more like Sonnet than Opus, so that's the speed you're competing with.

>>cactus+7o
Don't bother filing issues there. Their issue tracker is a galaxy-sized joke. They automatically close issues after 30 days of inactivity even if they weren't fixed, just to keep the issue count low.

The Reasonable Man might think that an AI company OF ALL COMPANIES would be able to use AI to triage bug tickets and reproduce them, but no! They expect humans to keep wasting their own time reproducing, pinging tickets and correcting Claude when it makes mistakes.

Random example: https://github.com/anthropics/claude-code/issues/12358

First reply from Anthropic: "Found 3 possible duplicate issues: This issue will be automatically closed as a duplicate in 3 days."

User replies, two of the tickets are irrelevant, one didn't help.

Second reply: "This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes."

Every ticket I ever filed was auto-closed for inactivity. Complete waste of time. I won't bother filing bugs again.

replies(1): >>Macha+QO1

>>bastar+(OP)
I really don’t understand people that say claude has no human support. In the worst case the human version of their support got back to me two day after the AI, and they apologized for being so slow.

It really leads me to wonder if it’s just my questions that are easy, or maybe the tone of the support requests that go unanswered is just completely different.

replies(1): >>serf+Bg2

>>czk+iS
Your quota also seems to be higher after unsubscribing and resubscribing?

replies(1): >>Curiou+iw2

>>bastar+(OP)
Making a new account and seeing doing the exact same thing to see if it happens again… would be against TOS and therefore is something you absolutely shouldn’t do

replies(1): >>0x9e37+uS2

>>vidarh+Tq1
True. I’ve been strictly in the terminal for weeks and I have a stop hook which commits each iteration after successful rust compilation and frontend typechecks, then I have a small command line tool to quickly review last commit. It’s a pretty good flow!

>>aja12+3f1
I've also see OpenCode around, but have yet to try it. I wonder how it compares to Roo Code

>>bastar+(OP)
> Now even worse is Claude seemingly has no real support channel. You get their AI bot, and that's about it

This made me chuckle.

>>throwu+b1
We’re an Anthropic enterprise customer, and somehow there’s a human developer of theirs on a call with us just about every week. Chatting, tips and tricks etc.

I think they are just focusing on where the dough is.

>>charci+M91
If only it was API calls. I like using it through claude code. But it would be infinitely more flexible if my $200 subscription worked through the API

replies(1): >>frotau+UP1

>>bastar+(OP)
One could, alternatively, come to the conclusion that the value of you as a customer far undersells the value of the product itself, even if it's doing what you expect it to do.

That is, you and most of claude users arn't paying the actual cost. You're like a Uber customer a decade ago.

>>throwu+b1
I think your surmise is probably wrong. It's not that their growing to fast, it's that their service is cheaper than the actual cost of doing business.

Growth isn't a problem unless you dont actually pay for the cost of every user you subscribe. Uber, but for poorly profitable business models.

replies(1): >>oblio+MQ3

>>quikoa+dl1
Its not a bug, it's a poorly defined business model!

>>qcnguy+uv1
> Every ticket I ever filed was auto-closed for inactivity. Complete waste of time. I won't bother filing bugs again.

Upcoming Anthropic Press Release: By using Claude to direct users to existing bugs reports, we have reduced tickets requiring direct action by xx% and even reduced the rate of incoming tickets

>>bastar+Ff
what do you actually do besides build tools to build tools to build tools?

replies(1): >>bastar+VZ3

>>Millio+JN
I remember a couple of weeks ago when people raved about Claude Code I got a feeling like there's no way this is sustainable, they must be using tokens like crazy if used as described. Guess Anthropic did the math as well and now we're here.

>>Aeolun+1N1
I don't understand, you CAN use claude code through the API.

replies(1): >>horsaw+C12

>>ayewo+l91
You think after 27 billions invested they're gonna be ethical or want to get their money back as fast as possible?

>>rootus+LW
I tried the API once. Burned 7 dollars in 15 minutes.

>>frotau+UP1
Yeah, but he can't use his $200 subscription for the API.

That's limited to accessing the models through code/desktop/mobile.

And while I'm also using their subscriptions because of the cost savings vs direct access, having the subscription be considerably cheaper than the usage billing rings all sorts of alarm bells that it won't last.

>>george+q11
Thats not really the case here in Europe, where good vs bad support is often what separates companies that build a loyal customer base from those stuck with churn they cant control.

>>Aeolun+Gy1
They shorted me a day off credit on the first day of offering the 200+ subscription and it took me 6 weeks for a human to tell me "whoops well we'll fix that, cya."

I can't be alone . Literally the worst customer experience I've ever had with the most expensive personal dot com subscription I've ever paid for.

Never again. When Google sets the customer service bar there are MAJOR issues.

>>Chicag+PS
Does closing claude code do something that running /clear does not?

replies(1): >>idonot+WD2

>>vbezhe+vJ
Not a doctor or anything, but API usage seems to support the more on-demand / spiky workflows available at a much larger scale, whereas a single seat, authenticated to Claude Code has controlled / set capacity and is generally more predictable and as a result easier to price?

API request method might have no cap, but they do cap Claude Code even on Max licenses, so easier to throttle as well if needed to control costs. Seems straightforward to me at any rate. Kinda like reserved instance vs. spot pricing models?

>>Aeolun+cz1
They'll also send you a free month of the 100 dollar plan if you unsubscribe to try and get you back.

replies(1): >>0x9e37+rI2

>>ayewo+l91
It's like the very first days of computers at all. IBM supplied both the hardware and the software, and the software did not make the most efficient use of the hardware.

Which was nothing new itself of course. Conflicts of interest didn't begin with computers, or probably even writing.

>>Millio+JN
The best thing about the max plan has been that I don’t have “range anxiety” with my workflows. This opens me to trying random things on a whim and explore the outer limits of the LLM capabilities more.

>>bmurph+4L
Hmm, are you using the /usage command? There’s also the ccusage package that I find useful.

replies(1): >>bmurph+LT2

>>starea+xu
Yes, it does exactly that. It also sends other prompts like generating 3 options to choose from, prefilling a reply like 'compile the code', etc. (I can confirm this because I connect CC to llama.cpp and use it with GLM-4.7. I see all these requests/prompts in the llama-server verbose log.)

You can stop most of this with

export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1

And might as well disable telemetry, etc: export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

I also noticed every time you start CC, it sends off > 10k tokens preparing the different agents. So try not to close / re-open it too often.

source: https://code.claude.com/docs/en/settings

replies(1): >>JLCarv+QL2

>>within+Bd1
You can tell it not to do that and it will show inline diffs.

>>fluidc+jl2
Yeah, it re-sends all the agent system prompts.

>>Curiou+iw2
or (tested on Max x20 plan) when the subscription renewal fails by any reason (they try charge your CC multiple times) then you are still in for 2+ weeks till it dies

>>idonot+ZB2
I would always close claude to start a new chat... Guess I should stop doing that. Thanks for bringing my attention to those two env vars.

>>left-s+FH1
Claude shows me more than one personal account, as I registered via single signon and then - via e-mail, and I paid once only for one of them.

It’s effectively a multi-tenant interface.

I also used individual acc but on corp e-mail, previously.

You could generate a new multi-use CC in your vibe-bank app (as Revolut), buy burner (e) sim for sms (5 eur in NL); then rewrite all requests at your mitm proxy to substitute a device id to one, not derived from your machine.

But same device id, same phone could be perfectly legitimate use case: you registered on corp e-mail then you changed your work place, using the same machine.

or you lost access to your e-mail (what a pity)

But to get good use of it, someone should compose proper requests to ClickHouse or whatever they use, for logs, build some logic to run as a service or web hook to detect duplicates with a pipeline to act on it.

And a good percentage of flags wouldn’t have been ToC violations.

That’s a bad vibe, can you imagine how much trial and error prompting it requires?..

They can’t vibe the way though the claude code bugs alone, on time!

>>deaux+xO
Unpopular opinion but I prefer slow and correct.

My experience on Claude Max (still on it till end-of-month) has been frequent incomplete assignments and troubling decision making. I'll give you an example of each from yesterday.

1. Asked Claude to implement the features in a v2_features.md doc. It completed 8 of 10 but 3 incorrectly. I gave GPT-5.1-Codex-Max (high) the same tasks and it completed 10 of 10 but took perhaps 5-10x as long. Annoyingly, with LLM variability, I can't know for sure if I tried Claude again it would get it correct. The only thing I do know is that GTP-5.2 and 5.1 do a lot more "double-checking" both prior to executing and after.

2. I asked Claude to update a string being displayed in the UI of my app to display something else instead. The string is powered by a json config. Claude searched the code, somehow assumed it was being loaded by a db, did not find the json and opted to write code to overwrite whatever comes out of the 'db' (incorrect) to be what I asked for. This is... not desired behavior and the source of a category of hidden bugs that Claude has created in the past (other models do this as well but less often). Max took its time, found the source json file, and made the update in the correct place.

I can only "sit back and let an agent code" if I trust that it'll do the work right. I don't need it fast, I need it done right. It's already saving me hours where I can do other things in parallel. So, I don't get this argument.

That said, I have a Claude Max and OpenAI Pro subscription and use them both. I instead typically have Claude Opus work on UI and areas where I can visually confirm logic quickly (usually) and Codex in back-end code.

I often wonder how much the complexity of codebases affects how people discuss these models.

>>rubenf+zB2
Thanks. I don't know why but I just I couldn't find that command. I spent so much time trying to understand what /context and other commands were showing me I got lost in that noise.

>>Consca+cb
Depending on task complexity, I like to write a small markdown file with the list of features or tasks. If I lose a session (with any model), I'll start with "we were disconnected, please review the desired features in 'features.md', verify current state, and complete anything remaining.

That has reliably worked for me with Gemini, Codex, and Opus. If you can get them to check-off features as they complete them, works even better (i.e, success criteria and an empty checkbox for them to mark off).

>>laserD+Vr
Very happy to see that I am not the only one. My pro subscription lasts maybe 30 minutes for the 5 hour limit. It is completely unusable and that's why I actually switched to OpenCode + GLM 4.7 for my personal projects and. It's not as clever as Opus 4.5 but it often gets the job done anyway

>>cyanyd+2O1
Interesting comparison, Uber.

> Since its founding in 2009, Uber has incurred a cumulative net loss of approximately $10.9 billion.

Now, Uber has become profitable, and will probably become a bit more profitable over time.

But except for speculators and probably a handful of early shareholders, Uber will have lost everyone else money for 20 years since its founding.

For comparison, Lyft, Didi, Grab, Bolt are in the same boat, most of them are barely turning profitable after 10+ years. Turns out taxis are a hard business, even when you ramp up the scale to 11. Though they might become profitable over the long term and we'll all get even worse and more abusive service, and probably more expensive than regular taxis would have been, 15-20 years from now.

I mean, we got some better mobile apps from taxi services, so there's that.

Oh, also a massive erosion of labor rights around the world.

replies(1): >>cyanyd+GM4

>>cyanyd+TO1
My normal day job is IT consulting, network/security mostly, so I'm using it largely to connect to my workers, sandboxed or not, to make me scripts to do things, modify configurations, and I built out an ansible/terraform integration in my mcp to be able to start doing direct automation tasking them directly via it as well.

The whole thing I needed was to let AI reach out and touch things, be my hands essentially. This is why I built my tmux/worker system, I built out an xdg-portal integration to let it screen shot and soon interact with my desktop as a poc.

I could let it just start logging into devices and letting them modify configs, but it's pretty dumb about stuff like modifying fortigate configurations at times what it thinks it should do vs what the cli actually let's it do, so I have to proof much of it, but that's why I'm building it to be able to run ansible/terraform jobs instead using frameworks that are provided by the vendors for direct configurations to allow for atomic config changes as much as vendor implementations allow for.

>>sigseg+LS
13 days ago on HN:

>>46566292

>>smcin+2U
So "The Bug Short" is still up for grabs if anyone wants to make a documentary about the end of the AI bubble? :D

>>oblio+MQ3
I suppose my comparison is that Uber eventually turned a profit and mostly displaced the competitors.

I don't see the current investments turning a profit. Maybe the datacenters will, but most of AI is going to be washed out when somewhere, someone wants to take out their investment and the new Bernie Madoff can't find another sucker.

>>bastar+(OP)
> Grok seems the only real alternative

Gemini CLI, Google Antigravity ...?

>>vbezhe+vJ
this Nov video talks thru via Cursor renegotiation, how in important ways a token != a token (not even within types like reasoning, output, or a model) because attention can lead to expensive logic loops. https://www.oreilly.com/videos/live-with-tim/0642572259488/ https://www.youtube.com/watch?v=6AXMO2dW0LI&list=PL055Epbe6d...

>>heavys+JL
at least with physical drugs you know how many grams you bought, with cc it’s all hidden behind the curtain.

>>Nitpic+U81
It's amazing how inefficient and cost prohibitive this tech is currently both for consumers, and large host.

Yes I like my crack at 20, bucks a month, but the tech is going to need to improve quick to keep that up.

>>throwu+0i
The worst part is it's not getting better. It's getting even more unstable. They are the most unstable product, every 10 minutes is another bug, the same bugs that have existed the entire year I used it reported by hundreds of people. And every day is just, a new bug, never anything fixed. It just gets worse.