On the other hand, where I remain a skeptic is this constant banging-on that somehow this will translate into entirely new things - research, materials science, economies, inventions, etc - because that requires learning “in real time” from information sources you’re literally generating in that moment, not decades of Stack Overflow responses without context. That has been bandied about for years, with no evidence to show for it beyond specifically cherry-picked examples, often from highly-controlled environments.
I never doubted that, with competent engineers, these tools could be used to generate “new” code from past datasets. What I continue to doubt is the utility of these tools given their immense costs, both environmentally and socially.
Personally I hope this will materialize, at the very least because there's plenty of discoveries to be made by cross-correlating discoveries already made; the necessary information should be there, but reasoning capability (both that of the model and that added by orchestration) seems to be lacking. I'm not sure if pure chat is the best way to access it, either. We need better, more hands-on tools to explore the latent spaces of LLMs.
Everyone wants to automate the (proverbial) plumbing, until shit spews everywhere and there’s nobody to blame but yourself.
That said, yes, it could be highly beneficial for identifying patterns in existing research that allows for new discoveries - provided we don’t trust it blindly and actually validate it with science. Though I question its value to society in burning up fossil fuels, polluting the atmosphere, and draining freshwater supplies compared to doing the same work with Grad Students and Scientists with the associated societal feedback involved in said employment activities.
But anybody can do plumbing. It’s not rocket science.
Of course, we won't be able to tell the real effects, now, because every longitudinal study of researchers will now be corrupted by the ongoing evisceration of academic research in the current environment. Vibe-coding won't be a net creativity gain to a researcher affected by vibe-immigration-policy, vibe-grant-availability, and vibe-firings, for all of which the unpredictability is a punitive design goal.
Whether fear of LLMs taking jobs has contributed to a larger culture of fear and tribalism that has emboldened anti-intellectual movements worldwide, and what the attributable net effect on research and development will be... it's incredibly hard to quantify.
Regulations come about because of repeated failures that end up harming the public. Regulations aren’t a dirty word, and aren’t obstacles to be “disrupted” in most cases.
> plumbing unions have a financial interest in limiting the number of plumbers
Golly gee, it’s almost as if - because we live in a society where everyone must work in order to survive - that skilled professionals have a vested interest in ensuring only qualified candidates may join their ranks, to make it harder to depress wages below subsistence levels (the default behavior of unregulated capital).
> But anybody can do plumbing. It’s not rocket science.
Oh wow, I had no idea I was qualified to design sewage infrastructure for my township just because I plumbed my Amazon bidet into the cold water line! Sure glad there’s no regulations stopping me from becoming a licensed plumber since apparently that’s all it takes to succeed!
Sarcasm aside, your argument holds about as much substance as artificial sweetener: it sounds informed and wise, but anyone with substantial experience in reality and collaborating with other people knows that all you’re spewing is ignorance of the larger systems at work and their interplay.
Sometimes, but see also the concepts of “iron triangles” and “regulatory capture”.
Quite literally this is what I’m trying to get at with my resistance to LLM adoption in the current environment. We’re not using it to do hard work, we’re throwing it everywhere in an intentional decision to dumb down more people and funnel resources and control into fewer hands.
Current AI isn’t democratizing anything, it’s just a shinier marketing ploy to get people to abandon skilled professions and leave the bulk of the populace only suitable for McJobs. The benefits of its use are seen by vanishingly few, while its harms felt by distressingly many.
At present, it is a tool designed to improve existing neoliberal policies and wealth pumps by reducing the demand for skilled labor without properly compensating those affected by its use, nor allowing an exit from their walled gardens (because that is literally what all these XaaS AI firms are - walled gardens of pattern matchers masquerading as intelligence).
Plumbing requires skill, particularly for difficult jobs, and also requires advanced equipment to do such a job in a reasonable amount of time, such as special cameras to inspect a septic tank or drain line without having to actually cut into it.
Regulations aren’t a binary (exclusively good or exclusively bad), yet so many of the HN cohort have drank the “exclusively bad and everyone can be trusted to make good decisions forever” koolaid that seeks to dismantle regulations wholesale.
Sometimes regulations come about to protect the public. Often, they’re enacted to protect the profits of insurance companies, banks, and other influential industries. Don’t be naive about “the systems at work and their interplay”.
The question was why plumbers are expensive. I assert that it’s not because plumbing is especially difficult.
Does it even have to be able to do so? Just the ability to speed up exploration and validation based on what a human tells it to do is already enormously useful, depending on how much you can speed up those things, and how accurate it can be.
Too slow or too inaccurate and it'll have a strong slowdown factor. But once some threshold been reached, where it makes either of those things faster, I'd probably consider the whole thing "overall useful". Nut of course that isn't the full picture and ignoring all the tradeoffs is kind of cheating, there are more things to consider too as you mention.
I'm guessing we aren't quite over the threshold because it is still very young all things considered, although the ecosystem is already pretty big. I feel like generally things tend to grow beyond their usefulness initially, and we're at that stage right now, and people are shooting it all kind of directions to see what works or not.
I'd imagine AI is much cheaper on that front than grad students, whether you count marginal contribution, or total costs of building and utilization. Humans are damn expensive and environmentally intensive to rear and keep around.
The big question is: is it useful enough to justify the cost when the VC subsidies go away?
My phone recently offered me Gemini "now for free" and I thought "free for now, you mean. I better not get used to that. They should be required to call it a free trial."
Smartest thing you’ve said all day. Thanks for reminding me that trying to convince someone of something when they cannot be bothered to do research beyond first order impacts is a waste of my time.
Evaluating a technology in a vacuum does not work when trying to assess its impact, and in that wider context I don’t see the value-add of these models deployed at scale, especially when their marketing continues focusing on synthetic benchmarks and lofty future-hype instead of immediately practicable applications (like this one was).
In my opinion, there is a very valid argument that the vast majority of things that are patented are not "new" things, because everything builds on something else that came before it.
The things that are seen as "new" are not infrequently something where someone in field A sees something in field B, ponders it for a minute, and goes "hey, if we take that idea from field B, twist it clockwise a bit, and bolt it onto the other thing we already use, it would make our lives easier over in this nasty corner of field A." Congratulations! "New" idea, and the patent lawyers and finance wonks rejoice.
LLMs may not be able to truly "invent" "new" things, depending on where you place those particular goalposts.
However, even a year or two ago - well before Deep Research et al - they could be shockingly useful for drawing connections between disparate fields and applications. I was working through a "try to sort out the design space of a chemical process" type exercise, and decided to ask whichever GPT was available and free at the time about analogous applications and processes in various industries.
After a bit of prodding it made some suggestions that I definitely could have come up on my own if I had the requisite domain knowledge, but would almost certainly never have managed on my own. It also caused me to make a connection between a few things that I don't think I would have stumbled upon otherwise.
I checked with my chemist friends, and they said the resulting ideas were worth testing. After much iteration, one of the suggested compounds/approaches ended up generating the least bad result from that set of experiments.
I've previously sketched out a framework for using these tools (combined with other similar machine learning/AI/simulation tools) to massively improve the energy consumption of industrial chemical processes. It seems to me that that type of application is one where the LLM's environmental cost could be very much offset by the advances it provides.
The social cost is a completely different question though, and I think a very valid one. I also don't think our economic system is structured in such a way that the social costs will ever be mitigated.
Where am I going with this? I'm not sure.
Is there a "ghost in the machine"? I wouldn't place a bet on yes, at least not today. But I think that there is a fair bit of something there. Utility, if nothing else. They seem like a force multiplier to me, and I think that with proper guidance, that force multiplier could be applied to basic research, material science, economics, and "inventions".
Right now, it does seem that it takes someone with a lot of knowledge about the specific area, process, or task to get really good results out of LLMs.
Will that always be true? I don't know. I think there's at least one piece of the puzzle we don't have sorted out yet, and that the utility of the existing models/architectures will ride the s-curve up a bit longer but ultimately flatten out.
I'm also wrong a LOT, so I wouldn't bet a shiny nickel on that.
I won't claim local LLMs as nearly as good as various top models behind paid subscriptions/APIs, but I'm certain I'd be able find a way (for me) of working with them well enough, if the entire paid/hosted ecosystem disappeared over night. Even with models released today.
I think the VC subsidies probably "make stuff happen" faster, and without it we'd see slower progress, but I don't think 100% of the ecosystem would disappear even if 100% of VC funding disappeared. We're bound for another AI winter at one point, and some will surely survive even that :)
It's also getting cheaper all the time. Something like 1000x cheaper in the last two years at the same quality level, and there's not yet any sign of a plateau.
So it'd be quite surprising if the only long-term business model turned out to be subscriptions.
https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch...
It has links to public sources on the pricing of both LLMs and search, and explains why the low inference prices can't be due the inference being subsidized. (And while there are other possible explanations, it includes a calculator for what the compound impact of all of those possible explanations could be.)
The elite really don't see why the proletariat should be interested in, or enjoy the dignity of, actual skill and quality.
Hence the enshitification of everything, and now AI promises to commoditize everything into slop.
Sad because it is the very deoth of society that has birthe
AI code tools are allowing people to build things they couldn't before due to lack of skillset, time or budget. I’ve seen all sorts of problems solved by semi technical and even non-technical people. My brother for example built a thing with Microsoft copilot that helped automate more in his manufacturing facility (used to be paper).
But yeah, keep yelling at that cloud - the rest of us will keep shipping cool things that we couldn’t before, and faster.
I have harped on this endlessly as a non-programmer working a non-tech job, with 7 "vibe-coded" programs now being used daily by people at my company.
I am sorry, but the tech world is completely missing the forest for the trees here. LLM's are talked about purely as tools that were created to help devs. Some love them, some hate them, but pretty much all of them seem unaware that LLMs allow non-tech people to automate tasks with a computer without having to go through a 3rd-party-created interface.
So yea, maybe Claude is useless troubleshooting your cloud platform. But it certainly isn't useless in helping me forgo a cloud platform by setting up a simple local database to use instead.
Brave's Search API is 3$ CPM and includes Web search, Images, Videos, News, Goggles[0]. Anthropic's API is 10$ CPM for Web search (and text only?), excluding any input/output tokens from your model of choice[1], that'd be an additional 15$ CPM, assuming 1KTok per request and Claude Sonnet 4 as a good model, so ~25$ CPM.
So your default "Ratio (Search cost / LLM cost): 25.0x" seems to be more on the 0.12x side of things (Search cost / LLM cost). Mind you, I just flew over everything in 10 mins and have no experience using either API.
Really a lot of innovation, even at the very cutting edge, is about combining old things in new ways, and these are great productivity tools for this.
I've been "vibe coding" quite a bit recently, and it's been going great. I still end up reading all the code and fixing issues by hand occasionally, but it does remove a lot of the grunt work of looking up simple things and typing out obvious code.
It helps me spend more time designing and thinking about how things should work.
It's easily a 2-3x productivity boost versus the old fashioned way of doing things, possibly more when you take into account that I also end up implementing extra bells and whistles that I would otherwise have been too lazy to add, but that come almost for free with LLMs.
I don't think the stereotype of vibe coding, that is of coding without understanding what's going on, actually works though. I've seen the tools get stuck on issues they don't seem to be able to understand fully too often to believe that.
I'm not worried at all that LLMs are going to take software engineering jobs soon. They're really just making engineers more powerful, maybe like going from low level languages to high level compiled ones. I don't think anyone was worried about the efficiency gains from that destroying jobs either.
There's still a lot of domain knowledge that goes into using LLMs for coding effectively. I have some stories on this too but that'll be for another day...
The problem is that it's sold as a complete solution. Use the LLM and you'll get a fully working product. However if you're not an experienced programmer you won't know what's missing, if it's using outdated and insecure options, or is just badly written. This still needs a professional.
The technology is great and it has real potential to change how things are made, but it's being marketed as something it isn't (yet).
I think a lot of this could be solved by a platform that implements appropriate guardrails so that the application code literally cannot screw up the security. Not every conceivable type of software would fit in such a platform, but a lot of what people want to do to automate their day-to-day lives could.
In applied research perhaps, Fundamental research is nothing like that in any field including ML.
All technology has the effect of concentrating wealth, and anyone who insists on using their two hands to fashion things when machines exist that can do it better will always be relegated to the "artisan" bin as time rolls on.
LLMs excel at writing software for one or a handful of users with a very narrow but very well defined use cases.
I don't need an LLM to write Excel.exe for keeping track of 20 employee's hours. A simple GUI on a SQLite database can easily do that.
It's worthwhile to note that https://github.com/deepseek-ai/open-infra-index/blob/main/20... shows cost vs. theoretical income. They don't show 80% gross margins and there's probably a reason they don't share their actual gross margin.
OpenAI is the easiest counterexample that proves inference is subsidized right now. They've taken $50B in investment; surpassed 400M WAUs (https://www.reuters.com/technology/artificial-intelligence/o...); lost $5B on $4B in revenue for 2024 (https://finance.yahoo.com/news/openai-thinks-revenue-more-tr...); and project they won't be cash-flow positive until 2029.
Prices would be significantly higher if OpenAI was priced for unit profitability right now.
As for the mega-conglomerates (Google, Meta, Microsoft), GenAI is a loss leader to build platform power. GenAI doesn't need to be unit profitable, it just needs to attract and retain people on their platform, ie you need a Google Cloud account to use Gemini API.
* We achieved Level 2 autonomy first, which requires you to fully supervise and retain control of the vehicle and expect mistakes at any moment. So kind of neat but also can get you in big trouble if you don't supervise properly. Some people like it, some people don't see it as a net gain given the oversight required.
^ This is where Tesla "FSD beta" is at, and probably where LLM codegen tools are at today.
* After many years we have achieved a degree of Level 4 autonomy on well-trained routes albeit with occasional human intervention. This is where Waymo is at in certain cities. Level 4 means autonomy within specific but broad circumstances like a given area and weather conditions. While it is still somewhat early days it looks like we can generally trust these to operate safely and ask for help when they are not confident. Humans are not out of the loop.[1]
^ This is probably what where we can expect codegen to grow after many more years of training and refinement in specific domains. I.e. a lot of what CloudFlare engineers did with their prompt engineering tweaking was of this nature. Think of them as the employees driving the training vehicles around San Francisco for the past decade. And similarly, "L4 codegen" needs to prioritize code safety which in part means ensuring humans can understand situations and step in to guide and debug when the tool gets stuck.
* We are still nowhere close to Level 5 "drive anywhere and under any conditions a human can." And IMHO it's not clear we ever will based purely on the technology and methods that got us to L4. There are other brain mechanisms at work that need to be modeled.
[1] https://www.cnbc.com/2023/11/06/cruise-confirms-robotaxis-re...
I believe the API prices are not subsidized, and there's an entire section devoted to that. To recap:
1) pure compute providers (rather than companies providing both the model and the compute) can't really gain anything from subsidizing. That market is already commoditized and supply-limited.
2) there is no value to gaining paid API market share -- the market share isn't sticky, and there's no benefit to just getting more usage since the terms of service for all the serious providers promise that the data won't be used for training.
3) we have data from a frontier lab on what the economics of their paid API inference are (but not the economics of other types of usage)
So the API prices set a ceiling on what the actual cost of inference can be. And that ceiling is very low relative to the prices of a comparable (but not identical) non-AI product category.
That's a very distinct case from free APIs and consumer products. The former is being given out for no cost in exchange for data, the latter for data and sticky market share. So unlike paid APIs, the incentives are there.
But given the cost structure of paid APIs, we can tell that it would be trivial for the consumer products to be profitably monetized with ads. They've got a ton of users, and the way users interact with their main product would be almost perfect for advertising.
The reason OpenAI is not making a profit isn't that inference is expensive. It's that they're choosing not to monetize like 95% of their users, despite the unit economics being very lucrative in principle. They're making a loss because for now they can, and for now the only goal of their consumer business is to maximize their growth and consumer mindshare.
If OpenAI needed to make a profit, they would not raise their prices on things being paid for. They'd just need to extract a very modest revenue from their unpaid users. (It's 500M unpaid users. To make $5B/year in revenue from them, you'd need just a $1 ARPU. That's an order of magnitude below what's realistic. Hell, that's lower than the famously hard to monetize Reddit's global ARPU.)
And in a world where policy is horrid and the effects are mainly negated, things would be even worse if the remaining researchers lost AI as a tool. For better or for worse, fire has been shared with humanity, and we might as well cook.
We're about to enter a world where everyone has their own custom software for their specific use cases. Each of these is relatively simple, yet they may replace something complex. Excel is complex because it needs to handle everyone's use cases, but for any one particular spreadsheet, you could pretty easily vibe-code a replacement that does that one spreadsheet's job better than Excel can.
I've also found that vibe-coding a presentation as a React app is better than using Power Point.
You might think it was worth it now because you got an iphone, but they didn't get an iphone.
> I have harped on this endlessly as a non-programmer working a non-tech job, with 7 "vibe-coded" programs now being used daily by people at my company.
Aren't AI coding agent(s) just the next iteration of democratizing app development? This has happened before with Microsoft Access (even Visual Basic), or going back further FoxPro, dBase & Clipper etc? With all of these tools, non-programmers had been able to create apps to help them with their businesses.
1) Help me understand what you mean by “pure compute providers” here. Who are the pure compute providers and what are their financials including pricing?
2) I already responded to this - platform power is one compelling value gained from paid API market share.
3) If the frontier lab you’re talking about is DeepSeek, I’ve already responded to this as well, and you didn’t even concede the point that the 80% margin you cited is inaccurate given that it’s based on a “theoretical income”.
Just because a paradigm shift doesn't miraculously catapult us all into a post-scarcity economy overnight, that doesn't mean it's not an important milestone on a longer road.
2) (API) platform power having no value in this space has been demonstrated repeatedly. There are no network effects, because you can't use the user data to improve models. There is no lock-in, as the models are easy to substitute due to how incredibly generic the interface is. There is no loyalty, the users will jump ship instantly when better models are released. There is no purchasing power from having more scale, the primary supplier (Nvidia) isn't giving volume discounts and is actually giving preferential allocations to smaller hosting providers to fragment the market as much as possible.
Did you have some other form of platform power in mind?
3) I did not concede that point because I don't think it's relevant. They provide the exact data for their R1 inference economics:
- The cost per node: a 8*H800 node costs $16/hour=$0.0045/s to run (rental price, so that covers capex + opex).
- The throughput per node: Given their traffic mix, a single node will process 75k/s input tokens and generate 15k/s output tokens.
- Pricing ($0.35/1M input when weighing for cache hit/miss, $2.2/1M output)
- From which it follows that the per-node revenue is $0.35/(1M/75k/s) = $0.026/s for input, and $2.2/(1M/15k/s)=$0.033/s for output. That's $0.06/s in revenue, substantially higher than the cost of revenue.
Like, that just is what the economics of paid R1 inference are (there being V3 in the mix doesn't matter, they're the same parameter count). Inference is really, really cheap both in absolute cost/token terms and relative to the prices people are willing to pay.
Their aggregate margins are different, and we don't know how different, because here too they choose to also provide free service with no ads. But that too is a choice. If they just stopped doing that and rented fewer GPUs, their margins would be very lucrative. (Not as high as the computation suggests since the unpaid traffic allows them to batch more efficiently, but hat's not going to make a 5x difference.)
But fair enough, it might be cleaner to use the straight cost per token data rather than add the indirection of margins. Either way, it seems clear that API pricing is not subsidized.
I don't work anywhere close with software but I have used chatgpt to program small tools and scripts for me I never would have written myself.
The real boon of AI programming is when normal people use it to program things custom tailored for their use case.
A ridiculous amount of most researchers' time is spent cleaning up data.
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
MessageBox.Show("Hello, World!")
End Sub
End Class
Becomes"Make a message box pop up on the screen that says Hello World!"
Yeah, that sounds about right to me. I wasn't talking about wholesale replacement though, but as a tool/augmentation, I'm not very confident an LLM would be able replace a software engineer, but I can definitely see many workflows of a software engineer being sped up, like the exploration and validation process.
> You really should read the papers and reporting coming out about the sheer cost of these AI models and their operation.
Unless I've missed something big, they're still showing what I said.
Obviously, AI has its cost. And it's going to be big, because the whole world is using it, and trying to develop better models.
> those humans provide knock-on impacts that can decrease their environmental impact (especially if done in concert)
Can you name three? As far as I know, humans are energy intensive and strongly carbon-negative in general - and there's only so much they can do to decrease it; otherwise we wouldn't be facing a climate crisis.
> the current crop of AI is content burning NatGas turbines
That's a misleading statement, not an argument. AI is powered by electricity, not natural gas. Electricity is fungible, and how it's generated is not relevant to to how it's used. Even if you can point at a data center that gets power directly and exclusively from a fossil fuel generator, the problem has nothing to do with AI, and the solution is not "less AI", but "power the data center from renewables or nuclear instead".
> I don’t see the value-add of these models deployed at scale, especially when their marketing continues focusing on synthetic benchmarks and lofty future-hype instead of immediately practicable applications (like this one was)
That's the crux of the issue. You don't see the value-add. I respectfully suggest to stop looking at benchmarks, to stop reading marketing materials and taking it seriously (always a good idea, regardless of the topic), to stop listening to linkedin "thought leaders". Instead, just look at it. Try using it, see how others are using it.
The value-add is real, substantial, and blindingly obvious. To me, it's one of the best uses of electricity today, in terms of value-add per kilowatt hour.
And it is true, those people did not got an iPhone and died, but this is also you saying this for them. You don't know all the specifics of history or all their motivations, the industrial revolution had a bloody story, but it's origins were also organic, it also had aspects of improvement. The world population grew almost 10x.
I don't think we are in a position to judge those past events to the lens you are posing.