zlacker

Two kinds of AI users are emerging

submitted by martin+(OP) on 2026-02-01 23:45:18 | 354 points 336 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts
◧◩
26. ChrisM+Rd[view] [source] [discussion] 2026-02-02 01:46:53
>>defros+4b
Obligatory xkcd: https://xkcd.com/1667/
◧◩◪
45. Daedal+6g[view] [source] [discussion] 2026-02-02 02:05:27
>>Spivak+ld
having worked in large financial institutions, this would be a step improvement

the largest independent derivatives broker in australia collapsed after it was discovered the board were using astrology and magicians to gamble with all the clients money

https://www.abc.net.au/news/2016-09-16/stockbroker-used-psyc...

◧◩
87. bitwiz+iA[view] [source] [discussion] 2026-02-02 05:46:44
>>defros+4b
The thing is, when you use AI, you're not really doing things, you're having things done. AI isn't a tool, it's a service.

Now, back in the day, IBM designed and built an "executive data terminal". It wasn't really a computer terminal in the sense that you and I understand it. Rather, it was a video and two-way-audio feed to a room with a team of underlings, which an executive could ask for business data and analyses, which could be called up on a computer display (also routed to the executive's office). This allowed the executive to ask questions so he (it was the 1960s, it was almost invariably a he) could make informed decisions, and the team of underlings to call up data or crunch numbers on the computer and show the results on the display.

So because executives are used to having things done for them, I can totally see AI being used by executives to replace the "team of underlings" in this setup—in principle. The fact is that were I in that CEO's chair, I'd be thinking twice before trusting anything an LLM tells me, and double-checking those results—perhaps with my team of underlings.

Discussed on Hackernews: >>42405462 IEEE article: https://spectrum.ieee.org/ibm-demo

◧◩◪
98. Hammer+bE[view] [source] [discussion] 2026-02-02 06:30:46
>>majorm+cd
This has had tremendous real world consequences. The European austerity wave of the early 2010s was largely downstream of an excel spreadsheet errors that changed the result of a major study on the impact of debt/gdp.

https://www.newscientist.com/article/dn23448-how-to-stop-exc...

◧◩
105. Antiba+mG[view] [source] [discussion] 2026-02-02 06:56:56
>>nnevat+4G
https://www.theregister.com/2025/10/07/gen_ai_shadow_it_secr...

"With 45 percent of enterprise employees now using generative AI tools, 77 percent of these AI users have been copying and pasting data into their chatbot queries, the LayerX study says. A bit more than a fifth (22 percent) of these copy and paste operations include PII/PCI."

◧◩
153. kristo+NU[view] [source] [discussion] 2026-02-02 09:39:29
>>danpal+vd
I call it the day50 problem, coined that about a year ago. I've been building tools to address it since then. Quit the dayjob 7 months ago and have been doing it full time since

https://github.com/day50-dev/

I have been meaning to put up a blog ...

Essentially there's a delta between what the human does and the computer produces. In a classic compiler setting this is a known, stable quantity throughout the life-cycle of development.

However, in the world of AI coding this distance increases.

There's various barriers that have labels like "code debt" where the line can cross. There's three mitigations now. Start the lines closer together (PRD is the current en vogue method), push out the frontier of how many shits someone gives (this is the TDD agent method), try to bend the curve so it doesn't fly out so much (this is the coworker/colleague method).

Unfortunately I'm just a one-man show so the fact that I was ahead and have working models to explain this has no rewards because you know, good software is hard...

I've explained this in person at SF events (probably about 40-50 times) so much though that someone reading this might have actually heard it from me...

If that's the case, hi, here it is again.

◧◩◪◨
205. xmcqdp+591[view] [source] [discussion] 2026-02-02 11:58:51
>>Punchy+RO
> Honestly the absolute revolution for me would be if someone managed to make LLM tell "sorry I don't know enough about the topic"

https://arxiv.org/abs/2509.04664

According to that OpenAI paper, models hallucinate in part because they are optimized on benchmarks that involve guessing. If you make a model that refuses to answer when unsure, you will not get SOTA performance on existing benchmarks and everyone will discount your work. If you create a new benchmark that penalizes guessing, everyone will think you are just creating benchmarks that advantage yourself.

◧◩◪
216. defros+Ad1[view] [source] [discussion] 2026-02-02 12:36:26
>>xmcqdp+ja1
I too have seen such things.

My experience being they are an exception rather than the rule and many more businesses have sheets that tend further toward Heath Robinson than would be admitted in public.

* https://en.wikipedia.org/wiki/W._Heath_Robinson

228. iqandj+Fm1[view] [source] 2026-02-02 13:32:00
>>martin+(OP)
It is like saying Apple is using Claude Code internally while selling you to use Apple Intelligence https://x.com/tbpn/status/2016911797656367199
◧◩
243. Leynos+fx1[view] [source] [discussion] 2026-02-02 14:33:49
>>viccis+7C
https://en.wikipedia.org/wiki/Monte_Carlo_methods_in_finance
◧◩◪◨⬒
277. KellyC+j82[view] [source] [discussion] 2026-02-02 17:42:16
>>xmcqdp+591
...or they hallicunate because of floating point issues in parallel execution environments:

https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

◧◩◪◨⬒
282. 3D3049+Qj2[view] [source] [discussion] 2026-02-02 18:36:05
>>bweste+DU1
You're in luck: https://theautomatedoperator.substack.com/
◧◩◪◨⬒⬓
302. idopms+JW3[view] [source] [discussion] 2026-02-03 02:42:48
>>3D3049+Qj2
That's the place!

The most fun one is this, which creates listing images for my products: https://theautomatedoperator.substack.com/p/opus-45-codes-ge...

More recently, I'm using Claude Code to handle my inventory management by having it act as an analyst while coding itself tools to access my Amazon Seller accounts to retrieve the necessary info: https://theautomatedoperator.substack.com/p/trading-my-vibe-...

◧◩◪◨⬒⬓
320. Mentlo+xz4[view] [source] [discussion] 2026-02-03 08:28:10
>>lepton+MF3
Os x has a 10% market share, which is 2nd after Windows, but i agree on that one i conflated terms. I couldn’t quickly find device manufacturers stats. If wiki is to be trusted - apple is 4th, with share not far behind dell [1].

If half doesn’t make you leader what does? Maybe you should elaborate your definition of leader? For me it’s “has the highest market share”. And in that definition half is necessarily true.

It’s funny that for PC’s you went for manufacturers (apple is 4th) but for mobile you went for OS (Apple is 2nd). On mobile devices, Apple is 1st, having double market share compared to 2nd place (samsung).

The need to paint Apple as purely a marketing company always fascinated me. Marketing is a big part of who they are though.

[1] https://en.wikipedia.org/wiki/Market_share_of_personal_compu...

◧◩◪◨⬒⬓⬔
327. Merad+0x6[view] [source] [discussion] 2026-02-03 19:09:42
>>theshr+tt4
I haven't seen it bypass my hook yet (knock on wood). I have my hook script [0] tell that its commits are required to pass validation, maybe that helps push it in the right direction?

0: https://github.com/mbcrawfo/vibefun/blob/main/.claude/hooks/...

[go to top]