The difference between the two is stark.
I'll know better in a week. Hopefully I can get better results with the $200 a month plan.
"Microsoft has over 100,000 software engineers working on software projects of all sizes."
So that would mean 100 000 000 000 (100 billion) lines of code per month. Frightening.
But I guess having my computer randomly stop working because a billion dollar corporation needs to save money by using a shitty text generation algorithm to write code instead of hiring competent programmers is just the new normal now.
CC has some magic secret sauce and I'm not sure what it is.
My company pays for both too, I keep coming back to Claude all-round
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
> My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”.
Obviously, "every line of C and C++ from Microsoft" is not contained within a single research project, nor are "Microsoft's largest codebases".
Generating bilions of lines of code that is unmaintainable and buggy should easily achieve that. ;-)
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
I miss those days.Starting in October with Vscode Copilot Chat it was $150, $200, $300, $400 per month with the same usage. I thought they were just charging more per request without warning. The last couple weeks it seemed that vscode copilot was just fucking up making useless calls.
Perhaps, it wasn't a dark malicious pattern but rather incompetence that was driving up the price.
That’s 200 Windows’ worth of code every month.
The fact that there are distinguished engineers at MS who think that is a reasonable goal is frightening though.
As @mrbungie says on this thread: "They took the stupidest metric ever and made a moronic target out of it"
Specifically WHY they use Apple hardware is something I can only speculate on. Presumably it's easier to launch Windows on Mac than the other way around, and they would likely need to do that as .NET and its related technologies are cross platform as of 2016. But that's a complete guess on my part.
Am *NOT* a Microsoft employee, just an MVP for Developer Technnolgies.
No, one researcher at Microsoft made a personal LinkedIn post that his team were using that as their 'North Star' for porting and transpiling existing C and C++ code, not writing new code, and when the internet hallucinated that he meant Windows and this meant new code, and started copypasting this as "Microsoft's goal", the post was edited and Microsoft said it isn't the company's goal.
The fact that it's a "PR disaster" for a researcher to have an ambitious project at one of the biggest tech companies on the planet, or to talk up their team on LinkedIn, is unbelievably ridiculous.
Copilot's main problem seems to be people don't know how to use it. They need to delete all their plugins except the vscode, CLI ones, and disable all models except anthropic ones.
The Claude Code reputation diff is greatly exaggerated beyond that.
(I think it is from "Triumph of the Nerds" (1996), but I can't find the time code)
They want to expand that value into engineering and so are looking for something they can measure. I haven't seen anyone answer what can be measured to make a useful improvement though. I have a good "feeling" that some people I work with are better than others, but most are not so bad that we should fire them - but I don't know how to put that into something objective.
The accounts have now all gone quiet, guess they got told to quit it.
One of the many reasons why it's such a bad practice (overly verbose solutions id another one of course)
Talking about rewriting Windows at a rate of 1 million lines of code per engineer per month with LLMs is absolutely going to garner negative publicity, no matter how much you spin it with words like "ambitious" (do you work in PR? it sounds like it's your calling).
Why would you continue supposing such a thing when both the employee, and the employer, have said that your suppositions are wrong?
Most models of productivity look like factories with inputs, outputs, and processes. This is just not how engineering or craftsmanship happen.
Tldr: Copilot has 1% marketshare among web chatbots and 1.85% of paid M365 users bought a subscription to it.
As much as I think AI is overrated already, Copilot is pretty much the worst performing one out there from the big tech companies. Despite all the Copilot buttons in office, windows, on keyboards and even on the physical front of computers now.
We have to use it at work but it just feels like if they spent half the effort they spend on marketing on actually trying to make it do its job people might actually want to use it.
Half the time it's not even doing anything. "Please try again later" or the standard error message Microsoft uses for every possible error now: "Something went wrong". Another pet peeve of mine, those useless error messages.
He didn't dislike it, but got himself a Macbook nonetheless at his cost.
* all mean 'nearly all' as of course there will be exceptions.
If Copilot is stupid uniquely with 5.2 Codex then they should disable that instead of blaming the user (I know they aren’t, you are). But that’s not the case, it’s noticeably worse with everything. Compared to both Cursor and Claude Code.
So with this level of productivity Windows could completely degrade itself and collapse in one week instead of 15 years.
GPT-5.2 sometimes does this too. Opus-4.5 is the best at understanding what you actually want, though it is ofc not perfect.
“What do we actually need to be productive?”
Which is how Anthropic pulled ahead of Microsoft, that prioritized
checks notes
Taking screenshots of every windows user’s desktop every few seconds. For productivity.
Totally agree. I see LOC as a liability metric. It amazes me that so many other people see it as an asset metric.
There is Microsoft Copilot, which replaced Bing Chat, Cortana and uses OpenAI’s GPT-4 and 5 models.
There is Github Copilot, the coding autocomplete tool.
There is Microsoft 365 Copilot, what they now call Office with built in GenAI stuff.
There is also a Copilot cli that lets you use whatever agent/model backend you want too?
Everything is Copilot. Laptops sell with Copilot buttons now.
It is not immediately clear what version of Copilot someone is talking about. 99% of my experience is with the Office and it 100% fails to do the thing it was advertised to do 2 years ago when work initially got the subscription. Point it a SharePoint/OneDrive location, a handful of excel spreadsheets and pdfs/word docs and tell it to make a PowerPoint presentation based on that information.
It cannot do this. It will spit out nonsense. You have to hold it by the hand tell it everything to do step by step to the point that making the PowerPoint presentation yourself is significantly faster because you don’t have to type out a bunch of prompts and edit it’s garbage output.
And now it’s clear they aren’t even dogfooding their own LLM products so why should anyone pay for Copilot?
Well, that might explain why all their products are unusable lately.
Maybe Microsoft is just using it internally, to finish copying the rest of the features from Claude Code.
Much like the article states, I use Claude Code beyond just it's coding capabilities....
Microsoft cannot and will not ever get better at naming things. It is said the universe will split open and and eldritch beast will consume the stars the day Microsoft stops using inconsistent and overlapping names for different and conflicting products.
Isn't that right .Net/dotnet
Improve the workflows that would benefit "AI" algorithms, image recognition, voice control, hand writing, code completion, and so on.
No need to put buttons to chat windows all over the place.
It's also just not as good at being self-directed and doing all of the rest of the agent-like behaviors we expect, i.e. breaking down into todolists, determining the appropriate scope of work to accomplish, proper tool calling, etc.
Attempt to build a product... Fail.
Buy someone else's product/steal someone else's product... Succeed.
No, there is Github Copilot, the AI agent tool that also has autocomplete, and a chat UI.
I understand your point about naming, but it's always helpful to know what the products do.
I have developed decent intuition on what kinds of problems Codex, Claude, Cursor(& sub-variants), Composer etc. will or will not be able to do well across different axes of speed, correctness, architectural taste, ...
If I had to reflect on why I still don't use Gemini, it's because they were late to the party and I would now have to be intentional about spending time learning yet another set of intuitions about those models.
Codex is the best at following instructions IME. Claude is pretty good too but is a little more "creative" than codex at trying to re-interpret my prompt to get at what I "probably" meant rather than what I actually said.
When it came out, Github Copilot was an autocomplete tool. That's it. That may be what the OP was originally using. That's what I used... 2 years ago. That they change the capabilities but don't change the name, yet change names on services that don't change capabilities further illustrates the OP's point, I would say.
Microsoft may or may not have a "problem" with naming, but if you're going to criticize a product, it's always a good starting place to know what you're criticizing.
I'm amazed that a company that's supposedly one of the big AI stocks seemingly won't spare a single QA position for a major development tool. It really validates Claude's CLI-first approach.
Any MBAs want to buy? For the right price I could even fix it ...
It's interesting to think back, what did Copilot do wrong? Why didn't it become Claude Code?
It seems for one thing its ambition might have been too small. Second, it was tightly coupled to VS Code / Github. Third, a lot of dumb big org Microsoft politics / stakeholders overly focused on enterprise over developers? But what else?
CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks
The tools or the models? It's getting absurdly confusing.
"Claude Code" is an interface to Claude, Cursor is an IDE (I think?! VS Code fork?), GitHub Copilot is a CLI or VS Code plugin to use with ... Claude, or GPT models, or ...
If they are using "Claude Code" that means they are using Anthropic's models - which is interesting given their huge investment in OpenAI.
But this is getting silly. People think "CoPilot" is "Microsoft's AI" which it isn't. They have OpenAI on Azure. Does Microsoft even have a fine-tuned GPT model or are they just prompting an OpenAI model for their Windows-builtins?
When you say you use CoPilot with Claude Opus people get confused. But this is what I do everyday at work.
shrug
meanwhile ms and github, is waiting for any breadcrumb that chatgpt leave with
Github Copilot is actually a pretty good tool.
[1] Not just AI. This is true for any major software product line, and why subordinate branding exists.
A.I. Tool Is Going Viral. Five Ways People Are Using It
https://www.nytimes.com/2026/01/23/technology/claude-code.ht...
Claude Is Taking the AI World by Storm, and Even Non-Nerds Are Blown Away
https://www.wsj.com/tech/ai/anthropic-claude-code-ai-7a46460...
GitHub Copilot is a service, you can buy subscription from here https://github.com/features/copilot.
GitHub Copilot is available from website https://github.com/copilot together with services like Spark (not available from other places), Spaces, Agents etc.
GitHub Copilot is VSCode extension which you can download at https://marketplace.visualstudio.com/items?itemName=GitHub.c... and use from VSCode.
New version has native "Claude Code" integration for Anthropic models served via GitHub Copilot.
You can also use your own ie. local llama.cpp based provider (if your github copilot subscription has it enabled / allows it at enterprise level).
Github Copilot CLI is available for download here https://github.com/features/copilot/cli and it's command line interface.
Copilot for Pull Requests https://githubnext.com/projects/copilot-for-pull-requests
Copilot Next Edit Suggestion https://githubnext.com/projects/copilot-next-edit-suggestion...
Copilot Workspace https://githubnext.com/projects/copilot-workspace/
Copilot for Docs https://githubnext.com/projects/copilot-for-docs/
Copilot Completions CLI https://githubnext.com/projects/copilot-completions-cli/
Copilot Voice https://githubnext.com/projects/copilot-voice/
GitHub Copilot Radar https://githubnext.com/projects/copilot-radar/
Copilot View https://githubnext.com/projects/copilot-view/
Copilot Labs https://githubnext.com/projects/copilot-labs/
This list doesn't include project names without Copilot in them like "Spark" or "Testpilot" https://githubnext.com/projects/testpilot etc.
I do agree that conceptually there is a big difference between an editor, even with smart autocomplete, and an agentic coding tool, as typified by Claude Code and other CLI tools, where there is not necessarily any editor involved at all.
> GitHub Copilot is a service
and maybe, the api behind
> GitHub Copilot is VSCode extension
???
What an absolute mess.
[0] based on user Thumbs up/Thumbs down voting
There was also "Active" before that, but .NET was next level crazy...
There's also all the other Copilot branded stuff which has varying use. The web based chat is OK, but I'm not sure which model powers it. Whatever it is it can be very verbose and doesn't handle images very well. The Office stuff seems to be completely useless so far.
Is it the context menu key? Or did they do another Ctrl+Alt+Shift+Win+L thing?
One thing that I don't know about is if they have an AI product that can work on combining unstructured and databases to give better insights on any new conversation? e.g. like say the LLM knows how to convert user queries to the domain model of tables and extract information? What companies are doing such things?
This would be something that can be deployed on-prem/ their own private cloud that is controlled by the company, because the data is quite sensitive.
Leaving Microsoft’s ecosystem a few years ago has been a great productivity boost, saved quite a bit of cash, and dramatically reduced my frustration.
It's pretty clear that Microsoft had "Everything must have Copilot" dictated from the top (or pretty close). They wanted to be all-in on AI but didn't start with any actual problems to solve. If you're an SWE or a PM or whatever and suddenly your employment/promotion/etc prospects depend on a conspicuously implemented Copilot thing, you do the best you can and implement a chat bot (and other shit) that no one asked for or wants.
I don't know Anthropic's process but it produced a tool that clearly solves a specific problem: essentially write code faster. I would guess that the solution grew organically given that the UI isn't remotely close to what you'd expect a product manager to want. We don't know how many internal false-starts there were or how many people were working on other solutions to this problem, but what emerged clearly solved that problem, and can generalize to other problems.
In other words, Microsoft seems to have focused on a technology buzzword. Anthropic let people solve their own problems and it led to an actual product. The kind that people want. The difference is like night and day.
Who knows what else might have happened in the last 12 months if C-suites were focused more on telling SWEs to be productive and less on forcing specific technology buzzwords because they were told it's the future.
I only gave it up because it felt like a liability and, ahem, it was awkward to review screenshots and delete inopportune ones.
Because Windows' UX is trash? Anyone with leverage over their employer can and should request a Mac. And in a hot market, developers/designers did have that leverage (maybe they still do) and so did get their Macs as requested.
Only office drones who don't have the leverage to ask for anything better or don't know something better exists are stuck with Windows. Everyone else will go Mac or Linux.
Which is why you see Windows becoming so shit, because none of the culprits actually use it day-to-day. Microsoft should've enforced a hard rule about dogfooding their own product back in the Windows 7 days when the OS was still usable. I'm not sure they could get away with it now without a massive revolt and/or productivity stopping dead in its tracks.
I am familiar with copilot cli (using models from different providers), OpenCode doing the same, and Claude with just the \A models, but if I ask all 3 the same thing using the same \A model, I SHOULD be getting roughly the same output, modulo LLM nondeterminism, right?
This would explain the "secret sauce", if it's true. But perhaps it's not and a lot is LLM nondeterminism mixing with human confirmation bias.
Until MS makes sure their models get the necessary context, I don't even care to click on them.
It's unbelievable how bad they failed at this. If you do the same with Claude or ChatGPT via simple web interface, they get miles ahead.
I mean they fought the browser war for years, then just used Chrome.
I'm sure no other tech company is like this.
I think technologies like the Windows kernel and OS, the .NET framework, their numerous attempts to build a modern desktop UI framework with XAML, their dev tools, were fundamentally good at some point.
Yet they cant or wont hire people who would fix Windows, rather than just maintain it, really push for modernization, make .NET actually cool and something people want to use.
They'd rather hire folks who were taught at school that Microsoft is the devil and Linux is superior in all ways, who don't know the first thing about the MS tech stack, and would rather write React on the Macbooks (see the start menu incident), rather than touch anything made by Microsoft.
It seems somehow the internal culture allows this. I'm sure if you forced devs to use Copilot, and provided them with the tools and organizational mandate to do so, it would become good enough eventually to not have to force people to use it.
My main complaint I keep hearing about Azure (which I do not use at workr)
Nadella might have fixed a few things, but Microsoft still have massive room for improvement in many areas.
Claude Code is fun, full of personality, many features to hack around model shortcomings, and very quick, but it should not be let anywhere near serious coding work.
That's also why OpenClaw uses Claude for personality, but its author (@steipete) disallows any contribution to it using Claude Code and uses Codex exclusively for its development. Claude Code is a slop producer with illusions of productivity.
MS's calculus was obvious - why spend insane amounts of engineering effort to make a browser engine that nobody uses - which is too bad, because if I remember correctly they were not too far behind Chrome in either perf or compatibility for a while.
That was using the Claude Sonnet 4.5 model, I wonder if using the Opus 4.5 model would have managed to avoid that.
It's truly more capable but still not capable enough that Im comfortable blindly trusting the output.
Claude still cant do half the things Crush can do.
Plus: you can use Kimi 2.5 with Crush soon
The execs buying Microsoft products are presumed to be as clueless as the execs naming Microsoft products.
There is no tech giant that is more vulnerable than Microsoft is at this moment.
Most document originations will begin out of or adjacent to of LLM sessions in the near future, as everything will blur in terms of collaborating with AI agents. Microsoft has no footing (or worse, their position is terrible courtesy of copilot) and is vulnerable to death by inflection point. Windows 11 is garbage and Google + Linux may finally be coming for their desktop (no different than what AMD has managed in unwinding the former Intel monopoly in PCs).
Someone should be charging at them with a new take on Office, right now. This is where you slice them in half. Take down Office and take down Windows. They're so stupid at present that they've opened the gates to Office being destroyed, which has been their moat for 30 years.
1. Classic Coding (Traditional Development) In the classic model, developers are the primary authors of every line.
Production Volume: A senior developer typically writes between 10,000 and 20,000 lines of code (LOC) per year.
Workflow: Manual logic construction, syntax memorization, and human-led debugging using tools like VS Code or JetBrains IDEs.
Focus: Writing the implementation details. Success is measured by the quality and maintainability of the hand-written code.
2. AI-Supported Coding (The Modern Workflow)
AI tools like GitHub Copilot and Cursor act as a "pair programmer," shifting the human role to a reviewer and architect. Production Volume: Developers using full AI integration have seen a 14x increase in code output (e.g., from ~24k lines to over 810k lines in a single year).
Work Distribution: Major tech leaders like AWS report that AI now generates up to 75% of their production code.
The New Bottleneck: Developers now spend roughly 70% of their time reviewing AI-generated code rather than writing it.
I think realistic 5x to 10x is possible. 50.000 - 200.000 LOC per YEAR !!!! Would it be good code? We will see.Related: https://www.cnet.com/tech/tech-industry/windows-servers-iden...
Seriously, how?
It's on the top of most leaderboards on lmarena.ai
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
Microsoft is very explicit in detailing how the data stays on device and goes to great lengths to detail exactly how it works to keep data private, as well as having a lot of sensible exceptions (e.g., disabled for incognito web browsing sessions) and a high degree of control (users can disable it per app).
On top of all this it’s 100% optional and all of Microsoft’s AI features have global on/off switches.
Take some arbitrary scaler and turn it into a mediocre metric, for some moronic target.
They really won't, though; Microsoft just does this kind of thing, over and over and over. Before everything was named "365", it was all "One", before that it was "Live"... 20 years ago, everything was called ".NET" whether it had anything to do with the Internet or not. Back in the '90s they went crazy for a while calling everything "Active".
My default everyday model is still Gemimi 3 in AI Studio, even for programming related problems. But for agentic work Antigravity felt very early-stages beta-ware when I tried it.
I will say that at least Gemimi 3 is usually able to converge on a correct solution after a few iterations. I tried Grok for a medium complexity task and it quickly got stuck trying to change minor details without being able to get itself out.
Do you have any advice on how to use Antigravity more effectively? I'm open to trying it again.
It simply didn’t work. I complained about it and was eventually hauled into a room with some MS PMs who told me in no uncertain terms that indeed, Biztalk didn’t work and it was essentially garbage that no one, including us, should ever use. Just pretend you’re doing something and when the week is up, go home. Tell everyone you’ve integrated with Biztalk. It won’t matter.
AI really should be a freaking feature, not the identity of their products. What MS is doing now is like renaming Photoshop to Photoshop Neural Filter.
If a large company has bought into "Co-Pilot", they want it all right? Or not, but let's not make carving anything out easy.
Just a thought.
Completely impossible. The search is bad to begin with, but it explicitly ignores anything that isn't a-9.
Can anyone else share what their workflow with CC looks like? Even if I never end up switching I'd like to at least feel like I gave it a good shot and made a choice based on that, but right now I just feel like I'm doing something wrong.
Both Claude and ChatGPT were unbearable, not primarily because of lack of technical abilities but because of their conversational tone. Obviously, it's pointless to take things personally with LLMs but they were so passive-aggressive and sometimes maliciously compliant that they started to get to me even though I was conscious of it and know very well how LLMs work. If they had been new hires, I had fired both of them within 2 weeks. In contrast, Gemini Pro just "talks" normally, task-oriented and brief. It also doesn't reply with files that contain changes in completely unrelated places (including changing comments somewhere), which is the worst such a tool could possibly do.
Edit: Reading some other comments here I have to add that the 1., 2. ,3. numbering of comments can be annoying. It's helpful for answers but should be an option/parameterization.
It reminds me of this [0] Dilbert comic, but heh.
This absolutely sucks, especially since tool calling uses tokens really really fast sometimes. Feels like a not-so-gentle nudge to using their 'official' tooling (read: vscode); even though there was a recent announcement about how GHCP works with opencode: https://github.blog/changelog/2026-01-16-github-copilot-now-...
No mention of it being severely gimped by the context limit in that press release, of course (tbf, why would they lol).
However, if you go back to aider, 128K tokens is a lot, same with web chat... not a total killer, but I wouldn't spend my money on that particular service with there being better options!
In essence, this is pretty much how you'd run a group of juniors - you'd sit on slack and jira diving up work and doing code reviews.
So Microsoft isn't bringing copilot to all these applications? It's just bringing a copilot label to them? So glad I don't use this garbage at home.
I got really good at reviewing code efficiently from my time at Google and others, which helps a lot. I'm sure my personal career experience influences a lot how I'm using it.
FWIW, I use Codex CLI, but I assume my flow would be the same with Claude Code.
For some reason, people have great cognitive difficulty with defensive trust. Charlie Brown, Sally.
PS: When I say party trick I don't deny it has its uses but it's currently used like the jesus-AI that can do anything.
It's funny because that's basically the approach I take in GH Copilot. I first work with it to create a plan broken up into small steps and save that to an md file and then I have it go one step at a time reviewing the changes as it goes or just when it's done.
I understand that you're using emacs to keep an eye on the code as it goes, so maybe what I wasn't groking was that people were using terminal based code editors to see the changes it was making. I assumed most people were just letting it do it s thing and then trying to review everything at the end, but felt like an anti-pattern given how much we (dev community) push for small PRs over gigantic 5k line PRs.
(Also a signal for why devs should not bother with their shoddy Xcode AI work - Apple devs are not using it)
I've been experimenting with small local models and the types of prompts you use with these are very different than the ones you use with Claude Code. It seems less different between Claude, Codex, and Gemini but there are differences.
It's hard to articulate those differences but I think that I kind of get in a groove after using models for a while.
MS's bottom line doesn't depend on how happy users are with W11, especially not power users like ourselves. W11 is just a means of selling subscriptions (office, ai, etc). The question isn't 'are users happy' it's 'will OEMs and business continue to push it?'. The answer to that is almost certainly yes. OEMs aren't going to be selling most pcs with ubuntu included any time soon. Businesses are not going to support libreoffice when MS office is the established standard.
Maybe apple could make inroads here, but they don't seem willing to give up their profit margins on overpriced hardware, and I don't think I've ever seen them release anything 'office' related that was anywhere near feature parity with MSO, and especially not cross platform.
So once we have signoff then my counterpart in Sharepoint/M365 land gets his "Copilot" for Office, while my reporting and analytics group gets "Copilot" for Power BI, while my coding team gets "Copilot" for llm assisted development in GitHub.
In the meantime everybody just plugs everything into ChatGPT and everybody pretends it isn't happening. It's not unlawful if they lawyers can't see it!
I worked on a project with some microsoft engineers to create a chatbot plugin for Salesforce, using Microsoft Power Virtual Agent, and the comunication tool they used was Slack and not teams. And I was obligated to use teams because of the consuting company I worked at the time.
And also the version control they used at the time was I think SVN, and not TFS.
Put together a nice and clean price list for your friends in the purchasing department.
I dare you.
Marketing need as much supervision as a toddler in a cristal store.
> In the meantime everybody just plugs everything into ChatGPT
I believe you meant "everyone plugs everything into ChatGPT for Co-Pilot"! A statement with its own useful ambiguities.
It is comical, but I can now make a serious addition to Sun Tzu's maxims.
“All warfare is based on deception.”
“To subdue the enemy without fighting is the acme of skill.”
"Approval is best co-opted with a polysemous brand envelope."
This often happens because the people inside are incentivized to build their own empire.
If someone comes and wants to get promoted/become an exec, there's a ceiling if they work under the an existing umberlla + dealing the politics of introducing a feature which requires dealing with an existing org.
So they build something new. And the next person does the same. And so you have 365, One, Live, .Net, etc
It fails to be pro-active. "Why didn't you run the tests you created?"
I want it to tell me if the implementation is working.
Feels lazy. And it hallucinates solutions frequently.
It pales in comparison to CC/Opus.
It won't make any changes until a detailed plan is generated and approved.
I think I could clean up their existing mess if they want help.
Jedd outlines my credentials well here https://news.ycombinator.com/item?id=17522649#17522861
Gemini CLI (not the model) is trash, I wish it weren't so, but I only have to try to use for a short time before I give up. It regularly retains stale file contents (even after re-prompting), constantly crashes, does not show diffs properly, etc, etc.
I recently tried OpenCode. It's got a bit better, but I still have all kind of API errors with the models. I also have no way to scroll back properly to earlier commands. Its edit acceptance and permissions interface is wonky.
And so on. It's amazing how Claude Code just nails the agentic CLI experience from the little things to the big.
Advice to agentic CLI developers: Just copy Claude Code's UX exactly, that's your starting point. Then, add stuff that make the life of user even easier and more productive. There's a ton of improvements I'd like to see in Claude Code:
- I frequently use multiple sessions. It's kinda hard to remember the context when I come back to a tab. Figure out a way to make it immediately obvious.
- Allow me to burn tokens to keep enough persistent context. Make the agent actually read my AGENTS.md before every response. Ensure thew agents gets closer and closer to matching the way I'd like it work as the sessions progresses (and even across sessions).
- Add a Replace tool, like the Read tool, that is reliable and prevents the agent from having to make changes manually one by one, or worse using sed (I've banned my agents from using sed because of the havoc they cause with it).
It will read the existing plugins, understand the code style/structure/how they integrate, then create a plugin called "sample" AND code that is usually what you wanted without telling it specifically, and write 10 tests for it.
In those cases it's magic. In large codebases, asking it to add something into existing code or modify a behavior I've found it to be...less useful at.
Also, a great use is Microsoft Forms I was surprised with the AI features. At first I just used it to get some qualitative feedback but ended up using copilot to enter questions Claude helped me create and it converted them into the appropriate forms for my surveys!
Objectives -> Claude -> Surveys (markdown) -> Copilot -> MS Forms -> Emailed.
Insights and analysis can use copilot too.
Main thing to remember is the models behind the scenes will change and evolve, Copilot is the branding. In fact, we can expect most companies will use multiple AI solutions/pipelines moving forward.
Assuming the leak was accurate, almost doubling usage in 4 months for an enterprise product seems like pretty fast growth?
Its growth trajectory seems to be on par with Teams so far, another enterprise product bundled with their M365 suite, though to be fair Teams was bundled for free: https://www.demandsage.com/microsoft-teams-statistics/
Using Gemini 2.5 or 3, flash.
- Adobe Neural Filter Acrobat
- Adobe Neural Filter App (previously photoshop)
- Adobe Neural Filter Illustrator
- Adobe 720 Neural Filter app
- etc.
By the way, why is app lowercase in "the Microsoft 365 Copilot app"? Is it not part of the trademark but even they couldn't deal with how confusing that was?Claude doesn't require paying payroll tax, health insurance, unemployment, or take family leave.
I would love for this to be true. But another scenario that could play out is that this process accelerates software bloat that was already happening with human coded software. Notepad will be a 300GB executable in 2035.
https://www.cbsnews.com/news/google-voice-assistant-lawsuit-...
https://www.cbsnews.com/news/lopez-voice-assistant-payout-se...
With humans you can categorically say ‘this guy lies in his comments and copy pastes bullshit everywhere’ and treat them consistently from there out. An LLM is guessing at everything all the time. Sometimes it’s copying flawless next-level code from Hacker News readers, sometimes it’s sabotaging your build by making unit tests forever green. Eternal vigilance is the opposite of how I think of development.
I hate how I’ve had a web site with my name on it since 2008 and when you google my name verbatim it says “did you mean Tyler Childers”
Such shade from the algorithm, I get it, I get it, software is lamer than music.
I’ll agree with you the moment Microsoft does that. But they haven’t done it. And again, I’m not their champion, I’m actively migrating away from Microsoft products. I just don’t think this type of philosophy is helpful. It’s basically cynicism for cynicism’s sake.
I have 2TB with OneDrive too via a Family Office account and I've got no good reason to have the large gapps space.
A ChatGPT account and pay for two Claude accounts.
Netflix, Disney+, Prime.
How did this happen to me?
Perhaps I should sign up to one of those companies that will help me close accounts I keep seeing advertised on YouTube?
Consumers _do not care_ if it is the firmware or Windows.
Dell was one of the earlier brands, and biggest, to suffer these standby problems. Dell has blamed MS and MS has blamed Dell, and neither has been in any hurry to resolve the issues.
I still can't put my laptop in my backpack without shutting it down, and as a hybrid worker, having to tear down and spin up my application context every other day is not productive.
Maybe the most tragic part is that this drags down Linux and plagues it with these hardware rooted sleep issues too.
Also, use the Superpowers plugin for Claude. That really helps for larger tasks but it can over do it hah. It's amusing to watch the code reviewer, implementor, and tester fight and go back and forth over something that doesn't even really matter.
I'm not sure if it's named Microsoft 365 Copilot nowadays, or if that's an optional AI addon? I thought it was renamed once more, but they themselves claim simply "Microsoft 365" (in a few various tiers) sans-Copilot. https://www.microsoft.com/microsoft-365/buy/compare-all-micr...
not sure what you mean, I have vscode open and make code changes in between claude doing its thing. I have had it revert my changes once which was amusing. Not sure why it did that, I've also seen it make the same mistake twice after being told not to.
I'm also mostly on Gemini 3 Flash. Not because I've compared them all and I found it the best bar none, but because it fulfills my needs and then some, and Google has a surprisingly little noted family plan for it. Unlike OpenAI, unlike Anthropic. IIRC it's something like 5 shared Gemini Pro subs for the price of 1. Even being just a couple sharing it, it's a fantastic deal. My wife uses it during studies, I professionally with coding and I've never run into limits.
https://www.wsj.com/tech/ai/the-100-billion-megadeal-between...
Everyone I know who use AI day-to-day is just using Copilot to mostly do things like add a transition animation to a Powerpoint slide or format a word document to look nice. The only problem these LLM products seem to solve is giving normal people a easy way to interact with terrible software processes and GUIs. And better solution to that problem would be for developers to actually observe how the average use interacts with both a computer and their program in particular.
It is a coding everything, autocomplete, ask, edit files and an agent (claude code like).
Then there's DirectX and its subs - though Direct3D had more room for expanded feature set compared to DXSound or DXInput so now they're up to D3D v12.
And this will cause what I'm talking about -- When nobody can afford memory because it's all going into the ocean-boiling datacenters, all of a sudden someone selling a program that fits into RAM will have a very attractive product
Part of me wonders if Microsoft knew it would appeal to governments.
https://arstechnica.com/tech-policy/2025/12/uk-to-encourage-...
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
Then they took their eyes off the ball - whether it was protecting the Windows fort (why create an app that has all the functionality of an OS that you give away for free - mostly on Windows, some Mac versions, but no Linux support) when people are paying for Windows OR they just diverted the IE devs to some other "hot" product, browser progress stagnated, even with XMLHttpRequest.
It's also an LLM chat UI, I don't know if it's because of my work but it lets me select models from all of the major players (GPT, Claude, Gemini)
As someone that was there, we saved the Xbox brand by bullying Microsoft out of normalizing spying on kids and their whole families.
But OpenAI is still innovating with new subcategories, and even in cases where it did not innovate (Claude Code came first and OpenAI responded with Codex), it outdoes its competitors. Codex is being widely preferred by the most popular vibecode devs, notably Moltbook's dev, but also Jess Fraz.
In terms of pricing, OAI holds by far the most expensive product so it's still positioned as a quality option, to give an example, most providers have a 3 tier price for API calls.
Anthropic has 1$/3$/5$ (per output MTokens) Gemini has 3$/12$ (2tier) OpenAI has 2$/14$/168$
So the competitors are mainly competing in price in the API category
To give another datapoint, Google just released multimodal (image input) models like 1 or 2 months ago. This has been in ChatGPT for almost a year now
The only danger is every once in a while one of those little footnotes becomes large enough to be a problem and you lose the market of those who do matter as well. While there are many obvious examples of where that happened, there are also a lot of cases where it didn't.
(A) doesn’t align to some important persons vision, who is incentivized have their finger on whatever change comes about
(B) might step on a lot of adjacent stakeholders, and the employees stakeholder may be risk adverse and want to play nice.
(C) higher up stakeholder fundamentally don’t understand the domain they’re leading
(D) the creators don’t want to fight an uphill battle for their idea to win.
We're now on the back end of that, where Microsoft must again make products with independent substance, but are instead drowning in their own infrastructural muck.
Windows 11 falling apart after AI adoption tells their AI, vibe coding is not going as planned.
If you saw their latest report claiming to focus on fixing the trust on Windows, it is a little too late, even newbies moved to Linux, and with AMD driver support, gaming is no longer an excuse.
System 360 OS/2 DB 2 MQ series. PC
It is like IBM just refused to entertain the idea of having competitors, why should it them name a database by any other name than DB?
Also, it is possibly the worst console name of all time.
I don't even know what Xbox is now, is it a service, is it a console, I'm not even joking really.
Also visual studio code Vs full fat visual studio. Thanks Microsoft you just made it more difficult to web search both products.
Full fat .Net Vs dotnet core Vs standard or is that .net.
2. Settlements are just that: settlements. You can be sued frivolously and still decide to settle because it’s cheaper/less risky.
https://www.bcs.org/articles-opinion-and-research/crowdstrik...
2. Settlements also avoid discovery because the impact is likely way worse than checks notes less than one day of profits per company, respectively.
"Oh you mean the original one?"
No the one that came after the 360.
"The third one?"
No that was the second one, the One was the third.
"OK what are they on now?"
The Series series.
"The Series series?"
Yeah the X and S. Don't confuse that with the Xbox One X or S, or the 360 S.
"Right but what's the difference?"
The X is better than the S because X is a bigger letter. But they run the same games, but they're different. They're the same though.
All providers are opt-out. The moat is the data, don't pretend like you don't know.
They probably should have called the WiiU the Super Wii or Wii 2 or something, but on the whole they've got a mostly coherent naming convention.
I think in the end it's branding. They want people to think "Copilot = AI" but the experience is anywhere from fairly effective to absolute trash. And the most visible applications are absolute trash. It really says something when Ethan Mollick is out there demonstrating that OpenAI is more effective at working with Excel than the built in AI.
There was an article posted here yesterday that said "MS has a lot to answer for with Copilot", and that was the point: MS destroyed their AI brand with this strategy.
We ran into this building a password automation tool (thepassword.app). The solution: the AI orchestrates browser navigation, but actual credential values are injected locally and never enter the model's reasoning loop. Prompt injection can't exfiltrate what's not in the context.
As these tools move into enterprise settings, I expect we'll see more architectural patterns emerge for keeping sensitive data out of agentic workflows entirely.
nes:snes = 6502
n64 = mips
gamecube:wii:wiiu = powerpc
switch:switch2 = armthe tools its built with seem to suck, but it can cook with serena mcp.
the flash models seem to get better results than the pro ones as far as ive seen, but theres not a big difference
The only thing until now I've found using the NPU are the built in blur, auto frame and eye focus modes for the webcam.
Searching the store or a company portal for one of these rebraned apps returns dozens of hits because 'windows', 'copilot', '365' and 'app' are all common words in most application descriptions.
Notable inflection point right around the time unlimited data became an afterthought and every piece of software decided it “needs” to spy on—— I mean needs to offer Fulfilling Connected Experiences at all times.
The TLDR: The $20/40m cost is more reflective of what inference actually costs, including the amortised cost of the Capex, together with the Opex.
The Long Read:
I think the reason is because Anthropic is attempting to run inference at a profit and Google isn't.
Another reason could be that they don't own their cost centers (GPUs are from Nvidia, Cloud instances are from AWS, data centers from AWS, etc); they own only the model but rent everything else needed for inference so pay a margin for all those rented cost centers.
Google owns their entire vertical (GPUs are google-made, Cloud instances and datacenters are Google-owned, etc) and can apply vertical cost optimisations, so their final cost of inference is going to be much cheaper anyway even if they were not subsidising inference with their profits from unrelated business units.
It's pretty much trial and error.
I tried using ChatGPT via the webchat interface on Sunday and it was so terse and to the point that it was basically useless. I had to repeatedly prompt for all the hidden details that I basically gave up and used a different webchat LLM (I regularly switch between ChatGPT, Claude, Grok and Gemini).
When I used it a month ago, it would point out potential footguns, flaws, etc. I suppose it just reinforces the point that "experience" gained using LLMs is mostly pointless, your experience gets invalidated the minute a model changes, or a system prompt changes, etc.
For most purposes, they are all mostly the same i.e. produce output so similar you won't notice a difference.
TBH, that isn't sustainable. Skills atrophy. At some point they are going take the code blindly.
Considering what they have said in the past about agentic code changes, they are already doing just that - blindly approving code from the agent. I say this because when I last read what one of their engineers on CC tweeted/posted/whatever, I thought to myself "No human can review that many lines of code per month"[1].
---------
[1] IIRC, it was something stupid like 30kLoc reviewed in a month by a single engineer.
What? Its not even the best experience. The best UX is done by Crush. and they nail the experience, but its slightly worse because they made it work for all models.
I keep telling my friends while experienced devs feel extremely productive. The newer ones will likely not develop skills needed to work with finer aspects of code.
This might work for a while, but you do a year or two of this, and then as little as a small Python script will feel like yak shaving.
I asked it to create a slide deck for me, within Slides, based on a block of notes I wrote. It wouldn't do it. The chat assistant at gemini.google.com wouldn't do it either. They told me how to do it step by step though...which I knew how to do already. Useless.
I also tried the `AI()` Sheets function to fill a range in based on some other data in the sheet. It doesn't accept other ranges, even if you use the &CELL_REF& notation.
2. I could sue you today for, well, pretty much anything. I don’t have a good case but I can file that lawsuit right now. Would you rather take my settlement offer of $50 or pay a lawyer to go to trial and potentially spend the next months/years of your life in court? You can’t make a blanket statement to say that every company that decides to settle has something to hide, or, similarly, that everyone who exercises their 4th amendment rights has something to hide. I will also point out that companies that make lots of money are huge lawsuit targets, e.g., patent trolls sue large corporations all the time.
Don’t forget we are here talking about a fully optional feature that isn’t even turned on by default. I’m not telling you to love Windows Recall, turn it off or switch to Linux if you don’t love it. My only point is that it’s gotten a lot of incorrect news and social media coverage that is factually untrue and designed to get clicks and reinforce feelings.
Having a tight feedback loop for agents is critical for getting good output.
I was trying to just get an Excel function dialed in with some IFs and formatting weirdness. The licensed Microsoft 365 Copilot built into Excel tried several times and failed miserably. One screenshot to ChatGPT (5.1?) and it was one-shot.
I'm not even sure it's the same models any more. It feels years behind. Maybe they limit it or cripple it somehow.
2. I’m very much already on Linux, most of my threat model is: “if it’s technically possible, it’s probable” and I adjust my technology choices accordingly
I’m just saying a max cap of $60 for Apple’s settlement sets precedence for future mass surveillance wrist slaps and maybe it would be worth the discovery process to uncover the actual global impact
We got the Steam Controller and the new... Steam Controller.
We also got the Steam machine, as well as the new Steam machine.
Lol
And on what grounds do you make this assumption?
- https://archivesit.org.uk/interviews/simon-peyton-jones/
Otherwise your position is that "blue-sky research" doesn't exist (it does) or that big companies don't fund it (they do). In particular, the LinkedIn in question said nothing about "Windows", that is something internet has hallucinated to maximise ragebait.
Wii is a game cube with a funny controller. Or, wii is a tv-only olde switch.
I appreciate that it has its own name due to being a transitional experience.