Watching AI drive Microsoft employees insane

>>laiysb+(OP)
Interesting that every comment has "Help improve Copilot by leaving feedback using the or buttons" suffix, yet none of the comments received any feedback, either positive or negative.

> This seems like it's fixing the symptom rather than the underlying issue?

This is also my experience when you haven't setup a proper system prompt to address this for everything an LLM does. Funniest PRs are the ones that "resolves" test failures by removing/commenting out the test cases, or change the assertions. Googles and Microsofts models seems more likely to do this than OpenAIs and Anthropics models, I wonder if there is some difference in their internal processes that are leaking through here?

The same PR as the quote above continues with 3 more messages before the human seemingly gives up:

> please take a look

> Your new tests aren't being run because the new file wasn't added to the csproj

> Your added tests are failing.

I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.

Another PR: https://github.com/dotnet/runtime/pull/115732/files

How are people reviewing that? 90% of the page height is taken up by "Check failure", can hardly see the code/diff at all. And as a cherry on top, the unit test has a comment that say "Test expressions mentioned in the issue". This whole thing would be fucking hilarious if I didn't feel so bad for the humans who are on the other side of this.

>>diggan+L1
> I can't imagine how the people who have to deal with this are feeling. It's like you have a junior developer except they don't even read what you're telling them, and have 0 agency to understand what they're actually doing.

That comparison is awful. I work with quite a few Junior developers and they can be competent. Certainly don't make the silly mistakes that LLMs do, don't need nearly as much handholding, and tend to learn pretty quickly so I don't have to keep repeating myself.

LLMs are decent code assistants when used with care, and can do a lot of heavy lifting, they certainly speed me up when I have a clear picture of what I want to do, and they are good to bounce off ideas when I am planning for something. That said, I really don't see how it could meaningfully replace an intern however, much less an actual developer.

>>surgic+68
These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted. There's a PM who takes your task and gives it to a "developer" who potentially has never actually written a line of code, but maybe they've built a WordPress site by pointing and clicking in Elementor or something. After dozens of hours billed you will, in fact, get code where the new file wasn't added to the csproj or something like that, and when you point it out, they will bill another 20 hours, and send you a new copy of the project, where the test always fails. It's exactly like this.

Nice to see that Microsoft has automated that, failure will be cheaper now.

>>safety+la
> These GH interactions remind me of one of those offshore software outsourcing firms on Upwork or Freelancer.com that bid $3/hr on every project that gets posted

Those have long been the folks I’ve seen at the biggest risk of being replaced by AI. Tasks that didn’t rely on human interaction or much training, just brute force which can be done from anywhere.

And for them, that $3/hr was really good money.

zlacker