GitHub Copilot Coding Agent

>>taurat+O6
I feel like I saw a quote recently that said 20-30% of MS code is generated in some way. [0]

In any case, I think this is the best use case for AI in programming—as a force multiplier for the developer. It’s for the best benefit of both AI and humanity for AI to avoid diminishing the creativity, agency and critical thinking skills of its human operators. AI should be task oriented, but high level decision-making and planning should always be a human task.

So I think our use of AI for programming should remain heavily human-driven for the long term. Ultimately, its use should involve enriching humans’ capabilities over churning out features for profit, though there are obvious limits to that.

[0] https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-a...

>>Scene_+V4
> I also ended up blowing through $15 of LLM tokens in a single evening.

Consider using Aider, and aggressively managing the context (via /add, /drop and /clear).

https://aider.chat/

>>net01+(OP)
In the early days on LLM, I had developed an "agent" using github actions + issues workflow[1], similar to how this works. It was very limited but kinda worked ie. you assign it a bug and it fired an action, did some architect/editing tasks, validated changes and finally sent a PR.

Good to see an official way of doing this.

1. https://github.com/asadm/chota

>>Beetle+bc
My tool Plandex[1] allows you to switch between automatic and manual context management. It can be useful to begin a task with automatic context while scoping it out and making the high level plan, then switch to the more 'aider-style' manual context management once the relevant files are clearly established.

1 - https://github.com/plandex-ai/plandex

Also, a bit more on auto vs. manual context management in the docs: https://docs.plandex.ai/core-concepts/context-management

>>net01+(OP)
Anthropic just announced the same thing for Claude Code, same day: https://docs.anthropic.com/en/docs/claude-code/github-action...

>>OutOfH+V6
I think you're probably thinking of Copilot Workspace (<https://github.blog/news-insights/product-news/github-copilo...>).

Copilot Workspace could take a task, implement it and create a PR - but it had a linear, highly structured flow, and wasn't deeply integrated into the GitHub tools that developers already use like issues and PRs.

With Copilot coding agent, we're taking all of the great work on Copilot Workspace, and all the learnings and feedback from that project, and integrating it more deeply into GitHub and really leveraging the capabilities of 2025's models, which allow the agent to be more fluid, asynchronous and autonomous.

(Source: I'm the product lead for Copilot coding agent.)

>>theusu+Ub
I think we expected disappointment with this one. (I expected it at least)[0]

But the upgraded Copilot was just in response to Cursor and Winsurf.

We'll see.

[0] >>43904611

>>aaroni+xn
Check out this idea: https://fairwitness.bot (>>44030394 ).

The entire website was created by Claude Sonnet through Windsurf Cascade, but with the “Fair Witness” prompt embedded in the global rules.

If you regularly guide the LLM to “consult a user experience designer”, “adopt the multiple perspectives of a marketing agenc”, etc., it will make rather decent suggestions.

I’ve been having pretty good success with this approach, granted mostly at the scale of starting the process with “build me a small educational website to convey this concept”.

>>brushf+bm
Here's a video of what it looks like with sonnet 3.7.

https://streamable.com/rqlr84

The claude and gemini models tend to be the slowest (yes, including flash). 4o is currently the fastest but still not great.

>>OutOfH+3o
I assume you can select whichever one you want (GPT-4o, o3-mini, Claude 3.5, 3.7, 3.7 thinking, Gemini 2.0 Flash, GPT=4.1 and the previews o1, Gemini 2.5 Pro and 04-mini), subject to the pricing multiplicators they announced recently [0].

Edit: From the TFA: Using the agent consumes GitHub Actions minutes and Copilot premium requests, starting from entitlements included with your plan.

[0] https://docs.github.com/en/copilot/managing-copilot/monitori...

>>Beetle+hs
2m27s for a partial response editing a 178 line file (it failed with an error, which seems to happen a lot with claude, but that's another issue).

https://streamable.com/rqlr84

>>timrog+Oj
> In the repo where we're building the agent, the agent itself is actually the #5 contributor - so we really are using Copilot coding agent to build Copilot coding agent ;)

Really cool, thanks for sharing! Would you perhaps consider implementing something like these stats that aider keeps on "aider writing itself"? - https://aider.chat/HISTORY.html

>>overfe+Bt
textbook survivorship bias https://en.wikipedia.org/wiki/Survivorship_bias

every bullet hole in that plane is the 1k PRs contributed by copilot. The missing dots, and whole missing planes, are unaccounted for. Ie, "ai ruined my morning"

>>net01+(OP)
Looks like their GitHub Copilot Workspace.

https://githubnext.com/projects/copilot-workspace

>>net01+(OP)
on a other note https://github.com/github/dmca/pull/17700 GitHub's automated auto-merged DMCA sync PRs get automated copilot reviews for every single one.

AMAZING

>>qwerto+Qu
Gemini has some GitHub integrations

https://developers.google.com/gemini-code-assist/docs/review...

>>qwerto+Qu
Google Cloud has a pre-GA product called "Secure Source Manager" that looks like a fork of Gitea: https://cloud.google.com/secure-source-manager/docs/overview

Definitely not Google Code, but better than Cloud Source Repositories.

>>shwouc+5q
This is something I've noticed as well with different AIs. They seem to disproportionately trust data read from the web. For example, I asked to check if some obvious phishing pages were scams and multiple times I got just a summary of the content as if it was authoritative. Several times I've gotten some random chinese repo with 2 stars presented as if it was the industry standard solution, since that's what it said in the README.

On an unrelated note, it also suggested I use the "Strobe" protocol for encryption and sent me to https://strobe.cool which is ironic considering that page is all about making one hallucinate.

>>sync+mm
And Google's version: https://jules.google

>>kenjac+VY
> I don't think it was unreasonable to be very skeptical at the time.

Well, that's back rationalization. I saw the advances like conducting meta sentiment analysis on medical papers in the 00's. Deep learning was clearly just the beginning. [0]

> Who would've thought (except you)

You're othering me, which is rude, and you're speaking as though you speak for an entire group of people. Seems kind of arrogant.

0. (2014) https://www.ted.com/talks/jeremy_howard_the_wonderful_and_te...

>>SkyPun+8I
The trick for greenfield projects is to use it to help you design detailed specs and a tentative implementation plan. Just bounce some ideas off of it, as with a somewhat smarter rubber duck, and hone the design until you arrive at something you're happy with. Then feed the detailed implementation plan step by step to another model or session.

This is a popular workflow I first read about here[1].

This has been the most useful use case for LLMs for me. Actually getting them to implement the spec correctly is the hard part, and you'll have to take the reigns and course correct often.

[1]: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/

>>sourdo+c81
A much better options is to use docstrings[0] and a tool like doxygen to extract an API reference. Domain explanations and architecture can be compiled later from design and feature docs.

A good example of the kind of result is something like the Laravel documentation[1] and its associated API reference[2]. I don't believe AI can help with this.

[0]: https://en.wikipedia.org/wiki/Docstring

[1]: https://laravel.com/docs/12.x

[2]: https://api.laravel.com/docs/12.x/

>>bionho+f41
Might have been the case, but no longer:

https://docs.github.com/en/copilot/managing-copilot/managing...

>>net01+(OP)
This is quite alarming: https://www.cursor.com/security

And this one too: https://docs.github.com/en/site-policy/privacy-policies/gith...

>>taurat+O6
They have released numbers, but I can't say they are for this specific product or something else. They are apparently having AI generate "30%" of their code.

https://techcrunch.com/2025/04/29/microsoft-ceo-says-up-to-3...

>>net01+(OP)
So, fun thing.. LinkedIn doesn't use Copilot.

I recently created an course for LinkedIn Learning using generative AI for creating SDKs[0]. When I was onsite with them to record it, I found my Github Copilot calls kept failing.. with a network error. Wha?

Turns out that LinkedIn doesn't allow people onsite to to Copilot so I had to put my Mifi in the window and connect to that to do my work. It's wild.

Btw, I love working with LinkedIn and have 15+ courses with them in the last decade. This is the only issue I've ever had.. but it was the least expected one.

0: https://www.linkedin.com/learning/build-with-ai-building-bet...

>>net01+(OP)
I'm building RSOLV (https://rsolv.dev) as an alternative approach to GitHub's Copilot agent.

Our key differentiator is cross-platform support - we work with Jira, Linear, GitHub, and GitLab - rather than limiting teams to GitHub's ecosystem.

GitHub's approach is technically impressive, but our experience suggests organizations derive more value from targeted automation that integrates with existing workflows rather than requiring teams to change their processes. This is particularly relevant for regulated industries where security considerations supersede feature breadth. Not everyone can just jump off of Jira on moment's notice.

Curious about others' experiences with integrating AI into your platforms and tools. Has ecosystem lock-in affected your team's productivity or tool choices?

>>gen220+l51
There is a better way than just READMEs: https://taoofmac.com/space/blog/2025/05/13/2230

>>mattlo+5J1
Comments are not easy for the LLM to refer to, ironically: https://taoofmac.com/space/blog/2025/05/13/2230

>>imiric+541
Here’s my workflow, it takes that a few steps further: https://taoofmac.com/space/blog/2025/05/13/2230

>>jagged+nq1
Be more methodical, it isn’t magic: https://taoofmac.com/space/blog/2025/05/13/2230

>>quanta+IK
Try doing https://taoofmac.com/space/blog/2025/05/13/2230, you’ll have some fun,

>>timrog+3g1
The announcement https://github.blog/news-insights/product-news/github-copilo... seems to position GitHub Actions as a core part of the Copilot coding agent’s architecture. From what I understand in the documentation and your comment, GitHub Actions is triggered later in the flow, mainly for security reasons. Just to clarify, is GitHub Actions also used in the development environment of the agent, or only after the code is generated and pushed?

>>net01+(OP)
Some example PRs if people want to look:

https://github.com/dotnet/runtime/pull/115733 https://github.com/dotnet/runtime/pull/115732 https://github.com/dotnet/runtime/pull/115762

>>codebo+vS1
That sounds reasonable enough, but the pace or end result is by no means guaranteed.

We have invested plenty of money and time into nuclear fusion with little progress. The list of key acheivments from CERN[1] is also meager in comparison to the investment put in, especially if you consider their ultimate goal to ultimately be towards applying research to more than just theory.

[1] https://home.cern/about/key-achievements

>>Philip+u01
>On an unrelated note, it also suggested I use the "Strobe" protocol for encryption and sent me to https://strobe.cool which is ironic considering that page is all about making one hallucinate.

That's not hallucination. That's just an optical illusion.

>>rcarmo+yR1
This seems like a good flow! I end up adding a "spec" and "todo" file for each feature[1]. This allows me to flesh out some of the architectural/technical decisions in advance and keep the LLM on the rails when the context gets very long.

[1] https://notes.jessmart.in/My+Writings/Pair+Programming+with+...

zlacker

GitHub Copilot Coding Agent