Scaling long-running autonomous coding

>>srames+(OP)
The more I think about LLMs the stranger it feels trying to grasp what they are. To me, when I'm working with them, they don't feel intelligence but rather an attempt at mimicking it. You can never trust, that the AI actually did something smart or dump. The judge always has to be you.

It's ability to pattern match it's way through a code base is impressive until it's not and you always have to pull it back to reality when it goes astray.

It's ability to plan ahead is so limited and it's way of "remembering" is so basic. Every day it's a bit like 50 first dates.

Nonetheless seeing what can be achieved with this pseudo intelligence tool makes me feel a little in awe. It's the contrast between not being intelligence and achieving clearly useful outcomes if stirred correctly and the feeling that we just started to understand how to interact with this alien.

>>Chipsh+jZ
If you find yourself 50-first-dating your LLMs, it may be worth it to invest some energy into building up some better context indexing of both the codebase itself and of your roadmap.

>>NiloCK+Yk1
Yeah, I admit I'm probably not doing that quite optimally. I'm still just letting the LLM generate ephemeral .md files that I delete after a certain task is done.

The other day I found [beads](https://github.com/steveyegge/beads) and thought maybe that could be a good improvement over my current state.

But I'm quite hesitant because I also have seen these AGENTS.md files become stale and then there is also the question of how much information is too much especially with the limited context windows.

Probably all things that could again just be solved by leveraging AI more and I'm just an LLM noob. :D

>>Chipsh+wy1
Beads is basically what github issues is, but local and built in a way that LLMs can easily use it. I had a self-made solution that was close, but moved to beads because it worked out of the box without disrupting my workflow that much.

I've used it quite a bit, but now that Gas Town is a thing Beads getting a bit bloated and they're adding new features left and right, dunno why.

Might have to steal the best bits of Beads (the averaged out cli experience and JSONL for storing issues in the repo + local sqlite cache) and build my one with none of the extra bells and whistles.

zlacker