I’m surprised by this wording. I didn’t encounter anyone talking about AI preference yet.
Can a trained LLM develop a preference for a given tool within some context and reliably report on that?
Is “what AI reports enjoying“ aligned with AI’s optimal performance?
Actually drastically improves any kind of writing by AI, even if just for my own consumption.
I set up spec-kit first, then updated its templates to tell it to use beads to track features and all that instead of writing markdown files. If nothing else, this is a quality-of-life improvement for me, because recent LLMs seem to have an intense penchant to try to write one or more markdown files per large task. Ending up with loads of markdown poop feels like the new `.DS_Store`, but harder to `.gitignore` because they'll name files whatever floats their boat.
It's helpful for getting Claude code to work with tasks that will span multiple context windows.
So is the issue the format or lack of structure which a local database can bring in?
Giving them somewhere to jot down notes is a surprisingly effective way of working around this limitation.
The simplest version of this is to let them read and write files. I often tell my coding agents "append things you figure out to notes.md as you are working" - then in future sessions I can tell them to read or search that file.
Beads is a much more structured way of achieving the same thing. I expect it works well partly because LLM training data makes them familiar with the issue/bug tracker style of working already.
This tool works by storing JSONL in a .beads/ folder. I wonder if it could work using a separate initially-empty "beads" branch for this data instead? That way the beads data (with its noisy commit history) could travel with the repository without adding a ton of noise to the main branch history.
The downside of that is that you wouldn't be able to branch the .beads/ data or keep it synchronized with main on a per-commit basis. I haven't figured out if that would break the system.
I even occasionally ask agents to move some learnings back to my Claude.md or Agents.md file.
I'm curious whether complicating this behaviour with a database integration would further abstract the work in progress. Are we heading down a slippery slope?
https://github.com/steveyegge/beads/blob/main/.beads/issues....
Here's that file opened in Datasette Lite which makes it easier to read and adds filters for things like issue type and status:
https://lite.datasette.io/?json=https://github.com/steveyegg...
Set the TASKDATA to `./.task/`. Then tell the agents to use the task CLI.
The benefit is most LLMs already understand Taskwarrior. They've never heard of Beads.
The information being available is not the problem; the agent not realizing that it doesn’t have all the info is, though. If you put it behind an MCP server, it becomes a matter of ensuring the agent will invoke the MCP at the right moment, which is a whole challenge in itself.
Are there any coding agents out there that enable you to plug middleware in there? I’ve been thinking about MITM’ing Claude Code for this, but wouldn’t mind exploring alternative options.
I've been having a ton of success just from letting them use their default grep-style search tools.
I have a folder called ~/dev/ with several hundred git projects checked out, and I'll tell Claude Code things like "search in ~/dev/ for relevant examples and documentation".
(I'd actually classify what I'm doing there as RAG already.)
I like the idea of keeping potentially noisy changes out of my main branch history, since I look at that all the time.
It’s pretty common in a lot of agents, but I don’t see a way to do that with Claude Code.
[1] https://github.com/steveyegge/beads/blob/main/docs/PROTECTED...
Also when you say 'never heard of beads' --- it spits ou onboarding text to tell the agent exactly what it needs to know.
Requires a deep dive, but this is an interesting direction for agent tooling
I also find it faster to use. I tell the agent the problem, ask them to write a set of tasks using beads, it creates the tasks and it creates the “depends on” tree structure. Then I tell it to work on one task at a time and require my review before continuing.
The added benefit is the agent doesn’t need to hold so much context in order to work on the tasks. I can start a new session and tell it to continue the tasks.
Most of this can work without beads but it’s so easy to use it’s the only spec tool I’ve found that has stuck.
Like you mentioned, agents are insanely good at grep. So much so that I’ve been trying to figure out how to create an llmgrep tool because it’s so good at it. Like, I want to learn how to be that good at grep, hah.
[1] Demo with Claude - https://pradeeproark.github.io/pensieve/demos/
[2] Article about it - https://pradeeproark.com/posts/agentic-scratch-memory-using-...
[3] https://github.com/cittamaya/cittamaya - Claude Code Skills Marketplace for Pensieve
But I also can’t rule out that he somehow believes it, which I suppose makes it a good troll.
> I appreciate that this is a very new project, but what’s missing is an architectural overview of the data model.
Response:
You're right to call me out on this. :)
Then I check the latest commit on architecture.md, which looks like a total rewrite in response to a beads.jsonl issue logged for this.
> JSONL for git: One entity per line means git diffs are readable and merges usually succeed automatically.
Hmm, ok. So readme says:
> .beads/beads.jsonl - Issue data in JSONL format (source of truth, synced via git)
But the beads.jsonl for that commit to fix architecture.md still has the issue to fix architecture.md in the beads.jsonl? So I wonder does that get line get removed now that it's fixed ... so I check master, but now beads.jsonl is gone?
But the readme still references beads.jsonl as source of truth? But there is no beads.jsonl in the dogfooded repo, and there's like ~hundreds of commits in the past few days, so I'm not clear how I'm supposed to understand what's going on with the repo. beads.jsonl is the spoon, but there is no spoon.
I'll check back later, or have my beads-superpowered agent check back for me. Agents report that they enjoy this.
https://github.com/steveyegge/beads/issues/376#issuecomment-...
https://github.com/steveyegge/beads/commit/c3e4172be7b97effa...
Reminds me of the guy who recently spammed PRs to the OCaml compiler but this time the script is flipped and all the confusion is self inflicted.
I wonder how long will it take us to see a vibe-coded, slop covered OS or database or whatever (I guess the “braveness” of these slop creators will (is?) be directly proportional to the quality of the SOTA coding LLMs).
Do we have a term for this yet? I mean the person, not the product (slop)
Even more impressive lately is how good the latest models are without anything keeping them on track!
Software that has great AX will become significantly more useful in the same way that good UX has been critical.
Neither does OpenAI's Codex CLI - you can confirm that by looking at the source code https://github.com/openai/codex
Cursor and Windsurf both use semantic search via embeddings.
You can get semantic search in Claude Code using this unofficial plugin: https://github.com/zilliztech/claude-context - it's built by and uses a managed vector database called Zilliz Cloud.
I finally started digging in to OpenCode for real these past couple weeks. It has a planning mode, which nicely builds out a plan on text chat as usual, but also a right pain on the TUI builds out a Todo list, which has been really nice. I often give it the go-ahead to do the next item or two or three. I've wondered how this is implemented, how OpenCode sets up and picks up on this structuring.
Beads formalizing that a bit more is tempting. I also deeply deeply enjoy that Beads is checked in. With both Aider and OpenCode, there's a nice history. But it's typically not checked in. OpenCode 's history in particular isnt even kept in the project directory, and can be quite complex with multiple sessions and multiple agents all flying around. Beads, as a strategy to record the work & understand it better, is also very tempting.
Would love to see deeper OpenCode + Beads integration.
They do use RAGs a lot for their desktop app, their projects implementation make a lot of use of it.