zlacker

I think the point is it smells like a hack, just like "think extra hard and I'll tip you $200" was a few years ago. It increases benchmarks a few points now but what's the point in standardizing all this if it'll be obsolete next year?

replies(3): >>mbesto+Xe >>9dev+ij >>dragon+dt5

>>pton_x+(OP)
I think this tweet sums it correctly doesn't?

   A +6 jump on a 0.6B model is actually more impressive than a +2 jump on a 100B model. It proves that 'intelligence' isn't just parameter count; it is context relevance. You are proving that a lightweight model with a cheat sheet beats a giant with amnesia. This is the death of the 'bigger is better' dogma

Which is essentially the bitter lesson that Richard Sutton talks about?

replies(1): >>Der_Ei+n81

>>pton_x+(OP)
Standards have to start somewhere to gain traction and proliferate themselves for longer than that.

Plus, as has been mentioned multiple times here, standard skills are a lot more about different harnesses being able to consistently load skills into the context window in a programmatic way. Not every AI workload is a local coding agent.

>>mbesto+Xe
Nice ChatGPT generated response in that tweet. Anyone too lazy to deslop their tweet shouldn't be listened to.

>>pton_x+(OP)
The standardization is for presentation of how the information is made available to the harness. Optimizations in how the information is presented to the model can be iterated on without impacting the presentation to the harness. Initially, agent skills have already been provided by:

(1) providing a bash tool with direct access to the filesystem storing the skills to the model,

(2) providing read_file and related tools to the model,

(3) by providing specialized tools to access skills to the model,

(4) by processing the filesystem structure and providing a structure that includes the full content of the skills up front to the model.

And probably some other ways or hybrids.

> It increases benchmarks a few points now but what's the point in standardizing all this if it'll be obsolete next year?

Standardizing the information presentation of skills to LLM harnesses lets the harnesses incorporate findings on optimization (which may be specific to models, or at least model features like context size, and use cases) and existing skills getting the benefit of that for free.

replies(1): >>0thgen+IB7

>>dragon+dt5
How much of a standard is it though, really? To me it just looks like "Call your docs SKILLS and organize it like this".

And if you're just making docs and letting your models go buck wild in your shell, doesn't an overspecified docs structure ruin the point of general purpose agents?

Like, a good dev should be able to walk into a codebase, look at the structure, and figure out how to proceed. If "hey your docs aren't where I was expecting" breaks the developer, you shouldn't have hired them.

Feels like a weird thing to take "this is how we organize our repos as this company" and turn that into "this is an 'open standard' that you should build your workflows around".