> If you want a clean comparison, I’d test three conditions under equal context budgets: (A) monolithic
> AGENTS.md, (B) README index that links to docs, (C) skills with progressive disclosure. Measure task
> success, latency, and doc‑fetch count across 10–20 repo tasks. My hunch: (B)≈(C) on quality, but (C)
> wins on token efficiency when the index is strong. Also, format alone isn’t magic—skills that reference
> real tools/assets via the backing MCP are qualitatively different from docs‑only skills, so I’d
> separate those in the comparison. Have you seen any benchmarks that control for discovery overhead?