However, I think producing detailed enough specification requires same or even larger amount of work than writing code. We write rough specification and clarify these during the process of coding. I think there are minimal effort required to produce these specification, AI will not help you speed up these effort.
The nice thing about code compared to other notation is that it's useful on its. You describe an algorithm and the machine can then solve the problem ad infinitum. It's one step instead of the two step of writing a spec and having an LLM translate it, then having to verify the output and alter it.
Assembly and high level languages are equivalent in terms of semantics. The latter helps in managing complexity, by reducing harmful possibilities (managing memory, off-by-one errors) and presenting common patterns (iterators/collections, struct and other data structures, ....) so that categories of problems are easily solved. There's no higher level of computing model unlocked. Just faster level of productivity unlocked by following proven patterns.
Spec driven workflow is a mirage, because even the best specs will leave a lot of unspecified details. Which are crucial as most of programming is making the computer not do the various things it can do.
This is a very stimulating way of putting it!
My particular hypothesis on this is something that feels a little bit like python and ruby, but has an absolutely insane overkill type system to help guide the AI. I also threw in a little lispiness on my draft: https://github.com/jaggederest/locque/
Our team has started dedicating much more time writing documentation for our SaaS app, no one seems to want to do it naturally, but there is very large potential for opening your system to machine automation. Not just for coding but customer facing tooling. I saw a preview of that possible future using NewRelic where they have an AI chat use their existing SQL-like query language to build tables and charts from natural language queries right in the web app. Theirs kinda sucks but there's so much potential there that it is very likely going to change how we build UIs and software interfaces.
Plus it also helps sales, support, and SEO having lots of documentation on how stuff works.
Also, they rely surprisingly closely on "good" code patterns, like comments and naming conventions.
So if anything, a managed language [1] with a decent type system and not a lot of features would be the best, especially if it has a lot of code in its training data. So I would rather vote on Java, or something close.
[1] reasoning about life times, even if aided by the compiler is a global property, and LLMs are not particularly good at that
This is sort of a fundamental problem with all AI. If you tell a robot assistant to "make a cup of tea", how's it supposed to know that that implies "don't break the priceless vase in the kitchen" and "don't step on the cat's tail", et cetera. You're never going to align it well enough with "human values" to be safe. Even just defining in human-understandable terms what those values are is a deep existential question of philosophy, let alone specifying it for a machine that's capable of acting in the world independently.
On the other hand: the usefulness of LLMs will always be gated by their interface to the human world. So even if their internal communication might be superseded at some point. Their contact surface can only evolve if their partners/subjects/masters can interface
I recently used Claude for a refactor. I had an exact list of call sites, with positions etc. The model had to add .foo to a bunch of builders that were either at that position or slightly before (the code position was for .result() or whatever.) I gave it the file and the instruction, and it mostly did it, but it also took the opportunity to "fix" similar builders near those I specified.
That is after iterating a few times on the prompt (first time it didn't want to do it because it was too much work, second time it tried to do it via regex, etc.)
I've had comical instances where asking an agent to "perform the refactor within somespec.md" results in it ... refactoring the spec as opposed to performing a refactor of the code mentioned in the spec. If I say "Implement the refactor within somespec.md" it's never misunderstood.
With LLMs _so_ strongly aligned on language and having deep semantic links, a hypothetical prompt compiler could ensure that your intent converts into the strongest weighted individual words to ensure maximal direction following and outcome.
Intent classification (task frame) -> Reference Binding (inputs v targets) -> high-leverage word selection .... -> Constraints(?) = <optimal prompt>