zlacker

At this point, I am starting to feel like we don’t need new languages, but new ways to create specifications.

I have a hypothesis that an LLM can act as a pseudocode to code translator, where the pseudocode can tolerate a mixture of code-like and natural language specification. The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer. This also might enable lower resource “non-frontier” models to be more useful. Additionally, it allows tolerance to syntax mistakes or in the worst case, natural language if needed.

In other words, I think llms don’t need new languages, we do.

replies(19): >>catlif+Q6 >>bigfis+g8 >>kamaal+k8 >>AgintA+1d >>daniel+We >>keepam+Gf >>ronces+jk >>jasfi+sr >>averev+Ns >>trklau+6K >>Quadma+Hb1 >>Footpr+Fe1 >>larodi+gg1 >>hombre+Zq1 >>Growin+5B1 >>dpweb+Rk2 >>pessim+Uq2 >>pizzaf+O13 >>UltraS+yG3

>>deepsq+(OP)
And so it comes full circle XD.

>>deepsq+(OP)
So in this case an LLM would just be a less-reliable compiler? What's the point? If you have to formally specify your program, we already have tools for that, no boiling-the-oceans required

>>deepsq+(OP)
>>new ways to create specifications.

Thats again programming languages. Real issue with LLMs now is it doesn't matter if it can generate code quickly. Some one still has to read, verify and test it.

Perhaps we need a need a terse programming language. Which can be read quickly and verified. You could call that specification.

replies(2): >>deepsq+Ic >>boness+371

>>kamaal+k8
Yes, essentially a higher level programming language than what we currently have. A programming language that doesn't have strict syntax, and can be expressed with words or code. And like any other programming language, it includes specifications for the tests and expectations of the result.

The programming language can look more like code in parts where the specification needs to be very detailed. I think people can get intuition about where the LLM is unlikely to be successful. It can have low detail for boilerplate or code that is simple to describe.

You should be able to alter and recompile the specification, unlike the wandering prompt which makes changes faster than normal version control practices keep up with.

Perhaps there's a world where reading the specification rather than the compiled code is sufficient in order to keep cognitive load at reasonable levels.

At very least, you can read compiled code until you can establish your own validation set and create statistical expectations about your domain. Principally, these models will always be statistical in nature. So we probably need to start operating more inside that kind of framework if we really want to be professional about it.

replies(2): >>kamaal+3n >>shakna+eD

>>deepsq+(OP)
This is the approach that Agint takes. We inference the structure of the code first top down as a graph, then add in types, then interpret the types as in out function signatures and then "inpaint" the functions for codegen.

>>deepsq+(OP)
I’ve been on a similar train of thought. Just last weekend I built a little experiment, using LLMs to highlight pseudocode syntax:

https://x.com/danielvaughn/status/2011280491287364067?s=46

>>deepsq+(OP)
I think this confuses two different things:

- LLMs can act as pseudocode to code translators (they are excellent at this)

- LLMs still create bugs and make errors, and a reasonable hypothesis is at a rate in direct proportion to the "complexity" or "buggedness" of the underlying language.

In other words, give an AI a footgun and it will happily use it unawares. That doesn't mean however it can't rapidly turn your pseudocode into code.

None of this means that LLMs can magically correct your pseudocode at all times if your logic is vastly wrong for your goal, but I do believe they'll benefit immensely from new languages that reduce the kind of bugs they make.

This is the moment we can create these languages. Because LLMs can optimize for things that humans can't, so it seems possible to design new languages to reduce bugs in ways that work for LLMs, but are less effective for people (due to syntax, ergonomics, verbosity, anything else).

This is crucially important. Why? Because 99% of all code written in the next two decades will be written by AI. And we will also produce 100x more code than has ever been written before (because the cost of doing it, has dropped essentially to zero). This means that, short of some revolutions in language technology, the number of bugs and vulnerabilities we can expect will also 100x.

That's why ideas like this are needed.

I believe in this too and am working on something also targeting LLMs specifically, and have been working on it since Mid to Late November last year. A business model will make such a language sustainable.

replies(1): >>fragme+wJ

>>deepsq+(OP)
What we need is a programming language that defines the diff to be applied upon the existing codebase to the same degree of unambiguity as the codebase itself.

That is, in the same way that event sourcing materializes a state from a series of change events, this language needs to materialize a codebase from a series of "modification instructions". Different models may materialize a different codebase using the same series of instructions (like compilers), or say different "environmental factors" (e.g. the database or cloud provider that's available). It's as if the codebase itself is no longer the important artifact, the sequence of prompts is. You would also use this sequence of prompts to generate a testing suite completely independent of the codebase.

replies(3): >>cloogs+OD >>gritzk+xK >>yuppie+4H2

>>deepsq+Ic
Simply put whatever you write should produce the same output regardless of how many times you execute it. The more verbose you make it, the more pointless it becomes.

More terse the better.

replies(1): >>tosapp+uv

>>deepsq+(OP)
I'm actually building this, will release it early next month. I've added a URL to watch to my profile (should be up later this week). It will be Open Source.

>>deepsq+(OP)
llm works great in closed loop so they can self correct but we don't have a reliable way to lint and test specs we need a new language for that

>>kamaal+3n
for the sake of being downvoted: MASM.

replies(1): >>tosapp+Az

>>tosapp+uv
An LLM could speak FPGA.

Good luck auditing that.

>>deepsq+Ic
We already have exceptionally high level languages, like Inform7 [0]. The concept doesn't work all that well. Terseness is a value. Its why we end up with so many symbol-heavy languages. Yes, there are tradeoffs, but that is the whole of computer science.

We didn't end up with Lean and Rust, for a lack of understanding in how to create strong specifications. Pascal-like languages fell out of favour, despite having higher readability.

[0] https://learnxinyminutes.com/inform7/

>>ronces+jk
I think this could be very useful even for regular old programming. We could treat the diffs to the code as the main source of truth (instead of the textual snapshot each diff creates).

Jonathan Edwards (Subtext lang) has a lot of great research on this.

>>keepam+Gf
Say you have this new language, with only a tiny amount of examples of there. How do the SOTA labs train on you're language? With sufficient examples, it can generate code which gets compiled and then run and that gets fed into a feedback loop to improve upon, but how do you get there? How do you bootstrap that? Nevermind the dollar cost, how does it offer something above having an LLM generate code in python or JavaScript, then having it rewrite it in golang/rust/c++ as needed/possible for performance or whatever reason?

It sounds like your plan is for it to write fewer bugs in NewLang, but, well, that seems a bit hard to achieve in the abstract. From bugs I've fixed in generated code, early LLM, it was just bad code. Multiple variables for the same thing, especially. Recently they've gotten better at that, but it still happens.

For a concrete example, any app dealing with points in time. Which sometimes have a date attached but sometimes do not. And also, what are timezones. The complexity is there because it depends on what you're trying to do. An alarm clock is different than a calendar is different than a pomodoro timer. How are you going to reduce the bugged-ed-ness of that without making one of those use cases more complicated than need be, given access to various primitives.

replies(2): >>keepam+zO >>Modern+nt1

>>deepsq+(OP)
Ah, people are starting to see the light.

This is something that could be distilled from some industries like aviation, where specification of software (requirements, architecture documents, etc.) is even more important that the software itself.

The problem is that natural language is in itself ambiguous, and people don't really grasp the importance of clear specification (how many times I have repeated to put units and tolerances to any limits they specify by requirements).

Another problem is: natural language doesn't have "defaults": if you don't specify something, is open to interpretation. And people _will_ interpret something instead of saying "yep I don't know this".

replies(3): >>mike_h+KX >>nxobje+AY >>datsci+Gb1

>>ronces+jk
I am working on that https://github.com/gritzko/librdx Conflictless merge and overlay branches (ie freely attachable/detachable by a click). That was the pie-in-the-sky of the CRDT community for maybe 15 years. My current approach is RDX tree CRDT effectively mapping to the AST tree of the program. Like CRDT DOM for the AST, because line based diffs are too clumsy for that.

Back in the day, JetBrains tried revision-controlling AST trees or psi-nodes in their parlance. That project was cancelled, as it became a research challenge. That was 10 years ago or so. At this point, things may work out well, time will tell.

replies(2): >>mike_h+GU >>embedd+Zo2

>>fragme+wJ
Your hypothetical misses praxis: in my experience LLM can pick up any new syntax with ease. From a few examples, it can generate more. With a compiler (even partial on limited syntax), it can correct. It soon becomes fluent simply from the context of your codebase. You don't need to "train" an LLM to recognize language syntax. It's effortless for it to pick it up.

Or, maybe my lanng just had LLM-easy syntax - which would be good - but I think this is more just par for the course for LLMs, bud.

replies(2): >>mike_h+wX >>fragme+2Q5

>>gritzk+xK
Was it cancelled? I thought MPS works that way.

replies(1): >>gritzk+931

>>keepam+zO
I'm also looking at this topic right now.

I think you're right within limits but the issue is semantics and obscure features. If the language differs from existing languages in only trivial ways, then LLMs can pick it up quickly. But then the value of such a language is trivial. If you deviate in bigger ways, it's harder to properly use just based on pre-existing code.

Here's a simple case study: Kotlin is semantically Java with a more concise syntax, but part of what makes it more concise is the Kotlin standard library adds a lot of utility methods to Java. Many utility methods are only needed rarely. LLMs can write competent Kotlin because they read the user guide and saw millions of examples in their training set, but if they were trying to learn exclusively from small examples in their context window, they wouldn't know about those obscure utilities and would never use them. Much of the benefit would be lost.

Given this, I see a few ways forward:

1. Just give up on designing new programming languages. Languages are user interfaces but the user is now an LLM with near infinite patience, so who cares if they aren't ideal. If the LLM has to brute force a utility method every single time instead of using a standard library... ok. Whatever. This would parallel what happened with CPU ISAs. There are very few of them today, they don't matter much and they're designed in ways that only machines can handle all the details, because everyone codes to higher level languages and compilers write all the assembly.

2. Define new languages as a delta on top of some well known initial language, ensuring that the language definition always fits inside a prompt as a skill. In this world we don't bother with new syntaxes anymore unless that syntax change encodes significant new semantics, because it's not worth wasting tokens showing the LLM what to do. Everything is just an extension to Python, in this world. The line between new languages and new libraries becomes increasingly blurred as runtimes get more powerful and flexible.

3. New languages have to come with their own fine tuned and hosted coding LLM. Maybe that's even a way to monetize new language creation.

4. The big model firms offer a service where you can pay to get your data into the training set. Then you use the giant prompt+delta mechanism to get an LLM to generate a textbook of sample code, pay to get it into the training set, wait six months for another foundation model run and then your language becomes usable.

Of these I think (2) is currently the most practical.

replies(1): >>keepam+YY

>>trklau+6K
You can use LLMs as specification compilers. They are quite good at finding ambiguities in specs and writing out lists of questions for the author to answer, or inferring sensible defaults in explicitly called out ways.

replies(1): >>UncleE+Jd2

>>trklau+6K
Time to bring out the flowcharts again!

>>mike_h+wX
This sounds academic, like a thought experiment. I have experience and can tell you this is not the case. I am using a significantly different language and the LLMs have 0 problem using it.

There's likely challenges here, but it's not the ones you're seeing so far.

replies(1): >>mike_h+mh1

>>mike_h+GU
I meant specifically revision control. JetBrains' school of thought is very much AST-centric, yes.

replies(1): >>mike_h+ph1

>>kamaal+k8
This specification argument seems to boil down to: what if we used Haskell to describe systems to LLMs?

Many of our traditional functional languages, ML family in particular, let you write hyper concise expressions (pure math if you’re in to that sort of thing), craft DSLs of unlimited specifiable power (‘makeTpsReportWith “new cover page format”’), and also in natural language (function names like `emptied cart should have zero items`).

I think if we did that and leveraged the type systems of those languages and the systematic improvements we see from ADTs and pattern matching in those languages, combined with a specification first approach like TDD, that we’d have a great starting point to have an LLM generate the rest of the system perfectly.

… yes, that is just writing Haskell/OCaml/F# with extra steps.

… yes, that level of specification is also the point with those languages where your exploratory type-diddling suddenly goes ‘presto’ and you magically have a fully functioning system.

I guess I’m old-fashioned, but sometimes I wonder if compilers are good for what they’re good for.

>>trklau+6K
> The problem is that natural language is in itself ambiguous

This is literally what software developers are actually paid to do. They are not paid to write code. This is reinventing software development.

replies(1): >>pessim+iD2

>>deepsq+(OP)
coding in latex and then translating to the target via llm works remarkably well nowadays

>>deepsq+(OP)
Go read ai2027 and then be ashamed of yourself /s

But seriously, llms can transmit ideas to each other through English that we do understand, we are screwed if it’s another language lol

>>deepsq+(OP)
we have some markup for architectures like - d2lang, sequencediagram.org's, bpmn.io xmls (which are OMG XMLs), so question is - can we master these, and not invent new stuf for a while?

p.s. a combination of the above fares very well during my agentic coding adventures.

>>keepam+YY
OK, that's valuable to know, but how is your language different? You were discussing syntax previously. How well does the LLM handle your language's different standard library and how big is it?

>>gritzk+931
I think MPS stores projects as serialized ASTs and can do VCS merging.

replies(1): >>gritzk+ap1

>>mike_h+ph1
Great. But MPS is not a revision control system.

replies(1): >>mike_h+iq1

>>gritzk+ap1
Ah I see. You mean they were trying to build a custom VCS that had special support for AST merging. MPS uses regular git with custom merge drivers to do AST-level merging instead of textual merging, but that's a bit different

>>deepsq+(OP)
We're already at a point where I think a PR should start with an LLM prompt that fully specs out the change/feature.

And then we can look at multiple LLM-generated implementations to inform how the prompt might need to be updated further until it's a one-shot.

Now you have perfect intention behind code, and you can refine the intention if it's wrong.

replies(1): >>bweste+Sy1

>>fragme+wJ
> Say you have this new language, with only a tiny amount of examples of there. How do the SOTA labs train on you're language?

Languages don't exist in isolation, they exist on a continuum. Your brand new language isn't brand new, it's built off the semantics and syntax of many languages that have come before it. Most language designers operate under what is known as a "weirdness budget", which is about keeping your language to within some delta of other languages modulo a small number of new concepts. This is to maintain comprehensibility, otherwise you get projects like Hoon / Nock where true is false and up is down that no one can figure out.

Under a small weirdness budget, an LLM should be able to understand your new language despite not being trained on it. if you just explain what's different about it. I've had great success with this so far even on early LLM models. One thing you can do is give it the EBNF grammar and it can just generate strings from that. But that method is prone to hallucinations.

>>hombre+Zq1
"A man with a watch knows what time it is. A man with two watches is never sure."

>>deepsq+(OP)
> The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer.

The code was always a secondary effect of making software. The pain is in fully specifying behavior.

>>mike_h+KX
Yeah, if you can somehow convince them you really, really want them to follow the specification and not just do whatever they want.

And is doesn't matter how many times you tell them the implementation and, more importantly, the tests needs to 100% follow the spec they'll still write tests to match the buggy code or just ignore bugs completely until you call them out on it and/or watch them like a hawk.

Maybe I'm just holding it wrong, who knows?

>>deepsq+(OP)
I disagree I think we always need new languages. Every language over time becomes more and more unnecessarily complex.

It's just part of the software lifecycle. People think their job is to "write code" and that means everything becomes more and more features, more abstractions, more complex, more "five different ways to do one thing".

Many many examples, C++, Java esp circa 2000-2010 and on and on and on. There's no hope for older languages. We need simpler languages.

replies(2): >>embedd+wo2 >>kwanbi+Ro2

>>dpweb+Rk2
> Every language over time becomes more and more unnecessarily complex.

Of course someone eventually will, so I might as well: Well, except for lisp-likes. I think the main reason programming languages grow and grow, is because people want to use them in "new" (sometimes new-new, sometimes existing) ways, and how you add new language features to a programming language? You change the core of the language in some way.

What if instead you made it really easy to change the core language from the language itself, when you need to, without impacting other parts of the codebase? Usually if you use a language from the lisp-"family" of languages, you'll be able to.

So instead of the programming language everyone is using grows regardless if you need it or not, it can stay simple and relatively small for everyone, while for the people who need it, they can grow their own hairballs "locally" (or be solid engineers and avoid hairballs in the first place, requires tenure/similar though).

>>dpweb+Rk2
Related to your comment. I was a "desktop" developer many years ago (about 20). Back then I mainly coded in Assembler, Visual Basic, and Delphi, and I also learned COBOL, C, and Java.

Just this week, I decided to start learning Kotlin because I want to build a mobile app.

Everything was going great until I reached lambda functions.

Honestly, I can't wrap my head around either their purpose or their syntax. I find them incredibly confusing. Right now, they feel like something that was invented purely to confuse developers.

I know this might just be one of those topics where you suddenly have an "aha" moment and everything clicks, but so far, that moment hasn't come.

Did anyone else coming from older, more imperative languages struggle this much with lambdas? Any tips or mental models that helped you finally "get" them?

replies(2): >>gridsp+JP2 >>colone+PQ4

>>gritzk+xK
Just a clarifying question to understand if I understand librdx correctly, it seems you've implemented a language explicitly for being easy to be used as CRDTs for syncing purposes (with specific types/structures for communicating just changes too?), rather than taking an existing language and then layering that stuff on top?

replies(1): >>gritzk+Ul3

>>deepsq+(OP)
I think that we haven't even started to properly think about a higher-level spec language. The highest level objects would have to be the users and the value that they get out of running the software. Even specific features would have to be subservient to that, and would shift as the users requirements shift. Requirements written in a truly higher-level spec language would allow the software to change features without the spec itself changing.

This is where LLMs slip up. I need a higher-level spec language where I don't have to specify to an LLM that I want the jpeg crop to be lossless if possible. It's doubly obvious that I wouldn't want it to be lossy, especially because making it lossy likely makes the resulting files larger. This is not obvious to an LLM, but it's absolutely obvious if our objects are users and user value.

A truly higher-level spec language compiler would recognize when actual functionality disappeared when a feature was removed, and would weigh the value of that functionality within the value framework of the hypothetical user. It would be able to recognize the value of redundant functionality by putting a value on user accessibility - how many ways can the user reach that functionality? How does it advertise itself?

We still haven't even thought about it properly. It's that "software engineering" thing that we were in a continual argument about whether it existed or not.

>>datsci+Gb1
IMO, it's clarifying software development. I think ultimately it means that some people who are slightly on the softer side of development will become indistinguishable from other developers, and people on the more mechanical side of development will disappear.

If what you do can be done by the systematic manipulation of symbols, we have a better system for that now. If the spec they hand to you has to be so specific that you don't have to think while implementing it, we have a machine that can do everything except think that can handle that.

replies(1): >>datsci+7N2

>>ronces+jk
Unrelated, but I am literally listening to Rolandskvadet right now and reading your username was a trip

>>pessim+iD2
> If the spec they hand to you has to be so specific that you don't have to think while implementing it

Does this exist in 2026? I feel like, at least in my bubble, expectations on individual developers has never been higher. I feel like the cut has already been made.

>>kwanbi+Ro2
You know how to add logic on the outside of a function, by putting that function into a larger one and calling the function in the middle.

However, how do you inject logic INTO the middle of a function?

Say you have a function which can iterate over any list and given a condition do a filter. How do you inject the condition logic into that filter function?

In the C days you would use a function pointer for this. C++ introduced templating so you could do this regardless of type. Lambdas make the whole process more ergonomic, it's just declaring a one-shot function in place with some convenient syntax.

In rust instead of the full blown

fn filter_condition(val: ValType) -> bool { // logic }

I can declare a function in place with |val|{logic} - the lambda is just syntactic sugar to make your life easier.

>>deepsq+(OP)
Language is not the problem but clear intent along with direction of action and defined and not implied subject.

Consider:

"Eat grandma if you're hungry"

"Eat grandma, if you're hungry"

"Eat grandma. if you're hungry"

Same words and entirely different outcome.

Pseudo code to clarify:

[Action | Directive - Eat] [Subject - Grandma] [Conditional of Subject - if hungry]

>>embedd+Zo2
It depends on what you mean by a language. If JSON then yes. There are ways to implement CRDT by layering on top of JSON (see Automerge) but the result is really far from idiomatic readable JSON people expect to see.

RDX is more like CRDT JSON DOM in fact, not just JSON+. If that makes sense.

>>deepsq+(OP)
Opus 4.5 is very good at using TLA+ specifications to generate code.

>>kwanbi+Ro2
They're just functions that you define inside another "outer" function's local scope, and depending on the language they will/can also work as closures - meaning they capture the values of variable in the outer function as well.

I don't know the syntax of kotlin, but lambda functions generally are usually useful to pass as parameters to other, generic/higher-order functions. For example if you have a sort function `sort(listofitems, compareitems_fn)` you could write the `compareitems_fn` as a lambda directly in-line with the call to sort()

>>keepam+zO
Syntax != semantics. The LLM being able to adhere syntax is one thing, the LLM picking up various language-isms is another. In Python, you don't want to see

    for i in range(len(arr)): 
        something(arr[i])

because the pythonic way is simply:

    for i in arr:
        something(i)

or even:

    [something(i) for i in arr]

The first version is absolutely syntactically correct, but terrible python. How do you teach the LLM that?

Bugs don't come from syntax errors. If you've got a syntax error, it doesn't compile/fails to run entirely. So we're not talking about the LLM learning the syntax, I'm asking the LLM learning the deeper semantics of lanng.