zlacker

Perhaps the lambda calculus proper is the oldest, but the choices made in encoding the binary lambda calculus are somewhat arbitrary: for instance, why use de Bruijn indices for symbols, instead of some other scheme, such as the index of the lambda in the string? And why encode de Bruijn indices in unary, and not in any other 1-prefixed self-delimiting format? And why use a self-delimiting format in the first place? Obviously, these choices are quite reasonable and make for a simple implementation, but they're totally distinct from the formalism that Church first described. (And for that matter, Church originally only used lambda notation in the context of an overarching logical formalism that later got struck down.)

Indeed, if we look at Church's paper, we'd find that the term is 80 symbols written fully, {λt[{{{{{t}(t)}(λh[λf[λn[{{{n}(h)}(f)}(n)]]])}(t)}(t)}(t)]}(λf[λx[{f}({f}(x))]]), or 44 symbols when abbreviated, {λt t(t, (λh λf λn n(h, f, n)), t, t, t)}(λf λx f(f(x))), which aren't too impressive given the alphabet of 11 or 12 symbols.

replies(2): >>tromp+66 >>chrisw+oc

>>Legion+(OP)
> but the choices made in encoding the binary lambda calculus are somewhat arbitrary

I made the simplest choices I could that do not waste bits.

> And why use a self-delimiting format in the first place?

Because a lambda term description has many different parts that you need to be able to separate from each other.

> And why encode de Bruijn indices in unary

I tried to answer that in more detail in this previous discussion [1].

[1] >>37584869

>>Legion+(OP)
> why use de Bruijn indices for symbols, instead of some other scheme, such as the index of the lambda in the string?

It's pretty natural to use de Bruijn indexes, since their meaning is local and their composition matches the language semantics (they have a denotational semantics, which gives expressions an intrinsic meaning). For example, with de Bruijn indices the expression `λ0` is always the identity function. If we tried a different scheme, like "the index of the lambda in the string", then expressions like `λ0` would have no intrinsic meaning; it depends on what "string" they occur in, which requires some concrete notion of "the string", which complicates any interpreter, etc.

Whenever I see an implementation of lambda calculus (or some derivative) which doesn't use de Bruijn indexes, I require some justification to convince me why not :) (There are such justifications, e.g. performance optimisations, use of a host language's naming facilities, etc.; but I've can't think of any reason a self-contained, minimalist approach should avoid them)

> why use a self-delimiting format in the first place?

There are various technical reasons to prefer self-delimiting languages when working in algorithmic information theory. For example, prefix-free (binary) encodings are paths through a (binary) tree, with programs as leaves. This lets us assign well-defined probability measures to the set of all programs (assign 1 to the root node, and divide each node's probability between its (two) children). It's also more natural when programs are executed before they're fully known; e.g. reading one bit at a time on-demand (say, generated by a coin toss)

replies(1): >>Legion+jz

>>chrisw+oc
Yeah, I know that de Bruijn indices and self-delimiting formats make a lot of sense, and are probably among the most natural options. My questions were mostly rhetorical; my main point is to challenge the idea that BLC and its transition function in particular should be considered as a fundamental measure of complexity, since it clearly incorporates a fair few design decisions external to the lambda calculus itself.