Note that a straightforward universal machine for the lambda calculus can be orders of magnitude smaller than for Turing machines [1].
[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...
For encoding size, it would be optimal if index i occurs roughly with frequency 2^-i. In many lambda terms of practical interest, one does see higher indices occurring much less frequently, so it's not terribly far from optimal. Some compression is certainly possible; within n binding lambdas, index n could be encoded as 1^n instead of 1^n 0, but again that severely complicates the interpreter itself.
I noticed that some of the program lengths ended up in expressions of lower and upper bounds. Also lambda terms represented with De Bruijn indices are essentially lists of numbers and a binary encoding could give exponentially shorter representation as compared to an unary encoding at the price of some overhead when dealing with the binary numbers which I thought might be a constant.
But I did admittedly not read the page too carefully and would probably need to refresh my knowledge to properly understand the details. The programs there are also mostly short, so a binary encoding would probably make them longer. And if it really mattered, than this is of course such an obvious thing to do, that it would certainly not have been overlooked.