Note that a straightforward universal machine for the lambda calculus can be orders of magnitude smaller than for Turing machines [1].
[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...
For encoding size, it would be optimal if index i occurs roughly with frequency 2^-i. In many lambda terms of practical interest, one does see higher indices occurring much less frequently, so it's not terribly far from optimal. Some compression is certainly possible; within n binding lambdas, index n could be encoded as 1^n instead of 1^n 0, but again that severely complicates the interpreter itself.