zlacker

> simply because it has access to so much more code/algorithms/examples then my language alone

I assumed that is what was catered for with "even when controlling for the size of the training set".

I.e. assuming I am reading it right: That it is better to get the same data as 25% in 4 languages, than 100% in one language.