zlacker

Funnily enough, and not so concidentally, this has been well known in practice by...drumroll please...America's greatest innovators, the Adult Entertainment Hobbyists.

It doesn't have order-of-magnitude, or I'd even wager 50%, benefits in enabling smaller models. But you nailed it exactly. Fine tune on dogs, fine tune on cats, then...just...average the weights. And you have something better than the original with minimal loss from finetuning.

LoRA's end up being more popular for that use case because they're easier to combine and mix, match, and scale. Model merging is still a key technique for a successful base model.