Being a bit hand-wavy with it: It’s akin to torrenting music/movies. The torrented files are lossy compressed representations of the original waveform from the music producer. Limewire, or Pirate Bay, or whatever provide interface to retrieve them (download or stream). The model weights are a form of lossy compression, and inference is like a document retrieval.
One may say, “it’s like an employee working at company X, then going to work at company Y, they retain their knowledge and experience.” I would say it’s more like, employee going from X to Y, but retaining audio and video recordings of all interactions he had, notes, documents, and other proprietary info and bringing it to company Y.