I think this theorem applies well for modern LLMs: large language model with pre-computed weights can be used to compute very complex algorithms that approximate human knowledge, that otherwise were impossible or would have required many orders more compute to calculate