I feel like there is a way to get around this where you use as many materials (books, newspapers, crawling websites etc) to generate the LLM so it can be good at reasoning/next token generation but it can only use reference knowledge files to answer your question that a user uploads at the time of asking.