So there is probably a big pile of Reddit comments, twitter messages, and libgen and arxiv PDFs I imagine
So there is some shit, but also painstakingly encoded knowledge (ie writing), and yeah it is miraculous that LLMs are right as often as they are