If it is legal to simply index a website, then why shouldn't it be legal to train a model in the very same data?
Of course, websites should have some option for declining data mining for ML/AI purposes, in the same way the can decline scraping/indexing in the robots.txt file.
But that ship has kind of sailed, unless the courts decide otherwise.