If even WWII-era documents are still under copyright, building a model respecting that would be impossible.
"We can't do this legally, so we should be allowed to ignore the law."
If you can't build a model while respecting licenses, don't build a model.
I don't want copyright to exist, at any duration, and I certainly think it should be much shorter than it is. However, as long as it exists, AI should not get any exception to it; such an exception inherently privileges massive entities over small entities or individuals.