The New York Times is suing OpenAI and Microsoft for copyright infringement

>>ssgodd+(OP)
I hope this results in Fair Use being expanded to cover AI training. This is way more important to humanity's future than any single media outlet. If the NYT goes under, a dozen similar outlets can replace them overnight. If we lose AI to stupid IP battles in its infancy, we end up handicapping probably the single most important development in human history just to protect some ancient newspaper. Then another country is going to do it anyway, and still the NYT is going to get eaten.

>>solard+Aj
Why can't AI at least cite its source? This feels like a broader problem, nothing specific to the NYTimes.

Long term, if no one is given credit for their research, either the creators will start to wall off their content or not create at all. Both options would be sad.

A humane attribution comment from the AI could go a long way - "I think I read something about this <topic X> in the NYTimes <link> on January 3rd, 2021."

It appears that without attribution, long term, nothing moves forward.

AI loses access to the latest findings from humanity. And so does the public.

>>aantix+1l
If you're going to consider training ai as fair use, you'll have all kinds of different people with different skill levels training ais that work in different ways on the corpus.

Not all of them will have the capability to cite a source, and plenty of them won't have it make sense to cite a source.

Eg. Suppose I train a regression that guesses how many words will be in a book.

Which book do I cite when I do an inference? All of them?

>>8note+sq
Any citation would be a good start.

For complex subjects, I'm sure the citation page would be large, and a count would be displayed demonstrating the depth of the subject[3].

This is how Google did it with search results in the early days[1]. Most probable to least probable, in terms of the relevancy of the page. With a count of all possible results [2].

The same attempt should be made for citations.

>>aantix+Vr
Ok, now please cite the source of this comment you just made. It's okay if the citation list is large, just list your citations from most probably to the least probable.

>>jquery+dy
"Now displaying 3 citations out of ~150,000,000.."

[1] http://web.archive.org/web/20120608192927/http://www.google....

[2] https://steemit.com/online/@jaroli/how-google-search-result-...

[3] https://www.smashingmagazine.com/2009/09/search-results-desi...

[4] Next page

:)

zlacker