zlacker

[return to "The New York Times is suing OpenAI and Microsoft for copyright infringement"]
1. solard+Aj[view] [source] 2023-12-27 15:53:06
>>ssgodd+(OP)
I hope this results in Fair Use being expanded to cover AI training. This is way more important to humanity's future than any single media outlet. If the NYT goes under, a dozen similar outlets can replace them overnight. If we lose AI to stupid IP battles in its infancy, we end up handicapping probably the single most important development in human history just to protect some ancient newspaper. Then another country is going to do it anyway, and still the NYT is going to get eaten.
◧◩
2. aantix+1l[view] [source] 2023-12-27 16:01:23
>>solard+Aj
Why can't AI at least cite its source? This feels like a broader problem, nothing specific to the NYTimes.

Long term, if no one is given credit for their research, either the creators will start to wall off their content or not create at all. Both options would be sad.

A humane attribution comment from the AI could go a long way - "I think I read something about this <topic X> in the NYTimes <link> on January 3rd, 2021."

It appears that without attribution, long term, nothing moves forward.

AI loses access to the latest findings from humanity. And so does the public.

◧◩◪
3. 8note+sq[view] [source] 2023-12-27 16:32:09
>>aantix+1l
If you're going to consider training ai as fair use, you'll have all kinds of different people with different skill levels training ais that work in different ways on the corpus.

Not all of them will have the capability to cite a source, and plenty of them won't have it make sense to cite a source.

Eg. Suppose I train a regression that guesses how many words will be in a book.

Which book do I cite when I do an inference? All of them?

◧◩◪◨
4. benrow+yJ[view] [source] 2023-12-27 18:16:21
>>8note+sq
Regression is a good analogy of the problem here. If you found a line of best fit for some datapoints, how would you get back the original datapoints, from the line?

Now imagine terabytes worth of datapoints, and thousands of dimensions rather than two.

[go to top]