zlacker

[parent] [thread] 14 comments
1. batch1+(OP)[view] [source] 2023-12-27 14:18:34
> The New York Times is suing OpenAI and Microsoft over claims the companies built their AI models by “copying and using millions” of the publication’s articles and now “directly compete” with the outlet’s content.

Millions? Damn, they can churn out some content. 13 million[0]!.

[0] https://archive.nytimes.com/www.nytimes.com/ref/membercenter....

replies(2): >>pavlov+F1 >>laborc+h2
2. pavlov+F1[view] [source] 2023-12-27 14:27:43
>>batch1+(OP)
The NYT publishes about 200 pieces of journalism every day (according to their own website), and it was founded in 1851. That makes for a lot of articles.
replies(3): >>engine+P2 >>midasu+5d >>naltro+Ii
3. laborc+h2[view] [source] 2023-12-27 14:30:58
>>batch1+(OP)

  “Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” the lawsuit states.
I can't be the only one that sees the irony of this news being "reported" and regurgitated over dozens of crappy blogs.

  ChatGPT [..] “can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style.”
If the NYT thinks that GPT-4 is replicating their style then [as anybody who has tried to do creative writing work with GPT-4 can testify to] they need to fire all their writers.
replies(4): >>Aurorn+h3 >>single+l6 >>jprete+p6 >>indymi+D9
◧◩
4. engine+P2[view] [source] [discussion] 2023-12-27 14:34:48
>>pavlov+F1
(2023 - 1851) * 365 * 200 = 12,556,000

Yep, so a few million ripped off articles is plausible.

replies(1): >>zozbot+q3
◧◩
5. Aurorn+h3[view] [source] [discussion] 2023-12-27 14:37:00
>>laborc+h2
> More on-topic: if the NYT thinks that GPT-4 is replicating their style then [as anybody who has tried to do creative writing work can testify to] they need to fire all their writers.

The complaint isn’t that ChatGPT is imitating New York Times style by default.

The complaint is that you can ask it to write “in the style of New York Times” and it will do so.

I don’t know if this argument has any legal merit, but it’s not as simple as you suggest. It’s the textual parallel to having AI image generators mimic the trademark style of artists. We know it can be done, the question is what does it mean legally.

replies(2): >>gerald+S4 >>laborc+f9
◧◩◪
6. zozbot+q3[view] [source] [discussion] 2023-12-27 14:37:47
>>engine+P2
Everything from 1851 to 1927 ought to be in the public domain, though. If the goal of training an AI is just "to mimic a style" there are absolutely humongous amounts of text that's totally free of any copyright restrictions.
replies(1): >>hnarn+h4
◧◩◪◨
7. hnarn+h4[view] [source] [discussion] 2023-12-27 14:42:43
>>zozbot+q3
Yes, there is large amounts of public domain text available, but does anyone believe this is a restriction that was imposed when feeding the models?
◧◩◪
8. gerald+S4[view] [source] [discussion] 2023-12-27 14:46:32
>>Aurorn+h3
All ai image generators can produce copyrighted works exactly. The level of modification is often barely more than you would get than if you slapped a filter on a copyrighted image in photoshop.
◧◩
9. single+l6[view] [source] [discussion] 2023-12-27 14:53:47
>>laborc+h2
Is your point that the NYT should sue bloggers? Or that given the existence of bloggers, they should not try to sue Microsoft? Or something else?
◧◩
10. jprete+p6[view] [source] [discussion] 2023-12-27 14:54:03
>>laborc+h2
All those blogs are _also_ violating copyright, so I don't see the irony? One doesn't spend a million dollars suing a defendant with pennies to their name.

I'd also expect the Times style complaint to have merit because it's probably much easier for ChatGPT to imitate the NYT style than an arbitrary style.

replies(1): >>visarg+lY
◧◩◪
11. laborc+f9[view] [source] [discussion] 2023-12-27 15:09:18
>>Aurorn+h3
The writing of the new york times is so diffuse (they even have their own published style guide!) that it's impossible to make a claim to any "style", as there are undoubtedly millions upon millions of lines of text by authors who have been inspired by the NYT.
◧◩
12. indymi+D9[view] [source] [discussion] 2023-12-27 15:11:08
>>laborc+h2
I'm pretty sure the defense is the "NYT Style and Usage Guide"...
◧◩
13. midasu+5d[view] [source] [discussion] 2023-12-27 15:31:00
>>pavlov+F1
The first 75+ years are no longer in copyright, so certainly possible to train on thousands maybe millions of NYT articles without concern.
◧◩
14. naltro+Ii[view] [source] [discussion] 2023-12-27 16:03:52
>>pavlov+F1
Copyright in the US persists for 70 years after the publisher's death.

So the earliest available copyrighted material would be all content published by anybody who died in the year 1953 or earlier.

If the author of an article published in 1950 still has a living author, the work is still copyrighted.

◧◩◪
15. visarg+lY[view] [source] [discussion] 2023-12-27 19:53:24
>>jprete+p6
> an arbitrary style

..off to try "Gwern style"

[go to top]