zlacker

[parent] [thread] 7 comments
1. mnd999+(OP)[view] [source] 2023-05-31 18:13:05
They don't want people training LLMs with their data without paying for it. Blame the AI bros for stealing all their data.
replies(3): >>ethbr0+b1 >>therea+G1 >>adrian+NC
2. ethbr0+b1[view] [source] 2023-05-31 18:17:33
>>mnd999+(OP)
It's not "their" data in the same way that last mile network access isn't "their" (telco's) pipes.

If value in a platform comes from third parties choosing to use the service, and those third parties are free to use alternatives, then platforms should be very careful about how greedy they get in exploiting their users.

Most of the platform value actually comes from future, continued use.

replies(2): >>toomuc+z4 >>mnd999+Dv
3. therea+G1[view] [source] 2023-05-31 18:18:53
>>mnd999+(OP)
But I wonder if they really fulfil this goal. How do they solve the unsolved problem of allowing scraping / SEO (Google, Bing etc.) but not teaching their LLMs?

It's obvious or an open secret that Alphabet/Google and Microsoft will use their web copy for teaching their AI.

◧◩
4. toomuc+z4[view] [source] [discussion] 2023-05-31 18:29:07
>>ethbr0+b1
> Most of the platform value actually comes from future, continued use.

OpenAI should start a clone, make it nice, and train their LLMs off of it. If discussion boards have immense future value from hosting humans interacting, clearly the cost of hosting them is worth it.

◧◩
5. mnd999+Dv[view] [source] [discussion] 2023-05-31 20:12:33
>>ethbr0+b1
Maybe, legally it probably is their data but your point still correct, they only have it because people choose to give it to them.

But it doesn’t matter what is, it matters what they think and they’ve got AI cash fomo.

replies(2): >>renlo+FA >>tivert+ZI
◧◩◪
6. renlo+FA[view] [source] [discussion] 2023-05-31 20:33:24
>>mnd999+Dv
It is their data but your copyrighted work :P

Seems to me this is more them trying to push ads on people; apps like Apollo do not serve ads (or, as a long time user of Apollo, I've never seen them). I think this has been a long time in the works, before all of the LLM buzz.

7. adrian+NC[view] [source] 2023-05-31 20:43:23
>>mnd999+(OP)
If that's the real issue, then offer two licenses. One that allows you to use the data to train an AI. Another that doesn't and says that if you do they will permaban your API access and sue your pants off.

Third-party client apps can keep doing what they do, knowing that attempting to use the data to train an AI would destroy their business forever. Companies that want to train an AI can use the other license and pay big stacks of money.

◧◩◪
8. tivert+ZI[view] [source] [discussion] 2023-05-31 21:17:29
>>mnd999+Dv
> But it doesn’t matter what is, it matters what they think and they’ve got AI cash fomo.

Aren't they closing the barn door after the horse has gotten out? Literally all their data from 2005 to March 2023 is still available via torrent.

[go to top]