zlacker

[return to "The shady world of Brave selling copyrighted data for AI training"]
1. throwa+ee[view] [source] 2023-07-15 13:41:30
>>rand0m+(OP)
I think this title is overstated. It seems like Brave is trying to do the right thing here vs other companies that don't even make the attempt. (Also, crawling as a service has been a thing for a while.)
◧◩
2. jsnell+yh[view] [source] 2023-07-15 14:04:25
>>throwa+ee
> It seems like Brave is trying to do the right thing here vs other companies that don't even make the attempt

I feel like I'm missing something. What the article claims they're doing is:

1. Misrepresenting what rights they have, and selling access to those rights.

2. Stealth-crawling the web, hiding from the webmasters just how much Brave is crawling their site, and making it impossible to block just their crawler.

How is either of these the right thing? I mean, for somebody besides Brave. What "attempt" are they making that other companies aren't?

◧◩◪
3. throwa+Di[view] [source] 2023-07-15 14:11:21
>>jsnell+yh
I think the first one seems to be a case where Brave just has incomplete information about licensing so for the Wikipedia data and other CCthey need to provide a link.

The second doesn't seem like a problem to me as long as they respect robots.txt

◧◩◪◨
4. skille+9l[view] [source] 2023-07-15 14:27:25
>>throwa+Di
I think you're missing the point. This is one example that uses a specific license, there are countless other licenses.

And you don't seem to have read the article either, because clearly it was explained that they don't respect robots.txt because they have no user-agent.

[go to top]