Wait. Brave browser sends back to Brave Search engine about your browsing? Other search engines usage, but also crawl pages on your computer to help build their search index?
Ref: https://github.com/brave/web-discovery-project/blob/main/mod...
Atricle 3 and 4 of the EU 'Copyright in the Digital Single Market' give data miners quite extensive rights.
Move operation to the EU, train a foundational model, than train a constitutional model based on that.
As much as I hate the upcoming AI regulation, the CDSM is solid.
https://academic.oup.com/grurint/article/71/8/685/6650009 https://eur-lex.europa.eu/eli/dir/2019/790/oj
Update: Fixed wrong link
"Brave doesn’t follow the sneaky practices of other big tech search engines. The Web Discovery Project is opt-in, and the data collected under the Web Discovery Project has specific protections to ensure anonymity." per https://support.brave.com/hc/en-us/articles/4409406835469-Wh...
The Supreme Court hasn’t ruled on a software case like this, as far as I know. But given the recent 7-2 decision against Andy Warhol’s estate for his copying of photographs of Prince, this doesn’t seem like a Court that’s ready to say copying terabytes of unlicensed material for a commercial purpose is OK.
I’m going to guess this ends with Congress setting up some kind of clearinghouse for copyrighted training material: You opt in to be included, you get fees from OpenAI when they use what you added. This isn’t unprecedented: Congress set up special rules and processes for things like music recordings repeatedly over the years.
https://scholarship.law.edu/cgi/viewcontent.cgi?referer=&htt...
Regarding the copyright of returned material here is a good discussion:
https://copyrightblog.kluweriplaw.com/2023/05/09/generative-...
[0] https://blogs.opera.com/africa/2022/05/free-data-with-opera-...
[1] https://www.androidpolice.com/2020/01/21/opera-predatory-loa...
And AI training is extremely legible. This is not like a bunch of people downloading stuff off BitTorrent. All of the large foundation models we use were trained by a large corporation with a source of venture capital funding which could be easily shut off by a sufficiently motivated government. Weights-available and liberally licensed models exist, but most improvements on them are fine-tuning. Anonymous individuals can fine-tune an LLM or art generator with a small amount of data and compute, but they cannot make meaningful improvements on the state of the art.
So our sufficiently motivated copyright judge could at least effectively freeze AI art in time until Big Tech and the MAFIAA agree on how to properly split the proceeds from screwing over individual artists.
"Butlerian Jihad" is a term from a book, so you don't need to take "jihad" literally. However, I will point out that there is a significant fraction of the population that does want to see AI permanently banned from creative endeavors. The loss of ownership over their work from having it be in the training set is a factor, but their main argument is that they specifically want to keep their current jobs as they are. They do not want to be replaced with AI, nor do they want to replace their existing drawing work with SEO keyword stuffed text-to-image prompts.
https://brave.com/firewall-vpn/ https://account.brave.com/?intent=checkout&product=search https://brave.com/search/api/