I feel like I'm missing something. What the article claims they're doing is:
1. Misrepresenting what rights they have, and selling access to those rights.
2. Stealth-crawling the web, hiding from the webmasters just how much Brave is crawling their site, and making it impossible to block just their crawler.
How is either of these the right thing? I mean, for somebody besides Brave. What "attempt" are they making that other companies aren't?
The second doesn't seem like a problem to me as long as they respect robots.txt
The Wikipedia example is glaring. They’re scraping content, stripping attribution and reselling it with a right to lock it down in a way that is not allowed by the original license.
Brave is laundering copyleft content while lying to their customers by selling a license they can’t give. If you’d like, you can sidestep the morality of copyright entirely and focus on the plagiarism and fraud.
But your original claim wasn't just "Brave are technically not doing anything illegal" or "they're no worse than the others". It was praising them for being better than the others, that they're the only ones trying to do the right thing. And for these example it's just not true, they're outright worse than the industry standard.
So, to repeat, what makes you think that "Brave is trying to do the right thing while other companies aren't even attempting"?
And you don't seem to have read the article either, because clearly it was explained that they don't respect robots.txt because they have no user-agent.
(The fact that they include original URL does not change much, given that they explicitly market it as "Data for AI" and those systems never have attribution)