People keep saying this without defining what exactly they mean. This is a technical topic, and it requires technical explanations. What do you think "mostly copying" means when you say it?
Because there isn't a shred of original pixel data reproduced from training data through to output data by any of the diffusion models. In fact there isn't enough data in the model weights to reproduce any images at all, without adding a random noise field.
> The benefits of allowing this will be had by a very small group of corporations and individuals
You are also grossly mistaken here. The benefits of heavily restricting this, will be had by a very small group of corporations and individuals. See, everyone currently comes around to "you should be able to copyright a style" as the solution to the "problem".
Okay - let's game this out. US Copyright lasts for the life of author plus 70 years. No copyright work today will enter public domain until I am dead, my children are dead, and probably my grandchildren as well. But copyright can be traded and sold. And unlike individuals, who do die, corporations as legal entities do not. And corporations can own copyright.
What is the probability that any particular artistic "style" - however you might define that (whole other topic really) - is truly unique? I mean, people don't generally invent a style on their own - they build it up from studying other sources, and come up with a mix. Whatever originality is in there is more a function of mutation of their ability to imitate styles then anything else - art students, for example, regularly will do studies of famous artists and intentionally try to copy their style as best they can. A huge amount of content tagged "Van Gough" in Stable Diffusion is actually Van Gough look-alikes, or content literally labelled "X in the style of Van Gough". It had nothing to do with them original man at all.
I mean, zero - by example - it's zero. There are no truly original art styles. Which means in a world with copyrightable art styles, all art styles eventually end up as a part of corporate owned styles. Or the opposite is also possible - maybe they all end up as public domain. But in both cases the answer is the same: if "style" becomes a copyrightable term, and AIs can reproduce it in some way which you can prove, then literal "prior art" of any particular style will invariably be an existing part of an AI dataset. Any new artist with a unique style will invariably be found to simply be 95% a blend of other known styles from an AI which has existed for centuries and been producing output constantly.
In the public domain world, we wind up approximately where we are now: every few decades old styles get new words keyed into them as people want to keep up with the times of some new rising artist who's captured a unique blend in the zeitgeist. In the corporate world though, the more likely one, Disney turns up with it's lawyers and says "we're taking 70% or we're taking it all".
I disagree that there is no originality in art styles, human creativity amounts to more than just copying other people. There is no way a current gen AI model would be able to create truly original mathematics or physics, it is just able to reproduce facsimile and convincing bullshit that looks like it. Before long the models will probably able to do formal reasoning in a system like Lean 4, but that is a long way of from truly inventive mathematics or physics.
Art is more subtle, but what these models produce is mostly "kitsch". It is telling that their idea of "aesthetics" involves anime fan art and other commercial work. Anyways, I don't like the commercial aspects of copyright all that much, but what I like is humans over machines. I believe in freely reusing and building on the work of others, but not on machines doing the same. Our interests are simply not aligned at this point.