Who knew the first AI battles would be fought by artists?

>>dredmo+(OP)
I've been finding that the strangest part of discussions around art AI among technical people is the complete lack of identification or empathy: it seems to me that most computer programmers should be just as afraid as artists, in the face of technology like this!!! I am a failed artist (read, I studied painting in school and tried to make a go at being a commercial artist in animation and couldn't make the cut), and so I decided to do something easier and became a computer programmer, working for FAANG and other large companies and making absurd (to me!!) amounts of cash. In my humble estimation, making art is vastly more difficult than the huge majority of computer programming that is done. Art AI is terrifying if you want to make art for a living- and, if AI is able to do these astonishingly difficult things, why shouldn't it, with some finagling, also be able to do the dumb, simple things most programmers do for their jobs?

The lack of empathy is incredibly depressing...

>>meebob+kc
Artists have all my sympathy. I'm also a hobbyist painter. But I have very little sympathy for those perpetuating this tiresome moral panic (a small amount of actual artists, whatever the word "artist" means), because I think that:

a) the panic is entirely misguided and based on two wrong assumptions. The first is that textual input and treating the model as a function (command in -> result out) are sufficient for anything. No, this is a fundamentally deficient way to give artistic directions, which is further handicapped by primitive models and weak compute. Text alone is a toy; the field will just become more and more complex and technically involved, just like 3D CGI did, because if you don't use every trick available, you're missing out. The second wrong assumption is that it's going to replace anyone, instead of making many people re-learn a new tool and produce what was previously unfeasible due to the amount of mechanistic work involved. This second assumption stems from the fundamental misunderstanding of the value artists provide, which is conceptualization, even in a seemingly routine job.

b) the panic is entirely blown out of proportion by the social media. Most people have neither time nor desire to actually dive into this tech and find out what works and what doesn't. They just believe that a magical machine steals their works to replace them, because that's what everyone reposts on Twitter endlessly.

>>orbita+Z62
> But I have very little sympathy for those perpetuating this tiresome moral panic (a small amount of actual artists, whatever the word "artist" means)

> A small amount of actual artists

It's extremely funny that you say this, because taking a look at the Trending on Artstation page tells a different story.

https://www.artstation.com/?sort_by=trending

>>dtn+qg2
That's what the b) was about, yes.

And ironically, the overwhelming majority of knowledge used by these models to produce pictures that superficially look like their work (usually not at all), is not coming from any artworks at all. It's as simple as that. They are mostly trained on photos which constitute the bulk of models' knowledge about the real world. They are the main source of coherency. Artist names and keywords like "trending on artstation" are just easily discoverable and very rough handles for pieces of the memory of the models.

>>orbita+0o2
I don't think the fact that photos are making up the vast majority of the training set is of any particular significance.

Can SD create artistic renderings without actual art being incorporated? Just from photos alone? I don't believe so, unless someone shows me evidence to the contrary.

Hence, SD necessitates having artwork in it's training corpus in order to emulate style, no matter how little it's represented in the training data.

>>dtn+ev2
SD has several separate parts. In the most simplistic sense (not entirely accurate to how it functions), one translates English into a semantic address inside the "main memory", and another one extracts the contents of the memory that the address refers to. If you prevent the first one (CLIP) from understanding artists names by removing the correspondence between names and addresses, the data will still be there and can be addressed in any other way, for example custom trained embeddings. Even if you remove artworks from the dataset entirely, you can easily finetune it on anything you want using various techniques, because the bulk of the training ($$$!) has already been done for you, and the coherency, knowledge of how things look in general, shapes, lighting, poses, etc is already there. You only need to skew it towards your desired style a bit.

Style transfer combined with the overall coherency of pre-trained models is the real power of these. "Country house in the style of Picasso" is generally not how you use this at full power, because "Picasso" is a poor descriptor for particular memory coordinates. You type "Country house" (a generic descriptor it knows very well) and provide your own embedding or any kind of finetuned addon to precisely lean the result towards the desired style, whether constructed by you or anyone else.

So, if anyone believes that this thing would drive the artists out of their jobs, then removing their works from the training set will change very little as it will still be able to generate anything given a few examples, on a consumer GPU. And that's only the current generation of such models and tools. (which admittedly doesn't pass the quality/controllability threshold required for serious work, just yet)

zlacker