Been thinking more about Stephen Marche’s piece on generative AI. He compares it to Hip-Hop, in certain respects. The most salient quote from it is here (source):

To make hip-hop, you don’t need to know how to play the drums, but you do need to be able to reference the entire history of beats and hooks. Every producer becomes an archive; the greater their knowledge and the more coherent their understanding, the better the resulting work. The creator of meaningful literary AI art will be, in effect, a literary curator.

I think he’s basically right, but there’s a foreseeable rhetorical danger here in comparing generative AI to Hip-Hop, because to many people Hip-Hop is synonymous with sampling: taking a beat or a hook or whatever wholesale out from another song, and re-purposing it into a new thing.

If you say gen AI is like sampling, then that opens up potential criticisms around copyright, that are – I think – incorrect with how the technology actually functions.

My somewhat non-technical understanding of how gen AI works is this: generative AI tools do not take “samples” or clips from other pieces and collage them together. Like an arm from here, an eye from here, lips from another source, etc. That’s just not how it works.

How it works is that many sources are analyzed for their dimensionality (being here something like attributes or characteristics). This piece is red. That piece depicts a human, this one a sky, this one a puppy peeing on a fire hydrant. From those aggregate measurements of dimensionality, an entirely new thing is invoked by user prompts that in no way incorporates a “sample” of the original pieces it was trained on.

Okay, then there’s the question of whether or not creators agreed to have their works “measured for dimensionality,” but in the United States, that seems to be a moot point, because on appeal, web scraping was deemed to be legal. So if you want to argue whether scraping & then measuring what you scraped is “ethical,” the court seems here to have some guidance, though certainly there are additional dimensions to the conversation that are worth interrogating.

Adobe has taken the approach of only using source images it owns rights to, and I think that’s a good idea for their particular usage (though I think Firefly image quality is pretty shitty).

Anyway, more to say on this, but other things press for the moment.