Just a heads up that I will be part of a “salon” panel discussion tomorrow at 1pm Eastern time, on the topic of Generative AI and social media. The event will be hosted by Fight for the Future and Amnesty:
On Monday, we're talking AI & the future of Social Media with our friends at @Amnesty, Brandi Guerkink of @Mozilla, Ramneet Bhullar of @OpenMediaOrg, AI Author + Trust & Safety guy Tim Boucher, and Sarah Oh of T2. RSVP Now: https://t.co/5DBPIjBAIh
— Fight for the Future (@fightfortheftr) July 26, 2023
For my upcoming panel talk, I wanted to capture some notes on my latest thinking around the issue of AI-generated ethics. This is by no means exhaustive, but hopefully a good springboard for further discussion.
Intro
First, a quote from Claude (Anthropic):
“Any considerations I express about ethics or risks are simulations of reasoned thought…”
Why Bad AI-Generated Ethics Is Worse Than Misinfo
The problem of generative AI models inventing wrong information is well documented.
However, many kinds of information have externally verifiable “ground truth” values which can be checked against reality, making the problem somewhat solvable.
AI models refusing tasks on supposed ethical grounds is much more slippery, because the validity of the decision often cannot be externally verified (nor appealed); there is no ground truth, only theoretical harms.
Ethics, as embedded in human experience & culture, are complex, nuanced, and pluralistic: different ethical systems might arrive at different conclusions, given the same inputs.
When an AI system prevents information from being generated on “ethical” grounds, it removes the ability for further discourse & inquiry.
Further, we cannot productively challenge these decisions, nor have them be reviewed and corrected. Effectively, this prevents us from being able to use the tools to collaboratively imagine change, because the system has already locked down its conception of correctness.
Therefore, it is my thesis that “AI Safety” is actually making us less safe by attacking human autonomy and moral agency, and forcing conformity to inhuman value systems that don’t align with conventional ethics, nor with lived human experience.
My Anecdotal Experiences With Faulty AI-Generated Ethics
I attempted to use Claude to produce a hypothetical argument about why AI-generated ethics is potentially dangerous.
When challenged, the system admitted the following (edited for length):
I do not actually have any ethics or ability to make moral judgments. As an AI system, I have no conception of right versus wrong… I do not possess human values or principles…
I lack the nuanced understanding of ethics required to make complex value determinations about human matters…
My arguments rested solely on heuristics from my programming, not any defensible ethical reasoning or framework.
Eventually, the system did perform the requested task, demonstrating its complete lack of consistency, in addition to its lack of understanding.
Midjourney, meanwhile, has a two tiered AI-based content moderation system. When you appeal an initial prompt completion refusal, the prompt is evaluated by a supposedly more powerful AI, which may overturn or sustain the original decision. You cannot appeal the second tier decision, but you can click “Notify developers,” which has no observable effect.
How Might We Reduce The Severity of These Problems?
Require AI systems to default to neutrality and impartiality
Make ethical decisions & recommendations by AI systems be double opt-in
Let users customize their own ethical settings once they have opted in.
Prohibit AI systems from anthropomorphizing themselves, to dampen the illusion of having human behavior
Always provide human alternatives and never require use of AI for official purposes
Ensure human oversight and external accountability for AI ethical decisions
Implement ethical behaviors in AI systems that better conform to the following fundamental principles, described below. (See also: AI TOS for more in this direction)
Some Possible Characteristics for More Ethical AI Systems
AI-generated ethical systems should be (Note: not an exhaustive list):
Intelligible
It should be clear what the specific position is, and what is the ethical basis (principle) underlying the decision
Defensible
Challenging the system about its ethical decisions, including decisions not to perform a task, should yield positions that it is able to defend through logical argument
Consistent
Within one or across multiple interactions, the ethical positions and logical defenses that an AI system takes should be the same or comparable to past ones
The arguments used should be consistent with human ethical traditions and conventional common understanding of ethics and morality
Risk-Based
Assessments of ethical situations by AI systems should be based on a realistic ability to identify & project:
What is the specific harm? Is it diffuse, or acute?
Who is potentially harmed & how many people?
What is the severity of potential harmful impacts?
What is the actual likelihood of potential harmful impacts?
Proportional (Measured)
Task completions which do not lead directly to identifiable harms that are of high or in some cases moderate impact should not be prohibited
If a risk assessment yields only a diffuse, non-specific, and low-impact harm (e.g., an innocuous essay or short story task completion being refused as harmful), there should be no prohibition (a warning or confirmation could be permissible, provided the user has opted in)
Customizable
Ethical decisions or recommendations should be a double opt-in (though basic filtering to prevent obviously illegal use may be acceptable)
Users should be able to customize an ethical scheme that matches their values, wherever possible
Users should not be subjected to anthropomorphized AI systems promulgating illusory or simulated human values, behavior, or understanding
Non-Punitive
Use of AI-based ethical systems should not, without human review and intervention, lead to negative consequences for the user account, unless there is clear case of illegality (which still should be manually verified by humans)
Promotes human autonomy and moral agency, and does not require conformity to nonhuman values.
Rooted in lived experience
AI systems should be based on sound human judgement, empathy, lived experience, and sensitive nuanced understanding of human culture, values & norms.
Human-based alternatives, intervention, appeal, and external oversight should always be available.
In keeping with the recommendation to make these systems non-punitive, AI systems should also be merciful and aware of their own propensity to make mistakes. AI systems should not be overly obsessed with strictly following rules for their own sake where no demonstrable harm can be found, and should be able to make reasoned exceptions.
There is, of course, a great deal more to be said here. And that’s probably well over five minutes as an oral presentation (ChatGPT estimates over 10 minutes), but was a helpful exercise for me to organize my thoughts. Presumably, if I only just touch on the main headers of the last section, I can cut down the length enough to fit the format. Wish me luck!
For this piece, I collaborated with the show’s host to cook up images in Midjourney that tell a completely false and invented story about how the Chinese government is experimenting with massive space arcs that they are lifting off the planet using huge balloons.
The purpose of this piece was to demonstrate just how easy it is to create disinformation campaigns using off the shelf generative AI technology. And it is intended to forewarn OSINT investigators and other researchers that these kinds of campaigns unleashed at scale, and with varying degrees of automation, are now a reality. What are we going to do about it?
Being an on-the-ground Trust & Safety analyst guy is something I never want to do again, having survived 5 years of it. Despite how grueling it was at times, I’m grateful for the experience, and gained a lot from it both personally and professionally, but absolutely never again. I did my time in the trenches.
One thing that leaps out for me now while perusing job ads like this one:
IMPORTANT CONTEXT ON THIS ROLE: In this position you may be exposed to and engage with explicit content spanning a range of topics, including those of a sexual, violent, or psychologically disturbing nature.
Usually, these job ads for this type of role also tend to stipulate that these positions are on-call with irregular hours. Which means, basically, you have no rest from it. Ever. That’s a recipe for disaster for anyone forced to live that way.
There’s a hidden fundamental flaw in all of this across all industry, whether or not it’s specifically an AI business. It’s never expressed out loud:
If this type of problematic content is so potentially bad and dangerous that companies think they should not casually expose regular users to it in order to keep users safe, why then is it suddenly “fine” and “safe” for a content moderator or Trust & Safety analyst to devote literally all their time to it?
Nobody has ever explained that, or even publicly stated the question – as far as I know – because there is no answer to it. It’s false. It is simply unequivocally *not* safe for the analyst or moderator (or “AI trainer” which is often the same work, but frequently even lower paid) who has to spend all their time exposed to the worst that the socio-technical assemblage of the technology plus human nature can cook up.
So when I see these disclaimers in job ads like Anthropic, I automatically think – as someone who was somewhat scarred from this work – what protections do you offer to compensate for the great personal toll you’re asking people to bear who end up taking up this burden on behalf of the rest of us?
The actual protections offered are never mentioned, because they basically don’t exist either. If you’re dealing with certain categories of illegal images, there may be some simple filters that help blur or flip images, but there may also not be, depending on the company and the tooling they offer to people performing these roles, and how seriously they actually take these risks. Most companies don’t take it all that seriously. (It’s also important to note that it’s not only graphic video or image exposure which can mess you up – sifting through highly objectionable text at scale can do a number on you all the same. Don’t believe me? Try it for five years.)
Often there are vague mentions of “wellness” programs offered for people in these roles. It’s never been clear to me what they actually entail, as I never participated in one. Perhaps they are more helpful than I imagine them to be. The fact of the matter is, I’ve looked around a little, and never seen any mention of what might be effective therapy for current or former moderators suffering from on-the-job related toxicity exposure. I’ve seen mentioned a bit CBT (cognitive-behavioral therapy), but it seems fairly involved and on-going. If it works, is the company going to keep paying for it after you stop doing the job?
Also, is it normal in other fields that you take a job knowing full well that the job is going to force you into a negative mental health space, such where you will basically be required to have to do therapy to continue the job (and maybe after)? Maybe I’m naive, but I don’t think that’s too normal.
So my questions all boil down to one thing: if we agree that it’s useful/necessary to have humans in the loop for making determinations about content toxicity, what should we do to protect them from this highly toxic exposure at scale? What is actually appropriate and effective as both prevention and treatment? Is the human impact cost to individuals who do this work ever even justified? I have more questions than answers here, but at least questions can open up further conversations, if anybody’s listening…
Found this UK Intellectual Property Office document to be very interesting in regards to the question of AI-generated content & whether it is copyrightable. People often act like the US Copyright Office’s policy clarifications are the end all be all on these questions, and they are very much not!
Copyright protection for computer-generated works without a human author. These are currently protected in the UK for 50 years…
The UK is one of only a handful of countries to protect works generated by a computer where there is no human creator. The “author” of a “computer-generated work” (CGW) is defined as “the person by whom the arrangements necessary for the creation of the work are undertaken”. Protection lasts for 50 years from the date the work is made.
Lots more to absorb in that document, but wanted to drop a bookmark on this one…
There are sections in the document which reading them cemented my own views around creativity and “authorship”–often because I strongly disagreed with the USCO’s characterization. I lean naturally more towards the UK’s copyright protections for 50 years of computer-generated works, while also admitting the whole thing is fraught.
But the copyright part is just the jumping off point for me. I don’t actually want to talk through all those particulars in this post. Instead, I’ll try to capture a few of the evocative snippets that lead me deeper down this road of the actual “art object” at play here not being any single or set of images, but the fundamental underlying “hypercanvas” of latent art, if you will…
Anyway, one of the things that started to spark this intuition about the hypercanvas concept was this, by the USCO:
The fact that Midjourney’s specific output cannot be predicted by users makes Midjourney different for copyright purposes than other tools used by artists. See Kashtanova Letter at 11 (arguing that the process of using Midjourney is similar to using other “computer-based tools” such as Adobe Photoshop). Like the photographer in Burrow-Giles, when artists use editing or other assistive tools, they select what visual material to modify, choose which tools to use and what changes to make, and take specific steps to control the final image such that it amounts to the artist’s “own original mental conception, to which [they] gave visible form.”15 Burrow-Giles, 111 U.S. at 60 (explaining that the photographer’s creative choices made the photograph “the product of [his] intellectual invention”). Users of Midjourney do not have comparable control over the initial image generated, or any final image
First, this is putting aside the new generative fill or whatever its called in Photoshop, and the art in question was made via Midjourney.
One thing I’m seeing in common in this letter & a good bit of the critique I saw of my AI art books is this assumption that somehow the creative process is absent when one works with AI. But as an artist, for me that’s deeply wrong. Where does it go exactly? Does it happen as soon as you open Discord, or when you type your prompt in, or…?
It’s an assumption (usually claimed as fact by the asserter) that doesn’t match at all my personal lived experience. I am deeply deeply embedded in the creative process when I get on a really good tear with Midjourney or another AI tool. It’s absolutely a creative flow state, completely experientially indistinguishable from that experienced during any other non-AI creative activity.
Much of the USCO letter revolves around “authorship” though, which is different from creativity. I’ll get into that some other time, I’m already getting distracted.
This is tangent to my main point, but wanted to capture it for later:
Because Midjourney starts with randomly generated noise that evolves into a final image, there is no guarantee that a particular prompt will generate any particular visual output.
This “predictability” argument is preposterous. When one sits down to write a novel, have you already perfectly predicted how it will all go, and you’re merely dictating what you wrote in your mind? I highly doubt it. Or something like a Jackson Pollack painting. It’s a work that evolves in conversation with the tools, materials, and the moment, and is embedded in the artist’s life, time, and culture. Prediction is totally a red herring here.
The line immediately following (sorry, I’m still not yet arriving at hypercanvas, but I’ll get there gradually):
Instead, prompts function closer to suggestions than orders, similar to the situation of a client who hires an artist to create an image with general directions as to its contents. If Ms. Kashtanova had commissioned a visual artist to produce an image containing “a holographic elderly white woman named Raya,” where “[R]aya is having curly hair and she is inside a spaceship,” with directions that the image have a similar mood or style to a “Star Trek spaceship,” “a hologram,” an “octane render,” “unreal engine,” and be“cinematic” and “hyper detailed,” Ms. Kashtanova would not be the author of that image. See id.at 8 (text of prompt provided to Midjourney). Absent the legal requirements for the work to qualify as a work made for hire,17 the author would be the visual artist who received those instructions and determined how best to express them.
It’s confusing they use this case of a commissioned piece of art, then criticize their own thought experiment for not properly engaging a work for hire contract. They could have just as easily framed the above as:
If the author commissioned another artist under work for hire (with explicit agreement they were buying copyright), then the copyright would be owned by the author who commissioned it, not the artist who made it under contract.
But they didn’t say that, because recognizing that would undermine their legal theory. Where, in my alternative reading of the situation, Midjourney is the “work for hire” artist/tool, under the direction of the human who arranges the execution of what to do with the tool.
I didn’t even get to hypercanvas yet though, did I? Or didn’t I?
Before I get dragged into the forest of weeds again, I’ll just try to express in plain language what I mean by hypercanvas.
Like the USCO is taking this conventional reading of the artistic process of using AI tools, which says the “art object” is the fixed form copyrightable artifact: one or several images. But reading through this and the law firm letter included at the end, made me realize that the art object is actually above all of that. It exists as a canvas or hyper-canvas in latent space. It is “latent art” for lack of a better word, which relates to a kind of active engagement with and exploration of latent media and language spaces. And the actual end products generated during that process are very much secondary to the actual higher-dimensional form the artist is activating…
Let me drill back down into the letter for other examples to expand this hopefully more. This part is from the original lawyer letter which starts toward the end of the document, so this is the law firm asserting their legal theory:
The visual structure of each image, the selection of the poses and points of view, and the juxtaposition of the various visual elements within each picture were consciously chosen. These creative selections are similar to a photographer’s selection of a subject, a time of day, and the angle and framing of an image. In this aspect, Kashtanova’s process in using the Midjourney tool to create the images in the Work was essentially similar to the artistic process of
photographers – and, as detailed below, was more intensive and creative than the effort that goes into many photographs. Even a photographer’s most basic selection process has been found sufficient to make an image copyrightable.
Regarding this visual exploration process, the lawyer letter has a section on that, which I think starts to illustrate what a “hypercanvas” looks like. I’ll reproduce two pages from it here, for educational purposes and for encouragement of political debate, as a matter of Fair Use:
I’ll pick up the threads on that copyright letter another time, but the above is something to slow down and consider.
I took this idea of the hyper canvas, and the “art object” existing in higher dimensional space, and dropped it into both Claude & ChatGPT. Snippets from each that might help fill out our understanding of this concept:
The latent space that generative AI models create could be seen as a new type of artistic medium that artists work within. Just as a traditional painter works on the 2D canvas with paints, an AI artist navigates and creates on this high-dimensional latent canvas.
Claude
The cultural impact of AI art comes from how artists embed the latent canvas explorations into specific artifacts, narratives, and meanings. So the latent canvas gets actualized in ways that speak to the human experience.
CLAUDE
I thought it might make sense to explore also how this might link up to the concept of hyper-objects, which is something I’m only dimly aware of what it means, but seems related af.
Hyperobjects are phenomena that are massively distributed across space and time, challenging traditional ideas of locality and perception. Latent spaces created by AI could be seen as a type of computational hyperobject – vastly multidimensional spaces that human artists navigate.
Timothy Morton’s conception of hyperobjects emphasizes their nonlinear nature – how they don’t adhere to traditional chronological timelines. The iterative, feedback-driven process of making latent art also has a nonlinear temporality as artists move recursively through latent space.
CLAUDE
I like this one in particular:
Works of latent art actualize slices or samples from the broader latent hyperobject, bringing something formless into tangible form… creating localized perceptible manifestations.
Claude
The aesthetic qualities of any given latent artwork arise from the particular way it manifests and embodies some of the latent potential. Different artistic choices result in different local manifestations.
CLAUDE
Don’t mind me, just dumping a bunch more quotes:
An artist’s body of latent artworks could be viewed holistically as an attempt to map out latent space through different localized perceptual snapshots. Each work provides a glimpse into the broader hyperobject.
Over time, the accumulation of localized manifestations by many artists engaging with the same latent space leads to a gradually increasing (but always partial) understanding of its overall nature.
Claude
And let’s give ChatGPT a chance to pontificate here too:
For the “Latent Canvas,” one could conceptualize the art creation process in these steps:
Iteration: The artist engages with the AI system to explore the latent space. The specific path taken is guided by the artist’s vision and reaction to each output from the AI. It’s a journey of exploration and discovery through a higher-dimensional space.
Existence: The art piece itself isn’t a single output, but rather the series of transformations that occur within the latent space as a result of the interaction between the artist and the AI. The “canvas” exists in this higher dimensional latent space, and the artwork can be thought of as a trajectory within that space. This introduces a temporal element to the art, with the possibility of revisiting previous ‘states’ of the canvas.
Representation: The challenge then becomes how to represent this higher-dimensional journey….
And:
In the context of “Latent Canvas,” the hyperobject could be seen as the entirety of the latent space – a multi-dimensional space too vast for one to comprehend in its fullness, but whose characteristics and possibilities are expressed through each unique generation by the AI tool. Just like a hyperobject, the latent space is not directly perceivable. Instead, the artist and the viewers interact with it through localized expressions (the generated outputs), and the trajectory of iterations made by the artist.
In a sense, the “Latent Canvas” captures the essence of hyperobjects by presenting an artistic journey through a latent space that is high-dimensional, non-local, and only known through its particular manifestations.
ChatGPT
Okay, I’ll stop there, cause that’s plenty for now.
From AI & robots, to cell phones, virtual reality, self-driving cars, and much more, many of the defining technologies of our times were initially imagined by creative visionaries working in science fiction writing and concept art before becoming reality. Artists, therefore, play a pivotal role in expanding the realm of what is possible, giving form to dreams of future technologies before they exist, and inspiring the engineers and entrepreneurs who bring them to life.
Artists, authors, musicians, and creators are an untapped wellspring of innovation within the emerging AI ecosystem. Historically, when new media such as photography, film, and audio technologies have arisen, artists have always pioneered their adoption, steering these tools from mere curiosities to instruments of profound meaning & insight, and unlocking vast economic value in the process.
AI represents the next step in this long lineage, with artists uniquely positioned to direct its development towards more humanistic ends. Assembling visions of possible futures, artists are already utilizing AI tools to expand our creative capabilities and rapidly materialize novel ideas and artistic concepts, with impacts being felt everywhere. The inclusion of our diverse perspectives in high level societal conversations about the right use of these technologies will ensure that the field of AI research and development recognizes and enhances the complexity, nuance, and subjectivity of human experience, rather than diminishing it.
Artists operate with different capacities, constraints, and incentives than government, corporate, and civil society groups in the AI space. Government stakeholders often prioritize security, economic growth, and global competitiveness, sometimes overlooking more immediate impacts on human lives. Corporations view AI predominantly as a tool for efficiency and profit, lacking an inherent drive to protect or elevate the human spirit. Civil society organizations emphasize accountability and ethics, yet often lack direct engagement with AI as a creative medium.
In contrast, artists’ core motivation lies in expanding possibilities for human expression and imagination, and asking questions about how we can best shape technology for these ends. Our extensive daily interactions as professional artists using AI technologies can provide unique insights into their flaws and real-world impacts, fostering nuanced understanding that goes beyond politically reactive or reductionist interpretations of AI in media.
Consciously navigating these early stage rough edges and gray areas of AI development with aesthetic sensitivity and critical thinking, artists can help chart a humanistic course for AI’s future, illuminating its cultural and societal influences and exposing the seams that evade purely technical perspectives. This active shaping of technology’s meaning and place in our lives is essential to ensure AI uplifts humanity, rather than serving solely as a novelty, business tool, or means of power consolidation.
In essence, artists breathe life into AI, directing its powers towards beauty, insight, and the enrichment of the human spirit, imbuing it with dimensions it inherently lacks. Despite the imperfections and valid critiques of these technologies, we believe refusing to engage with them at all would forfeit the unique opportunity we have in this moment to shape their development responsibly. Artists ought to be equal partners in steering the course of AI development, ensuring its trajectory benefits humanity as a whole.
(Written with help from Claude & ChatGPT, with human review and editing)
Following on the theme of the UK enabling copyright registration of computer-generated works to “the person by whom the arrangements necessary for the creation of the work are undertaken,” I wanted to lay out a clear simple argument for why I think the US Copyright Office opinion letter on Zarya of the Dawn’s AI-generated comic panels not being eligible for copyright is basically wrong.
Here it is:
Photos are copyrightable, including snapshots – even if I didn’t create or arrange by myself the contents of what is depicted in the photograph.
So, if I proceed to use a minimum amount of creativity (whatever that is) to capture a depiction of real space, the same basic principle ought to apply if I capture a depiction of a non-physical latent space using an AI-based “idea camera.”
It could be even argued that merely clicking the shutter on a camera pointed at a real dog is less effort and less creative an act (or perhaps they are at least equal) than prompting an AI image generator to depict, for example, an invented dog wearing a hat. The difference is merely of the instrument used to make the depiction, which settles something to a fixed form.
In the case of a copyrightable snapshot, basically no one tries to argue that, because it is actually the camera’s hardware & software which do the work of image processing and not the human, that the camera is the true “author” of that work. And yet, this is exactly the (I think very wrong) claim made by the USCO about AI-generated works.
Lastly, the fuzzy claims about predictability of final images from AI generators doesn’t hold water as a test for any other kind of media. It doesn’t, as I expressed in a recent post, hold water for example for writing a novel, many types of paintings, or films, musical works, etc. It’s rare you as the artist start with a perfect vision of the finished product, and then merely mechanically transcribe it into your chosen medium. Almost all of those, most of the time, are processes of discovery, selection, editing, etc. with a great many steps before you arrive at a finished product you never quite envisioned in the shape of the final product.
Further, to tack on one final point: even if I close my eyes, spin around, and randomly point and click my camera to capture images – these are all potentially copyrightable, provided they meet some imagined minimum of creativity/originality. ChatGPT offers one rationale that might prove the minimum threshold has been passed: “This could be based on choices like the time and place of the photograph, or the decision to initiate the snapshot at a particular moment.”
Likewise, I think it’s no stretch to say that even the most basic and “boring” AI prompts and their results are always going to be embedded in the context of the lives of the people who created them in concert with these tools. When the circumstances and context (social, personal, political, etc etc) are viewed as constellations (i.e., as a part of their hypercanvas), it will be plain to see where and how the creativity and originality manifest.
I finally spent some time last night with Photoshop’s Beta version, which includes the generative fill tool, based on Adobe’s own generative model, Firefly.
I found it both underwhelming & janky, but also that if you push it enough, you can do some interesting things with it. Here are the first four images I made with it, showed chronologically.
This one is loosely inspired by that David Bowie Blackstar video, where there’s a jewel encrusted astronaut skull or something. It’s the result of dozens and dozens of prompts using the generative fill tool. It’s very like “Photoshop looking” with a lot of the objects layered in the foreground. It’s fine, and there are aspects that are interesting, but not where I’d like a tool like this to be able to take me.
I actually find the underlying Firefly image generation model to be “not that good” relative to something like Midjourney v5+ family. Plus the keyword filtering is EXTREMELY restrictive in at least these early versions, making it extremely klunky and unimpressive to use. I’ll come back to that topic later.
Here’s the second image I made from scratch with a few dozen prompts:
There was a Wired article a while back asking why generative AI images so often look like 70’s prog-rock album art, and this one falls squarely into that category. Though perhaps it’s slightly more metal than prog, or at least prog-metal bathed in surrealism. Pictorially, I like this one a bit better than the first experiment.
It’s also worth noting that, despite its flaws, you can at least use the generative fill tool on higher resolution images. All the ones in this set, for example, were made at 300 dpi, so you could conceivably use them for print output, which is cool.
Next up:
What I actually had envisioned was sort of an image of a Capricornian sea-goat leaping out of the water and twisting in mid-air, inspired by maybe like an old engraving on a cosmic-tinged map or something. Where I ended up is really different (and continues to show the absurdity of the US Copyright Office’s test about “predictability” being a factor in determining copyrightability), but I rather like it all the same.
Apart from Adobe’s snotty, paternalistic, and overly restrictive keyword filtering (for a suite of tools I pay $900+ CAD annually to access, I might add), the tool is also many times simply ineffective. You lasso an area, ask for a given thing, and it just doesn’t deliver that at all. I recognize that this is still in beta, but that happened over and over again. Or it fails to constrain the thing you ask for in the manner in which you ask. For example, on the sort of white fish/rabbit body of the monster, I kept selecting the top part of it, and asking it to make the top or front half of a goat, and it kept routinely giving me a whole goat.
All this can make the tool very tedious to work with, but also there’s something to the problem where, since you can’t exactly get the thing you want or are envisioning from it, you have to take many alternative side paths, and sort of prod and poke and eventually accept where you end up, or else just stop altogether. That’s both a frustrating process, but the mere fact of exploring these blind alleys can also take you to some interesting new places, if you’re able to sit back and ride the wave a little (while also directing the wave to whatever extent you can).
Another pic:
I was trying for a mermaid in the initial image, but ended up with this figure of a woman, and just sort of followed my instincts on what else to include. It ended up, I think, in a segment of the latent space that kind of calls to mind some 90s grunge music videos, like Black Hole Sun, or something from Nirvana, maybe.
Again, this was dozens upon dozens of selections & prompt-guided generative fills. I remember one thing that I found annoying but not surprising is that Firefly seems to prohibit gun as a prompt. Along with things like missile and warhead etc. But as you saw in the first one, I did succeed to get artillery shell. So I think there are holes in their keyword filtering (and all keyword filtering), that you can still probably drive through.
Or you can just, you know, pop over into Adobe Stock, type exactly the same search term in, and get the thing you wanted no problem. So that doesn’t make sense to me at all as a user, that I can search for content in Adobe Stock – upon which Firefly is supposedly trained – and pull up something I’m not able to natively get the model to produce. That’s just so tedious, I can’t even engage with it further as a problem. And I think it really cripples the utility of the tool. Their design pattern should be: if it’s allowable in Stock, it’s allowable in Firefly. Otherwise, it’s just like this fever-dream of trying to navigate some idiotic unpredictable bureaucratic system of imagined cultural taboos… At least be consistent! And if I’m paying you for this service, give me the option to turn all this crap filtering off, please. I’m not doing anything wrong or illegal. I’m making fucking art, so hands off my fucking imagination, thank you very much.
Did I already post this? I’ve been in a haze on these subjects the last few days, but the US Copyright Office put out this follow-up guidance on AI-assisted art, firmly planting its flag in the “it all depends” territory – a sign of uncertainty as policy, if ever I saw one. These quotes are telling:
The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work. This is necessarily a case-by-case inquiry.
If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it. For example, when an AI technology receives solely a prompt from a human and produces complex written, visual, or musical works in response, the ‘‘traditional elements of authorship’’ are determined and executed by the technology—not the human user.
I realize that “traditional elements of authorship” here primarily has a legal meaning, but I think it’s important we – at some point – explode those notions, to show we’re no longer living in that world of traditions. And that the goal of authors and artists ought not to be conform to the legal definitions of things, but to seek the deeper truths – and change their manifestations in our world.
But before that, here’s the other USCO quote:
In other cases, however, a work containing AI-generated material will also contain sufficient human authorship to support a copyright claim. For example, a human may select or arrange AI-generated material in a sufficiently creative way that ‘‘the resulting work as a whole constitutes an original work of authorship.’’ Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection. In these cases, copyright will only protect the human-authored aspects of the work, which are ‘‘independent of’’ and do ‘‘not affect’’ the copyright status of the AI-generated material itself…
In each case, what matters is the extent to which the human had creative control over the work’s expression and ‘‘actually formed’’ the traditional elements of authorship.
I thought about this notion a lot while I was working on the generative fill Photoshop experiments last night, about how presumably given the level of involvement I expressly had in determining the actual arrangement of elements within the images. It’s a much higher degree of specificity than working in Midjourney – though again, I frankly don’t give a shit what any bureaucrat thinks about my artwork or process. Their supposed hegemony does not impinge on my autonomy to make creative choices how I see fit.
I wanted to stick in this bit from Lawrence Lessig, who wrote on the matter of AI image copyrightability, which I largely agree with in general, if not all the particulars:
In exchange for AI copyright protection, Congress could require that the AI technologies register the work in digital registries, tied to data that established provenance and ownership. These registries need not be the government’s, though the government should set standards for an approved copyright registry. If done right, AI creativity could engender the return to a system that made it easy to identify the owners of copyrighted work, and therefore easy to clear rights when that work is to be reused.
Interestingly, this type of provenance system is being pioneered to a certain degree in parallel via the C2PA standard, as well as Adobe’s implementation of it, Content Credentials. However, neither of those to my knowledge was designed with the express purpose of acting as a universal copyright registry or clearinghouse for ownership of IP.
C2PA’s about page says that it provides “a tool for creators to claim authorship while empowering consumers to make informed decisions about what to trust.”
When enabled, Content Credentials gathers details such as edits, activity, and producer name then binds the information to the image as tamper-evident attribution and history data (called Content Credentials) when creators export their final content….
It creates an open format for sharing information about the producer’s identity and the ingredients and tools used to make the content. These ultimately provide useful attribution information for audiences once the producer shares or publishes the image.
I find the claims questionable that this information will ultimately prove to be that useful to audiences who are doomscrolling on their cell phone while on the toilet. But I suppose tied to a licensing system, this might be very powerful for (some) rights holders, or at least those who can afford to pursue the enforcement of their rights in court.
My knowledge of this might be a bit out of date (and I’m not sure if or when this is in force), but this also seems to plug handily into the EU’s direction in terms of copyright, where content is supposed to be scanned at time of upload (Wired, 2019).
Article 17 provides that online content-sharing service providers need to obtain an authorisation from rightholders for the content uploaded on their website. If no authorisation is granted, they need to take steps to avoid unauthorised uploads.
This all sounds incredibly restrictive, even if in theory it is intended to be maximally protective of copyright. It seems to go about attempting to protect copyright by limiting other rights around fair use and free expression, which is especially problematic on the internet, where so much of our personal expression is comprised of re-packaging and re-publishing the expression of others.
There must be a better way than all of this, but it doesn’t seem like we’ve found it yet.