Questionable content, possibly linked

Category: Other Page 60 of 177

Occupy AI: How far is too far?

Via the latest ImportAI newsletter, this excellent piece, Whoever Controls Language Models Controls Politics by Hannes Bajohr.

There’s a lot worth unpacking in this piece, but I’m just going to jump straight to the end when the author says:

If AI systems become the site of articulating social visions, a dominant factor in the make-up of the public sphere, or even a political infrastructure themselves, there is much to be said for actually subjecting them to public control as well. If this is taken to its logical conclusion, the last resort would be, horribile dictu, communization – in other words, expropriation.

It’s an idea that seems to be taking root, and is also reflected in this FT piece:

It felt deeply wrong that consequential decisions potentially affecting every life on Earth could be made by a small group of private companies without democratic oversight.

I think I more and more agree that this is the biggest problem of AI’s rapid ascendancy, that a few powerful players will own it all and amass altogether too much power for themselves.

OpenAI’s charter seems worth another reference here also, particularly this bit about unduly concentrating power:

We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

I guess I’m starting to think, especially since GPT-4 and the mass adoption we’re seeing of OpenAI’s technologies, that they have already crossed this ill-defined threshold of unduly concentrating power. Is there a … duly (?) way concentrate power? Is that ever actually desirable that one company not just controls the technology itself, but now effectively gets to control all the other technologies that get built by third parties using their technologies?

Rather than pointlessly halt AI research for six months (so that Elon’s outfit can catch up – absolutely the only reason that fucker signed the moratorium), I can now strongly see the arguments for socializing it, and putting into the hands of the public literally *any* AI model that becomes powerful enough to be “really good.”

Citizens’ assembly to (non-violently) occupy OpenAI?

Citizens’s assembly to occupy Midjourney?

Are we there yet?

Too soon?

What I want to know is, what is the level of threat to human livelihood and democratic governance systems which would urge us to require action? And what would be appropriate action? My impression is that it will become more and more difficult to say “no” to systems like this, or the organizations which created them, the further entrenched their services get in the marketplace. This causes them to amass a lot of power and possible points of control, not to mention money. What happens when governments become wholly dependent on privately-owned AIs? There aren’t easy or simple answers to these questions, but I tried to explore them in fictional form here

Notes on Celestial Cephalopods

Celestial Cephalopods is book #90 in the AI Lore books series, by Canadian AI author & publisher, Lost Books.

I’m not sure how I landed in that part of the latent space anymore – because when you get into a flow state with Midjourney, it can be like falling into a dream, or some pocket universe where different rules work in unexpected ways – but somehow or other I landed on some images of ornately decorated quasi-religious seeming ‘Octopus Lords’ for lack of a better term. It was a side quest from some other exploration to be sure, but I bookmarked it and came back to it.

Once I connected the visual elements to a bit of lore – this idea that for some reason IRL some people claim cephalopods have potentially extraterrestrial features – the rest just flowed like water. Shades of Lovecraft in some of these I suppose, but I didn’t originally set out for that. It’s a convergence.

Here’s a copy of the art preview I uploaded to Gumroad for the book (click here for a bigger view). I really think of these foremost as art books:

Speaking of convergences, I’ve noticed in my image-making in Midjourney (and I’ve landed on “image-making” as my preferred term I think over the more academic sounding synthography) that some forms tend to converge on similar other forms. In this case, certain images of octopus tentacles combined with humans just ended up looking like snakes. In others, if you see representations of an octopus from a certain angle, you end up with a form that pretty strongly resembles an elephant’s face, trunk, and tusks. I saw it with some celebrity sets I was working with too, where certain views of certain figures generated by MJ seemed to bear resemblance to other well-known figures. I think this is just an artifact of there not being “that many things” ultimately in the universe, and shapes being reused consistently because they get the job done.

Anyway, I had fun with this one, because I figured out a way to kind of tell a somewhat sort of coherent story in a very lorecore way with these in ChatGPT v4. Kind of alternating between invented encyclopedia entries of pure exposition, and very short flash fiction segments of generally around 200 words set in that universe. So the workflow being something like:

  • Input some basic details about your ‘pocket universe’ of your narrative
  • Ask for a fictional encyclopedia on same
  • Then ask for 20 or so story ideas for flash fiction in that world
  • Give it a target word count, any directions, and tell it which items you want to flesh out into a flash fiction piece
  • I tend to tell it to not try to close or explain the story, because it’s sort of stuck on a ‘clean wrap-up’ which I really don’t want in this kind of open-ended story-telling, but ymmv.
  • Then alternate in new bits of encyclopedia entries that move the overall narrative in a given direction with new details
  • Then more flash fiction that progresses onwards in that world but doesn’t necessarily linearly complete anything that came before. It’s a way of mixing heavy lore via encyclopedia stuff, but giving a bit more space to digest it all by having dramatic incidents and scenarios… things that are evocative, vibey, and less spelled-out (though ChatGPT tends to do a good job of incorporating lots of contextual world-building into its flash fictions too.

After that, I had plenty of material to go back to MJ and flesh out specific aspects of the world I’d built, and once I found some winning formulas, just re-roll them a bunch of time to generate a bunch of image stock, with variations and “side quest” visual tangents.

I think this is one of the faster books I’ve produced at higher quality than some of the other fast ones I did with my own EncycGen app I wrote using ChatGPT. Probably a total of three hours for text generation (of high coherence, and very readable, I think), images made in Midjourney, image set reduction in Lightroom, editing & arrangement in Vellum, and uploading finished ebooks and collateral assets to Gumroad… maybe a total of 3 hours?

I know Midjourney doesn’t have an API yet (I don’t care about Stable Diffusion or Dall-E anymore – they’re dead to me), and GPT-4 isn’t public API yet either, but once I can get both of those hooked up to Github Copilot X (or brute forcing it through ChatGPT w/ v4), into my own custom book maker suited to my workflow – and which could export directly into Vellum for quick finishing touches all arranged properly… I don’t see any reason you couldn’t have a high quality lore-heavy book with interesting dramatic interludes & awesome images in about an hour, or possibly significantly less.

And if the quality matches the needs and desires of both author/publisher/producer and audience, it seems like a win to me all around?

Detoxifying AI as a dimension reduction

This is interesting:

LLMs model their output on the texts they have been trained on, which is more or less the writing of the entire Internet, including all the biases – the prejudices, racisms, and sexisms –that constitute much of it. Countering this means either censoring the output, as is done (to a degree) with ChatGPT, and thus rendering it potentially unusable. Or, as is also practiced, filtering the data set for its undesirable components – and thus feeding the model with a better world. This is an eminently political decision. Detoxifying AI necessarily involves formulating a social vision.

There’s a lot to tease out in this article, but this idea described above strikes me as a problem related to dimension reduction.

I’ve been having free ranging discussions with ChatGPT on some related problems around the design of my latent space navigator concept, and recently it offered this simple explanation of dimensionality reduction:

Dimensionality reduction is a technique used to reduce the number of variables or dimensions in a dataset while preserving the relationships and structures within the data.

So there’s something to having a large dataset with many dimensions, and having to reduce it to lower dimensionality for some specific intended use…

Notes on Shadows of Evil

Shadows of Evil is book #89 in the AI Lore books series.

It is an adaptation of the Sesame Street dystopian sci fi 70s film stills I made using Midjourney. It’s a subset of those images, as I didn’t want the book itself to be overly Sesame-ish. The backstory has been modified somewhat from the original Imgur post that holds the larger set. In the book, there is a depressed industrial city called Umbra, on the outskirts of which a strange explosion occurs in a chemical plant.

As a result of the explosion, somehow or other (ample handwavium), there is a children’s show being broadcast nearby at the same time, called The Wonderful World of Giggles. And as a result, characters from the universe of that show end up coming through a rift between dimensions into the city of Umbra. Since the monsters, the Giggles, are no longer constrained by the alternate dimension which held them, they end up causing havoc in Umbra, including a great deal of violence. Spurred on by that is the rise of a fascist group called the Regime which uses this turmoil to rise to power and enforce its own brutal regime, which it turns out is in league with some of the higher-up Giggles.

The text started out in Anthropic’s Claude, but I ran into a number of instruction-following problems with it, and switched over to ChatGPT running v4 to finish the rest. I liked Claude cause it seemed kind of fresh at first, but the more I’ve used it, the worse I think the instruction following is on this version. I believe there is a newer one coming out (or already out? v 1.3 I believe), and perhaps that is better, but I’m not sure how to activate it on my account…

Again, all the images are Midjourney. I have just straight up stopped using other image generation tools, because there’s no point in fiddling around with partial or lower quality when the results are as good and easy as they are in MJ. Midjourney has also opened up a lot of storytelling dimensions for me (#AIcinema), where most of the new books start now as image series before anything. Exploring those parts of the latent space gives me a strong narrative current, which I can then flesh out in AI tools for text.

Bryan Collins Interview With AI Author, Tim Boucher

Happy to share today an interview I did with Bryan Collins of the Become a Writer Podcast on the topic of using AI tools as a writer. Bryan and I had a good conversation that should be a useful introduction for creative people getting into this space.

If you are new here and haven’t already checked it out, you might also enjoy my conversation with Joanna Penn and also this one I did for This AI Life.

Also check out the about page for some other trails to follow, and of course my AI Lore books.

Latent Space Navigation Device

Following up on my post about trying to use AI to design a Midjourney controller, I asked ChatGPT for help doing a generalized blog post introducing the issues here. Here it is with light edits from me…


Introduction

In recent years, the advancements in artificial intelligence and machine learning have led to the development of sophisticated generative models capable of producing stunning and realistic images. One of the most notable types of these models is image diffusion models, which can generate a wide variety of images based on their underlying latent spaces. However, navigating these high-dimensional latent spaces and understanding their structure can be a challenging task.

Currently, our exploration of latent spaces is often haphazard and sporadic, with no maps or guides to help us understand their complex topography. What if we could have, for lack of a better comparison, a “Google Street View” for exploring the latent space of image diffusion models? This blog posts introduces the idea of and problem space around a hardware-software controller which would bring us closer to being able to more intuitively navigate high-dimensional spaces.

Navigating High-Dimensional Latent Spaces

Latent spaces are high-dimensional mathematical spaces that encode the essential features and variations of generated images. The challenge lies in creating an intuitive method to explore these spaces and discover interesting or meaningful images. The proposed solution consists of a handheld controller, combined with a software interface, that can translate physical actions into navigation through the latent space.

The physical controller could include components such as joysticks, dials, or sliders, which allow the user to manipulate specific dimensions of the latent space. The software interface would display the generated image based on the user’s current position in the latent space and update in real-time as the user navigates. Additionally, the interface could provide various exploration modes, such as local and global exploration, to facilitate different types of exploration experiences.

Dimensionality Reduction and User Experience

One of the core challenges in navigating high-dimensional latent spaces is the need to reduce their dimensionality to a more manageable form, without losing meaningful features. Techniques such as PCA or t-SNE can be used to retain important characteristics while providing an intuitive navigation experience.

As users navigate the latent space using the controller and software, they would be able to view their path of travel, save points of interest, and explore adjacent neighborhoods. The software could also allow users to switch between different dimensions on the fly, providing a more dynamic and flexible exploration experience.

Possible Exploration Modes

In addition to local exploration, which focuses on the immediate neighborhood around a specific point in the latent space, other modes could be integrated into the software. For example, a prompt-based mode would enable users to input text prompts and generate images based on those themes. Another possibility is a referent-based mode, where users can define a set of referents, points, or features within the latent space (such as a blue ball, an elephant, or a storm). This mode would allow users to explore themes around these referents with different configurations or treatments, effectively enabling them to discover new and unique combinations of visual elements.

Conclusion

The prospect of an innovative hardware-software solution for navigating high-dimensional latent spaces opens up a world of creative possibilities, allowing users to delve into the intricate structures of generative models.


I kinda clipped it at the end, cause it always does corny conclusions, etc. It’s a little janky overall, but a good enough anchor on the topic to at least drop into the water for now.

Notes on Repermanent

Repermanent is #88 in the AI lore books series. (I can’t believe we’re almost to 100!) In a very real way, though, this book should actually be first, in order of writing, since the bulk of it was published on my old blog way back in 2008. Fifteen years ago!

I was not careful saving a copy of the complete orginal version (which was unfinished), but fortunately Archive.org had everything but three of the sixteen or so chapters as originally published. So ultimately I ended up sort of re-constructing or reverse-engineering or something from the surviving chapters (ever so slightly edited first) in order to make new ones in ChatGPT, or to lengthen here and there.

It’s still sort of an unfinished or perhaps open-ended narrative with many overlapping strands; I didn’t try to force it into taking any particular shape but tried to follow the spirit of what was there already.

Where I really had fun with it was in the image-making process in Midjourney. There were already call-outs in the original text to a 1950s aesthetic, but with like virtual reality and androids and stuff. So I really went for that look in the accompanying art, and am really happy with the results.

If I remember correctly, a fair amount of the original source material for the book actually came from a dream or series of dreams; my brain has been churning on these subjects for a long time. I feel often like producing these books is like chasing these old dreams (and some new ones), trying to piece together whatever my subconscious was showing me then, or has been since. Feels many times like a Jungian process of active imagination, of amplifying these subconscious bits and bobs, giving them bodies so they can breathe the air and express whatever message it is they are carrying.

Midjourney Controllers

I’ve been playing around lately in Midjourney with doing visualizations of hand-held controllers one might use to control Midjourney.

That Imgur set has about 20 items, here are just a couple examples:

I like that the controls of these things are somewhat inscrutable, and seem in some cases outlandishly complex.

But then, I think you would probably need a somewhat complex controller to be able to meaningfully navigate high dimensional spaces, such as the latent spaces of image diffusion models.

Most controllers only work for movement in three spatial dimensions, and then include some other custom controls. But in machine learning data sets, you may have hundreds or thousands of dimensions: often, single pixels are treated as dimensions.

How then could you design a controller that would work in a fluid and flexible way in multi-dimensional spaces? The images Midjourney produces in queries around this topic seem almost tantalizingly comprehensible, but just outside the ability to grasp.

I took this basic concept, of a hardware and software package that can enable users to traverse latent spaces as though they were VR, and produced a pulp sci fi book from it, using Claude & Midjourney. It’s called Impossible Geometries. The Claude flash fictions are pretty fun, and easy to direct while you’re producing them (though in a lot of ways, Claude is lacking compared to GPT-4, and ChatGPT in general). And there is a superset of other imagined high dimension controllers in it (which I call the Prism LightScope), along with visualizations of living within the latent space, etc.

I’ve actually spent quite a while taking these concepts into ChatGPT w/ v4, and it’s been helping me meaningfully describe potentially real products that could be built in this space. There’s probably a lot of computational hurdles for visualizing and manipulating the contents of latent space in real time, but again it feels tantalizingly… possible.

Follow your side quests

One thing I’ve discovered in the land of discovery generative AI tools is that… it’s inherently a voyage of discovery. You might set out for a certain far off land, but find yourself in an unexpected strange country that its obvious is worth exploring more fully.

That process of discovery is inherently non-linear in that it seems to spawn endless side quests. It depends what your immediate goals are, but I’ve found it to be worthwhile to follow down some of these more interesting branching paths, because you end up finding bits and pieces you’ll use along the way later on in your main quest(s).

The last two books, especially are side quests – but that’s part of what’s fun for me about the AI Lore books series, is that they are literally all side quests, and the main quest is just something you assemble as you go along. There’s room for everything because it’s world-building. And then the game becomes merely knitting it into the world…

Two New Books About AI Collectives

I finished two new books lately about different AI factions, collectives, polities, etc.

Pictured above is #86 in the AI Lore Books series, the Hyperion Collective, AIs who choose to appear in alien form.

Also recently published is book #85, Tales of the Victoriana Intelligences, who prefer to instantiate among humans as Victiorian humans with pet dinosaurs.

Both of these collections grew organically out of experimentation in Midjourney. After I had a rough idea of their world, I went into Anthropic’s Claude with some guiding text, and had it generate around 20 flash fiction ideas, along with an encyclopedia entry about each group to kick off the story collections. Then I picked the best concepts, and used Claude to write in-world stories that fit into this topic, but also connect to my larger story-multiverse of the AI lore books.

Claude is a fun partner for world-building, especially used in this “slice of life” method of storytelling. Claude’s flash fiction/short stories are actually pretty good. They’re not the best thing ever, but neither is my writing to begin with, so, together we maybe get somewhere better. And Claude’s short stories are better than ones I’ve seen out of ChatGPT, in my opinion.

Page 60 of 177

Powered by WordPress & Theme by Anders Norén