I’ve been playing around lately in Midjourney with doing visualizations of hand-held controllers one might use to control Midjourney.

That Imgur set has about 20 items, here are just a couple examples:

I like that the controls of these things are somewhat inscrutable, and seem in some cases outlandishly complex.

But then, I think you would probably need a somewhat complex controller to be able to meaningfully navigate high dimensional spaces, such as the latent spaces of image diffusion models.

Most controllers only work for movement in three spatial dimensions, and then include some other custom controls. But in machine learning data sets, you may have hundreds or thousands of dimensions: often, single pixels are treated as dimensions.

How then could you design a controller that would work in a fluid and flexible way in multi-dimensional spaces? The images Midjourney produces in queries around this topic seem almost tantalizingly comprehensible, but just outside the ability to grasp.

I took this basic concept, of a hardware and software package that can enable users to traverse latent spaces as though they were VR, and produced a pulp sci fi book from it, using Claude & Midjourney. It’s called Impossible Geometries. The Claude flash fictions are pretty fun, and easy to direct while you’re producing them (though in a lot of ways, Claude is lacking compared to GPT-4, and ChatGPT in general). And there is a superset of other imagined high dimension controllers in it (which I call the Prism LightScope), along with visualizations of living within the latent space, etc.

I’ve actually spent quite a while taking these concepts into ChatGPT w/ v4, and it’s been helping me meaningfully describe potentially real products that could be built in this space. There’s probably a lot of computational hurdles for visualizing and manipulating the contents of latent space in real time, but again it feels tantalizingly… possible.