Series: Intertext Protocol

Layered hypertexts (Semiotics)

On 31 January 2023

Following on from my recent look at LLMs (large language models) as being potentially something predicted by postmodernists, I wanted to add another layer onto that.

Let’s dive right in with this original older definition of “hypertext” within the context of semiotics, via Wikipedia:

“Hypertext, in semiotics, is a text which alludes to, derives from, or relates to an earlier work or hypotext. For example, James Joyce’s Ulysses could be regarded as one of the many hypertexts deriving from Homer’s Odyssey…”

It continues on with some more relevant info:

The word was defined by the French theorist Gérard Genette as follows: “Hypertextuality refers to any relationship uniting a text B (which I shall call the hypertext) to an earlier text A (I shall, of course, call it the hypotext), upon which it is grafted in a manner that is not that of commentary.” So, a hypertext derives from hypotext(s) through a process which Genette calls transformation, in which text B “evokes” text A without necessarily mentioning it directly “.

Compare with the related term, intertextuality:

“Intertextuality is the shaping of a text’s meaning by another text, either through deliberate compositional strategies such as quotation, allusion, calque, plagiarism, translation, pastiche or parody, or by interconnections between similar or related works perceived by an audience or reader of the text.”

Speaking of plagiarism, I’ve used somewhat extensively a plagiarism/copyright scanning tool called Copyleaks. The tool is decent for what it is, and the basic report format that it outputs for text that it scans looks like this:

So while this tool is intended for busting people’s chops for potentially trying to pass off the work of others as their own, the window that it shows us into intertextuality and the original sense of hypertext is quite an interesting one.

We can see here specifically:

Passages within a text that appear elsewhere in the company’s databases, and the original source of those passages
Passages which appear to have been slightly modified (probably to pass plagiarism checkers like this)
Some other bits and bobs, but those are the major ones

I find “plagiarism” as a concept to be somewhat of a bore. But looking at this as a way to analyze and split apart texts into their component layers and references suddenly makes this whole thing seem a lot more interesting. It allows for a type of forensic “x-ray” analysis of texts, and a peek into the hidden underlying hypotexts from which it may be composed.

The whole thing calls to mind for me as well another tangential type of forensic x-ray analysis for documents, something we see in the form of a Github diff, which tracks revisions to a document.

This is not the most thrilling example of a Github diff ever, but it’s one I have on hand related to Quatria:

It’s easy enough to see here the difference in a simple file, though diffs can become quite complex as well. Both this and the original semiotic notion of hypertexts (as exposed through plagiarism checkers) seems like another useful avenue to explore in terms of how might we want to try to visualize AI attribution in a text.

Towards an Intertext Protocol

By Tim B.

On 11 January 2024

In Other

I’ve been thinking a lot about intertextuality, which I made some notes there about before the holiday. That texts create meaning in terms of their relationships with other texts. That is, re-centering on the “marginalia” so to speak.

It struck me, why don’t we have a universal system for deep referencing within any other kinds of texts – in this case used to mean media, online or offline? We have bits and pieces of it here and there: URL/URI, ISBN, DOI, other things. But why can’t I deep-link into a specific part of a book, a specific time-stamp in a film or podcast, or audio field recording, or museum artifact, etc.

What that seems to be pointing towards is a kind of Intertext Protocol. An internet of “texts” (in the literary theory sense) Where it would be able to function both online and offline, where the hidden (and not so hidden) connections within and between books/films/albums/articles would be something that could be made manifest, that could be ‘read’, where those intertextual relationships become central to the meaning-making and analysis of the works.

Via Wikipedia, linked above:

In literary theory, a text is any object that can be “read”, whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothing. It is a coherent set of signs that transmits some kind of informative message.^[1]

Through a series of coincidences, I landed on the image of the Sphinx as being a meaningful existing historical/mythical symbol with a lot of seemingly relevant stuff… keeper and revealer of riddles and secrets, understood differently within the contexts of different cultures. The Sphinx is an intertextual creature already, a body of myths woven together into an enduring emblem, image, an embodiment of an idea.

From Wikipedia:

I asked CG to come up with a acronym to link SPHINX to the ideas I’ve been exploring with it about the Intertext Protocol, the Universal Text Addressing System, cross-referencing, concordances, etc, and it came up with this acrostic which I like and am stealing:

Standardized Protocol for Holistic Intertextual Navigation and X-referencing [cross-referencing]

It’s a leap, but it’s the kind of leap I like.

Via Dalle3, saved to Imgur:

To quote more extensively from CG’s and my conversation on why the Sphinx is an appropriate symbol for this nascent idea/endeavor/whatever:

The Sphinx as a Metaphor for the Intertext Protocol:

Guardian of Knowledge:

Just as the Sphinx guards the entrance to cities or sacred sites in myths, the Intertext Protocol can be seen as a guardian or gatekeeper of the vast and complex world of texts and media. It holds the key to navigating and understanding this labyrinthine universe of information.

Challenge of Interpretation:

The riddles of the Sphinx require insight and wisdom to solve, paralleling the challenges faced in interpreting and understanding interconnected texts and media. The Intertext Protocol, like the Sphinx, poses a challenge to its users: to delve deeper and unravel the intricate connections between different forms of expression.

Wisdom through Integration:

The Sphinx, a creature of combined forms, symbolizes the integration of different realms or ideas. Similarly, the Intertext Protocol integrates various media and texts, offering a holistic approach to knowledge and interpretation.

Unlocking Secrets:

The Sphinx is often seen as a keeper of hidden truths. The Intertext Protocol, in a way, unlocks the secrets of texts by revealing their interconnectedness and the multiple layers of meaning that can be discovered through cross-referencing and contextual analysis.

Transformation and Revelation:

Encounters with the Sphinx often lead to transformation and enlightenment. The use of the Intertext Protocol can similarly transform one’s understanding of texts, revealing new insights and perspectives that were previously hidden or unexplored.

Bridge Between Worlds:

The Sphinx, part human and part animal, bridges the human world and the natural or divine realms. The Intertext Protocol bridges different media – the written word, digital content, visual media, etc. – facilitating a dialogue between these various forms of human creation.

Here’s one via Midjourney:

Also via Midjourney, which has a devil of a time depicting Sphinxes accurately, which, the more that I think about it, perhaps is appropriate for the subject.

I have mountains more to say about the SPHINX Internet Protocol, but am running out of productive time in the day, so will leave this first blast here as a beginning.

Power Reader Diamond (Visual Concept)

By Tim B.

On 26 January 2024

In Other

When I started smoking a couple decades ago now, I had this experience for a while where visual text on screen would sometimes appear almost textured, where like shapes within the columns of text would sort of jump out at me visually. I would I guess ascribe it to the substance, but the impression has stuck with me so long now that I need to exteriorize it in order to let it grow into something else.

This was always the root of the concept:

That there would be some kind of app, or tool, or way of reading or… something… which would be composed in its essential form as a diamond, through which words of a text would rapidly scroll for reading, and that somehow, this technique would be much faster than regular conventional reading, and it would enable the user to scan through far greater volumes of textual material (and other compatible multi-modal inputs & sources). I’m imagining something you could just as easily use to read a book as scan through in a meaningful way a large dataset.

Thinking through questions around intertextuality also seems to yield applicability of whatever this kernel of an idea is here, a feeling, a fleeting stoned impression…

Whatever replaces our current old stodgy way of browsing the web will have to be able to visualize for us all the wexes, all the intertextual references and borrowings, connections to and from other sources, other voices, other texts, other interpreters and commentators. Enter the Power Reader.

In the interest of getting these ideas out of my head, so they stop swirling endlessly, I worked with Dalle3 to draft some concepts for this visually, but it got stuck and constrained in its own limited beliefs about the usefulness and desirability of “apps” and “cellphones” both of which I think I eschew (No Apps No Masters – a topic for another blog post). Still, the visual explorations have some additional nuggets of truth in them I think. There’s definitely something her worth expanding further.

Here’s the full set, an archive, and highlights below, hotlinked out of Imgur (and therefore likely broken by the time you see them) for your viewing pleasure:

Text-Only Web Browsing Solves Most of Fake News, Taylor Swift Deepfakes, Other B.S.

By Tim B.

On 7 February 2024

In Other

Now that I’ve reduced my daily intake of images on the web, it’s become apparent to me how much better a text only internet (or one where images and videos are differently managed) – could be. It solves seeing anoying stock photos everywhere. It solves a lot of types of ads (plus ad blockers obvs). It solves hours of mindless scrolling (and not really finding anything). It solves much of the shock value of things like fake news, deepfakes, ____fakesnamedujour. No more stupid memes. No annoying pop-up autoplay videos. It solves seeing screenshots at the top of reddit threads designed to trigger some kind of emotional reaction. It simply vanishes all those things. It’s weird at first. And modern browsers don’t handle it well unless you like messing around in the Terminal (which I decidedly don’t). I couldn’t find anything except Gemini browsers for Mac (like Jimmy) – I like Gemini but I don’t know where to go to look at anything and I don’t understand how I can blog there like I do in this universe. I will keep looking I guess, but I just wanted to share this dream of like a modern text-only internet. Sounds crazy, but join me, you’ll see.

Just say no to images on the web.

Ted Nelson in Xanadu-Space

By Tim B.

On 5 August 2024

In Other

A friend mentioned Project Xanadu to me in passing a while back, and I only just now thought to poke around on what the hell it actually is, and turns out it is amazing:

The conceptual stuff demonstrated in this video kinda blew my mind, and was like a bunch of missing puzzle pieces falling into place for things I’ve been thinking about for years, both in terms of blogging, but also lately in my AI lore books. Running late over here but I’ll come back and weave these all together in more detail (++intertextuality).

Also related:

And this one is by another source that appears to not be Ted Nelson, and I think does a more succint job of explaining some of the key concepts of “xanalogical” organization.

Also worth it:

Narrative Topologies

By Tim B.

On 9 September 2024

In Other

Having heard this complaint about my AI Lore books for about the thousandth time (not an exaggeration), I think I might be finally ready to concede that – in some way – my books are indeed “not real books.”

What I mean by that is that the format of an ebook (or print book) merely serves as a vehicle to deliver what amount to complex narrative networks. To quote Wikipedia on the matter:

A networked narrative, also known as a network narrative or distributed narrative, is a language partitioned across a network of interconnected authors, access points, and/or discrete threads. It is not driven by the specificity of details; rather, details emerge through a co-construction of the ultimate story by the various participants or elements. […]

Networked narratives can be seen as being defined by their rejection of narrative unity.^[1] As a consequence, such narratives escape the constraints of centralized authorship, distribution, and storytelling.

Let’s put it another way, perhaps even more simply…

My books consist of sets of reference points, some of them textual, some of them image-based. The reference points are arranged in a certain order within each book, and also include hyperlinks out (physically encoded into the ebooks, as well as non-coded conceptual or thematic ones) to reference points contained in other books.

Let’s have a quick refresher on network topologies:

Instead of nodes in a network, think of them as nodes in a narrative, which consists of nodes and their relationships (arrangement) with other nodes. What’s a “node” in this context? Non-exhaustively, we could say it is something like entities (persons, places, things), events, etc. It’s a thing with some substance in a story.

Most conventional fiction could probably be represented as a pretty simple linear (line) topology. That is, you deliver one “reference point” or node, one after another, and the reader passes through them in the path laid out linearly by the author. Perhaps a choose your own adventure book might be mapped out to resemble something like a tree or a mesh, where the user chooses from among multiple pre-defined paths and branches to arrive at their own experience. And maybe a dictionary or encyclopedia might look like a “fully connected” network topology.

My books consist of kind of all of these smooshed together into a hybrid narrative network topology. Each book is a narrative node in itself, composed of many other sub-nodes and relationships. And then the reader traverses the nodes in basically any order, composing their own experience as they go along. This is not the way that I think of most other fiction books working usually. And above and beyond anything I’ve done using AI, I think this model, this structure, is what sets my books apart in the end.

If this is hard to parse, let’s pull in someone else’s diagram to help illustrate. This comes from a paper on ResearchGate, which has a set of illustrations, of which is this one, called “Narrative Network Graphs: examples of two far-right narratives in 2016.” Here’s the picture, which seems to represent narrative elements mapped as a visualization of relationships and proximity:

This is kind of a “latent space” approach to narratology, I think. And I suspect it might be somewhat aligned with how AIs “think” about narratives (I don’t think they actually think, however). When you invoke a narratively-flavored output from a generative AI service, it takes all your tokens that you input, finds the others laying around in the neighborhood that are likely to be related, and spits them back out. It outputs them in a linear order (A –> B –> C), but my hunch is that this linear order is not actually intrinsic to how AIs approach fulfillment of these tasks. It doesn’t care much about what the order is.

Humans, however, do seem to care a great deal about linear progressions through time within narrative, which Kurt Vonnegut’s bit on the 8 types of stories (of which he only depicts a few – see the rest here) ably illustrates:

I suspect the reason AI often crafts “shitty” narrative progressions is that 1) it is not intrinsically concerned with the order of presentation, only that nodes and their relationships are represented, and 2) it has no lived emotional experience, so has to make guesses as to what outputs ought to trigger which emotional states in humans.

The thing is, though, I like that weird quality, the Uncanny Valleyness of it all. The fact that it struggles and sputters with narrative unity. I like that AI currently does NOT actually fundamentally understand what makes a good, rich, and interesting story to humans. That failure, if interrogated well and empathetically, can actually be terrible fascinating all on its own. But it doesn’t make good “regular” books – yet. That day will come though.

So for me ultimately, what I want to say is that the outward form of an ebook or printed book is “fine” for me for now, because it is a common, well-understood, and more or less efficient means to distribute chains of reference points, or networked narrative nodes and their relationships. The same underlying nodes could be presented in countless other ways (lists, image sets, videos, immersive VR experiences, endless others), and over time I hope I have the opportunity to explore those other directions of AI-assisted storytelling, and where they intersect with “The Book” and where they can transcend it.

While I’m on this topic, here is an – I think – previously unreleased PDF document I made some six years ago (2018!), back when generative AI was barely a twinkle in Bill Gates’ eye. It predates any of the Quatria books, and it absolutely predates the AI Lore books, focusing more on Early Clues LLC, and its many exalted offshoots.

Even though it predates all of those things, it gives a fairly accurate (as these things go) “skeleton key” to understanding the rest of my extremely messy and convoluted networked narratives. Skimming around in this diagram cloud, I think, also gives a good visceral experience of what it’s like to try to navigate the stories that pass through all my other books – where the reader/viewer is largely left to their own devices to make sense of it all.

Good luck reading it!

Firefox LLM Conversation Manager Extension

By Tim B.

On 11 September 2024

In Other

One of the annoying things about using AI, in a way, is that because you have these newly increased capacities to execute on certain kinds of tasks, you end up progressing on many of them maybe more rapidly than you can handle. Like, I end up with all these different conversation threads about different projects, ideas, complaints, articles, interviews, job search, etc. It’s all ultimately “connected” but in ways that are completely organic and unique and deeply human.

Which the actual experience of using LLMs via chat interfaces is… not. It simulates humannness in its chatty capacity, but because of this UX framework, if you’re trying to compose documents in it, it becomes deeply frustrating to track text versions, changes, best phrasings or arguments, etc. And ChatGPT consistently drops specific things I need it to not drop and I have to go remind it again and again and again. Then I have to pop out to some other text editor and try to make sense of it all.

It is VERY tedious and annoying as an interaction paradigm for the actual work of writing. So I’ve been trying to get it to help me code a Firefox extension, which can run alongside the conversation in a sidebar. You would be able to highlight parts of the conversation and hang onto them in various ways. One thing you could do would be marking a certain part of the text as “canonical” or required, with the eventual aim being that you could then use this to like “force-include” certain elements in the conversation somehow (not really sure how the technical side of this would work). And this would hopefully reduce much of the time spent arguing with the model to go back and check all the things that it dropped from prior drafts of the document being workshopped.

The core of this is really simple, just taking clips of LLM conversations, holding them, giving them different status and sequence, being able to add notes, and then being able to export copy-pasteable text (with images would be ideal down the road) into other formats like a Google doc; I mostly use Dropbox Paper for my regular random writing things myself. Or a Word doc, a blog post, a Vellum doc (like I use for my ebooks). And if you could also export along with it meta-data… maybe like Adobe Content Credentials and include even markdown or similar to show which parts of the text were AI-generated (AI attribution), complete with timestamps (if you want that, I can imagine use cases where you might, like applying for copyright of AI-assisted works with the US Copyright Office).

I can even see a link here to certain elements of the Xanadu project, like where clips of saved conversation elements might be a transclusion of the original, and then when you export your clips and notes as a proto-document, it basically includes an edit decision list from the source conversation, which is the underlying hypotext from which this becomes a super-imposed hypertext, in the original meaning of that word. {See also: Intertext Protocol}

Anyway, I’m spinning out the directions this could go, but the fact is even with AI, I’m still not a developer. I can emulate one in certain regards, and experimenting with ChatGPT to build very simplified prototype versions of this is teaching me a lot. I’ve hit some early technical roadblocks around how such an extension would actually work were it to run entirely locally in the browser. Things like CSP and CORS which I am still struggling to figure out how to implement. ChatGPT will give me endless run-around solutions and changes to implement in the code samples it gives me. Then it again constantly drops elements. And it goes on and on and clearly the system can’t figure out what the problem is and is just confidently guessing. It’s tiresome and tedious as an end-user just trying to execute quickly on a given task which should be relatively simple, if one had adequate knowledge to solve it. Which I might not, but the system certainly has.

Anyway, after much back and forth, I got it to produce this very basic product spec for how this Firefox extension might work (that’s the browser I use, so it’s what I’m building for). Maybe somebody out there will be able to get it to produce a better actually working version of this more easily than I would as a non-developer, and would share it out into open-source land. That’s the dream, so here it is:

Context:

• Firefox extension that runs in the sidebar of an LLM chat, enabling users to capture, manage, and export important text clips.

Sections and Detailed Functionality:

Clip Maker: Popover appears when text is selected, offering the options to hold, save, or mark as canonical (verbatim or summary format). Clips are either temporarily held or saved.

Note Taker: Add notes linked to each clip, either at the time of saving or later. Notes can be edited or deleted.

Clip & Note Manager: Displays saved and held clips, allows reordering, adding notes after the fact, and toggling between temporary, saved, and canonical status. Users can also jump back to the point in the conversation where the clip was taken.

Clip Exporter: Export all clips and notes into a linear text block for external use.

Or phrased in a more active user voice:

Clip Creation: “When I select text, show options to hold, save, or mark as canonical. Choose exact or summary format.”

Note-Taking: “Add or edit notes linked to the clip or separate, either during or after saving.”

Clip Management: “Display all saved and held clips. Let me reorder, change the clip’s status (temporary, saved, canonical), and jump back to the original conversation point.”

Clip Exporting: “Export all clips and notes as a linear text block for easy external use.”

Anyway, wanted to put this out into the universe as I think there can/should be many solutions to this kind of thing, as it’s a kind of universal problem associated with the chat interface that is dominant in LLM product design currently – for better or worse, often worse for tasks with structured outcomes like writing a document, article, or letter. Hopefully by sharing thinking around the problem and open-sourcing solutions people come up with, we can land on a convergence of adequate solutions that enable creativity rather than hinder it, as it sometimes feels like these systems do, even for someone using them a lot like me.

If you end up building an extension like this for Firefox, or want to collab on one, drop me a line here.

Multi-Concept Addressing for Latent Space Navigation

By Tim B.

On 18 September 2024

In Other

Preface: How I got here

I’ve been hanging out at the library lately, and realized that my Dewey Decimal System (DDS) knowledge has gotten pretty rusty. I found a “concise” summary of it here, and printed off the First, Second, and Third Summaries, which cover The Ten Main Classes, The Hundred Divisions, and the Thousand Sections, respectively.

I knew there was some controversy about the DDS, but I hadn’t checked in on what it was these past couple decades, I guess. But in skimming through the classes, divisions, and sections, it became apparent how lop-sided its distribution of identifying numbers is towards all things European. We see it again and again that it literally marginalizes entire cultures and their achievements by sticking them into “grab bag” left-overs like:

290 Other religions (where 220-280 are all overtly dedicated to Christianity, and 200-219 are no doubt heavily influenced by that tradition)
490 Other languages (420-4880 are all European languages)
890 Other literatures (810-880 are all Euro or American lit)

I won’t bore the non-taxonomically inclined among you by going line by line through The Thousand Sections (though I am strongly tempted to, but that would prolong this preface unnecessarily moreso than already). But some curiosities jump out in that Islam, 297 doesn’t have its own number to itself, but also includes Babaism & Bahai Faith. Likewise, Buddhism is not given its own name let alone its own number. It is a subdivision of 294, Religions of Indic Origin. I don’t know the exact numbers, but based on some initial skimming on Perplexity, it looks like Buddhism + Islam combined in terms of number of followers on Earth is roughly equivalent to that of Christianity globally. But the Dewey Decimal System doesn’t represent these other dimensions of social reality.

Anyway, all that is to say, with an eye to not duplicating the failings of the DDS as a metaphorical jumping off point, couldn’t it be an interesting exercise to come up with some kind of flexible, less judgemental addressing system for navigating high-dimensional latent spaces, such as those you encounter as a user of generative AI models and systems? I’ve already experimented in this direction visually in the past, thinking about how gen AI image creation systems like Midjourney could benefit from some kind of hand-held controller, which would let you rapidly assign and modify dimensional values on the fly, in order to traverse neighborhoods and relations in more or less real time. Latent space as navigable VR, if you will.

I took this problem to three different AI genies, of, paraphrased, give me a dewey decimal system for latent space. The first genie, whose angelic name is ChatGPT-4o, gave me answers that were mildly insightful, but not adequately interesting to pursue in depth. The second genie, whose moniker is Claude, gave me results which were promising, and a UX response that was riddled with errors and hobbled by rate limits. The third genie, named for the Winds brought with it clear thinking, and an in-depth ability to solve the problem through interrogation. That genie’s failing is its refusal to follow custom instructions or the equivalent at a prompt level (“code only, no explanation”), and it’s slow speed. But what it lacked in those areas, it made up for in its ability to guide me to towards a tentatitvely adequate V1/MVP, which is presented here without further ado after this absurdly long preface. My sincerest sorry/not sorry.

Introduction

Disclaimer:

The Multi-Concept Addressing system (MCA) is an attempt by a non-technical author to develop a preliminary schema for one way of potentially addressing locations within latent spaces. It may not prove to be the “best way,” but seemed good enough to at least put out to get the conversation started.

Much of the rest of this text that follows comes directly from Mistral, with light edits from myself.

MCA: The “Dewey Decimal System” for Latent Space in Generative AI

Multi-Concept Addressing, or MCA, is a proposed addressing system designed to navigate and interpret high-dimensional latent spaces in generative AI models. It provides a structured and interpretable way to represent complex scenes and images, much like the Dewey Decimal System organizes information in libraries. (*See: Preface)

Key Components:

Base Concepts: High-level concepts that define the broad categories of elements in a scene.
Sub-Concepts: Detailed information about specific elements within the base concepts.
Relations: Relationships between different concepts, capturing how they interact.
Context: Additional contextual information that provides nuance and depth to the scene.

Operational Principles: MCA operates on membership degrees or intensities (that is, whether an image, for example, contains members of a particular concept, and how much), allowing for precise control over the presence and importance of various concepts and relationships within a scene.

Problem Solved: MCA addresses the challenge of navigating and understanding high-dimensional latent spaces in generative AI models in something that approximates a human-readable format. It provides (hopefully) a holistic and flexible solution that can potentially be adapted to various contexts, including image generation, semantic analysis, and data retrieval.

High-Level Example

Consider the following natural language prompt:

a cat riding a bicycle wearing a football helmet playing a banjo in outer space

This prompt contains multiple concepts and relationships that need to be represented in a structured and interpretable way. Let’s see how MCA can achieve this. [Back to my text with Mistral excerpts included below.]

The first part of an MCA address consists of a string like this representing base concepts and weights:

Ani90Obj80Env90Act80Acc70

Where the name values for high-level (or “base”) concepts represented in this query are:

Ani: Animals
Obj: Objects
Env: Environments
Act: Activities
Acc: Accessories

Using only this for addressing just leaves us in a very fuzzy general vicinity… maybe something like a room in a given library, or a big shelving unit. We might be able to find what we need, but we’re most likely to stumble around looking for it without more specific information.

Base Concepts Sample List (Provisional)

As an aside, I had Mistral work up a set of what might be the top base concept names and abbreviations. I kept saying, do you have any more edge cases, and it kept giving more and more. Eventually I gave up, as this seems like an adequatedly representative step for a v1 of this concept. Here is that list, for completeness (though it also made a list that was much much much longer, and I had to eventually push the stop button. I’ll at least spare you that one). I think there could be better three-letter codes representing each concept, but I left them as the first three letters to make it simple. Here it is:

Animals (Ani)
Objects (Obj)
Environments (Env)
Activities (Act)
Accessories (Acc)
People (Peo)
Plants (Pla)
Structures (Str)
Weather (Wea)
Time (Tim)
Emotions (Emo)
Events (Eve)
Sounds (Sou)
Text (Tex)
Abstract Concepts (Abs)
Technology (Tec)
Food and Drink (Foo)
Transportation (Tra)
Art and Culture (Art)
Natural Phenomena (Nat)
Science and Mathematics (Sci)
Health and Medicine (Hea)
Education (Edu)
Sports (Spo)
Mythology and Folklore (Myth)
Fantasy and Science Fiction (Fan)
Geography (Geo)
History (His)
Lighting (Lig)
Colors (Col)
Textures (Tex)
Movement (Mov)
Interactions (Int)
Symbols (Sym)
Virtual and Digital (Vir)
Celestial Bodies (Cel)
Microorganisms (Mic)
Chemicals (Che)
…and on and on

Navigating Sub-Concepts

Getting back to the addressing for that specific reference prompt results – if we’re looking at all of this in JSON, then the next part of the address will derive from sub-concepts within those broader base concepts which could be represented like this:

{
  "SubConcepts": {
    "Ani": "cat90",
    "Obj": "bike80banjo70",
    "Env": "space90",
    "Act": "ride80play70wear60",
    "Acc": "helm70"
  }
}

So our MCA partial address now has the base concepts, and sub-concepts, which would look something like this, give or take:

Ani90Obj80Env90Act80Acc70::cat90bike80banjo70space90ride80play70wear60helm70

It looks inscrutable-ish, but it’s not really. It’s just a way of compressing the JSON schema into a single line.

Relations & Context

But still in its current form, we don’t necessarily know enough about the different entities and actions to find exactly what we’re looking for. We might get close, but still have major errors. We need to know something more about the relationships between all these entities or values that are named, as well as any larger context not otherwise captured in the address so far. Otherwise our accuracy is going to be pretty low for navigation.

In JSON, these might look like:

  "Relations": [
    "Cat-RidingOn-Bike",
    "Cat-Wearing-Helm",
    "Cat-Playing-Banjo"
  ],
  "Context": [
    "Surreal",
    "Humorous",
    "BrightColors"
  ]

And if we squish that back down into the full MCA multi-concept address, with weights, it might look something like this:

Ani90Obj80Env90Act80Acc70::cat90bike80banjo70space90ride80play70wear60helm70::Cat-RidingOn-Bike_Cat-Wearing-Helm_Cat-Playing-Banjo::Surreal_Humorous_BrightColors

Granted, we could probably also assign weights to the relations and context elements, but I didn’t want to complicate it any more than it already is.

Putting It Together: Full MCA Example Schema

Putting the above extended MCA address all together again as JSON to elaborate its components:

{
  "MCA": "Ani90Obj80Env90Act80Acc70::cat90bike80banjo70space90ride80play70wear60helm70::Cat-RidingOn-Bike_Cat-Wearing-Helm_Cat-Playing-Banjo::Surreal_Humorous_BrightColors",
  "BaseConcepts": "Ani90Obj80Env90Act80Acc70",
  "SubConcepts": {
    "Ani": "cat90",
    "Obj": "bike80banjo70",
    "Env": "space90",
    "Act": "ride80play70wear60",
    "Acc": "helm70"
  },
  "Relations": [
    "Cat-RidingOn-Bike",
    "Cat-Wearing-Helm",
    "Cat-Playing-Banjo"
  ],
  "Context": [
    "Surreal",
    "Humorous",
    "BrightColors"
  ]
}

Is this actually simple and flexible like I had hoped setting out on Today’s AI Side-Quest™? Hard for me to be the judge, but so far it is the only thing of its kind that I have found out there (though I did find some adjacent concepts I’m not yet well-versed enough in these areas to explore in depth), and it does to me at least address a real and specific need, whether or not it, erm, completely accurately gives a reproducible address every time in all situations. It is still maybe a bit vague, but at least hopefully narrows down the task of navigation into a more restrained dimensional space, with keys as to values that could be changed in searching for the specific “shelf” that contains what you are after.

Really, what I imagine in all of this is like a bunch of conceptual characteristics mapped to sliders in a UI, where fully on means that a given characteristic/dimension/concept/tag is applied to the max within the desired outputs. And fully off means that attribute is excluded, plus all the values between. Then using a machine mapped to this would be about playing around with different conceptual sliders to emphasize or de-emphasize members of a given group or groups of high-dimensional characteristics in the latent space.

Phew, lot of words to get out here, but I think that brings us to the end for now, if not the “conclusion.”

Post Script

I am trying to find a natural language UI design tool that can output a version of the above as a simple web app, something to the effect of what I described in this prompt:

app for navigating addresses in latent space based on given values (concepts, subconcepts, relations, context) and their weights. the app consists of sliders paired to specific example attributes or concepts which can be adjusted to yield different results in a viewer window that shows that location

This is somewhat janky, but a quick version of that made using UIzard.io just to leave you with something more concrete to consider:

You Should Be Able to Mention Anyone On the Web, From Anywhere, And Have Them Be Able to Get It

By Tim B.

On 21 November 2024

In Other

You should be able to mention anyone on the web publicly, from anywhere, and have them be able to get it (if they want).

Google Alerts kinda almost does this, but usually only for “news” sources (many of which are questionable), and its a closed system.

I don’t know much about them, but talking with ChatGPT is leading me to believe that elements of such a concept exist already in Webmention and ActivityPub, but they would require participants in it to adopt compatible technological approaches.

My idea is much more broad than that, and as someone posting a blog or a tweet or a toot or a flutter or whatevertf people call them now, you would just say whatever you’re gonna say, to whomever you feel the need to address, publish it – and somehow, magically, it would be available to that person, or any person, since it is publicly available, just like a search result.

PublicPing

ChatGPT and I are calling this system PublicPing right now for lack of a better name. The idea goes something like this (my text):

PublicPing bridges siloed systems by providing a universal way to observe mentions, making it easier for users on different platforms to engage with the same public conversations.
So that would go like:
A blogger mentions a person’s name in a post.
Many diverse automated observers index and process natural language mentions of peoples’ names (or other identifiers) from public blogs and other open public sources.
Observers collate these mentions from sources they observe according to their own criteria into public PingFeeds that anyone can view or subscribe to updates from.
People can subscribe to PingFeeds for mentions of their own name/s or identity/s, and since PingFeeds are always public, anyone else can subscribe to them just the same.
The system is agnostic about the identity of people who post mentions and consumers of a PingFeed. It simply observes, collates, and retransmits mentions into a common format.
Such a system would allow us to have conversations as people across different technological systems, without having to be locked into any one of them.
We could publish how and where we like, and read public PingFeeds where and how we like too. No one would own this simple open framework or have a monopoly on how we communicate with one another online.
To participate in the system, you would just publish your thoughts – anywherere, publicly, and mention someone in them.
We would not need to agree to become anyone’s “users,” nor be subjected to the whims, restrictions, policies or bad UX of malicious owners or bumbling admins.
PublicPing should remain simple and accessible by avoiding unnecessary technological complexity like cryptocurrency or blockchain.
Observers would instead be incentivized to act by serving the common good of public civic discourse.
The framework is open, non-proprietary, and free for anyone to implement, ensuring no single entity controls the system.
Multiple independent observers ensure resilience, transparency, and diversity in how mentions are indexed and shared.
Observers faithfully re-transmit their findings based on public sources into corresponding public PingFeeds (unless doing so violates the law, etc. TK).
Consumers of PingFeeds would be able to filter them according to their own custom criteria. (Example: “Only mentions from verified websites,” “Exclude mentions with profanity.”)
Consumers decide on their own actions which they will take independently in their own way regarding items in PingFeeds (which could include ignoring them altogether).
If consumers choose to respond to a mention, potentially proving their identity as the person originally mentioned is up to them and outside the scope of the PublicPing system, which merely observes, collates, and re-transmits public mentions into public PingFeeds without intervening or mediating.
The system respects individual privacy, since it only deals in clearly publicly-available information. Does not harvest private or semi-private personal data (no private groups, no DMs, etc.).
Something something spam, malicious uses, ethical concerns – the usual grab bag of stuff nobody wants to have to deal with but perpetually exists with any system, like it or not. So may as well consider it a feature not a bug at this point.

Anyway, a lot to chew on here, and yes much of it half-baked. But this seems like it would fit in with the Intertext Protocol as well…