Interestingly, there is an old XML dialect called Artificial Intelligence Markup Language (AIML). Wikipedia says it originated between 1995 and 2002 or so. It’s purpose seems to have been defining patterns to use in chatbots – basically question-answer pairs. The pattern they use as an example:
<category> <pattern>WHAT IS YOUR NAME</pattern> <template>My name is Michael N.S Evanious.</template> </category>
It’s not at all what I imagine the use case would be for something like AIMark (which is really still just an experiment), or for other attempts at a microformat for distinguishing contributions in hybrid AI/human authorship texts.
The use case there is:
- I am an author who uses AI-assisted editing tools
- I want to track the contributions made by me (human) and the AI
- I want to communicate these attributions somehow in a meaningful way to readers.
- My readers might get some benefit from that.
It’s based on the assumption that this is something readers might want – though the benefits are unexplored/unknown. The first layer user want is merely that sometimes you might want to include or exclude AI-assisted content in your search results or social feeds.
It’s difficulty and becomes costly at scale to accurately detect AI-generated content, because of the variety of methods and technologies available, and the speed with which they are improving.
So what if you could reduce the load somewhat by having content creators voluntarily disclose not just the presence of AI content at a high-level, but at an inline granular level as well. Seeing line by line or word by word what were AI generations, and what were human contributions. (It could even be “signed” by the creators)
The core concept reduces to these simple examples:
% indicates AI generated text content%
/indicates human written content/
It might be awkward to have this in long text using those symbols, so presumably you could have a presentational mode, where the text just reads straight. And then you could have an “X-ray” mode where all this analytical/forensic layer can be exposed about the construction of the information contained.
You could also, for example, use the X-ray layer to identify claims, and link to fact-checks, etc. Especially if there are known biases around certain AI models tending to fabricate or suppress information around certain topics or blindspots.
Anyway, I got interrupted writing this… Will do a continuation of it another time, but wanted to jot down all the above while still fresh in my head.