Intent is often not knowable

As someone who has spent a lot of time in the trenches having to analyze content for moderation purposes, I can confidently say that you can rarely truly determine intent. It’s often murky, especially when you enter into the realm of satire, trolling, disinformation, etc.

That’s why I’m somewhat heartened to see the following included in the WITNESS & MIT Co-Creation Studio 2023 action plan around satire & synthetic media:

DON’T GET HUNG UP ON INTENT

Intent is going to be hard (and intent shifts as media moves). But explore crowdsourced and decentralized smaller community-based assessment to detect, understand and assess intent as well as consequences.

It’s something I’ve seen in the “disinformation industrial complex” that, A) there’s a lot of needless quibbling even still to define misinformation vs. disinformation, and B) the difference people land on is usually one of intent (where misinformation is wrong + accidental, and disinformation is wrong + intentional – which I think is a bit lacking).

From the perspective of someone who has had to engage in thousands upon thousands of enforcement actions, I would argue that intent is opaque, and easily masked. You don’t have hundreds of hours to analyze each case, you have seconds or minutes. So the analysis necessarily must shift to consequence, as in the quote above: but more specifically, harms, in other words. Likelihood, severity, who is impacted, what is the specific harm, etc. The risk analysis matrix.

The quote above points towards community-based assessments, presumably as a way to expand the points of view leveraged to make determinations. Multi-assessor frameworks can definitely add value in difficult situations, though they can also be difficult to make proper use of in circumstances with a pressing time element (like so frequently occurs in content moderation). How does one apply this in a position as a content moderator, for example?

I’ve not used it myself (as I haven’t been active on Twitter in quite some time), but Twitter’s Community Notes aka Birdwatch seem to be an example of community-based assessment. Does it work? I’m not sure – probably depends how we define what “this is working” means, and how it could be effectively measured.

In any event, there’s more to be said here, but just wanted to establish a beachhead with some references to unpack further later on…

Related writing & research

1 Comment

Add Comment →

Tim B.

also, if we accept that even skilled human investigators may have great difficulty establishing strong evidence of intent, then that seems to suggest that trying to have AIs determine whether something is a malicious use is a fraught concept from the get-go:

https://www.timboucher.ca/2023/02/ai-alignment-malicious-use/

1 March 2023

Questionable content, possibly linked

Intent is often not knowable

Related writing & research

1 Comment

Tim B.