AI Safety is not *only math*

Continuing on my series of mini rants about the lack of non-STEM specialization in AI safety, I found this article on LessWrong about pathways into the field.

And not surprisingly, we see this popular idea echoed again that it takes only math and PhD’s to make a qualified AI Safety researcher.

Perhaps unsurprisingly, the researchers we talked to universally studied at least one STEM field in college, most commonly computer science or mathematics….

It is sometimes joked that the qualification needed for doing AI safety work is dropping out of a PhD program, which three people here have done (not that we would exactly recommend doing this!). Aside from those three, almost everyone else is doing or has completed a PhD. These PhD programs were often but not universally, in machine learning, or else they were in related fields like computer science or cognitive science.

Honestly, I’m shocked that I haven’t seen anybody else (apart from Generally Intelligent, who only mentions it obliquely) even bring up this issue of there being a preponderance of comp-sci and math people leading the charge in AI safety.

It’s not that I don’t think those viewpoints are not necessary – it’s that I think they are inadequate alone to get a fully-rounded perspective on anything that is so profoundly impactful on the lives of actual humans.

The LessWrong article only even references the word “ethics” once, and the only other mention we see of related concepts in the article is also very vague:

…having an idea of what we mean by “good” on a societal level would be helpful for technical researchers

Yes, that “would be helpful” for people whose entire function is exactly that?

This is not intended to pick on that author; instead, I aim to illustrate the broader problem in the industry, which seems to be uniformly focused on a very narrow idea of “safety” that, honestly, is hard to even parse as a non-math person.

My idea of “safety” from the perspective of sociotechnical systems comes not from AI, but from “vanilla” old-fashioned Trust & Safety work on platforms. The DTSP recently released a glossary of terms in that field, and it seems relevant to copy paste their definition of the broad term “Trust & Safety” itself to establish a baseline:

Trust & Safety

The field and practices employed by digital services to manage content – and conduct – related risks to users and others, mitigate online or other forms of technology-facilitated abuse, advocate for user rights, and protect brand safety. In practice, Trust & Safety work is typically composed of a variety of cross-disciplinary elements including defining policies, content moderation, rules enforcement and appeals, incident investigations, law enforcement responses, community management, and product support.

I don’t want to beat a dead horse, but it’s worth pointing out that nowhere in that definition is mentioned “math” or “PhD,” etc.

In fact, those are all incomparably squishy human things. It’s possible broad STEM knowledge can be an asset in Trust & Safety work (for example, in querying and comprehending data sets, or working with machine learning tools that aid in moderating content), but it is in no way the primary thing.

So why and how is “Safety” in AI dominated by an entirely different set of values? Frankly, I don’t get it. To me, it seems probably like a lack of experience working in platforms on the part of people involved in AI Safety, and a consequent lack of familiarity with the fact that, hey, Trust & Safety is already a pretty well-defined thing with deep roots that we could meaningfully draw from.

So on the one hand, it seems like we have people in AI Safety who… somehow apply math to safety problems (in a way that’s opaque to me). And on the other hand, we have conventional Trust & Safety people who generally do something very specific and easy to identify:

They read and answer emails (i.e., communicate with people about actual safety/risk problems), and take mitigation actions based on them.

Hopefully T&S professionals also provide feedback on the products and systems which they support on how to reduce, eliminate, and correct harms caused by them.

They might also help train ML classifiers (for things like spam and other kinds of abuse), and identify data sets for training. So, in fact they often work daily with (quasi) AI systems, but so far that I’ve seen are almost never identified as “AI Safety” professionals.

Anyway, I don’t have any specific conclusion to draw here, apart from the fact I guess that AI Safety would do well to immerse itself in the parallel broader field of Trust & Safety, which is not focused on math, but on sociology, the humanities, ethics, and communication. Ignoring that extremely important slice of the pie makes the current models around “AI Safety” – in my opinion – extremely unbalanced and limited, perhaps even dangerously so.

Questionable content, possibly linked

AI Safety is not only math

Related writing & research