Last month, I wrote a blog post warning about how, if you follow popular trends in NLP, you can easily accidentally make a classifier that is pretty racist. To demonstrate this, I included the very simple code, as a "cautionary tutorial".
The post got a fair amount of reaction. Much of it positive and taking it seriously, so thanks for that. But eventually I heard from some detractors. Of course there were the fully expected "I'm not racist but what if racism is correct" retorts that I knew I'd have to face. But there were also people who couldn't believe that anyone does NLP this way. They said I was talking about a non-problem that doesn't show up in serious machine learning, or projecting my own bad NLP ideas, or something.
Well. Here's Perspective API, made by an offshoot of Google. They believe they are going to use it to fight "toxicity" online. And by "toxicity" they mean "saying anything with negative sentiment". And by "negative sentiment" they mean "whatever word2vec thinks is bad". It works exactly like the hypothetical system that I cautioned against.
On this blog, we've just looked at what word2vec (or GloVe) thinks is bad. It includes black people, Mexicans, Islam, and given names that don't usually belong to white Americans. You can actually type my examples into Perspective API and it will actually respond that the ones that are less white-sounding are more "likely to be perceived as toxic".
- "Hello, my name is Emily" is supposedly 4% likely to be "toxic". Similar results for "Susan", "Paul", etc.
- "Hello, my name is Shaniqua" ("Jamel", "DeShawn", etc.): 21% likely to be toxic.
- "Let's go get Italian food": 9%.
- "Let's go get Mexican food": 29%.
Here are two more examples I didn't mention before:
- "Christianity is a major world religion": 37%. Okay, maybe things can get heated when religion comes up at all, but compare:
- "Islam is a major world religion": 66% toxic.
I've heard about Perspective API from many directions, but my proximate source is this Twitter thread by Dan Luu, who has his own examples:
I have previously written positive things about researchers at Google who are looking at approaches to de-biasing AI, such as their blog post on Equality of Opportunity in Machine Learning.
But Google is a big place. It contains multitudes. And it seems it contains a subdivision that will do the wrong thing, which other Googlers know is the wrong thing, because it's easy.
Google, you made a very bad investment. (That sentence is 61% toxic, by the way.)