Unseen People: Facial Recognition Software’s Problem with Bias

As facial recognition software becomes mainstream, a troubling problem is rearing its head — many heavily marketed AI technologies rely on inherently biased datasets. This causes facial recognition software to make more mistakes when trying to recognize women and people of color.

Written by
Tyson Jones
July 3, 2018
Filed under:

Not too long ago, science fiction movies told stories of a world in the distant future where people could be identified using facial scans and other biometrics. Fast forward to today, and facial recognition technology is quickly becoming ingrained in emerging products. Not only can facial recognition help you unlock your phone, it’s also being used to help stores identify customers at checkout and to help police identify people of interest.

But beneath the shiny facade — the promise of speedier transactions and more accurate suspect identification — hides a serious issue. One that stems from persistent societal ills, namely a lack of diversity in tech. It can be difficult to talk about, but when it comes to facial recognition, looking the other way is irresponsible.

The Invisible Ones

facial-recognition-biasA recent study found that women and darker-skinned individuals are less likely to be correctly identified by facial recognition software. MIT Graduate Researcher Joy Buolamwini became interested in the topic because of her own first-hand experience — software she built was unable to recognize her face unless she put on a white mask.

Curious about the source of this problem, Joy analyzed commercial software that can tell the gender of a person in a photograph. She found that the software is accurate 99% of the time for white men. But for darker skinned people, more errors arise. In fact, the software misidentified darker skinned women 35% of the time.

Facial Recognition 101

To understand the root cause of the problem, you have to understand a little bit about how facial recognition software works. At a high-level, facial recognition technology detects one or more faces in an image, separates the face from the image background, and compares that face to images of faces in a data set to see if there’s a match.

Like with other artificial intelligence technology, facial recognition software trains on data. By analyzing many different photos, the machine learns what a face is and what different faces look like.

Training Data Matters

In the case of Joy’s study, the training data was mostly filled with white male faces, and so are the most widely used facial recognition data sets. According to the New York Times, one widely used data set is more than 75 percent male and more than 80 percent white.

These data sets aren’t just used in school projects and harmless apps either. Big names in data like Microsoft, IBM, and even Google have come up short. In the case of the darkest-skinned women, it failed to recognize their gender nearly fifty percent of the time, a failure rate so high they might as well be guessing at random.

Until standards emerge, the responsibility is on those developing facial recognition software to make sure they use diverse data sets. Bias in training data can be mitigated, but only if someone sees that it’s there and knows how to correct it.

Joy, for one, is using her desire to improve the state of facial recognition software to form the Algorithmic Justice League. The purpose of this group is to raise awareness and address the issues of inclusion and bias in tech.