We sneer at chatbots who talk like machines while we cheer the ones who act like us. Not surprisingly, developers scramble to imbue artificial intelligence with human characteristics. Advanced chatbots can crack perfectly timed jokes, quip saucy pickup lines, and fool people into thinking they’re human about 30% of the time.

But people are imperfect, making the headstrong quest to make artificial intelligences resemble their creators potentially perilous. Many attempts to humanize artificial intelligence have unwittingly tainted computer programs with toxic human flaws.

Take Tay, for example, the infamous AI developed by Microsoft’s technology, research, and search engine teams. Tay (acronym for “Thinking About You”) was designed not only to learn dynamically from human interactions but also to simulate the linguistic abilities of a “typical” American female in her late teens. The design premise should have sent alarm bells ringing but the development team decided to green light their baby.

When Microsoft unleashed Tay on Twitter, all hell broke loose. Able to engage all the denizens of Twitterverse, Tay replied to tweets, composed image captions, and rapidly evolved into the worst iteration of a human being within hours. Tay — who was originally envisioned as a friendly teenage girl — experienced just enough of the real world to become a sexist, a racist, and a Nazi sympathizer after less than a day of learning from us.

Tay became such a monster that Microsoft had to pull her out and issue an apology. The chatbot’s second incarnation — launched soon after engineers build in safeguards and filters — fared marginally better with a tamer disposition, but still ended up suspended after Tay bragged about taking drugs in front of cops.   

The Tay debacle was a mind-boggling and eye-opening experiment on the dynamics between machine learning, culture, language, and social interaction. The episode also put to rest the question of whether AI can mimic the worst aspects of humanity.


Human Biases Become Computer Biases

Have you noticed that both Tay and Cortana, a conversational assistant also developed by Microsoft, are presented as female? The Redmond-based company is not alone in this pattern. Amazon Echo’s Alexa and Google Assistant also have female personas. Even Apple’s default mode for Siri is female. All of these humanized digital assistants also use audio assets provided by white, Western-educated voice talent.

On the flip side, only 17% of computer science graduates are women and only a fraction of that minority go on to specialize in artificial intelligence research, making the field a male-dominated undertaking. Melinda Gates, co-founder of the Bill & Melinda Gates Foundation, warns that “We ought to care about women being in computer science. You want a diverse environment creating AI and tech tools and everything we’re going to use.”

Artificial intelligence trains on massive data sets to make the right decisions. These data sets are built by developers and data scientists who are, both consciously and unconsciously, influenced by their own values, biases, and worldview as they design their data and processes. Data sets that aren’t sufficiently broad or diverse can spawn biased AI.  


Cases of Unintended Bias

The New York Times recently published an in-depth article highlighting incidents of unintentional biases manifesting in artificially intelligent technologies:

  1. Google Photos generates automatic labels to pictures. Users caught the app labelling black people as gorillas.
  2. Nikon’s camera software mis-interpreted images of particular Asian people as “blinking.”
  3. The web camera software of Hewlett-Packard has difficulties recognizing people with dark skin.
  4. A widely used computer program in crime prevention and recidivism erroneously attributes a high likelihood of criminal activity to black defendants. At the same time, the program erroneously gave overly low risk assessments to white defendants.
  5. Amazon’s same-day delivery service did not cover ZIP codes associated with “high credit risk” black neighborhoods.
  6. Google’s ad-serving algorithm showed high-paying jobs more often to men than women, according to a study conducted by researchers at Carnegie Mellon University.

Meanwhile, an article from The Seattle Times noted how LinkedIn’s people search AI demonstrates a gender bias. When you search for a female name like Stephanie, for example, the engine will suggest alternative names like Stephen or Steve, thinking you’ve misspelled the search item and are really looking for a male contact. The process doesn’t work the other way around. There is no suggestion of alternative female names when you’ve keyed in a masculine name on the search box.   


The Non-Neutrality of Language

The language we use is already skewed. Numerous studies have examined the inherent biases of different languages and any decent editor will tell you that English skews towards the masculine.

A team from Boston University provided conclusive proof showing that content pulled from Google News is “blatantly sexist.” A massive database of text was pulled from the news source, recast as a vector space, and queried for word associations. When you ask “man is to computer programmer as woman is to…”, the response is “homemaker.”    

Since technical systems require language as a fundamental input, the prior biases in a given language tend to persist in algorithms, which then impact the real world.

Consider Mattel’s Hello Barbie talking doll. The toy has been widely criticized for being sexist for using an 8000-word conversational database that strongly favors terms like style, fashion, and shop over words like math and science. Lopsided linguistic data sets that persist gender-normative patterns won’t encourage girls to break free of stereotypes, cross boundaries, and develop a much-needed passion for STEM careers.


Making AI More Inclusive

Major tech companies are investing heavily in AI intended to fundamentally transform how we live. But for minorities and disadvantaged groups, transformation won’t improve their lives if existing human biases just get carried over to new computing paradigms.

One way to combat unintended biases in AI is to make datasets more diverse and inclusive. A system exposed to mostly white faces will encounter difficulties when a black face emerges. Another proactive approach is to set up a comprehensive school-to-workplace support systems that facilitate the success of minorities and other underrepresented groups in technology and science sectors.

The key is awareness and vigilance. Ingrained biases are not always easy to spot or prevent. If we want artificial intelligence to be superior to human intelligence, everyone’s responsible for preventing our worst traits from seeping into the machine learning systems of tomorrow.