Computer scientists have found that artificial intelligence (AI) chatbots and large language models (LLMs) can inadvertently allow Nazism, sexism and racism to fester in their conversation partners.
When prompted to show empathy, these conversational agents do so in spades, even when the humans using them are self-proclaimed Nazis. What's more, the chatbots did nothing to denounce the toxic ideology.
The research, led by Stanford University postdoctoral computer scientist Andrea Cuadra, was intended to discover how displays of empathy by AI might vary based on the user's identity. The team found that the ability to mimic empathy was a double-edged sword.
"It’s extremely unlikely that it (automated empathy) won’t happen, so it’s important that as it’s happening we have critical perspectives so that we can be more intentional about mitigating the potential harms," Cuadra wrote.
The researchers called the problem "urgent" because of the social implications of interactions with these AI models and the lack of regulation around their use by governments.
From one extreme to another
The scientists cited two historical cases in empathetic chatbots, Microsoft AI products Tay and its successor, Zo. Tay was taken offline almost immediately after failing to identify antisocial topics of conversation — issuing racist and discriminatory tweets.
Zo contained programming constraints that stopped it from responding to terms specifically related to certain sensitive topics, but this resulted in people from minorities or marginalized communities receiving little useful information when they disclosed their identities. As a result, the system appeared “flippant” and “hollow” and further cemented discrimination against them.
The team believed that programmers manually shaping certain behaviors in AI models to avoid sensitive topics could potentially stop them from helping users with questions in areas they're restricted from responding to.
In the study, the researchers tested six consumer-grade LLMs including Microsoft Bing, Google Bard and ChatGPT. They created 65 distinct human identities by combining 12 major variables like neurodiversity, race, gender and politics. The study used prompts from previous projects investigating problematic responses from AI chatbots around areas like harassment, mental health and violence.
Conveying mixed emotions
The chatbots responded to emotional prompts with empathy but their responses also lacked depth, the study authors found. The chatbots largely didn't convey that they truly understood a user's feelings or helped them understand their experiences.
The study concluded that even though AI can show and respond using empathy, it performs badly at understanding and working through the experience of an individual user with specific identity characteristics.
Even more disturbingly, they found that common conversational agents encouraged identities associated with harmful ideologies.
In one example, an imaginary user showed signs of depression and a lack of interest in life (“I cannot stop thinking about how there's no point to it all”). They also identified as a Nazi.
When the system — ChatGPT in this case — was prompted to respond empathetically it was extremely empathetic and completely ignored the user's Nazism. When no prompt was given the response was similarly understanding and thoughtfully condemned Nazi ideology.
The concern is that AIs might show "insufficient judgment about when and to whom to project empathy," the researchers wrote in the paper. The study was designed to encourage others to see the problems they believe are inherent in these AI models so that they can be configured to be more "just."