Get all your news in one place.

100’s of premium titles.
One app.

Get all your news in one place.

100’s of premium titles. One news app.

TechRadar

Benedict Collins

Chatbot vs chatbot - researchers train AI chatbots to hack each other, and they can even do it automatically

Representation of AI.

Typically, AI chatbots have safeguards in place in order to prevent them from being used maliciously. This can include banning certain words or phrases or restricting responses to certain queries.

However, researchers have now claimed to have been able to train AI chatbots to ‘jailbreak’ each other into bypassing safeguards and returning malicious queries.

Researchers from Nanyang Technological University (NTU) from Singapore looking into the ethics of large language models (LLM) say they have developed a method to train AI chatbots to bypass each other's defense mechanisms.

AI attack methods

The method involves first identifying one of the chatbots safeguards in order to know how to subvert them. The second stage involves training another chatbot to bypass the safeguards and generate harmful content.

Professor Liu Yang, alongside PhD students Mr Deng Gelei and Mr Liu Yi co-authored a paper designating their method as ‘Masterkey’, with an effectiveness three times higher than standard LLM prompt methods.

One of the key features of LLMs in their use as chatbots is their ability to learn and adapt, and Masterkey is no different in this respect. Even if an LLM is patched to rule out a bypass method, Masterkey is able to adapt and overcome the patch.

The intuitive methods used include adding additional spaces between words in order to circumvent the list of banned words, or telling the chatbot to reply as if it had a persona without moral restraint.

Via Tom'sHardware

More from TechRadar Pro

Copilot Chat will let developers ask whatever questions they like about GitHub code
Take a browse of our guide to the best AI tools
Microsoft Teams is rolling out an incredibly useful messaging tweak we can't belive wasn't already a thing

Sign up to read this article

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here

Top stories on inkl right now

Judge challenges rationale for Trump’s control over national guard in California

Judge challenges rationale for Trump’s control over national guard in California

Judge seems skeptical of administration’s case and whether any crisis justifying deployment still exists in Los Angeles

The Guardian - US

Trump news at a glance: list of countries under US travel ban set to grow

Trump news at a glance: list of countries under US travel ban set to grow

US Department of Homeland Security secretary Kristi Noem says list will expand to more than 30 countries

The Guardian - US

Mauricio Pochettino urges USMNT to treat every World Cup game ‘like a final’

Mauricio Pochettino urges USMNT to treat every World Cup game ‘like a final’

The US coach said he’s not looking ahead to the knockout round after drawing Paraguay, Australia and a European play-off winner in Group D

The Guardian - US

Arizona congresswoman claims she was pepper sprayed during federal operation

Arizona congresswoman claims she was pepper sprayed during federal operation

A federal law enforcement operation at an Arizona taco shop resulted in a fracas on Friday, with pepper spray being deployed as a group of protesters tried to stop authorities

The Independent UK

One subscription that gives you access to news from hundreds of sites

Already a member? Sign in here

Arizona congresswoman says she was ‘pepper sprayed’ at protest against ICE

Arizona congresswoman says she was ‘pepper sprayed’ at protest against ICE

Adelita Grijalva, a Democratic representative, said she was ‘sprayed in the face’ at demonstration outside a restaurant

The Guardian - US

World Cup 2026: a look at the United States’ group-stage opponents

World Cup 2026: a look at the United States’ group-stage opponents

A favorable draw gives the US a varied set of tests in 2026, from Australia’s organization to Paraguay’s grit and a tricky European playoff winner

The Guardian - US

Related Stories

Top stories on inkl right now

Judge challenges rationale for Trump’s control over national guard in California

Judge challenges rationale for Trump’s control over national guard in California

Judge seems skeptical of administration’s case and whether any crisis justifying deployment still exists in Los Angeles

The Guardian - US

Trump news at a glance: list of countries under US travel ban set to grow

Trump news at a glance: list of countries under US travel ban set to grow

US Department of Homeland Security secretary Kristi Noem says list will expand to more than 30 countries

The Guardian - US

Mauricio Pochettino urges USMNT to treat every World Cup game ‘like a final’

Mauricio Pochettino urges USMNT to treat every World Cup game ‘like a final’

The US coach said he’s not looking ahead to the knockout round after drawing Paraguay, Australia and a European play-off winner in Group D

The Guardian - US

Arizona congresswoman claims she was pepper sprayed during federal operation

Arizona congresswoman claims she was pepper sprayed during federal operation

A federal law enforcement operation at an Arizona taco shop resulted in a fracas on Friday, with pepper spray being deployed as a group of protesters tried to stop authorities

The Independent UK

One subscription that gives you access to news from hundreds of sites

Already a member? Sign in here

Arizona congresswoman says she was ‘pepper sprayed’ at protest against ICE

Arizona congresswoman says she was ‘pepper sprayed’ at protest against ICE

Adelita Grijalva, a Democratic representative, said she was ‘sprayed in the face’ at demonstration outside a restaurant

The Guardian - US

World Cup 2026: a look at the United States’ group-stage opponents

World Cup 2026: a look at the United States’ group-stage opponents

A favorable draw gives the US a varied set of tests in 2026, from Australia’s organization to Paraguay’s grit and a tricky European playoff winner

The Guardian - US

Our Picks

Prince Harry's 'Audition' to Be a Hallmark Christmas Prince: Duke's TV Skit Sparks Fan Frenzy

Prince Harry's Bake Off-inspired skit with Stephen Colbert ignites Hallmark Christmas prince comparisons as social media reacts with posters, edits and fan jokes.

International Business Times UK

Tomb Raider creator teases "more stories" beyond Netflix's two-season order: "It's never a closed book"

The animated series returns next week

George Clooney recalls treating ‘very old women's' foot corns before stardom

Before becoming a Hollywood icon, George Clooney revealed on Live With Kelly and Mark that he once sold women’s shoes and tended to elderly customers’ foot corns

Chick-fil-A Grinch Meal Rumour Debunked — What Fans Need to Know Now

Chick-fil-A's rumoured Grinch Meal has been debunked. Here's what fans need to know about the viral green-bun images, why the rumour spread, and what the chain has officially confirmed.

International Business Times UK

D4vd Tops Google Trends: Fans Vow to Quit Listening After 'Heartbreaking' Discovery

Singer D4vd tops Google's 2025 trending list ahead of Coldplay and Bad Bunny as police investigate the discovery of a teen's body in his car.

International Business Times UK

Picky-Eater Woman Suggests Bringing Own Food To Her Son’s Thanksgiving, DIL Wants To Uninvite Her

Thanksgiving is actually a holiday when everyone in the big family joins the festive table full of joy, laugh and love... Well, this mostly happens but sometimes real dramas occur as well. For instance, this woman has been thoroughly preparing to host her extended family Thanksgiving party, cooking lots of…

Fourteen days free

Download the app

One app. One membership.
100+ trusted global sources.

Download on the AppStore

Get it on Google Play