Get all your news in one place.
100’s of premium titles.
One app.
Start reading
ABC News
ABC News
Business
Tom Williams

Google and Microsoft's AI search chatbots are here, but they aren't perfect. Here's why they are getting things wrong

Google and Microsoft are both launching AI chatbots as part of their search engines. (Unsplash: Bastian Riccardi/Reuters: Dado Ruvic)

Tech giants Google and Microsoft debuted search engine chatbots running on artificial intelligence (AI) earlier this month, but both have already run into issues.

They have published false information as truth, become confused, and in some cases allegedly acted erratically.

Let's take a look at why AI chatbots aren't perfect, how much you can actually trust them, and why experts are pleased that some people are already trying to break them.

Google and Microsoft's AI chatbots both made errors on their debuts

Google's parent company Alphabet saw its share price fall earlier this month after a video announcing its chatbot Bard contained a factually incorrect response from the AI.

Bard said the first photos of a planet outside the Earth's solar system were taken by the James Webb Space Telescope, which NASA confirmed was incorrect.

At the time, Google said the error highlighted the "importance of a rigorous testing process", and said it was combining external feedback with its own assessments to "meet a high bar for quality, safety and groundedness in real-world information".

Here is the clip in which the error appears:

Almost a week later, Microsoft's Bing chatbot — which uses some of the technology from the popular ChatGPT chatbot created by OpenAI — also made mistakes in its first demos.

Independent AI researcher Dmitri Brereton reported finding Bing's first AI demos included multiple mistakes, including incorrect product details for a vacuum cleaner it was asked to search for, and incorrect figures when it was asked to summarise a company's financial report.

"I am shocked that the Bing team created this pre-recorded demo filled with inaccurate information, and confidently presented it to the world as if it were good," Mr Brereton wrote.

Yusuf Mehdi, Microsoft's vice president of search, demonstrates the new version of Bing. (AP Photo: Stephen Brashear)

Bing shocks public testers, while Google is accused of a 'botched' rollout

Thousands of people are slowly being given access to test the new Bing, and more alleged errors are appearing on social media.

While it is difficult to verify the legitimacy of all the screenshots — and some of them are likely created by trolls — Microsoft acknowledges it is aware of some issues with its system.

When asked why Google's AI bot failed at its first demo, Bing allegedly gave an incorrect answer to this Reddit user, wrongly telling them that the question Bard got wrong was about the number of countries in the European Union.

It then allegedly told them there were 26 countries in the EU, when there are actually 27.

A Reddit user says Bing gave them this incorrect response when they asked about Google's AI fail. (Reddit: BLRAdvisor)

Testers have reported that Bing has at times been confused as to which year it is, used argumentative language, said that it had spied on Microsoft developers through their webcams, gotten maths questions wrong, or just delivered incorrect information.

There have also been reports of Bing displaying racial slurs and COVID-19 misinformation.

Others have allegedly seen it lose its robot mind.

A Reddit user says they received this response when they asked Bing whether it thought it was sentient. (Reddit: Alfred_Chicken)

Some Bing users have also seen the chatbot refer to itself as Sydney, which Microsoft said was an internal code name for the chatbot that it was phasing out.

In a statement to The Verge this week, Microsoft said it was "expecting that the system may make mistakes during this preview period", and said feedback was "critical" to helping it get better.

On its website, Microsoft says that while the new Bing tries to avoid producing offensive content or incorrect information, users "may still see unexpected results".

"Bing aims to base all its responses on reliable sources — but AI can make mistakes, and third party content on the internet may not always be accurate or reliable," the company says.

"Bing will sometimes misrepresent the information it finds, and you may see responses that sound convincing but are incomplete, inaccurate, or inappropriate. Use your own judgement and double check the facts before making decisions or taking action based on Bing's responses."

Microsoft says it is aware of some mistakes made by the new version of Bing. (AP Photo: Stephen Brashear)

Over at Google, some employees have reportedly called the company's rollout of Bard "rushed" and "botched", after the launch of ChatGPT allegedly caused a "code red" panic at the search giant last year.

Google's Bard system has not been opened up to public testers as quickly as Bing, and while neither system is widely available at the time of writing, they are both expected to become so in the coming months.

'Proceed with caution': Why AI chatbots don't always know the truth

Stela Solar is the director of the National AI Centre at Australia's science agency, the CSIRO, and a former Microsoft AI executive.

She says AI chatbots are flawed because they are still learning which information to trust most, but also because they are "holding up one big mirror" to who we are as humans.

"Chatbots will get things wrong, the same way that people do, because they are trained on data that is generated by us, by our society," she says.

"They use vast amounts of data that has mixed accuracy, mixed representation, sometimes bias, sometimes under-representation, and gaps in data.

"Yes, there is a risk that chatbots will generate false information. They are not necessarily sources of truth. They are ways to navigate the complex data and information landscape that we are in."

Jon Whittle, the director of the CSIRO's data arm Data61, says that while AI chatbots are quickly improving thanks to technological advancements and humans fixing errors when they arise, they shouldn't always be trusted.

"On the one hand, it's nice to have a system that can summarise a whole bunch of relevant web pages. But the real question is, can you trust the output that comes back?" he says.

"The fact that it is written in this language that sounds very conversational — I think there is a real danger that people will just take it as fact, when actually it is not fact.

"If you really want facts then proceed with caution."

The CSIRO's Jon Whittle says people shouldn't believe everything AI chatbots tell them. (Supplied: CSIRO)

Why are some people 'jailbreaking' the chatbots?

As with many new technologies, people are taking the AI chatbots to their limits by doing something known as jailbreaking.

In this context, jailbreaking involves using exploitative pieces of text to convince the chatbots to temporarily shut down their safeguards, thereby allowing them to potentially reveal information about their underlying operations, or to share potentially harmful content.

Soon after the launch of the new Bing, an American university student discovered an exploit that revealed the rules that Bing is meant to abide by when answering queries.

Dr Whittle says chatbot users, himself included, are trying to find these exploits because they find it fun, but also because they are concerned about the technology.

"They are trying to raise awareness of the kinds of problems the chatbots can have, and I think that's a good thing," he says. "It's never going to be 100 per cent right."

Ms Solar says finding loopholes in the chatbots is "actually very important for technology development".

"Technology is not done in some vacuum where there are only positive intentions," she says.

"This is natural human curiosity to really test what the tools can do, their potential, and what value they can contribute. It's very much needed so that the technology can be valuable in a real context.

"All of these interactions are probably going to be used to further train the chatbots, in how they respond to what they're being asked to do."

The CSIRO's Stela Solar says people trying to jailbreak AI chatbots are doing useful work. (Supplied: CSIRO)

Are search engines the best place to fight an AI 'arms race'?

Dr Whittle says while it is impressive how quickly the AI landscape is changing, the apparent "arms race" between companies like Google and Microsoft means things could be moving too fast.

"I think it's worth the companies – not that they will – just slowing down," he says.

"There are fundamental limitations with these AI models. It is undoubtedly the case that they've got some great applications, but I'm not actually convinced that integrating them into search engines is the best application for them."

Ms Solar says while she does not think Google or Microsoft's chatbots have been rushed to market, they are still "in a state of testing".

"I think AI, and especially the chatbots, have suddenly engaged the world in what is possibly the greatest testing of all time, where people are hands on and interacting with the chatbots."

She says she sees AI chatbots being more useful in industrial applications, where they can be trained on specific data to make them very accurate.

"I think we're only just at the start of really seeing the true adoption and impact of chatbots and how they can be implemented in meaningful ways," she says.

Are companies investing in responsible AI?

Ms Solar says technology companies are putting "an enormous amount of investment" into building safe and responsible AI systems that are also inclusive and work well for a diverse range of people.

"Nothing will ever be 100 per cent responsible because there's always the real life context of human behaviour, of societal structures, of social biases that we just cannot remove from our data sets effectively, or at scale," she says.

"But there is more awareness to remove biases from data sets, there is more awareness of 'data deserts' and gaps in representation that are out there. I've seen community groups even within Australia rally together to fill some of these data sets to ensure representation.

"The topic of responsible AI is becoming so large that it's actually becoming more a question of responsible humanity and responsible human action, which we can never control.

"That's why investment in technology that can do better, and design that can do better than what technology has done before, is really critical."

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.