New Google AI Model "Outperforms" ChatGPT in Most Tests

According to Demis Hassabis, CEO of Google DeepMind, Gemini is as good as the best human experts in the 50 different subject areas they tested the model on. (Credit: AFP News)

A new Google artificial intelligence (AI) model, referred to as Gemini, reportedly outperforms ChatGPT in most tests and displays "advanced reasoning" across multiple formats.

Described by Google as "its most capable model yet", Gemini is the first to be announced since last month's global AI safety summit, at which tech firms agreed to collaborate with governments on testing advanced systems before and after their release.

Google, an American multinational technology company, said it was in discussions with the UK's newly formed AI Safety Institute over testing Gemini's most powerful version, which will be released next year.

According to Google, Gemini has advanced "reasoning capabilities", which enables it to "think more carefully" when answering hard questions.

The model was tested on its problem-solving and knowledge in 57 subject areas including maths and humanities.

It outperformed "state-of-the-art" AI models including ChatGPT's most powerful model, GPT-4, on 30 out of 32 benchmark tests including in reasoning and image understanding.

The Pro model outperformed GPT-3.5, the technology that underpins the free-access version of ChatGPT, in six out of eight tests.

Boss Sundar Pichai said it represented a "new era" for AI.

"This new era of models represents one of the biggest science and engineering efforts we've undertaken as a company. I'm genuinely excited for what's ahead, and for the opportunities Gemini will unlock for people everywhere."

Gemini is expected to be released in January after the date was reportedly pushed back as Google "found the AI didn't reliably handle some non-English queries".

The tech firm adopted a cautious approach to the launch of its AI chatbot, Bard, earlier this year, describing it as "an experiment".

Bard made a mistake in its publicity demo, providing the wrong answer to a question about space.

Gemini differs slightly from its predecessor – it is what is known as a foundational model, meaning it will be integrated into Google's existing tools, including search and Bard.

Despite appearing to eclipse the capabilities of OpenAI's ChatGPT, a new more powerful version of the OpenAI software is due to be released next year, with chief executive Sam Altman saying the firm's new products would make its current ones look like "a quaint relative".

It remains to be seen whether the recent turmoil at OpenAI - which saw Mr Altman fired and rehired in the space of a few days - will have any impact on that launch.

The firm also faces fresh competition from Elon Musk's xAI, which is seeking to raise around $1bn to invest in research and development.

Chinese firm Baidu is also racing ahead with its own AI products.

But as the technology rapidly evolves, so do fears about its potential to cause harm.

Governments around the world are trying to develop rules or even legislation to contain the possible future risks of AI.

In November, the subject was discussed at a summit in the UK, hosted by UK Prime Minister Rishi Sunak at Bletchley Park, Buckinghamshire.

EU lawmakers halted talks on the bloc's landmark Artificial Intelligence (AI) Act today, agreeing to resume discussions on Friday after they failed to reach a deal during almost 24 hours of negotiations.

Lawmakers had agreed provisional terms for regulating AI systems like ChatGPT early on Thursday, sources said, taking a step closer to clinching rules governing the technology.

A document circulated among negotiators showed the European Commission would maintain a list of AI models deemed to pose a "systemic risk", while providers of general-purpose AIs would have to publish detailed summaries of the content used to train them.

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here