OpenAI’s next-generation large language model GPT-5 will have better reasoning capabilities, improved accuracy and video support, CEO Sam Altman revealed.
On the Bill Gates Unconfuse Me podcast, Altman explained that the next-generation model would be fully multimodal with speech, image, code and video support.
During the conversation, he also indicated that many of the issues around unreliable responses or the model not understanding queries properly would be addressed.
“Speech in, speech out. Images. Eventually video,” Altman said of what will come with future versions of the AI model. “Clearly, people really want that. We’ve launched images and audio, and it had a much stronger response than we expected,” he explained.
What is GPT-5?
We don’t know much about GPT-5 yet beyond hints from Altman and others. It is expected to be a true multimodal model, similar to Google's new Gemini Ultra.
OpenAI started training GPT-5 last year, with hints from Altman that it will be a significant improvement over GPT-4, particularly in its ability to understand complex queries and the real world.
Altman told Bill Gates: “At least for the next 5 or 10 years we will be on a steep improvement curve, this is the stupidest these models will ever be.”
Will it be a superintelligence?
Many of the largest artificial intelligence labs, including OpenAI have Artificial General Intelligence (AGI) as their final goal. Creating a form of superintelligence that is smarter than humanity and much more capable.
There was some suggestion early on that GPT-5 might be some form of superintelligence, but speculation surrounding the model now suggests it will be a better version of the type of AI we already have in GPT-4, Anthropic’s Claude 2 or Google’s Gemini Ultra.
That is to say, it will have much better reasoning capabilities, likely not just outperform humans on many academic assessments, but also have a degree of understanding that goes beyond just mirroring human intelligence.
It may also be the next step on the path to AGI. During a speech at the Y-Combinator W24 event on Friday, Altman is said to have told the founders and entrepreneurs in the room that they should build with the mindset that AGI will be achieved "relatively soon".
What do people want from GPT-5?
One of the biggest issues with the current generation of AI models is the fact they make things up, also known as hallucinations — this is in part a reliability issue that Altman says will be solved in GPT-5.
He told Gates: “If you ask GPT-4 most questions 10,000 times, one of those 10,000 is probably pretty good, but it doesn’t always know which one, and you’d like to get the best response of 10,000 each time, and so that increase in reliability will be important."
The other significant improvement will be in the ability to customize how the AI responds, acts and solves problems. Some of this has become possible with the addition of GPTs — personalized chatbots built on top of ChatGPT.
“People want very different things out of GPT-4,” Altman said, including different styles of responses and even different sets of assumptions when responding. “We’ll make all that possible, and then also the ability to have it use your own data.”
Microsoft and Google have already taken steps to integrate AI models with personal data through Copilot’s integration with 365 and Bard’s link to Workspace.
Altman says this can go deeper in the future. “The ability to know about you, your email, your calendar, how you like appointments booked, connected to other outside data sources, all of that. Those will be some of the most important areas of improvement.”
I use AI models all the time for my job, I play with different tools and try to understand how they work and what they can do. Giving AI access to my life, data and personality seems like asking for trouble — and the emergence of Skynet.