Since its launch in November last year, ChatGPT has become an extraordinary hit. Essentially a souped-up chatbot, the AI program can churn out answers to the biggest and smallest questions in life, and draw up college essays, fictional stories, haikus, and even job application letters. It does this by drawing on what it has gleaned from a staggering amount of text on the internet, with careful guidance from human experts. Ask ChatGPT a question, as millions have in recent weeks, and it will do its best to respond – unless it knows it cannot. The answers are confident and fluently written, even if they are sometimes spectacularly wrong.
The program is the latest to emerge from OpenAI, a research laboratory in California, and is based on an earlier AI from the outfit, called GPT-3. Known in the field as a large language model or LLM, the AI is fed hundreds of billions of words in the form of books, conversations and web articles, from which it builds a model, based on statistical probability, of the words and sentences that tend to follow whatever text came before. It is a bit like predictive text on a mobile phone, but scaled up massively, allowing it to produce entire responses instead of single words.
The significant step forward with ChatGPT lies in the extra training it received. The initial language model was fine-tuned by feeding it a vast number of questions and answers provided by human AI trainers. These were then incorporated into its dataset. Next, the program was asked to produce several different responses to a wide variety questions, which human experts then ranked from best to worst. This human-guided fine-tuning means ChatGPT is often highly impressive at working out what information a question is really after, gathering the right information, and framing a response in a natural manner.
The result, according to Elon Musk, is “scary good”, as many early users – including college students who see it as a saviour for late assignments – will attest. It is also harder to corrupt than earlier chatbots. Unlike older chatbots, ChatGPT has been designed to refuse inappropriate questions and to avoid making stuff up by churning out responses on issues it has not been trained on. For example, ChatGPT knows nothing in the world post-2021 as its data has not been updated since then. It has other, more fundamental limitations, too. ChatGPT has no handle on the truth, so even when answers are fluent and plausible, there is no guarantee they are correct.
Prof Michael Wooldridge, director of foundational AI research at the Alan Turing Institute in London, says: “If I write a text message to my wife that starts: ‘I’m going to be ...’ it might suggest the next words ‘in the pub’ or ‘late’, because it’s looked at all the messages I’ve sent to my wife and learned that these are the most likely ways I’ll complete that sentence. ChatGPT does exactly the same thing on a massively large scale.
“These are the first systems that I can genuinely get excited about. It would take 1,000 human lifetimes to read the amount of text the system was trained on and hidden away in all of that text is an awful lot of knowledge about the world.”
As OpenAI notes: “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers” and “will sometimes respond to harmful instructions or exhibit biased behaviour.” It can also give long-winded replies, a problem its developers put down to trainers “preferring long answers that look more comprehensive”.
“One of the biggest problems with ChatGPT is that it comes back, very confidently, with falsities,” says Wooldridge. “It doesn’t know what’s true or false. It doesn’t know about the world. You should absolutely not trust it. You need to check what it says.
“We are nowhere near the Hollywood dream of AI. It cannot tie a pair of shoelaces or ride a bicycle. If you ask it for a recipe for an omelette, it’ll probably do a good job, but that doesn’t mean it knows what an omelette is.” It is very much a work in progress, but a transformative one nonetheless.