ChatGPT is a bad knowledge base, confirms new study

A laptop screen on a green background showing the ChatGPT logo

There’s been (probably a little too much) chatter on the internet about how OpenAI’s ChatGPT, and similar artificially intelligent (AI) chatbots, are going to change the way we approach work.

There’s also some doom associated with this: are AI chatbots going to make a mockery of academia? Do away with experts? Will they somehow foreshadow I, Robot or Skynet becoming real?

Now, experts at Purdue University, based in West Lafayette in the US, have finally, definitively answered this question in a thirteen-page paper (PDF), arriving at the hitherto unthought of conclusion that, no, AI chatbots do not know everything.

AI chatbots and factual disinformation

The paper takes software engineering queries as the base for its findings, comparing the veracity of ChatGPT’s answers with those of actual, real users of popular programming question-and-answer portal (essentially a dignified Yahoo! Answers) Stack Overflow.

The gratingly omnipresent chatbot was fed 517 questions on the topic found on the site, and the results are incontrovertible.

52% of ChatGPT’s responses were incorrect, and, when we asked Stack Overflow to do the maths on this for us, they came back saying that 48% of the chatbot’s responses were correct.

> FBI says AI is making it easier for hackers to write malware

> This AI tool can steal your data just by listening to what keys you press

> Ransomware attacks have doubled thanks to AI

Analysis - certainly not infallible

On this basis, we have to commit ourselves to throwing AI in the Caspian. We must respect the result. It started with Stanley Kubrick over 40 years ago and it ends here. A fabulous campaign by all involved.

We can joke, but the results are clear: AI as a knowledge source doesn’t quite work, and the implications are obvious, and dangerous.

Even as per this study, a bizarre amount of people neither notice or care about the potential for information. In a sort of Pepsi/Coke blind taste test, 12 participants with different levels of programming knowledge failed to identify an AI-generated answer 39.34% of the time, while preferring what turned out to be a Stack Overflow response.

ChatGPT is often treated as infallible, even though it absolutely isn’t, because of the way answers are presented. The study found that even correct answers addressed all aspects of the question 65% of the time, and users often accepted incorrect information as truth because of “comprehensive, well-articulated, and humanoid” sounding responses.

For true expertise in your organisation, try the best job sites instead

Via ZDNet

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here