Get all your news in one place.

100's of premium titles.
One app.

Start reading

Get all your news in one place.

100's of premium titles. One news app.

Start reading

ABC News

National

By Iris Zhao and Sally Brooks

International students and researchers concerned tools to detect AI-generated text may be inaccurate

International Students Turnitin UNSW Toby Walsh United States Stanford University

International students are concerned their original writing is being flagged as AI-generated text. (Unsplash: Andrew Neel)

When Jia Li ran a draft university essay through a computer program used to detect content generated by artificial intelligence, it concluded that just over half was likely machine written.

The program flagged sentences the international student wrote in Chinese and then translated to English using a computer, but also others she wrote in English herself.

"It is my own work but [the program] says it's AI generated," she told the ABC.

Ms Li used the detector because her university has started employing similar tools to flag students who might be cheating by using text-generating AI programs.

"I know other students who were found to have committed misconduct," said Ms Li, who spoke to the ABC on the condition she was able to use a pseudonym.

She is one of a number of international students in Australia who have posted on Chinese social media about concerns tools to detect AI-generated text are unreliable and might lead them to be falsely accused of cheating.

The rapid advent of generative AI programs such as ChatGPT, which are able to produce material such as university essays that some students have been able to pass off as their own work, has universities scrambling to respond.

Some have started using the tools that detect AI-generated text as one way to pick up assignments possibly written by machines.

But AI experts say the technology can be inaccurate, and some argue the detectors should not yet be used to monitor student assessments.

US researchers call for caution

Powerful AI text generative tools like ChatGPT are posing challenges for universities. (Reuters: Dado Ruvic, Illustration)

One of the detectors at the centre of the debate is Turnitin's new AI-writing detection tool, launched in April.

As universities started using it for the first time, a study from Stanford University in California urged caution, after it found programs to detect text generated by AI could be biased against "non-native English writers".

The study did not include Turnitin's writing detector.

The researchers put 91 essays written in English by Chinese students, and 88 English essays by US students, through seven different publicly available detectors.

The tools found 61 per cent of Chinese students' essays were AI generated, but showed "near-perfect accuracy" for the US students' essays, meaning their work was not flagged.

Report co-author James Zou, an assistant professor of biomedical data science at Stanford University, said he would not trust AI detectors right now because the research showed they could be easily fooled and made too many mistakes.

Professor Zou said many of the current AI detection algorithms had an over-reliance on a "perplexity" metric, a measure of complicated words being used in the text.

"If there are a lot of complicated words, then they'll have high perplexity," he said.

Non-native speakers' writing was often misclassified as AI generated, Professor Zou said, because they did not use as many "fancy" words.

Many non-native speakers also use translation and grammar tools, and Professor Zou said the algorithms used by those programs ultimately decreased the perplexity of writing, so detectors would more often flag those lines of text as being AI generated as well.

"Our results call for a broader conversation about the ethical implications of deploying ChatGPT content detectors and caution against their use in evaluative or educational settings," the study concluded.

The rapidly expanding suite of AI tools is causing confusion for international students about what they can and cannot use in assignments. (AP: Michael Dwyer)

Ms Li, a student at the University of New South Wales (UNSW) in Sydney, ran her essay through ZeroGPT, one of the tools included in the Stanford study.

A ZeroGPT spokesperson said its detector was accurate and not biased against "non-native English writers", and that the company was "always looking for ways to improve" its service.

The University of New South Wales (UNSW) is using Turnitin's new AI-writing detection tool.

A UNSW spokesperson said it assisted teachers "in identifying any unauthorised use of AI in student submissions".

"The initial detection is not definitive proof of cheating and does not automatically result in academic misconduct. It is a flag that triggers a further investigation," the spokesperson said.

After Ms Li put her essay in ZeroGPT, she spent hours rewriting all the sentences highlighted by the detector to lower the risk of it being flagged as machine generated.

And she still got a "very low score" for her assignment.

"The instructor told me my language was very hard to understand," Ms Li said.

"Anyway, there was little I could do. I needed to lower the AI rate."

Ms Li said she had been given permission by UNSW to occasionally use a translation program on the condition she cited which content was translated.

New tool has 'huge impact' on study

Some international students are buying AI tools to check their work before submitting it for assessment. (Unsplash: Cristin Hume)

Sophie, a Chinese international student at the University of Melbourne, told the ABC one of her recent assignments had been flagged by Turnitin as 30 per cent likely to have been machine written.

"I think the AI checking function [of Turnitin] is not well developed at the moment," said Sophie, who spoke on the condition that only her first name was used.

She said she did not use any grammar, translation or AI text generative tools to produce the essay, and the university should wait until the tool was more accurate before using it to flag possible misconduct.

"Many of my friends have had to buy Turnitin to pre-check [their assignments]," she said.

"The [use of] of AI detectors has had a huge impact on our study."

A University of Melbourne spokesperson said Turnitin's new tool was only a prompt for further investigation, and that all work submitted by a student "must be their own".

The university's website says it is being used "so that we can thoroughly test it and actively provide input to Turnitin on its design".

"This may mean that the tool incorrectly identifies some assessments as having been produced by AI when they have not," the website says.

"Should you be asked to discuss or explain components of your assessment task, understand that this, alone, is not an accusation of academic misconduct."

Turnitin's Asia-Pacific regional vice-president James Thorley said the company was working to keep the amount of false positive results as low as possible.

"Our goal certainly in the first stage of releasing the tool was to be able to detect ChatGPT generated text at scale," Mr Thorley said.

"This is an incredibly new area. We're learning and we'll be adapting and changing based on what we see."

In a statement issued just last week, Turnitin's chief product officer Annie Chechitelli said the company had now made several changes to the AI writing detector tool.

Ms Chechitelli said that additional "real world" testing, in the seven weeks since it came on the market, had shown false positive results are higher in writing samples where the tool finds that less than 20 per cent of a writing sample is flagged as AI writing.

"This is inconsistent behaviour, and we will continue to test to understand the root cause," she said.

"In order to reduce the likelihood of misinterpretation, we have updated the AI indicator button in the Similarity Report to contain an asterisk for percentages less than 20 per cent to call attention to the fact that the score is less reliable."

Students treated as 'criminals'

Many tools are now available to detect AI-generated text from powerful chatbots like ChatGPT. (ABC News )

Deakin University in Victoria has decided against using Turnitin's AI writing detector.

Associate Professor Trish McCluskey, the university's director of digital learning, said while the university used Turnitin's text matching tool, which helps pick up plagiarism, they were wary of claims the AI writing detection tool was highly accurate.

"Until such a time as the university can test the efficacy and data management process of Turnitin's new product, Deakin has chosen not to apply the tool in the marking of student assessments," she said.

"This is to protect student data and is in line with the approach adopted by a growing list of global education providers, and we expect many Australian universities will follow our lead."

Ms McCluskey said she understood navigating the use of AI was "a minefield" for international students.

"What we have to do is change the culture to try and support academics and support the university community to embrace this technology," she said.

Some experts believe AI detectors assume students are cheating. (Flickr CC: Jirka Matousek)

UNSW AI expert Toby Walsh is also concerned that AI detection tools are inaccurate.

Professor Walsh said AI tools, including those that checked translations and grammar, could be useful learning aids for students whose primary language was not English.

"The [AI technology] could both improve the quality of the text and help [students] communicate the ideas, but also they can come up with the ideas," he said.

"The problem is how we separate those two parts."

Stefan Popenici, author of Artificial Intelligence and Learning Futures, said one of the most problematic issues with AI detectors was that universities "treat students from the beginning as potential criminals".

Dr Popenici, who also works at Charles Darwin University, said universities should be cautious about using these tools to tackle challenges posed by artificial intelligence text generation.

"We try to find a silver bullet for a problem that is very complex because we like simple solutions," he said.

"We complain about our students taking a shortcut. And then we are using a shortcut.

"I don't think that's fair."

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here