Coming from a family of school teachers, the one concern that keeps coming up is how students can use ChatGPT to cheat on their homework. There are tools that supposedly detect use of AI text generation, but their reliability is ropey.
And that’s why I’m sure they’re welcoming OpenAI’s sneaky update of a blog post from back in May (spotted by TechCrunch) that the company has developed a “text watermarking method” that is ready to go. However, there are concerns that have stopped the team releasing it.
Watermarking and metadata
So what are the methods OpenAI has been working on? There are multiple, but the company has detailed two:
- Watermarking: Adding some kind of secret text watermark has been “highly effective” in identified AI-generated work, and has even managed to be strong against “localized tampering, such as paraphrasing.”
- Metadata: Rather than adding a watermark that people can try to workaround and to eliminate any chance of a false positive (more on that later), OpenAI is also looking into adding metadata that is cryptographically signed.
The other method OpenAI has explored is using classifiers. You’ll see them used regularly in machine learning when it comes to email apps automatically putting messages in the spam folder or categorizing important emails into the main inbox. This could be used as a hidden classification of essays into being AI generated.
Can be problematic
These tools are basically ready to go, and OpenAI is sitting on them, according to a report from The Wall Street Journal. So what’s the hold up? Put simply, they’re not completely fool-proof and they could cause more harm than good.
I mentioned how watermarking is good against localized tampering, but it doesn’t do so great against “globalized tampering.” Certain cheeky methods like “using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character” will work around the watermark.
Meanwhile, the other problem is that of having a disproportionate impact on some groups. AI can be a “useful writing tool for non-native English speakers,” and in those situations, you don’t want to stigmatize the use of it — eliminating the global accessibility of these tools for education.