ChatGPT developer OpenAI has quietly shuttered its "AI Classifier" tool, which it introduced to the world mere months before. As AI Classifier was developed with the sole purpose of detecting generative AI content outputted by its own software, its shuttering is significant in the context of the multiple copyright lawsuits OpenAI is already juggling. The reason cited for the shuttering of AI Identifier rests with its low accuracy, which hovered at a paltry 26%. The move comes mere days after a number of AI-invested companies (including OpenAI) vowed to do just the opposite.
The main issue here stems from AI's opaqueness; an operational black box that seemingly means that being an AI product developer isn't enough to allow you to identify your product's output (which is as basic a capability as it gets).
While there's been a number of advancements in "invisibly" watermarking the synthetic, generative output from AI-based image generators, the issue is a much more complex one when it comes to text. This happens because within actual text, there's no space to add any sort of identifying data; while an image allows for a number of different elements to be invisibly present (to our human eyes), any changes in the image, even invisible, add data that can then be picked up by specialist tools.
But text is a wholly different beast, and there's also the issue that for someone truly devoted, they could simply re-type the entirety of a Large Language Model's output to keep only the text information, making fun of any invisible watermarks that might also be copied with a simple "copy and paste" command.
This is part of the reason why most "AI text" detector apps - such as the now buried AI Identifier and even GPTZero - process the actual words within a text, identifying writing patterns. Then it's really all about linguistics. If there's one thing that AI still can't do, it's to generate true "novelty" - everything that comes is a remix and a rematch of what came before (which justifies the usage of the word "synthetic" to describe AI-generated content, or current AI being described as a "stochastic parrot").
Analysing and identifying AI-generated text, then, seems to build upon the information theory work pioneered by Claude Shannon, which described the essential valuation of information as happening in the field of "surprise": if the information is surprising, that is, if it adds to or subverts an expectation, then it's relevant information (being more or less relevant according to how more or less surprising it is).
Just think about these two different titles for a Tom's Hardware article for the next generation's best GPUs: "Nvidia keeps the gaming performance crown in its latest generation" and "AMD steals gaming performance crown from Nvidia" (let's not get into whether those are good or badly written titles). The point is, one of those would be more surprising than the other and so the information's impact would also be higher.
AI detection tools such as the now defunct AI Classifier tool have essentially taken Claude Shannon's work and applied it to linguistics; because language follows patterns (some words work in some places and some contexts, but not in others), these AI tools look for patterns that have a low "surprise" value, that is, that don't deviate from the "baseline" amalgam of language that the model has arrived at after its training and weighting are complete (which is more aligned with the mean content of the model rather than the rest of its data).
Essentially, in the current world of AI "writing" there are generally no wordy-word, exquisite nor seldom-seen word sequences or turns of phrase such as this (nor this writer's own tendency to write run-in sentences with an ever-higher number of parentheticals).
This lack of surprise in AI's "writing" patterns is what these tools attempt to detect; something they're currently not very good at, as proven by AI Classifier's abhorrent 26% hit rate. Even GPTZero is only slightly better at detecting AI output than a monkey randomly attempting to do the same job would be.
The shuttering of AI Classifier shouldn't be seen as an admission of defeat on OpenAI's part, but rather a return to the drawing board. It's clear that the tool was broken and insufficient, and inspired little confidence. There's damage to be done when content written by an AI isn't properly detected, yes; but there's arguably an even higher impact to an actual human's writing being wrongly classified as AI-written. Copyrights, sensitive or classified information, corporate secrets, deepfakes, plagiarism - all of these could be brought to the table.
There's a responsibility to watermark AI correctly, but there's an even greater responsibility to not get it wrong. That said, OpenAI's decision to remove the tool sounds like the best one.