
Anthropic, makers of the popular Claude chatbot, has accused its competitors of attacking it by “distilling” its models.
Distillation involves training an AI system by using another, more powerful one, taking its name from the fact that it allows the training of a specific artificial intelligence tool to be reduced down to a smaller size. It can be used as a legitimate technique to improve AI, by allowing researchers to produce similar results using a smaller and more efficient model.
But Anthropic said that it had uncovered a number of “attacks” where other AI laboratories were using that technique to “illicitly extract Claude’s capabilities to improve their own models”.
The company noted that the use of the technique can be legitimate. But it suggested that it could also be used for “illicit purposes: competitors can use it to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost, that it would take to develop them independently”.
It suggested that other systems could take the power of AI tools such as Claude and use them in ways that would ordinarily be prevented. And it focused its criticism on its Chinese competitors, suggesting that AI laboratories from other countries might not act as safely.
“Anthropic and other US companies build systems that prevent state and non-state actors from using AI to, for example, develop bioweapons or carry out malicious cyber activities,” it wrote in a blog post. “Models built through illicit distillation are unlikely to retain those safeguards, meaning that dangerous capabilities can proliferate with many protections stripped out entirely.
“Foreign labs that distill American models can then feed these unprotected capabilities into military, intelligence, and surveillance systems—enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. If distilled models are open-sourced, this risk multiplies as these capabilities spread freely beyond any single government's control.”
It said that the attacks were growing “in intensity and sophistication”. And it said the response needed “rapid, coordinated action among industry players, policymakers, and the global AI community”.
But Anthropic said it was making a number of changes to its Claude system intended to thwart those attacks. They include tools to spot when it is being used in such attacks, ways of sharing intelligence with other AI labs, more ways to stop fraudulent accounts and ways that the model would make distillation less possible.
Some critics, however, pointed out that distillation can be a legitimate research tool and that many AI systems are already trained on data that might have been procured without proper compensation to those who created it.