Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Guide
Tom’s Guide
Technology
Ritoban Mukherjee

The Reflection 70B model held huge promise for AI but now its creators are accused of fraud — here's what went wrong

Adobe Firefly AI image of a Llama looking in a mirror.

The creators of Reflection 70B, a tuned-up version of Meta Llama 70B that was recently touted as the world’s top open-source AI model, have just opened up after being accused of fraud. 

Based on independent tests run by Artificial Analysis, the model fails to deliver on the promises made by Matt Shumer, CEO of OthersideAI and HypeWrite, the company behind Reflection 70B. Shumer, who initially attributed the discrepancies to an issue with the model’s upload process, has since admitted that he may have gotten ahead of himself in the claims he had made.

But critics in the AI research community have gone as far as accusing Shumer of fraud, stating that the model is just a thin wrapper based on Anthropic’s Claude, rather than a tuned-up version of Meta Llama

Discrepancies emerge after third-party evaluation

Developed by New York startup HyperWrite AI, Reflection 70B was touted as "the world's top open-source model" by Matt Shumer, the company’s CEO. 

Yet on September 7, a day after Shumer’s announcement on X, Artificial Analysis reported that their evaluation of Reflection 70B yielded results significantly lower than Shumer's claims. Shumer attributed these to an upload error affecting the model's weights, which caused a discrepancy between Shumer’s private API and the weights uploaded to Hugging Face’s model repository.

However, further analysis by the AI community on platforms like Reddit and Github suggested that Reflection 70B’s performance mirrors closer to Meta Llama 3 rather than Llama 3.1, as claimed by Shumer. Suspicions were raised further when it was found that Shumer had an undisclosed vested interest in Glaive, the platform he claimed was used to generate the model's synthetic training data. 

Some went on to suggest that Reflection 70B was merely a "wrapper" built on top of Anthropic's proprietary AI model, Claude 3. On September 8, X user Shin Megami Boson publicly accused Matt Shumer of “fraud in the AI research community.”

HypeWrite breaks silence following fraud accusations

After initially going silent as the controversy erupted, Shumer issued a public response through X on September 10, acknowledging the skepticism around the model’s performance. He claimed a team was working to understand what went wrong and promised transparency once they had the facts.

However, Shumer did not provide a clear explanation for the performance discrepancies. Sahil Chaudhary, founder of Glaive, the platform Shumer said was used to train Reflection 70B, also admitted uncertainty about the model's capabilities and that the touted benchmark scores had not been reproducible.

Critics have remained unsatisfied with Shumer's response so far. "Shumer's explanations and apologies have failed to provide a satisfactory explanation for the discrepancies," reported analytics firm GlobalVillageSpace. Yuchen Jin, co-founder of Hyperbolic Labs, expressed disappointment in the lack of transparency and called for more thorough explanations from Shumer. 

More from Tom's Guide

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.