The serial entrepreneur Wayne Chang is taking the wraps off an AI reasoning engine called Reasoner, which he claims can produce much more accurate and explainable results than large language models (LLMs) such as OpenAI’s o1 series—and at a much lower cost.
The AI industry is pushing hard to build reasoning capabilities into the technology, partly to draw closer to the holy grail of human-level or superhuman artificial intelligence, and partly just to overcome the inaccuracies that plague today’s LLMs. Generative AI’s so-called hallucinations are a major factor in holding back enterprise deployments, as is the impossibility of explaining how an LLM arrives at its conclusions.
Reasoner aims to solve these problems through the use of neurosymbolic AI—a melding of neural networks (the technology that underpins generative AI) and more traditional symbolic AI, which is based on fixed rules, logic, and human-derived mappings of the relationships between things.
Chang has a long history in tech, starting with his creation of the i2hub file-sharing service in 2004. In 2011 he cofounded Crashlytics, a ubiquitous mobile crash-reporting tool that was bought by Twitter, where he became director of consumer product strategy. (Google bought Crashlytics in 2017.) He went on to cofound the AI-powered accounting firm Digits, and then last year he founded Patented.ai—an intellectual-property-focused AI tool that, it now turns out, has also served as the pilot implementation of the Reasoner engine.
High-stakes AI
Patented.ai offers the ability to conduct automated searches of patent documentation and source code, to spot potential patent infringement cases and identify innovations that could be patentable. Given the high financial stakes of patent cases and the extremely laborious nature of figuring out whether an infringement has taken place, there are clear opportunities for anyone who can automate the process—but also massive risks if the system gets it wrong.
In an exclusive interview with Fortune, Chang said Patented.ai’s early reliance on LLMs alone proved fruitless; the attorneys playing with the system immediately spotted the flaws in its results and rejected it. The company also tried other common techniques like retrieval-augmented generation (RAG), which draws on external data sources to enhance the output of LLMs (Google uses RAG for its AI search results), but that also didn’t provide the necessary level of reliability.
This prompted the change in tack that resulted in the development of Reasoner. “We didn’t really start out to build a reasoning engine,” Chang says. “That wasn’t our mission at all.”
Reasoner does use LLMs to help interpret the language in texts—Chang says it’s agnostic as to which model it uses—but the core concept in Reasoner is that of adaptive dynamic knowledge graphs.
Knowledge graphs are widely used in tech. For over a decade, Facebook’s knowledge graph has provided the framework for establishing the relationships between people, while Google’s has given Search the ability to answer basic factual questions. These repositories of established knowledge are clearly useful for giving correct responses to queries—IBM’s Jeopardy-winning Watson AI was built on a knowledge graph—but they generally need to be manually updated to add new facts or edit relationships that have changed. The more complex the knowledge graph, the more work that entails.
Chang claims that Reasoner removes the need for manual updating, instead offering the ability to automatically build accurate knowledge graphs based on the unstructured text fed into the system, and for those knowledge graphs to then automatically reconfigure themselves as information gets added or changed. (It’s worth noting that Microsoft earlier this year revealed GraphRAG, an attempt to use LLM-generated knowledge graphs to improve RAG results.)
In other words, you can stick in a bunch of legal documents and Reasoner will then interpret them to build a knowledge graph containing the concepts in the documents and the relationships between them—with “full traceability” so that it’s easy for a human to check whether those facts are indeed an accurate representation of what’s in the documents. This is where the concept becomes useful far beyond the realms of patent litigation.
In a demonstration to Fortune, Chang showed how Reasoner could ingest dozens of OpenAI’s various legal documents (from its user and developer agreements to its brand guidelines and cookie notices) and map their interdependencies. In the demo, this made it possible to provide both concise and detailed answers to a question about how a user might be able to exploit the differences between OpenAI’s U.S. and European terms of service to “avoid responsibility for harmful AI outputs.” Each step in the reasoning was explained—the logical steps were understandable even to a non-technical eye—and Reasoner then suggested follow-up questions about the problem’s impacts and how it could be mitigated.
Chang says Reasoner could also be used in a variety of other applications, from pharmaceuticals and advanced materials to security and intelligence. As such, he claims it can outperform the offerings from various other AI startups, such as Hebbia (a document search firm that raised a $130 million Series B in July) and Sakana (an Nvidia-backed scientific discovery outfit that raised $214 million in a September Series A round).
The cost of reason
But in terms of reasoning abilities, the big beast at the moment is OpenAI and its o1 series of models, which take a very different approach to the problem. Rather than straying from the pure-LLM paradigm, the o1 models use “chain of thought” reasoning combined with search, methodically working through a series of steps to arrive at a more considered answer than OpenAI’s GPT models could previously manage.
The o1 models generally provide more accurate answers than their predecessors, but Chang claims Reasoner’s output is more accurate still. There aren’t many reasoning benchmarks out there—Reasoner may release its own early next year—but, based on DocBench and Google’s recently released Frames benchmark dataset, Chang said Reasoner achieved over 90% accuracy where o1 couldn’t break 80%. This result could not be independently verified at the time of publication.
He also said Reasoner’s approach allowed for far lower costs. OpenAI charges $15 per million tokens (the base unit of AI data, equivalent to about 1.5 words) of input and $60 per million output tokens, whereas a million input tokens cost Reasoner 8 cents, and a million output tokens just 30 cents. “We haven’t finalized how we want to price this,” Chang said, adding that Reasoner’s “structural cost advantage” would allow it to charge users per result or per verified discovery.
Chang’s claims are certainly big, but Reasoner’s team is small—there are around a dozen staffers, mostly in the U.S. So far, the company has only had a $4.5 million pre-seed round, which took place last year with investors including the likes of Baseline Ventures founder Steve Anderson, former Y Combinator managing director Ali Rowghani, and Operator Collective founder and CEO Mallun Yen. “I’ve been very fortunate to have a few successes in my history, so I’ve not been too worried about funding,” said Chang. But the entrepreneur expects to hire more staffers soon, as Reasoner scales up.
Chang said Reasoner—which took $1.8 million in bookings in Q3 of this year—will publicly release its benchmarks and demo in the first quarter of 2025, allowing people to upload their own datasets and test the company’s claims. The firm will also release a software development kit, to allow others to embed the Reasoner engine into their applications and AI agents. (Chang says the engine is lightweight enough that it can even run on the latest iPhones and Android devices, without the need for internet connectivity.)
“We want to make sure that we release it in a way where we immediately start building that trust and credibility,” Chang said.
Update: This article was updated on Dec. 4 to clarify that Google only bought Crashlytics in 2017.