Image

AWS scientist: Your AI technique wants mathematical logic

Hallucination is fundamental to how transformer-based language models work. In fact, it’s their greatest asset: this is the method by which language models find links between sometimes disparate concepts. But hallucination can become a curse when language models are applied in domains where the truth matters. Examples range from questions about health care policies, to code that correctly uses third-party APIs. With agentic AI, the stakes are even higher, as the autonomous bots can take irreversible action—like sending money—on our behalf.

The good news is that we have methods for making AI systems follow the rules, and the underlying engines of those tools are also scaling dramatically each year. This branch of AI is called automated reasoning (a/k/a symbolic AI) which symbolically searches for proofs in mathematical logic to reason about the truth and falsity that follow from axiomatically defined policies.

It is important to understand that we’re not talking about probability or best guesses. Instead, this is about rigorous proofs found in mathematical logic via algorithmic search. Symbolic AI uses the foundations originally laid out by predecessors such as Aristotle, Bool, and Frege—and developed in modern times by great minds like Claude Shannon and Alan Turing.

Automated reasoning is not just theory: in fact, it enjoys deep industry adoption

In the 1990s, it began with proofs of low-level circuits in response to the FDIV bug. Later, it was in safety critical systems used by Airbus and NASA. Today, it is increasingly deployed in instances of neurosymbolic AI. Leibniz AI, for example, is applying formal reasoning in AI for the legal domain, while Atalanta is applying the same ideas to problems in government contracting, and Deepmind’s AlphaProof system doesn’t generate false arguments in mathematics because it uses the Lean theorem prover.

The list goes on: Imanda’s CodeLogician doesn’t allow programs to be synthesized that would violate API usage rules because it too uses automated reasoning tools. Amazon’s Automated Reasoning checks feature in Bedrock Guardrails filters out true from untrue statements using automated reasoning together with axiomatic formalizations that can be defined by customers. For organizations seeking to augment their work with AI while having confidence in its outputs, the logical deduction capabilities of automated reasoning tools can be used ensure that interactions live within defined constraints and rules.

A key feature of automated reasoning is that it admits “I don’t know” when it cannot prove an answer valid, rather than fabricating information. In many cases, the tools can also point to the conflicting logic that makes it unable to prove or disprove a statement with certainty, and show the reasoning behind determinations.

Automated reasoning tools are also typically inexpensive to operate, especially in comparison to the power-hungry transformer-based tools. The reason is that automated reasoning tools operate only symbolically about what is true and untrue. They don’t “crunch numbers”, and there is no matrix multiplications on GPUs. To see why, think of problems like “solving for x” from your mathematics courses in school. When we rewrite x+y to y+x, or x(y+z) to xy + xz, we are reasoning about the infinite while only making a few simple steps. These steps are easily performed in milliseconds on a computer.

It is true that the application of mathematical logic isn’t a universal solution to all problems in AI. For example, we would be dubious of an axiomatization of what makes a song or poem “good”. We would also question tools that claim to prove in mathematical logic that our home furnace will not break. But in applications where we can define axiomatically the set of true and untrue statements in a given domain (e.g., eligibility for the Family Medical Leave Act or the correct usage of a software library), the approach offers a practical way to deploy AI safely in business-critical areas where accuracy is paramount.

Getting started

While automated reasoning tools historically required deep mathematical expertise to use, the growing power of generative AI is making them increasingly accessible to broader audiences where users can express rules in natural language and automatically verify AI outputs against those rules. In fact: many language models are trained over the outputs of automated reasoning tools (often in combination with reinforcement learning). The key is starting with clear use cases that can be precisely defined—think of things like coding, HR policies, and tax laws. It is also applicable in areas where verification really matters like security, compliance, and cloud infrastructure.

Looking ahead

As we seek to integrate AI ever deeper into our lives, the ability to verify the correctness and truthfulness of their actions and outputs will only become more critical. Organizations that invest in automated reasoning capabilities now will be better positioned to safely scale AI and agent adoption while maintaining control and compliance. In your next AI strategy meeting, consider automated reasoning. It could be the key to deploying AI with confidence across your organization and for your customers. 

The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.

Fortune Global Forum returns Oct. 26–27, 2025 in Riyadh. CEOs and global leaders will gather for a dynamic, invitation-only event shaping the future of business. Apply for an invitation.

SHARE THIS POST