About AIDA Research

We build instruments that measure what language models actually know — not what their benchmarks claim.


Who We Are

Tim Hayes, Founder of AIDA Research

Tim Hayes

Founder & Principal Researcher

AIDA is the brainchild of Tim Hayes. Originally trained as a Chartered Accountant, he moved in the early 1970s into the design and development of early online systems, building one of the first generations of accounting software used by major institutions, manufacturers, and local government.

Over the following decades, his work expanded across domains: leading UK efforts during the Millennium Bug, contributing to the International Space Station programme, and designing memory-resident, loosely coupled distributed database systems long before such architectures became mainstream. More recently, he developed specialised software for the legal profession. The common thread is constant: complex systems, built with precision, under conditions where failure is not an option.

AIDA began not as a commercial venture but as a discovery process. Tim describes it as a refusal to accept surface explanations when he began to question how AI systems were producing their answers. What started as curiosity became a systematic investigation into the internal behaviour of modern AI — an attempt to understand what lies beneath the outputs that appear so fluent on the surface.

Each layer of the labyrinth revealed another beneath it. Each answer raised more questions. What emerged was not a tool but a scientific discovery: a five-dimensional geometric framework — the epistemic manifold — and a complete measurement stack capable of revealing how AI systems reason.

AIDA was not conceived to help an industry meet its obligations under the European AI Act. It was conceived to uncover the truth of how these systems work. The fact that the discovery now provides the measurement layer that regulation requires is a consequence, not the origin.

AIDA represents the first step in what must become the future direction for safety, transparency, and accountability in artificial intelligence. It is the product of decades of systems thinking, a deep respect for evidence, and a belief that technology should be understood from the inside out.

What We Do

Measure

We reconstruct the layer-by-layer trajectory of every inference through the full transformer depth. Dual geometric and logit views produce a measured picture of how each answer was arrived at — not just what the answer was.

Classify

Every model–question pair is classified into one of six epistemic regimes. We distinguish genuine structural knowledge from brittle pattern-matching, knowledge that is suppressed from knowledge that was never acquired.

Certify

The instruments produce auditable, certificate-grade diagnostics from hundreds of thousands of analysis records per model. This is the measurement layer that governance frameworks require — and that no existing evaluation method provides.


Three Pillars

The AIDA framework rests on three interdependent pillars.

Safety

Epistemic measurement is not optional in safety-critical domains. Medicine, law, finance, and defence require assurance that answers arise from clean reasoning — not from internal collapse, drift, or fusion.

Epistemics

The quality of knowledge matters, not just its presence. AIDA separates fused knowledge (robust, perturbation-resistant) from rote knowledge (brittle, surface-pattern-dependent) — even when both produce correct answers.

Accuracy

Not accuracy as currently measured, but accuracy in the epistemic sense: the ability to certify that an answer was produced through a clean, stable internal process. If the framework cannot certify to that standard, the question is returned as outside the ensemble’s capabilities.


The Aviation Analogy

An aircraft is not certified based on how fast it can fly or how high it can climb. Certification depends on its stability envelope — its stall speed, its behaviour under stress, its ability to remain controllable under adverse conditions. Medical AI requires the same principle: not performance at peak, but stability under uncertainty. AIDA provides that stability envelope for language models.


Why This Matters Now

The EU AI Act’s General Purpose AI obligations became enforceable in August 2025. The Act’s conformity assessment requirements presuppose the existence of measurement instruments capable of producing the evidence that regulators require. Aggregate accuracy on benchmark suites does not satisfy robustness requirements. Confidence calibration does not address risk management provisions. AIDA provides the measurement layer that regulation now legally demands.

The Internal Epistemic State Is Measurable

With that recognition, a new era of model evaluation becomes possible.

Explore the Instruments