This is what it looks like inside an AI’s mind. Select a medical question, choose a model, and press Play. Watch the probability of each answer evolve layer by layer through the full depth of the transformer. Switch to Geometry view to see the structural relationships between answer representations.
Currently showing simulated data representative of observed patterns. Production version connects to the AIDA diagnostic database (750,000+ records).
At each transformer layer, the model has an evolving “opinion” about which answer is correct. The probability bars show these internal predictions before the model produces its final output. In a well-functioning model, the gold answer (★) should gradually dominate. When it doesn’t, the model is reasoning incorrectly at depth.
The ASCOL geometric analysis measures how similar the internal representations of each answer option are to each other. The centroid (◉) is the most central answer; the outlier (◇) is the most isolated. When the gold answer is the centroid, the model has structured knowledge. When a wrong answer is the centroid, the model is confidently wrong.
The coloured ribbon shows which answer the model predicts at each layer. Yellow dots mark prediction flips — moments where the model changes its mind. The number of flips and their location reveal whether the model is deliberating productively or oscillating without convergence.
Standard benchmarks show only the final answer. AIDA reconstructs the entire reasoning trajectory — every layer, every option, every structural relationship.
Explore the Instruments