Hallucination Radar: Applying Radar Detection Theory to LLM Outputs
Most LLM-powered applications detect hallucinations by hoping users notice. They treat each model output as an oracle and ship the answer downstream. When the model confidently states something false, the system has no built-in way to flag it. Users discover the error by being burned.
This is a solved problem in another field. Radar engineers have been distinguishing real signal from noise since 1940, and the discipline is called detection theory. The same math that decides whether a blip on the screen is an aircraft or a bird applies to deciding whether an LLM output is a fact or a fabrication.
The Setup: Three Models, One Truth
The Hallucination Radar — a free tool you can run right now — submits any LLM-generated claim to three independent free models in parallel:
- Nvidia Nemotron 3 Super 120B — large reasoning model, 262K context
- Google Gemma 4 31B — dense multimodal, fast structured output
- MiniMax M2.5 — alternative architecture, independent training distribution
Each model is given the same adversarial fact-checker system prompt and asked to return strict JSON: a verdict (LIKELY_TRUE, UNCERTAIN, LIKELY_FALSE), a confidence score, specific concerns it identified, and verifiable supporting facts.
Three independent observations of the same target. This is the core of detection theory.
Why Three Models, Not One
A single model can be confidently wrong. Two models trained on similar data can be confidently wrong in the same direction. Three models with different architectures, training distributions, and fine-tuning regimes are statistically much harder to mislead in the same way. When all three agree, the agreement is signal. When they disagree, the disagreement itself is information — it tells you the claim is contested.
In radar terms: this is a three-receiver array. A single receiver detects everything plus noise. Three receivers, processed jointly, suppress the noise. The probability of false alarm (Pfa) for the joint detector is much lower than for any single receiver, while the probability of detection (Pd) for true signals stays high.
The ROC Tradeoff Made Visible
Every detector trades false alarms against missed detections. Tighten the threshold and you miss real hallucinations. Loosen it and you flag everything. This curve — false-alarm rate against detection probability — is called the Receiver Operating Characteristic (ROC) curve, and choosing where to operate on it is the core engineering decision in any detection system.
The Hallucination Radar exposes this. The consensus verdict represents a specific operating point: only flag claims where two-of-three or three-of-three models agree. The agreement flag in the telemetry tells you which it was. The confidence score tells you the average certainty across the agreeing models.
From a wiki synthesis I built mapping radar concepts to AI: "Emergence detection IS radar detection theory. Pfa = emergence alert false positives. Pd = catching real threats. Detection threshold = stuck predicate firing conditions. ROC curve = tradeoff between alert sensitivity and alert fatigue."
What Visible Telemetry Looks Like
When you run the tool, the flight-deck UI exposes:
- Audit ID — every run is logged for forensic traceability
- Models returned — how many of the three actually responded (free-tier OpenRouter rate limits are real)
- Per-model latency — slowest model gates the whole verdict
- Agreement — boolean indicator whether the three models converged
- SNR / Confidence — averaged across responding models, downweighted on disagreement
- Fallback events — visible record when a model was rate-limited or failed
This is not a chatbot. This is what a production hallucination detector looks like — telemetry visible, failure modes inspectable, decisions traceable.
How to Use It in Your Workflow
Three patterns pay off:
- Adversarial QA before publishing — paste an LLM-generated paragraph before you ship it to a customer. Three-model disagreement is your signal to manually review.
- Spot-check existing outputs — paste a previous response that customers questioned. The radar will tell you whether the answer was actually wrong.
- Calibrate confidence in your own LLM pipeline — the Hallucination Radar shows you the structure of multi-model verification. Build the same pattern into your production stack.
The third use case is where this becomes engineering, not just a tool. The architecture shown — three independent calls, structured-output prompts, JSON parsing, agreement aggregation, ROC-tuned thresholds — is reproducible. The live tool is a working reference implementation, and the public audit feed shows real runs.
The Pattern Behind It
This tool exists because I documented a specific transfer: radar detection theory applies to AI hallucination detection. The mapping is direct — Pfa to false-alarm rate, Pd to detection probability, matched filter to known hallucination signatures, three-receiver array to multi-model parallel calls. Each of these has decades of engineering literature behind it. Pulling that literature into AI deployment is what I do.
For production deployments — where the stakes are high enough to warrant ROC-tuned thresholds, custom matched-filter signatures, Kalman state estimation on streaming outputs, and live monitoring dashboards — see the paid service below.
Run a Claim Through the Radar
Submit any LLM output. Three independent free models (Nemotron 120B + Gemma 31B + MiniMax) fact-check in parallel and return per-model verdicts, agreement matrix, and consensus confidence.
AI Failure Engineering — Service #36
Production-grade hallucination detection with ROC-tuned thresholds, matched-filter signature library, Kalman state estimation on your real workload. Same architecture, scaled.