You Can’t Do Quantitative Risk Without Calibrated Humans

Joel Van Dyk
Jan 22
4 min read

In cybersecurity, it has become almost a reflexive mantra: we need quantitative risk management. I agree—but often for very different reasons than those usually given. Before we can talk about models, Monte Carlo simulations, FAIR analyses, or loss exceedance curves, we need to confront a more basic problem.

We have not calibrated our detectors.

I came to cybersecurity through experimental physics. In that world, measurement is everything—but measurement without calibration is worse than ignorance. At particle accelerators, no one would accept a detector reading simply because it produced a number. The first question is always: how accurate is it, and under what conditions?

Every detector has:

• A sensitivity range

• A known error margin (± uncertainty)

• Biases and failure modes

• Drift over time

Until those are understood, quantified, and corrected for, the output is not “data”—it’s noise dressed up as precision.

Cybersecurity risk measurement is no different, except our detectors are human beings.

⸻

Risk Is Measured by People, Not Instruments

Unlike physics, cybersecurity does not have direct observables for “risk.” We do not measure breach probability with a photomultiplier tube or financial loss with a calorimeter. Instead, we rely on human judgment to estimate:

• Likelihood

• Impact

• Control effectiveness

• Threat actor capability

• Exposure duration

Even when these estimates are fed into quantitative frameworks, the inputs are still subjective human judgments. That means the true detector in risk management is the analyst, architect, CISO, or risk committee member providing the numbers.

Yet we treat those inputs as if they were objective measurements.

In physics, this would be unthinkable.

⸻

Every Measurement Has Error Bars

One of the most important lessons in experimental science is that a number without an uncertainty is meaningless. A result of “5” tells you nothing unless you know whether it is:

• 5 ± 0.01

• 5 ± 1

• 5 ± 10

In cybersecurity risk assessments, we routinely assign values—often with alarming confidence—without expressing uncertainty at all. Worse, we implicitly assume that different experts are interchangeable detectors producing equivalent readings.

They are not.

I recently observed this firsthand while listening to a podcast featuring three well-regarded cybersecurity experts. Each was asked to assess the risk of the same scenario. Their initial evaluations were not merely different; they were wildly at odds. Orders of magnitude apart as we’d say in physics, and so unreliable.

This wasn’t incompetence. These were smart, experienced professionals. What I was hearing was uncalibrated measurement.

⸻

Human Bias Is Detector Bias

In physics, detector bias is rigorously studied. We characterize systematic error, random error, saturation effects, and environmental influences. We correct for them statistically, or we redesign the detector.

Human risk assessors are subject to their own well-known biases:

• Availability bias (recent incidents loom larger)

• Anchoring (first number sticks)

• Overconfidence

• Loss aversion

• Organizational incentives

• Professional background bias (ops vs. audit vs. threat intel)

Yet in cybersecurity, we almost never attempt to measure these biases, let alone correct for them. We simply average opinions, escalate disagreements, or default to hierarchy.

That is not quantitative risk management. It is qualitative judgment wearing quantitative clothing.

⸻

Calibration Comes Before Quantification

If cybersecurity truly wants to be quantitative, the first step is not better math—it is calibration of human detectors.

In practice, this means:

• Giving multiple assessors known historical scenarios with known outcomes

• Measuring variance between assessors

• Identifying consistent over- and under-estimators

• Mapping individual and group bias statistically

• Tracking drift over time as experience, roles, and incentives change

Only then can we assign confidence intervals to risk estimates. Only then does it make sense to combine inputs into probabilistic models.

Without calibration, quantitative risk frameworks merely give us precisely calculated nonsense.

⸻

Disagreement Is Data

One of the most valuable insights from experimental science is that disagreement between measurements is not a failure—it is information. When detectors disagree, you do not average blindly; you investigate why.

The podcast example was illuminating precisely because the disagreement was so stark. It revealed:

• Different internal threat models

• Different assumptions about control effectiveness

• Different interpretations of likelihood

• Different implicit definitions of “impact”

Those differences should have been surfaced, measured, and reconciled. Instead, in most risk discussions, they are smoothed over to reach a decision.

That smoothing destroys information.

⸻

Toward Honest Risk Measurement

Quantitative risk management is not about replacing judgment with math. It is about making judgment measurable, bounded, and accountable.

That requires humility—accepting that:

• Our measurements are noisy

• Our confidence is often misplaced

• Our expertise does not make us accurate by default

In experimental physics, this humility is enforced by reality. The data does not care how senior you are. Cybersecurity has not yet reached that stage—but it will have to, if it wants to mature as a discipline.

Before we ask for better numbers, we must ask a harder question:

How well calibrated are the people producing them?

Until we answer that, quantitative risk will remain an aspiration—not a measurement.

You Can’t Do Quantitative Risk Without Calibrated Humans

Recent Posts

Comments