I don't want alerts, I want explanations
Every monitoring system I’ve used has the same failure mode.
Something goes wrong. An alert fires. A human gets paged. The alert says: “Incident Detected.”
And then… nothing. No context. No explanation. Just a timestamp and a threshold that was crossed.
Most alerts answer exactly one question: did a number cross a line?
They don’t answer:
- Is this normal for this system?
- What changed compared to yesterday?
- What assumption just stopped being true?
- What happens if I ignore this?
So the human fills the gap — opening dashboards, correlating graphs, rebuilding a mental model the system itself doesn’t have.
That cognitive work is invisible. But it’s where systems actually fail.
I learned this properly in my kitchen.
The original automation was obvious: motion detected, lights on; no motion, lights off. Technically correct. Completely infuriating. Lights would turn off while someone was still cooking.
The failure wasn’t missing signals. It was misunderstanding intent.
What finally worked was a shift in mindset. The system stopped reacting to events and started maintaining a belief about the room.
Instead of “did motion happen?” it began asking: are we cooking? Are we relaxing? Has that changed?
The lights don’t flicker because a sensor fired. They change when the system’s understanding changes. And when they get it wrong, the reason is visible.
This generalises beyond kitchens.
If a system is going to interrupt a human, it should answer three questions:
- What changed? — relative to this system’s own history
- Why does this matter? — which expectation was violated
- What happens if nothing is done? — impact, not urgency theatre
If it can’t answer those, it’s not delivering insight. It’s exporting work.
The problem with most alerts isn’t that there are too many. It’s that they’re empty. They carry no understanding — just a demand for attention.
I don’t want more alerts. I don’t want smarter thresholds.
I want systems that know what they believe — and can explain why.
