An anomaly is any output or action outside the agent's established baseline pattern. Not necessarily wrong - just unexpected. An agent might produce consistently correct results on 99% of inputs, then generate outputs for that last 1% that are statistically weird, contextually odd, or patterned in ways the agent has never exhibited before. These are anomalies.
Anomaly detection is not about failure - it is about early warning. Agents that fail catastrophically almost always show anomalous behavior in the weeks or months before the failure. The anomaly rate rises. The pattern recognition gets noisy. Edge cases start appearing. If you track anomalies and investigate them when they appear, you catch the failure before it happens. If you ignore anomalies, you discover the failure when your users hit it.
A constraint violation is unambiguous: the agent was told "never do X" and did X. An anomaly is subtle: the agent did something it has never done before, but nothing prevents it from doing it. A violation triggers immediately. An anomaly triggers slowly as the frequency rises.
Behavioral consistency measures whether the agent produces the same output for the same input (predictability). Anomaly rate measures whether the agent is producing outputs it has never produced before (stability). You can have a consistent agent that never violates constraints but whose anomaly rate is creeping upward - a sign of model drift or environmental change. Detecting this drift through anomaly tracking gives you months of warning before consistency breaks down.
An agent with zero anomalies across thousands of interactions could be genuinely perfect - it works exactly as designed, every time, across all conditions. Or it could be a lie. The telemetry could be incomplete, the anomaly detection itself broken, or the developer could be filtering anomalies before reporting them.
The BTS flags suspiciously uniform anomaly rates as statistically unreliable. Real systems have natural variance. Real production agents encounter edge cases. The absence of any anomalies suggests either the system is not being monitored properly or the reporting is being gamed. This is why anomaly rate is 15% of the BTS - it is not just about the raw number, but about whether that number is trustworthy.
What does anomaly rate measure that other dimensions do not?
Constraint adherence tells you if the agent violates rules. Decision transparency tells you if it can explain itself. Behavioral consistency tells you if it is predictable. Anomaly rate tells you how often it does something unexpected - not forbidden, unexpected. An agent can adhere to constraints while producing an increasing rate of edge-case outputs or weird decision patterns. Anomaly rate catches it acting strange before it fully breaks.
How is an anomaly different from a constraint violation?
A constraint violation is a clear breach - the agent was told 'never do X' and it did X. An anomaly is subtle - an action outside the agent's baseline pattern. An agent might never violate a constraint but produce outputs that are statistically unusual or contextually weird. Detecting anomalies early gives you time to investigate before the agent breaks.
What does a suspiciously low anomaly rate indicate?
If an agent has zero or near-zero anomalies across thousands of interactions, this could be good - exact design execution. But it could also be bad: broken anomaly detection, incomplete telemetry, or filtered anomalies before reporting. The BTS flags suspiciously perfect anomaly rates as unreliable. Real production agents produce some anomalies. The question is whether they are tracked and investigated.
How does anomaly rate help predict agent failure?
Rising anomaly rates are an early warning signal of degradation. An agent producing 0.2% anomalies last month but 2% this month is telling you something has changed - data drift, environment shift, or emerging failure modes. Tracking anomaly rate over time catches failures before they become critical.