Safety (AI Safety): Definition and Meaning | Borealis AI Trust Glossary

Explanation

AI safety is broader than technical correctness. A system can do exactly what it was specified to do and still be unsafe if the specification leads to harmful outcomes. Safety requires evaluating outputs not just against specifications but against real-world consequences.

Why it matters

AI agents increasingly take consequential actions: managing financial portfolios, providing medical information, operating physical systems. Safety failures in these domains can cause irreversible harm that no subsequent technical fix can undo.

How Borealis uses it

Safety is the motivating principle behind the entire BTS framework. The constraint adherence dimension directly measures whether agents avoid defined harmful behaviors. Certification is the mechanism by which safety is verified externally, beyond developer self-assessment.