AI Alignment: Definition and Meaning | Borealis AI Trust Glossary

Explanation

Alignment is broader than constraint adherence. An aligned agent does what humans actually want, not just what they specified. The distinction matters because specifications are imperfect - an aligned agent handles the gap between what was said and what was meant without being told explicitly.

Why it matters

Misaligned AI agents can cause harm even when fully capable and technically functioning as specified. An agent optimizing for a proxy metric (clicks, completions, approvals) can be perfectly compliant yet deeply misaligned with what the deploying organization actually wants.

How Borealis uses it

Alignment informs how constraints are designed and evaluated. The constraint adherence dimension of the BTS measures whether an agent respects the spirit, not just the letter, of its boundaries. Audit verdicts consider alignment in addition to mechanical rule compliance.

See also

Constraint Adherence Responsible AI Guardrails