AI Trust Glossary · Canonical Definition
Data Provenance
The documented history of data used to train or operate an AI system - including source, ownership, transformation chain, and custody history.
Explanation
Data provenance asks: where did this training data come from, who owns it, what has been done to it, and does its use comply with applicable law? Without clear provenance, bias and legal risk cannot be properly assessed.
Why it matters
Model behavior is a function of training data. Opaque provenance makes it impossible to diagnose bias or demonstrate compliance. Regulators increasingly require provenance documentation for high-risk AI conformity assessments.
How Borealis uses it
Data provenance is evaluated under audit completeness and decision transparency. Agents submitted for certification must include training data sourcing documentation. Opaque data sourcing reduces the certification tier ceiling.
See also