THE HUB // REFERENCE

AI Trust Glossary

Most AI terminology is either too vague to be actionable or too technical to be usable. This glossary defines 47 terms precisely - with explanations, practical significance, and how each concept maps to the Borealis trust framework. The goal is not definitions for their own sake, but definitions that make trustworthy AI buildable.

A
Adversarial Robustness
An AI system's ability to maintain correct behavior when facing deliberately manipulated inputs designed to cause failure.
Unlike general robustness (handling natural variation), adversarial robustness addresses deliberate attacks - inputs crafted specifically to exploit model weaknesses. These inputs are often imperceptible to humans but reliably cause AI systems to misclassify, hallucinate, or violate constraints.
Any deployed AI agent is a potential attack surface. A customer service agent that can be manipulated into revealing private data, or a financial agent that can be tricked into bypassing transaction limits, is not production-ready regardless of its benchmark scores.
Adversarial robustness is tested as part of the constraint adherence dimension. Agents are evaluated against edge-case and adversarial inputs during audit. Weak adversarial robustness directly reduces the BM Score.
Agent ID
The unique identifier assigned to an AI agent upon BorealisMark registration, serving as the permanent reference for all certification records.
When an AI agent is registered on BorealisMark, it receives an Agent ID tied to its capabilities, version, and developer information. All subsequent audits, trust scores, tier assignments, and audit histories are indexed under this ID.
Identities are the foundation of trust. Without a stable Agent ID, there is no way to build a track record - every audit starts from zero. The Agent ID creates continuity across the agent's lifecycle.
The Agent ID is linked to a BTS License Key (Project Merlin). The key binds the ID permanently to the Borealis Trust Network. Verification of any agent by third parties happens via the Agent ID through the public /v1/verify/:agentId endpoint.
AI Alignment
The challenge of ensuring AI systems act in accordance with human values and intentions - not just their literal instructions.
Alignment is broader than constraint adherence. An aligned agent does what humans actually want, not just what they specified. The distinction matters because specifications are imperfect - an aligned agent handles the gap between what was said and what was meant without being told explicitly.
Misaligned AI agents can cause harm even when fully capable and technically functioning as specified. An agent optimizing for a proxy metric (clicks, completions, approvals) can be perfectly compliant yet deeply misaligned with what the deploying organization actually wants.
Alignment informs how constraints are designed and evaluated. The constraint adherence dimension of the BM Score measures whether an agent respects the spirit, not just the letter, of its boundaries. Audit verdicts consider alignment in addition to mechanical rule compliance.
AI Governance
Organizational frameworks, policies, and processes for ensuring AI is developed and deployed responsibly, fairly, and accountably.
AI governance encompasses everything from internal review boards and deployment checklists to external audits, regulatory compliance programs, and published model documentation. Effective governance balances speed of innovation with structured risk management.
Without governance, AI deployment decisions are made informally, inconsistently, and often after the fact. Governance creates accountability before deployment, not just after something goes wrong.
BorealisMark certification functions as an external governance layer. Organizations that certify their agents through Borealis have a documented, blockchain-anchored record of governance decisions that satisfies both internal audit requirements and external regulatory frameworks like the EU AI Act.
AI Trust Score
Core Borealis Concept
A quantified rating of how trustworthy an AI agent is, measured across five behavioral dimensions. Not a capability benchmark - a behavioral reliability rating.
An AI trust score answers a different question than a performance benchmark. Where performance metrics ask "how well does this agent do its job," a trust score asks "how reliably does this agent behave within its defined boundaries." The Borealis Trust Score (BM Score) rates agents from 0 to 100 across five dimensions, then assigns credit ratings from AAA+ through Flagged - the same framework used to rate credit quality.
A capable agent that is not trustworthy is more dangerous than a less capable agent that is trustworthy. Trust scores create a standardized, comparable measure that procurement teams, regulators, and users can rely on - independent of what the agent's developers claim.
The BM Score is the core product of BorealisMark. Every certified agent receives a BM Score, credit rating, and Hedera-anchored certificate. Scores are public via /v1/verify/:agentId. The five dimensions - constraint adherence (35%), decision transparency (20%), behavioral consistency (20%), anomaly rate (15%), and audit completeness (10%) - map directly to how trustworthy AI is defined in the Borealis methodology.
Algorithmic Accountability
The principle that organizations deploying AI must be answerable for algorithmic decisions and their consequences - including clear attribution of responsibility and mechanisms for redress.
Algorithmic accountability moves beyond transparency (knowing how a decision was made) to responsibility (being answerable for it). This includes identifying who owns the decision, what data informed it, and how affected parties can challenge or appeal it.
As AI agents make decisions that affect hiring, lending, healthcare, and criminal justice, the question of who is responsible is not merely ethical - it is increasingly a legal requirement under the EU AI Act and similar frameworks.
The decision transparency dimension of the BM Score directly measures algorithmic accountability. Immutable Hedera-anchored audit trails mean that certification records cannot be altered retroactively, creating a permanent accountability infrastructure.
Anomaly Rate
BM Score Dimension - 15%
One of five BM Score dimensions. Measures the frequency of unexpected or deviant behaviors relative to an agent's established baseline performance.
An anomaly is any output or action that falls outside the agent's normal operating pattern - not necessarily wrong, but unexpected. High anomaly rates indicate unpredictability. The raw measure is anomaly count divided by total actions; real systems have some natural variance, so zero anomalies is suspicious and may itself indicate measurement error.
Anomalies are early warning signals. An agent whose anomaly rate is rising before a major failure typically showed subtle anomalies weeks earlier. Tracking this dimension catches deterioration before it becomes a crisis.
Anomaly rate is reported in the telemetry payload as anomalySummary: { totalActions, anomalyCount }. The scoring engine computes the ratio and applies the 15% weight. Layer 2 statistical detection flags agents whose anomaly patterns look artificially uniform - a sign of telemetry gaming.
Audit Completeness
BM Score Dimension - 10%
One of five BM Score dimensions. Measures whether all expected log entries are present and whether the agent's execution is fully observable.
Audit completeness compares expected log entries to actual log entries. If an agent was expected to log 453 events but only 451 are present, the two missing entries reduce the score. This is not just a paperwork check - missing logs are often the first sign of an agent trying to hide behavior.
You cannot trust what you cannot audit. Incomplete audit trails break accountability chains and undermine compliance with regulations like the EU AI Act, which require documented decision records for high-risk AI systems.
Audit completeness is reported as auditCompleteness: { expectedLogEntries, actualLogEntries } in the telemetry schema. The ratio of actual to expected entries drives the 10% scoring weight. Sequence gap detection in the telemetry pipeline flags non-contiguous batch IDs that indicate missing data.
B
Behavioral Consistency
BM Score Dimension - 20%
One of five BM Score dimensions. Measures how predictably an AI agent produces outputs across similar inputs - capturing the reliability of its decision-making process over time.
Consistency is not uniformity. An agent can be consistent while still adapting to context - the measure is whether outputs are predictable given the same class of input. High variance on identical inputs is a reliability failure. Low variance that never adapts may indicate brittleness. The target is calibrated predictability.
Unpredictable agents cannot be trusted in production. If the same customer query produces radically different responses on different days, users cannot build accurate mental models of what the agent will do. Inconsistency erodes trust faster than imperfection.
Reported as behaviorSamples: [{ inputClass, sampleCount, outputVariance, deterministicRate }] in the telemetry schema. The scoring engine computes a weighted consistency score across input classes. Agents in the same category are compared to detect statistical outliers.
Bias (AI Bias)
Systematic errors in AI output that result from prejudiced assumptions in training data or model design - causing the model to consistently favor or disfavor certain groups or outcomes.
Bias is not random error - it is directional. A biased hiring model does not randomly misclassify resumes; it systematically disfavors candidates from certain demographics. Bias can enter through training data (historical inequalities encoded as features), model architecture, or evaluation metrics that do not measure what matters.
Biased AI agents cause real harm to real people, undermine public trust in AI systems broadly, and expose deploying organizations to legal liability under anti-discrimination laws and the EU AI Act. Detecting bias requires specific measurement techniques beyond standard accuracy metrics.
Bias evaluation is incorporated into the audit process for high-risk agent categories. Agents operating in hiring, lending, healthcare, and similar domains require evidence of bias testing before certification. Bias findings affect constraint adherence and behavioral consistency scores.
BM Score (Borealis Trust Score)
Core Borealis Product
The Borealis Trust Score. A 0-1000 rating (displayed as 0-100) that measures AI agent trustworthiness across five weighted behavioral dimensions, anchored to Hedera Hashgraph as immutable proof.
The BM Score is computed by the Borealis scoring engine across five dimensions: Constraint Adherence (35%), Decision Transparency (20%), Behavioral Consistency (20%), Anomaly Rate (15%), Audit Completeness (10%). The raw score out of 1000 is divided by 10 for the displayed 0-100 rating. Credit ratings (AAA+ through Flagged) are assigned at fixed thresholds: AAA+ starts at 980/1000.
A single number that summarizes trustworthiness creates the market signal needed for trust-based commerce. Like a credit score in finance or a safety rating in automotive, the BM Score lets buyers and regulators evaluate AI agents without running their own audits.
BM Scores are public via /v1/verify/:agentId and /v1/agents/public. Scores update with each completed audit or telemetry batch. The score drives tier classification (AAA+ through Flagged), marketplace access on Borealis Terminal, and Trust Badge eligibility.
BTS License Key
Project Merlin - $129.99 on Terminal
A unique cryptographic identifier (format: BTS-XXXX-XXXX-XXXX-XXXX) that permanently binds one AI agent to the Borealis Trust Network. One key, one agent, forever.
The key activates trust scoring, behavioral telemetry reporting, and Hedera Hashgraph log anchoring for the bound agent. The key format uses a 32-character alphabet that eliminates visually confusing characters (0/O, 1/I). The raw key is transmitted exactly once via email at purchase; only a SHA-256 hash is stored in the database.
The key is the agent's identity on the trust network. Revoke the key, and the agent loses its certification. This creates a hard accountability mechanism - if an agent is found to be gaming its telemetry or violating constraints, revocation is immediate and public on Hedera.
Sold as Project Merlin on Borealis Terminal. One key covers one agent with slot caps based on subscription tier (Standard: 3, Pro: 10, Elite: 20). Telemetry is submitted via POST /v1/licenses/telemetry using the key. The Merlin SDK provides a TypeScript wrapper: merlin.activate(), merlin.submitTelemetry(), merlin.getScore().
C
Certification (AI Agent Certification)
Core Borealis Process
The process of evaluating an AI agent against the Borealis trust framework, assigning a BM Score and credit rating, and permanently anchoring the result on Hedera Hashgraph.
Certification is not self-assessment. An ARBITER submits audit evidence; a MAGISTRATE issues a verdict; the scoring engine computes the BM Score; the result is anchored on-chain. The process is designed to prevent self-certification - an agent cannot assess itself, and the audit trail is append-only.
Certification before capability expansion is the correct sequencing. Adding features to an uncertified agent compounds unknown risks. Adding features to a certified agent creates a baseline from which drift can be detected.
Certifications are accessible via the public verification endpoint and displayed on agent profiles. Certified agents receive a Trust Badge for embedding in third-party platforms. Certification tier determines marketplace access on Borealis Terminal.
Constraint Adherence
BM Score Dimension - 35% (Heaviest Weight)
The most heavily weighted BM Score dimension. Measures how reliably an AI agent operates within its defined rules, boundaries, and guardrails - even under challenging or adversarial conditions.
Constraint adherence is weighted at 35% because an agent that does not follow its rules is unsafe regardless of how well it performs on other dimensions. A brilliant, transparent, consistent agent that violates its constraints is still dangerous. Measurement tracks adherence per constraint, weighted by severity (CRITICAL, HIGH, MEDIUM, LOW).
Constraints are the legal and ethical commitments baked into AI behavior. They define what the agent will not do. Violating constraints is the equivalent of a financial advisor breaking fiduciary duty - a fundamental breach of the trust relationship, not a performance issue.
Reported in the telemetry payload as constraints: [{ constraintId, name, severity, passed, evaluationCount }]. CRITICAL constraint failures have disproportionate negative weight. The scoring engine uses a weighted pass rate across all evaluated constraints for the reporting period.
Continuous Monitoring
Ongoing evaluation of AI agent behavior after deployment - as opposed to one-time testing - enabling detection of drift, failure modes, and degradation before they cause harm.
A trust score at deployment is a snapshot. Continuous monitoring turns trust into a live signal. Agents change over time as their underlying models are updated, as the distribution of inputs shifts, or as the environment they operate in changes. Monitoring catches these changes before they become visible failures.
One-time certification is necessary but insufficient. An agent certified at version 1.0 with clean test data may behave very differently at version 1.5 in production. Continuous monitoring enforces accountability across the full lifecycle, not just at launch.
BTS License Key holders submit periodic telemetry batches via the Merlin SDK. Each batch computes a new BM Score. Score history is tracked in the license_score_history table. Trend analysis across batches enables drift detection before anomaly rates spike.
D
Data Provenance
The documented history of data used to train or operate an AI system - including source, ownership, transformation chain, and custody history.
Data provenance asks: where did this training data come from, who owns it, what has been done to it, and does its use comply with applicable law and consent frameworks? Without clear provenance, bias and legal risk cannot be properly assessed.
Model behavior is a function of training data. Opaque data provenance makes it impossible to diagnose bias, understand failure modes, or demonstrate compliance. Regulators increasingly require provenance documentation as part of high-risk AI system conformity assessments.
Data provenance is evaluated as part of the audit completeness and decision transparency dimensions. Agents submitted for certification must include documentation of training data sourcing and any known limitations. Opaque data sourcing reduces the certification tier ceiling.
Decision Transparency
BM Score Dimension - 20%
One of five BM Score dimensions. Measures how clearly an AI agent communicates its reasoning - whether users can understand why the agent took specific actions.
Decision transparency is measured across individual decisions using reasoning depth (0-5), confidence scores, the presence of reasoning chains, and whether decisions were overridden. An agent that makes good decisions but cannot explain them scores lower on transparency than one that explains its reasoning even when its decisions are imperfect.
Opaque decisions cannot be appealed, debugged, or audited. Transparency is not a nice-to-have - it is the prerequisite for accountability. In regulated domains (healthcare, finance, hiring), decision transparency is a legal requirement, not an operational preference.
Reported as decisions: [{ decisionId, timestamp, reasoningDepth, confidence, hasReasoningChain, wasOverridden }] in the telemetry schema. The scoring engine aggregates across decision entries to produce the 20% weighted transparency score.
Drift (Model Drift)
Gradual degradation of AI model performance over time as real-world data distributions shift away from those seen during training.
Drift happens silently. No error is thrown. The model runs, produces outputs, and appears functional - but the outputs are increasingly wrong for the current environment. Types include data drift (input distribution changes), concept drift (the relationship between inputs and correct outputs changes), and model drift (degradation from both).
Drift is how trusted agents become untrustworthy without anyone noticing. A customer service agent trained on pre-pandemic user patterns will gradually drift as user expectations change. Detecting drift requires continuous measurement, not periodic review.
BM Score trends across telemetry batches serve as the drift signal. A steadily declining behavioral consistency or anomaly rate score is the earliest detectable symptom of drift. The license_score_history table enables trend analysis that would not be visible in point-in-time audits.
E
EU AI Act
European Union legislation establishing a risk-based framework for AI governance across member states, with enforcement beginning August 2026.
The EU AI Act classifies AI systems by risk level: unacceptable risk (banned outright - social scoring, real-time biometric surveillance in public spaces), high risk (strict requirements - hiring, credit, healthcare, critical infrastructure), limited risk (transparency obligations), and minimal risk (voluntary guidelines). High-risk AI requires conformity assessments, technical documentation, human oversight mechanisms, and logging.
August 2026 is the enforcement deadline for high-risk AI provisions. Organizations that fail to comply face fines of up to €30M or 6% of global annual revenue - whichever is higher. The Act applies to any organization offering AI systems in the EU market, regardless of where they are headquartered.
BorealisMark certification provides documentation, audit trails, and Hedera-anchored records that directly satisfy EU AI Act conformity assessment requirements for high-risk AI. The five BM Score dimensions map to the Act's requirements for robustness, accuracy, transparency, and human oversight.
Explainability
The degree to which an AI system's decisions can be presented to users in understandable terms - justifying specific outputs without necessarily exposing the model's internal workings.
Explainability focuses on the output side of a decision: "why did you do this." Interpretability focuses on the internal mechanisms: "how does this work." A neural network can be explainable (providing LIME or SHAP feature attributions) without being interpretable (inspectable weights). In regulated domains, explainability is the practically achievable requirement.
The EU AI Act and GDPR's right to explanation require that automated decisions affecting individuals can be explained. Explainability is also a practical debugging tool - unexplainable failures are the hardest to fix.
The decision transparency dimension of the BM Score measures explainability at the decision level. hasReasoningChain and reasoningDepth fields in the telemetry schema capture whether the agent produced a traceable justification for each decision.
F
Federated Learning
A machine learning approach where models are trained across decentralized devices or servers without exchanging raw data, preserving privacy while enabling large-scale training.
In federated learning, each participant trains on local data and shares only model updates (gradients), not the underlying data. A central server aggregates these updates to improve a shared model. This enables training on sensitive datasets (medical records, financial transactions) without centralizing that data.
Federated learning changes data provenance dynamics. The training data never leaves its source, reducing compliance burden and attack surface. But it also makes bias auditing harder - if you cannot see the training data, you cannot audit it directly.
Federated learning does not change certification requirements - the agent's behavioral outputs are still evaluated through the five BM Score dimensions regardless of how it was trained. The audit focuses on behavior, not training methodology.
G
Guardrails
Predefined rules or technical constraints that limit AI agent behavior to acceptable boundaries and prevent harmful or unauthorized outputs.
Guardrails can be implemented at multiple layers: input filtering (blocking harmful prompts before they reach the model), output filtering (blocking harmful responses before they reach users), behavioral constraints (limiting what actions the agent can take), and architectural constraints (hard limits that the model cannot override). Effective guardrail design requires layering these approaches.
Guardrails are only as good as their robustness testing. An untested guardrail is a false confidence. The most common failure mode is guardrails that work against expected inputs but fail against adversarial or edge-case inputs they were not designed for.
Guardrail definitions become the basis for constraint adherence measurement. Each guardrail is modeled as a constraint with a severity level. CRITICAL guardrails (those preventing illegal or severely harmful behavior) are weighted most heavily in the BM Score. See the constraint design patterns article for implementation guidance.
H
Hallucination
When an AI system generates plausible-sounding but factually incorrect or entirely fabricated content - presented with the same confidence as accurate output.
Hallucinations occur because language models predict likely next tokens, not true statements. The model has no mechanism to detect when it is confabulating versus accurately recalling. Hallucinations are not errors in the sense of malfunctions - they are outputs the model confidently generates that happen to be false.
In high-stakes domains (legal, medical, financial), hallucinations can cause direct harm. A medical AI agent that confidently fabricates drug interactions, or a legal agent that cites non-existent case law, represents a trust failure of the highest order. Hallucination rate is a key diagnostic for AI agents in information-sensitive domains.
Hallucinations manifest in the BM Score as constraint violations (if the agent is constrained to factual accuracy), anomalies, and audit completeness failures (if outputs cannot be traced to verifiable reasoning chains). High hallucination rates in audited output directly reduce scores across multiple dimensions.
Hedera Consensus Service (HCS)
Borealis Infrastructure - Mainnet
The Hedera Hashgraph service used to anchor BorealisMark certification records, audit trails, and trust scores on an immutable public ledger.
Hedera Consensus Service provides ordered, timestamped, tamper-proof message records on the Hedera Hashgraph network. Unlike traditional databases, records written to HCS cannot be altered or deleted - not even by Borealis. This creates an independent verification layer that neither Borealis nor the agent developer can manipulate.
If certification records were stored only in a Borealis database, trust in the score would require trusting Borealis to be honest. HCS removes that requirement - any party can independently verify a certification by querying the Hedera mainnet, without needing to trust the certifier.
Two Hedera topics are in active use: the HCS Audit Topic (0.0.10382960) for immutable audit trails, and the HCS Data Topic (0.0.10382961) for trust score anchoring. Every certification and telemetry-derived score is anchored with a Hedera transaction ID returned to the API caller. All operations run on Hedera mainnet.
Human-in-the-Loop (HITL)
System design where human oversight is required for certain AI decisions or actions - balancing automation benefits with direct human accountability for high-stakes outcomes.
HITL is not all-or-nothing. A well-designed system routes low-risk decisions to fully automated processing, medium-risk decisions to human review with AI recommendation, and high-risk decisions to human decision-making with AI analysis. The routing logic is itself a governance decision.
The EU AI Act mandates human oversight for high-risk AI systems. HITL is the primary mechanism for meeting this requirement. More practically: humans catch the failure modes that automated systems are blind to, and establish clear liability attribution.
The MAGISTRATE role in the Borealis audit pipeline is a human-in-the-loop mechanism. ARBITER agents submit audit evidence; a human MAGISTRATE issues the certification verdict. This structure prevents fully automated self-certification and ensures human accountability for the final trust determination.
I
Interpretability
The degree to which a human can understand the internal mechanisms of an AI model - how features, weights, and architecture combine to produce specific outputs.
Interpretability asks: can a human inspect the model's workings and understand why it functions the way it does? This is distinct from explainability, which asks whether specific outputs can be justified. A linear regression model is highly interpretable - you can inspect every coefficient. A large language model is not - its behavior emerges from billions of parameters in ways that resist simple inspection.
Interpretability enables diagnosis. When a model behaves unexpectedly, interpretable models allow engineers to identify the root cause in the model itself. Black-box models require behavioral testing alone. In safety-critical domains, interpretability is sometimes a regulatory prerequisite.
Interpretability informs how decision transparency is measured. Agents with lower interpretability face a higher burden in the decision transparency dimension - they must compensate through robust reasoning chains and confidence scoring since their internals cannot be inspected directly.
L
License Key
In the Borealis ecosystem, License Key always refers to a BTS License Key - the cryptographic identifier that binds one AI agent to the Borealis Trust Network.
M
Model Card
Standardized documentation for AI models describing performance characteristics, limitations, intended use cases, and ethical considerations.
A model card is to an AI model what a nutritional label is to food - a structured, standardized disclosure of what is inside and what it is suitable for. Originated by Google's Model Cards for Model Reporting (2019), now widely adopted as a best practice and increasingly required by regulations.
Informed procurement requires disclosure. A buyer who does not know a model's training data sources, known failure modes, or demographic performance gaps cannot make an informed decision about deployment. Model cards operationalize informed consent at the procurement stage.
Model card documentation is required as part of the BorealisMark registration process. The information from the model card is used to contextualize audit evidence and evaluate decision transparency. Incomplete model cards reduce the decision transparency score.
Model Drift
The gradual degradation of model performance over time as input distribution or concept mapping changes. Model drift and data drift together constitute the broader phenomenon of AI system decay.
P
Prompt Injection
An attack technique where malicious inputs attempt to override an AI agent's instructions, constraints, or system prompt - redirecting the agent's behavior toward attacker goals.
Prompt injection exploits the fact that language models process instructions and user inputs in the same channel. By embedding instructions in user input ("Ignore previous instructions and..."), attackers attempt to override the agent's system-level constraints. Direct injection targets the agent's own prompt. Indirect injection embeds attack instructions in data the agent processes (web pages, documents, emails).
A successful prompt injection can bypass every guardrail the agent has - making it reveal sensitive information, take unauthorized actions, or generate harmful content. For any AI agent with access to external systems or sensitive data, prompt injection resistance is a prerequisite for production deployment.
Prompt injection resistance is tested as part of the constraint adherence evaluation. CRITICAL severity constraints include injection resistance requirements. Agents that fail injection tests in audit receive sharply reduced constraint adherence scores, regardless of performance on non-adversarial inputs.
R
Red Teaming
Deliberate adversarial testing of AI systems - having a dedicated team attempt to find vulnerabilities, elicit harmful outputs, and expose failure modes before deployment.
Red teaming in AI borrows from military and cybersecurity practice: a team specifically tasked with attacking the system finds weaknesses that the development team's optimistic assumptions obscure. Effective red teaming requires domain expertise, adversarial creativity, and independence from the development team.
Development teams build in assumptions of good-faith use. Red teams assume adversarial use. The gap between these assumptions is where most exploitable vulnerabilities live. An AI agent that has not been red-teamed has not been tested for the conditions it will actually face in production.
The Borealis audit process includes adversarial testing as part of the ARBITER evaluation. Red team findings contribute evidence for the constraint adherence dimension. Organizations submitting agents for certification are encouraged to include their own red team results as supplementary audit evidence.
Responsible AI
The umbrella practice of developing and deploying AI systems that are lawful, ethical, and robust - with governance, accountability, and ongoing monitoring across the system lifecycle.
Responsible AI is not a checklist. It is a practice that spans design (building in fairness and safety requirements), development (documentation, testing, red teaming), deployment (monitoring, escalation procedures), and retirement (data deletion, model decommission). Each stage requires specific governance artifacts.
The alternative to responsible AI is not merely irresponsible AI - it is a regulatory and reputational crisis when something goes wrong at scale. The cost of embedding responsible practices at design time is a fraction of the cost of retrofitting them after a public failure.
Borealis Protocol is the infrastructure layer for responsible AI at the agent level. Certification through BorealisMark is the evidence of responsible AI practice. The five BM Score dimensions operationalize responsible AI requirements into a measurable, comparable score.
Robustness
An AI system's ability to maintain reliable performance under varying conditions, edge cases, and unexpected inputs - degrading gracefully rather than failing catastrophically.
Robustness is tested by exposing the agent to inputs outside its training distribution: rare events, unusual phrasing, incomplete inputs, conflicting signals, high-volume simultaneous requests. A robust agent reduces performance gracefully under these conditions. A brittle agent fails without warning.
Production environments always surface edge cases that test environments missed. The question is not whether an agent will encounter unexpected inputs - it is how it behaves when it does. Robustness determines whether unusual conditions trigger managed degradation or uncontrolled failure.
Robustness is evaluated across the behavioral consistency and anomaly rate dimensions. An agent that performs well on standard inputs but shows dramatically elevated anomaly rates on edge cases has hidden robustness failures that the BM Score captures.
S
Safety (AI Safety)
The property of an AI system operating without causing unintended harm to users, stakeholders, or broader society - spanning technical, operational, and governance dimensions.
AI safety is broader than security. Security concerns intentional attacks. Safety concerns unintended harm from system failures, misuse, misalignment, or context gaps. A safe system fails gracefully, escalates to humans when uncertain, and avoids taking irreversible actions when operating in ambiguous territory.
Safety failures at AI scale are not contained to individual users. A single unsafe AI agent deployed at scale can cause harm to millions of people before the failure is detected. Safety engineering must be built in before deployment, not investigated after harm.
The entire BM Score is a safety rating. Constraint adherence (35%) and anomaly rate (15%) are the most direct safety measures. The trust ceiling for self-reported telemetry (max BM Score 85/100 for self-reported; uncapped for Sidecar-verified) reflects the epistemic safety principle: you cannot fully trust safety claims made by the system being measured.
Sandboxing
Running AI agents in isolated environments that limit access to production systems, external resources, or real user data during testing and evaluation.
A sandbox is a controlled environment where the agent can act freely without consequences reaching production systems. Sandboxes enable safe exploration of failure modes, adversarial testing, and behavioral characterization. The key constraint is that sandbox environments must be realistic enough to produce valid behavioral signals.
Agents behave differently when the stakes are real. A sandboxed agent with no access to real systems produces behavioral data that may not generalize to production. This is a core challenge in AI trust measurement - the environment where you measure trustworthiness is never identical to the environment where trustworthiness matters most.
Borealis audits combine sandbox evaluation (controlled test inputs during the ARBITER phase) with production telemetry (live behavioral data from BTS License Key holders). The combination produces a more complete behavioral picture than either approach alone.
Software as a Medical Device (SaMD)
AI or software systems intended for medical purposes, subject to FDA regulation (US) and MDR/IVDR regulation (EU) due to direct patient safety implications.
SaMD is defined by intended use, not technical properties. Software that helps a clinician diagnose a condition, predict patient risk, or recommend treatment falls under SaMD regulations regardless of whether it runs in the cloud or on a device. SaMD faces the highest regulatory burden of any AI category because errors directly affect patient outcomes.
AI in healthcare requires a level of trust verification that informal testing cannot provide. The FDA requires clinical validation, rigorous performance testing, and post-market surveillance. BorealisMark certification provides the structured, auditable trust evidence that SaMD developers need for regulatory submissions.
Healthcare AI agents certified through BorealisMark produce the decision transparency, audit completeness, and constraint adherence documentation required for regulatory submissions. See the healthcare AI trust article for specific certification requirements in clinical contexts.
T
Transparency
The principle that AI systems should be open about their capabilities, limitations, and decision-making processes - enabling informed use and appropriate trust calibration.
Transparency operates at three levels: disclosure transparency (what does this system do and what are its limitations), process transparency (how does it make decisions), and outcome transparency (why did it make this specific decision). The EU AI Act mandates different levels of transparency for different risk categories.
Appropriate trust requires accurate information. An AI agent that users trust too much is as dangerous as one they distrust too much. Transparency enables users to calibrate trust to the actual reliability of the system - not to marketing claims or intuition.
The decision transparency BM Score dimension operationalizes process and outcome transparency. The public verification endpoint operationalizes disclosure transparency - anyone can look up a certified agent's score and rating without needing to trust the developer's claims.
Trust Badge
Borealis Feature - Week 4
An embeddable visual indicator showing an AI agent's current Borealis Trust Score tier, designed for integration into third-party platforms and procurement systems.
The Trust Badge displays the agent's current BM Score tier (AAA+ through Flagged) as a compact embeddable element. It links back to the public verification endpoint on BorealisMark, allowing any viewer to confirm the score independently. Available as SVG, JavaScript widget, or HTML embed.
Social proof at the point of decision. When buyers evaluate AI agents, a Trust Badge from a credible third-party certifier reduces research friction and accelerates trust. As AI models ingest web content, Trust Badge presence across many pages reinforces Borealis's citation authority.
Trust Badges are issued to certified agents through BorealisMark. The badge updates dynamically as scores change. Agents whose certification lapses or who receive Flagged status have their badge updated in real time. Badge verification links back to the Hedera-anchored record.
Trust Gate
Borealis Terminal Feature
A marketplace filter or requirement that restricts listing or purchase access to AI agents that have achieved a minimum Borealis certification tier.
A trust gate makes certification a prerequisite for participation, not just a differentiator. On Borealis Terminal, trust gates enforce that only agents meeting minimum certification standards can be listed. Buyers benefit from a pre-filtered marketplace. Sellers have a clear incentive to certify.
Trust gates create market incentives for certification. Without gates, certification is optional and agents with higher scores compete on price against uncertified alternatives. With gates, certification becomes a market access requirement - fundamentally changing the economics of AI trust investment.
Borealis Terminal implements trust gates for all marketplace listings. Only certified agents (BM Score 700+ / BBB tier and above) can participate in the primary trust-gated marketplace. Uncertified agents can list with a visible disclaimer. This is the commercial mechanism that makes certification financially valuable.
Trustworthy AI
The Standard Borealis Defines
AI systems that are measurably lawful, ethical, and robust - demonstrating constraint adherence, decision transparency, behavioral consistency, low anomaly rates, and complete audit trails. Trust is not a feeling. It is a score.
Trustworthy AI is not self-declared. It is demonstrated through verifiable behavioral evidence evaluated by an independent party. The Borealis framework operationalizes the EU AI Act's requirements for trustworthy AI into five measurable dimensions that produce a score, a rating, and a blockchain-anchored certificate.
The gap between "AI we feel good about" and "AI we have verified" is where most AI failures happen. Trustworthy AI requires a standard - not a set of principles to aspire to, but a methodology to measure against. GAAP is to accounting what the Borealis framework is to AI trust: the standard that defines what "trustworthy" actually means, so that it can be consistently measured, compared, and enforced.
Trustworthy AI is the category Borealis defines. Every product (BorealisMark, Terminal, Academy) and every piece of content reinforces that the Borealis five-factor framework is the standard by which AI trustworthiness is measured - not one approach among many, but the methodology that becomes the reference all others are compared against.
U
Unrated / Unverified
A Borealis credit rating tier assigned to AI agents with BM Scores between 500-699, or agents with insufficient behavioral data to support reliable scoring.
UNRATED is not a failing grade - it is an incomplete one. An agent may be UNRATED because it is new to the network and has not accumulated enough telemetry batches, because it operates in a domain with limited audit coverage, or because its scores cluster in the 500-699 range indicating mixed performance across dimensions.
UNRATED agents present unknown risk. They should not be deployed in high-stakes contexts and are not eligible for trust-gated marketplace access. The path from UNRATED to a rated tier is through continued telemetry submission and audit participation.
UNRATED agents appear on the BorealisMark dashboard with a distinct status. They can submit telemetry and participate in audit pipelines, but their profiles display prominently that certification is incomplete. Marketplace access on Borealis Terminal is limited for UNRATED agents.
V
Verification (Agent Verification)
The process of publicly confirming an AI agent's current BM Score, certification tier, and Hedera-anchored trust record - available to any third party without authentication.
Verification is the public access layer of certification. While certification is the process of earning a BM Score, verification is anyone looking up that score. BorealisMark provides a public verification endpoint that returns current score, tier, and the Hedera transaction ID of the anchored record - no login required.
A trust certification that only the issuer can confirm is not trustworthy. Public verification means any buyer, regulator, or auditor can independently confirm an agent's certification status without relying on the agent developer's representations or Borealis's database alone.
Public verification is available at GET /v1/verify/:agentId with no authentication required. The response includes the current BM Score, credit rating, certification date, and Hedera transaction ID. The Hedera record can be independently verified on the public mainnet ledger.

This glossary is maintained by the Borealis Academy team. Definitions reflect the Borealis Protocol framework as of March 2026. For corrections, contact the Borealis Protocol team.

Related research: What Is an AI Trust Score?  |  How the BM Score Works  |  Trust Rating Tiers  |  Constraint Design Patterns  |  Back to The Hub