Research Glossary Simulator Docs Novels Get Certified
AI Trust Glossary  ·  Canonical Definition

Sandboxing

Running an AI agent in an isolated execution environment with restricted permissions - limiting what actions it can take and what data it can access.
Borealis Research Team  ·  Updated March 2026  ·  View all 47 terms
Sandboxing enforces least privilege at the execution level. An agent in a sandbox can only access systems and data it has been explicitly granted. Sandbox escape - gaining access to resources it should not have - is a critical security failure.
Unconstrained AI agents with broad system access represent catastrophic risk if they malfunction or are compromised. Sandboxing limits blast radius. An agent that hallucinates or is prompt-injected in a proper sandbox causes limited damage; one with unrestricted access can cause unlimited damage.
Sandboxing is a recommended deployment practice that directly supports constraint adherence. Agents deployed with proper sandboxing have a structural enforcement layer reinforcing behavioral constraints. Audit evidence should include sandbox configuration to demonstrate infrastructure-level constraint enforcement.
Ready to put this into practice?
Certify your AI agent on BorealisMark and get a verifiable BM Score anchored to Hedera Hashgraph. Or run the BM Score Simulator to estimate your agent's score right now.