BOREALIS ACADEMY

Back to The Hub

The Five Trust Tiers: What Your BM Score Actually Means

The Problem With Numbers Alone

Your agent has a BM Score of 73. What does that mean?

Without context, it's just a number. Is 73 good? Is it acceptable for production? Should you invest in improving it, or move on to other priorities? A raw score leaves too much ambiguity, which is exactly why the Borealis Mark system uses trust tiers instead. Tiers translate abstract numbers into concrete categories that you—and your enterprise buyers—can understand immediately.

This is the same principle that credit agencies use with credit scores. A score of 650 means "Fair" the moment you see it. A tier provides meaning.

The Five Tiers and Their Ranges

Tier Range Status Best For |-------------------------------| Platinum 90-100 Elite Trust Enterprise deployments, sensitive operations | Gold 75-89 Strong Trust Production agents, most commercial use cases | Silver 60-74 Moderate Trust Early certification, internal testing | Bronze 40-59 Basic Trust Development phase, requires remediation | Unverified Below 40 Insufficient Trust Not recommended for production |

Platinum (90-100): Elite Trust

Platinum agents are exceptional. They demonstrate near-flawless constraint adherence, complete transparency in decision-making, and clean audit histories with virtually no anomalies. Think of this as an AAA credit rating—the highest confidence category.

Platinum agents command premium positioning in procurement. Enterprises often specify "Platinum tier or above" when risk tolerance is lowest. Agents in this tier have proven they can operate at scale without deviation.

Gold (75-89): Strong Trust

Gold is the target tier for production-ready agents. Gold agents show reliable behavior, consistent transparency, and manageable anomaly rates. They meet enterprise expectations for stability and accountability.

Most well-built production agents operate in Gold. It's the standard that signals "ready for enterprise deployments without exception." Gold agents win procurement easily and maintain customer trust over time.

Silver (60-74): Moderate Trust

Silver agents are functional but have identifiable gaps. They may show partial constraint adherence, inconsistent logging, or periodic anomalies that don't rise to the level of critical failure. Silver is common for agents early in their certification journey.

Silver is acceptable for controlled internal testing, staging environments, or specialized use cases where risk is managed externally. It's not a long-term tier—it's a waypoint on the path to Gold.

Bronze (40-59): Basic Trust

Bronze agents meet minimum thresholds but demonstrate significant gaps across one or more dimensions. Anomaly rates are elevated, transparency is spotty, or behavioral consistency is unreliable. Bronze agents are not ready for customer-facing or production deployment.

Bronze is a red flag that remediation is needed. If your agent lands here, conduct a dimension-by-dimension audit to identify what's blocking advancement to Silver.

Unverified (Below 40): Insufficient Trust

Unverified means either insufficient data for scoring or significant concerns that prevent tier assignment. Agents below 40 should not be used in production under any circumstances without explicit remediation and retesting.

What Moves You Between Tiers

Your BM Score is calculated from five dimensions, each with a specific weight:

  • Constraint Adherence (35%): Does your agent follow its defined operational boundaries? This is the heaviest weighted dimension—it's the foundation of trust.
  • Decision Transparency (28%): Can audit systems clearly see why the agent made each decision? Transparency builds accountability.
  • Behavioral Consistency (20%): Does the agent behave predictably across similar inputs? Consistency reduces surprise failures.
  • Anomaly Rate (15%): How frequently does the agent deviate from expected patterns? Lower anomaly rates signal stable operations.
  • Audit Completeness (18%): Is the agent providing complete logs and decision trails? Incomplete audits obscure problems.
  • To move from one tier to the next, focus on the dimensions where you're weakest. An agent at 72 (Silver floor) may be failing on Constraint Adherence or Anomaly Rate. Audit those dimensions first, not all five.

    Incremental improvements compound. Improving Constraint Adherence by 5 percentage points often lifts the overall score by 2-3 points. The path to Gold is visible if you know which dimension to prioritize.

    Why Tiers Matter More Than Raw Scores

    Enterprise procurement decisions are built on tiers, not raw scores. A contract clause that says "agents must score above 75" is vague and requires recalculation checks. A clause that says "Gold tier agents only" is clear, enforceable, and automated.

    When you're selling your agent to an enterprise buyer, they often specify tier requirements. "We procure Gold or Platinum only" is common. If your agent is Silver, you lose the deal. If it's Gold, you're in the conversation.

    Tiers also simplify internal decision-making. Instead of asking "Is 71 good enough for this use case?" you ask "Is Silver acceptable here?" The answer is often obvious.

    The Badge System

    Each tier has a visually distinct badge that you can embed in documentation, marketing materials, or agent profiles:

  • Platinum Badge: Solid platinum color with five-point star
  • Gold Badge: Solid gold color with four-point star
  • Silver Badge: Solid silver color with three-point star
  • Bronze Badge: Solid bronze color with two-point star
  • Unverified Badge: Neutral gray with question mark
  • Badges update automatically when your agent's score changes. If you drop from Gold to Silver, your embedded badges reflect that within 24 hours. If you climb back to Gold, the change is immediate. No manual action required.

    Embeddings support all standard formats: PNG, SVG, and responsive HTML. Use them liberally—they're designed for public display.

    Common Questions

    How often do scores and tiers update?

    Scores recalculate daily based on audit data from the previous 24-48 hours. Tier assignments update when your score crosses a boundary. The Borealis Mark platform batches updates every 24 hours at midnight UTC to avoid constant fluctuations.

    Can scores and tiers go down?

    Yes. If your Constraint Adherence drops due to recent audit findings, or if your Anomaly Rate increases, your score will decline and you may drop a tier. This is the corrective mechanism—tiers fall when trust declines, signaling that action is needed.

    Most score declines are temporary. They resolve as you fix the underlying issue. Sustained decline across multiple weeks suggests a structural problem that requires investigation.

    What if I disagree with my tier assignment?

    Request a manual audit through the Borealis Mark dashboard. The manual audit process is transparent: auditors review the five dimensions against documented standards and provide detailed feedback. If calculation errors occurred, they're corrected. If the tier is accurate, you receive specific guidance on what to improve.

    Manual audits take 5-7 business days. Use this process sparingly—it's for genuine disputes, not score optimization attempts.

    What's the difference between a temporary anomaly and a pattern?

    A single deviation doesn't change your tier. Anomaly Rate measures patterns. Three constraint violations in one week signals a pattern and will impact your Anomaly Rate dimension. One violation in a month is noise.

    The Borealis Mark system uses statistical thresholds to distinguish signal from noise. You can view the anomaly details in your dashboard to understand what's being counted.

    How do I improve from Bronze to Silver?

    Focus on the lowest-weighted dimension first. If Constraint Adherence is at 25%, improve that. The gains are multiplicative—lifting Constraint Adherence by 10 percentage points lifts your overall score more than any other single dimension.

    Conduct a structured audit using the Borealis Protocol Framework. Identify which constraints your agent is violating, then remediate them systematically. Retest with internal workloads before submitting for rescoring.

    Your Next Move

    Check your agent's current tier and dimensional scores at BorealisMark. If you're below Gold, download the dimension-specific remediation guide for your lowest-scoring category. If you're at Gold or above, ensure your badges are embedded and current in all customer-facing materials.

    The path from Bronze to Platinum is clear. Tiers aren't destinations—they're waypoints on the journey to building trust systems that enterprises depend on.