A standard is a paradigm you can't take back

Posted on June 8, 2026

Note: This post offers a perspective on how industry "standards" can unintentionally ossify practices and create a checkbox mentality. It is not an allegation of fraud or misconduct. Frameworks like AARM are built in good faith and have legitimate value, but organizations should critically evaluate how they implement and rely on them.

An incomplete standard is a fixable thing. You ship v1.0, you track errata, you plan v2.0. Fine. The risk arises when a standard appears finished and authoritative, even while it is still evolving, potentially leading organizations to treat it as a definitive measure rather than guidance, because at that point it stops being a document and becomes a procurement checkbox. The checkbox travels faster than the nuance, and it travels into exactly the places that can least afford to be wrong: regulated buyers who read "conformant" as "safe."

AARM (Autonomous Action Runtime Management) is a clean example. It carries the imprimatur of a well-known body, it has a public ecosystem of sixty-plus listed builders (per the project's directory as of May 2026), a handful of conformant-badged products, and it reads like a mature category definition. It is also a near-perfect demonstration of how a standard can codify the wrong paradigm and then freeze the whole field around it.

AARM gets real things right. Its identification of the action layer as the stable security boundary is correct. This post is about the paradigm it encodes at that boundary, not the people building it.

What it actually mandates

Strip the spec to its load-bearing requirements and you get this: intercept every agent action before execution, accumulate session context, evaluate the action against policy and against the agent's stated intent, return one of five decisions, and write a tamper-evident receipt bound to an agent identity. Six MUST requirements. Three SHOULD requirements on top, including drift tracking and least-privilege scoping.

The receipts and the identity binding are genuinely good. Tamper-evident records, cryptographically bound to a verifiable agent identity, are the right foundation for audit and non-repudiation. No complaint there. We build the same thing.

But audit is forensics. It tells you what happened after it happened. A standard whose enforceable core is "watch every action and keep good logs" is an observability standard wearing a security standard's clothes. And the rest of the mandatory core makes a bet that does not hold up.

The bet is reactive, and it's baked into the non-negotiable requirement

The first and most emphatic requirement is interception: every action passes through a control plane before it executes, and this is described as non-negotiable. Read that closely. It assumes the dangerous action is fully expressible, gets constructed by the agent, and then gets caught in flight. That is deny-list thinking promoted to a MUST. You are racing the agent. You win when your evaluator is smarter than the attacker who shaped the agent's behavior, and you lose otherwise.

The alternative paradigm does not race anyone. You make the dangerous action structurally inexpressible. The agent holds a set of typed capabilities, and the set of actions it can even form is the set of safe actions. There is nothing to intercept because there is nothing to catch.

A concrete example from our evaluation: a file-reading agent given a permissive Python baseline escaped 94% of the time by reading secret files outside its scope. The same agent running under Symbiont escaped 0% of the time across 80 trials. Not because a policy caught the bad reads. Because the typed tool contract only accepts paths matching an approved directory pattern. ToolClad's path validator rejected 177 hostile read attempts. The dangerous action was not intercepted. It was inexpressible.

A standard that mandates interception as the floor has quietly decided the industry should build the racing kind, not the constructive kind.

Intent alignment puts the model back inside the trust boundary

One mandatory requirement is that policy evaluate each action for alignment with the agent's stated intent. The stated intent is an LLM-mediated artifact. It is set at the start of a session and it is exactly the thing prompt injection and goal hijacking corrupt. So the spec asks you to check actions against a compass that the same threats it names can spin. You cannot evaluate alignment against a poisoned reference and call the result a guarantee.

Evaluating the action itself against a fixed policy is fine. That is capability checking. Folding "alignment with stated intent" in as a co-equal MUST is where the model leaks back into the part of the system that was supposed to be independent of it. Policy has to live outside model influence or it is not policy, it is a suggestion the model gets a vote on.

The priority ordering gives the game away

Least-privilege scoping, arguably the single most effective structural control in the whole document, is a SHOULD. Runtime interception is a MUST.

So the thing closest to prevention is optional, and the thing that is reactive by construction is mandatory. That ordering is not an accident of drafting. It is the paradigm showing through.

The coverage claim outruns the requirements

The spec names eleven threat classes and states that a conformant system must address all of them. Then it gives you six mandatory requirements that are mostly interception plus receipts. How does intercept-and-log structurally defend against side-channel leakage, or memory poisoning, or environmental manipulation? It does not, not in a way the requirements compel. The coverage is asserted, not demonstrated. A buyer reading "designed to defend against all known classes of attack" will not notice the gap between the marketing surface and the enforceable core. That gap is the entire danger.

It gets worse in the architecture section, which blesses SDK instrumentation that may not intercept non-instrumented paths, and vendor integration whose coverage depends on the vendor. Both contradict the non-negotiable "no action may bypass" requirement on the same page. You can ship a known bypass and still claim the requirement that forbids bypasses.

Why this is worse than no standard

Conformance here is a self-assembled evidence package and a roughly two-week review. In regulated procurement that badge becomes a line item, and once it is a line item, every vendor optimizes to earn it. You get an industry of conformant-but-reactive products and a buyer population that believes the box is checked. False confidence in a safety-critical context is worse than acknowledged ignorance, because ignorance at least keeps people cautious.

And a standard does one more thing a whitepaper does not: it freezes a vocabulary. Once a respected body blesses a category, every better paradigm has to fight the incumbent definition before it can even be heard. Premature standardization of the reactive model does not just describe a local optimum. It ossifies one.

What a standard should encode

If you are going to standardize agent security, standardize prevention, not interception. Make dangerous actions inexpressible rather than catchable. Keep the gate a structurally independent phase the model cannot influence. Push enforcement to construction time and make it verifiable, instead of leaning on a runtime evaluator guessing at intent. Keep the audit layer, it is good, but be honest that it is the record, not the lock.

That is the line we have tried to take with OATS: prevention first, independent enforcement, audit as evidence rather than protection. The point of this post is not the alternative spec. The point is the failure mode. The most dangerous specification is not the one that admits it is unfinished. It is the one that looks finished, gets adopted, and turns out to have standardized the thing we should have been trying to replace.