Shane Deconinck Trusted AI Agents · Decentralized Trust

Untangling Autonomy and Risk for AI Agents

· 5 min read

Six dimensions of agent governance visualized in 3D

The potential of agentic AI is hard to ignore. But when it comes to making decisions about it, the questions start piling up. And the longer you think about it, the more there are.

  • How well does it actually perform?
  • What’s the worst that can happen when it doesn’t?
  • Does the business value outweigh the risk?
  • Where does the organization draw the line?
  • Is there anything in our infrastructure that can prevent it?
  • How much autonomy should it get?

That’s a lot to hold in your head for one agent. Now govern that across an organization, across disciplines, across risk appetites.

Most of what’s being published on this comes from a specific angle: strategy papers championing the transformational potential (with a soft disclaimer that it needs to be governed), security reports warning about the risks (without a clear path to still getting things done), compliance frameworks listing requirements (without a clear way to enforce them). Each useful on its own, but hard to connect, easily outdated, and not enough when you need to make a call.

What Anthropic’s Numbers Show

Anthropic published Measuring AI Agent Autonomy in Practice earlier this month, analyzing millions of real agent interactions. One chart stuck with me. They scored agent tasks on two axes: autonomy (is the agent following explicit instructions or operating independently?) and risk (what happens if something goes wrong?).

Risk vs autonomy scatter plot showing the upper-right quadrant sparsely populated but not empty
Risk vs autonomy by task cluster. Source: Anthropic, Measuring AI agent autonomy in practice (Feb 2026).

Today, most agent actions are low-risk and reversible, with software engineering accounting for nearly 50% of all agentic tool calls. But the data also shows emerging usage in healthcare, finance, and cybersecurity. The upper-right quadrant, high autonomy combined with high risk, is “sparsely populated but not empty”: patient medical records, cryptocurrency trading, production deployments are showing up. Anthropic expects this frontier to expand as agents move into domains where the stakes are higher than fixing a bug.

Yet most governance conversations collapse all of this into a single question: “how risky is this agent?” That bundles together what the agent does, what happens when it fails, how much freedom it has, and whether you’ve built the infrastructure to contain it. Too many questions crammed into one.

I kept asking these questions, kept thinking them through, borrowing from other domains, and distilled what I found into a model. Then I built the PAC Agent Profiler to visualize how the pieces interact.

The Six Dimensions

Each one answers a question the others can’t.

Business value is where it starts. The excitement, the dream. This agent could save us millions, collapse a workflow from days to minutes, unlock something we couldn’t do before. Business value is why you’d accept any risk at all. Without it, there’s nothing to discuss.

Reliability is the reality check. Better models, better prompts, better evals, better guardrails. Most teams focus here, and it matters. But it’s only meaningful relative to what happens when the agent fails.

Blast radius is the worst-case impact of that failure. The profiler scores this on five levels: contained (errors caught before impact), recoverable (small group affected), exposed (public-facing, hard to recall), regulated (compliance or legal consequences), and irreversible (money, contracts, safety). This is fixed by the use case, not by engineering. You can’t engineer your way to a smaller blast radius: you can only choose which use cases to pursue.

Infrastructure is the guardrails you’ve actually built. Audit trails, identity verification, authorization frameworks, sandboxing, monitoring. This is where the model gets opinionated.

Governance thresholds represent where the organization draws its lines. Regulatory requirements, internal policies, risk appetite. An agent might be technically capable of full autonomy, but if the compliance team requires human approval for anything touching customer data, that’s the ceiling.

Autonomy is the output. Not an input you set, but a level the agent earns based on everything else.

The PAC Agent Profiler showing agent use cases plotted across six dimensions in a 3D visualization
The PAC Agent Profiler: six dimensions, visualized.

Infrastructure as a Gate

This is the dimension I care most about, and it’s where the model diverges from typical risk frameworks.

Most frameworks treat everything as a spectrum. Infrastructure doesn’t work that way. You either have audit trails or you don’t. You either verify agent identity or you don’t. You either enforce authorization scopes or you don’t.

Infrastructure is a gate, not a slider. No amount of reliability compensates for guardrails you haven't built.

In the profiler, infrastructure is binary per autonomy level. Each level requires a specific set of capabilities. If you haven’t built them, that level is locked, regardless of how reliable the agent is. A brilliant agent without audit trails can’t be trusted with delegated authority, because when something goes wrong you have no way to understand what happened.

This makes the model actionable. Instead of “improve your governance posture,” it says specifically: “you need identity verification and authorization scopes before this agent can move from human-approval to oversight mode.”

The requirements are cumulative. Level 2 (Approve) needs basic logging and human confirmation flows. Level 3 (Oversight) adds structured audit trails and monitoring. Level 4 (Delegated) requires identity verification, scoped authorization, and sandboxing. Level 5 (Autonomous) demands all of the above plus anomaly detection and automated containment.

Anthropic’s research backs this up. They found that 80% of tool calls in the wild have at least one safeguard in place, and 73% involve human oversight of some form. The infrastructure exists or it doesn’t. People build it before granting autonomy, not after.

Autonomy Is Earned

The five autonomy levels map to decreasing human involvement:

  1. Suggestion: the agent recommends, the human decides and acts
  2. Approve: the agent prepares, the human reviews and confirms
  3. Oversight: the agent acts, the human monitors and can intervene
  4. Delegated: the agent acts within defined boundaries, the human reviews periodically
  5. Autonomous: the agent acts independently, the human is notified of exceptions

Autonomy is the dependent variable. You don’t start by deciding “this agent should be autonomous” and then figure out the requirements. You assess the other five dimensions, and the appropriate autonomy level falls out.

This also means autonomy can change over time. As you build infrastructure, improve reliability, or as the organization adjusts its thresholds, the same agent can earn higher autonomy. It’s a progression, not a one-time decision.

Try It

The PAC Agent Profiler lets you map these dimensions for a specific use case: see where the gaps are, understand what’s blocking higher autonomy, and get a concrete path forward.

The profiler is at v0.1. The foundations are there, but the details will evolve. It’s open source. If you use it and something doesn’t fit, I’d love to hear about it.


I built a framework for adopting agentic AI with trust at the center, with interactive explainers on the protocols behind it. I also run a live training programme on this at trustedagentic.ai.

Stay in the loop

A few times per month at most. Unsubscribe with one click.

Your email will be stored with Buttondown and used only for this newsletter. No tracking, no ads.

↑ Back to top