Trusted Agentic AI Framework

AI that reasons, plans, and acts on your behalf is no longer futuristic. Agents inherit your organization's foundations: governance, security, identity, delegation, too often deferred when the stakes were lower. That changes when machines that can't be perfectly trusted start creating intent on your behalf. However bright the upside, navigating this without the right people at the table is playing with fire.

A lot of sharp thinking is happening in almost every vertical. But the challenge is a wicked one: it spans disciplines, it's overwhelming, and no single product or vendor can solve it alone. That's where this framework comes in: a practical path from hype and fear to progressive experimentation and sustainable deployment, built on three pillars: Potential, Accountability, and Control.

It sets the frame, the angles to look through so you can move forward without getting paralyzed. A reference point from which to venture boldly, and with which to challenge assumptions. This is a living framework: it grows as the field evolves and as I deepen the work.

Three Pillars

The most important thing to remember is to look at the challenge through three lenses. Different stakeholders will gravitate to different ones, but none works without the other two.

Potential

What value can agents unlock?

Agents can handle work that wasn't feasible before: too complex, too manual, too expensive to coordinate. Finding those opportunities means getting the people who understand the domain, the tech (and its flaws), and the data into the same room. The technical barrier is lower than ever. The hard part is knowing where to aim.

Who you need at the table: business strategy, technology leadership, product, data, domain expertise

Accountability

Who's responsible when the agent gets it wrong?

Agents make decisions on behalf of your organization. When something goes wrong, you need to explain what happened and why. That means governance structures, explainability, audit trails, and EU AI Act compliance need to be designed in from the start, not bolted on after launch.

Who you need at the table: legal, compliance, risk, audit, executive oversight

Control

How do we keep agents bounded?

Don't tell agents what not to do, make it technically impossible. Identity, authorization, delegation, permission boundaries: the infrastructure that ensures agents can only do what they're allowed to, and that every action is provable. Zero trust, applied to agents.

Who you need at the table: security, identity & access, platform engineering, architecture

They're Interdependent

Drop any one of the three and the others fall apart.

Potential without Accountability is reckless adoption. You build fast and hit a wall when the first incident happens and nobody can explain what went wrong.

Accountability without Control is governance on paper. Policies that say "agents must operate within scope" mean nothing if the infrastructure can't enforce it.

Control without Potential is infrastructure without a mandate. Agentic AI isn't a new piece of software you deploy once, it's a considerable rethink that needs ongoing funding. If the business doesn't see value, that funding stops.

Questions for Your Team

18 questions from Curated Questions for the Boardroom, grouped by pillar. Each one is a conversation starter, not a checklist item. Expand any question for context on why it matters.

Potential

Are your agents actually making decisions, or just automating steps humans already defined?

The value of agents is that they decide what to do given a goal. If your agents are running predefined workflows, you're getting automation, not agency. The upside comes from letting them reason, plan, and act.

What decisions are you not yet delegating to agents, and what's that costing you?

Every organization has processes where human bottlenecks slow things down. Some of those are genuinely high-stakes and need a human. Others are just habit. Knowing the difference is where the opportunity lives.

Will better models make your current setup more valuable, or obsolete?

Every workaround your team builds for a model's limitations becomes dead weight when that limitation disappears. The investments that compound are context (giving agents the right information) and permissions (governing what they can do with it). Ask your team how much of the current agent codebase they'd expect to throw away in a year.

Does the right context reach your agents at the right time?

Agent quality depends on having the right information for the task at hand. Not everything, not nothing: the relevant context, when it's needed. If your agents are underperforming, the model might not be the bottleneck. The context pipeline might be.

Are you building on established and emerging standards, or on an island?

Protocols like MCP for tool integration and A2A for agent communication are maturing fast. Building proprietary alternatives might feel faster now, but risks leaving you incompatible with the ecosystem forming around you.

How much value are you leaving on the table by over-constraining?

Agents that need human approval for every action aren't agents: they're suggestion engines. Containment by design lets agents run autonomously within safe boundaries. That's where the real productivity gain is.

Accountability

Do you know every agent running in your organization?

When employees build agents on low-code platforms, the company is still the deployer. An HR screening agent built without a compliance assessment makes you non-compliant without knowing the system exists. Shadow agents are the new shadow IT.

Can your infrastructure prevent an agent from running without being registered?

Knowing what's running today is one thing. Making it structurally impossible to deploy an unregistered agent is another. If anyone can spin one up without it showing up in a registry, visibility is a snapshot, not a guarantee.

When an agent makes a consequential decision, can you trace who authorized it and what happened?

Audit logs that show "alice@company.com" aren't enough when Alice delegated to an agent three months ago. You need to know who or what made the call, and under what authority.

If an agent causes harm, is the liability chain clear?

The human who delegated, the team that deployed, the vendor who built the model: all may share responsibility. If no one owns the answer, everyone will point at each other when it matters.

Could you explain to a regulator what your agent did and why?

The EU AI Act requires traceability, risk management, and human oversight for high-risk use cases. If an agent autonomously wandered into one, can you reconstruct the chain of decisions that got it there?

Control

Are agents restricted to what they can do, or only blocked from what they can't?

You can't list everything an agent shouldn't do: the list is infinite. The inverse works: start from zero authority, grant explicit permissions per task. A blocklist is always incomplete. An allowlist is always bounded.

When agents delegate to other agents, can authority only decrease?

If your procurement agent can approve purchases up to $5,000, any sub-agent it calls should inherit that ceiling or lower. Does the architecture enforce this, or just assume it?

What happens when an agent wanders into a use case you didn't anticipate?

A general-purpose office assistant told to "handle my inbox" might draft an email (minimal risk), then screen a job application (high-risk). The risk tier depends on how open-ended the prompt is. If you can't anticipate the use case, how do you bound it?

Are your agents contained by architecture, or only by policy?

Policy says what agents shouldn't do. Architecture limits what they can do, regardless of what they try. If a prompt injection or bad dependency triggers something unexpected, what actually stops it from spreading?

What happens when human oversight breaks down in practice?

After the 20th approval prompt, people start clicking "yes" without reading. Decades of automation research confirm that humans can't reliably monitor automated systems and then rapidly take control when needed. If your safety model depends on vigilance, what happens when vigilance fades?

How do you balance agent quality with data privacy?

Agents get better with more context, but more context means more data exposure. Do your agents see only what they need for the task at hand, or do they have broad access because it's easier to set up? And where does that data go: to an open-source model running on your infrastructure, or to a frontier model behind an API? The privacy calculus is different for each.

Does your agent setup work when agents need to cross trust boundaries?

Most current approaches work within a single trust domain. When agents act across organizations, call external APIs, or coordinate with third-party agents, identity and authority need to travel with the request: verifiable at every step, not assumed.

Asking the right questions is hard. Getting the right answers takes experts. See how I work with teams.

It's Iterative

A waterfall approach doesn't work here. Companies aren't all at the same starting point, and the interesting challenges surface mid-engagement, not on a pre-set agenda.

A leadership team might start with Potential ("what's possible?"), realize they need Accountability answers before they can greenlight anything, and then discover that Control is what makes the whole thing feasible.

An engineering team might start with Control ("how do we auth this agent?"), then loop back to Potential to validate the use case is even worth securing.

Each loop is short and delivers something concrete. The decisions you make now about how you structure context and information will compound as agents mature, but every step along the way should deliver value on its own.