By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Designing Agent Governance: A New Surface for AI Risk

How Endor Labs designed Agent Governance, a new product inside the existing platform. A look at three choices that shaped where the product lives, how risk reads, and where policies meet reality.

‍

Written by

Karen Ng

Published on

May 18, 2026

Updated on

May 20, 2026

Topics

AI/ML

Summarize with AI

A New Unit of Risk

Endor Labs is built around the developer codebase. Namespaces scope a customer's tenancy, projects map to repositories, findings are the security risks inside them. The entire product assumes you are looking at code.

AI expanded the risk beyond the code itself. Coding agents, MCP servers, skills, and models now run on developer workstations, with access to source, secrets, and credentials. They make decisions, call tools, and modify systems before any of that work hits a pull request. The developer's laptop has become a place where risk lives, outside the codebase.

That shift created new jobs for three groups of people. Through customer conversations and design research, three users with three different jobs shaped how Agent Governance came together.

Engineering and Security Leaders: Adopting AI at Scale

Enterprise engineering and security leaders are deciding how to adopt AI coding agents at scale. Their job is to enable the productivity AI agents bring while keeping the security posture intact. They need visibility into what AI is in use across the organization, controls that scale across thousands of developers, and evidence that the program is working.

Security Teams: Making Governance Real

Security teams sit at the operational edge of the program. They write the policies, tune the thresholds, and investigate the events that fire. Depending on the organization, this might be a single analyst, a small AppSec team, or a larger function with separate strategy and operations roles. Their job is to make governance real day to day without becoming a bottleneck for the developers they are protecting.

Agent Developers: Moving Fast with Guardrails

Agent developers are the people using AI coding agents to build software. They are not directly users of Agent Governance and may never open it, but the design of the product shapes their experience every day. Their job is to move fast: triage findings, write code with AI assistance, ship features. The guardrails the product provides have to operate without slowing that work down.

Three design questions came out of those three jobs:

Where security teams would find this product alongside everything else they already use,
How they would read risk across very different things on the same platform,
Where the rules they write would meet the actions agents actually take.

Carrying Two Product Shapes in One UI

Where would security teams find this new product alongside everything else they already use? That was the first question.

Agent Governance is a new product inside an existing platform. Endor Labs' other products live in the same UI, organized around namespaces, repositories, and findings. The question was how a new product with a different shape should sit alongside that work. There were three reasonable ways to do it, and the choice mattered for how a security team would experience the platform.

Build it as a separate app, outside the rest of Endor Labs. Clean to ship, but now a security team has two places to log in and two products to keep track of.
Tuck it under an existing area like Inventory or Findings. Tidier, but those areas assume you are looking at code, with namespaces and repositories scoping the data. Agent Governance does not work that way, and the user would meet an exception every time they clicked.
Place it as a peer alongside the existing products, with its own grammar where the product needs it.

We chose the third. Agent Governance got its own top-level entry in the sidebar, sitting alongside Findings, Inventory, and Package Firewall, with a new product behind it. Inside Agent Governance, the namespace selector at the top of the page is hidden, because the data is org-wide and namespace scoping has no meaning on a workstation. A security team member who tried to filter by namespace would find filtering that does not apply, and would lose confidence in what they were seeing. Everywhere else, the selector is restored. Breadcrumbs follow the same rule and show the right scope per page.

Inside Agent Governance, each page reuses the same table, filter, and drawer patterns as the rest of Endor Labs. Columns and detail content change to match what the user is looking at, but the way to interact with a page stays the same. The structure is shared; the content adapts.

For someone moving across the product, the work feels like one platform, even though two grammars run underneath. Familiar shapes carry users into unfamiliar territory faster than novel ones.

One Scoring Language for Three Different Risks

How would security teams read risk across very different things on the same platform? That was the second question.

Endor Labs already had risk scoring for open-source packages. The Endor Labs OSS Score has an overall score, a verdict, and a breakdown that explains why. As we added scoring to MCP servers and to skills, each had its own engineering team thinking about what mattered for that kind of risk: supply chain and authentication for MCP servers, data and permission for skills. Three good systems, none speaking the same visual language. A security team member moving across the platform was learning three reading patterns at once without realizing it.

The temptation was to force one set of dimensions on everything for clean consistency. The problem with that is the dimensions matter. What matters for an MCP server genuinely differs from what matters for an open-source package, and collapsing them into a single set would have hidden risk that the user came here to see.

What we did instead was unify the way scores read, while keeping the dimensions specific to each kind of risk. MCP servers, skills, and OSS packages now share the same visual shape: an overall score, a Safe / Caution / Risky verdict, a four-dimension breakdown explaining the score, and a side drawer that opens from the inventory the same way every time. The dimensions stay tailored. The reading experience stays the same. A security team member reading an MCP server score, a skill score, and an OSS package score finds the same structure each time. They learn the language once and read fluently across types. For an engineering or security leader looking at the program at the organization level, the same number means the same thing whether it appears on an MCP server, a skill, or a package, making risk easier to summarize for executives and auditors.

The harder half was tying scoring to policy. A score is only useful if a team can act on it. The policy authoring flow now references scores directly: a team can set a threshold and have Agent Governance alert when an MCP server or skill falls below it. The same number that explains risk in the inventory becomes a control point when the team writes a rule.

Where Policies Meet Reality

Where would the rules a team writes meet the actions agents actually take? That was the third question.

When a security team writes policies, they need somewhere to see those policies actually doing their job. The Policy Violations page is that place. It sits where the rules a team writes meet the actions agents take, and it had to work for the way security teams actually use it.

An audit log answers "what happened?" and stops there. A findings list, the pattern from the rest of Endor Labs, answers "what's wrong?" but assumes a vulnerability concept to anchor on, which Agent Governance does not have. Security teams using this page need to scan, prioritize, drill in, and act. Sometimes daily. Sometimes after an incident. Sometimes during a compliance review.

Each entry on the Policy Violations page is a complete story: the detected behavior, the policy that matched it, and the action that resulted. Detail views answer four questions in the same order, every time: what fired, who did it, what the agent was trying to do, and what the system did about it.

The order matters. A security analyst reviewing an event starts with what triggered. Knowing the rule that fired sets the frame for everything else, and showing the action the system took before showing the agent's intent puts the response in context with the rule it was responding to. If detail views asked these questions in a different order, analysts would have to rebuild the frame every time they opened a new entry, and consistency across hundreds of entries would never compound into fluency. By the third entry, an analyst is reading faster, drilling in more decisively, and acting with less hesitation. That fluency compounds across a security team that may scan dozens of violations a day, and it produces the evidence trail engineering and security leaders need to show the program is doing what it was designed to do.

The page is where governance stops being abstract and becomes a record of decisions a team made. Every policy a team writes earns its place, or doesn't, on this page over time.

Designing for What Comes Next

AI risk evolves constantly. A security platform that can absorb new kinds of risk inside its existing UI is one less new product a security team has to learn each time. For a team using both sides of the platform today, the work stays in one place. For the team using whatever comes next, the same approach scales.

At Endor Labs, we design for security and engineering teams who want to move fast and secure smarter. ‍Book a demo to see how Agent Governance can help your team get visibility into the AI tools and agents running across your developer workstations.