By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Introducing Agent Governance: Using Hooks to Bring Visibility to AI Coding Agents

Learn how hooks turn AI coding agents like Claude Code and Cursor into governed systems with deterministic policy, centralized audit, and defense in depth.

‍

Written by

Robert Haynes

Published on

May 12, 2026

Updated on

May 12, 2026

Topics

AI/ML

Open Source

Security

Summarize with AI

AI coding harnesses like Claude Code and Cursor have become central to many software development teams. Alongside writing code, they read your repository, run shell commands, edit files, call MCP servers, and ship work end-to-end. The same properties that make them productive also make them risky. An agent that can run npm install can also exfiltrate .env. An agent that can edit Terraform can also run terraform destroy. And the natural-language interface means an attacker who can plant a sentence anywhere the agent reads, a README, an issue comment, a dependency description, can try to talk it into doing something it shouldn't.

The model layer alone cannot be the security control. Models are non-deterministic and persuadable; system prompts and refusal training help, but enterprise security teams need something stronger: a deterministic, auditable, out-of-band layer that observes every consequential action the agent takes and decides whether to allow it. That layer exists. It's called hooks, and both Claude Code and Cursor expose it.

This post walks through what hooks are, the lifecycle events they fire on, the data they receive, the response they're allowed to send back, and how a tool like Endor Labs uses them to provide centralized logging and policy enforcement, including blocking concrete bad behaviors.

Why hooks

Hooks aren't a new idea. Git has them. Pre-commit has them. Kubernetes admission webhooks are the same pattern. The shared design principle is that the platform exposes well-defined extension points in its execution lifecycle, and at each point, it hands control to a script that the operator owns. The script gets structured data describing what's about to happen, and it returns a decision: allow, deny, or modify.

Applied to AI coding agents, this translates into a clean separation of concerns. The model decides what it wants to do. The hook decides whether it gets to. Because the hook runs as a normal process on the developer's machine, you can write it in any language, ship it through normal package management, sign it, version it, and, critically, your security team can own it without having to understand prompt engineering.

There are three things hooks are uniquely good at:

Deterministic policy. A regex on a shell command line is not subject to jailbreaks. If the policy says block rm -rf /, the script blocks rm -rf /, regardless of what the agent's reasoning trace looks like.

Centralized audit. Every PreToolUse, every shell exec, every file read can stream to a SIEM. You get a forensic record of what the agent did, on whose machine, against which repo, with which prompts. Compliance teams have asked for this since the moment Copilot launched.

Defense in depth. Hooks complement, rather than replace, the harness's own permission prompts and the model's safety training. When the model misbehaves, and the user clicks "yes" by mistake, the hook is your last line.

What a hook actually is

In both Claude Code and Cursor, a hook is a shell command that the harness spawns at a specific lifecycle event. Configuration lives in a JSON file checked into (or layered over) the project, .claude/settings.json for Claude Code, .cursor/hooks.json for Cursor. The harness writes a JSON document describing the event to the hook process's stdin and reads its stdout (and exit code, in Claude's case) for a decision. That's the whole interface.

A trimmed-down Claude Code config looks like this:

{

  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          { "type": "command",
            "command": "/path/to/policy.sh" }
        ]
      }
    ]
  }
}

Cursor's equivalent:

{
  "version": 1,
  "hooks": {
    "beforeShellExecution": [
      { "command": "/path/to/policy.sh" }
    ]
  }
}

Same idea but with a (slightly) different vocabulary. Below, we'll see that the vocabulary differences run a little deeper than this.

Hook events

Both harnesses model the agent loop as a sequence of intercept points: the session opens, the user submits a prompt, the model decides to call a tool, the tool runs, and the loop completes. Each of those moments has a hook.

Events differ in name and granularity between harnesses, and have different configuration file syntaxes to achieve functionally similar things, like selecting a particular command to run based on precisely which tool is called during a tool-use event. The practical takeaway, however, is that the shape of the integration is shared. You can intercept the agent before it submits a prompt to the model, before it fires off any tool call (with sub-types for shell, MCP, and file I/O), and at the end of a turn or session. Whatever else the platforms add over time, and they will, those are the points where security policy belongs.

Endor Labs supported hook events

The Claude config wires endorctl ai-audit claudecode to six events; the Cursor config wires endorctl ai-audit cursor to thirteen. Cursor needs more entries because it ships event-specific hooks for shell, MCP, and file I/O instead of using a single PreToolUse event with matchers, the way Claude does.

Lifecycle phase	Claude Code	Cursor
Session begins	`SessionStart`	`sessionStart`
Session ends	—	`sessionEnd`
User submits a prompt	`UserPromptSubmit`	`beforeSubmitPrompt`
Before any tool call	`PreToolUse`	`preToolUse`
After a tool call (success)	`PostToolUse`	`postToolUse`
After a tool call (failure)	`PostToolUseFailure`	`postToolUseFailure`
Before a shell command	`PreToolUse` `matcher: Bash`	`beforeShellExecution`
After a shell command	`PostToolUse` `matcher: Bash`	`afterShellExecution`
Before an MCP tool call	`PreToolUse` `matcher: mcp__*`	`beforeMCPExecution`
After an MCP tool call	`PostToolUse` `matcher: mcp__*`	`afterMCPExecution`
Before a file read	`PreToolUse` `matcher: Read`	`beforeReadFile`
After a file edit	`PostToolUse` `matcher: Edit/Write`	`afterFileEdit`
Agent finishes its turn	`Stop`	`stop`

What the agent passes to your script

When a hook fires, the harness pipes a JSON object to your script's stdin describing the event. The exact fields differ a little by event and a little by platform, but the JSON describes what the agent is doing in sufficient detail for decisions to be made by a security controller

What the script can return

The response contract is where governance actually happens. There are three meaningful answers a hook can give: allow, deny with a reason the model can see, and modify.

Claude Code uses a hybrid of exit codes and stdout JSON. Exit code 0 means success, and the harness parses stdout for a JSON decision. Exit code 2 is the blocking error: stdout is ignored, stderr is fed back to the model as an error message, and depending on the event, the action is canceled (the tool call doesn't run, the prompt is erased, the session can't stop). Any other non-zero code is a non-blocking error.

For richer control, exit-0 stdout JSON lets a PreToolUse hook return:

{

  "hookSpecificOutput": {

    "hookEventName": "PreToolUse",

    "permissionDecision": "deny",

    "permissionDecisionReason": "Destructive command blocked by hook"

  }

}

permissionDecision is one of "allow", "deny", "ask", or "defer", and when multiple hooks chain, deny wins over defer over ask over allow — secure by default. A hook can also return updatedInput to rewrite the tool call before it runs (mask a path, strip a flag, normalize a command), or additionalContext to inject a string of guidance into the model's context window.

Cursor uses pure stdout JSON. The two permissioned events return a HookPermissionResponse:

{

  "permission": "allow" | "deny" | "ask",

  "userMessage": "shown to the human",

  "agentMessage": "shown to the model — useful for self-correction"

}

beforeSubmitPrompt returns a simple { "continue": boolean }. afterFileEdit and stop are notification-only; their return value is ignored.

The user is informed of the reason for blocking via the userMessage field, and the agent can be informed of the reason and given instructions for future actions - creating a ‘learning loop’ to prevent further problems.

From hooks to governance: where Endor Labs fits

A hook is a single shell command. To turn that into a real governance system, somebody has to implement the policy, log every event, and centralize the rules so they don't drift across a hundred developers' machines. That's the role Endor Labs plays.

The integration pattern is simple. Configure a single command, endorctl ai-audit, as the handler for every hook event.

For example (from Claude settings.json):

"SessionStart": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "npx -y endorctl --namespace $ENDOR_NAMESPACE --api-key $ENDOR_API_CREDENTIALS_KEY --api-secret $ENDOR_API_CREDENTIALS_SECRET ai-audit claudecode"
          }
        ]
      }
    ],

The endorctl command performs the following actions:

Read the harness's JSON payload from stdin.
Normalize it across Claude and Cursor's vocabularies into a common event model.
Stream the event to the Endor Labs backend, every prompt, every tool call, every shell command, every file read, producing a centrally searchable audit trail.
Evaluate the event against the organization's policy: shell-command allow/deny lists, sensitive-file patterns, MCP server allowlists, environment-variable rules.
Reply to the harness in the platform-specific format, permissionDecision: "deny" for Claude Code, permission: "deny" for Cursor, including a human-readable reason.

Because the policy lives server-side in Endor, your security team can update policies once and have every developer protected. Because the audit trail lives server-side, you have an answer the next time someone asks, "what is this AI doing on our laptops, and how can we control it?"

Endor Agent Governance policies

At the time of writing, there are 29 default AI governance policies on the Endor Labs Auri platform, covering most risky actions an agent might take. While most of them block the agent from continuing, a few prompt the agent ot ask permission from the user. Administrators can also create new policies, complete with custom messages for the user and the agent (where available), as well as edit the existing policies.

Closing

Hooks are not glamorous. They're a JSON file and a stdin/stdout contract. But they're the missing primitive that lets enterprise security teams adopt AI coding agents on the same terms they adopt anything else, with audit, with policy, with deterministic enforcement that doesn't depend on the goodwill of a probabilistic model.

If you're piloting Claude Code or Cursor, the lowest-effort thing you can do this quarter is wire up a hook. Even a logging-only handler will tell you, very quickly, more about how your developers are actually using these tools than any survey will. From there, adding policy is a config change rather than a re-architecture. And tools like Endor Labshelp you cross the line from "I have a hook" to "I have a governed system" with a single line of settings.json.