Modern AI agent frameworks represent a new frontier in software architecture and in security challenges. As these systems grow more sophisticated, combining large language models with tool execution and external integrations, they inherit both classic vulnerability patterns and introduce entirely new attack surfaces.
To understand how agentic security analysis performs against real-world AI infrastructure, we applied Endor Labs' AI SAST engine to OpenClaw, an open-source AI agent framework. We then took the critical next step: systematically validating findings through exploit development and proof-of-concept testing against a live deployment. This resulted in 7 findings we have since reproduced, documented, and responsibly disclosed.
The analysis process
The engagement followed a straightforward methodology:
- AI SAST Analysis: We ran Endor Labs' AI SAST engine against the OpenClaw codebase. The engine performed semantic analysis of the code, identifying potential security issues and tracing data flows from sources to sinks. When a complete data flow path is identified, the AI SAST reports a finding.
- Data Flow Validation: For each finding, we examined each data flow path the engine identified and reported. This meant understanding how user-controlled data moves through the application, what transformations occur along the way, and where it ultimately gets used.
- Exploit Development: We developed proof-of-concept exploits for findings where the data flow analysis indicated exploitability. This involved setting up a live OpenClaw deployment and crafting actual attacks against the identified vulnerabilities.
- Impact Assessment: For each successful exploit, we documented the real-world security impact, demonstrating what an attacker could achieve.
The result: seven vulnerabilities confirmed as exploitable, now responsibly disclosed to the OpenClaw maintainers.
Why data flow analysis matters
The key to moving from "potential issue" to "confirmed vulnerability" is understanding the complete data flow path. The Endor Labs AI SAST engine traced:
Source identification: Where attacker-controlled data enters the system through HTTP request parameters, configuration values, conversation histories, or external API responses. Not all inputs are equally dangerous, and the impact varies between different trust levels.
Transformation tracking: How data changes as it moves through the codebase. String concatenation, path resolution, JSON parsing, and format operations can potentially affect exploitability.
Sink analysis: Where data reaches security-sensitive operations. The engine identified dangerous operations such as command execution, file system operations, network requests, database queries, and determined whether attacker-controlled data could reach them.
In multiple cases, the data flow analysis revealed vulnerabilities spanning several architectural layers:
- HTTP route handlers receiving user input
- Business logic processing and transforming that input
- Tool dispatch layers routing to different capabilities
- Lower-level implementation functions performing security-sensitive operations
This multi-layer tracking was essential. Several vulnerabilities existed precisely because validation was missing at different stages, hinting at a permissive threat model.
From findings to exploits
Let's walk through what exploit development looked like in practice:
Configuration and Setup: We deployed OpenClaw in a Docker container with all necessary integrations configured. This included setting up browser automation, enabling various tools and channels, and configuring the gateway with authentication tokens.
Exploit Development: For each finding where the AI SAST engine identified a complete data flow path from attacker-controlled input to a dangerous operation, we crafted proof-of-concept exploits. This meant:
- Identifying the exact API endpoints or interfaces where malicious input could be injected
- Crafting payloads that would survive any transformations along the data flow path
- Developing HTTP requests, tool invocations, or conversation sequences that triggered the vulnerability
- Capturing evidence of successful exploitation such as unauthorized file access, command execution, or network requests to attacker-controlled domains
Validation Testing: Each exploit was tested against the live deployment, with all requests manually validated. We documented:
- The exact attack vectors that worked
- The server responses confirming exploitation
- The real-world impact (files read, commands executed, data exfiltrated)
- Any defensive mechanisms that were bypassed
Impact Documentation: For each confirmed vulnerability, we assessed the practical security impact. What could an attacker with valid credentials achieve? What about an unauthenticated attacker? What data could be accessed or systems compromised?
Responsible disclosure
We've reported seven vulnerabilities to the OpenClaw maintainers through responsible disclosure channels. Once patches are available and the vulnerabilities are made public, we will provide more information about these vulnerabilities identified by Endor Labs AI SAST.
Detect and block malware



What's next?
When you're ready to take the next step in securing your software supply chain, here are 3 ways Endor Labs can help:






