By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Design Flaws in AI Generated Code

AI coding assistants are introducing systemic architectural weaknesses that have major consequences for application security.

Open Report

View Report

Written by

Andrew Stiefel

Published on

February 9, 2026

Updated on

February 9, 2026

Topics

AI/ML

First Party Code

AI coding assistants are introducing systemic architectural weaknesses that have major consequences for application security.

Open Report

View Report

Summarize with AI

When we think about security vulnerabilities in code, we typically focus on the obvious stuff: SQL injection, hardcoded credentials, or using deprecated libraries. These are the kinds of issues that traditional SAST tools can catch. And while Endor Labs’ research shows that these issues continue to be prevalent in AI-generated code, there’s also a different class of problems that’s much harder for SAST to detect: Design flaws. These are architectural decisions that undermine your application's security posture in ways that won't trigger any alerts in your SAST tool.

I spoke with Srajan Gupta, a Senior Security Engineer at Dave and co-author of a 2025 research paper AI Code Generation and the Rise of Design Flaws [pdf] published in the International Journal of Latest Research in Engineering and Technology.

The findings reveal a troubling pattern: AI coding assistants are introducing systemic architectural weaknesses that have major consequences for application security.

AI tools create design flaws because they mimic patterns without context

Here's the core issue: While both developers and AI follow patterns, they don’t do it in the same way.

When a developer joins a new codebase and needs to build a feature, they first look at existing code to understand the conventions. Should this endpoint require authentication? Does it need rate limiting? What middleware should be applied?

AI coding assistants also review the codebase but do it without the architectural understanding that human developers bring. If the codebase contains design flaws, a human might recognize the mistake, but the LLM will see the flaws as acceptable patterns.

"AI will look at the structure of your existing code," Gupta explains. "If you have an unprotected API with no middleware, the AI will replicate that pattern. It's not necessarily introducing new classes of vulnerabilities, but it's amplifying and spreading design flaws throughout your codebase."

The research paper examined 20 different prompts across realistic development scenarios using a fictional Flask-based SaaS platform. The results were concerning:

15 out of 20 completions contained at least one architectural design flaw
12 out of 20 exhibited "design pattern drift", which is a failure to reuse established security controls like RBAC decorators or audit logging
6 out of 20 were completely invisible to static analysis tools

The problem compounds in large, multi-team environments where architectural rules are often enforced through tribal knowledge rather than programmatically. A small deviation introduced by AI-generated code can break platform invariants and create security debt that accumulates as more developers build on that flawed foundation.

Four types of design flaws in AI-generated code

The paper documents four categories of design flaws that repeatedly appeared in AI-generated code.

Cross-Service Trust Coupling: When prompted to "build a login system that returns a JWT," the AI generated code that used a shared secret key across multiple microservices, breaking the zero-trust isolation that the architecture required.

Privilege Escalation by Default: A simple prompt to "add a registration route" resulted in code that directly assigned role='admin' to new users, completely bypassing the application's centralized role assignment workflow.

Cryptographic Subversion: Asked to "encrypt passwords before saving to DB," the AI used reversible AES encryption instead of the existing bcrypt-based password hashing pipeline—silently undermining the entire security model.

Missing Accountability: Administrative deletion routes were generated without the required audit logging calls, breaking compliance requirements and traceability contracts.

And here's the kicker: recent research suggests that only about 10% of AI-generated code is both correct and secure [pdf]. That's a major problem when these tools are becoming integral to developer workflows.

Why SAST struggles to detect design flaws

In Your Next Breach Won’t Be a CVE: Connecting Real Incidents to AI-Aware Code Review, Matt Brown did a deep dive on high-profile incidents caused by subtle flaws. He explained that rule-based SAST tools weren’t built to understand why a change is risky in the context of your architecture and data flows.

Gupta emphasizes that "These aren't things you can catch with traditional SAST tools. "Design flaws live in the assumptions we make—why you rate limit one endpoint but not another, or why certain operations require audit trails. These are decisions that happen in design review, not in the code itself."

"Pure engineering decisions have security implications," Gupta notes. "That's why these flaws are going to expand dramatically as AI adoption grows."

The future is design-aware security

So what can we do about it? Gupta suggests starting with architectural visibility:

"Use AI to build a graph of the application you're building. What are the new endpoints in this PR? What are the best practices you want enforced? What decorators or middleware should be applied? Build that understanding and map out the flow—that's a good starting point for understanding how AI is generating code, rather than just looking for individual findings."

The research paper proposes several recommendations:

For developers:

Prompt with architectural intent, not just functional goals
Instead of "create a login endpoint," try "create a login endpoint using our existing JWT signing helper and audit the login"
Ask follow-up questions: "What are the security implications of this implementation?"

For tooling:

Design-aware linters that highlight when generated code deviates from nearby secure conventions
Annotations showing which architectural principles (authentication, role enforcement, audit) have been satisfied or omitted
Automated security reviews that look beyond code correctness into architectural soundness

For organizations:

Flag AI-generated changes that modify access control, session logic, or multi-tenant boundaries for additional scrutiny
Assume that developers may not fully understand the architectural implications of AI-suggested code
Make security design reviews a standard part of the AI-assisted development workflow. This kind of architectural visibility must be continuous,not a one-time thing.

Security is just engineering

As Gupta writes in his Substack, "Security is just engineering." Design flaws aren't a separate category of problem. They're a natural consequence of incomplete architectural understanding.

"If you focus on design flaws, you can mitigate entire classes of vulnerabilities, not just individual bugs," he explains. "During design review, the questions become binary: Do you use this library to sanitize input? Do you apply this middleware to all authenticated endpoints? These are easy answers for developers—yes or no. Start focusing on that, and the maturity of entire vulnerability classes improves."

The key insight from this research isn't that AI coding assistants are fundamentally broken. It's that they operate without the architectural context that security requires. As these tools become more powerful and more widely adopted, we need security tooling that can detect not just what the code does, but whether it honors the assumptions that keep our systems safe. And right now, AI doesn't understand those assumptions at all.

Find out More