How we built an AI security scanner that's actually useful

If you've ever run Semgrep or Bandit on a real codebase, you know the pattern: hundreds of findings, 90% of them false positives, and the actual bugs buried in the noise. We wanted to build a scanner where the signal-to-noise ratio was flipped — few findings, most of them real.

The problem with rule-based scanners

Traditional SAST tools match patterns. They flag eval() calls, SQL string concatenation, hardcoded credentials. These rules catch the obvious stuff, which is good — you genuinely don't want eval(user_input) in production. But rule-based matching has two failure modes that ruin the experience at scale.

First, false positives. A rule that flags every exec() call catches genuine RCEs but also every harmless test fixture, every database migration, every CLI argument parser. The signal drowns in the noise and developers eventually disable the tool.

Second, false negatives. Rules miss bugs that require understanding intent. A function named sanitize_input() that does nothing isn't flagged — the rule sees a sanitizer, not the fact that it doesn't actually sanitize. A SQL query built by concatenating trusted and untrusted strings through three layers of helper functions isn't caught — the rule doesn't trace data flow across function boundaries.

What AI adds

LLMs are good at the stuff rule-based scanners are bad at. They read context. They spot inconsistencies between what code claims to do and what it actually does. They can reason about data flow across function boundaries in natural language.

The obvious win is detection quality. The less obvious win is explanation quality. A rule-based scanner says "unsafe deserialization in line 47." An AI scanner says:

The payload.loads() call deserializes pickled data from an HTTP request body without any integrity check. An attacker who can reach this endpoint can execute arbitrary code by crafting a malicious pickle payload. Replace pickle with JSON, or sign the payload with HMAC and verify before deserializing.

That second one is a ticket you can hand to a junior engineer. The first is homework.

Our architecture

RepoInsight's security scanner is a three-stage pipeline.

Stage 1: targeted retrieval

We don't send the whole repo to Claude. We use semantic retrieval to surface the files most likely to contain security-sensitive code: auth handlers, input parsers, database queries, crypto operations, anything that touches request bodies or file paths. Think of it as the security scanner's equivalent of a radiologist — you look at the right slices, not the whole body.

Stage 2: structured analysis

Claude analyzes the retrieved files against a checklist: hardcoded secrets, SQL injection, XSS, insecure deserialization, path traversal, CSRF, weak cryptography, logging of sensitive data, outdated dependencies with known CVEs. We use a strict JSON schema for the output, so every finding includes file path, severity, CWE identifier when applicable, a specific description, and a concrete remediation with code example.

json

{
  "file_path": "src/auth/oauth.py",
  "line_hint": "exchange_code_for_token",
  "vulnerability": "Hardcoded OAuth secret",
  "severity": "critical",
  "cwe": "CWE-798",
  "description": "The OAuth client secret is embedded in source code and will be exposed in any repository clone.",
  "remediation": "Move the secret to an environment variable. Rotate the existing secret since it's already exposed in git history."
}

Stage 3: scoring + summary

After findings, Claude produces an overall security posture score (0-100) and a paragraph-level summary describing the codebase's overall security shape. This is what teams paste into security review tickets.

What it doesn't do

What's next

Three directions we're exploring:

Dependency scanning — surfacing CVEs in direct and transitive dependencies with remediation paths
Secret-specific scanning — integrating with tools like Gitleaks to catch secrets in git history, not just HEAD
PR-time security review — block merges that introduce new critical findings

Try the security scanner on your own repo — it's free for public repositories. Get started at repoinsight.ai.