Code Auditing: How to Catch What Reviews Miss

Most security incidents start with a line of code that looked fine in the PR, received two thumbs-up reactions, and went to production carrying a hole nobody noticed.

Code auditing is the practice that exists to find those holes on purpose, before someone outside the team finds them first.

Most engineering teams feel the need for an audit only after shipping something they wish they had not. The problem is that teams already run careful PR reviews, yet issues still slip through. Reviewing a single change in isolation cannot expose vulnerabilities that emerge across multiple files written months apart. For a closer look at how teams catch these issues earlier in the workflow, the AI code review guide explains the approach.

The good news is that running a code audit is a learnable, repeatable process. The rest of this guide explains what a code audit involves, how the process works, what auditors examine, and when to schedule one.

What is code auditing?

Code auditing is a structured review of an application’s source code to identify security vulnerabilities, logic flaws, unsafe dependencies, and gaps in validation or access control. It examines how data flows through the system and whether safeguards remain effective, producing a prioritized list of issues and fixes.

Three qualities set a code audit apart from the day-to-day reviews developers perform.

Scope: An audit takes a service, a module, or an entire application as its working surface, and the auditor is free to wander wherever the code takes them, opening files that no PR ever touched.
Depth: An auditor spends hours, sometimes days, on the parts of the system that handle authentication, file uploads, data access, and external API calls. That kind of attention is impossible within the time budget of a normal PR review.
Intent: An audit asks whether attackers can turn the code against its users, which is a sharper question than whether the change works as designed.

That last shift is the one most teams underestimate. Once a team starts reading their own code with an attacker's mindset, the volume of findings tends to surprise everyone, including the senior engineers who wrote the modules in question.

How code auditing differs from a regular code review

Both practices read source code. Each one finds different types of issues.

A code review is part of the PR workflow. Reviewers examine the diff, leave comments, and the author iterates until the change merges. The unit of work is a single change, the cadence is continuous, and the time per file is limited. Reviews catch style issues, bugs in new code, and problems visible within the diff.

A code audit operates at a broader level. Auditors begin with a system-level view of the application and examine it deliberately, often spending extended time on a single file. They trace data across modules and verify whether assumptions hold deeper in the codebase.

Most effective teams run both. Insights from audits shape what reviewers focus on over time, turning audits into a periodic recalibration of review practices.

For a deeper comparison of how the two fit into a typical engineering workflow, the code review best practices guide explains where each one fits.

How a code audit actually works

Audits look chaotic from the outside. From the inside, they follow a clear sequence. The phases below outline the workflow that most auditors and AI code review tools follow during a serious audit, regardless of stack or domain.

Step 1: Scope the audit

Scoping decides what gets read and what gets ignored.

A team cannot audit a monorepo with 600,000 lines of code and forty services in a week. The team defines the scope by identifying which services handle user data or money, which components have changed recently, and which areas the team has not audited before. A scoping document should also capture the stack and dependencies. Auditors without this context lose time relearning what the team already knows. Sharing architecture diagrams, threat models, and recent incidents keeps the audit focused on meaningful findings.

Step 2: Run automated discovery

Static analysis tools, secret scanners, and software composition analysis tools run first. Their job is to surface the easy findings quickly so that human attention can focus where it matters.

SAST engines flag injection patterns and unsafe API calls. Secret scanners find hardcoded credentials. SCA tools list every dependency and cross-reference it against known CVEs. Together, they run in a few hours and produce an initial set of findings.

These tools also produce noise. A typical SAST run on a medium-sized codebase will return hundreds of warnings, many of which are duplicates, false positives, or legitimate but low-priority findings. Treat the output as a starting point, not a verdict.

Step 3: Trace sources and sinks

Most real findings come from tracing sources to sinks.

Auditors identify where untrusted data enters the application and trace it across functions and files until it reaches a point where the system uses it in a way that could cause harm. Common sources include URL parameters, headers, request bodies, file uploads, and webhooks. Common sinks include SQL queries, shell commands, file operations, HTML rendering, and deserialization.

How this looks in practice (source → sink):

$file = $_GET['file'];                // source
$path = "/var/www/uploads/" . $file;  // transformation
echo file_get_contents($path);        // sink

Auditors then verify whether every path includes proper validation, filtering, or escaping. This work is slow and cannot rely only on tools. Data often moves across multiple files, and only a human or a well-instructed AI tool can confirm whether validation holds across all steps.

Even if the prefix appears safe, inputs like "../../etc/passwd" can bypass it, resulting in a path traversal vulnerability in production.

Step 4: Review dependencies and configuration

Modern applications carry hundreds of third-party packages. Auditors examine three aspects: the versions in use, any known CVEs affecting them, and whether the application uses unsafe methods from those libraries.

The third question matters more than most teams realise. A library can be CVE-free and still ship a function that runs raw shell commands the moment a developer calls it. Configuration files get the same treatment. Environment variables, infrastructure-as-code templates, container manifests, and CI pipeline definitions all hold secrets and access controls that can undo every protection in the application code itself.

Step 5: Document findings and prioritize

The auditor documents each finding with a description, affected files, a reproduction path, a severity rating, and a recommended fix. Severity is the most useful field for the receiving team, because it dictates whether a finding interrupts the current sprint as a hotfix or joins the backlog alongside other tech debt. Most teams standardize this using frameworks like CVSS or simple High/Medium/Low classifications to ensure consistent prioritization across findings.

Step 6: Verify the fixes

A code audit is not complete when the report is delivered. The auditor or the engineering team verifies each fix in a follow-up pass, either through targeted re-review or by re-running the relevant tools against the patched code. Fixes that look right at first glance can introduce new issues, especially when developers patch a sink without understanding the source that feeds it.

What auditors look for

Findings cluster around a small number of recurring categories. The table below covers the ones that appear in nearly every audit.

Vulnerability class	What it looks like in code	Why it matters
Injection flaws	String concatenation in SQL queries, shell commands, or LDAP filters	Lets an attacker run arbitrary queries or commands on the host infrastructure
Broken authentication	Weak session handling, missing MFA paths, predictable tokens	Account takeover and lateral movement across user data
Insecure direct object references	API endpoints that trust user-supplied IDs without ownership checks	Users can read or modify records that belong to other accounts
Path traversal	File operations that accept unvalidated path segments from input	Attackers read or write files anywhere the application has access to
Hardcoded secrets	API keys, database passwords, or signing keys committed to the repo	Anyone with repo access, including past contributors, holds production credentials
Vulnerable dependencies	Outdated packages with public CVEs, or libraries that expose unsafe methods	Known exploits become available the moment the CVE publishes
Unsafe deserialization	Loading untrusted data into pickle, YAML, or Java serialization	Remote code execution with no further input from the attacker
Improper error handling	Stack traces returned to clients, sensitive data in logs	Information leakage that helps an attacker map the system

The security code review checklist walks through each of the categories above with concrete examples and the questions a reviewer should ask before approving a PR that touches them.

When to run a code audit

Audits cost time, so most teams run them at specific points in the product lifecycle rather than on a fixed monthly cadence. The triggers below cover the cases where an audit pays for itself.

Before a major release that changes how user data is handled, stored, or transmitted
After a security incident, to find related issues that the original investigation missed
When entering a regulated market that requires SOC 2, HIPAA, PCI DSS, or ISO 27001 evidence
After a large engineering reorganisation, when ownership of critical modules has shifted
Before onboarding a major enterprise customer who will run their own security review
When integrating a recently acquired codebase into the main product
At least once a year for any application that handles payments, identity, or sensitive personal data

Some teams also run lighter, narrower audits whenever a developer flags something that feels wrong. These targeted audits cost a fraction of a full pass and catch issues that would otherwise sit until the next scheduled review.

Where most code audits go wrong

The mechanics of an audit are well understood. Audits often fail to deliver value due to organisational issues rather than technical limitations.

Scope creep is the first failure mode. An audit that starts focused expands into the entire backend, runs out of time, and produces a shallow report. Locking the scope at the start prevents this.

The report-and-forget problem comes next. The team fixes the top findings and leaves the rest buried in a Notion page no one revisits. Audits deliver results only when teams track findings with clear ownership and deadlines.

Over-reliance on tooling is the third. SAST and SCA scanners are fast but cannot reason about business logic or chained vulnerabilities. Audits that depend only on tools and skip manual verification overlook the issues that matter most.

How AI is reshaping code auditing

Over the past two years, the gap between continuous code review and scheduled audits has narrowed.

AI code review tools now read PRs with the kind of cross-file context that used to require a human auditor and a free afternoon. They follow function calls across the codebase, recognize injection patterns in unfamiliar frameworks, and flag dependency upgrades that introduce risky new methods. The findings that used to wait for the next scheduled audit start showing up at PR time, when the developer who wrote the code still has the change loaded in their head.

Scheduled audits are still necessary alongside AI code review. A senior security engineer conducting a focused, scoped audit still finds issues that automated systems miss, especially in authorization logic and business rules. What changes is the baseline. Teams using context-aware AI review walk into their next audit with far fewer easy findings, which means the audit budget gets spent on the harder questions.

Refacto reviews every PR with full codebase context, catching the same classes of issues a code audit would identify before they reach production. See Refacto on your next PR →

Refacto traces data flows across files, understands the authentication and authorisation patterns in your code, and flags the exact source-to-sink paths a manual auditor would investigate. The best AI code review tools comparison highlights how Refacto and similar tools handle context and depth differently.

Where to take this next

Code auditing integrates with the engineering workflow as a deeper form of code review. The same skills apply: checking input handling, authorization, and dependencies. What changes is how much code each pass covers and how deeply each issue gets examined.

Done well, an audit comes down to three things: a tight scope, the depth to trace data across files, and a verification pass that confirms each fix holds.

For teams new to audits, start small. Begin with the service that handles the most sensitive data, clearly define the audit scope around it, and carry out the review. The findings will reveal more about your security posture than any framework. A code review checklist is a good starting point before involving a dedicated auditor.

Teams that build security into their daily review process give every audit a higher ceiling. The audit stops being a panic event and becomes a periodic check on a system that already takes security seriously.

Catch security issues at PR time, not in your next audit. Refacto reviews your code with the full repository context. Start your free trial!