Security Code Review Checklist: What to Check in Every PR

Pull Requests (PRs) get reviewed for logic, readability, and style. Security? That part usually gets a quick scroll and a mental "looks fine." The problem is that attackers don't care about your variable names or your import order. They care about the SQL query on line 214 that concatenates user input, the session token that never expires, and the API key sitting in a config file committed three sprints ago.

This guide breaks down the specific security checks that belong in every PR review, organized by attack surface, with code examples and patterns that engineering teams can start using immediately. For a broader review framework beyond security, the complete code review checklist for engineering teams covers everything from logic to performance.

What a Security Code Review Covers

A security code review is a targeted review of source code changes that focuses on identifying vulnerabilities, unsafe data handling, and weak access controls before deploying the changes to production.

The OWASP Top 10 acts as a reference framework, covering common vulnerability categories such as broken access control, injection flaws, cryptographic failures, security misconfigurations, and others that frequently occur in web applications.

The key difference from a standard code review is the reviewer's mindset. A standard review asks: Does this work? A security review asks: how could this be abused?

Static analysis tools detect known vulnerability patterns at scale, but they do not identify business logic flaws, incorrect authorization flows, or context-specific risks that require human understanding. AI-powered code review tools like those covered in the best AI code review tools guide add another layer by analysing the full PR context for patterns that static rules miss. The strongest security review process combines all three: static analysis, AI-assisted review, and human judgment.

1/ Input Validation and Injection Prevention

Every piece of data that enters an application from the outside is a potential attack vector: user input, query parameters, HTTP headers, file uploads, webhook payloads. Injection attacks remain the most exploited vulnerability class because developers trust input that should never be trusted.

SQL injection

SQL injection happens when user-supplied values get concatenated directly into database queries. The fix is straightforward: parameterised queries or ORM-level query builders that separate data from SQL statements.

Here's what that looks like in practice:

# UNSAFE - user input concatenated into query
query = f"SELECT * FROM users WHERE email = '{email}'"
cursor.execute(query)
# SAFE - parameterised query
cursor.execute("SELECT * FROM users WHERE email = %s", (email,))

Reviewers should flag any raw string concatenation that builds a SQL query, regardless of language. The same principle applies to NoSQL databases: MongoDB's $where clause, for example, accepts arbitrary JavaScript and should be treated with the same caution as raw SQL.

XSS, command injection, and path traversal

Cross-site scripting (XSS) exploits missing output encoding. When an application renders user-supplied data in HTML without escaping it, attackers inject scripts that execute in other users' browsers. Every user-controlled value displayed on a page should pass through a context-appropriate encoding function.

Command injection and path traversal follow the same underlying pattern. Functions like os.system(), exec(), or child_process.exec() should never receive unsanitised user input. If a user controls part of a file path, an attacker can access files outside the intended directory using sequences like ../../etc/passwd.

Reviewer shortcut: search the PR diff for os.system, exec(, child_process, and f-string patterns touching database queries. These are high-signal indicators of injection risk.

2/ Authentication and session management

Authentication is the front door of any application, and session management keeps that door locked after a user walks through it. A weakness in either one gives attackers direct access to user accounts, sensitive data, and elevated privileges. Every PR that touches login flows, session handling, or password storage deserves scrutiny.

Here's what reviewers should verify:

Strong password hashing: Passwords stored with bcrypt, scrypt, or Argon2 using per-user salts. MD5 and SHA-1 are broken for password storage and should trigger an immediate rejection.
Session token entropy: Session IDs generated with cryptographically secure random number generators, producing at least 128 bits of entropy.
Session invalidation: Tokens invalidated on logout, password change, and after a configurable idle timeout.
Re-authentication for sensitive operations: Password changes, payment submissions, and account deletion should require the user to re-enter credentials, even during an active session.
Secure cookie attributes: Session cookies set with HttpOnly, Secure, and SameSite flags to prevent theft through XSS or CSRF.
Multi-factor authentication: MFA implemented for high-risk accounts and administrative access.

Stale sessions left active are a common finding in penetration tests, and they're easy to miss during code review because the vulnerability is in what the code doesn't do.

3/ Authorization and access control

Broken access control has held the top position in the OWASP Top 10 for years. The pattern behind this ranking is consistent: applications verify that a user is logged in, but they fail to verify whether that user should be allowed to perform the specific action they're requesting.

Consider a common scenario. A user sends a request to /api/invoices/1042 and gets their own invoice. Then they change the URL to /api/invoices/1043 and receive a different customer's invoice. The application authenticated the user, but it never checked whether that user owned invoice 1043. This is an insecure direct object reference (IDOR), and it surfaces in virtually every penetration test of B2B SaaS applications.

Reviewers should verify three things on every endpoint that returns or modifies data:

The server enforces ownership or role-based access checks on every request. Client-side guards protect nothing.
Routes default to deny. Access opens only through explicit grants to the roles and users that need it.
Privilege escalation paths are blocked. A standard user should never reach admin functionality through a missing permission check or an exposed internal API.

4/ Secrets and sensitive data handling

Hardcoded credentials are one of the easiest vulnerabilities for an attacker to exploit and one of the easiest for a reviewer to catch. API keys, database passwords, JWT signing secrets, and third-party tokens should not be included in application code, configuration files committed to version control, or CI/CD pipeline definitions stored in the repository.

What reviewers find	Why it matters	What to do instead
API key hardcoded in source	Anyone with repo access, including former employees, can extract the key	Store in a secrets manager (Vault, AWS Secrets Manager) or environment variables
Passwords logged in plaintext	Credentials become visible in log aggregation tools and to anyone with log access	Mask or omit sensitive fields before writing to logs
Encryption key committed to repo	A leaked key means full data compromise across every environment that shares it	Use a KMS with automatic rotation policies
Data transmitted over HTTP	Man-in-the-middle attacks can intercept credentials and user data in transit	Enforce TLS on all connections and set HSTS headers

Beyond the table above, reviewers should confirm that sensitive data at rest uses strong encryption (AES-256 or equivalent) and that the encryption keys rotate on a defined schedule. PII, payment data, and health records are subject to additional regulatory requirements under GDPR, PCI DSS, and HIPAA, which directly impact how code manages storage, access logging, and deletion.

Refacto flags security vulnerabilities, exposed secrets, and unsafe data patterns in every PR, with inline suggestions to fix them.

Try Refacto AI on your next PR for free →

5/ Dependency and supply chain risks

Modern applications incorporate hundreds of third-party packages, each of which carries its own vulnerability surface. Supply chain attacks have evolved from a theoretical risk to a documented reality, with high-profile incidents affecting npm, PyPI, and Maven registries in recent years. A single compromised dependency can introduce backdoors, credential theft, or data exfiltration into an otherwise secure codebase.

During a security code review, these four specific checks cover the majority of dependency risk:

Known CVEs: Run dependency audit tools (npm audit, pip-audit, Snyk) against the lock file. Flag any dependency with a critical or high-severity CVE that lacks a patch.
Unmaintained packages: A library that has not been updated for over 18 months likely contains unpatched vulnerabilities and may indicate that it is no longer actively maintained.
Lock file integrity: Verify that package-lock.json, yarn.lock, or equivalent files are committed and consistent. If a lock file is missing or altered, unexpected package versions can be introduced into the build.
Dependency confusion: Confirm that internal packages use scoped names or private registries to prevent attackers from publishing malicious packages with the same name on public registries.

For a deeper understanding of how automated tools identify these risks, read the upcoming guide on how AI code review detects dependency vulnerabilities, which walks through the detection workflow in detail.

6/ Error handling and security logging

Verbose error messages are a gift to attackers. A stack trace that reveals internal file paths, database table names, or framework versions gives an attacker a detailed map of the application's internals. Production error responses should return generic messages to end users while logging full diagnostic details to internal monitoring systems.

Here's the difference in practice:

# UNSAFE - leaks internal details to the user
except Exception as e:
	return jsonify({"error": str(e), "trace": traceback.format_exc()}), 500
# SAFE - generic response, detailed internal log
except Exception as e:
	logger.error(f"Payment processing failed: {e}", exc_info=True)
	return jsonify({"error": "Something went wrong. Please try again."}), 500

Logging itself introduces security concerns when handled carelessly. Passwords, API tokens, session IDs, and PII should never appear in log output. Log injection attacks, where an attacker inserts newline characters or control sequences into logged fields, can corrupt audit trails and mislead incident responders.

What to log: authentication events (successful logins, failed attempts, logouts, password resets) with timestamps, user identifiers, and source IPs. Authorisation denials with the user, the resource, and the reason for the block. Input validation failures that could indicate probing or scanning.

What to never log: passwords, tokens, API keys, session IDs, full credit card numbers, Social Security numbers, or raw request bodies that may contain sensitive user data.

How AI code review catches what reviewers miss

Human reviewers bring judgment to business logic, architecture, and context-specific risks. But they lose focus over time, especially with large or multiple PRs. Security checks are repetitive, but they also depend on context. Verifying queries, data flows, access control, and secrets requires understanding how changes interact across files.

Refacto reviews the full PR context to identify vulnerability patterns, exposed secrets, insecure data flows, and missing validation. It runs automatically and delivers inline comments with actionable fixes.

The most effective process combines both:

AI handles repetitive checks and surfaces contextual risks across the PR
Humans focus on deeper issues like business logic and architecture

With a consistent checklist and AI-assisted review, teams can catch vulnerabilities early without slowing down development.