GitHub Copilot Code Review in 2026: What It Does and Misses

GitHub Copilot code review now runs on more than one in five PR reviews on GitHub, with usage growing roughly 10× since its launch and surpassing 60 million total reviews. More than 12,000 organisations have it running automatically on every PR.

At that scale, something worth noticing has happened in the background. “Copilot code review” is no longer a single feature; it has evolved into a family of eight different surfaces spread across github.com, VS Code, the terminal, and GitHub Actions.

Each surface catches slightly different things, each one fits a different moment in the developer workflow, and each one has its own setup path. The question engineering leaders are actually asking in 2026 is not whether to use Copilot code review. It’s about where it fits in a review process that still relies on people and follows solid code review best practices.

This guide breaks down all eight surfaces, what they handle well, what they miss, and how they fit into a complete review workflow.

What counts as GitHub Copilot code review

GitHub Copilot code review is a family of AI-powered review features built into the Copilot suite, available across github.com, VS Code, Visual Studio, JetBrains IDEs, the Copilot CLI, and GitHub Actions. All surfaces share the same tuned model mix and post inline comments with one-click fixes, though none can approve a PR or satisfy branch protection rules.

A review can be triggered in several ways. A developer can assign Copilot as a reviewer from the PR sidebar. A repository can enable automatic review, so that every new PR gets a first pass the moment it opens. Someone can right-click a selection in VS Code and choose Review and Comment. A slash command in chat can run a prompt file. A terminal session can invoke the CLI. Each entry point feeds into the same review machinery, and each one consumes one premium request against the user's monthly Copilot allowance.

The output is consistent across surfaces. Copilot posts inline comments on logical multi-line ranges instead of single lines, provides one-click fixes, and groups recurring issues so that developers can resolve entire classes of problems in one action. One important detail that often surprises teams: Copilot code review always leaves a "Comment" review on a PR. It never leaves "Approve" or "Request changes." Branch protection rules that require approving reviews cannot be satisfied by Copilot. Treat it as a smart first pass, not as a merge gate.

The eight surfaces of Copilot code review

Most teams start with one surface, usually the github.com native review, and never discover the rest. The eight surfaces together form a layered toolkit that reaches into every corner of the developer workflow.

#	Surface	Where it runs	What it is best at	Setup effort
1	Native review on github.com	Browser, on any PR	PR-level review at scale	Low
2	Review Selection in VS Code	Editor, on any selection	Pre-commit checks on uncommitted code	Low
3	Custom review instructions	Every surface, via config file	Enforcing team standards across all reviews	Low
4	Prompt files	VS Code chat, as slash commands	Repeatable review workflows	Medium
5	Custom agents	VS Code chat	Dedicated reviewer personas with tool limits	Medium
6	MCP-powered PR review	VS Code with GitHub MCP server	Deep PR analysis with live context	Medium
7	Coding agent handoff	GitHub Actions environment	Auto-implementing fixes for review findings	Low
8	Copilot CLI	Terminal	Pre-commit and server-side reviews	Low

1/ Native review on github.com

This is the most visible surface and the one that shows up in GitHub's own public data. A developer assigns Copilot as a reviewer from the PR sidebar, or a repository admin turns on automatic review under Settings, and every new or updated PR receives a first pass the moment it opens. Copilot posts inline comments on the diff with suggested fixes that a reviewer can apply with one click. No changes are required in the editor, terminal, or local workflow. For teams looking for the simplest way to add an AI reviewer to every PR, this is the starting point.

2/ Review Selection inside VS Code

The github.com native review runs after a PR exists. Review Selection runs before. A developer selects a function, a class, or any block of code in the editor, right-clicks, chooses Copilot > Review and Comment, and Copilot posts its findings as inline comments directly in the file. A button in the Source Control panel reviews all uncommitted changes in one action. Configuration is defined in VS Code’s settings.json under github.copilot.chat.reviewSelection.instructions. It accepts inline rules or a path to a file, such as '.github/review-instructions.md', keeping review guidelines version-controlled. This is the surface for pre-PR polish, pair programming moments, and catching issues before they enter version control.

3/ Custom instructions for review standards

Custom instructions are how a team teaches Copilot its house rules. GitHub reads two files. The first is '.github/copilot-instructions.md', which applies to every Copilot interaction in the repository. The second is '.github/copilot-code-review-instructions.md', which applies specifically when Copilot is reviewing code. Both accept natural language rules such as security requirements, naming conventions, testing standards, error handling guidelines, and other practices your team already follows. Once the file exists, every Copilot review on that repository, whether it runs on github.com, in VS Code, or through the CLI, reads it and adjusts the feedback accordingly. This is the single highest-leverage configuration a team can make.

4/ Prompt files for repeatable review workflows

Prompt files live in '.github/prompts/*.prompt.md' and become named slash commands inside VS Code chat. A file called 'security-review.prompt.md' becomes the '/security-review' command. Each file defines a review workflow with a structured output format, so a security review consistently returns a table with severity, location, issue, risk, and fix. Common workflows also include performance, test coverage, accessibility, and infrastructure-as-code reviews. Over time, a team builds up a library of review plays, and every developer runs them the same way. The output is structured, consistent, and actionable.

5/ Custom agents for specialist reviewer personas

Custom agents take the prompt file idea one step further. An agent file in '.github/agents/*.agent.md' defines a dedicated reviewer persona with a specific system prompt, an allowed tool set, and a set of rules. A security-reviewer agent can be restricted to read-only tools so it cannot modify files during a review. An architecture-reviewer agent can focus only on module boundaries and cross-service contracts. Teams that want specialised review behaviours without polluting general Copilot interactions use agents to keep roles separate. A security agent never worries about variable naming, a style agent never flags architectural choices, and the feedback from each one stays focused.

6/ MCP-powered PR review from the editor

The Model Context Protocol (MCP) lets Copilot connect to external tools, and the GitHub MCP server is the one that matters for code review. It exposes tools for fetching PR metadata, retrieving the full diff, listing existing review comments, pulling in linked issues, and posting new review comments back to the PR. With the GitHub MCP server connected, a developer can run a full PR review from VS Code chat: fetch the diff and the linked issue context, analyse the changes against the team's standards, and post the findings back as inline comments on github.com. Without MCP, Copilot in VS Code only sees the files a developer has open. With MCP, it sees the whole PR.

7/ Coding agent handoff for review follow-up

The Copilot coding agent is not a reviewer. It is the thing that fixes what the reviewer finds. A developer creates a GitHub Issue with the review findings and assigns it to Copilot. The coding agent picks it up in a GitHub Actions environment, applies the fixes, and opens a stacked PR. The pattern is useful when a review flags a batch of mechanical issues and a team wants them resolved without manually implementing each fix. The agent cannot approve or merge anything, and every PR it creates is scanned by CodeQL and secret scanning before a human touches it.

8/ Copilot CLI for terminal-side review

The Copilot command is a standalone terminal agent that replaced the older gh copilot extension. A developer runs it before committing to get a review of staged changes, or on a server with terminal-only access to review code that never makes it into the local editor. The '--allow-tool flag' controls what a review session can access. For example, "--allow-tool='shell(git)'" grants read access to Git history, while "--allow-tool='shell(gh)'" allows it to fetch PR context. Common uses include pre-commit checks, quick sanity reviews on remote servers, and CI hooks that flag issues before a PR is opened.

What the whole family catches well

All surfaces share the same underlying model, so the strengths remain consistent whether a developer triggers a review from github.com, VS Code's Review Selection, or the CLI. The strongest use case across the board is a fast, high-signal first pass on routine changes.

Surfaces common logic bugs like assignment-versus-comparison errors, wrong null checks, and off-by-one mistakes
Flags missing error handling around I/O calls, network requests, and promise chains
Catches dependency mistakes in framework patterns, including missing values in a React useCallback or useEffect dependency array
Suggests readability improvements that line up with common language and framework conventions
Highlights dead code, unreachable branches, and obvious redundancy
Proposes small local refactors that tighten a function without changing its behaviour

The more telling number is what Copilot chooses not to say. GitHub reports that Copilot surfaces actionable feedback in about 71% of PR reviews on GitHub, while remaining silent in the other 29%. Across reviews where it gives feedback, the agent averages about 5.1 comments per review. A reviewer who comments on every PR trains developers to ignore it. One that stays quiet when it has nothing useful to add is one that developers actually read.

What the whole family misses

GitHub's own responsible-use documentation is refreshingly clear about where Copilot code review stops. It lists missed problems, false positives, inaccurately suggested code, and training-data bias as known issues. Independent testing sharpens the picture. The gaps apply across every surface because they are properties of the underlying review approach, not bugs in a specific entry point.

Security vulnerabilities are a documented weak spot

GitHub states directly that Copilot code review may not identify all security issues, may suggest code that contains vulnerabilities, and should not be relied on in place of dedicated security tooling. Independent testing produces sharper numbers. An academic study evaluating Copilot’s code review feature found that it often failed to detect critical vulnerabilities such as SQL injection and XSS, instead focusing on low-severity issues like style and formatting.

GitHub itself treats security fixes as a separate feature. Copilot Autofix for code scanning is a separate system from Copilot code review and operates on CodeQL alerts rather than the review prompt. A team relying on Copilot code review to catch injection attacks, insecure deserialisation, or broken authentication logic is relying on the wrong tool. A complete security code review checklist should run alongside it.

Cross-repository and architectural context

Every Copilot surface treats the current repository as its context boundary. The native GitHub review has access to the PR’s repository, along with the repo-level custom instructions file. VS Code Review Selection sees open files. MCP extends the reach through GitHub API calls, but only within repositories that Copilot can access. If a PR modifies a contract that is consumed by three other services living in three other repositories, Copilot will review the file on screen and miss the downstream blast radius. The comment covers the diff alone and misses the services that depend on it.

Monorepos hit a different flavour of the same problem. The full workspace context is often too broad for focused feedback, and package-level scoping is still developing. Workarounds exist, including more specific custom instruction files and Copilot Spaces, but each one adds setup and maintenance overhead that grows with the codebase.

Architectural intent, and is this the right change at all

Copilot reviews the code in front of it. It does not review the decision behind the code. If a PR introduces a new caching layer because queries are slow, Copilot will review the cache implementation line by line. It will not ask whether the underlying problem is an N+1 query that should be fixed at the data access layer instead. This is the category where senior human reviewers remain genuinely irreplaceable. Copilot exists to free their attention for design questions, never to take those questions over.

Large or complex PRs

GitHub documents this limitation directly: review quality drops on large or complex changes. That aligns with how humans review code too. The practical takeaway concerns expectations. Teams should not expect the tool to rescue them from shipping 2,000-line PRs. Breaking changes into smaller, focused PRs is already standard advice, and it helps Copilot for the same reason it helps humans. A strong PR review process limits the scope of changes so that both AI and human reviewers can assess them properly.

Hallucinated issues and incorrectly suggested fixes

GitHub’s documentation highlights two failure modes that teams using this feature will eventually encounter. The first is the false positive, where Copilot flags an issue that is not actually there, often because the model misunderstood the surrounding code. The second is the bad suggested fix, where Copilot offers code that looks plausible but introduces a bug or a subtle vulnerability. Both are reminders that every Copilot comment is a suggestion, not a verdict.

Premium request limits and the high-volume tax

Every use of Copilot code review consumes one premium request against the user's monthly allowance, and premium requests are capped per user. Copilot Business users get 300 premium requests per month. Copilot Enterprise includes 1,000 premium requests per user per month, which are consumed by advanced features such as code review, chat, and agent workflows, with additional usage available on demand. Beyond the cap, the system switches to a base model with unlimited usage, but the review quality is lower than the premium model.

For an individual developer doing light review work, 300 requests a month is not a concern. For a 50-developer team with automatic review on every PR across a busy monorepo, the math adds up quickly. Engineering leaders running high-volume review workflows should model their expected usage before enabling automatic review org-wide, and should understand which plan tier matches their actual throughput. Falling back to the base model still works, though quality drops noticeably, and that drop adds up across hundreds of reviews each week.

A practical review stack for 2026

The cleanest way to think about Copilot code review is as one layer in a defence-in-depth review strategy. Different layers catch different categories of issues, and no single layer catches them all. The table below maps the major review concerns against the three layers most serious teams are now running in 2026.

Review concern	Copilot code review (any surface)	Dedicated AI code review layer	Human reviewer
Style and common logic bugs	Strong	Strong	Wasteful use of time
Framework-specific patterns	Strong	Strong	Strong
Cross-file and cross-repo impact	Weak	Strong	Strong
Security vulnerabilities	Weak	Varies by tool	Depends on expertise
Architectural intent and design	Weak	Partial	Strong
Large PR comprehension	Degrades with size	Varies by tool	Degrades with size
Can block merge via branch protection	No	Varies by tool	Yes

Three-layer setups are becoming the norm on teams that ship production software at scale. Copilot handles the fast first pass on style and obvious bugs across any of the eight surfaces that fit the team's workflow. A dedicated AI code review layer handles cross-file and cross-repo context, full-codebase reasoning, and the deeper security coverage Copilot was not built for. Human reviewers handle architectural intent, and the judgment calls no AI reviewer is ready to own. Each layer earns its place by doing something the others cannot do as well.

See what a full-codebase AI review layer adds alongside Copilot. Refacto reviews PRs with full repository context, flags the cross-file and cross-repo issues a single-repo reviewer cannot see, and runs independently of the premium request budget that throttles high-volume Copilot usage. Try it on your next PR for free!

Which Copilot surface fits which team

No team runs all eight surfaces on day one, and no team needs to. The surfaces stack in a natural order as a team matures its review workflow.

A team starting with AI code review should enable github.com native review. It requires no editor changes, applies to every PR through automatic review, and gives the rest of the team immediate visibility into the feedback. Once it is running, the next step is writing a '.github/copilot-code-review-instructions.md' file that captures the team's house rules. This single file shapes every Copilot review, on every surface, without any other configuration.

Developers who want faster feedback before opening a PR can use Review Selection in VS Code, or run the Copilot CLI for checks directly from the terminal before committing. These surfaces run on the developer's own machine and never hit the shared PR workflow. Teams with recurring review needs, such as security or performance checks, can create a small set of prompt files or custom agents to turn those reviews into one-command workflows. For deeper analysis of complex PRs, teams should set up the GitHub MCP server so that Copilot in VS Code can access PR context, linked issues, and existing comments in one place. Refacto's list of the best AI code review tools covers the broader landscape for teams evaluating what to run alongside Copilot.

How to think about Copilot code review in 2026

Copilot code review is a family of genuinely useful tools that handles the mechanical layer of code review better than most teams were handling it themselves a year ago. The eight surfaces give it reach into every corner of the developer workflow, from the moment a developer selects a function in VS Code to the moment a PR opens on github.com to the moment a terminal hook runs on a CI server. The 71/29 split on actionable feedback is the right design choice, the agentic architecture is a real upgrade over the older version, and the decision to stay silent when nothing useful can be said is what keeps the feedback worth reading.

Teams run into trouble when they treat it as their entire review strategy. The gaps are not hidden. GitHub outlines them in its responsible use documentation, and independent testing further clarifies them. Security coverage is weak across every surface. Cross-repository context does not exist in any of them. Architectural intent sits beyond what any diff-level reviewer can reliably assess. Large PRs stretch the model past its useful range. And no Copilot surface can block a merge through branch protection, which means every one of them adds to a review process rather than replacing it.

The direction of travel is clear. Over the next twelve months, the tools that stand out will be those that go beyond the diff, understand the full codebase and service boundaries, and support human reviewers rather than replace them. A practical next step for any engineering lead evaluating Copilot code review is to take a real PR from a live repository, run it through the Copilot surface that fits the team’s workflow, and compare the output against a complete code review checklist. That exercise tends to settle the question of where Copilot fits, and where another layer needs to sit beside it, faster than any blog post can.

Add full-codebase code review to your GitHub workflow. Refacto works alongside Copilot in the same PR, with full codebase context to identify issues that diff-based reviews often miss. Start your free trial now!