Pull Request Best Practices: A Complete Guide for 2026

Name: Diffwise
Availability: InStock
Author: Diffwise

June 10, 2026 · 13 min read · by Amar Tripathi

A good pull request is small (under 400 changed lines), makes exactly one logical change, carries an imperative title under 72 characters, and includes a description covering what changed, why, and how it was tested. A good review of that pull request starts within a few business hours, arrives as a single batched round of comments with each one marked blocking or non-blocking, and ends in approve-with-comments unless something is genuinely broken. Anything mechanical (formatting, lint, first-pass defect detection) should be automated so humans spend their review minutes on design.

That paragraph is the whole guide in miniature. The rest of this post unpacks each piece with the numbers behind it, because almost every PR problem teams complain about, slow reviews, rubber stamping, week-long merge queues, traces back to ignoring one of those rules. This is written for both sides of the review: the first half is for authors, the second half for reviewers, and the last third covers the automation and merge settings that make the human parts cheaper.

How big should a pull request be?

Keep pull requests under 400 changed lines, and aim much lower. The most cited data here comes from SmartBear's study of code review at Cisco Systems, which analyzed roughly 2,500 reviews and found that a reviewer's ability to find defects drops sharply once a review exceeds 200 to 400 lines of code. Past that point, reviewers stop reading and start skimming. The same study found defect discovery falls off when reviewers move faster than about 500 lines per hour, which is exactly what happens when you hand someone a 1,500 line diff at 4pm.

Google's internal data points the same direction from the other end. In the published case study of Google's code review practice, the median change is around 24 lines. Not 240. Small changes there get reviewed in under an hour on average, which is why engineers keep making them small: the feedback loop rewards it.

The relationship between size and outcome looks roughly like this:

PR size (changed lines)	Typical review behavior	What you can expect
Under 50	Read line by line in one sitting	Fast pickup, real feedback, same-day merge
50 to 200	Read carefully with short breaks	Solid defect detection, 1 to 2 review rounds
200 to 400	Upper limit of effective review (SmartBear/Cisco)	Defect detection starts to decay
400 to 1,000	Skimmed, not read	"LGTM" with one nit on a variable name
Over 1,000	Approved on faith	Bugs ship, review was theater

If you are not sure where your diff lands, run it through a PR size calculator before you open it. Generated files, lockfiles, and snapshots inflate the raw number, so what matters is reviewable lines, not total lines.

One logical change per PR

Size is a proxy for the real rule: one PR, one idea. A 300 line PR that renames a module is easy to review. A 120 line PR that fixes a bug, refactors an unrelated helper, and bumps two dependencies is not, because the reviewer has to mentally untangle three changes that could each break something different. If you find yourself writing "also" in the description, you probably have two PRs.

What about big features?

Use stacked PRs. Break the feature into a chain of dependent branches: PR 1 adds the schema migration, PR 2 adds the service layer targeting PR 1's branch, PR 3 adds the UI targeting PR 2's. Each PR stays reviewable on its own, reviewers see a coherent slice instead of a 2,000 line wall, and you can start collecting feedback on the foundation while you are still writing the top of the stack. Tools like Graphite, git-spice, and plain GitHub branch targeting all support this workflow. The discipline costs you 20 minutes of branch management and saves your reviewer half a day.

How do you write a good pull request title?

Write titles in the imperative mood, keep them under 72 characters, and use the Conventional Commits format if your team squash-merges (more on why below).

The pattern: type(scope): imperative summary

feat(auth): add session revocation on password change
fix(billing): handle Stripe webhook retries idempotently
refactor(api): extract rate limiter into shared package

Three rules make a title work:

Imperative mood. "Add retry logic", not "Added retry logic" or "Adding retry logic". The title should complete the sentence "If merged, this PR will...". This matches git's own convention for commit subjects.
Under 72 characters. GitHub truncates longer titles in PR lists, notification emails, and the commit log after squash merge. If you cannot summarize the change in 72 characters, the PR is probably doing too much. See the size section.
Conventional type prefix. feat, fix, refactor, chore, docs, test. This makes the PR list scannable, enables automated changelogs via tools like semantic-release, and tells the reviewer what posture to read with. A refactor that changes behavior is a red flag the prefix makes visible.

Do not rely on memory to enforce this. The amannn/action-semantic-pull-request GitHub Action validates PR titles against the Conventional Commits spec on every push, and commitlint does the same for individual commits. Add the check to your branch protection rules as a required status check and the format becomes self-enforcing, no style debates in review. You can sanity-check a title against these rules in a few seconds with a PR lint checker before you open the PR.

What makes a good pull request description?

A description that answers four questions, in this order:

What changed. Two or three sentences. Not a restatement of the diff, a summary of it. "Replaces the polling-based job status check with a webhook callback" tells the reviewer what shape to expect.
Why. The context that is not in the code. Link the issue or incident. "Polling was hitting the rate limit at 40+ concurrent jobs (see #482)" turns a confusing change into an obvious one.
How it was tested. Be specific. "Added unit tests for the retry path, manually tested webhook delivery against a local tunnel, verified the migration on a staging snapshot." If the answer is "CI passes", say so honestly and expect the reviewer to push back on risky paths.
Breaking changes and rollout notes. API changes, migrations that need ordering, feature flags, anything the person deploying or the person reading git blame in two years needs to know. Write "None" explicitly rather than omitting the section, so the reviewer knows you considered it.

Screenshots or a short clip for any UI change. A before/after pair answers questions no paragraph can.

The biggest failure mode is not bad descriptions, it is empty ones. Two fixes work in practice. First, make the structure ambient: commit a .github/PULL_REQUEST_TEMPLATE.md so every PR opens with the four sections pre-filled as headings. A PR template generator will produce a solid starting template you can adapt to your stack. Second, lower the cost of a first draft: paste your diff into a PR description generator and edit the result. An edited draft beats a blank textbox every time, and the "why" still has to come from you, because it is not in the diff.

What should you do before requesting review?

Three things, in order: self-review, draft status, green CI.

Self-review first. Open your own PR in the GitHub diff view and read it as if someone else wrote it. You will catch leftover debug statements, commented-out code, accidental file inclusions, and unclear naming before anyone else has to. Engineers who self-review consistently report fewer review rounds, and the reason is mechanical: the diff view shows you the change the way the reviewer will see it, which is not the way it looked in your editor. Annotate your own PR where context helps ("this rename touches 14 files, the logic change is only in session.ts"). Five minutes of self-review routinely saves a full round trip, which on most teams is a full day.

Use draft PRs for work in progress. A draft PR signals "look if you want, but I am not asking yet". It gets you CI runs and early eyes without burning a reviewer's attention on code that will change tomorrow. Marking a PR "Ready for review" should mean exactly that: you believe it is mergeable as-is. Teams that blur this line train reviewers to ignore review requests, and then everyone complains about pickup time.

CI green before requesting review. Asking a human to review a PR with failing checks is asking them to review code you already know is wrong. The reviewer either wastes time finding what the test suite already found, or waits and context-switches twice. Let the machines finish first. Humans are the most expensive check in the pipeline; they should run last.

What do good reviewers do differently?

Review speed is a team SLA, not a personal virtue. The four practices below separate teams where review is a multiplier from teams where it is a queue.

Pick up reviews within a working-day SLA. Set an explicit target: first response within 4 business hours, or by end of next business day at the latest. Google's data shows small changes getting initial review in under an hour, and that speed is a big reason their median change stays at 24 lines. The causality runs both ways: slow reviews push authors toward giant batched PRs ("if review costs a day, I'll make it count"), which makes reviews slower. Fast pickup is what makes small PRs rational. Treat an open review request like a paged alert with a relaxed deadline, not like email.

Prefix every comment with its severity. The reviewer knows which comments must be addressed before merge. The author should not have to guess. Use explicit prefixes:

blocking: must be fixed before merge, and the reviewer will say why
nit: style or preference, author may fix or ignore
question: reviewer needs information, not a change
suggestion: worth considering, not required

This is the core idea behind Conventional Comments, and it kills the most common review pathology: authors treating every nit as mandatory and burning a day polishing things nobody required.

Batch comments into one round. Read the whole PR, then submit one review with all your comments, using GitHub's "Start a review" instead of single comments. Drip-feeding comments one at a time generates a notification storm and forces the author to context-switch repeatedly. It also produces worse reviews, because comment 3 often gets answered by code you would have read at comment 9. Aim for one round of substantive feedback, a second round to verify fixes, and merge. If you are on round four, the problem is the PR's scope or the spec, not the code, so take it to a call.

Default to approve-with-comments. If your remaining comments are nits and questions, approve now and trust the author to address them before merging. Holding approval hostage to a variable rename costs the team a full review cycle (often a day) to enforce a preference. Reserve "Request changes" for actual defects: bugs, security problems, design decisions that will be expensive to reverse. The author owns the code; the reviewer owns catching what the author cannot see.

Which parts of review should you automate?

Every comment a human writes about formatting is a process failure. The automation stack, from bottom to top:

Formatters and linters run before the PR exists. Prettier, Black, gofmt, rustfmt as pre-commit hooks or editor-on-save. ESLint, Ruff, Clippy in CI as required checks. After this layer, whitespace and import order are physically incapable of appearing in review comments.

AI review as the first pass. This is the layer that changed most since 2024. An AI reviewer reads the full diff the moment the PR opens and posts inline findings in the few minutes before any human arrives, so the human review starts from "the mechanical issues are already flagged" instead of from zero. Diffwise is built for exactly this slot: it runs 40+ specialist agents in parallel on every PR (security, performance, conventions, plus language-specific agents for Python, Go, Rust, and React that activate based on the files changed), instead of one generic model doing everything shallowly. The part that matters most for review hygiene is incremental re-review: when you push fixes, findings get classified as Fixed, Still Open, or New rather than re-litigated from scratch, which mirrors how a good human reviewer handles round two.

Two rules keep AI review useful rather than noisy. First, configure it in code: a .diffwise.yml (or your tool's equivalent) that sets severity thresholds, ignores generated paths, and caps findings per review, so the bot's behavior is versioned and reviewable like everything else. Second, keep the division of labor honest. The AI pass catches the N+1 query, the unhandled error path, the hardcoded secret. The human pass asks whether this is the right abstraction and whether the feature should exist in this form. Teams that let AI handle the first category report human review rounds dropping because reviewers no longer spend their attention budget on mechanical findings.

Privacy is a real selection criterion here. If a tool reviews your diffs, ask where the code goes. Diffwise fetches the diff, holds it in memory for the duration of the review, and discards it, with zero code storage. Whatever tool you pick, get that answer in writing before installing it org-wide.

How should you merge?

Squash merge by default. Squash collapses the PR's commits ("wip", "fix tests", "fix tests actually") into one commit on the main branch, titled with the PR title. This is why PR title discipline pays off twice: your main branch history becomes a readable changelog of one commit per logical change, and git bisect operates on meaningful units. Use merge commits only where individual commits within a PR are independently meaningful and well-crafted, which in practice means stacked workflows and very disciplined teams. Avoid rebase-merge unless everyone understands the force-push implications.

Branch protection makes all of the above real. A reasonable baseline for main:

Require a pull request before merging, with at least 1 approval
Dismiss stale approvals when new commits are pushed
Require status checks to pass: build, tests, lint, PR title check
Require branches to be up to date before merging
No force pushes, no direct pushes, including by admins

Make review findings a required check, not a suggestion. Comments are easy to ignore at 6pm on a Friday. Checks are not. GitHub Check Runs let review tooling report pass/fail status that branch protection can enforce. Diffwise posts a Check Run per review and can block merge when critical findings (a hardcoded credential, an auth bypass) are still open, while letting warnings through, so the gate has teeth without becoming a nag. Wherever you draw that line, draw it in configuration rather than relying on whoever is merging to read carefully.

FAQ

What is the ideal pull request size?

Under 400 changed lines as a hard ceiling, based on the SmartBear/Cisco study showing defect detection drops past 200 to 400 lines. Under 200 is the practical sweet spot, and Google's median change is around 24 lines. Count reviewable lines, not generated files or lockfiles.

How fast should pull requests be reviewed?

First response within 4 business hours, full review within one business day. Reviewers should read at under 500 lines per hour, so a well-sized PR takes 15 to 45 minutes of actual review time. If pickup regularly takes days, authors will respond with bigger PRs, which makes everything slower.

Should I use squash merge or merge commits?

Squash merge for most teams. It gives you one clean commit per PR on the main branch, named after the PR title, which keeps history readable and bisectable. Use merge commits only when individual commits in the PR are deliberately structured and worth preserving.

What is a draft PR for?

A draft PR shares work in progress without requesting review. It runs CI, makes the branch visible, and invites optional early feedback, but does not ping reviewers or count against review SLAs. Convert to "Ready for review" only when CI is green and you would merge it as-is.

Can AI code review replace human reviewers?

No. AI review is a first pass that catches mechanical and pattern-level defects (injection risks, race conditions, missing error handling) within minutes of the PR opening, before a human looks. Humans still own design judgment, product correctness, and architectural tradeoffs. Tools like Diffwise are designed to run before human review, not instead of it, so the human round starts cleaner and finishes faster.