HomeBack to recipes
RecipesTestingVisual Regression on PRs

TL;DR

Visual Regression on PRs is a testing workflow that chains Playwright + GitHub to automate a common task. Playwright captures screenshots of key pages on every PR and diffs them against the baseline. Regressions block merge. Once configured, it saves ~10 hours/week of manual QA clicking, plus prevention of 3-5 visual bugs per sprint and runs through Claude Code, Cursor, Windsurf or any MCP-compatible AI agent.

🎭🐙
TestingIntermediate

Visual Regression on PRs

Playwright captures screenshots of key pages on every PR and diffs them against the baseline. Regressions block merge.

20 min setup, catches regressions on every PR2 MCPs requiredSaves ~10 hours/week of manual QA clicking, plus prevention of 3-5 visual bugs per sprint

How it works

🎭Playwright
🐙GitHub
Automated
1PR opened — run Playwright test suite2Capture screenshots for all key routes3Diff against stored baseline images+3 more steps

Partial support — 1 of 2 MCPs hostable

Hosted execution needs every MCP on the whitelist. Use the local CLI for this recipe until the missing MCPs are added.

Not yet hostable:

🎭Playwright
mcpizy recipe install playwright-github-visual-regression

Why this combo?

Playwright can headlessly screenshot every important page in your app; GitHub can hold those screenshots as artifacts and block merges when diffs exceed a threshold. Together they act as an always-on visual QA reviewer that catches unintended CSS or layout regressions before users do.

Without this workflow

Someone manually clicks through the app after each deploy and notices a broken layout two days later from a user complaint.

With MCPizy

Every PR automatically screenshots every key page, diffs against the baseline, and blocks merge if pixels changed beyond tolerance.

Business value

Concrete ROI — not marketing fluff.

Time saved

~10 hours/week of manual QA clicking, plus prevention of 3-5 visual bugs per sprint

  • Catches 80% of CSS regressions before production — saves $2-5k per embarrassing visual bug that hits customers
  • Frees 1-2 QA engineers from manual click-testing to focus on complex flows and edge cases
  • Protects conversion-critical pages (pricing, checkout, signup) from silent layout breaks that tank revenue
  • Design system integrity enforced automatically — no more drift between Figma and production

Workflow steps

  1. 1
    PR opened — run Playwright test suite
  2. 2
    Capture screenshots for all key routes
  3. 3
    Diff against stored baseline images
  4. 4
    Upload diff artifacts to GitHub Actions
  5. 5
    Post pass/fail status check to PR
  6. 6
    Update baseline on merge to main

Use cases

  • Catch unintended CSS changes that break page layout before they reach users
  • Automated baseline updates when intentional design changes are merged
  • Per-component screenshot testing for design system consistency
  • Block deployment if key marketing pages render incorrectly

MCPs required

🎭

Playwright

Playwright MCP Server

View
🐙

GitHub

GitHub MCP Server

View

Agent prompt (copy into Claude Code)

This prompt is the workflow. Paste into Claude Code, Cursor, or Windsurf.

You are a visual regression agent. Runs on every PR opened or updated.

Given the preview deployment URL and a list of routes (from routes.json):
1. For each route, call playwright.browser_navigate(url) then playwright.browser_take_screenshot(fullPage=true)
2. Compare each screenshot against baselines/<route>.png via pixel diff (tolerance: 0.1%)
3. If diff > threshold, upload diff image to GitHub as a workflow artifact
4. Call github.add_pr_comment with a table: route | diff% | baseline | current | diff-image
5. If any route fails, call github.create_status(state="failure", context="visual-regression") to block merge
6. On merge to main: call playwright.browser_take_screenshot for all routes and overwrite baselines

Always run headless. Use device emulation for mobile routes.

Trigger & credentials

How this workflow fires and what env vars you need.

.env.example
WebhookTrigger
POST /webhook/github (events: pull_request synchronize, pull_request merged to main)
🎭Playwright· 1 var
PLAYWRIGHT_BASE_URL

Base URL for previews (falls back to preview URL from PR comment)

e.g. https://pr-123.preview.example.com

🐙GitHub· 1 var
GITHUB_TOKENGet key

PAT with repo + statuses scopes

e.g. ghp_...

One-command deploy

Install everything — MCPs, prompt, env template — in a single call.

$ mcpizy recipe install playwright-github-visual-regression

✓ Installs all 2 MCP servers
✓ Writes prompt to ~/.mcpizy/prompts/playwright-github-visual-regression.md
✓ Generates .env.example in current directory
✓ Ready to paste into Claude Code

Requires mcpizy CLI v1.1+ — install via npm i -g mcpizy.

Quick install (MCPs only)

20 min setup, catches regressions on every PR
$ mcpizy install playwright && mcpizy install github

More Testing recipes

🐙🎭

Full CI Pipeline with Slack Alerts

Tests run on every push. Failures post a Slack message with the failing test name, screenshot, and a link to the run.

🌐🐙

Cross-Browser Testing on PRs

Run your test suite across Chrome, Firefox, Safari, and Edge on BrowserStack automatically when a PR is opened.

🐙🔎

Code Quality Gates

SonarQube analyzes every PR for code smells, coverage drops, and security hotspots. PRs below the quality gate are blocked.

Frequently asked questions

What is this workflow?

Visual Regression on PRs is a testing automation that uses Playwright + GitHub together via the Model Context Protocol. Playwright captures screenshots of key pages on every PR and diffs them against the baseline. Regressions block merge.

How long does setup take?

Setup takes around 20 min setup, catches regressions on every PR. You install the required MCP servers with `mcpizy install playwright && mcpizy install github`, connect your accounts, and the workflow is ready to run.

How much time does this workflow save?

Once running, this workflow saves ~10 hours/week of manual QA clicking, plus prevention of 3-5 visual bugs per sprint. The concrete business value: Catches 80% of CSS regressions before production — saves $2-5k per embarrassing visual bug that hits customers; Frees 1-2 QA engineers from manual click-testing to focus on complex flows and edge cases.

Which MCP servers do I need for this?

You need 2 MCP servers: Playwright (mcpizy install playwright), GitHub (mcpizy install github). All are installable in one command via the MCPizy CLI and configured in your `.claude.json` or `.cursor/mcp.json`.

Does this work with Claude Code, Cursor, and Windsurf?

Yes. The workflow runs with any MCP-compatible AI agent — Claude Code, Claude Desktop, Cursor, Windsurf, VS Code with Copilot, and custom agents built on the MCP SDK. The MCP servers are identical across clients; only the config file path (`.claude.json` vs `.cursor/mcp.json`) changes.

Start building this workflow

Install the required MCPs from the marketplace and automate this in 20 min setup.

$ mcpizy install playwright && mcpizy install github

🎭Install Playwright🐙Install GitHub

Free to install. Connect your accounts and this workflow runs itself.