The playwright-explore-website Copilot skill

This is the standalone write-up behind the workflow I referenced in my JavaScript London talk companion post.

The short version is that playwright-explore-website is a small GitHub Copilot skill that tells Copilot to use the Playwright MCP server to open a site, explore a few important user flows, document what it found, and suggest candidate test cases.

In the latest version, it also treats Playwright as the source of truth for rendered UI checks. If the task involves a visible change or regression, the agent should inspect the current UI first, make the change, verify the updated state afterwards, and clean up temporary screenshots unless the user asked to keep them.

I did not invent the starting point from scratch. The initial version was based on the original playwright-explore-website example in awesome-copilot. What I changed was the setup guidance and the execution rules, so it behaves better in real browser work, especially for regression checks and visual QA, instead of acting like a generic prompt stub.

This post stays focused on the skill itself: where it came from, how I set it up locally, and the guardrails I added. The talk companion post covers the broader Playwright testing workflow around it.

What the skill is for

This skill sits in the gap between “open the browser and poke around” and “write the final Playwright test file”.

It is useful when you need to:

understand an unfamiliar product area quickly
smoke test a staging or preview deployment
reproduce a bug report that is missing exact steps
verify rendered UI before and after a visible change
identify candidate locators before writing a Playwright test
turn exploratory browsing into a short list of candidate scenarios

That is why I like it. It makes exploration explicit.

The original awesome-copilot example

The original version in awesome-copilot is deliberately small.

At a high level, it tells Copilot to:

navigate to the provided URL with Playwright MCP
identify and interact with 3 to 5 core user flows
document the interactions, locators, and expected outcomes
close the browser context afterwards
summarise the findings and propose test cases

That is already a good baseline because it forces exploration before test generation. The model has to look at the real site first instead of guessing what the UI probably looks like.

My local setup

I keep this as a personal skill rather than a repo-specific one.

That means the file lives at:

1
~/.copilot/skills/playwright-explore-website/SKILL.md

I prefer that because the same workflow is useful across multiple projects. Once the skill is there, Copilot can discover it in any repo where browser exploration makes sense.

The frontmatter stays intentionally simple:

1
---
2
name: playwright-explore-website
3
description: 'Website exploration for testing using Playwright MCP'
4
---

The important part is not the title. It is that the description contains the right trigger words, so Copilot can load it when the task is about website exploration, Playwright, or browser-based testing.

Playwright MCP setup

One of the biggest gaps in the original example is that it assumes the Playwright MCP tools are already available.

That is fine once your machine is configured. It is less helpful the first time you try to use the skill and the tools are missing.

I added an explicit setup section so the skill can bootstrap the missing piece instead of failing vaguely.

The CLI path is the shortest route:

1
code --add-mcp '{"name":"playwright","command":"pnpm","args":["dlx","@playwright/mcp@latest"]}'

If you prefer to wire it up in settings.json, the equivalent config is:

1
"mcp": {
2
  "servers": {
3
    "playwright": {
4
      "command": "pnpm",
5
      "args": ["dlx", "@playwright/mcp@latest"]
6
    }
7
  }
8
}

After that, reload VS Code or the Copilot extension and accept the prompt to start the MCP server.

This matters because a good skill should not only describe the happy path. It should also help the agent recover when a prerequisite is missing.

The enhancements I added

I kept the core purpose the same, but I tightened how the skill runs.

1. Self-bootstrapping setup instructions

The added ## MCP Server Setup section gives the agent a concrete fallback when the Playwright tools are unavailable.

That turns a dead end into a fixable setup problem.

2. A serial exploration rule

I added a rule for multi-page and multi-breakpoint work:

1
If you need to compare multiple pages or breakpoints, inspect them serially or
2
in separate tabs. Do not queue parallel navigations and screenshots against the
3
same Playwright page context.

This is a practical guardrail. Browser exploration becomes noisy very quickly if you mix multiple navigations and screenshots in one live context.

3. Rendered UI review

The newer version also makes the rendered browser state the thing to trust.

1
Use Playwright to review the rendered UI directly. For implementation or
2
regression checks, inspect the current state first, then verify the updated
3
state after changes instead of relying on code inspection alone.

This is the rule that changed the skill most in practice. CSS, JSX, or Astro templates are not the final UI. The browser is. If the job is about a visible change, the skill now pushes Copilot to validate the actual rendered result before it signs off.

4. Visual audit prompts

I also added an explicit visual review prompt:

1
For visual audits, explicitly note supporting-label readability,
2
hero-to-first-section spacing, footer divider spacing, and
3
last-section-to-footer separation when relevant.

That came from real UI review work. I wanted the skill to be useful not just for functional exploration, but also for browser-based design checks.

5. Screenshot cleanup

I also added a cleanup rule:

1
Delete any temporary screenshots you created during the session unless the
2
user explicitly asked to keep them.

This sounds small, but it matters. Exploration and regression review can leave behind a pile of disposable screenshots very quickly. If the skill creates artifacts to reason about the UI, it should also leave the workspace tidy when those artifacts are no longer needed.

6. Stronger output requirements

The local version is stricter about what the exploration should produce.

It does not stop at “I clicked around and it looked fine”. It asks for:

the user interactions that were performed
the relevant UI elements and likely locators
the expected outcomes for each flow
a concise summary of findings
proposed test cases based on the exploration

That output maps much more cleanly to a future Playwright test file.

Where this fits

This post is intentionally about the skill itself.

If you want the wider workflow around it, including how I use it to explore a risky journey, compare it with codegen, and turn the findings into a real test, that is in Playwright E2E testing AI skills: JavaScript London talk.

That boundary is deliberate. The skill helps you explore a real browser session and capture candidate flows, locators, and outcomes. The Playwright test is still the final artifact.