25 June 2026

·

5 min read

QA & TestingPlaywrightTest Automation

Selenium to Playwright Migration: What Enterprise Teams Get Wrong in the First 30 Days

Playwright downloads grew ~235% year-on-year, and enterprise QA teams are mid-migration off Selenium. The technical port is the easy 20%. The 80% that derails programmes is CI ownership, the flake baseline, and the reporting line into product.

Anystack Engineering

Playwright is no longer a challenger. npm download trends tracked through 2025 show roughly 235% year-on-year growth, and enterprise QA teams that spent a decade on Selenium are now mid-migration. The Currents migration guide is a fair reference for the technical port — selectors, fixtures, async model, parallelism primitives — but it is also where most enterprise programmes quietly stall.

The reason is uncomfortable: the code port is roughly 20% of the work. The other 80% is changing who owns CI, what the flake baseline is allowed to be, and how test results reach the product organisation. Teams that treat the migration as a framework swap get a faster suite that nobody trusts. Teams that treat it as a delivery system change get a faster suite that gates releases.

This is the gap to close in the first 30 days.

Finding 1: The technical port is the trap, not the prize

Playwright's auto-waiting, network interception, and trace viewer remove entire categories of Selenium flake. That is real. But the published case studies — Microsoft, Disney+, VS Code's own end-to-end suite — also share a property that rarely transfers: a single team that owns both the test framework and the CI pipeline that runs it.

In a typical 200-engineer enterprise, the Selenium suite is owned by a QA group, the Jenkins or GitHub Actions pipeline by a platform group, and the flaky-test triage by nobody in particular. Porting the tests does not change that. On day 31 you have Playwright specs running on the same overloaded runners, with the same retry-three-times culture, producing the same JUnit XML that nobody reads.

What to do this week. Before a single spec is rewritten, name one engineer accountable for end-to-end test reliability across framework, runner, and reporting. Not a manager. An engineer with commit rights in all three repositories. If you cannot name that person, the migration is not ready to start.

Finding 2: The flake baseline must be reset, not inherited

The 2024 Google Testing Blog post-mortems and the DORA 2025 research both point at the same pattern: teams measure flake as a percentage of test runs and accept anything under 2-3%. At enterprise scale that number hides catastrophe. A 2% flake rate across a 4,000-test suite means roughly 80 spurious failures per run, which means retries are mandatory, which means real failures are invisible.

Playwright gives you better instruments — --trace on-first-retry, deterministic network mocking, the HTML reporter — but only if you set a new baseline at the moment of migration. Carry the old baseline across and you have spent six months to arrive at the same place faster.

What to do this week. Pick the 200 highest-value end-to-end tests (the ones that map to revenue-critical user journeys). Port those first. Set the flake budget for that subset at zero failures over 50 consecutive runs before any are allowed into the merge gate. Everything else is a second-class citizen until it earns the gate.

Finding 3: The reporting chain into product is the actual deliverable

The quiet truth of QA modernisation is that engineering leaders rarely buy a faster test suite. They buy the ability to tell a product VP "we can ship Tuesday" with evidence. Selenium suites usually cannot do this — the output is a wall of JUnit XML, a Slack channel of red Xs, and a QA lead who translates. The translation layer is the bottleneck.

Playwright's trace viewer and HTML report are good developer tools. They are not product-leadership artefacts. The migration is an opportunity to install the missing layer: a release-readiness view that maps test results to user journeys and surfaces a single signal — ship, hold, or investigate — to the people making the release call.

What to do this week. Sketch the one-page release-readiness view on a whiteboard with your head of product. Not the dashboard. The decision. What does product need to see on a Monday morning to commit to a Tuesday release? Build the test taxonomy backwards from that view. If your Playwright suite cannot answer that question on day 90, the migration has failed regardless of how fast it runs.


What 60 days looks like when it works

The enterprise teams that finish a Selenium-to-Playwright migration cleanly tend to share three properties at the end of week eight, not week one:

  • A named owner accountable for the framework, the CI runners, and the reporting layer as a single system
  • A high-value subset of journeys ported, gated, and trusted — with the long tail explicitly de-scoped or retired
  • A release-readiness view that product leadership reads without translation

None of those are Playwright features. They are operating-model changes that the migration creates the opening to make. Miss the opening and you have a faster Selenium.

How Anystack approaches this

A 3-person AI-augmented pod treats a Selenium-to-Playwright migration as a delivery-system change with a test-framework port inside it, not the other way around. The first two weeks are spent on the journey taxonomy, the flake baseline, and the reporting view — before any specs are rewritten. The port itself is accelerated with AI-assisted selector translation and trace-driven assertion generation, which is where the framework swap genuinely compresses. We typically deliver a gated high-value subset by week six and full cutover with ownership transferred to the internal team by week ten or twelve.

If you are mid-migration and the new suite is running but nobody trusts it yet, that is the failure mode this approach is designed for. More on how we structure that work on the QA modernisation page.

Free engineering audit

Want a structured assessment of where this applies to your stack? Our 30-minute tech audit is free.

Request audit →

Start a conversation

Facing a version of this in your organisation? We scope engagements in a single call.

Book a 30-min call →

See the evidence

Read how we've delivered these outcomes for clients in fintech, healthcare, and telecom.

Browse case studies →