The complete guide to headless browser screenshots

Playwright, Puppeteer, Selenium — and when to skip them all.

March 27, 2026 · 10 min read

A headless browser is a web browser without a visible window. It renders pages exactly like a normal browser — executing JavaScript, applying CSS, loading fonts — but outputs to an image or PDF instead of a screen. It's the foundation technology behind screenshot APIs, PDF generators, web scrapers, and end-to-end testing frameworks.

This guide covers the practical side: which headless browser tool to use, how to configure it for reliable screenshots, common pitfalls, and when you should skip self-hosting entirely.

Playwright vs Puppeteer vs Selenium

Playwright (Microsoft) is the current best choice for new projects. It supports Chromium, Firefox, and WebKit from a single API, has excellent auto-wait functionality, and offers Python, Node.js, Java, and .NET bindings. Browser management is built in — playwright install downloads the right browser version. See the Playwright screenshot tutorial for a Node.js walkthrough.

Puppeteer (Google) was the gold standard from 2017–2022. It's Chromium-only, which is fine for screenshots (Chromium rendering is what you want). The API is slightly lower-level than Playwright. Still maintained, still works, but Playwright has surpassed it in ergonomics and cross-browser support. The Puppeteer screenshot guide covers the full API. For a direct comparison, see Puppeteer vs Playwright for screenshots.

Selenium is the oldest option (2004) and the most painful to set up. WebDriver version management is a constant headache. Only use Selenium if you have an existing test suite built on it and don't want to migrate.

Reliable screenshots: the hard parts

Taking a basic screenshot is trivial. Taking reliable screenshots of arbitrary URLs at scale is where the complexity lives.

Page load timing. When is a page "done loading"? The load event fires when HTML and subresources are loaded, but SPAs might still be rendering. networkidle waits for network activity to stop, which is better but not perfect — some pages have persistent WebSocket connections or analytics pings that never stop. The pragmatic approach is networkidle plus a fixed delay (1–2 seconds) as a buffer.

Lazy loading. Images below the fold won't load until scrolled into view. If you're doing a full-page capture, you need to scroll the page first to trigger lazy loading, then scroll back to top and capture. Playwright doesn't do this automatically.

Cookie consent banners. They'll be in your screenshot unless you dismiss them. Options: inject CSS to hide them (display: none on common selectors), or click the "Accept" button programmatically. Neither is 100% reliable across all sites.

Pop-ups and modals. Newsletter pop-ups, age verification gates, GDPR notices — they'll all appear in your screenshots. Same mitigation as cookie banners: CSS injection or programmatic dismissal.

Web fonts. Fonts load asynchronously. If you capture before fonts are loaded, you'll get fallback system fonts in your screenshot. Wait for document.fonts.ready or add a delay after page load.

Memory and process management

Chromium is a memory-hungry application. Each browser instance uses 200–500MB of RAM. At scale, you need to manage browser instances carefully — reuse them across requests (with page cleanup between captures), set memory limits, and handle crashes gracefully. A leaked browser instance that isn't cleaned up will consume memory until your server runs out.

Process zombies are another common issue. If your application crashes mid-screenshot, the Chromium process may be left running. Use process monitoring and cleanup to handle this.

When to use an API instead

Managing headless browsers in production is operational overhead. If any of these apply to you, consider a screenshot API instead:

You're running in a serverless environment (Lambda, Vercel, Cloudflare Workers) where you can't install Chromium — here's why serverless and headless browsers don't mix
You need fewer than 10,000 screenshots per month and your time is worth more than $50/month
You don't want to deal with browser updates, security patches, and crash recovery
You need consistent rendering across environments (local dev, CI, production)
You're building an AI agent that needs to "see" web pages

At nightglass, a screenshot costs $0.005. For 10,000 screenshots, that's $50. If setting up and maintaining headless browser infrastructure would take you more than a few hours, the API is the cheaper option when you account for your time.

Not sure which API to use? The 2026 screenshot API comparison breaks down the main players. For the full nightglass reference, see the quickstart guide.