Skip to content

[Bug]: headless=False Parameter Ignored by AsyncWebCrawler on Windows (v0.6.1) #1034

@Fabrice-x-m

Description

@Fabrice-x-m

crawl4ai version

0.6.1

Expected Behavior

When AsyncWebCrawler is initialized with headless=False (either via BrowserConfig or directly as a constructor parameter), a visible browser window (e.g., Chromium) is expected to launch and appear on the user's display during the crawler's initialization phase or shortly after, before navigation (arun) begins. The user should be able to see the browser UI operate.

Current Behavior

Despite explicitly setting headless=False, no browser window becomes visible on the screen when initializing or running AsyncWebCrawler. The process executes entirely in the background, behaving as if it were running in headless mode.

While the crawl operation itself may succeed technically (e.g., fetching HTML from example.com), the browser UI is never displayed.

Notably:

  1. Running Playwright directly on the same system with launch(headless=False) successfully launches a visible browser window.
  2. Setting the PWDEBUG=1 environment variable while using AsyncWebCrawler successfully forces a visible Playwright Inspector window to appear.

This strongly indicates the issue lies within Crawl4AI's handling of the standard headless=False configuration, not with the underlying Playwright installation or the OS environment's capability to run visible browsers.

Is this reproducible?

Yes

Inputs Causing the Bug

- **URL(s):** `https://example.com` (Reproducible with simple URLs; likely URL-independent).
- **Settings used:** The core setting causing the issue (or rather, being ignored) is `headless=False`. This was tested in two ways:
    1. Via `BrowserConfig`: `BrowserConfig(browser_type="chromium", headless=False, verbose=True)` passed to `AsyncWebCrawler(config=...)`.
    2. Via direct parameter: `AsyncWebCrawler(browser_type="chromium", headless=False, verbose=True)`.
- **Input data:** Not applicable.

Steps to Reproduce

1.  Set up the environment: Windows 11, Python 3.13.3, Crawl4AI 0.6.1, Playwright 1.51.0 (with browsers installed via `playwright install`).
2.  Run either of the minimal Python code snippets provided below.
3.  During the `asyncio.sleep(5)` pause included in the snippets (immediately after `AsyncWebCrawler` initialization), carefully observe the screen.
4.  **Observe:** Note that no browser window appears, contrary to the expected behavior for `headless=False`. The script continues execution silently in the background.

Code snippets

# Snippet 1: Using BrowserConfig (Minimal Test)
import asyncio
import re
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode
try:
    from crawl4ai.config import BrowserConfig, CrawlerRunConfig, CacheMode
except ImportError:
    pass # Ignore if already imported

async def test_crawl4ai_visible_minimal_config():
    print("--- Test minimal Crawl4AI with headless=False via BrowserConfig ---")
    browser_cfg = BrowserConfig(
        browser_type="chromium",
        headless=False, # Explicitly set to False
        verbose=True
    )
    run_cfg = CrawlerRunConfig(cache_mode=CacheMode.BYPASS)
    print("[!] Initializing AsyncWebCrawler with BrowserConfig(headless=False)...")
    print("[!] >>> WATCH SCREEN CAREFULLY FOR 5 SECONDS <<<")
    print("[!] >>> A browser window SHOULD appear now <<<")
    await asyncio.sleep(5) # Pause for visual observation
    try:
        async with AsyncWebCrawler(config=browser_cfg) as crawler:
            print("[!] Crawler initialized. Attempting crawl...")
            result = await crawler.arun("https://example.com", config=run_cfg)
            print(f"[!] Crawl finished. Success: {result.success}")
            if result.success and result.html:
                 title_match = re.search(r"<title>(.*?)</title>", result.html, re.IGNORECASE | re.DOTALL)
                 print(f"[+] Title from HTML: {title_match.group(1).strip() if title_match else 'Not Found'}")
            elif not result.success: print(f"[-] Crawl failed: {result.error_message}")
            print("\n[?] CRITICAL QUESTION: Did you see a browser window open during the pause?")
    except Exception as e: print(f"[!!!] Error: {e}")
    finally: print("[!] Exiting async with block.")

if __name__ == "__main__": asyncio.run(test_crawl4ai_visible_minimal_config()); print("[!] Test finished.")

# Snippet 2: Using Direct Parameter (Minimal Test)
import asyncio
import re
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig, CacheMode
try:
    from crawl4ai.config import CrawlerRunConfig, CacheMode
except ImportError:
    pass

async def test_crawl4ai_visible_minimal_direct():
    print("--- Test minimal Crawl4AI with headless=False via direct parameter ---")
    run_cfg = CrawlerRunConfig(cache_mode=CacheMode.BYPASS)
    print("[!] Initializing AsyncWebCrawler(headless=False)...")
    print("[!] >>> WATCH SCREEN CAREFULLY FOR 5 SECONDS <<<")
    print("[!] >>> A browser window SHOULD appear now <<<")
    await asyncio.sleep(5) # Pause for visual observation
    try:
        async with AsyncWebCrawler(browser_type="chromium", headless=False, verbose=True) as crawler:
            print("[!] Crawler initialized. Attempting crawl...")
            result = await crawler.arun("https://example.com", config=run_cfg)
            print(f"[!] Crawl finished. Success: {result.success}")
            if result.success and result.html:
                 title_match = re.search(r"<title>(.*?)</title>", result.html, re.IGNORECASE | re.DOTALL)
                 print(f"[+] Title from HTML: {title_match.group(1).strip() if title_match else 'Not Found'}")
            elif not result.success: print(f"[-] Crawl failed: {result.error_message}")
            print("\n[?] CRITICAL QUESTION: Did you see a browser window open during the pause?")
    except Exception as e: print(f"[!!!] Error: {e}")
    finally: print("[!] Exiting async with block.")

if __name__ == "__main__": asyncio.run(test_crawl4ai_visible_minimal_direct()); print("[!] Test finished.")


# Snippet 3: Direct Playwright (Works for Comparison)
import asyncio
from playwright.async_api import async_playwright

async def test_browser_direct_visible():
    print("--- Direct Playwright Test with headless=False (THIS WORKS) ---")
    async with async_playwright() as p:
        print("[!] Launching browser directly...")
        # This launch correctly shows a window:
        browser = await p.chromium.launch(headless=False)
        print("[+] Browser window should be visible now!")
        page = await browser.new_page()
        await page.goto('https://example.com')
        print(f'[+] Title: {await page.title()}')
        await browser.close()
        print("[+] Browser closed.")

if __name__ == "__main__": asyncio.run(test_browser_direct_visible()); print("[!] Direct test finished.")

OS

Windows 11

Python version

3.13.3

Browser

Chromium and Firefox

Browser version

Browser binary installed via playwright install associated with Playwright version 1.51.0

Error logs & Screenshots (if applicable)

No specific error logs related to failing to launch visibly are generated by Crawl4AI. The script often runs to completion successfully according to the logs (fetching HTML etc.), just without displaying the expected browser UI.

The core evidence is the visual observation during execution:

  • Running Snippet 1 or Snippet 2 (using Crawl4AI with headless=False): No browser window appears.
  • Running Snippet 3 (using Playwright directly with headless=False): A browser window correctly appears.
  • Running Crawl4AI with the PWDEBUG=1 environment variable set: The Playwright Inspector window correctly appears.

test_log_20250425_200730.log
test_log_20250425_200836.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 BugSomething isn't working🩺 Needs TriageNeeds attention of maintainers

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions