Release Introducing Search - v1.10.0 · mendableai/firecrawl

We’re excited to announce the launch of our new Search API endpoint that combines web search with Firecrawl’s powerful scraping capabilities.

Search Features:

Search the web and get full content from results in one API call
Choose specific output formats (markdown, HTML, links, screenshots)
Customize search parameters (language, country, time range, number of results)
Full SDK support for Python and Node.js

More Features

Auto mode proxy for scraping (scrapeURL, js-sdk) #1551, #1602
Timeout handling and content type improvements for scrapeURL/pdf #1570, #1604, #1592
Redis improvements: separate non-eviction Redis support #1600
Search improvements: ignoreBlockedURLs, ignore concurrency limit #1580, #1617
New /cclog endpoint for concurrency logging #1589
Metadata extraction now includes itemprop attributes #1624
Self-hosted: deployable Playwright image #1625

Fixes & Improvements

Better subdomain handling for LLMs.txt + bypass option #1557
Improved URL validation and special character handling #1547
Zombie worker cleanup + TTL handling for extract status #1575, #1599
Fix concurrency queue logic and rate limiter override #1595, #1593
Better logging for search pagination and robust fetch #1572, #1588
Minor fixes: og:locale:alternate, adblock toggle, Playwright-only logic, malformed metadata arrays #1597, #1616, #1574

Testing & Docs

Add MAX_RAM and MAX_CPU environment variable docs #1581
Testing infrastructure improvements #1623

What's Changed

Fix LLMs.txt cache bug with subdomains and add bypass option by @devin-ai-integration in #1557
FIR-1951: Fix URL validation for special characters in query parameters by @devin-ai-integration in #1547
feat(scrapeURL): proxy auto mode (FIR-1853) by @mogery in #1551
feat(scrapeURL/pdf/mu): add timeout and created_at (FIR-2008) by @mogery in #1570
fix(auto_charge): fix ACUC clear (FIR-1805) by @mogery in #1571
fix(api/search): log page options correctly (FIR-2015) by @mogery in #1572
Update docker-compose.yaml comment by @emircanerkul in #1566
hotfix: kill zombie workers, respect timeouts better (FIR-2034) by @mogery in #1575
Fix: Concatenate metadata arrays into strings with exceptions by @devin-ai-integration in #1574
Fix sdk/undefined response handle error by @rafaelsideguide in #1578
feat(python-sdk/CrawlWatcher): remove max payload size from WebSocket (FIR-2038) by @mogery in #1577
FIR-2006: Fix maxUrls and timeLimit parameters in Deep Research API by @devin-ai-integration in #1569
docs: add MAX_RAM and MAX_CPU environment variables documentation by @devin-ai-integration in #1581
feat(search): ignoreBlockedURLs (FIR-1954) by @mogery in #1580
fix(queue-worker): finish crawl if all addable URLs were already locked (FIR-1936) by @mogery in #1582
feat(api/extract): show extract as origin for scrapes originating from it (FIR-2061) by @mogery in #1584
feat(api/v1/extract): ignoreInvalidURLs (FIR-1948) by @mogery in #1585
fix(robustFetch): selective logging (FIR-2072) by @mogery in #1588
feat(scrapeURL, logJob): log pdf page count to db (FIR-2068) by @mogery in #1587
feat(concurrency-log): add cclog endpoint (FIR-2067) by @mogery in #1589
feat: parse PDFs on fc side and reject if too long for timeout (FIR-2083) by @mogery in #1592
feat(queue-worker/afterJobDone): improved ccq insert logic (FIR-2082) by @mogery in #1595
fix(v1): avoid overwriting rateLimiterMode with FIRE-1 rate limiter (FIR-2090) by @mogery in #1593
fix(html-transformer): bad outName for og:locale:alternate (FIR-2101) by @mogery in #1597
fix(extract-status): be able to get extract status even after TTL lapses by @mogery in #1599
feat(scrapeURL): add unnormalizedSourceURL for url matching DX (FIR-2137) by @mogery in #1601
feat(apps/api): add support for a separate, non-eviction Redis by @mogery in #1600
feat(js-sdk): auto mode proxy (FIR-2145) by @mogery in #1602
feat(scrapeURL): handle contentType JSON better in markdown conversion (FIR-2159) by @mogery in #1604
feat(scrapeURL/pdf): bill n credits per page (FIR-1934) by @mogery in #1553
[rust-sdk] webhook param for crawl by @palsp in #1609
feat(search): ignore concurrency limit for search (FIR-2187) by @mogery in #1617
fix(scrapeURL): only allow disabling the adblock on playwright (FIR-2200) by @mogery in #1616
feat(api/scrape): credits_billed column + handle billing for /scrape calls on worker side with stricter timeout enforcement (FIR-2162) by @mogery in #1607
Bypass billing on search preview by @nickscamara in #1622
feat: enhance metadata extraction by including 'itemprop' attribute in HTML by @ftonato in #1624
feat(selfhost): deploy a playwright image by @mogery in #1625
Testing improvements (FIR-2209) by @mogery in #1623
Index (FIR-2177) by @mogery in #1605

New Contributors

@emircanerkul made their first contribution in #1566
@palsp made their first contribution in #1609

Full Changelog: v1.9.0...v.10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introducing Search - v1.10.0

Search Features:

More Features

Fixes & Improvements

Testing & Docs

What's Changed

New Contributors

Contributors

Uh oh!