Skip to content

Feature parity: Support for running Crawlee in a web server environment #1148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SeeYangZhi opened this issue Apr 10, 2025 · 4 comments · May be fixed by #1174
Open

Feature parity: Support for running Crawlee in a web server environment #1148

SeeYangZhi opened this issue Apr 10, 2025 · 4 comments · May be fixed by #1174
Assignees
Labels
t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@SeeYangZhi
Copy link

I’m currently exploring the Python version of Crawlee and was wondering if there is support (or planned support) for running Crawlee in a web server environment — similar to what’s described in the JavaScript guide here: https://crawlee.dev/js/docs/guides/running-in-web-server

This functionality would be useful for exposing Crawlee crawlers via an API endpoint (e.g., using FastAPI or Flask), enabling dynamic start/stop of crawls and job monitoring through HTTP.

A few questions:

  • Is such functionality currently supported in crawlee-python?
  • If not, are there any plans to support this use case?
  • Any recommended workarounds or best practices in the meantime?
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Apr 10, 2025
@janbuchar
Copy link
Collaborator

Hi @SeeYangZhi and thank you for your interest in Crawlee! Thios is already possible, but not exactly straightforward - the approach depends on your exact use case.

Could you describe what you have in mind? Perhaps we can compile this info into a guide later...

@SeeYangZhi
Copy link
Author

Hi @janbuchar, thanks for the quick response!

My use case involves exposing a Crawlee crawler as a web API, where users can send a POST request with a URL, and receive the extracted data (e.g. page title, presence of specific elements, etc.) as a JSON response.

Essentially, I’m trying to replicate the functionality described in the JS guide for running Crawlee in a web server, but using the Python version of Crawlee with FastAPI.

@janbuchar
Copy link
Collaborator

Right, so one API call leads to the extraction of a single page. We'll see if we can add a guide for that 🙂

@SeeYangZhi
Copy link
Author

Right, so one API call leads to the extraction of a single page. We'll see if we can add a guide for that 🙂

Looking forward for the guide!

@Pijukatel Pijukatel self-assigned this Apr 25, 2025
@Pijukatel Pijukatel linked a pull request Apr 25, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants