The package implements a wrapper layer to extract job data from environment, prepare the job properly, and execute it using Scrapy.
estela-crawl
: Process job args and settings to run the job with Scrapy.estela-describe-project
: Print JSON-encoded project information and image metadata.
$ python setup.py install
$ pip install -r requirements.txt
Job specifications are passed through env variables:
JOB_INFO
: Dictionary with this fields:- [Required] key: Job key (job ID, spider ID and project ID).
- [Required] spider: String spider name.
- [Required] auth_token: User token authentication.
- [Required] api_host: API host URL.
- [Optional] args: Dictionary with job arguments.
- [Required] collection: String with name of collection where items will be stored.
- [Optional] unique: String,
"True"
if the data will be stored in a unique collection,"False"
otherwise. Required only for cronjobs.
QUEUE_PLATFORM
: The queue platform used by estela, review the list of the current supported platforms.QUEUE_PLATFORM_{PARAMETERS}
: Please, refer to theestela-queue-adapter
documentation to declare the needed variables.
$ pytest
$ black .