Spidergram is a toolkit for crawling and analyzing complex web properties. create-spidergram
is a quick and easy way to set up a new Spidergram project of your own.
- Ensure you're running NodeJS 18 (
node -v
) - Install ArangoDB via direct download or homebrew. Alternately, if you've got Docker installed, you can use Spidergram's included docker-compose.yml file to spin up an Arango container for testing and development.
- Create a new project directory,
cd
into it, and runnpx create-spidergram
. You'll be prompted for the project's name and your choice of project template. - Run
npm install
- Kick the tires with
npm run crawl <url>
, or dive right in to customizing the project.
- Boilerplate is an NPM project that fires up a Spidergram crawler, grabs the contents of one or more sites, and prints out a summary report of their URL structures.
- Boilerplate (Typescript) is a Typescript version of Boilerplate, with no other functional differences.
- Crawl with Report (Typescript) demonstrates basic data extraction and custom report generation in plaintext and Excel formats.
- JSON config uses a static config file to control most Spidergram settings in conjunction with the globally-installed CLI. If you're interested in kicking the tires, just install this one, then
npm install -g spidergram
,brew install docker-compose
, anddocker-compose up
. You're ready to Spidergram. - YAML config What if Spidergram, but YAML?