This repo is a demo for web scraping on a schedule with GitHub Actions.
We're scraping police Calls for Service from Oakland, CA. It's the same information you can find here, but in CSV form. If you'd like more information, here's all we know about the Data Source.
The city of Oakland reported a ransomware attack that they confirm affected this data. February 9 is the last day the city posted new data. For February 10, the data is identical to the day before. There was no data at all posted for February 11 and 12. And since February 13, the data again repeats, identically, what the city had available February 9. We've been in touch the city of Oakland and will note here when there is a fix.
It gives us, in this case, cron capability and free storage (to a point) for any scraper we tell it how to run.
It will also monitor the jobs for us and commit and push back to the repo any new data it finds at our endpoint. All without us having to do anything beyond setting it up in .github/workflows/update.yml
.