-
Notifications
You must be signed in to change notification settings - Fork 0
Tools
Kara Moraw edited this page Jul 10, 2024
·
4 revisions
Huge tool to build database of info from Git repositories. Link to resulting schema.
It has Docker containers which run a database and API. Repositories can be added using their git ID (docs).
Can it be used for repositories that I do not have e.g. push access to?
Component Perceval: Python API for retrieving data from repository.
Arthur: schedules and executes Perceval for larger amounts of software repositories. Uses Redis queue.
Record public GitHub timeline, archive it and make it easily accessible.
Data is available as raw, hourly JSON encoded events file from data.gharchive.org
.
Moreover, it's on Google BigQuery, which needs Google Developer access but allows SQL-like queries.
There is a limit of 1TB data processing per month though.