Skip to content
Kara Moraw edited this page Jul 10, 2024 · 4 revisions

Augur

Huge tool to build database of info from Git repositories. Link to resulting schema.

It has Docker containers which run a database and API. Repositories can be added using their git ID (docs).

Can it be used for repositories that I do not have e.g. push access to?

GrimoireLab

Link

Component Perceval: Python API for retrieving data from repository.

Arthur: schedules and executes Perceval for larger amounts of software repositories. Uses Redis queue.

GH Archive

Link

Record public GitHub timeline, archive it and make it easily accessible. Data is available as raw, hourly JSON encoded events file from data.gharchive.org. Moreover, it's on Google BigQuery, which needs Google Developer access but allows SQL-like queries. There is a limit of 1TB data processing per month though.

Clone this wiki locally