|
1 | | -# github-clone-archiver |
| 1 | +# Metrics Workflow 📊 |
| 2 | + |
2 | 3 | This is a reusable GitHub Actions workflow designed to automate the collection and archival of repository clone statistics. It solves the "14-day limit" problem of GitHub's native traffic insights by persisting data into a central repository. |
| 4 | + |
| 5 | +## 🏗 Architecture |
| 6 | + |
| 7 | +The system uses a **three-repo architecture** to maintain security and organization: |
| 8 | + |
| 9 | +1. **Workflows Repo (`metrics-workflows`)**: Central hub containing the reusable logic (`metrics.yml`). |
| 10 | +2. **Observed Repo(s)**: The repositories being tracked. Each triggers the workflow on a schedule. |
| 11 | +3. **Observer Repo (`metrics-database`)**: The central storage where `.csv` files are maintained and updated. |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +### 🔍 How it Works |
| 16 | + |
| 17 | +GitHub only keeps traffic data (clones and visitors) for **14 days**. This workflow acts as a "Data Logger": |
| 18 | + |
| 19 | +1. It wakes up every day and asks the GitHub API for the clone history of the **Observed Repo**. |
| 20 | +2. It compares this data with the existing logs in your **Observer Repo**. |
| 21 | +3. It appends only the most recent data and updates the "14-day total" summary. |
| 22 | +4. It deduplicates the file and sorts it so the most recent stats are always at the top. |
| 23 | + |
| 24 | +> [!IMPORTANT] |
| 25 | +> **Access Requirement:** You must be the owner or a collaborator with appropriate permissions on the **Observed Repos**. This workflow requires an authorized Personal Access Token (PAT) to read traffic data that is otherwise hidden from the public. |
| 26 | +
|
| 27 | + |
| 28 | +--- |
| 29 | + |
| 30 | +## 🔐 Security & Permissions |
| 31 | + |
| 32 | +To allow a workflow running in an **Observed Repo** to write data to the **Observer Repo**, specific permissions must be configured via a Fine-Grained Personal Access Token (PAT). |
| 33 | + |
| 34 | +### 1. The Personal Access Token (PAT) |
| 35 | + |
| 36 | +Create a token named **"Metrics Workflow"** in your [Developer Settings](https://github.com/settings/tokens?type=beta) under Personal Access Tokens/Fine-grained tokens: |
| 37 | + |
| 38 | +* **Repository access**: |
| 39 | + * Select **Only select repositories**. |
| 40 | + * Include all **Observed Repositories** AND the **Observer Repository**. |
| 41 | + |
| 42 | + |
| 43 | +* **Permissions**: |
| 44 | + * `Administration`: Read-only (required for some traffic API metadata). |
| 45 | + * `Metadata`: Read-only. |
| 46 | + * `Contents`: **Read & Write** (Required to push `.csv` updates). |
| 47 | + |
| 48 | + |
| 49 | + |
| 50 | +### 2. Repository Secrets |
| 51 | + |
| 52 | +In **each** Observed Repo (Settings > Secrets and variables > Actions), add the following secret: |
| 53 | + |
| 54 | +* **Name**: `METRICS_PAT` |
| 55 | +* **Value**: Paste the token generated above. |
| 56 | + |
| 57 | +> [!NOTE] |
| 58 | +> While it may seem like the Observer repo doesn't need a "separate" PAT, it is actually covered by the **"Metrics Workflow" PAT** you created. Because that single token has "Write" access to the Observer repo, it can push the data once the workflow finishes gathering it. |
| 59 | +
|
| 60 | +--- |
| 61 | + |
| 62 | +## 🚀 Usage |
| 63 | + |
| 64 | +To implement this in an **Observed Repo**, create a file at `.github/workflows/metrics.yml`: |
| 65 | + |
| 66 | +```yaml |
| 67 | +name: Collect Metrics |
| 68 | + |
| 69 | +on: |
| 70 | + schedule: |
| 71 | + - cron: '0 0 * * *' # Runs daily at midnight |
| 72 | + workflow_dispatch: # Allows manual triggering |
| 73 | + |
| 74 | +jobs: |
| 75 | + update-metrics: |
| 76 | + # This is required for the runner to operate within the Observed repo |
| 77 | + permissions: |
| 78 | + contents: read |
| 79 | + uses: myspace/workflows-repo/.github/workflows/metrics.yml@main |
| 80 | + with: |
| 81 | + metrics-repo: myspace/observer-repo |
| 82 | + secrets: |
| 83 | + METRICS_PAT: ${{ secrets.METRICS_PAT }} |
| 84 | + |
| 85 | +``` |
| 86 | + |
| 87 | +This is your caller workflow for the observed repository. |
| 88 | + |
| 89 | +--- |
| 90 | + |
| 91 | +## 📈 Data Structure |
| 92 | + |
| 93 | +The workflow generates/updates a CSV file named after the repository (e.g., `myspace_observed-repo.csv`). |
| 94 | + |
| 95 | +### Sorting Logic: |
| 96 | + |
| 97 | +The CSV is automatically deduplicated and sorted in **reverse chronological order** (newest first). |
| 98 | + |
| 99 | +* **Daily Stats**: Recorded as `YYYY-MM-DD`. |
| 100 | +* **14-day Totals**: Recorded as `YYYY-MM-DD~ 14-day total`. |
| 101 | + |
| 102 | +The use of the tilde (`~`) ensures that in a descending sort, the **Total** summary for a specific day appears immediately **above** the individual daily stats for that same day. |
| 103 | + |
| 104 | +--- |
| 105 | + |
| 106 | +## 🛠 Maintenance |
| 107 | + |
| 108 | +* **Adding Repos**: To track a new repository, simply add the `METRICS_PAT` secret to the new repo and create the caller workflow. |
| 109 | +* **Data Integrity**: The workflow uses `awk` to ensure that if it runs multiple times in one day, only the most recent (most complete) data point is saved, preventing duplicates. |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +## 🛡️🔐 Single Token vs. High Security |
| 114 | + |
| 115 | +This workflow requires cross-repository permissions. You can choose between a **Standard** setup (the current setup, easier to maintain) or a **High Security** setup (follows the Principle of Least Privilege). |
| 116 | + |
| 117 | +### Option 1: Standard Setup (Single Token) |
| 118 | + |
| 119 | +Recommended for solo developers or small setups. |
| 120 | + |
| 121 | +* **Token Name**: `Metrics-Unified-Token` |
| 122 | +* **Scope**: All Observed Repos **AND** the Observer Repo. |
| 123 | +* **Permissions**: |
| 124 | +* `Metadata`: Read-only |
| 125 | +* `Administration`: Read-only |
| 126 | +* `Contents`: **Read & Write** |
| 127 | + |
| 128 | + |
| 129 | +* **Workflow Secret**: Store as `METRICS_PAT` in all Observed repos. |
| 130 | + |
| 131 | +### Option 2: High Security Setup (Dual Token) |
| 132 | + |
| 133 | +Recommended for teams or sensitive source code. This ensures the "Writer" token cannot be used to modify source code in your Observed repositories. |
| 134 | + |
| 135 | +#### A. The "Traffic Reader" Token |
| 136 | + |
| 137 | +* **Scope**: All **Observed Repos** only. |
| 138 | +* **Permissions**: `Metadata` (Read), `Administration` (Read). |
| 139 | +* **Usage**: Used by the workflow to fetch clone data from the GitHub API. |
| 140 | +* **Secret Name**: `READER_PAT` |
| 141 | + |
| 142 | +#### B. The "Database Writer" Token |
| 143 | + |
| 144 | +* **Scope**: The **Observer Repo** only. |
| 145 | +* **Permissions**: `Contents` (Read & Write). |
| 146 | +* **Usage**: Used by the workflow to `git push` the CSV file. |
| 147 | +* **Secret Name**: `WRITER_PAT` |
| 148 | + |
| 149 | +--- |
| 150 | + |
| 151 | +## 🛡️🚀 Usage (High Security Example) |
| 152 | + |
| 153 | +If you choose the **High Security** route, update your caller workflow in the Observed Repo as follows: |
| 154 | + |
| 155 | +```yaml |
| 156 | +jobs: |
| 157 | + update-metrics: |
| 158 | + uses: myspace/workflows-repo/.github/workflows/metrics.yml@main |
| 159 | + with: |
| 160 | + metrics-repo: myspace/observer-repo |
| 161 | + secrets: |
| 162 | + # We pass the Writer token to the reusable workflow |
| 163 | + # so it can push to the central database repo |
| 164 | + METRICS_PAT: ${{ secrets.WRITER_PAT }} |
| 165 | + |
| 166 | +``` |
| 167 | + |
| 168 | +> [!TIP] |
| 169 | +> **Why do we pass the Writer token?** > The GitHub Actions default `GITHUB_TOKEN` can read the current repo's traffic. By passing the `WRITER_PAT` as the `METRICS_PAT` secret, the workflow gains the specific authority needed to write to the **Observer Repo** without needing permission to write to your source code. |
| 170 | +
|
| 171 | +--- |
| 172 | + |
| 173 | +## 🛡️🗂️ Permission Table Reference |
| 174 | + |
| 175 | +| Permission | Requirement | Reason | |
| 176 | +| --- | --- | --- | |
| 177 | +| `Metadata` | Read | Basic repository access | |
| 178 | +| `Administration` | Read | Required to access `/traffic/clones` API | |
| 179 | +| `Contents` | Read/Write | Required to push `.csv` changes to Observer Repo | |
| 180 | + |
| 181 | +--- |
| 182 | + |
| 183 | +### How to verify your permissions |
| 184 | + |
| 185 | +If the workflow fails with a `403 Forbidden` error during the **push** phase, check that your PAT (the one passed to `METRICS_PAT`) has `Contents: Write` access specifically for the **Observer Repo**. |
0 commit comments