Refactor to matrix-based parallel workflow with automated issue management #1

Copilot · 2026-01-07T07:46:14Z

The sequential workflow hits 10 bank endpoints every 5 minutes, resulting in high GitHub Actions costs and brittle execution where single failures affect all banks.

Changes

Unified Fetcher (`scripts/fetch-bank.ts`)

Consolidates 10 individual .js postprocessors into single TypeScript script with bank config map
Accepts bank name as CLI arg: deno run --allow-all scripts/fetch-bank.ts akbank
Logs structured errors to logs/failures.jsonl with timestamp/stack traces
Inline readJSON helper replaces external dependency

Matrix Workflow (`.github/workflows/flat.yml`)

Schedule: */5 → */10 (50% cost reduction)
Matrix strategy with max-parallel: 5, fail-fast: false
Concurrency control: group: ${{ github.workflow }}-${{ github.ref }}
Retry logic for git push conflicts (5 attempts with exponential backoff)
Summary job aggregates matrix results, displays success/failure counts

Automated Issue Management

Creates issues for failures with duplicate detection via label query
Issue body includes error details, stack trace, workflow run link
Auto-closes issues when banks recover
Labels: automated, data-fetch

Code Quality

Fixed async forEach race conditions → for...of loops
Corrected garanti/ziraat URLs to actual endpoints
Exact bank name matching in workflow dispatch filter

Example

Manual trigger for specific banks:

# Workflow dispatch input
banks: "akbank,ing,hsbc"

# Filter logic
if: ${{ github.event.inputs.banks == '' || 
       contains(fromJSON(format('["{0}"]', replace(github.event.inputs.banks, ',', '","'))), matrix.bank) }}

Error log structure:

{"timestamp": "2026-01-07T12:00:00.000Z", "bank": "akbank", "error": "HTTP 500", "stack": "..."}

CSV output format and data processing logic unchanged. Compatible with existing archive.py.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

dl.deno.land
- Triggering command: /usr/bin/curl curl -s REDACTED (dns block)
- Triggering command: /usr/bin/curl curl --fail --location --progress-bar --output /home/REDACTED/.deno/bin/deno.zip REDACTED (dns block)
- Triggering command: /usr/local/bin/deno deno run --unsafely-ignore-certificate-errors --allow-all scripts/fetch-bank.ts hsbc (dns block)
esm.ubuntu.com
- Triggering command: /usr/lib/apt/methods/https /usr/lib/apt/methods/https tcp --destination-port 53 -j ACCEPT (dns block)
www.hsbcyatirim.com.tr
- Triggering command: /usr/local/bin/deno deno run --unsafely-ignore-certificate-errors --allow-all scripts/fetch-bank.ts hsbc (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

Problem Statement

The current flat-file workflow runs sequentially, hitting multiple bank endpoints every 5 minutes. This results in high GitHub Actions minute costs and limited debuggability. We need to refactor to a more cost-effective and maintainable solution.

Requirements

1. Matrix-Based Parallel Approach

Replace sequential fetching with GitHub Actions matrix strategy

Run fetches in parallel with fail-fast: false to continue even if one bank fails

Use max-parallel: 5 to limit concurrent jobs

Add concurrency control to cancel redundant runs

2. Unified Fetcher Script

Create a single TypeScript/Deno fetcher script that handles all banks

Configure banks via a central config object

Script should accept bank name as argument

Maintain current CSV output format (no Parquet conversion)

Handle errors gracefully with structured logging

3. Improved Error Handling & Debugging

Use continue-on-error: true for individual fetch steps

Log failures to structured error files

Upload failure logs as artifacts (retention: 7 days)

Create a summary job that aggregates results from all matrix jobs

Show success/failure counts in job summary

4. Automatic Issue Creation

Post GitHub issues when banks fail to fetch

Critical: Check for existing open issues to prevent duplicates

Only create issue if no open issue exists for that bank

Include timestamp, bank name, and error details

Label with 'automated' and 'data-fetch'

Close issues automatically when bank recovers

5. Schedule Optimization

Change cron schedule from */5 * * * * to */10 * * * * (every 10 minutes)

Keep workflow_dispatch for manual triggers

Add optional input to fetch specific banks only

6. Current Behavior to Preserve

Keep CSV file format (do NOT convert to Parquet)

Maintain existing file structure: {bank}.csv in root

Keep current data processing logic from existing .js files

Preserve Git commit behavior for data updates

Current Setup

The repository currently uses:

.github/workflows/flat.yml with sequential githubocto/flat@v3 actions

Individual JavaScript postprocessing files: akbank.js, ing.js, garanti.js, etc.

CSV output files: akbank.csv, ing.csv, etc.

Deno v1.x runtime

Banks to fetch:

akbank

ing

garanti

enpara

teb

hsbc

kuveytturk

isbank

ziraat

yapikredi

Implementation Details

Unified Fetcher Structure

Create scripts/fetch-bank.ts or similar that:

Maintains a config object mapping bank names to their endpoints and parsers

Accepts bank name as CLI argument

Fetches data from the appropriate endpoint

Processes using existing logic from individual .js files

Outputs to CSV in the same format as current implementation

Logs errors to logs/failures.jsonl with structured data

Workflow Structure

Matrix job (fetch): Runs for each bank in parallel

Summary job (summary): Aggregates results, creates issues if needed

Error Tracking

Log structure: { timestamp, bank, error, stack }

Artifact naming: failure-log-{bank}

Issue deduplication: Query existing issues before creating new ones

Success Criteria

✅ Workflow runs every 10 minutes (50% cost reduction)

✅ Banks fetched in parallel (faster execution)

✅ Individual bank failures don't stop other banks

✅ Failed fetches logged with full error details

✅ Job summary shows which banks succeeded/failed

✅ Issues created automatically for failures (no duplicates)

✅ CSV files continue to work as before

✅ Existing data format preserved

Files to Create/Modify

.github/workflows/flat.yml - Refactor to matrix strategy

scripts/fetch-bank.ts - New unified fetcher (or similar path)

Any bank config files needed

Error logging structure

Notes

Do NOT convert to Parquet - keep CSV format

Reuse existing parsing logic from .js files where possible

Ensure backward compatibility with existing data consumers

Consider the existing archive.py script that processes these CSVs

This pull request was created from Copilot chat.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…llel approach Co-authored-by: c6p <[email protected]>

Co-authored-by: c6p <[email protected]>

…dant conditional Co-authored-by: c6p <[email protected]>

Co-authored-by: c6p <[email protected]>

Initial plan

c781e86

Copilot AI assigned Copilot and c6p Jan 7, 2026

Copilot started work on behalf of c6p January 7, 2026 07:46 View session

Copilot AI and others added 4 commits January 7, 2026 07:50

Add unified fetcher script and refactor workflow to matrix-based para…

c33d905

…llel approach Co-authored-by: c6p <[email protected]>

Fix TypeScript errors and clean up YAML formatting

0239a90

Co-authored-by: c6p <[email protected]>

Fix code review issues: correct URLs, fix async forEach, remove redun…

e61951e

…dant conditional Co-authored-by: c6p <[email protected]>

Improve bank filtering logic and add retry logic for git push conflicts

6206de6

Co-authored-by: c6p <[email protected]>

Copilot AI changed the title ~~[WIP] Refactor flat-file workflow to matrix-based parallel approach~~ Refactor to matrix-based parallel workflow with automated issue management Jan 7, 2026

Copilot AI requested a review from c6p January 7, 2026 08:03

Copilot finished work on behalf of c6p January 7, 2026 08:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor to matrix-based parallel workflow with automated issue management #1

Refactor to matrix-based parallel workflow with automated issue management #1

Uh oh!

Copilot AI commented Jan 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor to matrix-based parallel workflow with automated issue management #1

Are you sure you want to change the base?

Refactor to matrix-based parallel workflow with automated issue management #1

Uh oh!

Conversation

Copilot AI commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Unified Fetcher (scripts/fetch-bank.ts)

Matrix Workflow (.github/workflows/flat.yml)

Automated Issue Management

Code Quality

Example

I tried to connect to the following addresses, but was blocked by firewall rules:

Problem Statement

Requirements

1. Matrix-Based Parallel Approach

2. Unified Fetcher Script

3. Improved Error Handling & Debugging

4. Automatic Issue Creation

5. Schedule Optimization

6. Current Behavior to Preserve

Current Setup

Implementation Details

Unified Fetcher Structure

Workflow Structure

Error Tracking

Success Criteria

Files to Create/Modify

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 7, 2026 •

edited

Loading

Unified Fetcher (`scripts/fetch-bank.ts`)

Matrix Workflow (`.github/workflows/flat.yml`)