Structured per-item event reporting for `borg create` (JSON / observability layer)

Hi Borg maintainers 👋,

I’ve been exploring the `borg create` flow in depth (from CLI dispatch → `do_create` → `_rec_walk` / `_process_any` → `archive.py` processing), and I noticed that while Borg provides rich logging for humans, it currently lacks **structured, machine-readable observability** during backup operations.

---

## 💡 Motivation

During large backup runs:

* Errors and retries are logged as text, which can be hard to analyze programmatically
* Deduplication decisions (reuse vs new chunks) are not externally visible
* There is no structured way to trace per-file lifecycle (start → processed → skipped → error)

This makes:

* debugging large backups harder
* integration with external tools (dashboards, monitoring, UI) difficult
* automated analysis of backup behavior nearly impossible

---

## 🚀 Proposal

Introduce an **optional structured event reporting system** for `borg create`:

### CLI Flag (opt-in)

```bash
borg create --log-json ...
```

---

## 🧩 Core Idea

Emit structured events during backup execution, for example (JSON Lines format):

```json
{"event": "file_started", "path": "/home/user/file.txt"}
{"event": "chunk_reused", "chunk_id": "...", "size": 4096}
{"event": "file_completed", "path": "/home/user/file.txt", "status": "ok"}
{"event": "file_error", "path": "...", "error": "..."}
```

---

## 🏗️ Possible Design Direction

* Introduce a lightweight **event emitter** inside the create pipeline
* Hook into key points:

  * before/after file processing (`_process_any`, `process_file`)
  * deduplication decisions (`cache.reuse_chunk`, `add_chunk`)
  * retry / error paths
* Keep default behavior unchanged (text logs remain primary)

---

## 🎯 Benefits

* Improves **observability and debugging**
* Enables future tooling:

  * dashboards / visualizations
  * progress trackers
  * integration with external systems
* Keeps Borg backward-compatible (fully opt-in)

---

## ❓ Questions / Feedback

* Would such an observability layer align with Borg’s design goals?
* Are there existing discussions or constraints I should be aware of before prototyping?
* Preferred direction:

  * integrate with existing logging system
  * or introduce a separate structured event pipeline?

---

If this direction makes sense, I’d be happy to:

* prototype a minimal version (e.g., file-level events)
* iterate based on feedback

Thanks for your time and for maintaining such a powerful tool 🙌


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structured per-item event reporting for `borg create` (JSON / observability layer) #9528

💡 Motivation

🚀 Proposal

CLI Flag (opt-in)

🧩 Core Idea

🏗️ Possible Design Direction

🎯 Benefits

❓ Questions / Feedback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Structured per-item event reporting for borg create (JSON / observability layer) #9528

Description

💡 Motivation

🚀 Proposal

CLI Flag (opt-in)

🧩 Core Idea

🏗️ Possible Design Direction

🎯 Benefits

❓ Questions / Feedback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Structured per-item event reporting for `borg create` (JSON / observability layer) #9528