Skip to content

Structured per-item event reporting for borg create (JSON / observability layer)Β #9528

@YK-03

Description

@YK-03

Hi Borg maintainers πŸ‘‹,

I’ve been exploring the borg create flow in depth (from CLI dispatch β†’ do_create β†’ _rec_walk / _process_any β†’ archive.py processing), and I noticed that while Borg provides rich logging for humans, it currently lacks structured, machine-readable observability during backup operations.


πŸ’‘ Motivation

During large backup runs:

  • Errors and retries are logged as text, which can be hard to analyze programmatically
  • Deduplication decisions (reuse vs new chunks) are not externally visible
  • There is no structured way to trace per-file lifecycle (start β†’ processed β†’ skipped β†’ error)

This makes:

  • debugging large backups harder
  • integration with external tools (dashboards, monitoring, UI) difficult
  • automated analysis of backup behavior nearly impossible

πŸš€ Proposal

Introduce an optional structured event reporting system for borg create:

CLI Flag (opt-in)

borg create --log-json ...

🧩 Core Idea

Emit structured events during backup execution, for example (JSON Lines format):

{"event": "file_started", "path": "/home/user/file.txt"}
{"event": "chunk_reused", "chunk_id": "...", "size": 4096}
{"event": "file_completed", "path": "/home/user/file.txt", "status": "ok"}
{"event": "file_error", "path": "...", "error": "..."}

πŸ—οΈ Possible Design Direction

  • Introduce a lightweight event emitter inside the create pipeline

  • Hook into key points:

    • before/after file processing (_process_any, process_file)
    • deduplication decisions (cache.reuse_chunk, add_chunk)
    • retry / error paths
  • Keep default behavior unchanged (text logs remain primary)


🎯 Benefits

  • Improves observability and debugging

  • Enables future tooling:

    • dashboards / visualizations
    • progress trackers
    • integration with external systems
  • Keeps Borg backward-compatible (fully opt-in)


❓ Questions / Feedback

  • Would such an observability layer align with Borg’s design goals?

  • Are there existing discussions or constraints I should be aware of before prototyping?

  • Preferred direction:

    • integrate with existing logging system
    • or introduce a separate structured event pipeline?

If this direction makes sense, I’d be happy to:

  • prototype a minimal version (e.g., file-level events)
  • iterate based on feedback

Thanks for your time and for maintaining such a powerful tool πŸ™Œ

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions