Skip to content
This repository was archived by the owner on Jan 31, 2020. It is now read-only.
This repository was archived by the owner on Jan 31, 2020. It is now read-only.

Indirect elements / large element optimisation #59

@dhardy

Description

@dhardy

Pippin currently must pull a whole partition's element data into memory to run, which is fine for very small elements but poor behaviour with very large elements or many medium-sized elements. In these cases indirection may be beneficial.

  • Simple blob files: each large element could be stored in an individual file, and loaded on demand. This should make the rest of the DB smaller and faster, but has a couple of disadvantages: many files may be needed on disk, and some form of garbage collection would be needed to find old/orphan blobs for deletion. In its simplest form, each change to an element would require a new blob.

  • Blob changesets: an addition to the above, allowing an element to be reconstructed from a base blob and one or more changesets (directly from logs or from separate blobs). (Note that simple direct editing of blob files would lose history and risk corruption.)

  • Page/tar files: simply an "optimisation" of the above allowing many medium-sized pieces of data to be stored together in a single file. These would make garbage collection and backups harder to do, but could be useful when many small files should be avoided. (Probably usage should be a user option.)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions