Skip to content

Should pyosmium-up-to-date respect an .osm.pbf's bounds #256

@daniel-j-h

Description

@daniel-j-h

Hi there! Suppose I have used osmium extract to generate a small (< 10 MB) .osm.pbf file of an area from a snapshot and I have used the --set-bounds options so that the bounds get written into the file header.

I want to keep this small file up to date e.g. on a daily basis by running pyosmium-up-to-date but when I do so it looks like

  1. It takes multiple minutes (not that big of a deal)
  2. After it finishes I end up with a vastly bigger file (~85 MB)
  3. The updates file no longer seem to include a bounding box in its header

Here is the osmium fileinfo output on the .osm.pbf pyosmium-up-to-date generates:

File:
  Name: latest.osm.pbf
  Format: PBF
  Compression: none
  Size: 88664952
Header:
  Bounding boxes:
  With history: no
  Options:
    generator=pyosmium-up-to-date/3.6.0
    osmosis_replication_base_url=https://planet.osm.org/replication/hour/
    osmosis_replication_sequence_number=103469
    osmosis_replication_timestamp=2024-07-02T12:00:00Z
    pbf_dense_nodes=true
    timestamp=2024-07-02T12:00:00Z

I wanted to flag this behavior because it was unexpected to me and I'm not sure if this is by design.


My workaround for now is the following

  1. Download a snapshot .osm.pbf once e.g. from the Geofabrik download service (> 370 MB)
  2. Use osmium extract to cut a small .osm.pbf out of it (< 10 MB)
  3. Every day
    a. run pyosimium-up-to-date (~ 100 MB)
    b. re-run osmium extract as in step 2 to re-cut for the specific bounds (< 10 MB)

Thank you! Also happy for any pointers on how other folks keep their small extracts up to date!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions