Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building predictions for Zushi, Japan missing #117

Open
marklit opened this issue Oct 7, 2024 · 0 comments
Open

Building predictions for Zushi, Japan missing #117

marklit opened this issue Oct 7, 2024 · 0 comments

Comments

@marklit
Copy link

marklit commented Oct 7, 2024

It looks like there might have been a decision to not run inference around Zushi, Japan based on tiles as there is a clear break ~5KM before the coastline.

qgis-bin_JNjxTh95hw

OpenStreetMap doesn't have any building footprints for this area.

qgis-bin_PIrsldFwpL

There is a PLATEAU project by the Japanese government to map out buildings in 3D but they haven't covered Zushi yet. Other Japanese public dataset have points for each of the buildings but no polygon footprints.

Any inference of the buildings in this area would produce a pretty unique dataset.

These were the steps I took to extract the buildings:

$ wget https://minedbuildings.blob.core.windows.net/global-buildings/dataset-links.csv

$ cat dataset-links.csv \
    | ~/duckdb \
            -json \
            -c "SELECT Url
                FROM READ_CSV('/dev/stdin')
                WHERE Location = 'Japan'" \
    | jq '.[].Url' \
    > japan.txt

$ cat japan.txt | xargs -P4 -I% wget -c %

$ ~/duckdb buildings.duckdb
CREATE OR REPLACE TABLE buildings (
    height     DOUBLE,
    confidence DOUBLE,
    geom       GEOMETRY);
$ for FILENAME in *.csv.gz; do
      gunzip -c $FILENAME | gzip -1 > working.jsonl.gz # Fix trailing garbage complaint
  
      echo "INSERT INTO buildings
            SELECT properties.height,
                   properties.confidence,
                   ST_GEOMFROMGeoJSON(geometry) geom
            FROM READ_NDJSON('working.jsonl.gz')" \
          | ~/duckdb buildings.duckdb
  done


$ ~/duckdb buildings.duckdb
COPY(
    SELECT height,
           confidence,
           ST_AsWKB(geom) AS geom
    FROM   buildings
    WHERE  ST_Y(ST_CENTROID(geom)) IS NOT NULL
    AND    ST_X(ST_CENTROID(geom)) IS NOT NULL
    ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geom)),
                             ST_X(ST_CENTROID(geom))]::double[2])
) TO 'japan.pq' (
    FORMAT 'PARQUET',
    CODEC  'ZSTD',
    ROW_GROUP_SIZE 15000);

Between the trailing garbage complaint when decompressing the GZIP files and some geometry producing NULL centroid coords there might be some other issues at play.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant