Skip to content

Lazy access of jp2 file in private S3 bucket works, but subsequent computation outside of the session fails with HTTP response 403 #816

@konstntokas

Description

@konstntokas

Code Sample, a copy-pastable example if possible

The code sample below raises a warning with an HTTP response 403. Note that key and secret for AWS bucket can be obtained by CDSE.

import rasterio
import rioxarray


uri = (
    "s3://eodata/Sentinel-2/MSI/L2A/2020/07/05/S2B_MSIL2A_20200705T101559_N0214_R065"
    "_T32TMT_20200705T135630.SAFE/GRANULE/L2A_T32TMT_A017394_20200705T101917/"
    "IMG_DATA/R10m/T32TMT_20200705T101559_B02_10m.jp2"
)
session = rasterio.session.AWSSession(
    aws_unsigned=False,
    endpoint_url="eodata.dataspace.copernicus.eu",
    aws_access_key_id="xxx",
    aws_secret_access_key="xxx",
)
with rasterio.env.Env(session=session, AWS_VIRTUAL_HOSTING=False):
    ds = rioxarray.open_rasterio(uri, chunks=dict(x=1024, y=1024))

mean = ds.mean()
print(mean)
print(mean.compute())

The output is is given in the following cell. When computing the dask graph in the last line print(mean.compute()), the actual data needs to be accessed, which raises the warning. In larger examples it raises the error ''Aborting load due to failure while reading'.

<xarray.DataArray ()> Size: 8B
dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>
Coordinates:
    spatial_ref  int64 8B 0
Warning 1: HTTP response code on https://eodata.s3.us-east-2.amazonaws.com/Sentinel-2/MSI/L2A/2020/07/05/S2B_MSIL2A_20200705T101559_N0214_R065_T32TMT_20200705T135630.SAFE/GRANULE/L2A_T32TMT_A017394_20200705T101917/IMG_DATA/R10m/T32TMT_20200705T101559_B02_10m.jp2.msk: 403
Warning 1: HTTP response code on https://eodata.s3.us-east-2.amazonaws.com/Sentinel-2/MSI/L2A/2020/07/05/S2B_MSIL2A_20200705T101559_N0214_R065_T32TMT_20200705T135630.SAFE/GRANULE/L2A_T32TMT_A017394_20200705T101917/IMG_DATA/R10m/T32TMT_20200705T101559_B02_10m.jp2.MSK: 403
<xarray.DataArray ()> Size: 8B
array(802.02040494)
Coordinates:
    spatial_ref  int64 8B 0

Process finished with exit code 0

When performing the last line within the rasterio Env, it works just fine.

import rasterio
import rioxarray


uri = (
    "s3://eodata/Sentinel-2/MSI/L2A/2020/07/05/S2B_MSIL2A_20200705T101559_N0214_R065"
    "_T32TMT_20200705T135630.SAFE/GRANULE/L2A_T32TMT_A017394_20200705T101917/"
    "IMG_DATA/R10m/T32TMT_20200705T101559_B02_10m.jp2"
)
session = rasterio.session.AWSSession(
    aws_unsigned=False,
    endpoint_url="eodata.dataspace.copernicus.eu",
    aws_access_key_id="O0M0CUQIDQO9TDZ4D8NR",
    aws_secret_access_key="qPUyXs9G6j8on6MY5KPhQNHuA5uZTqxEscrbBCGx",
)
with rasterio.env.Env(session=session, AWS_VIRTUAL_HOSTING=False):
    ds = rioxarray.open_rasterio(uri, chunks=dict(x=1024, y=1024))

mean = ds.mean()
print(mean)

with rasterio.env.Env(session=session, AWS_VIRTUAL_HOSTING=False):
    print(mean.compute())

Problem description

When computing the dask graph in the last line print(mean.compute()), the actual data needs to be accessed, which raises the warning. In larger examples it raises ''Aborting load due to failure while reading'.

Expected Output

The access credentials should be somehow saved when open the data. Otherwise for each computation on the actual data, a new environment will need to be created and applied.

Environment Information

python -c "import rioxarray; rioxarray.show_versions()"
rioxarray (0.17.0) deps:
rasterio: 1.4.1
xarray: 2024.6.0
GDAL: 3.9.3
GEOS: 3.13.0
PROJ: 9.5.0
PROJ DATA: /home/konstantin/micromamba/envs/xcube-stac/share/proj
GDAL DATA: /home/konstantin/micromamba/envs/xcube-stac/share/gdal

Other python deps:
scipy: 1.14.1
pyproj: 3.7.0

System:
python: 3.12.7 | packaged by conda-forge | (main, Oct 4 2024, 16:05:46) [GCC 13.3.0]
executable: /home/konstantin/micromamba/envs/xcube-stac/bin/python
machine: Linux-6.8.0-47-generic-x86_64-with-glibc2.35

Installation method

  • micromamba

Conda environment information (if you installed with conda):


Environment (micromamba list):
$ micromamba list | grep -E "rasterio|xarray|gdal"
  gdal                              3.9.3           py312h1299960_0          conda-forge
  libgdal                           3.9.3           ha770c72_0               conda-forge
  libgdal-core                      3.9.3           hd5b9bfb_0               conda-forge
  libgdal-fits                      3.9.3           h2db6552_0               conda-forge
  libgdal-grib                      3.9.3           hc3b29a1_0               conda-forge
  libgdal-hdf4                      3.9.3           hd5ecb85_0               conda-forge
  libgdal-hdf5                      3.9.3           h6283f77_0               conda-forge
  libgdal-jp2openjpeg               3.9.3           h1b2c38e_0               conda-forge
  libgdal-kea                       3.9.3           h1df15e4_0               conda-forge
  libgdal-netcdf                    3.9.3           hf2d2f32_0               conda-forge
  libgdal-pdf                       3.9.3           h600f43f_0               conda-forge
  libgdal-pg                        3.9.3           h5e77dd0_0               conda-forge
  libgdal-postgisraster             3.9.3           h5e77dd0_0               conda-forge
  libgdal-tiledb                    3.9.3           h4a3bace_0               conda-forge
  libgdal-xls                       3.9.3           h03c987c_0               conda-forge
  rasterio                          1.4.1           py312h8456570_0          conda-forge
  rioxarray                         0.17.0          pyhd8ed1ab_0             conda-forge
  xarray                            2024.10.0       pyhd8ed1ab_0             conda-forge



Details about micromamba and system ( micromamaba info ):
$ micromamba info

       libmamba version : 1.5.8
     micromamba version : 1.5.8
           curl version : libcurl/8.6.0 OpenSSL/3.2.1 zlib/1.2.13 zstd/1.5.5 libssh2/1.11.0 nghttp2/1.58.0
     libarchive version : libarchive 3.7.2 zlib/1.2.13 bz2lib/1.0.8 libzstd/1.5.5
       envs directories : /home/konstantin/micromamba/envs
          package cache : /home/konstantin/micromamba/pkgs
                          /home/konstantin/.mamba/pkgs
            environment : xcube-stac (active)
           env location : /home/konstantin/micromamba/envs/xcube-stac
      user config files : /home/konstantin/.mambarc
 populated config files : 
       virtual packages : __unix=0=0
                          __linux=6.8.0=0
                          __glibc=2.35=0
                          __archspec=1=x86_64-v3
               channels : 
       base environment : /home/konstantin/micromamba
               platform : linux-64


Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions