|
| 1 | +--- |
| 2 | +draft: false |
| 3 | +date: 2025-03-24 |
| 4 | +categories: |
| 5 | + - Release |
| 6 | +authors: |
| 7 | + - kylebarron |
| 8 | +links: |
| 9 | + - CHANGELOG.md |
| 10 | +--- |
| 11 | + |
| 12 | +# Releasing obstore 0.6! |
| 13 | + |
| 14 | +Obstore is the simplest, highest-throughput Python interface to Amazon S3, Google Cloud Storage, and Azure Storage, powered by Rust. |
| 15 | + |
| 16 | +This post gives an overview of what's new in obstore version 0.6. |
| 17 | + |
| 18 | +<!-- more --> |
| 19 | + |
| 20 | +Refer to the [changelog](../../CHANGELOG.md) for all updates. |
| 21 | + |
| 22 | +## Easier access to Microsoft Planetary Computer |
| 23 | + |
| 24 | +The [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com/) hosts a multi-petabyte catalog of global environmental data. |
| 25 | + |
| 26 | +The contained data is publicly accessible, but requires the user to fetch [short-lived access tokens](https://planetarycomputer.microsoft.com/docs/concepts/sas/). But accessing and refreshing these tokens every hour can be confusing and annoying. |
| 27 | + |
| 28 | +Following up on the [addition in v0.5 of credential providers](obstore-0.5.md#credential-providers), this release adds [`PlanetaryComputerCredentialProvider`][obstore.auth.planetary_computer.PlanetaryComputerCredentialProvider], which **handles all token access and refresh automatically**. |
| 29 | + |
| 30 | +As a quick example, we'll read data from the [NAIP dataset](https://planetarycomputer.microsoft.com/dataset/naip): |
| 31 | + |
| 32 | +```py |
| 33 | +from obstore.store import AzureStore |
| 34 | +from obstore.auth.planetary_computer import PlanetaryComputerCredentialProvider |
| 35 | + |
| 36 | +url = "https://naipeuwest.blob.core.windows.net/naip/v002/mt/2023/mt_060cm_2023/" |
| 37 | + |
| 38 | +# Construct an AzureStore with this credential provider. |
| 39 | +# |
| 40 | +# The account, container, and container prefix are passed down to AzureStore |
| 41 | +# automatically. |
| 42 | +store = AzureStore(credential_provider=PlanetaryComputerCredentialProvider(url)) |
| 43 | +``` |
| 44 | + |
| 45 | +Then, for example, list some items in the container (the prefix `v002/mt/2023/mt_060cm_2023` was automatically set as the prefix on the `AzureStore`): |
| 46 | + |
| 47 | +```py |
| 48 | +items = next(store.list()) |
| 49 | +print(items[:2]) |
| 50 | +``` |
| 51 | + |
| 52 | +```py |
| 53 | +[{'path': '44104/m_4410401_ne_13_060_20230811_20240103.200.jpg', |
| 54 | + 'last_modified': datetime.datetime(2025, 1, 13, 18, 18, 1, tzinfo=datetime.timezone.utc), |
| 55 | + 'size': 14459, |
| 56 | + 'e_tag': '0x8DD33FE9DB7A24D', |
| 57 | + 'version': None}, |
| 58 | + {'path': '44104/m_4410401_ne_13_060_20230811_20240103.tif', |
| 59 | + 'last_modified': datetime.datetime(2025, 1, 13, 16, 39, 6, tzinfo=datetime.timezone.utc), |
| 60 | + 'size': 400422790, |
| 61 | + 'e_tag': '0x8DD33F0CC1D1752', |
| 62 | + 'version': None}] |
| 63 | +``` |
| 64 | + |
| 65 | +And we can fetch an image thumbnail: |
| 66 | + |
| 67 | +```py |
| 68 | +path = "44106/m_4410602_nw_13_060_20230712_20240103.200.jpg" |
| 69 | +image_content = store.get(path).bytes() |
| 70 | + |
| 71 | +# Write out the image content to a file in the current directory |
| 72 | +with open("thumbnail.jpg", "wb") as f: |
| 73 | + f.write(image_content) |
| 74 | +``` |
| 75 | + |
| 76 | +And voilà: |
| 77 | + |
| 78 | + |
| 79 | + |
| 80 | +### Using with the Planetary Computer STAC API |
| 81 | + |
| 82 | +[STAC](https://stacspec.org/en) is a metadata specification for geospatial data. The Planetary Computer [provides a STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) to help search and find data of interest. |
| 83 | + |
| 84 | +The [`PlanetaryComputerCredentialProvider`][obstore.auth.planetary_computer.PlanetaryComputerCredentialProvider] includes a [`from_asset`][obstore.auth.planetary_computer.PlanetaryComputerCredentialProvider].from_asset] constructor to easily convey configuration. |
| 85 | + |
| 86 | +```py |
| 87 | +import pystac_client |
| 88 | + |
| 89 | +from obstore.auth.planetary_computer import PlanetaryComputerCredentialProvider |
| 90 | +from obstore.store import AzureStore |
| 91 | + |
| 92 | +stac_url = "https://planetarycomputer.microsoft.com/api/stac/v1/" |
| 93 | +# Open the STAC Catalog |
| 94 | +catalog = pystac_client.Client.open(stac_url) |
| 95 | + |
| 96 | +# Access a specific Collection and Asset |
| 97 | +collection = catalog.get_collection("daymet-daily-hi") |
| 98 | +asset = collection.assets["zarr-abfs"] |
| 99 | + |
| 100 | +# Then we can pass this directly to `from_asset` |
| 101 | +credential_provider = PlanetaryComputerCredentialProvider.from_asset(asset) |
| 102 | + |
| 103 | +# Print objects at the root of this directory |
| 104 | +store = AzureStore(credential_provider=credential_provider) |
| 105 | +print(store.list_with_delimiter()["objects"]) |
| 106 | +``` |
| 107 | + |
| 108 | +```py |
| 109 | +[{'path': '.zattrs', |
| 110 | + 'last_modified': datetime.datetime(2021, 6, 9, 15, 48, 6, tzinfo=datetime.timezone.utc), |
| 111 | + 'size': 402, |
| 112 | + 'e_tag': '0x8D92B5DF9186DED', |
| 113 | + 'version': None}, |
| 114 | + {'path': '.zgroup', |
| 115 | + 'last_modified': datetime.datetime(2021, 6, 9, 15, 45, 56, tzinfo=datetime.timezone.utc), |
| 116 | + 'size': 24, |
| 117 | + 'e_tag': '0x8D92B5DABDE919A', |
| 118 | + 'version': None}, |
| 119 | + {'path': '.zmetadata', |
| 120 | + 'last_modified': datetime.datetime(2021, 6, 9, 15, 46, 56, tzinfo=datetime.timezone.utc), |
| 121 | + 'size': 13479, |
| 122 | + 'e_tag': '0x8D92B5DCFE1B7CC', |
| 123 | + 'version': None}] |
| 124 | +``` |
| 125 | + |
| 126 | +## All updates |
| 127 | + |
| 128 | +Refer to the [changelog](../../CHANGELOG.md) for all updates. |
0 commit comments