Open
Description
This repo (and project in general) have been super useful as a centralized spot to grab met data, thank you!
Wanted to raise an issue that was noticed when following a query pattern that looks like:
- determine nearby
Stations
inventory
filter to sites with data during the period of interest- fetch
Hourly
data from these stations
The 'hourly_end' field in Stations
results doesn't always match the actual hourly data availability obtained via the Hourly
class.
This discrepancy means we end up skipping the inventory
step and request hourly data from more sites than necessary to ensure we get what is actually available.
Given the workaround, this is not a functional issue but does mean passing on more query load, which you might not want!
A minimal example that shows a discrepancy at the time of posting:
from datetime import datetime, timedelta
from meteostat import Stations, Hourly
stations = Stations()
# force latest data so an old cache isn't the problem
stations.max_age = 0
stations = stations.nearby(30.2416, -90.9827)
station = stations.fetch(1)
print(
f"(Stations) Latest hourly data for {station['name'].iloc[0]} ({station.index[0]}): {station['hourly_end'].iloc[0]}"
)
utc_now = datetime.utcnow()
hourly = Hourly(
loc=station.index.tolist(),
start=utc_now - timedelta(days=3),
end=utc_now,
model=False,
)
hourly = hourly.fetch()
print(
f"(Hourly) Latest hourly data for {station['name'].iloc[0]} ({station.index[0]}): {hourly.index.max()}"
)
>>> (Stations) Latest hourly data for Louisiana Regional Airport (7O9W0): 2024-03-03 00:00:00
>>> (Hourly) Latest hourly data for Louisiana Regional Airport (7O9W0): 2024-03-07 16:00:00
Manually removing the cache doesn't seem to resolve the difference either.
Metadata
Metadata
Assignees
Labels
No labels