- In
GeoConnex
, set therequest_type='POST'
for all CQL queries. While non-spatial CQL queries are working, spatial CQL queries are still not working due to an issue with the GeoConnex service. For now, it's recommended to use thebyfilter
method for most of the queries, including spatial queries. For simple spatial queries, you can use thebybox
method then filter the results based on the actual geometry.
- Replace the links to NLDI and PyGeoAPI web services to their new URLs.
- Add two new methods to
GeoConnex
class for queryingbybox
andbyfilter
. Note that CQL query is still not working due to an issue with the GeoConnex service. For now, it's recommended to use thebyfilter
method for most of the queries, including spatial queries. For simple spatial queries, you can use thebybox
method then filter the results based on the actual geometry.
- Drop support for Python 3.8 since its end-of-life date is October 2024.
- Remove all exceptions from the main module and raise them from the
exceptions
module. This is to declutter the public API and make it easier to maintain.
- The function
pynhd.streamcat
now can be called without any arguments to get a dataframe of all available metrics and their descriptions.
- Add the
exceptions
module to the high-level API to declutter the main module. In the future, all exceptions will be raised from this module and not from the main module. For now, the exceptions are raised from both modules for backward compatibility. - Switch to using the
src
layout instead of theflat
layout for the package structure. This is to make the package more maintainable and to avoid any potential conflicts with other packages. - Add artifact attestations to the release workflow.
- Add support for LakeCat dataset in
streamcat
function. A new argument calledlakes_only
is added to the function. If set toTrue
, only metrics for lake and their associated catchments will be returned. The default isFalse
to retain backward compatibility.
- Modify
HP3D
class based on the latest changes to the 3D Hydrography Program service. Hydrolocation layer has now three sub-layers:hydrolocation_waterbody
for Sink, Spring, Waterbody Outlet,hydrolocation_flowline
for Headwater, Terminus, Divergence, Confluence, Catchment Outlet,hydrolocation_reach
for Reach Code, External Connection.
- EPA's HMS no longer supports the StreamCat dataset, since they have a dedicated
service for it. Thus, the
epa_nhd_catchments
function no longer accepts "streamcat" as an input for thefeature
argument. For all StreamCat queries, use thestreamcat
function instead. Now, theepa_nhd_catchments
function is essentially useful for getting Curve Number data.
- In
NLDI.get_basins
, the indices used to be station IDs but in the previous release they were reset by mistake. This version retains the correct indices.
- In
nhdplus_l48
function, when the layer isNHDFlowline_Network
orNHDFlowline_NonNetwork
, merge allMultiLineString
geometries toLineString
.
- Fix an issue in
network_xsection
andflowline_xsection
related to the changes inshapely
2 API. Now, these functions should return the correct cross-sections.
- Add access to USGS 3D Hydrography Program (3DHP) service. The new
class is called
HP3D
. It can be queried by IDs, geometry, or SQL where clause. - Add support for the new PyGeoAPI endpoints called
xsatpathpts
. This new endpoint is useful for getting elevation profile along Ashapely.LineString
. You can usepygeoapi
function withservice="elevation_profile"
(orPyGeoAPI
class) to access this new endpoint. Previously, theelevation_profile
endpoint was used for getting elevation profile along a path from two endpoints and the inputGeoDataFrame
must have been aMultiPoint
with two coordinates. Now, you must the input must containLineString
geometries. - Switch to using the new smoothing algorithm from
pygeoutils
for resampling the flowlines and getting their cross-sections. This new algorithm is more robust, accurate, and faster. It has a new argument calledsmoothing
for controlling the number knots of the spline. Higher values result in smoother curves. The default value isNone
which uses all the points from the input flowline.
- Update
GeoConnex
based on the latest changes in the web service.
- Fix HyRiver libraries requirements by specifying a range instead
of exact version so
conda-forge
can resolve the dependencies.
From release 0.15 onward, all minor versions of HyRiver packages
will be pinned. This ensures that previous minor versions of HyRiver
packages cannot be installed with later minor releases. For example,
if you have py3dep==0.14.x
installed, you cannot install
pydaymet==0.15.x
. This is to ensure that the API is
consistent across all minor versions.
- Add a new function, called
nhdplus_h12pp
, for retrieving HUC12 pour points across CONUS. - Add
use_arrow=True
topynhd.nhdplus_l48
when reading the NHDPlus dataset. This speeds up the process sincepyarrow
is installed. - In
nhdplus_l48
makelayer
option sosql
parameter ofpyogrio.read_dataframe
can also be used. This is necessary sincepyogrio.read_dataframe
does not support passing bothlayer
andsql
parameters. - Update the mainstems dataset link to version 2.0 in
mainstem_huc12_nx
. - Expose
NHDTools
class to the public API. - For now, retain compatibility with
shapely<2
while supportingshapley>=2
.
- Remove unnecessary conversion of
id_col
andtoid_col
toInt64
innhdflw2nx
andvector_accumulation
. This ensures that the input data types are preserved. - Fix an issue in
nhdplus_l48
, where if the inputdata_dir
is not absolutepy7zr
fails to extract the file.
- Rewrite the
GeoConnex
class to provide access to new capabilities of the web service. Support for spatial queries have been added via CQL queries. For more information, check out the updated GeoConnex example notebook. - Add a new property to
StreamCat
, calledmetrics_df
that gets a dataframe of metric names and their description. - Create a new private
StreamCatValidator
class to avoid polluting the publicStreamCat
class with private attributes and methods. Moreover, add a new alternative metric names attribute toStreamCat
calledalt_names
for handling those metric names that do not followMETRIC+YYYY
convention. This attribute is a dictionary that maps the alternative names to the actual metric names, so users can useMETRIC_NAME
column ofmetrics_df
and add a year suffix fromvalid_years
attribute ofStreamCat
to get the actual metric name. - In
navigate_by*
functions ofNLDI
addstop_comid
, which is another criterion for stopping the navigation in addition todistance
. - Improve
UserWarning
messages ofNLDI
andWaterData
.
- Remove
pynhd.geoconnex
function since more functionality has been added to the GeoConnex service that existence of this function does not make sense anymore. All queries should be done viapynhd.GeoConnex
class. - Rewrite
NLDI
to improve code readability and significantly improving performance. Now, its methods do now return tuples if there are failed requests, instead they will be shown as aUserWarning
. - Bump the minimum required version of
shapely
to 2.0, and use its new API.
- Sync all minor versions of HyRiver packages to 0.14.0.
- Update the link to version 2.0 of the ENHD dataset in
enhd_attrs
.
- Improve columns data types in
enhd_attrs
andnhdplus_vaa
by usingint32
instead ofInt64
, where applicable. - Sync all patch versions of HyRiver packages to x.x.12.
- The
prepare_nhdplus
now supports NHDPlus HR in addition to NHDPlus MR. It automatically detects the NHDPlus version based on the ID column name:nhdplusid
for HR andcomid
for MR.
- Fully migrate
setup.cfg
andsetup.py
topyproject.toml
. - Convert relative imports to absolute with
absolufy-imports
. - Improve performance of
prepare_nhdplus
by usingpandas.merge
instead of applying a function to each row of the dataframe.
- Add support for the new EPA's
StreamCat
Restful API with around 600 NHDPlus
catchment level metrics. One class is added for getting the service
properties such as valid metrics, called
StreamCat
. You can usestreamcat
function to get the metrics as apandas.DataFrame
. - Refactor the
show_versions
function to improve performance and print the output in a nicer table-like format.
- Skip 0.13.9 version so the minor version of all HyRiver packages become the same.
- Modify the codebase based on the latest changes in
geopandas
related to empty dataframes.
- Add a new function, called
nhdplus_attrs_s3
, for accessing the recently released NHDPlus derived attributes on a USGS's S3 bucket. The attributes are provided in parquet files, so getting them is faster thannhdplus_attrs
. Also, you can request for multiple attributes at once whereas innhdplus_attrs
you had to request for each attribute one at a time. This function will replacenhdplus_attrs
in a future release, as soon as all data that are available on the ScienceBase version are also accessible from the S3 bucket. - Add two new functions called
mainstem_huc12_nx
andenhd_flowlines_nx
. These functions generate anetworkx
directed graph object of NHD HUC12 water boundaries and flowlines, respectively. They also return a dictionary mapping of COMID and HUC12 to the correspondingnetworkx
node. Additionally, a topologically sorted list of COMIDs/HUC12s are returned. The generated data are useful for doing US-scale network analysis and flow accumulation on the NHD network. The NHD graph has about 2.7 million edges and the mainstem HUC12 graph has about 80K edges. - Add a new function for getting the entire NHDPlus dataset for CONUS (Lower 48),
called
nhdplus_l48
. The entire NHDPlus dataset is downloaded from here. This 7.3 GB file will take a while to download, depending on your internet connection. The first time you run this function, the file will be downloaded and stored in the./cache
directory. Subsequent calls will use the cached file. Moreover, there are two additional dependencies for using this function:pyogrio
andpy7zr
. These dependencies can be installed usingpip install pyogrio py7zr
orconda install -c conda-forge pyogrio py7zr
.
- Refactor
vector_accumulation
for significant performance improvements. - Modify the codebase based on Refurb suggestions.
- Add a new function called
epa_nhd_catchments
to access one of the EPA's HMS endpoints calledWSCatchment
. You can use this function to access 414 catchment-scale characteristics for all the NHDPlus catchments including 16-day average curve number. More information on the curve number dataset can be found at its project page here.
- Fix a bug in
NHDTools
where due to the recent changes inpandas
exception handling, theNHDTools
fails in converting columns withNaN
values to integer type. Now,pandas
throwsIntCastingNaNError
instead ofTypeError
when usingastype
method on a column.
- Use
pyupgrade
package to update the type hinting annotations to Python 3.10 style.
- Add the missing PyPi classifiers for the supported Python versions.
- Append "Error" to all exception classes for conforming to PEP-8 naming conventions.
- Bump the minimum versions of
pygeoogc
andpygeoutils
to 0.13.5 and that ofasync-retriever
to 0.3.5.
- Fix an issue in
nhdplus_vaa
andenhd_attrs
functions where ifcache
folder does not exist, it would not have been created, thus resulting to an error.
- Use the new
async_retriever.stream_write
function to download files innhdplus_vaa
andenhd_attrs
functions. This is more memory efficient. - Convert the type of list of not found items in
NLDI.comid_byloc
andNLDI.feature_byloc
to list of tuples of coordinates from list of strings. This matches the type of returned not found coordinates to that of the inputs. - Fix an issue with NLDI that was caused by the recent changes in the NLDI web
service's error handling. The NLDI web service now returns more descriptive
error messages in a
json
format instead of returning the usual status errors. - Slice the ENHD dataframe in
NHDTools.clean_flowlines
before updating the flowline dataframe to reduce the required memory for theupdate
operation.
- Set the minimum supported version of Python to 3.8 since many of the
dependencies such as
xarray
,pandas
,rioxarray
have dropped support for Python 3.7.
- Use micromamba for running tests and use nox for linting in CI.
- Add support for all the GeoConnex web service endpoints. There are two
ways to use it. For a single query, you can use the
geoconnex
function and for multiple queries, it's more efficient to use theGeoConnex
class. - Add support for passing any of the supported NLDI feature sources to
the
get_basins
method of theNLDI
class. The default isnwissite
to retain backward compatibility.
- Set the type of "ReachCode" column to
str
instead ofint
inpygeoapi
andnhdplus_vaa
functions.
- Add two new functions called
flowline_resample
andnetwork_resample
for resampling a flowline or network of flowlines based on a given spacing. This is useful for smoothing jagged flowlines similar to those in the NHDPlus database. - Add support for the new NLDI endpoint called "hydrolocation". The
NLDI
class now has two methods for getting features by coordinates:feature_byloc
andcomid_byloc
. Thefeature_byloc
method returns the flowline that is associated with the closest NHDPlus feature to the given coordinates. Thecomid_byloc
method returns a point on the closest downstream flowline to the given coordinates. - Add a new function called
pygeoapi
for calling the API in batch mode. This function accepts the input coordinates as ageopandas.GeoDataFrame
. It is more performant than calling its counteractPyGeoAPI
multiple times. It's recommended to switch to using this new batch function instead of thePyGeoAPI
class. Users just need to prepare an input data frame that has all the required service parameters as columns. - Add a new step to
prepare_nhdplus
to convertMultiLineString
toLineString
. - Add support for the
simplified
flag of NLDI'sget_basins
function. The default value isTrue
to retain the old behavior.
Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:
HYRIVER_CACHE_NAME
: Path to the caching SQLite database.HYRIVER_CACHE_EXPIRE
: Expiration time for cached requests in seconds.HYRIVER_CACHE_DISABLE
: Disable reading/writing from/to the cache file.
You can do this like so:
import os
os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite"
os.environ["HYRIVER_CACHE_EXPIRE"] = "3600"
os.environ["HYRIVER_CACHE_DISABLE"] = "true"
- Add a new class called
NHD
for accessing the latest National Hydrography Dataset. More info regarding this data can be found here. - Add two new functions for getting cross-sections along a single flowline via
flowline_xsection
or throughout a network of flowlines vianetwork_xsection
. You can specify spacing and width parameters to control their location. For more information and examples please consult the documentation. - Add a new property to
AGRBase
calledservice_info
to include some useful info about the service includingfeature_types
which can be handy for converting numeric values of types to their string equivalent.
- Use the new PyGeoAPI API.
- Refactor
prepare_nhdplus
for improving the performance and robustness of determiningtocomid
within a network of NHD flowlines. - Add empty geometries that
NLDI.getbasins
returns to the list ofnot found
IDs. This is because the NLDI service does not include non-network flowlines and instead returns an empty geometry for these flowlines. (:issue_nhd:`#48`)
- Use the three new
ar.retrieve_*
functions instead of the oldar.retrieve
function to improve type hinting and to make the API more consistent. - Revert to the original PyGeoAPI base URL.
- Rewrite
ScienceBase
to make it applicable for working with other ScienceBase items. A new function has been added for staging the Additional NHDPlus attributes items calledstage_nhdplus_attrs
. - Refactor
AGRBase
to remove unnecessary functions and make them more general. - Update
PyGeoAPI
class to conform to the newpygeoapi
API. This web service is undergoing some changes at the time of this release and the API is not stable, might not work as expected. As soon as the web service is stable, a new version will be released.
- In
WaterData.byid
show a warning if there are any missing feature IDs that are requested but are not available in the dataset. - For all
by*
methods ofWaterData
throw aZeroMatched
exception if no features are found. - Add
expire_after
anddisable_caching
arguments to all functions that useasync_retriever
. Set the default request caching expiration time to never expire. You can usedisable_caching
if you don't want to use the cached responses. Please refer to documentation of the functions for more details.
- Refactor
prepare_nhdplus
to reduce code complexity by grouping all the NHDPlus tools as a private class. - Modify
AGRBase
to reflect the latest API changes inpygeoogc.ArcGISRESTfull
class. - Refactor
prepare_nhdplus
by creating a private class that includes all the previously used private functions. This will make the code more readable and easier to maintain. - Add all the missing types so
mypy --strict
passes.
- Add a new argument to
NLDI.get_basins
calledsplit_catchment
that if is set toTrue
will split the basin geometry at the watershed outlet.
- Catch service errors in
PyGeoAPI
and show useful error messages. - Use
importlib-metadata
for getting the version instead ofpkg_resources
to decrease import time as discussed in this issue.
- More robust handling of inputs and outputs of
NLDI
's methods. - Use an alternative download link for NHDPlus VAA file on Hydroshare.
- Restructure the codebase to reduce the complexity of
pynhd.py
file by dividing it into three files:pynhd
all classes that provide access to the supported web services,core
that includes base classes, andnhdplus_derived
that has functions for getting databases that provided additional attributes for the NHDPlus database.
- Add support for PyGeoAPI. It offers
four functionalities:
flow_trace
,split_catchment
,elevation_profile
, andcross_section
.
- Add a function for getting all NHD
FCodes
as a data frame, callednhd_fcode
. - Improve
prepare_nhdplus
function by removing all coastlines and better detection of the terminal point in a network.
- Migrate to using
AsyncRetriever
for handling communications with web services. - Catch the
ConnectionError
separately inNLDI
and raise aServiceError
instead. So user knows that data cannot be returned due to the out of service status of the server notZeroMatched
.
- Add
nhdplus_vaa
to access NHDPlus Value Added Attributes for all its flowlines. - To see a list of available layers in NHDPlus HR, you can instantiate its class without
passing any argument like so
NHDPlusHR()
.
- Drop support for Python 3.6 since many of the dependencies such as
xarray
andpandas
have done so.
- Use persistent caching for all requests which can help speed up network responses significantly.
- Improve documentation and testing.
- Add an announcement regarding the new name for the software stack, HyRiver.
- Improve
pip
installation and release workflow.
- The first release after renaming hydrodata to PyGeoHydro.
- Make
mypy
checks more strict and fix all the errors and prevent possible bugs. - Speed up CI testing by using
mamba
and caching.
- Bump version to the same version as PyGeoHydro.
- Add a new function for getting basins geometries for a list of USGS station IDs.
The function is a method of
NLDI
class calledget_basins
. So, nowNLDI.getfeature_byid
function does not have a basin flag. This change makes getting geometries easier and faster. - Remove
characteristics_dataframe
method fromNLDI
and make a standalone function callednhdplus_attrs
for accessing NHDPlus attributes directly from ScienceBase. - Add support for using hydro
or edits
webs services for getting NHDPlus High-Resolution using
NHDPlusHR
function. The new arguments areservice
which acceptshydro
oredits
, andautos_switch
flag for automatically switching to the other service if the ones passed byservice
fails.
- Add a new argument to
topoogical_sort
callededge_attr
that allows adding attribute(s) to the returned Networkx Graph. By default, it isNone
. - A new base class,
AGRBase
for connecting to ArcGISRESTful-based services such as National Map and EPA's WaterGEOS. - Add support for setting the buffer distance for the input geometries to
AGRBase.bygeom
. - Add
comid_byloc
toNLDI
class for getting ComIDs of the closest flowlines from a list of lon/lat coordinates. - Add
bydistance
toWaterData
for getting features within a given radius of a point.
- Re-wrote the
NLDI
function to use API v3 of the NLDI service. - The
crs
argument ofWaterData
now is the target CRS of the output dataframe. The service CRS is nowEPSG:4269
for all the layers. - Remove the
url_only
argument ofNLDI
since it's not applicable anymore.
- Added support for NHDPlus High Resolution for getting features by geometry, IDs, or SQL where clause.
- The following functions are added to
NLDI
:
getcharacteristic_byid
: Getting characteristics of NHDPlus catchments.navigate_byloc
: Getting the nearest ComID to a coordinate and performing navigation.characteristics_dataframe
: Getting all the available catchment-scale characteristics as a data frame.get_validchars
: Getting a list of available characteristic IDs for a specified characteristic type.
- The following function is added to
WaterData
:
byfilter
: Getting data based on any valid CQL filter.bygeom
: Getting data within a geometry (polygon and multipolygon).
- Add support for Python 3.9 and tests for Windows.
- Refactored
WaterData
to fix the CRS inconsistencies (#1).
- Replaced
simplejson
withorjson
to speed-up JSON operations.
- Add
show_versions
function for showing versions of the installed deps. - Improve documentation
- Improved documentation
- Refactored
WaterData
to improve readability.
- First release on PyPI.