Skip to content

feat: Implement ST_LENGTH geography function #1791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
95a871a
feat: Implement ST_LENGTH geography function
google-labs-jules[bot] Jun 4, 2025
3644db1
feat: Add NotImplemented length property to GeoSeries
google-labs-jules[bot] Jun 4, 2025
b3fcd91
Update bigframes/bigquery/_operations/__init__.py
tswast Jun 4, 2025
323c33e
Merge branch 'main' into feat-st-length
tswast Jun 4, 2025
7faa776
fix lint
tswast Jun 4, 2025
719a835
add missing compilation method
tswast Jun 4, 2025
c956744
use pandas for the expected values in tests
tswast Jun 4, 2025
d2a2138
fix: Apply patch for ST_LENGTH and related test updates
google-labs-jules[bot] Jun 4, 2025
4e1cdc7
feat: Add use_spheroid parameter to ST_LENGTH and update docs
google-labs-jules[bot] Jun 5, 2025
7eb23b9
feat: Implement use_spheroid for ST_LENGTH via Ibis UDF
google-labs-jules[bot] Jun 5, 2025
b6fa804
refactor: Use Ibis UDF for ST_LENGTH BigQuery builtin
google-labs-jules[bot] Jun 5, 2025
9edc23c
refactor: Consolidate st_length tests in test_geo.py
google-labs-jules[bot] Jun 5, 2025
388a7a2
fix: Correct export of GeoStLengthOp in operations init
google-labs-jules[bot] Jun 5, 2025
c3f45c5
fix system test and some linting
tswast Jun 5, 2025
4b333f2
Merge remote-tracking branch 'origin/main' into feat-st-length
tswast Jun 5, 2025
87405a7
fix lint
tswast Jun 5, 2025
73dc58b
fix doctest
tswast Jun 5, 2025
2bb07f0
fix docstring
tswast Jun 5, 2025
b3ea901
Merge branch 'main' into feat-st-length
tswast Jun 6, 2025
9a58e07
Update bigframes/core/compile/scalar_op_compiler.py
tswast Jun 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions bigframes/bigquery/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
st_difference,
st_distance,
st_intersection,
st_length,
)
from bigframes.bigquery._operations.json import (
json_extract,
Expand All @@ -58,6 +59,7 @@
"st_difference",
"st_distance",
"st_intersection",
"st_length",
# json ops
"json_extract",
"json_extract_array",
Expand Down
64 changes: 64 additions & 0 deletions bigframes/bigquery/_operations/geo.py
Original file line number Diff line number Diff line change
Expand Up @@ -380,3 +380,67 @@ def st_intersection(
each aligned geometry with other.
"""
return series._apply_binary_op(other, ops.geo_st_intersection_op)


def st_length(
series: Union[bigframes.series.Series, bigframes.geopandas.GeoSeries],
*,
use_spheroid: bool = False,
) -> bigframes.series.Series:
"""Returns the total length in meters of the lines in the input GEOGRAPHY.

If a series element is a point or a polygon, returns zero for that row.
If a series element is a collection, returns the length of the lines
in the collection; if the collection doesn't contain lines, returns
zero.

The optional use_spheroid parameter determines how this function
measures distance. If use_spheroid is FALSE, the function measures
distance on the surface of a perfect sphere.

The use_spheroid parameter currently only supports the value FALSE. The
default value of use_spheroid is FALSE. See:
https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_length

**Examples:**

>>> import bigframes.geopandas
>>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> from shapely.geometry import Polygon, LineString, Point, GeometryCollection
>>> bpd.options.display.progress_bar = None

>>> series = bigframes.geopandas.GeoSeries(
... [
... LineString([(0, 0), (1, 0)]), # Length will be approx 1 degree in meters
... Polygon([(0.0, 0.0), (0.1, 0.1), (0.0, 0.1)]), # Length is 0
... Point(0, 1), # Length is 0
... GeometryCollection([LineString([(0,0),(0,1)]), Point(1,1)]) # Length of LineString only
... ]
... )

Default behavior (use_spheroid=False):

>>> result = bbq.st_length(series)
>>> result
0 111195.101177
1 0.0
2 0.0
3 111195.101177
dtype: Float64

Args:
series (bigframes.series.Series | bigframes.geopandas.GeoSeries):
A series containing geography objects.
use_spheroid (bool, optional):
Determines how this function measures distance.
If FALSE (default), measures distance on a perfect sphere.
Currently, only FALSE is supported.

Returns:
bigframes.series.Series:
Series of floats representing the lengths in meters.
"""
series = series._apply_unary_op(ops.GeoStLengthOp(use_spheroid=use_spheroid))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC from the docstring, we don't support for a True value of use_spheroid. Can you please throw a NotImplementedError here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, why we cannot support a True value of use_spheroid here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a server-side limitation: https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_length

I'd rather avoid any client-side checks, as maybe someday the server side will implement this feature.

I'm not sure why they have the parameter at all if it's not implemented, to be honest.

series.name = None
return series
13 changes: 12 additions & 1 deletion bigframes/core/compile/scalar_op_compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@
import bigframes.core.compile.default_ordering
import bigframes.core.compile.ibis_types
import bigframes.core.expression as ex
import bigframes.dtypes
import bigframes.operations as ops

_ZERO = typing.cast(ibis_types.NumericValue, ibis_types.literal(0))
Expand Down Expand Up @@ -1079,6 +1078,12 @@ def geo_x_op_impl(x: ibis_types.Value):
return typing.cast(ibis_types.GeoSpatialValue, x).x()


@scalar_op_compiler.register_unary_op(ops.GeoStLengthOp, pass_op=True)
def geo_length_op_impl(x: ibis_types.Value, op: ops.GeoStLengthOp):
# Call the st_length UDF defined in this file (or imported)
return st_length(x, op.use_spheroid)


@scalar_op_compiler.register_unary_op(ops.geo_y_op)
def geo_y_op_impl(x: ibis_types.Value):
return typing.cast(ibis_types.GeoSpatialValue, x).y()
Expand Down Expand Up @@ -2057,6 +2062,12 @@ def st_distance(a: ibis_dtypes.geography, b: ibis_dtypes.geography, use_spheroid
"""Convert string to geography."""


@ibis_udf.scalar.builtin
def st_length(geog: ibis_dtypes.geography, use_spheroid: bool) -> ibis_dtypes.float: # type: ignore
"""ST_LENGTH BQ builtin. This body is never executed."""
pass


@ibis_udf.scalar.builtin
def unix_micros(a: ibis_dtypes.timestamp) -> int: # type: ignore
"""Convert a timestamp to microseconds"""
Expand Down
6 changes: 6 additions & 0 deletions bigframes/geopandas/geoseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ def __init__(self, data=None, index=None, **kwargs):
data=data, index=index, dtype=geopandas.array.GeometryDtype(), **kwargs
)

@property
def length(self):
raise NotImplementedError(
"GeoSeries.length is not yet implemented. Please use bigframes.bigquery.st_length(geoseries) instead."
)

@property
def x(self) -> bigframes.series.Series:
series = self._apply_unary_op(ops.geo_x_op)
Expand Down
2 changes: 2 additions & 0 deletions bigframes/operations/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@
geo_x_op,
geo_y_op,
GeoStDistanceOp,
GeoStLengthOp,
)
from bigframes.operations.json_ops import (
JSONExtract,
Expand Down Expand Up @@ -385,6 +386,7 @@
"geo_st_geogfromtext_op",
"geo_st_geogpoint_op",
"geo_st_intersection_op",
"GeoStLengthOp",
"geo_x_op",
"geo_y_op",
"GeoStDistanceOp",
Expand Down
9 changes: 9 additions & 0 deletions bigframes/operations/geo_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,12 @@ class GeoStDistanceOp(base_ops.BinaryOp):

def output_type(self, *input_types: dtypes.ExpressionType) -> dtypes.ExpressionType:
return dtypes.FLOAT_DTYPE


@dataclasses.dataclass(frozen=True)
class GeoStLengthOp(base_ops.UnaryOp):
name = "geo_st_length"
use_spheroid: bool = False

def output_type(self, *input_types: dtypes.ExpressionType) -> dtypes.ExpressionType:
return dtypes.FLOAT_DTYPE
64 changes: 64 additions & 0 deletions tests/system/small/bigquery/test_geo.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,14 @@
from shapely.geometry import ( # type: ignore
GeometryCollection,
LineString,
MultiLineString,
MultiPoint,
MultiPolygon,
Point,
Polygon,
)

from bigframes.bigquery import st_length
import bigframes.bigquery as bbq
import bigframes.geopandas

Expand Down Expand Up @@ -59,6 +63,66 @@ def test_geo_st_area():
)


# Expected length for 1 degree of longitude at the equator is approx 111195.079734 meters
DEG_LNG_EQUATOR_METERS = 111195.07973400292


def test_st_length_various_geometries(session):
input_geometries = [
Point(0, 0),
LineString([(0, 0), (1, 0)]),
Polygon([(0, 0), (1, 0), (0, 1), (0, 0)]),
MultiPoint([Point(0, 0), Point(1, 1)]),
MultiLineString([LineString([(0, 0), (1, 0)]), LineString([(0, 0), (0, 1)])]),
MultiPolygon(
[
Polygon([(0, 0), (1, 0), (0, 1), (0, 0)]),
Polygon([(2, 2), (3, 2), (2, 3), (2, 2)]),
]
),
GeometryCollection([Point(0, 0), LineString([(0, 0), (1, 0)])]),
GeometryCollection([]),
None, # Represents NULL geography input
GeometryCollection([Point(1, 1), Point(2, 2)]),
]
geoseries = bigframes.geopandas.GeoSeries(input_geometries, session=session)

expected_lengths = pd.Series(
[
0.0, # Point
DEG_LNG_EQUATOR_METERS, # LineString
0.0, # Polygon
0.0, # MultiPoint
2 * DEG_LNG_EQUATOR_METERS, # MultiLineString
0.0, # MultiPolygon
DEG_LNG_EQUATOR_METERS, # GeometryCollection (Point + LineString)
0.0, # Empty GeometryCollection
pd.NA, # None input for ST_LENGTH(NULL) is NULL
0.0, # GeometryCollection (Point + Point)
],
index=pd.Index(range(10), dtype="Int64"),
dtype="Float64",
)

# Test default use_spheroid
result_default = st_length(geoseries).to_pandas()
pd.testing.assert_series_equal(
result_default,
expected_lengths,
rtol=1e-3,
atol=1e-3, # For comparisons involving 0.0
) # type: ignore

# Test explicit use_spheroid=False
result_explicit_false = st_length(geoseries, use_spheroid=False).to_pandas()
pd.testing.assert_series_equal(
result_explicit_false,
expected_lengths,
rtol=1e-3,
atol=1e-3, # For comparisons involving 0.0
) # type: ignore


def test_geo_st_difference_with_geometry_objects():
data1 = [
Polygon([(0, 0), (10, 0), (10, 10), (0, 0)]),
Expand Down
11 changes: 11 additions & 0 deletions tests/system/small/geopandas/test_geoseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,17 @@ def test_geo_area_not_supported():
bf_series.area


def test_geoseries_length_property_not_implemented(session):
gs = bigframes.geopandas.GeoSeries([Point(0, 0)], session=session)
with pytest.raises(
NotImplementedError,
match=re.escape(
"GeoSeries.length is not yet implemented. Please use bigframes.bigquery.st_length(geoseries) instead."
),
):
_ = gs.length


def test_geo_distance_not_supported():
s1 = bigframes.pandas.Series(
[
Expand Down