Skip to content

Commit a50551d

Browse files
authored
Separate Python packages (#50)
* start separating python packages * maybe CI * maybe docs * move import to geoarrow.c * with passing test * readme * fix a few references * maybe fix doctests * fix package name * fix install instructions * maybe fix doctest * move doctests to the end * maybe fix wheel name * maybe fix * actions * maybe fix doc * also ignore opt/ for coverage * maybe actually get doctests to run * maybe more portable doctest
1 parent 73a5d09 commit a50551d

35 files changed

+157
-194
lines changed

.env

-6
This file was deleted.

.github/workflows/python.yaml

+19-13
Original file line numberDiff line numberDiff line change
@@ -32,30 +32,22 @@ jobs:
3232

3333
- name: Install geoarrow
3434
run: |
35-
pushd python
35+
pushd python/geoarrow-c
3636
pip install .[test]
3737
popd
3838
pip list
3939
4040
- name: Run tests
4141
run: |
42-
pytest python/tests -v -s
43-
44-
- name: Run doctests
45-
if: success() && matrix.python-version == '3.10'
46-
run: |
47-
pytest --pyargs geoarrow --doctest-modules
48-
# No Cython docs yet
49-
# pip install pytest-cython
50-
# pytest --pyargs geoarrow --doctest-cython
42+
pytest python/geoarrow-c/tests -v -s
5143
5244
- name: Coverage
5345
if: success() && matrix.python-version == '3.10'
5446
run: |
5547
sudo apt-get install -y lcov
5648
pip uninstall --yes geoarrow
5749
pip install pytest-cov Cython
58-
pushd python
50+
pushd python/geoarrow-c
5951
6052
# Build with Cython + gcc coverage options
6153
pip install -e .[test]
@@ -69,9 +61,10 @@ jobs:
6961
lcov \
7062
--capture --directory build \
7163
--exclude "/usr/*" \
64+
--exclude "/opt/*" \
7265
--exclude "/Library/*" \
7366
--exclude "*/_lib.cpp" \
74-
--exclude "*/src/geoarrow/geoarrow/*" \
67+
--exclude "*/src/geoarrow/c/geoarrow/*" \
7568
--output-file=coverage.info
7669
7770
lcov --list coverage.info
@@ -80,4 +73,17 @@ jobs:
8073
if: success() && matrix.python-version == '3.10'
8174
uses: codecov/codecov-action@v2
8275
with:
83-
files: 'python/coverage.info,python/coverage.xml'
76+
files: 'python/geoarrow-coverage.info,python/geoarrow-c/coverage.xml'
77+
78+
- name: Run doctests
79+
if: success() && matrix.python-version == '3.10'
80+
run: |
81+
# Because of namespace packaging we have to add this here and
82+
# rebuild to avoid confusig pytest
83+
touch python/geoarrow-c/src/geoarrow/__init__.py
84+
pip install python/geoarrow-c
85+
86+
pytest --pyargs geoarrow.c --doctest-modules
87+
# No Cython docs yet
88+
# pip install pytest-cython
89+
# pytest --pyargs geoarrow --doctest-cython

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ release of the geoarrow specification.
1818
## Get started in Python
1919

2020
```python
21-
import geoarrow.pyarrow as ga
21+
import geoarrow.c.pyarrow as ga
2222

2323
ga.point()
2424
# PointType(geoarrow.point)

ci/scripts/build-docs.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,10 @@ main() {
5151
# pip install . doesn't quite work with the setuptools available on the
5252
# ubuntu docker image...python -m build works I think because it sets up
5353
# a virtualenv
54-
pushd python
54+
pushd python/geoarrow-c
5555
rm -rf dist
5656
python3 -m build --wheel
57-
pip3 install dist/geoarrow-*.whl
57+
pip3 install dist/geoarrow*.whl
5858
popd
5959

6060
pushd docs

docker-compose.yml

+1-8
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,7 @@ version: '3.5'
44
services:
55

66
docs:
7-
image: ${REPO}:ubuntu-${GEOARROW_ARCH}
8-
build:
9-
context: .
10-
cache_from:
11-
- ${REPO}:ubuntu-${GEOARROW_ARCH}
12-
dockerfile: ci/docker/ubuntu.dockerfile
13-
args:
14-
GEOARROW_ARCH: ${GEOARROW_ARCH}
7+
image: ghcr.io/apache/arrow-nanoarrow:ubuntu
158
volumes:
169
- .:/geoarrow-c
1710
command: "/bin/bash /geoarrow-c/ci/scripts/build-docs.sh"

docs/source/conf.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
import sys
1515
import datetime
1616

17-
import geoarrow
17+
import geoarrow.c
1818

1919
sys.path.insert(0, os.path.abspath(".."))
2020

docs/source/python/geoarrow.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Core API
33
========
44

5-
.. automodule:: geoarrow
5+
.. automodule:: geoarrow.c
66
:members:
77

88
Constants

docs/source/python/pyarrow.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Integration with pyarrow
33
========================
44

5-
.. automodule:: geoarrow.pyarrow
5+
.. automodule:: geoarrow.c.pyarrow
66

77
Array constructors
88
------------------
@@ -100,8 +100,8 @@ Integration with pyarrow
100100
.. autoclass:: MultiPolygonType
101101
:members:
102102

103-
.. autoclass:: geoarrow.pyarrow._dataset.GeoDataset
103+
.. autoclass:: geoarrow.c.pyarrow._dataset.GeoDataset
104104
:members:
105105

106-
.. autoclass:: geoarrow.pyarrow._dataset.ParquetRowGroupGeoDataset
106+
.. autoclass:: geoarrow.c.pyarrow._dataset.ParquetRowGroupGeoDataset
107107
:members:
File renamed without changes.

python/.gitignore python/geoarrow-c/.gitignore

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@
1616
# specific language governing permissions and limitations
1717
# under the License.
1818

19-
src/geoarrow/geoarrow
20-
src/geoarrow/_lib.cpp
19+
src/geoarrow/c/geoarrow
20+
src/geoarrow/c/_lib.cpp
2121

2222
# Byte-compiled / optimized / DLL files
2323
__pycache__/

python/MANIFEST.in python/geoarrow-c/MANIFEST.in

+3-2
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,6 @@
1616
# under the License.
1717

1818
exclude bootstrap.py
19-
include src/geoarrow/geoarrow/*.h
20-
include src/geoarrow/geoarrow/*.hpp
19+
include src/geoarrow/c/**/**/*.h
20+
include src/geoarrow/c/**/*.h
21+
include src/geoarrow/c/**/*.hpp

python/README.ipynb python/geoarrow-c/README.ipynb

+13-29
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
"Python bindings for nanoarrow are not yet available on PyPI. You can install via URL (requires a C++ compiler):\n",
1515
"\n",
1616
"```bash\n",
17-
"python -m pip install \"https://github.com/geoarrow/geoarrow-cpp/archive/refs/heads/main.zip#egg=geoarrow&subdirectory=python\"\n",
17+
"python -m pip install \"https://github.com/geoarrow/geoarrow-c/archive/refs/heads/main.zip#egg=geoarrow-c&subdirectory=python/geoarrow-c\"\n",
1818
"```\n",
1919
"\n",
2020
"If you can import the namespace, you're good to go! The only reasonable interface to geoarrow currently depends on `pyarrow`, which you can import with:"
@@ -26,7 +26,7 @@
2626
"metadata": {},
2727
"outputs": [],
2828
"source": [
29-
"import geoarrow.pyarrow as ga"
29+
"import geoarrow.c.pyarrow as ga"
3030
]
3131
},
3232
{
@@ -48,7 +48,7 @@
4848
"data": {
4949
"text/plain": [
5050
"PointArray:PointType(geoarrow.point)[1]\n",
51-
"<POINT (0 1)>\n"
51+
"<POINT (0 1)>"
5252
]
5353
},
5454
"execution_count": 2,
@@ -88,7 +88,7 @@
8888
"PointArray:PointType(geoarrow.point)[3]\n",
8989
"<POINT (1 3)>\n",
9090
"<POINT (2 4)>\n",
91-
"<POINT (3 5)>\n"
91+
"<POINT (3 5)>"
9292
]
9393
},
9494
"execution_count": 3,
@@ -117,7 +117,7 @@
117117
"PointArray:PointType(interleaved geoarrow.point)[3]\n",
118118
"<POINT (1 2)>\n",
119119
"<POINT (3 4)>\n",
120-
"<POINT (5 6)>\n"
120+
"<POINT (5 6)>"
121121
]
122122
},
123123
"execution_count": 4,
@@ -137,7 +137,7 @@
137137
"cell_type": "markdown",
138138
"metadata": {},
139139
"source": [
140-
"Importing `geoarrow.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository."
140+
"Importing `geoarrow.c.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository."
141141
]
142142
},
143143
{
@@ -189,22 +189,6 @@
189189
"execution_count": 6,
190190
"metadata": {},
191191
"outputs": [
192-
{
193-
"name": "stderr",
194-
"output_type": "stream",
195-
"text": [
196-
"/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/geopandas/_compat.py:124: UserWarning: The Shapely GEOS version (3.11.1-CAPI-1.17.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.1-CAPI-1.16.0). Conversions between both will be slow.\n",
197-
" warnings.warn(\n",
198-
"/var/folders/gt/l87wjg8s7312zs9s7c1fgs900000gn/T/ipykernel_81348/2107898165.py:1: DeprecationWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas still uses PyGEOS by default. However, starting with version 0.14, the default will switch to Shapely. To force to use Shapely 2.0 now, you can either uninstall PyGEOS or set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:\n",
199-
"\n",
200-
"import os\n",
201-
"os.environ['USE_PYGEOS'] = '0'\n",
202-
"import geopandas\n",
203-
"\n",
204-
"In the next release, GeoPandas will switch to using Shapely by default, even if PyGEOS is installed. If you only have PyGEOS installed to get speed-ups, this switch should be smooth. However, if you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).\n",
205-
" import geopandas\n"
206-
]
207-
},
208192
{
209193
"data": {
210194
"text/plain": [
@@ -216,9 +200,9 @@
216200
"<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>\n",
217201
"...245 values...\n",
218202
"<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>\n",
219-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>\n",
220-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>\n",
221-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>\n",
203+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>\n",
204+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>\n",
205+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>\n",
222206
"<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>"
223207
]
224208
},
@@ -381,9 +365,9 @@
381365
"<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>\n",
382366
"...245 values...\n",
383367
"<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>\n",
384-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>\n",
385-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>\n",
386-
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>\n",
368+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>\n",
369+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>\n",
370+
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>\n",
387371
"<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>"
388372
]
389373
},
@@ -444,7 +428,7 @@
444428
"name": "python",
445429
"nbconvert_exporter": "python",
446430
"pygments_lexer": "ipython3",
447-
"version": "3.9.6"
431+
"version": "3.11.2"
448432
},
449433
"orig_nbformat": 4
450434
},

python/README.md python/geoarrow-c/README.md

+12-27
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# geoarrow for Python
22

3-
The geoarrow Python package provides bindings to the geoarrow-c implementation of the [GeoArrow specification](https://github.com/geoarrow/geoarrow). The geoarrow Python bindings provide input/output to/from Arrow-friendly formats (e.g., Parquet, Arrow Stream, Arrow File) and general-purpose coordinate shuffling tools among GeoArrow, WKT, and WKB encodings.
3+
The geoarrow Python package provides bindings to the geoarrow-c implementation of the [GeoArrow specification](https://github.com/geoarrow/geoarrow). The geoarrow Python bindings provide input/output to/from Arrow-friendly formats (e.g., Parquet, Arrow Stream, Arrow File) and general-purpose coordinate shuffling tools among GeoArrow, WKT, and WKB encodings.
44

55
## Installation
66

77
Python bindings for nanoarrow are not yet available on PyPI. You can install via URL (requires a C++ compiler):
88

99
```bash
10-
python -m pip install "https://github.com/geoarrow/geoarrow-cpp/archive/refs/heads/main.zip#egg=geoarrow&subdirectory=python"
10+
python -m pip install "https://github.com/geoarrow/geoarrow-c/archive/refs/heads/main.zip#egg=geoarrow-c&subdirectory=python/geoarrow-c"
1111
```
1212

1313
If you can import the namespace, you're good to go! The only reasonable interface to geoarrow currently depends on `pyarrow`, which you can import with:
1414

1515

1616
```python
17-
import geoarrow.pyarrow as ga
17+
import geoarrow.c.pyarrow as ga
1818
```
1919

2020
## Examples
@@ -34,7 +34,6 @@ ga.as_geoarrow(["POINT (0 1)"])
3434

3535

3636

37-
3837
This will work with:
3938

4039
- An existing array created by geoarrow
@@ -51,7 +50,7 @@ Alternatively, you can construct GeoArrow arrays directly from a series of buffe
5150
import numpy as np
5251

5352
ga.point().from_geobuffers(
54-
None,
53+
None,
5554
np.array([1.0, 2.0, 3.0]),
5655
np.array([3.0, 4.0, 5.0])
5756
)
@@ -68,7 +67,6 @@ ga.point().from_geobuffers(
6867

6968

7069

71-
7270
```python
7371
ga.point().with_coord_type(ga.CoordType.INTERLEAVED).from_geobuffers(
7472
None,
@@ -86,8 +84,7 @@ ga.point().with_coord_type(ga.CoordType.INTERLEAVED).from_geobuffers(
8684

8785

8886

89-
90-
Importing `geoarrow.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository.
87+
Importing `geoarrow.c.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository.
9188

9289

9390
```python
@@ -135,18 +132,6 @@ array = ga.as_geoarrow(df.geometry)
135132
array
136133
```
137134

138-
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/geopandas/_compat.py:124: UserWarning: The Shapely GEOS version (3.11.1-CAPI-1.17.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.1-CAPI-1.16.0). Conversions between both will be slow.
139-
warnings.warn(
140-
/var/folders/gt/l87wjg8s7312zs9s7c1fgs900000gn/T/ipykernel_81348/2107898165.py:1: DeprecationWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas still uses PyGEOS by default. However, starting with version 0.14, the default will switch to Shapely. To force to use Shapely 2.0 now, you can either uninstall PyGEOS or set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:
141-
142-
import os
143-
os.environ['USE_PYGEOS'] = '0'
144-
import geopandas
145-
146-
In the next release, GeoPandas will switch to using Shapely by default, even if PyGEOS is installed. If you only have PyGEOS installed to get speed-ups, this switch should be smooth. However, if you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
147-
import geopandas
148-
149-
150135

151136

152137

@@ -158,9 +143,9 @@ array
158143
<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>
159144
...245 values...
160145
<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>
161-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>
162-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>
163-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>
146+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>
147+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>
148+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>
164149
<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>
165150

166151

@@ -180,7 +165,7 @@ geopandas.GeoSeries.from_wkb(ga.as_wkb(array))
180165
2 MULTILINESTRING ((631355.519 5122892.285, 6313...
181166
3 MULTILINESTRING ((665166.020 5138641.982, 6651...
182167
4 MULTILINESTRING ((673606.020 5162961.982, 6736...
183-
...
168+
...
184169
250 MULTILINESTRING ((681672.620 5078601.582, 6818...
185170
251 MULTILINESTRING ((414867.917 5093040.881, 4147...
186171
252 MULTILINESTRING ((414867.917 5093040.881, 4148...
@@ -280,9 +265,9 @@ geoarrow_array2
280265
<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>
281266
...245 values...
282267
<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>
283-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>
284-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>
285-
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>
268+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>
269+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>
270+
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>
286271
<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>
287272

288273

0 commit comments

Comments
 (0)