Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Pin numpy<2 #85

Closed
wants to merge 2 commits into from
Closed

fix: Pin numpy<2 #85

wants to merge 2 commits into from

Conversation

mjsqu
Copy link

@mjsqu mjsqu commented Jul 1, 2024

Patches:
#84

@mjsqu mjsqu changed the title Pin numpy<2 fix: Pin numpy<2 Jul 1, 2024
@edgarrmondragon
Copy link
Member

You might wanna update poetry.lock too

@visch
Copy link
Member

visch commented Jul 2, 2024

@edgarrmondragon and @mjsqu

I tested the main repo locally (without this PR) and I had no issues at all

See https://github.com/MeltanoLabs/tap-universal-file/actions/runs/9761082046/job/26941341616

Can you give some steps to replicate what you're seeing @mjsqu ? Or @edgarrmondragon can you see it?

My steps were

  1. Remove old venv
  2. poetry install
  3. It worked (no numpy stuff needed)

I checked because I looked for an issue on pyarrow for this and I didn't see anything.

@edgarrmondragon
Copy link
Member

My steps were

1. Remove old venv

2. `poetry install`

3. It worked (no numpy stuff needed)

I checked because I looked for an issue on pyarrow for this and I didn't see anything.

@visch how did you test the tap?

@edgarrmondragon
Copy link
Member

Also, according to numpy/numpy#26191, pyarrow 16.0+ is required to support numpy 2.0

@edgarrmondragon
Copy link
Member

Also, according to numpy/numpy#26191, pyarrow 16.0+ is required to support numpy 2.0

#87 should improve compatibility

@visch
Copy link
Member

visch commented Jul 2, 2024

@edgarrmondragon I was running poetry run pytest -k test_parquet_convert_execution looks like this test isn't testing properly as if I instead do

meltano lock --update --all 
meltano install
meltano config tap-universal-file set file_type parquet
meltano config tap-universal-file set parquet_type_coercion_strategy envelope
meltano config tap-universal-file set file_type file_regex "^.*racing\\.parquet$"
meltano invoke tap-unviersal-file

I get the error

tap-universal-file-py3.10visch@DESKTOP-9BDPA9T:~/git/tap-universal-file$ meltano invoke tap-universal-file
2024-07-02T16:33:13.485002Z [info     ] Environment 'test' is active  
Need help fixing this problem? Visit http://melta.no/ for troubleshooting steps, or to
join our friendly Slack community.

Catalog discovery failed: command ['/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file', '--config', '/home/visch/git/tap-universal-file/.meltano/run/tap-universal-file/tap.8127f8c9-7a4d-45cc-991f-21b171619874.config.json', '--discover'] returned 1 with stderr:
 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file", line 5, in <module>
    from tap_universal_file.tap import TapUniversalFile
  File "/home/visch/git/tap-universal-file/tap_universal_file/tap.py", line 19, in <module>
    from tap_universal_file import streams
  File "/home/visch/git/tap-universal-file/tap_universal_file/streams.py", line 15, in <module>
    import pyarrow as pa
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/lib/python3.10/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found
Traceback (most recent call last):
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file", line 5, in <module>
    from tap_universal_file.tap import TapUniversalFile
  File "/home/visch/git/tap-universal-file/tap_universal_file/tap.py", line 19, in <module>
    from tap_universal_file import streams
  File "/home/visch/git/tap-universal-file/tap_universal_file/streams.py", line 15, in <module>
    import pyarrow as pa
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/lib/python3.10/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
  File "pyarrow/lib.pyx", line 36, in init pyarrow.lib
ImportError: numpy.core.multiarray failed to import

So the test must be wrong. I put in #88 to look at this

@edgarrmondragon
Copy link
Member

@edgarrmondragon I was running poetry run pytest -k test_parquet_convert_execution looks like this test isn't testing properly as if I instead do

meltano lock --update --all 
meltano install
meltano config tap-universal-file set file_type parquet
meltano config tap-universal-file set parquet_type_coercion_strategy envelope
meltano config tap-universal-file set file_type file_regex "^.*racing\\.parquet$"
meltano invoke tap-unviersal-file

I get the error

tap-universal-file-py3.10visch@DESKTOP-9BDPA9T:~/git/tap-universal-file$ meltano invoke tap-universal-file
2024-07-02T16:33:13.485002Z [info     ] Environment 'test' is active  
Need help fixing this problem? Visit http://melta.no/ for troubleshooting steps, or to
join our friendly Slack community.

Catalog discovery failed: command ['/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file', '--config', '/home/visch/git/tap-universal-file/.meltano/run/tap-universal-file/tap.8127f8c9-7a4d-45cc-991f-21b171619874.config.json', '--discover'] returned 1 with stderr:
 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file", line 5, in <module>
    from tap_universal_file.tap import TapUniversalFile
  File "/home/visch/git/tap-universal-file/tap_universal_file/tap.py", line 19, in <module>
    from tap_universal_file import streams
  File "/home/visch/git/tap-universal-file/tap_universal_file/streams.py", line 15, in <module>
    import pyarrow as pa
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/lib/python3.10/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found
Traceback (most recent call last):
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/bin/tap-universal-file", line 5, in <module>
    from tap_universal_file.tap import TapUniversalFile
  File "/home/visch/git/tap-universal-file/tap_universal_file/tap.py", line 19, in <module>
    from tap_universal_file import streams
  File "/home/visch/git/tap-universal-file/tap_universal_file/streams.py", line 15, in <module>
    import pyarrow as pa
  File "/home/visch/git/tap-universal-file/.meltano/extractors/tap-universal-file/venv/lib/python3.10/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
  File "pyarrow/lib.pyx", line 36, in init pyarrow.lib
ImportError: numpy.core.multiarray failed to import

So the test must be wrong. I put in #88 to look at this

@visch this worked for me on the HEAD of main with this config:

    config:
      file_type: parquet
      file_regex: ^.*racing\.parquet$
      file_path: tests/data
      parquet_type_coercion_strategy: envelope
      protocol: file

@visch
Copy link
Member

visch commented Jul 2, 2024

@edgarrmondragon the version updates fixed us then, we're good to close then I think?

@mjsqu are you good to go now?

@edgarrmondragon
Copy link
Member

@edgarrmondragon the version updates fixed us then, we're good to close then I think?

I think so. Let's wait for @mjsqu to confirm.

@mjsqu
Copy link
Author

mjsqu commented Jul 3, 2024

Good to close, I can't reproduce this on my personal setup (py3.11, Debian Linux) - we've had a lot of this issue with numpy on my work environments, at various levels and we have pinned numpy

@edgarrmondragon
Copy link
Member

Good to close, I can't reproduce this on my personal setup (py3.11, Debian Linux) - we've had a lot of this issue with numpy on my work environments, at various levels and we have pinned numpy

Things should begin to stabilize as the ecosystem catches up 🤞.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants