-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Migration numPy 2.0 and remove deprecated scipy sparsetools functions #3615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@julianpollmann can you please merge Also, since we are upgrading numpy, maybe update cython as well? As per the report in #3611. Inside #3611 I also see a recent link to this (claude-authored) commit: flext-sh@8c418b7 . Is there anything in there that might help with this PR? |
pyproject.toml
Outdated
# If we build our extensions with Cython 3.0.0, then they will be an | ||
# order of magnitude slower, so avoid it for now. | ||
# | ||
"Cython>=0.29.32,<3.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 3.1.2 any better regarding speed (see the comment above this line)?
If so let's change this line to 3.1.2 plus update that comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piskvorky I removed the upper version bound and tested with 3.1.2. However builds/tests take ages. It seems that Cython >=3 has stricter type checking, making it slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt it's type checking that takes ages. Maybe something (a dependency?) gets built from source, whereas it used to be pulled pre-built before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piskvorky I tried several things, unfortunately no luck. Especially Testing of test_doc2vec.py::TestDoc2VecModel seems to be much slower on Cython 3.1.2.
With Upgrading to 3.1.2 Python 3.13 seems to work however?!
I'll look into that tomorrow! |
if it's of any help. support for python 3.13 was only added in cython 3.1.x If you need, I think there is an old syntax for setup.py/setup.cfg to specify different versions.
|
Hey @morotti @piskvorky @gojomo, I don't have enough experience of Cython to judge, but profiling e.g. test_word2vec.py showed that train_batch_cbow and train_batch_sg are very slow (16-20sec). I suspect some kind of type conversion thing, but don't know. Also, when compiling by hand I get lot of messages like:
Hope somebody can figure out?! |
That's an unusually lucid and helpful warning message / hint! Thanks for spotting it. Can you please try adding that |
@piskvorky It seems like Cython 3 will check for exceptions and therefore use the GIL. After adding With this tests for Python 3.13 are passing 😀 P.S.: I'll be N/A till Aug 18th, so cannot make any changes to this PR. |
Hey @piskvorky, could you rerun failed jobs from the CI workflows? There are some http errors, which should disappear. |
Sure. Re-running now under https://github.com/piskvorky/gensim/actions/runs/17163546119?pr=3615 |
@piskvorky thanks. Looks like builds should pass. There is one issue with
Rerunning the failed (macOS) job should fix this. Then Tests should pass. |
@julianpollmann re-run #3 still says "failed". Is there a way for you to be able to (re)run these tests yourself? I added you as a "collaborator" now, please let me know if that helps! |
Thanks will try this evening! |
@piskvorky Builds and Tests are passing. I guess this can be merged. |
Any update on this PR? It would be really great to gain support for both numpy2 and Python 3.13 in gensim. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work!
@piskvorky @gojomo If there are no objections, let's merge and release |
I already OKed above; please merge & let's release! Thanks to everyone, I know this has been in the oven for a very long time. |
Hey @gojomo @hechth @piskvorky @mpenkov,
I've created a new PR for the Migration to numPy 2.0 and the removal of deprecated scipy functions. What has been done so far:
What needs to be done:
Remove NMSLibIndexer since it will only work on numPy 1.0?(See this comment). NMSLibIndex will raise an Exception, if on Python >=3.10 and numpy >=2.xHope we get this done now 💪