-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLD: switch to Cython>=3.0,<3.1 #4575
Conversation
builds seem to go through on all platforms, however tests running on Jenkins appear to be much slower. They normally take about 50 min, and it looks like they're about to time out at 2h (with no signs of hanging). I'll run them again for good measure, but it's likely a sign of performance regressions. |
Probably another set of default-changes, right? |
That is my expectation. It might be a bit harder for me to track those down as it's always been a pain to replicate nose tests locally, but I'll give it a try. |
currently chasing gil acquisition in import numpy as np
import yt
from yt.testing import requires_file
from yt.utilities.lib.bounding_volume_hierarchy import BVH, test_ray_trace
from yt.visualization.volume_rendering.api import Camera, Scene
def get_rays(camera):
normal_vector = camera.unit_vectors[2].d
W = np.array([8.0, 8.0])
N = np.array([800, 800])
dx = W / N
x_points = np.linspace((-N[0] / 2 + 0.5) * dx[0], (N[0] / 2 - 0.5) * dx[0], N[0])
y_points = np.linspace((-N[1] / 2 + 0.5) * dx[1], (N[1] / 2 - 0.5) * dx[0], N[1])
X, Y = np.meshgrid(x_points, y_points)
result = np.dot(camera.unit_vectors[0:2].T, [X.ravel(), Y.ravel()])
vec_origins = camera.scene.arr(result.T, "unitary") + camera.position
return np.array(vec_origins), np.array(normal_vector)
fn = "MOOSE_sample_data/out.e-s010"
ds = yt.load(fn)
vertices = ds.index.meshes[0].connectivity_coords
indices = ds.index.meshes[0].connectivity_indices - 1
ad = ds.all_data()
field_data = ad["connect1", "diffused"]
bvh = BVH(vertices, indices, field_data)
sc = Scene()
cam = Camera(sc)
cam.set_position(np.array([8.0, 8.0, 8.0]))
cam.focus = np.array([0.0, 0.0, 0.0])
origins, direction = get_rays(cam)
image = np.empty(800 * 800, np.float64)
test_ray_trace(image, origins, direction, bvh) # this is the bottleneck
image = image.reshape((800, 800)) Eventhough all methods of the BVH class are already marked as |
912ba5f
to
7af321a
Compare
@@ -1,3 +1,4 @@ | |||
# cython: legacy_implicit_noexcept=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generates a bunch of warnings so it's not to be considered stable, but it's useful to quickly slap noexcept
on all functions in the module and reduce the difference between the old cythonized C code and the new one. I should also point out that it's not enough to solve the performance issue seen in the example I'm currently working on.
Cython 3.0 final is being deployed to PyPI. Will need to update this PR. |
In my latest commit, I attempted forcing legacy behaviour (set |
used a sampling profiler and came up with
Which suggests the combo of openmp and the GIL is causing it |
So I think the openmp might have been a red herring. I looked at the code and it was using |
There might also be a corner case of |
How do these testing times look? |
Still longer than the baseline, but it's too soon to conclude (still running) |
I will take a look at a few more places. Any ideas on narrowing it down would be helpful, but I can take a first pass. |
inspecting previous runs' runtime and sorting by duration might provide a couple ideas:
|
for what it's worth, I confirm that your commits resolved the one case I was goose-chasing, so we're on the right track ! |
Looks like the next two to go after are |
GAMER also needs a look, which makes sense since we cythonized some of its stuff. |
I've pushed a change that does the SPH kernel functions, which are in a PXD file, as well as some functions that get called from within python functions in the GAMER fields. |
So it looks like with my latest change, it still timed out at 4hours. Am I reading this right? https://tests.yt-project.org/job/yt_py38_git/7124/testReport So the pixelized projection values tests are still quite slow. I'll take another look today. |
I believe that there are several top-level functions in the file |
yes.
I don't see anything about that in the migration guide, so if true, that might be an actual regression ? |
Sorry, I meant, they are |
The process seems to be hanging (or maybe cookbook recipes are so slow that one of them is taking for ever, but it's not clear which one since the order they are run isn't deterministic). I'm going trigger a new job with a deterministic test order so if it hangs again, at least I'll know where. |
I'm splitting out the known-useful bits into smaller PRs so we can keep making progress on this one while keeping the diff reasonable. |
Let's not do that. I am unwilling to use or recommend Cython 3 until the performance has been addressed, so they should be a single PR to do the upgrade and fix it. |
I've rebased the branch to account for conflicts and removed my clearly-useless commits. |
c86b6be
to
7360d9e
Compare
Thanks to @Xarthisius I think we now have restored performance completely, so this is ready to go. Also, this now closes #4355 |
7360d9e
to
fe61992
Compare
@Xarthisius can you merge this please ? |
We've been continuously testing Cython 3.0 pre-releases on Linux and for a subset of the test suite. This PR is an experiment to check for portability issues by running all tests with the current release candidate.