Skip to content

Conversation

@Ambar-13
Copy link

Added GPUUtils module that rewrites integer powers (u^2, u^3, etc) to multiplication chains (uu, uu*u) before GPU kernel generation.
Tests cover symbolic transformations, device detection, and the original failing cases (u^2, Dx(u^3)) on CUDA. Non-CUDA systems skip GPU tests.

Note: Developed on Mac M1. CPU tests pass. Please run CUDA tests.

Add GPUUtils module that rewrites integer powers (u^2, u^3, etc)
to multiplication chains (u*u, u*u*u) before GPU kernel generation. This
preserves Symbolics derivative rules and avoids AD issues.

Only applies when init_params are on GPU (via AbstractGPUDevice).

Tests cover symbolic transformations, device detection, and the
original failing cases (u^2, Dx(u^3)) on CUDA. Non-CUDA systems
skip GPU tests.
Use SymbolicUtils.term for multiplication construction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't in the runtests so it won't be ran.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dr Chris, Thank you for pointing this out. I might be missing something here.
I added them as @testItems and was relying on ReTestItems to pick them up via the :cuda tags.
If you’d prefer them to be explicitly included in runtests.jl or moved alongside the existing CUDA PDE tests, I’d be happy to adjust.

src/gpu_utils.jl Outdated
Comment on lines 11 to 17
Transform integer power operations into explicit multiplication chains
compatible with symbolic differentiation.

This function rewrites expressions of the form `u^n` (where `n` is a positive
integer) into equivalent multiplication expressions `u * u * ... * u` (n times).
This transformation enables automatic differentiation through the Symbolics.jl
chain rule without requiring special-cased derivative rules for power operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like it would generate more efficient code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right. yes
I did that to avoid NaNs we were seeing on GPU backward passes for expressions like u(x)^2 and Dx(u(x)^3) (issue #914).
I'll update the comment to make the intent clearer, but would you prefer to handle this error a different way?

Added CUDA import to test items
Fix missing imports in gpu_nonlinear_tests.jl
Documentation: clarify purpose of transform_power_ops
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants