TrixiCUDA v0.1.0-beta.5
Pre-release
Pre-release
What's Changed
- Small changes regarding variable name on tests by @huiyuxie in #88
- Add scripts for benchmarking and profiling workflows by @huiyuxie in #92
- Refactor solver and tests to support
Float32
computations by @huiyuxie in #94 - Add one more example to benchmark by @huiyuxie in #95
- Update README.md by @huiyuxie in #96
- Combine similar kernels using cooperative groups by @huiyuxie in #97
- Relax inbounds checking within GPU kernels by @huiyuxie in #99
- Fuse
reset_du!
function into volume integral kernels by @huiyuxie in #100 - Relax inbounds checking in minor GPU kernels by @huiyuxie in #101
- Optimize volume integral kernels by @huiyuxie in #102
- Update README.md by @huiyuxie in #103
- Load package with device property querying by @huiyuxie in #104
- Optimize volume integral kernel for flux differencing by @huiyuxie in #105
- Bump crate-ci/typos from 1.28.1 to 1.29.0 by @dependabot in #106
- Switch to less parallelism to avoid redundant computation by @huiyuxie in #107
- Remove comments for
reset_du!
function in tests by @huiyuxie in #108 - Update some critical comments by @huiyuxie in #111
- Adapt
wrap_array
for GPU arrays by @huiyuxie in #112 - Optimize volume integral kernels for flux differencing by @huiyuxie in #114
- Optimization patch for volume integral kernels by @huiyuxie in #115
- Optimize volume integral kernels for larger arrays (less common use) by @huiyuxie in #116
Full Changelog: v0.1.0-beta.4...v0.1.0-beta.5