-
Notifications
You must be signed in to change notification settings - Fork 125
Mirror intel/llvm commits #2795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Unified Runtime -> intel/llvm Repo Move NoticeInformationThe source code of Unified Runtime has been moved to intel/llvm under the unified-runtime top-level directory, The code will be mirrored to oneapi-src/unified-runtime and the specification will continue to be hosted at oneapi-src.github.io/unified-runtime. The contribution guide will be updated with new instructions for contributing to Unified Runtime. PR MigrationAll open PRs including this one will be marked with the Should you wish to continue with your PR you will need to migrate it to intel/llvm. If your PR should remain open and not be closed automatically, you can remove the This is an automated comment. |
Enable origin tracking for host/shared/device USM, which can provide the more debug information about the detected uninitialized memory.
When we try to exit the problem when reporting the error, the `exit(1)` could cause hangs in SYCL runtime because it skips some processes before the SYCL shutdown process in atexit stage. The `abort()` is more accurate here because it would stop the whole process right away, and no more SYCL shutdown process with the unstable, early exited program.
Level zero is the only backend that supports 1D fetch. However it was marked as unsupported. This PR fixes that and adds corresponding tests. As with other fetch cases, O0 builds fail on windows for L0 using fetch 1D (see intel/llvm#18919). --------- Signed-off-by: JackAKirk <[email protected]>
Bug1: Leak, when private shadow failed to allocate, the already allocated private base would not be freed. Bug2: Leak, the old private base is never freed. Improve: try to reuse private base just like we try to reuse private shadow. --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: Kenneth Benzie (Benie) <[email protected]>
This commit changes the HIP adapter to select the correct binary for the device when a bundle contains binaries built for multiple AMDGPU architectures. Similarly to other adapaters, the HIP adapter would previously select the first 'amdgcn' binary it came across. This works fine for the common case where the program was compiled for one architecture but may fail otherwise. To aid in this, the SYCL runtime passes some extra information into urDeviceSelectBinary via the pre-existing 'pNext' field of ur_device_binary_t. It does this only for the HIP backend. The HIP adapater then parses this binary information as a clang offload bundle, which conveniently contains specific triple & architecture information for each binary. For this we re-use the code that the offload adapter was using, making it common and fixing a bug in the version matching logic.
urDeviceGetInfo is now able to retrieve the max memory bandwidth. --------- Signed-off-by: Zhang, Winston <[email protected]>
Instead of using a global constructor to initialize the L0 adapter, do it in the first call to `urAdapterGet`. Likewise, instead of de-initing it as a global destructor, do it in the last call to `urAdapterRelease`. As well as not doing L0 initialization where the user is not using L0, it also allows `urAdapterRelease` to be called in a global destructor (e.g. what the SYCL runtime does) without worrying about global destructor order.
aa6a556
to
cb4d997
Compare
Automated changes by create-pull-request GitHub action