Skip to content

Releases: ROCm/aomp

AOMP Release 0.7-0

02 Aug 20:53
Compare
Choose a tag to compare
AOMP Release 0.7-0 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

This release is a major update from 0.6-5. The source code base for this release is the clang/llvm 9.0 development trunk as of July 15, 2019. These are the other changes included in this release.

  • The package now installs in /usr/lib/aomp_0.7-X with symbolic link from /usr/lib/aomp.
  • Uses build of rocm-device-libs exactly from rocm 2.6 source files.
  • New untested infrastructure to eventually support fortran with flang
  • Moved to the new llvm-project repository. This is the new monorepo that eliminates need for clang, llvm, lld, and openmp repositories.
  • no longer build for nvptx backend, removed cuda examples
  • moved utils to aomp-extras repository
  • moved custom libraries from rocm-device-libs to aomp-device-libs
  • hcc now build with rocm 2.6 hcc is not in the package because we only use it to build the hip runtime.
  • roct and rocr are now build from rocm 2.6 sources
  • comgr is now build from the rocm 2.6 sources.
  • fixes for a number of new test cases

AOMP Release 0.6-5

29 Jun 14:01
Compare
Choose a tag to compare
AOMP Release 0.6-5 Pre-release
Pre-release

Like 0.6-4, this release 0.6-5 of aomp is based off the stable version of clang/llvm 8.0.
These are the changes found in 0.6-5 compared to the previous 0.6-4 release.

  • Added support for archives of bundles on command line.
  • Created hostcall payload on system memory instead of GPU memory. This avoids cache effects of HBM memory that gets flushed only at kernel boundaries.
  • Cleaned up examples.
  • Readability changes to various README files in docs.
  • Added SLES-15-SP1 source install dependencies and important notes for linux support.
  • Emit struct of per kernel attributes.
  • Detect and warn that a target exit data clause fails, rather than abort.
  • Fixed linking issue when archive files contain no BC files.

AOMP Release 0.6-4

17 Jun 13:31
Compare
Choose a tag to compare
AOMP Release 0.6-4 Pre-release
Pre-release

Like 0.6-3, this release 0.6-4 of aomp is based off the stable version of clang/llvm 8.0.

These are the changes found in 0.6-4 compared to the previous 0.6-3 release.

  • support for building on SLES15 SP1
  • rpm package for SLES15 SP1
  • do not create a host thread for GPU hostcall services if no services are used by any kernel in the application. This fixes a performance regression we saw with openmpapps in 0.6-3 because none of those apps currently use printf on the device. This still needs more study.
  • Reorganized the github README and linked pages to make it less confusing and to ready support for more platforms.
  • removed hip wrapper scripts such as hipcc. Users must compile hip with clang++ as demonstrated in the examples to get openmp support with hip.
  • properly set amdgpu-flat-work-group-size for generic mode: add wave_size
  • add -lelf to link step of libomptarget.rtl.hsa.so
  • more gracefully exit when gpu arch of kernel does not match device arch
  • refine LIBPOMPTARGET_KERNEL_TRACE 1=>minimal, 2=>verbos'er

AOMP Release 0.6-3

28 May 18:20
Compare
Choose a tag to compare
AOMP Release 0.6-3 Pre-release
Pre-release

Like 0.6-2, this release is based off the stable version of clang/llvm 8.0.

These changes are from 0.6-2.

  • New support for synchronous services called hostcall.
  • The source to support hostcall can be found in a new repository called aomp-extras in the hostcall directory
  • There are minor changes to atmi to support hostcall. These are in branch atmi-0.5-063.
  • Removed printf end-of-kernel service and added to hostcall. printf is now much more reliable from the gpu.
  • Enhancements to toolchain to support static device libraries
  • fix to correctly pickup math functions from libm-.bc . Previously it was seeing math functions as builtins.
  • Suppress calls to __kmpc_push_target_count for host code, resolves undefined reference.
  • Allow -frtti flag to be honored if user requests it on command line.
  • Add AOMP/include path before /usr/local/include to pick up correct header for omp.h.
  • Generate Metadata for both SPMD and Generic offload targets.
  • Honor OMP_TEAM_LIMIT for work groups, just like OMP_NUM_TEAMS.
  • Added *_wg_size symbol to reflect compile time known thread limit for a kernel.
  • Added support to openmp runtimes to support 1024 threads per team/work group.
  • Reenabled SILoadStoreOptimizer pass after pulling upstream fix for scalar carry corruption.
  • Fixed amdgcn noinline and alwaysinline incompatibility issue for the Parallel Data Sharing Wrapper

AOMP Release 0.6-2

01 May 01:34
Compare
Choose a tag to compare
AOMP Release 0.6-2 Pre-release
Pre-release

This release uses the release_80 stable release of clang/llvm/lld/openmp repositories. The artifacts for this release include the patches to the release_80 repos to support openmp for amdgcn for release 0.6-2

Here are the fixes for 0.6-2

  • Fixed issue with constant size teams and threads.
  • Moved to the stable clang/llvm 8.0 code base
  • Fixed code in deviceRTLs/amdgcn that set Max_Warp_Number to 16, was 64
  • Enable Float16 for 0.6-2, disabled by default in release_80 merge
  • Disable META data opt, and provide evar AMDGPU_ENABLE_META_OPT_BUG to enable
  • Add archive handling for bc linking.
  • For performance, rewrite select_outline_wrapper calls, to be direct calls.
    Example: change the generated from:
    @_HASHW_DeclareSharedMemory_cpp__omp_outlined___wrapper =
    local_unnamed_addr addrspace(4) constant i64 -4874776124079246075
    call void @select_outline_wrapper(i16 0, i32 %6, i64 -4874776124079246075)
    to:
    call void @DeclareSharedMemory_cpp__omp_outlined___wrapper(i16 0, i32 %6)
  • In release_80, Loop_tripcount API is now used, so we need to limit num_groups/teams
    to no more than Max_Teams, fixes assertok_error, and snap4
    Also handle num_teams clause inside loop_tripcount logic.
  • BALLOT_SYNC macro replaced with ACTIVEMASK in release_80

AOMP Release 0.6-1

15 Apr 16:56
Compare
Choose a tag to compare
AOMP Release 0.6-1 Pre-release
Pre-release

Changes from 0.6-0 to 0.6-1:

  • Disabled SILoadStoreOptimizer pass to work around 64 bit address calculation issue

  • Added 6 new device APIs as extentions to OpenMP device apis

    • omp_ext_get_warp_id
    • omp_ext_get_lane_id
    • omp_ext_get_master_thread_id
    • omp_ext_get_smid
    • omp_ext_is_spmd_mode
    • omp_ext_get_active_threads_mask
  • rtl get_launch_vals added, algorithm rewrite for threads, teams computation

    • Throttle code for teams and threads off by default, enabled with THREAD_TEAM_THROTTLE
  • Added support for an LLC and OPT specific env-var AOMP_LLC_ARGS AOMP_OPT_ARGS

    • Allows adding compiler options to opt and llc via env-var, useful for triage, dumps, and debug.
  • Added clang-unbundle-archive tool.

  • Added support for device library archives in clang when using -l flag.

  • Updated llvm-link to work with archives of .bc components

  • Added new method AddStaticDeviceLibs to CommonArgs.cpp that searches for static device
    libraries using -l and -L command line options in a way similar to the search method used for
    host libraries including which directories to search for. The differences from host search are:

    • Searches look for names that specify the architecture and/or GPU
    • Searches look in the libdevice subdirectory of each host directory path
    • Searches look for filenames with .a suffix before searching for .bc suffix
  • Cleanup of aomp build scripts including split of llvm component into llvm, clang, and lld.

  • Fix where llvm-config is found during build

  • Added installed binaries from llvm to help with clang lit testing

  • New build script for comgr. This is not part of the compiler build yet. Developers and those building from source can run build_comgr.sh

  • Do not build hip runtime for ppc and arm builds.

  • Added two new smoke tests and improved automation of smoke tests

  • Corrected mymcpu and mygpu for vega20

AOMP Release 0.6-0

25 Feb 23:08
Compare
Choose a tag to compare
AOMP Release 0.6-0 Pre-release
Pre-release

This is the initial release of AOMP.

AOMP is the new name for HCC2. The last HCC2 release was HCC2 0.5-4.
Changes from HCC2 0.5-4

  • AOMP is built from sources for ROCm 2.1.
  • AOMP can build for Nvidia cards so install of CUDA 10 SDK is required.
  • AOMP needs to build hcc for proper build of hip.

Two of the openmpapps are known to fail. We are working to fix this in 0.6-1.

If you built aomp from source, it will default install into $HOME/rocm/aomp. This package will install into /opt/rocm/aomp. Many of the samples will look first in $HOME/rocm/aomp. To override this,

export AOMP=/opt/rocm/aomp