Releases: bsc-pm/ompss-2-releases
OmpSs-2 2025.06
Version 2025.06, Fri Jun 6, 2025
The OmpSs-2 2025.06 release adds compatibility with ALPI v1.2 across Task-Aware libraries and runtimes, expands device support via Nanos6 and LLVM/Clang, introduces code coverage in nOS-V, and features several updates to Task-Aware libraries such as TASYCL, TACUDA, and TAMPI. It also introduces several bug fixes and improvements to the OmpSs-2 clang parser, several APIs from nOS-V, and instrumentation across libraries.
Nanos6
- Add compatibility with ALPI version 1.2.
- Add support for the
grid
clause on CUDA tasks. - Weaken DLB test requirements (partial drop of tests).
nOS-V
- Extended hwinfo API with new functionalities
- Updated compatibility to support ALPI v1.2
- Integrated code coverage
- Fixed a TLS-related bug when nesting
attaches
viafork()
- Fixed synchronization issues involving
nosv_cond
mutexes - Fixed memory consistency issues on certain architectures related to barriers in
complete_callbacks
- Reduced the number of instrumentation events triggered by
schedpoints
andyields
- Improved detection logic and internal representation of hardware information
- Fixed an instrumentation bug where physical CPU IDs were incorrectly emitted instead of logical IDs, leading to emulator failures
NODES
- No relevant changes compared to the previous release
LLVM/OpenMP (libompv)
- New, more refined implementation of the
passive
wait policy inlibompv
(OMP_WAIT_POLICY=passive
) - Add a
libompvtarget
, equivalent tolibomptarget
but throughlibompv
- Several bugfixes in the
libompv
implementation of free-agents
LLVM/Clang
- Preliminary support for combining OmpSs-2 with OpenMP offload
- Added suport for the
grid
clause for devices in OmpSs-2 - Several bugfixes in the OmpSs-2 clang parser
Ovni
- Add support OpenMP label and task ID views.
- Add support for nOS-V non-blocking scheduler server events (VSN and VSn).
- Add OpenMP simple breakdown view.
- Add bench6 package to run full mini-apps for tests.
Task-Aware Libraries
- Add compatibility with ALPI v1.2 to multiple Task-Aware libraries.
- (TASYCL) Accept Adaptive CPP targets in the configure script.
- (TASYCL) Expand the create queues API to accept combinations of SYCL device selectors, async exception handlers.
- (TASYCL) Expand the API to allow executing functions on all TASYCL queues.
- (TAMPI) Rework the polling mechanism through ALPI's new suspend feature
- (TAMPI) Generate a PKGCONFIG file on installation
- (TAMPI) Allow specifying the maximum number of CPUs of the system while configuring TAMPI
- (TAMPI) Improve the logging of tests
- (TAMPI) Fixed passing of lambdas in some boost functions to fix compatibility with v1.87.9
- (TAMPI) Removed
cpubind
from tests to avoid unexpected behavior depending on SLURM configuration - (TACUDA) Allow multiple streams per CPU through the use of the
tacudaCreateStreams
parameter - (TACUDA) Preallocate
cudaEvent_t
objects to reduce internal CUDA library contention
OmpSs-2 2024.11
Version 2024.11, Fri Nov 15, 2024
The OmpSs-2 2024.11 release adds support for Coroutines through the NODES runtime and the nOS-V tasking library and introduces several new features in nOS-V which include support for a task suspension API, support for RISC-V, a Topology API, and a Memory Pressure API, among others. This release also introduces support for the breakdown model through ovni and nOS-V.
Nanos6
- Add compatibility with ALPI version 1.1 by implementing various functions from the tasking interface
nOS-V
- Introduce support for breakdown model implementation, supported through the use of
ovniemu -b
- Refactor shutdown mechanism, using a coordinated approach to prevent contention during runtime shutdown
- Introduce a Memory Pressure API, to query the current occupancy of the nOS-V shared memory segment
- Allow re-initialization of nOS-V, permitting the call to
nosv_init()
afternosv_shutdown()
- Enable
turbo
setting by default, and add correctness checking to detect changes to FPU flags from outside of nOS-V - Add support for coroutines and similar constructs through the
nosv_suspend()
API. - Add support for RISC-V
- Introduce a Topology API, which allows the configuration of system topology through the
nosv.toml
file - Allow submitting tasks as
NOSV_SUBMIT_IMMEDIATE
from a task's run callback - Introduce
nosv_cond_t
and related calls, as a replacement for pthread condition variables - Other miscellaneous fixes and improvements
NODES
- Introduce support for Coroutines
- Fix immediate successor logic from within busy threads
- Fix wrong header include order in the build system affecting NODES' installation
- Other minor bug fixes and code improvements
LLVM/OpenMP (libompv)
- Support other LLVM/Intel compiler generated code in libompv (tracing) by setting
OMP_ENABLE_COMPAT=1
- Other bug fixes and improvements
LLVM/Clang
- Miscellaneous bug fixes and improvements
Ovni
- Add breakdown model for nOS-V
- New mark API
ovni_mark_*()
to emit user-defined events - New API to manage stream metadata
ovni_attr_*()
- Update trace format to version 3 (to support independent streams)
Task-Aware Libraries
- Introduce TAMPI-OPT, an update for the Task-Aware MPI (TAMPI) library which implements several optimizations
OmpSs-2 2024.05
Version 2024.05, Thu May 16, 2024
The OmpSs-2 2024.05 release includes the Directory/Cache (D/C) for Host and CUDA devices in Nanos6, several new features for the nOS-V tasking library, and performance and bugfixes. The libompv
in LLVM/OpenMP includes the implementation of OpenMP free-agents and instrumentation through ovni. This release removes the support for the Mercurium compiler.
Nanos6
- Add directory/cache (D/C) for Host and CUDA devices
- Add device memory allocation API for D/C-managed memory
- Improvements to the ovni instrumentation
nOS-V
- New batch submission API, which can accumulate tasks to submit them in batch once a certain threshold is reached
- Add
nosv_mutex_t
andnosv_barrier_t
as nOS-V aware alternatives to their pthread counterparts - Add instrumentation points for the
nosv_attach
andnosv_detach
calls - Add instrumentation for parallel tasks
- Activate the
turbo.enabled
configuration option by default, enabling flush-to-zero in x86-64 and aarch64 - Perform safety checks when the
turbo.enabled
configuration option is set to verify FPU flags are not modified by external libraries - Split instrumentation events for the scheduler to allow them to be more granularly controlled
- Allow nOS-V programs to call fork() without leaving the forked process in an incoherent state
- Other bugfixes and improvements
NODES
- Improve the error-handling of nOS-V return codes
- Improve descriptiveness of ovni instrumentation
- Various improvements related to API integrations (nOS-V, ALPI, ovni)
LLVM/OpenMP (libompv)
- Implement the OpenMP free-agents feature by setting
OMP_ENABLE_FREE_AGENTS=1
andOMP_WAIT_POLICY=passive
- Instrument through ovni by setting
OMP_OVNI=1
and enabling ovni instrumentation in nOS-V
LLVM/Clang
- Add
OPENMP_RUNTIME
environment variable to choose the runtime library to link against - Other bugfixes and improvements
Ovni
- New
ovni_thread_require
function to enable emulation models - Streams are marked as finished when calling
ovni_thread_free
- Support per-thread metadata
- Add manual page for
ovnidump
- Add support for
nosv_attach
andnosv_detach
events - Add support for
nosv_mutex_lock
,nosv_mutex_trylock
, andnosv_mutex_unlock
events - Add support for
nosv_barrier
events - Add OpenMP model to instrument the
libompv
implementation - Add new body model to support parallel tasks in nOS-V (
taskfor
directive) - Fix Paraver cfgs for Mac OS
- Other bugfixes and improvements
OmpSs-2 2023.11
Version 2023.11, Wed Nov 22, 2023
The OmpSs-2 2023.11 release includes performance and bugfixes for the runtime systems, several new features for the nOS-V tasking library, and performance improvements on the taskiter
construct implementation. It also implements the ALPI (version 1.0) in the runtime systems, which provides support for task-aware libraries. The LLVM/OpenMP includes a new OpenMP runtime called OpenMP-V (libompv
) that works on top of the nOS-V tasking library. A new instrumentation library called Sonar is provided to instrument MPI function calls through ovni.
General
- The OmpSs-2 runtime systems expose the ALPI generic low-level tasking interface
Nanos6
- Implement the ALPI interface (version 1.0)
- Allow embedding jemalloc allocator
- Embed hwloc and jemalloc by default
- Add
devices.cuda.prefetch
config option to control CUDA prefetching of data dependencies (enabled by default) - Install the
nanos6.toml
config file in$prefix/share
- Remove obsolete instrument.h public interface
- Remove obsolete stats and graph instrumentations
- Remove software dependency with libunwind and elfutils
- Fix execution when enabling extrae instrumentation
- Remove memory leaks
- Various bugfixes and corrections
nOS-V
- Implement the ALPI interface (version 1.0)
- Add
misc.stack_size
config option to change the stack size of nOS-V threads - Add
ovni.level
config option for fine-grained instrumentation control - Change
nosv_attach
API to not require an explicit task type and support multiple attaches - Implement parallel tasks which can be executed on multiple CPUs at once
- Allow calling
nosv_init
andnosv_shutdown
multiple times - Change error handling to return custom nOS-V error codes
- Allow early wake of deadline tasks with
nosv_submit
passing theNOSV_SUBMIT_DEADLINE_WAKE
flag - Add compatibility layer for calls to
sched_get/setaffinity
andpthread_get/setaffinity
- Add instrumentation points for the
nosv_create
andnosv_destroy
APIs - Various bugfixes and corrections
NODES
- Improve performance of the
taskiter
construct - Fix several bugs of the
taskiter
implementation - Ensure nOS-V library is at the first level of dependencies
- Use the updated attach/detach from nOS-V 2.0
- Drop support for nOS-V versions older than 2.0
LLVM/OpenMP
- Provide OpenMP runtime named OpenMP-V (
libompv
) working over the nOS-V tasking library (-fopenmp=libompv
) - Make OpenMP-V runtime compatible with task-aware libraries
- Drop support for task-aware libraries in vanilla OpenMP runtime
libomp
LLVM/Clang
- Fix task data dependencies' calculation for long double types
Ovni
- Add
OVNI_TRACEDIR
envar to change the trace directory (default isovni
) - Add the
ovniver
program to report the libovni version and commit - Add
ovni_version_get()
function - Add nOS-V API subsystem events for
nosv_create()
andnosv_destroy()
- Add TAMPI model with
T
code, subsystem events and cfgs - Add MPI model with
M
code, function events and cfgs - Don't hardcore destination directory names like lib, to use the ones in the destination host (like lib64)
Sonar
- Introduce the Sonar library that uses ovni for instrumenting MPI functions
Task-Aware Libraries
- Leverage the ALPI interface instead of the Nanos6-specific interface
- Drop support for OmpSs-2 versions older than 2023.11
- See other features and fixes in each task-aware libraries' CHANGELOG
OmpSs-2 2023.05.1
OmpSs-2 2023.05.1, Mon Jul 24, 2023
The OmpSs-2 2023.05.1 release includes several bug fixes and improvements with respect to the OmpSs-2 2023.05 release. These bug fixes are listed at the end of these release notes.
The OmpSs-2 2023.05 releases include new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomic
andcritical
OmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task for
clause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiter
construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2
option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6
(default) andlibnodes
, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline
Changes with respect to the 2023.05 release
The OmpSs-2 2023.05.1 includes the following bug fixes and improvements with respect to the 2023.05 version:
Nanos6 Runtime
- Fix CUDA kernel launch configuration and improve performance of OmpSs-2@CUDA support
- Allow failures at CUDA prefetching without aborting the execution
- Fix linking with jemalloc when --as-needed linking flag is used
- Improve testing infrastructure and programs
- Update documentation regarding OmpSs-2@CUDA support
- Improve general documentation
LLVM/OpenMP Runtime
- Fix OpenMP potential use-after-free in polling tasks' mechanism
LLVM/Clang Compiler
- Fix unconditional break inside a for-loop which is encapsulated in a task
- Fix device tasks call order when capturing more information in other clauses
- Add support
shmem
clause in device tasks
OmpSs-2 2023.05
OmpSs-2 2023.05, Wed May 24, 2023
The OmpSs-2 2023.05 release includes new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomic
andcritical
OmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task for
clause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiter
construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2
option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6
(default) andlibnodes
, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline