Skip to content

Releases: HPC-Dwarfs/TheBandwidthBenchmark

Major release v3.0

24 Nov 14:27
1db65c0

Choose a tag to compare

New features:

  • Support to measure sustained memory bandwidth on NVIDIA GPUs
  • Support also random array initialization instead of constants
  • Option to enable AVX512 intrinsics to enforce non temporal stores
  • Introduce command line arguments to overwrite most default settings

Other things:

  • A major refactoring of most of the code
  • Stricter clang-tidy rules
  • Cleanup formatting
  • Improve README

Major release v2.0

05 Aug 03:36

Choose a tag to compare

New modes to scan range of sizes in order to measure a bandwidth profile for the complete memory hierarchy. Sequential mode will use one thread and throughput mode will test bandwidth scaling of memory hierarchy levels using multiple threads but without any work sharing overhead. We added shell scripts to generate plots for these new nodes using Gnuplot.

Other changes:

  • Intel OneAPI compiler is the default now
  • Removed Intel compiler flag for NT Stores
  • Refactor code: Introduce HARNESS macros to eliminate redundant code
  • Extend README

Minor Release v1.4

04 Feb 05:33
6f958e3

Choose a tag to compare

These are mostly cosmetic changes:

  • Put kernels in separate module
  • Put profiling and LIKWID instrumentation in separate module
  • Add clang-format specification and reformat
  • Add banner
  • Replace huge copyright header with something smaller
  • Make NHR@FAU copyright holder
  • Add new build targets for format and .clangd and remove tags target
  • Clean up of Makefile and sources

New features:

  • The Makefile will automatically generate a clang LSP configuration
  • The CLANG toolchain is the default now. Please change to other toolchains in config.mk
  • VERBOSE_AFFINITY will now output the complete affinity mask and the processor a thread is currently scheduled on
  • make distclean will now clean all toolchains, enabled by another directory level ./build for the build products
  • Correct rebuild of all objects if any build configuration has changed (include_<TOOLCHAIN>.mk and config.mk)

Minor release v1.3

28 Sep 12:59
91896f1

Choose a tag to compare

Transfer benchmarking scripts to Wiki.
Move benchmarking documentation to Wiki.
Update Makefile.

Minor release v1.2

10 Dec 05:55

Choose a tag to compare

Changelog for 1.2:

  • Use schedule(static) clause for all worksharing constructs
  • Pull Likwid instrumentation outside benchmark functions
  • Add script to extract scaling runs

Minor release v1.1

12 Oct 09:06
072bda5

Choose a tag to compare

Changelog for 1.1:

  • Increase default problem size to almost 4GB to compensate for OpenMP overhead.
  • Turn on streaming stores always for Intel toolchain
  • Explicitly set static scheduling for OMP for loops
  • Add golang version in util
  • Add single file versions (C and Fortran) for teaching
  • Improve LIKWID instrumentation

Initial release

28 Mar 13:38
bb19bf5

Choose a tag to compare

v1.0

Update README.md