New features:
- Support to measure sustained memory bandwidth on NVIDIA GPUs
- Support also random array initialization instead of constants
- Option to enable AVX512 intrinsics to enforce non temporal stores
- Introduce command line arguments to overwrite most default settings
Other things:
- A major refactoring of most of the code
- Stricter clang-tidy rules
- Cleanup formatting
- Improve README