Skip to content

[next-cmake] Add new build system #1629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 81 commits into
base: develop
Choose a base branch
from

Conversation

bstefanuk
Copy link

@bstefanuk bstefanuk commented May 28, 2025

Add the next-cmake build system to rocBLAS.

This build system consumes Tensile as an out of tree subdirectory, which is configurable via ROCBLAS_TENSILE_SUBDIR_PATH. The intention of this build system is to improve the usability, speed, maintainability, and coherence of rocBLAS builds, primarily through structural changes and the adoption of modern CMake practices. Additional details about configuration parameters and design can be found in next-cmake/README.rst.

For rocBLAS developers: The next-cmake build system is in beta; it will not replace the existing build system until it has undergone sufficient stress-testing. However, the end goal of this work is to replace the original build system.

Testing

  rocBLAS pre-checkin tests

$ OMP_NUM_THREADS=16 ./build/clients/staging/rocblas-test '--gtest_filter=*quick*:*pre_checkin*-*known_bug*'

...
[----------] Global test environment tear-down
[==========] 619030 tests from 205 test suites ran. (2467645 ms total)
[ PASSED ] 619030 tests.
rocBLAS version: 5.1.0.9fe725c5-dirty
rocBLAS-commit-hash: 9fe725c51745b0f3da065f78fe1a962101769aef
Tensile-commit-hash: N/A, as rocBLAS was built without Tensile

  Diff of install_manifest.txt Build command (original):

 $ ./install.sh -c -a gfx90a -i
Build command (next-cmake):

$ cmake -B build- S . -D CMAKE_CXX_COMPILER=/opt/rocm/bin/amdclang++ -D CMAKE_C_COMPILER=/opt/rocm/bin/amdclang -D CMAKE_BUILD_TYPE=Release -D CMAKE_PREFIX_PATH="/opt/rocm" -D GPU_TARGETS="gfx90a:xnack-;gfx90a:xnack+" -D ROCBLAS_ENABLE_FORTRAN=ON -D ROCBLAS_ENABLE_HIPBLASLT=ON

< /opt/rocm/include/rocblas/internal/rocblas-exported-proto.hpp

bstefanuk and others added 30 commits May 1, 2025 18:47
${linkage}
-fsanitize=address
-shared-libasan
-fuse-ld=lld
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These flags should not be in CMakeLists.txt. These are toolchain flags and should be added to the toolchain since this changes the ABI. It also brittle to add sanitizers like this.

Copy link
Contributor

@ellosel ellosel Jun 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree with your position, but this PR is an incremental step in a longer body of work intended to get rocm blas libraries to compose in a super build context. If there is a correctness concern with the flags let us know. Otherwise, we will file an issue to address this at a later time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed these should be top-level in a consistent way. Disagreed that they are all toolchain variables. Linker choice is toolchain. sanitizer selection is project (and sometimes target) configuration, and whether shared-libasan is used or not is more nuanced.

This is just a first pass, so I'm not going to comment here in detail, but in general, we need to converge all of our sanitizer selections to a common set of configuration variables at the top level. I review a lot of code and this is incredibly inconsistent project wide and we are largely going to have to rip up and replace all of it with large scale changes in order to get full sanitizer matrix coverage.

For getting next-cmake landed, I can overlook all of this and focus on what works. But once in the monorepo and able to do LSC work, we need to normalize it all across all components (and expand to the full sanitizer and link mode matrix).

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I will defer this decision to others. All I'll say is that this code implements a very similar pattern to the existing build system.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For getting next-cmake landed, I can overlook all of this and focus on what works

In this case, I will move forward with the current implementation.

add_library(rocblas SHARED)
else()
add_library(rocblas STATIC)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad pattern. Best practice is to just do add_library(rocblas) and BUILD_SHARED_LIBS will decide whether its shared or static.

Copy link
Contributor

@ellosel ellosel Jun 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are name spacing variables such as this to ensure that we have component specific switches. BUILD_SHARED_LIBS would propagate across all projects in a composed build system which we may or may not want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bstefanuk we should honor BUILD_SHARED_LIBS

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this to honour ROCBLAS_BUILD_SHARED_LIBS and BUILD_SHARED_LIBS as a master switch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUILD_SHARED_LIBS would propagate across all projects in a composed build system which we may or may not want.

No it doesnt. You just set it to what you want for each component:

# Build as shared
set(BUILD_SHARED_LIBS ON)
add_subdirectory(foo)
# Build as static
set(BUILD_SHARED_LIBS OFF)
add_subdirectory(bar)

Although I am not sure when we would ever need to mix and match though.

find_package(GTest REQUIRED)
find_package(cblas REQUIRED)
find_package(BLAS REQUIRED)
find_package(LAPACK REQUIRED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should go into the clients subdirectory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed we will push a change to address this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move this into the clients directory because they're only used there.

EXPORT rocblas-targets
ARCHIVE COMPONENT devel
LIBRARY COMPONENT runtime
RUNTIME COMPONENT runtime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use rocm_install_targets and remove the COMPONENT flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an explanation of why so we have context. The best I could surmise based on the limited documentation is that those are the default components so we are just restating the default. Not sure why we should prefer the other function...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the lines of what @stellaraccident mentioned above, I think we should prefer to be explicit about these things instead of relying on non-standard CMake functions that could obfuscate our intention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be encapsulated in its own function. Its not correct for the way we currently provide the usage requirements. And without rocm-cmake, it would be a nightmare to update to get it to work correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because static builds have their own usage requirements so they need to go in a seperate component so it doesnt conflict with the dev builds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to support dedicated static or asan packaging for new build work. The corresponding requirements for those have been de-scoped and will be addressed through a less invasive means.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I light of these comments, I will leave the current implementation.

else()
rocm_install(
DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/../library/include/"
DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/rocblas"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the directories was fixed so it matches the installed path, because the includes should be added to the rocm_install_targets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the goal but we aren't changing the structure of the source tree in this PR so there will inevitably be inconsistencies. I can't speak to rocm_install_targets. It is difficult to know behavior and proper usage of these functions as it isn't documented:

https://rocm.docs.amd.com/projects/ROCmCMakeBuildTools/en/latest/reference/ROCMInstallTargets.html

at least I don't see anything in the above link that would indicate that I should add includes to rocm_install_targets

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the directories do not match the install path. I'm not familiar with the internal workings of rocm_install_targets and how it may/may not update paths internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least I don't see anything in the above link that would indicate that I should add includes to rocm_install_targets

That probably needs to be documented, but you have the rocm-cmake team here to help support you with any question you have.

The INCLUDE flag will install the header files and set the include path on the targets to the source path for the build and the install path for installation(it also sets up the component to devel for the headers automatically).

Comment on lines +311 to +316
INST(int const*);
INST(int* const*);
INST(int*);
INST(int const* const*);
INST(signed char const*);
INST(signed char const* const*);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not expected to instantiate additional functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this code we hit undefined symbol errors. Can you explain why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this code we hit undefined symbol errors. Can you explain why?

This indicates you have broken the build by changing internal API consumption patterns. Yes you need to figure that out before proceeding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TorreZuk without knowing more I would say that we've changed the build in such a way that we exposed an issue (that was resolved by adding the above lines the new build is passing all tests...). To suggest that this indicates it is broken is to suggest the code is perfect. What would be helpful is if you could provide something more concrete regarding:

internal API consumption patterns

so that we can track down the discrepancy quickly. If you are not familiar enough with this aspect of code that is ok - we will figure it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some dependency edge or build option got changed. Whether this was an important invariant or not can't be said without understanding what that was.

I think this and the library path change are the two functional changes I wouldn't have expected just from adapting the build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd imagine this is from int8/int32 gemm_ex, ex: this call to gemm_ex_typecasting(); why it worked before I'm not sure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @daineAMD for your help! I spent some time trying to run down the source of this error but couldn't pin it down. I left it in this PR so as to solicit support from the rocBLAS team on understanding the root cause of this difference. @ellosel and I will continue to investigate.

Comment on lines 49 to 54
set(ROCBLAS_ENABLE_DEVICE
ON
CACHE BOOL "Build rocBLAS device libraries.")
set(ROCBLAS_ENABLE_HOST
ON
CACHE BOOL "Build rocBLAS host library.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you asked the spack community to get involved in the redesign proposal? changing the cmake variable names to something with less inherent meaning doesn't help me nor them. I could ask someone from that group to review this. e.g. Was ROCBLAS_ENABLE_DEVICE meant to replace BUILD_WITH_TENSILE? rocblas is a device library with or without Tensile we don't need it in the variable name. I don't like to see a change variable names for no improvement in meaning. I think without an inplace update of names and other changes this massive change risks losing useful information and introducing bugs. @amcamd maybe you can chime in on this topic specfically.

@ellosel
Copy link
Contributor

ellosel commented Jun 8, 2025

Will get some more reviews on the design. Please add some design doc links to understand the structure a bit more, e.g. example specific CMakeLists, host and device library terminology.

We've been using host to denote the C++ host api and device to denote gemm device libraries.

@ellosel
Copy link
Contributor

ellosel commented Jun 8, 2025

@TorreZuk a few responses to your comment:

Have you asked the spack community to get involved in the redesign proposal? changing the cmake variable names to something with less inherent meaning doesn't help me nor them. I could ask someone from that group to review this.

Please do we need to figure out a way through this transition period with as little pain as possible.

Was ROCBLAS_ENABLE_DEVICE meant to replace BUILD_WITH_TENSILE? rocblas is a device library with or without Tensile we don't need it in the variable name. I don't like to see a change variable names for no improvement in meaning.

Yes and the goal was to remain as consistent with other components with next-cmake directories. That said, I agree that we need to reconsider naming.

I think without an inplace update of names and other changes this massive change risks losing useful information and introducing bugs.

Can you enumerate the useful information that we are losing so we can address it?

Copy link
Contributor

@stellaraccident stellaraccident left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick first pass review. Will review in more detail later.

Comment on lines 731 to 733
else if(TestPath(base_path + "/../Tensile/library"))
// For next-cmake build-tree support
base_path += "/../Tensile/library";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try not to get into strong positions on brace vs no brace debates (having lived through the LLVM debates on these topics). However, I do generally agree with co-pilot and the spirit of the LLVM guidance in a codebase like this: no braces for unambiguous, single statement if/else extensions, but if one of the branches gets braces, they all do.

Comment on lines 731 to 733
else if(TestPath(base_path + "/../Tensile/library"))
// For next-cmake build-tree support
base_path += "/../Tensile/library";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a change we don't actually want. I know it comes from the way you are doing the layering (which itself comes from how we've atomized all of these projects into individual repos), but the directory layout is a product level invariant: it should remain as lib/rocblas, regardless of how the code/projects are organized.

endif()
endif()

if(ROCBLAS_ENABLE_TESTS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine to have finer-grained, project level control for the purpose of disabling tests for individual components, but agreed: BUILD_TESTING is the correct master switch. Unless if there is a use case for fine grained control, I would only rely on BUILD_TESTING to start with and add complexity if needed as part of a larger super-build.

endif()
target_compile_definitions(
rocblas-clients-common
PUBLIC ROCBLAS_REFERENCE_LIB=${BLAS_LIBRARY} ROCM_USE_FLOAT16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re ROCM_USE_FLOAT16, refer to the side conversation about whether/how this should be preprocessor level conditioning only. In this case, because it is only for clients-common, which is a private dependency, there may be a case to do it at the CMake level, but in general, I think that a consistent, header-only approach is probably better for the project. Even if technically correct to do in this one place, it will get copy-pasta'd from here by people who don't understand when a CMake determination vs a preprocessor determination is appropriate.

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this would entail updates to one or more headers so as to set this definition based on hardware details, potentially extracted from hipDeviceGetAttribute or similar. In terms of actionability, would you prefer this update be done pre-emptively or retroactively?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding background information: _Float16 is a datatype but it is not supported by all compilers, llvm supports _Float16, gcc does not support _Float16. Below code is from
https://github.com/ROCm/rocBLAS/blob/develop/library/include/internal/rocblas-types.h#L90
When _Float16 is not supported, uint16_t is used as a storage type.

#ifdef ROCM_USE_FLOAT16
typedef _Float16 rocblas_half;
#else
/*! \brief Structure definition for rocblas_half */
typedef struct rocblas_half
{
    uint16_t data;
} rocblas_half;
#endif

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the context. I have created an issue in rocm-libraries to track this work: ROCm/rocm-libraries#189

For now, I will leave this as is.

-fsanitize=address
-shared-libasan
)
target_link_options(${rocblas_target}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRIVATE

${linkage}
-fsanitize=address
-shared-libasan
-fuse-ld=lld
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed these should be top-level in a consistent way. Disagreed that they are all toolchain variables. Linker choice is toolchain. sanitizer selection is project (and sometimes target) configuration, and whether shared-libasan is used or not is more nuanced.

This is just a first pass, so I'm not going to comment here in detail, but in general, we need to converge all of our sanitizer selections to a common set of configuration variables at the top level. I review a lot of code and this is incredibly inconsistent project wide and we are largely going to have to rip up and replace all of it with large scale changes in order to get full sanitizer matrix coverage.

For getting next-cmake landed, I can overlook all of this and focus on what works. But once in the monorepo and able to do LSC work, we need to normalize it all across all components (and expand to the full sanitizer and link mode matrix).

endif()

if(ROCBLAS_ENABLE_ASAN)
rocblas_target_configure_sanitizers(rocblas PRIVATE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't layer it this way: We need to be able to support the full matrix of sanitizers and ways to use them. Ideally, there is a single libraries-level set of CMake definitions we use to select the sanitizer modes and then you want to call a helper to configure them on a target. I would remove the linkage from here (not the right term and I question whether it adds value to separate it from use like this) and then push choice of which sanitizers to enable completely within the helper vs having half of the decision be made here (if(ROCBLAS_ENABLE_ASAN)) and then have it assumed in the helper.

But also, this stuff is inconsistent project wide and needs to be normalized in an LSC per my comments above. I'm willing to bias towards what works for first step landing next-cmake if there is a task set aside to scrub this later.

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create a ticket in rocm-libraries to track action on this work once we're further along the monorepo path.

endif()

if(ROCBLAS_BUILD_SHARED_LIBS)
target_link_libraries(rocblas PRIVATE roctx64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for the future: it is a bad pattern to enable/disable features based on linkage mode. In this case, whether the library is built with marker support is the feature, not whether it is built shared or not. However, I am also aware that rocblas has a number of bad patterns around this one already and you are trying not to diverge. Note that linking this needs to be disabled on current consumer windows builds. Other projects use a feature flag for it which can then be controlled for the corresponding builds separately, but rocblas does already do some (bad) platform switching on this, so if you have to cargo cult that here, you at least need to not do this on WIN32.

Be aware too that just depending on this barename library like this may be how rocblas does it, but it is not right. It should be probing via find_library at the top level and then linking the ${ROCTX64} variable if found/not.

It should also be noted that this whole thing is deprecated and will be replaced at some point by linking against the corresponding rocprofv3 library when marker support is enabled. Set things up to make that switch easy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree, enabling/disabling linked libraries based on shared/static gives makes me squirm a bit. Setting everything here up required a number of gross assumptions and executive decisions just to get it to this point without re-implementing the type of build system that we're trying to get away from. This is an artifact of that.

I will implement this in the manner you've described with find_package at the top-level and ${ROCTX64}.

TARGETS
rocblas-example-c-dgeam
rocblas-example-gemv-graph-capture
# For whatever reason rocblas-example-header-check isn't installed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocblas-example-header-check checks that CPP version guard exist, will fail at compile-time otherwise so doesn't have a use to install.
Also may need cpp11 specifically? @rkamd can correct me if I'm wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will update this comment to reflect these details. Thank you.

# Copyright Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT

cmake_minimum_required(VERSION 3.25.2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - is this minimum version being standardized across libs? For my own learning sake, which cmake features from 3.25 are needed? Thanks

Copy link
Author

@bstefanuk bstefanuk Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, it supports block scoping for variables and policies and improves modern C++ and CUDA support for modern standards. While I haven't implemented this in rocBLAS directly, I have used this feature in other next-cmake builds, e.g., hipBLASLt, and thereby am looking to standardize this requirement across the ROCm libraries alongside the monorepo migration. If this minimum version is problematic for any reason and we need to support lower versions, I'm not opposed to changing this.

-----------------

This section describes how to configure and build the |project_name| project. We assume the user has a
ROCm installation, Python 3.8 or newer and CMake 3.25.0 or newer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmake_minimum_required is 3.25.2 so thinking this should be the same to be precise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will update this to match.

@@ -0,0 +1,463 @@
# Copyright Advanced Micro Devices, Inc., or its affiliates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the copyright notice, the ones you included differ from our existing notices. Is this intended?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intended as the extended copyright notices don't provide any additional protection beyond what is shown here, and these notices reduce line-count bloat.

set(BLIS_PATH_4_1_0 "/opt/AMD/aocl/aocl-linux-aocc-4.1.0/aocc")
set(BLIS_PATH_4_0 "/opt/AMD/aocl/aocl-linux-aocc-4.0")

# # NOTE: Keeping this code for reference of previous implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this discussion's likely already been had, but are you set on only including these reference libs? Imo it would be better to search for the old libs until we figure out the aocl dependency/EULA issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is ongoing discussion about how to properly handle the BLIS libs. Admittedly, this is not an area I am an expert in, could you provide more context regarding the AOCL dependency/EULA issue? @ellosel or @marbre do you have any additional input here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to daineAMD's comment, I thought the reference library switching to OpenBLAS because the AOCL click through EULA prevents use of AOCL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amcamd Could you or someone else on your team expand or provide context or documentation for the use of these different libraries in rocBLAS so I can further understand the team's expectations and implement it accordingly?

Comment on lines +311 to +316
INST(int const*);
INST(int* const*);
INST(int*);
INST(int const* const*);
INST(signed char const*);
INST(signed char const* const*);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd imagine this is from int8/int32 gemm_ex, ex: this call to gemm_ex_typecasting(); why it worked before I'm not sure.


* `ROCBLAS_BUILD_SHARED_LIBS`: Build the |project_name| shared or static library (default: `ON`)
* `ROCBLAS_ENABLE_BLIS`: Enable BLIS support (default: `ON`)
* `ROCBLAS_ENABLE_OPENMP`: Enable OpenMP support (default: `ON`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the default was going to be OpenBLAS . Has this switched to the default being BLIS?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can update the default to be OpenBLAS instead of BLIS.

* `ROCBLAS_ENABLE_BLIS`: Enable BLIS support (default: `ON`)
* `ROCBLAS_ENABLE_OPENMP`: Enable OpenMP support (default: `ON`)
* `ROCBLAS_ENABLE_TENSILE`: Build |project_name| host library with Tensile backend (default: `ON`)
* `ROCBLAS_ENABLE_HIPBLASLT`: Build |project_name| host library with hipBLASLt backend (default: `OFF`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a default of ON for ROCBLAS_ENABLE_HIPBLASLT

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is on in the build system, I will update the README to match.

endif()
target_compile_definitions(
rocblas-clients-common
PUBLIC ROCBLAS_REFERENCE_LIB=${BLAS_LIBRARY} ROCM_USE_FLOAT16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding background information: _Float16 is a datatype but it is not supported by all compilers, llvm supports _Float16, gcc does not support _Float16. Below code is from
https://github.com/ROCm/rocBLAS/blob/develop/library/include/internal/rocblas-types.h#L90
When _Float16 is not supported, uint16_t is used as a storage type.

#ifdef ROCM_USE_FLOAT16
typedef _Float16 rocblas_half;
#else
/*! \brief Structure definition for rocblas_half */
typedef struct rocblas_half
{
    uint16_t data;
} rocblas_half;
#endif

set(BLIS_PATH_4_1_0 "/opt/AMD/aocl/aocl-linux-aocc-4.1.0/aocc")
set(BLIS_PATH_4_0 "/opt/AMD/aocl/aocl-linux-aocc-4.0")

# # NOTE: Keeping this code for reference of previous implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to daineAMD's comment, I thought the reference library switching to OpenBLAS because the AOCL click through EULA prevents use of AOCL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
noCI Disable Jenkins for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants