Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-36217: Fix building with Clang and GCC on RISC-V #3871

Open
wants to merge 1 commit into
base: 10.6
Choose a base branch
from

Conversation

brad0
Copy link
Contributor

@brad0 brad0 commented Mar 5, 2025

  • The Jira issue number for this PR is: MDEV-______

Description

Clang does not have the builtin __builtin_riscv_pause().

Release Notes

Fix building with Clang and GCC (other than 13) on RISC-V

How can this PR be tested?

Build test with Clang and GCC (other than 13).

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@grooverdan
Copy link
Member

Thanks @brad0 , @dr-m is looking at it in MDEV-36217. Which instruction is this byte sequence mapping to?

@brad0
Copy link
Contributor Author

brad0 commented Mar 5, 2025

Thanks @brad0 , @dr-m is looking at it in MDEV-36217. Which instruction is this byte sequence mapping to?

As was mentioned in the original commit..

Use raw instruction encoding, no idea how to cleanly access the pause instruction from the Zihintpause extension.

Just FYI this is on OpenBSD with Clang 16.

@grooverdan grooverdan changed the title Fix building with Clang on RISC-V MDEV-36217. Fix building with Clang on RISC-V Mar 5, 2025
@grooverdan grooverdan requested a review from dr-m March 5, 2025 23:17
include/my_cpu.h Outdated
Comment on lines 100 to 102
#ifdef __clang__
__asm__ volatile(".long 0x0100000f" ::: "memory");
#else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tested my take at this in (see my comment in MDEV-36217) I did confirm that this is the encoding of the instruction on both 32-bit and 64-bit RISC-V. I kind of like the simplicity and the portability of this, but not the readability (or the lack thereof). As far as I understand, this version would work practically with any version of GCC (and underlying assembler) or clang. If we go down this route, I’d suggest a verbose comment:

#elif defined __GNUC__ && defined __riscv
  /* The GCC-only __builtin_riscv_pause() or the pause instruction is
  encoded like a fence instruction with special parameters. On RISC-V
  implementations that do not support arch=+zihintpause this
  instruction could be interpreted as a more expensive memory fence;
  it should not be an illegal instruction. */
  __asm__ volatile(".long 0x0100000f" ::: "memory");
#elif defined __GNUC__

I don’t have a RISC-V system to test on, so my "less hacky" version would have to be tested by you (clang) and @ottok (GCC on Ubuntu). I expect that some additional changes would be necessary. The __attribute__((target("arch=+zihintpause"))) that my fix would add to this inline function would have to be added to each function that invokes this, similar to the TRANSACTIONAL_INLINE and TRANSACTIONAL_TARGET that are defined for the Intel TSX-NI (rtm) ISA extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind testing anything no matter which direction we go in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have read all the discussion before looking at the code. Given that you are on clang 16, my "less hacky" approach is not going to work, because the inline assembler would only know about fence starting with clang 18. Let’s go with your version, but please apply the comment change that I suggested. I hope that @ottok can schedule this on the Ubuntu CI as soon as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to note as an LLVM/Clang developer I opened a ticket upstream to see about adding the builtin to Clang. Not that it helps now but it's good to push for feature parity when missing bits.

Will do.

@cvicentiu cvicentiu added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 6, 2025
ottok added a commit to ottok/mariadb that referenced this pull request Mar 6, 2025
Clang does not have the builtin __builtin_riscv_pause().

Forwarded: MariaDB#3871
@ottok
Copy link
Contributor

ottok commented Mar 6, 2025

I am now testing this fix in https://salsa.debian.org/mariadb-team/mariadb-server/-/merge_requests/108 and will post results in a couple of hours.

@ottok
Copy link
Contributor

ottok commented Mar 7, 2025

Build at https://launchpadlibrarian.net/780528506/buildlog_ubuntu-jammy-riscv64.mariadb-10.6_1%3A10.6.21-0ubuntu0.22.04.1~bpo22.04.1~1741285144.cb4a90e5ea0+bugfix.10.6.fix.riscv64.build_BUILDING.txt.gz failed on

In file included from /<<PKGBUILDDIR>>/mysys/lf_hash.cc:30:
/<<PKGBUILDDIR>>/include/my_cpu.h: In function ‘void MY_RELAX_CPU()’:
/<<PKGBUILDDIR>>/include/my_cpu.h:103:3: error: ‘__builtin_riscv_pause’ was not declared in this scope; did you mean ‘__builtin_riscv_fsflags’?
  103 |   __builtin_riscv_pause();
      |   ^~~~~~~~~~~~~~~~~~~~~
      |   __builtin_riscv_fsflags
/<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/lib/ii.c:6482:11: warning: cast increases required alignment of target type [-Wcast-align]
 6482 |     tpe = (grn_id *)GRN_BULK_CURR(post);
      |           ^
/<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/lib/ii.c:6483:15: warning: cast increases required alignment of target type [-Wcast-align]
 6483 |     for (tp = (grn_id *)GRN_BULK_HEAD(post); tp < tpe; tp++) {
      |               ^
In file included from /<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/include/groonga.h:22,
                 from /<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/lib/grn.h:759,
                 from /<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/lib/ii.c:18:
/<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/include/groonga/groonga.h:1475:34: warning: cast increases required alignment of target type [-Wcast-align]
 1475 | #define GRN_RECORD_VALUE(obj) (*((grn_id *)GRN_BULK_HEAD(obj)))
      |                                  ^
/<<PKGBUILDDIR>>/storage/mroonga/vendor/groonga/lib/ii.c:6548:50: note: in expansion of macro ‘GRN_RECORD_VALUE’
 6548 |           grn_uvector_add_element(ctx, &uvector, GRN_RECORD_VALUE(old_),
      |                                                  ^~~~~~~~~~~~~~~~
make[4]: *** [mysys/CMakeFiles/mysys.dir/build.make:1395: mysys/CMakeFiles/mysys.dir/lf_hash.cc.o] Error 1
make[4]: Leaving directory '/<<PKGBUILDDIR>>/builddir'
make[3]: *** [CMakeFiles/Makefile2:9611: mysys/CMakeFiles/mysys.dir/all] Error 2

@brad0
Copy link
Contributor Author

brad0 commented Mar 7, 2025

@brad0 brad0 changed the title MDEV-36217. Fix building with Clang on RISC-V MDEV-36217. Fix building with Clang and older GCC on RISC-V Mar 7, 2025
include/my_cpu.h Outdated
Comment on lines 105 to 109
# if !defined(__clang__) && __GNUC__ >= 14
__builtin_riscv_pause();
# else
__asm__ volatile(".long 0x0100000f" ::: "memory");
# endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my test on https://godbolt.org this will not work on GCC 14. We’d need to specify __attribute__((target("arch=+zihintpause"))) on the function and any function that invokes it inline. By the way, I believe that clang always defines __GNUC__ as 4, and you’re evaluating __GNUC__ without checking that it’s defined.

I think that we should simply use the numeric encoding of the instruction for both GCC and clang, no matter which version. You should also remove the word "older" from the title of this pull request, because the existing code only happens to work with exactly GCC 13 (but likely not with its assembler invocation), not GCC 14 or anything newer.

I believe that the .long 0x0100000f should work with practically any version of clang or GCC or assembler that targets RISC-V. The built-in function does not exist in clang, but starting with clang-18 the inline assembler would recognize the pause instruction. That would not help you, because you’re using clang version 16.

Copy link
Contributor Author

@brad0 brad0 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used older as I was under the impression that fairly modern versions were Ok. Now I understand your post above.

Not that it matters with the updated diff, but it checks if __GNUC__ is defined right above that.

Clang does not have the builtin __builtin_riscv_pause().
@brad0 brad0 changed the title MDEV-36217. Fix building with Clang and older GCC on RISC-V MDEV-36217. Fix building with Clang and GCC on RISC-V Mar 7, 2025
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I would wait for results from @ottok before merging this.

ottok added a commit to ottok/mariadb that referenced this pull request Mar 7, 2025
Clang does not have the builtin __builtin_riscv_pause().

Forwarded: MariaDB#3871
@ottok
Copy link
Contributor

ottok commented Mar 7, 2025

I tried a different variation at and the build fully passed https://launchpad.net/~mysql-ubuntu/+archive/ubuntu/mariadb/+build/30402143 3 minutes ago:

diff --git a/include/my_cpu.h b/include/my_cpu.h
index bd5fca5..f68558a 100644
--- a/include/my_cpu.h
+++ b/include/my_cpu.h
@@ -97,7 +97,16 @@ static inline void MY_RELAX_CPU(void)
   /* Changed from __ppc_get_timebase for musl and clang compatibility */
   __builtin_ppc_get_timebase();
 #elif defined __GNUC__ && defined __riscv
-  __builtin_riscv_pause();
+  /* The GCC-only __builtin_riscv_pause() or the pause instruction is
+  encoded like a fence instruction with special parameters. On RISC-V
+  implementations that do not support arch=+zihintpause this
+  instruction could be interpreted as a more expensive memory fence;
+  it should not be an illegal instruction. */
+#ifdef __builtin_riscv_pause
+    __builtin_riscv_pause();
+#else
+    __asm__ volatile(".long 0x0100000f" ::: "memory");
+#endif
 #elif defined __GNUC__
   /* Mainly, prevent the compiler from optimizing away delay loops */
   __asm__ __volatile__ ("":::"memory");

I will re-build now with the latest version of this PR#3871.

Copy link
Contributor

@ottok ottok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dr-m
Copy link
Contributor

dr-m commented Mar 8, 2025

@ottok Thank you! Is there also a log of running some regression tests, so that we would see the runtime behaviour?

@ottok
Copy link
Contributor

ottok commented Mar 9, 2025

Right, I forced the MTR to run now post-build and it all passed in https://launchpadlibrarian.net/780866991/buildlog_ubuntu-jammy-riscv64.mariadb-10.6_1%3A10.6.21-0ubuntu0.22.04.1~bpo22.04.1~1741455135.68ead9769b1+bugfix.10.6.fix.riscv64.build_BUILDING.txt.gz

...
main.sum_distinct                        w7 [ pass ]   3348
main.trigger_no_defaults-11698           w6 [ pass ]    152
main.trigger                             w4 [ pass ]  54819
main.subselect_sj_mat                    w3 [ pass ]   7453
main.subselect_no_semijoin               w2 [ pass ]  31187
main.subselect_sj_jcl6                   w5 [ pass ]  36534
main.subselect_sj                        w8 [ pass ]  38490
--------------------------------------------------------------------------
The servers were restarted 223 times
Spent 7069.129 of 1592 seconds executing testcases

Completed: All 1058 tests were successful.

162 tests were skipped, 63 by the test itself.

@brad0 brad0 changed the title MDEV-36217. Fix building with Clang and GCC on RISC-V MDEV-36217: Fix building with Clang and GCC on RISC-V Mar 9, 2025
ottok added a commit to ottok/mariadb that referenced this pull request Mar 11, 2025
Clang does not have the builtin __builtin_riscv_pause().

Forwarded: MariaDB#3871
ottok added a commit to ottok/mariadb that referenced this pull request Mar 13, 2025
Clang does not have the builtin __builtin_riscv_pause().

Forwarded: MariaDB#3871
@ottok
Copy link
Contributor

ottok commented Mar 13, 2025

This is pending being included for the Ubuntu 24.10 upload at https://salsa.debian.org/mariadb-team/mariadb-server/-/merge_requests/109

@brad0
Copy link
Contributor Author

brad0 commented Mar 18, 2025

@vuvova Ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.
Development

Successfully merging this pull request may close these issues.

6 participants