Skip to content

[RFC] ML-KEM: Import AArch64 backend (from mlkem-native) #2498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hanno-becker
Copy link
Contributor

@hanno-becker hanno-becker commented Jun 22, 2025

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.


The purpose of this PR is to demonstrate and gather feedback on one option for integrating an AArch64 arithmetic backend for mlkem-native into AWS-LC.

Alternative option: #2500


Context: The ML-KEM implementation in AWS-LC is imported from
mlkem-native. mlkem-native comes in a "C-only" version, but also
offers AArch64 and x86_64 backends for (a) arithmetic,
and (b) FIPS-202. Currently, only the "C-only" version is
imported into AWS-LC.

Summary: This commit extends the mlkem-native->AWS-LC import to include the AArch64 arithmetic backend.

Details:

  • crypto/fipsmodule/ml_kem/importer.sh now imports
    the arithmetic backend API header native/api.h as well
    as the native backend native/aarch64/*.

  • The backend is imported as-is, with one exception:
    importer.sh converts the preprocessor directives used by
    mlkem-native into the ones used by s2n-bignum. This is to
    piggy-back on the adjustments made to the delocator to work
    with s2n-bignum assembly; otherwise, similar adjustments would
    likely be needed for mlkem-native assembly files.

  • All imported functions are formally verified for functional
    correctness using HOL-Light. The proofs run as part of
    mlkem-native's CI. The HOL-Light specs are manually translated
    into CBMC specs in the header accompanying the ASM, and
    all higher level CBMC proofs conducted against those specs.
    Again, those are part of the mlkem-native CI.

  • A backend header crypto/fipsmodule/ml_kem/mlkem_native_backend.h
    is added, activating the AArch64 arithmetic backend on Linux and
    MacOS AArch64 system, except if the NO_ASM directive is set
    (same as for s2n-bignum).

    Once the x86_64 arithmetic backend is ready for integration,
    it will be added to mlkem_native_backend.h as well.

  • The backend header is registered in the configuration file
    crypto/fipsmodule/ml_kem/mlkem_native_config.h.

  • The importer.sh is re-run.


Performance

Platform      | Algorithm   | Operation | New (ops/sec) | Old (ops/sec) | Improvement
--------------|-------------|-----------|---------------|---------------|-------------
Graviton 4    | ML-KEM-512  | keygen    |    29,145.9   |    19,666.0   |   +48.2%
              |             | encaps    |    91,043.3   |    64,240.2   |   +41.7%
              |             | decaps    |    75,242.9   |    52,486.8   |   +43.4%
              | ML-KEM-768  | keygen    |    18,190.1   |    12,240.0   |   +48.6%
              |             | encaps    |    57,600.4   |    39,490.6   |   +45.8%
              |             | decaps    |    48,334.1   |    33,298.1   |   +45.2%
              | ML-KEM-1024 | keygen    |    12,094.7   |     8,112.9   |   +49.1%
              |             | encaps    |    37,795.3   |    27,194.0   |   +39.0%
              |             | decaps    |    32,194.3   |    23,314.0   |   +38.1%

Graviton 3    | ML-KEM-512  | keygen    |    24,518.9   |    16,239.5   |   +51.0%
              |             | encaps    |    76,458.9   |    53,104.8   |   +44.0%
              |             | decaps    |    63,591.6   |    43,364.4   |   +46.6%
              | ML-KEM-768  | keygen    |    15,241.8   |    10,127.7   |   +50.5%
              |             | encaps    |    48,145.3   |    34,219.0   |   +40.7%
              |             | decaps    |    40,504.1   |    28,606.5   |   +41.6%
              | ML-KEM-1024 | keygen    |    10,179.0   |     6,695.3   |   +52.0%
              |             | encaps    |    31,879.8   |    23,317.0   |   +36.7%
              |             | decaps    |    27,168.6   |    19,866.5   |   +36.8%

Graviton 2    | ML-KEM-512  | keygen    |    16,570.3   |    10,901.8   |   +52.0%
              |             | encaps    |    51,617.2   |    35,299.9   |   +46.2%
              |             | decaps    |    42,396.2   |    28,533.3   |   +48.6%
              | ML-KEM-768  | keygen    |    10,348.5   |     6,848.2   |   +51.1%
              |             | encaps    |    32,580.5   |    22,890.7   |   +42.3%
              |             | decaps    |    27,176.9   |    18,999.9   |   +43.0%
              | ML-KEM-1024 | keygen    |     6,992.2   |     4,618.4   |   +51.4%
              |             | encaps    |    21,909.4   |    15,703.5   |   +39.5%
              |             | decaps    |    18,565.0   |    13,320.0   |   +39.4%

Example for header patching during import:

Before (mlkem-native):

#include "../../../common.h"
#if defined(MLK_ARITH_BACKEND_AARCH64) &&  !defined(MLK_CONFIG_MULTILEVEL_NO_SHARED)

.text
.balign 4
.global MLK_ASM_NAMESPACE(ntt_asm)
MLK_ASM_FN_SYMBOL(ntt_asm)

After (imported):

#include "_internal_s2n_bignum.h"

.text
.balign 4
        S2N_BN_SYM_VISIBILITY_DIRECTIVE(mlkem_ntt_asm)
        S2N_BN_SYM_PRIVACY_DIRECTIVE(mlkem_ntt_asm)
S2N_BN_SYMBOL(mlkem_ntt_asm):

@hanno-becker hanno-becker changed the title [RFC] ML-KEM: Import AArch64 backend [RFC] ML-KEM: Import AArch64 backend from mlkem-native Jun 22, 2025
@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch 3 times, most recently from 9f583c4 to 2749386 Compare June 22, 2025 09:20
@codecov-commenter
Copy link

codecov-commenter commented Jun 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.85%. Comparing base (587cf97) to head (f53824c).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2498      +/-   ##
==========================================
- Coverage   78.86%   78.85%   -0.01%     
==========================================
  Files         640      640              
  Lines      109560   109560              
  Branches    15522    15521       -1     
==========================================
- Hits        86402    86395       -7     
- Misses      22461    22465       +4     
- Partials      697      700       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch 2 times, most recently from a81eb55 to 417e91c Compare June 23, 2025 10:16
@hanno-becker hanno-becker marked this pull request as ready for review June 23, 2025 11:51
@hanno-becker hanno-becker requested a review from a team as a code owner June 23, 2025 11:51
@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch from 417e91c to 65eda0d Compare June 23, 2025 12:13
@hanno-becker hanno-becker changed the title [RFC] ML-KEM: Import AArch64 backend from mlkem-native [RFC] ML-KEM: Import AArch64 backend (from mlkem-native) Jun 23, 2025
@hanno-becker hanno-becker marked this pull request as draft June 23, 2025 15:41
@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch from 65eda0d to 9c42220 Compare June 24, 2025 05:31
Copy link
Contributor

@andrewhop andrewhop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does mlkem-native do the assembly dispatching? This raises an interesting gap we might have in our CI: building the arm implementation once and testing on all CPUs like we do for x86 with the Intel SDE.

if(BORINGSSL_PREFIX)
# NOTE: mlkem-native has its own symbol prefixing, but AWS-LC's post-hoc prefixing
# is compatible with that.
set_source_files_properties(${MLKEM_NATIVE_AARCH64_ASM_SOURCES} PROPERTIES COMPILE_FLAGS "--include=\"${AWSLC_BINARY_DIR}/symbol_prefix_include/openssl/boringssl_prefix_symbols_asm.h\"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be an issue with the way this approach is prefixing the symbols:
https://github.com/aws/aws-lc/actions/runs/15842126232/job/44656634333?pr=2498#step:17:4425

          Undefined symbols for architecture arm64:
            "_aws_lc_0_30_0_mlkem_intt_asm", referenced from:
                _mlk_intt_native in libaws_lc_sys-f5b29bebed91b9af.rlib[114](f8e4fd781484bd36-bcm.o)
            "_aws_lc_0_30_0_mlkem_ntt_asm", referenced from:

Copy link
Contributor Author

@hanno-becker hanno-becker Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two mechanisms do work together; the issue here is that the mlkem-native assembly isn't subject to the prefix header in this specific test (only).

The reason is this: cc_builder has a patch which adds --include <PREFIX_HEADER> to the build of s2n-bignum assembly files. This patch mimicks this line in the CMakeLists.txt. A similar addition to CMakeLists.txt is made for mlkem-native in this PR, and that's why the (a) prefix builds work, (b) the aws-lc-rs test succeed with CMAKE_BUILDER=1.

One could solve this issue following precedent, by adding another --include clause to cc_builder.

However, taking a step back: Why do we do this --include patching in the first place? Why don't we just add #include <openssl/boringssl_prefix_symbols_asm.h> to the assembly files? @nebeid @torben-hansen @dtapuska @andrewhop If someone could enlighten this, I'd be grateful.

For the time being, I pushed a test-commit attempting to add #include <openssl/boringssl_prefix_symbols_asm.h> to the mlkem-native assembly files upon import. While this is a patch, I think this is no worse -- rather better, because more transparent -- than the --include patches in the CMakeLists / cc_builder.

echo "Fixup include paths"
sed "${SED_I[@]}" 's/#include "src\/\([^"]*\)"/#include "\1"/' $SRC/mlkem_native_bcm.c

echo "Fixup AArch64 assembly backend to use s2n-bignum macros"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update aws-lc's code to not need to modify the mlkem-native code? I would like to keep this as simple as git clone and copying some directories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The delocator seems to have issues with nested relative #include's, so trying to have it process the ASM as it is in mlkem-native seems to be a challenge, at best. I'm not convinced that patching the delocator and build is better than patching the assembly headers to be compatible with the existing tooling.

What we're currently doing is replacing

#include "../../../common.h"
#if defined(MLK_ARITH_BACKEND_AARCH64) &&  !defined(MLK_CONFIG_MULTILEVEL_NO_SHARED)

.text
.balign 4
.global MLK_ASM_NAMESPACE(ntt_asm)
MLK_ASM_FN_SYMBOL(ntt_asm)

by

#include "_internal_s2n_bignum.h"

.text
.balign 4
        S2N_BN_SYM_VISIBILITY_DIRECTIVE(mlkem_ntt_asm)
        S2N_BN_SYM_PRIVACY_DIRECTIVE(mlkem_ntt_asm)
S2N_BN_SYMBOL(mlkem_ntt_asm):

This doesn't look too bad to me. Also, there is still room for simplification, since in mlkem-native we could use a different macro structure (e.g. wrapping .global MLK_ASM_NAMESPACE(...) as MLK_ASM_GLOBAL(...) which would further simplify the search-and-replace transformation to the s2n-bignum directives.

@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch from 9c42220 to bf167e4 Compare June 25, 2025 10:56
Context: The ML-KEM implementation in AWS-LC is imported from
mlkem-native. mlkem-native comes in a "C-only" version, but also
offers AArch64 and x86_64 backends for (a) arithmetic,
and (b) FIPS-202. Currently, only the "C-only" version is
imported into AWS-LC.

Summary: This commit extends the mlkem-native->AWS-LC import to include
the AArch64 arithmetic backend.

Details:

- `crypto/fipsmodule/ml_kem/importer.sh` now imports
  the arithmetic backend API header `native/api.h` as well
  as the native backend `native/aarch64/*`.

- The backend is imported as-is, with one exception:
  `importer.sh` converts the preprocessor directives used by
  mlkem-native into the ones used by s2n-bignum. This is to
  piggy-back on the adjustments made to the delocator to work
  with s2n-bignum assembly; otherwise, similar adjustments would
  likely be needed for mlkem-native assembly files.

- All imported functions are formally verified for functional
  correctness using HOL-Light. The proofs run as part of
  mlkem-native's CI. The HOL-Light specs are manually translated
  into CBMC specs in the header accompanying the ASM, and
  all higher level CBMC proofs conducted against those specs.
  Again, those are part of the mlkem-native CI.

- A backend header crypto/fipsmodule/ml_kem/mlkem_native_backend.h
  is added, activating the AArch64 arithmetic backend on Linux and
  MacOS AArch64 system, except if the NO_ASM directive is set
  (same as for s2n-bignum).

  Once the x86_64 arithmetic backend is ready for integration,
  it will be added to `mlkem_native_backend.h` as well.

- The backend header is registered in the configuration file
  `crypto/fipsmodule/ml_kem/mlkem_native_config.h`.

- The importer.sh is re-run.

Signed-off-by: Hanno Becker <[email protected]>
@hanno-becker hanno-becker force-pushed the mlkem_aarch64_backend branch from bf167e4 to 10605b7 Compare June 25, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants