-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support digest plugins #3822
Support digest plugins #3822
Conversation
11d4abe
to
d01a777
Compare
b067685
to
d002586
Compare
I replaced the the usage of FFI to read a pointer with Fiddle, which is a default gem. I hope that's sufficient. If it isn't then I can change these specs, but most of the verification that the plugin work involves ensuring the digest metadata and context are working correctly. The metadata struct is needed to for the plugin to tie into the digest machinery. The context object I'm reading as a way to to indirectly verify that the I've addressed a couple of lint issues as well. One was due to copying code from digest (and that I attributed) not matching the code style asked for in the Ruby Spec Suite. I think the remaining lint failures are unrelated to this PR. |
d002586
to
ac5296c
Compare
@@ -0,0 +1 @@ | |||
inherited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a difference from MRI, but I don't think it would be realistically be a problem given it's a standard hook in Ruby. The inherited
hook is used to mix in Truffle::Digest::Plugin
to new Digest::Base
subclasses. This allows for seamless loading of plugins, but does introduce two user visible differences from MRI:
- The presence of
inherited
inDigest::Base.singleton_class.public_instance_methods(false)
- The mixed in
Truffle::Digest::Plugin
module for digest gem plugins
I think both of these deviations are quite small and unlikely to cause any real world trouble. If they are problematic, we'll need to find another way to load digest plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agree
ac5296c
to
a2e7c58
Compare
At this point, I think all of the test failures are problems with the build unrelated to my PR. If that's not correct, please let me know and I'll work on it. |
PullRequest: truffleruby/4507
Overview
The digest gem is a default gem that has a standard C API for contributing new digest algorithms. The gem includes several built-in digest algorithms and they all make use of this plugin mechanism. However, TruffleRuby does not run the real digest source when a caller uses
require digest
. Rather, TruffleRuby reimplements the digest API in Ruby and Java, allowing it to make use of the JVM'sMessageDigest
instances.Generally, this reimplementation is seamless. The
Digest
API has not changed in many years and TruffleRuby ships with support for all of the included digest algorithms. While running the digest native extension would work on TruffleRuby, it would be slower than using the optimized JVM code.This, however, has presented a problem when it comes to digest plugins: since we don't run the native code, we can't load native plugins. We've had an issue open for 6.5 years asking for this support. This PR adds support for loading digest plugins using FFI to work with TruffleRuby's reimplementation.
When an unknown digest is loaded, we now include the
Truffle::Digest::Plugin
module, which will override functionality fromDigest::Base
. The current built-in digest algorithms (e.g., MD5 and SHA256) will continue to work as they have been in TruffleRuby. But, new algorithms that adhere to the digest gem's API will now load and function.BLAKE3
As a motivating example, this PR allows Shopify's BLAKE3 gem, which is written in Rust, to load and execute on TruffleRuby. The blake3-rb Ruby test suite passes 100%. The gem ships with another test suite for Cargo that does not pass due to ongoing work with rb-sys. These tests load the compiled extension into a Rust test runner and do not impact the typical execution mode of running blake3-rb from a gem.
I looked for other digest plugins to work against, but was unable to find any that work with the latest digest gem even on MRI. If anyone has a gem in particular that they would like to me to check, I'm happy to do so, provided it works with MRI.
Testing
This PR also adds some C extension tests for the plugin support. Despite not strictly being part of the MRI extension API, I placed them with other extension specs because digest is a default gem and it's expected that API works out of the box. I struggled a bit with finding the right balance in testing the API boundary without recreating all of the
Digest
Ruby specs we already have. I settled on two groups of tests:Digest
Ruby APIFor the callback tests I wrote a simple digest plugin that writes to the context in straightforward ways. The plugin doesn't actually perform any hashing of values, but rather writes text strings that can be easily verified in the specs. I'm open to suggestions if this form of testing is too indirect.
Benchmarks
The blake3-rb gem ships with some benchmarks. I ran the string benchmarks on my workstation. I didn't fully tune for benchmarks so please take the numbers with a grain of salt. I'm running Ubuntu 24.04.2 (kernel: 6.8.0-55-generic) on a Ryzen 9800X3D processor with the
performance
governor.Running through FFI doesn't appear to introduce too large of an overhead. I've verified that the blake3-rb extension is using SIMD instructions by way of the BLAKE3 Rust library. Absent built-in support for BLAKE3 in the JVM, I'm skeptical that a pure Java implementation would do much better.
The string benchmark, however, processes strings of various lengths in each benchmark. I had seen fairly large variance in performance depending on the length of a string. That's worth exploring in more depth.
For comparison, here's what I get with MRI 3.4.2: