-
Notifications
You must be signed in to change notification settings - Fork 30
Add BLAKE3? #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@knu With BLAKE2, at least, you had expressed a preference to use OpenSSL for hashing. I don't when BLAKE3 will become available in OpenSSL, but it's been an open issue for over 4.5 years. I suspect it will be a while yet before it's available for use in Ruby. The blake3-ruby gem has proven to be popular with 220k installations. I know that may seem small compared to something like Rails, but I think it's fairly large number for a hash function. This indicates there is broad interest in choosing BLAKE3 over functions available in either the digest or openssl gems. I think it'd be great for the Ruby community if it were even easier to use BLAKE3. |
Thanks for the information. I worked hard to make Digest easy to extend, ensuring that adding new algorithms as external modules would be easy. From what I can see from the implementation the blake3-rb gem you mentioned is a perfect example of that, and the third-party digest module having been used as a popular add-on is good news to me. That said, what does the author of the gem think or say about the integration? And if we were to import theirs, would it be a soft requirement to keep using Rust? Requiring a Rust compiler to build and install the digest library might be a show-stopper and we would probably need to consider making it optional and/or distributing binary gems, which I'm not really willing to do. |
@knu Thanks for the quick reply. Having a plug-in system is nice, but I think Ruby should have robust defaults out of the box. Up until recently I wasn't even aware you could have digest plugins. I bet most people look at what's available in either digest or openssl and give up if they don't find the hash function they're looking for. That, in turn, could look like Ruby is falling behind. I don't know if we literally need to upstream that gem. I'd suggest using the official BLAKE3 C implementation. From there, we can largely translate the gem to C, although owing to the plugin mechanism, the gem is quite straightforward. @ianks, do you have any objections to upstreaming your work on blake3-rb? I know the Rust implementation affords some really nice properties, but the official Ruby gems stick to C for ease of distribution. That's a concession I'm willing to make in order to get BLAKE3 available out of the box, but I'd like to confirm that's okay with you. A pleasant side effect of being integrated upstream is that JRuby and TruffleRuby could optimize it for the JVM. @headius @enebo Would either of you be up for implementing BLAKE3 using the JRuby extension API? |
Absolutely fine by me. Happy to see official blake3 support come to fruition. Thanks for taking this on @nirvdrum. |
I don't see how we'd be able to optimize it for the JVM if it is imported as a C library. JRuby doesn't support the C extension API so any integration would have to be over FFI, which then requires copying data back and forth. TruffleRuby supports the C extension API, but cannot optimize across that boundary, so the performance would be no faster than in CRuby (and I assume slower because of the overhead of crossing that boundary).
There are Java implementations of Blake3 that could be used from JRuby just by writing Ruby code but I have not evaluated any of them for performance. If they are small enough, it would be preferable to import their Java code directly into this library, rather than adding another external dependency to a standard library gem. We have not heard from any users interested in this feature, so it would not be a high priority. It wouldn't be hard for someone else to integrate it, though. |
@nirvdrum I have interest if this is something important enough to be needed but it is very low on the priority list as far as free time I can spend on it unfortunately. I agree with @headius that this needs to be a Java implementation so it requires some extra work in figuring out what we can leverage. |
👍 to add BLAKE3 from me. As we have seen in TruffleRuby already some time ago, having a digest implementation as a C API (to be precise, not using Digest's extension mechanism but just using a method defined in C via So for good digest performance, they need to be implemented directly as part of TruffleRuby's Java code for TruffleRuby, as C extension for CRuby and as a JRuby extension for JRuby.
The main issues I see with that mechanism are: JRuby cannot support that API, because it relies on C extensions. TruffleRuby does not currently support it, maybe it could but it would be quite slow as mentioned above and just the FFI overhead to call the native functions would be too high (as again it needs at least a byte[]->char* copy). It seems unfortunately very difficult to create an extension mechanism for Digest which is efficient for the 3 main Ruby implementations. |
@knu any more opinions on this? It would be really nice if Because yes it can be extended with external gems, but often gems prefer to minimize dependencies so they end up using what's in digest and often that means MD5, which then cause issues with things like FIPS (some certification that remove MD5 from OpenSSL). So I'd be happy to submit a PR if that's something you'd accept. |
@byroot As we talked about earlier, let's go ahead with this. CRC32 sounds nice to me, too. |
Thank you! I'll come up with some PRs after RubyKaigi |
Ref: ruby#77 `CRC32` is relatively commonly needed for network protocol and some archive formats like `zip`. This is a clean implementation derived from the Wikipedia article.
The BLAKE3 hash function is growing in popularity due to improvements in both security and performance over other hashing functions. There's an open source plugin for this extension that adds BLAKE3 that works well, but it would be great if this were integrated directly in digest. That way, the gem provides a great experience out of the box. Additionally, it would be nicer for JRuby and TruffleRuby since both of those implementations provide an API-compatible version of digest in their respective distributions.
There's an official implementation of BLAKE3 in C that we could wrap in Ruby.
I can work on a PR to add BLAKE3 support directly in this gem, but I wanted to check if there are any concerns to address or blockers that would prohibit the integration.
The text was updated successfully, but these errors were encountered: