Skip to content

Add a pure-Rust backend #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brson opened this issue Mar 27, 2017 · 21 comments
Closed

Add a pure-Rust backend #67

brson opened this issue Mar 27, 2017 · 21 comments

Comments

@brson
Copy link

brson commented Mar 27, 2017

flate2 gets its actual compression from either libz or miniz, both C libraries. Ultimately, a pure-Rust stack is better for Rust, mostly because of build simplicity.

Write a miniz replacement in Rust, publish it to crates.io and add another compile-time feature to select it.

Then make the performance better than miniz and make it the default.

@brson
Copy link
Author

brson commented Mar 28, 2017

There are existing 'deflate' and 'inflate' crates that might be suitable.

@oyvindln
Copy link
Contributor

oyvindln commented Apr 3, 2017

Developer of the deflate crate here, I would be happy to coordinate with you on this if you decide to utilize it.

@dtolnay
Copy link
Member

dtolnay commented Apr 4, 2017

Notes from today's libs team meeting:

  • Adding a pure-Rust backend should not require a change to the API.
  • Opting into C backends can cause mysterious explosions when multiple crates opt in to multiple revs.
  • Not a great risk here because miniz is not going to change.
  • A pure-Rust backend is not a requirement for 1.0 of flate2.

Currently we do not plan to pursue a pure-Rust backend for flate2. Instead we plan to promote migration to pure Rust crates as they become stable.

@matklad
Copy link
Member

matklad commented Apr 7, 2017

Currently we do not plan to pursue a pure-Rust backend for flate2. Instead we plan to promote migration to pure Rust crates as they become stable.

Would a drop-in pure rust replacement of miniz be useful anyway? We have summer internships at https://compscicenter.ru, and porting a C library to Rust seems like an interesting project (given that you can throw corrode, ALF and all other cool tools at it) which I probably can mentor.

@alexcrichton
Copy link
Member

I'd imagine so yeah! ABI compatible, yet safe, libraries seem like a fantastic use case for Rust.

@Frommi
Copy link
Contributor

Frommi commented Jul 4, 2017

I'm a summer intern mentored by @matklad to rewrite miniz to Rust. Here is the repo.

@Frommi
Copy link
Contributor

Frommi commented Jul 24, 2017

I made a fork that uses miniz_oxide. Right now miniz.c file is fully ported to Rust and a little bit from tdef.

@Frommi
Copy link
Contributor

Frommi commented Aug 14, 2017

Deflation is done (except for png functions), but it certainly can be a bit rustier. As a whole miniz_tester.cpp is working 1.09 (SE 0.01) times slower with miniz_oxide according to /usr/bin/time user time.

@oyvindln
Copy link
Contributor

oyvindln commented Aug 14, 2017

The performance of miniz_oxide for deflating looks really good!
I tested it with the benchmarks I use to compare my own encoder with flate2 and the difference between the oxide and the vanilla version of flate2 is pretty small (except for creating a new writer instance):

flate2-oxide
        Running target/release/deps/bench-320f073631836af1

running 9 tests
test test_file_zlib_def         ... bench:  10,611,928 ns/iter (+/- 215,276)
test test_file_zlib_fast        ... bench:   2,806,482 ns/iter (+/- 54,799)
test test_file_zlib_flate2_best ... bench:  13,205,205 ns/iter (+/- 411,780)
test test_file_zlib_flate2_def  ... bench:  10,080,656 ns/iter (+/- 231,622)
test test_file_zlib_flate2_fast ... bench:   2,620,335 ns/iter (+/- 73,291)
test test_file_zlib_high        ... bench:  13,603,453 ns/iter (+/- 150,140)
test test_file_zlib_rle         ... bench:   1,189,553 ns/iter (+/- 19,496)
test writer_create              ... bench:       6,852 ns/iter (+/- 251)
test writer_create_flate2       ... bench:      14,419 ns/iter (+/- 385)

flate2-miniz
     Running target/release/deps/bench-320f073631836af1

running 9 tests
test test_file_zlib_def         ... bench:  10,627,654 ns/iter (+/- 138,669)
test test_file_zlib_fast        ... bench:   2,799,167 ns/iter (+/- 42,938)
test test_file_zlib_flate2_best ... bench:  12,888,815 ns/iter (+/- 609,888)
test test_file_zlib_flate2_def  ... bench:   9,629,206 ns/iter (+/- 104,458)
test test_file_zlib_flate2_fast ... bench:   2,033,945 ns/iter (+/- 45,261)
test test_file_zlib_high        ... bench:  13,614,552 ns/iter (+/- 176,195)
test test_file_zlib_rle         ... bench:   1,196,375 ns/iter (+/- 19,859)
test writer_create              ... bench:       6,823 ns/iter (+/- 211)
test writer_create_flate2       ... bench:       4,143 ns/iter (+/- 86)

(The test prefixed with flate2 is the ones using flate2)

Let me know if you want me to help out on making it more rusty. I suppose porting the miniz backend for flate2 may be an easier route to get rid of this dependency on a C-library than trying to approach the speed with the current pure-rust decoders.

@Frommi
Copy link
Contributor

Frommi commented Aug 14, 2017

writer_create_flate2 is slower probably because I 0-set all the arrays at init. Valgrind is actually reporting conditional jumps based on uninitialized values in miniz because of that.

@oyvindln
Copy link
Contributor

Yeah it's probably 0-initialization taking a little bit of time.

I believe the reading of uninitialized values is because it may read past the end of the input buffer when comparing matches as it doesn't check if it's hit the end. Zlib does something like that deliberately to avoid doing too many bounds checks when comparing matches.

@alexcrichton
Copy link
Member

Nice work @Frommi! Those are some really encouraging numbers :)

@Frommi
Copy link
Contributor

Frommi commented Aug 28, 2017

With help from @oyvindln inflation is now rewritten as well. miniz_oxide should already be usable, but there is still some work to be done like comprehensive fuzzing, testing for ARMs and some optimizations.

@Frommi
Copy link
Contributor

Frommi commented Aug 30, 2017

Good news! Compression in miniz_oxide is now faster than in miniz on big enough files. Decompression still needs some work, though.

Bench results:

running 13 tests
test compress_default                   ... bench:   7,509,230 ns/iter (+/- 160,622)
test compress_fast                      ... bench:   1,470,900 ns/iter (+/- 32,374)
test compress_high                      ... bench:  17,968,952 ns/iter (+/- 384,224)
test compress_mem_to_heap_default_miniz ... bench:   7,715,709 ns/iter (+/- 219,649)
test compress_mem_to_heap_default_oxide ... bench:   7,495,585 ns/iter (+/- 152,332)
test compress_mem_to_heap_fast_miniz    ... bench:   1,985,796 ns/iter (+/- 29,899)
test compress_mem_to_heap_fast_oxide    ... bench:   1,940,112 ns/iter (+/- 32,735)
test compress_mem_to_heap_high_miniz    ... bench:  18,495,747 ns/iter (+/- 199,349)
test compress_mem_to_heap_high_oxide    ... bench:  17,887,569 ns/iter (+/- 251,364)
test decompress                         ... bench:     746,421 ns/iter (+/- 5,564)
test decompress_mem_to_heap_miniz       ... bench:     608,065 ns/iter (+/- 12,973)
test zlib_compress_fast                 ... bench:   1,547,289 ns/iter (+/- 14,296)
test zlib_decompress                    ... bench:     820,994 ns/iter (+/- 6,032)

@Boscop
Copy link

Boscop commented Oct 1, 2017

Any updates on this? :)

@oyvindln
Copy link
Contributor

oyvindln commented Oct 1, 2017

miniz_oxide is pretty much ready to be added as an experimental back-end now. (Though it needs to be tested and probably fixed to work on big-endian systems still.)
Maybe we could start off by putting it behind a separate feature in flate2, like using system zlib is now, so people can opt-in to it if they want to avoid needing C code.

@oyvindln
Copy link
Contributor

oyvindln commented Oct 3, 2017

big-endian issues have been fixed.

@alexcrichton
How do we best go about adding this as a back-end? I don't know which of system zlib and miniz_oxide ought to take priority if both features are added (are there any zlib-only features?).

@alexcrichton
Copy link
Member

Nice progress! Want to make a PR with the changes here? For now I think we'll still want it off by default, but we could add an opt-in feature and get it on CI

@Frommi
Copy link
Contributor

Frommi commented Oct 7, 2017

Okay, we will publish it on crates.io first and then make a PR here. By the way, here are current benchmarks vs miniz:

name                         miniz:: ns/iter  oxide:: ns/iter  diff ns/iter   diff %  speedup 
 compress_bin_lvl_1           1,581,071        1,645,425              64,354    4.07%   x 0.96 
 compress_bin_lvl_6           10,251,827       9,725,356            -526,471   -5.14%   x 1.05 
 compress_bin_lvl_9           23,637,460       22,797,519           -839,941   -3.55%   x 1.04 
 compress_code_lvl_1          1,200,146        1,172,170             -27,976   -2.33%   x 1.02 
 compress_code_lvl_6          5,686,613        5,475,717            -210,896   -3.71%   x 1.04 
 compress_code_lvl_9          7,790,218        7,550,755            -239,463   -3.07%   x 1.03 
 compress_compressed_lvl_1    244,517          290,464                45,947   18.79%   x 0.84 
 compress_compressed_lvl_6    1,557,577        1,504,730             -52,847   -3.39%   x 1.04 
 compress_compressed_lvl_9    3,406,719        3,320,516             -86,203   -2.53%   x 1.03 
 decompress_bin_lvl_1         885,646          822,365               -63,281   -7.15%   x 1.08 
 decompress_bin_lvl_6         754,030          658,653               -95,377  -12.65%   x 1.14 
 decompress_bin_lvl_9         749,811          648,946              -100,865  -13.45%   x 1.16 
 decompress_code_lvl_1        676,070          614,149               -61,921   -9.16%   x 1.10 
 decompress_code_lvl_6        484,888          396,394               -88,494  -18.25%   x 1.22 
 decompress_code_lvl_9        481,809          393,934               -87,875  -18.24%   x 1.22 
 decompress_compressed_lvl_1  174,119          126,022               -48,097  -27.62%   x 1.38 
 decompress_compressed_lvl_6  235,449          203,710               -31,739  -13.48%   x 1.16 
 decompress_compressed_lvl_9  235,754          204,383               -31,371  -13.31%   x 1.15

@oyvindln
Copy link
Contributor

oyvindln commented Oct 23, 2017

To anyone following this issue but not pull requests, this was added in #128.

@alexcrichton
Copy link
Member

Indeed! I'll close now.

Thanks so much @Frommi and @oyvindln!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants