Brotli compression is slow
Brotli is an extremely efficient compression format (alternative to gzip, zstd, etc.), with google/brotli being sort of its reference implementation. What always bugged me with Brotli is computation time, though. As a consequence of yielding such good compression results, Brotli-compressing a file is also an extremely slow process (orders of magnitude slower compared to gzip). That fact that the “official” brotli
command-line tool (written in C) is single-threaded doesn’t particularly help with this either.
Multi-threaded implementation
Recently came across this article on Dropbox’s tech blog (super interesting, btw.!), though, where they describe how they switched to Brotli for file downloads and uploads. In that context, they came up with their own Rust impementation of Brotli. What’s especially interesting about this is the fact that their library support multi-threaded compression. In my understanding, this was enabled by “simply” compressing individual chunks in parallel and then concatenating the binary output.
Their tool is mainly a Rust library and comes with C bindings in addition. Enabled through those, they also provide Python- and Go bindings. What’s more, they release CLI executables in addition, however, apparently only for Windows (?). Perhaps it’s a drop-in replacement for Google’s brotli
command, but I didn’t get to try it, as I didn’t have easy access to a Windows machine at the time. Building the Rust project produces an executable, but (without having read through the code) it seems to be single-threaded, only.
Getting the Go version running
Instead, I dove a bit deeper into the code and realized their Go example is also a more or less ready-to-use, standalone program to perform multi-threaded (de-) compression (without a proper command-line interface, though). I had to apply a couple of changes to get it working. Specifically, I did:
- Fix the library dependency version
- Update the hard-coded, relative dependency path to my local home dir
- Make it use as many threads as available CPU cores by default
- Link the
brotli_ffi
library statically into the Go executable
You can find patch files for my changes below.
my-patch.go
1 | diff --git a/c/go/brotli/brotli.go b/c/go/brotli/brotli.go |
After applying these changes, building the executable was straightforward:
1 | make |
Eventually, I was able to Brotli-compress a file while utilizing 100 % of my CPU 🙌:
1 | cat random.txt | ./brotli-rust -w > random.txt.br |
Benchmark
I ran a few quick test to compare performance on a 128 MB high-entropy text file and these are the results (on a 12-core CPU):
1 | time brotli random.txt # 197.83 s |
In this case, the multi-threaded version is 26x faster than the original one. For comparison, I ran gzip as well (single- and multi-threaded, the latter using pigz
). However, the comparison is unfair, of course, as gzip compression rate is much worse on average.