Skip to content

hyparam/hysnappy

Repository files navigation

HySnappy

hysnappy penguin

npm minzipped workflow status mit license dependencies

HySnappy is a lightweight, high-performance Snappy decompression library compiled to WebAssembly. It provides:

  • Very fast Snappy compression suitable for web and Node.js environments.
  • A minimal footprint with no external dependencies.
  • Seamless integration with tools like Hyparquet.

The Snappy compression format, originally released by Google, is designed for high-speed and reasonable compression ratios. HySnappy leverages these strengths by providing a WebAssembly build that can be included directly in your JavaScript bundle for optimal performance.

Usage

Decompress Snappy Data

The snappyUncompress function requires arguments:

  • compressed: a Uint8Array with compressed data.
  • outputLength: the uncompressed size of the data.

The length is needed to know how much wasm memory to allocate. For formats like parquet, this length will generally be known in advance.

To decompress a Uint8Array with known output length:

const { snappyUncompress } = await import('hysnappy')

const compressed = new Uint8Array([
  0x0a, 0x24, 0x68, 0x79, 0x70, 0x65, 0x72, 0x70, 0x61, 0x72, 0x61, 0x6d
])
const outputLength = 10
const output = snappyUncompress(compressed, outputLength) // hyperparam

Compress Snappy Data

Use the snappyCompress function to compress a Uint8Array:

const { snappyCompress } = await import('hysnappy')

const input = new Uint8Array([
  0x68, 0x79, 0x70, 0x61, 0x72, 0x61, 0x6d
])
const compressed = snappyCompress(input)

Hyparquet Integration

Hysnappy was built specifically to accelerate the the hyparquet parquet parsing library.

Hysnappy exports a loader function snappyUncompressor() which loads the WASM module once, and returns a pre-loaded version of snappyUncompress function.

To use hysnappy with hyparquet:

import { parquetQuery } from 'hyparquet'
import { snappyUncompressor } from 'hysnappy'

await parquetQuery({
  file,
  compressors: {
    SNAPPY: snappyUncompressor(),
  },
})

Alternatively, check out hyparquet-compressors which includes hysnappy decompression.

Development

The build uses clang without emscripten, in order to produce the smallest possible binary.

Run make to build from source. The build process consists of:

  1. Compile from c to wasm using clang.
  2. Encode wasm as base64 to uncompress.wasm.base64 and compress.wasm.base64.
  3. Insert base64 strings into uncompress.js and compress.js for distribution.

WASM Loading

By keeping wasm files under 4kb, we can include it directly in the javascript files and load the WASM blob synchronously, which is faster than loading a separate .wasm file. [web.dev]

References