Skip to content

Latest commit

 

History

History
40 lines (29 loc) · 1.63 KB

readme.org

File metadata and controls

40 lines (29 loc) · 1.63 KB

Suffix array searching

This repo contains two crates, one to speed up binary search in sorted integer arrays, and one to speed up suffix array searching. For now, this is a research project. Code is not directly intended to be used as-is as a dependency in other projects, and documentation is sub-par for that purpose.

static-search-tree: 40x faster binary search

This is explained in detail in my blog post.

It provides a data structure that can answer ‘binary search’ queries much faster, at the cost of 6% space overhead: given a list of n sorted integers and a list of queries, it returns for each query the first number that is at least the queried value.

It is on my to-do list, but with low priority, to make this into a nice standalone crate.

To reproduce experiments and plots, first cd static-search-tree, and create the results and plots directories (the scripts expect them to exist).

  1. Run the experiments: cargo run -r --bin bench -- --release.

    See cargo run -r --bin bench -- --help for more options.

    Note: To use SIMD, make sure to compile for your native CPU, e.g. by adding the following to .cargo/config.toml:

    [build]
    rustflags = ["-C", "target-cpu=native"]
        
  2. To generate plots: python3 ./plot.py. This will error at some point that results-human-release.json isn’t found. To avoid that, comment out the last plot_blog() call.

suffix-array-searching: Searching suffix arrays using static search trees

This is mostly placeholder code and some rough experiments. I may or may not get back to this.