Skip to content

Release bloomfilter-blocked-0.1.0.0 #770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jorisdral
Copy link
Collaborator

@jorisdral jorisdral commented Jun 30, 2025

No description provided.

@jorisdral jorisdral self-assigned this Jun 30, 2025
@jorisdral jorisdral force-pushed the jdral/bloomfilter-blocked branch 2 times, most recently from 3e2b44d to ecb3fae Compare June 30, 2025 15:45
@jorisdral jorisdral force-pushed the jdral/bloomfilter-blocked branch from ecb3fae to fcacf6d Compare July 1, 2025 13:36
This includes
* A new README
* Moved "examples" and "differences" haddock sections to the top-level module.
* Make examples executable using `cabal-docspec`
* Add `BloomPolicy` and `BloomSize` types for the blocked bloom filter instead
  of reusing the types of the same name from the classic bloom filter. This adds
  a bit of boilerplate, but it makes the documentation clearer because the
  hyperlinks were pointing from the blocked modules to the classic modules
  before.
* Add `read` and `readHashes` to the blocked bloom filter, which the classic
  bloom filter already had implemented.
@jorisdral jorisdral force-pushed the jdral/bloomfilter-blocked branch from fcacf6d to 69db9f8 Compare July 1, 2025 14:01
@jorisdral jorisdral marked this pull request as ready for review July 1, 2025 14:02
Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

-- The main differences are
--
-- * This packages support bloomfilters of arbitrary sizes
-- (not limited to powers of two). Also sizes over 2^32 are supported.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to a maximum of 2^41 bits (256 Gbytes).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or up to a maximum of 2^48 for classic.

Comment on lines 150 to 153
-- * The 'Bloom' and 'MBloom' types are parametrised over a 'Hashable' type
-- class, instead of having a @a -> ['Hash']@ typed field.
-- This separation allows clean de\/serialization of Bloom filters in this
-- package, as the hashing scheme is static.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also faster than calling an unknown function for every hash.

Comment on lines 155 to 156
-- * [@XXH3@ hash](https://xxhash.com/) is used instead of Jenkins'
-- @lookup3@.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'coz it's faster 😁

Comment on lines +71 to +75
data BloomPolicy = BloomPolicy {
policyBits :: !Double,
policyHashes :: !Int
}
deriving stock Show
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Add `BloomPolicy` and `BloomSize` types for the blocked bloom filter instead
  of reusing the types of the same name from the classic bloom filter. This adds
  a bit of boilerplate, but it makes the documentation clearer because the
  hyperlinks were pointing from the blocked modules to the classic modules
  before.

Hmmm.

If the issue is just the docs, perhaps we should pull these types out into a common module, which could be marked not-home so they get documented as if they live in each module.


Copyright 2008, 2009, 2010, 2011 Bryan O'Sullivan <[email protected]>.
`bloomfilter-blocked` is a Haskell library providing multiple fast and efficient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple/two

what does efficient refer to as distinct from fast? Perhaps space-efficient?

the examples directory.
* **Blocked** floom filters, found in the `Data.BloomFilter.Blocked` module: an
implementation that optimises the memory layout of a classic bloom filter for
speed (cheaper CPU cache reads), at the cost of a slightly higher FPR for the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheaper/fewer

This library is written by Bryan O'Sullivan, <[email protected]>.
The library is a full rewrite of the [`bloomfilter`][bloomfilter:hackage]
package, originally authored by Bryan O'Sullivan <[email protected]>. The main
differences are:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: benchmarks to show it's faster!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants