Bloom Filter¶
A probabilistic data structure for membership testing.
Usage¶
from redis_kit import BloomFilter
bf = BloomFilter(conn.sync_client, "emails", expected_items=100_000, false_positive_rate=0.01)
bf.add("alice@example.com")
bf.exists("alice@example.com") # True
bf.exists("unknown@example.com") # False (probably)
Batch Operations¶
bf.add_many(["a@x.com", "b@x.com", "c@x.com"])
results = bf.exists_many(["a@x.com", "d@x.com"]) # [True, False]
Reset¶
How It Works¶
- Uses double hashing technique (two SHA-256-based hashes to derive k offsets), FIPS-compatible
- Pipeline-based SETBIT/GETBIT operations for improved performance
exists_manyuses a single pipeline batch check instead of N independent calls- Automatically calculates optimal bit array size and hash function count based on
expected_itemsandfalse_positive_rate