Logo

Matviy Ivanov

< Projects View on GitHub

hll-rs

hll-rs is a high-performance, thread-safe implementation of the HyperLogLog algorithm for cardinality estimation.

Why HyperLogLog?

If you need to count the number of unique elements (cardinality) in a data stream of billions of items, a standard [std::collections::HashSet] would consume gigabytes of memory.

HyperLogLog provides a probabilistic estimate with a fixed, tiny memory footprint. For example, with a precision of P=14, you can estimate cardinalities into the billions with an error rate of ~0.8% using only 16 KiB of memory.

Key Features

Quick Start

use hll_rs::Hll;

// Create a new HyperLogLog with precision 14 (16,384 registers)
let hll = Hll::<14>::new();

// Add items from multiple threads if needed
hll.add("rust");
hll.add("programming");
hll.add("rust"); // Duplicate

println!("Estimated unique items: {}", hll.estimate_count());

Mathematical Bounds

Precision (P) Registers (m) Memory Usage Standard Error
10 1,024 1 KiB ~3.25%
14 16,384 16 KiB ~0.81%
16 65,536 64 KiB ~0.40%
18 262,144 256 KiB ~0.20%