Jump to content
Kev

Zig library for HyperLogLog cardinality estimation

Recommended Posts

LogLog-Beta and More: A New Algorithm for Cardinality Estimation Based on LogLog Counting - by Jason Qin, Denys Kim, Yumei Tung

 

TL;DR: Better than HyperLogLog in approximating the number unique elements in a set

 

LogLog-Beta

LogLog-Beta is a new algorithm for estimating cardinalities based on LogLog counting. The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities, therefore, it is more efficient and simpler to implement. The simulations show that the accuracy provided by the new algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++.

 

Example

const std = @import("std");
const HyperLogLog = @import("zig-hyperloglog").DefaultHyperLogLog;
const RndGen = std.rand.DefaultPrng;

var rnd = RndGen.init(0);

pub fn main() !void {
    const count = 1e7;
    const alloc = std.heap.page_allocator;

    var hll = try HyperLogLog().init(alloc);
    defer hll.deinit();

    var i: u64 = 0;
    while (i < count) : (i += 1) {
        const x = rnd.random().int(u64);
        try hll.add_hashed(x);
    }

    const est = hll.cardinality();
    std.debug.print("Estimated cardinality: {d}\n", .{est});
}

 

Source: github.com

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.



×
×
  • Create New...