Skip to content

1.0.0

Choose a tag to compare

@jpivarski jpivarski released this 01 Sep 18:11
· 67 commits to master since this release

Histogrammar is a suite of data aggregation primitives designed for use in parallel processing. In the simplest case, you can use this to compute histograms in distributed processors like Apache Spark, but the generality of the primitives allows much more.

This Scala implementation of Histogrammar adheres to version 1.0 of the specification and has been tested to guarantee compatibility with the Python implementation. The test suite includes empty datasets, NaN/infinity handling, associativity tests, and numerical agreement at the level of one part in a trillion (double precision). Several common histogram types can be plotted in Bokeh with a single method call.

It is the first version to be distributed in Maven Central for easy inclusion in Maven, sbt, and Spark jobs. See http://histogrammar.org/ for more.