Industry

Business & Industry Applications

Language

Python
Java
C

Features

High Performance Computing
Big Data
Numerical Computing
Data Mining

Blosc is a high performance compression library written in C that uses the blocking technique to make compression operations easier, faster, and more flexible. Blosc splits the datasets into blocks, then transparently packs them into compressed containers. Blosc extends standard compression by allowing the user to condition the data in every block with filters prior to the compression operation; these filters can be selected depending on the properties of the dataset to be compressed. In addition, Blosc provides a diversity of codecs to cover different needs (better compression, faster speed or a balance between the two). Finally, Blosc runs its operations in parallel to leverage the high number of cores in modern CPUs.
Blosc is designed to help in any application where (binary) data needs to be compressed as fast as possible so as to minimize the impact in handling compressed data transparently.

Blosc is currently used in many projects, among them PyTables (there is also a stand-alone hdf5-blosc plugin for general HDF5 applications), bcolz, and zarr (via its numcod