Global Data Compression Competition 2021

  • 202,000 EUR total prize fund
  • 12 main competition categories
  • Special category for students
  • 20,000 EUR grand prize for block-test rapid category
  • 5,000 EUR “gap” prize in all 12 main categories when the first-place competitor wins by a sufficiently large margin over the second-place competitor
  • Final submission deadline: November 15, 2021

21 JuNE - 15 NOV

ongoing

2021 leaderboards

The leaderboard tables below contain results for contest submissions and selected publicly available compressors. The names of submitted compressors are denoted with asterisk. In each test, the data we use for scoring is private. We provide sample data, or training data, that is of the same nature and similar to, yet different from, the private test set.

The statistics below are for reference purposes only.

Caution on comparability: if possible, we set compressor options to use one thread only. However, some programs might (and did) use multiple threads. We did not try to fine-tune presets to fit into speed limits as tightly as possible, so the compressors are not aligned by speed. Thus these results SHOULD NOT be used to draw conclusions like “compressor X is better than compressor Y”.

Test 1
Rapid
  • Rapid
  • Balanced
  • High compression ratio
Private

Table Notes

  • See the “Ranking” section for rules governing how we order the results.
  • HCR stands for “High Compression Ratio”.
  • “Private” means an entire private dataset (1 GB), “demo” means sample data available to participants.
  • The compressors that were about to fit into the given speed category, but didn’t make it, are placed at the bottom of the corresponding table.
  • All compressors that are present in the table were tested under Windows 10 x64 OS and with the test hardware as described in Test Hardware.
  • The size of sample (training) datasets for Test 1, Test 2, Test 3, Test 4 is, respectively: 300,000,000 bytes, 500,004,864 bytes, 500,000,000 bytes, 500,039,680 bytes.
  • The size of private datasets for Test 1, Test 2, Test 3, Test 4 is, respectively: 1,000,000,000 bytes, 1,000,911,424 bytes, 1,000,000,000 bytes, 1,000,013,824 bytes.
  • Reference compressors for Test 4 were tested so that for every 8K subblock the entire 64K block was decoded to extract that subblock, so in effect the entire dataset was decoded 8 times (see more detailed for Zstd and brotli)

Notes on Compressors

  • mcm 0.84 froze while decoding Test 3 data for both the -t11 and -x11 presets.
  • Zstd was modified for Test 4 to comply with our API: it employed the functions ZSTD_createCCtx, ZSTD_compressCCtx, ZSTD_createDCtx and ZSTD_decompressDCtx from the zstd API; it was compiled using x86_64-w64-mingw32-gcc; and the ZSTD_compressCCtx function took the number from the preset column as an argument (see more detailed here).
  • brotli was modified for Test 4 to comply with our API: it employed the functions BrotliEncoderCompress and BrotliDecoderDecompress from the brotli API; it was compiled using x86_64-w64-mingw32-gcc; and the BrotliEncoderCompress function took the number from the preset column as an argument (see more detailed here).