Artifact [9bb2561a09]
Not logged in

Artifact 9bb2561a09ff57702c43839516e187df9ce7ab63:

Wiki page [benchmarks (2019 update)] by sandro 2019-02-06 12:30:41.
D 2019-02-06T12:30:41.703
L benchmarks\s(2019\supdate)
U sandro
W 7395
Back to <a href="https://www.gaia-gis.it/fossil/librasterlite2/wiki?name=rasterlite2-doc">RasterLite2 doc index</a><hr><br>
<h1>RasterLite2 reference Benchmarks (2019 update)</h1>
<h2>Intended scopes</h2>
In recent years new and innovative <a href="https://en.wikipedia.org/wiki/Lossless_compression">lossless compression algorithms</a> have been developed.<br>
The current benchmark is intended to check and verify by practical testing how these new compression methods do practically perform under the most usual conditions.<br>
More specifically, a comparison will be made between the relative performances of new and older lossless compression methods.
<h2>The contenders</h2>
The following <b><i>general purpose</i></b> lossless compression methods will be systematically compared:
<ul>
<li><b>DEFLATE</b>: (aka <b>Zip</b>)<br>
<a href="https://en.wikipedia.org/wiki/DEFLATE">This</a> is the most classic and almost universally adopted lossless compression method.<br>
It was initially introduced about 30 years ago (in <b>1991</b>), so it can be assumed to be the venerable decane of all them.</li>
<li><b>LZMA</b>: (aka <b>7-Zip</b>)<br>
<a href="https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm">This</a> is a well known and widely adopted lossless compression method.<br>
It's younger than DEFLATE having been introduced about 20 years ago (in <b>1998</b>). LZMA is an extremist interpretation of loassless compression.<br> It's usually able to achieve really impressive compression ratios (by far better than DEFLATE can do), but at the cost of severely sacrifycing the compression speed; LZMA can be easily deadly slow.</li>
<li><b>LZ4</b><br>
<a href="https://en.wikipedia.org/wiki/LZ4_(compression_algorithm)">This</a> is a more modern algorithm having been introduced less than 10 years ago (in <b>2011</b>), so it's diffusion and adoption is still rather limited.<br>
LZ4 too is an extremist interpretation of lossless compression, but it goes exactely in the opposite direction of LZMA.<br>
It's strongly optimized so to be extremely fast, but at the cost of sacrifycing the compression ratios.</li>
<li><b>ZSTD</b> (aka <b>Zstandard</b>)<br>
<a href="https://en.wikipedia.org/wiki/Zstandard">This</a> is a very recently introduced algorithm (<b>2015</b>), and it's adoption is still rather limited.<br>
Curiosly enough, both LZ4 and ZSTD are developed and maintained by the same author (Yann Collet).<br>
ZSTD is a well balenced algorithm pretending to be a most modern replacement for DEFLATE, being able to be faster and/or to achieve better compression ratios.<br>
Just few technical details about the most relevant innovations introduced by ZSTD:
<ul>
<li>The old DEFLATE was designed so to require a very limited amount of memory, and this impaired someway it's efficiency.<br>
Modern HW can easily support a lot of memory, so ZSTD borrows few ideas from LZMA about a less constrained and more efficient memory usage.<br>
More specifically, DEFLATE is based on a moving data window of only <b>32KB</b>; both LZMA and ZSTD adopt a more generous moving window of <b>1MB</b>.</li>
<li>Both DEFLATE and ZSTD adopts the classic <a href="https://en.wikipedia.org/wiki/Huffman_coding">Huffman coding</a> for reducing the information entropy.<br>
But ZSTD can also support a further advanced mechanism based on <a href="https://en.wikipedia.org/wiki/Asymmetric_numeral_systems#tANS">Finite State Entropy</a>, a very recent tecnique being much faster.</li>
</ul></li>
</ul>
<br>
Whenever possible and appropriate the following lossless compression methods specifically intended for <b><i>images / rasters</i></b> will be tested as well:
<ul>
<li><b>PNG</b><br>
<a href="https://en.wikipedia.org/wiki/Portable_Network_Graphics">This</a> is a very popular format supporting RGB and Grayscale images (with or without Alpha transparencies).<br>
PNG fully depends on DEFLATE for data compression.</li>
<li><b>CharLS</b><br>
This is an image format (RGB and Grayscale) having a limited diffusion but rather popular for storying medical imagery.<br>
CharLS is based on <a href="https://en.wikipedia.org/wiki/Lossless_JPEG">Lossless JPEG</a>, a genuinely lossless image compression schema
not to be confused with plain JPEG (that is the most classic example of <a href="https://en.wikipedia.org/wiki/Lossy_compression">lossy compression</a>).</li>
<li><b>Jpeg2000</b><br>
<a href="https://en.wikipedia.org/wiki/JPEG_2000">This</a> is intended to be a more advanced replacement for JPEG, but is's not yet so widely supported as its ancestor.<br>
Jpeg2000 is an inherently <b>lossy compression</b>, but under special settings it can effectively support a genuine <b>lossless compression</b> mode.</li>
<li><b>WebP</b><br>
<a href="https://en.wikipedia.org/wiki/WebP">This</a> too is an innovative image format pretending to be a better replacemente for JPEG.<br>
WebP images are expected to support the same visual quality of JPEG but requiring a significantly reduced storage space.<br>
Exactely as Jpeg2000 WebP too is an inherently <b>lossy compression</b>, but under special settings it can effectively support a genuine <b>lossless compression</b> mode.</li>
</ul>
<br>
<hr>
<h1>Testing generic datasets</h1>
We'll start first by testing several generic datasets, so to stress all compression methods under the most common conditions.<br>
The same dataset will be compressed and then decompressed using each method, so to gather informations about:
<ul>
<li>the <b>size</b> of the resulting compressed file.<br>
The ratio between the uncompressed and compressed sizes will correspond to the <b>compression ratio</b>.</li>
<li>the <b>time</b> required to <b>compress</b> the original dataset.</li>
<li>the <b>time</b> required to <b>decompress</b> the compressed file so to recover the initial uncompressed dataset.</li>
</ul>
<br>
<b>Note</b>: compressing is a much harder operation than decompressing, and will always require more time.<br>
The speed differences between the various compression algorithms will be strong and well marked when compressing, but also the differences in decompression speeds (although less impressives) are worth to be carefully evaluated.
<ul>
<li>for any compression algorithm being slow (or even very slow) when compressing can be easily considered a trivial and forgivable issue.<br>
Compression usually happens only once in the lifetime of a compressed dataset, and there are many ways for minimizing the adverse effects of intrinsic slowness.<br>
You could e.g. compress your files in batch mode, may be during off-peack hours, and in such a scenario reaching stronger compression ratios could easily justify a longer process time.<br>
Or alternatively you could enable (if possible) a multithread compression approach (parallel processing), so to significantly reduce the required time.</li>
<li>being slow when decompressing is a much more serious issue, because decompression will happen more frequently; very frequently in some specific scenario.<br>
So a certain degree of slowness in decompression could easily become a serious bottleneck severely limiting the overall performances of your system.</li>
</ul>

<br>
<hr><br>
Back to <a href="https://www.gaia-gis.it/fossil/librasterlite2/wiki?name=rasterlite2-doc">RasterLite2 doc index</a>
Z c876c54c6ad2683416cec21bce33dd03