pwshub.com

'Extremely fast' compression algorithm LZ4 gets even faster

The new version of the high-speed compression algorithm LZ4 gets a big speed boost – nearly an order of magnitude.

LZ4 is one of the faster compression algorithms in Linux, but the newly released LZ4 version 1.10 significantly raises the bar on its own forerunners. On some hardware, LZ4 1.10 compresses data over five and up to nearly ten times faster than previous releases by using multiple CPU cores in parallel.

As the release notes explain:

There are multiple compression algorithms in Linux and other FOSS OSes, such as the recently infamous xz. There's no single "best," they are all optimized for different uses, some for big files, some for certain types of data, some for the smallest possible compressed file size, some for the smallest memory usage, and so on. The LZ4 algorithm is one of the speed-optimized ones. Its self-description on GitHub is "Extremely Fast Compression algorithm."

It's been around for a while. As far as the FOSS desk can tell, The Register first mentioned it in 2012 and it was incorporated into the Linux kernel in version 3.11 the following year. It was used to compress the SquashFS found on many Linux boot media since kernel 3.19.

The curious can read a short but dense explanation of how LZ4 works from author Yann Collet, who works for Facebook and is also the creator of Zstd and xxHash. The US sitcom Silicon Valley fictionalized his work via a character called Richard Hendricks.

For a speed-focused compression scheme that's over a decade old, such a big performance jump is unexpected. It does it by spreading compression over multiple CPU cores, as previously done in the lz4mt C++ implementation. (The author of that variant, Takayuki Matsuoka, contributed multiple changes to the new LZ4 release.)

  • Linux Mint 22 'Wilma' still the Bedrock choice for moving off Windows
  • X.org lone ranger rides to rescue multi-monitor refresh rates
  • OpenBSD enthusiast cooks up guide for the technically timid
  • Arch-based CachyOS promises speed but trips over its laces

LZ4 could already do over half a gigabyte per second on each core, but now, if you have lots of cores to throw at it, it can do substantially more. The table in the announcement shows an AMD 7850HS – an octo-core chip – getting seven to eight times faster, and an Intel i7-9700K, also with eight cores, getting nearly six times as quick.

For us, this release illustrates several important points. First, writing efficient code to exploit the parallelism of multiple processor cores is very hard. The parallelized lz4mt implementation was ten years ago, and it's remarkable that it's taken a whole decade for this change to make it into what is a speed-focused algorithm. That, in turn, is why more parts of modern OSes and apps can't and don't make effective use of multiple CPU cores… and that's why the number of cores in desktop CPUs is increasing much more slowly than in server CPUs. More cores can't make a single-threaded process run any more quickly, and in general, most common apps tend to only use a small number of threads. There's still no way to automatically parallelize algorithms – only very smart humans can do that.

As we noted earlier this year when discussing code bloat, the late great Gene Amdahl formalized Amdahl's Law, which notes that the performance gains from making code more parallel usually tops out at about 20 processors. We also highly recommend "The Future of Microprocessors" talk by Arm co-creator Sophie Wilson, in which she notes that the silicon-chip industry is unique in successfully selling high-volume products where the purchasers can't use most of them. In fact, in any modern CPU, if it were possible to turn on all of any processor die at once, it would burn itself out in seconds.

In the meantime, though, LZ4 1.10 means you can use a bit more occasionally. Alongside LZ4, another thing that made it into the Linux kernel in version 3.11, humorously nicknamed Linux for Workgroups, was zswap, which can compress data before it's swapped out to virtual memory. As we described a couple of years ago, turning on zswap can really help the performance of any Linux box that uses swap heavily. When version 1.10 of LZ4 makes it into the kernel, that will get faster still, but in the meantime, you can easily turn it on and enjoy the result today. ®

Source: theregister.com

Related stories
1 month ago - In front, unmodified GNOME; underneath, it's all a bit strange, but purposefully so Vanilla OS is an experimental distro testing out new implementations of immutability, cross-distro packaging, A/B failover, and more.…
1 month ago - The Ryzen 9 9950X is AMD's most powerful Zen 5 desktop offering. Dual CCD Ryzen CPUs have never been the best options for gamers, but they have been the best options for those who like to work and play.Read Entire Article
1 day ago - There are quite a few fast broadband options in Sacramento, and here are our top picks that will take care of all your internet needs.
1 month ago - John Henry was a steel-driving man — Human-beating ping-pong AI learned to play in a simulated...
3 weeks ago - Opelika is a historic place, but it's internet service providers are future-ready. Explore the fiber, cable and fixed wireless providers for home internet in this railroad town.
Other stories
20 minutes ago - Write better code, urges Jen Easterly. And while you're at it, give crime gangs horrible names like 'Evil Ferret' Software developers who ship buggy, insecure code are the real villains in the cyber crime story, according to Jen Easterly,...
59 minutes ago - The Indian government has approved $2.7 billion in new spending for its space program.
59 minutes ago - heard you like apps — Windows App replaces Microsoft Remote Desktop on macOS, iOS, and Android. Enlarge / The...
59 minutes ago - LinkedIn limits opt-outs to future training, warns AI models may spout personal data.
59 minutes ago - BUSTED — iServer provided a simple service for phishing credentials to unlock phones. Getty Images ...