[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ubuntu-devel
Subject:    Re: zstd compression for packages
From:       Daniel Axtens <daniel.axtens () canonical ! com>
Date:       2018-03-13 1:07:43
Message-ID: CAKjEEsFsRWxJznWLVOHtsp+fP8Ukj+pBBw92bV=6u4BkmsDSfw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Tue, Mar 13, 2018 at 1:43 AM, Balint Reczey <balint.reczey@canonical.com>
wrote:

> Hi Daniel,
> 
> On Mon, Mar 12, 2018 at 2:11 PM, Daniel Axtens
> <daniel.axtens@canonical.com> wrote:
> > Hi,
> > 
> > I looked into compression algorithms a bit in a previous role, and to be
> > honest I'm quite surprised to see zstd proposed for package storage.
> zstd,
> > according to its own github repo, is "targeting real-time compression
> > scenarios". It's not really designed to be run at its maximum compression
> > level, it's designed to really quickly compress data coming off the wire
> -
> > things like compressing log files being streamed to a central server, or
> I
> > guess writing random data to btrfs where speed is absolutely an issue.
> > 
> > Is speed of decompression a big user concern relative to file size? I
> admit
> > that I am biased - as an Australian and with the crummy internet that my
> > location entails, I'd save much more time if the file was 6% smaller and
> > took 10% longer to decompress than the other way around.
> 
> Yes, decompression speed is a big issue in some cases. Please consider
> the case of provisioning cluoud/container instances, where after
> booting the image plenty of packages need to be installed and saving
> seconds matter a lot.
> 
> Zstd format also allows parallel decompression which can make package
> installation even quicker in wall-clock time.
> 
> Internet connection speed increases by ~50% (according to this [3]
> study which matches my experience)  on average per year which is more
> than 6% for every two months.
> 
> 
The future is pretty unevenly distributed, and lots of the planet is stuck
on really bad internet still.

AFAICT, [3] is anecdotal, rather than a 'study' - it's based on data from 1
person living in California. This is not really representative. If we look
at the connection speed visualisation from the Akamai State of the Internet
report [4], it shows that lots and lots of countries - most of the world! -
has significantly slower internet than that person.

(FWIW, anecdotally, I've never had a residential connection get faster
(except when I moved), which is mostly because the speed of ADSL is pretty
much fixed. Anecdotal reports from users in developing countries, and rural
areas of developed countries are not encouraging either: [5].)

Having said that, I'm not unsympathetic to the usecase you outline. I just
am saddened to see the trade-offs fall against the interests of people with
worse access to the internet. If I can find you ways of saving at least as
much time without making the files bigger, would you be open to that?

Regards,
Daniel

[4]
https://www.akamai.com/uk/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-connectivity-visualization.jsp
 [5] https://danluu.com/web-bloat/




> > 
> > Did you consider Google's Brotli?
> 
> We did consider it but it was less promising.
> 
> Cheers,
> Balint
> 
> [3] http://xahlee.info/comp/bandwidth.html
> 
> > 
> > Regards,
> > Daniel
> > 
> > On Mon, Mar 12, 2018 at 9:58 PM, Julian Andres Klode
> > <julian.klode@canonical.com> wrote:
> > > 
> > > On Mon, Mar 12, 2018 at 11:06:11AM +0100, Julian Andres Klode wrote:
> > > > Hey folks,
> > > > 
> > > > We had a coding day in Foundations last week and Balint and Julian
> added
> > > > support for zstd compression to dpkg [1] and apt [2].
> > > > 
> > > > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892664
> > > > [2] https://salsa.debian.org/apt-team/apt/merge_requests/8
> > > > 
> > > > Zstd is a compression algorithm developed by Facebook that offers far
> > > > higher decompression speeds than xz or even gzip (at roughly constant
> > > > speed and memory usage across all levels), while offering 19
> compression
> > > > levels ranging from roughly comparable to gzip in size (but much
> faster)
> > > > to 19, which is roughly comparable to xz -6:
> > > > 
> > > > In our configuration, we run zstd at level 19. For bionic main amd64,
> > > > this causes a size increase of about 6%, from roughly 5.6 to 5.9 GB.
> > > > Installs speed up by about 10%, or, if eatmydata is involved, by up to
> > > > 40% - user time generally by about 50%.
> > > > 
> > > > Our implementations for apt and dpkg support multiple frames as used
> by
> > > > pzstd, so packages can be compressed and decompressed in parallel
> > > > eventually.
> > > 
> > > More links:
> > > 
> > > PPA:
> > > https://launchpad.net/~canonical-foundations/+
> archive/ubuntu/zstd-archive
> > > APT merge request: https://salsa.debian.org/apt-
> team/apt/merge_requests/8
> > > dpkg patches:      https://bugs.debian.org/892664
> > > 
> > > I'd also like to talk a bit more about libzstd itself: The package is
> > > currently in universe, but btrfs recently gained support for zstd,
> > > so we already have a copy in the kernel and we need to MIR it anyway
> > > for btrfs-progs.
> > > 
> > > --
> > > debian developer - deb.li/jak | jak-linux.org - free software dev
> > > ubuntu core developer                              i speak de, en
> > > 
> > > --
> 
> 
> --
> Balint Reczey
> Ubuntu & Debian Developer
> 


[Attachment #5 (text/html)]

<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar \
13, 2018 at 1:43 AM, Balint Reczey <span dir="ltr">&lt;<a \
href="mailto:balint.reczey@canonical.com" \
target="_blank">balint.reczey@canonical.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">Hi Daniel,<br> <span class="gmail-"><br>
On Mon, Mar 12, 2018 at 2:11 PM, Daniel Axtens<br>
&lt;<a href="mailto:daniel.axtens@canonical.com">daniel.axtens@canonical.com</a>&gt; \
wrote:<br> &gt; Hi,<br>
&gt;<br>
&gt; I looked into compression algorithms a bit in a previous role, and to be<br>
&gt; honest I&#39;m quite surprised to see zstd proposed for package storage. \
zstd,<br> &gt; according to its own github repo, is &quot;targeting real-time \
compression<br> &gt; scenarios&quot;. It&#39;s not really designed to be run at its \
maximum compression<br> &gt; level, it&#39;s designed to really quickly compress data \
coming off the wire -<br> &gt; things like compressing log files being streamed to a \
central server, or I<br> &gt; guess writing random data to btrfs where speed is \
absolutely an issue.<br> &gt;<br>
&gt; Is speed of decompression a big user concern relative to file size? I admit<br>
&gt; that I am biased - as an Australian and with the crummy internet that my<br>
&gt; location entails, I&#39;d save much more time if the file was 6% smaller and<br>
&gt; took 10% longer to decompress than the other way around.<br>
<br>
</span>Yes, decompression speed is a big issue in some cases. Please consider<br>
the case of provisioning cluoud/container instances, where after<br>
booting the image plenty of packages need to be installed and saving<br>
seconds matter a lot.<br>
<br>
Zstd format also allows parallel decompression which can make package<br>
installation even quicker in wall-clock time.<br>
<br>
Internet connection speed increases by ~50% (according to this [3]<br>
study which matches my experience)   on average per year which is more<br>
than 6% for every two months.<br>
<span class="gmail-"><br></span></blockquote><div><br></div><div>The future is pretty \
unevenly distributed, and lots of the planet is stuck on really bad internet \
still.</div><div><br></div><div>AFAICT, [3] is anecdotal, rather than a \
&#39;study&#39; - it&#39;s based on data from 1 person living in California. This is \
not really representative. If we look at the connection speed visualisation from the \
Akamai State of the Internet report [4], it shows that lots and lots of countries - \
most of the world! - has significantly slower internet than that person.  \
</div><div><br></div><div>(FWIW, anecdotally, I&#39;ve never had a residential \
connection get faster (except when I moved), which is mostly because the speed of \
ADSL is pretty much fixed. Anecdotal reports from users in developing countries, and \
rural areas of developed countries are not encouraging either: \
[5].)</div><div><br></div><div>Having said that, I&#39;m not unsympathetic to the \
usecase you outline. I just am saddened to see the trade-offs fall against the \
interests of people with worse access to the internet. If I can find you ways of \
saving at least as much time without making the files bigger, would you be open to \
that?</div><div><br></div><div>Regards,</div><div>Daniel</div><div><br></div><div>[4] \
<a href="https://www.akamai.com/uk/en/about/our-thinking/state-of-the-internet-report/ \
state-of-the-internet-connectivity-visualization.jsp">https://www.akamai.com/uk/en/abo \
ut/our-thinking/state-of-the-internet-report/state-of-the-internet-connectivity-visualization.jsp</a><br></div><div><div \
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:nor \
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spac \
ing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wor \
d-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">[5] \
<a href="https://danluu.com/web-bloat/">https://danluu.com/web-bloat/</a></div><div \
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:nor \
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spac \
ing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wor \
d-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br \
class="gmail-Apple-interchange-newline"></div><br \
class="gmail-Apple-interchange-newline">  </div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><span class="gmail-"> &gt;<br>
&gt; Did you consider Google&#39;s Brotli?<br>
<br>
</span>We did consider it but it was less promising.<br>
<br>
Cheers,<br>
Balint<br>
<br>
[3] <a href="http://xahlee.info/comp/bandwidth.html" rel="noreferrer" \
target="_blank">http://xahlee.info/comp/<wbr>bandwidth.html</a><br> <div \
class="gmail-HOEnZb"><div class="gmail-h5"><br> &gt;<br>
&gt; Regards,<br>
&gt; Daniel<br>
&gt;<br>
&gt; On Mon, Mar 12, 2018 at 9:58 PM, Julian Andres Klode<br>
&gt; &lt;<a href="mailto:julian.klode@canonical.com">julian.klode@canonical.com</a>&gt; \
wrote:<br> &gt;&gt;<br>
&gt;&gt; On Mon, Mar 12, 2018 at 11:06:11AM +0100, Julian Andres Klode wrote:<br>
&gt;&gt; &gt; Hey folks,<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; We had a coding day in Foundations last week and Balint and Julian \
added<br> &gt;&gt; &gt; support for zstd compression to dpkg [1] and apt [2].<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; [1] <a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892664" \
rel="noreferrer" target="_blank">https://bugs.debian.org/cgi-<wbr>bin/bugreport.cgi?bug=892664</a><br>
 &gt;&gt; &gt; [2] <a href="https://salsa.debian.org/apt-team/apt/merge_requests/8" \
rel="noreferrer" target="_blank">https://salsa.debian.org/apt-<wbr>team/apt/merge_requests/8</a><br>
 &gt;&gt; &gt;<br>
&gt;&gt; &gt; Zstd is a compression algorithm developed by Facebook that offers \
far<br> &gt;&gt; &gt; higher decompression speeds than xz or even gzip (at roughly \
constant<br> &gt;&gt; &gt; speed and memory usage across all levels), while offering \
19 compression<br> &gt;&gt; &gt; levels ranging from roughly comparable to gzip in \
size (but much faster)<br> &gt;&gt; &gt; to 19, which is roughly comparable to xz \
-6:<br> &gt;&gt; &gt;<br>
&gt;&gt; &gt; In our configuration, we run zstd at level 19. For bionic main \
amd64,<br> &gt;&gt; &gt; this causes a size increase of about 6%, from roughly 5.6 to \
5.9 GB.<br> &gt;&gt; &gt; Installs speed up by about 10%, or, if eatmydata is \
involved, by up to<br> &gt;&gt; &gt; 40% - user time generally by about 50%.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; Our implementations for apt and dpkg support multiple frames as used \
by<br> &gt;&gt; &gt; pzstd, so packages can be compressed and decompressed in \
parallel<br> &gt;&gt; &gt; eventually.<br>
&gt;&gt;<br>
&gt;&gt; More links:<br>
&gt;&gt;<br>
&gt;&gt; PPA:<br>
&gt;&gt; <a href="https://launchpad.net/~canonical-foundations/+archive/ubuntu/zstd-archive" \
rel="noreferrer" target="_blank">https://launchpad.net/~<wbr>canonical-foundations/+<wbr>archive/ubuntu/zstd-archive</a><br>
 &gt;&gt; APT merge request: <a \
href="https://salsa.debian.org/apt-team/apt/merge_requests/8" rel="noreferrer" \
target="_blank">https://salsa.debian.org/apt-<wbr>team/apt/merge_requests/8</a><br> \
&gt;&gt; dpkg patches:         <a href="https://bugs.debian.org/892664" \
rel="noreferrer" target="_blank">https://bugs.debian.org/892664</a><br> &gt;&gt;<br>
&gt;&gt; I&#39;d also like to talk a bit more about libzstd itself: The package \
is<br> &gt;&gt; currently in universe, but btrfs recently gained support for \
zstd,<br> &gt;&gt; so we already have a copy in the kernel and we need to MIR it \
anyway<br> &gt;&gt; for btrfs-progs.<br>
&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; debian developer - <a href="http://deb.li/jak" rel="noreferrer" \
target="_blank">deb.li/jak</a> | <a href="http://jak-linux.org" rel="noreferrer" \
target="_blank">jak-linux.org</a> - free software dev<br> &gt;&gt; ubuntu core \
developer                                             i speak de, en<br> &gt;&gt;<br>
&gt;&gt; --<br>
<br>
<br>
</div></div><span class="gmail-HOEnZb"><font color="#888888">--<br>
Balint Reczey<br>
Ubuntu &amp; Debian Developer<br>
</font></span></blockquote></div><br></div></div>


[Attachment #6 (text/plain)]

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic