[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gentoo-dev
Subject:    Fwd: Re: [gentoo-dev] [pre-GLEP] Gentoo binary package container format [gentoo@jonesmz.com]
From:       Roy Bamford <neddyseagoon () gentoo ! org>
Date:       2018-11-18 21:10:03
Message-ID: 8wbjQMoEQy/EntGTUihxxc () IqujqQJNQp+Tbney0Ttn8
[Download RAW message or body]

[Attachment #2 (multipart/mixed)]


See attached.

Replying off list because I am not on the whitelist ...

-- 
Regards,

Roy Bamford
(Neddyseagoon) a member of
elections
gentoo-ops
forum-mods

[Attachment #5 (message/rfc822)]



On Sun, Nov 18, 2018 at 5:04 AM Roy Bamford <neddyseagoon@gentoo.org> wrote=
:

> On 2018.11.18 09:38, Micha=C5=82 G=C3=B3rny wrote:
> > On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote:
> > > On 17-11-2018 12:21:40 +0100, Micha=C5=82 G=C3=B3rny wrote:
> > > > Problems with the current binary package format
>
> [snip]
>
> > > > 2. **The format relies on obscure compressor feature of ignoring
> > > >    trailing garbage**.  While this behavior is traditionally
> > implemented
> > > >    by many compressors, the original reasons for it have become
> > long
> > > >    irrelevant and it is not surprising that new compressors do not
> > > >    support it.  In particular, Portage already hit this problem
> > twice:
> > > >    once when users replaced bzip2 with parallel-capable pbzip2
> > > >    implementation [#PBZIP2]_, and the second time when support for
> > zstd
> > > >    compressor was added [#ZSTD]_.
> > >
> > > I think this is actually the result of a rather opportunistic
> > > implementation.  The fault is that we chose to use an extension that
> > > suggests the file is a regular compressed tarball.
> > > When one detects that a file is xpak padded, it is trivial to feed
> > the
> > > decompressor just the relevant part of the datastream.  The format
> > > itself isn't bad, and doesn't rely on obscure behaviour.
> >
> > Except if you don't have the proper tools installed.  In which case
> > the 'opportunistic' behavior made it possible to extract the contents
> > without special tools... except when it actually happens not to work
> > anymore.  Roy's reply indicates that there is actually interest in
> > this
> > design feature.
> >
> [snip]
>
> Team,
>
> I use to post something like https://wiki.gentoo.org/wiki/Fix_My_Gentoo
> with a link to Patricks binhost on the forums every three or four months.
> It made it worth writing that wiki page anyway.
>
> We still get users removing elements of their toolchain or glbc from time
> to time.  The requirement that I didn't express very well, is that it
> shall
> be possible to install binary packages without the use of any Gentoo
> specific tooling.
>
> The current tarball of tarballs proposal would satisfy that requirement.
>
> Its unlikely that a custom binary format would.  Of course, this being
> Gentoo someone would write a run anywhere script that did the
> unpicking, We already have deb2targz and rpm2targz. We have the
> opportunity to design out binpgk2targz before it exists.
>
> --
> Regards,
>
> Roy Bamford
> (Neddyseagoon) a member of
> elections
> gentoo-ops
> forum-mods
>


Replying off list because I am not on the whitelist.

Please also consider my use case:

I have a cluster file system, cephfs, which all of my gentoo machines mount
for access to various shared file resources.

I want to have all of them mount a cephfs path to the folder which portage
is configured to look for binary packages.

This works great if all of the machines have identical portage
configurations, but breaks down as soon as one machine uses a different use
flag.

The reason for this is that the package file names do not encode anything
other than the package name and version number. So if a binpkg already
exists in my binpkg repository, and another machine builds with different
use flags, the binpkg gets overwritten, potentially while a third machine
is reading the binpkg file.

The filename also does not represent compile time dependencies, or any
number of other possible points of differentiation

This issue could be (at least partially) solved at least 3 ways.

1) append a uuid to each filename. Generated when the bin package file is
generated.
2) encode the hostname of the machine that generated the file
3) encode the use flags in the filename.

Perhaps a fuller solution is to respect an environment variable
"BINARY_PKG_FILENAME_FORMAT" that accepts a series of variable
substitutions to append after the package name and version number?

This variable would be used only when generating the binary package.
Portage would still use any binary package that it found that matched its
needs, regardless of suffix.

Thanks for your time.

[Attachment #8 (text/html)]

<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Sun, Nov 18, 2018 at 5:04 \
AM Roy Bamford &lt;<a \
href="mailto:neddyseagoon@gentoo.org">neddyseagoon@gentoo.org</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex">On 2018.11.18 09:38, Michał Górny \
wrote:<br> &gt; On Sun, 2018-11-18 at 10:16 +0100, Fabian Groffen wrote:<br>
&gt; &gt; On 17-11-2018 12:21:40 +0100, Michał Górny wrote:<br>
&gt; &gt; &gt; Problems with the current binary package format<br>
<br>
[snip]<br>
<br>
&gt; &gt; &gt; 2. **The format relies on obscure compressor feature of ignoring<br>
&gt; &gt; &gt;      trailing garbage**.   While this behavior is traditionally<br>
&gt; implemented<br>
&gt; &gt; &gt;      by many compressors, the original reasons for it have become<br>
&gt; long<br>
&gt; &gt; &gt;      irrelevant and it is not surprising that new compressors do \
not<br> &gt; &gt; &gt;      support it.   In particular, Portage already hit this \
problem<br> &gt; twice:<br>
&gt; &gt; &gt;      once when users replaced bzip2 with parallel-capable pbzip2<br>
&gt; &gt; &gt;      implementation [#PBZIP2]_, and the second time when support \
for<br> &gt; zstd<br>
&gt; &gt; &gt;      compressor was added [#ZSTD]_.<br>
&gt; &gt; <br>
&gt; &gt; I think this is actually the result of a rather opportunistic<br>
&gt; &gt; implementation.   The fault is that we chose to use an extension that<br>
&gt; &gt; suggests the file is a regular compressed tarball.<br>
&gt; &gt; When one detects that a file is xpak padded, it is trivial to feed<br>
&gt; the<br>
&gt; &gt; decompressor just the relevant part of the datastream.   The format<br>
&gt; &gt; itself isn&#39;t bad, and doesn&#39;t rely on obscure behaviour.<br>
&gt; <br>
&gt; Except if you don&#39;t have the proper tools installed.   In which case<br>
&gt; the &#39;opportunistic&#39; behavior made it possible to extract the \
contents<br> &gt; without special tools... except when it actually happens not to \
work<br> &gt; anymore.   Roy&#39;s reply indicates that there is actually interest \
in<br> &gt; this<br>
&gt; design feature.<br>
&gt; <br>
[snip]<br>
<br>
Team,<br>
<br>
I use to post something like <a href="https://wiki.gentoo.org/wiki/Fix_My_Gentoo" \
rel="noreferrer" target="_blank">https://wiki.gentoo.org/wiki/Fix_My_Gentoo</a><br> \
with a link to Patricks binhost on the forums every three or four months. <br> It \
made it worth writing that wiki page anyway.<br> <br>
We still get users removing elements of their toolchain or glbc from time<br>
to time.   The requirement that I didn&#39;t express very well, is that it shall <br>
be possible to install binary packages without the use of any Gentoo<br>
specific tooling.<br>
<br>
The current tarball of tarballs proposal would satisfy that requirement.<br>
<br>
Its unlikely that a custom binary format would.   Of course, this being <br>
Gentoo someone would write a run anywhere script that did the <br>
unpicking, We already have deb2targz and rpm2targz. We have the <br>
opportunity to design out binpgk2targz before it exists.<br>
<br>
-- <br>
Regards,<br>
<br>
Roy Bamford<br>
(Neddyseagoon) a member of<br>
elections<br>
gentoo-ops<br>
forum-mods<br></blockquote><div><br></div><div><br></div>Replying off list because I \
am not on the whitelist.<div dir="auto"><br></div><div dir="auto">Please also \
consider my use case:</div><div dir="auto"><br></div><div dir="auto">I have a cluster \
file system, cephfs, which all of my gentoo machines mount for access to various \
shared file resources.</div><div dir="auto"><br></div><div dir="auto">I want to have \
all of them mount a cephfs path to the folder which portage is configured to look for \
binary packages.</div><div dir="auto"><br></div><div dir="auto">This works great if \
all of the machines have identical portage configurations, but breaks down as soon as \
one machine uses a different use flag.  </div><div dir="auto"><br></div><div \
dir="auto">The reason for this is that the package file names do not encode anything \
other than the package name and version number. So if a binpkg already exists in my \
binpkg repository, and another machine builds with different use flags, the binpkg \
gets overwritten, potentially while a third machine is reading the binpkg \
file.</div><div dir="auto"><br></div><div dir="auto">The filename also does not \
represent compile time dependencies, or any number of other possible points of \
differentiation</div><div dir="auto"><br></div><div dir="auto">This issue could be \
(at least partially) solved at least 3 ways.</div><div dir="auto"><br></div><div \
dir="auto">1) append a uuid to each filename. Generated when the bin package file is \
generated.  </div><div dir="auto">2) encode the hostname of the machine that \
generated the file</div><div dir="auto">3) encode the use flags in the \
filename.</div><div dir="auto"><br></div><div dir="auto">Perhaps a fuller solution is \
to respect an environment variable &quot;BINARY_PKG_FILENAME_FORMAT&quot; that \
accepts a series of variable substitutions to append after the package name and \
version number?</div><div dir="auto"><br></div><div dir="auto">This variable would be \
used only when generating the binary package. Portage would still use any binary \
package that it found that matched its needs, regardless of suffix.</div><div \
dir="auto"><br></div><div dir="auto">Thanks for your time.</div><div \
dir="auto"><br></div><div dir="auto"><br></div><br \
class="gmail-Apple-interchange-newline"><div>  </div></div></div>


[Attachment #9 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic