[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gentoo-dev
Subject:    Re: [gentoo-dev] [v1.0.2] GLEP 74: Full-tree verification using Manifest files
From:       "Robin H. Johnson" <robbat2 () gentoo ! org>
Date:       2017-10-30 19:56:27
Message-ID: robbat2-20171030T184928-290641978Z () orbis-terrarum ! net
[Download RAW message or body]


On Mon, Oct 30, 2017 at 05:51:36PM +0100, Michał Górny wrote:
...
> Directory tree coverage
> -----------------------
This section should maybe cover OPTIONAL in more detail (see more
below).

> The Manifest files can also specify ``IGNORE`` entries to skip Manifest
> verification of subdirectories and/or files. The package manager can
> support injecting ignore paths to account for additional files created,
> modified or removed by user's processes that would not be ignored
> by existing rules. Files and directories starting with a dot are always
> implicitly ignored. All files that are not ignored must be covered
> by at least one of the Manifests.
(English) There are multiple points in this paragraph, and I missed on first
reading. The package manager part is esp. lost.
|| All files that are not ignored must be covered by at least one of the
|| Manifests. Files may be ignored by several ways:
|| - Files and directories starting with a dot are always implicitly
||   ignored.
|| - The Manifest files can specify ``IGNORE`` entries to skip
||   verification of ubdirectories and/or files.
|| - The package manager can support injecting ignore paths to account for
||   additional files created modified or removed by user's processes that
||   would not be ignored by existing rules.

> File verification
> -----------------
> When verifying a file against the Manifest, the following rules are
> used:
...
> 3. If the file is covered by an entry of the ``OPTIONAL`` type:
>    a. if the file is present, then the verification fails,
>    b. otherwise, the verification succeeds.
Move the OPTIONAL type further up in the verification list maybe? See
the interpretation question below.


> Modern Manifest tags
> --------------------
...
> ``IGNORE <path>``
>   Ignores a subdirectory or file from Manifest checks. If the specified
>   path is present, it and its contents are omitted from the Manifest
>   verification (always pass).
Clarification Needed:
Should subdirectories have a trailing slash in the Manifest or not?
This affects matching of the type.
Case 1.1:
Manifest has 'IGNORE foo'; 'foo' is a file; => ignored.
Case 1.2:
Manifest has 'IGNORE foo'; 'foo' is a directory; => ignored.
Case 2.1:
Manifest has 'IGNORE foo/'; 'foo' is a file; => FAIL
Case 2.2:
Manifest has 'IGNORE foo/'; 'foo' is a directory; => ignored.


> ``OPTIONAL <path>``
>   Specifies a file that does not exist in the distribution but if it
>   did, it would be marked as ``MISC``. In the strict mode, the file
>   must not exist for the verification to pass. The package manager
>   may ignore a stray file matching this entry if operating in non-strict
>   mode.
This has gotten less clear.
Is the following correct interpretation?
if(package manager is strict) then
  Treat the OPTIONAL entry as NOT present in the Manifest.
  This will cause files to be in the present set but not the covered set.
else
  Treat the OPTIONAL entry as 'IGNORE <path>'
endif

> ``DIST <filename> <size> <checksums>…``
>   Specifies a distfile entry used to verify files fetched as part
>   of ``SRC_URI``. The filename must match the filename used to store
>   the fetched file as specified in the PMS [#PMS-FETCH]_. The package
>   manager must reject the fetched file if it fails verification.
>   ``DIST`` entries apply to all packages below the Manifest file
>   specifying them.
Nit: You have used a unicode ellipsis '…' in some places and plain ASCII
ellipsis '...' in others. Stick to ASCII?


> An example Manifest file (informational)
> ----------------------------------------
Can you add an example for OPTIONAL?

> Tree layout restrictions
> ------------------------
> The Gentoo repository does not use symbolic links. Some Gentoo
> repositories do, however. To provide a simple solution for dealing with
> symlinks without having to take care to implement special handling for
> them, the common behavior of implicitly resolving them is used.
> Therefore, symbolic links to files are stored as if they were regular
> files, and symbolic links to directories are followed as if they were
> regular directories.
Clarification: should cross-device symlinks be rejected? (perhaps
implicit, but wanted to check)
If so, need to add to 'Algorithm for full-tree verification' section.

> Dotfiles are implicitly ignored as that is a common notion used
> in software written for POSIX systems. All other filenames require
> explicit ``IGNORE`` lines.
This paragraph should re-iterate that the package manager may specify
additional files to be ignored per the user.

> The algorithm is restricted to work on a single filesystem. This is
> mostly relevant when scanning for top-level Manifest — we do not want
> to cross filesystem boundaries then. However, to ensure consistent
> bidirectional behavior we need to also ban them when operating downwards
> the tree.
> 
> The directories and files on different filesystems needs to be ignored
> explicitly as implicitly skipping them would cause confusion.
> In particular, tools might then claim that a file does not exist when
> it clearly does because it was skipped due to filesystem boundaries.
The downward path needs to check the device on files.
Otherwise:
cat/pn/Manifest
cat/pn/files/ <-- different filesystem here

> Non-obligatory Manifest verification
> ------------------------------------
...
> The traditional ``MISC`` type is amended with a complementary
> ``OPTIONAL`` tag to account for files that are not provided
> in the specific repository. It aims to ensure that the same path would
> be non-fatal when provided by the repository but fatal when created
> by the user tooling.
Clarify the last sentence to be for strict mode only?

> Splitting distfile checksums from file checksums
> ------------------------------------------------
> 
> Another problem with the current Manifest format is that the checksums
> for fetched files are combined with checksums for local files
> in a single file inside the package directory. It has been specifically
> pointed out that:
> 
> - since distfiles are sometimes reused across different packages,
>   the repeating checksums are redundant,
> 
> - mirror admins were interested in the possibility of verifying all
>   the distfiles with a single tool.
> 
> This specification does not provide a clean solution to this problem.
> It technically permits moving ``DIST`` entries to higher-level Manifests
> but the usefulness of such a solution is doubtful.
Clarification of validity:
If cat/pn1 and cat/pn2 share 1000 DIST files; would it be valid to
have: the following:
cat/pn1/Manifest:MANIFEST ../Manifest.some-shared-name 1234 ...
cat/pn1/Manifest:DIST unique-pn1-dist.tgz 1234 ...
cat/pn2/Manifest:MANIFEST ../Manifest.some-shared-name 1234 ...
cat/pn2/Manifest:DIST unique-pn2-dist.tgz 1234 ...

> Performance considerations
> --------------------------
...
> To improve speed on I/O and/or CPU-restrained systems even further,
> the algorithms can be easily extended to perform incremental
> verification. Given that rsync does not preserve mtimes by default,
> the tool can take advantage of mtime and Manifest comparisons to recheck
> only the parts of the repository that have changed.
Implementation note, not for GLEP addition:
If the package manager caches by filename,inode,mtime locally, it can
then avoid repeat-checking of the hashes (it only needs a stat),
provided that it is happy there was no local attacker who might perform
an in-place modification of a file (mtime&inode remain the same).

> 
> Furthermore, the package manager implementations can restrict checking
> only to the parts of the repository that are actually being used.
> 
> 
> Backwards Compatibility
> =======================
> 
> This GLEP provides optional means of preserving backwards compatibility.
> To preserve the backwards compatibility, the following needs to be
> ensured:
"package directory" is common to all of the items here, if you move that
to the list preamble, it's a lot cleaner to read.
|| To preserve the backwards compatibility, the following needs to be
|| ensured about package directories:
And cleanup the list:
s/in(side)? (that|the) package directory//
s/of a package directory//

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer
E-Mail   : robbat2@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic