[prev in list] [next in list] [prev in thread] [next in thread] 

List:       subversion-issues
Subject:    [Issue 4554] New - Wrong file length with PLAIN representations in FSFS
From:       sf () tigris ! org
Date:       2015-01-27 1:18:11
Message-ID: iz4554 () subversion ! tigris ! org
[Download RAW message or body]

http://subversion.tigris.org/issues/show_bug.cgi?id=4554
                 Issue #|4554
                 Summary|Wrong file length with PLAIN representations in FSFS
               Component|subversion
                 Version|1.8.x
                Platform|All
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P1
            Subcomponent|libsvn_fs_fs
             Assigned to|sf
             Reported by|sf






------- Additional comments from sf@tigris.org Mon Jan 26 17:18:11 -0800 2015 -------
FSFS allows "PLAIN", i.e. non-deltified non-compressed, representations
to store the content length as 0 as it matches the on-disk size.  Up to
and including 1.8.x, there is no restriction on the representation type
for that omission.

In practice, however, it can be difficult to decide whether a 0 value
represents an omission or an actually empty file: A self-deltified empty
file has a length of 0 but an 4 byte on-disk size.  When representations
are read, their header tells us whether it is indeed a PLAIN or a DELTA
representation and that is enough to resolve any ambiguity.

The problem occurs when we omit the length value for file contents and
call svn_fs_file_length() on it.  FSFS will report the length as 0 and
that causes e.g. 'svnadmin dump' to write broken dump files where the
skipped / empty contents does not match the checksum.

Up to 1.7.x, we used this omission rule only for hash data, i.e. props
and directories, never for file contents.  Thus, there is no problem
with these Subversion releases.  Starting 1.9.0 and up, the "structure"
document explicitly restricts the omission to property reps.  Furthermore,
1.9+ will not omit length values at all.

3rd party implementations like SVNkit (?) may have produced instances
of omitted length values for file contents, though.  We need to handle
those correctly and extend the API implementation accordingly.

Moreover, 1.8.x generalized the rep sharing mechanism.  If a file contents
happened to match a property representation, e.g. "END\n", it would now
use the property representation.  The latter is PLAIN by default and
stores a 0 length value in the rep cache.  Hence, the file contents rep
will also report a 0 length.

Reproduction sketch (requires 1.8.x):

* Create repo. Keep rep sharing on and prop deltification off.
* Add empty file, set prop on file and commit.
* Remove prop on file and commit.
* Set file contents to "END\n" and commit.
* run 'svn ls -v' on the parent folder => file length is shown as 0

The following things need to be fixed:

* Don't omit the length value - even for properties.
  That prevents new instances caused by incoming data.
  (already fixed in 1.9; fix for 1.8 still needed.)
* Update the "structure" document with info when the omission is safe.

* Compare the size and length values of the rep returned by the rep cache
  with the data of the new rep.  Only replace new with old if those match.
  The prevents new instances caused by the rep cache.
* Fix svn_fs_fs__file_length to return 0 lengths only for files that
  are known empty.  This will be the actual bug fix.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=463&dsMessageId=3094784

To unsubscribe from this discussion, e-mail: [issues-unsubscribe@subversion.tigris.org].
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic