[prev in list] [next in list] [prev in thread] [next in thread] 

List:       subversion-issues
Subject:    [Issue 4013]  merge performance regression in 1.7 when target has
From:       pburba () tigris ! org
Date:       2011-09-21 21:29:44
Message-ID: 20110921212944.E30725400A1 () sc157-tigr ! sjc ! collab ! net
[Download RAW message or body]

http://subversion.tigris.org/issues/show_bug.cgi?id=4013



User pburba changed the following:

                What    |Old value                 |New value
================================================================================
                Priority|P2                        |P1
--------------------------------------------------------------------------------
        Target milestone|---                       |1.7.0
--------------------------------------------------------------------------------




------- Additional comments from pburba@tigris.org Wed Sep 21 14:29:43 -0700 2011 -------
This issue was attributable to the fixes made for issue #3669 on the
^/subversion/issue-3668-3669 branch, which were reintegrated to trunk in
r1035894 and tweaked in r1148436, r1148426, r1148071, r1145653, r1142038,
r1141981, r1141600, r1135982, and r1035894.

Specifically the problems are in the svn_fs_validate_mergeinfo implementations
svn_fs_fs__validate_mergeinfo and svn_fs_base__validate_mergeinfo (I'll refer to
the fsfs implementation here, but the same applies to the bdb implementation):

1) For every revision in the incoming MERGEINFO to
svn_fs_fs__validate_mergeinfo, svn_fs_fs__check_path is called for each source
path associated with that revision.  For the example in desc1 that is 247,653
revisions to be checked and many of them include multiple associated source
paths, so a total of 436,831 calls to svn_fs_fs__check_path are made.

2) This also means a similar number of calls to svn_mergeinfo_merge as the
validated mergeinfo is built up (assuming all the path-revs are valid, the total
might be lower).  Unfortunately since we are iterating over the revisions from a
hash there is no particular order to the revisions.  This means that
svn_mergeinfo_merge's (and in particular svn_rangelist_merge) performance is not
the optimal case where we always merge *younger* revisions (since
svn_rangelist_merge can simply adjust the end of each rangelist in place or
append a new range without re-sorting).

3) Because svn_mergeinfo_merge only makes a shallow copy of its CHANGES
argument, as we build up mergeinfo fragments to merge into the validated
mergeinfo with svn_mergeinfo_parse, we use the scratch pool rather than an
iterpool.  This becomes quite memory intensive when dealing with large amounts
of mergeinfo.

The long and short of this is that performance in real world cases gets abysmal
in a hurry, worse we run out of memory very quickly:  Replicating the merge from
desc1 on a local repository the working set memory climbs to 2GB in less than 90
seconds (real time) and runs out of memory.

There are some changes to mitigate #2 (sort the hash of revs-to-paths and
iterate over it from oldest to youngest) as well as #3 (rev svn_mergeinfo_merge
to use the two-pool paradigm), but solving #1 is not so straightforward.  So
given the lateness of 1.7 and the fact that we've lived with issue #3669 since
1.5, I reverted the issue #3669 fixes (and the related follow-ups) in
http://svn.apache.org/viewvc?view=revision&revision=1173425 and
http://svn.apache.org/viewvc?view=revision&revision=1173639 I'll reopen issue #3669.

The good news is that these reversions fix (this) issue #4013:

[[[
trunk@1173574>timethis svn merge
https://svn.apache.org/repos/asf/subversion/trunk/subversion subversion
-c-1127134 --accept postpone

TimeThis :  Command Line :  svn merge
https://svn.apache.org/repos/asf/subversion/trunk/subversion subversion
-c-1127134 --accept postpone
TimeThis :    Start Time :  Wed Sep 21 14:41:49 2011

--- Reverse-merging r1127134 into 'subversion':
U    subversion\tests\cmdline\tree_conflict_tests.py
U    subversion\libsvn_wc\adm_ops.c
C    subversion\libsvn_client\add.c
--- Recording mergeinfo for reverse merge of r1127134 into 'subversion':
 G   subversion
--- Eliding mergeinfo from 'subversion':
 U   subversion
Summary of conflicts:
  Text conflicts: 1

TimeThis :  Command Line :  svn merge
https://svn.apache.org/repos/asf/subversion/trunk/subversion subversion
-c-1127134 --accept postpone
TimeThis :    Start Time :  Wed Sep 21 14:41:49 2011
TimeThis :      End Time :  Wed Sep 21 14:41:53 2011
TimeThis :  Elapsed Time :  00:00:04.174
]]] - 30,236KB peak working set memory.

Marking target milestone as 1.7.0 and setting the priority to P1 (as this would
present an obvious denial-of-service attack if released).  r1173425 and r1173639
have been nominated for backport to 1.7.x as part of the forthcoming 1.7.0
release.  Once they are backported we can close this issue as fixed.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=463&dsMessageId=2843062

To unsubscribe from this discussion, e-mail: [issues-unsubscribe@subversion.tigris.org].
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic