[prev in list] [next in list] [prev in thread] [next in thread] 

List:       subversion-issues
Subject:    [Issue 3719] New - Extremely slow checkout on Windows
From:       yogurt2 () tigris ! org
Date:       2010-09-23 20:02:21
Message-ID: iz3719 () subversion ! tigris ! org
[Download RAW message or body]

http://subversion.tigris.org/issues/show_bug.cgi?id=3719
                 Issue #|3719
                 Summary|Extremely slow checkout on Windows
               Component|subversion
                 Version|1.6.x
                Platform|PC
                     URL|
              OS/Version|Windows Vista
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|ENHANCEMENT
                Priority|P3
            Subcomponent|unknown
             Assigned to|issues@subversion
             Reported by|yogurt2






------- Additional comments from yogurt2@tigris.org Thu Sep 23 13:02:20 -0700 2010 -------
Our repo has 31k files, a total of 120 MB. Linux checkout takes 2 minutes,
Windows checkout takes 37 minutes (on only a little slower machine).

While checking out on Windows (Windows 7 x64, svn 1.6.12, NTFS), svn eats 50%
CPU (that is, one of the two cores). When a large file is being downloaded, the
load drops. It clearly showed me that there is a per-file problem (instead of
throughput, bandwidth limit, some kind of conversion [we made even test to check
if the CR -> CR/LF conversion takes too much time], etc.)

Now I've fired up ProcMon from SysInternals. Here are some bottlenecks I've found:

1. Anytime an "entries" is read, I see the following sequence: Open, Read 80
bytes, Close, Open, Read 80 bytes, Close, Open, Read whole file, Close. What is
the reason behind this?

2. I've also found that the same "entries" file is being read several times (in
the above way) consecutively, without any writes to that file, without any other
operations between the two queries. So O, R80, C, O, R80, C, O, Rall, C, O, R80,
C, O, R80, C, O, Rall, C, etc.

3. In some directories I see a loop. Svn tries to create a file "tempfile.tmp"
and gets NAME COLLISION result. "tempfile.2.tmp" is tried then with the same
result. And so on. Sometimes going up to even "tempfile.340.tmp". Seems some
DeleteFile is missing for the temporaries. But why not use the GetTempFileName
function anyway?

4. When a large file is being checked out, I see the following sequence:
Write 4k from offs 0,
Write 4k from offs 4k,
Write 4k from offs 8k,
Read 16k from offs 16k, <- Why?
Write 4k from offs 12k,
Write 4k from offs 16k,
etc.
It also shows that either the TCP packet size is set to 4096 bytes, or the file
buffer size is set to this silly small value in svn.

5. It seems that the "entries" in the whole directory tree is checked for each
repository file. Say file "root/dirA/dirB​/fileC" is processed, and both dirA
and dirB is already created. svn checks "root/entries", "root/dirA/entries",
"root/dirA/dirB​/entries", deals with the file, the checks (reads!)
"root/dirA/dirB​/entries" again, then "root/dirA/entries" and "root/entries".

All in all: Most of the lost time is spent with the "entries" files.

I'm willing to check any tests or test versions sent to me.

PS: The OS field of the issue form should now include Windows 7 as well...

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=463&dsMessageId=2663810

To unsubscribe from this discussion, e-mail: [issues-unsubscribe@subversion.tigris.org].
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic