[prev in list] [next in list] [prev in thread] [next in thread]
List: mercurial
Subject: SV: Slow push of large file over HTTP
From: Michael_Tjørnemark <mtj () pfa ! dk>
Date: 2012-04-26 11:51:17
Message-ID: E2CD1466E9927D4EA97501ACD43BE51B204CD9F7DA () PFANPEX01 ! pfa ! dk
[Download RAW message or body]
> Michael Tjørnemark <mtj@pfa.dk> writes:
>
> Hejsa :)
>
> > I have a repository with a single changeset which adds a single 60 MB
> > file (zip-file). Pushing this repo over HTTP is much, much slower than
> > other commands on the repository, including a similar pull - is this
> > to be expected? I have recreated the problem on other machines and
> > files as well, so it seems to be a general problem with pushing a
> > large(ish) file.
> >
> > Times (all on my local machine):
> > Commit file - 4 secs
> > Push to empty repo using filesystem - 7 secs Clone from repo over HTTP
> > - 23 secs Pull to empty repo over HTTP - 23 secs Push to empty repo
> > over HTTP - 4 mins <-- SLOW
> >
> > Command to serve empty repo:
> > hg serve --config web.allow_push=* --config web.push_ssl=false
> >
> > Command to push to empty repo:
> > hg push http://localhost:8000/ --debug --time
> >
> > In the debug output (see below), bundling and sending takes around 10
> > secs. Then there is a almost 4 min pause between "sending:
> > 60342/120684 kb (50.00%)" and "remote: adding changesets". (Also it
> > seems wrong that sending only goes to 50%, but that is another
> > problem).
>
> You've stumpled upon a weird corner case in Python's HTTP library. If the server \
> asks for authentication, then we'll only see this *after* pushing the entire \
> changegroup to the server! We'll then have to start over. The progress code \
> anticipates this and claims that you need to send 120 MB for the 60 MB push so that \
> the progress bar will go smoothly from 0% to 100%. Here the push is really finished \
> when it reaches 50%.
Yeah, I'm not too worried about that, but thanks for the explanation.
> I'm unsure why it then stops for 4 minuttes -- I would not expect that.
>
> > I understand that the largefiles extension might help, but this
> > requires that everybody that uses the repo enables the extension, so i
> > would rather avoid that. And also everything else is fast (commit,
> > clone, pull), so it seems as if something is wrong with push over
> > HTTP.
>
> Versioning zip files is... unusual :) Every revision of the zip file will take up a \
> lot of new space since it can't be delta compressed much against the previous \
> version. So after 10 edits to the file, you could end up with a repo with maybe 400 \
> MB of history for that single file.
> The largefiles extension sound like just what you need -- Unity is actually is \
> using it for versioning a lot of zip files.
Yes, this is not really what i am trying to do, but was just a simple way to recreate \
the core problem with push of large files over HTTP. The real usecase that started my \
investigation was initializing a new repository with around 6000 files, 150 MB total \
and of varying size - the largest 33 MB, and about 15 of them > 1 MB - which took \
more than 15 minutes to push (on a slow machine).
I have now recreated the problem with 5 xml files of 16 MB each (a more reasonable \
example than a single zip), and pulling over HTTP takes 8 secs while pushing takes \
around 1 minute - so as in the original example a factor of almost 1:10 of pull to \
push times. A test of a repository with 4500 small files (7,5 MB total) takes around \
9 secs for both pull and push, so it seems to be a problem with large files.
We can live with the current performance since it only has to be done once and we are \
used to slow ClearCase performance, so I don't need a definitive answer. I just think \
that something is wrong when push is that much slower than pull - I would think the \
times should be comparable, so 10 times slower just seems wierd. It should be easy to \
recreate the problem anywhere (I have only tried it on Windows though) by serving an \
empty repository, and pushing a single changeset to it with one/more large files \
added, and compare that time with a pull from the same repository. \
_______________________________________________ Mercurial mailing list
Mercurial@selenic.com
http://selenic.com/mailman/listinfo/mercurial
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic