[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH] Enable index-pack threading in msysgit.
From:       Karsten Blees <karsten.blees () gmail ! com>
Date:       2014-03-21 20:01:45
Message-ID: 532C9AA9.1010102 () gmail ! com
[Download RAW message or body]

Am 20.03.2014 22:56, schrieb Stefan Zager:
> On Thu, Mar 20, 2014 at 2:35 PM, Karsten Blees <karsten.blees@gmail.com> wrote:
> > Am 20.03.2014 17:08, schrieb Stefan Zager:
> > 
> > > Going forward, there is still a lot of performance that gets left on
> > > the table when you rule out threaded file access.  There are not so
> > > many calls to read, mmap, and pread in the code; it should be possible
> > > to rationalize them and make them thread-safe -- at least, thread-safe
> > > for posix-compliant systems and msysgit, which covers the great
> > > majority of git users, I would hope.
> > > 
> > 
> > IMO a "mostly" XSI compliant pread (or even the git_pread() emulation) is still \
> > better than forbidding the use of read() entirely. Switching from read to pread \
> > everywhere requires that all callers have to keep track of the file position, \
> > which means a _lot_ of code changes (read/xread/strbuf_read is used in ~70 places \
> > throughout git). And how do you plan to deal with platforms that don't have a \
> > thread-safe pread (HP, Cygwin)? 
> > Considering all that, Duy's solution of opening separate file descriptors per \
> > thread seems to be the best pattern for future multi-threaded work.
> 
> Does that mean you would endorse the (N threads) * (M pack files)
> approach to threading checkout and status?  That seems kind of
> crazy-town to me.  Not to mention that pack windows are not shared, so
> this approach to multi-threading can have the side-effect of blowing
> out memory consumption.  We have already had to dial back settings for
> pack.threads and core.deltaBaseCacheLimit, because threaded index-pack
> was causing OOM errors on 32-bit platforms.
> 

Opening more file descriptors doesn't significantly increase the memory footprint, so \
it shouldn't matter whether the threads read data via shared or private descriptors.

git-status with core.preloadindex is already multithreaded (at least the first part), \
and AFAIK doesn't read pack files at all.

I'm still not convinced that multi-threaded git-checkout is a good idea. According to \
my tests this is actually slower than sequential checkout. You'd have to be very \
careful to only multi-thread the parts that don't do any IO, such as unpacking / \
undeltifying.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic