[prev in list] [next in list] [prev in thread] [next in thread] 

List:       mercurial-devel
Subject:    Re: ideas: chg repo preloading, and new changelog index
From:       Jun Wu <quark () fb ! com>
Date:       2016-12-30 20:58:26
Message-ID: 1483130474-sup-5028 () x1c
[Download RAW message or body]

Excerpts from Gregory Szorc's message of 2016-12-30 12:18:32 -0800:
> On Tue, Dec 27, 2016 at 8:35 AM, Jun Wu <quark@fb.com> wrote:
> 
> > chg repo preloading
> >
> >   I have been thinking about speeding up repo loading for a long time.
> >   Previous ideas are persistent radix tree, hidden bitmap, mmap
> > changelog.i.
> >
> >   Recently I realized that chg (after the uisetup refactoring) could be an
> >   option, assuming users use read commands more frequently than writes.
> >
> >   The idea is simple, the master server (the process before fork) maintains
> >   a map {repo_path: {index_hash: index, marker_hash: markers, ...}}, where
> >   *_hash is a quick hash of sensitive properties like sensitive file sizes,
> >   etc. to decide whether the value can be used. The forked worker gets the
> >   map for free and uses it to quickly construct the repo object if the hash
> >   matches.
> >
> 
> I like the idea of having the persistent server cache repo objects so we
> can avoid the overhead of loading a repo on every command. But I'd like to
> see numbers to have an idea how much this really saves us so the work can
> be justified.

With the planned new architecture (where uisetup runs per request), repo
object *cannot* be cached because extensions having side effects on
the localrepository class do not work. If we do run uisetup (or reposetups),
we face more serious consistency issue that bugs the current chg.

For numbers, it already takes us 0.x seconds to build the radix tree for one
of the big repo internally. And if you use hg-committed, you can run "hg id"
vs. "hg id --hidden" to roughly measure the obsmarker overhead - it could be
0.x to 1.x seconds.

> >   The master server needs a background thread doing the preloading. So it's
> >   no longer stateless. Hopefully it's fine because all the preloading
> > stuffs
> >   are low-level, self-contained and not affected by extensions.
> >
> >   However, if an extension does change the behavior of something being
> >   cached here, we will have compatibility issues. It's solvable if chg has
> >   APIs for 3rd-party extensions to just drop some kind of cache.
> >
> >   3rd-party repo requirements can also be troublesome for things that
> >   require a repo object to calculate, namely obsolete._compute*set. While
> >   changelog.index, obsstore._readmarkers could be calculated without repo.
> >
> >   Therefore I think it's still a good idea to cache those low-level stuffs
> >   without a repo object.
> >
> >   If this direction looks promising, I will try to start with caching the C
> >   index object first. Then we can think about how to deal with the
> > obsstore.
> >
> > new changelog index
> >
> >   (note: this is less related to chg, but fits nicely with the plan above)
> >
> >   I personally like to see an efficient changelog "index" object whose code
> >   is immutable to extensions (i.e. extensions could not change the logic
> >   inside it), reusable outside the Python eco-system (likely implemented in
> >   C without Python.h or Rust), taking a minimal set of inputs (changelog.i,
> >   phaseroots, obsstore, but allows customized parsers), and deals with the
> >   following independently (could be implemented incrementally):
> >
> >     - converting between rev number, node (and partialmatch)
> >     - calculate common ancestors
> >     - revset bitmap representation: native ancestors / descendants
> >       construction, support and/or/minus operations
> >     - understand phases
> >     - understand obsolete concepts
> >
> >   If that looks promising, I'll try to work on it after the above chg
> > change.
> >
> 
> What are the drivers behind wanting this? I think it is important to have a
> really good reason to write code in a lower-level language, as the cost to
> maintaining that code tends to be higher.

1. Performance. The current "ancestor set" implementation is basically
   Python code and could be 10-100x faster if written in C.
2. Pureness. If extensions cannot have side effects on the core
   data structures (index, low-level revmap representations), chg could
   preload them correctly without worrying about compatibility.
   (i.e. it makes chg repo preloading *possible*)
3. Memory. Because fork is COW, the memory usage is expected to be reduced
   because processes can share the cache data structure.

> Historically the main driver is performance. But PyPy can close that gap.
> Have you assessed the performance of PyPy as an alternative to writing
> non-Python code?

PyPy is not the magic that solves everything. Some code path is already in C
(like the radix tree building), and it is still a perf concern of a giant
repo. The obsmarker logic may be faster running under PyPy, but it's still
some O(??? N-ish) where the preloading approach makes it feel like O(1).

> (Don't get me wrong, I like the idea of a reusable library for doing
> everything changelog. It just seems like a lot of work.)

It is. But with modular design. It could be completed incrementally. Like
the radix tree first, then the bitmap, then the phases / ancestors map. It
could also be implemented as separate small components / algorithms, instead
of a god index object.
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic