[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH 00/39] per-repository object store, part 1
From:       Brandon Williams <bmwill () google ! com>
Date:       2017-08-30 23:07:55
Message-ID: 20170830230755.GF50018 () google ! com
[Download RAW message or body]

On 08/29, Jonathan Nieder wrote:
> Hi,
> 
> Most of the credit for this series should go to Stefan Beller.  I just
> decided to pull the trigger on sending out what we have so far.
> 
> This series is about API.  It makes no functional change yet.
> 
> Today, when a git command wants to operate on some objects from another
> repository (e.g., a submodule), it has two choices:
> 
>  A. Use run_command to operate on that repository in a separate process.
> 
>  B. Use add_to_alternates_memory to pretend the repository is an
>     alternate.  This has a number of downsides.  Aside from aesthetics,
>     one particularly painful consequence is that as alternates
>     accumulate, the number of packs git has to check for objects
>     increases, which can cause significant slowdowns.
> 
> Brandon Williams's recent work to introduce "struct repository" points
> to a better way.  Encapsulating object access in struct repository
> would mean:
> 
>   i. The API for accessing objects in another repository becomes more
>      simple and familiar (instead of using the CLI or abusing alternates).
> 
>   ii. Operations on one repository do not interfere with another,
>       neither in semantics (e.g. replace objects do not work correctly
>       with the approach (B) above) nor performance (already described
>       above).
> 
>  iii. Resources associated with access to a repository could be freed
>       when done with that repo.
> 
>   iv. Thread-safe multiple readers to a single repository also become
>       straightforward, by using multiple repository objects for the same
>       repo.
> 
> This series is a small step in that direction.
> 
> At the end of this series, sha1_loose_object_info takes a repository
> argument and can be independently called for multiple repositories.
> Not incredibly useful on its own, but a future series will do the same
> for sha1_object_info, which will be enough to migrate a caller in
> submodule.c (which uses the object store for commit existence checks).
> 
> This series has a few phases:
> 
>  1. Patch 1 is a cleanup that made some of the later patches easier.
> 
>  2. Patches 2-6 create a struct object_store field inside struct
>     repository and move some globals to it.
> 
>  3. Patches 7-24 are mechanical changes that update some functions to
>     accept a repository argument. The only goal is to make the later
>     patches that teach these functions to actual handle a repository
>     other than the_repository easier to review.  The patches enforce
>     at compile time that no caller passes a repository other than
>     the_repository --- see patch 7 in particular for details on how
>     that works.
> 
>  4. Patches 25-39 update the implementations of those functions to
>     handle a repository other than the_repository.  This means the
>     safety check introduced in phase 3 goes away completely --- all
>     functions that gained a repository argument are safe to use with
>     a repository argument other than the_repository.
> 
> Patches 2-6 and 25-39 should be the most interesting to review.  I'd
> particularly appreciate if people can look over 25-39 carefully.  We
> were careful not to leave any calls to functions that assume they are
> operating on the_repository, but a triple-check is always welcome.
> 
> Thanks as well to brian m. carlson, who showed us how such a long and
> potentially tedius series can be made bearable for reviewers.
> 
> Thoughts of all kinds welcome, as always.

Just finished looking through the series.  Thanks for keeping each
commit very short and to the point, it made reviewing it much easier.  I
couldn't see anything wrong these transformations and I am very happy to
see this work getting done.

One thing that needs to be noted is that currently the object_store is
only really being used by the_repository so this series didn't need to
create any object_store_init() or object_store_clear() type functions.
So these types of functions will need to be added once submodules are
using their own object store, in their own struct repository.

> 
> Thanks,
> Jonathan Nieder (24):
>   pack: make packed_git_mru global a value instead of a pointer
>   object-store: move packed_git and packed_git_mru to object store
>     struct
>   pack: move prepare_packed_git_run_once to object store struct
>   pack: move approximate object count to object store struct
>   pack: add repository argument to install_packed_git
>   pack: add repository argument to prepare_packed_git_one
>   pack: add repository argument to rearrange_packed_git
>   pack: add repository argument to prepare_packed_git_mru
>   pack: add repository argument to prepare_packed_git
>   pack: add repository argument to reprepare_packed_git
>   pack: add repository argument to sha1_file_name
>   pack: add repository argument to map_sha1_file
>   pack: allow install_packed_git to handle arbitrary repositories
>   pack: allow rearrange_packed_git to handle arbitrary repositories
>   pack: allow prepare_packed_git_mru to handle arbitrary repositories
>   pack: allow prepare_packed_git_one to handle arbitrary repositories
>   pack: allow prepare_packed_git to handle arbitrary repositories
>   pack: allow reprepare_packed_git to handle arbitrary repositories
>   pack: allow sha1_file_name to handle arbitrary repositories
>   pack: allow stat_sha1_file to handle arbitrary repositories
>   pack: allow open_sha1_file to handle arbitrary repositories
>   pack: allow map_sha1_file_1 to handle arbitrary repositories
>   pack: allow map_sha1_file to handle arbitrary repositories
>   pack: allow sha1_loose_object_info to handle arbitrary repositories
> 
> Stefan Beller (15):
>   repository: introduce object store field
>   object-store: move alt_odb_list and alt_odb_tail to object store
>     struct
>   sha1_file: add repository argument to alt_odb_usable
>   sha1_file: add repository argument to link_alt_odb_entry
>   sha1_file: add repository argument to read_info_alternates
>   sha1_file: add repository argument to link_alt_odb_entries
>   sha1_file: add repository argument to stat_sha1_file
>   sha1_file: add repository argument to open_sha1_file
>   sha1_file: add repository argument to map_sha1_file_1
>   sha1_file: add repository argument to sha1_loose_object_info
>   object-store: add repository argument to prepare_alt_odb
>   object-store: add repository argument to foreach_alt_odb
>   sha1_file: allow alt_odb_usable to handle arbitrary repositories
>   object-store: allow prepare_alt_odb to handle arbitrary repositories
>   object-store: allow foreach_alt_odb to handle arbitrary repositories
> 
>  builtin/count-objects.c             |  10 ++-
>  builtin/fsck.c                      |  15 ++--
>  builtin/gc.c                        |   8 +-
>  builtin/index-pack.c                |   1 +
>  builtin/pack-objects.c              |  23 +++--
>  builtin/pack-redundant.c            |   8 +-
>  builtin/receive-pack.c              |   4 +-
>  builtin/submodule--helper.c         |   4 +-
>  bulk-checkin.c                      |   3 +-
>  cache.h                             |  50 ++---------
>  contrib/coccinelle/packed_git.cocci |  15 ++++
>  fast-import.c                       |  10 ++-
>  fetch-pack.c                        |   3 +-
>  http-backend.c                      |   8 +-
>  http-push.c                         |   1 +
>  http-walker.c                       |   4 +-
>  http.c                              |   9 +-
>  mru.h                               |   1 +
>  object-store.h                      |  71 ++++++++++++++++
>  pack-bitmap.c                       |   6 +-
>  pack-check.c                        |   1 +
>  pack-revindex.c                     |   1 +
>  packfile.c                          |  94 ++++++++++----------
>  packfile.h                          |   6 +-
>  reachable.c                         |   1 +
>  repository.c                        |   4 +-
>  repository.h                        |   7 ++
>  server-info.c                       |   8 +-
>  sha1_file.c                         | 165 ++++++++++++++++++++----------------
>  sha1_name.c                         |  11 ++-
>  streaming.c                         |   5 +-
>  transport.c                         |   4 +-
>  32 files changed, 344 insertions(+), 217 deletions(-)
>  create mode 100644 contrib/coccinelle/packed_git.cocci
>  create mode 100644 object-store.h
> 
> -- 
> 2.14.1.581.gf28d330327
> 

-- 
Brandon Williams
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic