[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH] send-pack: use OBJECT_INFO_QUICK to check negative objects
From:       Junio C Hamano <gitster () pobox ! com>
Date:       2019-11-30 17:08:32
Message-ID: xmqqr21p2sxb.fsf () gitster-ct ! c ! googlers ! com
[Download RAW message or body]

Jeff King <peff@peff.net> writes:

> I think the patch below would help you.
>
> -- >8 --
> Subject: [PATCH] send-pack: use OBJECT_INFO_QUICK to check negative objects
>
> When pushing, we feed pack-objects a list of both positive and negative
> objects. The positive objects are what we want to send, and the negative
> objects are what the other side told us they have, which we can use to
> limit the size of the push.
>
> Before passing along a negative object, send_pack() will make sure we
> actually have it (since we only know about it because the remote
> mentioned it, not because it's one of our refs). So it's expected that
> some of these objects will be missing on the local side. But looking for
> a missing object is more expensive than one that we have: it triggers
> reprepare_packed_git() to handle a racy repack, plus it has to explore
> every alternate's loose object tree (which can be slow if you have a lot
> of them, or have a high-latency filesystem).
>
> This isn't usually a big problem, since repositories you're pushing to
> don't generally have a large number of refs that are unrelated to what
> the client has. But there's no reason such a setup is wrong, and it
> currently performs poorly.
>
> We can fix this by using OBJECT_INFO_QUICK, which tells the lookup
> code that we expect objects to be missing. Notably, it will not re-scan
> the packs, and it will use the loose cache from 61c7711cfe (sha1-file:
> use loose object cache for quick existence check, 2018-11-12).
>
> The downside is that in the rare case that we race with a local repack,
> we might fail to feed some objects to pack-objects, making the resulting
> push larger. But we'd never produce an invalid or incorrect push, just a
> less optimal one. That seems like a reasonable tradeoff, and we already
> do similar things on the fetch side (e.g., when marking COMPLETE
> commits).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Interestingly, upload-pack does not use OBJECT_INFO_QUICK when it's
> getting oids from the other side. But I think it could possibly benefit
> in the same way. Nobody seems to have noticed. Perhaps it simply comes
> up less, as servers would tend to have more objects than their clients?

Makes me wonder how many times we wre bitten by the fact that
INFO_SKIP_FETCH_OBJECT does not imply INFO_QUICK.  Perhaps most of
the users of INFO_SKIP_FETCH_OBJECT wants to use INFO_FOR_PREFETCH?

cf. 31f5256c ("sha1-file: split OBJECT_INFO_FOR_PREFETCH", 2019-05-28)

>  send-pack.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/send-pack.c b/send-pack.c
> index 34c77cbb1a..16d6584439 100644
> --- a/send-pack.c
> +++ b/send-pack.c
> @@ -41,7 +41,9 @@ int option_parse_push_signed(const struct option *opt,
>  static void feed_object(const struct object_id *oid, FILE *fh, int negative)
>  {
>  	if (negative &&
> -	    !has_object_file_with_flags(oid, OBJECT_INFO_SKIP_FETCH_OBJECT))
> +	    !has_object_file_with_flags(oid,
> +					OBJECT_INFO_SKIP_FETCH_OBJECT |
> +					OBJECT_INFO_QUICK))
>  		return;
>  
>  	if (negative)
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic