'Re: [PATCH v3 2/6] stash: remove the second index in stash_working_tree()'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH v3 2/6] stash: remove the second index in stash_working_tree()
From:       Junio C Hamano <gitster () pobox ! com>
Date:       2020-07-31 18:26:41
Message-ID: xmqqy2mz35i6.fsf () gitster ! c ! googlers ! com
[Download RAW message or body]

Alban Gruin <alban.gruin@gmail.com> writes:

> This removes the second index used in stash_working_tree() to simplify
> the code.

Continuing what I said for [0/6], say "the second index file" to
clarify the distinction between the in-core index and on-disk index
files to avoid confusion.

> 	init_revisions(&rev, NULL);
> 	copy_pathspec(&rev.prune_data, ps);
> 
> -	set_alternate_index_output(stash_index_path.buf);
> 	if (reset_tree(&info->i_tree, 0, 0)) {
> 		ret = -1;
> 		goto done;
> 	}
> -	set_alternate_index_output(NULL);

Hmph.  So at this point i_tree is what is in the index, we reset the
working tree to it with reset_tree(), and instead of writing to
$TMPindex, we let reset_tree() clobber the main on-disk index but we
do not care because the result is supposed to be the same as what
was originally in the index anyway?

> @@ -1091,8 +1088,6 @@ static int stash_working_tree(struct stash_info *info, const \
> struct pathspec *ps  argv_array_pushl(&cp_upd_index.args, "update-index",
> 			 "--ignore-skip-worktree-entries",
> 			 "-z", "--add", "--remove", "--stdin", NULL);
> -	argv_array_pushf(&cp_upd_index.env_array, "GIT_INDEX_FILE=%s",
> -			 stash_index_path.buf);
> 
> 	if (pipe_command(&cp_upd_index, diff_output.buf, diff_output.len,
> 			 NULL, 0, NULL, 0)) {

And then the new code now lets "update-index" work directly on the
main index (which does make an observable difference to the outside
world, but we are not letting any hook to look at this intermediate
state, so it might be OK---I cannot tell at this point in the code).

> @@ -1100,19 +1095,16 @@ static int stash_working_tree(struct stash_info *info, \
> const struct pathspec *ps  goto done;
> 	}
> 
> -	if (write_index_as_tree(&info->w_tree, &istate, stash_index_path.buf, 0,
> -				NULL)) {
> +	discard_cache();
> +	if (write_cache_as_tree(&info->w_tree, 0, NULL) ||
> +	    reset_tree(&info->i_tree, 0, 1))

We used to read from $TMPindex, which has been updated with the
contents of files modified in the working tree, and write its
content out as a tree object, grabbing its object name into w_tree.

The new code instead writes out the same tree from the main index,
which has been clobbered with the working tree changes that the user
hasn't added to the index.  Because of that, we need to discard
these changes and re-read the in-core index out of i_tree with
reset_tree().

So, this change makes one new call to reset_tree() that scans and
updates working tree files that are modified?  How expensive is
that?  

It's not like this change trades cost to write out a temporary index
file with the cost of having to call reset_tree() one more time---as
far as I can see, the new code writes to on-disk index file the same
time as the original code.

How is the use of second on-disk index hurting us?  I must be
missing something obvious, but at this point, it is not clear what
we are gaining---I only see downsides.


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic