[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH] dir: special case check for the possibility that pathspec is NULL
From:       Denton Liu <liu.denton () gmail ! com>
Date:       2019-09-30 22:31:05
Message-ID: 20190930223105.GA86525 () generichostname
[Download RAW message or body]

Hi Elijah,

On Mon, Sep 30, 2019 at 12:11:06PM -0700, Elijah Newren wrote:
> Commits 404ebceda01c ("dir: also check directories for matching
> pathspecs", 2019-09-17) and 89a1f4aaf765 ("dir: if our pathspec might
> match files under a dir, recurse into it", 2019-09-17) added calls to
> match_pathspec() and do_match_pathspec() passing along their pathspec
> parameter.  Both match_pathspec() and do_match_pathspec() assume the
> pathspec argument they are given is non-NULL.  It turns out that
> unpack-tree.c's verify_clean_subdirectory() calls read_directory() with
> pathspec == NULL, and it is possible on case insensitive filesystems for
> that NULL to make it to these new calls to match_pathspec() and
> do_match_pathspec().  Add appropriate checks on the NULLness of pathspec
> to avoid a segfault.
> 
> In case the negation throws anyone off (one of the calls was to
> do_match_pathspec() while the other was to !match_pathspec(), yet no
> negation of the NULLness of pathspec is used), there are two ways to
> understand the differences:
>   * The code already handled the pathspec == NULL cases before this
>     series, and this series only tried to change behavior when there was
>     a pathspec, thus we only want to go into the if-block if pathspec is
>     non-NULL.
>   * One of the calls is for whether to recurse into a subdirectory, the
>     other is for after we've recursed into it for whether we want to
>     remove the subdirectory itself (i.e. the subdirectory didn't match
>     but something under it could have).  That difference in situation
>     leads to the slight differences in logic used (well, that and the
>     slightly unusual fact that we don't want empty pathspecs to remove
>     untracked directories by default).
> 
> Helped-by: Denton Liu <liu.denton@gmail.com>
> Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> This patch applies on top of en/clean-nested-with-ignored, which is now
> in next.
> 
> Denton found and analyzed one issue and provided the patch for the
> match_pathspec() call, SZEDER figured out why the issue only reproduced
> for some folks and not others and provided the testcase, and I looked
> through the remainder of the series and noted the do_match_pathspec()
> call that should have the same check.

Thanks for catching what I missed.

> 
> So, I'm not sure who should be author and who should be helped-by; I
> feel like their contributions are possibly bigger than mine.  While I
> tried to reproduce and debug, they ended up doing the work, and I just
> looked through the rest of the series for similar issues and wrote up
> a commit message.  *shrug*

Eh, it doesn't really matter to me. GitHub appears to have de facto
standardised the Co-authored-by: trailer to allow credit to be split
amonst multiple authors so _maybe_ we could use that, but I'm pretty
impartial.

> 
>  dir.c                 |  8 +++++---
>  t/t0050-filesystem.sh | 23 +++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/dir.c b/dir.c
> index 7ff79170fc..bd39b86be4 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1962,8 +1962,9 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>  			((state == path_untracked) &&
>  			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR) &&
>  			 ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
> -			  do_match_pathspec(istate, pathspec, path.buf, path.len,
> -					    baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC))) {
> +			  (pathspec &&
> +			   do_match_pathspec(istate, pathspec, path.buf, path.len,
> +					     baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC)))) {
>  			struct untracked_cache_dir *ud;
>  			ud = lookup_untracked(dir->untracked, untracked,
>  					      path.buf + baselen,
> @@ -1975,7 +1976,8 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>  			if (subdir_state > dir_state)
>  				dir_state = subdir_state;
>  
> -			if (!match_pathspec(istate, pathspec, path.buf, path.len,
> +			if (pathspec &&
> +			    !match_pathspec(istate, pathspec, path.buf, path.len,
>  					    0 /* prefix */, NULL,
>  					    0 /* do NOT special case dirs */))
>  				state = path_none;
> diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
> index 192c94eccd..edb30f9eb2 100755
> --- a/t/t0050-filesystem.sh
> +++ b/t/t0050-filesystem.sh
> @@ -131,4 +131,27 @@ $test_unicode 'merge (silent unicode normalization)' '
>  	git merge topic
>  '
>  
> +test_expect_success CASE_INSENSITIVE_FS 'checkout with no pathspec and a case insensitive fs' '
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		>Gitweb &&
> +		git add Gitweb &&
> +		git commit -m "add Gitweb" &&
> +
> +		git checkout --orphan todo &&
> +		git reset --hard &&
> +		# the subdir is crucial, without it there is no segfault

We should either remove this comment or change the justification. A
future reader may be confused at what particular segfault this refers
to.

> +		mkdir -p gitweb/subdir &&
> +		>gitweb/subdir/file &&
> +		# it is not strictly necessary to add and commit the
> +		# gitweb directory, its presence is sufficient

Same here, its presence is sufficient to... what?

Thanks,

Denton

> +		git add gitweb &&
> +		git commit -m "add gitweb/subdir/file" &&
> +
> +		git checkout master
> +	)
> +'
> +
>  test_done
> -- 
> 2.22.1.14.g885c22d24b
> 
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic