diff mbox series

clean: demonstrate a bug with pathspecs

Message ID pull.526.git.1579119946211.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series clean: demonstrate a bug with pathspecs | expand

Commit Message

Johannes Schindelin via GitGitGadget Jan. 15, 2020, 8:25 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
modified the way pathspecs are handled when handling a directory
during "git clean -f <path>". While this improved the behavior
for known test breakages, it also regressed in how the clean
command handles cleaning a specified file.

Add a test case that demonstrates this behavior. This test passes
before b9660c1 then fails after.

Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
    clean: demonstrate a bug with pathspecs
    
    While integrating v2.25.0 into the microsoft/git fork, one of our VFS
    for Git functional tests started failing. Looking into it, the only
    possible place could have been where one of our integration points with
    the virtualfilesystem hook was moved by c5c4edd (dir: break part of
    read_directory_recursive() out for reuse, 2019-12-10) and then used in
    the following two commits.
    
    By reverting these two commits, we stopped the failure, but it took a
    while before figuring out that it was a regression in Git and not a
    failure in our integration to the new logic. Thanks to Kevin Willford
    for producing a test case.
    
    b9660c1 (dir: fix checks on common prefix directory, 2019-12-19) is the
    culprit, so this patch is based on that. If rebased to c5c4edd, then the
    test passes.
    
    As for actually fixing this regression, I don't know how. This code is
    pretty dense and I don't have a firm grasp of what is happening in both
    b9660c1 and the following 777b420 (dir: synchronize tread_leading_path()
    and read_directory_recursive()). Elijah is CC'd in case he still has
    context on this area.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-526%2Fderrickstolee%2Fclean-bug-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-526/derrickstolee/clean-bug-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/526

 t/t7300-clean.sh | 9 +++++++++
 1 file changed, 9 insertions(+)


base-commit: b9670c1f5e6b98837c489a03ac0d343d30e08505

Comments

Kyle Meyer Jan. 15, 2020, 11:30 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
> modified the way pathspecs are handled when handling a directory
> during "git clean -f <path>".

I can't find b9660c1.  I think this and other references below should
point to b9670c1f5e (dir: fix checks on common prefix directory,
2019-12-19), which matches the base-commit value for this patch.
Jonathan Nieder Jan. 16, 2020, 12:03 a.m. UTC | #2
Hi,

Derrick Stolee wrote:

> b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
> modified the way pathspecs are handled when handling a directory
> during "git clean -f <path>". While this improved the behavior
> for known test breakages, it also regressed in how the clean
> command handles cleaning a specified file.
>
> Add a test case that demonstrates this behavior. This test passes
> before b9660c1 then fails after.

Can this commit message say a little more about the nature of the
bug?  For example, what kind of workflow does this come up in for
end users?

[...]
>     While integrating v2.25.0 into the microsoft/git fork, one of our VFS
>     for Git functional tests started failing.

This is also useful information to put in the commit message: e.g.
"Noticed via VFS for Git's functional test <test name>".  It provides
useful context when looking at such a patch later.

[...]
>                                      Elijah is CC'd in case he still has
>     context on this area.

Thanks and hope that helps,
Jonathan
Elijah Newren Jan. 16, 2020, 12:38 a.m. UTC | #3
On Wed, Jan 15, 2020 at 12:25 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
> modified the way pathspecs are handled when handling a directory
> during "git clean -f <path>". While this improved the behavior
> for known test breakages, it also regressed in how the clean
> command handles cleaning a specified file.
>
> Add a test case that demonstrates this behavior. This test passes
> before b9660c1 then fails after.
>
> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>     clean: demonstrate a bug with pathspecs
>
>     While integrating v2.25.0 into the microsoft/git fork, one of our VFS
>     for Git functional tests started failing. Looking into it, the only
>     possible place could have been where one of our integration points with
>     the virtualfilesystem hook was moved by c5c4edd (dir: break part of
>     read_directory_recursive() out for reuse, 2019-12-10) and then used in
>     the following two commits.
>
>     By reverting these two commits, we stopped the failure, but it took a
>     while before figuring out that it was a regression in Git and not a
>     failure in our integration to the new logic. Thanks to Kevin Willford
>     for producing a test case.
>
>     b9660c1 (dir: fix checks on common prefix directory, 2019-12-19) is the
>     culprit, so this patch is based on that. If rebased to c5c4edd, then the
>     test passes.
>
>     As for actually fixing this regression, I don't know how. This code is
>     pretty dense and I don't have a firm grasp of what is happening in both
>     b9660c1 and the following 777b420 (dir: synchronize tread_leading_path()
>     and read_directory_recursive()). Elijah is CC'd in case he still has
>     context on this area.
>
>     Thanks, -Stolee
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-526%2Fderrickstolee%2Fclean-bug-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-526/derrickstolee/clean-bug-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/526
>
>  t/t7300-clean.sh | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
> index 6e6d24c1c3..782e125c89 100755
> --- a/t/t7300-clean.sh
> +++ b/t/t7300-clean.sh
> @@ -737,4 +737,13 @@ test_expect_success MINGW 'handle clean & core.longpaths = false nicely' '
>         test_i18ngrep "too long" .git/err
>  '
>
> +test_expect_failure 'clean untracked paths by pathspec' '
> +       git init untracked &&
> +       mkdir untracked/dir &&
> +       echo >untracked/dir/file.txt &&
> +       git -C untracked clean -f dir/file.txt &&
> +       ls untracked/dir >actual &&
> +       test_must_be_empty actual
> +'
> +
>  test_done
>
> base-commit: b9670c1f5e6b98837c489a03ac0d343d30e08505
> --
> gitgitgadget

Is there an inverted phrase corresponding to "the gift that keeps on
giving", something like "the punishment that keeps on punishing"?  If
so, it would be a very appropriate description of dir.c.

Yeah, I still have context.  I even think I've got an idea about what
the fix might be, though with dir.c my ideas about fixes usually just
serve as starting points for debugging before I find the real fix.
I'll try to dig in.
Derrick Stolee Jan. 16, 2020, 1:23 a.m. UTC | #4
On 1/15/2020 7:38 PM, Elijah Newren wrote:
> Is there an inverted phrase corresponding to "the gift that keeps on
> giving", something like "the punishment that keeps on punishing"?  If
> so, it would be a very appropriate description of dir.c.

At least we will continue adding tests until we converge towards
correctness, and the behavior issues are even more contrived and
special case (like this one).

> Yeah, I still have context.  I even think I've got an idea about what
> the fix might be, though with dir.c my ideas about fixes usually just
> serve as starting points for debugging before I find the real fix.
> I'll try to dig in.

Thanks! I'll try to review it carefully when it arrives. Good luck.

-Stolee
Derrick Stolee Jan. 16, 2020, 1:33 a.m. UTC | #5
On 1/15/2020 6:30 PM, Kyle Meyer wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
>> modified the way pathspecs are handled when handling a directory
>> during "git clean -f <path>".
> 
> I can't find b9660c1.  I think this and other references below should
> point to b9670c1f5e (dir: fix checks on common prefix directory,
> 2019-12-19), which matches the base-commit value for this patch.

Sorry for the digit swap. Thanks for pointing that out!
Derrick Stolee Jan. 16, 2020, 1:43 a.m. UTC | #6
On 1/15/2020 7:03 PM, Jonathan Nieder wrote:
> Hi,
> 
> Derrick Stolee wrote:
> 
>> b9660c1 (dir: fix checks on common prefix directory, 2019-12-19)
>> modified the way pathspecs are handled when handling a directory
>> during "git clean -f <path>". While this improved the behavior
>> for known test breakages, it also regressed in how the clean
>> command handles cleaning a specified file.
>>
>> Add a test case that demonstrates this behavior. This test passes
>> before b9660c1 then fails after.
> 
> Can this commit message say a little more about the nature of the
> bug?  For example, what kind of workflow does this come up in for
> end users?

I honestly don't know why anyone would call `git clean -f <path>` on a
file instead of using `rm <path>`. But, the behavior _did_ change, which
is why I'm bringing it up.

If the community instead said "this is not important functionality. We
should just expect the given pathspec to only match directories" then I
would accept that and just delete the file in another way. That seems
unlikely.

> [...]
>>     While integrating v2.25.0 into the microsoft/git fork, one of our VFS
>>     for Git functional tests started failing.
> 
> This is also useful information to put in the commit message: e.g.
> "Noticed via VFS for Git's functional test <test name>".  It provides
> useful context when looking at such a patch later.

I'm not sure the test [1] will shed much light on the issue. It sort of
accidentally reveals this bug because it happens to use "git clean -f <path>".

The test itself is holding a handle on <path> on a commit where <path>
is untracked, then tries to checkout a commit where <path> is tracked. On
Windows, this should fail. With the virtualization layer in VFS for Git,
Git doesn't actually try to write to <path> but instead VFS for Git tries
to update the virtualization at <path>, colliding with what Git is trying
to do. Hence, we need to make sure the Git command actually fails in this
attempt.

Perhaps that context isn't actually helpful. And you could understand why
I stared at this test for a long while before realizing that it was actually
a failure in "git clean -f" and then Kevin did the real work to find that
VFS for Git wasn't causing the issue.

-Stolee

[1] https://github.com/microsoft/VFSForGit/blob/1aec263033cc3c05d0389e1792b7958d9a2e70c6/GVFS/GVFS.FunctionalTests.Windows/Windows/Tests/WindowsUpdatePlaceholderTests.cs#L38-L72
Elijah Newren Jan. 16, 2020, 6:01 p.m. UTC | #7
On Wed, Jan 15, 2020 at 5:23 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/15/2020 7:38 PM, Elijah Newren wrote:
> > Is there an inverted phrase corresponding to "the gift that keeps on
> > giving", something like "the punishment that keeps on punishing"?  If
> > so, it would be a very appropriate description of dir.c.
>
> At least we will continue adding tests until we converge towards
> correctness, and the behavior issues are even more contrived and
> special case (like this one).

This doesn't seem any more contrived or special case than most my
previous fixes for dir.c...

> > Yeah, I still have context.  I even think I've got an idea about what
> > the fix might be, though with dir.c my ideas about fixes usually just
> > serve as starting points for debugging before I find the real fix.
> > I'll try to dig in.
>
> Thanks! I'll try to review it carefully when it arrives. Good luck.

Man, I'm such a bozo.  It turns out, for once, that my idea for the
fix was correct but after digging a bit I realized that it was
essentially a bug I fixed not that long ago once already -- and that I
myself re-introduced it (for a slightly different case) in some
commits where I used some strongly worded disgust that "this bad code
structure is going to cause someone to mess up in <this way>" and then
I made that exact kind of mistake I was complaining about in the
commit message...as part of that EXACT commit, to boot.

At least it'll make for a fun new commit message explaining it all...


Anyway, I'm going to pull your commit into my series so I can put my
fix on top, and lump it in with Peff's two patches over at
https://lore.kernel.org/git/20200115202146.GA4091171@coredump.intra.peff.net/
since all these patches are basically "more fill_directory() fixes".
Let me know if you have any concerns with that.

Elijah
Junio C Hamano Jan. 16, 2020, 8:20 p.m. UTC | #8
Elijah Newren <newren@gmail.com> writes:

> Anyway, I'm going to pull your commit into my series so I can put my
> fix on top, and lump it in with Peff's two patches over at
> https://lore.kernel.org/git/20200115202146.GA4091171@coredump.intra.peff.net/
> since all these patches are basically "more fill_directory() fixes".

Thanks.  Then I'll refrain from applying those two patches we saw
earlier (including the one you have the URL in your message).
diff mbox series

Patch

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 6e6d24c1c3..782e125c89 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -737,4 +737,13 @@  test_expect_success MINGW 'handle clean & core.longpaths = false nicely' '
 	test_i18ngrep "too long" .git/err
 '
 
+test_expect_failure 'clean untracked paths by pathspec' '
+	git init untracked &&
+	mkdir untracked/dir &&
+	echo >untracked/dir/file.txt &&
+	git -C untracked clean -f dir/file.txt &&
+	ls untracked/dir >actual &&
+	test_must_be_empty actual
+'
+
 test_done