t3706: Add test for wildcard vs literal pathspec

Message ID	20250412174051.780148-1-jayatheerthkulkarni2005@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-pg1-f173.google.com (mail-pg1-f173.google.com [209.85.215.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 675A0347A2 for <git@vger.kernel.org>; Sat, 12 Apr 2025 17:41:01 +0000 (UTC) From: K Jayatheerth <jayatheerthkulkarni2005@gmail.com> To: jayatheerthkulkarni2005@gmail.com Cc: git@vger.kernel.org, peff@peff.net, piotrsiupa@gmail.com Subject: [PATCH] t3706: Add test for wildcard vs literal pathspec Date: Sat, 12 Apr 2025 23:10:51 +0530 Message-ID: <20250412174051.780148-1-jayatheerthkulkarni2005@gmail.com> In-Reply-To: <20250412094607.236382-2-jayatheerthkulkarni2005@gmail.com> References: <20250412094607.236382-2-jayatheerthkulkarni2005@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	t3706: Add test for wildcard vs literal pathspec \| expand t3706: Add test for wildcard vs literal pathspec

JAYATHEERTH K April 12, 2025, 5:40 p.m. UTC

When 'git add <pattern>' is run, and a file exists whose literal
name matches <pattern> (e.g., a file named "f*" when running
'git add f*'), Git should treat the pattern as a wildcard
matching multiple files, rather than just adding the literal file.

Add a test case that:
 1. Creates files 'f*', 'f**', and 'foo'.
 2. Verifies that 'git add "f\*\*"'
    correctly adds only the literal file 'f**'.
 3. Verifies that 'git add 'f*'' (quoted to prevent shell expansion)
    correctly adds 'f*', 'f**', and 'foo' by treating 'f*' as a
    wildcard.

Covering these the test adds 5 cases.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
---
 t/meson.build                   |  1 +
 t/t3706-add-wildcard-literal.sh | 44 +++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100755 t/t3706-add-wildcard-literal.sh

Lucas Seiki Oshiro April 14, 2025, 4:51 p.m. UTC | #1

Hi!

> +test_expect_success 'setup: create files and initial commit' '
> +    mkdir testdir &&
> +    >testdir/f\* &&
> +    >testdir/f\*\* &&
> +    >testdir/foo &&
> +    git add testdir &&
> +    git commit -m "Initial setup with literal wildcard files"
> +'
> +
> +test_expect_success 'clean slate before testing wildcard behavior' '
> +    git rm -rf testdir &&
> +    git commit -m "Clean state"
> +'
>
> +test_expect_success 'recreate files to test add behavior' '
> +    mkdir testdir &&
> +    >testdir/f\* &&
> +    >testdir/f\*\* &&
> +    >testdir/foo
> +'


Two questions:

1. Does this need to be inside a test_expect_success? It seems to me
   that those two tests cases are actually setup code for the next
   two.

2. If so, does it need to have all that setup? I could reproduce the
   bug by only running:

   ```
   git reset
   touch foo 'f*' 'f**'
   git add 'f*'
   git ls-files 
   ```

   btw, this works with your code, congrats!

Other idea: `?` is another wildcard for matching only one character.
Have you tested if the same bug happens with it?

PS: while I was writing this review I pushed this to my GitHub just
to make the CI run the entire test suite since pathspecs are a
sensible part of Git.

Take look at this, it seems that your tests aren't passing on Windows:
https://github.com/lucasoshiro/git/actions/runs/14450183624/job/40521015897.
Perhaps you'll need to change something there. It seems to be
related to how Windows handle paths (specially the \ character, which
means the same as / in Unix). Personally, I'm not a Windows guy and
can't help you further with this. A quick reference on how paths on
Windows work is this (and yeah, they are far more complex than in
Unix):
https://www.fileside.app/blog/2023-03-17_windows-file-paths/

JAYATHEERTH K April 14, 2025, 5:08 p.m. UTC | #2

On Mon, Apr 14, 2025 at 10:21 PM Lucas Seiki Oshiro
<lucasseikioshiro@gmail.com> wrote:
>
> Hi!
>
Hi Lucas,

> > +test_expect_success 'setup: create files and initial commit' '
> > +    mkdir testdir &&
> > +    >testdir/f\* &&
> > +    >testdir/f\*\* &&
> > +    >testdir/foo &&
> > +    git add testdir &&
> > +    git commit -m "Initial setup with literal wildcard files"
> > +'
> > +
> > +test_expect_success 'clean slate before testing wildcard behavior' '
> > +    git rm -rf testdir &&
> > +    git commit -m "Clean state"
> > +'
> >
> > +test_expect_success 'recreate files to test add behavior' '
> > +    mkdir testdir &&
> > +    >testdir/f\* &&
> > +    >testdir/f\*\* &&
> > +    >testdir/foo
> > +'
>
>
> Two questions:
>
> 1. Does this need to be inside a test_expect_success? It seems to me
>    that those two tests cases are actually setup code for the next
>    two.
>
> 2. If so, does it need to have all that setup? I could reproduce the
>    bug by only running:
>
>    ```
>    git reset
>    touch foo 'f*' 'f**'
>    git add 'f*'
>    git ls-files
>    ```
>

Umm, I think the setup should just be a single block. I will send a
patch on this
I'm parallely working on a second patch.

>    btw, this works with your code, congrats!
>

Thank you, credit goes to Peff(Jeff King) I almost lost track.

> Other idea: `?` is another wildcard for matching only one character.
> Have you tested if the same bug happens with it?
>

Yup I think that's also a great suggestion, I think consolidating setup
and adding at least a few different wildcards will be good.

> PS: while I was writing this review I pushed this to my GitHub just
> to make the CI run the entire test suite since pathspecs are a
> sensible part of Git.
>
> Take look at this, it seems that your tests aren't passing on Windows:
> https://github.com/lucasoshiro/git/actions/runs/14450183624/job/40521015897.

Oh damn!! That's a silly mistake. I almost forgot windows exist!!
Thanks for letting me know

> Perhaps you'll need to change something there. It seems to be
> related to how Windows handle paths (specially the \ character, which
> means the same as / in Unix). Personally, I'm not a Windows guy and

Same not a windows guy, but I will have to read some things out.
Will figure it out.

> can't help you further with this. A quick reference on how paths on
> Windows work is this (and yeah, they are far more complex than in
> Unix):
> https://www.fileside.app/blog/2023-03-17_windows-file-paths/
>
>

Thanks again Lucas, these help.

-Jayatheerth

Junio C Hamano April 14, 2025, 9:42 p.m. UTC | #3

K Jayatheerth <jayatheerthkulkarni2005@gmail.com> writes:

> When 'git add <pattern>' is run, and a file exists whose literal
> name matches <pattern> (e.g., a file named "f*" when running
> 'git add f*'), Git should treat the pattern as a wildcard
> matching multiple files, rather than just adding the literal file.

This pretends to be a separate and independent patch, but it in
reality depends on the other change to dir.c, doesn't it?

Either make them into two-patch series, or better yet make them into
a single patch, perhaps?

> Add a test case that:
>  1. Creates files 'f*', 'f**', and 'foo'.
>  2. Verifies that 'git add "f\*\*"'
>     correctly adds only the literal file 'f**'.
>  3. Verifies that 'git add 'f*'' (quoted to prevent shell expansion)
>     correctly adds 'f*', 'f**', and 'foo' by treating 'f*' as a
>     wildcard.

That is something easily can be read from the script itself.  Don't
repeat what the code does, UNLESS the explanation helps readers to
understand what the code does NOT tell them (like, "why does the
code does what it does?").

> Covering these the test adds 5 cases.
>
> Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
> ---
>  t/meson.build                   |  1 +
>  t/t3706-add-wildcard-literal.sh | 44 +++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
>  create mode 100755 t/t3706-add-wildcard-literal.sh

I am not sure if this should be done as a new test script and covers
only "git add".  Is "git add" the only code paths that triggers
do_match_pathspec()?  If there are many, it may make sense to have a
new script (like this patch does) and cover all of these commands
that are corrected by the previous patch.  If not, and if it truly
affects only "git add", then it would make more sense to add to an
existing test script that covers "git add".

> +test_expect_success 'setup: create files and initial commit' '
> +    mkdir testdir &&
> +    >testdir/f\* &&
> +    >testdir/f\*\* &&
> +    >testdir/foo &&
> +    git add testdir &&
> +    git commit -m "Initial setup with literal wildcard files"
> +'
> +
> +test_expect_success 'clean slate before testing wildcard behavior' '
> +    git rm -rf testdir &&
> +    git commit -m "Clean state"
> +'

What's the point of the above two test pieces?  The initial commit
is not used in any way after it gets created, the wildcard pathspec,
which is the topic of this test, is not involved in its creation in
any way, and everything gets removed before the real test after this
step starts.  Not complaining, but showing puzzlement, as I truly do
not see the point of these two steps.

> +test_expect_success 'recreate files to test add behavior' '
> +    mkdir testdir &&
> +    >testdir/f\* &&
> +    >testdir/f\*\* &&
> +    >testdir/foo
> +'
> +
> +test_expect_success 'quoted literal: git add "f\\*\\*" adds only f**' '
> +    git reset &&

What is the point of this "reset"?  Since the last commit, the tree
of HEAD matches the index, and nobody touched the index after that.

> +    git add "testdir/f\\*\\*" &&

OK.  We are giving the pathspec with asterisks quoted in it, to make
sure the asterisks are not interpreted as glob.  So there is only
one path that gets added in the end, and that is what is tested.

That's a good test (except that I do not understand what the "reset"
at the beginning is trying to do).

> +    git ls-files >actual &&
> +    echo "testdir/f**" >expected &&
> +    test_cmp expected actual
> +'
> +
> +test_expect_success 'wildcard: git add f* adds f*, f** and foo' '
> +    git reset &&
> +    git add '\''testdir/f*'\'' &&

Why not just

	git add "testdir/f*" &&

to match the previous one?

> +    git ls-files | sort >actual &&

"git" command on the left hand side of a pipe means even if it
fails, its exit status is hidden.  Also, the order ls-files emits
paths is well defined (sorted in the byte-value order regardless of
locale), so piping its output through sort loses information
unnecessarily (you wouldn't notice if ls-files started emitting its
paths in an incorrect order).

Since you use these three pathnames in this order quote often in
this script, why not do this once upfront:

	cat >list-of-files <<-\EOF &&
	testdir/f*
	testdir/f**
        testdir/foo
	EOF

and then use the list of files in later steps like so:

	while read path
	do
		>"$path"
	done <list-of-files &&

	git ls-files >actual &&
	test_cmp list-of-files actual

> +    printf "%s\n" "testdir/f*" "testdir/f**" "testdir/foo" | sort >expected &&
> +    test_cmp expected actual
> +'
> +
> +test_done
> \ No newline at end of file

Please make sure your files are properly formatted.  Perhaps we
should add whitespace error class 'incomplete-lines' that catches
this, and enable it globally, or something?

brian m. carlson April 15, 2025, 10:32 p.m. UTC | #4

On 2025-04-12 at 17:40:51, K Jayatheerth wrote:
> +test_expect_success 'recreate files to test add behavior' '
> +    mkdir testdir &&
> +    >testdir/f\* &&
> +    >testdir/f\*\* &&

I just want to point out that creating files with asterisks may not be
possible on Windows due to limitations in the file system.  I'm not a
Windows expert, so unfortunately I can't provide more details than that,
but you may end up needing to add a prerequisite here to skip this on
our Windows platforms if necessary.  Hopefully CI and a suitable search
can help you figure it out.

> +test_done
> \ No newline at end of file

We do want to keep newlines at the end of a file.  POSIX mandates one on
text files and some systems are less tolerant of missing newlines than
others.  Usually Linux and the BSDs handle this just fine, but some
proprietary Unix systems, which unfortunately we don't have CI for, tend
to be the ones that are less happy about this.

I haven't given this a full review, since others have done that instead,
but just pointed out one or two things that got my attention.

JAYATHEERTH K April 16, 2025, 1:56 a.m. UTC | #5

On Wed, Apr 16, 2025 at 4:02 AM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2025-04-12 at 17:40:51, K Jayatheerth wrote:
> > +test_expect_success 'recreate files to test add behavior' '
> > +    mkdir testdir &&
> > +    >testdir/f\* &&
> > +    >testdir/f\*\* &&
>
> I just want to point out that creating files with asterisks may not be
> possible on Windows due to limitations in the file system.  I'm not a
> Windows expert, so unfortunately I can't provide more details than that,
> but you may end up needing to add a prerequisite here to skip this on
> our Windows platforms if necessary.  Hopefully CI and a suitable search
> can help you figure it out.
>

Ok I will look into it, thank you for letting me know

> > +test_done
> > \ No newline at end of file
>
> We do want to keep newlines at the end of a file.  POSIX mandates one on
> text files and some systems are less tolerant of missing newlines than
> others.  Usually Linux and the BSDs handle this just fine, but some
> proprietary Unix systems, which unfortunately we don't have CI for, tend
> to be the ones that are less happy about this.
>

Ok I will make sure of that

> I haven't given this a full review, since others have done that instead,
> but just pointed out one or two things that got my attention.
> --
> brian m. carlson (they/them)
> Toronto, Ontario, CA


While I was looking into the reviews I was creating various test cases
with these files
'*'  '**'  '?'  '\*'  '[abc]'   commit_files  'f*'  'f**'  'file[1-3]'
 'foo*bar'  'f?z'  'hello?world'

Everything went correct
But when I checked \* and which is used for getting * as specific
but there is also a literal \* in the above files

So it still adds both, I'm unsure if that is the intended behaviour.

but when I say git add "\*" it adds both the files * and \*
But rest of the other wildcards and literals work as intended which is why
I incorporated the \* literal

I also think I will still divide the test file because
git add isn't the only one that looks into wildcards and pathspec

I think something like git commit "*" -m "Test" also would be a great test

or even git rm command.

About the windows question, I think I will see if there is any common
ground I could find
But until then I think prereq is a great option.

For reference my test file looks something like this,

--- /dev/null
+++ b/t/t6137-pathspec-wildcard-literal.sh
@@ -0,0 +1,139 @@
+#!/bin/sh
+
+test_description='test wildcards and literals with various git commands'
+
+. ./test-lib.sh
+
+reset_git_repo () {
+ rm -rf .git &&
+ git init
+}
+
+test_expect_success 'setup' '
+ mkdir testdir &&
+ cd testdir &&
+ touch "*" "?" "[abc]" "f*" "f?z" "a" &&
+ touch "**" "foo*bar" "hello?world" "f**" "hello_world" &&
+ git init
+'
+
+test_expect_success 'check * wildcard in git add' '
+ git init &&
+ git add "*" &&
+ cat >expected_files <<EOF &&
+*
+**
+?
+[abc]
+a
+f*
+f**
+f?z
+foo*bar
+hello?world
+hello_world
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check \* literal in git add' '
+ reset_git_repo &&
+ git add "\*" &&
+ cat >expected_files <<EOF &&
+*
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check f* wildcard in git add' '
+ reset_git_repo &&
+ git add "f*" &&
+ cat >expected_files <<EOF &&
+f*
+f**
+f?z
+foo*bar
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check f\* literal in git add' '
+ reset_git_repo &&
+ git add "f\*" &&
+ cat >expected_files <<EOF &&
+f*
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check f** wildcard in git add' '
+ reset_git_repo &&
+ git add "f**" &&
+ cat >expected_files <<EOF &&
+f*
+f**
+f?z
+foo*bar
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check f\*\* literal in git add' '
+ reset_git_repo &&
+ git add "f\*\*" &&
+ cat >expected_files <<EOF &&
+f**
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check ? wildcard in git add' '
+ reset_git_repo &&
+ git add "?" &&
+ cat >expected_files <<EOF &&
+*
+?
+a
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check \? literal in git add' '
+ reset_git_repo &&
+ git add "\?" &&
+ cat >expected_files <<EOF &&
+?
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check hello?world wildcard in git add' '
+ reset_git_repo &&
+ git add "hello?world" &&
+ cat >expected_files <<EOF &&
+hello?world
+hello_world
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_expect_success 'check hello\?world literal in git add' '
+ reset_git_repo &&
+ git add "hello\?world" &&
+ cat >expected_files <<EOF &&
+hello?world
+EOF
+ git ls-files >actual_files &&
+ test_cmp expected_files actual_files
+'
+
+test_done

t3706: Add test for wildcard vs literal pathspec

Commit Message

Comments

Patch