diff mbox series

[5/7] sparse-checkout: reject non-cone-mode patterns starting with a '#'

Message ID 265cbe36b2df5a9a076877fe3ddc3880a64a9217.1644712798.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series RFC: sparse checkout: make --cone mode the default, and check add/set argument validity | expand

Commit Message

Elijah Newren Feb. 13, 2022, 12:39 a.m. UTC
From: Elijah Newren <newren@gmail.com>

In sparse-checkout add/set, in cone mode any specified directories will
be transformed into appropriate patterns.  In non-cone mode, the
non-option arguments are treated as patterns.

Since .git/info/sparse-checkout will ignore any patterns starting with a
'#' (they are just gitignore patterns), if the user passes an argument
starting with a '#' to sparse-checkout add/set in non-cone mode, it
would just be treated as a comment and ignored.  Error out in such
cases, informing the user that they need to backslash escape it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/sparse-checkout.c          | 5 +++++
 t/t1091-sparse-checkout-builtin.sh | 6 ++++++
 2 files changed, 11 insertions(+)

Comments

Junio C Hamano Feb. 14, 2022, 5:59 p.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +	if (!core_sparse_checkout_cone)
> +		for (i = 0; i < argc; i++)
> +			if (argv[i][0] == '#')
> +				die(_("paths beginning with a '#' must be preceeded by a backslash"));
> +

Whenever I see an error message like this, where it is clear that
the command knows the only viable solution is to the issue, and yet
still refuses to do-what-the-user-clearly-meant-to-do (is there a
valid reason to copy and paste "# comment" line, which clearly is
not about choosing which paths to use/ignore, from an existing file
and feed it to the command?), I question if it should be solved the
opposite way.

That is, to pretend as if "\" + argv[i] was given and then give the
user either a warning saying what we did, or an unsquelcheable advice
message (no need for advice.* config---the user can avoid triggering
it by learning what the advice message would say, which is to use \#
when they mean to give a pattern that begins with a pound).


>  	for (i = 0; i < argc; i++) {
>  		struct cache_entry *ce;
>  		struct index_state *index = the_repository->index;
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index 1d95fa47258..32b77415679 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -857,4 +857,10 @@ test_expect_success 'by default, non-cone mode will warn on individual files' '
>  	grep "passing directories or less specific patterns is recommended" warning
>  '
>  
> +test_expect_success 'paths starting with hash must be escaped in non-cone mode' '
> +	test_must_fail git -C repo sparse-checkout set --no-cone "#funny-path" 2>error &&
> +
> +	grep "paths beginning.*#.*must be preceeded by a backslash" error
> +'
> +
>  test_done
Elijah Newren Feb. 15, 2022, 4:31 a.m. UTC | #2
On Mon, Feb 14, 2022 at 9:59 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +     if (!core_sparse_checkout_cone)
> > +             for (i = 0; i < argc; i++)
> > +                     if (argv[i][0] == '#')
> > +                             die(_("paths beginning with a '#' must be preceeded by a backslash"));
> > +
>
> Whenever I see an error message like this, where it is clear that
> the command knows the only viable solution is to the issue, and yet
> still refuses to do-what-the-user-clearly-meant-to-do (is there a
> valid reason to copy and paste "# comment" line, which clearly is
> not about choosing which paths to use/ignore, from an existing file
> and feed it to the command?), I question if it should be solved the
> opposite way.
>
> That is, to pretend as if "\" + argv[i] was given and then give the
> user either a warning saying what we did, or an unsquelcheable advice
> message (no need for advice.* config---the user can avoid triggering
> it by learning what the advice message would say, which is to use \#
> when they mean to give a pattern that begins with a pound).

If this were the only special character case, I'd totally agree, but I
do worry a bit that escaping this particular case might lead users to
expect us to escape and fix other special characters from '*?[]!\'.
If users have files with those characters and specify an argument with
one of those, are we to automatically escape them as well?  If we
don't escape the other characters but do escape '#', aren't we being
inconsistent?  And if we do escape those other characters too, aren't
we breaking users who are trying to specify patterns (which is what
non-cone mode is all about)?  Personally, I'd rather drop this patch
than introduce such an inconsistency.
Junio C Hamano Feb. 16, 2022, 1:07 a.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

> If this were the only special character case, I'd totally agree, but I
> do worry a bit that escaping this particular case might lead users to
> expect us to escape and fix other special characters from '*?[]!\'.

Sorry, but I do not quite get why it is a problem.  I understand
that the idea behind this "rejection" is that "#" is special only
when it appears in files (as comment introducer) and must be
prefixed with "\", right?

Do any of the wildcard characters mean different things depending on
where they appear?  Isn't '*' a wildcard to match 0-or-more-bytes
whether it appears in files or on the command line, and need to
prefixed with "\" to make it non-special regardless of where it is
found?

> If users have files with those characters and specify an argument with
> one of those, are we to automatically escape them as well?  If we
> don't escape the other characters but do escape '#', aren't we being
> inconsistent?

I do not quite get where you are seeing an inconsistency.  Do you
mean that it is inconsistent that "# comment" is only allowed in
files but not on the command line?  If so, a way to make it
consistent may be to allow "# comment" even from the command line
;-)
Elijah Newren Feb. 16, 2022, 2:23 a.m. UTC | #4
On Tue, Feb 15, 2022 at 5:07 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > If this were the only special character case, I'd totally agree, but I
> > do worry a bit that escaping this particular case might lead users to
> > expect us to escape and fix other special characters from '*?[]!\'.
>
> Sorry, but I do not quite get why it is a problem.  I understand
> that the idea behind this "rejection" is that "#" is special only
> when it appears in files (as comment introducer) and must be
> prefixed with "\", right?
>
> Do any of the wildcard characters mean different things depending on
> where they appear?  Isn't '*' a wildcard to match 0-or-more-bytes
> whether it appears in files or on the command line, and need to
> prefixed with "\" to make it non-special regardless of where it is
> found?
>
> > If users have files with those characters and specify an argument with
> > one of those, are we to automatically escape them as well?  If we
> > don't escape the other characters but do escape '#', aren't we being
> > inconsistent?
>
> I do not quite get where you are seeing an inconsistency.  Do you
> mean that it is inconsistent that "# comment" is only allowed in
> files but not on the command line?

I don't understand what distinction you are trying to make between the
file or the command line; for non-cone mode, all positional arguments
to sparse-checkout {add,set} are taken as-is and inserted into the
$GIT_DIR/info/sparse-checkout file directly.

I don't like just assuming that users are specifying paths rather than
patterns, when non-cone mode is all about specifying patterns rather
than paths; it just feels broken to me.  However, since comments can
never match anything and part of the point of the sparse-checkout
command is so that users don't have to edit or look at the
$GIT_DIR/info/sparse-checkout file, it seemed worth flagging.

With all the special characters in non-cone mode ('*?[]!#\') and the
years of training we've given to users to edit
$GITDIR/info/sparse-checkout directly, there's really not much we can
check for; this was the only thing I could think of that seemed
reasonable to flag in non-cone mode.  However, it's really not all
that important to me, so I'll just drop this patch.

>  If so, a way to make it
> consistent may be to allow "# comment" even from the command line
> ;-)

Yes, dropping this patch would keep things consistent, and continue
allowing "# comment" from the command line (even if users are unlikely
to ever look at said comment).  I'll do that.
Junio C Hamano Feb. 16, 2022, 3:05 a.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

>> I do not quite get where you are seeing an inconsistency.  Do you
>> mean that it is inconsistent that "# comment" is only allowed in
>> files but not on the command line?
>
> I don't understand what distinction you are trying to make between the
> file or the command line; for non-cone mode, all positional arguments
> to sparse-checkout {add,set} are taken as-is and inserted into the
> $GIT_DIR/info/sparse-checkout file directly.

If so, then '# comment" from the command line would be a valid way
to spell a comment, no?  It sounds like the right thing to do here
is just passing it through to $GIT_DIR/info/sparse-checkout and let
it become comment, instead of warning, \-quoting, or rejecting.

> I don't like just assuming that users are specifying paths rather than
> patterns, when non-cone mode is all about specifying patterns rather
> than paths; it just feels broken to me.

Oh, I don't like such an assumption, either.  If the user gives a
pathspec, we do assume that is a collection of patterns.  If we are
taking patterns from the command line, treating them as patterns is
the right thing to do.  I do not see how that interacts with what a
path or pattern that begins with a pound should be handled, though.
diff mbox series

Patch

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 8f8d2bd6ba5..0f9e737ed97 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -704,6 +704,11 @@  static void sanitize_paths(int argc, const char **argv,
 	if (skip_checks)
 		return;
 
+	if (!core_sparse_checkout_cone)
+		for (i = 0; i < argc; i++)
+			if (argv[i][0] == '#')
+				die(_("paths beginning with a '#' must be preceeded by a backslash"));
+
 	for (i = 0; i < argc; i++) {
 		struct cache_entry *ce;
 		struct index_state *index = the_repository->index;
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 1d95fa47258..32b77415679 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -857,4 +857,10 @@  test_expect_success 'by default, non-cone mode will warn on individual files' '
 	grep "passing directories or less specific patterns is recommended" warning
 '
 
+test_expect_success 'paths starting with hash must be escaped in non-cone mode' '
+	test_must_fail git -C repo sparse-checkout set --no-cone "#funny-path" 2>error &&
+
+	grep "paths beginning.*#.*must be preceeded by a backslash" error
+'
+
 test_done