diff mbox series

[5/5] commit: support the --pathspec-from-file option

Message ID f4847046896848d3f16bc5f3cb7a26271cefd97c.1572895605.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Add --pathspec-from-file option for reset, commit | expand

Commit Message

Linus Arver via GitGitGadget Nov. 4, 2019, 7:26 p.m. UTC
From: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>

This option solves the problem of commandline length limit for UI's
built on top of git. Plumbing commands are not always a good fit, for
two major reasons:
1) Some UI's serve as assistants that help user run git commands. In
   this case, replacing familiar commands with plumbing commands will
   confuse most users.
2) Some UI's have started and grown with porcelain commands. Replacing
   existing logic with plumbing commands could be cumbersome and prone
   to various new problems.

The new option is designed to behave very close to pathspecs passed in
commandline args, so that switching from one to another is simple.

The new option allows to read either a specified file or `stdin`.
Reading from file is a good way to avoid competing for stdin, and
also gives some extra flexibility.

Decisions taken for simplicity:
1) The new option is declared incompatible with other options that
   could use `stdin`.
2) It is not allowed to pass some refspecs in args and others in file.
3) New options do not have shorthands to avoid shorthand conflicts.

Also add new '--pathspec-file-null' switch that mirrors '-z' used in
various places. Some porcelain commands, such as `git commit`, already
use '-z', therefore it needed a new unambiguous name.

Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
---
 Documentation/git-commit.txt    |  14 ++++-
 builtin/commit.c                |  25 ++++++--
 t/t7526-commit-pathspec-file.sh | 107 ++++++++++++++++++++++++++++++++
 3 files changed, 141 insertions(+), 5 deletions(-)
 create mode 100755 t/t7526-commit-pathspec-file.sh

Comments

Phillip Wood Nov. 5, 2019, 4:27 p.m. UTC | #1
Hi Alexandr

On 04/11/2019 19:26, Alexandr Miloslavskiy via GitGitGadget wrote:
> From: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> 
> This option solves the problem of commandline length limit for UI's
> built on top of git. Plumbing commands are not always a good fit, for
> two major reasons:
> 1) Some UI's serve as assistants that help user run git commands. In
>     this case, replacing familiar commands with plumbing commands will
>     confuse most users.
> 2) Some UI's have started and grown with porcelain commands. Replacing
>     existing logic with plumbing commands could be cumbersome and prone
>     to various new problems.
> 
> The new option is designed to behave very close to pathspecs passed in
> commandline args, so that switching from one to another is simple.
> 
> The new option allows to read either a specified file or `stdin`.
> Reading from file is a good way to avoid competing for stdin, and
> also gives some extra flexibility.
> 
> Decisions taken for simplicity:
> 1) The new option is declared incompatible with other options that
>     could use `stdin`.
> 2) It is not allowed to pass some refspecs in args and others in file.
> 3) New options do not have shorthands to avoid shorthand conflicts.
> 
> Also add new '--pathspec-file-null' switch that mirrors '-z' used in
> various places. Some porcelain commands, such as `git commit`, already
> use '-z', therefore it needed a new unambiguous name.

It might be worth tailoring the message to the command rather than 
having exactly the same message for commit and reset

> 
> Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> ---
>   Documentation/git-commit.txt    |  14 ++++-
>   builtin/commit.c                |  25 ++++++--
>   t/t7526-commit-pathspec-file.sh | 107 ++++++++++++++++++++++++++++++++
>   3 files changed, 141 insertions(+), 5 deletions(-)
>   create mode 100755 t/t7526-commit-pathspec-file.sh
> 
> diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
> index 4341d0e3ab..ec4752298d 100644
> --- a/Documentation/git-commit.txt
> +++ b/Documentation/git-commit.txt
> @@ -13,7 +13,8 @@ SYNOPSIS
>   	   [-F <file> | -m <msg>] [--reset-author] [--allow-empty]
>   	   [--allow-empty-message] [--no-verify] [-e] [--author=<author>]
>   	   [--date=<date>] [--cleanup=<mode>] [--[no-]status]
> -	   [-i | -o] [-S[<keyid>]] [--] [<pathspec>...]
> +	   [-i | -o] [--pathspec-from-file=<file> [--pathspec-file-null]]
> +	   [-S[<keyid>]] [--] [<pathspec>...]
>   
>   DESCRIPTION
>   -----------
> @@ -277,6 +278,17 @@ FROM UPSTREAM REBASE" section in linkgit:git-rebase[1].)
>   	the last commit without committing changes that have
>   	already been staged. If used together with `--allow-empty`
>   	paths are also not required, and an empty commit will be created.
> +	
> +--pathspec-from-file=<file>::
> +	Read `<pathspec>` from `<file>` instead. If `<file>` is exactly `-`
> +	then read from standard input. Pathspecs are separated by LF or
> +	CR/LF. Pathspecs can be quoted as explained for the configuration
> +	variable `core.quotePath` (see linkgit:git-config[1]). See also
> +	`--pathspec-file-null` and global `--literal-pathspecs`.
> +
> +--pathspec-file-null::
> +	Only meaningful with `--pathspec-from-file`. Pathspecs are
> +	separated with NUL character and are not expected to be quoted.

I think my comments from patch 3 about <pathspecs> and the option names 
apply here as well

>   -u[<mode>]::
>   --untracked-files[=<mode>]::
> diff --git a/builtin/commit.c b/builtin/commit.c
> index e588bc6ad3..532f305926 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -107,9 +107,9 @@ static int all, also, interactive, patch_interactive, only, amend, signoff;
>   static int edit_flag = -1; /* unspecified */
>   static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
>   static int config_commit_verbose = -1; /* unspecified */
> -static int no_post_rewrite, allow_empty_message;
> +static int no_post_rewrite, allow_empty_message, pathspec_file_null;
>   static char *untracked_files_arg, *force_date, *ignore_submodule_arg, *ignored_arg;
> -static char *sign_commit;
> +static char *sign_commit, *pathspec_from_file;
>   
>   /*
>    * The default commit message cleanup mode will remove the lines
> @@ -343,6 +343,23 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
>   		       PATHSPEC_PREFER_FULL,
>   		       prefix, argv);
>   
> +	if (pathspec_from_file) {
> +		if (interactive)
> +			die(_("--pathspec-from-file is incompatible with --interactive/--patch"));
> +
> +		if (pathspec.nr)
> +			die(_("--pathspec-from-file is incompatible with path arguments"));
> +
> +		parse_pathspec_file(&pathspec, 0,
> +				    PATHSPEC_PREFER_FULL,
> +				    prefix, pathspec_from_file, pathspec_file_null);
> +	}
> +	else if (pathspec_file_null)
> +		die(_("--pathspec-file-null requires --pathspec-from-file"));
> +
> +	if (!pathspec.nr && (also || (only && !amend && !allow_empty)))
> +		die(_("No paths with --include/--only does not make sense."));

I wonder if there is a way of calling parse_pathspec_file() from 
parse_and_validate_options() instead. Otherwise we end up validating 
options here instead which is a bit messy.

>   	if (read_cache_preload(&pathspec) < 0)
>   		die(_("index file corrupt"));
>   
> @@ -1198,8 +1215,6 @@ static int parse_and_validate_options(int argc, const char *argv[],
>   
>   	if (also + only + all + interactive > 1)
>   		die(_("Only one of --include/--only/--all/--interactive/--patch can be used."));
> -	if (argc == 0 && (also || (only && !amend && !allow_empty)))
> -		die(_("No paths with --include/--only does not make sense."));
>   	cleanup_mode = get_cleanup_mode(cleanup_arg, use_editor);
>   
>   	handle_untracked_files_arg(s);
> @@ -1535,6 +1550,8 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
>   		OPT_BOOL(0, "amend", &amend, N_("amend previous commit")),
>   		OPT_BOOL(0, "no-post-rewrite", &no_post_rewrite, N_("bypass post-rewrite hook")),
>   		{ OPTION_STRING, 'u', "untracked-files", &untracked_files_arg, N_("mode"), N_("show untracked files, optional modes: all, normal, no. (Default: all)"), PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
> +		OPT_FILENAME(0, "pathspec-from-file", &pathspec_from_file, N_("read pathspecs from file")),
> +		OPT_BOOL(0, "pathspec-file-null", &pathspec_file_null, N_("with --pathspec-from-file, pathspecs are separated with NUL character")),
>   		/* end commit contents options */
>   
>   		OPT_HIDDEN_BOOL(0, "allow-empty", &allow_empty,
> diff --git a/t/t7526-commit-pathspec-file.sh b/t/t7526-commit-pathspec-file.sh
> new file mode 100755
> index 0000000000..c5d68e01e6
> --- /dev/null
> +++ b/t/t7526-commit-pathspec-file.sh

The test comments from patch 3 apply here as well I think.

Overall this series is nicely structured and is looking pretty good

Best Wishes

Phillip

> @@ -0,0 +1,107 @@
> +#!/bin/sh
> +
> +test_description='commit --pathspec-from-file'
> +
> +. ./test-lib.sh
> +
> +test_tick
> +
> +cat > expect.a <<EOF
> +A	fileA.t
> +EOF
> +
> +cat > expect.ab <<EOF
> +A	fileA.t
> +A	fileB.t
> +EOF
> +
> +cat > expect.bc <<EOF
> +A	fileB.t
> +A	fileC.t
> +EOF
> +
> +test_expect_success setup '
> +	test_commit file0 &&
> +	checkpoint=$(git rev-parse --verify HEAD) &&
> +	
> +	echo A >fileA.t &&
> +	echo B >fileB.t &&
> +	echo C >fileC.t &&
> +	echo D >fileD.t &&
> +	git add fileA.t fileB.t fileC.t fileD.t
> +'
> +
> +restore_checkpoint () {
> +	git reset --soft "$checkpoint"
> +}
> +
> +verify_state () {
> +	git diff-tree --no-commit-id --name-status -r HEAD >actual &&
> +	test_cmp "$1" actual
> +}
> +
> +test_expect_success '--pathspec-from-file from stdin' '
> +	restore_checkpoint &&
> +
> +	echo fileA.t | git commit --pathspec-from-file=- -m "Commit" &&
> +
> +	verify_state expect.a
> +'
> +
> +test_expect_success '--pathspec-from-file from file' '
> +	restore_checkpoint &&
> +
> +	echo fileA.t >list &&
> +	git commit --pathspec-from-file=list -m "Commit" &&
> +
> +	verify_state expect.a
> +'
> +
> +test_expect_success 'NUL delimiters' '
> +	restore_checkpoint &&
> +
> +	printf fileA.tQfileB.t | q_to_nul | git commit --pathspec-from-file=- --pathspec-file-null -m "Commit" &&
> +
> +	verify_state expect.ab
> +'
> +
> +test_expect_success 'LF delimiters' '
> +	restore_checkpoint &&
> +
> +	printf "fileA.t\nfileB.t" | git commit --pathspec-from-file=- -m "Commit" &&
> +
> +	verify_state expect.ab
> +'
> +
> +test_expect_success 'CRLF delimiters' '
> +	restore_checkpoint &&
> +
> +	printf "fileA.t\r\nfileB.t" | git commit --pathspec-from-file=- -m "Commit" &&
> +
> +	verify_state expect.ab
> +'
> +
> +test_expect_success 'quotes' '
> +	restore_checkpoint &&
> +
> +	printf "\"file\\101.t\"" | git commit --pathspec-from-file=- -m "Commit" &&
> +
> +	verify_state expect.a
> +'
> +
> +test_expect_success 'quotes not compatible with --pathspec-file-null' '
> +	restore_checkpoint &&
> +
> +	printf "\"file\\101.t\"" >list &&
> +	test_must_fail git commit --pathspec-from-file=list --pathspec-file-null -m "Commit"
> +'
> +
> +test_expect_success 'only touches what was listed' '
> +	restore_checkpoint &&
> +
> +	printf "fileB.t\nfileC.t" | git commit --pathspec-from-file=- -m "Commit" &&
> +
> +	verify_state expect.bc
> +'
> +
> +test_done
>
Alexandr Miloslavskiy Nov. 5, 2019, 7:42 p.m. UTC | #2
On 05.11.2019 17:27, Phillip Wood wrote:

>> Also add new '--pathspec-file-null' switch that mirrors '-z' used in
>> various places. Some porcelain commands, such as `git commit`, already
>> use '-z', therefore it needed a new unambiguous name.
> 
> It might be worth tailoring the message to the command rather than 
> having exactly the same message for commit and reset

I also was somewhat unhappy about duplication. But I didn't figure how 
to do that correctly. Currently the messages for 'git reset' and 'git 
commit' are almost identical.

Maybe in 2nd commit I should say something like "Extend 
`--pathspec-file-null` support to `git commit` (see previous patch for 
`git reset`)" ?

> I think my comments from patch 3 about <pathspecs> and the option names 
> apply here as well

Yes, sure, I will try to apply your suggestions to all patches. 
Hopefully without forgetting things :)

>> +    if (!pathspec.nr && (also || (only && !amend && !allow_empty)))
>> +        die(_("No paths with --include/--only does not make sense."));
> 
> I wonder if there is a way of calling parse_pathspec_file() from 
> parse_and_validate_options() instead. Otherwise we end up validating 
> options here instead which is a bit messy.

Yes, I was also somewhat unhappy about that. I will give it more thought.

> Overall this series is nicely structured and is looking pretty good

Thanks, and also thanks for reviewing my patches!
Junio C Hamano Nov. 6, 2019, 4:51 a.m. UTC | #3
"Alexandr Miloslavskiy via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
>
> This option solves the problem of commandline length limit for UI's
> built on top of git. Plumbing commands are not always a good fit, for
> two major reasons:
> 1) Some UI's serve as assistants that help user run git commands. In
>    this case, replacing familiar commands with plumbing commands will
>    confuse most users.
> 2) Some UI's have started and grown with porcelain commands. Replacing
>    existing logic with plumbing commands could be cumbersome and prone
>    to various new problems.

I think all the comments I made on "reset" with these two options,
both to the proposed log message and to the patch text, applies to
this this step, too, so I won't repeat them.

Thanks for working on this topic.
Alexandr Miloslavskiy Nov. 6, 2019, 3:56 p.m. UTC | #4
I think I have implemented most suggestions in PatchV2. Thanks!

> It might be worth tailoring the message to the command rather than 
> having exactly the same message for commit and reset

I decided to move the general comment to base commit where options are 
introduced and not repeat it where options are supported.

> I wonder if there is a way of calling parse_pathspec_file() from 
> parse_and_validate_options() instead. Otherwise we end up validating 
> options here instead which is a bit messy.

The code looks a bit too entangled to support that without making it 
worse. The biggest concern I have is that parse_and_validate_options() 
will populate `pathspec` and some other function will need to remember 
to clean it up. I like it better when `pathspec` is handled in one place.

I agree that things are not perfect, but this seems to be a consequence 
of other existing problems. For example, I would have expected a 
structure instead of a handful of global variables. That would have 
solved many problems. However, I didn't have the bravery to dive into 
this refactoring.
Phillip Wood Dec. 10, 2019, 10:42 a.m. UTC | #5
Hi Alexandr

Sorry it has taken me so long to reply

On 06/11/2019 15:56, Alexandr Miloslavskiy wrote:
> I think I have implemented most suggestions in PatchV2. Thanks!
> 
>> It might be worth tailoring the message to the command rather than 
>> having exactly the same message for commit and reset
> 
> I decided to move the general comment to base commit where options are 
> introduced and not repeat it where options are supported.
> 
>> I wonder if there is a way of calling parse_pathspec_file() from 
>> parse_and_validate_options() instead. Otherwise we end up validating 
>> options here instead which is a bit messy.
> 
> The code looks a bit too entangled to support that without making it 
> worse. The biggest concern I have is that parse_and_validate_options() 
> will populate `pathspec` and some other function will need to remember 
> to clean it up. I like it better when `pathspec` is handled in one place.

I don't think it's so bad if the pathspec is cleaned up just after it is used, 
the diff below applies on top of your patch - see what you think. The diff 
also adds dies if --all is given with --pathspec-from-file which (together 
with a test) would be worth adding to you series I think.

> 
> I agree that things are not perfect, but this seems to be a consequence 
> of other existing problems. For example, I would have expected a 
> structure instead of a handful of global variables. That would have 
> solved many problems. However, I didn't have the bravery to dive into 
> this refactoring.

Yes it is a pain that the builtin functions tend to use a lot of global 
variables rather than a structure.

Best Wishes

Phillip

--- >8 ---
diff --git a/builtin/commit.c b/builtin/commit.c
index ed40729355..bb9515c44b 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -330,37 +330,18 @@ static void refresh_cache_or_die(int refresh_flags)
 }
 
 static const char *prepare_index(int argc, const char **argv, const char *prefix,
+				 struct pathspec *pathspec,
 				 const struct commit *current_head, int is_status)
 {
 	struct string_list partial = STRING_LIST_INIT_DUP;
-	struct pathspec pathspec;
+
 	int refresh_flags = REFRESH_QUIET;
 	const char *ret;
 
 	if (is_status)
 		refresh_flags |= REFRESH_UNMERGED;
-	parse_pathspec(&pathspec, 0,
-		       PATHSPEC_PREFER_FULL,
-		       prefix, argv);
 
-	if (pathspec_from_file) {
-		if (interactive)
-			die(_("--pathspec-from-file is incompatible with --interactive/--patch"));
-
-		if (pathspec.nr)
-			die(_("--pathspec-from-file is incompatible with pathspec arguments"));
-
-		parse_pathspec_file(&pathspec, 0,
-				    PATHSPEC_PREFER_FULL,
-				    prefix, pathspec_from_file, pathspec_file_nul);
-	} else if (pathspec_file_nul) {
-		die(_("--pathspec-file-nul requires --pathspec-from-file"));
-	}
-
-	if (!pathspec.nr && (also || (only && !amend && !allow_empty)))
-		die(_("No paths with --include/--only does not make sense."));
-
-	if (read_cache_preload(&pathspec) < 0)
+	if (read_cache_preload(pathspec) < 0)
 		die(_("index file corrupt"));
 
 	if (interactive) {
@@ -411,9 +392,9 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	 * (A) if all goes well, commit the real index;
 	 * (B) on failure, rollback the real index.
 	 */
-	if (all || (also && pathspec.nr)) {
+	if (all || (also && pathspec->nr)) {
 		hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
-		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
+		add_files_to_cache(also ? prefix : NULL, pathspec, 0);
 		refresh_cache_or_die(refresh_flags);
 		update_main_cache_tree(WRITE_TREE_SILENT);
 		if (write_locked_index(&the_index, &index_lock, 0))
@@ -432,7 +413,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	 * and create commit from the_index.
 	 * We still need to refresh the index here.
 	 */
-	if (!only && !pathspec.nr) {
+	if (!only && !pathspec->nr) {
 		hold_locked_index(&index_lock, LOCK_DIE_ON_ERROR);
 		refresh_cache_or_die(refresh_flags);
 		if (active_cache_changed
@@ -474,7 +455,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 			die(_("cannot do a partial commit during a cherry-pick."));
 	}
 
-	if (list_paths(&partial, !current_head ? NULL : "HEAD", &pathspec))
+	if (list_paths(&partial, !current_head ? NULL : "HEAD", pathspec))
 		exit(1);
 
 	discard_cache();
@@ -505,7 +486,7 @@ static const char *prepare_index(int argc, const char **argv, const char *prefix
 	read_cache_from(ret);
 out:
 	string_list_clear(&partial, 0);
-	clear_pathspec(&pathspec);
+	clear_pathspec(pathspec);
 	return ret;
 }
 
@@ -1148,6 +1129,7 @@ static int parse_and_validate_options(int argc, const char *argv[],
 				      const struct option *options,
 				      const char * const usage[],
 				      const char *prefix,
+				      struct pathspec *pathspec,
 				      struct commit *current_head,
 				      struct wt_status *s)
 {
@@ -1223,19 +1205,42 @@ static int parse_and_validate_options(int argc, const char *argv[],
 		die(_("paths '%s ...' with -a does not make sense"),
 		    argv[0]);
 
+	if (pathspec_from_file) {
+		if (interactive)
+			die(_("--pathspec-from-file is incompatible with --interactive/--patch"));
+
+		if (argc)
+			die(_("--pathspec-from-file is incompatible with pathspec arguments"));
+
+		if (all)
+			die(_("--pathspec-from-file is incompatible with --all"));
+
+		parse_pathspec_file(pathspec, 0,
+				    PATHSPEC_PREFER_FULL,
+				    prefix, pathspec_from_file, pathspec_file_nul);
+	} else if (pathspec_file_nul) {
+		die(_("--pathspec-file-nul requires --pathspec-from-file"));
+	} else {
+		parse_pathspec(pathspec, 0, PATHSPEC_PREFER_FULL, prefix, argv);
+	}
+
+	if (!pathspec->nr && (also || (only && !amend && !allow_empty)))
+		die(_("No paths with --include/--only does not make sense."));
+
 	if (status_format != STATUS_FORMAT_NONE)
 		dry_run = 1;
 
 	return argc;
 }
 
 static int dry_run_commit(int argc, const char **argv, const char *prefix,
+			  struct pathspec *pathspec,
 			  const struct commit *current_head, struct wt_status *s)
 {
 	int committable;
 	const char *index_file;
 
-	index_file = prepare_index(argc, argv, prefix, current_head, 1);
+	index_file = prepare_index(argc, argv, prefix, pathspec, current_head, 1);
 	committable = run_status(stdout, index_file, prefix, 0, s);
 	rollback_index_files();
 
@@ -1571,6 +1576,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 	struct commit *current_head = NULL;
 	struct commit_extra_header *extra = NULL;
 	struct strbuf err = STRBUF_INIT;
+	struct pathspec pathspec;
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_commit_usage, builtin_commit_options);
@@ -1590,13 +1596,15 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 	verbose = -1; /* unspecified */
 	argc = parse_and_validate_options(argc, argv, builtin_commit_options,
 					  builtin_commit_usage,
-					  prefix, current_head, &s);
+					  prefix, &pathspec, current_head, &s);
 	if (verbose == -1)
 		verbose = (config_commit_verbose < 0) ? 0 : config_commit_verbose;
 
 	if (dry_run)
-		return dry_run_commit(argc, argv, prefix, current_head, &s);
-	index_file = prepare_index(argc, argv, prefix, current_head, 0);
+		return dry_run_commit(argc, argv, prefix, &pathspec,
+				     current_head, &s);
+	index_file = prepare_index(argc, argv, prefix, &pathspec, current_head,
+				   0);
 
 	/* Set up everything for writing the commit object.  This includes
 	   running hooks, writing the trees, and interacting with the user.  */
Alexandr Miloslavskiy Dec. 11, 2019, 11:43 a.m. UTC | #6
On 10.12.2019 11:42, Phillip Wood wrote:
> I don't think it's so bad if the pathspec is cleaned up just after it is used,
> the diff below applies on top of your patch - see what you think. The diff
> also adds dies if --all is given with --pathspec-from-file which (together
> with a test) would be worth adding to you series I think.

Unfortunately, your reply came too late, topic was cooking in pu/next 
for a while and merged into master yesterday: c58ae96f.

I understand that your patch consists of two parts:

1) Adding test for --all
------------------------
I must admit that I overlooked that there was a similar test for 
args-based pathspec. I will add this part in my next topic for 
--pathspec-from-file. I expect it to appear in the next day or two. I 
will try to remember to CC you to it.

2) Moving parsing/validation into `parse_and_validate_options()`
------------------------
Again, I agree that having parsing/validation outside is suboptimal.
However, with current code, I find it to be a choice between two evils, 
and my choice was "outside but clear" to "inside but obscure".

What I find obscure in your suggestion/patch is that innocently looking 
`prepare_index()` suddenly clears pathspec as well. It's even harder to 
see when called through `dry_run_commit()`.

Now, let me illustrate. There's a similar case in my TODO list, this 
time involving a real bug in git. In `init_db()`, the bug occurs in 
`set_git_dir(real_path(git_dir))`, also due to unexpected clearing.

Now that I pointed my finger at it, please try to find what's wrong. I 
imagine that it won't be easy at all. And it's way harder when there's 
no reason to dig deep into a very specific line of code.

I really try to avoid such type of pitfalls. That's why my choice 
between two evils is what I did.
Phillip Wood Dec. 11, 2019, 2:27 p.m. UTC | #7
Hi Alexandr

On 11/12/2019 11:43, Alexandr Miloslavskiy wrote:
> On 10.12.2019 11:42, Phillip Wood wrote:
>> I don't think it's so bad if the pathspec is cleaned up just after it 
>> is used,
>> the diff below applies on top of your patch - see what you think. The 
>> diff
>> also adds dies if --all is given with --pathspec-from-file which 
>> (together
>> with a test) would be worth adding to you series I think.
> 
> Unfortunately, your reply came too late, topic was cooking in pu/next 
> for a while and merged into master yesterday: c58ae96f.

Ah I thought it might be a bit late to fix up the patch, there could 
always be a follow patch though.

> 
> I understand that your patch consists of two parts:
> 
> 1) Adding test for --all
> ------------------------
> I must admit that I overlooked that there was a similar test for 
> args-based pathspec. I will add this part in my next topic for 
> --pathspec-from-file. I expect it to appear in the next day or two. I 
> will try to remember to CC you to it.

Thanks, one other thing I forgot to mention yesterday is to ask what the 
expected behavior is if the user passes an empty file to 
--pathspec-from-file. With no pathspecs on the command line commit, 
checkout, reset and restore-files all default to operating on all paths 
but passing --pathspec-from-file implies the user wants to specify 
specific paths so I think it would perhaps be better to error out if no 
paths are given. There is a precedent for this in checkout-index which 
does nothing if no paths are given (though I can't remember if it errors 
out or not).

> 
> 2) Moving parsing/validation into `parse_and_validate_options()`
> ------------------------
> Again, I agree that having parsing/validation outside is suboptimal.
> However, with current code, I find it to be a choice between two evils, 
> and my choice was "outside but clear" to "inside but obscure".
> 
> What I find obscure in your suggestion/patch is that innocently looking 
> `prepare_index()` suddenly clears pathspec as well. It's even harder to 
> see when called through `dry_run_commit()`.

It would be easy enough to clear pathspec in cmd_commit() I just didn't 
bother to do it.

> 
> Now, let me illustrate. There's a similar case in my TODO list, this 
> time involving a real bug in git. In `init_db()`, the bug occurs in 
> `set_git_dir(real_path(git_dir))`, also due to unexpected clearing.
> 
> Now that I pointed my finger at it, please try to find what's wrong. I 
> imagine that it won't be easy at all. And it's way harder when there's 
> no reason to dig deep into a very specific line of code.
> 
> I really try to avoid such type of pitfalls. That's why my choice 
> between two evils is what I did.

If I had to hazard a guess I'd say that either set_git_dir() calls 
real_path() which messes up the path passed to it or it does not copy 
the path passed to it and it is messed up by some other code calling 
real_path() after set_git_dir() has returned. If my guess is correct (I 
wouldn't be surprised if it's wrong) I think that's a bit different to 
the pathspec case as it's about the lifetime of the return value of a 
function rather than a function freeing an argument passed to it.

Best Wishes

Phillip
Alexandr Miloslavskiy Dec. 11, 2019, 3:06 p.m. UTC | #8
On 11.12.2019 15:27, Phillip Wood wrote:

> Thanks, one other thing I forgot to mention yesterday is to ask what the 
> expected behavior is if the user passes an empty file to 
> --pathspec-from-file. With no pathspecs on the command line commit, 
> checkout, reset and restore-files all default to operating on all paths 
> but passing --pathspec-from-file implies the user wants to specify 
> specific paths so I think it would perhaps be better to error out if no 
> paths are given. There is a precedent for this in checkout-index which 
> does nothing if no paths are given (though I can't remember if it errors 
> out or not).

Back when I composed patch for `git commit`, I spent some time thinking 
about this question. I agree that empty `--pathspec-from-file` is weird. 
Eventually I decided that `--pathspec-from-file` shall be as close as 
possible to passing pathspec args. So empty file should behave the same 
way as passing zero pathspec args. It's more or less the question "if 
user does something unexpected, what is expected?". I decided that 
treating as zero args doesn't sound dangerous to me, and it's a very 
weird case anyway, so let's just treat as zero args.

>> What I find obscure in your suggestion/patch is that innocently 
>> looking `prepare_index()` suddenly clears pathspec as well. It's even 
>> harder to see when called through `dry_run_commit()`.
> 
> It would be easy enough to clear pathspec in cmd_commit() I just didn't 
> bother to do it.

Yes, this thought also came to me after I replied. I agree that this is 
viable and seems to solve most problems.

Now I'm in the situation where before me, code already wasn't perfect, 
and I can continue to extend it in the same direction, or try to 
refactor to make it better. I actually choose the second path: [1][2]. 
These were two "don't" lessons for me, as both topics now gather dust in 
mail graveyard. No good deed remains unpunished, I guess?

> I think that's a bit different to 
> the pathspec case as it's about the lifetime of the return value of a 
> function rather than a function freeing an argument passed to it.

Both cases are about lifetime that ends where it was not expected. You 
can probably agree that it never makes code easier to understand, and at 
best it will be properly accounted for and does not explode. Again, both 
choices were somewhat evil.

[1] 
https://public-inbox.org/git/pull.479.git.1574969538.gitgitgadget@gmail.com/
[2] 
https://public-inbox.org/git/pull.477.git.1574848137.gitgitgadget@gmail.com/T/#u
Junio C Hamano Dec. 11, 2019, 4:14 p.m. UTC | #9
Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> writes:

> and I can continue to extend it in the same direction, or try to
> refactor to make it better. I actually choose the second path:
> [1][2]. These were two "don't" lessons for me, as both topics now
> gather dust in mail graveyard.

If nobody commented on your patches, that is not a punishment.  It
could have been that reviewers were busy addressing other issues
back then, in which case a gentle ping by resending a polished
version (it could be that the changes were not presented well to
attract reviewers' attention---polishing the proposed log messages
without changing the patch text might be all it takes to make them
realize that the topic is worth looking at) would help.

> [1]
> https://public-inbox.org/git/pull.479.git.1574969538.gitgitgadget@gmail.com/
> [2]
> https://public-inbox.org/git/pull.477.git.1574848137.gitgitgadget@gmail.com/T/#u

I actually was meaning to read and comment on [1], but lack of time
pushed it down in my queue.  Sorry about that.
Alexandr Miloslavskiy Dec. 11, 2019, 4:20 p.m. UTC | #10
On 11.12.2019 17:14, Junio C Hamano wrote:
> If nobody commented on your patches, that is not a punishment.  It
> could have been that reviewers were busy addressing other issues
> back then, in which case a gentle ping by resending a polished
> version (it could be that the changes were not presented well to
> attract reviewers' attention---polishing the proposed log messages
> without changing the patch text might be all it takes to make them
> realize that the topic is worth looking at) would help.

Thanks for letting me know! This is actually a relief.

I will then merge both topics with my next '--pathspec-from-file' batch, 
because they are all parts of one work. I just thought that since they 
can be submitted separately I should do that, maybe that was a mistake.
Alexandr Miloslavskiy Dec. 12, 2019, 2:56 p.m. UTC | #11
The new topic is now submitted:
https://lore.kernel.org/git/pull.490.git.1576161385.gitgitgadget@gmail.com/
					
Phillip, I tried to CC you, but GitGitGadget did something weird.
I'm currently trying to fix CC issues there:
https://github.com/gitgitgadget/gitgitgadget/pull/163
diff mbox series

Patch

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index 4341d0e3ab..ec4752298d 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -13,7 +13,8 @@  SYNOPSIS
 	   [-F <file> | -m <msg>] [--reset-author] [--allow-empty]
 	   [--allow-empty-message] [--no-verify] [-e] [--author=<author>]
 	   [--date=<date>] [--cleanup=<mode>] [--[no-]status]
-	   [-i | -o] [-S[<keyid>]] [--] [<pathspec>...]
+	   [-i | -o] [--pathspec-from-file=<file> [--pathspec-file-null]]
+	   [-S[<keyid>]] [--] [<pathspec>...]
 
 DESCRIPTION
 -----------
@@ -277,6 +278,17 @@  FROM UPSTREAM REBASE" section in linkgit:git-rebase[1].)
 	the last commit without committing changes that have
 	already been staged. If used together with `--allow-empty`
 	paths are also not required, and an empty commit will be created.
+	
+--pathspec-from-file=<file>::
+	Read `<pathspec>` from `<file>` instead. If `<file>` is exactly `-`
+	then read from standard input. Pathspecs are separated by LF or
+	CR/LF. Pathspecs can be quoted as explained for the configuration
+	variable `core.quotePath` (see linkgit:git-config[1]). See also
+	`--pathspec-file-null` and global `--literal-pathspecs`.
+
+--pathspec-file-null::
+	Only meaningful with `--pathspec-from-file`. Pathspecs are
+	separated with NUL character and are not expected to be quoted.
 
 -u[<mode>]::
 --untracked-files[=<mode>]::
diff --git a/builtin/commit.c b/builtin/commit.c
index e588bc6ad3..532f305926 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -107,9 +107,9 @@  static int all, also, interactive, patch_interactive, only, amend, signoff;
 static int edit_flag = -1; /* unspecified */
 static int quiet, verbose, no_verify, allow_empty, dry_run, renew_authorship;
 static int config_commit_verbose = -1; /* unspecified */
-static int no_post_rewrite, allow_empty_message;
+static int no_post_rewrite, allow_empty_message, pathspec_file_null;
 static char *untracked_files_arg, *force_date, *ignore_submodule_arg, *ignored_arg;
-static char *sign_commit;
+static char *sign_commit, *pathspec_from_file;
 
 /*
  * The default commit message cleanup mode will remove the lines
@@ -343,6 +343,23 @@  static const char *prepare_index(int argc, const char **argv, const char *prefix
 		       PATHSPEC_PREFER_FULL,
 		       prefix, argv);
 
+	if (pathspec_from_file) {
+		if (interactive)
+			die(_("--pathspec-from-file is incompatible with --interactive/--patch"));
+
+		if (pathspec.nr)
+			die(_("--pathspec-from-file is incompatible with path arguments"));
+
+		parse_pathspec_file(&pathspec, 0,
+				    PATHSPEC_PREFER_FULL,
+				    prefix, pathspec_from_file, pathspec_file_null);
+	}
+	else if (pathspec_file_null)
+		die(_("--pathspec-file-null requires --pathspec-from-file"));
+
+	if (!pathspec.nr && (also || (only && !amend && !allow_empty)))
+		die(_("No paths with --include/--only does not make sense."));
+
 	if (read_cache_preload(&pathspec) < 0)
 		die(_("index file corrupt"));
 
@@ -1198,8 +1215,6 @@  static int parse_and_validate_options(int argc, const char *argv[],
 
 	if (also + only + all + interactive > 1)
 		die(_("Only one of --include/--only/--all/--interactive/--patch can be used."));
-	if (argc == 0 && (also || (only && !amend && !allow_empty)))
-		die(_("No paths with --include/--only does not make sense."));
 	cleanup_mode = get_cleanup_mode(cleanup_arg, use_editor);
 
 	handle_untracked_files_arg(s);
@@ -1535,6 +1550,8 @@  int cmd_commit(int argc, const char **argv, const char *prefix)
 		OPT_BOOL(0, "amend", &amend, N_("amend previous commit")),
 		OPT_BOOL(0, "no-post-rewrite", &no_post_rewrite, N_("bypass post-rewrite hook")),
 		{ OPTION_STRING, 'u', "untracked-files", &untracked_files_arg, N_("mode"), N_("show untracked files, optional modes: all, normal, no. (Default: all)"), PARSE_OPT_OPTARG, NULL, (intptr_t)"all" },
+		OPT_FILENAME(0, "pathspec-from-file", &pathspec_from_file, N_("read pathspecs from file")),
+		OPT_BOOL(0, "pathspec-file-null", &pathspec_file_null, N_("with --pathspec-from-file, pathspecs are separated with NUL character")),
 		/* end commit contents options */
 
 		OPT_HIDDEN_BOOL(0, "allow-empty", &allow_empty,
diff --git a/t/t7526-commit-pathspec-file.sh b/t/t7526-commit-pathspec-file.sh
new file mode 100755
index 0000000000..c5d68e01e6
--- /dev/null
+++ b/t/t7526-commit-pathspec-file.sh
@@ -0,0 +1,107 @@ 
+#!/bin/sh
+
+test_description='commit --pathspec-from-file'
+
+. ./test-lib.sh
+
+test_tick
+
+cat > expect.a <<EOF
+A	fileA.t
+EOF
+
+cat > expect.ab <<EOF
+A	fileA.t
+A	fileB.t
+EOF
+
+cat > expect.bc <<EOF
+A	fileB.t
+A	fileC.t
+EOF
+
+test_expect_success setup '
+	test_commit file0 &&
+	checkpoint=$(git rev-parse --verify HEAD) &&
+	
+	echo A >fileA.t &&
+	echo B >fileB.t &&
+	echo C >fileC.t &&
+	echo D >fileD.t &&
+	git add fileA.t fileB.t fileC.t fileD.t
+'
+
+restore_checkpoint () {
+	git reset --soft "$checkpoint"
+}
+
+verify_state () {
+	git diff-tree --no-commit-id --name-status -r HEAD >actual &&
+	test_cmp "$1" actual
+}
+
+test_expect_success '--pathspec-from-file from stdin' '
+	restore_checkpoint &&
+
+	echo fileA.t | git commit --pathspec-from-file=- -m "Commit" &&
+
+	verify_state expect.a
+'
+
+test_expect_success '--pathspec-from-file from file' '
+	restore_checkpoint &&
+
+	echo fileA.t >list &&
+	git commit --pathspec-from-file=list -m "Commit" &&
+
+	verify_state expect.a
+'
+
+test_expect_success 'NUL delimiters' '
+	restore_checkpoint &&
+
+	printf fileA.tQfileB.t | q_to_nul | git commit --pathspec-from-file=- --pathspec-file-null -m "Commit" &&
+
+	verify_state expect.ab
+'
+
+test_expect_success 'LF delimiters' '
+	restore_checkpoint &&
+
+	printf "fileA.t\nfileB.t" | git commit --pathspec-from-file=- -m "Commit" &&
+
+	verify_state expect.ab
+'
+
+test_expect_success 'CRLF delimiters' '
+	restore_checkpoint &&
+
+	printf "fileA.t\r\nfileB.t" | git commit --pathspec-from-file=- -m "Commit" &&
+
+	verify_state expect.ab
+'
+
+test_expect_success 'quotes' '
+	restore_checkpoint &&
+
+	printf "\"file\\101.t\"" | git commit --pathspec-from-file=- -m "Commit" &&
+
+	verify_state expect.a
+'
+
+test_expect_success 'quotes not compatible with --pathspec-file-null' '
+	restore_checkpoint &&
+
+	printf "\"file\\101.t\"" >list &&
+	test_must_fail git commit --pathspec-from-file=list --pathspec-file-null -m "Commit"
+'
+
+test_expect_success 'only touches what was listed' '
+	restore_checkpoint &&
+
+	printf "fileB.t\nfileC.t" | git commit --pathspec-from-file=- -m "Commit" &&
+
+	verify_state expect.bc
+'
+
+test_done