diff mbox series

[v3,03/11] read-tree, merge-recursive: overwrite ignored files by default

Message ID 2501a0c552ad5147f61a96b9ebe45c5199e1dbfd.1632760428.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 491a7575f188cbf6cfb5be75981a40beb4c22b44
Headers show
Series Fix various issues around removal of untracked files/directories | expand

Commit Message

Elijah Newren Sept. 27, 2021, 4:33 p.m. UTC
From: Elijah Newren <newren@gmail.com>

This fixes a long-standing patchwork of ignored files handling in
read-tree and merge-recursive, called out and suggested by Junio long
ago.  Quoting from commit dcf0c16ef1 ("core.excludesfile clean-up"
2007-11-16):

    git-read-tree takes --exclude-per-directory=<gitignore>,
    not because the flexibility was needed.  Again, this was
    because the option predates the standardization of the ignore
    files.

    ...

    On the other hand, I think it makes perfect sense to fix
    git-read-tree, git-merge-recursive and git-clean to follow the
    same rule as other commands.  I do not think of a valid use case
    to give an exclude-per-directory that is nonstandard to
    read-tree command, outside a "negative" test in the t1004 test
    script.

    This patch is the first step to untangle this mess.

    The next step would be to teach read-tree, merge-recursive and
    clean (in C) to use setup_standard_excludes().

History shows each of these were partially or fully fixed:

  * clean was taught the new trick in 1617adc7a0 ("Teach git clean to
    use setup_standard_excludes()", 2007-11-14).

  * read-tree was primarily used by checkout & merge scripts.  checkout
    and merge later became builtins and were both fixed to use the new
    setup_standard_excludes() handling in fc001b526c ("checkout,merge:
    loosen overwriting untracked file check based on info/exclude",
    2011-11-27).  So the primary users were fixed, though read-tree
    itself was not.

  * merge-recursive has now been replaced as the default merge backend
    by merge-ort.  merge-ort fixed this by using
    setup_standard_excludes() starting early in its implementation; see
    commit 6681ce5cf6 ("merge-ort: add implementation of checkout()",
    2020-12-13), largely due to its design depending on checkout() and
    thus being influenced by the checkout code.  However,
    merge-recursive itself was not fixed here, in part because its
    design meant it had difficulty differentiating between untracked
    files, ignored files, leftover tracked files that haven't been
    removed yet due to order of processing files, and files written by
    itself due to collisions).

Make the conversion more complete by now handling read-tree and
handling at least the unpack_trees() portion of merge-recursive.  While
merge-recursive is on its way out, fixing the unpack_trees() portion is
easy and facilitates some of the later changes in this series.  Note
that fixing read-tree makes the --exclude-per-directory option to
read-tree useless, so we remove it from the documentation (though we
continue to accept it if passed).

The read-tree changes happen to fix a bug in t1013.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/git-read-tree.txt | 18 +-----------------
 builtin/read-tree.c             | 25 ++++++++++---------------
 merge-recursive.c               | 11 ++++++++++-
 t/t1013-read-tree-submodule.sh  |  1 -
 4 files changed, 21 insertions(+), 34 deletions(-)

Comments

Jack O'Connor Dec. 13, 2021, 5:12 p.m. UTC | #1
> read-tree, merge-recursive: overwrite ignored files by default

When this patch shipped in v1.34, a test broke in a project of mine
(https://github.com/buildinspace/peru/blob/e9ba6e0024ea08105a8d027f958899cca39aeb9a/tests/test_cache.py#L111-L117)
that was relying on git read-tree *not* to respect .gitignore files.
(Obligatory https://xkcd.com/1172.) That peru tool is using git
plumbing commands to manage trees of files, but it tries to keep this
implementation detail internal, and behaving differently in the
presence of a .gitignore file belonging to the user would leak this
internal implementation detail. I've been trying to figure out a way
to reproduce the Git 1.33 behavior in Git 1.34, but so far I haven't
found any flags or configs to do that. (For example, putting !* in
.git/info/exclude doesn't seem to help, I think because a .gitignore
file in the working tree takes precedence.) Can anyone suggest another
workaround?

This is my first mail to this list, so please let me know if I mess up
the etiquette.

- Jack
Elijah Newren Dec. 13, 2021, 8:10 p.m. UTC | #2
On Mon, Dec 13, 2021 at 9:12 AM Jack O'Connor <oconnor663@gmail.com> wrote:
>
> > read-tree, merge-recursive: overwrite ignored files by default
>
> When this patch shipped in v1.34, a test broke in a project of mine
> (https://github.com/buildinspace/peru/blob/e9ba6e0024ea08105a8d027f958899cca39aeb9a/tests/test_cache.py#L111-L117)
> that was relying on git read-tree *not* to respect .gitignore files.
> (Obligatory https://xkcd.com/1172.) That peru tool is using git
> plumbing commands to manage trees of files, but it tries to keep this
> implementation detail internal, and behaving differently in the
> presence of a .gitignore file belonging to the user would leak this
> internal implementation detail. I've been trying to figure out a way
> to reproduce the Git 1.33 behavior in Git 1.34, but so far I haven't
> found any flags or configs to do that. (For example, putting !* in
> .git/info/exclude doesn't seem to help, I think because a .gitignore
> file in the working tree takes precedence.) Can anyone suggest another
> workaround?
>
> This is my first mail to this list, so please let me know if I mess up
> the etiquette.

Your email is fine.  :-)  Interesting usage case; thanks for sending it along.

Digging a bit into your repository, it appears this all started
because you noticed that checkout would overwrite ignored files, and
so you switched to reset --keep (in your 637d5c042262 (make cache
export refuse to pave .gitgnored files, 2014-07-22)) and then to
read-tree (in your 057d1af600f9 (Rewrite `export_tree` to allow
deleted files., 2014-08-05)) to avoid having ignored files be
overwritten. You could have stuck with `git checkout` all along, and
just passed it the --no-overwrite-ignore flag.  You probably just
missed the existence of that flag, because Duy forgot to document it
for 8 years (see git.git's commit 9d223d43e5 ("doc: document
--overwrite-ignore", 2019-03-29))  Going back to checkout might
provide you a workaround.  (Also, another random thing I noticed while
looking at your repo: `--diff-filter=d` is a much better way of
checking for not-deleted-changes than using `--diff-filter=ACMRTUXB`.
Note the lowercase 'd' rather than uppercase.)

Your report suggests more places should accept the
--no-overwrite-ignore flag, which I alluded to as a possibility in the
sixth patch in the series ("Remove ignored files by default when they
are in the way"[1]) and the comments in the cover letter about
precious ignored files (under "SIDENOTE about treating ignored files
as precious"[2]).  And perhaps we could have a core.overwriteIgnore
config option for setting a different global default (also as alluded
to in my cover letter).  Doing things would provide additional
workarounds, and finally provide the "precious ignored" concept that
has been discussed occasionally.  I think it's not too hard to do that
on top of my previous patch series.  I'll try to take a look after
some other in-flight series finally land.


[1] https://lore.kernel.org/git/b7fe354effff8da3de53bd9cc40a03b5fd455f67.1632760428.git.gitgitgadget@gmail.com/
[2] https://lore.kernel.org/git/pull.1036.v3.git.1632760428.gitgitgadget@gmail.com/
diff mbox series

Patch

diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index 5fa8bab64c2..0222a27c5af 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -10,8 +10,7 @@  SYNOPSIS
 --------
 [verse]
 'git read-tree' [[-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>]
-		[-u [--exclude-per-directory=<gitignore>] | -i]]
-		[--index-output=<file>] [--no-sparse-checkout]
+		[-u | -i]] [--index-output=<file>] [--no-sparse-checkout]
 		(--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])
 
 
@@ -88,21 +87,6 @@  OPTIONS
 	The command will refuse to overwrite entries that already
 	existed in the original index file.
 
---exclude-per-directory=<gitignore>::
-	When running the command with `-u` and `-m` options, the
-	merge result may need to overwrite paths that are not
-	tracked in the current branch.  The command usually
-	refuses to proceed with the merge to avoid losing such a
-	path.  However this safety valve sometimes gets in the
-	way.  For example, it often happens that the other
-	branch added a file that used to be a generated file in
-	your branch, and the safety valve triggers when you try
-	to switch to that branch after you ran `make` but before
-	running `make clean` to remove the generated file.  This
-	option tells the command to read per-directory exclude
-	file (usually '.gitignore') and allows such an untracked
-	but explicitly ignored file to be overwritten.
-
 --index-output=<file>::
 	Instead of writing the results out to `$GIT_INDEX_FILE`,
 	write the resulting index in the named file.  While the
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 96102c222bf..73cb957a69b 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -38,7 +38,7 @@  static int list_tree(struct object_id *oid)
 }
 
 static const char * const read_tree_usage[] = {
-	N_("git read-tree [(-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>) [-u [--exclude-per-directory=<gitignore>] | -i]] [--no-sparse-checkout] [--index-output=<file>] (--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])"),
+	N_("git read-tree [(-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>) [-u | -i]] [--no-sparse-checkout] [--index-output=<file>] (--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])"),
 	NULL
 };
 
@@ -53,24 +53,16 @@  static int index_output_cb(const struct option *opt, const char *arg,
 static int exclude_per_directory_cb(const struct option *opt, const char *arg,
 				    int unset)
 {
-	struct dir_struct *dir;
 	struct unpack_trees_options *opts;
 
 	BUG_ON_OPT_NEG(unset);
 
 	opts = (struct unpack_trees_options *)opt->value;
 
-	if (opts->dir)
-		die("more than one --exclude-per-directory given.");
-
-	dir = xcalloc(1, sizeof(*opts->dir));
-	dir->flags |= DIR_SHOW_IGNORED;
-	dir->exclude_per_dir = arg;
-	opts->dir = dir;
-	/* We do not need to nor want to do read-directory
-	 * here; we are merely interested in reusing the
-	 * per directory ignore stack mechanism.
-	 */
+	if (!opts->update)
+		die("--exclude-per-directory is meaningless unless -u");
+	if (strcmp(arg, ".gitignore"))
+		die("--exclude-per-directory argument must be .gitignore");
 	return 0;
 }
 
@@ -209,8 +201,11 @@  int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 	if ((opts.update || opts.index_only) && !opts.merge)
 		die("%s is meaningless without -m, --reset, or --prefix",
 		    opts.update ? "-u" : "-i");
-	if ((opts.dir && !opts.update))
-		die("--exclude-per-directory is meaningless unless -u");
+	if (opts.update && !opts.reset) {
+		CALLOC_ARRAY(opts.dir, 1);
+		opts.dir->flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(opts.dir);
+	}
 	if (opts.merge && !opts.index_only)
 		setup_work_tree();
 
diff --git a/merge-recursive.c b/merge-recursive.c
index e594d4c3fa1..233d9f686ad 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -408,8 +408,13 @@  static int unpack_trees_start(struct merge_options *opt,
 	memset(&opt->priv->unpack_opts, 0, sizeof(opt->priv->unpack_opts));
 	if (opt->priv->call_depth)
 		opt->priv->unpack_opts.index_only = 1;
-	else
+	else {
 		opt->priv->unpack_opts.update = 1;
+		/* FIXME: should only do this if !overwrite_ignore */
+		CALLOC_ARRAY(opt->priv->unpack_opts.dir, 1);
+		opt->priv->unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(opt->priv->unpack_opts.dir);
+	}
 	opt->priv->unpack_opts.merge = 1;
 	opt->priv->unpack_opts.head_idx = 2;
 	opt->priv->unpack_opts.fn = threeway_merge;
@@ -423,6 +428,10 @@  static int unpack_trees_start(struct merge_options *opt,
 	init_tree_desc_from_tree(t+2, merge);
 
 	rc = unpack_trees(3, t, &opt->priv->unpack_opts);
+	if (opt->priv->unpack_opts.dir) {
+		dir_clear(opt->priv->unpack_opts.dir);
+		FREE_AND_NULL(opt->priv->unpack_opts.dir);
+	}
 	cache_tree_free(&opt->repo->index->cache_tree);
 
 	/*
diff --git a/t/t1013-read-tree-submodule.sh b/t/t1013-read-tree-submodule.sh
index b6df7444c05..bfc90d4cf27 100755
--- a/t/t1013-read-tree-submodule.sh
+++ b/t/t1013-read-tree-submodule.sh
@@ -6,7 +6,6 @@  test_description='read-tree can handle submodules'
 . "$TEST_DIRECTORY"/lib-submodule-update.sh
 
 KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
-KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1
 
 test_submodule_switch_recursing_with_args "read-tree -u -m"