[04/10] fast-export: avoid dying when filtering by paths and old tags exist
diff mbox series

Message ID 20181111062312.16342-5-newren@gmail.com
State New
Headers show
Series
  • fast export and import fixes and features
Related show

Commit Message

Elijah Newren Nov. 11, 2018, 6:23 a.m. UTC
If --tag-of-filtered-object=rewrite is specified along with a set of
paths to limit what is exported, then any tags pointing to old commits
that do not contain any of those specified paths cause problems.  Since
the old tagged commit is not exported, fast-export attempts to rewrite
such tags to an ancestor commit which was exported.  If no such commit
exists, then fast-export currently die()s.  Five years after the tag
rewriting logic was added to fast-export (see commit 2d8ad4691921,
"fast-export: Add a --tag-of-filtered-object  option for newly dangling
tags", 2009-06-25), fast-import gained the ability to delete refs (see
commit 4ee1b225b99f, "fast-import: add support to delete refs",
2014-04-20), so now we do have a valid option to rewrite the tag to.
Delete these tags instead of dying.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/fast-export.c  |  9 ++++++---
 t/t9350-fast-export.sh | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+), 3 deletions(-)

Comments

Jeff King Nov. 11, 2018, 6:44 a.m. UTC | #1
On Sat, Nov 10, 2018 at 10:23:06PM -0800, Elijah Newren wrote:

> If --tag-of-filtered-object=rewrite is specified along with a set of
> paths to limit what is exported, then any tags pointing to old commits
> that do not contain any of those specified paths cause problems.  Since
> the old tagged commit is not exported, fast-export attempts to rewrite
> such tags to an ancestor commit which was exported.  If no such commit
> exists, then fast-export currently die()s.  Five years after the tag
> rewriting logic was added to fast-export (see commit 2d8ad4691921,
> "fast-export: Add a --tag-of-filtered-object  option for newly dangling
> tags", 2009-06-25), fast-import gained the ability to delete refs (see
> commit 4ee1b225b99f, "fast-import: add support to delete refs",
> 2014-04-20), so now we do have a valid option to rewrite the tag to.
> Delete these tags instead of dying.

Hmm. That's the right thing to do if we're considering the export to be
an independent unit. But what if I'm just rewriting a portion of history
like:

  git fast-export HEAD~5..HEAD | some_filter | git fast-import

? If I have a tag pointing to HEAD~10, will this delete that? Ideally I
think it would be left alone.

> +test_expect_success 'rewrite tag predating pathspecs to nothing' '
> +	test_create_repo rewrite_tag_predating_pathspecs &&
> +	(
> +		cd rewrite_tag_predating_pathspecs &&
> +
> +		touch ignored &&

We usually prefer ">ignored" to create an empty file rather than
"touch".

> +		git add ignored &&
> +		test_commit initial &&

What do we need this "ignored" for? test_commit should create a file
"initial.t".

> +		echo foo >bar &&
> +		git add bar &&
> +		test_commit add-bar &&

Likewise, "test_commit bar" should work by itself (though note the
filename is "bar.t" in your fast-export command).

> +		git fast-export --tag-of-filtered-object=rewrite --all -- bar >output &&
> +		grep -A 1 refs/tags/v0.0.0.0.0.0.1 output | grep -E ^from.0{40}

I don't think "grep -A" is portable (and we don't seem to otherwise use
it). You can probably do something similar with sed.

Use $ZERO_OID instead of hard-coding 40, which future-proofs for the
hash transition (though I suppose the hash is not likely to get
_shorter_ ;) ).

-Peff
Elijah Newren Nov. 11, 2018, 7:38 a.m. UTC | #2
On Sat, Nov 10, 2018 at 10:44 PM Jeff King <peff@peff.net> wrote:
>
> On Sat, Nov 10, 2018 at 10:23:06PM -0800, Elijah Newren wrote:
>
> > If --tag-of-filtered-object=rewrite is specified along with a set of
> > paths to limit what is exported, then any tags pointing to old commits
> > that do not contain any of those specified paths cause problems.  Since
> > the old tagged commit is not exported, fast-export attempts to rewrite
> > such tags to an ancestor commit which was exported.  If no such commit
> > exists, then fast-export currently die()s.  Five years after the tag
> > rewriting logic was added to fast-export (see commit 2d8ad4691921,
> > "fast-export: Add a --tag-of-filtered-object  option for newly dangling
> > tags", 2009-06-25), fast-import gained the ability to delete refs (see
> > commit 4ee1b225b99f, "fast-import: add support to delete refs",
> > 2014-04-20), so now we do have a valid option to rewrite the tag to.
> > Delete these tags instead of dying.
>
> Hmm. That's the right thing to do if we're considering the export to be
> an independent unit. But what if I'm just rewriting a portion of history
> like:
>
>   git fast-export HEAD~5..HEAD | some_filter | git fast-import
>
> ? If I have a tag pointing to HEAD~10, will this delete that? Ideally I
> think it would be left alone.

A couple things:
  * This code path only triggers in a very specific case: If a tag is
requested for export but points to a commit which is filtered out by
something else (e.g. path limiters and the commit in question didn't
modify any of the relevant paths), AND the user explicitly specified
--tag-of-filtered-object=rewrite (so that the tag in question can be
rewritten to the nearest non-filtered ancestor).
  * You didn't specify to export any tags, only HEAD, so this
situation isn't relevant (the tag wouldn't be exported or deleted).
  * You didn't specify --tag-of-filtered-object=rewrite, so this
situation isn't relevant (even if you had specified a tag to filter,
you'd get an abort instead)

But let's say you do modify the example some:
   git fast-export --tag-of-filtered-object=rewrite
--signed-tags=strip --tags master -- relatively_recent_subdirectory/ |
some_filter | git fast-import

The user asked that all tags and master be exported but only for the
history that touched relatively_recent_subdirectory/, and if any tags
point at commits that are pruned by only asking for commits touching
relatively_recent_subdirectory/, then rewrite what those tags point to
so that they instead point to the nearest non-filtered ancestor.  What
about a commit like v0.1.0 that likely pre-dated the introduction of
relatively_recent_subdirectory/?  It has no nearest ancestor to
rewrite to.  The previous answer was to abort, which is really bad,
especially since the user was clearly asking us to do whatever smart
rewriting we can (--signed-tags=strip and
--tag-of-filtered-object=rewrite).

Perhaps there's a different answer that's workable as well, but this
one, in these circumstances, seemed the most reasonable to me.

> > +test_expect_success 'rewrite tag predating pathspecs to nothing' '
> > +     test_create_repo rewrite_tag_predating_pathspecs &&
> > +     (
> > +             cd rewrite_tag_predating_pathspecs &&
> > +
> > +             touch ignored &&
>
> We usually prefer ">ignored" to create an empty file rather than
> "touch".

Will fix.

>
> > +             git add ignored &&
> > +             test_commit initial &&
>
> What do we need this "ignored" for? test_commit should create a file
> "initial.t".

I think I original had plain "git commit", then switched to
test_commit, then didn't recheck.  Thanks, will fix.

> > +             echo foo >bar &&
> > +             git add bar &&
> > +             test_commit add-bar &&
>
> Likewise, "test_commit bar" should work by itself (though note the
> filename is "bar.t" in your fast-export command).
>
> > +             git fast-export --tag-of-filtered-object=rewrite --all -- bar >output &&
> > +             grep -A 1 refs/tags/v0.0.0.0.0.0.1 output | grep -E ^from.0{40}
>
> I don't think "grep -A" is portable (and we don't seem to otherwise use
> it). You can probably do something similar with sed.
>
> Use $ZERO_OID instead of hard-coding 40, which future-proofs for the
> hash transition (though I suppose the hash is not likely to get
> _shorter_ ;) ).

Will fix these up as well...after waiting for more feedback on
possible alternate suggestions.
Jeff King Nov. 12, 2018, 12:32 p.m. UTC | #3
On Sat, Nov 10, 2018 at 11:38:45PM -0800, Elijah Newren wrote:

> > Hmm. That's the right thing to do if we're considering the export to be
> > an independent unit. But what if I'm just rewriting a portion of history
> > like:
> >
> >   git fast-export HEAD~5..HEAD | some_filter | git fast-import
> >
> > ? If I have a tag pointing to HEAD~10, will this delete that? Ideally I
> > think it would be left alone.
> 
> A couple things:
>   * This code path only triggers in a very specific case: If a tag is
> requested for export but points to a commit which is filtered out by
> something else (e.g. path limiters and the commit in question didn't
> modify any of the relevant paths), AND the user explicitly specified
> --tag-of-filtered-object=rewrite (so that the tag in question can be
> rewritten to the nearest non-filtered ancestor).

Right, I think this is the bit I was missing: somebody has to have
explicitly asked to export the tag. At which point the only sensible
thing to do is drop it.

-Peff
brian m. carlson Nov. 12, 2018, 10:50 p.m. UTC | #4
On Sun, Nov 11, 2018 at 01:44:43AM -0500, Jeff King wrote:
> > +		git fast-export --tag-of-filtered-object=rewrite --all -- bar >output &&
> > +		grep -A 1 refs/tags/v0.0.0.0.0.0.1 output | grep -E ^from.0{40}
> 
> I don't think "grep -A" is portable (and we don't seem to otherwise use
> it). You can probably do something similar with sed.
> 
> Use $ZERO_OID instead of hard-coding 40, which future-proofs for the
> hash transition (though I suppose the hash is not likely to get
> _shorter_ ;) ).

It would indeed be nice if we used $ZERO_OID.  Also, we prefer to write
"egrep", since some less capable systems don't have a grep with -E.
Jeff King Nov. 13, 2018, 2:38 p.m. UTC | #5
On Mon, Nov 12, 2018 at 10:50:43PM +0000, brian m. carlson wrote:

> On Sun, Nov 11, 2018 at 01:44:43AM -0500, Jeff King wrote:
> > > +		git fast-export --tag-of-filtered-object=rewrite --all -- bar >output &&
> > > +		grep -A 1 refs/tags/v0.0.0.0.0.0.1 output | grep -E ^from.0{40}
> > 
> > I don't think "grep -A" is portable (and we don't seem to otherwise use
> > it). You can probably do something similar with sed.
> > 
> > Use $ZERO_OID instead of hard-coding 40, which future-proofs for the
> > hash transition (though I suppose the hash is not likely to get
> > _shorter_ ;) ).
> 
> It would indeed be nice if we used $ZERO_OID.  Also, we prefer to write
> "egrep", since some less capable systems don't have a grep with -E.

I thought that, too, but it is only "grep -F" that has been a problem
for us in the past, and we have many "grep -E" calls already. c.f.
https://public-inbox.org/git/20180910154453.GA15270@sigill.intra.peff.net/

-Peff

Patch
diff mbox series

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 1a299c2a21..89de9d6400 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -774,9 +774,12 @@  static void handle_tag(const char *name, struct tag *tag)
 					break;
 				if (!(p->object.flags & TREESAME))
 					break;
-				if (!p->parents)
-					die("can't find replacement commit for tag %s",
-					     oid_to_hex(&tag->object.oid));
+				if (!p->parents) {
+					printf("reset %s\nfrom %s\n\n",
+					       name, sha1_to_hex(null_sha1));
+					free(buf);
+					return;
+				}
 				p = p->parents->item;
 			}
 			tagged_mark = get_object_mark(&p->object);
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 6a392e87bc..5bf21b4908 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -325,6 +325,26 @@  test_expect_success 'rewriting tag of filtered out object' '
 )
 '
 
+test_expect_success 'rewrite tag predating pathspecs to nothing' '
+	test_create_repo rewrite_tag_predating_pathspecs &&
+	(
+		cd rewrite_tag_predating_pathspecs &&
+
+		touch ignored &&
+		git add ignored &&
+		test_commit initial &&
+
+		git tag -a -m "Some old tag" v0.0.0.0.0.0.1 &&
+
+		echo foo >bar &&
+		git add bar &&
+		test_commit add-bar &&
+
+		git fast-export --tag-of-filtered-object=rewrite --all -- bar >output &&
+		grep -A 1 refs/tags/v0.0.0.0.0.0.1 output | grep -E ^from.0{40}
+	)
+'
+
 cat > limit-by-paths/expected << EOF
 blob
 mark :1