mbox series

[0/3] tag: keep the message file in case ref transaction fails

Message ID cover.1684067644.git.code@khaugsbakk.name (mailing list archive)
Headers show
Series tag: keep the message file in case ref transaction fails | expand

Message

Kristoffer Haugsbakk May 14, 2023, 1:17 p.m. UTC
The ref transaction can fail after the message has been written using the
editor. The ref transaction is attempted after the message file (`TAG_EDITMSG`)
has been unlinked, so there is no backup tag message file to retry the
command.[1]

This is unfortunate if someone has written more than e.g. “v1.99.4” in the
editor. (I don’t know if people write long tag messages in practice.)

Hold on to the tag message file until after the ref transaction in order
to preserve the backup.

† 1: On commit 91428f078b (The eighteenth batch, 2023-05-10)

§ Reproduction script

```
cd /tmp
dir=$(mktemp -d)
cd $dir
git init
git commit --allow-empty -mInit
git tag release/v1
# Fails
git tag -a release
```

Error message:

```
fatal: cannot lock ref 'refs/tags/release': 'refs/tags/release/v1' exists; cannot create 'refs/tags/release'
```

Better error message and behavior:

```
The tag message has been left in .git/TAG_EDITMSG
fatal: cannot lock ref 'refs/tags/release': 'refs/tags/release/v1' exists; cannot create 'refs/tags/release'
```

§ Alternatives considered

My first thought was to find a way to “dry run” the ref update before opening
the editor (the edge case of the ref update command succeeding the first time
but not the second *real* time seemed incredibly unlikely to happen by
happenstance, so I saw no reason to consider that). However that seemed like it
would involve more code and conditionals, and I don’t know if the dry-run mode
is even supported.

A benefit of this alternative approach would be to error out immediately instead
of opening the editor. But trying to create a tag which collides with an
existing “namespace” seems very unlikely to happen in practice.[2] Losing a file
is much worse than being inconvenienced to retry the command, so I decided to
just focus on the former problem.

Most importantly though this approach was within my ability to implement.

† 2: Just observe my “Reproduction script”: one tries to create `release` after
    someone else made `release/v1`. But what is just “release”? What follows
    (next version) that? But why am I arguing against my change…

§ CI

https://github.com/LemmingAvalanche/git/actions/runs/4972149825

§ Carbon copy

The suggested contacts seemed too long:

```
$ ./contrib/contacts/git-contacts v2.40.1..
Junio C Hamano <>
René Scharfe <>
Jonathan Tan <>
Jonathan Nieder <>
Jeff King <>
Stefan Beller <>
Denton Liu <>
Robert Dailey <>
```

So I added only Jeff King based on commit 3927bbe9a4 (tag: delete TAG_EDITMSG
only on successful tag, 2008-12-06).

§ Patches

1. Test the happy case
2. Test the unhappy case (fails without the next patch)
3. The change

—

Cheers.

Kristoffer Haugsbakk (3):
  t/t7004-tag: add regression test for existing behavior
  t/t7004-tag: add failing tag message file test
  tag: keep the message file in case ref transaction fails

 builtin/tag.c  | 24 +++++++++++++++---------
 t/t7004-tag.sh | 19 +++++++++++++++++++
 2 files changed, 34 insertions(+), 9 deletions(-)

Comments

Jeff King May 17, 2023, 9:32 a.m. UTC | #1
On Sun, May 14, 2023 at 03:17:57PM +0200, Kristoffer Haugsbakk wrote:

> The ref transaction can fail after the message has been written using the
> editor. The ref transaction is attempted after the message file (`TAG_EDITMSG`)
> has been unlinked, so there is no backup tag message file to retry the
> command.[1]
> 
> This is unfortunate if someone has written more than e.g. “v1.99.4” in the
> editor. (I don’t know if people write long tag messages in practice.)
> [...]
>
> So I added only Jeff King based on commit 3927bbe9a4 (tag: delete TAG_EDITMSG
> only on successful tag, 2008-12-06).

It has definitely been a while since that one. :) I agree what you are
doing here is in the same spirit as that commit, though there is one
small difference. 3927bbe9a4 was about saving the message file if we
failed to create the object, since otherwise there is no record of it.

Your patch is about saving it from a failed ref update. But in that case
we _do_ have the data saved in the object. You can find it with
git-fsck:

  [after running something like your test script]
  $ git fsck
  Checking object directories: 100% (256/256), done.
  dangling tag b9bde516952a863690027a611575a79d4c786b8d

  $ git cat-file tag b9bde516952a863690027a611575a79d4c786b8d
  object 8b1e708e32ae3af17825b3c713a3ab317824b932
  type commit
  tag foo/bar
  tagger Jeff King <peff@peff.net> 1684314980 -0400
  
  my message

That said, I'm willing to believe that most users wouldn't figure this
out on their own, and saving TAG_EDITMSG might be more friendly. But one
other alternative might be to mention the hash of that tag object, and
possibly give advice on recovering it.

It's too bad we do not have "git tag -c" to match "git commit -c", which
would allow us to say something like:

      A tag object was created, but we failed to update the ref.
      After correcting any errors, you can recover the original tag
      message with:

        git tag -c $oid [other options]

(where we'd replace $oid with the hash of the created tag object). The
best alternatives I could come up with were:

      # this is kind of obscure advice to give somebody, plus
      # it makes a weird tag if you originally tried to tag "foo/bar"
      # but then later switch to some other name. The "tag" field
      # in the object won't match the ref.
      git update-ref $ref $oid

      # this saves just the message but is horribly complicated
      git cat-file tag $oid | sed '1,/^$/d' |
      git tag -F -

I dunno. There is a certain elegance to telling the user about what
progress we _did_ make, but if there isn't an easy way to turn that into
a retry of their command, it may not be all that useful.

So I'm not really opposed to your patch, but mostly just brainstorming
alternatives.

I gave v3 a brief look, and it seems OK to me. It might be nice to
factor out the duplicated advice message, but since we cannot share any
of the other lines (one spot calls exit() and the other uses die()), I
am OK with it as-is.

-Peff
Junio C Hamano May 17, 2023, 4 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> That said, I'm willing to believe that most users wouldn't figure this
> out on their own, and saving TAG_EDITMSG might be more friendly. But one
> other alternative might be to mention the hash of that tag object, and
> possibly give advice on recovering it.

Hmph, then the advice message would suggest "update-ref"?

Ah, no.  Because the message may be reused to create a tag with
different tagname, which is very likely because one reason for the
refusal to update the ref could be that the name was already taken,
and that would create a mismatch between tagname and refname.

OK, so ...

> It's too bad we do not have "git tag -c" to match "git commit -c", which
> would allow us to say something like:
>
>       A tag object was created, but we failed to update the ref.
>       After correcting any errors, you can recover the original tag
>       message with:
>
>         git tag -c $oid [other options]
>
> (where we'd replace $oid with the hash of the created tag object). The
> best alternatives I could come up with were:
>
>       # this is kind of obscure advice to give somebody, plus
>       # it makes a weird tag if you originally tried to tag "foo/bar"
>       # but then later switch to some other name. The "tag" field
>       # in the object won't match the ref.
>       git update-ref $ref $oid

... I agree that this is not a very good advice to give, and ...

>       # this saves just the message but is horribly complicated
>       git cat-file tag $oid | sed '1,/^$/d' |
>       git tag -F -

... this is a reasonable thing to have in a more user-friendly
feature, like your -c above.

> I dunno. There is a certain elegance to telling the user about what
> progress we _did_ make, but if there isn't an easy way to turn that into
> a retry of their command, it may not be all that useful.

Yeah, I am OK with "leaving TAG_EDITMSG behind" and a future "tag
-c/-C $another" to coexist.
Jeff King May 17, 2023, 5:37 p.m. UTC | #3
On Wed, May 17, 2023 at 09:00:10AM -0700, Junio C Hamano wrote:

> > I dunno. There is a certain elegance to telling the user about what
> > progress we _did_ make, but if there isn't an easy way to turn that into
> > a retry of their command, it may not be all that useful.
> 
> Yeah, I am OK with "leaving TAG_EDITMSG behind" and a future "tag
> -c/-C $another" to coexist.

Me too.

One thing I wondered is whether the obvious command to retry:

  git tag -F .git/TAG_EDITMSG foo

would work, or if we would overwrite the file before it is read. But it
does work, which is good. I wonder if we:

  a. want to protect that with a test (since I could imagine a
     refactoring where we try to copy the "-F" contents from file to
     file, rather than reading it into a memory buffer ahead of time)

  b. want to tell users that is a good way to recover (though maybe that
     is a rabbit hole of details as one subtlety is that it will be
     overwritten by an unrelated tag command).

But I am also happy to leave it to the user's imagination to pull the
contents from the file with "cp" or their editor. :)

-Peff
Eric Sunshine May 17, 2023, 5:52 p.m. UTC | #4
On Wed, May 17, 2023 at 1:40 PM Jeff King <peff@peff.net> wrote:
> One thing I wondered is whether the obvious command to retry:
>
>   git tag -F .git/TAG_EDITMSG foo
>
> would work, or if we would overwrite the file before it is read. But it
> does work, which is good. I wonder if we:
>
>   b. want to tell users that is a good way to recover (though maybe that
>      is a rabbit hole of details as one subtlety is that it will be
>      overwritten by an unrelated tag command).

If that route is taken, then making the advice worktree-friendly would
probably be a good idea:

    git tag -F $(git rev-parse --git-path TAG_EDITMSG) foo
Kristoffer Haugsbakk May 17, 2023, 6:02 p.m. UTC | #5
On Wed, May 17, 2023, at 19:52, Eric Sunshine wrote:
> If that route is taken, then making the advice worktree-friendly would
> probably be a good idea:
>
>     git tag -F $(git rev-parse --git-path TAG_EDITMSG) foo

The current advice works with worktrees since it just uses `path`.
Kristoffer Haugsbakk May 17, 2023, 7:58 p.m. UTC | #6
On Wed, May 17, 2023, at 11:32, Jeff King wrote:
> It has definitely been a while since that one. :) I agree what you are
> doing here is in the same spirit as that commit, though there is one
> small difference. 3927bbe9a4 was about saving the message file if we
> failed to create the object, since otherwise there is no record of it.
>
> Your patch is about saving it from a failed ref update. But in that case
> we _do_ have the data saved in the object. You can find it with
> git-fsck:

I didn’t even think of that. :) Too much tunnel-focus on that text file…

>
>   [after running something like your test script]
>   $ git fsck
>   Checking object directories: 100% (256/256), done.
>   dangling tag b9bde516952a863690027a611575a79d4c786b8d
>
>   $ git cat-file tag b9bde516952a863690027a611575a79d4c786b8d
>   object 8b1e708e32ae3af17825b3c713a3ab317824b932
>   type commit
>   tag foo/bar
>   tagger Jeff King <peff@peff.net> 1684314980 -0400
>
>   my message
>
> That said, I'm willing to believe that most users wouldn't figure this
> out on their own, and saving TAG_EDITMSG might be more friendly.

The current error says:

```
fatal: cannot lock ref 'refs/tags/release': 'refs/tags/release/v1' exists; cannot create 'refs/tags/release'
```

As a user, I have no idea if anything (the tag object) was even
created. The error just says that the tag couldn’t be created. (As a
user the ref pointing to the annotated tag object and the object itself
sound like basically the same thing.)

I might be able to guess that it *had* created something tangible like
an object in the database since it got all the way to update the ref to
point to it (just one more step…). But that feels like a stretch.

> But one other alternative might be to mention the hash of that tag
> object, and possibly give advice on recovering it.

That could work. Like:

```
hint: git tag <new name> f73e152cb1560aff1446b208affba4cdcf5bea36
```

So I copy that into my prompt and update the name. All is good.

But I might decide to be just copy the hash, press “Up” in the terminal,
copy the hash at the end, and change the name:

```
git tag -a release-it f73e152cb1560aff1446b208affba4cdcf5bea36
```

Next I’m in my text editor again. Not sure what that is about but okay.

*closes it*

```
hint: You have created a nested tag. The object referred to by your new tag is
hint: already a tag. If you meant to tag the object that it points to, use:
hint:
hint: 	git tag -f release-it f73e152cb1560aff1446b208affba4cdcf5bea36^{}
hint: Disable this message with "git config advice.nestedTag false"
```

> It's too bad we do not have "git tag -c" to match "git commit -c", which
> would allow us to say something like:

Sure. I appreciate that kind of symmetry. (I guess that’s not the most
precise word in this context.)

> I dunno. There is a certain elegance to telling the user about what
> progress we _did_ make, but if there isn't an easy way to turn that into
> a retry of their command, it may not be all that useful.

As a user, the only expensive part (potentially; refer back to footnote
№ 2 of the CV) is writing the tag message. So fudging around a bit with
copying the backup text file into the new message (I would probably just
copy it into the editor) is not a big deal. And retrying the command is
just “Up” and a little line editing.

> So I'm not really opposed to your patch, but mostly just brainstorming
> alternatives.

It’s very much appreciated.

> I gave v3 a brief look, and it seems OK to me. It might be nice to
> factor out the duplicated advice message, but since we cannot share any
> of the other lines (one spot calls exit() and the other uses die()), I
> am OK with it as-is.
>
> -Peff

I’ll make a function for the message in the next round. It’s nice to not
have to duplicate such strings/message ids, I think.

Cheers