Message ID | 20190613221541.10007-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] rev-list: clarify --abbrev and --abbrev-commit usage | expand |
Emily Shaffer <emilyshaffer@google.com> writes: > It looks like `git log --abbrev=5` also doesn't work the way one might > expect, which makes sense to me, as they use the same internals for > option parsing (parse_revisions()). I suspect that was primarily because --abbrev-commit was only to abbreviate the commit object names on the log header, and --abbrev was to abbreviate the object names seen in the --raw output. These days, even --raw output seems to abbreviate by default, so --abbrev alone lost its usefulness quite a bit.
On Thu, Jun 13, 2019 at 03:15:41PM -0700, Emily Shaffer wrote: > I thought this was odd when I was working on the other rev-list changes > - --abbrev doesn't do anything on its own. It looks like it does work by > itself in other commands, but apparently not in rev-list. > > Listed this patch as RFC because maybe instead it's better to fix > something so --abbrev can be used alone, or teach --abbrev-commit=<n>. > It looks like `git log --abbrev=5` also doesn't work the way one might > expect, which makes sense to me, as they use the same internals for > option parsing (parse_revisions()). > > The manpages for log and rev-list both correctly indicate that > --abbrev=<n> is an optional addition to --abbrev-commit. `git log -h` is > generated by parse-options tooling and doesn't cover --abbrev-commit at > all, but `git rev-list` doesn't use an option parser on its own and the > usage is hardcoded. Yeah, "--abbrev" is a bit tricky here. It is really "when you abbrev, do it to this level". In "log", that means that "git log --abbrev=5 --raw" _does_ do something useful (it abbreviates the blob hashes). And then you may add "--abbrev-commit" on top to ask to abbreviate the commit ids. And rev-list follows that same pattern, except that rev-list _never_ shows diff output. You'd traditionally do (and this is how log was implemented once upon a time): git rev-list HEAD | git diff-tree --stdin --abbrev=5 --raw But even there, we are not seeing the commit id output by rev-list. It goes to diff-tree, which then formats it separately. So if you wanted abbreviated commits there, you'd add "--abbrev-commit" to the diff-tree invocation, not rev-list! So no, I cannot see a way in which "rev-list --abbrev" is useful without "--abbrev-commit". Which means that perhaps the former should imply the latter. And as you noticed in your other patch, there is no way to abbreviate "--objects" output at all. I am not sure I have ever seen a good use for that. Though to be honest, I am not sure that "--abbrev" is really all that useful in the first place. Machine-readable output should never abbreviate, and human-readable ones generally already do. But at any rate, before making any behavior changes it may make sense to think about how they'd interact with "rev-list --objects" abbreviation, if it were to be added. As for the patch itself: > diff --git a/builtin/rev-list.c b/builtin/rev-list.c > index 9f31837d30..6ae0087b01 100644 > --- a/builtin/rev-list.c > +++ b/builtin/rev-list.c > @@ -49,8 +49,8 @@ static const char rev_list_usage[] = > " --objects | --objects-edge\n" > " --unpacked\n" > " --header | --pretty\n" > -" --abbrev=<n> | --no-abbrev\n" > -" --abbrev-commit\n" > +" --abbrev-commit [--abbrev=<n>]\n" > +" --no-abbrev\n" So --no-abbrev clears both --abbrev and --abbrev-commit. That sort of makes sense, though I did not expect it. But it also means that: --abbrev-commit [--abbrev=<n> | --no-abbrev] is not right. Possibly: --abbrev-commit [--abbrev=<n>] | --no-abbrev would show the interaction more clearly, but I don't have a strong opinion. -Peff
On Fri, Jun 14, 2019 at 12:18:41PM -0400, Jeff King wrote: > On Thu, Jun 13, 2019 at 03:15:41PM -0700, Emily Shaffer wrote: > > > I thought this was odd when I was working on the other rev-list changes > > - --abbrev doesn't do anything on its own. It looks like it does work by > > itself in other commands, but apparently not in rev-list. > > > > Listed this patch as RFC because maybe instead it's better to fix > > something so --abbrev can be used alone, or teach --abbrev-commit=<n>. > > It looks like `git log --abbrev=5` also doesn't work the way one might > > expect, which makes sense to me, as they use the same internals for > > option parsing (parse_revisions()). > > > > The manpages for log and rev-list both correctly indicate that > > --abbrev=<n> is an optional addition to --abbrev-commit. `git log -h` is > > generated by parse-options tooling and doesn't cover --abbrev-commit at > > all, but `git rev-list` doesn't use an option parser on its own and the > > usage is hardcoded. > > Yeah, "--abbrev" is a bit tricky here. It is really "when you abbrev, do > it to this level". In "log", that means that "git log --abbrev=5 --raw" > _does_ do something useful (it abbreviates the blob hashes). And then > you may add "--abbrev-commit" on top to ask to abbreviate the commit > ids. > > And rev-list follows that same pattern, except that rev-list _never_ > shows diff output. You'd traditionally do (and this is how log was > implemented once upon a time): > > git rev-list HEAD | git diff-tree --stdin --abbrev=5 --raw > > But even there, we are not seeing the commit id output by rev-list. It > goes to diff-tree, which then formats it separately. So if you wanted > abbreviated commits there, you'd add "--abbrev-commit" to the diff-tree > invocation, not rev-list! > > So no, I cannot see a way in which "rev-list --abbrev" is useful without > "--abbrev-commit". Which means that perhaps the former should imply the > latter. Maybe it should; maybe this patch is a good excuse to do so, and enforce mutual exclusion of --abbrev-commit/--abbrev and --no-abbrev. Maybe it's also a good time to add --abbrev-commit=<length>? > > And as you noticed in your other patch, there is no way to abbreviate > "--objects" output at all. I am not sure I have ever seen a good use for > that. Though to be honest, I am not sure that "--abbrev" is really all > that useful in the first place. Machine-readable output should never > abbreviate, and human-readable ones generally already do. > > But at any rate, before making any behavior changes it may make sense to > think about how they'd interact with "rev-list --objects" abbreviation, > if it were to be added. > > As for the patch itself: > > > diff --git a/builtin/rev-list.c b/builtin/rev-list.c > > index 9f31837d30..6ae0087b01 100644 > > --- a/builtin/rev-list.c > > +++ b/builtin/rev-list.c > > @@ -49,8 +49,8 @@ static const char rev_list_usage[] = > > " --objects | --objects-edge\n" > > " --unpacked\n" > > " --header | --pretty\n" > > -" --abbrev=<n> | --no-abbrev\n" > > -" --abbrev-commit\n" > > +" --abbrev-commit [--abbrev=<n>]\n" > > +" --no-abbrev\n" > > So --no-abbrev clears both --abbrev and --abbrev-commit. That sort of > makes sense, though I did not expect it. But it also means that: > > --abbrev-commit [--abbrev=<n> | --no-abbrev] > > is not right. Possibly: > > --abbrev-commit [--abbrev=<n>] | --no-abbrev > > would show the interaction more clearly, but I don't have a strong > opinion. I did consider demonstrating it this way, but when both --abbrev-commit and --no-abbrev are used together, we don't complain or reject the invocation - which I would expect if the usage states the two options are mutually exclusive. I've been trying to think of good reasons not to enforce their mutual exclusion, and the one I keep coming back to is that --no-abbrev might be desired to override a git config'd abbreviation length - although I didn't check to see whether we have one, maybe we would want one later. And even in that case, I suppose that --abbrev-commit would not be explicitly added to the call (because we'd infer from the config), or that if it did need to be explicitly added (like if we need the user to say they want abbreviation, but we want to use their configured preferred length) then we could still reject the addition of --no-abbrev. So maybe it makes even more sense to take this patch as an opportunity to make these options mutually exclusive... although that checking I think would wind up in revision.c, and therefore widen the impact of the change significantly. From a practical standpoint, I guess you could consider --abbrev-commit and --no-abbrev mutually exclusive, but since it's not enforced, I have a little preference not to document it as though it is... - Emily
On Fri, Jun 14, 2019 at 01:59:50PM -0700, Emily Shaffer wrote: > > So no, I cannot see a way in which "rev-list --abbrev" is useful without > > "--abbrev-commit". Which means that perhaps the former should imply the > > latter. > > Maybe it should; maybe this patch is a good excuse to do so, and enforce > mutual exclusion of --abbrev-commit/--abbrev and --no-abbrev. Maybe it's > also a good time to add --abbrev-commit=<length>? Hmm, yeah, I think that would reduce confusion quite a bit. If it were "--abbrev-commit=<length>", then "--abbrev" would not be useful for anything in rev-list. It would still work as a historical item, but we would not need or want to advertise it in the usage at all. Good suggestion. > > is not right. Possibly: > > > > --abbrev-commit [--abbrev=<n>] | --no-abbrev > > > > would show the interaction more clearly, but I don't have a strong > > opinion. > > I did consider demonstrating it this way, but when both --abbrev-commit > and --no-abbrev are used together, we don't complain or reject the > invocation - which I would expect if the usage states the two options > are mutually exclusive. Ah, I see. I don't consider "|" to indicate an exclusion to the point that the options are rejected. Only that you wouldn't want to use both, because one counteracts the other. So every "--no-foo" is mutually exclusive with "--foo" in the sense that one override the other. But the outcome is "last one wins", and not "whoops, we cannot figure out what you meant". And that's what the original: --abbrev=<n> | --no-abbrev before your patch was trying to say (and I suspect there are many other cases of "|" with this kind of last-one-wins behavior). > I've been trying to think of good reasons not to enforce their mutual > exclusion, and the one I keep coming back to is that --no-abbrev might > be desired to override a git config'd abbreviation length - although I > didn't check to see whether we have one, maybe we would want one later. > And even in that case, I suppose that --abbrev-commit would not be > explicitly added to the call (because we'd infer from the config), or > that if it did need to be explicitly added (like if we need the user to > say they want abbreviation, but we want to use their configured > preferred length) then we could still reject the addition of > --no-abbrev. > > So maybe it makes even more sense to take this patch as an opportunity > to make these options mutually exclusive... although that checking I > think would wind up in revision.c, and therefore widen the impact of > the change significantly. You can configure core.abbrev, though I'm not sure if it ever requests abbreviation itself, or if it simply sets the length when we do happen to abbreviate based on command-line options. But forgetting config for a moment, last-one-wins is useful even among command line options. E.g., imagine an alias like this: [alias] mylog = git rev-list --abbrev-commit --pretty=oneline It's nice if you can run "git mylog --no-abbrev" and have it do what you expect, instead of complaining "you cannot use --abbrev-commit and --no-abbrev together". That's a toy example, but you can imagine more elaborate scripts that set some default options, and allow arbitrary per-invocation options to be appended. -Peff
On Fri, Jun 14, 2019 at 05:27:14PM -0400, Jeff King wrote: > On Fri, Jun 14, 2019 at 01:59:50PM -0700, Emily Shaffer wrote: > > > > So no, I cannot see a way in which "rev-list --abbrev" is useful without > > > "--abbrev-commit". Which means that perhaps the former should imply the > > > latter. > > > > Maybe it should; maybe this patch is a good excuse to do so, and enforce > > mutual exclusion of --abbrev-commit/--abbrev and --no-abbrev. Maybe it's > > also a good time to add --abbrev-commit=<length>? > > Hmm, yeah, I think that would reduce confusion quite a bit. If it were > "--abbrev-commit=<length>", then "--abbrev" would not be useful for > anything in rev-list. It would still work as a historical item, but we > would not need or want to advertise it in the usage at all. Good > suggestion. Given your comments below, I think rather than enforcing mutual exclusion it makes more sense to enforce last-one-wins. But the thinking is essentially the same. > > > > is not right. Possibly: > > > > > > --abbrev-commit [--abbrev=<n>] | --no-abbrev > > > > > > would show the interaction more clearly, but I don't have a strong > > > opinion. > > > > I did consider demonstrating it this way, but when both --abbrev-commit > > and --no-abbrev are used together, we don't complain or reject the > > invocation - which I would expect if the usage states the two options > > are mutually exclusive. > > Ah, I see. I don't consider "|" to indicate an exclusion to the point > that the options are rejected. Only that you wouldn't want to use both, > because one counteracts the other. So every "--no-foo" is mutually > exclusive with "--foo" in the sense that one override the other. But the > outcome is "last one wins", and not "whoops, we cannot figure out what > you meant". And that's what the original: > > --abbrev=<n> | --no-abbrev > > before your patch was trying to say (and I suspect there are many other > cases of "|" with this kind of last-one-wins behavior). For what it's worth, in this case it's not last-one-wins - --no-abbrev always wins: emilyshaffer@podkayne:~/git [master]$ g rev-list --abbrev-commit --no-abbrev --max-count=5 --pretty=oneline HEAD b697d92f56511e804b8ba20ccbe7bdc85dc66810 Git 2.22 6ee1eaca3e996e69691f515742129645f453e0e8 Merge tag 'l10n-2.22.0-rnd3' of git://github.com/git-l10n/git-po 0cdb8d2db2f39d1a29636975168c457d2dc0d466 Merge branch 'fr_review' of git://github.com/jnavila/git d0149200792f579151166a4a5bfae7e66c5d998b Merge branch 'master' of git://github.com/alshopov/git-po 82eb147dbbbd0221980883e87ca7efd16a939a6f l10n: fr.po: Review French translation emilyshaffer@podkayne:~/git [master]$ g rev-list --no-abbrev --abbrev-commit --max-count=5 --pretty=oneline HEAD b697d92f56511e804b8ba20ccbe7bdc85dc66810 Git 2.22 6ee1eaca3e996e69691f515742129645f453e0e8 Merge tag 'l10n-2.22.0-rnd3' of git://github.com/git-l10n/git-po 0cdb8d2db2f39d1a29636975168c457d2dc0d466 Merge branch 'fr_review' of git://github.com/jnavila/git d0149200792f579151166a4a5bfae7e66c5d998b Merge branch 'master' of git://github.com/alshopov/git-po 82eb147dbbbd0221980883e87ca7efd16a939a6f l10n: fr.po: Review French translation emilyshaffer@podkayne:~/git [master]$ g rev-list --abbrev-commit --max-count=5 --pretty=oneline HEAD b697d92f56 Git 2.22 6ee1eaca3e Merge tag 'l10n-2.22.0-rnd3' of git://github.com/git-l10n/git-po 0cdb8d2db2 Merge branch 'fr_review' of git://github.com/jnavila/git d014920079 Merge branch 'master' of git://github.com/alshopov/git-po 82eb147dbb l10n: fr.po: Review French translation > > > I've been trying to think of good reasons not to enforce their mutual > > exclusion, and the one I keep coming back to is that --no-abbrev might > > be desired to override a git config'd abbreviation length - although I > > didn't check to see whether we have one, maybe we would want one later. > > And even in that case, I suppose that --abbrev-commit would not be > > explicitly added to the call (because we'd infer from the config), or > > that if it did need to be explicitly added (like if we need the user to > > say they want abbreviation, but we want to use their configured > > preferred length) then we could still reject the addition of > > --no-abbrev. > > > > So maybe it makes even more sense to take this patch as an opportunity > > to make these options mutually exclusive... although that checking I > > think would wind up in revision.c, and therefore widen the impact of > > the change significantly. > > You can configure core.abbrev, though I'm not sure if it ever requests > abbreviation itself, or if it simply sets the length when we do happen > to abbreviate based on command-line options. > > But forgetting config for a moment, last-one-wins is useful even among > command line options. E.g., imagine an alias like this: > > [alias] > mylog = git rev-list --abbrev-commit --pretty=oneline > > It's nice if you can run "git mylog --no-abbrev" and have it do what you > expect, instead of complaining "you cannot use --abbrev-commit and > --no-abbrev together". > > That's a toy example, but you can imagine more elaborate scripts that > set some default options, and allow arbitrary per-invocation options to > be appended. This makes a lot more sense than the scenarios I was imagining. Thanks. I think a good solution here is to go and add --abbrev-commit=<n> without breaking support for --abbrev=<n>; I'm a little more worried about changing --no-abbrev to last-one-wins but I'll take a crack at it and see what the test suite says. While I'm at it, I'll check for last-one-wins with multiple instances of --abbrev[-commit]=<n>. Having done so, I'll also change the documentation here in rev-list to: --abbrev-commit[=<n>] [--abbrev=<n>] | --no-abbrev Sounds like a fun bit of low hanging fruit to me. Hoping to have a short turnaround. :) - Emily
On Fri, Jun 14, 2019 at 03:56:54PM -0700, Emily Shaffer wrote: > > Ah, I see. I don't consider "|" to indicate an exclusion to the point > > that the options are rejected. Only that you wouldn't want to use both, > > because one counteracts the other. So every "--no-foo" is mutually > > exclusive with "--foo" in the sense that one override the other. But the > > outcome is "last one wins", and not "whoops, we cannot figure out what > > you meant". And that's what the original: > > > > --abbrev=<n> | --no-abbrev > > > > before your patch was trying to say (and I suspect there are many other > > cases of "|" with this kind of last-one-wins behavior). > > For what it's worth, in this case it's not last-one-wins - --no-abbrev > always wins: Ah, thanks for pointing that; I hadn't noticed. That _is_ unlike most of the rest of Git. I'm tempted to say it's a bug and should be fixed, but I worry slightly that it could have an unexpected effect. > I think a good solution here is to go and add --abbrev-commit=<n> > without breaking support for --abbrev=<n>; I'm a little more worried > about changing --no-abbrev to last-one-wins but I'll take a crack at it > and see what the test suite says. While I'm at it, I'll check for > last-one-wins with multiple instances of --abbrev[-commit]=<n>. I think --abbrev-commit=<n> sounds safe enough, though I worry it may get a bit complicated because we'd presumably want to fall back to the <n> from --abbrev=<n>. I'll see how your patch turns out. :) I like the idea of changing --no-abbrev to last-one-wins, as above, but the test suite may not give us that much confidence. These kinds of cases are often not well-covered, and we're really worried about the wider world of random scripts people have grown over the last 10 years. Of course if the test suite does break horribly that might give us extra caution, but I'm not sure "the test suite does not break" gives us much confidence. > Having done so, I'll also change the documentation here in rev-list to: > --abbrev-commit[=<n>] [--abbrev=<n>] | --no-abbrev Yeah, that makes sense. -Peff
On Wed, Jun 19, 2019 at 05:21:59PM -0400, Jeff King wrote: > On Fri, Jun 14, 2019 at 03:56:54PM -0700, Emily Shaffer wrote: > > > > Ah, I see. I don't consider "|" to indicate an exclusion to the point > > > that the options are rejected. Only that you wouldn't want to use both, > > > because one counteracts the other. So every "--no-foo" is mutually > > > exclusive with "--foo" in the sense that one override the other. But the > > > outcome is "last one wins", and not "whoops, we cannot figure out what > > > you meant". And that's what the original: > > > > > > --abbrev=<n> | --no-abbrev > > > > > > before your patch was trying to say (and I suspect there are many other > > > cases of "|" with this kind of last-one-wins behavior). > > > > For what it's worth, in this case it's not last-one-wins - --no-abbrev > > always wins: > > Ah, thanks for pointing that; I hadn't noticed. That _is_ unlike most of > the rest of Git. I'm tempted to say it's a bug and should be fixed, but > I worry slightly that it could have an unexpected effect. > > > I think a good solution here is to go and add --abbrev-commit=<n> > > without breaking support for --abbrev=<n>; I'm a little more worried > > about changing --no-abbrev to last-one-wins but I'll take a crack at it > > and see what the test suite says. While I'm at it, I'll check for > > last-one-wins with multiple instances of --abbrev[-commit]=<n>. > > I think --abbrev-commit=<n> sounds safe enough, though I worry it may > get a bit complicated because we'd presumably want to fall back to the > <n> from --abbrev=<n>. I'll see how your patch turns out. :) > > I like the idea of changing --no-abbrev to last-one-wins, as above, but > the test suite may not give us that much confidence. These kinds of > cases are often not well-covered, and we're really worried about the > wider world of random scripts people have grown over the last 10 years. > Of course if the test suite does break horribly that might give us extra > caution, but I'm not sure "the test suite does not break" gives us much > confidence. I think at that point, based on what I've seen on the list, there's a tendency to include a config option to enable the new behavior. And I think that might pass the tipping point of "wasted effort" for me. The change isn't really enabling anything that was impossible before (last-one-wins is handy, but it's legacy public behavior now for --abbrev). So I'm kind of starting to lean towards doing nothing at all. It's a lot of work, for a not-very-functional change, which needs to be specially configured to even happen... So maybe I'll save my time and work on actual bugs instead. :) > > > Having done so, I'll also change the documentation here in rev-list to: > > --abbrev-commit[=<n>] [--abbrev=<n>] | --no-abbrev > > Yeah, that makes sense. > > -Peff Thanks for the input, all. I think I'll drop this. - Emily
diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 9f31837d30..6ae0087b01 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -49,8 +49,8 @@ static const char rev_list_usage[] = " --objects | --objects-edge\n" " --unpacked\n" " --header | --pretty\n" -" --abbrev=<n> | --no-abbrev\n" -" --abbrev-commit\n" +" --abbrev-commit [--abbrev=<n>]\n" +" --no-abbrev\n" " --left-right\n" " --count\n" " special purpose:\n"
Indicate that --abbrev only works with --abbrev-commit also specified. It seems that simply running `git rev-list --abbrev=5` doesn't abbreviate commit OIDs. But the combination of `git rev-list --abbrev-commit --abbrev=5` works as expected. Clarify in the documentation by indicating that --abbrev is an optional addition to the --abbrev-commit option. --no-abbrev remains on a separate line as it can still be used to disable OID abbreviation even if --abbrev-commit has been specified. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Change-Id: If9b1198938e1a3515ae6740241f7b791fb7a88bd --- I thought this was odd when I was working on the other rev-list changes - --abbrev doesn't do anything on its own. It looks like it does work by itself in other commands, but apparently not in rev-list. Listed this patch as RFC because maybe instead it's better to fix something so --abbrev can be used alone, or teach --abbrev-commit=<n>. It looks like `git log --abbrev=5` also doesn't work the way one might expect, which makes sense to me, as they use the same internals for option parsing (parse_revisions()). The manpages for log and rev-list both correctly indicate that --abbrev=<n> is an optional addition to --abbrev-commit. `git log -h` is generated by parse-options tooling and doesn't cover --abbrev-commit at all, but `git rev-list` doesn't use an option parser on its own and the usage is hardcoded. - Emily builtin/rev-list.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)