Message ID | faaa9a3d6ba66d77cc2a8eab438d1bfc8f762fa1.1559857032.git.matvore@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] ref-filter: sort detached HEAD lines firstly | expand |
Hi Matthew, On Thu, 6 Jun 2019, Matthew DeVore wrote: > Before this patch, "git branch" would put "(HEAD detached...)" and "(no > branch, rebasing...)" lines before all the other branches *in most > cases* and only because of the fact that "(" is a low codepoint. This > would not hold in the Chinese locale, which uses a full-width "(" symbol > (codepoint FF08). This meant that the detached HEAD line would appear > after all local refs and even after the remote refs if there were any. > > Deliberately sort the detached HEAD refs before other refs when sorting > by refname rather than rely on codepoint subtleties. This description is pretty convincing! > diff --git a/ref-filter.c b/ref-filter.c > index 8500671bc6..cbfae790f9 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -2157,25 +2157,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru > cmp_type cmp_type = used_atom[s->atom].type; > int (*cmp_fn)(const char *, const char *); > struct strbuf err = STRBUF_INIT; > > if (get_ref_atom_value(a, s->atom, &va, &err)) > die("%s", err.buf); > if (get_ref_atom_value(b, s->atom, &vb, &err)) > die("%s", err.buf); > strbuf_release(&err); > cmp_fn = s->ignore_case ? strcasecmp : strcmp; > - if (s->version) > + if (s->version) { > cmp = versioncmp(va->s, vb->s); > - else if (cmp_type == FIELD_STR) > + } else if (cmp_type == FIELD_STR) { I find that it makes sense in general to suppress one's urges regarding introducing `{ ... }` around one-liners when the patch does not actually require it. For example, I found this patch harder than necessary to read because of it. > + if ((a->kind & FILTER_REFS_DETACHED_HEAD) != > + (b->kind & FILTER_REFS_DETACHED_HEAD)) { So in case that both are detached... > + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; > + } > cmp = cmp_fn(va->s, vb->s); ... we compare their commit hashes, is that right? Might be worth a code comment. > - else { > + } else { FWIW it would have been a much more obvious patch if it had done if (s->version) [...] + else if (cmp_type == FIELD_STR && + (a->kind & FILTER_REFS_DETACHED_HEAD || + b->kind & FILTER_REFS_DETACHED_HEAD)) + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; else if (cmp_type == FIELD_STR) [...] Maybe still worth doing. FWIW I was *so* tempted to write ((a->kind ^ b->kind) & FILTER_REFS_DETACHED_HEAD) to make this code DRYer, but then, readers not intimately familiar with Boolean arithmetic might not even know about the `^` operator, making the code harder to read than necessary, too. > diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh > index 2139b427ca..de08d109dc 100644 > --- a/t/lib-gettext.sh > +++ b/t/lib-gettext.sh > @@ -25,23 +25,29 @@ then > p > q > }') > # is_IS.ISO8859-1 on Solaris and FreeBSD, is_IS.iso88591 on Debian > is_IS_iso_locale=$(locale -a 2>/dev/null | > sed -n '/^is_IS\.[iI][sS][oO]8859-*1$/{ > p > q > }') > > - # Export them as an environment variable so the t0202/test.pl Perl > - # test can use it too > - export is_IS_locale is_IS_iso_locale > + zh_CN_locale=$(locale -a 2>/dev/null | > + sed -n '/^zh_CN\.[uU][tT][fF]-*8$/{ > + p > + q > + }') > + > + # Export them as environment variables so other tests can use them > + # too > + export is_IS_locale is_IS_iso_locale zh_CN_locale > > if test -n "$is_IS_locale" && > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > then > # Some of the tests need the reference Icelandic locale > test_set_prereq GETTEXT_LOCALE > > # Exporting for t0202/test.pl > GETTEXT_LOCALE=1 > export GETTEXT_LOCALE > @@ -53,11 +59,15 @@ then > if test -n "$is_IS_iso_locale" && > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > then > # Some of the tests need the reference Icelandic locale > test_set_prereq GETTEXT_ISO_LOCALE > > say "# lib-gettext: Found '$is_IS_iso_locale' as an is_IS ISO-8859-1 locale" > else > say "# lib-gettext: No is_IS ISO-8859-1 locale available" > fi > + > + if test -z "$zh_CN_locale"; then > + say "# lib-gettext: No zh_CN UTF-8 locale available" > + fi I wonder why this hunk, unlike the previous one, does not imitate the is_IS handling closely. > diff --git a/t/t3207-branch-intl.sh b/t/t3207-branch-intl.sh > new file mode 100755 > index 0000000000..9f6fcc7481 > --- /dev/null > +++ b/t/t3207-branch-intl.sh > @@ -0,0 +1,38 @@ > +#!/bin/sh > + > +test_description='git branch internationalization tests' > + > +. ./lib-gettext.sh > + > +test_expect_success 'init repo' ' > + git init r1 && Why? > + touch r1/foo && > + git -C r1 add foo && > + git -C r1 commit -m foo > +' Why not simply `test_commit foo`? > +test_expect_success 'detached head sorts before other branches' ' > + # Ref sorting logic should put detached heads before the other > + # branches, but this is not automatic when a branch name sorts > + # lexically before "(" or the full-width "(" (Unicode codepoint FF08). > + # The latter case is nearly guaranteed for the Chinese locale. > + > + git -C r1 checkout HEAD^{} -- && > + git -C r1 branch !should_be_after_detached HEAD && I am not sure that `!` is a wise choice, as it might not be a legal file name character everywhere. A `.` or `-` might make more sense. > + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ > + git -C r1 branch >actual && > + git -C r1 checkout - && Why call `checkout` after `branch`? That's unnecessary, we do not verify anything after that call. > + awk " > + # We need full-width or half-width parens on the first line. > + NR == 1 && (/[(].*[)]/ || /\xef\xbc\x88.*\xef\xbc\x89/) { > + found_head = 1; > + } > + /!should_be_after_detached/ { > + found_control_branch = 1; > + } > + END { exit !found_head || !found_control_branch } > + " actual This might look beautiful for a fan of `awk`. For the vast majority of us, this is not a good idea. Remember, you do *not* write those tests for your own pleasure, you do *not* write those tests in order to help you catch problems while you develop your patches, you do *not* develop these tests in order to just catch future breakages. You *do* write those tests for *other* developers who you try to help in preventing introducing regressions. As such, you *want* the tests to be - easy to understand for as wide a range of developers as you can make, - quick, - covering regressions, and *only* regressions, - helping diagnose *and* fix regressions. In the ideal case you won't even hear when developers found your test helpful, and you will never, ever learn about regressions that have been prevented. You most frequently will hear about your tests when they did not do their job well. In this instance, I would have expected something like test_expect_lines = 3 actual && head -n 1 <actual >first && test_i18ngrep "detached HEAD" first && tail -n 1 <actual >last && grep should_be_after last instead of the "awk-ward" code above. Ciao, Johannes > +' > + > +test_done > -- > 2.21.0 > >
Hi, sorry for top-posting, but I just noticed the "firstlyxy" typo in the subject ;-) Ciao, Dscho On Sun, 9 Jun 2019, Johannes Schindelin wrote: > Hi Matthew, > > On Thu, 6 Jun 2019, Matthew DeVore wrote: > > > Before this patch, "git branch" would put "(HEAD detached...)" and "(no > > branch, rebasing...)" lines before all the other branches *in most > > cases* and only because of the fact that "(" is a low codepoint. This > > would not hold in the Chinese locale, which uses a full-width "(" symbol > > (codepoint FF08). This meant that the detached HEAD line would appear > > after all local refs and even after the remote refs if there were any. > > > > Deliberately sort the detached HEAD refs before other refs when sorting > > by refname rather than rely on codepoint subtleties. > > This description is pretty convincing! > > > diff --git a/ref-filter.c b/ref-filter.c > > index 8500671bc6..cbfae790f9 100644 > > --- a/ref-filter.c > > +++ b/ref-filter.c > > @@ -2157,25 +2157,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru > > cmp_type cmp_type = used_atom[s->atom].type; > > int (*cmp_fn)(const char *, const char *); > > struct strbuf err = STRBUF_INIT; > > > > if (get_ref_atom_value(a, s->atom, &va, &err)) > > die("%s", err.buf); > > if (get_ref_atom_value(b, s->atom, &vb, &err)) > > die("%s", err.buf); > > strbuf_release(&err); > > cmp_fn = s->ignore_case ? strcasecmp : strcmp; > > - if (s->version) > > + if (s->version) { > > cmp = versioncmp(va->s, vb->s); > > - else if (cmp_type == FIELD_STR) > > + } else if (cmp_type == FIELD_STR) { > > I find that it makes sense in general to suppress one's urges regarding > introducing `{ ... }` around one-liners when the patch does not actually > require it. > > For example, I found this patch harder than necessary to read because of > it. > > > + if ((a->kind & FILTER_REFS_DETACHED_HEAD) != > > + (b->kind & FILTER_REFS_DETACHED_HEAD)) { > > So in case that both are detached... > > > + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; > > + } > > cmp = cmp_fn(va->s, vb->s); > > ... we compare their commit hashes, is that right? Might be worth a code > comment. > > > - else { > > + } else { > > FWIW it would have been a much more obvious patch if it had done > > if (s->version) > [...] > + else if (cmp_type == FIELD_STR && > + (a->kind & FILTER_REFS_DETACHED_HEAD || > + b->kind & FILTER_REFS_DETACHED_HEAD)) > + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; > else if (cmp_type == FIELD_STR) > [...] > > Maybe still worth doing. > > FWIW I was *so* tempted to write > > ((a->kind ^ b->kind) & FILTER_REFS_DETACHED_HEAD) > > to make this code DRYer, but then, readers not intimately familiar with > Boolean arithmetic might not even know about the `^` operator, making the > code harder to read than necessary, too. > > > diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh > > index 2139b427ca..de08d109dc 100644 > > --- a/t/lib-gettext.sh > > +++ b/t/lib-gettext.sh > > @@ -25,23 +25,29 @@ then > > p > > q > > }') > > # is_IS.ISO8859-1 on Solaris and FreeBSD, is_IS.iso88591 on Debian > > is_IS_iso_locale=$(locale -a 2>/dev/null | > > sed -n '/^is_IS\.[iI][sS][oO]8859-*1$/{ > > p > > q > > }') > > > > - # Export them as an environment variable so the t0202/test.pl Perl > > - # test can use it too > > - export is_IS_locale is_IS_iso_locale > > + zh_CN_locale=$(locale -a 2>/dev/null | > > + sed -n '/^zh_CN\.[uU][tT][fF]-*8$/{ > > + p > > + q > > + }') > > + > > + # Export them as environment variables so other tests can use them > > + # too > > + export is_IS_locale is_IS_iso_locale zh_CN_locale > > > > if test -n "$is_IS_locale" && > > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > > then > > # Some of the tests need the reference Icelandic locale > > test_set_prereq GETTEXT_LOCALE > > > > # Exporting for t0202/test.pl > > GETTEXT_LOCALE=1 > > export GETTEXT_LOCALE > > @@ -53,11 +59,15 @@ then > > if test -n "$is_IS_iso_locale" && > > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > > then > > # Some of the tests need the reference Icelandic locale > > test_set_prereq GETTEXT_ISO_LOCALE > > > > say "# lib-gettext: Found '$is_IS_iso_locale' as an is_IS ISO-8859-1 locale" > > else > > say "# lib-gettext: No is_IS ISO-8859-1 locale available" > > fi > > + > > + if test -z "$zh_CN_locale"; then > > + say "# lib-gettext: No zh_CN UTF-8 locale available" > > + fi > > I wonder why this hunk, unlike the previous one, does not imitate the > is_IS handling closely. > > > diff --git a/t/t3207-branch-intl.sh b/t/t3207-branch-intl.sh > > new file mode 100755 > > index 0000000000..9f6fcc7481 > > --- /dev/null > > +++ b/t/t3207-branch-intl.sh > > @@ -0,0 +1,38 @@ > > +#!/bin/sh > > + > > +test_description='git branch internationalization tests' > > + > > +. ./lib-gettext.sh > > + > > +test_expect_success 'init repo' ' > > + git init r1 && > > Why? > > > + touch r1/foo && > > + git -C r1 add foo && > > + git -C r1 commit -m foo > > +' > > Why not simply `test_commit foo`? > > > +test_expect_success 'detached head sorts before other branches' ' > > + # Ref sorting logic should put detached heads before the other > > + # branches, but this is not automatic when a branch name sorts > > + # lexically before "(" or the full-width "(" (Unicode codepoint FF08). > > + # The latter case is nearly guaranteed for the Chinese locale. > > + > > + git -C r1 checkout HEAD^{} -- && > > + git -C r1 branch !should_be_after_detached HEAD && > > I am not sure that `!` is a wise choice, as it might not be a legal file > name character everywhere. A `.` or `-` might make more sense. > > > + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ > > + git -C r1 branch >actual && > > + git -C r1 checkout - && > > Why call `checkout` after `branch`? That's unnecessary, we do not verify > anything after that call. > > > + awk " > > + # We need full-width or half-width parens on the first line. > > + NR == 1 && (/[(].*[)]/ || /\xef\xbc\x88.*\xef\xbc\x89/) { > > + found_head = 1; > > + } > > + /!should_be_after_detached/ { > > + found_control_branch = 1; > > + } > > + END { exit !found_head || !found_control_branch } > > + " actual > > This might look beautiful for a fan of `awk`. For the vast majority of us, > this is not a good idea. > > Remember, you do *not* write those tests for your own pleasure, you do > *not* write those tests in order to help you catch problems while you > develop your patches, you do *not* develop these tests in order to just > catch future breakages. > > You *do* write those tests for *other* developers who you try to help in > preventing introducing regressions. > > As such, you *want* the tests to be > > - easy to understand for as wide a range of developers as you can make, > > - quick, > > - covering regressions, and *only* regressions, > > - helping diagnose *and* fix regressions. > > In the ideal case you won't even hear when developers found your test > helpful, and you will never, ever learn about regressions that have been > prevented. > > You most frequently will hear about your tests when they did not do their > job well. > > In this instance, I would have expected something like > > test_expect_lines = 3 actual && > > head -n 1 <actual >first && > test_i18ngrep "detached HEAD" first && > > tail -n 1 <actual >last && > grep should_be_after last > > instead of the "awk-ward" code above. > > Ciao, > Johannes > > > +' > > + > > +test_done > > -- > > 2.21.0 > > > > >
On Sun, Jun 09, 2019 at 10:17:19AM +0200, Johannes Schindelin wrote: > > if (get_ref_atom_value(a, s->atom, &va, &err)) > > die("%s", err.buf); > > if (get_ref_atom_value(b, s->atom, &vb, &err)) > > die("%s", err.buf); > > strbuf_release(&err); > > cmp_fn = s->ignore_case ? strcasecmp : strcmp; > > - if (s->version) > > + if (s->version) { > > cmp = versioncmp(va->s, vb->s); > > - else if (cmp_type == FIELD_STR) > > + } else if (cmp_type == FIELD_STR) { > > I find that it makes sense in general to suppress one's urges regarding > introducing `{ ... }` around one-liners when the patch does not actually > require it. > > For example, I found this patch harder than necessary to read because of > it. I understand the desire to make the patch itself clean, and I sometimes try to do that to a fault, but the style as I understand it is to put { } around all if branches if only one branch requires it. Since I'm already modifying the "else if (cmp_type == FIELD_STR)" line, I decided to put the } at the start of the line and modify the if (s->version) line as well. So only one line was modified "in excess." I think the temporary cost of the verbose patch is justified to keep the style consistent in narrow code fragments. > > > + if ((a->kind & FILTER_REFS_DETACHED_HEAD) != > > + (b->kind & FILTER_REFS_DETACHED_HEAD)) { > > So in case that both are detached... > > > + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; > > + } > > cmp = cmp_fn(va->s, vb->s); > > ... we compare their commit hashes, is that right? Might be worth a code > comment. > It should not be possible to have two detached head lines, so adding a comment about that may be distracting. > > - else { > > + } else { > > FWIW it would have been a much more obvious patch if it had done > > if (s->version) > [...] > + else if (cmp_type == FIELD_STR && > + (a->kind & FILTER_REFS_DETACHED_HEAD || > + b->kind & FILTER_REFS_DETACHED_HEAD)) > + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; But that means that if a is detached, it is always a < b, even if both are detached? That's probably right in practice, since there should only be one detached head, but it's jarring to read a brittle cmp function. > else if (cmp_type == FIELD_STR) > [...] > > Maybe still worth doing. > > FWIW I was *so* tempted to write > > ((a->kind ^ b->kind) & FILTER_REFS_DETACHED_HEAD) > > to make this code DRYer, but then, readers not intimately familiar with > Boolean arithmetic might not even know about the `^` operator, making the > code harder to read than necessary, too. I think I found a readable way to DRY: } else if (cmp_type == FIELD_STR) { const int a_detached = a->kind & FILTER_REFS_DETACHED_HEAD; /* * When sorting by name, we should put "detached" head lines, * which are all the lines in parenthesis, before all others. * This usually is automatic, since "(" is before "refs/" and * "remotes/", but this does not hold for zh_CN, which uses * full-width parenthesis, so make the ordering explicit. */ if (a_detached != (b->kind & FILTER_REFS_DETACHED_HEAD)) cmp = a_detached ? -1 : 1; else cmp = cmp_fn(va->s, vb->s); } ... > > > > if test -n "$is_IS_locale" && > > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > > then > > # Some of the tests need the reference Icelandic locale > > test_set_prereq GETTEXT_LOCALE > > > > # Exporting for t0202/test.pl > > GETTEXT_LOCALE=1 > > export GETTEXT_LOCALE > > @@ -53,11 +59,15 @@ then > > if test -n "$is_IS_iso_locale" && > > test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" > > then > > # Some of the tests need the reference Icelandic locale > > test_set_prereq GETTEXT_ISO_LOCALE > > > > say "# lib-gettext: Found '$is_IS_iso_locale' as an is_IS ISO-8859-1 locale" > > else > > say "# lib-gettext: No is_IS ISO-8859-1 locale available" > > fi > > + > > + if test -z "$zh_CN_locale"; then > > + say "# lib-gettext: No zh_CN UTF-8 locale available" > > + fi > > I wonder why this hunk, unlike the previous one, does not imitate the > is_IS handling closely. It was because I didn't have gettext set up properly and it was causing GIT_INTERNAL_GETTEXT_SH_SCHEME to be "fallthrough", despite the actual Git output being translated. My set-up was quite broken so I don't think our intl utility and test code needs any fixing. I've handled this and the next version of the patch will have that fixed. (incidentally, this was why I originally marked my patch RFC. Also incidentally, I'm using a Mac for development since the most powerful machine I have access to is a Mac, so I've been jumping through some hoops to make that work.) > > > diff --git a/t/t3207-branch-intl.sh b/t/t3207-branch-intl.sh > > new file mode 100755 > > index 0000000000..9f6fcc7481 > > --- /dev/null > > +++ b/t/t3207-branch-intl.sh > > @@ -0,0 +1,38 @@ > > +#!/bin/sh > > + > > +test_description='git branch internationalization tests' > > + > > +. ./lib-gettext.sh > > + > > +test_expect_success 'init repo' ' > > + git init r1 && > > Why? You mean why make a test repo? > > > + touch r1/foo && > > + git -C r1 add foo && > > + git -C r1 commit -m foo > > +' > > Why not simply `test_commit foo`? Good idea, I'll use that. > > > +test_expect_success 'detached head sorts before other branches' ' > > + # Ref sorting logic should put detached heads before the other > > + # branches, but this is not automatic when a branch name sorts > > + # lexically before "(" or the full-width "(" (Unicode codepoint FF08). > > + # The latter case is nearly guaranteed for the Chinese locale. > > + > > + git -C r1 checkout HEAD^{} -- && > > + git -C r1 branch !should_be_after_detached HEAD && > > I am not sure that `!` is a wise choice, as it might not be a legal file > name character everywhere. A `.` or `-` might make more sense. The ! is actually meaningless, as I have come to observed, since the actual ref name used for comparison is not "!should_be_after_detached" but "refs/heads/!should_be_after_detached". So I'll remove it. > > > + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ > > + git -C r1 branch >actual && > > + git -C r1 checkout - && > > Why call `checkout` after `branch`? That's unnecessary, we do not verify > anything after that call. It's to get the repo into a neutral state in case an additional testcase is added in the future. > > > + awk " > > + # We need full-width or half-width parens on the first line. > > + NR == 1 && (/[(].*[)]/ || /\xef\xbc\x88.*\xef\xbc\x89/) { > > + found_head = 1; > > + } > > + /!should_be_after_detached/ { > > + found_control_branch = 1; > > + } > > + END { exit !found_head || !found_control_branch } > > + " actual > > This might look beautiful for a fan of `awk`. For the vast majority of us, > this is not a good idea. > > Remember, you do *not* write those tests for your own pleasure, you do > *not* write those tests in order to help you catch problems while you > develop your patches, you do *not* develop these tests in order to just > catch future breakages. > > You *do* write those tests for *other* developers who you try to help in > preventing introducing regressions. > > As such, you *want* the tests to be > > - easy to understand for as wide a range of developers as you can make, > > - quick, > > - covering regressions, and *only* regressions, > > - helping diagnose *and* fix regressions. > > In the ideal case you won't even hear when developers found your test > helpful, and you will never, ever learn about regressions that have been > prevented. > > You most frequently will hear about your tests when they did not do their > job well. > > In this instance, I would have expected something like > > test_expect_lines = 3 actual && > > head -n 1 <actual >first && > test_i18ngrep "detached HEAD" first && Tangential to your point, but the test_i18ngrep actually won't work since all it does is *not* grep when the text is translated. But I want the text to use the genuine zh_CN translation. > > tail -n 1 <actual >last && > grep should_be_after last > > instead of the "awk-ward" code above. All fair points, and I may have been carried away when I wrote the awk code. I liked using awk since it matched my mental model at the time, which was procedural, as opposed to your proposed pure sh implementation, which is more declarative. Here is the new version of the test: test_expect_success GETTEXT_ZH_LOCALE 'detached head sorts before branches' ' # Ref sorting logic should put detached heads before the other # branches, but this is not automatic when a branch name sorts # lexically before "(" or the full-width "(" (Unicode codepoint FF08). # The latter case is nearly guaranteed for the Chinese locale. git -C r1 checkout HEAD^{} -- && LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ git -C r1 branch >actual && git -C r1 checkout - && head -n 1 actual >first && # The first line should be enclosed by full-width parenthesis. grep '$'\xef\xbc\x88.*\xef\xbc\x89'' first && grep master actual ' > > Ciao, > Johannes > Thank you for the thoughtful feedback.
Hi, Matthew DeVore wrote: > On Sun, Jun 09, 2019 at 10:17:19AM +0200, Johannes Schindelin wrote: >> I find that it makes sense in general to suppress one's urges regarding >> introducing `{ ... }` around one-liners when the patch does not actually >> require it. >> >> For example, I found this patch harder than necessary to read because of >> it. > > I understand the desire to make the patch itself clean, and I sometimes try to > do that to a fault, but the style as I understand it is to put { } around all > if branches if only one branch requires it. Since I'm already modifying the > "else if (cmp_type == FIELD_STR)" line, I decided to put the } at the start of > the line and modify the if (s->version) line as well. So only one line was > modified "in excess." I think the temporary cost of the verbose patch is > justified to keep the style consistent in narrow code fragments. Git seems to be inconsistent about this. Documentation/CodingGuidelines says - When there are multiple arms to a conditional and some of them require braces, enclose even a single line block in braces for consistency. E.g.: so you have some cover from there (and it matches what I'm used to, too). :) [...] >>> + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ >>> + git -C r1 branch >actual && >>> + git -C r1 checkout - && >> >> Why call `checkout` after `branch`? That's unnecessary, we do not verify >> anything after that call. > > It's to get the repo into a neutral state in case an additional testcase is > added in the future. For this kind of thing, we tend to use test_when_finished so that the test ends up in a clean state even if it fails. [...] > test_expect_success GETTEXT_ZH_LOCALE 'detached head sorts before branches' ' > # Ref sorting logic should put detached heads before the other > # branches, but this is not automatic when a branch name sorts > # lexically before "(" or the full-width "(" (Unicode codepoint FF08). > # The latter case is nearly guaranteed for the Chinese locale. > > git -C r1 checkout HEAD^{} -- && > LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ > git -C r1 branch >actual && > git -C r1 checkout - && > > head -n 1 actual >first && > # The first line should be enclosed by full-width parenthesis. > grep '$'\xef\xbc\x88.*\xef\xbc\x89'' first && nit: older shells do not know how to do $'\x01' interpolation. Probably best to use the raw UTF-8 directly here (it will be more readable anyway). Thanks, Jonathan
On Mon, Jun 10, 2019 at 05:41:06PM -0700, Jonathan Nieder wrote: > Git seems to be inconsistent about this. Documentation/CodingGuidelines > says > > - When there are multiple arms to a conditional and some of them > require braces, enclose even a single line block in braces for > consistency. E.g.: > > so you have some cover from there (and it matches what I'm used to, > too). :) Thanks for finding that. > > [...] > >>> + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ > >>> + git -C r1 branch >actual && > >>> + git -C r1 checkout - && > >> > >> Why call `checkout` after `branch`? That's unnecessary, we do not verify > >> anything after that call. > > > > It's to get the repo into a neutral state in case an additional testcase is > > added in the future. > > For this kind of thing, we tend to use test_when_finished so that the > test ends up in a clean state even if it fails. Done. That will be used in the next roll-up. > > head -n 1 actual >first && > > # The first line should be enclosed by full-width parenthesis. > > grep '$'\xef\xbc\x88.*\xef\xbc\x89'' first && > > nit: older shells do not know how to do $'\x01' interpolation. > Probably best to use the raw UTF-8 directly here (it will be more > readable anyway). Good point. I suppose we don't have to worry about dev's editors screwing up encoding since modern editors make this kind of thing easy to configure (and I suspect that all sane editors use UTF-8 as the default or at least won't mangle it...)
Jonathan Nieder <jrnieder@gmail.com> writes: > Git seems to be inconsistent about this. Documentation/CodingGuidelines > says > > - When there are multiple arms to a conditional and some of them > require braces, enclose even a single line block in braces for > consistency. E.g.: > > so you have some cover from there (and it matches what I'm used to, > too). :) Yup, it took us for quite some time before we settled on that rule and wrote it down, so there are some lines that predate it *and* have survived.
diff --git a/ref-filter.c b/ref-filter.c index 8500671bc6..cbfae790f9 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2157,25 +2157,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru cmp_type cmp_type = used_atom[s->atom].type; int (*cmp_fn)(const char *, const char *); struct strbuf err = STRBUF_INIT; if (get_ref_atom_value(a, s->atom, &va, &err)) die("%s", err.buf); if (get_ref_atom_value(b, s->atom, &vb, &err)) die("%s", err.buf); strbuf_release(&err); cmp_fn = s->ignore_case ? strcasecmp : strcmp; - if (s->version) + if (s->version) { cmp = versioncmp(va->s, vb->s); - else if (cmp_type == FIELD_STR) + } else if (cmp_type == FIELD_STR) { + if ((a->kind & FILTER_REFS_DETACHED_HEAD) != + (b->kind & FILTER_REFS_DETACHED_HEAD)) { + return (a->kind & FILTER_REFS_DETACHED_HEAD) ? -1 : 1; + } cmp = cmp_fn(va->s, vb->s); - else { + } else { if (va->value < vb->value) cmp = -1; else if (va->value == vb->value) cmp = cmp_fn(a->refname, b->refname); else cmp = 1; } return (s->reverse) ? -cmp : cmp; } diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh index 2139b427ca..de08d109dc 100644 --- a/t/lib-gettext.sh +++ b/t/lib-gettext.sh @@ -25,23 +25,29 @@ then p q }') # is_IS.ISO8859-1 on Solaris and FreeBSD, is_IS.iso88591 on Debian is_IS_iso_locale=$(locale -a 2>/dev/null | sed -n '/^is_IS\.[iI][sS][oO]8859-*1$/{ p q }') - # Export them as an environment variable so the t0202/test.pl Perl - # test can use it too - export is_IS_locale is_IS_iso_locale + zh_CN_locale=$(locale -a 2>/dev/null | + sed -n '/^zh_CN\.[uU][tT][fF]-*8$/{ + p + q + }') + + # Export them as environment variables so other tests can use them + # too + export is_IS_locale is_IS_iso_locale zh_CN_locale if test -n "$is_IS_locale" && test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" then # Some of the tests need the reference Icelandic locale test_set_prereq GETTEXT_LOCALE # Exporting for t0202/test.pl GETTEXT_LOCALE=1 export GETTEXT_LOCALE @@ -53,11 +59,15 @@ then if test -n "$is_IS_iso_locale" && test $GIT_INTERNAL_GETTEXT_SH_SCHEME != "fallthrough" then # Some of the tests need the reference Icelandic locale test_set_prereq GETTEXT_ISO_LOCALE say "# lib-gettext: Found '$is_IS_iso_locale' as an is_IS ISO-8859-1 locale" else say "# lib-gettext: No is_IS ISO-8859-1 locale available" fi + + if test -z "$zh_CN_locale"; then + say "# lib-gettext: No zh_CN UTF-8 locale available" + fi fi diff --git a/t/t3207-branch-intl.sh b/t/t3207-branch-intl.sh new file mode 100755 index 0000000000..9f6fcc7481 --- /dev/null +++ b/t/t3207-branch-intl.sh @@ -0,0 +1,38 @@ +#!/bin/sh + +test_description='git branch internationalization tests' + +. ./lib-gettext.sh + +test_expect_success 'init repo' ' + git init r1 && + touch r1/foo && + git -C r1 add foo && + git -C r1 commit -m foo +' + +test_expect_success 'detached head sorts before other branches' ' + # Ref sorting logic should put detached heads before the other + # branches, but this is not automatic when a branch name sorts + # lexically before "(" or the full-width "(" (Unicode codepoint FF08). + # The latter case is nearly guaranteed for the Chinese locale. + + git -C r1 checkout HEAD^{} -- && + git -C r1 branch !should_be_after_detached HEAD && + LC_ALL=$zh_CN_locale LC_MESSAGES=$zh_CN_locale \ + git -C r1 branch >actual && + git -C r1 checkout - && + + awk " + # We need full-width or half-width parens on the first line. + NR == 1 && (/[(].*[)]/ || /\xef\xbc\x88.*\xef\xbc\x89/) { + found_head = 1; + } + /!should_be_after_detached/ { + found_control_branch = 1; + } + END { exit !found_head || !found_control_branch } + " actual +' + +test_done
Before this patch, "git branch" would put "(HEAD detached...)" and "(no branch, rebasing...)" lines before all the other branches *in most cases* and only because of the fact that "(" is a low codepoint. This would not hold in the Chinese locale, which uses a full-width "(" symbol (codepoint FF08). This meant that the detached HEAD line would appear after all local refs and even after the remote refs if there were any. Deliberately sort the detached HEAD refs before other refs when sorting by refname rather than rely on codepoint subtleties. Signed-off-by: Matthew DeVore <matvore@google.com> --- ref-filter.c | 10 +++++++--- t/lib-gettext.sh | 16 +++++++++++++--- t/t3207-branch-intl.sh | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 58 insertions(+), 6 deletions(-) create mode 100755 t/t3207-branch-intl.sh