Message ID | 39aa46bce700cc9a4ca49f38922e3a7ebf14a52c.1598004663.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix trailers atom bug and improved tests | expand |
On Fri, Aug 21, 2020 at 6:11 AM Hariom Verma via GitGitGadget <gitgitgadget@gmail.com> wrote: > The 'contents' atom does not show any error if used with 'trailers' > atom and semicolon is missing before trailers arguments. Do you mean s/semicolon/colon/ ? > e.g %(contents:trailersonly) works, while it shouldn't. > > It is definitely not an expected behavior. > > Let's fix this bug. > > Signed-off-by: Hariom Verma <hariom18599@gmail.com> > --- > diff --git a/ref-filter.c b/ref-filter.c > @@ -332,6 +332,22 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato > +static int check_format_field(const char *arg, const char *field, const char **option) > +{ > + const char *opt; > + if (skip_prefix(arg, field, &opt)) { > + if (*opt == '\0') { > + *option = NULL; > + return 1; > + } > + else if (*opt == ':') { > + *option = opt + 1; > + return 1; > + } > + } > + return 0; > +} Not necessarily worth a re-roll, but rather than introducing all the above new code... > @@ -345,9 +361,8 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato > - else if (skip_prefix(arg, "trailers", &arg)) { > - skip_prefix(arg, ":", &arg); > - if (trailers_atom_parser(format, atom, *arg ? arg : NULL, err)) > + else if (check_format_field(arg, "trailers", &arg)) { > + if (trailers_atom_parser(format, atom, arg, err)) > return -1; ...an alternative would have been something like: else if (!strcmp(arg, "trailers")) { if (trailers_atom_parser(format, atom, NULL, err)) return -1; } else if (skip_prefix(arg, "trailers:", &arg)) { if (trailers_atom_parser(format, atom, arg, err)) return -1; } which is quite simple to reason about (though has the cost of a tiny bit of duplication). > diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh > @@ -823,6 +823,15 @@ test_expect_success '%(trailers) rejects unknown trailers arguments' ' > +test_expect_success 'if arguments, %(contents:trailers) shows error if semicolon is missing' ' s/semicolon/colon/ > + # error message cannot be checked under i18n What is this comment about? I realize that you copied it from other nearby tests, but I find that it muddies rather than clarifies. > + cat >expect <<-EOF && > + fatal: unrecognized %(contents) argument: trailersonly > + EOF > + test_must_fail git for-each-ref --format="%(contents:trailersonly)" 2>actual && > + test_i18ncmp expect actual > +'
Eric Sunshine <sunshine@sunshineco.com> writes: > ...an alternative would have been something like: > > else if (!strcmp(arg, "trailers")) { > if (trailers_atom_parser(format, atom, NULL, err)) > return -1; > } else if (skip_prefix(arg, "trailers:", &arg)) { > if (trailers_atom_parser(format, atom, arg, err)) > return -1; > } > > which is quite simple to reason about (though has the cost of a tiny > bit of duplication). Yeah, that looks quite simple and straight-forward. >> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh >> @@ -823,6 +823,15 @@ test_expect_success '%(trailers) rejects unknown trailers arguments' ' >> +test_expect_success 'if arguments, %(contents:trailers) shows error if semicolon is missing' ' > > s/semicolon/colon/ Definitely. > >> + # error message cannot be checked under i18n > > What is this comment about? I realize that you copied it from other > nearby tests, but I find that it muddies rather than clarifies. Yup. If a patch changes test_cmp with test_i18ncmp, the above message belongs to its commit log message, but it is overkill to have it as an in-line comment in every place where test_i18ncmp gets used. Thanks for a review. >> + cat >expect <<-EOF && >> + fatal: unrecognized %(contents) argument: trailersonly >> + EOF >> + test_must_fail git for-each-ref --format="%(contents:trailersonly)" 2>actual && >> + test_i18ncmp expect actual >> +'
Hi, On Sat, Aug 22, 2020 at 12:47 AM Junio C Hamano <gitster@pobox.com> wrote: > > Eric Sunshine <sunshine@sunshineco.com> writes: > > > ...an alternative would have been something like: > > > > else if (!strcmp(arg, "trailers")) { > > if (trailers_atom_parser(format, atom, NULL, err)) > > return -1; > > } else if (skip_prefix(arg, "trailers:", &arg)) { > > if (trailers_atom_parser(format, atom, arg, err)) > > return -1; > > } > > > > which is quite simple to reason about (though has the cost of a tiny > > bit of duplication). > > Yeah, that looks quite simple and straight-forward. No doubt, it looks good for "contents:trailers". What if In future we would like to expand functionalities of other 'contents' options? Recently, I sent a patch series "Improvements to ref-filter"[1]. A patch in this patch series introduced "sanitize" modifier to "subject" atom. i.e "%(subject:sanitize)". What if in the future we also want "%(contents:subject:sanitize)" to work? We can duplicate code again. Something like: ``` } else if (!strcmp(arg, "trailers")) { if (trailers_atom_parser(format, atom, NULL, err)) return -1; } else if (skip_prefix(arg, "trailers:", &arg)) { if (trailers_atom_parser(format, atom, arg, err)) return -1; } else if (!strcmp(arg, "subject")) { if (subject_atom_parser(format, atom, NULL, err)) return -1; } else if (skip_prefix(arg, "subject:", &arg)) { if (subject_atom_parser(format, atom, arg, err)) return -1; } ``` OR We can just simply use helper. Something like: ``` else if (check_format_field(arg, "subject", &arg)) { if (subject_atom_parser(format, atom, arg, err)) return -1; } else if (check_format_field(arg, "trailers", &arg)) { if (trailers_atom_parser(format, atom, arg, err)) return -1; ``` We can use this helper any number of times, whenever there is a need. Sorry, I missed saying this earlier. But I don't prefer duplicating the code here. Thanks, Hariom [1]: https://public-inbox.org/git/pull.684.v4.git.1598046110.gitgitgadget@gmail.com/#t
On Sun, Aug 23, 2020 at 8:56 PM Hariom verma <hariom18599@gmail.com> wrote: > On Sat, Aug 22, 2020 at 12:47 AM Junio C Hamano <gitster@pobox.com> wrote: > > Eric Sunshine <sunshine@sunshineco.com> writes: > > > ...an alternative would have been something like: > > > > > > else if (!strcmp(arg, "trailers")) { > > > if (trailers_atom_parser(format, atom, NULL, err)) > > > return -1; > > > } else if (skip_prefix(arg, "trailers:", &arg)) { > > > if (trailers_atom_parser(format, atom, arg, err)) > > > return -1; > > > } > > > > > > which is quite simple to reason about (though has the cost of a tiny > > > bit of duplication). > > > > Yeah, that looks quite simple and straight-forward. > > Recently, I sent a patch series "Improvements to ref-filter"[1]. A > patch in this patch series introduced "sanitize" modifier to "subject" > atom. i.e "%(subject:sanitize)". > > What if in the future we also want "%(contents:subject:sanitize)" to work? > We can use this helper any number of times, whenever there is a need. > > Sorry, I missed saying this earlier. But I don't prefer duplicating > the code here. Pushing back on a reviewer suggestion is fine. Explaining the reason for your position -- as you do here -- helps reviewers understand why you feel the way you do. My review suggestion about making it easier to reason about the code while avoiding a brand new function, at the cost of a minor amount of duplication, was made in the context of this one-off case in which the function increased cognitive load and was used just once (not knowing that you envisioned future callers). If you expect the new function to be re-used by upcoming changes, then that may be a good reason to keep it. Stating so in the commit message will help reviewers see beyond the immediate patch or patch series. Aside from a couple minor style violations[1,2], I don't particularly oppose the helper function, though I have a quibble with the name check_format_field(), which I don't find helpful, and which (at least for me) increases the cognitive load. The increased cognitive load, I think, comes not only from the function name not spelling out what the function actually does, but also because the function is dual-purpose: it's both checking that the argument matches a particular token ("trailers", in this case) and extracting the sub-argument. Perhaps naming it match_and_extract_subarg() or something similar would help, though that's a mouthful. But the observation about the function being dual-purpose (thus potentially confusing) brings up other questions. For instance, is it too special-purpose? If you foresee more callers in the future with multiple-token arguments such as `%(content:subject:sanitize)`, should the function provide more assistance by splitting out each of the sub-arguments rather than stopping at the first? Taking that even further, a generalized helper for "splitting" arguments like that might be useful at the top-level of contents_atom_parser() too, rather than only for specific arguments, such as "trailers". Of course, this may all be way too ambitious for this little bug fix series or even for whatever upcoming changes you're planning, thus not worth pursuing. As for the helper's implementation, I might have written it like this: static int check_format_field(...) { const char *opt if (!strcmp(arg, field)) *option = NULL; else if (skip_prefix(arg, field, opt) && *opt == ':') *option = opt + 1; else return 0; return 1; } which is more compact and closer to what I suggested earlier for avoiding the helper function in the first place. But, of course, programming is quite subjective, and you may find your implementation easier to reason about. Plus, your version has the benefit of being slightly more optimal since it avoids an extra string scan, although that probably is mostly immaterial considering that contents_atom_parser() itself contains a long chain of potentially sub-optimal strcmp() and skip_prefix() calls. Footnotes [1]: use `if (!*opt)` rather than `if (*opt == '\0')` [2]: cuddle the closing brace and `else` on the same line like this: `} else if (...) {`
Hi, On Mon, Aug 24, 2020 at 9:19 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Sun, Aug 23, 2020 at 8:56 PM Hariom verma <hariom18599@gmail.com> wrote: > > On Sat, Aug 22, 2020 at 12:47 AM Junio C Hamano <gitster@pobox.com> wrote: > > > Eric Sunshine <sunshine@sunshineco.com> writes: > > > > ...an alternative would have been something like: > > > > > > > > else if (!strcmp(arg, "trailers")) { > > > > if (trailers_atom_parser(format, atom, NULL, err)) > > > > return -1; > > > > } else if (skip_prefix(arg, "trailers:", &arg)) { > > > > if (trailers_atom_parser(format, atom, arg, err)) > > > > return -1; > > > > } > > > > > > > > which is quite simple to reason about (though has the cost of a tiny > > > > bit of duplication). > > > > > > Yeah, that looks quite simple and straight-forward. > > > > Recently, I sent a patch series "Improvements to ref-filter"[1]. A > > patch in this patch series introduced "sanitize" modifier to "subject" > > atom. i.e "%(subject:sanitize)". > > > > What if in the future we also want "%(contents:subject:sanitize)" to work? > > We can use this helper any number of times, whenever there is a need. > > > > Sorry, I missed saying this earlier. But I don't prefer duplicating > > the code here. > > Pushing back on a reviewer suggestion is fine. Explaining the reason > for your position -- as you do here -- helps reviewers understand why > you feel the way you do. My review suggestion about making it easier > to reason about the code while avoiding a brand new function, at the > cost of a minor amount of duplication, was made in the context of this > one-off case in which the function increased cognitive load and was > used just once (not knowing that you envisioned future callers). If > you expect the new function to be re-used by upcoming changes, then > that may be a good reason to keep it. Stating so in the commit message > will help reviewers see beyond the immediate patch or patch series. Yeah. I should have mentioned this in the commit message. > Aside from a couple minor style violations[1,2], I don't particularly > oppose the helper function, though I have a quibble with the name > check_format_field(), which I don't find helpful, and which (at least > for me) increases the cognitive load. The increased cognitive load, I > think, comes not only from the function name not spelling out what the > function actually does, but also because the function is dual-purpose: > it's both checking that the argument matches a particular token > ("trailers", in this case) and extracting the sub-argument. Perhaps > naming it match_and_extract_subarg() or something similar would help, > though that's a mouthful. I will fix those violations. Also, "match_and_extract_subarg()" looks good to me. > But the observation about the function being dual-purpose (thus > potentially confusing) brings up other questions. For instance, is it > too special-purpose? If you foresee more callers in the future with > multiple-token arguments such as `%(content:subject:sanitize)`, should > the function provide more assistance by splitting out each of the > sub-arguments rather than stopping at the first? Taking that even > further, a generalized helper for "splitting" arguments like that > might be useful at the top-level of contents_atom_parser() too, rather > than only for specific arguments, such as "trailers". Of course, this > may all be way too ambitious for this little bug fix series or even > for whatever upcoming changes you're planning, thus not worth > pursuing. Splitting sub-arguments is done at "<atomname>_atom_parser()". If you mean pre-splitting every argument... something like: ['contents', 'subject', 'sanitize'] for `%(content:subject:sanitize)` in `contents_atom_parser()` ? I'm not able to see how it can be useful. Sorry, If I got your concerned wrong. > As for the helper's implementation, I might have written it like this: > > static int check_format_field(...) > { > const char *opt > if (!strcmp(arg, field)) > *option = NULL; > else if (skip_prefix(arg, field, opt) && *opt == ':') > *option = opt + 1; > else > return 0; > return 1; > } > > which is more compact and closer to what I suggested earlier for > avoiding the helper function in the first place. But, of course, > programming is quite subjective, and you may find your implementation > easier to reason about. Plus, your version has the benefit of being > slightly more optimal since it avoids an extra string scan, although > that probably is mostly immaterial considering that > contents_atom_parser() itself contains a long chain of potentially > sub-optimal strcmp() and skip_prefix() calls. "programming is quite subjective" Yeah, I couldn't agree more. The change you suggested looks good too. But I'm little inclined to my keeping my changes. I'm curious, what others have to say on this. Thanks, Hariom
Hi, On Tue, Aug 25, 2020 at 1:32 AM Hariom verma <hariom18599@gmail.com> wrote: > On Mon, Aug 24, 2020 at 9:19 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > > On Sun, Aug 23, 2020 at 8:56 PM Hariom verma <hariom18599@gmail.com> wrote: > > > Recently, I sent a patch series "Improvements to ref-filter"[1]. A > > > patch in this patch series introduced "sanitize" modifier to "subject" > > > atom. i.e "%(subject:sanitize)". > > > > > > What if in the future we also want "%(contents:subject:sanitize)" to work? > > > We can use this helper any number of times, whenever there is a need. > > > > > > Sorry, I missed saying this earlier. But I don't prefer duplicating > > > the code here. > > > > Pushing back on a reviewer suggestion is fine. Explaining the reason > > for your position -- as you do here -- helps reviewers understand why > > you feel the way you do. My review suggestion about making it easier > > to reason about the code while avoiding a brand new function, at the > > cost of a minor amount of duplication, was made in the context of this > > one-off case in which the function increased cognitive load and was > > used just once (not knowing that you envisioned future callers). If > > you expect the new function to be re-used by upcoming changes, then > > that may be a good reason to keep it. Stating so in the commit message > > will help reviewers see beyond the immediate patch or patch series. > > Yeah. I should have mentioned this in the commit message. I agree. > > Aside from a couple minor style violations[1,2], I don't particularly > > oppose the helper function, though I have a quibble with the name > > check_format_field(), which I don't find helpful, and which (at least > > for me) increases the cognitive load. The increased cognitive load, I > > think, comes not only from the function name not spelling out what the > > function actually does, but also because the function is dual-purpose: > > it's both checking that the argument matches a particular token > > ("trailers", in this case) and extracting the sub-argument. Perhaps > > naming it match_and_extract_subarg() or something similar would help, > > though that's a mouthful. > > I will fix those violations. > Also, "match_and_extract_subarg()" looks good to me. I am not sure about the "subarg" part of the name. In the for-each-ref doc, names inside %(...) are called "field names", and parts after ":" are called "options". So it might be better to have "field_option" instead of "subarg" in the name. I think we could also get rid of the "match_and_" part of the suggestion, in the same way as skip_prefix() is not called match_and_skip_prefix(). Readers can just expect that if there is no match the function will return 0. So maybe "extract_field_option()". > > But the observation about the function being dual-purpose (thus > > potentially confusing) brings up other questions. For instance, is it > > too special-purpose? If you foresee more callers in the future with > > multiple-token arguments such as `%(content:subject:sanitize)`, should > > the function provide more assistance by splitting out each of the > > sub-arguments rather than stopping at the first? Taking that even > > further, a generalized helper for "splitting" arguments like that > > might be useful at the top-level of contents_atom_parser() too, rather > > than only for specific arguments, such as "trailers". Of course, this > > may all be way too ambitious for this little bug fix series or even > > for whatever upcoming changes you're planning, thus not worth > > pursuing. > > Splitting sub-arguments is done at "<atomname>_atom_parser()". > If you mean pre-splitting every argument... > something like: ['contents', 'subject', 'sanitize'] for > `%(content:subject:sanitize)` in `contents_atom_parser()` ? I'm not > able to see how it can be useful. Yeah, it seems to me that such a splitting would require a complete rewrite of the current code, so I am not sure it's an interesting way forward for now. And anyway adding extract_field_option() goes in the right direction of abstracting the parsing and making the code simpler, more efficient and likely more correct. > Sorry, If I got your concerned wrong. > > > As for the helper's implementation, I might have written it like this: > > > > static int check_format_field(...) > > { > > const char *opt > > if (!strcmp(arg, field)) > > *option = NULL; > > else if (skip_prefix(arg, field, opt) && *opt == ':') > > *option = opt + 1; > > else > > return 0; > > return 1; > > } > > > > which is more compact and closer to what I suggested earlier for > > avoiding the helper function in the first place. But, of course, > > programming is quite subjective, and you may find your implementation > > easier to reason about. Plus, your version has the benefit of being > > slightly more optimal since it avoids an extra string scan, although > > that probably is mostly immaterial considering that > > contents_atom_parser() itself contains a long chain of potentially > > sub-optimal strcmp() and skip_prefix() calls. > > "programming is quite subjective" > Yeah, I couldn't agree more. > > The change you suggested looks good too. But I'm little inclined to my > keeping my changes. I'm curious, what others have to say on this. I also prefer a slightly more optimal one even if it's a bit less compact. Thanks, Christian.
On Wed, Aug 26, 2020 at 8:18 AM Christian Couder <christian.couder@gmail.com> wrote: > I think we could also get rid of the "match_and_" part of the > suggestion, in the same way as skip_prefix() is not called > match_and_skip_prefix(). Readers can just expect that if there is no > match the function will return 0. > > So maybe "extract_field_option()". If we want to hint more that it works in the way as skip_prefix(), we could call it "skip_field()".
Hi, On Wed, Aug 26, 2020 at 11:48 AM Christian Couder <christian.couder@gmail.com> wrote: > > Hi, > > On Tue, Aug 25, 2020 at 1:32 AM Hariom verma <hariom18599@gmail.com> wrote: > > > On Mon, Aug 24, 2020 at 9:19 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > > Aside from a couple minor style violations[1,2], I don't particularly > > > oppose the helper function, though I have a quibble with the name > > > check_format_field(), which I don't find helpful, and which (at least > > > for me) increases the cognitive load. The increased cognitive load, I > > > think, comes not only from the function name not spelling out what the > > > function actually does, but also because the function is dual-purpose: > > > it's both checking that the argument matches a particular token > > > ("trailers", in this case) and extracting the sub-argument. Perhaps > > > naming it match_and_extract_subarg() or something similar would help, > > > though that's a mouthful. > > > > I will fix those violations. > > Also, "match_and_extract_subarg()" looks good to me. > > I am not sure about the "subarg" part of the name. In the for-each-ref > doc, names inside %(...) are called "field names", and parts after ":" > are called "options". So it might be better to have "field_option" > instead of "subarg" in the name. > > I think we could also get rid of the "match_and_" part of the > suggestion, in the same way as skip_prefix() is not called > match_and_skip_prefix(). Readers can just expect that if there is no > match the function will return 0. > > So maybe "extract_field_option()". Makes sense to me. > > > But the observation about the function being dual-purpose (thus > > > potentially confusing) brings up other questions. For instance, is it > > > too special-purpose? If you foresee more callers in the future with > > > multiple-token arguments such as `%(content:subject:sanitize)`, should > > > the function provide more assistance by splitting out each of the > > > sub-arguments rather than stopping at the first? Taking that even > > > further, a generalized helper for "splitting" arguments like that > > > might be useful at the top-level of contents_atom_parser() too, rather > > > than only for specific arguments, such as "trailers". Of course, this > > > may all be way too ambitious for this little bug fix series or even > > > for whatever upcoming changes you're planning, thus not worth > > > pursuing. > > > > Splitting sub-arguments is done at "<atomname>_atom_parser()". > > If you mean pre-splitting every argument... > > something like: ['contents', 'subject', 'sanitize'] for > > `%(content:subject:sanitize)` in `contents_atom_parser()` ? I'm not > > able to see how it can be useful. > > Yeah, it seems to me that such a splitting would require a complete > rewrite of the current code, so I am not sure it's an interesting way > forward for now. And anyway adding extract_field_option() goes in the > right direction of abstracting the parsing and making the code > simpler, more efficient and likely more correct. > > > Sorry, If I got your concerned wrong. > > > > > As for the helper's implementation, I might have written it like this: > > > > > > static int check_format_field(...) > > > { > > > const char *opt > > > if (!strcmp(arg, field)) > > > *option = NULL; > > > else if (skip_prefix(arg, field, opt) && *opt == ':') > > > *option = opt + 1; > > > else > > > return 0; > > > return 1; > > > } > > > > > > which is more compact and closer to what I suggested earlier for > > > avoiding the helper function in the first place. But, of course, > > > programming is quite subjective, and you may find your implementation > > > easier to reason about. Plus, your version has the benefit of being > > > slightly more optimal since it avoids an extra string scan, although > > > that probably is mostly immaterial considering that > > > contents_atom_parser() itself contains a long chain of potentially > > > sub-optimal strcmp() and skip_prefix() calls. > > > > "programming is quite subjective" > > Yeah, I couldn't agree more. > > > > The change you suggested looks good too. But I'm little inclined to my > > keeping my changes. I'm curious, what others have to say on this. > > I also prefer a slightly more optimal one even if it's a bit less compact. +1 Thanks, Hariom
diff --git a/ref-filter.c b/ref-filter.c index ba85869755..fa131c4854 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -332,6 +332,22 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato return 0; } +static int check_format_field(const char *arg, const char *field, const char **option) +{ + const char *opt; + if (skip_prefix(arg, field, &opt)) { + if (*opt == '\0') { + *option = NULL; + return 1; + } + else if (*opt == ':') { + *option = opt + 1; + return 1; + } + } + return 0; +} + static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom, const char *arg, struct strbuf *err) { @@ -345,9 +361,8 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato atom->u.contents.option = C_SIG; else if (!strcmp(arg, "subject")) atom->u.contents.option = C_SUB; - else if (skip_prefix(arg, "trailers", &arg)) { - skip_prefix(arg, ":", &arg); - if (trailers_atom_parser(format, atom, *arg ? arg : NULL, err)) + else if (check_format_field(arg, "trailers", &arg)) { + if (trailers_atom_parser(format, atom, arg, err)) return -1; } else if (skip_prefix(arg, "lines=", &arg)) { atom->u.contents.option = C_LINES; diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index 0570380344..6d535653d9 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -823,6 +823,15 @@ test_expect_success '%(trailers) rejects unknown trailers arguments' ' test_i18ncmp expect actual ' +test_expect_success 'if arguments, %(contents:trailers) shows error if semicolon is missing' ' + # error message cannot be checked under i18n + cat >expect <<-EOF && + fatal: unrecognized %(contents) argument: trailersonly + EOF + test_must_fail git for-each-ref --format="%(contents:trailersonly)" 2>actual && + test_i18ncmp expect actual +' + test_expect_success 'basic atom: head contents:trailers' ' git for-each-ref --format="%(contents:trailers)" refs/heads/master >actual && sanitize_pgp <actual >actual.clean &&