Message ID | Y5n4mb/S/RORb+N7@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Accepted |
Commit | 1955ef10edf3c888dcb237728c81dd6e81df4960 |
Headers | show |
Series | minor ref-filter error-reporting bug-fixes | expand |
On Wed, Dec 14, 2022 at 11:23:53AM -0500, Jeff King wrote: > It seems like the cleanest fix would be for atom->name to be _just_ the > name, since there's already a separate "args" field. But since that > field is also used for other things, we can't change it easily (e.g., > it's how we find things in the used_atoms array, and clearly %(refname) > and %(refname:short) are not the same thing). > > Instead, we'll teach our error_bad_arg() function to stop at the first > ":". This is a little hacky, as we're effectively re-parsing the name, > but the format is simple enough to do this as a one-liner, and this > localizes the change to the error-reporting code. > > We'll give the same treatment to err_no_arg(). None of its callers use > this atom->name trick, but it's worth future-proofing it while we're > here. For what it's worth, I think that this balance of a somewhat-hacky implementation against a more significant and trickier refactoring is well thought-out and the right decision, IMHO. Thanks, Taylor
On Wed, Dec 14, 2022 at 03:05:05PM -0500, Taylor Blau wrote: > On Wed, Dec 14, 2022 at 11:23:53AM -0500, Jeff King wrote: > > It seems like the cleanest fix would be for atom->name to be _just_ the > > name, since there's already a separate "args" field. But since that > > field is also used for other things, we can't change it easily (e.g., > > it's how we find things in the used_atoms array, and clearly %(refname) > > and %(refname:short) are not the same thing). > > > > Instead, we'll teach our error_bad_arg() function to stop at the first > > ":". This is a little hacky, as we're effectively re-parsing the name, > > but the format is simple enough to do this as a one-liner, and this > > localizes the change to the error-reporting code. > > > > We'll give the same treatment to err_no_arg(). None of its callers use > > this atom->name trick, but it's worth future-proofing it while we're > > here. > > For what it's worth, I think that this balance of a somewhat-hacky > implementation against a more significant and trickier refactoring is > well thought-out and the right decision, IMHO. By the way, I did try the other change, to make atom->name just contain the name with no args. There are a bunch of pitfalls in parse_ref_filter_atom(), including: - don't use atom_len; it's off-by-one when looking at "atom" and not "sp" when there's a "*" dereference - the "args" pointer in the struct actually points into the name string. I don't think anybody relies on that, but I'm not 100% sure because... - once you deal with that, then it segfaults mysteriously, because the numeric index between used_atom and the computed values gets out of sync. That's where I gave up. Which isn't to say it isn't do-able, or it wouldn't even make the ref-filter code cleaner overall if somebody did that refactoring. But it seemed like too much for solving this one little problem. Here's the patch where I stopped, for posterity: diff --git a/ref-filter.c b/ref-filter.c index caf10ab23e..3a2a7d0271 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -716,7 +716,7 @@ static int parse_ref_filter_atom(struct ref_format *format, used_atom_cnt++; REALLOC_ARRAY(used_atom, used_atom_cnt); used_atom[at].atom_type = i; - used_atom[at].name = xmemdupz(atom, ep - atom); + used_atom[at].name = xmemdupz(atom, (arg ? arg : ep) - atom); used_atom[at].type = valid_atom[i].cmp_type; used_atom[at].source = valid_atom[i].source; if (used_atom[at].source == SOURCE_OBJ) { @@ -726,8 +726,10 @@ static int parse_ref_filter_atom(struct ref_format *format, oi.info.contentp = &oi.content; } if (arg) { - arg = used_atom[at].name + (arg - atom) + 1; - if (!*arg) { + arg++; /* skip ':' */ + if (arg < ep) { + arg = xmemdupz(arg, ep - arg); + } else { /* * Treat empty sub-arguments list as NULL (i.e., * "%(atom:)" is equivalent to "%(atom)"). Its relative shortness does not represent the great confusion I had in producing it. -Peff
diff --git a/ref-filter.c b/ref-filter.c index 271d619da9..f40bc4d9c9 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -230,13 +230,17 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...) static int err_no_arg(struct strbuf *sb, const char *name) { - strbuf_addf(sb, _("%%(%s) does not take arguments"), name); + size_t namelen = strchrnul(name, ':') - name; + strbuf_addf(sb, _("%%(%.*s) does not take arguments"), + (int)namelen, name); return -1; } static int err_bad_arg(struct strbuf *sb, const char *name, const char *arg) { - strbuf_addf(sb, _("unrecognized %%(%s) argument: %s"), name, arg); + size_t namelen = strchrnul(name, ':') - name; + strbuf_addf(sb, _("unrecognized %%(%.*s) argument: %s"), + (int)namelen, name, arg); return -1; } @@ -274,7 +278,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a if (strtol_i(arg, 10, &atom->rstrip)) return strbuf_addf_ret(err, -1, _("Integer value expected refname:rstrip=%s"), arg); } else - return strbuf_addf_ret(err, -1, _("unrecognized %%(%s) argument: %s"), name, arg); + return err_bad_arg(err, name, arg); return 0; } @@ -471,7 +475,7 @@ static int oid_atom_parser(struct ref_format *format, struct used_atom *atom, if (atom->u.oid.length < MINIMUM_ABBREV) atom->u.oid.length = MINIMUM_ABBREV; } else - return strbuf_addf_ret(err, -1, _("unrecognized %%(%s) argument: %s"), atom->name, arg); + return err_bad_arg(err, atom->name, arg); return 0; } diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index 010ba5a2cb..2ae1fc721b 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -1254,6 +1254,12 @@ test_expect_success 'subject atom rejects unknown arguments' ' test_cmp expect err ' +test_expect_success 'refname atom rejects unknown arguments' ' + test_must_fail git for-each-ref --format="%(refname:foo)" 2>err && + echo "fatal: unrecognized %(refname) argument: foo" >expect && + test_cmp expect err +' + test_expect_success 'trailer parsing not fooled by --- line' ' git commit --allow-empty -F - <<-\EOF && this is the subject
If you pass a bogus argument to %(refname), you may end up with a message like this: $ git for-each-ref --format='%(refname:foo)' fatal: unrecognized %(refname:foo) argument: foo which is confusing. It should just say: fatal: unrecognized %(refname) argument: foo which is clearer, and is consistent with most other atom parsers. Those other parsers do not have the same problem because they pass the atom name from a string literal in the parser function. But because the parser for %(refname) also handles %(upstream) and %(push), it instead uses atom->name, which includes the arguments. The oid atom parser which handles %(tree), %(parent), etc suffers from the same problem. It seems like the cleanest fix would be for atom->name to be _just_ the name, since there's already a separate "args" field. But since that field is also used for other things, we can't change it easily (e.g., it's how we find things in the used_atoms array, and clearly %(refname) and %(refname:short) are not the same thing). Instead, we'll teach our error_bad_arg() function to stop at the first ":". This is a little hacky, as we're effectively re-parsing the name, but the format is simple enough to do this as a one-liner, and this localizes the change to the error-reporting code. We'll give the same treatment to err_no_arg(). None of its callers use this atom->name trick, but it's worth future-proofing it while we're here. Signed-off-by: Jeff King <peff@peff.net> --- ref-filter.c | 12 ++++++++---- t/t6300-for-each-ref.sh | 6 ++++++ 2 files changed, 14 insertions(+), 4 deletions(-)