Message ID | 0463066ebe0889b72b6a1f6c344f2de127458391.1694240177.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Commit | b5e75f87b5881811a7eee8ea6aa10cd43f1fe430 |
Headers | show |
Series | Trailer readability cleanups | expand |
"Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Linus Arver <linusa@google.com> > > Previously these fields in the trailer_info struct were of type "const > char *" and pointed to positions in the input string directly (to the > start and end positions of the trailer block). > > Use offsets to make the intended usage less ambiguous. We only need to > reference the input string in format_trailer_info(), so update that > function to take a pointer to the input. Hmm, I am not sure if this is an improvement. If the underlying buffer can be reallocated (to grow), the approach to use the offsets certainly is easier to deal with, as they will stay valid even after such a reallocation. But you lose the obvious sentinel value NULL that can mean something special, and have to make the readers aware of the local convention you happened to have picked with a comment like ... > Signed-off-by: Linus Arver <linusa@google.com> > --- > trailer.c | 17 ++++++++--------- > trailer.h | 7 +++---- > 2 files changed, 11 insertions(+), 13 deletions(-) > ... > /* > - * Pointers to the start and end of the trailer block found. If there > - * is no trailer block found, these 2 pointers point to the end of the > - * input string. > + * Offsets to the trailer block start and end positions in the input > + * string. If no trailer block is found, these are set to 0. > */ ... this, simply because there is no obvious sentinel value for an unsigned integral type; even if you count MAX_ULONG and its friends, they are not as obvious as NULL for pointer types. So, I dunno.
Junio C Hamano <gitster@pobox.com> writes: > "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Linus Arver <linusa@google.com> >> >> Previously these fields in the trailer_info struct were of type "const >> char *" and pointed to positions in the input string directly (to the >> start and end positions of the trailer block). >> >> Use offsets to make the intended usage less ambiguous. We only need to >> reference the input string in format_trailer_info(), so update that >> function to take a pointer to the input. > > Hmm, I am not sure if this is an improvement. If the underlying > buffer can be reallocated (to grow), the approach to use the offsets > certainly is easier to deal with, as they will stay valid even after > such a reallocation. But you lose the obvious sentinel value NULL > that can mean something special True. > and have to make the readers aware > of the local convention you happened to have picked with a comment > like ... > >> Signed-off-by: Linus Arver <linusa@google.com> >> --- >> trailer.c | 17 ++++++++--------- >> trailer.h | 7 +++---- >> 2 files changed, 11 insertions(+), 13 deletions(-) >> ... >> /* >> - * Pointers to the start and end of the trailer block found. If there >> - * is no trailer block found, these 2 pointers point to the end of the >> - * input string. >> + * Offsets to the trailer block start and end positions in the input >> + * string. If no trailer block is found, these are set to 0. >> */ > > ... this, simply because there is no obvious sentinel value for an > unsigned integral type; even if you count MAX_ULONG and its friends, > they are not as obvious as NULL for pointer types. I agree that there is no trustworthy sentinel value for an unsigned integral type. On the other hand, we never used NULL as a sentinel value before even when they were const char pointers --- the current comment for these fields which say ... If there is no trailer block found, these 2 pointers point to the end of the input string. ... sounds somewhat arbitrary to me (and I don't think we care about this property in trailer.c, and AFAICS it's also not something that consumers should be aware of). Consumers of the trailer_info struct could also just see if "info->trailer_nr > 0" to check whether any trailers were found, although if we're merging Patch 1 [1] of this series the consumers will not have easy access any more to any of the trailer_info fields, and they should instead be using a public-facing function that does the "were trailers found" check. > So, I dunno. If the "we don't use NULL sentinel values currently anyway" argument is convincing enough, I'm happy to add this to the commit message on a reroll. But I'm also OK with dropping this patch. Thoughts? [1] https://lore.kernel.org/git/pull.1563.git.1691211879.gitgitgadget@gmail.com/T/#m8f1b5f1eb346331658c8c7b3e057a4ee31223664
Linus Arver <linusa@google.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: >> >>> From: Linus Arver <linusa@google.com> >>> >>> Previously these fields in the trailer_info struct were of type "const >>> char *" and pointed to positions in the input string directly (to the >>> start and end positions of the trailer block). >>> >>> Use offsets to make the intended usage less ambiguous. We only need to >>> reference the input string in format_trailer_info(), so update that >>> function to take a pointer to the input. >> >> Hmm, I am not sure if this is an improvement. If the underlying >> buffer can be reallocated (to grow), the approach to use the offsets >> certainly is easier to deal with, as they will stay valid even after >> such a reallocation. But you lose the obvious sentinel value NULL >> that can mean something special > > True. > >> and have to make the readers aware >> of the local convention you happened to have picked with a comment >> like ... >> >>> Signed-off-by: Linus Arver <linusa@google.com> >>> --- >>> trailer.c | 17 ++++++++--------- >>> trailer.h | 7 +++---- >>> 2 files changed, 11 insertions(+), 13 deletions(-) >>> ... >>> /* >>> - * Pointers to the start and end of the trailer block found. If there >>> - * is no trailer block found, these 2 pointers point to the end of the >>> - * input string. >>> + * Offsets to the trailer block start and end positions in the input >>> + * string. If no trailer block is found, these are set to 0. >>> */ I've just realized that the new comment "If no trailer block is found, these are set to 0" is perhaps not always true in the current version of this patch. This is because this patch did a mechanical type conversion of the old fields from "const char *" to "size_t", and we still run trailer_end = find_trailer_end(str, trailer_search_boundary); trailer_start = find_trailer_start(str, trailer_end); inside trailer_info_get() (not visible in the patch context lines). I will update this patch to make the comment "If no trailer block is found, these are set to 0" true.
diff --git a/trailer.c b/trailer.c index 6ad2fbca942..00326720e81 100644 --- a/trailer.c +++ b/trailer.c @@ -1055,7 +1055,6 @@ void process_trailers(const char *file, LIST_HEAD(head); struct strbuf sb = STRBUF_INIT; struct trailer_info info; - size_t trailer_end; FILE *outfile = stdout; ensure_configured(); @@ -1066,11 +1065,10 @@ void process_trailers(const char *file, outfile = create_in_place_tempfile(file); parse_trailers(&info, sb.buf, &head, opts); - trailer_end = info.trailer_end - sb.buf; /* Print the lines before the trailers */ if (!opts->only_trailers) - fwrite(sb.buf, 1, info.trailer_start - sb.buf, outfile); + fwrite(sb.buf, 1, info.trailer_start, outfile); if (!opts->only_trailers && !info.blank_line_before_trailer) fprintf(outfile, "\n"); @@ -1092,7 +1090,7 @@ void process_trailers(const char *file, /* Print the lines after the trailers as is */ if (!opts->only_trailers) - fwrite(sb.buf + trailer_end, 1, sb.len - trailer_end, outfile); + fwrite(sb.buf + info.trailer_end, 1, sb.len - info.trailer_end, outfile); if (opts->in_place) if (rename_tempfile(&trailers_tempfile, file)) @@ -1104,7 +1102,7 @@ void process_trailers(const char *file, void trailer_info_get(struct trailer_info *info, const char *str, const struct process_trailer_options *opts) { - int patch_start, trailer_end, trailer_start; + size_t patch_start, trailer_end = 0, trailer_start = 0; struct strbuf **trailer_lines, **ptr; char **trailer_strings = NULL; size_t nr = 0, alloc = 0; @@ -1139,8 +1137,8 @@ void trailer_info_get(struct trailer_info *info, const char *str, info->blank_line_before_trailer = ends_with_blank_line(str, trailer_start); - info->trailer_start = str + trailer_start; - info->trailer_end = str + trailer_end; + info->trailer_start = trailer_start; + info->trailer_end = trailer_end; info->trailers = trailer_strings; info->trailer_nr = nr; } @@ -1155,6 +1153,7 @@ void trailer_info_release(struct trailer_info *info) static void format_trailer_info(struct strbuf *out, const struct trailer_info *info, + const char *msg, const struct process_trailer_options *opts) { size_t origlen = out->len; @@ -1164,7 +1163,7 @@ static void format_trailer_info(struct strbuf *out, if (!opts->only_trailers && !opts->unfold && !opts->filter && !opts->separator && !opts->key_only && !opts->value_only && !opts->key_value_separator) { - strbuf_add(out, info->trailer_start, + strbuf_add(out, msg + info->trailer_start, info->trailer_end - info->trailer_start); return; } @@ -1219,7 +1218,7 @@ void format_trailers_from_commit(struct strbuf *out, const char *msg, struct trailer_info info; trailer_info_get(&info, msg, opts); - format_trailer_info(out, &info, opts); + format_trailer_info(out, &info, msg, opts); trailer_info_release(&info); } diff --git a/trailer.h b/trailer.h index a689d768c79..13fbf0dcd12 100644 --- a/trailer.h +++ b/trailer.h @@ -37,11 +37,10 @@ struct trailer_info { int blank_line_before_trailer; /* - * Pointers to the start and end of the trailer block found. If there - * is no trailer block found, these 2 pointers point to the end of the - * input string. + * Offsets to the trailer block start and end positions in the input + * string. If no trailer block is found, these are set to 0. */ - const char *trailer_start, *trailer_end; + size_t trailer_start, trailer_end; /* * Array of trailers found.