Message ID | f5f507c4c6c4514af7dca35e307ca68e72435afb.1694240177.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Commit | ee8c5ee08cee9aee6732ab315e94bc88631c7431 |
Headers | show |
Series | Trailer readability cleanups | expand |
"Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Linus Arver <linusa@google.com> > > Currently, find_patch_start only finds the start of the patch part of > the input (by looking at the "---" divider) for cases where the > "--no-divider" flag has not been provided. If the user provides this > flag, we do not rely on find_patch_start at all and just call strlen() > directly on the input. > > Instead, make find_patch_start aware of "--no-divider" and make it > handle that case as well. This means we no longer need to call strlen at > all and can just rely on the existing code in find_patch_start. By > forcing callers to consider this important option, we avoid the kind of > mistake described in be3d654343 (commit: pass --no-divider to > interpret-trailers, 2023-06-17). OK. The code pays attention to "---" so making it stop doing so when the "--no-*" option is given will make the function responsible for finding the beginning of the patch. I wonder if we should rename this function while we are at it, though. When "--no-divider" is given, the expected use case is *not* to have a patch at all, and it is dubious that a function whose name is find_patch_start() can possibly do anything useful. The real purpose of this function is to find the end of the log message, isn't it? And the caller uses the end of the log message it found and gives it to find_trailer_start() and find_trailer_end() as the upper boundary of the search for the trailer block.
Junio C Hamano <gitster@pobox.com> writes: > "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Linus Arver <linusa@google.com> >> >> Currently, find_patch_start only finds the start of the patch part of >> the input (by looking at the "---" divider) for cases where the >> "--no-divider" flag has not been provided. If the user provides this >> flag, we do not rely on find_patch_start at all and just call strlen() >> directly on the input. >> >> Instead, make find_patch_start aware of "--no-divider" and make it >> handle that case as well. This means we no longer need to call strlen at >> all and can just rely on the existing code in find_patch_start. By >> forcing callers to consider this important option, we avoid the kind of >> mistake described in be3d654343 (commit: pass --no-divider to >> interpret-trailers, 2023-06-17). > > OK. The code pays attention to "---" so making it stop doing so > when the "--no-*" option is given will make the function responsible > for finding the beginning of the patch. > > I wonder if we should rename this function while we are at it, > though. When "--no-divider" is given, the expected use case is > *not* to have a patch at all, and it is dubious that a function > whose name is find_patch_start() can possibly do anything useful. IOW, saying as the caller of this function, "find the patch start in this input I'm giving you, but also FYI the input has no patch in it" sounds wrong. Agreed. > The real purpose of this function is to find the end of the log > message, isn't it? Indeed. > And the caller uses the end of the log message > it found and gives it to find_trailer_start() and find_trailer_end() > as the upper boundary of the search for the trailer block. Right! So a better name might be something like "find_trailer_search_boundary" with a comment like Find the point at which we should stop searching for trailers (to parse them). This is either the end of the input string (obviously), or the point when we see "---" indicating the start of the "patch part". Will update.
Linus Arver <linusa@google.com> writes: >> The real purpose of this function is to find the end of the log >> message, isn't it? > > Indeed. > ... > Right! So a better name might be something like > "find_trailer_search_boundary" with a comment like Or "find_end_of_log_message()", which we agreed to be the real purpose of this function ;-)
Junio C Hamano <gitster@pobox.com> writes: > Linus Arver <linusa@google.com> writes: > >>> The real purpose of this function is to find the end of the log >>> message, isn't it? >> >> Indeed. >> ... >> Right! So a better name might be something like >> "find_trailer_search_boundary" with a comment like > > Or "find_end_of_log_message()", which we agreed to be the real > purpose of this function ;-) I did this locally, but in doing so realized that we have in trailer.c /* Return the position of the end of the trailers. */ static size_t find_trailer_end(const char *buf, size_t len) { return len - ignore_non_trailer(buf, len); } and the ignore_non_trailer() comes from commit.c, which says /* * Inspect the given string and determine the true "end" of the log message, in * order to find where to put a new Signed-off-by trailer. Ignored are * trailing comment lines and blank lines. To support "git commit -s * --amend" on an existing commit, we also ignore "Conflicts:". To * support "git commit -v", we truncate at cut lines. * * Returns the number of bytes from the tail to ignore, to be fed as * the second parameter to append_signoff(). */ size_t ignore_non_trailer(const char *buf, size_t len) { ...and I am not so sure the new "find_end_of_log_message" name for find_patch_start() is desirable because of the overlap in meaning with the comment for ignore_non_trailer(). To recap, we have in trailer_info_get() in trailer.c which (without this patch) has if (opts->no_divider) patch_start = strlen(str); else patch_start = find_patch_start(str); trailer_end = find_trailer_end(str, patch_start); trailer_start = find_trailer_start(str, trailer_end); to find the trailer_end and trailer_start positions in the input. These positions are the boundaries for parsing for actual trailers (if any). More precisely, the "patch_start" variable helps us _skip_ the "patch part" of the input (if any, denoted by "---"). The call to find_trailer_end() helps us (again) _skip_ any parts that are not part of the actual log message (per the comment in ignore_non_trailer()). So as far as trailer_info_get() is concerned, we are just trying to skip over areas where we shouldn't search/parse for trailers. The above analysis leads me to some new ideas: (1) For "find_end_of_log_message()", I think this name should really belong to ignore_non_trailer() in commit.c instead of find_patch_start() in trailer.c. The existing comment for ignore_non_trailer() already says that its job is to determine the true end of the log message (and does a lot of the necessary work to do this job). (2) find_patch_start() should be named something like "shrink_trailer_search_space" (although this meaning belongs equally well to find_trailer_end()), because the point is to reduce the search space for parsing trailers. (3) "find_trailer_end" is not a great function name because on first glance it implies that it found the end of trailers, but this is only true for the "happy path" of actually finding trailers. I need to consider all of the above ideas and reroll this patch. Because the ideas here closely relate to the "trailer_end" and "trailer_start" variables, I will probably reorder the series so that this patch and the last patch are closer together.
diff --git a/trailer.c b/trailer.c index b6de5d9cb2d..f646e484a23 100644 --- a/trailer.c +++ b/trailer.c @@ -812,14 +812,14 @@ static ssize_t last_line(const char *buf, size_t len) * Return the position of the start of the patch or the length of str if there * is no patch in the message. */ -static size_t find_patch_start(const char *str) +static size_t find_patch_start(const char *str, int no_divider) { const char *s; for (s = str; *s; s = next_line(s)) { const char *v; - if (skip_prefix(s, "---", &v) && isspace(*v)) + if (!no_divider && skip_prefix(s, "---", &v) && isspace(*v)) return s - str; } @@ -1109,11 +1109,7 @@ void trailer_info_get(struct trailer_info *info, const char *str, ensure_configured(); - if (opts->no_divider) - patch_start = strlen(str); - else - patch_start = find_patch_start(str); - + patch_start = find_patch_start(str, opts->no_divider); trailer_end = find_trailer_end(str, patch_start); trailer_start = find_trailer_start(str, trailer_end);