Message ID | patch-v4-04.27-69f0aabe38f-20220331T005325Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | revision.[ch]: add and use release_revisions() | expand |
Hi Ævar On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote: > Fix two memory leaks in "struct rev_info" by freeing that memory in > cmd_format_patch(). These two are unusual special-cases in being in > the "struct rev_info", but not being "owned" by the code in > revision.c. I.e. they're members of the struct so that this code in > "builtin/log.c" can pass information code in log-tree.c. I'm not sure that I necessarily agree that these are owned by builtin/log.c. For rev.extra_headers it is set in builtin/log.c but never used there which makes me think we are transferring ownership to struct rev_info. For ref_message_ids it is less clear cut but having it owned by struct rev_info and freeing it in release_revisions() would make things clearer I think. Having some members owned by struct rev_info but others allocated and freed by other code is confusing and is likely to lead to memory errors. I don't think struct rev_info is borrowing a reference to these items as they are being allocated for it's exclusive use. Best Wishes Phillip > See 20ff06805c6 (format-patch: resurrect extra headers from config, > 2006-06-02) and d1566f7883f (git-format-patch: Make the second and > subsequent mails replies to the first, 2006-07-14) for the initial > introduction of "extra_headers" and "ref_message_ids". > > We can count on repo_init_revisions() memset()-ing this data to 0 > however, so we can count on it being either NULL or something we > allocated. In the case of "extra_headers" let's add a local "char *" > variable to hold it, to avoid the eventual cast from "const char *" > when we free() it. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > builtin/log.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/builtin/log.c b/builtin/log.c > index 634dc782cce..6f9928fabfe 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -1747,6 +1747,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) > struct commit *commit; > struct commit **list = NULL; > struct rev_info rev; > + char *to_free = NULL; > struct setup_revision_opt s_r_opt; > int nr = 0, total, i; > int use_stdout = 0; > @@ -1947,7 +1948,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) > strbuf_addch(&buf, '\n'); > } > > - rev.extra_headers = strbuf_detach(&buf, NULL); > + rev.extra_headers = to_free = strbuf_detach(&buf, NULL); > > if (from) { > if (split_ident_line(&rev.from_ident, from, strlen(from))) > @@ -2284,6 +2285,10 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) > strbuf_release(&rdiff1); > strbuf_release(&rdiff2); > strbuf_release(&rdiff_title); > + free(to_free); > + if (rev.ref_message_ids) > + string_list_clear(rev.ref_message_ids, 0); > + free(rev.ref_message_ids); > UNLEAK(rev); > return 0; > }
On Fri, Apr 01 2022, Phillip Wood wrote: > Hi Ævar > > On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote: >> Fix two memory leaks in "struct rev_info" by freeing that memory in >> cmd_format_patch(). These two are unusual special-cases in being in >> the "struct rev_info", but not being "owned" by the code in >> revision.c. I.e. they're members of the struct so that this code in >> "builtin/log.c" can pass information code in log-tree.c. > > I'm not sure that I necessarily agree that these are owned by > builtin/log.c. For rev.extra_headers it is set in builtin/log.c but > never used there which makes me think we are transferring ownership to > struct rev_info. For ref_message_ids it is less clear cut but having > it owned by struct rev_info and freeing it in release_revisions() > would make things clearer I think. Having some members owned by struct > rev_info but others allocated and freed by other code is confusing and > is likely to lead to memory errors. I don't think struct rev_info is > borrowing a reference to these items as they are being allocated for > it's exclusive use. This really isn't for the use of revision.[ch] at all, but just something that's used as ad-hoc message passing between builtin/log.c and log-tree.c. It just so happens that it's done by ferrying it via the struct rev_info. The below patch fails tests, it's just a quick (and obviously flawed) search/replacement that I hacked up to get the removal of these from revision.h to compile, which shows that it's just something between log.c and log-tree.c, pretty much. Anyway, while I'm 100% in agreement with you that this *should* be fixed I'd really like to do the bare minimum to address leaks in this initial iteration. I.e. you're right that this relatively fragile, and I'm also concern about catching memory errors. But being able to run (most of) the format-patch tests tests as a result of this more isolated leak will leave us much better position to validate any subsequent larger fixes, such as (something like) the below. diff --git a/builtin/log.c b/builtin/log.c index c211d66d1d0..dc1acd730d1 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -1192,7 +1192,8 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file, struct commit *origin, int nr, struct commit **list, const char *branch_name, - int quiet) + int quiet, const char *extra_headers, + struct string_list *ref_message_ids) { const char *committer; struct shortlog log; @@ -1212,7 +1213,9 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file, open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", rev, quiet)) die(_("failed to create cover-letter file")); - log_write_email_headers(rev, head, &pp.after_subject, &need_8bit_cte, 0); + log_write_email_headers(rev, head, pp.after_subject, ref_message_ids, + &need_8bit_cte, 0); + pp.after_subject = extra_headers; for (i = 0; !need_8bit_cte && i < nr; i++) { const char *buf = get_commit_buffer(list[i], NULL); @@ -1777,6 +1780,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) struct strbuf rdiff2 = STRBUF_INIT; struct strbuf rdiff_title = STRBUF_INIT; int creation_factor = -1; + char *extra_headers; + struct string_list *ref_message_ids; const struct option builtin_format_patch_options[] = { OPT_CALLBACK_F('n', "numbered", &numbered, NULL, @@ -1946,7 +1951,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) strbuf_addch(&buf, '\n'); } - rev.extra_headers = strbuf_detach(&buf, NULL); + extra_headers = strbuf_detach(&buf, NULL); if (from) { if (split_ident_line(&rev.from_ident, from, strlen(from))) @@ -2174,10 +2179,10 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) } if (in_reply_to || thread || cover_letter) - rev.ref_message_ids = xcalloc(1, sizeof(struct string_list)); + ref_message_ids = xcalloc(1, sizeof(struct string_list)); if (in_reply_to) { const char *msgid = clean_message_id(in_reply_to); - string_list_append(rev.ref_message_ids, msgid); + string_list_append(ref_message_ids, msgid); } rev.numbered_files = just_numbers; rev.patch_suffix = fmt_patch_suffix; @@ -2185,7 +2190,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) if (thread) gen_message_id(&rev, "cover"); make_cover_letter(&rev, !!output_directory, - origin, nr, list, branch_name, quiet); + origin, nr, list, branch_name, quiet, + extra_headers, ref_message_ids); print_bases(&bases, rev.diffopt.file); print_signature(rev.diffopt.file); total++; @@ -2229,11 +2235,11 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) * --in-reply-to, if specified. */ if (thread == THREAD_SHALLOW - && rev.ref_message_ids->nr > 0 + && ref_message_ids->nr > 0 && (!cover_letter || rev.nr > 1)) free(rev.message_id); else - string_list_append(rev.ref_message_ids, + string_list_append(ref_message_ids, rev.message_id); } gen_message_id(&rev, oid_to_hex(&commit->object.oid)); diff --git a/log-tree.c b/log-tree.c index 38e5cccc1a1..66170b75254 100644 --- a/log-tree.c +++ b/log-tree.c @@ -419,11 +419,11 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt) } void log_write_email_headers(struct rev_info *opt, struct commit *commit, - const char **extra_headers_p, + const char *extra_headers, + struct string_list *ref_message_ids, int *need_8bit_cte_p, int maybe_multipart) { - const char *extra_headers = opt->extra_headers; const char *name = oid_to_hex(opt->zero_commit ? null_oid() : &commit->object.oid); @@ -435,13 +435,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit, fprintf(opt->diffopt.file, "Message-Id: <%s>\n", opt->message_id); graph_show_oneline(opt->graph); } - if (opt->ref_message_ids && opt->ref_message_ids->nr > 0) { + if (ref_message_ids && ref_message_ids->nr > 0) { int i, n; - n = opt->ref_message_ids->nr; - fprintf(opt->diffopt.file, "In-Reply-To: <%s>\n", opt->ref_message_ids->items[n-1].string); + n = ref_message_ids->nr; + fprintf(opt->diffopt.file, "In-Reply-To: <%s>\n", ref_message_ids->items[n-1].string); for (i = 0; i < n; i++) fprintf(opt->diffopt.file, "%s<%s>\n", (i > 0 ? "\t" : "References: "), - opt->ref_message_ids->items[i].string); + ref_message_ids->items[i].string); graph_show_oneline(opt->graph); } if (opt->mime_boundary && maybe_multipart) { @@ -488,7 +488,6 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit, opt->diffopt.stat_sep = buffer.buf; strbuf_release(&filename); } - *extra_headers_p = extra_headers; } static void show_sig_lines(struct rev_info *opt, int status, const char *bol) @@ -621,13 +620,12 @@ static void next_commentary_block(struct rev_info *opt, struct strbuf *sb) opt->shown_dashes = 1; } -void show_log(struct rev_info *opt) +void show_log_extra_headers(struct rev_info *opt, const char *extra_headers) { struct strbuf msgbuf = STRBUF_INIT; struct log_info *log = opt->loginfo; struct commit *commit = log->commit, *parent = log->parent; int abbrev_commit = opt->abbrev_commit ? opt->abbrev : the_hash_algo->hexsz; - const char *extra_headers = opt->extra_headers; struct pretty_print_context ctx = {0}; opt->loginfo = NULL; @@ -687,7 +685,7 @@ void show_log(struct rev_info *opt) */ if (cmit_fmt_is_mail(opt->commit_format)) { - log_write_email_headers(opt, commit, &extra_headers, + log_write_email_headers(opt, commit, extra_headers, NULL, &ctx.need_8bit_cte, 1); ctx.rev = opt; ctx.print_email_subject = 1; @@ -848,6 +846,11 @@ void show_log(struct rev_info *opt) } } +void show_log(struct rev_info *opt) +{ + show_log_extra_headers(opt, NULL); +} + int log_tree_diff_flush(struct rev_info *opt) { opt->shown_dashes = 0; diff --git a/log-tree.h b/log-tree.h index e7e4641cf83..32ae026f6fb 100644 --- a/log-tree.h +++ b/log-tree.h @@ -16,6 +16,7 @@ struct decoration_filter { int parse_decorate_color_config(const char *var, const char *slot_name, const char *value); int log_tree_diff_flush(struct rev_info *); int log_tree_commit(struct rev_info *, struct commit *); +void show_log_extra_headers(struct rev_info *opt, const char *extra_headers); void show_log(struct rev_info *opt); void format_decorations_extended(struct strbuf *sb, const struct commit *commit, int use_color, @@ -26,7 +27,8 @@ void format_decorations_extended(struct strbuf *sb, const struct commit *commit, format_decorations_extended((strbuf), (commit), (color), " (", ", ", ")") void show_decorations(struct rev_info *opt, struct commit *commit); void log_write_email_headers(struct rev_info *opt, struct commit *commit, - const char **extra_headers_p, + const char *extra_headers, + struct string_list *ref_message_ids, int *need_8bit_cte_p, int maybe_multipart); void load_ref_decorations(struct decoration_filter *filter, int flags); diff --git a/revision.h b/revision.h index 5bc59c7bfe1..e2e08673bba 100644 --- a/revision.h +++ b/revision.h @@ -243,9 +243,7 @@ struct rev_info { const char *reroll_count; char *message_id; struct ident_split from_ident; - struct string_list *ref_message_ids; int add_signoff; - const char *extra_headers; const char *log_reencode; const char *subject_prefix; int patch_name_max;
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > The below patch fails tests, it's just a quick (and obviously flawed) > search/replacement that I hacked up to get the removal of these from > revision.h to compile, which shows that it's just something between > log.c and log-tree.c, pretty much. Given that the way most end-users see log-tree.c in action is via log.c (is there a useful entry point into log-tree that bypasses log.c that implements end-user actions?), I would actually think it is that extra_headers is quite an integral part of rev_info structure, which is how the revision traversal API passes information relevant to the current traversal around various layors of the implementation. So, I am not sure what the above experiment shows us. > Anyway, while I'm 100% in agreement with you that this *should* be fixed > I'd really like to do the bare minimum to address leaks in this initial > iteration. > > I.e. you're right that this relatively fragile, and I'm also concern > about catching memory errors. As long as we know there are yet more work to do (are we leaving NEEDSWORK comments as we go, by the way?), I think that's fine. We cannot go all the way in one go for a large code base like revisions API, and we can leave a small corner "not quite optimial", and we can even leave it in "sometimes still leaks" status, as long as we never make anything to free too much by mistake.
diff --git a/builtin/log.c b/builtin/log.c index 634dc782cce..6f9928fabfe 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -1747,6 +1747,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) struct commit *commit; struct commit **list = NULL; struct rev_info rev; + char *to_free = NULL; struct setup_revision_opt s_r_opt; int nr = 0, total, i; int use_stdout = 0; @@ -1947,7 +1948,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) strbuf_addch(&buf, '\n'); } - rev.extra_headers = strbuf_detach(&buf, NULL); + rev.extra_headers = to_free = strbuf_detach(&buf, NULL); if (from) { if (split_ident_line(&rev.from_ident, from, strlen(from))) @@ -2284,6 +2285,10 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) strbuf_release(&rdiff1); strbuf_release(&rdiff2); strbuf_release(&rdiff_title); + free(to_free); + if (rev.ref_message_ids) + string_list_clear(rev.ref_message_ids, 0); + free(rev.ref_message_ids); UNLEAK(rev); return 0; }
Fix two memory leaks in "struct rev_info" by freeing that memory in cmd_format_patch(). These two are unusual special-cases in being in the "struct rev_info", but not being "owned" by the code in revision.c. I.e. they're members of the struct so that this code in "builtin/log.c" can pass information code in log-tree.c. See 20ff06805c6 (format-patch: resurrect extra headers from config, 2006-06-02) and d1566f7883f (git-format-patch: Make the second and subsequent mails replies to the first, 2006-07-14) for the initial introduction of "extra_headers" and "ref_message_ids". We can count on repo_init_revisions() memset()-ing this data to 0 however, so we can count on it being either NULL or something we allocated. In the case of "extra_headers" let's add a local "char *" variable to hold it, to avoid the eventual cast from "const char *" when we free() it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/log.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)