diff mbox series

[v4,04/27] format-patch: don't leak "extra_headers" or "ref_message_ids"

Message ID patch-v4-04.27-69f0aabe38f-20220331T005325Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series revision.[ch]: add and use release_revisions() | expand

Commit Message

Ævar Arnfjörð Bjarmason March 31, 2022, 1:11 a.m. UTC
Fix two memory leaks in "struct rev_info" by freeing that memory in
cmd_format_patch(). These two are unusual special-cases in being in
the "struct rev_info", but not being "owned" by the code in
revision.c. I.e. they're members of the struct so that this code in
"builtin/log.c" can pass information code in log-tree.c.

See 20ff06805c6 (format-patch: resurrect extra headers from config,
2006-06-02) and d1566f7883f (git-format-patch: Make the second and
subsequent mails replies to the first, 2006-07-14) for the initial
introduction of "extra_headers" and "ref_message_ids".

We can count on repo_init_revisions() memset()-ing this data to 0
however, so we can count on it being either NULL or something we
allocated. In the case of "extra_headers" let's add a local "char *"
variable to hold it, to avoid the eventual cast from "const char *"
when we free() it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/log.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Phillip Wood April 1, 2022, 3:13 p.m. UTC | #1
Hi Ævar

On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote:
> Fix two memory leaks in "struct rev_info" by freeing that memory in
> cmd_format_patch(). These two are unusual special-cases in being in
> the "struct rev_info", but not being "owned" by the code in
> revision.c. I.e. they're members of the struct so that this code in
> "builtin/log.c" can pass information code in log-tree.c.

I'm not sure that I necessarily agree that these are owned by 
builtin/log.c. For rev.extra_headers it is set in builtin/log.c but 
never used there which makes me think we are transferring ownership to 
struct rev_info. For ref_message_ids it is less clear cut but having it 
owned by struct rev_info and freeing it in release_revisions() would 
make things clearer I think. Having some members owned by struct 
rev_info but others allocated and freed by other code is confusing and 
is likely to lead to memory errors. I don't think struct rev_info is 
borrowing a reference to these items as they are being allocated for 
it's exclusive use.

Best Wishes

Phillip

> See 20ff06805c6 (format-patch: resurrect extra headers from config,
> 2006-06-02) and d1566f7883f (git-format-patch: Make the second and
> subsequent mails replies to the first, 2006-07-14) for the initial
> introduction of "extra_headers" and "ref_message_ids".
> 
> We can count on repo_init_revisions() memset()-ing this data to 0
> however, so we can count on it being either NULL or something we
> allocated. In the case of "extra_headers" let's add a local "char *"
> variable to hold it, to avoid the eventual cast from "const char *"
> when we free() it.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   builtin/log.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/log.c b/builtin/log.c
> index 634dc782cce..6f9928fabfe 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1747,6 +1747,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>   	struct commit *commit;
>   	struct commit **list = NULL;
>   	struct rev_info rev;
> +	char *to_free = NULL;
>   	struct setup_revision_opt s_r_opt;
>   	int nr = 0, total, i;
>   	int use_stdout = 0;
> @@ -1947,7 +1948,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>   		strbuf_addch(&buf, '\n');
>   	}
>   
> -	rev.extra_headers = strbuf_detach(&buf, NULL);
> +	rev.extra_headers = to_free = strbuf_detach(&buf, NULL);
>   
>   	if (from) {
>   		if (split_ident_line(&rev.from_ident, from, strlen(from)))
> @@ -2284,6 +2285,10 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>   	strbuf_release(&rdiff1);
>   	strbuf_release(&rdiff2);
>   	strbuf_release(&rdiff_title);
> +	free(to_free);
> +	if (rev.ref_message_ids)
> +		string_list_clear(rev.ref_message_ids, 0);
> +	free(rev.ref_message_ids);
>   	UNLEAK(rev);
>   	return 0;
>   }
Ævar Arnfjörð Bjarmason April 1, 2022, 5:16 p.m. UTC | #2
On Fri, Apr 01 2022, Phillip Wood wrote:

> Hi Ævar
>
> On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote:
>> Fix two memory leaks in "struct rev_info" by freeing that memory in
>> cmd_format_patch(). These two are unusual special-cases in being in
>> the "struct rev_info", but not being "owned" by the code in
>> revision.c. I.e. they're members of the struct so that this code in
>> "builtin/log.c" can pass information code in log-tree.c.
>
> I'm not sure that I necessarily agree that these are owned by
> builtin/log.c. For rev.extra_headers it is set in builtin/log.c but 
> never used there which makes me think we are transferring ownership to
> struct rev_info. For ref_message_ids it is less clear cut but having
> it owned by struct rev_info and freeing it in release_revisions()
> would make things clearer I think. Having some members owned by struct 
> rev_info but others allocated and freed by other code is confusing and
> is likely to lead to memory errors. I don't think struct rev_info is 
> borrowing a reference to these items as they are being allocated for
> it's exclusive use.

This really isn't for the use of revision.[ch] at all, but just
something that's used as ad-hoc message passing between builtin/log.c
and log-tree.c. It just so happens that it's done by ferrying it via the
struct rev_info.

The below patch fails tests, it's just a quick (and obviously flawed)
search/replacement that I hacked up to get the removal of these from
revision.h to compile, which shows that it's just something between
log.c and log-tree.c, pretty much.

Anyway, while I'm 100% in agreement with you that this *should* be fixed
I'd really like to do the bare minimum to address leaks in this initial
iteration.

I.e. you're right that this relatively fragile, and I'm also concern
about catching memory errors.

But being able to run (most of) the format-patch tests tests as a result
of this more isolated leak will leave us much better position to
validate any subsequent larger fixes, such as (something like) the
below.

diff --git a/builtin/log.c b/builtin/log.c
index c211d66d1d0..dc1acd730d1 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1192,7 +1192,8 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 			      struct commit *origin,
 			      int nr, struct commit **list,
 			      const char *branch_name,
-			      int quiet)
+			      int quiet, const char *extra_headers,
+			      struct string_list *ref_message_ids)
 {
 	const char *committer;
 	struct shortlog log;
@@ -1212,7 +1213,9 @@ static void make_cover_letter(struct rev_info *rev, int use_separate_file,
 	    open_next_file(NULL, rev->numbered_files ? NULL : "cover-letter", rev, quiet))
 		die(_("failed to create cover-letter file"));
 
-	log_write_email_headers(rev, head, &pp.after_subject, &need_8bit_cte, 0);
+	log_write_email_headers(rev, head, pp.after_subject, ref_message_ids,
+				&need_8bit_cte, 0);
+	pp.after_subject = extra_headers;
 
 	for (i = 0; !need_8bit_cte && i < nr; i++) {
 		const char *buf = get_commit_buffer(list[i], NULL);
@@ -1777,6 +1780,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	struct strbuf rdiff2 = STRBUF_INIT;
 	struct strbuf rdiff_title = STRBUF_INIT;
 	int creation_factor = -1;
+	char *extra_headers;
+	struct string_list *ref_message_ids;
 
 	const struct option builtin_format_patch_options[] = {
 		OPT_CALLBACK_F('n', "numbered", &numbered, NULL,
@@ -1946,7 +1951,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		strbuf_addch(&buf, '\n');
 	}
 
-	rev.extra_headers = strbuf_detach(&buf, NULL);
+	extra_headers = strbuf_detach(&buf, NULL);
 
 	if (from) {
 		if (split_ident_line(&rev.from_ident, from, strlen(from)))
@@ -2174,10 +2179,10 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	}
 
 	if (in_reply_to || thread || cover_letter)
-		rev.ref_message_ids = xcalloc(1, sizeof(struct string_list));
+		ref_message_ids = xcalloc(1, sizeof(struct string_list));
 	if (in_reply_to) {
 		const char *msgid = clean_message_id(in_reply_to);
-		string_list_append(rev.ref_message_ids, msgid);
+		string_list_append(ref_message_ids, msgid);
 	}
 	rev.numbered_files = just_numbers;
 	rev.patch_suffix = fmt_patch_suffix;
@@ -2185,7 +2190,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		if (thread)
 			gen_message_id(&rev, "cover");
 		make_cover_letter(&rev, !!output_directory,
-				  origin, nr, list, branch_name, quiet);
+				  origin, nr, list, branch_name, quiet,
+				  extra_headers, ref_message_ids);
 		print_bases(&bases, rev.diffopt.file);
 		print_signature(rev.diffopt.file);
 		total++;
@@ -2229,11 +2235,11 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 				 * --in-reply-to, if specified.
 				 */
 				if (thread == THREAD_SHALLOW
-				    && rev.ref_message_ids->nr > 0
+				    && ref_message_ids->nr > 0
 				    && (!cover_letter || rev.nr > 1))
 					free(rev.message_id);
 				else
-					string_list_append(rev.ref_message_ids,
+					string_list_append(ref_message_ids,
 							   rev.message_id);
 			}
 			gen_message_id(&rev, oid_to_hex(&commit->object.oid));
diff --git a/log-tree.c b/log-tree.c
index 38e5cccc1a1..66170b75254 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -419,11 +419,11 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 }
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     const char *extra_headers,
+			     struct string_list *ref_message_ids,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
@@ -435,13 +435,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		fprintf(opt->diffopt.file, "Message-Id: <%s>\n", opt->message_id);
 		graph_show_oneline(opt->graph);
 	}
-	if (opt->ref_message_ids && opt->ref_message_ids->nr > 0) {
+	if (ref_message_ids && ref_message_ids->nr > 0) {
 		int i, n;
-		n = opt->ref_message_ids->nr;
-		fprintf(opt->diffopt.file, "In-Reply-To: <%s>\n", opt->ref_message_ids->items[n-1].string);
+		n = ref_message_ids->nr;
+		fprintf(opt->diffopt.file, "In-Reply-To: <%s>\n", ref_message_ids->items[n-1].string);
 		for (i = 0; i < n; i++)
 			fprintf(opt->diffopt.file, "%s<%s>\n", (i > 0 ? "\t" : "References: "),
-			       opt->ref_message_ids->items[i].string);
+			       ref_message_ids->items[i].string);
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
@@ -488,7 +488,6 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
@@ -621,13 +620,12 @@ static void next_commentary_block(struct rev_info *opt, struct strbuf *sb)
 	opt->shown_dashes = 1;
 }
 
-void show_log(struct rev_info *opt)
+void show_log_extra_headers(struct rev_info *opt, const char *extra_headers)
 {
 	struct strbuf msgbuf = STRBUF_INIT;
 	struct log_info *log = opt->loginfo;
 	struct commit *commit = log->commit, *parent = log->parent;
 	int abbrev_commit = opt->abbrev_commit ? opt->abbrev : the_hash_algo->hexsz;
-	const char *extra_headers = opt->extra_headers;
 	struct pretty_print_context ctx = {0};
 
 	opt->loginfo = NULL;
@@ -687,7 +685,7 @@ void show_log(struct rev_info *opt)
 	 */
 
 	if (cmit_fmt_is_mail(opt->commit_format)) {
-		log_write_email_headers(opt, commit, &extra_headers,
+		log_write_email_headers(opt, commit, extra_headers, NULL,
 					&ctx.need_8bit_cte, 1);
 		ctx.rev = opt;
 		ctx.print_email_subject = 1;
@@ -848,6 +846,11 @@ void show_log(struct rev_info *opt)
 	}
 }
 
+void show_log(struct rev_info *opt)
+{
+	show_log_extra_headers(opt, NULL);
+}
+
 int log_tree_diff_flush(struct rev_info *opt)
 {
 	opt->shown_dashes = 0;
diff --git a/log-tree.h b/log-tree.h
index e7e4641cf83..32ae026f6fb 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -16,6 +16,7 @@ struct decoration_filter {
 int parse_decorate_color_config(const char *var, const char *slot_name, const char *value);
 int log_tree_diff_flush(struct rev_info *);
 int log_tree_commit(struct rev_info *, struct commit *);
+void show_log_extra_headers(struct rev_info *opt, const char *extra_headers);
 void show_log(struct rev_info *opt);
 void format_decorations_extended(struct strbuf *sb, const struct commit *commit,
 			     int use_color,
@@ -26,7 +27,8 @@ void format_decorations_extended(struct strbuf *sb, const struct commit *commit,
 			     format_decorations_extended((strbuf), (commit), (color), " (", ", ", ")")
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     const char *extra_headers,
+			     struct string_list *ref_message_ids,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart);
 void load_ref_decorations(struct decoration_filter *filter, int flags);
diff --git a/revision.h b/revision.h
index 5bc59c7bfe1..e2e08673bba 100644
--- a/revision.h
+++ b/revision.h
@@ -243,9 +243,7 @@ struct rev_info {
 	const char	*reroll_count;
 	char		*message_id;
 	struct ident_split from_ident;
-	struct string_list *ref_message_ids;
 	int		add_signoff;
-	const char	*extra_headers;
 	const char	*log_reencode;
 	const char	*subject_prefix;
 	int		patch_name_max;
Junio C Hamano April 1, 2022, 7:32 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> The below patch fails tests, it's just a quick (and obviously flawed)
> search/replacement that I hacked up to get the removal of these from
> revision.h to compile, which shows that it's just something between
> log.c and log-tree.c, pretty much.

Given that the way most end-users see log-tree.c in action is via
log.c (is there a useful entry point into log-tree that bypasses
log.c that implements end-user actions?), I would actually think it
is that extra_headers is quite an integral part of rev_info
structure, which is how the revision traversal API passes
information relevant to the current traversal around various layors
of the implementation.  So, I am not sure what the above experiment
shows us.

> Anyway, while I'm 100% in agreement with you that this *should* be fixed
> I'd really like to do the bare minimum to address leaks in this initial
> iteration.
>
> I.e. you're right that this relatively fragile, and I'm also concern
> about catching memory errors.

As long as we know there are yet more work to do (are we leaving
NEEDSWORK comments as we go, by the way?), I think that's fine.  We
cannot go all the way in one go for a large code base like revisions
API, and we can leave a small corner "not quite optimial", and we can
even leave it in "sometimes still leaks" status, as long as we never
make anything to free too much by mistake.
diff mbox series

Patch

diff --git a/builtin/log.c b/builtin/log.c
index 634dc782cce..6f9928fabfe 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1747,6 +1747,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	struct commit *commit;
 	struct commit **list = NULL;
 	struct rev_info rev;
+	char *to_free = NULL;
 	struct setup_revision_opt s_r_opt;
 	int nr = 0, total, i;
 	int use_stdout = 0;
@@ -1947,7 +1948,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		strbuf_addch(&buf, '\n');
 	}
 
-	rev.extra_headers = strbuf_detach(&buf, NULL);
+	rev.extra_headers = to_free = strbuf_detach(&buf, NULL);
 
 	if (from) {
 		if (split_ident_line(&rev.from_ident, from, strlen(from)))
@@ -2284,6 +2285,10 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	strbuf_release(&rdiff1);
 	strbuf_release(&rdiff2);
 	strbuf_release(&rdiff_title);
+	free(to_free);
+	if (rev.ref_message_ids)
+		string_list_clear(rev.ref_message_ids, 0);
+	free(rev.ref_message_ids);
 	UNLEAK(rev);
 	return 0;
 }