diff mbox series

[v2,1/3] revision: add a per-email field to rev-info

Message ID 9a7102b708e4afe78447e48e4baf5b6d66ca50d1.1710873210.git.code@khaugsbakk.name (mailing list archive)
State New
Headers show
Series format-patch: teach `--header-cmd` | expand

Commit Message

Kristoffer Haugsbakk March 19, 2024, 6:35 p.m. UTC
Add `pe_header` to `rev_info` to store per-email headers.

The next commit will add an option to `format-patch` which will allow
the user to store headers per-email; a complement to options like
`--add-header`.

To make this possible we need a new field to store these headers. We
also need to take ownership of `extra_headers_p` in
`log_write_email_headers`; facilitate this by removing constness from
the relevant pointers.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
---

Notes (series):
    v2:
    • Replaces “log-tree: take ownership of pointer”
      • Link: https://lore.kernel.org/git/3b12a8cf393b6d8f0877fd7d87173c565d7d5a90.1709841147.git.code@khaugsbakk.name/
    • More preliminary work
      • Link: https://lore.kernel.org/git/20240313065454.GB125150@coredump.intra.peff.net/

 log-tree.c | 21 +++++++++++----------
 log-tree.h |  2 +-
 pretty.h   |  2 +-
 revision.h |  4 +++-
 4 files changed, 16 insertions(+), 13 deletions(-)

Comments

Jeff King March 19, 2024, 9:29 p.m. UTC | #1
On Tue, Mar 19, 2024 at 07:35:36PM +0100, Kristoffer Haugsbakk wrote:

> Add `pe_header` to `rev_info` to store per-email headers.

It is only just now that I realized that "pe" stands for per-email
(though to be fair I was not really focused on the intent of the series
when reading v1). Can we just call it per_email_headers or something?

> The next commit will add an option to `format-patch` which will allow
> the user to store headers per-email; a complement to options like
> `--add-header`.
> 
> To make this possible we need a new field to store these headers. We
> also need to take ownership of `extra_headers_p` in
> `log_write_email_headers`; facilitate this by removing constness from
> the relevant pointers.

There are three pointers at play here:

  - ctx.after_subject has its const removed, since it will now always be
    allocated by log_write_email_headers(), and then freed by the
    caller. Makes sense Though it looks like we only free in show_log(),
    and the free in make_cover_letter() is not added until patch 2?

  - rev_info.extra_headers has its const removed here, but I don't think
    that is helping anything. We only use it to write into the "headers"
    strbuf in log_write_email_headers(), which always returns
    headers.buf (or NULL).

  - rev.pe_headers is introduced as non-const because it is allocated
    and freed for each email. That makes some sense, though if we
    followed the pattern of rev.extra_headers, then the pointer is
    conceptually "const" within the rev_info struct, and it is the
    caller who keeps track of the allocation (using a to_free variable).
    Possibly we should do the same here?

I do still think this could be split in a more obvious way, leaving the
pe_headers bits until they are actually needed. Let me see if I can
sketch it up.

-Peff
Kristoffer Haugsbakk March 19, 2024, 9:41 p.m. UTC | #2
On Tue, Mar 19, 2024, at 22:29, Jeff King wrote:
> On Tue, Mar 19, 2024 at 07:35:36PM +0100, Kristoffer Haugsbakk wrote:
>
>> Add `pe_header` to `rev_info` to store per-email headers.
>
> It is only just now that I realized that "pe" stands for per-email
> (though to be fair I was not really focused on the intent of the series
> when reading v1). Can we just call it per_email_headers or something?

For sure.

>> The next commit will add an option to `format-patch` which will allow
>> the user to store headers per-email; a complement to options like
>> `--add-header`.
>>
>> To make this possible we need a new field to store these headers. We
>> also need to take ownership of `extra_headers_p` in
>> `log_write_email_headers`; facilitate this by removing constness from
>> the relevant pointers.
>
> There are three pointers at play here:
>
>   - ctx.after_subject has its const removed, since it will now always be
>     allocated by log_write_email_headers(), and then freed by the
>     caller. Makes sense Though it looks like we only free in show_log(),
>     and the free in make_cover_letter() is not added until patch 2?
>
>   - rev_info.extra_headers has its const removed here, but I don't think
>     that is helping anything. We only use it to write into the "headers"
>     strbuf in log_write_email_headers(), which always returns
>     headers.buf (or NULL).
>
>   - rev.pe_headers is introduced as non-const because it is allocated
>     and freed for each email. That makes some sense, though if we
>     followed the pattern of rev.extra_headers, then the pointer is
>     conceptually "const" within the rev_info struct, and it is the
>     caller who keeps track of the allocation (using a to_free variable).
>     Possibly we should do the same here?
>
> I do still think this could be split in a more obvious way, leaving the
> pe_headers bits until they are actually needed. Let me see if I can
> sketch it up.

Nice :)
Jeff King March 20, 2024, 12:25 a.m. UTC | #3
On Tue, Mar 19, 2024 at 05:29:40PM -0400, Jeff King wrote:

> There are three pointers at play here:
> 
>   - ctx.after_subject has its const removed, since it will now always be
>     allocated by log_write_email_headers(), and then freed by the
>     caller. Makes sense Though it looks like we only free in show_log(),
>     and the free in make_cover_letter() is not added until patch 2?
> 
>   - rev_info.extra_headers has its const removed here, but I don't think
>     that is helping anything. We only use it to write into the "headers"
>     strbuf in log_write_email_headers(), which always returns
>     headers.buf (or NULL).
> 
>   - rev.pe_headers is introduced as non-const because it is allocated
>     and freed for each email. That makes some sense, though if we
>     followed the pattern of rev.extra_headers, then the pointer is
>     conceptually "const" within the rev_info struct, and it is the
>     caller who keeps track of the allocation (using a to_free variable).
>     Possibly we should do the same here?
> 
> I do still think this could be split in a more obvious way, leaving the
> pe_headers bits until they are actually needed. Let me see if I can
> sketch it up.

OK, this rabbit hole went much deeper than I expected. ;)

I see why you wanted to drop the const from rev_info.extra_headers here.
We need the local extra_headers variable in show_log() to be non-const
(since it receives the output of log_write_email_headers). But we also
assign rev_info.extra_headers to that variable, and if it is const, the
compiler will complain.

But as it turns out, that assignment is not really necessary at all! It
is only used when you have extra headers along with a non-email format.
In most cases we simply ignore the headers for those formats, and in the
one case where we do respect them, I think it is doing the wrong thing.

So here are some patches which clean things up. They would make a
suitable base for your changes, I think, but IMHO they also stand on
their own as cleanups.

Having now stared at this code for a bit, I do think there's another,
much simpler option for your series: keep the same ugly static-strbuf
allocation pattern in log_write_email_headers(), but extend it further.
I'll show that in a moment, too.

  [1/6]: shortlog: stop setting pp.print_email_subject
  [2/6]: pretty: split oneline and email subject printing
  [3/6]: pretty: drop print_email_subject flag
  [4/6]: log: do not set up extra_headers for non-email formats
  [5/6]: format-patch: return an allocated string from log_write_email_headers()
  [6/6]: format-patch: simplify after-subject MIME header handling

 builtin/log.c      |  4 ++--
 builtin/rev-list.c |  1 +
 builtin/shortlog.c |  1 -
 log-tree.c         | 22 +++++++++-------------
 log-tree.h         |  2 +-
 pretty.c           | 43 ++++++++++++++++++++-----------------------
 pretty.h           | 11 +++++------
 7 files changed, 38 insertions(+), 46 deletions(-)

-Peff
Jeff King March 20, 2024, 12:43 a.m. UTC | #4
On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:

> Having now stared at this code for a bit, I do think there's another,
> much simpler option for your series: keep the same ugly static-strbuf
> allocation pattern in log_write_email_headers(), but extend it further.
> I'll show that in a moment, too.

So something like this:

diff --git a/log-tree.c b/log-tree.c
index e5438b029d..ae0f4fc502 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -474,12 +474,21 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	static struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	strbuf_reset(&headers);
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	/*
+	 * here's where you'd do your pe_headers; I wonder if you could even
+	 * just run the header command directly here and not need to shove the
+	 * string into rev_info?
+	 */
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -496,16 +505,13 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -516,10 +522,8 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -539,7 +543,7 @@ void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? headers.buf : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)

And then the callers can continue not caring about how or when to free
the returned pointer. I think in the long run the cleanups I showed are
a nicer place to end up, but I'd just worry that your feature work will
be held hostage by my desire to clean. ;)

If you did it this way (probably as a separate preparatory patch minus
the pe_headers comment), then either I could do my cleanups on top, or
they could even graduate independently (though obviously there will be a
little bit of tricky merging at the end).

-Peff
Kristoffer Haugsbakk March 22, 2024, 10:31 p.m. UTC | #5
On Wed, Mar 20, 2024, at 01:43, Jeff King wrote:
> On Tue, Mar 19, 2024 at 08:25:55PM -0400, Jeff King wrote:
>
>> Having now stared at this code for a bit, I do think there's another,
>> much simpler option for your series: keep the same ugly static-strbuf
>> allocation pattern in log_write_email_headers(), but extend it further.
>> I'll show that in a moment, too.
>
> So something like this:
>
> diff --git a/log-tree.c b/log-tree.c
> index e5438b029d..ae0f4fc502 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -474,12 +474,21 @@ void log_write_email_headers(struct rev_info
> *opt, struct commit *commit,
>  			     int *need_8bit_cte_p,
>  			     int maybe_multipart)
>  {
> -	const char *extra_headers = opt->extra_headers;
> +	static struct strbuf headers = STRBUF_INIT;
>  	const char *name = oid_to_hex(opt->zero_commit ?
>  				      null_oid() : &commit->object.oid);
>
>  	*need_8bit_cte_p = 0; /* unknown */
>
> +	strbuf_reset(&headers);
> +	if (opt->extra_headers)
> +		strbuf_addstr(&headers, opt->extra_headers);
> +	/*
> +	 * here's where you'd do your pe_headers; I wonder if you could even
> +	 * just run the header command directly here and not need to shove the
> +	 * string into rev_info?
> +	 */
> +

Hmm. I’ll look into that. This seems like a nicer place to do it
compared to `log.c`.

>  	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n",
> name);
>  	graph_show_oneline(opt->graph);
>  	if (opt->message_id) {
> @@ -496,16 +505,13 @@ void log_write_email_headers(struct rev_info
> *opt, struct commit *commit,
>  		graph_show_oneline(opt->graph);
>  	}
>  	if (opt->mime_boundary && maybe_multipart) {
> -		static struct strbuf subject_buffer = STRBUF_INIT;
>  		static struct strbuf buffer = STRBUF_INIT;
>  		struct strbuf filename =  STRBUF_INIT;
>  		*need_8bit_cte_p = -1; /* NEVER */
>
> -		strbuf_reset(&subject_buffer);
>  		strbuf_reset(&buffer);
>
> -		strbuf_addf(&subject_buffer,
> -			 "%s"
> +		strbuf_addf(&headers,
>  			 "MIME-Version: 1.0\n"
>  			 "Content-Type: multipart/mixed;"
>  			 " boundary=\"%s%s\"\n"
> @@ -516,10 +522,8 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>  			 "Content-Type: text/plain; "
>  			 "charset=UTF-8; format=fixed\n"
>  			 "Content-Transfer-Encoding: 8bit\n\n",
> -			 extra_headers ? extra_headers : "",
>  			 mime_boundary_leader, opt->mime_boundary,
>  			 mime_boundary_leader, opt->mime_boundary);
> -		extra_headers = subject_buffer.buf;
>
>  		if (opt->numbered_files)
>  			strbuf_addf(&filename, "%d", opt->nr);
> @@ -539,7 +543,7 @@ void log_write_email_headers(struct rev_info *opt,
> struct commit *commit,
>  		opt->diffopt.stat_sep = buffer.buf;
>  		strbuf_release(&filename);
>  	}
> -	*extra_headers_p = extra_headers;
> +	*extra_headers_p = headers.len ? headers.buf : NULL;
>  }
>
>  static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
>
> And then the callers can continue not caring about how or when to free
> the returned pointer. I think in the long run the cleanups I showed are
> a nicer place to end up, but I'd just worry that your feature work will
> be held hostage by my desire to clean. ;)

Hah! Definitely don’t worry about that, this has been very helpful.

> If you did it this way (probably as a separate preparatory patch minus
> the pe_headers comment), then either I could do my cleanups on top, or
> they could even graduate independently (though obviously there will be a
> little bit of tricky merging at the end).
>
> -Peff

I think your series should take precedence. I’ll put my series on the
backburner for a while. There’s no rush with that one. These changes of
yours will make extending the header logic easier overall.

Then when yours is merged I’ll have an even easier time.

Thanks again

Kristoffer
diff mbox series

Patch

diff --git a/log-tree.c b/log-tree.c
index e5438b029d9..f6cdde6e8f3 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -470,16 +470,21 @@  void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt)
 }
 
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart)
 {
-	const char *extra_headers = opt->extra_headers;
+	struct strbuf headers = STRBUF_INIT;
 	const char *name = oid_to_hex(opt->zero_commit ?
 				      null_oid() : &commit->object.oid);
 
 	*need_8bit_cte_p = 0; /* unknown */
 
+	if (opt->extra_headers)
+		strbuf_addstr(&headers, opt->extra_headers);
+	if (opt->pe_headers)
+		strbuf_addstr(&headers, opt->pe_headers);
+
 	fprintf(opt->diffopt.file, "From %s Mon Sep 17 00:00:00 2001\n", name);
 	graph_show_oneline(opt->graph);
 	if (opt->message_id) {
@@ -496,16 +501,13 @@  void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		graph_show_oneline(opt->graph);
 	}
 	if (opt->mime_boundary && maybe_multipart) {
-		static struct strbuf subject_buffer = STRBUF_INIT;
 		static struct strbuf buffer = STRBUF_INIT;
 		struct strbuf filename =  STRBUF_INIT;
 		*need_8bit_cte_p = -1; /* NEVER */
 
-		strbuf_reset(&subject_buffer);
 		strbuf_reset(&buffer);
 
-		strbuf_addf(&subject_buffer,
-			 "%s"
+		strbuf_addf(&headers,
 			 "MIME-Version: 1.0\n"
 			 "Content-Type: multipart/mixed;"
 			 " boundary=\"%s%s\"\n"
@@ -516,10 +518,8 @@  void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 			 "Content-Type: text/plain; "
 			 "charset=UTF-8; format=fixed\n"
 			 "Content-Transfer-Encoding: 8bit\n\n",
-			 extra_headers ? extra_headers : "",
 			 mime_boundary_leader, opt->mime_boundary,
 			 mime_boundary_leader, opt->mime_boundary);
-		extra_headers = subject_buffer.buf;
 
 		if (opt->numbered_files)
 			strbuf_addf(&filename, "%d", opt->nr);
@@ -539,7 +539,7 @@  void log_write_email_headers(struct rev_info *opt, struct commit *commit,
 		opt->diffopt.stat_sep = buffer.buf;
 		strbuf_release(&filename);
 	}
-	*extra_headers_p = extra_headers;
+	*extra_headers_p = headers.len ? strbuf_detach(&headers, NULL) : NULL;
 }
 
 static void show_sig_lines(struct rev_info *opt, int status, const char *bol)
@@ -678,7 +678,7 @@  void show_log(struct rev_info *opt)
 	struct log_info *log = opt->loginfo;
 	struct commit *commit = log->commit, *parent = log->parent;
 	int abbrev_commit = opt->abbrev_commit ? opt->abbrev : the_hash_algo->hexsz;
-	const char *extra_headers = opt->extra_headers;
+	char *extra_headers = opt->extra_headers;
 	struct pretty_print_context ctx = {0};
 
 	opt->loginfo = NULL;
@@ -857,6 +857,7 @@  void show_log(struct rev_info *opt)
 
 	strbuf_release(&msgbuf);
 	free(ctx.notes_message);
+	free(ctx.after_subject);
 
 	if (cmit_fmt_is_mail(ctx.fmt) && opt->idiff_oid1) {
 		struct diff_queue_struct dq;
diff --git a/log-tree.h b/log-tree.h
index 41c776fea52..94978e2c838 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -29,7 +29,7 @@  void format_decorations(struct strbuf *sb, const struct commit *commit,
 			int use_color, const struct decoration_options *opts);
 void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, struct commit *commit,
-			     const char **extra_headers_p,
+			     char **extra_headers_p,
 			     int *need_8bit_cte_p,
 			     int maybe_multipart);
 void load_ref_decorations(struct decoration_filter *filter, int flags);
diff --git a/pretty.h b/pretty.h
index 421209e9ec2..bdce3191875 100644
--- a/pretty.h
+++ b/pretty.h
@@ -35,7 +35,7 @@  struct pretty_print_context {
 	 */
 	enum cmit_fmt fmt;
 	int abbrev;
-	const char *after_subject;
+	char *after_subject;
 	int preserve_subject;
 	struct date_mode date_mode;
 	unsigned date_mode_explicit:1;
diff --git a/revision.h b/revision.h
index 94c43138bc3..95e92397a7a 100644
--- a/revision.h
+++ b/revision.h
@@ -290,7 +290,9 @@  struct rev_info {
 	struct ident_split from_ident;
 	struct string_list *ref_message_ids;
 	int		add_signoff;
-	const char	*extra_headers;
+	char		*extra_headers;
+	/* per-email headers */
+	char		*pe_headers;
 	const char	*log_reencode;
 	const char	*subject_prefix;
 	int		patch_name_max;