diff mbox series

send-email: export patch counters in validate environment

Message ID 20230411114723.89029-1-robin@jarry.cc (mailing list archive)
State Superseded
Headers show
Series send-email: export patch counters in validate environment | expand

Commit Message

Robin Jarry April 11, 2023, 11:47 a.m. UTC
When sending patch series (with a cover-letter or not)
sendemail-validate is called with every email/patch file independently
from the others. When one of the patches depends on a previous one, it
may not be possible to use this hook in a meaningful way. A hook that
wants to check some property of the whole series needs to know which
patch is the final one.

Expose the current and total number of patches to the hook via the
GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment
variables so that both incremental and global validation is possible.

Sharing any other state between successive invocations of the validate
hook must be done via external means. For example, by storing it in
a GIT_DIR/SENDEMAIL_VALIDATE file.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Robin Jarry <robin@jarry.cc>
---

Notes:
    Follow up on:
    https://lore.kernel.org/git/9b8d6cc4-741a-5081-d5de-df0972efec37@gmail.com/
    
    As suggested by Phillip, this is a less intrusive change which allows
    validating whole series. Let me know what you think.

 Documentation/githooks.txt | 10 ++++++++++
 git-send-email.perl        |  7 +++++++
 2 files changed, 17 insertions(+)

Comments

Phillip Wood April 11, 2023, 1:23 p.m. UTC | #1
Hi Robin

On 11/04/2023 12:47, Robin Jarry wrote:
> When sending patch series (with a cover-letter or not)
> sendemail-validate is called with every email/patch file independently
> from the others. When one of the patches depends on a previous one, it
> may not be possible to use this hook in a meaningful way. A hook that
> wants to check some property of the whole series needs to know which
> patch is the final one.
> 
> Expose the current and total number of patches to the hook via the
> GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment
> variables so that both incremental and global validation is possible.
> 
> Sharing any other state between successive invocations of the validate
> hook must be done via external means. For example, by storing it in
> a GIT_DIR/SENDEMAIL_VALIDATE file.
> 
> Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
> Signed-off-by: Robin Jarry <robin@jarry.cc>
> ---
> 
> Notes:
>      Follow up on:
>      https://lore.kernel.org/git/9b8d6cc4-741a-5081-d5de-df0972efec37@gmail.com/
>      
>      As suggested by Phillip, this is a less intrusive change which allows
>      validating whole series. Let me know what you think.

This is certainly less intrusive, if it does what you need and is 
efficient enough for your needs then I'd be inclined to go with this 
approach.

>   Documentation/githooks.txt | 10 ++++++++++
>   git-send-email.perl        |  7 +++++++
>   2 files changed, 17 insertions(+)
> 
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index 62908602e7be..55f00e0f6f8c 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -600,6 +600,16 @@ the name of the file that holds the e-mail to be sent.  Exiting with a
>   non-zero status causes `git send-email` to abort before sending any
>   e-mails.
>   
> +The following environment variables are set when executing the hook.
> +
> +`GIT_SENDEMAIL_PATCH_COUNTER`::
> +	A 1-based counter incremented by one for every file.
> +
> +`GIT_SENDEMAIL_PATCH_TOTAL`::
> +	The total number of files.
> +
> +These variables can be used to validate patch series.
> +
>   fsmonitor-watchman
>   ~~~~~~~~~~~~~~~~~~
>   
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 07f2a0cbeaad..e962d5e15983 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -795,10 +795,17 @@ sub is_format_patch_arg {
>   @files = handle_backup_files(@files);
>   
>   if ($validate) {
> +	my $num = 1;
> +	my $num_patches = @files;
>   	foreach my $f (@files) {
>   		unless (-p $f) {
> +			$ENV{GIT_SENDEMAIL_PATCH_COUNTER} = "$num";
> +			$ENV{GIT_SENDEMAIL_PATCH_TOTAL} = "$num_patches";

We only need to set this once outside the loop

>   			validate_patch($f, $target_xfer_encoding);
> +			delete $ENV{GIT_SENDEMAIL_PATCH_COUNTER};
> +			delete $ENV{GIT_SENDEMAIL_PATCH_TOTAL};

Do we really need to clear these? Certainly not in each iteration of the 
loop I would think.

Best Wishes

Phillip
>   		}
> +		$num += 1;
>   	}
>   }
>
Junio C Hamano April 11, 2023, 4:28 p.m. UTC | #2
Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Robin
>
> On 11/04/2023 12:47, Robin Jarry wrote:
>> When sending patch series (with a cover-letter or not)
>> sendemail-validate is called with every email/patch file independently
>> from the others. When one of the patches depends on a previous one, it
>> may not be possible to use this hook in a meaningful way. A hook that
>> wants to check some property of the whole series needs to know which
>> patch is the final one.
>> Expose the current and total number of patches to the hook via the
>> GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment
>> variables so that both incremental and global validation is possible.

The above mentions "cover letter" and naturally the readers would
wonder how it is treated.  When we have 5-patch series with a
separate cover letter, do we get TOTAL=6, COUNTER=1 for the cover,
COUNTER=2 for [PATCH 1/5], and so on, or do we see TOTAL=5,
COUNTER=0 for the cover, counter=1 for [PATCH 1/5], and so on?

The latter is certainly richer (with the former, the validator that
wants to act differently on the cover has to somehow figure out if
the invocation with COUNTER=1 is seeing the cover or the first
patch).  The usual and recommended workflow being "git format-patch
-o outdir --cover-letter <range>" followed by "edit outdir/*" to
proofread and edit the cover and the patches, followed by "git
send-email outdir/*.patch", git-send-email has to guess before
invoking the hook.

But it may be better than forcing the hook to guess, I dunno?

Whichever way we choose, we should

 * explain the choice in the proposed log message.  If we choose the
   "TOTAL is the number of patches and COUNTER=0 is used for the
   optional cover letter" interpretation, we should also explain
   that we cannot reliably do so and sometimes can guess wrong.  If
   we choose the "TOTAL is the number of input files and COUNTER
   just counts, regardless of the payload" interpretation, we should
   also explain that even though we hinted that a series with cover
   letter can be validated, it is a slight lie, because the hook has
   to guess if the series has cover and it can guess wrong.

 * document what TOTAL and COUNTER means.

>> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
>> index 62908602e7be..55f00e0f6f8c 100644
>> --- a/Documentation/githooks.txt
>> +++ b/Documentation/githooks.txt
>> @@ -600,6 +600,16 @@ the name of the file that holds the e-mail to be sent.  Exiting with a
>>   non-zero status causes `git send-email` to abort before sending any
>>   e-mails.
>>   +The following environment variables are set when executing the
>> hook.
>> +
>> +`GIT_SENDEMAIL_PATCH_COUNTER`::
>> +	A 1-based counter incremented by one for every file.
>> +
>> +`GIT_SENDEMAIL_PATCH_TOTAL`::
>> +	The total number of files.
>> +
>> +These variables can be used to validate patch series.

This may be sufficient documentation to imply we are not treating
cover letter any differently, by not saying "patch" or "cover
letter" but just saying "file".  It may be more helpful to be a bit
more explicit, though (e.g. "files" -> "input files", perhaps).

>>   diff --git a/git-send-email.perl b/git-send-email.perl
>> index 07f2a0cbeaad..e962d5e15983 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -795,10 +795,17 @@ sub is_format_patch_arg {
>>   @files = handle_backup_files(@files);
>>     if ($validate) {
>> +	my $num = 1;
>> +	my $num_patches = @files;
>>   	foreach my $f (@files) {
>>   		unless (-p $f) {
>> +			$ENV{GIT_SENDEMAIL_PATCH_COUNTER} = "$num";
>> +			$ENV{GIT_SENDEMAIL_PATCH_TOTAL} = "$num_patches";
>
> We only need to set this once outside the loop

Indeed.

>>   			validate_patch($f, $target_xfer_encoding);
>> +			delete $ENV{GIT_SENDEMAIL_PATCH_COUNTER};
>> +			delete $ENV{GIT_SENDEMAIL_PATCH_TOTAL};
>
> Do we really need to clear these? Certainly not in each iteration of
> the loop I would think.

If we set TOTAL outside, we should clear it outside.  We have to set
COUNTER inside, and we could clear it outside, but it probably is
easier to see the correspondence of set/clear if it is done inside.

>>   		}
>> +		$num += 1;

When you have 3 files to send, and if the last one satisfies "-p",
the hook will be told "You are called for 1/3" and then "2/3", and
will never hear about "3/3", so in practice it will spool the first
two and finish without getting a chance to flush what has been
spooled.  When you have 3 files to send, and if the first one
satisfies "-p', the hook will be told "You are called for 2/3", but
it is understandable if anybody is tempted to write a hook this way:

	if COUNTER==1:
		initialize the spool area
		record TOTAL there
	else:
		read TOTAL recorded in the spool area
		make sure TOTAL matches

	process [PATCH COUNTER/TOTAL] individually
	if COUNTER==TOTAL:
		process the series as a whole

and for such an invocation of "git send-email", the hook will try to
process the second file without having its state fully initialzied
because it never saw the first.

Would these be problems?  I dunno.

Thanks for working on the patch, and thanks for a careful review.
Robin Jarry April 11, 2023, 4:47 p.m. UTC | #3
Phillip Wood, Apr 11, 2023 at 15:23:
> This is certainly less intrusive, if it does what you need and is 
> efficient enough for your needs then I'd be inclined to go with this 
> approach.

Yes, that is perfectly suitable to validate series. The missing pieces
of information (e.g. the place where all patches are spooled) can be
either hard coded or stored in git config entries.

> >   	foreach my $f (@files) {
> >   		unless (-p $f) {
> > +			$ENV{GIT_SENDEMAIL_PATCH_COUNTER} = "$num";
> > +			$ENV{GIT_SENDEMAIL_PATCH_TOTAL} = "$num_patches";
>
> We only need to set this once outside the loop
>
> >   			validate_patch($f, $target_xfer_encoding);
> > +			delete $ENV{GIT_SENDEMAIL_PATCH_COUNTER};
> > +			delete $ENV{GIT_SENDEMAIL_PATCH_TOTAL};
>
> Do we really need to clear these? Certainly not in each iteration of the 
> loop I would think.

I wanted to keep everything collocated. The time spent setting/unsetting
these variables is completely negligible. I don't mind making this more
streamlined in a v2.

Thanks for the review :)
Robin Jarry April 11, 2023, 5:13 p.m. UTC | #4
Hi Junio,

Junio C Hamano, Apr 11, 2023 at 18:28:
> The above mentions "cover letter" and naturally the readers would
> wonder how it is treated.  When we have 5-patch series with a
> separate cover letter, do we get TOTAL=6, COUNTER=1 for the cover,
> COUNTER=2 for [PATCH 1/5], and so on, or do we see TOTAL=5,
> COUNTER=0 for the cover, counter=1 for [PATCH 1/5], and so on?
>
> The latter is certainly richer (with the former, the validator that
> wants to act differently on the cover has to somehow figure out if
> the invocation with COUNTER=1 is seeing the cover or the first
> patch).  The usual and recommended workflow being "git format-patch
> -o outdir --cover-letter <range>" followed by "edit outdir/*" to
> proofread and edit the cover and the patches, followed by "git
> send-email outdir/*.patch", git-send-email has to guess before
> invoking the hook.
>
> But it may be better than forcing the hook to guess, I dunno?
>
> Whichever way we choose, we should
>
>  * explain the choice in the proposed log message.  If we choose the
>    "TOTAL is the number of patches and COUNTER=0 is used for the
>    optional cover letter" interpretation, we should also explain
>    that we cannot reliably do so and sometimes can guess wrong.  If
>    we choose the "TOTAL is the number of input files and COUNTER
>    just counts, regardless of the payload" interpretation, we should
>    also explain that even though we hinted that a series with cover
>    letter can be validated, it is a slight lie, because the hook has
>    to guess if the series has cover and it can guess wrong.
>
>  * document what TOTAL and COUNTER means.

It is easy enough to differentiate a cover letter from an actual patch
with a simple shell test:

    if grep -q "^diff --git " "$1"; then
        # patch file
    else
        # cover letter
    fi

It is probably best to let git-send-email out of the picture. Since
nothing prevents from sending multiple patch series at once, it may not
be possible to determine the proper ordering of all these files. A dumb
1-based counter will be perfectly suitable.

I will add more details about these two variables, what they mean and
how they should be used.

> This may be sufficient documentation to imply we are not treating
> cover letter any differently, by not saying "patch" or "cover
> letter" but just saying "file".  It may be more helpful to be a bit
> more explicit, though (e.g. "files" -> "input files", perhaps).

It makes sense to use the "files" terminology instead of "patches".
I will update for v2.

> > Do we really need to clear these? Certainly not in each iteration of
> > the loop I would think.
>
> If we set TOTAL outside, we should clear it outside.  We have to set
> COUNTER inside, and we could clear it outside, but it probably is
> easier to see the correspondence of set/clear if it is done inside.

Given the small cost of setting these variables in a perl script, it was
my intention to have a clear correspondence between the set/clear
operations.

> When you have 3 files to send, and if the last one satisfies "-p",
> the hook will be told "You are called for 1/3" and then "2/3", and
> will never hear about "3/3", so in practice it will spool the first
> two and finish without getting a chance to flush what has been
> spooled.  When you have 3 files to send, and if the first one
> satisfies "-p', the hook will be told "You are called for 2/3", but
> it is understandable if anybody is tempted to write a hook this way:
>
> 	if COUNTER==1:
> 		initialize the spool area
> 		record TOTAL there
> 	else:
> 		read TOTAL recorded in the spool area
> 		make sure TOTAL matches
>
> 	process [PATCH COUNTER/TOTAL] individually
> 	if COUNTER==TOTAL:
> 		process the series as a whole
>
> and for such an invocation of "git send-email", the hook will try to
> process the second file without having its state fully initialzied
> because it never saw the first.
>
> Would these be problems?  I dunno.

I had thought of this. From perl docs:

    -p  File is a named pipe (FIFO), or Filehandle is a pipe.
    https://perldoc.perl.org/functions/-p

While there is very little chance that users will run git send-email on
FIFOs, it is a possibility. Reference commit is:

    300913bd448de ("git-send-email: Accept fifos as well as files")
    https://github.com/git/git/commit/300913bd448de

I can run the loop twice to determine the count of non-FIFOs and adjust
GIT_SENDEMAIL_FILE_TOTAL accordingly.

Thanks for the review.

PS: What would you think if I also added a sendemail-validate.sample
script in the templates folder? Should I add it in the same commit?
Junio C Hamano April 11, 2023, 7:14 p.m. UTC | #5
"Robin Jarry" <robin@jarry.cc> writes:

> It is probably best to let git-send-email out of the picture. Since
> nothing prevents from sending multiple patch series at once, it may not
> be possible to determine the proper ordering of all these files. A dumb
> 1-based counter will be perfectly suitable.

As long as the design decision is clearly documented, I am more than
fine to make it the user's problem ;-) It is a better design between
the two, as the user knows better what their payload is.

> I can run the loop twice to determine the count of non-FIFOs and adjust
> GIT_SENDEMAIL_FILE_TOTAL accordingly.

It sounds like a reasonable way out.

Thanks.
diff mbox series

Patch

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 62908602e7be..55f00e0f6f8c 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -600,6 +600,16 @@  the name of the file that holds the e-mail to be sent.  Exiting with a
 non-zero status causes `git send-email` to abort before sending any
 e-mails.
 
+The following environment variables are set when executing the hook.
+
+`GIT_SENDEMAIL_PATCH_COUNTER`::
+	A 1-based counter incremented by one for every file.
+
+`GIT_SENDEMAIL_PATCH_TOTAL`::
+	The total number of files.
+
+These variables can be used to validate patch series.
+
 fsmonitor-watchman
 ~~~~~~~~~~~~~~~~~~
 
diff --git a/git-send-email.perl b/git-send-email.perl
index 07f2a0cbeaad..e962d5e15983 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -795,10 +795,17 @@  sub is_format_patch_arg {
 @files = handle_backup_files(@files);
 
 if ($validate) {
+	my $num = 1;
+	my $num_patches = @files;
 	foreach my $f (@files) {
 		unless (-p $f) {
+			$ENV{GIT_SENDEMAIL_PATCH_COUNTER} = "$num";
+			$ENV{GIT_SENDEMAIL_PATCH_TOTAL} = "$num_patches";
 			validate_patch($f, $target_xfer_encoding);
+			delete $ENV{GIT_SENDEMAIL_PATCH_COUNTER};
+			delete $ENV{GIT_SENDEMAIL_PATCH_TOTAL};
 		}
+		$num += 1;
 	}
 }