diff mbox series

send-email: move validation code below process_address_list

Message ID ee56c4df-e030-45f9-86a9-94fb3540db60@amd.com (mailing list archive)
State Superseded
Headers show
Series send-email: move validation code below process_address_list | expand

Commit Message

Michael Strawbridge Oct. 24, 2023, 8:19 p.m. UTC
From 09ea51d63cebdf9ff0c073ef86e21b4b09c268e5 Mon Sep 17 00:00:00 2001
From: Michael Strawbridge <michael.strawbridge@amd.com>
Date: Wed, 11 Oct 2023 16:13:13 -0400
Subject: [PATCH] send-email: move validation code below process_address_list

Move validation logic below processing of email address lists so that
email validation gets the proper email addresses.

This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
---
 git-send-email.perl | 48 ++++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

Comments

Junio C Hamano Oct. 24, 2023, 9:55 p.m. UTC | #1
Michael Strawbridge <michael.strawbridge@amd.com> writes:

> Subject: [PATCH] send-email: move validation code below process_address_list
>
> Move validation logic below processing of email address lists so that
> email validation gets the proper email addresses.

Hmph, without this patch, the tip of 'seen' passes t9001 on my box,
but with it, it claims that it failed 58, 87, and 89.

Here is how #58 fails (the last part of "cd t && sh t9001-*.sh -i -v").

expecting success of 9001.58 'In-Reply-To without --chain-reply-to':
        clean_fake_sendmail &&
        echo "<unique-message-id@example.com>" >expect &&
        git send-email \
                --from="Example <nobody@example.com>" \
                --to=nobody@example.com \
                --no-chain-reply-to \
                --in-reply-to="$(cat expect)" \
                --smtp-server="$(pwd)/fake.sendmail" \
                $patches $patches $patches \
                2>errors &&
        # The first message is a reply to --in-reply-to
        sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt1 >actual &&
        test_cmp expect actual &&
        # Second and subsequent messages are replies to the first one
        sed -n -e "s/^Message-ID: *\(.*\)/\1/p" msgtxt1 >expect &&
        sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt2 >actual &&
        test_cmp expect actual &&
        sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt3 >actual &&
        test_cmp expect actual

0001-Second.patch
0001-Second.patch
0001-Second.patch
(mbox) Adding cc: A <author@example.com> from line 'From: A <author@example.com>'
(mbox) Adding cc: One <one@example.com> from line 'Cc: One <one@example.com>, two@example.com'
(mbox) Adding cc: two@example.com from line 'Cc: One <one@example.com>, two@example.com'
(body) Adding cc: C O Mitter <committer@example.com> from line 'Signed-off-by: C O Mitter <committer@example.com>'
OK. Log says:
Sendmail: /usr/local/google/home/jch/w/git.git/t/trash directory.t9001-send-email/fake.sendmail -i nobody@example.com author@example.com one@example.com two@example.com committer@example.com
From: Example <nobody@example.com>
To: nobody@example.com
Cc: A <author@example.com>,
        One <one@example.com>,
        two@example.com,
        C O Mitter <committer@example.com>
Subject: [PATCH 1/1] Second.
Date: Tue, 24 Oct 2023 21:52:27 +0000
Message-ID: <20231024215229.1787922-1-nobody@example.com>
X-Mailer: git-send-email 2.42.0-705-g1a1f985ecc
In-Reply-To: <unique-message-id@example.com>
References: <unique-message-id@example.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Result: OK
(mbox) Adding cc: A <author@example.com> from line 'From: A <author@example.com>'
(mbox) Adding cc: One <one@example.com> from line 'Cc: One <one@example.com>, two@example.com'
(mbox) Adding cc: two@example.com from line 'Cc: One <one@example.com>, two@example.com'
(body) Adding cc: C O Mitter <committer@example.com> from line 'Signed-off-by: C O Mitter <committer@example.com>'
OK. Log says:
Sendmail: /usr/local/google/home/jch/w/git.git/t/trash directory.t9001-send-email/fake.sendmail -i nobody@example.com author@example.com one@example.com two@example.com committer@example.com
From: Example <nobody@example.com>
To: nobody@example.com
Cc: A <author@example.com>,
        One <one@example.com>,
        two@example.com,
        C O Mitter <committer@example.com>
Subject: [PATCH 1/1] Second.
Date: Tue, 24 Oct 2023 21:52:28 +0000
Message-ID: <20231024215229.1787922-2-nobody@example.com>
X-Mailer: git-send-email 2.42.0-705-g1a1f985ecc
In-Reply-To: <unique-message-id@example.com>
References: <unique-message-id@example.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Result: OK
(mbox) Adding cc: A <author@example.com> from line 'From: A <author@example.com>'
(mbox) Adding cc: One <one@example.com> from line 'Cc: One <one@example.com>, two@example.com'
(mbox) Adding cc: two@example.com from line 'Cc: One <one@example.com>, two@example.com'
(body) Adding cc: C O Mitter <committer@example.com> from line 'Signed-off-by: C O Mitter <committer@example.com>'
OK. Log says:
Sendmail: /usr/local/google/home/jch/w/git.git/t/trash directory.t9001-send-email/fake.sendmail -i nobody@example.com author@example.com one@example.com two@example.com committer@example.com
From: Example <nobody@example.com>
To: nobody@example.com
Cc: A <author@example.com>,
        One <one@example.com>,
        two@example.com,
        C O Mitter <committer@example.com>
Subject: [PATCH 1/1] Second.
Date: Tue, 24 Oct 2023 21:52:29 +0000
Message-ID: <20231024215229.1787922-3-nobody@example.com>
X-Mailer: git-send-email 2.42.0-705-g1a1f985ecc
In-Reply-To: <unique-message-id@example.com>
References: <unique-message-id@example.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Result: OK
--- expect      2023-10-24 21:52:29.115044899 +0000
+++ actual      2023-10-24 21:52:29.119045306 +0000
@@ -1 +1 @@
-<20231024215229.1787922-1-nobody@example.com>
+<unique-message-id@example.com>
not ok 58 - In-Reply-To without --chain-reply-to
#
#               clean_fake_sendmail &&
#               echo "<unique-message-id@example.com>" >expect &&
#               git send-email \
#                       --from="Example <nobody@example.com>" \
#                       --to=nobody@example.com \
#                       --no-chain-reply-to \
#                       --in-reply-to="$(cat expect)" \
#                       --smtp-server="$(pwd)/fake.sendmail" \
#                       $patches $patches $patches \
#                       2>errors &&
#               # The first message is a reply to --in-reply-to
#               sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt1 >actual &&
#               test_cmp expect actual &&
#               # Second and subsequent messages are replies to the first one
#               sed -n -e "s/^Message-ID: *\(.*\)/\1/p" msgtxt1 >expect &&
#               sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt2 >actual &&
#               test_cmp expect actual &&
#               sed -n -e "s/^In-Reply-To: *\(.*\)/\1/p" msgtxt3 >actual &&
#               test_cmp expect actual
#
1..58
Junio C Hamano Oct. 24, 2023, 10:03 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Michael Strawbridge <michael.strawbridge@amd.com> writes:
>
>> Subject: [PATCH] send-email: move validation code below process_address_list
>>
>> Move validation logic below processing of email address lists so that
>> email validation gets the proper email addresses.
>
> Hmph, without this patch, the tip of 'seen' passes t9001 on my box,
> but with it, it claims that it failed 58, 87, and 89.

FWIW, when this patch is used with 'master' (not 'seen'), t9001
claims the same three tests failed.  The way #58 fails seems to be
identical to the way 'seen' with this patch failed, shown in the
message I am responding to.
Jeff King Oct. 25, 2023, 6:50 a.m. UTC | #3
On Tue, Oct 24, 2023 at 04:19:43PM -0400, Michael Strawbridge wrote:

> Move validation logic below processing of email address lists so that
> email validation gets the proper email addresses.
> 
> This fixes email address validation errors when the optional
> perl module Email::Valid is installed and multiple addresses are passed
> in on a single to/cc argument like --to=foo@example.com,bar@example.com.

Is there a test we can include here?

> @@ -2023,6 +1999,30 @@ sub process_file {
>  	return 1;
>  }
>  
> +if ($validate) {

So the new spot is right before we call process_file() on each of the
input files. It is a little hard to follow because of the number of
functions defined inline in the middle of the script, but I think that
is a reasonable spot. It is after we have called process_address_list()
on to/cc/bcc, which I think fixes the regression. But it is also after
we sanitize $reply_to, etc, which seems like a good idea.

But I think putting it down that far is the source of the test failures.

The culprit seems not to be the call to validate_patch() in the loop you
moved, but rather pre_process_file(), which was added in your earlier
a8022c5f7b (send-email: expose header information to git-send-email's
sendemail-validate hook, 2023-04-19).

It looks like the issue is the global $message_num variable which is
incremented by pre_process_file(). On line 1755 (on the current tip of
master), we set it to 0. And your patch moves the validation across
there (from line ~799 to ~2023).

And that's why the extra increments didn't matter when you added the
calls to pre_process_file() in your earlier patch; they all happened
before we reset $message_num to 0. But now they happen after.

To be fair, this is absolutely horrific perl code. There's over a
thousand lines of function definitions, and then hidden in the middle
are some global variable assignments!

So we have a few options, I think:

  1. Reset $message_num to 0 after validating (maybe we also need
     to reset $in_reply_to, etc, set at the same time? I didn't check).
     This feels like a hack.

  2. Move the validation down, but not so far down. Like maybe right
     after we set up the @files list with the $compose.final name. This
     is the smallest diff, but feels like we haven't really made the
     world a better place.

  3. Move the $message_num, etc, initialization to right before we call
     the process_file() loop, which is what expects to use them. Like
     this (squashed into your patch):

diff --git a/git-send-email.perl b/git-send-email.perl
index a898dbc76e..d44db14223 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1730,10 +1730,6 @@ sub send_message {
 	return 1;
 }
 
-$in_reply_to = $initial_in_reply_to;
-$references = $initial_in_reply_to || '';
-$message_num = 0;
-
 sub pre_process_file {
 	my ($t, $quiet) = @_;
 
@@ -2023,6 +2019,10 @@ sub process_file {
 	delete $ENV{GIT_SENDEMAIL_FILE_TOTAL};
 }
 
+$in_reply_to = $initial_in_reply_to;
+$references = $initial_in_reply_to || '';
+$message_num = 0;
+
 foreach my $t (@files) {
 	while (!process_file($t)) {
 		# user edited the file

That seems to make the test failures go away. It is still weird that the
validation code is calling pre_process_file(), which increments
$message_num, without us having set it up in any meaningful way. I'm not
sure if there are bugs lurking there or not. I'm not impressed by the
general quality of this code, and I'm kind of afraid to keep looking
deeper.

-Peff
Uwe Kleine-König Oct. 25, 2023, 7:43 a.m. UTC | #4
Hello,

On Tue, Oct 24, 2023 at 04:19:43PM -0400, Michael Strawbridge wrote:
> >From 09ea51d63cebdf9ff0c073ef86e21b4b09c268e5 Mon Sep 17 00:00:00 2001
> From: Michael Strawbridge <michael.strawbridge@amd.com>
> Date: Wed, 11 Oct 2023 16:13:13 -0400
> Subject: [PATCH] send-email: move validation code below process_address_list
> 
> Move validation logic below processing of email address lists so that
> email validation gets the proper email addresses.
> 
> This fixes email address validation errors when the optional
> perl module Email::Valid is installed and multiple addresses are passed
> in on a single to/cc argument like --to=foo@example.com,bar@example.com.
> 
> Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
> Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>

If you do Fixes: trailers as the kernel does, this could get:

Fixes: a8022c5f7b67 ("send-email: expose header information to git-send-email's sendemail-validate hook")

I tested this patch on top of main (2e8e77cbac8a) and it fixes the
regression I reported in a separate thread (where Jeff pointed out this
patch as fixing it).

Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Thanks
Uwe
Michael Strawbridge Oct. 25, 2023, 6:47 p.m. UTC | #5
On 10/25/23 02:50, Jeff King wrote:
> On Tue, Oct 24, 2023 at 04:19:43PM -0400, Michael Strawbridge wrote:
> 
>> Move validation logic below processing of email address lists so that
>> email validation gets the proper email addresses.
>>
>> This fixes email address validation errors when the optional
>> perl module Email::Valid is installed and multiple addresses are passed
>> in on a single to/cc argument like --to=foo@example.com,bar@example.com.
> 
> Is there a test we can include here?
> 
>> @@ -2023,6 +1999,30 @@ sub process_file {
>>  	return 1;
>>  }
>>  
>> +if ($validate) {
> 
> So the new spot is right before we call process_file() on each of the
> input files. It is a little hard to follow because of the number of
> functions defined inline in the middle of the script, but I think that
> is a reasonable spot. It is after we have called process_address_list()
> on to/cc/bcc, which I think fixes the regression. But it is also after
> we sanitize $reply_to, etc, which seems like a good idea.
> 
> But I think putting it down that far is the source of the test failures.
> 
> The culprit seems not to be the call to validate_patch() in the loop you
> moved, but rather pre_process_file(), which was added in your earlier
> a8022c5f7b (send-email: expose header information to git-send-email's
> sendemail-validate hook, 2023-04-19).
> 
> It looks like the issue is the global $message_num variable which is
> incremented by pre_process_file(). On line 1755 (on the current tip of
> master), we set it to 0. And your patch moves the validation across
> there (from line ~799 to ~2023).
> 
> And that's why the extra increments didn't matter when you added the
> calls to pre_process_file() in your earlier patch; they all happened
> before we reset $message_num to 0. But now they happen after.
> 
> To be fair, this is absolutely horrific perl code. There's over a
> thousand lines of function definitions, and then hidden in the middle
> are some global variable assignments!

I agree. Following where things are initialized seems to be especially troublesome.
> 
> So we have a few options, I think:
> 
>   1. Reset $message_num to 0 after validating (maybe we also need
>      to reset $in_reply_to, etc, set at the same time? I didn't check).
>      This feels like a hack.
> 
>   2. Move the validation down, but not so far down. Like maybe right
>      after we set up the @files list with the $compose.final name. This
>      is the smallest diff, but feels like we haven't really made the
>      world a better place.
> 
>   3. Move the $message_num, etc, initialization to right before we call
>      the process_file() loop, which is what expects to use them. Like
>      this (squashed into your patch):
> 
> diff --git a/git-send-email.perl b/git-send-email.perl
> index a898dbc76e..d44db14223 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -1730,10 +1730,6 @@ sub send_message {
>  	return 1;
>  }
>  
> -$in_reply_to = $initial_in_reply_to;
> -$references = $initial_in_reply_to || '';
> -$message_num = 0;
> -
>  sub pre_process_file {
>  	my ($t, $quiet) = @_;
>  
> @@ -2023,6 +2019,10 @@ sub process_file {
>  	delete $ENV{GIT_SENDEMAIL_FILE_TOTAL};
>  }
>  
> +$in_reply_to = $initial_in_reply_to;
> +$references = $initial_in_reply_to || '';
> +$message_num = 0;
> +
>  foreach my $t (@files) {
>  	while (!process_file($t)) {
>  		# user edited the file
> 
The above patch was a great place to start.  Thank you!  In order to address
the fact that validation and actually sending the emails should have the same
initial conditions I created a new function to set the variables and call it
instead.
> That seems to make the test failures go away. It is still weird that the
> validation code is calling pre_process_file(), which increments
> $message_num, without us having set it up in any meaningful way. I'm not
> sure if there are bugs lurking there or not. I'm not impressed by the
> general quality of this code, and I'm kind of afraid to keep looking
> deeper.
> 
> -Peff
Michael Strawbridge Oct. 25, 2023, 6:48 p.m. UTC | #6
On 10/24/23 18:03, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Michael Strawbridge <michael.strawbridge@amd.com> writes:
>>
>>> Subject: [PATCH] send-email: move validation code below process_address_list
>>>
>>> Move validation logic below processing of email address lists so that
>>> email validation gets the proper email addresses.
>>
>> Hmph, without this patch, the tip of 'seen' passes t9001 on my box,
>> but with it, it claims that it failed 58, 87, and 89.
> 
> FWIW, when this patch is used with 'master' (not 'seen'), t9001
> claims the same three tests failed.  The way #58 fails seems to be
> identical to the way 'seen' with this patch failed, shown in the
> message I am responding to.

I'm sorry to have wasted your time with patch 1. I had done the other manual
tests but ended up forgetting the automated ones.
Junio C Hamano Oct. 27, 2023, 1:04 p.m. UTC | #7
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:

>> This fixes email address validation errors when the optional
>> perl module Email::Valid is installed and multiple addresses are passed
>> in on a single to/cc argument like --to=foo@example.com,bar@example.com.
>> 
>> Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
>> Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com>
>
> If you do Fixes: trailers as the kernel does, this could get:
>
> Fixes: a8022c5f7b67 ("send-email: expose header information to git-send-email's sendemail-validate hook")

While referring to a concrete commit object name is great, we tend
not to use that particular trailer in this project; rather, we
prefer to see description on how and in what way the culprit change
was undesirable.

> I tested this patch on top of main (2e8e77cbac8a) and it fixes the
> regression I reported in a separate thread (where Jeff pointed out this
> patch as fixing it).

Great.  Thanks.
diff mbox series

Patch

diff --git a/git-send-email.perl b/git-send-email.perl
index 288ea1ae80..a898dbc76e 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -799,30 +799,6 @@  sub is_format_patch_arg {
 
 $time = time - scalar $#files;
 
-if ($validate) {
-	# FIFOs can only be read once, exclude them from validation.
-	my @real_files = ();
-	foreach my $f (@files) {
-		unless (-p $f) {
-			push(@real_files, $f);
-		}
-	}
-
-	# Run the loop once again to avoid gaps in the counter due to FIFO
-	# arguments provided by the user.
-	my $num = 1;
-	my $num_files = scalar @real_files;
-	$ENV{GIT_SENDEMAIL_FILE_TOTAL} = "$num_files";
-	foreach my $r (@real_files) {
-		$ENV{GIT_SENDEMAIL_FILE_COUNTER} = "$num";
-		pre_process_file($r, 1);
-		validate_patch($r, $target_xfer_encoding);
-		$num += 1;
-	}
-	delete $ENV{GIT_SENDEMAIL_FILE_COUNTER};
-	delete $ENV{GIT_SENDEMAIL_FILE_TOTAL};
-}
-
 @files = handle_backup_files(@files);
 
 if (@files) {
@@ -2023,6 +1999,30 @@  sub process_file {
 	return 1;
 }
 
+if ($validate) {
+	# FIFOs can only be read once, exclude them from validation.
+	my @real_files = ();
+	foreach my $f (@files) {
+		unless (-p $f) {
+			push(@real_files, $f);
+		}
+	}
+
+	# Run the loop once again to avoid gaps in the counter due to FIFO
+	# arguments provided by the user.
+	my $num = 1;
+	my $num_files = scalar @real_files;
+	$ENV{GIT_SENDEMAIL_FILE_TOTAL} = "$num_files";
+	foreach my $r (@real_files) {
+		$ENV{GIT_SENDEMAIL_FILE_COUNTER} = "$num";
+		pre_process_file($r, 1);
+		validate_patch($r, $target_xfer_encoding);
+		$num += 1;
+	}
+	delete $ENV{GIT_SENDEMAIL_FILE_COUNTER};
+	delete $ENV{GIT_SENDEMAIL_FILE_TOTAL};
+}
+
 foreach my $t (@files) {
 	while (!process_file($t)) {
 		# user edited the file