Message ID | 7e2c92ff-b42c-4b3f-a509-9d0785448262@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | send-email: move process_address_list earlier to avoid, uninitialized address error | expand |
Hi Bagas, If possible, please try the patch I just sent out and let me know if it works for your situation. Thanks, Michael On 10/11/23 16:22, Michael Strawbridge wrote: > Move processing of email address lists before the sendemail-validate > hook code. This fixes email address validation errors when the optional > perl module Email::Valid is installed and multiple addresses are passed > in on a single to/cc argument like --to=foo@example.com,bar@example.com. > --- > git-send-email.perl | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/git-send-email.perl b/git-send-email.perl > index 288ea1ae80..cfd80c9d8b 100755 > --- a/git-send-email.perl > +++ b/git-send-email.perl > @@ -799,6 +799,10 @@ sub is_format_patch_arg { > > $time = time - scalar $#files; > > +@initial_to = process_address_list(@initial_to); > +@initial_cc = process_address_list(@initial_cc); > +@initial_bcc = process_address_list(@initial_bcc); > + > if ($validate) { > # FIFOs can only be read once, exclude them from validation. > my @real_files = (); > @@ -1099,10 +1103,6 @@ sub expand_one_alias { > return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias; > } > > -@initial_to = process_address_list(@initial_to); > -@initial_cc = process_address_list(@initial_cc); > -@initial_bcc = process_address_list(@initial_bcc); > - > if ($thread && !defined $initial_in_reply_to && $prompting) { > $initial_in_reply_to = ask( > __("Message-ID to be used as In-Reply-To for the first email (if any)? "),
Michael Strawbridge <michael.strawbridge@amd.com> writes: > @@ -799,6 +799,10 @@ sub is_format_patch_arg { > > $time = time - scalar $#files; > > +@initial_to = process_address_list(@initial_to); > +@initial_cc = process_address_list(@initial_cc); > +@initial_bcc = process_address_list(@initial_bcc); > + This does not look OK. If we trace how @initial_to gets its value, - it first gets its value from @getopt_to and @config_to - if that is empty, and there is no $to_cmd, the end-user is interactively asked. - then process_address_list() is applied. But this patch just swapped the second one and the third one, so process_address_list() does not process what the end-user gave interactively, no? > if ($validate) { > # FIFOs can only be read once, exclude them from validation. > my @real_files = (); It almost feels like what need to move is not the setting of these address lists, but the code that calls int validation callchain that needs access to these address lists---the block that begins with the above "if ($validate) {" needs to move below ... > @@ -1099,10 +1103,6 @@ sub expand_one_alias { > return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias; > } > > -@initial_to = process_address_list(@initial_to); > -@initial_cc = process_address_list(@initial_cc); > -@initial_bcc = process_address_list(@initial_bcc); > - ... this point, or something, perhaps? > if ($thread && !defined $initial_in_reply_to && $prompting) { > $initial_in_reply_to = ask( > __("Message-ID to be used as In-Reply-To for the first email (if any)? "),
On Wed, Oct 11, 2023 at 02:27:49PM -0700, Junio C Hamano wrote: > Michael Strawbridge <michael.strawbridge@amd.com> writes: > > > @@ -799,6 +799,10 @@ sub is_format_patch_arg { > > > > $time = time - scalar $#files; > > > > +@initial_to = process_address_list(@initial_to); > > +@initial_cc = process_address_list(@initial_cc); > > +@initial_bcc = process_address_list(@initial_bcc); > > + > > This does not look OK. If we trace how @initial_to gets its value, > > - it first gets its value from @getopt_to and @config_to > > - if that is empty, and there is no $to_cmd, the end-user is > interactively asked. > > - then process_address_list() is applied. > > But this patch just swapped the second one and the third one, so > process_address_list() does not process what the end-user gave > interactively, no? Yep, that is the issue I found when I dug into this earlier. > > if ($validate) { > > # FIFOs can only be read once, exclude them from validation. > > my @real_files = (); > > It almost feels like what need to move is not the setting of these > address lists, but the code that calls int validation callchain that > needs access to these address lists---the block that begins with the > above "if ($validate) {" needs to move below ... Yes, though then we have the problem that we've asked the user some interactive questions before validating if the input files are bogus or not. Which would be annoying if they aren't valid, because when we barf they've wasted time typing. Which of course implies that we're not (and cannot) validate what they're typing at this step, but I think that's OK because we feed it through extract_valid_address_or_die(). IOW, I think there are actually two distinct validation steps hidden here: 1. We want to validate that the patch files we were fed are OK. 2. We want to validate that the addresses, etc, fed by the user are OK. And after Michael's original patch, we are accidentally hitting some of that validation code for (2) while doing (1). This is actually a weird split if you think about it. We are feeding to the validate hook in (1), so surely it would want to see the full set of inputs from the user, too? Which argues for pushing the "if ($validate)" down as you suggest. And then either: a. We accept that the user experience is a little worse if validation fails after the user typed. b. We split (1) into "early" validation that just checks if the files are OK, but doesn't call the hook. And then later on we do the full validation. I don't have a strong opinion myself (I don't even use send-email myself, and if I did, I'd probably mostly be feeding it with "--to" etc on the command line, rather than interactively). -Peff
Jeff King <peff@peff.net> writes: > Which of course implies that we're not (and cannot) validate what > they're typing at this step, but I think that's OK because we feed it > through extract_valid_address_or_die(). OK, let's queue it then. > IOW, I think there are actually two distinct validation steps > hidden here: > > 1. We want to validate that the patch files we were fed are OK. > > 2. We want to validate that the addresses, etc, fed by the user are > OK. > > And after Michael's original patch, we are accidentally hitting some of > that validation code for (2) while doing (1). > This is actually a weird split if you think about it. We are feeding to > the validate hook in (1), so surely it would want to see the full set of > inputs from the user, too? Which argues for pushing the "if ($validate)" > down as you suggest. And then either: > > a. We accept that the user experience is a little worse if validation > fails after the user typed. > > b. We split (1) into "early" validation that just checks if the files > are OK, but doesn't call the hook. And then later on we do the full > validation. > > I don't have a strong opinion myself (I don't even use send-email > myself, and if I did, I'd probably mostly be feeding it with "--to" etc > on the command line, rather than interactively). I am not affected, either, and do not have a strong opinion either way. As long as the end-user input is validated separately, it would be OK, but if the end-user supplied validation hook cares about what addresses the messages are going to be sent to, not knowing the set of recipients mean the validation hook is not getting the whole picture, which does smell bad. On the other hand, I am not sure what is wrong with "after the user typed", actually. As you said, anybody sane would be using --to (or an equivalent configuration variable in the repository) to send their patches to the project address instead of typing, and to them it is not a problem. After getting the recipient address from the end user, the validation may fail due to a wrong address, in which case it is a good thing. If the validation failed due to wrong contents of the patch (perhaps it included a change to the file with trade secret that appeared in the context lines), as long as the reason why the validation hook rejected the patches is clear enough (e.g., "it's the patches, not the recipients"), such "a rejection after typing" would be only once per a patch series, so it does not sound too bad, either. But perhaps I am not seeing the reason why "fail after the user typed" is so disliked and being unnecessarily unsympathetic. I dunno.
On Wed, Oct 11, 2023 at 03:37:39PM -0700, Junio C Hamano wrote: > On the other hand, I am not sure what is wrong with "after the user > typed", actually. As you said, anybody sane would be using --to (or > an equivalent configuration variable in the repository) to send > their patches to the project address instead of typing, and to them > it is not a problem. After getting the recipient address from the > end user, the validation may fail due to a wrong address, in which > case it is a good thing. If the validation failed due to wrong > contents of the patch (perhaps it included a change to the file with > trade secret that appeared in the context lines), as long as the > reason why the validation hook rejected the patches is clear enough > (e.g., "it's the patches, not the recipients"), such "a rejection > after typing" would be only once per a patch series, so it does not > sound too bad, either. > > But perhaps I am not seeing the reason why "fail after the user typed" > is so disliked and being unnecessarily unsympathetic. I dunno. I did not look carefully at the flow of send-email, so this may or may not be an issue. But what I think would be _really_ annoying is if you asked to write a cover letter, went through the trouble of writing it, and then send-email bailed due to some validation failure that could have been checked earlier. There is probably a way to recover your work (presumably we leave it in a temporary file somewhere), but it may not be entirely trivial, especially for users who are not comfortable with advanced usage of their editor. ;) I seem to remember we had one or two such problems in the early days with "git commit", where you would go to the trouble to type a commit message only to bail on some condition which _could_ have been checked earlier. You can recover the message from .git/COMMIT_EDITMSG, but you need to remember to do so before re-invoking "git commit", otherwise it gets obliterated. Now for send-email, if your flow is to generate the patches with "format-patch", then edit the cover letter separately, and then finally ship it all out with "send-email", that might not be an issue. But some workflows use the --compose option instead. -Peff
On 10/11/23 18:47, Jeff King wrote: > On Wed, Oct 11, 2023 at 03:37:39PM -0700, Junio C Hamano wrote: > >> On the other hand, I am not sure what is wrong with "after the user >> typed", actually. As you said, anybody sane would be using --to (or >> an equivalent configuration variable in the repository) to send >> their patches to the project address instead of typing, and to them >> it is not a problem. After getting the recipient address from the >> end user, the validation may fail due to a wrong address, in which >> case it is a good thing. If the validation failed due to wrong >> contents of the patch (perhaps it included a change to the file with >> trade secret that appeared in the context lines), as long as the >> reason why the validation hook rejected the patches is clear enough >> (e.g., "it's the patches, not the recipients"), such "a rejection >> after typing" would be only once per a patch series, so it does not >> sound too bad, either. >> >> But perhaps I am not seeing the reason why "fail after the user typed" >> is so disliked and being unnecessarily unsympathetic. I dunno. > I did not look carefully at the flow of send-email, so this may or may > not be an issue. But what I think would be _really_ annoying is if you > asked to write a cover letter, went through the trouble of writing it, > and then send-email bailed due to some validation failure that could > have been checked earlier. > > There is probably a way to recover your work (presumably we leave it in > a temporary file somewhere), but it may not be entirely trivial, > especially for users who are not comfortable with advanced usage of > their editor. ;) As I was looking at covering the case of interactive input (--compose) to the fix I noticed that this seems to be at least partly handled by the $compose_filename code. There is a nice output message telling you exactly where the intermediate version of the email you are composing is located if there are errors. I took a quick look inside and can verify that any lost work should be minimal as long as someone knows how to edit files with their editor of choice. > > I seem to remember we had one or two such problems in the early days > with "git commit", where you would go to the trouble to type a commit > message only to bail on some condition which _could_ have been checked > earlier. You can recover the message from .git/COMMIT_EDITMSG, but you > need to remember to do so before re-invoking "git commit", otherwise it > gets obliterated. > > Now for send-email, if your flow is to generate the patches with > "format-patch", then edit the cover letter separately, and then finally > ship it all out with "send-email", that might not be an issue. But some > workflows use the --compose option instead. > > -Peff I have been looking into handling the interactive input cases while solving this issue, but have yet to make a breakthrough. Simply moving the validation code below the original process_address_list code results in a a scenario where I get the email address being seen as something like "ARRAY (0x55ddb951d768)" rather than the email address I wrote in the compose buffer.
On Wed, Oct 11, 2023 at 04:22:18PM -0400, Michael Strawbridge wrote: > Move processing of email address lists before the sendemail-validate > hook code. This fixes email address validation errors when the optional > perl module Email::Valid is installed and multiple addresses are passed > in on a single to/cc argument like --to=foo@example.com,bar@example.com. > --- > git-send-email.perl | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/git-send-email.perl b/git-send-email.perl > index 288ea1ae80..cfd80c9d8b 100755 > --- a/git-send-email.perl > +++ b/git-send-email.perl > @@ -799,6 +799,10 @@ sub is_format_patch_arg { > > $time = time - scalar $#files; > > +@initial_to = process_address_list(@initial_to); > +@initial_cc = process_address_list(@initial_cc); > +@initial_bcc = process_address_list(@initial_bcc); > + > if ($validate) { > # FIFOs can only be read once, exclude them from validation. > my @real_files = (); > @@ -1099,10 +1103,6 @@ sub expand_one_alias { > return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias; > } > > -@initial_to = process_address_list(@initial_to); > -@initial_cc = process_address_list(@initial_cc); > -@initial_bcc = process_address_list(@initial_bcc); > - > if ($thread && !defined $initial_in_reply_to && $prompting) { > $initial_in_reply_to = ask( > __("Message-ID to be used as In-Reply-To for the first email (if any)? "), Thanks for the fixup! The patch itself is whitespace-damaged, though, so I have to manually apply it. Regardless, Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
On Fri, Oct 13, 2023 at 04:25:57PM -0400, Michael Strawbridge wrote: > > I did not look carefully at the flow of send-email, so this may or may > > not be an issue. But what I think would be _really_ annoying is if you > > asked to write a cover letter, went through the trouble of writing it, > > and then send-email bailed due to some validation failure that could > > have been checked earlier. > > > > There is probably a way to recover your work (presumably we leave it in > > a temporary file somewhere), but it may not be entirely trivial, > > especially for users who are not comfortable with advanced usage of > > their editor. ;) > > As I was looking at covering the case of interactive input (--compose) > to the fix I noticed that this seems to be at least partly handled by > the $compose_filename code. There is a nice output message telling > you exactly where the intermediate version of the email you are > composing is located if there are errors. I took a quick look inside > and can verify that any lost work should be minimal as long as someone > knows how to edit files with their editor of choice. OK, that makes me feel better about just moving the validation code. A more complicated solution could be do to do _some_ basic checks up front, and then more complete validation later. But even if we wanted to do that, moving the bulk of the validation (as discussed in this thread) would probably be the first step anyway (and if nobody complains, maybe we can avoid doing the extra work). I do wonder if we might find other interesting corner cases where the validation code (or somebody's hook) isn't happy with seeing the more "full" picture (i.e., with the extra addresses from interactive and command-line input). But arguably any such case would be indicative of a bug, and smoking it out would be a good thing. > I have been looking into handling the interactive input cases while > solving this issue, but have yet to make a breakthrough. Simply > moving the validation code below the original process_address_list > code results in a a scenario where I get the email address being seen > as something like "ARRAY (0x55ddb951d768)" rather than the email > address I wrote in the compose buffer. Sounds like something is making a perl ref that shouldn't (or something that should be dereferencing it not doing so). If you post your patch and a reproduction command, I might be able to help debug. But just blindly moving the validation code down, like: diff --git a/git-send-email.perl b/git-send-email.perl index 288ea1ae80..76589c7827 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -799,30 +799,6 @@ sub is_format_patch_arg { $time = time - scalar $#files; -if ($validate) { - # FIFOs can only be read once, exclude them from validation. - my @real_files = (); - foreach my $f (@files) { - unless (-p $f) { - push(@real_files, $f); - } - } - - # Run the loop once again to avoid gaps in the counter due to FIFO - # arguments provided by the user. - my $num = 1; - my $num_files = scalar @real_files; - $ENV{GIT_SENDEMAIL_FILE_TOTAL} = "$num_files"; - foreach my $r (@real_files) { - $ENV{GIT_SENDEMAIL_FILE_COUNTER} = "$num"; - pre_process_file($r, 1); - validate_patch($r, $target_xfer_encoding); - $num += 1; - } - delete $ENV{GIT_SENDEMAIL_FILE_COUNTER}; - delete $ENV{GIT_SENDEMAIL_FILE_TOTAL}; -} - @files = handle_backup_files(@files); if (@files) { @@ -1121,6 +1097,30 @@ sub expand_one_alias { $reply_to = sanitize_address($reply_to); } +if ($validate) { + # FIFOs can only be read once, exclude them from validation. + my @real_files = (); + foreach my $f (@files) { + unless (-p $f) { + push(@real_files, $f); + } + } + + # Run the loop once again to avoid gaps in the counter due to FIFO + # arguments provided by the user. + my $num = 1; + my $num_files = scalar @real_files; + $ENV{GIT_SENDEMAIL_FILE_TOTAL} = "$num_files"; + foreach my $r (@real_files) { + $ENV{GIT_SENDEMAIL_FILE_COUNTER} = "$num"; + pre_process_file($r, 1); + validate_patch($r, $target_xfer_encoding); + $num += 1; + } + delete $ENV{GIT_SENDEMAIL_FILE_COUNTER}; + delete $ENV{GIT_SENDEMAIL_FILE_TOTAL}; +} + if (!defined $sendmail_cmd && !defined $smtp_server) { my @sendmail_paths = qw( /usr/sbin/sendmail /usr/lib/sendmail ); push @sendmail_paths, map {"$_/sendmail"} split /:/, $ENV{PATH}; seems to fix the problem from this thread and passes the existing tests. Manually inspecting the result (and what's fed to the validation hook) I don't see anything odd (like "ARRAY (...)"). -Peff
On Fri, Oct 20, 2023 at 02:45:25AM -0400, Jeff King wrote: > > I have been looking into handling the interactive input cases while > > solving this issue, but have yet to make a breakthrough. Simply > > moving the validation code below the original process_address_list > > code results in a a scenario where I get the email address being seen > > as something like "ARRAY (0x55ddb951d768)" rather than the email > > address I wrote in the compose buffer. > > Sounds like something is making a perl ref that shouldn't (or something > that should be dereferencing it not doing so). If you post your patch > and a reproduction command, I might be able to help debug. Ah, your "address I wrote in the compose buffer" was the clue I needed. I think this is actually an existing bug. If I use --compose and write: To: foo@example.com in the editor, we read that back in and handle it in parse_header_line() like: my $addr_pat = join "|", qw(To Cc Bcc); foreach (split(/\n/, $lines)) { if (/^($addr_pat):\s*(.+)$/i) { $parsed_line->{$1} = [ parse_address_line($2) ]; } elsif (/^([^:]*):\s*(.+)\s*$/i) { $parsed_line->{$1} = $2; } } and there's your perl array ref (from the square brackets, which are necessary because we're sticking it in a hash value). But even before your patch, this seems to end up as garbage. The code which reads $parsed_line does not dereference the array. The patch to fix it is only a few lines (well, more than that with some light editorializing in the comments): diff --git a/git-send-email.perl b/git-send-email.perl index 76589c7827..46a30088c9 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -918,7 +918,28 @@ sub get_patch_subject { # Preserve unknown headers foreach my $key (keys %parsed_email) { next if $key eq 'body'; - print $c2 "$key: $parsed_email{$key}"; + + # it seems like it would be easier to just look for + # $parsed_email{'To'} and so on. But we actually match + # these case-insenstively and preserve the user's spelling, so + # we might see $parsed_email{'to'}. Of course, the same bug + # exists for Subject, etc, above. Anyway, a "/i" regex here + # handles all cases. + # + # It kind of feels like all of this code would be much simpler + # if we just handled all of the headers while reading back the + # input, rather than stuffing them all into $parsed_email and + # then picking them out of it. + # + # It also really feels like these to/cc/bcc lines should be + # added to the regular ones? It is silly to make a cover letter + # that goes to some addresses, and then not send the patches to + # them, too. + if ($key =~ /^(To|Cc|Bcc)$/i) { + print $c2 "$key: ", join(', ', @{$parsed_email{$key}}); + } else { + print $c2 "$key: $parsed_email{$key}"; + } } if ($parsed_email{'body'}) { I don't really think your patch makes things worse here. But it is probably worth fixing it while we are here. -Peff
diff --git a/git-send-email.perl b/git-send-email.perl index 288ea1ae80..cfd80c9d8b 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -799,6 +799,10 @@ sub is_format_patch_arg { $time = time - scalar $#files; +@initial_to = process_address_list(@initial_to); +@initial_cc = process_address_list(@initial_cc); +@initial_bcc = process_address_list(@initial_bcc); + if ($validate) { # FIFOs can only be read once, exclude them from validation. my @real_files = (); @@ -1099,10 +1103,6 @@ sub expand_one_alias { return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias; } -@initial_to = process_address_list(@initial_to); -@initial_cc = process_address_list(@initial_cc); -@initial_bcc = process_address_list(@initial_bcc); - if ($thread && !defined $initial_in_reply_to && $prompting) { $initial_in_reply_to = ask( __("Message-ID to be used as In-Reply-To for the first email (if any)? "),