Message ID | 20240628085018.65076-2-csokas.bence@prolan.hu (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v3] git-send-email: Use sanitized address when reading mbox body | expand |
"Csókás, Bence" <csokas.bence@prolan.hu> writes: > Commas and other punctuation marks in 'Cc: ', 'Signed-off-by: ' > etc. lines mess with git-send-email. Calling these "trailers" would make it more obvious that you are talking about parts of the log message ("S-o-b" would not be mistaken as an e-mail header, but you cannot complain if a reader mistook the mention of "Cc" as a reference to an e-mail header). More importantly, "mess with" is true when potentially address-looking strings from the trailer lines are added to @cc list. So perhaps Addresses that are mentioned on the trailers in the commit log messages (e.g., "Reviewed-by") are added to the "Cc:" list (unless suppressed) by "git send-email". These hand-written addresses may be malformed (e.g., having unquoted "." and other punctutation marks in the display-name part). > This is handled by calling > `sanitize_address()` before adding addresses to @cc. This function > was already being called, but was only used for comparing it to > $author for suppression purposes. So this is not telling the truth. This is *NOT* handled by calling the helper function, as the result is *NOT* added to @cc. I would suggest striking this paragraph (and the rest of the log message) out and rewrite, perhaps like so: The code does use sanitize_address on these address-looking strings to make them into valid addresses, but the result was used only to see if the address should be suppressed. The original string taken from the message was added to the @cc list. Because the addresses on trailer lines are hand-written and more likely to contain malformed addresses, when adding to the @cc list, use the result from sanitize_address, not the original. Note that we do not modify the behaviour for addresses taken from the e-mail headers, as they are more likely to be machine generated and well-formed. or something. > diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh > index 58699f8e4e..8bbbf20855 100755 > --- a/t/t9001-send-email.sh > +++ b/t/t9001-send-email.sh > @@ -1299,6 +1299,57 @@ test_expect_success $PREREQ 'utf8 sender is not duplicated' ' > test_line_count = 1 msgfrom > ' > > +test_expect_success $PREREQ 'setup expect for cc list' " > +cat >expected-cc <<\EOF > +!recipient@example.com! > +!author@example.com! > +!one@example.com! > +!odd_?=mail@example.com! > +!doug@example.com! > +!thor.au@example.com! > +EOF > +" > + > +test_expect_success $PREREQ 'cc list is sanitized' ' > + clean_fake_sendmail && > + test_commit weird_cc_body && > + test_when_finished "git reset --hard HEAD^" && > + git commit --amend -F - <<-EOF && > + Test Cc: sanitization. > + > + Cc: Person, One <one@example.com> > + Reviewed-by: Füñný Nâmé <odd_?=mail@example.com> > + Reported-by: bugger on Jira > + Reported-by: Douglas Reporter <doug@example.com> [from Jira profile] > + BugID: 12345 > + Signed-off-by: A. U. Thor <thor.au@example.com> > + EOF > + git send-email -1 --to=recipient@example.com \ > + --smtp-server="$(pwd)/fake.sendmail" >actual-show-all-headers && > + test_cmp expected-cc commandline1 && > + test_grep "^(body) Adding cc: \"Person, One\" <one@example.com>" actual-show-all-headers && > + test_grep "^(body) Adding cc: =?UTF-8?q?F=C3=BC=C3=B1n=C3=BD=20N=C3=A2m=C3=A9?="\ > +" <odd_?=mail@example.com>" actual-show-all-headers && > + test_grep "^(body) Adding cc: Douglas Reporter <doug@example.com>" actual-show-all-headers && > + test_grep "^(body) Adding cc: \"A. U. Thor\" <thor.au@example.com>" actual-show-all-headers > +' The test does not check that "bugger on Jira" and "12345" are not used as recipients, but it should. Otherwise your effort to add the extra trailers to the test goes wasted. It is a good choice to use Reported-by and BugID as two extra ones, by the way. The mechanisms that causes them omitted from the resulting message are different. > + > +test_expect_success $PREREQ 'quotes are sanitized in cc list' " > + clean_fake_sendmail && > + test_commit quote_in_cc_body && > + test_when_finished \"git reset --hard HEAD^\" && > + git commit --amend -F - <<-EOF && > + Quotation marks sanitization in Cc:. > + > + Cc: P'erson, One <one@example.com> > + Reported-by: \"Douglas 'Bug' Reporter\" <doug@example.com> > + EOF Quoting the test body in a pair of double quotes is error prone. Do not do it (and you do have a bug here that was caused by it). Rather, write ${SQ} where you would write a single quote in your test body, and enclose the test body in a pair of single quotes, like everybody else does. > + git send-email -1 --to=recipient@example.com \ > + --smtp-server=\"$(pwd)/fake.sendmail\" >actual-show-all-headers && Here is the bug. This $(pwd) command substition is done *before* the test runs---it is expanded while the shell computes the command line to run test_expect_success (as part of the test body string that is quoted in a pair of double-quotes). So if your cwd has a character that needs to be quoted inside double-quotes to be a valid value for the "--smtp-server" option, you end up breaking the shell quoting rules. In other words: what happens if your $(pwd) has a double-quote in it? > + test_grep \"^(body) Adding cc: \\\"P'erson, One\\\" <one@example.com>\" actual-show-all-headers && > + test_grep \"^(body) Adding cc: \\\"Douglas 'Bug' Reporter\\\" <doug@example.com>\" actual-show-all-headers > +" > + So, I do not see a strong reason why we want to have this as a separate test. We can write and test addresses with single quotes in it as part of the previous test. A squashable change on top of your version is attached at the end. * Added a name with apostrophe in it on the Cc: line, and check that the address appears on the recipient list and also in the log. * Added checks for "Reported-by: bugger on Jira" that explicitly gets ignored in the log, and "BugId:" that is not even considered for Cc: (in other words, not appearing in the log at all). Thanks. t/t9001-send-email.sh | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git c/t/t9001-send-email.sh w/t/t9001-send-email.sh index 658f3a72ae..64c56677cd 100755 --- c/t/t9001-send-email.sh +++ w/t/t9001-send-email.sh @@ -1304,6 +1304,7 @@ cat >expected-cc <<\EOF !recipient@example.com! !author@example.com! !one@example.com! +!os@example.com! !odd_?=mail@example.com! !doug@example.com! !thor.au@example.com! @@ -1318,6 +1319,7 @@ test_expect_success $PREREQ 'cc list is sanitized' ' Test Cc: sanitization. Cc: Person, One <one@example.com> + Cc: Ronnie O${SQ}Sullivan <os@example.com> Reviewed-by: Füñný Nâmé <odd_?=mail@example.com> Reported-by: bugger on Jira Reported-by: Douglas Reporter <doug@example.com> [from Jira profile] @@ -1328,28 +1330,15 @@ test_expect_success $PREREQ 'cc list is sanitized' ' --smtp-server="$(pwd)/fake.sendmail" >actual-show-all-headers && test_cmp expected-cc commandline1 && test_grep "^(body) Adding cc: \"Person, One\" <one@example.com>" actual-show-all-headers && + test_grep "^(body) Adding cc: Ronnie O${SQ}Sullivan <os@example.com>" actual-show-all-headers && test_grep "^(body) Adding cc: =?UTF-8?q?F=C3=BC=C3=B1n=C3=BD=20N=C3=A2m=C3=A9?="\ " <odd_?=mail@example.com>" actual-show-all-headers && + test_grep "^(body) Ignoring Reported-by .* bugger on Jira" actual-show-all-headers && test_grep "^(body) Adding cc: Douglas Reporter <doug@example.com>" actual-show-all-headers && + test_grep ! "12345" actual-show-all-headers && test_grep "^(body) Adding cc: \"A. U. Thor\" <thor.au@example.com>" actual-show-all-headers ' -test_expect_success $PREREQ 'quotes are sanitized in cc list' " - clean_fake_sendmail && - test_commit quote_in_cc_body && - test_when_finished \"git reset --hard HEAD^\" && - git commit --amend -F - <<-EOF && - Quotation marks sanitization in Cc:. - - Cc: P'erson, One <one@example.com> - Reported-by: \"Douglas 'Bug' Reporter\" <doug@example.com> - EOF - git send-email -1 --to=recipient@example.com \ - --smtp-server=\"$(pwd)/fake.sendmail\" >actual-show-all-headers && - test_grep \"^(body) Adding cc: \\\"P'erson, One\\\" <one@example.com>\" actual-show-all-headers && - test_grep \"^(body) Adding cc: \\\"Douglas 'Bug' Reporter\\\" <doug@example.com>\" actual-show-all-headers -" - test_expect_success $PREREQ 'sendemail.composeencoding works' ' clean_fake_sendmail && git config sendemail.composeencoding iso-8859-1 &&
diff --git a/git-send-email.perl b/git-send-email.perl index f0be4b4560..72044e5ef3 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1847,9 +1847,9 @@ sub pre_process_file { $what, $_) unless $quiet; next; } - push @cc, $c; + push @cc, $sc; printf(__("(body) Adding cc: %s from line '%s'\n"), - $c, $_) unless $quiet; + $sc, $_) unless $quiet; } } close $fh; diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 58699f8e4e..8bbbf20855 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1299,6 +1299,57 @@ test_expect_success $PREREQ 'utf8 sender is not duplicated' ' test_line_count = 1 msgfrom ' +test_expect_success $PREREQ 'setup expect for cc list' " +cat >expected-cc <<\EOF +!recipient@example.com! +!author@example.com! +!one@example.com! +!odd_?=mail@example.com! +!doug@example.com! +!thor.au@example.com! +EOF +" + +test_expect_success $PREREQ 'cc list is sanitized' ' + clean_fake_sendmail && + test_commit weird_cc_body && + test_when_finished "git reset --hard HEAD^" && + git commit --amend -F - <<-EOF && + Test Cc: sanitization. + + Cc: Person, One <one@example.com> + Reviewed-by: Füñný Nâmé <odd_?=mail@example.com> + Reported-by: bugger on Jira + Reported-by: Douglas Reporter <doug@example.com> [from Jira profile] + BugID: 12345 + Signed-off-by: A. U. Thor <thor.au@example.com> + EOF + git send-email -1 --to=recipient@example.com \ + --smtp-server="$(pwd)/fake.sendmail" >actual-show-all-headers && + test_cmp expected-cc commandline1 && + test_grep "^(body) Adding cc: \"Person, One\" <one@example.com>" actual-show-all-headers && + test_grep "^(body) Adding cc: =?UTF-8?q?F=C3=BC=C3=B1n=C3=BD=20N=C3=A2m=C3=A9?="\ +" <odd_?=mail@example.com>" actual-show-all-headers && + test_grep "^(body) Adding cc: Douglas Reporter <doug@example.com>" actual-show-all-headers && + test_grep "^(body) Adding cc: \"A. U. Thor\" <thor.au@example.com>" actual-show-all-headers +' + +test_expect_success $PREREQ 'quotes are sanitized in cc list' " + clean_fake_sendmail && + test_commit quote_in_cc_body && + test_when_finished \"git reset --hard HEAD^\" && + git commit --amend -F - <<-EOF && + Quotation marks sanitization in Cc:. + + Cc: P'erson, One <one@example.com> + Reported-by: \"Douglas 'Bug' Reporter\" <doug@example.com> + EOF + git send-email -1 --to=recipient@example.com \ + --smtp-server=\"$(pwd)/fake.sendmail\" >actual-show-all-headers && + test_grep \"^(body) Adding cc: \\\"P'erson, One\\\" <one@example.com>\" actual-show-all-headers && + test_grep \"^(body) Adding cc: \\\"Douglas 'Bug' Reporter\\\" <doug@example.com>\" actual-show-all-headers +" + test_expect_success $PREREQ 'sendemail.composeencoding works' ' clean_fake_sendmail && git config sendemail.composeencoding iso-8859-1 &&
Commas and other punctuation marks in 'Cc: ', 'Signed-off-by: ' etc. lines mess with git-send-email. This is handled by calling `sanitize_address()` before adding addresses to @cc. This function was already being called, but was only used for comparing it to $author for suppression purposes. Note that sanitization is only done for the message body, as `git format-patch` already RFC 2047-encodes mbox headers, so those are generally trusted to be sane. Also note that `sanitize_address()` does not process the mailbox addresses, so it is up to `sendmail` to handle special characters there (e.g. there are mailboxes in regular use with '+'-es in them). Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu> --- [ sorry, I forgot the --notes: ] Notes: Changes in v2: * added testcase to t9001 * added rationale behind trusting mbox headers and the address-parts Changes in v3: * more testcases * clarified wording in message git-send-email.perl | 4 ++-- t/t9001-send-email.sh | 51 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+), 2 deletions(-)