Message ID | 45d2c4ab58c4b0c6f0c7790890bbf75eb373f999.1620148732.git.congdanhqx@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Teach am/mailinfo to process quoted CR | expand |
Đoàn Trần Công Danh <congdanhqx@gmail.com> writes: > When an SMTP server receives an 8-bit email message, possibly with only > LF as line ending, some of those servers decide to change said LF to > CRLF. s/an SMTP server receives/SMTP servers receive/ s/those servers/them/ > Some mailing list softwares, when receives an 8-bit email message, > decide to encoding such message in base64 or quoted-printable. s/encoding/encode/ So the issue is not about CRLF terminating the lines of base64 or QP (we should treat CRLF and LF terminated lines when unwrapping base64 or QP the same way). It is about seeing CRLF in the payload after unwrapping base64 or QP. It was unclear which one was at issue from the subject alone. > If an email is transfered through above mail servers, then distributed > by such mailing list softwares, the recipients will receive an email > contains a patch mungled with CRLF encoded inside another encoding. > Thus, such CR couldn't be dropped by mailsplit. Hence, the mailed patch > couldn't be applied cleanly. Such accidents have been observed in the wild [1]. > > Let's give our users some warnings if such CR is found. Hmph. It is unclear which one of the following we want our endgame to be: (1) strip silently and apply (2) warn but strip and apply (3) warn but do not strip, letting the application fail but let's keep reading. I suspect (1) and (2) might be error prone, as the mailpath that may have caused this kind of breakage may not be under end-user's control. > +static void summarize_quoted_cr(struct mailinfo *mi, int have_quoted_cr) > +{ > + if (have_quoted_cr) > + warning("quoted CR detected"); > +} At this step, it is unclear if it is easier to read to make it the responsibility of the caller to check for have_quoted_cr, but it will become clear as we add more condition for the warning in later steps to let callers unconditionally call this helper and decide when we want to be silent inside this function. Have you considered adding a new have_quoted_cr member to "struct mailinfo"? After all, the mailinfo struct is not only about end user preference but contains all information we gleaned out of the incoming message. > static void handle_body(struct mailinfo *mi, struct strbuf *line) > { > struct strbuf prev = STRBUF_INIT; > + int have_quoted_cr = 0; > > /* Skip up to the first boundary */ > if (*(mi->content_top)) { > @@ -1051,6 +1063,8 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) > handle_filter(mi, &prev); > strbuf_reset(&prev); > } > + summarize_quoted_cr(mi, have_quoted_cr); > + have_quoted_cr = 0; > if (!handle_boundary(mi, line)) > goto handle_body_out; > } > @@ -1081,7 +1095,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) > strbuf_addbuf(&prev, sb); > break; > } > - handle_filter_flowed(mi, sb, &prev); > + handle_filter_flowed(mi, sb, &prev, &have_quoted_cr); > } > /* > * The partial chunk is saved in "prev" and will be > @@ -1091,7 +1105,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) > break; > } > default: > - handle_filter_flowed(mi, line, &prev); > + handle_filter_flowed(mi, line, &prev, &have_quoted_cr); > } > > if (mi->input_error) > @@ -1100,6 +1114,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) > > if (prev.len) > handle_filter(mi, &prev); > + summarize_quoted_cr(mi, have_quoted_cr); > > flush_inbody_header_accum(mi); > > diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh > index 147e616533..d8fdda6bea 100755 > --- a/t/t5100-mailinfo.sh > +++ b/t/t5100-mailinfo.sh > @@ -228,4 +228,19 @@ test_expect_success 'mailinfo handles unusual header whitespace' ' > test_cmp expect actual > ' > > +check_quoted_cr_mail() { SP on both sides of (), i.e. check_quoted_cr_mail () { > + git mailinfo -u "$@" quoted-cr-msg quoted-cr-patch \ > + <"$DATA/quoted-cr.mbox" >quoted-cr-info 2>quoted-cr-err && > + test_cmp "expect-cr-msg" quoted-cr-msg && > + test_cmp "expect-cr-patch" quoted-cr-patch && > + test_cmp "$DATA/quoted-cr-info" quoted-cr-info > +} > + > +test_expect_success 'mailinfo warn CR in base64 encoded email' ' > + sed "s/%%/$(printf \\015)/" "$DATA/quoted-cr-msg" >expect-cr-msg && > + sed "s/%%/$(printf \\015)/" "$DATA/quoted-cr-patch" >expect-cr-patch && > + check_quoted_cr_mail && > + grep "quoted CR detected" quoted-cr-err > +' > + > test_done
diff --git a/mailinfo.c b/mailinfo.c index 5681d9130d..713567f84b 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -988,12 +988,17 @@ static int handle_boundary(struct mailinfo *mi, struct strbuf *line) } static void handle_filter_flowed(struct mailinfo *mi, struct strbuf *line, - struct strbuf *prev) + struct strbuf *prev, int *have_quoted_cr) { size_t len = line->len; const char *rest; if (!mi->format_flowed) { + if (len >= 2 && + line->buf[len - 2] == '\r' && + line->buf[len - 1] == '\n') { + *have_quoted_cr = 1; + } handle_filter(mi, line); return; } @@ -1033,9 +1038,16 @@ static void handle_filter_flowed(struct mailinfo *mi, struct strbuf *line, handle_filter(mi, line); } +static void summarize_quoted_cr(struct mailinfo *mi, int have_quoted_cr) +{ + if (have_quoted_cr) + warning("quoted CR detected"); +} + static void handle_body(struct mailinfo *mi, struct strbuf *line) { struct strbuf prev = STRBUF_INIT; + int have_quoted_cr = 0; /* Skip up to the first boundary */ if (*(mi->content_top)) { @@ -1051,6 +1063,8 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) handle_filter(mi, &prev); strbuf_reset(&prev); } + summarize_quoted_cr(mi, have_quoted_cr); + have_quoted_cr = 0; if (!handle_boundary(mi, line)) goto handle_body_out; } @@ -1081,7 +1095,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) strbuf_addbuf(&prev, sb); break; } - handle_filter_flowed(mi, sb, &prev); + handle_filter_flowed(mi, sb, &prev, &have_quoted_cr); } /* * The partial chunk is saved in "prev" and will be @@ -1091,7 +1105,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) break; } default: - handle_filter_flowed(mi, line, &prev); + handle_filter_flowed(mi, line, &prev, &have_quoted_cr); } if (mi->input_error) @@ -1100,6 +1114,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf *line) if (prev.len) handle_filter(mi, &prev); + summarize_quoted_cr(mi, have_quoted_cr); flush_inbody_header_accum(mi); diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index 147e616533..d8fdda6bea 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -228,4 +228,19 @@ test_expect_success 'mailinfo handles unusual header whitespace' ' test_cmp expect actual ' +check_quoted_cr_mail() { + git mailinfo -u "$@" quoted-cr-msg quoted-cr-patch \ + <"$DATA/quoted-cr.mbox" >quoted-cr-info 2>quoted-cr-err && + test_cmp "expect-cr-msg" quoted-cr-msg && + test_cmp "expect-cr-patch" quoted-cr-patch && + test_cmp "$DATA/quoted-cr-info" quoted-cr-info +} + +test_expect_success 'mailinfo warn CR in base64 encoded email' ' + sed "s/%%/$(printf \\015)/" "$DATA/quoted-cr-msg" >expect-cr-msg && + sed "s/%%/$(printf \\015)/" "$DATA/quoted-cr-patch" >expect-cr-patch && + check_quoted_cr_mail && + grep "quoted CR detected" quoted-cr-err +' + test_done diff --git a/t/t5100/quoted-cr-info b/t/t5100/quoted-cr-info new file mode 100644 index 0000000000..dab2228b70 --- /dev/null +++ b/t/t5100/quoted-cr-info @@ -0,0 +1,5 @@ +Author: A U Thor +Email: mail@example.com +Subject: sample +Date: Mon, 3 Aug 2020 22:40:55 +0700 + diff --git a/t/t5100/quoted-cr-msg b/t/t5100/quoted-cr-msg new file mode 100644 index 0000000000..a148bc7e26 --- /dev/null +++ b/t/t5100/quoted-cr-msg @@ -0,0 +1,2 @@ +On different distro, pytest is suffixed with different patterns.%% +%% diff --git a/t/t5100/quoted-cr-patch b/t/t5100/quoted-cr-patch new file mode 100644 index 0000000000..580e2bddb8 --- /dev/null +++ b/t/t5100/quoted-cr-patch @@ -0,0 +1,22 @@ +---%% + configure | 2 +-%% + 1 file changed, 1 insertion(+), 1 deletion(-)%% +%% +diff --git a/configure b/configure%% +index db3538b3..f7c1c095 100755%% +--- a/configure%% ++++ b/configure%% +@@ -814,7 +814,7 @@ if [ $have_python3 -eq 1 ]; then%% + printf "Checking for python3 pytest (>= 3.0)... "%% + conf=$(mktemp)%% + printf "[pytest]\nminversion=3.0\n" > $conf%% +- if pytest-3 -c $conf --version >/dev/null 2>&1; then%% ++ if "$python" -m pytest -c $conf --version >/dev/null 2>&1; then%% + printf "Yes.\n"%% + have_python3_pytest=1%% + else%% +-- %% +2.28.0%% +_______________________________________________ +example mailing list -- list@example.org +To unsubscribe send an email to list-leave@example.org diff --git a/t/t5100/quoted-cr.mbox b/t/t5100/quoted-cr.mbox new file mode 100644 index 0000000000..6ea9806a6b --- /dev/null +++ b/t/t5100/quoted-cr.mbox @@ -0,0 +1,22 @@ +From: A U Thor <mail@example.com> +To: list@example.org +Subject: [PATCH v2] sample +Date: Mon, 3 Aug 2020 22:40:55 +0700 +Message-Id: <msg-id@example.com> +Content-Type: text/plain; charset="utf-8" +Content-Transfer-Encoding: base64 + +T24gZGlmZmVyZW50IGRpc3RybywgcHl0ZXN0IGlzIHN1ZmZpeGVkIHdpdGggZGlmZmVyZW50IHBh +dHRlcm5zLg0KDQotLS0NCiBjb25maWd1cmUgfCAyICstDQogMSBmaWxlIGNoYW5nZWQsIDEgaW5z +ZXJ0aW9uKCspLCAxIGRlbGV0aW9uKC0pDQoNCmRpZmYgLS1naXQgYS9jb25maWd1cmUgYi9jb25m +aWd1cmUNCmluZGV4IGRiMzUzOGIzLi5mN2MxYzA5NSAxMDA3NTUNCi0tLSBhL2NvbmZpZ3VyZQ0K +KysrIGIvY29uZmlndXJlDQpAQCAtODE0LDcgKzgxNCw3IEBAIGlmIFsgJGhhdmVfcHl0aG9uMyAt +ZXEgMSBdOyB0aGVuDQogICAgIHByaW50ZiAiQ2hlY2tpbmcgZm9yIHB5dGhvbjMgcHl0ZXN0ICg+ +PSAzLjApLi4uICINCiAgICAgY29uZj0kKG1rdGVtcCkNCiAgICAgcHJpbnRmICJbcHl0ZXN0XVxu +bWludmVyc2lvbj0zLjBcbiIgPiAkY29uZg0KLSAgICBpZiBweXRlc3QtMyAtYyAkY29uZiAtLXZl +cnNpb24gPi9kZXYvbnVsbCAyPiYxOyB0aGVuDQorICAgIGlmICIkcHl0aG9uIiAtbSBweXRlc3Qg +LWMgJGNvbmYgLS12ZXJzaW9uID4vZGV2L251bGwgMj4mMTsgdGhlbg0KICAgICAgICAgcHJpbnRm +ICJZZXMuXG4iDQogICAgICAgICBoYXZlX3B5dGhvbjNfcHl0ZXN0PTENCiAgICAgZWxzZQ0KLS0g +DQoyLjI4LjANCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f +CmV4YW1wbGUgbWFpbGluZyBsaXN0IC0tIGxpc3RAZXhhbXBsZS5vcmcKVG8gdW5zdWJzY3JpYmUg +c2VuZCBhbiBlbWFpbCB0byBsaXN0LWxlYXZlQGV4YW1wbGUub3JnCg==
When an SMTP server receives an 8-bit email message, possibly with only LF as line ending, some of those servers decide to change said LF to CRLF. Some mailing list softwares, when receives an 8-bit email message, decide to encoding such message in base64 or quoted-printable. If an email is transfered through above mail servers, then distributed by such mailing list softwares, the recipients will receive an email contains a patch mungled with CRLF encoded inside another encoding. Thus, such CR couldn't be dropped by mailsplit. Hence, the mailed patch couldn't be applied cleanly. Such accidents have been observed in the wild [1]. Let's give our users some warnings if such CR is found. [1]: https://nmbug.notmuchmail.org/nmweb/show/m2lf9ejegj.fsf%40guru.guru-group.fi Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> --- mailinfo.c | 21 ++++++++++++++++++--- t/t5100-mailinfo.sh | 15 +++++++++++++++ t/t5100/quoted-cr-info | 5 +++++ t/t5100/quoted-cr-msg | 2 ++ t/t5100/quoted-cr-patch | 22 ++++++++++++++++++++++ t/t5100/quoted-cr.mbox | 22 ++++++++++++++++++++++ 6 files changed, 84 insertions(+), 3 deletions(-) create mode 100644 t/t5100/quoted-cr-info create mode 100644 t/t5100/quoted-cr-msg create mode 100644 t/t5100/quoted-cr-patch create mode 100644 t/t5100/quoted-cr.mbox