Message ID | 20220319110422.8261-1-zhouzhouyi@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2] net:ipv4: send an ack when seg.ack > snd.nxt | expand |
On Sat, Mar 19, 2022 at 4:04 AM <zhouzhouyi@gmail.com> wrote: > > From: Zhouyi Zhou <zhouzhouyi@gmail.com> > > In RFC 793, page 72: "If the ACK acks something not yet sent > (SEG.ACK > SND.NXT) then send an ACK, drop the segment, > and return." > > Fix Linux's behavior according to RFC 793. > > Reported-by: Wei Xu <xuweihf@ustc.edu.cn> > Signed-off-by: Wei Xu <xuweihf@ustc.edu.cn> > Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > --- > Thank Florian Westphal for pointing out > the potential duplicated ack bug in patch version 1. I am travelling this week, but I think your patch is not necessary and might actually be bad. Please provide more details of why nobody complained of this until today. Also I doubt you actually fully tested this patch, sending a V2 30 minutes after V1. If yes, please provide a packetdrill test. Thank you. > -- > net/ipv4/tcp_input.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index bfe4112e000c..4bbf85d7ea8c 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3771,11 +3771,13 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) > goto old_ack; > } > > - /* If the ack includes data we haven't sent yet, discard > - * this segment (RFC793 Section 3.9). > + /* If the ack includes data we haven't sent yet, then send > + * an ack, drop this segment, and return (RFC793 Section 3.9 page 72). > */ > - if (after(ack, tp->snd_nxt)) > - return -1; > + if (after(ack, tp->snd_nxt)) { > + tcp_send_ack(sk); > + return -2; > + } > > if (after(ack, prior_snd_una)) { > flag |= FLAG_SND_UNA_ADVANCED; > @@ -6385,6 +6387,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > struct request_sock *req; > int queued = 0; > bool acceptable; > + int ret; > > switch (sk->sk_state) { > case TCP_CLOSE: > @@ -6451,14 +6454,16 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > return 0; > > /* step 5: check the ACK field */ > - acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH | > - FLAG_UPDATE_TS_RECENT | > - FLAG_NO_CHALLENGE_ACK) > 0; > + ret = tcp_ack(sk, skb, FLAG_SLOWPATH | > + FLAG_UPDATE_TS_RECENT | > + FLAG_NO_CHALLENGE_ACK); > + acceptable = ret > 0; > > if (!acceptable) { > if (sk->sk_state == TCP_SYN_RECV) > return 1; /* send one RST */ > - tcp_send_challenge_ack(sk); > + if (ret > -2) > + tcp_send_challenge_ack(sk); > goto discard; > } > switch (sk->sk_state) { > -- > 2.25.1 >
Thanks for reviewing my patch On Sat, Mar 19, 2022 at 7:14 PM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Mar 19, 2022 at 4:04 AM <zhouzhouyi@gmail.com> wrote: > > > > From: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > In RFC 793, page 72: "If the ACK acks something not yet sent > > (SEG.ACK > SND.NXT) then send an ACK, drop the segment, > > and return." > > > > Fix Linux's behavior according to RFC 793. > > > > Reported-by: Wei Xu <xuweihf@ustc.edu.cn> > > Signed-off-by: Wei Xu <xuweihf@ustc.edu.cn> > > Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > > --- > > Thank Florian Westphal for pointing out > > the potential duplicated ack bug in patch version 1. > > I am travelling this week, but I think your patch is not necessary and > might actually be bad. > > Please provide more details of why nobody complained of this until today. > > Also I doubt you actually fully tested this patch, sending a V2 30 > minutes after V1. > > If yes, please provide a packetdrill test. I am a beginner to TCP, although I have submitted once a patch to netdev in 2013 (aaa0c23cb90141309f5076ba5e3bfbd39544b985), this is first time I learned packetdrill test. I think I should do the packetdrill test in the coming days, and provide more details of how this (RFC793 related) can happen. Apologize sincerely in advance if I have made noise. Thank you for your time Sincerely Zhouyi > > Thank you. > > > -- > > net/ipv4/tcp_input.c | 21 +++++++++++++-------- > > 1 file changed, 13 insertions(+), 8 deletions(-) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index bfe4112e000c..4bbf85d7ea8c 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -3771,11 +3771,13 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) > > goto old_ack; > > } > > > > - /* If the ack includes data we haven't sent yet, discard > > - * this segment (RFC793 Section 3.9). > > + /* If the ack includes data we haven't sent yet, then send > > + * an ack, drop this segment, and return (RFC793 Section 3.9 page 72). > > */ > > - if (after(ack, tp->snd_nxt)) > > - return -1; > > + if (after(ack, tp->snd_nxt)) { > > + tcp_send_ack(sk); > > + return -2; > > + } > > > > if (after(ack, prior_snd_una)) { > > flag |= FLAG_SND_UNA_ADVANCED; > > @@ -6385,6 +6387,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > > struct request_sock *req; > > int queued = 0; > > bool acceptable; > > + int ret; > > > > switch (sk->sk_state) { > > case TCP_CLOSE: > > @@ -6451,14 +6454,16 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > > return 0; > > > > /* step 5: check the ACK field */ > > - acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH | > > - FLAG_UPDATE_TS_RECENT | > > - FLAG_NO_CHALLENGE_ACK) > 0; > > + ret = tcp_ack(sk, skb, FLAG_SLOWPATH | > > + FLAG_UPDATE_TS_RECENT | > > + FLAG_NO_CHALLENGE_ACK); > > + acceptable = ret > 0; > > > > if (!acceptable) { > > if (sk->sk_state == TCP_SYN_RECV) > > return 1; /* send one RST */ > > - tcp_send_challenge_ack(sk); > > + if (ret > -2) > > + tcp_send_challenge_ack(sk); > > goto discard; > > } > > switch (sk->sk_state) { > > -- > > 2.25.1 > >
On Sat, Mar 19, 2022 at 7:34 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Thanks for reviewing my patch > > On Sat, Mar 19, 2022 at 7:14 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Sat, Mar 19, 2022 at 4:04 AM <zhouzhouyi@gmail.com> wrote: > > > > > > From: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > > > In RFC 793, page 72: "If the ACK acks something not yet sent > > > (SEG.ACK > SND.NXT) then send an ACK, drop the segment, > > > and return." > > > > > > Fix Linux's behavior according to RFC 793. > > > > > > Reported-by: Wei Xu <xuweihf@ustc.edu.cn> > > > Signed-off-by: Wei Xu <xuweihf@ustc.edu.cn> > > > Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > --- > > > Thank Florian Westphal for pointing out > > > the potential duplicated ack bug in patch version 1. > > > > I am travelling this week, but I think your patch is not necessary and > > might actually be bad. > > > > Please provide more details of why nobody complained of this until today. > > > > Also I doubt you actually fully tested this patch, sending a V2 30 > > minutes after V1. > > > > If yes, please provide a packetdrill test. > I am a beginner to TCP, although I have submitted once a patch to > netdev in 2013 (aaa0c23cb90141309f5076ba5e3bfbd39544b985), this is > first time I learned packetdrill test. > I think I should do the packetdrill test in the coming days, and > provide more details of how this (RFC793 related) can happen. In addition to a packetdrill test and a more detailed analysis of how this can happen, and the implications, I think there are at least a few other issues that need to be considered: (1) AFAICT, adding an unconditional ACK if (after(ack, tp->snd_nxt)) seems to open the potential for attackers to cause DoS attacks with something like the following: (a) attacker injects one data packet in the A->B direction and one data packet in the B->A direction (b) endpoint A sends an ACK for the forged data sent to it, which will have an ACK beyond B's snd_nxt (c) endpoint B sends an ACK for the forged data sent to it, which will have an ACK beyond A's snd_nxt (d) endpoint B receives the ACK sent by A, causing B to send another ACK beyond A's snd_nxt (e) endpoint A receives the ACK sent by B, causing A to send another ACK beyond B's snd_nxt (f) repeat (d) and (e) ad infinitum So AFAICT an attacker could send two data packets with 1 byte of data and cause the two endpoints to use up an unbounded amount of CPU and bandwidth sending ACKs in an "infinite loop". To avoid this "infinite loop" of packets, if we really need to add an ACK in this case then the code should use the tcp_oow_rate_limited() helper to ensure that such ACKs are rate-limited. For more context on tcp_oow_rate_limited(), see: f06535c599354 Merge branch 'tcp_ack_loops' 4fb17a6091674 tcp: mitigate ACK loops for connections as tcp_timewait_sock f2b2c582e8242 tcp: mitigate ACK loops for connections as tcp_sock a9b2c06dbef48 tcp: mitigate ACK loops for connections as tcp_request_sock 032ee4236954e tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks Note that f06535c599354 in particular mentions the case discussed in this patch: (2) RFC 793 (section 3.9, page 72) says: "If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK". (2) Please consider the potential that adding a new ACK in this scenario may introduce new, unanticipated side channels. For more on side channels, see: https://lwn.net/Articles/696868/ The TCP "challenge ACK" side channel Principled Unearthing of TCP Side Channel Vulnerabilities https://dl.acm.org/doi/10.1145/3319535.3354250 best regards, neal
Thank Neil and Eric for your valuable advice! I will do the test and analysis. Please forgive my hasty reply because it will take me some time to fully understand the email. Also please give me about a month to accomplish the test and analysis. On Sat, Mar 19, 2022 at 9:57 PM Neal Cardwell <ncardwell@google.com> wrote: > > On Sat, Mar 19, 2022 at 7:34 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > Thanks for reviewing my patch > > > > On Sat, Mar 19, 2022 at 7:14 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Sat, Mar 19, 2022 at 4:04 AM <zhouzhouyi@gmail.com> wrote: > > > > > > > > From: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > > > > > In RFC 793, page 72: "If the ACK acks something not yet sent > > > > (SEG.ACK > SND.NXT) then send an ACK, drop the segment, > > > > and return." > > > > > > > > Fix Linux's behavior according to RFC 793. > > > > > > > > Reported-by: Wei Xu <xuweihf@ustc.edu.cn> > > > > Signed-off-by: Wei Xu <xuweihf@ustc.edu.cn> > > > > Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > --- > > > > Thank Florian Westphal for pointing out > > > > the potential duplicated ack bug in patch version 1. > > > > > > I am travelling this week, but I think your patch is not necessary and > > > might actually be bad. > > > > > > Please provide more details of why nobody complained of this until today. > > > > > > Also I doubt you actually fully tested this patch, sending a V2 30 > > > minutes after V1. > > > > > > If yes, please provide a packetdrill test. > > I am a beginner to TCP, although I have submitted once a patch to > > netdev in 2013 (aaa0c23cb90141309f5076ba5e3bfbd39544b985), this is > > first time I learned packetdrill test. > > I think I should do the packetdrill test in the coming days, and > > provide more details of how this (RFC793 related) can happen. > > In addition to a packetdrill test and a more detailed analysis of how > this can happen, and the implications, I think there are at least a > few other issues that need to be considered: > > (1) AFAICT, adding an unconditional ACK if (after(ack, tp->snd_nxt)) > seems to open the potential for attackers to cause DoS attacks with > something like the following: > > (a) attacker injects one data packet in the A->B direction and one > data packet in the B->A direction > > (b) endpoint A sends an ACK for the forged data sent to it, which > will have an ACK beyond B's snd_nxt > > (c) endpoint B sends an ACK for the forged data sent to it, which > will have an ACK beyond A's snd_nxt > > (d) endpoint B receives the ACK sent by A, causing B to send another > ACK beyond A's snd_nxt > > (e) endpoint A receives the ACK sent by B, causing A to send another > ACK beyond B's snd_nxt > > (f) repeat (d) and (e) ad infinitum I will make a full understanding of the above scenery in the coming days. > > So AFAICT an attacker could send two data packets with 1 byte of data > and cause the two endpoints to use up an unbounded amount of CPU and > bandwidth sending ACKs in an "infinite loop". > > To avoid this "infinite loop" of packets, if we really need to add an > ACK in this case then the code should use the tcp_oow_rate_limited() > helper to ensure that such ACKs are rate-limited. For more context on > tcp_oow_rate_limited(), see: > > f06535c599354 Merge branch 'tcp_ack_loops' > 4fb17a6091674 tcp: mitigate ACK loops for connections as tcp_timewait_sock > f2b2c582e8242 tcp: mitigate ACK loops for connections as tcp_sock > a9b2c06dbef48 tcp: mitigate ACK loops for connections as tcp_request_sock > 032ee4236954e tcp: helpers to mitigate ACK loops by rate-limiting > out-of-window dupacks > > Note that f06535c599354 in particular mentions the case discussed in this patch: > > (2) RFC 793 (section 3.9, page 72) says: "If the ACK acknowledges > something not yet sent (SEG.ACK > SND.NXT) then send an ACK". > > (2) Please consider the potential that adding a new ACK in this > scenario may introduce new, unanticipated side channels. For more on > side channels, see: > > https://lwn.net/Articles/696868/ > The TCP "challenge ACK" side channel I will read the article in the days following. > > Principled Unearthing of TCP Side Channel Vulnerabilities > https://dl.acm.org/doi/10.1145/3319535.3354250 I will read the paper too. > > best regards, > neal Best Regards Zhouyi
Hi Thank you for reviewing my patch and guiding me to do the research ;-) 1. Abstract I performed packetdrill test and security analysis of the patch. I think we shouldn't apply my patch to the Linux kernel now because of security challenges. 2. Introduction RFC 793 In RFC 793, page 72: "If the ACK acks something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return." So, I tried a patch to implement the above. 3. Packetdrill test result Before apply my patch: http://154.220.3.115/May062022/run_all.orig.log After apply my patch: http://154.220.3.115/May062022/run_all.new.log Comparing these two logs, I found the difference is caused by a) the surplus ack from my patch: invalid_ack.pkt (ipv4-mapped-v6 ipv4 ipv6) b) tcp busy time checking fail: tcp-info-sndbuf-limited.pkt (ipv4-mapped-v6 ipv4 ipv6) c) packet time error: close-remote-fin-then-close.pkt (ipv6) I conclude that the patch does not introduce a serious problem to the TCP stack from packetdrill's point of view. 4. Possible infinite loop introduced by packet injection (a) attacker injects one data packet in the A->B direction and one data packet in the B->A direction (b) endpoint A sends an ACK for the forged data sent to it, which will have an ACK beyond B's snd_nxt (c) endpoint B sends an ACK for the forged data sent to it, which will have an ACK beyond A's snd_nxt (d) endpoint B receives the ACK sent by A, causing B to send another ACK beyond A's snd_nxt (e) endpoint A receives the ACK sent by B, causing A to send another ACK beyond B's snd_nxt (f) repeat (d) and (e) ad infinitum If we apply the patch, nothing can stop this infinite loop, we can only limit the rate of new ack. To note that, off path attack can achieve above attack by means of TCP off path exploits [1] [2] 5. Possible new, unanticipated side channels To mitigate the infinite loop in section 4, we should limit the rate of the ack. Limiting the rate of global ack has the potential of introducing new, unanticipated side channels. 6. Conclusion I think we shouldn't apply my patch to the Linux kernel now because of security challenges. [1] "Off-Path TCP Exploits: Global Rate Limit Considered Dangerous" https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/cao [2] "Principled Unearthing of TCP Side Channel Vulnerabilities" https://dl.acm.org/doi/10.1145/3319535.3354250 Thanks Zhouyi On Sat, Mar 19, 2022 at 11:31 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Thank Neil and Eric for your valuable advice! > > I will do the test and analysis. Please forgive my hasty reply because > it will take me some time to fully understand the email. Also please > give me about a month to accomplish the test and analysis. > On Sat, Mar 19, 2022 at 9:57 PM Neal Cardwell <ncardwell@google.com> wrote: > > > > On Sat, Mar 19, 2022 at 7:34 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > > > Thanks for reviewing my patch > > > > > > On Sat, Mar 19, 2022 at 7:14 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > > > On Sat, Mar 19, 2022 at 4:04 AM <zhouzhouyi@gmail.com> wrote: > > > > > > > > > > From: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > > > > > > > In RFC 793, page 72: "If the ACK acks something not yet sent > > > > > (SEG.ACK > SND.NXT) then send an ACK, drop the segment, > > > > > and return." > > > > > > > > > > Fix Linux's behavior according to RFC 793. > > > > > > > > > > Reported-by: Wei Xu <xuweihf@ustc.edu.cn> > > > > > Signed-off-by: Wei Xu <xuweihf@ustc.edu.cn> > > > > > Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > > --- > > > > > Thank Florian Westphal for pointing out > > > > > the potential duplicated ack bug in patch version 1. > > > > > > > > I am travelling this week, but I think your patch is not necessary and > > > > might actually be bad. > > > > > > > > Please provide more details of why nobody complained of this until today. > > > > > > > > Also I doubt you actually fully tested this patch, sending a V2 30 > > > > minutes after V1. > > > > > > > > If yes, please provide a packetdrill test. > > > I am a beginner to TCP, although I have submitted once a patch to > > > netdev in 2013 (aaa0c23cb90141309f5076ba5e3bfbd39544b985), this is > > > first time I learned packetdrill test. > > > I think I should do the packetdrill test in the coming days, and > > > provide more details of how this (RFC793 related) can happen. > > > > In addition to a packetdrill test and a more detailed analysis of how > > this can happen, and the implications, I think there are at least a > > few other issues that need to be considered: > > > > (1) AFAICT, adding an unconditional ACK if (after(ack, tp->snd_nxt)) > > seems to open the potential for attackers to cause DoS attacks with > > something like the following: > > > > (a) attacker injects one data packet in the A->B direction and one > > data packet in the B->A direction > > > > (b) endpoint A sends an ACK for the forged data sent to it, which > > will have an ACK beyond B's snd_nxt > > > > (c) endpoint B sends an ACK for the forged data sent to it, which > > will have an ACK beyond A's snd_nxt > > > > (d) endpoint B receives the ACK sent by A, causing B to send another > > ACK beyond A's snd_nxt > > > > (e) endpoint A receives the ACK sent by B, causing A to send another > > ACK beyond B's snd_nxt > > > > (f) repeat (d) and (e) ad infinitum > I will make a full understanding of the above scenery in the coming days. > > > > So AFAICT an attacker could send two data packets with 1 byte of data > > and cause the two endpoints to use up an unbounded amount of CPU and > > bandwidth sending ACKs in an "infinite loop". > > > > To avoid this "infinite loop" of packets, if we really need to add an > > ACK in this case then the code should use the tcp_oow_rate_limited() > > helper to ensure that such ACKs are rate-limited. For more context on > > tcp_oow_rate_limited(), see: > > > > f06535c599354 Merge branch 'tcp_ack_loops' > > 4fb17a6091674 tcp: mitigate ACK loops for connections as tcp_timewait_sock > > f2b2c582e8242 tcp: mitigate ACK loops for connections as tcp_sock > > a9b2c06dbef48 tcp: mitigate ACK loops for connections as tcp_request_sock > > 032ee4236954e tcp: helpers to mitigate ACK loops by rate-limiting > > out-of-window dupacks > > > > Note that f06535c599354 in particular mentions the case discussed in this patch: > > > > (2) RFC 793 (section 3.9, page 72) says: "If the ACK acknowledges > > something not yet sent (SEG.ACK > SND.NXT) then send an ACK". > > > > (2) Please consider the potential that adding a new ACK in this > > scenario may introduce new, unanticipated side channels. For more on > > side channels, see: > > > > https://lwn.net/Articles/696868/ > > The TCP "challenge ACK" side channel > I will read the article in the days following. > > > > Principled Unearthing of TCP Side Channel Vulnerabilities > > https://dl.acm.org/doi/10.1145/3319535.3354250 > I will read the paper too. > > > > best regards, > > neal > Best Regards > Zhouyi
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index bfe4112e000c..4bbf85d7ea8c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3771,11 +3771,13 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) goto old_ack; } - /* If the ack includes data we haven't sent yet, discard - * this segment (RFC793 Section 3.9). + /* If the ack includes data we haven't sent yet, then send + * an ack, drop this segment, and return (RFC793 Section 3.9 page 72). */ - if (after(ack, tp->snd_nxt)) - return -1; + if (after(ack, tp->snd_nxt)) { + tcp_send_ack(sk); + return -2; + } if (after(ack, prior_snd_una)) { flag |= FLAG_SND_UNA_ADVANCED; @@ -6385,6 +6387,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) struct request_sock *req; int queued = 0; bool acceptable; + int ret; switch (sk->sk_state) { case TCP_CLOSE: @@ -6451,14 +6454,16 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) return 0; /* step 5: check the ACK field */ - acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH | - FLAG_UPDATE_TS_RECENT | - FLAG_NO_CHALLENGE_ACK) > 0; + ret = tcp_ack(sk, skb, FLAG_SLOWPATH | + FLAG_UPDATE_TS_RECENT | + FLAG_NO_CHALLENGE_ACK); + acceptable = ret > 0; if (!acceptable) { if (sk->sk_state == TCP_SYN_RECV) return 1; /* send one RST */ - tcp_send_challenge_ack(sk); + if (ret > -2) + tcp_send_challenge_ack(sk); goto discard; } switch (sk->sk_state) {