Message ID | 20231205161841.2702925-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 3d501dd326fb1c73f1b8206d4c6e1d7b15c07e27 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tcp: do not accept ACK of bytes we never sent | expand |
On Tue, Dec 5, 2023 at 8:18 AM Eric Dumazet <edumazet@google.com> wrote: > > This patch is based on a detailed report and ideas from Yepeng Pan > and Christian Rossow. > > ACK seq validation is currently following RFC 5961 5.2 guidelines: > > The ACK value is considered acceptable only if > it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= > SND.NXT). All incoming segments whose ACK value doesn't satisfy the > above condition MUST be discarded and an ACK sent back. It needs to > be noted that RFC 793 on page 72 (fifth check) says: "If the ACK is a > duplicate (SEG.ACK < SND.UNA), it can be ignored. If the ACK > acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an > ACK, drop the segment, and return". The "ignored" above implies that > the processing of the incoming data segment continues, which means > the ACK value is treated as acceptable. This mitigation makes the > ACK check more stringent since any ACK < SND.UNA wouldn't be > accepted, instead only ACKs that are in the range ((SND.UNA - > MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through. Thank you Eric for the fix. It appears the newer RFC https://www.rfc-editor.org/rfc/rfc9293.html also has this issue that needs a revision? > > This can be refined for new (and possibly spoofed) flows, > by not accepting ACK for bytes that were never sent. > > This greatly improves TCP security at a little cost. > > I added a Fixes: tag to make sure this patch will reach stable trees, > even if the 'blamed' patch was adhering to the RFC. > > tp->bytes_acked was added in linux-4.2 > > Following packetdrill test (courtesy of Yepeng Pan) shows > the issue at hand: > > 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 > +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > +0 bind(3, ..., ...) = 0 > +0 listen(3, 1024) = 0 > > // ---------------- Handshake ------------------- // > > // when window scale is set to 14 the window size can be extended to > // 65535 * (2^14) = 1073725440. Linux would accept an ACK packet > // with ack number in (Server_ISN+1-1073725440. Server_ISN+1) > // ,though this ack number acknowledges some data never > // sent by the server. > > +0 < S 0:0(0) win 65535 <mss 1400,nop,wscale 14> > +0 > S. 0:0(0) ack 1 <...> > +0 < . 1:1(0) ack 1 win 65535 > +0 accept(3, ..., ...) = 4 > > // For the established connection, we send an ACK packet, > // the ack packet uses ack number 1 - 1073725300 + 2^32, > // where 2^32 is used to wrap around. > // Note: we used 1073725300 instead of 1073725440 to avoid possible > // edge cases. > // 1 - 1073725300 + 2^32 = 3221241997 > > // Oops, old kernels happily accept this packet. > +0 < . 1:1001(1000) ack 3221241997 win 65535 > > // After the kernel fix the following will be replaced by a challenge ACK, > // and prior malicious frame would be dropped. > +0 > . 1:1(0) ack 1001 > > Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Yepeng Pan <yepeng.pan@cispa.de> > Reported-by: Christian Rossow <rossow@cispa.de> > --- > net/ipv4/tcp_input.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index bcb55d98004c5213f0095613124d5193b15b2793..62cccc2e89ec68b3badae03168f1bfcd2698e0b7 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3871,8 +3871,12 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) > * then we can probably ignore it. > */ > if (before(ack, prior_snd_una)) { > + u32 max_window; > + > + /* do not accept ACK for bytes we never sent. */ > + max_window = min_t(u64, tp->max_window, tp->bytes_acked); > /* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */ > - if (before(ack, prior_snd_una - tp->max_window)) { > + if (before(ack, prior_snd_una - max_window)) { > if (!(flag & FLAG_NO_CHALLENGE_ACK)) > tcp_send_challenge_ack(sk); > return -SKB_DROP_REASON_TCP_TOO_OLD_ACK; > -- > 2.43.0.rc2.451.g8631bc7472-goog >
On Tue, Dec 5, 2023 at 5:47 PM Yuchung Cheng <ycheng@google.com> wrote: > Thank you Eric for the fix. It appears the newer RFC > https://www.rfc-editor.org/rfc/rfc9293.html also has this issue that > needs a revision? I do not think RFC 9293 made any refinement about RFC 5961 guidelines. Perhaps Yepeng Pan and Christian Rossow have plans to bring this issue to IETF.
On Tue, Dec 5, 2023 at 11:18 AM Eric Dumazet <edumazet@google.com> wrote: > > This patch is based on a detailed report and ideas from Yepeng Pan > and Christian Rossow. > > ACK seq validation is currently following RFC 5961 5.2 guidelines: > > The ACK value is considered acceptable only if > it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= > SND.NXT). All incoming segments whose ACK value doesn't satisfy the > above condition MUST be discarded and an ACK sent back. It needs to > be noted that RFC 793 on page 72 (fifth check) says: "If the ACK is a > duplicate (SEG.ACK < SND.UNA), it can be ignored. If the ACK > acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an > ACK, drop the segment, and return". The "ignored" above implies that > the processing of the incoming data segment continues, which means > the ACK value is treated as acceptable. This mitigation makes the > ACK check more stringent since any ACK < SND.UNA wouldn't be > accepted, instead only ACKs that are in the range ((SND.UNA - > MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through. > > This can be refined for new (and possibly spoofed) flows, > by not accepting ACK for bytes that were never sent. > > This greatly improves TCP security at a little cost. > > I added a Fixes: tag to make sure this patch will reach stable trees, > even if the 'blamed' patch was adhering to the RFC. > > tp->bytes_acked was added in linux-4.2 > > Following packetdrill test (courtesy of Yepeng Pan) shows > the issue at hand: > > 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 > +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > +0 bind(3, ..., ...) = 0 > +0 listen(3, 1024) = 0 > > // ---------------- Handshake ------------------- // > > // when window scale is set to 14 the window size can be extended to > // 65535 * (2^14) = 1073725440. Linux would accept an ACK packet > // with ack number in (Server_ISN+1-1073725440. Server_ISN+1) > // ,though this ack number acknowledges some data never > // sent by the server. > > +0 < S 0:0(0) win 65535 <mss 1400,nop,wscale 14> > +0 > S. 0:0(0) ack 1 <...> > +0 < . 1:1(0) ack 1 win 65535 > +0 accept(3, ..., ...) = 4 > > // For the established connection, we send an ACK packet, > // the ack packet uses ack number 1 - 1073725300 + 2^32, > // where 2^32 is used to wrap around. > // Note: we used 1073725300 instead of 1073725440 to avoid possible > // edge cases. > // 1 - 1073725300 + 2^32 = 3221241997 > > // Oops, old kernels happily accept this packet. > +0 < . 1:1001(1000) ack 3221241997 win 65535 > > // After the kernel fix the following will be replaced by a challenge ACK, > // and prior malicious frame would be dropped. > +0 > . 1:1(0) ack 1001 > > Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Yepeng Pan <yepeng.pan@cispa.de> > Reported-by: Christian Rossow <rossow@cispa.de> > --- > net/ipv4/tcp_input.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index bcb55d98004c5213f0095613124d5193b15b2793..62cccc2e89ec68b3badae03168f1bfcd2698e0b7 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3871,8 +3871,12 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) > * then we can probably ignore it. > */ > if (before(ack, prior_snd_una)) { > + u32 max_window; > + > + /* do not accept ACK for bytes we never sent. */ > + max_window = min_t(u64, tp->max_window, tp->bytes_acked); > /* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */ > - if (before(ack, prior_snd_una - tp->max_window)) { > + if (before(ack, prior_snd_una - max_window)) { > if (!(flag & FLAG_NO_CHALLENGE_ACK)) > tcp_send_challenge_ack(sk); > return -SKB_DROP_REASON_TCP_TOO_OLD_ACK; > -- Thanks, Eric! Acked-by: Neal Cardwell <ncardwell@google.com> neal
> Perhaps Yepeng Pan and Christian Rossow have plans to bring this issue to IETF.
We already brought this up to the IETF TCP WG chairs a few weeks ago.
It's still unclear if the TCP WG wants to work on an Internet Draft that
mitigates "ghost ACKs". We'll follow up with them.
Cheers,
Christian
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Tue, 5 Dec 2023 16:18:41 +0000 you wrote: > This patch is based on a detailed report and ideas from Yepeng Pan > and Christian Rossow. > > ACK seq validation is currently following RFC 5961 5.2 guidelines: > > The ACK value is considered acceptable only if > it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= > SND.NXT). All incoming segments whose ACK value doesn't satisfy the > above condition MUST be discarded and an ACK sent back. It needs to > be noted that RFC 793 on page 72 (fifth check) says: "If the ACK is a > duplicate (SEG.ACK < SND.UNA), it can be ignored. If the ACK > acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an > ACK, drop the segment, and return". The "ignored" above implies that > the processing of the incoming data segment continues, which means > the ACK value is treated as acceptable. This mitigation makes the > ACK check more stringent since any ACK < SND.UNA wouldn't be > accepted, instead only ACKs that are in the range ((SND.UNA - > MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through. > > [...] Here is the summary with links: - [net] tcp: do not accept ACK of bytes we never sent https://git.kernel.org/netdev/net/c/3d501dd326fb You are awesome, thank you!
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index bcb55d98004c5213f0095613124d5193b15b2793..62cccc2e89ec68b3badae03168f1bfcd2698e0b7 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3871,8 +3871,12 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) * then we can probably ignore it. */ if (before(ack, prior_snd_una)) { + u32 max_window; + + /* do not accept ACK for bytes we never sent. */ + max_window = min_t(u64, tp->max_window, tp->bytes_acked); /* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */ - if (before(ack, prior_snd_una - tp->max_window)) { + if (before(ack, prior_snd_una - max_window)) { if (!(flag & FLAG_NO_CHALLENGE_ACK)) tcp_send_challenge_ack(sk); return -SKB_DROP_REASON_TCP_TOO_OLD_ACK;
This patch is based on a detailed report and ideas from Yepeng Pan and Christian Rossow. ACK seq validation is currently following RFC 5961 5.2 guidelines: The ACK value is considered acceptable only if it is in the range of ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= SND.NXT). All incoming segments whose ACK value doesn't satisfy the above condition MUST be discarded and an ACK sent back. It needs to be noted that RFC 793 on page 72 (fifth check) says: "If the ACK is a duplicate (SEG.ACK < SND.UNA), it can be ignored. If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return". The "ignored" above implies that the processing of the incoming data segment continues, which means the ACK value is treated as acceptable. This mitigation makes the ACK check more stringent since any ACK < SND.UNA wouldn't be accepted, instead only ACKs that are in the range ((SND.UNA - MAX.SND.WND) <= SEG.ACK <= SND.NXT) get through. This can be refined for new (and possibly spoofed) flows, by not accepting ACK for bytes that were never sent. This greatly improves TCP security at a little cost. I added a Fixes: tag to make sure this patch will reach stable trees, even if the 'blamed' patch was adhering to the RFC. tp->bytes_acked was added in linux-4.2 Following packetdrill test (courtesy of Yepeng Pan) shows the issue at hand: 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1024) = 0 // ---------------- Handshake ------------------- // // when window scale is set to 14 the window size can be extended to // 65535 * (2^14) = 1073725440. Linux would accept an ACK packet // with ack number in (Server_ISN+1-1073725440. Server_ISN+1) // ,though this ack number acknowledges some data never // sent by the server. +0 < S 0:0(0) win 65535 <mss 1400,nop,wscale 14> +0 > S. 0:0(0) ack 1 <...> +0 < . 1:1(0) ack 1 win 65535 +0 accept(3, ..., ...) = 4 // For the established connection, we send an ACK packet, // the ack packet uses ack number 1 - 1073725300 + 2^32, // where 2^32 is used to wrap around. // Note: we used 1073725300 instead of 1073725440 to avoid possible // edge cases. // 1 - 1073725300 + 2^32 = 3221241997 // Oops, old kernels happily accept this packet. +0 < . 1:1001(1000) ack 3221241997 win 65535 // After the kernel fix the following will be replaced by a challenge ACK, // and prior malicious frame would be dropped. +0 > . 1:1(0) ack 1001 Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yepeng Pan <yepeng.pan@cispa.de> Reported-by: Christian Rossow <rossow@cispa.de> --- net/ipv4/tcp_input.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)