Message ID | 20250325110325.51958-1-jiayuan.chen@linux.dev (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v2] tcp: Support skb PAWS drop reason when TIME-WAIT | expand |
On Tue, Mar 25, 2025 at 12:03 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > PAWS is a long-standing issue, especially when there are upstream network > devices, making it more prone to occur. > > Currently, packet loss statistics for PAWS can only be viewed through MIB, > which is a global metric and cannot be precisely obtained through tracing > to get the specific 4-tuple of the dropped packet. In the past, we had to > use kprobe ret to retrieve relevant skb information from > tcp_timewait_state_process(). > > We add a drop_reason pointer, similar to what previous commit does: > commit e34100c2ecbb ("tcp: add a drop_reason pointer to tcp_check_req()") > > This commit addresses the PAWSESTABREJECTED case and also sets the > corresponding drop reason. > > We use 'pwru' to test. > > Before this commit: > '''' > ./pwru 'port 9999' > 2025/03/24 13:46:03 Listening for events.. > TUPLE FUNC > 172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_NOT_SPECIFIED) > ''' > > After this commit: > ''' > ./pwru 'port 9999' > 2025/03/24 16:06:59 Listening for events.. > TUPLE FUNC > 172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_TCP_RFC7323_PAWS) > ''' > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > --- > My apologize. > I struggled for a long time to get packetdrill to fix the client port, but > ultimately failed to do so, which is why I couldn't provide a packetdrill > script. > Instead, I wrote my own program to trigger PAWS, which can be found at > https://github.com/mrpre/nettrigger/tree/main > --- > include/net/tcp.h | 3 ++- > net/ipv4/tcp_ipv4.c | 2 +- > net/ipv4/tcp_minisocks.c | 7 +++++-- > net/ipv6/tcp_ipv6.c | 2 +- > 4 files changed, 9 insertions(+), 5 deletions(-) > > diff --git a/include/net/tcp.h b/include/net/tcp.h > index f8efe56bbccb..e1574e804530 100644 > --- a/include/net/tcp.h > +++ b/include/net/tcp.h > @@ -427,7 +427,8 @@ enum tcp_tw_status { > enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw, > struct sk_buff *skb, > const struct tcphdr *th, > - u32 *tw_isn); > + u32 *tw_isn, > + enum skb_drop_reason *drop_reason); > struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > struct request_sock *req, bool fastopen, > bool *lost_race, enum skb_drop_reason *drop_reason); > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 1cd0938d47e0..a9dde473a23f 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -2417,7 +2417,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto csum_error; > } > > - tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn); > + tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn, &drop_reason); > switch (tw_status) { > case TCP_TW_SYN: { > struct sock *sk2 = inet_lookup_listener(net, > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index fb9349be36b8..d16dfd41397e 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -97,7 +97,8 @@ static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq, > */ > enum tcp_tw_status > tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, > - const struct tcphdr *th, u32 *tw_isn) > + const struct tcphdr *th, u32 *tw_isn, > + enum skb_drop_reason *drop_reason) > { > struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); > u32 rcv_nxt = READ_ONCE(tcptw->tw_rcv_nxt); > @@ -245,8 +246,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, > return TCP_TW_SYN; > } > > - if (paws_reject) > + if (paws_reject) { > + *drop_reason = SKB_DROP_REASON_TCP_RFC7323_PAWS; > __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED); I think we should add a new SNMP value and drop reason for TW sockets. SNMP_MIB_ITEM("PAWSTimewait", LINUX_MIB_PAWSTIMEWAIT), and SKB_DROP_REASON_TCP_RFC7323_TW_PAWS ?
March 25, 2025 at 19:29, "Eric Dumazet" <edumazet@google.com> wrote: > > --- > > > > include/net/tcp.h | 3 ++- > > > > net/ipv4/tcp_ipv4.c | 2 +- > > > > net/ipv4/tcp_minisocks.c | 7 +++++-- > > > > net/ipv6/tcp_ipv6.c | 2 +- > > > > 4 files changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/include/net/tcp.h b/include/net/tcp.h > > > > index f8efe56bbccb..e1574e804530 100644 > > > > --- a/include/net/tcp.h > > > > +++ b/include/net/tcp.h > > > > @@ -427,7 +427,8 @@ enum tcp_tw_status { > > > > enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw, > > > > struct sk_buff *skb, > > > > const struct tcphdr *th, > > > > - u32 *tw_isn); > > > > + u32 *tw_isn, > > > > + enum skb_drop_reason *drop_reason); > > > > struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, > > > > struct request_sock *req, bool fastopen, > > > > bool *lost_race, enum skb_drop_reason *drop_reason); > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > > index 1cd0938d47e0..a9dde473a23f 100644 > > > > --- a/net/ipv4/tcp_ipv4.c > > > > +++ b/net/ipv4/tcp_ipv4.c > > > > @@ -2417,7 +2417,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > goto csum_error; > > > > } > > > > - tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn); > > > > + tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn, &drop_reason); > > > > switch (tw_status) { > > > > case TCP_TW_SYN: { > > > > struct sock *sk2 = inet_lookup_listener(net, > > > > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > > > > index fb9349be36b8..d16dfd41397e 100644 > > > > --- a/net/ipv4/tcp_minisocks.c > > > > +++ b/net/ipv4/tcp_minisocks.c > > > > @@ -97,7 +97,8 @@ static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq, > > > > */ > > > > enum tcp_tw_status > > > > tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, > > > > - const struct tcphdr *th, u32 *tw_isn) > > > > + const struct tcphdr *th, u32 *tw_isn, > > > > + enum skb_drop_reason *drop_reason) > > > > { > > > > struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); > > > > u32 rcv_nxt = READ_ONCE(tcptw->tw_rcv_nxt); > > > > @@ -245,8 +246,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, > > > > return TCP_TW_SYN; > > > > } > > > > - if (paws_reject) > > > > + if (paws_reject) { > > > > + *drop_reason = SKB_DROP_REASON_TCP_RFC7323_PAWS; > > > > __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED); > > > > I think we should add a new SNMP value and drop reason for TW sockets. > > SNMP_MIB_ITEM("PAWSTimewait", LINUX_MIB_PAWSTIMEWAIT), > > and SKB_DROP_REASON_TCP_RFC7323_TW_PAWS ? > That makes sense, we've done similar things before, such as adding PAWS_OLD_ACK previously. Thanks for the suggestion!
On Tue, 25 Mar 2025 12:20:18 +0000 Jiayuan Chen wrote:
> pw-bot: cr
Interesting, so you know about pw-bot commands but neither about
the 24h reposting cool down time, nor about the fact that net-next
is closed..
Please read:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html
March 25, 2025 at 23:43, "Jakub Kicinski" <kuba@kernel.org> wrote: > > On Tue, 25 Mar 2025 12:20:18 +0000 Jiayuan Chen wrote: > > > > > pw-bot: cr > > > > Interesting, so you know about pw-bot commands but neither about > the 24h reposting cool down time, nor about the fact that net-next > is closed.. > > Please read: > > https://www.kernel.org/doc/html/next/process/maintainer-netdev.html > My apologies, I learned the pw-bot commands from other maintainers' email responses and then discovered that I could set the patchwork status. I'll learn the documentation more thoroughly next. Please ingore my patch; I'll resubmit it after the merge window closes.
diff --git a/include/net/tcp.h b/include/net/tcp.h index f8efe56bbccb..e1574e804530 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -427,7 +427,8 @@ enum tcp_tw_status { enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, const struct tcphdr *th, - u32 *tw_isn); + u32 *tw_isn, + enum skb_drop_reason *drop_reason); struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, struct request_sock *req, bool fastopen, bool *lost_race, enum skb_drop_reason *drop_reason); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 1cd0938d47e0..a9dde473a23f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2417,7 +2417,7 @@ int tcp_v4_rcv(struct sk_buff *skb) goto csum_error; } - tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn); + tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn, &drop_reason); switch (tw_status) { case TCP_TW_SYN: { struct sock *sk2 = inet_lookup_listener(net, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index fb9349be36b8..d16dfd41397e 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -97,7 +97,8 @@ static void twsk_rcv_nxt_update(struct tcp_timewait_sock *tcptw, u32 seq, */ enum tcp_tw_status tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, - const struct tcphdr *th, u32 *tw_isn) + const struct tcphdr *th, u32 *tw_isn, + enum skb_drop_reason *drop_reason) { struct tcp_timewait_sock *tcptw = tcp_twsk((struct sock *)tw); u32 rcv_nxt = READ_ONCE(tcptw->tw_rcv_nxt); @@ -245,8 +246,10 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb, return TCP_TW_SYN; } - if (paws_reject) + if (paws_reject) { + *drop_reason = SKB_DROP_REASON_TCP_RFC7323_PAWS; __NET_INC_STATS(twsk_net(tw), LINUX_MIB_PAWSESTABREJECTED); + } if (!th->rst) { /* In this case we must reset the TIMEWAIT timer. diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c134cf1a603a..8862e315edc5 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1970,7 +1970,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) goto csum_error; } - tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn); + tw_status = tcp_timewait_state_process(inet_twsk(sk), skb, th, &isn, &drop_reason); switch (tw_status) { case TCP_TW_SYN: {
PAWS is a long-standing issue, especially when there are upstream network devices, making it more prone to occur. Currently, packet loss statistics for PAWS can only be viewed through MIB, which is a global metric and cannot be precisely obtained through tracing to get the specific 4-tuple of the dropped packet. In the past, we had to use kprobe ret to retrieve relevant skb information from tcp_timewait_state_process(). We add a drop_reason pointer, similar to what previous commit does: commit e34100c2ecbb ("tcp: add a drop_reason pointer to tcp_check_req()") This commit addresses the PAWSESTABREJECTED case and also sets the corresponding drop reason. We use 'pwru' to test. Before this commit: '''' ./pwru 'port 9999' 2025/03/24 13:46:03 Listening for events.. TUPLE FUNC 172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_NOT_SPECIFIED) ''' After this commit: ''' ./pwru 'port 9999' 2025/03/24 16:06:59 Listening for events.. TUPLE FUNC 172.31.75.115:12345->172.31.75.114:9999(tcp) sk_skb_reason_drop(SKB_DROP_REASON_TCP_RFC7323_PAWS) ''' Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> --- My apologize. I struggled for a long time to get packetdrill to fix the client port, but ultimately failed to do so, which is why I couldn't provide a packetdrill script. Instead, I wrote my own program to trigger PAWS, which can be found at https://github.com/mrpre/nettrigger/tree/main --- include/net/tcp.h | 3 ++- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 7 +++++-- net/ipv6/tcp_ipv6.c | 2 +- 4 files changed, 9 insertions(+), 5 deletions(-)