Message ID | 20240529033104.33882-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,net-next] tcp: introduce a new MIB for CLOSE-WAIT sockets | expand |
On Wed, May 29, 2024 at 5:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > CLOSE-WAIT is a relatively special state which "represents waiting for > a connection termination request from the local user" (RFC 793). Some > issues may happen because of unexpected/too many CLOSE-WAIT sockets, > like user application mistakenly handling close() syscall. It's a very > common issue in the real world. > > We want to trace this total number of CLOSE-WAIT sockets fastly and > frequently instead of resorting to displaying them altogether by using: > > ss -s state close-wait > > or something like this. They need to loop and collect required socket > information in kernel and then get back to the userside for print, which > does harm to the performance especially in heavy load for frequent > sampling. > > That's the reason why I chose to introduce this new MIB counter like > CurrEstab does. With this counter implemented, we can record/observe the > normal changes of this counter all the time. It can help us: > 1) We are able to be alerted in advance if the counter changes drastically. > 2) If some users report some issues happening, we will particularly > pay more attention to it. > > Besides, in the group of TCP_MIB_* defined by RFC 1213, TCP_MIB_CURRESTAB > should include both ESTABLISHED and CLOSE-WAIT sockets in theory: > We (Neal and myself) prefer to fix TCP_MIB_CURRESTAB to include CLOSE_WAIT sockets. We do not think it will annoy anyone, please change tcp_set_state() accordingly. Rationale is that adoption of a new MIB in documentations and various products will take years. Also make a similar change for mptcp. Thank you.
Hello Eric, On Wed, May 29, 2024 at 11:42 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, May 29, 2024 at 5:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > CLOSE-WAIT is a relatively special state which "represents waiting for > > a connection termination request from the local user" (RFC 793). Some > > issues may happen because of unexpected/too many CLOSE-WAIT sockets, > > like user application mistakenly handling close() syscall. It's a very > > common issue in the real world. > > > > We want to trace this total number of CLOSE-WAIT sockets fastly and > > frequently instead of resorting to displaying them altogether by using: > > > > ss -s state close-wait > > > > or something like this. They need to loop and collect required socket > > information in kernel and then get back to the userside for print, which > > does harm to the performance especially in heavy load for frequent > > sampling. > > > > That's the reason why I chose to introduce this new MIB counter like > > CurrEstab does. With this counter implemented, we can record/observe the > > normal changes of this counter all the time. It can help us: > > 1) We are able to be alerted in advance if the counter changes drastically. > > 2) If some users report some issues happening, we will particularly > > pay more attention to it. > > > > Besides, in the group of TCP_MIB_* defined by RFC 1213, TCP_MIB_CURRESTAB > > should include both ESTABLISHED and CLOSE-WAIT sockets in theory: > > > > We (Neal and myself) prefer to fix TCP_MIB_CURRESTAB to include > CLOSE_WAIT sockets. > We do not think it will annoy anyone, please change tcp_set_state() accordingly. Thanks for your reply. Honestly, I was worried about what you said. Now, I'm relieved. It seems that I should add a Fixes: tag... > > Rationale is that adoption of a new MIB in documentations and various > products will take years. > > Also make a similar change for mptcp. I will check that part tomorrow morning, too. Thank you :) Jason
diff --git a/include/net/ip.h b/include/net/ip.h index 6d735e00d3f3..0fe2994796b0 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -298,6 +298,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, #define IP_UPD_PO_STATS(net, field, val) SNMP_UPD_PO_STATS64((net)->mib.ip_statistics, field, val) #define __IP_UPD_PO_STATS(net, field, val) __SNMP_UPD_PO_STATS64((net)->mib.ip_statistics, field, val) #define NET_INC_STATS(net, field) SNMP_INC_STATS((net)->mib.net_statistics, field) +#define NET_DEC_STATS(net, field) SNMP_DEC_STATS((net)->mib.net_statistics, field) #define __NET_INC_STATS(net, field) __SNMP_INC_STATS((net)->mib.net_statistics, field) #define NET_ADD_STATS(net, field, adnd) SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd) #define __NET_ADD_STATS(net, field, adnd) __SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd) diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index adf5fd78dd50..c0feefb4d88b 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -302,6 +302,7 @@ enum LINUX_MIB_TCPAOKEYNOTFOUND, /* TCPAOKeyNotFound */ LINUX_MIB_TCPAOGOOD, /* TCPAOGood */ LINUX_MIB_TCPAODROPPEDICMPS, /* TCPAODroppedIcmps */ + LINUX_MIB_TCPCLOSEWAIT, /* TCPCloseWait */ __LINUX_MIB_MAX }; diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index 6c4664c681ca..964897dc6eb8 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -305,6 +305,7 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND), SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD), SNMP_MIB_ITEM("TCPAODroppedIcmps", LINUX_MIB_TCPAODROPPEDICMPS), + SNMP_MIB_ITEM("TCPCloseWait", LINUX_MIB_TCPCLOSEWAIT), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 681b54e1f3a6..3908ea7fd14a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2661,6 +2661,11 @@ void tcp_set_state(struct sock *sk, int state) TCP_DEC_STATS(sock_net(sk), TCP_MIB_CURRESTAB); } + if (state == TCP_CLOSE_WAIT) + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPCLOSEWAIT); + if (oldstate == TCP_CLOSE_WAIT) + NET_DEC_STATS(sock_net(sk), LINUX_MIB_TCPCLOSEWAIT); + /* Change state AFTER socket is unhashed to avoid closed * socket sitting in hash tables. */