Message ID | 20240417085143.69578-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Implement reset reason mechanism to detect | expand |
On Wed, Apr 17, 2024 at 4:51 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > In production, there are so many cases about why the RST skb is sent but > we don't have a very convenient/fast method to detect the exact underlying > reasons. > > RST is implemented in two kinds: passive kind (like tcp_v4_send_reset()) > and active kind (like tcp_send_active_reset()). The former can be traced > carefully 1) in TCP, with the help of drop reasons, which is based on > Eric's idea[1], 2) in MPTCP, with the help of reset options defined in > RFC 8684. The latter is relatively independent, which should be > implemented on our own. > > In this series, I focus on the fundamental implement mostly about how > the rstreason mechnism works and give the detailed passive part as an > example, not including the active reset part. In future, we can go > further and refine those NOT_SPECIFIED reasons. > > Here are some examples when tracing: > <idle>-0 [002] ..s1. 1830.262425: tcp_send_reset: skbaddr=x > skaddr=x src=x dest=x state=x reason=NOT_SPECIFIED > <idle>-0 [002] ..s1. 1830.262425: tcp_send_reset: skbaddr=x > skaddr=x src=x dest=x state=x reason=NO_SOCKET > > [1] > Link: https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/ > > v6 > 1. add back casts, or else they are treated as error. > > v5 > Link: https://lore.kernel.org/all/20240411115630.38420-1-kerneljasonxing@gmail.com/ > 1. address format issue (like reverse xmas tree) (Eric, Paolo) > 2. remove unnecessary casts. (Eric) > 3. introduce a helper used in mptcp active reset. See patch 6. (Paolo) > > v4 > Link: https://lore.kernel.org/all/20240409100934.37725-1-kerneljasonxing@gmail.com/ > 1. passing 'enum sk_rst_reason' for readability when tracing (Antoine) > > v3 > Link: https://lore.kernel.org/all/20240404072047.11490-1-kerneljasonxing@gmail.com/ > 1. rebase (mptcp part) and address what Mat suggested. > > v2 > Link: https://lore.kernel.org/all/20240403185033.47ebc6a9@kernel.org/ > 1. rebase against the latest net-next tree > > > > Jason Xing (7): > net: introduce rstreason to detect why the RST is sent > rstreason: prepare for passive reset > rstreason: prepare for active reset > tcp: support rstreason for passive reset > mptcp: support rstreason for passive reset > mptcp: introducing a helper into active reset logic > rstreason: make it work in trace world > > include/net/request_sock.h | 4 +- > include/net/rstreason.h | 93 ++++++++++++++++++++++++++++++++++++++ > include/net/tcp.h | 3 +- > include/trace/events/tcp.h | 37 +++++++++++++-- > net/dccp/ipv4.c | 10 ++-- > net/dccp/ipv6.c | 10 ++-- > net/dccp/minisocks.c | 3 +- > net/ipv4/tcp.c | 15 ++++-- > net/ipv4/tcp_ipv4.c | 14 +++--- > net/ipv4/tcp_minisocks.c | 3 +- > net/ipv4/tcp_output.c | 5 +- > net/ipv4/tcp_timer.c | 9 ++-- > net/ipv6/tcp_ipv6.c | 17 ++++--- > net/mptcp/protocol.c | 2 +- > net/mptcp/protocol.h | 11 +++++ > net/mptcp/subflow.c | 27 ++++++++--- > 16 files changed, 216 insertions(+), 47 deletions(-) > create mode 100644 include/net/rstreason.h > > -- > 2.37.3 > Hello maintainers, I'm not sure why the patch series has been changed to 'Changes Requested', until now I don't think I need to change something. Should I repost this series (keeping the v6 tag) and then wait for more comments? Thanks, Jason
On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote: > I'm not sure why the patch series has been changed to 'Changes > Requested', until now I don't think I need to change something. > > Should I repost this series (keeping the v6 tag) and then wait for > more comments? If Eric doesn't like it - it's not getting merged.
On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote: > > I'm not sure why the patch series has been changed to 'Changes > > Requested', until now I don't think I need to change something. > > > > Should I repost this series (keeping the v6 tag) and then wait for > > more comments? > > If Eric doesn't like it - it's not getting merged. I'm not a English native speaker. If I understand correctly, it seems that Eric doesn't object to the patch series. Here is the quotation [1]: "If you feel the need to put them in a special group, this is fine by me." This rst reason mechanism can cover all the possible reasons for both TCP and MPTCP. We don't need to reinvent some definitions of reset reasons which are totally the same as drop reasons. Also, we don't need to reinvent something to cover MPTCP. If people are willing to contribute more rst reasons, they can find a good place. Reset is one big complicated 'issue' in production. I spent a lot of time handling all kinds of reset reasons daily. I'm apparently not the only one. I just want to make admins' lives easier, including me. This special/separate reason group is important because we can extend it in the future, which will not get confused. I hope it can have a chance to get merged :) Thank you. [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/ Thanks, Jason
On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote: > > > I'm not sure why the patch series has been changed to 'Changes > > > Requested', until now I don't think I need to change something. > > > > > > Should I repost this series (keeping the v6 tag) and then wait for > > > more comments? > > > > If Eric doesn't like it - it's not getting merged. > > I'm not a English native speaker. If I understand correctly, it seems > that Eric doesn't object to the patch series. Here is the quotation > [1]: > "If you feel the need to put them in a special group, this is fine by me." > > This rst reason mechanism can cover all the possible reasons for both > TCP and MPTCP. We don't need to reinvent some definitions of reset > reasons which are totally the same as drop reasons. Also, we don't > need to reinvent something to cover MPTCP. If people are willing to > contribute more rst reasons, they can find a good place. > > Reset is one big complicated 'issue' in production. I spent a lot of > time handling all kinds of reset reasons daily. I'm apparently not the > only one. I just want to make admins' lives easier, including me. This > special/separate reason group is important because we can extend it in > the future, which will not get confused. > > I hope it can have a chance to get merged :) Thank you. > > [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/ > > Thanks, > Jason My objection was these casts between enums. Especially if hiding with (u32) I see no reason for adding these casts in TCP stack. When I said "If you feel the need to put them in a special group, this is fine by me.", this was really about partitioning the existing enum into groups, if you prefer having a group of 'RES reasons'
On Fri, Apr 19, 2024 at 2:51 AM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote: > > > > I'm not sure why the patch series has been changed to 'Changes > > > > Requested', until now I don't think I need to change something. > > > > > > > > Should I repost this series (keeping the v6 tag) and then wait for > > > > more comments? > > > > > > If Eric doesn't like it - it's not getting merged. > > > > I'm not a English native speaker. If I understand correctly, it seems > > that Eric doesn't object to the patch series. Here is the quotation > > [1]: > > "If you feel the need to put them in a special group, this is fine by me." > > > > This rst reason mechanism can cover all the possible reasons for both > > TCP and MPTCP. We don't need to reinvent some definitions of reset > > reasons which are totally the same as drop reasons. Also, we don't > > need to reinvent something to cover MPTCP. If people are willing to > > contribute more rst reasons, they can find a good place. > > > > Reset is one big complicated 'issue' in production. I spent a lot of > > time handling all kinds of reset reasons daily. I'm apparently not the > > only one. I just want to make admins' lives easier, including me. This > > special/separate reason group is important because we can extend it in > > the future, which will not get confused. > > > > I hope it can have a chance to get merged :) Thank you. > > > > [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/ > > > > Thanks, > > Jason > > My objection was these casts between enums. Especially if hiding with (u32) So I should explicitly cast it like this: tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); ? > > I see no reason for adding these casts in TCP stack. Sorry, I don't know why the casts really make you so annoyed. But I still think it's not a bad way to handle all the cases for RST. Supposing not to add a enum sk_rst_reason{}, passing drop reasons only works well in TCP for passive rests. For active reset cases (in the tcp_send_active_reset()), it's meaningless/confusing to insist on reusing the drop reason because I have to add some reset reasons (that are only used in RST cases) in the enum skb_drop_reason{}, which is really weird, in my view. The same problem exists in how to handle MPTCP. So I prefer putting them in a separate group like now. What do you think about it, right now? Thanks, Jason
On Fri, Apr 19, 2024 at 6:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 2:51 AM Eric Dumazet <edumazet@google.com> wrote: > > > > On Thu, Apr 18, 2024 at 6:24 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > On Thu, Apr 18, 2024 at 11:46 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > > > On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote: > > > > > I'm not sure why the patch series has been changed to 'Changes > > > > > Requested', until now I don't think I need to change something. > > > > > > > > > > Should I repost this series (keeping the v6 tag) and then wait for > > > > > more comments? > > > > > > > > If Eric doesn't like it - it's not getting merged. > > > > > > I'm not a English native speaker. If I understand correctly, it seems > > > that Eric doesn't object to the patch series. Here is the quotation > > > [1]: > > > "If you feel the need to put them in a special group, this is fine by me." > > > > > > This rst reason mechanism can cover all the possible reasons for both > > > TCP and MPTCP. We don't need to reinvent some definitions of reset > > > reasons which are totally the same as drop reasons. Also, we don't > > > need to reinvent something to cover MPTCP. If people are willing to > > > contribute more rst reasons, they can find a good place. > > > > > > Reset is one big complicated 'issue' in production. I spent a lot of > > > time handling all kinds of reset reasons daily. I'm apparently not the > > > only one. I just want to make admins' lives easier, including me. This > > > special/separate reason group is important because we can extend it in > > > the future, which will not get confused. > > > > > > I hope it can have a chance to get merged :) Thank you. > > > > > > [1]: https://lore.kernel.org/all/CANn89i+aLO_aGYC8dr8dkFyi+6wpzCGrogysvgR8FrfRvaa-Vg@mail.gmail.com/ > > > > > > Thanks, > > > Jason > > > > My objection was these casts between enums. Especially if hiding with (u32) > > So I should explicitly cast it like this: > tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > ? > > > > > I see no reason for adding these casts in TCP stack. > > Sorry, I don't know why the casts really make you so annoyed. But I > still think it's not a bad way to handle all the cases for RST. > > Supposing not to add a enum sk_rst_reason{}, passing drop reasons only > works well in TCP for passive rests. For active reset cases (in the > tcp_send_active_reset()), it's meaningless/confusing to insist on > reusing the drop reason because I have to add some reset reasons (that > are only used in RST cases) in the enum skb_drop_reason{}, which is > really weird, in my view. The same problem exists in how to handle > MPTCP. So I prefer putting them in a separate group like now. What do > you think about it, right now? The description in the previous email may be too long. The key point is that we're not supporting only for passive resets, right? Thanks, Jason
> When I said "If you feel the need to put them in a special group, this > is fine by me.", > this was really about partitioning the existing enum into groups, if > you prefer having a group of 'RES reasons' Are you suggesting copying what we need from enum skb_drop_reason{} to enum sk_rst_reason{}? Why not reusing them directly. I have no idea what the side effect of cast conversion itself is? If __not__ doing so (copying reasons one by one), for passive rests, we can totally rely on the drop reason, which means if we implement more reasons for skb drop happening in reset cases, we don't need to handle reset cases over and over again (like adding rst reasons just after newly added drop reasons if without cast conversions). It's easier to maintain the reset reason part if we can apply the current patch series. Thank you.
On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > When I said "If you feel the need to put them in a special group, this > > is fine by me.", > > this was really about partitioning the existing enum into groups, if > > you prefer having a group of 'RES reasons' > > Are you suggesting copying what we need from enum skb_drop_reason{} to > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > what the side effect of cast conversion itself is? Sorry that I'm writing this email. I'm worried my statement is not that clear, so I write one simple snippet which can help me explain well :) Allow me give NO_SOCKET as an example: diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index e63a3bf99617..2c9f7364de45 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info, if (!fl4.saddr) fl4.saddr = htonl(INADDR_DUMMY); + trace_icmp_send(skb_in, type, code); icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); ende: ip_rt_put(rt); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 1e650ec71d2f..d5963831280f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) { struct net *net = dev_net(skb->dev); enum skb_drop_reason drop_reason; + enum sk_rst_reason rst_reason; int sdif = inet_sdif(skb); int dif = inet_iif(skb); const struct iphdr *iph; @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) bad_packet: __TCP_INC_STATS(net, TCP_MIB_INERRS); } else { - tcp_v4_send_reset(NULL, skb); + rst_reason = RST_REASON_NO_SOCKET; + tcp_v4_send_reset(NULL, skb, rst_reason); } discard_it: As you can see, we need to add a new 'rst_reason' variable which actually is the same as drop reason. They are the same except for the enum type... Such rst_reasons/drop_reasons are all over the place. Eric, if you have a strong preference, I can do it as you said. Well, how about explicitly casting them like this based on the current series. It looks better and clearer and more helpful to people who is reading codes to understand: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 461b4d2b7cfe..eb125163d819 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - tcp_v4_send_reset(rsk, skb, (u32)reason); + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); discard: kfree_skb_reason(skb, reason); /* Be careful here. If this function gets more complicated and Thanks for your patience again. Jason
On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > When I said "If you feel the need to put them in a special group, this > > > is fine by me.", > > > this was really about partitioning the existing enum into groups, if > > > you prefer having a group of 'RES reasons' > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > > what the side effect of cast conversion itself is? > > Sorry that I'm writing this email. I'm worried my statement is not > that clear, so I write one simple snippet which can help me explain > well :) > > Allow me give NO_SOCKET as an example: > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > index e63a3bf99617..2c9f7364de45 100644 > --- a/net/ipv4/icmp.c > +++ b/net/ipv4/icmp.c > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, > int code, __be32 info, > if (!fl4.saddr) > fl4.saddr = htonl(INADDR_DUMMY); > > + trace_icmp_send(skb_in, type, code); > icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); > ende: > ip_rt_put(rt); > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 1e650ec71d2f..d5963831280f 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > { > struct net *net = dev_net(skb->dev); > enum skb_drop_reason drop_reason; > + enum sk_rst_reason rst_reason; > int sdif = inet_sdif(skb); > int dif = inet_iif(skb); > const struct iphdr *iph; > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > bad_packet: > __TCP_INC_STATS(net, TCP_MIB_INERRS); > } else { > - tcp_v4_send_reset(NULL, skb); > + rst_reason = RST_REASON_NO_SOCKET; > + tcp_v4_send_reset(NULL, skb, rst_reason); > } > > discard_it: > > As you can see, we need to add a new 'rst_reason' variable which > actually is the same as drop reason. They are the same except for the > enum type... Such rst_reasons/drop_reasons are all over the place. > > Eric, if you have a strong preference, I can do it as you said. > > Well, how about explicitly casting them like this based on the current > series. It looks better and clearer and more helpful to people who is > reading codes to understand: > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 461b4d2b7cfe..eb125163d819 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > return 0; > > reset: > - tcp_v4_send_reset(rsk, skb, (u32)reason); > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > discard: > kfree_skb_reason(skb, reason); > /* Be careful here. If this function gets more complicated and It makes no sense to declare an enum sk_rst_reason and then convert it to drop_reason or vice versa. Next thing you know, compiler guys will add a new -Woption that will forbid such conversions. Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum. skb_drop_reason are simply values that are later converted to strings... So : Do not declare a new enum.
On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > When I said "If you feel the need to put them in a special group, this > > > > is fine by me.", > > > > this was really about partitioning the existing enum into groups, if > > > > you prefer having a group of 'RES reasons' > > > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > > > what the side effect of cast conversion itself is? > > > > Sorry that I'm writing this email. I'm worried my statement is not > > that clear, so I write one simple snippet which can help me explain > > well :) > > > > Allow me give NO_SOCKET as an example: > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > > index e63a3bf99617..2c9f7364de45 100644 > > --- a/net/ipv4/icmp.c > > +++ b/net/ipv4/icmp.c > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, > > int code, __be32 info, > > if (!fl4.saddr) > > fl4.saddr = htonl(INADDR_DUMMY); > > > > + trace_icmp_send(skb_in, type, code); > > icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); > > ende: > > ip_rt_put(rt); > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > index 1e650ec71d2f..d5963831280f 100644 > > --- a/net/ipv4/tcp_ipv4.c > > +++ b/net/ipv4/tcp_ipv4.c > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > > { > > struct net *net = dev_net(skb->dev); > > enum skb_drop_reason drop_reason; > > + enum sk_rst_reason rst_reason; > > int sdif = inet_sdif(skb); > > int dif = inet_iif(skb); > > const struct iphdr *iph; > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > > bad_packet: > > __TCP_INC_STATS(net, TCP_MIB_INERRS); > > } else { > > - tcp_v4_send_reset(NULL, skb); > > + rst_reason = RST_REASON_NO_SOCKET; > > + tcp_v4_send_reset(NULL, skb, rst_reason); > > } > > > > discard_it: > > > > As you can see, we need to add a new 'rst_reason' variable which > > actually is the same as drop reason. They are the same except for the > > enum type... Such rst_reasons/drop_reasons are all over the place. > > > > Eric, if you have a strong preference, I can do it as you said. > > > > Well, how about explicitly casting them like this based on the current > > series. It looks better and clearer and more helpful to people who is > > reading codes to understand: > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > index 461b4d2b7cfe..eb125163d819 100644 > > --- a/net/ipv4/tcp_ipv4.c > > +++ b/net/ipv4/tcp_ipv4.c > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > > return 0; > > > > reset: > > - tcp_v4_send_reset(rsk, skb, (u32)reason); > > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > > discard: > > kfree_skb_reason(skb, reason); > > /* Be careful here. If this function gets more complicated and > > It makes no sense to declare an enum sk_rst_reason and then convert it > to drop_reason > or vice versa. > > Next thing you know, compiler guys will add a new -Woption that will > forbid such conversions. > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum. Ah... It looks like I didn't make myself clear again. Sorry... Actually I wrote this part many times. My conclusion is that It's not feasible to do this. REASONS: If we __only__ need to deal with this passive reset in TCP, it's fine. We pass a skb_drop_reason which should also be converted to u32 type in tcp_v4_send_reset() as you said, it can work. People who use the trace will see the reason with the 'SKB_DROP_REASON' prefix stripped. But we have to deal with other cases. A few questions are listed here: 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop reasons? There is no drop reason at all. I think people will get confused. So I believe we should invent new definitions to cope with it. 2) What about the .send_reset callback in the subflow_syn_recv_sock() in MPTCP? The reasons in MPTCP are only definitions (such as MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum skb_drop_reason type. So where should we group those various reasons? Introducing a new enum is for extension and compatibility for all kinds of reset reasons. What do you think? Thanks, Jason > > skb_drop_reason are simply values that are later converted to strings... > > So : Do not declare a new enum.
On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > > When I said "If you feel the need to put them in a special group, this > > > > > is fine by me.", > > > > > this was really about partitioning the existing enum into groups, if > > > > > you prefer having a group of 'RES reasons' > > > > > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to > > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > > > > what the side effect of cast conversion itself is? > > > > > > Sorry that I'm writing this email. I'm worried my statement is not > > > that clear, so I write one simple snippet which can help me explain > > > well :) > > > > > > Allow me give NO_SOCKET as an example: > > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > > > index e63a3bf99617..2c9f7364de45 100644 > > > --- a/net/ipv4/icmp.c > > > +++ b/net/ipv4/icmp.c > > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, > > > int code, __be32 info, > > > if (!fl4.saddr) > > > fl4.saddr = htonl(INADDR_DUMMY); > > > > > > + trace_icmp_send(skb_in, type, code); > > > icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); > > > ende: > > > ip_rt_put(rt); > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > index 1e650ec71d2f..d5963831280f 100644 > > > --- a/net/ipv4/tcp_ipv4.c > > > +++ b/net/ipv4/tcp_ipv4.c > > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > { > > > struct net *net = dev_net(skb->dev); > > > enum skb_drop_reason drop_reason; > > > + enum sk_rst_reason rst_reason; > > > int sdif = inet_sdif(skb); > > > int dif = inet_iif(skb); > > > const struct iphdr *iph; > > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > bad_packet: > > > __TCP_INC_STATS(net, TCP_MIB_INERRS); > > > } else { > > > - tcp_v4_send_reset(NULL, skb); > > > + rst_reason = RST_REASON_NO_SOCKET; > > > + tcp_v4_send_reset(NULL, skb, rst_reason); > > > } > > > > > > discard_it: > > > > > > As you can see, we need to add a new 'rst_reason' variable which > > > actually is the same as drop reason. They are the same except for the > > > enum type... Such rst_reasons/drop_reasons are all over the place. > > > > > > Eric, if you have a strong preference, I can do it as you said. > > > > > > Well, how about explicitly casting them like this based on the current > > > series. It looks better and clearer and more helpful to people who is > > > reading codes to understand: > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > index 461b4d2b7cfe..eb125163d819 100644 > > > --- a/net/ipv4/tcp_ipv4.c > > > +++ b/net/ipv4/tcp_ipv4.c > > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > > > return 0; > > > > > > reset: > > > - tcp_v4_send_reset(rsk, skb, (u32)reason); > > > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > > > discard: > > > kfree_skb_reason(skb, reason); > > > /* Be careful here. If this function gets more complicated and > > > > It makes no sense to declare an enum sk_rst_reason and then convert it > > to drop_reason > > or vice versa. > > > > Next thing you know, compiler guys will add a new -Woption that will > > forbid such conversions. > > > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum. > > Ah... It looks like I didn't make myself clear again. Sorry... > Actually I wrote this part many times. My conclusion is that It's not > feasible to do this. > > REASONS: > If we __only__ need to deal with this passive reset in TCP, it's fine. > We pass a skb_drop_reason which should also be converted to u32 type > in tcp_v4_send_reset() as you said, it can work. People who use the > trace will see the reason with the 'SKB_DROP_REASON' prefix stripped. > > But we have to deal with other cases. A few questions are listed here: > 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop > reasons? There is no drop reason at all. I think people will get > confused. So I believe we should invent new definitions to cope with > it. > 2) What about the .send_reset callback in the subflow_syn_recv_sock() > in MPTCP? The reasons in MPTCP are only definitions (such as > MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum > skb_drop_reason type. > > So where should we group those various reasons? > > Introducing a new enum is for extension and compatibility for all > kinds of reset reasons. > > What do you think? I will stop repeating myself. enums are not what you think. type safety is there for a reason. Can you show me another place in networking stacks where we cast enums to others ?
On Fri, Apr 19, 2024 at 3:44 PM Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > > > > When I said "If you feel the need to put them in a special group, this > > > > > > is fine by me.", > > > > > > this was really about partitioning the existing enum into groups, if > > > > > > you prefer having a group of 'RES reasons' > > > > > > > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to > > > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > > > > > what the side effect of cast conversion itself is? > > > > > > > > Sorry that I'm writing this email. I'm worried my statement is not > > > > that clear, so I write one simple snippet which can help me explain > > > > well :) > > > > > > > > Allow me give NO_SOCKET as an example: > > > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > > > > index e63a3bf99617..2c9f7364de45 100644 > > > > --- a/net/ipv4/icmp.c > > > > +++ b/net/ipv4/icmp.c > > > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, > > > > int code, __be32 info, > > > > if (!fl4.saddr) > > > > fl4.saddr = htonl(INADDR_DUMMY); > > > > > > > > + trace_icmp_send(skb_in, type, code); > > > > icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); > > > > ende: > > > > ip_rt_put(rt); > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > > index 1e650ec71d2f..d5963831280f 100644 > > > > --- a/net/ipv4/tcp_ipv4.c > > > > +++ b/net/ipv4/tcp_ipv4.c > > > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > { > > > > struct net *net = dev_net(skb->dev); > > > > enum skb_drop_reason drop_reason; > > > > + enum sk_rst_reason rst_reason; > > > > int sdif = inet_sdif(skb); > > > > int dif = inet_iif(skb); > > > > const struct iphdr *iph; > > > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > bad_packet: > > > > __TCP_INC_STATS(net, TCP_MIB_INERRS); > > > > } else { > > > > - tcp_v4_send_reset(NULL, skb); > > > > + rst_reason = RST_REASON_NO_SOCKET; > > > > + tcp_v4_send_reset(NULL, skb, rst_reason); > > > > } > > > > > > > > discard_it: > > > > > > > > As you can see, we need to add a new 'rst_reason' variable which > > > > actually is the same as drop reason. They are the same except for the > > > > enum type... Such rst_reasons/drop_reasons are all over the place. > > > > > > > > Eric, if you have a strong preference, I can do it as you said. > > > > > > > > Well, how about explicitly casting them like this based on the current > > > > series. It looks better and clearer and more helpful to people who is > > > > reading codes to understand: > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > > index 461b4d2b7cfe..eb125163d819 100644 > > > > --- a/net/ipv4/tcp_ipv4.c > > > > +++ b/net/ipv4/tcp_ipv4.c > > > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > > > > return 0; > > > > > > > > reset: > > > > - tcp_v4_send_reset(rsk, skb, (u32)reason); > > > > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > > > > discard: > > > > kfree_skb_reason(skb, reason); > > > > /* Be careful here. If this function gets more complicated and > > > > > > It makes no sense to declare an enum sk_rst_reason and then convert it > > > to drop_reason > > > or vice versa. > > > > > > Next thing you know, compiler guys will add a new -Woption that will > > > forbid such conversions. > > > > > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum. > > > > Ah... It looks like I didn't make myself clear again. Sorry... > > Actually I wrote this part many times. My conclusion is that It's not > > feasible to do this. > > > > REASONS: > > If we __only__ need to deal with this passive reset in TCP, it's fine. > > We pass a skb_drop_reason which should also be converted to u32 type > > in tcp_v4_send_reset() as you said, it can work. People who use the > > trace will see the reason with the 'SKB_DROP_REASON' prefix stripped. > > > > But we have to deal with other cases. A few questions are listed here: > > 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop > > reasons? There is no drop reason at all. I think people will get > > confused. So I believe we should invent new definitions to cope with > > it. > > 2) What about the .send_reset callback in the subflow_syn_recv_sock() > > in MPTCP? The reasons in MPTCP are only definitions (such as > > MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum > > skb_drop_reason type. > > > > So where should we group those various reasons? > > > > Introducing a new enum is for extension and compatibility for all > > kinds of reset reasons. > > > > What do you think? > > I will stop repeating myself. > > enums are not what you think. > > type safety is there for a reason. > > Can you show me another place in networking stacks where we cast enums > to others ? No, I've checked this a month ago. BTW, I don't know the dangers of casting enum types. I know you will answer it, but I still insist on asking, hoping someone seeing this will help me. Using skb_drop_reason can only deal with the passive reset in TCP. It is just one of all kinds of reset cases :( Forgive me, I cannot come up with a good way to cover all the cases TT I've tried..Sorry... If other experts see this thread, please help me. I would appreciate it. I have strong interests and feel strong responsibility to implement something like this patch series. It can be very useful!! Thanks again.
On Fri, Apr 19, 2024 at 4:00 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 3:44 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Fri, Apr 19, 2024 at 9:29 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > On Fri, Apr 19, 2024 at 3:02 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > > > On Fri, Apr 19, 2024 at 4:31 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > > > On Fri, Apr 19, 2024 at 7:26 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > > > > > > When I said "If you feel the need to put them in a special group, this > > > > > > > is fine by me.", > > > > > > > this was really about partitioning the existing enum into groups, if > > > > > > > you prefer having a group of 'RES reasons' > > > > > > > > > > > > Are you suggesting copying what we need from enum skb_drop_reason{} to > > > > > > enum sk_rst_reason{}? Why not reusing them directly. I have no idea > > > > > > what the side effect of cast conversion itself is? > > > > > > > > > > Sorry that I'm writing this email. I'm worried my statement is not > > > > > that clear, so I write one simple snippet which can help me explain > > > > > well :) > > > > > > > > > > Allow me give NO_SOCKET as an example: > > > > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > > > > > index e63a3bf99617..2c9f7364de45 100644 > > > > > --- a/net/ipv4/icmp.c > > > > > +++ b/net/ipv4/icmp.c > > > > > @@ -767,6 +767,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, > > > > > int code, __be32 info, > > > > > if (!fl4.saddr) > > > > > fl4.saddr = htonl(INADDR_DUMMY); > > > > > > > > > > + trace_icmp_send(skb_in, type, code); > > > > > icmp_push_reply(sk, &icmp_param, &fl4, &ipc, &rt); > > > > > ende: > > > > > ip_rt_put(rt); > > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > > > index 1e650ec71d2f..d5963831280f 100644 > > > > > --- a/net/ipv4/tcp_ipv4.c > > > > > +++ b/net/ipv4/tcp_ipv4.c > > > > > @@ -2160,6 +2160,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > > { > > > > > struct net *net = dev_net(skb->dev); > > > > > enum skb_drop_reason drop_reason; > > > > > + enum sk_rst_reason rst_reason; > > > > > int sdif = inet_sdif(skb); > > > > > int dif = inet_iif(skb); > > > > > const struct iphdr *iph; > > > > > @@ -2355,7 +2356,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > > bad_packet: > > > > > __TCP_INC_STATS(net, TCP_MIB_INERRS); > > > > > } else { > > > > > - tcp_v4_send_reset(NULL, skb); > > > > > + rst_reason = RST_REASON_NO_SOCKET; > > > > > + tcp_v4_send_reset(NULL, skb, rst_reason); > > > > > } > > > > > > > > > > discard_it: > > > > > > > > > > As you can see, we need to add a new 'rst_reason' variable which > > > > > actually is the same as drop reason. They are the same except for the > > > > > enum type... Such rst_reasons/drop_reasons are all over the place. > > > > > > > > > > Eric, if you have a strong preference, I can do it as you said. > > > > > > > > > > Well, how about explicitly casting them like this based on the current > > > > > series. It looks better and clearer and more helpful to people who is > > > > > reading codes to understand: > > > > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > > > > index 461b4d2b7cfe..eb125163d819 100644 > > > > > --- a/net/ipv4/tcp_ipv4.c > > > > > +++ b/net/ipv4/tcp_ipv4.c > > > > > @@ -1936,7 +1936,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > > > > > return 0; > > > > > > > > > > reset: > > > > > - tcp_v4_send_reset(rsk, skb, (u32)reason); > > > > > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > > > > > discard: > > > > > kfree_skb_reason(skb, reason); > > > > > /* Be careful here. If this function gets more complicated and > > > > > > > > It makes no sense to declare an enum sk_rst_reason and then convert it > > > > to drop_reason > > > > or vice versa. > > > > > > > > Next thing you know, compiler guys will add a new -Woption that will > > > > forbid such conversions. > > > > > > > > Please add to tcp_v4_send_reset() an skb_drop_reason, not a new enum. > > > > > > Ah... It looks like I didn't make myself clear again. Sorry... > > > Actually I wrote this part many times. My conclusion is that It's not > > > feasible to do this. > > > > > > REASONS: > > > If we __only__ need to deal with this passive reset in TCP, it's fine. > > > We pass a skb_drop_reason which should also be converted to u32 type > > > in tcp_v4_send_reset() as you said, it can work. People who use the > > > trace will see the reason with the 'SKB_DROP_REASON' prefix stripped. > > > > > > But we have to deal with other cases. A few questions are listed here: > > > 1) What about tcp_send_active_reset() in TCP/MPTCP? Passing weird drop > > > reasons? There is no drop reason at all. I think people will get > > > confused. So I believe we should invent new definitions to cope with > > > it. > > > 2) What about the .send_reset callback in the subflow_syn_recv_sock() > > > in MPTCP? The reasons in MPTCP are only definitions (such as > > > MPTCP_RST_EUNSPEC). I don't think we can convert them into the enum > > > skb_drop_reason type. > > > > > > So where should we group those various reasons? > > > > > > Introducing a new enum is for extension and compatibility for all > > > kinds of reset reasons. > > > > > > What do you think? > > > > I will stop repeating myself. > > > > enums are not what you think. > > > > type safety is there for a reason. > > > > Can you show me another place in networking stacks where we cast enums > > to others ? > > No, I've checked this a month ago. > > BTW, I don't know the dangers of casting enum types. I know you will s/will/won't/ > answer it, but I still insist on asking, hoping someone seeing this > will help me. > > Using skb_drop_reason can only deal with the passive reset in TCP. It > is just one of all kinds of reset cases :( > > Forgive me, I cannot come up with a good way to cover all the cases TT > > I've tried..Sorry... > > If other experts see this thread, please help me. I would appreciate > it. I have strong interests and feel strong responsibility to > implement something like this patch series. It can be very useful!! > > Thanks again.
On Fri, 19 Apr 2024 16:00:20 +0800 Jason Xing <kerneljasonxing@gmail.com> wrote: > If other experts see this thread, please help me. I would appreciate > it. I have strong interests and feel strong responsibility to > implement something like this patch series. It can be very useful!! I'm not a networking expert, but as I'm Cc'd and this is about tracing, I'll jump in to see if I can help. Honestly, reading the thread, it appears that you and Eric are talking past each other. I believe Eric is concerned about losing the value of the enum. Enums are types, and if you typecast them to another type, they lose the previous type, and all the safety that goes with it. Now, I do not really understand the problem trying to be solved here. I understand how TCP works but I never looked into the implementation of MPTCP. You added this: +static inline enum sk_rst_reason convert_mptcp_reason(u32 reason) +{ + return reason += RST_REASON_START; +} And used it for places like this: @@ -309,8 +309,13 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk, return dst; dst_release(dst); - if (!req->syncookie) - tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); + if (!req->syncookie) { + struct mptcp_ext *mpext = mptcp_get_ext(skb); + enum sk_rst_reason reason; + + reason = convert_mptcp_reason(mpext->reset_reason); + tcp_request_sock_ops.send_reset(sk, skb, reason); + } return NULL; } As I don't know this code or how MPTCP works, I do not understand the above. It use to pass to send_reset() SK_RST_REASON_NOT_SPECIFIED. But now it takes a "reset_reason" calls the "convert_mptcp_reason()" to get back a enum value. If you are mapping the reset_reason to enum sk_rst_reason, why not do it via a real conversion instead of this fragile arithmetic between the two values? static inline enum sk_rst_reason convert_mptcp_reason(u32 reason) { switch(reason) { case 0: return SK_RST_REASON_MPTCP_RST_EUNSPEC; case 1: return SK_RST_REASON_MPTCP_RST_EMPTCP; case 2: return SK_RST_REASON_MPTCP_RST_ERESOURCE; [..] default: return SK_RST_REASON_MAX; // or some other error value ] } I'm not sure if this is any better, but it's not doing any casting and it's easier to understand. It's a simple mapping between the reason and the enum and there's no inherit dependency between the values. Could possibly create enums for the reason numbers and replace the hard coded values with them. That way that helper function is at least doing a real conversion of one type to another. But like I said from the beginning. I don't understand the details here and have not spent the time to dig deeper. I just read the thread and I agree with Eric that the arithmetic conversion of reason to an enum looks fragile at best and buggy at worst. -- Steve
Hello Steven, On Sat, Apr 20, 2024 at 10:36 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 19 Apr 2024 16:00:20 +0800 > Jason Xing <kerneljasonxing@gmail.com> wrote: > > > If other experts see this thread, please help me. I would appreciate > > it. I have strong interests and feel strong responsibility to > > implement something like this patch series. It can be very useful!! > > I'm not a networking expert, but as I'm Cc'd and this is about tracing, > I'll jump in to see if I can help. Honestly, reading the thread, it > appears that you and Eric are talking past each other. > > I believe Eric is concerned about losing the value of the enum. Enums > are types, and if you typecast them to another type, they lose the > previous type, and all the safety that goes with it. Ah, I see. Possible lost value in another enum could cause a problem. > > Now, I do not really understand the problem trying to be solved here. I > understand how TCP works but I never looked into the implementation of > MPTCP. > > You added this: > > +static inline enum sk_rst_reason convert_mptcp_reason(u32 reason) > +{ > + return reason += RST_REASON_START; > +} > > And used it for places like this: > > @@ -309,8 +309,13 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk, > return dst; > > dst_release(dst); > - if (!req->syncookie) > - tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); > + if (!req->syncookie) { > + struct mptcp_ext *mpext = mptcp_get_ext(skb); > + enum sk_rst_reason reason; > + > + reason = convert_mptcp_reason(mpext->reset_reason); > + tcp_request_sock_ops.send_reset(sk, skb, reason); > + } > return NULL; > } > > As I don't know this code or how MPTCP works, I do not understand the > above. It use to pass to send_reset() SK_RST_REASON_NOT_SPECIFIED. But > now it takes a "reset_reason" calls the "convert_mptcp_reason()" to get > back a enum value. > > If you are mapping the reset_reason to enum sk_rst_reason, why not do > it via a real conversion instead of this fragile arithmetic between the two > values? > > static inline enum sk_rst_reason convert_mptcp_reason(u32 reason) > { > switch(reason) { > case 0: return SK_RST_REASON_MPTCP_RST_EUNSPEC; > case 1: return SK_RST_REASON_MPTCP_RST_EMPTCP; > case 2: return SK_RST_REASON_MPTCP_RST_ERESOURCE; > [..] > default: return SK_RST_REASON_MAX; // or some other error value > ] > } This code snippet looks much better than mine. > > I'm not sure if this is any better, but it's not doing any casting and > it's easier to understand. It's a simple mapping between the reason and > the enum and there's no inherit dependency between the values. Could > possibly create enums for the reason numbers and replace the hard coded > values with them. Right. I also need to handle many drop reasons cases like what you do. Due to too many of them, I will try the key-value map instead of switch-case and then see if it works. > > That way that helper function is at least doing a real conversion of > one type to another. > > But like I said from the beginning. I don't understand the details here > and have not spent the time to dig deeper. I just read the thread and I > agree with Eric that the arithmetic conversion of reason to an enum > looks fragile at best and buggy at worst. Thanks so much for your help, which I didn't even imagine. Sure, after one night of investigation, I figured it out. I will drop enum casts without any doubt as Eric and you suggested. But I believe a new enum is needed, grouping various reasons together which help ftrace print the valid string to userspace. Thanks, Jason > > -- Steve
From: Jason Xing <kernelxing@tencent.com> In production, there are so many cases about why the RST skb is sent but we don't have a very convenient/fast method to detect the exact underlying reasons. RST is implemented in two kinds: passive kind (like tcp_v4_send_reset()) and active kind (like tcp_send_active_reset()). The former can be traced carefully 1) in TCP, with the help of drop reasons, which is based on Eric's idea[1], 2) in MPTCP, with the help of reset options defined in RFC 8684. The latter is relatively independent, which should be implemented on our own. In this series, I focus on the fundamental implement mostly about how the rstreason mechnism works and give the detailed passive part as an example, not including the active reset part. In future, we can go further and refine those NOT_SPECIFIED reasons. Here are some examples when tracing: <idle>-0 [002] ..s1. 1830.262425: tcp_send_reset: skbaddr=x skaddr=x src=x dest=x state=x reason=NOT_SPECIFIED <idle>-0 [002] ..s1. 1830.262425: tcp_send_reset: skbaddr=x skaddr=x src=x dest=x state=x reason=NO_SOCKET [1] Link: https://lore.kernel.org/all/CANn89iJw8x-LqgsWOeJQQvgVg6DnL5aBRLi10QN2WBdr+X4k=w@mail.gmail.com/ v6 1. add back casts, or else they are treated as error. v5 Link: https://lore.kernel.org/all/20240411115630.38420-1-kerneljasonxing@gmail.com/ 1. address format issue (like reverse xmas tree) (Eric, Paolo) 2. remove unnecessary casts. (Eric) 3. introduce a helper used in mptcp active reset. See patch 6. (Paolo) v4 Link: https://lore.kernel.org/all/20240409100934.37725-1-kerneljasonxing@gmail.com/ 1. passing 'enum sk_rst_reason' for readability when tracing (Antoine) v3 Link: https://lore.kernel.org/all/20240404072047.11490-1-kerneljasonxing@gmail.com/ 1. rebase (mptcp part) and address what Mat suggested. v2 Link: https://lore.kernel.org/all/20240403185033.47ebc6a9@kernel.org/ 1. rebase against the latest net-next tree Jason Xing (7): net: introduce rstreason to detect why the RST is sent rstreason: prepare for passive reset rstreason: prepare for active reset tcp: support rstreason for passive reset mptcp: support rstreason for passive reset mptcp: introducing a helper into active reset logic rstreason: make it work in trace world include/net/request_sock.h | 4 +- include/net/rstreason.h | 93 ++++++++++++++++++++++++++++++++++++++ include/net/tcp.h | 3 +- include/trace/events/tcp.h | 37 +++++++++++++-- net/dccp/ipv4.c | 10 ++-- net/dccp/ipv6.c | 10 ++-- net/dccp/minisocks.c | 3 +- net/ipv4/tcp.c | 15 ++++-- net/ipv4/tcp_ipv4.c | 14 +++--- net/ipv4/tcp_minisocks.c | 3 +- net/ipv4/tcp_output.c | 5 +- net/ipv4/tcp_timer.c | 9 ++-- net/ipv6/tcp_ipv6.c | 17 ++++--- net/mptcp/protocol.c | 2 +- net/mptcp/protocol.h | 11 +++++ net/mptcp/subflow.c | 27 ++++++++--- 16 files changed, 216 insertions(+), 47 deletions(-) create mode 100644 include/net/rstreason.h