Message ID | 20240409100934.37725-3-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Implement reset reason mechanism to detect | expand |
Quoting Jason Xing (2024-04-09 12:09:30) > void (*send_reset)(const struct sock *sk, > - struct sk_buff *skb); > + struct sk_buff *skb, > + int reason); I get that 'int' is used instead of 'enum sk_rst_reason' to allow passing drop reasons too without casting, but that makes understanding what should be 'reason' harder. Eg. when looking at the code or when using BTF (to then install debugging probes with BPF) this is not obvious. A similar approach could be done as the one used for drop reasons: enum skb_drop_reason is used for parameters (eg. kfree_skb_reason) but other valid values (subsystem drop reasons) can be used too if casted (to u32). We could use 'enum sk_rst_reason' and cast the other values. WDYT? Thanks, Antoine
Hi Antoine, On Wed, Apr 10, 2024 at 8:14 PM Antoine Tenart <atenart@kernel.org> wrote: > > Quoting Jason Xing (2024-04-09 12:09:30) > > void (*send_reset)(const struct sock *sk, > > - struct sk_buff *skb); > > + struct sk_buff *skb, > > + int reason); > > I get that 'int' is used instead of 'enum sk_rst_reason' to allow > passing drop reasons too without casting, but that makes understanding Yes! > what should be 'reason' harder. Eg. when looking at the code or when > using BTF (to then install debugging probes with BPF) this is not > obvious. Only one number if we want to extract the reason with BPF, right? I haven't tried it. > > A similar approach could be done as the one used for drop reasons: enum > skb_drop_reason is used for parameters (eg. kfree_skb_reason) but other > valid values (subsystem drop reasons) can be used too if casted (to > u32). We could use 'enum sk_rst_reason' and cast the other values. WDYT? I have been haunted by this 'issue' for a long time... Are you suggesting doing so as below for readability: 1) replace the reason parameter in all the related functions (like .send_reset(), tcp_v4_send_reset(), etc) by using 'enum sk_rst_reason' type? 2) in patch [4/6], when it needs to pass the specific reason in those functions, we can cast it to 'enum sk_rst_reason'? One modification I just made based on this patchset if I understand correctly: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4889fccbf754..e0419b8496b5 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -725,7 +725,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb, */ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb, - int reason) + enum sk_rst_reason reason) { const struct tcphdr *th = tcp_hdr(skb); struct { @@ -1935,7 +1935,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - tcp_v4_send_reset(rsk, skb, reason); + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); discard: kfree_skb_reason(skb, reason); /* Be careful here. If this function gets more complicated and It, indeed, looks better (clean and clear) :) So I also ought to adjust the trace_tcp_send_reset part... Thanks, Jason
Quoting Jason Xing (2024-04-10 14:54:51) > Hi Antoine, > > On Wed, Apr 10, 2024 at 8:14 PM Antoine Tenart <atenart@kernel.org> wrote: > > > > Quoting Jason Xing (2024-04-09 12:09:30) > > > void (*send_reset)(const struct sock *sk, > > > - struct sk_buff *skb); > > > + struct sk_buff *skb, > > > + int reason); > > > what should be 'reason' harder. Eg. when looking at the code or when > > using BTF (to then install debugging probes with BPF) this is not > > obvious. > > Only one number if we want to extract the reason with BPF, right? I > haven't tried it. Yes, we can get 'reason'. Knowing the type helps. > > A similar approach could be done as the one used for drop reasons: enum > > skb_drop_reason is used for parameters (eg. kfree_skb_reason) but other > > valid values (subsystem drop reasons) can be used too if casted (to > > u32). We could use 'enum sk_rst_reason' and cast the other values. WDYT? > > I have been haunted by this 'issue' for a long time... > > Are you suggesting doing so as below for readability: > 1) replace the reason parameter in all the related functions (like > .send_reset(), tcp_v4_send_reset(), etc) by using 'enum sk_rst_reason' > type? > 2) in patch [4/6], when it needs to pass the specific reason in those > functions, we can cast it to 'enum sk_rst_reason'? > > One modification I just made based on this patchset if I understand correctly: > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 4889fccbf754..e0419b8496b5 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -725,7 +725,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock > *sk, struct sk_buff *skb, > */ > > static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb, > - int reason) > + enum sk_rst_reason reason) > { > const struct tcphdr *th = tcp_hdr(skb); > struct { > @@ -1935,7 +1935,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > return 0; > > reset: > - tcp_v4_send_reset(rsk, skb, reason); > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > discard: > kfree_skb_reason(skb, reason); > /* Be careful here. If this function gets more complicated and > That's right. I think (u32) can also be used for the cast to make the compiler happy in 2), but the above makes sense. Thanks! Antoine
On Wed, Apr 10, 2024 at 9:21 PM Antoine Tenart <atenart@kernel.org> wrote: > > Quoting Jason Xing (2024-04-10 14:54:51) > > Hi Antoine, > > > > On Wed, Apr 10, 2024 at 8:14 PM Antoine Tenart <atenart@kernel.org> wrote: > > > > > > Quoting Jason Xing (2024-04-09 12:09:30) > > > > void (*send_reset)(const struct sock *sk, > > > > - struct sk_buff *skb); > > > > + struct sk_buff *skb, > > > > + int reason); > > > > > what should be 'reason' harder. Eg. when looking at the code or when > > > using BTF (to then install debugging probes with BPF) this is not > > > obvious. > > > > Only one number if we want to extract the reason with BPF, right? I > > haven't tried it. > > Yes, we can get 'reason'. Knowing the type helps. > > > > A similar approach could be done as the one used for drop reasons: enum > > > skb_drop_reason is used for parameters (eg. kfree_skb_reason) but other > > > valid values (subsystem drop reasons) can be used too if casted (to > > > u32). We could use 'enum sk_rst_reason' and cast the other values. WDYT? > > > > I have been haunted by this 'issue' for a long time... > > > > Are you suggesting doing so as below for readability: > > 1) replace the reason parameter in all the related functions (like > > .send_reset(), tcp_v4_send_reset(), etc) by using 'enum sk_rst_reason' > > type? > > 2) in patch [4/6], when it needs to pass the specific reason in those > > functions, we can cast it to 'enum sk_rst_reason'? > > > > One modification I just made based on this patchset if I understand correctly: > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > index 4889fccbf754..e0419b8496b5 100644 > > --- a/net/ipv4/tcp_ipv4.c > > +++ b/net/ipv4/tcp_ipv4.c > > @@ -725,7 +725,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock > > *sk, struct sk_buff *skb, > > */ > > > > static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb, > > - int reason) > > + enum sk_rst_reason reason) > > { > > const struct tcphdr *th = tcp_hdr(skb); > > struct { > > @@ -1935,7 +1935,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) > > return 0; > > > > reset: > > - tcp_v4_send_reset(rsk, skb, reason); > > + tcp_v4_send_reset(rsk, skb, (enum sk_rst_reason)reason); > > discard: > > kfree_skb_reason(skb, reason); > > /* Be careful here. If this function gets more complicated and > > > > That's right. I think (u32) can also be used for the cast to make the > compiler happy in 2), but the above makes sense. Got it :) Will update soon. Thanks, Jason
diff --git a/include/net/request_sock.h b/include/net/request_sock.h index 004e651e6067..93f9fee7e52f 100644 --- a/include/net/request_sock.h +++ b/include/net/request_sock.h @@ -34,7 +34,8 @@ struct request_sock_ops { void (*send_ack)(const struct sock *sk, struct sk_buff *skb, struct request_sock *req); void (*send_reset)(const struct sock *sk, - struct sk_buff *skb); + struct sk_buff *skb, + int reason); void (*destructor)(struct request_sock *req); void (*syn_ack_timeout)(const struct request_sock *req); }; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 9fc9cea4c251..11b8d14be3e2 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -24,6 +24,7 @@ #include <net/xfrm.h> #include <net/secure_seq.h> #include <net/netns/generic.h> +#include <net/rstreason.h> #include "ackvec.h" #include "ccid.h" @@ -521,7 +522,8 @@ static int dccp_v4_send_response(const struct sock *sk, struct request_sock *req return err; } -static void dccp_v4_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb) +static void dccp_v4_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb, + int reason) { int err; const struct iphdr *rxiph; @@ -706,7 +708,7 @@ int dccp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - dccp_v4_ctl_send_reset(sk, skb); + dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); kfree_skb(skb); return 0; } @@ -869,7 +871,7 @@ static int dccp_v4_rcv(struct sk_buff *skb) if (nsk == sk) { reqsk_put(req); } else if (dccp_child_process(sk, nsk, skb)) { - dccp_v4_ctl_send_reset(sk, skb); + dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); goto discard_and_relse; } else { sock_put(sk); @@ -909,7 +911,7 @@ static int dccp_v4_rcv(struct sk_buff *skb) if (dh->dccph_type != DCCP_PKT_RESET) { DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_NO_CONNECTION; - dccp_v4_ctl_send_reset(sk, skb); + dccp_v4_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); } discard_it: diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index c8ca703dc331..232092dc3887 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -29,6 +29,7 @@ #include <net/secure_seq.h> #include <net/netns/generic.h> #include <net/sock.h> +#include <net/rstreason.h> #include "dccp.h" #include "ipv6.h" @@ -256,7 +257,8 @@ static void dccp_v6_reqsk_destructor(struct request_sock *req) kfree_skb(inet_rsk(req)->pktopts); } -static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb) +static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb, + int reason) { const struct ipv6hdr *rxip6h; struct sk_buff *skb; @@ -656,7 +658,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - dccp_v6_ctl_send_reset(sk, skb); + dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); discard: if (opt_skb != NULL) __kfree_skb(opt_skb); @@ -762,7 +764,7 @@ static int dccp_v6_rcv(struct sk_buff *skb) if (nsk == sk) { reqsk_put(req); } else if (dccp_child_process(sk, nsk, skb)) { - dccp_v6_ctl_send_reset(sk, skb); + dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); goto discard_and_relse; } else { sock_put(sk); @@ -801,7 +803,7 @@ static int dccp_v6_rcv(struct sk_buff *skb) if (dh->dccph_type != DCCP_PKT_RESET) { DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_NO_CONNECTION; - dccp_v6_ctl_send_reset(sk, skb); + dccp_v6_ctl_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); } discard_it: diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c index 64d805b27add..251a57cf5822 100644 --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -15,6 +15,7 @@ #include <net/sock.h> #include <net/xfrm.h> #include <net/inet_timewait_sock.h> +#include <net/rstreason.h> #include "ackvec.h" #include "ccid.h" @@ -202,7 +203,7 @@ struct sock *dccp_check_req(struct sock *sk, struct sk_buff *skb, DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_TOO_BUSY; drop: if (dccp_hdr(skb)->dccph_type != DCCP_PKT_RESET) - req->rsk_ops->send_reset(sk, skb); + req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); inet_csk_reqsk_queue_drop(sk, req); out: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 52963c3bb8ca..21fa69445f7a 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -70,6 +70,7 @@ #include <net/xfrm.h> #include <net/secure_seq.h> #include <net/busy_poll.h> +#include <net/rstreason.h> #include <linux/inet.h> #include <linux/ipv6.h> @@ -723,7 +724,8 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb, * Exception: precedence violation. We do not implement it in any case. */ -static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) +static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb, + int reason) { const struct tcphdr *th = tcp_hdr(skb); struct { @@ -1933,7 +1935,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - tcp_v4_send_reset(rsk, skb); + tcp_v4_send_reset(rsk, skb, SK_RST_REASON_NOT_SPECIFIED); discard: kfree_skb_reason(skb, reason); /* Be careful here. If this function gets more complicated and @@ -2276,7 +2278,7 @@ int tcp_v4_rcv(struct sk_buff *skb) } else { drop_reason = tcp_child_process(sk, nsk, skb); if (drop_reason) { - tcp_v4_send_reset(nsk, skb); + tcp_v4_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED); goto discard_and_relse; } sock_put(sk); @@ -2354,7 +2356,7 @@ int tcp_v4_rcv(struct sk_buff *skb) bad_packet: __TCP_INC_STATS(net, TCP_MIB_INERRS); } else { - tcp_v4_send_reset(NULL, skb); + tcp_v4_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED); } discard_it: @@ -2405,7 +2407,7 @@ int tcp_v4_rcv(struct sk_buff *skb) tcp_v4_timewait_ack(sk, skb); break; case TCP_TW_RST: - tcp_v4_send_reset(sk, skb); + tcp_v4_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); inet_twsk_deschedule_put(inet_twsk(sk)); goto discard_it; case TCP_TW_SUCCESS:; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 5b21a07ddf9a..2d7d81428c07 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -22,6 +22,7 @@ #include <net/tcp.h> #include <net/xfrm.h> #include <net/busy_poll.h> +#include <net/rstreason.h> static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win) { @@ -879,7 +880,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, * avoid becoming vulnerable to outside attack aiming at * resetting legit local connections. */ - req->rsk_ops->send_reset(sk, skb); + req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); } else if (fastopen) { /* received a valid RST pkt */ reqsk_fastopen_remove(sk, req, true); tcp_reset(sk, skb); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index cffebaec66f1..7e591521b193 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -60,6 +60,7 @@ #include <net/secure_seq.h> #include <net/hotdata.h> #include <net/busy_poll.h> +#include <net/rstreason.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> @@ -69,7 +70,8 @@ #include <trace/events/tcp.h> -static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb); +static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb, + int reason); static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, struct request_sock *req); @@ -1006,7 +1008,8 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 kfree_skb(buff); } -static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb) +static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb, + int reason) { const struct tcphdr *th = tcp_hdr(skb); struct ipv6hdr *ipv6h = ipv6_hdr(skb); @@ -1675,7 +1678,7 @@ int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) return 0; reset: - tcp_v6_send_reset(sk, skb); + tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); discard: if (opt_skb) __kfree_skb(opt_skb); @@ -1861,7 +1864,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) } else { drop_reason = tcp_child_process(sk, nsk, skb); if (drop_reason) { - tcp_v6_send_reset(nsk, skb); + tcp_v6_send_reset(nsk, skb, SK_RST_REASON_NOT_SPECIFIED); goto discard_and_relse; } sock_put(sk); @@ -1937,7 +1940,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) bad_packet: __TCP_INC_STATS(net, TCP_MIB_INERRS); } else { - tcp_v6_send_reset(NULL, skb); + tcp_v6_send_reset(NULL, skb, SK_RST_REASON_NOT_SPECIFIED); } discard_it: @@ -1992,7 +1995,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) tcp_v6_timewait_ack(sk, skb); break; case TCP_TW_RST: - tcp_v6_send_reset(sk, skb); + tcp_v6_send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); inet_twsk_deschedule_put(inet_twsk(sk)); goto discard_it; case TCP_TW_SUCCESS: diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 162b218d9858..744c87b6d5a4 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -20,6 +20,8 @@ #include <net/transp_v6.h> #endif #include <net/mptcp.h> +#include <net/rstreason.h> + #include "protocol.h" #include "mib.h" @@ -307,7 +309,7 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk, dst_release(dst); if (!req->syncookie) - tcp_request_sock_ops.send_reset(sk, skb); + tcp_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); return NULL; } @@ -374,7 +376,7 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk, dst_release(dst); if (!req->syncookie) - tcp6_request_sock_ops.send_reset(sk, skb); + tcp6_request_sock_ops.send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); return NULL; } #endif @@ -909,7 +911,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, tcp_rsk(req)->drop_req = true; inet_csk_prepare_for_destroy_sock(child); tcp_done(child); - req->rsk_ops->send_reset(sk, skb); + req->rsk_ops->send_reset(sk, skb, SK_RST_REASON_NOT_SPECIFIED); /* The last child reference will be released by the caller */ return child;