Message ID | 20241207173803.90744-11-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net-timestamp: bpf extension to equip applications transparently | expand |
On 12/7/24 9:38 AM, Jason Xing wrote: > From: Jason Xing <kernelxing@tencent.com> > > For now, there are three phases where we are not able to fetch > the right seqno from the skops->skb_data, because: > 1) in __dev_queue_xmit(), the skb->data doesn't point to the start > offset in tcp header. > 2) in tcp_ack_tstamp(), the skb doesn't have the tcp header. > > In the long run, we may add other trace points for bpf extension. > And the shinfo->tskey is always the same value for both bpf and > non-bpf cases. With that said, let's directly use shinfo->tskey > for TCP protocol. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/skbuff.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 7c59ef501c74..2e13643f791c 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -5544,7 +5544,7 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, > int tstype) > { > struct timespec64 tstamp; > - u32 args[2] = {0, 0}; > + u32 args[3] = {0, 0, 0}; > int op; > > if (!sk) > @@ -5569,7 +5569,10 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, > return; > } > > - bpf_skops_tx_timestamping(sk, skb, op, 2, args); > + if (sk_is_tcp(sk)) > + args[2] = skb_shinfo(skb)->tskey; Instead of only passing one info "skb_shinfo(skb)->tskey" of a skb, pass the whole skb ptr to the bpf prog. Take a look at bpf_skops_init_skb. Lets start with end_offset = 0 for now so that the bpf prog won't use it to read the skb->data. It can be revisited later. bpf_skops_init_skb(&sock_ops, skb, 0); The bpf prog can use bpf_cast_to_kern_ctx() and bpf_core_cast() to get to the skb_shinfo(skb). Take a look at the md_skb example in type_cast.c. Then it needs to add a bpf_sock->op check to the existing bpf_sock_ops_{load,store}_hdr_opt() helpers to ensure these helpers can only be used by the BPF_SOCK_OPS_PARSE_HDR_OPT_CB, BPF_SOCK_OPS_HDR_OPT_LEN_CB, and BPF_SOCK_OPS_WRITE_HDR_OPT_CB callback. btw, how is the ack_skb used for the SCM_TSTAMP_ACK by the user space now?
On Fri, Dec 13, 2024 at 8:28 AM Martin KaFai Lau <martin.lau@linux.dev> wrote: > > On 12/7/24 9:38 AM, Jason Xing wrote: > > From: Jason Xing <kernelxing@tencent.com> > > > > For now, there are three phases where we are not able to fetch > > the right seqno from the skops->skb_data, because: > > 1) in __dev_queue_xmit(), the skb->data doesn't point to the start > > offset in tcp header. > > 2) in tcp_ack_tstamp(), the skb doesn't have the tcp header. > > > > In the long run, we may add other trace points for bpf extension. > > And the shinfo->tskey is always the same value for both bpf and > > non-bpf cases. With that said, let's directly use shinfo->tskey > > for TCP protocol. > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > --- > > net/core/skbuff.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > index 7c59ef501c74..2e13643f791c 100644 > > --- a/net/core/skbuff.c > > +++ b/net/core/skbuff.c > > @@ -5544,7 +5544,7 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, > > int tstype) > > { > > struct timespec64 tstamp; > > - u32 args[2] = {0, 0}; > > + u32 args[3] = {0, 0, 0}; > > int op; > > > > if (!sk) > > @@ -5569,7 +5569,10 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, > > return; > > } > > > > - bpf_skops_tx_timestamping(sk, skb, op, 2, args); > > + if (sk_is_tcp(sk)) > > + args[2] = skb_shinfo(skb)->tskey; > > Instead of only passing one info "skb_shinfo(skb)->tskey" of a skb, pass the > whole skb ptr to the bpf prog. Take a look at bpf_skops_init_skb. Lets start > with end_offset = 0 for now so that the bpf prog won't use it to read the > skb->data. It can be revisited later. > > bpf_skops_init_skb(&sock_ops, skb, 0); > > The bpf prog can use bpf_cast_to_kern_ctx() and bpf_core_cast() to get to the > skb_shinfo(skb). Take a look at the md_skb example in type_cast.c. Sorry, I didn't give it much thought on getting to the shinfo. That's why I quickly gave up using bpf_skops_init_skb() after I noticed the seq of skb is always zero :( I will test it tomorrow. Thanks. > > Then it needs to add a bpf_sock->op check to the existing > bpf_sock_ops_{load,store}_hdr_opt() helpers to ensure these helpers can only be > used by the BPF_SOCK_OPS_PARSE_HDR_OPT_CB, BPF_SOCK_OPS_HDR_OPT_LEN_CB, and > BPF_SOCK_OPS_WRITE_HDR_OPT_CB callback. Forgive me. I cannot see how the bpf_sock_ops_load_hdr_opt helper has something to do with the current thread? Could you enlighten me? > > btw, how is the ack_skb used for the SCM_TSTAMP_ACK by the user space now? To be honest, I hardly use the ack_skb[1] under this circumstance... I think if someone offers a suggestion to use it, then we can support it? [1] commit e7ed11ee945438b737e2ae2370e35591e16ec371 Author: Yousuk Seung <ysseung@google.com> Date: Wed Jan 20 12:41:55 2021 -0800 tcp: add TTL to SCM_TIMESTAMPING_OPT_STATS This patch adds TCP_NLA_TTL to SCM_TIMESTAMPING_OPT_STATS that exports the time-to-live or hop limit of the latest incoming packet with SCM_TSTAMP_ACK. The value exported may not be from the packet that acks the sequence when incoming packets are aggregated. Exporting the time-to-live or hop limit value of incoming packets helps to estimate the hop count of the path of the flow that may change over time.
On 12/13/24 7:44 AM, Jason Xing wrote: >>> @@ -5569,7 +5569,10 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, >>> return; >>> } >>> >>> - bpf_skops_tx_timestamping(sk, skb, op, 2, args); >>> + if (sk_is_tcp(sk)) >>> + args[2] = skb_shinfo(skb)->tskey; >> Instead of only passing one info "skb_shinfo(skb)->tskey" of a skb, pass the >> whole skb ptr to the bpf prog. Take a look at bpf_skops_init_skb. Lets start >> with end_offset = 0 for now so that the bpf prog won't use it to read the >> skb->data. It can be revisited later. >> >> bpf_skops_init_skb(&sock_ops, skb, 0); >> >> The bpf prog can use bpf_cast_to_kern_ctx() and bpf_core_cast() to get to the >> skb_shinfo(skb). Take a look at the md_skb example in type_cast.c. > Sorry, I didn't give it much thought on getting to the shinfo. That's > why I quickly gave up using bpf_skops_init_skb() after I noticed the > seq of skb is always zero
On Sat, Dec 14, 2024 at 7:55 AM Martin KaFai Lau <martin.lau@linux.dev> wrote: > > On 12/13/24 7:44 AM, Jason Xing wrote: > >>> @@ -5569,7 +5569,10 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, > >>> return; > >>> } > >>> > >>> - bpf_skops_tx_timestamping(sk, skb, op, 2, args); > >>> + if (sk_is_tcp(sk)) > >>> + args[2] = skb_shinfo(skb)->tskey; > >> Instead of only passing one info "skb_shinfo(skb)->tskey" of a skb, pass the > >> whole skb ptr to the bpf prog. Take a look at bpf_skops_init_skb. Lets start > >> with end_offset = 0 for now so that the bpf prog won't use it to read the > >> skb->data. It can be revisited later. > >> > >> bpf_skops_init_skb(&sock_ops, skb, 0); > >> > >> The bpf prog can use bpf_cast_to_kern_ctx() and bpf_core_cast() to get to the > >> skb_shinfo(skb). Take a look at the md_skb example in type_cast.c. > > Sorry, I didn't give it much thought on getting to the shinfo. That's > > why I quickly gave up using bpf_skops_init_skb() after I noticed the > > seq of skb is always zero
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7c59ef501c74..2e13643f791c 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -5544,7 +5544,7 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, int tstype) { struct timespec64 tstamp; - u32 args[2] = {0, 0}; + u32 args[3] = {0, 0, 0}; int op; if (!sk) @@ -5569,7 +5569,10 @@ static void __skb_tstamp_tx_bpf(struct sock *sk, struct sk_buff *skb, return; } - bpf_skops_tx_timestamping(sk, skb, op, 2, args); + if (sk_is_tcp(sk)) + args[2] = skb_shinfo(skb)->tskey; + + bpf_skops_tx_timestamping(sk, skb, op, 3, args); } static void skb_tstamp_tx_output(struct sk_buff *orig_skb,