diff mbox series

[net-next,7/9] net-timestamp: open gate for bpf_setsockopt

Message ID 20241008095109.99918-8-kerneljasonxing@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net-timestamp: bpf extension to equip applications transparently | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 6 this patch: 6
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 17 of 17 maintainers
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 31 this patch: 31
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 6 this patch: 6
netdev/source_inline success Was 0 now: 0

Commit Message

Jason Xing Oct. 8, 2024, 9:51 a.m. UTC
From: Jason Xing <kernelxing@tencent.com>

Now we allow users to set tsflags through bpf_setsockopt. What I
want to do is passing SOF_TIMESTAMPING_RX_SOFTWARE flag, so that
we can generate rx timestamps the moment the skb traverses through
driver.

Here is an example:

case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
	sock_opt = SOF_TIMESTAMPING_RX_SOFTWARE;
	bpf_setsockopt(skops, SOL_SOCKET, SO_TIMESTAMPING,
		       &sock_opt, sizeof(sock_opt));
	break;

In this way, we can use bpf program that help us generate and report
rx timestamp.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 net/core/filter.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Martin KaFai Lau Oct. 9, 2024, 7:19 a.m. UTC | #1
On 10/8/24 2:51 AM, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Now we allow users to set tsflags through bpf_setsockopt. What I
> want to do is passing SOF_TIMESTAMPING_RX_SOFTWARE flag, so that
> we can generate rx timestamps the moment the skb traverses through
> driver.
> 
> Here is an example:
> 
> case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
> case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
> 	sock_opt = SOF_TIMESTAMPING_RX_SOFTWARE;
> 	bpf_setsockopt(skops, SOL_SOCKET, SO_TIMESTAMPING,
> 		       &sock_opt, sizeof(sock_opt));
> 	break;
> 
> In this way, we can use bpf program that help us generate and report
> rx timestamp.
> 
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>   net/core/filter.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index bd0d08bf76bb..9ce99d320571 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5225,6 +5225,9 @@ static int sol_socket_sockopt(struct sock *sk, int optname,
>   		break;
>   	case SO_BINDTODEVICE:
>   		break;
> +	case SO_TIMESTAMPING_NEW:
> +	case SO_TIMESTAMPING_OLD:

I believe this change was proposed before. It will change the user expectation 
on the sk_error_queue. It needs some bits/fields/knobs for bpf. I think this 
point is similar to other's earlier comments in this thread.

I only have a chance to briefly look at it. I think it is useful. This 
bpf/timestamp feature request has come up before.

A high level comment. The current timestamp should work for non tcp sock? The 
bpf/timestamp solution should be able to also.

sockops is tcp centric. From looking at patch 9 that needs to initialize 4 args, 
this interface feels old and not sure we want to extend to other sock types.
This needs some thoughts.

> +		break;
>   	default:
>   		return -EINVAL;
>   	}
Jason Xing Oct. 9, 2024, 8:09 a.m. UTC | #2
On Wed, Oct 9, 2024 at 3:19 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 10/8/24 2:51 AM, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Now we allow users to set tsflags through bpf_setsockopt. What I
> > want to do is passing SOF_TIMESTAMPING_RX_SOFTWARE flag, so that
> > we can generate rx timestamps the moment the skb traverses through
> > driver.
> >
> > Here is an example:
> >
> > case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
> > case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
> >       sock_opt = SOF_TIMESTAMPING_RX_SOFTWARE;
> >       bpf_setsockopt(skops, SOL_SOCKET, SO_TIMESTAMPING,
> >                      &sock_opt, sizeof(sock_opt));
> >       break;
> >
> > In this way, we can use bpf program that help us generate and report
> > rx timestamp.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> >   net/core/filter.c | 3 +++
> >   1 file changed, 3 insertions(+)
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index bd0d08bf76bb..9ce99d320571 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5225,6 +5225,9 @@ static int sol_socket_sockopt(struct sock *sk, int optname,
> >               break;
> >       case SO_BINDTODEVICE:
> >               break;
> > +     case SO_TIMESTAMPING_NEW:
> > +     case SO_TIMESTAMPING_OLD:
>
> I believe this change was proposed before. It will change the user expectation
> on the sk_error_queue. It needs some bits/fields/knobs for bpf. I think this
> point is similar to other's earlier comments in this thread.

Thanks for your reply.

After seeing what you mentioned, I searched through the mailing list
and found one [1] which was designed to fetch hardware timestamps.

[1]:https://lore.kernel.org/bpf/51fd5249-140a-4f1b-b20e-703f159e88a3@linux.dev/T/

>
> I only have a chance to briefly look at it. I think it is useful. This
> bpf/timestamp feature request has come up before.

At the very beginning, I had no intention to use bpf_setsockopt() to
retrieve the rx timestamp because it will override sk_tsflags, but I
cannot implement a good way like what I did to tx path: only setting
skb's field. I'm not sure if this override behaviour is acceptable, so
I post it to know what the bpf experts' suggestions are.

>
> A high level comment. The current timestamp should work for non tcp sock? The
> bpf/timestamp solution should be able to also.

For now, it only supports TCP proto. I would like to quickly implement
a framework which is also suitable for other protos. TCP is just a
start point.

>
> sockops is tcp centric. From looking at patch 9 that needs to initialize 4 args,
> this interface feels old and not sure we want to extend to other sock types.
> This needs some thoughts.

For me, I have interests to extend to other sock types. But I'm
supposed to ask Willem's opinion first.

+Willem de Bruijn Do you want this bpf extension feature to extend to
other protos?

Thanks,
Jason
Willem de Bruijn Oct. 9, 2024, 1:23 p.m. UTC | #3
Jason Xing wrote:
> On Wed, Oct 9, 2024 at 3:19 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >
> > On 10/8/24 2:51 AM, Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > Now we allow users to set tsflags through bpf_setsockopt. What I
> > > want to do is passing SOF_TIMESTAMPING_RX_SOFTWARE flag, so that
> > > we can generate rx timestamps the moment the skb traverses through
> > > driver.
> > >
> > > Here is an example:
> > >
> > > case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
> > > case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
> > >       sock_opt = SOF_TIMESTAMPING_RX_SOFTWARE;
> > >       bpf_setsockopt(skops, SOL_SOCKET, SO_TIMESTAMPING,
> > >                      &sock_opt, sizeof(sock_opt));
> > >       break;
> > >
> > > In this way, we can use bpf program that help us generate and report
> > > rx timestamp.
> > >
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > ---
> > >   net/core/filter.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > >
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index bd0d08bf76bb..9ce99d320571 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -5225,6 +5225,9 @@ static int sol_socket_sockopt(struct sock *sk, int optname,
> > >               break;
> > >       case SO_BINDTODEVICE:
> > >               break;
> > > +     case SO_TIMESTAMPING_NEW:
> > > +     case SO_TIMESTAMPING_OLD:
> >
> > I believe this change was proposed before. It will change the user expectation
> > on the sk_error_queue. It needs some bits/fields/knobs for bpf. I think this
> > point is similar to other's earlier comments in this thread.
> 
> Thanks for your reply.
> 
> After seeing what you mentioned, I searched through the mailing list
> and found one [1] which was designed to fetch hardware timestamps.
> 
> [1]:https://lore.kernel.org/bpf/51fd5249-140a-4f1b-b20e-703f159e88a3@linux.dev/T/
> 
> >
> > I only have a chance to briefly look at it. I think it is useful. This
> > bpf/timestamp feature request has come up before.
> 
> At the very beginning, I had no intention to use bpf_setsockopt() to
> retrieve the rx timestamp because it will override sk_tsflags, but I
> cannot implement a good way like what I did to tx path: only setting
> skb's field. I'm not sure if this override behaviour is acceptable, so
> I post it to know what the bpf experts' suggestions are.
> 
> >
> > A high level comment. The current timestamp should work for non tcp sock? The
> > bpf/timestamp solution should be able to also.
> 
> For now, it only supports TCP proto. I would like to quickly implement
> a framework which is also suitable for other protos. TCP is just a
> start point.
> 
> >
> > sockops is tcp centric. From looking at patch 9 that needs to initialize 4 args,
> > this interface feels old and not sure we want to extend to other sock types.
> > This needs some thoughts.
> 
> For me, I have interests to extend to other sock types. But I'm
> supposed to ask Willem's opinion first.
> 
> +Willem de Bruijn Do you want this bpf extension feature to extend to
> other protos?

There would likely be users for other protocols too, just like
SO_TIMESTAMPING. Though TCP is probably the most widely used case by
far.
Jason Xing Oct. 9, 2024, 1:48 p.m. UTC | #4
On Wed, Oct 9, 2024 at 9:23 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Jason Xing wrote:
> > On Wed, Oct 9, 2024 at 3:19 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > >
> > > On 10/8/24 2:51 AM, Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > >
> > > > Now we allow users to set tsflags through bpf_setsockopt. What I
> > > > want to do is passing SOF_TIMESTAMPING_RX_SOFTWARE flag, so that
> > > > we can generate rx timestamps the moment the skb traverses through
> > > > driver.
> > > >
> > > > Here is an example:
> > > >
> > > > case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
> > > > case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
> > > >       sock_opt = SOF_TIMESTAMPING_RX_SOFTWARE;
> > > >       bpf_setsockopt(skops, SOL_SOCKET, SO_TIMESTAMPING,
> > > >                      &sock_opt, sizeof(sock_opt));
> > > >       break;
> > > >
> > > > In this way, we can use bpf program that help us generate and report
> > > > rx timestamp.
> > > >
> > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > > ---
> > > >   net/core/filter.c | 3 +++
> > > >   1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > > index bd0d08bf76bb..9ce99d320571 100644
> > > > --- a/net/core/filter.c
> > > > +++ b/net/core/filter.c
> > > > @@ -5225,6 +5225,9 @@ static int sol_socket_sockopt(struct sock *sk, int optname,
> > > >               break;
> > > >       case SO_BINDTODEVICE:
> > > >               break;
> > > > +     case SO_TIMESTAMPING_NEW:
> > > > +     case SO_TIMESTAMPING_OLD:
> > >
> > > I believe this change was proposed before. It will change the user expectation
> > > on the sk_error_queue. It needs some bits/fields/knobs for bpf. I think this
> > > point is similar to other's earlier comments in this thread.
> >
> > Thanks for your reply.
> >
> > After seeing what you mentioned, I searched through the mailing list
> > and found one [1] which was designed to fetch hardware timestamps.
> >
> > [1]:https://lore.kernel.org/bpf/51fd5249-140a-4f1b-b20e-703f159e88a3@linux.dev/T/
> >
> > >
> > > I only have a chance to briefly look at it. I think it is useful. This
> > > bpf/timestamp feature request has come up before.
> >
> > At the very beginning, I had no intention to use bpf_setsockopt() to
> > retrieve the rx timestamp because it will override sk_tsflags, but I
> > cannot implement a good way like what I did to tx path: only setting
> > skb's field. I'm not sure if this override behaviour is acceptable, so
> > I post it to know what the bpf experts' suggestions are.
> >
> > >
> > > A high level comment. The current timestamp should work for non tcp sock? The
> > > bpf/timestamp solution should be able to also.
> >
> > For now, it only supports TCP proto. I would like to quickly implement
> > a framework which is also suitable for other protos. TCP is just a
> > start point.
> >
> > >
> > > sockops is tcp centric. From looking at patch 9 that needs to initialize 4 args,
> > > this interface feels old and not sure we want to extend to other sock types.
> > > This needs some thoughts.
> >
> > For me, I have interests to extend to other sock types. But I'm
> > supposed to ask Willem's opinion first.
> >
> > +Willem de Bruijn Do you want this bpf extension feature to extend to
> > other protos?
>
> There would likely be users for other protocols too, just like
> SO_TIMESTAMPING. Though TCP is probably the most widely used case by
> far.

Agreed !
diff mbox series

Patch

diff --git a/net/core/filter.c b/net/core/filter.c
index bd0d08bf76bb..9ce99d320571 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5225,6 +5225,9 @@  static int sol_socket_sockopt(struct sock *sk, int optname,
 		break;
 	case SO_BINDTODEVICE:
 		break;
+	case SO_TIMESTAMPING_NEW:
+	case SO_TIMESTAMPING_OLD:
+		break;
 	default:
 		return -EINVAL;
 	}