Message ID | 20210701200535.1033513-1-kafai@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf: Allow bpf tcp iter to do bpf_(get|set)sockopt | expand |
From: Martin KaFai Lau > Sent: 01 July 2021 21:06 > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. How does that work at all? IIRC only setsockopt() was converted so that it is callable with a kernel buffer. The corresponding change wasn't done to getsockopt(). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Jul 02, 2021 at 10:50:43AM +0000, David Laight wrote: > From: Martin KaFai Lau > > Sent: 01 July 2021 21:06 > > > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > How does that work at all? > > IIRC only setsockopt() was converted so that it is callable > with a kernel buffer. > The corresponding change wasn't done to getsockopt(). It calls _bpf_getsockopt which does not depend on sys_getsockopt.
On Thu, Jul 1, 2021 at 1:05 PM Martin KaFai Lau <kafai@fb.com> wrote: > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > restarting the applications to pick up the new tcp-cc, this set > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > into all the fields of a tcp_sock, so there is a lot of flexibility > to select the desired sk to do setsockopt(), e.g. it can test for > TCP_LISTEN only and leave the established connections untouched, > or check the addr/port, or check the current tcp-cc name, ...etc. > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > Patch 5 is to have the tcp seq_file iterate on the > port+addr lhash2 instead of the port only listening_hash. ... > include/linux/bpf.h | 8 + > include/net/inet_hashtables.h | 6 + > include/net/tcp.h | 1 - > kernel/bpf/bpf_iter.c | 22 + > kernel/trace/bpf_trace.c | 7 +- > net/core/filter.c | 34 ++ > net/ipv4/tcp_ipv4.c | 410 ++++++++++++++---- Eric, Could you please review this set where it touches inet bits? I've looked a few times and it all looks fine to me, but I'm no expert in those parts.
On Wed, Jul 14, 2021 at 6:29 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jul 1, 2021 at 1:05 PM Martin KaFai Lau <kafai@fb.com> wrote: > > > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > > restarting the applications to pick up the new tcp-cc, this set > > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > > into all the fields of a tcp_sock, so there is a lot of flexibility > > to select the desired sk to do setsockopt(), e.g. it can test for > > TCP_LISTEN only and leave the established connections untouched, > > or check the addr/port, or check the current tcp-cc name, ...etc. > > > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > > > Patch 5 is to have the tcp seq_file iterate on the > > port+addr lhash2 instead of the port only listening_hash. > ... > > include/linux/bpf.h | 8 + > > include/net/inet_hashtables.h | 6 + > > include/net/tcp.h | 1 - > > kernel/bpf/bpf_iter.c | 22 + > > kernel/trace/bpf_trace.c | 7 +- > > net/core/filter.c | 34 ++ > > net/ipv4/tcp_ipv4.c | 410 ++++++++++++++---- > > Eric, > > Could you please review this set where it touches inet bits? > I've looked a few times and it all looks fine to me, but I'm no expert > in those parts. Eric, ping! If you're on vacation or something I'm inclined to land the patches and let Martin address your review feedback in follow up patches. Thanks
Hi there. I was indeed on vacation, but I am back, and done with my netdev presentation :) I will take a look, thanks ! On Tue, Jul 20, 2021 at 8:05 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Jul 14, 2021 at 6:29 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Thu, Jul 1, 2021 at 1:05 PM Martin KaFai Lau <kafai@fb.com> wrote: > > > > > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > > > > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > > > restarting the applications to pick up the new tcp-cc, this set > > > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > > > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > > > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > > > into all the fields of a tcp_sock, so there is a lot of flexibility > > > to select the desired sk to do setsockopt(), e.g. it can test for > > > TCP_LISTEN only and leave the established connections untouched, > > > or check the addr/port, or check the current tcp-cc name, ...etc. > > > > > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > > > > > Patch 5 is to have the tcp seq_file iterate on the > > > port+addr lhash2 instead of the port only listening_hash. > > ... > > > include/linux/bpf.h | 8 + > > > include/net/inet_hashtables.h | 6 + > > > include/net/tcp.h | 1 - > > > kernel/bpf/bpf_iter.c | 22 + > > > kernel/trace/bpf_trace.c | 7 +- > > > net/core/filter.c | 34 ++ > > > net/ipv4/tcp_ipv4.c | 410 ++++++++++++++---- > > > > Eric, > > > > Could you please review this set where it touches inet bits? > > I've looked a few times and it all looks fine to me, but I'm no expert > > in those parts. > > Eric, > > ping! > If you're on vacation or something I'm inclined to land the patches > and let Martin address your review feedback in follow up patches. > > Thanks
On 7/1/21 10:05 PM, Martin KaFai Lau wrote: > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > restarting the applications to pick up the new tcp-cc, this set > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > into all the fields of a tcp_sock, so there is a lot of flexibility > to select the desired sk to do setsockopt(), e.g. it can test for > TCP_LISTEN only and leave the established connections untouched, > or check the addr/port, or check the current tcp-cc name, ...etc. > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > Patch 5 is to have the tcp seq_file iterate on the > port+addr lhash2 instead of the port only listening_hash. > > Patch 6 is to have the bpf tcp iter doing batching which > then allows lock_sock. lock_sock is needed for setsockopt. > > Patch 7 allows the bpf tcp iter to call bpf_(get|set)sockopt. > > v2: > - Use __GFP_NOWARN in patch 6 > - Add bpf_getsockopt() in patch 7 to give a symmetrical user experience. > selftest in patch 8 is changed to also cover bpf_getsockopt(). > - Remove CAP_NET_ADMIN check in patch 7. Tracing bpf prog has already > required CAP_SYS_ADMIN or CAP_PERFMON. > - Move some def macros to bpf_tracing_net.h in patch 8 > > Martin KaFai Lau (8): > tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos > tcp: seq_file: Refactor net and family matching > bpf: tcp: seq_file: Remove bpf_seq_afinfo from tcp_iter_state > tcp: seq_file: Add listening_get_first() > tcp: seq_file: Replace listening_hash with lhash2 > bpf: tcp: bpf iter batching and lock_sock > bpf: tcp: Support bpf_(get|set)sockopt in bpf tcp iter > bpf: selftest: Test batching and bpf_(get|set)sockopt in bpf tcp iter For the whole series : Reviewed-by: Eric Dumazet <edumazet@google.com> Sorry for the delay. BTW, it seems weird for new BPF features to use /proc/net "legacy" infrastructure and update it.
From: Martin KaFai Lau <kafai@fb.com> Date: Thu, 1 Jul 2021 13:05:35 -0700 > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > restarting the applications to pick up the new tcp-cc, this set > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > into all the fields of a tcp_sock, so there is a lot of flexibility > to select the desired sk to do setsockopt(), e.g. it can test for > TCP_LISTEN only and leave the established connections untouched, > or check the addr/port, or check the current tcp-cc name, ...etc. > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > Patch 5 is to have the tcp seq_file iterate on the > port+addr lhash2 instead of the port only listening_hash. > > Patch 6 is to have the bpf tcp iter doing batching which > then allows lock_sock. lock_sock is needed for setsockopt. > > Patch 7 allows the bpf tcp iter to call bpf_(get|set)sockopt. I have a comment on the first patch, but the series looks good to me. Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
On Thu, Jul 22, 2021 at 03:25:39PM +0200, Eric Dumazet wrote: > > > On 7/1/21 10:05 PM, Martin KaFai Lau wrote: > > This set is to allow bpf tcp iter to call bpf_(get|set)sockopt. > > > > With bpf-tcp-cc, new algo rollout happens more often. Instead of > > restarting the applications to pick up the new tcp-cc, this set > > allows the bpf tcp iter to call bpf_(get|set)sockopt(TCP_CONGESTION). > > It is not limited to TCP_CONGESTION, the bpf tcp iter can call > > bpf_(get|set)sockopt() with other options. The bpf tcp iter can read > > into all the fields of a tcp_sock, so there is a lot of flexibility > > to select the desired sk to do setsockopt(), e.g. it can test for > > TCP_LISTEN only and leave the established connections untouched, > > or check the addr/port, or check the current tcp-cc name, ...etc. > > > > Patch 1-4 are some cleanup and prep work in the tcp and bpf seq_file. > > > > Patch 5 is to have the tcp seq_file iterate on the > > port+addr lhash2 instead of the port only listening_hash. > > > > Patch 6 is to have the bpf tcp iter doing batching which > > then allows lock_sock. lock_sock is needed for setsockopt. > > > > Patch 7 allows the bpf tcp iter to call bpf_(get|set)sockopt. > > > > v2: > > - Use __GFP_NOWARN in patch 6 > > - Add bpf_getsockopt() in patch 7 to give a symmetrical user experience. > > selftest in patch 8 is changed to also cover bpf_getsockopt(). > > - Remove CAP_NET_ADMIN check in patch 7. Tracing bpf prog has already > > required CAP_SYS_ADMIN or CAP_PERFMON. > > - Move some def macros to bpf_tracing_net.h in patch 8 > > > > Martin KaFai Lau (8): > > tcp: seq_file: Avoid skipping sk during tcp_seek_last_pos > > tcp: seq_file: Refactor net and family matching > > bpf: tcp: seq_file: Remove bpf_seq_afinfo from tcp_iter_state > > tcp: seq_file: Add listening_get_first() > > tcp: seq_file: Replace listening_hash with lhash2 > > bpf: tcp: bpf iter batching and lock_sock > > bpf: tcp: Support bpf_(get|set)sockopt in bpf tcp iter > > bpf: selftest: Test batching and bpf_(get|set)sockopt in bpf tcp iter > > For the whole series : > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > Sorry for the delay. > > BTW, it seems weird for new BPF features to use /proc/net "legacy" > infrastructure and update it. bpf iter uses seq_file, so the initial bpf_iter_tcp reuses most of the pieces from /proc/net/tcp. This set refactored a few things such that the bpf_iter_tcp only shares the legacy tcp_seek_last_pos(), so the dependency on /proc/net/tcp should be less going forward. A similar modification could also be done to bpf_iter_udp in the future. Thanks for the review!