Message ID | 20230320192410.1624645-1-kuifeng@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | Transit between BPF TCP congestion controls. | expand |
This thread is sent by accident. Please ignore this thread and check the other thread sent with the same subject. Sorry for wasting your time. On 3/20/23 12:24, Kui-Feng Lee wrote: > Major changes: > > - Create bpf_links in the kernel for BPF struct_ops to register and > unregister it. > > - Enables switching between implementations of bpf-tcp-cc under a > name instantly by replacing the backing struct_ops map of a > bpf_link. > > Previously, BPF struct_ops didn't go off, as even when the user > program creating it was terminated, none of these ever were pinned. > For instance, the TCP congestion control subsystem indirectly > maintains a reference count on the struct_ops of any registered BPF > implemented algorithm. Thus, the algorithm won't be deactivated until > someone deliberately unregisters it. For compatibility with other BPF > programs, bpf_links have been created to work in coordination with > struct_ops maps. This ensures that the registration and unregistration > of these respective maps is carried out at the start and end of the > bpf_link. > > We also faced complications when attempting to replace an existing TCP > congestion control algorithm with a new implementation on the fly. A > struct_ops map was used to register a TCP congestion control algorithm > with a unique name. We had to either register the alternative > implementation with a new name and move over or unregister the current > one before being able to reregistration with the same name. To fix > this problem, we can an option to migrate the registration of the > algorithm from struct_ops maps to bpf_links. By modifying the backing > map of a bpf_link, it suddenly becomes possible to replace an existing > TCP congestion control algorithm with ease. > > --- > > The major differences form v8: > > - Check bpf_struct_ops::{validate,update} in > bpf_struct_ops_map_alloc() > > The major differences from v7: > > - Use synchronize_rcu_mult(call_rcu, call_rcu_tasks) to replace > synchronize_rcu() *** BLURB HERE *** synchronize_rcu_tasks(). > > - Call synchronize_rcu() in tcp_update_congestion_control(). > > - Handle -EBUSY in bpf_map__attach_struct_ops() to allow a struct_ops > can be used to create links more than once. Include a test case. > > - Add old_map_fd to bpf_attr and handle BPF_F_REPLACE in > bpf_struct_ops_map_link_update(). > > - Remove changes in bpf_dummy_struct_ops.c and add a check of .update > function pointer of bpf_struct_ops. > > The major differences from v6: > > - Reword commit logs of the patch 1, 2, and 8. > > - Call syncrhonize_rcu_tasks() as well in bpf_struct_ops_map_free(). > > - Refactor bpf_struct_ops_map_free() so that > bpf_struct_ops_map_alloc() can free a struct_ops without waiting > for a RCU grace period. > > The major differences from v5: > > - Add a new step to bpf_object__load() to prepare vdata. > > - Accept BPF_F_REPLACE. > > - Check section IDs in find_struct_ops_map_by_offset() > > - Add a test case to check mixing w/ and w/o link struct_ops. > > - Add a test case of using struct_ops w/o link to update a link. > > - Improve bpf_link__detach_struct_ops() to handle the w/ link case. > > The major differences from v4: > > - Rebase. > > - Reorder patches and merge part 4 to part 2 of the v4. > > The major differences from v3: > > - Remove bpf_struct_ops_map_free_rcu(), and use synchronize_rcu(). > > - Improve the commit log of the part 1. > > - Before transitioning to the READY state, we conduct a value check > to ensure that struct_ops can be successfully utilized and links > created later. > > The major differences from v2: > > - Simplify states > > - Remove TOBEUNREG. > > - Rename UNREG to READY. > > - Stop using the refcnt of the kvalue of a struct_ops. Explicitly > increase and decrease the refcount of struct_ops. > > - Prepare kernel vdata during the load phase of libbpf. > > The major differences from v1: > > - Added bpf_struct_ops_link to replace the previous union-based > approach. > > - Added UNREG and TOBEUNREG to the state of bpf_struct_ops_map. > > - bpf_struct_ops_transit_state() maintains state transitions. > > - Fixed synchronization issue. > > - Prepare kernel vdata of struct_ops during the loading phase of > bpf_object. > > - Merged previous patch 3 to patch 1. > > v8: https://lore.kernel.org/all/20230318053144.1180301-1-kuifeng@meta.com/ > v7: https://lore.kernel.org/all/20230316023641.2092778-1-kuifeng@meta.com/ > v6: https://lore.kernel.org/all/20230310043812.3087672-1-kuifeng@meta.com/ > v5: https://lore.kernel.org/all/20230308005050.255859-1-kuifeng@meta.com/ > v4: https://lore.kernel.org/all/20230307232913.576893-1-andrii@kernel.org/ > v3: https://lore.kernel.org/all/20230303012122.852654-1-kuifeng@meta.com/ > v2: https://lore.kernel.org/bpf/20230223011238.12313-1-kuifeng@meta.com/ > v1: https://lore.kernel.org/bpf/20230214221718.503964-1-kuifeng@meta.com/ > > Kui-Feng Lee (8): > bpf: Retire the struct_ops map kvalue->refcnt. > net: Update an existing TCP congestion control algorithm. > bpf: Create links for BPF struct_ops maps. > libbpf: Create a bpf_link in bpf_map__attach_struct_ops(). > bpf: Update the struct_ops of a bpf_link. > libbpf: Update a bpf_link with another struct_ops. > libbpf: Use .struct_ops.link section to indicate a struct_ops with a > link. > selftests/bpf: Test switching TCP Congestion Control algorithms. > > include/linux/bpf.h | 11 + > include/net/tcp.h | 3 + > include/uapi/linux/bpf.h | 33 ++- > kernel/bpf/bpf_struct_ops.c | 250 +++++++++++++++--- > kernel/bpf/syscall.c | 63 ++++- > net/ipv4/bpf_tcp_ca.c | 14 +- > net/ipv4/tcp_cong.c | 65 ++++- > tools/include/uapi/linux/bpf.h | 33 ++- > tools/lib/bpf/libbpf.c | 190 ++++++++++--- > tools/lib/bpf/libbpf.h | 1 + > tools/lib/bpf/libbpf.map | 1 + > .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 116 ++++++++ > .../selftests/bpf/progs/tcp_ca_update.c | 80 ++++++ > 13 files changed, 759 insertions(+), 101 deletions(-) > create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_update.c >