Message ID | 20221020221327.3557258-1-yhs@fb.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Implement cgroup local storage available to non-cgroup-attached bpf progs | expand |
On Thu, Oct 20, 2022 at 03:13:27PM -0700, Yonghong Song wrote: > Add some descriptions and examples for BPF_MAP_TYPE_CGRP_STROAGE. > Also illustate the major difference between BPF_MAP_TYPE_CGRP_STROAGE > and BPF_MAP_TYPE_CGROUP_STORAGE and recommend to use > BPF_MAP_TYPE_CGRP_STROAGE instead of BPF_MAP_TYPE_CGROUP_STORAGE > in the end. > > Signed-off-by: Yonghong Song <yhs@fb.com> > --- > Documentation/bpf/map_cgrp_storage.rst | 104 +++++++++++++++++++++++++ > 1 file changed, 104 insertions(+) > create mode 100644 Documentation/bpf/map_cgrp_storage.rst > > diff --git a/Documentation/bpf/map_cgrp_storage.rst b/Documentation/bpf/map_cgrp_storage.rst > new file mode 100644 > index 000000000000..15691aab7fc7 > --- /dev/null > +++ b/Documentation/bpf/map_cgrp_storage.rst > @@ -0,0 +1,104 @@ > +.. SPDX-License-Identifier: GPL-2.0-only > +.. Copyright (C) 2022 Meta Platforms, Inc. and affiliates. > + > +=========================== > +BPF_MAP_TYPE_CGRP_STORAGE > +=========================== > + > +The ``BPF_MAP_TYPE_CGRP_STORAGE`` map type represents a local fix-sized > +storage for cgroups. It is only available with ``CONFIG_CGROUP_BPF``. > +The programs are made available by the same Kconfig. The > +data for a particular cgroup can be retrieved by looking up the map > +with that cgroup. > + > +This document describes the usage and semantics of the > +``BPF_MAP_TYPE_CGRP_STORAGE`` map type. > + > +Usage > +===== > + > +The map key must be ``sizeof(int)`` representing a cgroup fd. > +To access the storage in a program, use ``bpf_cgrp_storage_get``:: > + > + void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) > + > +``flags`` could be 0 or ``BPF_LOCAL_STORAGE_GET_F_CREATE`` which indicates that > +a new local storage will be created if one does not exist. > + > +The local storage can be removed with ``bpf_cgrp_storage_delete``:: > + > + long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgruop *cgroup) s/cgruop/cgroup > + > +The map is available to all program types. > + > +Examples > +======== > + > +An bpf-program example with BPF_MAP_TYPE_CGRP_STORAGE:: s/An bpf-program/A bpf program > + > + #include <vmlinux.h> > + #include <bpf/bpf_helpers.h> > + #include <bpf/bpf_tracing.h> > + > + struct { > + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); > + __uint(map_flags, BPF_F_NO_PREALLOC); > + __type(key, int); > + __type(value, long); > + } cgrp_storage SEC(".maps"); > + > + SEC("tp_btf/sys_enter") > + int BPF_PROG(on_enter, struct pt_regs *regs, long id) > + { > + struct task_struct *task = bpf_get_current_task_btf(); > + long *ptr; > + > + ptr = bpf_cgrp_storage_get(&cgrp_storage, task->cgroups->dfl_cgrp, 0, > + BPF_LOCAL_STORAGE_GET_F_CREATE); > + if (ptr) > + __sync_fetch_and_add(ptr, 1); > + > + return 0; > + } > + > +Userspace accessing map declared above:: > + > + #include <linux/bpf.h> > + #include <linux/libbpf.h> > + > + __u32 map_lookup(struct bpf_map *map, int cgrp_fd) > + { > + __u32 *value; > + value = bpf_map_lookup_elem(bpf_map__fd(map), &cgrp_fd); > + if (value) > + return *value; > + return 0; > + } > + > +Difference Between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE > +============================================================================ > + > +The main difference between ``BPF_MAP_TYPE_CGRP_STORAGE`` and ``BPF_MAP_TYPE_CGROUP_STORAGE`` > +is that ``BPF_MAP_TYPE_CGRP_STORAGE`` can be used by all program types while > +``BPF_MAP_TYPE_CGROUP_STORAGE`` is available only to cgroup program types. Suggestion: I'd consider going into just a bit more detail about what's meant by "cgroup program types" here. Maybe just 1-2 sentences describing the types of programs where the deprecated cgroup storage map type is available, and why. A program that's using the BPF_MAP_TYPE_CGRP_STORAGE map is conceptually also a "cgroup program type" in that it's querying, analyzing, etc cgroups, so being explicit may help get ahead of any confusion. Also, given that BPF_MAP_TYPE_CGROUP_STORAGE is deprecated, and is extremely close in name to BPF_MAP_TYPE_CGRP_STORAGE, perhaps we should start this section out by explaining that BPF_MAP_TYPE_CGROUP_STORAGE is a deprecated map type that's now aliased to BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, and then reference it as BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED for the rest of the page. I think it's important to highlight that the map type is deprecated, and give some historical context here so that users understand why they should use BPF_MAP_TYPE_CGRP_STORAGE, and why we have two such confusingly- similarly named maps. What do you think? > +There are some other differences as well. > + > +(1). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports local storage for more than one > + cgroups while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only support one, the one attached s/cgroups/cgroup > + by the bpf program. > + > +(2). ``BPF_MAP_TYPE_CGROUP_STORAGE`` allocates local storage at attach time so > + ``bpf_get_local_storage()`` always returns non-null local storage. Suggestion: s/non-null/non-NULL. In general, I'd suggest changing null to NULL throughout the page, but I don't feel strongly about it. > + ``BPF_MAP_TYPE_CGRP_STORAGE`` allocates local storage at runtime so > + it is possible that ``bpf_cgrp_storage_get()`` may return null local storage. > + To avoid such null local storage issue, user space can do > + ``bpf_map_update_elem()`` to pre-allocate local storage. Should we specify that user space can preallocate by doing bpf_map_update_elem() _before_ the program is attached? Also, another small nit, but I think pre-allocate and de-allocate should just be preallocate and deallocate respectively. > +(3). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports de-allocating local storage by bpf program s/by bpf program/by a bpf program > + while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only de-allocates storage during > + prog detach time. > + > +So overall, ``BPF_MAP_TYPE_CGRP_STORAGE`` supports all ``BPF_MAP_TYPE_CGROUP_STORAGE`` > +functionality and beyond. It is recommended to use ``BPF_MAP_TYPE_CGRP_STORAGE`` > +instead of ``BPF_MAP_TYPE_CGROUP_STORAGE``. As mentioned above, I think we need to go beyond stating that using BPF_MAP_TYPE_CGRP_STORAGE is recommended, and also explicitly and loudly call out that BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED is deprecated and will not be supported indefinitely. Overall though, this looks great. Thank you for writing it up. Thanks, David
On 10/21/22 12:12 AM, David Vernet wrote: > On Thu, Oct 20, 2022 at 03:13:27PM -0700, Yonghong Song wrote: >> Add some descriptions and examples for BPF_MAP_TYPE_CGRP_STROAGE. >> Also illustate the major difference between BPF_MAP_TYPE_CGRP_STROAGE >> and BPF_MAP_TYPE_CGROUP_STORAGE and recommend to use >> BPF_MAP_TYPE_CGRP_STROAGE instead of BPF_MAP_TYPE_CGROUP_STORAGE >> in the end. >> >> Signed-off-by: Yonghong Song <yhs@fb.com> >> --- >> Documentation/bpf/map_cgrp_storage.rst | 104 +++++++++++++++++++++++++ >> 1 file changed, 104 insertions(+) >> create mode 100644 Documentation/bpf/map_cgrp_storage.rst >> >> diff --git a/Documentation/bpf/map_cgrp_storage.rst b/Documentation/bpf/map_cgrp_storage.rst >> new file mode 100644 >> index 000000000000..15691aab7fc7 >> --- /dev/null >> +++ b/Documentation/bpf/map_cgrp_storage.rst >> @@ -0,0 +1,104 @@ >> +.. SPDX-License-Identifier: GPL-2.0-only >> +.. Copyright (C) 2022 Meta Platforms, Inc. and affiliates. >> + >> +=========================== >> +BPF_MAP_TYPE_CGRP_STORAGE >> +=========================== >> + >> +The ``BPF_MAP_TYPE_CGRP_STORAGE`` map type represents a local fix-sized >> +storage for cgroups. It is only available with ``CONFIG_CGROUP_BPF``. >> +The programs are made available by the same Kconfig. The >> +data for a particular cgroup can be retrieved by looking up the map >> +with that cgroup. >> + >> +This document describes the usage and semantics of the >> +``BPF_MAP_TYPE_CGRP_STORAGE`` map type. >> + >> +Usage >> +===== >> + >> +The map key must be ``sizeof(int)`` representing a cgroup fd. >> +To access the storage in a program, use ``bpf_cgrp_storage_get``:: >> + >> + void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) >> + >> +``flags`` could be 0 or ``BPF_LOCAL_STORAGE_GET_F_CREATE`` which indicates that >> +a new local storage will be created if one does not exist. >> + >> +The local storage can be removed with ``bpf_cgrp_storage_delete``:: >> + >> + long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgruop *cgroup) > > s/cgruop/cgroup ack. > >> + >> +The map is available to all program types. >> + >> +Examples >> +======== >> + >> +An bpf-program example with BPF_MAP_TYPE_CGRP_STORAGE:: > > s/An bpf-program/A bpf program ack. > >> + >> + #include <vmlinux.h> >> + #include <bpf/bpf_helpers.h> >> + #include <bpf/bpf_tracing.h> >> + >> + struct { >> + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); >> + __uint(map_flags, BPF_F_NO_PREALLOC); >> + __type(key, int); >> + __type(value, long); >> + } cgrp_storage SEC(".maps"); >> + >> + SEC("tp_btf/sys_enter") >> + int BPF_PROG(on_enter, struct pt_regs *regs, long id) >> + { >> + struct task_struct *task = bpf_get_current_task_btf(); >> + long *ptr; >> + >> + ptr = bpf_cgrp_storage_get(&cgrp_storage, task->cgroups->dfl_cgrp, 0, >> + BPF_LOCAL_STORAGE_GET_F_CREATE); >> + if (ptr) >> + __sync_fetch_and_add(ptr, 1); >> + >> + return 0; >> + } >> + >> +Userspace accessing map declared above:: >> + >> + #include <linux/bpf.h> >> + #include <linux/libbpf.h> >> + >> + __u32 map_lookup(struct bpf_map *map, int cgrp_fd) >> + { >> + __u32 *value; >> + value = bpf_map_lookup_elem(bpf_map__fd(map), &cgrp_fd); >> + if (value) >> + return *value; >> + return 0; >> + } >> + >> +Difference Between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE >> +============================================================================ >> + >> +The main difference between ``BPF_MAP_TYPE_CGRP_STORAGE`` and ``BPF_MAP_TYPE_CGROUP_STORAGE`` >> +is that ``BPF_MAP_TYPE_CGRP_STORAGE`` can be used by all program types while >> +``BPF_MAP_TYPE_CGROUP_STORAGE`` is available only to cgroup program types. > > Suggestion: I'd consider going into just a bit more detail about what's > meant by "cgroup program types" here. Maybe just 1-2 sentences > describing the types of programs where the deprecated cgroup storage map > type is available, and why. A program that's using the > BPF_MAP_TYPE_CGRP_STORAGE map is conceptually also a "cgroup program > type" in that it's querying, analyzing, etc cgroups, so being explicit > may help get ahead of any confusion. Right, 'cgroup program types' is not precise either. It should 'only available to programs attached to cgroups'. I will add a few examples like BPF_CGROUP_INET_INGRESS and BPF_CGROUP_SOCK_OPS, etc. > > Also, given that BPF_MAP_TYPE_CGROUP_STORAGE is deprecated, and is > extremely close in name to BPF_MAP_TYPE_CGRP_STORAGE, perhaps we should > start this section out by explaining that BPF_MAP_TYPE_CGROUP_STORAGE is > a deprecated map type that's now aliased to > BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, and then reference it as > BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED for the rest of the page. I think > it's important to highlight that the map type is deprecated, and give > some historical context here so that users understand why they should > use BPF_MAP_TYPE_CGRP_STORAGE, and why we have two such confusingly- > similarly named maps. What do you think? Make sense. Putting deprecation in the beginning of this section can make readers more aware of the difference from the beginning. > >> +There are some other differences as well. >> + >> +(1). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports local storage for more than one >> + cgroups while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only support one, the one attached > > s/cgroups/cgroup ack. > >> + by the bpf program. >> + >> +(2). ``BPF_MAP_TYPE_CGROUP_STORAGE`` allocates local storage at attach time so >> + ``bpf_get_local_storage()`` always returns non-null local storage. > > Suggestion: s/non-null/non-NULL. In general, I'd suggest changing null > to NULL throughout the page, but I don't feel strongly about it. ack. > >> + ``BPF_MAP_TYPE_CGRP_STORAGE`` allocates local storage at runtime so >> + it is possible that ``bpf_cgrp_storage_get()`` may return null local storage. >> + To avoid such null local storage issue, user space can do >> + ``bpf_map_update_elem()`` to pre-allocate local storage. > > Should we specify that user space can preallocate by doing > bpf_map_update_elem() _before_ the program is attached? Also, another > small nit, but I think pre-allocate and de-allocate should just be > preallocate and deallocate respectively. Yes, bpf_map_update_elem() should be done before attachment. I will add this. > >> +(3). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports de-allocating local storage by bpf program > > s/by bpf program/by a bpf program ack. > >> + while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only de-allocates storage during >> + prog detach time. >> + >> +So overall, ``BPF_MAP_TYPE_CGRP_STORAGE`` supports all ``BPF_MAP_TYPE_CGROUP_STORAGE`` >> +functionality and beyond. It is recommended to use ``BPF_MAP_TYPE_CGRP_STORAGE`` >> +instead of ``BPF_MAP_TYPE_CGROUP_STORAGE``. > > As mentioned above, I think we need to go beyond stating that using > BPF_MAP_TYPE_CGRP_STORAGE is recommended, and also explicitly and loudly > call out that BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED is deprecated and > will not be supported indefinitely. agree. > > Overall though, this looks great. Thank you for writing it up. > > Thanks, > David
diff --git a/Documentation/bpf/map_cgrp_storage.rst b/Documentation/bpf/map_cgrp_storage.rst new file mode 100644 index 000000000000..15691aab7fc7 --- /dev/null +++ b/Documentation/bpf/map_cgrp_storage.rst @@ -0,0 +1,104 @@ +.. SPDX-License-Identifier: GPL-2.0-only +.. Copyright (C) 2022 Meta Platforms, Inc. and affiliates. + +=========================== +BPF_MAP_TYPE_CGRP_STORAGE +=========================== + +The ``BPF_MAP_TYPE_CGRP_STORAGE`` map type represents a local fix-sized +storage for cgroups. It is only available with ``CONFIG_CGROUP_BPF``. +The programs are made available by the same Kconfig. The +data for a particular cgroup can be retrieved by looking up the map +with that cgroup. + +This document describes the usage and semantics of the +``BPF_MAP_TYPE_CGRP_STORAGE`` map type. + +Usage +===== + +The map key must be ``sizeof(int)`` representing a cgroup fd. +To access the storage in a program, use ``bpf_cgrp_storage_get``:: + + void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) + +``flags`` could be 0 or ``BPF_LOCAL_STORAGE_GET_F_CREATE`` which indicates that +a new local storage will be created if one does not exist. + +The local storage can be removed with ``bpf_cgrp_storage_delete``:: + + long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgruop *cgroup) + +The map is available to all program types. + +Examples +======== + +An bpf-program example with BPF_MAP_TYPE_CGRP_STORAGE:: + + #include <vmlinux.h> + #include <bpf/bpf_helpers.h> + #include <bpf/bpf_tracing.h> + + struct { + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, long); + } cgrp_storage SEC(".maps"); + + SEC("tp_btf/sys_enter") + int BPF_PROG(on_enter, struct pt_regs *regs, long id) + { + struct task_struct *task = bpf_get_current_task_btf(); + long *ptr; + + ptr = bpf_cgrp_storage_get(&cgrp_storage, task->cgroups->dfl_cgrp, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (ptr) + __sync_fetch_and_add(ptr, 1); + + return 0; + } + +Userspace accessing map declared above:: + + #include <linux/bpf.h> + #include <linux/libbpf.h> + + __u32 map_lookup(struct bpf_map *map, int cgrp_fd) + { + __u32 *value; + value = bpf_map_lookup_elem(bpf_map__fd(map), &cgrp_fd); + if (value) + return *value; + return 0; + } + +Difference Between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE +============================================================================ + +The main difference between ``BPF_MAP_TYPE_CGRP_STORAGE`` and ``BPF_MAP_TYPE_CGROUP_STORAGE`` +is that ``BPF_MAP_TYPE_CGRP_STORAGE`` can be used by all program types while +``BPF_MAP_TYPE_CGROUP_STORAGE`` is available only to cgroup program types. + +There are some other differences as well. + +(1). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports local storage for more than one + cgroups while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only support one, the one attached + by the bpf program. + +(2). ``BPF_MAP_TYPE_CGROUP_STORAGE`` allocates local storage at attach time so + ``bpf_get_local_storage()`` always returns non-null local storage. + ``BPF_MAP_TYPE_CGRP_STORAGE`` allocates local storage at runtime so + it is possible that ``bpf_cgrp_storage_get()`` may return null local storage. + To avoid such null local storage issue, user space can do + ``bpf_map_update_elem()`` to pre-allocate local storage. + +(3). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports de-allocating local storage by bpf program + while ``BPF_MAP_TYPE_CGROUP_STORAGE`` only de-allocates storage during + prog detach time. + +So overall, ``BPF_MAP_TYPE_CGRP_STORAGE`` supports all ``BPF_MAP_TYPE_CGROUP_STORAGE`` +functionality and beyond. It is recommended to use ``BPF_MAP_TYPE_CGRP_STORAGE`` +instead of ``BPF_MAP_TYPE_CGROUP_STORAGE``.
Add some descriptions and examples for BPF_MAP_TYPE_CGRP_STROAGE. Also illustate the major difference between BPF_MAP_TYPE_CGRP_STROAGE and BPF_MAP_TYPE_CGROUP_STORAGE and recommend to use BPF_MAP_TYPE_CGRP_STROAGE instead of BPF_MAP_TYPE_CGROUP_STORAGE in the end. Signed-off-by: Yonghong Song <yhs@fb.com> --- Documentation/bpf/map_cgrp_storage.rst | 104 +++++++++++++++++++++++++ 1 file changed, 104 insertions(+) create mode 100644 Documentation/bpf/map_cgrp_storage.rst