diff mbox series

[v2,bpf-next,1/2] libbpf: add capability for resizing datasec maps

Message ID 20230510223342.12886-2-inwardvessel@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series libbpf: capability for resizing datasec maps | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 10 maintainers not CCed: daniel@iogearbox.net yhs@fb.com kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com song@kernel.org sdf@google.com jolsa@kernel.org haoluo@google.com ast@kernel.org
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning CHECK: Alignment should match open parenthesis WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-9 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16

Commit Message

JP Kobryn May 10, 2023, 10:33 p.m. UTC
This patch updates bpf_map__set_value_size() so that if the given map is
memory mapped, it will attempt to resize the mapped region. Initial
contents of the mapped region are preserved. BTF is not required, but
after the mapping is resized an attempt is made to adjust the associated
BTF information if the following criteria is met:
 - BTF info is present
 - the map is a datasec
 - the final variable in the datasec is an array

... the resulting BTF info will be updated so that the final array
variable is associated with a new BTF array type sized to cover the
requested size.

Note that the initial resizing of the memory mapped region can succeed
while the subsequent BTF adjustment can fail. In this case, BTF info is
dropped from the map by clearing the key and value type.

Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
 tools/lib/bpf/libbpf.c | 158 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 157 insertions(+), 1 deletion(-)

Comments

Andrii Nakryiko May 16, 2023, 9:10 p.m. UTC | #1
On Wed, May 10, 2023 at 3:33 PM JP Kobryn <inwardvessel@gmail.com> wrote:
>
> This patch updates bpf_map__set_value_size() so that if the given map is
> memory mapped, it will attempt to resize the mapped region. Initial
> contents of the mapped region are preserved. BTF is not required, but
> after the mapping is resized an attempt is made to adjust the associated
> BTF information if the following criteria is met:
>  - BTF info is present
>  - the map is a datasec
>  - the final variable in the datasec is an array
>
> ... the resulting BTF info will be updated so that the final array
> variable is associated with a new BTF array type sized to cover the
> requested size.
>
> Note that the initial resizing of the memory mapped region can succeed
> while the subsequent BTF adjustment can fail. In this case, BTF info is
> dropped from the map by clearing the key and value type.
>
> Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
> ---

It's coming along pretty nicely, still few bugs and unnecessary
complications, but easy to fix.

Can you please also add a doc-comment in libbpf.h for
bpf_map__set_value_size() describing what this API is doing and
explicitly point out that once mmap-able BPF map is resized, all
previous pointers returned from bpf_map__initial_value() and BPF
skeleton pointers for corresponding data section will be invalidated
and have to be re-initialized. This is an important and subtle point,
best to call it out very explicitly.


>  tools/lib/bpf/libbpf.c | 158 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 157 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 1cbacf9e71f3..50cfe2bd4ba0 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -1510,6 +1510,39 @@ static size_t bpf_map_mmap_sz(const struct bpf_map *map)
>         return map_sz;
>  }
>
> +static int bpf_map_mmap_resize(struct bpf_map *map, size_t old_sz, size_t new_sz)
> +{
> +       void *mmaped;
> +
> +       if (!map->mmaped)
> +               return -EINVAL;
> +
> +       if (old_sz == new_sz)
> +               return 0;
> +
> +       mmaped = mmap(NULL, new_sz, PROT_READ | PROT_WRITE,
> +                       MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +       if (mmaped == MAP_FAILED)
> +               return libbpf_err(-errno);

no need to use libbpf_err() here, it's internal function

> +
> +       /* copy pre-existing contents to new region,
> +        * using the minimum of old/new size
> +        */
> +       memcpy(mmaped, map->mmaped, min(old_sz, new_sz));
> +
> +       if (munmap(map->mmaped, old_sz)) {
> +               pr_warn("map '%s': failed to unmap\n", bpf_map__name(map));
> +               if (munmap(mmaped, new_sz))
> +                       pr_warn("map '%s': failed to unmap temp region\n",
> +                                       bpf_map__name(map));
> +               return libbpf_err(-errno);
> +       }

this seems a bit too paranoid. Let's just unconditionally
`munmap(map->mmaped, old_sz);` and that's it. Don't add warning and
definitely don't try to unmap newly mapped region. As is this code
could lead to double-free, effectively.

> +
> +       map->mmaped = mmaped;
> +
> +       return 0;
> +}
> +
>  static char *internal_map_name(struct bpf_object *obj, const char *real_name)
>  {
>         char map_name[BPF_OBJ_NAME_LEN], *p;
> @@ -9412,12 +9445,135 @@ __u32 bpf_map__value_size(const struct bpf_map *map)
>         return map->def.value_size;
>  }
>
> +static int map_btf_datasec_resize(struct bpf_map *map, __u32 size)
> +{
> +       int err;
> +       int i, vlen;
> +       struct btf *btf;
> +       const struct btf_type *array_type, *array_element_type;
> +       struct btf_type *datasec_type, *var_type;
> +       struct btf_var_secinfo *var;
> +       const struct btf_array *array;
> +       __u32 offset, nr_elements, new_array_id;
> +
> +       /* check btf existence */
> +       btf = bpf_object__btf(map->obj);
> +       if (!btf)
> +               return -ENOENT;
> +
> +       /* verify map is datasec */
> +       datasec_type = btf_type_by_id(btf, bpf_map__btf_value_type_id(map));
> +       if (!btf_is_datasec(datasec_type)) {
> +               pr_warn("map '%s': attempted to resize but map is not a datasec\n",

"map value type is not a datasec". map is not a datasec, it's value type is

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* verify datasec has at least one var */
> +       vlen = btf_vlen(datasec_type);
> +       if (vlen == 0) {
> +               pr_warn("map '%s': attempted to resize but map vlen == 0\n",

maybe "map value datasec is empty"? "vlen" is not necessarily
something that users will easily recognize and understand

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* walk to the last var in the datasec,
> +        * increasing the offset as we pass each var
> +        */
> +       var = btf_var_secinfos(datasec_type);
> +       offset = 0;
> +       for (i = 0; i < vlen - 1; i++) {
> +               offset += var->size;
> +               var++;
> +       }

it's both incorrect and overcomplicated. Just:

var = btf_var_secinfos(datasec_type)[vlen - 1];
offset = var->offset;

> +
> +       /* verify last var in the datasec is an array */
> +       var_type = btf_type_by_id(btf, var->type);
> +       array_type = skip_mods_and_typedefs(btf, var_type->type, NULL);
> +       if (!btf_is_array(array_type)) {
> +               pr_warn("map '%s': cannot be resized last var must be array\n",

"cannot be resized, last var must be an array"?

> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* verify request size aligns with array */
> +       array = btf_array(array_type);
> +       array_element_type = btf_type_by_id(btf, array->type);

not enough, need to skip_mods_and_typedefs() first, etc. But probably
simpler to just use btf__resolve_size(array->type)? And don't forget
to check that we get > 0 result, otherwise we run a risk of division
by zero below

> +       if ((size - offset) % array_element_type->size != 0) {
> +               pr_warn("map '%s': attempted to resize but requested size does not align\n",
> +                               bpf_map__name(map));
> +               return -EINVAL;
> +       }
> +
> +       /* create a new array based on the existing array,
> +        * but with new length
> +        */
> +       nr_elements = (size - offset) / array_element_type->size;
> +       new_array_id = btf__add_array(btf, array->index_type, array->type,
> +                       nr_elements);
> +       if (new_array_id < 0) {
> +               pr_warn("map '%s': failed to create new array\n",

this is a very unlikely error to happen, unless there is some bug (the
only legitimate reason is -ENOMEM, which we generally don't log
everywhere). So let's drop unnecessary pr_warn() here.

> +                               bpf_map__name(map));
> +               err = new_array_id;
> +               return err;
> +       }
> +
> +       /* adding a new btf type invalidates existing pointers to btf objects,
> +        * so refresh pointers before proceeding
> +        */
> +       datasec_type = btf_type_by_id(btf, map->btf_value_type_id);
> +       var = btf_var_secinfos(datasec_type);
> +       for (i = 0; i < vlen - 1; i++)
> +               var++;

as I showed above, btf_var_secinfos(datasec_type)[vlen - 1], no need
for linear search

> +       var_type = btf_type_by_id(btf, var->type);
> +
> +       /* finally update btf info */
> +       datasec_type->size = size;
> +       var->size = size - offset;
> +       var_type->type = new_array_id;
> +
> +       return 0;
> +}
> +
>  int bpf_map__set_value_size(struct bpf_map *map, __u32 size)
>  {
> +       int err;
> +       __u32 old_size;
> +
>         if (map->fd >= 0)
>                 return libbpf_err(-EBUSY);
> -       map->def.value_size = size;
> +
> +       old_size = map->def.value_size;
> +
> +       if (map->mmaped) {
> +               size_t mmap_old_sz, mmap_new_sz;
> +
> +               mmap_old_sz = bpf_map_mmap_sz(map);
> +               map->def.value_size = size;
> +               mmap_new_sz = bpf_map_mmap_sz(map);

it's ugly that we need to modify map->def.value_size just to calculate
mmap region size. Let's add a helper that would take value_size and
max_entries explicitly and return mmap size? This will make this
function simpler as well as map->def.value_size will be updated once
at the very end, regardless of map->mmaped or not.

> +
> +               err = bpf_map_mmap_resize(map, mmap_old_sz, mmap_new_sz);
> +               if (err) {
> +                       pr_warn("map '%s': failed to resize memory mapped region\n",
> +                                       bpf_map__name(map));
> +                       goto err_out;
> +               }
> +               err = map_btf_datasec_resize(map, size);
> +               if (err && err != -ENOENT) {
> +                       pr_warn("map '%s': failed to adjust btf for resized map. dropping btf info\n",

nit, either capitalize "Dropping", or (preferably) turn it into a
single sentence: "failed to adjust BTF for resized map, clearing BTF
key/value type info\n". Dropping is ambiguous and sounds more
dangerous than it really is


> +                                       bpf_map__name(map));
> +                       map->btf_value_type_id = 0;
> +                       map->btf_key_type_id = 0;
> +               }
> +       } else {
> +               map->def.value_size = size;
> +       }
> +
>         return 0;
> +
> +err_out:
> +       map->def.value_size = old_size;
> +       return libbpf_err(err);
>  }
>
>  __u32 bpf_map__btf_key_type_id(const struct bpf_map *map)
> --
> 2.40.0
>
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 1cbacf9e71f3..50cfe2bd4ba0 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1510,6 +1510,39 @@  static size_t bpf_map_mmap_sz(const struct bpf_map *map)
 	return map_sz;
 }
 
+static int bpf_map_mmap_resize(struct bpf_map *map, size_t old_sz, size_t new_sz)
+{
+	void *mmaped;
+
+	if (!map->mmaped)
+		return -EINVAL;
+
+	if (old_sz == new_sz)
+		return 0;
+
+	mmaped = mmap(NULL, new_sz, PROT_READ | PROT_WRITE,
+			MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (mmaped == MAP_FAILED)
+		return libbpf_err(-errno);
+
+	/* copy pre-existing contents to new region,
+	 * using the minimum of old/new size
+	 */
+	memcpy(mmaped, map->mmaped, min(old_sz, new_sz));
+
+	if (munmap(map->mmaped, old_sz)) {
+		pr_warn("map '%s': failed to unmap\n", bpf_map__name(map));
+		if (munmap(mmaped, new_sz))
+			pr_warn("map '%s': failed to unmap temp region\n",
+					bpf_map__name(map));
+		return libbpf_err(-errno);
+	}
+
+	map->mmaped = mmaped;
+
+	return 0;
+}
+
 static char *internal_map_name(struct bpf_object *obj, const char *real_name)
 {
 	char map_name[BPF_OBJ_NAME_LEN], *p;
@@ -9412,12 +9445,135 @@  __u32 bpf_map__value_size(const struct bpf_map *map)
 	return map->def.value_size;
 }
 
+static int map_btf_datasec_resize(struct bpf_map *map, __u32 size)
+{
+	int err;
+	int i, vlen;
+	struct btf *btf;
+	const struct btf_type *array_type, *array_element_type;
+	struct btf_type *datasec_type, *var_type;
+	struct btf_var_secinfo *var;
+	const struct btf_array *array;
+	__u32 offset, nr_elements, new_array_id;
+
+	/* check btf existence */
+	btf = bpf_object__btf(map->obj);
+	if (!btf)
+		return -ENOENT;
+
+	/* verify map is datasec */
+	datasec_type = btf_type_by_id(btf, bpf_map__btf_value_type_id(map));
+	if (!btf_is_datasec(datasec_type)) {
+		pr_warn("map '%s': attempted to resize but map is not a datasec\n",
+				bpf_map__name(map));
+		return -EINVAL;
+	}
+
+	/* verify datasec has at least one var */
+	vlen = btf_vlen(datasec_type);
+	if (vlen == 0) {
+		pr_warn("map '%s': attempted to resize but map vlen == 0\n",
+				bpf_map__name(map));
+		return -EINVAL;
+	}
+
+	/* walk to the last var in the datasec,
+	 * increasing the offset as we pass each var
+	 */
+	var = btf_var_secinfos(datasec_type);
+	offset = 0;
+	for (i = 0; i < vlen - 1; i++) {
+		offset += var->size;
+		var++;
+	}
+
+	/* verify last var in the datasec is an array */
+	var_type = btf_type_by_id(btf, var->type);
+	array_type = skip_mods_and_typedefs(btf, var_type->type, NULL);
+	if (!btf_is_array(array_type)) {
+		pr_warn("map '%s': cannot be resized last var must be array\n",
+				bpf_map__name(map));
+		return -EINVAL;
+	}
+
+	/* verify request size aligns with array */
+	array = btf_array(array_type);
+	array_element_type = btf_type_by_id(btf, array->type);
+	if ((size - offset) % array_element_type->size != 0) {
+		pr_warn("map '%s': attempted to resize but requested size does not align\n",
+				bpf_map__name(map));
+		return -EINVAL;
+	}
+
+	/* create a new array based on the existing array,
+	 * but with new length
+	 */
+	nr_elements = (size - offset) / array_element_type->size;
+	new_array_id = btf__add_array(btf, array->index_type, array->type,
+			nr_elements);
+	if (new_array_id < 0) {
+		pr_warn("map '%s': failed to create new array\n",
+				bpf_map__name(map));
+		err = new_array_id;
+		return err;
+	}
+
+	/* adding a new btf type invalidates existing pointers to btf objects,
+	 * so refresh pointers before proceeding
+	 */
+	datasec_type = btf_type_by_id(btf, map->btf_value_type_id);
+	var = btf_var_secinfos(datasec_type);
+	for (i = 0; i < vlen - 1; i++)
+		var++;
+	var_type = btf_type_by_id(btf, var->type);
+
+	/* finally update btf info */
+	datasec_type->size = size;
+	var->size = size - offset;
+	var_type->type = new_array_id;
+
+	return 0;
+}
+
 int bpf_map__set_value_size(struct bpf_map *map, __u32 size)
 {
+	int err;
+	__u32 old_size;
+
 	if (map->fd >= 0)
 		return libbpf_err(-EBUSY);
-	map->def.value_size = size;
+
+	old_size = map->def.value_size;
+
+	if (map->mmaped) {
+		size_t mmap_old_sz, mmap_new_sz;
+
+		mmap_old_sz = bpf_map_mmap_sz(map);
+		map->def.value_size = size;
+		mmap_new_sz = bpf_map_mmap_sz(map);
+
+		err = bpf_map_mmap_resize(map, mmap_old_sz, mmap_new_sz);
+		if (err) {
+			pr_warn("map '%s': failed to resize memory mapped region\n",
+					bpf_map__name(map));
+			goto err_out;
+		}
+		err = map_btf_datasec_resize(map, size);
+		if (err && err != -ENOENT) {
+			pr_warn("map '%s': failed to adjust btf for resized map. dropping btf info\n",
+					bpf_map__name(map));
+			map->btf_value_type_id = 0;
+			map->btf_key_type_id = 0;
+		}
+	} else {
+		map->def.value_size = size;
+	}
+
 	return 0;
+
+err_out:
+	map->def.value_size = old_size;
+	return libbpf_err(err);
 }
 
 __u32 bpf_map__btf_key_type_id(const struct bpf_map *map)