diff mbox series

[bpf-next,v3,03/18] bpf: hashtab memory usage

Message ID 20230227152032.12359-4-laoar.shao@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: bpf memory usage | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format fail Series longer than 15 patches (and no cover letter)
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 12 this patch: 12
netdev/cc_maintainers warning 2 maintainers not CCed: song@kernel.org martin.lau@linux.dev
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 12 this patch: 12
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-17

Commit Message

Yafang Shao Feb. 27, 2023, 3:20 p.m. UTC
htab_map_mem_usage() is introduced to calculate hashmap memory usage. In
this helper, some small memory allocations are ignore, as their size is
quite small compared with the total size. The inner_map_meta in
hash_of_map is also ignored.

The result for hashtab as follows,

- before this change
1: hash  name count_map  flags 0x1  <<<< no prealloc, fully set
        key 16B  value 24B  max_entries 1048576  memlock 41943040B
2: hash  name count_map  flags 0x1  <<<< no prealloc, none set
        key 16B  value 24B  max_entries 1048576  memlock 41943040B
3: hash  name count_map  flags 0x0  <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 41943040B

The memlock is always a fixed size whatever it is preallocated or
not, and whatever the count of allocated elements is.

- after this change
1: hash  name count_map  flags 0x1    <<<< non prealloc, fully set
        key 16B  value 24B  max_entries 1048576  memlock 117441536B
2: hash  name count_map  flags 0x1    <<<< non prealloc, non set
        key 16B  value 24B  max_entries 1048576  memlock 16778240B
3: hash  name count_map  flags 0x0    <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 109056000B

The memlock now is hashtab actually allocated.

The result for percpu hash map as follows,
- before this change
4: percpu_hash  name count_map  flags 0x0       <<<< prealloc
        key 16B  value 24B  max_entries 1048576  memlock 822083584B
5: percpu_hash  name count_map  flags 0x1       <<<< no prealloc
        key 16B  value 24B  max_entries 1048576  memlock 822083584B

- after this change
4: percpu_hash  name count_map  flags 0x0
        key 16B  value 24B  max_entries 1048576  memlock 897582080B
5: percpu_hash  name count_map  flags 0x1
        key 16B  value 24B  max_entries 1048576  memlock 922748736B

At worst, the difference can be 10x, for example,
- before this change
6: hash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 1048576  memlock 8388608B

- after this change
6: hash  name count_map  flags 0x0
        key 4B  value 4B  max_entries 1048576  memlock 83889408B

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/bpf/hashtab.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

Comments

Hou Tao March 1, 2023, 2:59 a.m. UTC | #1
From: Hou Tao <houtao1@huawei.com>

> htab_map_mem_usage() is introduced to calculate hashmap memory usage. In
> this helper, some small memory allocations are ignore, as their size is
> quite small compared with the total size. The inner_map_meta in
> hash_of_map is also ignored.
> 
> The result for hashtab as follows,
> 
> - before this change
> 1: hash  name count_map  flags 0x1  <<<< no prealloc, fully set
>         key 16B  value 24B  max_entries 1048576  memlock 41943040B
> 2: hash  name count_map  flags 0x1  <<<< no prealloc, none set
>         key 16B  value 24B  max_entries 1048576  memlock 41943040B
> 3: hash  name count_map  flags 0x0  <<<< prealloc
>         key 16B  value 24B  max_entries 1048576  memlock 41943040B
> 
> The memlock is always a fixed size whatever it is preallocated or
> not, and whatever the count of allocated elements is.
> 
> - after this change
> 1: hash  name count_map  flags 0x1    <<<< non prealloc, fully set
>         key 16B  value 24B  max_entries 1048576  memlock 117441536B
> 2: hash  name count_map  flags 0x1    <<<< non prealloc, non set
>         key 16B  value 24B  max_entries 1048576  memlock 16778240B
> 3: hash  name count_map  flags 0x0    <<<< prealloc
>         key 16B  value 24B  max_entries 1048576  memlock 109056000B
> 
> The memlock now is hashtab actually allocated.
> 
> The result for percpu hash map as follows,
> - before this change
> 4: percpu_hash  name count_map  flags 0x0       <<<< prealloc
>         key 16B  value 24B  max_entries 1048576  memlock 822083584B
> 5: percpu_hash  name count_map  flags 0x1       <<<< no prealloc
>         key 16B  value 24B  max_entries 1048576  memlock 822083584B
> 
> - after this change
> 4: percpu_hash  name count_map  flags 0x0
>         key 16B  value 24B  max_entries 1048576  memlock 897582080B
> 5: percpu_hash  name count_map  flags 0x1
>         key 16B  value 24B  max_entries 1048576  memlock 922748736B
> 
> At worst, the difference can be 10x, for example,
> - before this change
> 6: hash  name count_map  flags 0x0
>         key 4B  value 4B  max_entries 1048576  memlock 8388608B
> 
> - after this change
> 6: hash  name count_map  flags 0x0
>         key 4B  value 4B  max_entries 1048576  memlock 83889408B
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> 
Acked-by: Hou Tao <houtao1@huawei.com>
diff mbox series

Patch

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 5dfcb5a..6913b92 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -2175,6 +2175,44 @@  static int bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_f
 	return num_elems;
 }
 
+static u64 htab_map_mem_usage(const struct bpf_map *map)
+{
+	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
+	u32 value_size = round_up(htab->map.value_size, 8);
+	bool prealloc = htab_is_prealloc(htab);
+	bool percpu = htab_is_percpu(htab);
+	bool lru = htab_is_lru(htab);
+	u64 num_entries;
+	u64 usage = sizeof(struct bpf_htab);
+
+	usage += sizeof(struct bucket) * htab->n_buckets;
+	usage += sizeof(int) * num_possible_cpus() * HASHTAB_MAP_LOCK_COUNT;
+	if (prealloc) {
+		num_entries = map->max_entries;
+		if (htab_has_extra_elems(htab))
+			num_entries += num_possible_cpus();
+
+		usage += htab->elem_size * num_entries;
+
+		if (percpu)
+			usage += value_size * num_possible_cpus() * num_entries;
+		else if (!lru)
+			usage += sizeof(struct htab_elem *) * num_possible_cpus();
+	} else {
+#define LLIST_NODE_SZ sizeof(struct llist_node)
+
+		num_entries = htab->use_percpu_counter ?
+					  percpu_counter_sum(&htab->pcount) :
+					  atomic_read(&htab->count);
+		usage += (htab->elem_size + LLIST_NODE_SZ) * num_entries;
+		if (percpu) {
+			usage += (LLIST_NODE_SZ + sizeof(void *)) * num_entries;
+			usage += value_size * num_possible_cpus() * num_entries;
+		}
+	}
+	return usage;
+}
+
 BTF_ID_LIST_SINGLE(htab_map_btf_ids, struct, bpf_htab)
 const struct bpf_map_ops htab_map_ops = {
 	.map_meta_equal = bpf_map_meta_equal,
@@ -2191,6 +2229,7 @@  static int bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_f
 	.map_seq_show_elem = htab_map_seq_show_elem,
 	.map_set_for_each_callback_args = map_set_for_each_callback_args,
 	.map_for_each_callback = bpf_for_each_hash_elem,
+	.map_mem_usage = htab_map_mem_usage,
 	BATCH_OPS(htab),
 	.map_btf_id = &htab_map_btf_ids[0],
 	.iter_seq_info = &iter_seq_info,
@@ -2212,6 +2251,7 @@  static int bpf_for_each_hash_elem(struct bpf_map *map, bpf_callback_t callback_f
 	.map_seq_show_elem = htab_map_seq_show_elem,
 	.map_set_for_each_callback_args = map_set_for_each_callback_args,
 	.map_for_each_callback = bpf_for_each_hash_elem,
+	.map_mem_usage = htab_map_mem_usage,
 	BATCH_OPS(htab_lru),
 	.map_btf_id = &htab_map_btf_ids[0],
 	.iter_seq_info = &iter_seq_info,
@@ -2363,6 +2403,7 @@  static void htab_percpu_map_seq_show_elem(struct bpf_map *map, void *key,
 	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
 	.map_set_for_each_callback_args = map_set_for_each_callback_args,
 	.map_for_each_callback = bpf_for_each_hash_elem,
+	.map_mem_usage = htab_map_mem_usage,
 	BATCH_OPS(htab_percpu),
 	.map_btf_id = &htab_map_btf_ids[0],
 	.iter_seq_info = &iter_seq_info,
@@ -2382,6 +2423,7 @@  static void htab_percpu_map_seq_show_elem(struct bpf_map *map, void *key,
 	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
 	.map_set_for_each_callback_args = map_set_for_each_callback_args,
 	.map_for_each_callback = bpf_for_each_hash_elem,
+	.map_mem_usage = htab_map_mem_usage,
 	BATCH_OPS(htab_lru_percpu),
 	.map_btf_id = &htab_map_btf_ids[0],
 	.iter_seq_info = &iter_seq_info,
@@ -2519,6 +2561,7 @@  static void htab_of_map_free(struct bpf_map *map)
 	.map_fd_sys_lookup_elem = bpf_map_fd_sys_lookup_elem,
 	.map_gen_lookup = htab_of_map_gen_lookup,
 	.map_check_btf = map_check_no_btf,
+	.map_mem_usage = htab_map_mem_usage,
 	BATCH_OPS(htab),
 	.map_btf_id = &htab_map_btf_ids[0],
 };