diff mbox series

[bpf-next] sock_map: include sk_psock memory overhead too

Message ID 20230326221612.169289-1-xiyou.wangcong@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf-next] sock_map: include sk_psock memory overhead too | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers fail 1 blamed authors not CCed: ast@kernel.org; 5 maintainers not CCed: pabeni@redhat.com kuba@kernel.org edumazet@google.com ast@kernel.org davem@davemloft.net
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 20 this patch: 20
netdev/checkpatch warning WARNING: line length of 94 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc

Commit Message

Cong Wang March 26, 2023, 10:16 p.m. UTC
From: Cong Wang <cong.wang@bytedance.com>

When a socket is added to a sockmap, sk_psock is allocated too as its
sk_user_data, therefore it should be consider as an overhead of sockmap
memory usage.

Before this patch:

1: sockmap  flags 0x0
	key 4B  value 4B  max_entries 2  memlock 656B
	pids echo-sockmap(549)

After this patch:

9: sockmap  flags 0x0
	key 4B  value 4B  max_entries 2  memlock 1824B
	pids echo-sockmap(568)

Fixes: 73d2c61919e9 ("bpf, net: sock_map memory usage")
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/core/sock_map.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Yafang Shao March 27, 2023, 3:48 a.m. UTC | #1
On Mon, Mar 27, 2023 at 6:16 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> From: Cong Wang <cong.wang@bytedance.com>
>
> When a socket is added to a sockmap, sk_psock is allocated too as its
> sk_user_data, therefore it should be consider as an overhead of sockmap
> memory usage.
>
> Before this patch:
>
> 1: sockmap  flags 0x0
>         key 4B  value 4B  max_entries 2  memlock 656B
>         pids echo-sockmap(549)
>
> After this patch:
>
> 9: sockmap  flags 0x0
>         key 4B  value 4B  max_entries 2  memlock 1824B
>         pids echo-sockmap(568)
>
> Fixes: 73d2c61919e9 ("bpf, net: sock_map memory usage")
> Cc: Yafang Shao <laoar.shao@gmail.com>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---
>  net/core/sock_map.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 7c189c2e2fbf..22197e565ece 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -799,9 +799,17 @@ static void sock_map_fini_seq_private(void *priv_data)
>
>  static u64 sock_map_mem_usage(const struct bpf_map *map)
>  {
> +       struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
>         u64 usage = sizeof(struct bpf_stab);
> +       int i;
>
>         usage += (u64)map->max_entries * sizeof(struct sock *);
> +
> +       for (i = 0; i < stab->map.max_entries; i++) {

Although it adds a for-loop, the operation below is quite light. So it
looks good to me.

> +               if (stab->sks[i])

Nit, stab->sks[i] can be modified in the delete path in parallel, so
there should be a READ_ONCE() here.

> +                       usage += sizeof(struct sk_psock);
> +       }
> +
>         return usage;
>  }
>
> @@ -1412,7 +1420,7 @@ static u64 sock_hash_mem_usage(const struct bpf_map *map)
>         u64 usage = sizeof(*htab);
>
>         usage += htab->buckets_num * sizeof(struct bpf_shtab_bucket);
> -       usage += atomic_read(&htab->count) * (u64)htab->elem_size;
> +       usage += atomic_read(&htab->count) * ((u64)htab->elem_size + sizeof(struct sk_psock));
>         return usage;
>  }
>
> --
> 2.34.1
>
John Fastabend April 1, 2023, 1:09 a.m. UTC | #2
Yafang Shao wrote:
> On Mon, Mar 27, 2023 at 6:16 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > From: Cong Wang <cong.wang@bytedance.com>
> >
> > When a socket is added to a sockmap, sk_psock is allocated too as its
> > sk_user_data, therefore it should be consider as an overhead of sockmap
> > memory usage.
> >
> > Before this patch:
> >
> > 1: sockmap  flags 0x0
> >         key 4B  value 4B  max_entries 2  memlock 656B
> >         pids echo-sockmap(549)
> >
> > After this patch:
> >
> > 9: sockmap  flags 0x0
> >         key 4B  value 4B  max_entries 2  memlock 1824B
> >         pids echo-sockmap(568)
> >
> > Fixes: 73d2c61919e9 ("bpf, net: sock_map memory usage")
> > Cc: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
> >  net/core/sock_map.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index 7c189c2e2fbf..22197e565ece 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -799,9 +799,17 @@ static void sock_map_fini_seq_private(void *priv_data)
> >
> >  static u64 sock_map_mem_usage(const struct bpf_map *map)
> >  {
> > +       struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
> >         u64 usage = sizeof(struct bpf_stab);
> > +       int i;
> >
> >         usage += (u64)map->max_entries * sizeof(struct sock *);
> > +
> > +       for (i = 0; i < stab->map.max_entries; i++) {
> 
> Although it adds a for-loop, the operation below is quite light. So it
> looks good to me.

We could track a count from update to avoid the loop?

> 
> > +               if (stab->sks[i])
> 
> Nit, stab->sks[i] can be modified in the delete path in parallel, so
> there should be a READ_ONCE() here.
> 
> > +                       usage += sizeof(struct sk_psock);
> > +       }
> > +
> >         return usage;
> >  }
> >
> > @@ -1412,7 +1420,7 @@ static u64 sock_hash_mem_usage(const struct bpf_map *map)
> >         u64 usage = sizeof(*htab);
> >
> >         usage += htab->buckets_num * sizeof(struct bpf_shtab_bucket);
> > -       usage += atomic_read(&htab->count) * (u64)htab->elem_size;
> > +       usage += atomic_read(&htab->count) * ((u64)htab->elem_size + sizeof(struct sk_psock));
> >         return usage;
> >  }
> >
> > --
> > 2.34.1
> >
> 
> 
> -- 
> Regards
> Yafang
Yafang Shao April 2, 2023, 10:41 a.m. UTC | #3
On Sat, Apr 1, 2023 at 9:09 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Yafang Shao wrote:
> > On Mon, Mar 27, 2023 at 6:16 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > >
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > When a socket is added to a sockmap, sk_psock is allocated too as its
> > > sk_user_data, therefore it should be consider as an overhead of sockmap
> > > memory usage.
> > >
> > > Before this patch:
> > >
> > > 1: sockmap  flags 0x0
> > >         key 4B  value 4B  max_entries 2  memlock 656B
> > >         pids echo-sockmap(549)
> > >
> > > After this patch:
> > >
> > > 9: sockmap  flags 0x0
> > >         key 4B  value 4B  max_entries 2  memlock 1824B
> > >         pids echo-sockmap(568)
> > >
> > > Fixes: 73d2c61919e9 ("bpf, net: sock_map memory usage")
> > > Cc: Yafang Shao <laoar.shao@gmail.com>
> > > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > > Cc: John Fastabend <john.fastabend@gmail.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > >  net/core/sock_map.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > > index 7c189c2e2fbf..22197e565ece 100644
> > > --- a/net/core/sock_map.c
> > > +++ b/net/core/sock_map.c
> > > @@ -799,9 +799,17 @@ static void sock_map_fini_seq_private(void *priv_data)
> > >
> > >  static u64 sock_map_mem_usage(const struct bpf_map *map)
> > >  {
> > > +       struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
> > >         u64 usage = sizeof(struct bpf_stab);
> > > +       int i;
> > >
> > >         usage += (u64)map->max_entries * sizeof(struct sock *);
> > > +
> > > +       for (i = 0; i < stab->map.max_entries; i++) {
> >
> > Although it adds a for-loop, the operation below is quite light. So it
> > looks good to me.
>
> We could track a count from update to avoid the loop?
>

I prefer adding a count into struct bpf_stab. We can also get the
number of socks easily with this new count, and it should be
acceptable to modify this count in the update/delete paths.
diff mbox series

Patch

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 7c189c2e2fbf..22197e565ece 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -799,9 +799,17 @@  static void sock_map_fini_seq_private(void *priv_data)
 
 static u64 sock_map_mem_usage(const struct bpf_map *map)
 {
+	struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
 	u64 usage = sizeof(struct bpf_stab);
+	int i;
 
 	usage += (u64)map->max_entries * sizeof(struct sock *);
+
+	for (i = 0; i < stab->map.max_entries; i++) {
+		if (stab->sks[i])
+			usage += sizeof(struct sk_psock);
+	}
+
 	return usage;
 }
 
@@ -1412,7 +1420,7 @@  static u64 sock_hash_mem_usage(const struct bpf_map *map)
 	u64 usage = sizeof(*htab);
 
 	usage += htab->buckets_num * sizeof(struct bpf_shtab_bucket);
-	usage += atomic_read(&htab->count) * (u64)htab->elem_size;
+	usage += atomic_read(&htab->count) * ((u64)htab->elem_size + sizeof(struct sk_psock));
 	return usage;
 }