diff mbox series

[bpf,v4,2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats

Message ID 20210712195546.423990-3-john.fastabend@gmail.com (mailing list archive)
State Accepted
Commit 228a4a7ba8e99bb9ef980b62f71e3be33f4aae69
Delegated to: BPF
Headers show
Series bpf, sockmap: fix potential memory leak | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/cc_maintainers warning 11 maintainers not CCed: dsahern@kernel.org yhs@fb.com kpsingh@kernel.org yoshfuji@linux-ipv6.org andrii@kernel.org kafai@fb.com edumazet@google.com lmb@cloudflare.com songliubraving@fb.com davem@davemloft.net kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

John Fastabend July 12, 2021, 7:55 p.m. UTC
Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
We currently do not set this correctly from sockmap side. The result is
reading sock stats '/proc/net/sockstat' gives incorrect values. The
socket counter is incremented correctly, but because we don't set the
counter correctly when we replace sk_prot we may omit the decrement.

To get the correct inuse_idx value move the core_initcall that initializes
the tcp/udp proto handlers to late_initcall. This way it is initialized
after TCP/UDP has the chance to assign the inuse_idx value from the
register protocol handler.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jakub Sitnicki July 13, 2021, 7:47 a.m. UTC | #1
On Mon, Jul 12, 2021 at 09:55 PM CEST, John Fastabend wrote:
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> To get the correct inuse_idx value move the core_initcall that initializes
> the tcp/udp proto handlers to late_initcall. This way it is initialized
> after TCP/UDP has the chance to assign the inuse_idx value from the
> register protocol handler.
>
> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/ipv4/tcp_bpf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index f26916a62f25..d3e9386b493e 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -503,7 +503,7 @@ static int __init tcp_bpf_v4_build_proto(void)
>  	tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);
>  	return 0;
>  }
> -core_initcall(tcp_bpf_v4_build_proto);
> +late_initcall(tcp_bpf_v4_build_proto);
>
>  static int tcp_bpf_assert_proto_ops(struct proto *ops)
>  {

Respective change for udp_bpf is missing. I've posted it separately [1]
to save us an iteration. Hope you don't mind.

[1] https://lore.kernel.org/bpf/20210713074401.475209-1-jakub@cloudflare.com/
Cong Wang July 14, 2021, 12:56 a.m. UTC | #2
On Mon, Jul 12, 2021 at 12:56 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> To get the correct inuse_idx value move the core_initcall that initializes
> the tcp/udp proto handlers to late_initcall. This way it is initialized
> after TCP/UDP has the chance to assign the inuse_idx value from the
> register protocol handler.
>
> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

For IPv6, I think the module is always loaded before we can
trigger tcp_bpf_check_v6_needs_rebuild(). So,

Reviewed-by: Cong Wang <cong.wang@bytedance.com>

Thanks.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index f26916a62f25..d3e9386b493e 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -503,7 +503,7 @@  static int __init tcp_bpf_v4_build_proto(void)
 	tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);
 	return 0;
 }
-core_initcall(tcp_bpf_v4_build_proto);
+late_initcall(tcp_bpf_v4_build_proto);
 
 static int tcp_bpf_assert_proto_ops(struct proto *ops)
 {