diff mbox series

[v2,bpf,2/2] bpf, sockmap: sk_prot needs inuse_idx for proc stats

Message ID 20210702001123.728035-3-john.fastabend@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series potential sockmap memleak and proc stats fix | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/cc_maintainers warning 9 maintainers not CCed: yhs@fb.com kpsingh@kernel.org andrii@kernel.org kafai@fb.com lmb@cloudflare.com songliubraving@fb.com davem@davemloft.net jakub@cloudflare.com kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

John Fastabend July 2, 2021, 12:11 a.m. UTC
Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
We currently do not set this correctly from sockmap side. The result is
reading sock stats '/proc/net/sockstat' gives incorrect values. The
socket counter is incremented correctly, but because we don't set the
counter correctly when we replace sk_prot we may omit the decrement.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Cong Wang July 2, 2021, 7:50 p.m. UTC | #1
On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/core/sock_map.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 60decd6420ca..016ea5460f8f 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -222,6 +222,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
>         struct bpf_prog *msg_parser = NULL;
>         struct sk_psock *psock;
>         int ret;
> +#ifdef CONFIG_PROC_FS
> +       int idx;
> +#endif
>
>         /* Only sockets we can redirect into/from in BPF need to hold
>          * refs to parser/verdict progs and have their sk_data_ready
> @@ -293,9 +296,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
>         if (msg_parser)
>                 psock_set_prog(&psock->progs.msg_parser, msg_parser);
>
> +#ifdef CONFIG_PROC_FS
> +       idx = sk->sk_prot->inuse_idx;
> +#endif
>         ret = sock_map_init_proto(sk, psock);
>         if (ret < 0)
>                 goto out_drop;
> +#ifdef CONFIG_PROC_FS
> +       sk->sk_prot->inuse_idx = idx;
> +#endif

I think it is better to put these into sock_map_init_proto()
so that sock_map_link() does not need to worry about the sk_prot
details.

Thanks.
John Fastabend July 5, 2021, 4:28 p.m. UTC | #2
Cong Wang wrote:
> On Thu, Jul 1, 2021 at 5:12 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> > We currently do not set this correctly from sockmap side. The result is
> > reading sock stats '/proc/net/sockstat' gives incorrect values. The
> > socket counter is incremented correctly, but because we don't set the
> > counter correctly when we replace sk_prot we may omit the decrement.
> >
> > Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> >  net/core/sock_map.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index 60decd6420ca..016ea5460f8f 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -222,6 +222,9 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
> >         struct bpf_prog *msg_parser = NULL;
> >         struct sk_psock *psock;
> >         int ret;
> > +#ifdef CONFIG_PROC_FS
> > +       int idx;
> > +#endif
> >
> >         /* Only sockets we can redirect into/from in BPF need to hold
> >          * refs to parser/verdict progs and have their sk_data_ready
> > @@ -293,9 +296,15 @@ static int sock_map_link(struct bpf_map *map, struct sock *sk)
> >         if (msg_parser)
> >                 psock_set_prog(&psock->progs.msg_parser, msg_parser);
> >
> > +#ifdef CONFIG_PROC_FS
> > +       idx = sk->sk_prot->inuse_idx;
> > +#endif
> >         ret = sock_map_init_proto(sk, psock);
> >         if (ret < 0)
> >                 goto out_drop;
> > +#ifdef CONFIG_PROC_FS
> > +       sk->sk_prot->inuse_idx = idx;
> > +#endif
> 
> I think it is better to put these into sock_map_init_proto()
> so that sock_map_link() does not need to worry about the sk_prot
> details.
> 
> Thanks.

Sure, that is fine.
diff mbox series

Patch

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 60decd6420ca..016ea5460f8f 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -222,6 +222,9 @@  static int sock_map_link(struct bpf_map *map, struct sock *sk)
 	struct bpf_prog *msg_parser = NULL;
 	struct sk_psock *psock;
 	int ret;
+#ifdef CONFIG_PROC_FS
+	int idx;
+#endif
 
 	/* Only sockets we can redirect into/from in BPF need to hold
 	 * refs to parser/verdict progs and have their sk_data_ready
@@ -293,9 +296,15 @@  static int sock_map_link(struct bpf_map *map, struct sock *sk)
 	if (msg_parser)
 		psock_set_prog(&psock->progs.msg_parser, msg_parser);
 
+#ifdef CONFIG_PROC_FS
+	idx = sk->sk_prot->inuse_idx;
+#endif
 	ret = sock_map_init_proto(sk, psock);
 	if (ret < 0)
 		goto out_drop;
+#ifdef CONFIG_PROC_FS
+	sk->sk_prot->inuse_idx = idx;
+#endif
 
 	write_lock_bh(&sk->sk_callback_lock);
 	if (stream_parser && stream_verdict && !psock->saved_data_ready) {