diff mbox series

[bpf,v3,8/8] skmsg: increase sk->sk_drops when dropping packets

Message ID 20210527011155.10097-9-xiyou.wangcong@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series sock_map: some bug fixes and improvements | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/cc_maintainers warning 8 maintainers not CCed: yhs@fb.com kpsingh@kernel.org andrii@kernel.org kafai@fb.com ast@kernel.org songliubraving@fb.com davem@davemloft.net kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 76 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link

Commit Message

Cong Wang May 27, 2021, 1:11 a.m. UTC
From: Cong Wang <cong.wang@bytedance.com>

It is hard to observe packet drops without increasing relevant
drop counters, here we should increase sk->sk_drops which is
a protocol-independent counter. Fortunately psock is always
associated with a struct sock, we can just use psock->sk.

Suggested-by: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/core/skmsg.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

Comments

John Fastabend May 28, 2021, 5:27 a.m. UTC | #1
Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> It is hard to observe packet drops without increasing relevant
> drop counters, here we should increase sk->sk_drops which is
> a protocol-independent counter. Fortunately psock is always
> associated with a struct sock, we can just use psock->sk.
> 
> Suggested-by: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Cc: Lorenz Bauer <lmb@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---
>  net/core/skmsg.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 

[...]

> @@ -942,7 +948,7 @@ static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
>  	case __SK_DROP:
>  	default:
>  out_free:
> -		kfree_skb(skb);
> +		sock_drop(psock->sk, skb);

I must have missed this on first review.

Why should we mark a packet we intentionally drop as sk_drops? I think
we should leave it as just kfree_skb() this way sk_drops is just
the error cases and if users want this counter they can always add
it to the bpf prog itself.

>  	}
>  
>  	return err;
Cong Wang May 29, 2021, 7:29 p.m. UTC | #2
On Thu, May 27, 2021 at 10:27 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> >
> > It is hard to observe packet drops without increasing relevant
> > drop counters, here we should increase sk->sk_drops which is
> > a protocol-independent counter. Fortunately psock is always
> > associated with a struct sock, we can just use psock->sk.
> >
> > Suggested-by: John Fastabend <john.fastabend@gmail.com>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > Cc: Lorenz Bauer <lmb@cloudflare.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
> >  net/core/skmsg.c | 22 ++++++++++++++--------
> >  1 file changed, 14 insertions(+), 8 deletions(-)
> >
>
> [...]
>
> > @@ -942,7 +948,7 @@ static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
> >       case __SK_DROP:
> >       default:
> >  out_free:
> > -             kfree_skb(skb);
> > +             sock_drop(psock->sk, skb);
>
> I must have missed this on first review.
>
> Why should we mark a packet we intentionally drop as sk_drops? I think
> we should leave it as just kfree_skb() this way sk_drops is just
> the error cases and if users want this counter they can always add
> it to the bpf prog itself.

This is actually a mixed case of error and non-error drops,
because bpf_sk_redirect_map() could return SK_DROP
in error cases. And of course users could want to drop packets
in whatever cases.

But if you look at packet filter cases, for example UDP one,
it increases drop counters too when user-defined rules drop
them:

2182         if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))
2183                 goto drop;
2184
...
2192 drop:
2193         __UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
2194         atomic_inc(&sk->sk_drops);
2195         kfree_skb(skb);
2196         return -1;


Thanks.
John Fastabend June 2, 2021, 3:39 a.m. UTC | #3
Cong Wang wrote:
> On Thu, May 27, 2021 at 10:27 PM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > It is hard to observe packet drops without increasing relevant
> > > drop counters, here we should increase sk->sk_drops which is
> > > a protocol-independent counter. Fortunately psock is always
> > > associated with a struct sock, we can just use psock->sk.
> > >
> > > Suggested-by: John Fastabend <john.fastabend@gmail.com>
> > > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > > Cc: Jakub Sitnicki <jakub@cloudflare.com>
> > > Cc: Lorenz Bauer <lmb@cloudflare.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > >  net/core/skmsg.c | 22 ++++++++++++++--------
> > >  1 file changed, 14 insertions(+), 8 deletions(-)
> > >
> >
> > [...]
> >
> > > @@ -942,7 +948,7 @@ static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
> > >       case __SK_DROP:
> > >       default:
> > >  out_free:
> > > -             kfree_skb(skb);
> > > +             sock_drop(psock->sk, skb);
> >
> > I must have missed this on first review.
> >
> > Why should we mark a packet we intentionally drop as sk_drops? I think
> > we should leave it as just kfree_skb() this way sk_drops is just
> > the error cases and if users want this counter they can always add
> > it to the bpf prog itself.
> 
> This is actually a mixed case of error and non-error drops,
> because bpf_sk_redirect_map() could return SK_DROP
> in error cases. And of course users could want to drop packets
> in whatever cases.
> 
> But if you look at packet filter cases, for example UDP one,
> it increases drop counters too when user-defined rules drop
> them:
> 
> 2182         if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))
> 2183                 goto drop;
> 2184
> ...
> 2192 drop:
> 2193         __UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> 2194         atomic_inc(&sk->sk_drops);
> 2195         kfree_skb(skb);
> 2196         return -1;
> 
> 
> Thanks.

OK same for TCP side for sk_filter_trim_cap() case. Works for me.
diff mbox series

Patch

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 3aa9065811ad..9b6160a191f8 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -578,6 +578,12 @@  static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
 	return sk_psock_skb_ingress(psock, skb);
 }
 
+static void sock_drop(struct sock *sk, struct sk_buff *skb)
+{
+	sk_drops_add(sk, skb);
+	kfree_skb(skb);
+}
+
 static void sk_psock_backlog(struct work_struct *work)
 {
 	struct sk_psock *psock = container_of(work, struct sk_psock, work);
@@ -617,7 +623,7 @@  static void sk_psock_backlog(struct work_struct *work)
 				/* Hard errors break pipe and stop xmit. */
 				sk_psock_report_error(psock, ret ? -ret : EPIPE);
 				sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
-				kfree_skb(skb);
+				sock_drop(psock->sk, skb);
 				goto end;
 			}
 			off += ret;
@@ -708,7 +714,7 @@  static void __sk_psock_zap_ingress(struct sk_psock *psock)
 
 	while ((skb = skb_dequeue(&psock->ingress_skb)) != NULL) {
 		skb_bpf_redirect_clear(skb);
-		kfree_skb(skb);
+		sock_drop(psock->sk, skb);
 	}
 	__sk_psock_purge_ingress_msg(psock);
 }
@@ -834,7 +840,7 @@  static int sk_psock_skb_redirect(struct sk_psock *from, struct sk_buff *skb)
 	 * return code, but then didn't set a redirect interface.
 	 */
 	if (unlikely(!sk_other)) {
-		kfree_skb(skb);
+		sock_drop(from->sk, skb);
 		return -EIO;
 	}
 	psock_other = sk_psock(sk_other);
@@ -844,14 +850,14 @@  static int sk_psock_skb_redirect(struct sk_psock *from, struct sk_buff *skb)
 	 */
 	if (!psock_other || sock_flag(sk_other, SOCK_DEAD)) {
 		skb_bpf_redirect_clear(skb);
-		kfree_skb(skb);
+		sock_drop(from->sk, skb);
 		return -EIO;
 	}
 	spin_lock_bh(&psock_other->ingress_lock);
 	if (!sk_psock_test_state(psock_other, SK_PSOCK_TX_ENABLED)) {
 		spin_unlock_bh(&psock_other->ingress_lock);
 		skb_bpf_redirect_clear(skb);
-		kfree_skb(skb);
+		sock_drop(from->sk, skb);
 		return -EIO;
 	}
 
@@ -942,7 +948,7 @@  static int sk_psock_verdict_apply(struct sk_psock *psock, struct sk_buff *skb,
 	case __SK_DROP:
 	default:
 out_free:
-		kfree_skb(skb);
+		sock_drop(psock->sk, skb);
 	}
 
 	return err;
@@ -977,7 +983,7 @@  static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb)
 	sk = strp->sk;
 	psock = sk_psock(sk);
 	if (unlikely(!psock)) {
-		kfree_skb(skb);
+		sock_drop(sk, skb);
 		goto out;
 	}
 	prog = READ_ONCE(psock->progs.stream_verdict);
@@ -1098,7 +1104,7 @@  static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb,
 	psock = sk_psock(sk);
 	if (unlikely(!psock)) {
 		len = 0;
-		kfree_skb(skb);
+		sock_drop(sk, skb);
 		goto out;
 	}
 	prog = READ_ONCE(psock->progs.stream_verdict);