diff mbox series

[bpf-next,v2] xsk: fix xsk_build_skb() error: 'skb' dereferencing possible ERR_PTR()

Message ID 20230815150325.2010460-1-tirthendu.sarkar@intel.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next,v2] xsk: fix xsk_build_skb() error: 'skb' dereferencing possible ERR_PTR() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1332 this patch: 1332
netdev/cc_maintainers warning 3 maintainers not CCed: hawk@kernel.org edumazet@google.com john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 1353 this patch: 1353
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1355 this patch: 1355
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 54 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on s390x with gcc

Commit Message

Tirthendu Sarkar Aug. 15, 2023, 3:03 p.m. UTC
xsk_build_skb_zerocopy() may return an error other than -EAGAIN and this
is received as skb and used later in xsk_set_destructor_arg() and
xsk_drop_skb() which must operate on a valid skb.

Set -EOVERFLOW as error when MAX_SKB_FRAGS are exceeded and packet needs
to be dropped and use this to distinguish against all other error cases
where allocation needs to be retried.

Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202307210434.OjgqFcbB-lkp@intel.com/
Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")

Changelog:
	v1 -> v2:
	- Removed err as a parameter to xsk_build_skb_zerocopy()
	[Stanislav Fomichev]
	- use explicit error to distinguish packet drop vs retry
---
 net/xdp/xsk.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

Comments

Stanislav Fomichev Aug. 15, 2023, 6:20 p.m. UTC | #1
On 08/15, Tirthendu Sarkar wrote:
> xsk_build_skb_zerocopy() may return an error other than -EAGAIN and this
> is received as skb and used later in xsk_set_destructor_arg() and
> xsk_drop_skb() which must operate on a valid skb.
> 
> Set -EOVERFLOW as error when MAX_SKB_FRAGS are exceeded and packet needs
> to be dropped and use this to distinguish against all other error cases
> where allocation needs to be retried.
> 
> Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> Closes: https://lore.kernel.org/r/202307210434.OjgqFcbB-lkp@intel.com/
> Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
> 
> Changelog:
> 	v1 -> v2:
> 	- Removed err as a parameter to xsk_build_skb_zerocopy()
> 	[Stanislav Fomichev]
> 	- use explicit error to distinguish packet drop vs retry
> ---
>  net/xdp/xsk.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index fcfc8472f73d..55f8b9b0e06d 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -602,7 +602,7 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
>  
>  	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
>  		if (unlikely(i >= MAX_SKB_FRAGS))
> -			return ERR_PTR(-EFAULT);
> +			return ERR_PTR(-EOVERFLOW);
>  
>  		page = pool->umem->pgs[addr >> PAGE_SHIFT];
>  		get_page(page);
> @@ -655,15 +655,17 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
>  			skb_put(skb, len);
>  
>  			err = skb_store_bits(skb, 0, buffer, len);
> -			if (unlikely(err))
> +			if (unlikely(err)) {
> +				kfree_skb(skb);
>  				goto free_err;
> +			}
>  		} else {
>  			int nr_frags = skb_shinfo(skb)->nr_frags;
>  			struct page *page;
>  			u8 *vaddr;
>  
>  			if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) && xp_mb_desc(desc))) {
> -				err = -EFAULT;
> +				err = -EOVERFLOW;
>  				goto free_err;
>  			}
>  
> @@ -690,12 +692,14 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
>  	return skb;
>  
>  free_err:
> -	if (err == -EAGAIN) {
> -		xsk_cq_cancel_locked(xs, 1);
> -	} else {
> -		xsk_set_destructor_arg(skb);
> -		xsk_drop_skb(skb);
> +	if (err == -EOVERFLOW) {

Don't think this will work? We have some other error paths in xsk_build_skb
that are not -EOVERFLOW that still need kfree_skb, right?

I feel like we are trying to share some state between xsk_build_skb and
xsk_build_skb_zerocopy which we really shouldn't share. So how about
we try to have a separate cleanup path in xsk_build_skb_zerocopy?

Will something like the following (untested / uncompiled) work instead?

IOW, ideally, xsk_build_skb should look like:

	if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
		return xsk_build_skb_zerocopy(xs, desc);
	} else {
		return xsk_build_skb_copy(xs, desc);
		/* ^^ current path that should really be a separate func */
	}

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 79a96019b7ef..747dd012afdb 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -578,6 +578,19 @@ static void xsk_drop_skb(struct sk_buff *skb)
 	xsk_consume_skb(skb);
 }
 
+static struct sk_buff *xsk_cleanup_skb(int err, struct sk_buff *skb, struct xdp_sock *xs)
+{
+	if (err == -EAGAIN) {
+		xsk_cq_cancel_locked(xs, 1);
+	} else {
+		xsk_set_destructor_arg(skb);
+		xsk_drop_skb(skb);
+		xskq_cons_release(xs->tx);
+	}
+
+	return ERR_PTR(err);
+}
+
 static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 					      struct xdp_desc *desc)
 {
@@ -593,8 +606,10 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 		hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(xs->dev->needed_headroom));
 
 		skb = sock_alloc_send_skb(&xs->sk, hr, 1, &err);
-		if (unlikely(!skb))
-			return ERR_PTR(err);
+		if (unlikely(!skb)) {
+			err = ERR_PTR(err);
+			goto free_err;
+		}
 
 		skb_reserve(skb, hr);
 	}
@@ -608,8 +623,10 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 	addr = buffer - pool->addrs;
 
 	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
-		if (unlikely(i >= MAX_SKB_FRAGS))
-			return ERR_PTR(-EFAULT);
+		if (unlikely(i >= MAX_SKB_FRAGS)) {
+			err = ERR_PTR(-EFAULT);
+			goto free_err;
+		}
 
 		page = pool->umem->pgs[addr >> PAGE_SHIFT];
 		get_page(page);
@@ -629,6 +646,9 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 	refcount_add(ts, &xs->sk.sk_wmem_alloc);
 
 	return skb;
+
+free_err:
+	return xsk_cleanup_skb(err, skb, xs);
 }
 
 static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
@@ -641,11 +661,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 	int err;
 
 	if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
-		skb = xsk_build_skb_zerocopy(xs, desc);
-		if (IS_ERR(skb)) {
-			err = PTR_ERR(skb);
-			goto free_err;
-		}
+		return xsk_build_skb_zerocopy(xs, desc);
 	} else {
 		u32 hr, tr, len;
 		void *buffer;
@@ -729,15 +745,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 	return skb;
 
 free_err:
-	if (err == -EAGAIN) {
-		xsk_cq_cancel_locked(xs, 1);
-	} else {
-		xsk_set_destructor_arg(skb);
-		xsk_drop_skb(skb);
-		xskq_cons_release(xs->tx);
-	}
-
-	return ERR_PTR(err);
+	return xsk_cleanup_skb(err, skb, xs);
 }
 
 static int __xsk_generic_xmit(struct sock *sk)
Tirthendu Sarkar Aug. 16, 2023, 6:04 a.m. UTC | #2
> -----Original Message-----
> From: Stanislav Fomichev <sdf@google.com>
> Sent: Tuesday, August 15, 2023 11:51 PM
> On 08/15, Tirthendu Sarkar wrote:
> > xsk_build_skb_zerocopy() may return an error other than -EAGAIN and
> this
> > is received as skb and used later in xsk_set_destructor_arg() and
> > xsk_drop_skb() which must operate on a valid skb.
> >
> > Set -EOVERFLOW as error when MAX_SKB_FRAGS are exceeded and
> packet needs
> > to be dropped and use this to distinguish against all other error cases
> > where allocation needs to be retried.
> >
> > Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > Closes: https://lore.kernel.org/r/202307210434.OjgqFcbB-lkp@intel.com/
> > Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx
> path")
> >
> > Changelog:
> > 	v1 -> v2:
> > 	- Removed err as a parameter to xsk_build_skb_zerocopy()
> > 	[Stanislav Fomichev]
> > 	- use explicit error to distinguish packet drop vs retry
> > ---
> >  net/xdp/xsk.c | 22 +++++++++++++---------
> >  1 file changed, 13 insertions(+), 9 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index fcfc8472f73d..55f8b9b0e06d 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -602,7 +602,7 @@ static struct sk_buff
> *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> >
> >  	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
> >  		if (unlikely(i >= MAX_SKB_FRAGS))
> > -			return ERR_PTR(-EFAULT);
> > +			return ERR_PTR(-EOVERFLOW);
> >
> >  		page = pool->umem->pgs[addr >> PAGE_SHIFT];
> >  		get_page(page);
> > @@ -655,15 +655,17 @@ static struct sk_buff *xsk_build_skb(struct
> xdp_sock *xs,
> >  			skb_put(skb, len);
> >
> >  			err = skb_store_bits(skb, 0, buffer, len);
> > -			if (unlikely(err))
> > +			if (unlikely(err)) {
> > +				kfree_skb(skb);
> >  				goto free_err;
> > +			}
> >  		} else {
> >  			int nr_frags = skb_shinfo(skb)->nr_frags;
> >  			struct page *page;
> >  			u8 *vaddr;
> >
> >  			if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) &&
> xp_mb_desc(desc))) {
> > -				err = -EFAULT;
> > +				err = -EOVERFLOW;
> >  				goto free_err;
> >  			}
> >
> > @@ -690,12 +692,14 @@ static struct sk_buff *xsk_build_skb(struct
> xdp_sock *xs,
> >  	return skb;
> >
> >  free_err:
> > -	if (err == -EAGAIN) {
> > -		xsk_cq_cancel_locked(xs, 1);
> > -	} else {
> > -		xsk_set_destructor_arg(skb);
> > -		xsk_drop_skb(skb);
> > +	if (err == -EOVERFLOW) {
> 
> Don't think this will work? We have some other error paths in xsk_build_skb
> that are not -EOVERFLOW that still need kfree_skb, right?
> 

There are 4 possible error paths in xsk_build_skb():
1. sock_alloc_send_skb:  skb is NULL; retry
2. skb_store_bits : free skb and retry
3. MAX_SKB_FRAGS exceeded: Free skb, cleanup and drop packet
4. alloc_page fails for frag: retry page allocation for frag w/o freeing skb

Of these 1] and 3] can also happen in xsk_build_skb_zerocopy() and the 
error returned is either -EOVERFLOW or something else and the same
error handling needs to be done.

> I feel like we are trying to share some state between xsk_build_skb and
> xsk_build_skb_zerocopy which we really shouldn't share. So how about
> we try to have a separate cleanup path in xsk_build_skb_zerocopy?
> 

The only things that are common between *copy and *zerocopy  paths are 
setting  the skb headers (destructor/args, mark, priority) and error handling.

While we can potentially split out the paths into independent functions, the
same skb header settings and error handling will still need to be duplicated in
both functions.

> Will something like the following (untested / uncompiled) work instead?
> 
> IOW, ideally, xsk_build_skb should look like:
> 
> 	if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
> 		return xsk_build_skb_zerocopy(xs, desc);
> 	} else {
> 		return xsk_build_skb_copy(xs, desc);
> 		/* ^^ current path that should really be a separate func */
> 	}
>
Fijalkowski, Maciej Aug. 22, 2023, 5:48 p.m. UTC | #3
On Wed, Aug 16, 2023 at 08:04:52AM +0200, Sarkar, Tirthendu wrote:
> > -----Original Message-----
> > From: Stanislav Fomichev <sdf@google.com>
> > Sent: Tuesday, August 15, 2023 11:51 PM
> > On 08/15, Tirthendu Sarkar wrote:
> > > xsk_build_skb_zerocopy() may return an error other than -EAGAIN and
> > this
> > > is received as skb and used later in xsk_set_destructor_arg() and
> > > xsk_drop_skb() which must operate on a valid skb.
> > >
> > > Set -EOVERFLOW as error when MAX_SKB_FRAGS are exceeded and
> > packet needs
> > > to be dropped and use this to distinguish against all other error cases
> > > where allocation needs to be retried.

Please be explicit - say that you're changing this error code
in xsk_build_skb_zerocopy() otherwise it's not clear.

You're not saying anything about kfree_skb() that is added when
skb_store_bits() failed. This code is non-trivial so all of the changes
need to be described.

Also did we test this patch? I believe it would require us to run xdpsock
against virtio net device?

> > >
> > > Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
> > > Reported-by: kernel test robot <lkp@intel.com>
> > > Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> > > Closes: https://lore.kernel.org/r/202307210434.OjgqFcbB-lkp@intel.com/
> > > Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx
> > path")
> > >
> > > Changelog:
> > > 	v1 -> v2:
> > > 	- Removed err as a parameter to xsk_build_skb_zerocopy()
> > > 	[Stanislav Fomichev]
> > > 	- use explicit error to distinguish packet drop vs retry
> > > ---
> > >  net/xdp/xsk.c | 22 +++++++++++++---------
> > >  1 file changed, 13 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index fcfc8472f73d..55f8b9b0e06d 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -602,7 +602,7 @@ static struct sk_buff
> > *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> > >
> > >  	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
> > >  		if (unlikely(i >= MAX_SKB_FRAGS))
> > > -			return ERR_PTR(-EFAULT);
> > > +			return ERR_PTR(-EOVERFLOW);
> > >
> > >  		page = pool->umem->pgs[addr >> PAGE_SHIFT];
> > >  		get_page(page);
> > > @@ -655,15 +655,17 @@ static struct sk_buff *xsk_build_skb(struct
> > xdp_sock *xs,
> > >  			skb_put(skb, len);
> > >
> > >  			err = skb_store_bits(skb, 0, buffer, len);
> > > -			if (unlikely(err))
> > > +			if (unlikely(err)) {
> > > +				kfree_skb(skb);
> > >  				goto free_err;
> > > +			}
> > >  		} else {
> > >  			int nr_frags = skb_shinfo(skb)->nr_frags;
> > >  			struct page *page;
> > >  			u8 *vaddr;
> > >
> > >  			if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) &&
> > xp_mb_desc(desc))) {
> > > -				err = -EFAULT;
> > > +				err = -EOVERFLOW;
> > >  				goto free_err;
> > >  			}
> > >
> > > @@ -690,12 +692,14 @@ static struct sk_buff *xsk_build_skb(struct
> > xdp_sock *xs,
> > >  	return skb;
> > >
> > >  free_err:
> > > -	if (err == -EAGAIN) {
> > > -		xsk_cq_cancel_locked(xs, 1);
> > > -	} else {
> > > -		xsk_set_destructor_arg(skb);
> > > -		xsk_drop_skb(skb);
> > > +	if (err == -EOVERFLOW) {
> > 
> > Don't think this will work? We have some other error paths in xsk_build_skb
> > that are not -EOVERFLOW that still need kfree_skb, right?
> > 
> 
> There are 4 possible error paths in xsk_build_skb():
> 1. sock_alloc_send_skb:  skb is NULL; retry
> 2. skb_store_bits : free skb and retry
> 3. MAX_SKB_FRAGS exceeded: Free skb, cleanup and drop packet
> 4. alloc_page fails for frag: retry page allocation for frag w/o freeing skb
> 
> Of these 1] and 3] can also happen in xsk_build_skb_zerocopy() and the 
> error returned is either -EOVERFLOW or something else and the same
> error handling needs to be done.

That would be helpful to have it included in commit message.

> 
> > I feel like we are trying to share some state between xsk_build_skb and
> > xsk_build_skb_zerocopy which we really shouldn't share. So how about
> > we try to have a separate cleanup path in xsk_build_skb_zerocopy?
> > 
> 
> The only things that are common between *copy and *zerocopy  paths are 
> setting  the skb headers (destructor/args, mark, priority) and error handling.
> 
> While we can potentially split out the paths into independent functions, the
> same skb header settings and error handling will still need to be duplicated in
> both functions.
> 
> > Will something like the following (untested / uncompiled) work instead?
> > 
> > IOW, ideally, xsk_build_skb should look like:
> > 
> > 	if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
> > 		return xsk_build_skb_zerocopy(xs, desc);
> > 	} else {
> > 		return xsk_build_skb_copy(xs, desc);
> > 		/* ^^ current path that should really be a separate func */
> > 	}

IMHO this is way out of the scope for a fix. We can think later on about
cleaning up these paths.

> >
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index fcfc8472f73d..55f8b9b0e06d 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -602,7 +602,7 @@  static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
 
 	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
 		if (unlikely(i >= MAX_SKB_FRAGS))
-			return ERR_PTR(-EFAULT);
+			return ERR_PTR(-EOVERFLOW);
 
 		page = pool->umem->pgs[addr >> PAGE_SHIFT];
 		get_page(page);
@@ -655,15 +655,17 @@  static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 			skb_put(skb, len);
 
 			err = skb_store_bits(skb, 0, buffer, len);
-			if (unlikely(err))
+			if (unlikely(err)) {
+				kfree_skb(skb);
 				goto free_err;
+			}
 		} else {
 			int nr_frags = skb_shinfo(skb)->nr_frags;
 			struct page *page;
 			u8 *vaddr;
 
 			if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) && xp_mb_desc(desc))) {
-				err = -EFAULT;
+				err = -EOVERFLOW;
 				goto free_err;
 			}
 
@@ -690,12 +692,14 @@  static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
 	return skb;
 
 free_err:
-	if (err == -EAGAIN) {
-		xsk_cq_cancel_locked(xs, 1);
-	} else {
-		xsk_set_destructor_arg(skb);
-		xsk_drop_skb(skb);
+	if (err == -EOVERFLOW) {
+		/* Drop the packet */
+		xsk_set_destructor_arg(xs->skb);
+		xsk_drop_skb(xs->skb);
 		xskq_cons_release(xs->tx);
+	} else {
+		/* Let application retry */
+		xsk_cq_cancel_locked(xs, 1);
 	}
 
 	return ERR_PTR(err);
@@ -738,7 +742,7 @@  static int __xsk_generic_xmit(struct sock *sk)
 		skb = xsk_build_skb(xs, &desc);
 		if (IS_ERR(skb)) {
 			err = PTR_ERR(skb);
-			if (err == -EAGAIN)
+			if (err != -EOVERFLOW)
 				goto out;
 			err = 0;
 			continue;