diff mbox series

[bpf] xsk: fix generic transmit when completion queue reservation fails

Message ID 20220614070746.8871-1-ciara.loftus@intel.com (mailing list archive)
State Accepted
Commit a6e944f25cdbe6b82275402b8bc9a55ad7aac10b
Delegated to: BPF
Headers show
Series [bpf] xsk: fix generic transmit when completion queue reservation fails | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers fail 2 blamed authors not CCed: daniel@iogearbox.net i.maximets@samsung.com; 14 maintainers not CCed: i.maximets@samsung.com edumazet@google.com daniel@iogearbox.net songliubraving@fb.com ast@kernel.org hawk@kernel.org pabeni@redhat.com yhs@fb.com kuba@kernel.org john.fastabend@gmail.com davem@davemloft.net kafai@fb.com andrii@kernel.org kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 32 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-1 fail Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-VM_Test-2 fail Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-VM_Test-3 fail Logs for Kernel LATEST on z15 with gcc

Commit Message

Ciara Loftus June 14, 2022, 7:07 a.m. UTC
Two points of potential failure in the generic transmit function are:
1. completion queue (cq) reservation failure.
2. skb allocation failure

Originally the cq reservation was performed first, followed by the skb
allocation. Commit 675716400da6 ("xdp: fix possible cq entry leak")
reversed the order because at the time there was no mechanism available to
undo the cq reservation which could have led to possible cq entry leaks in
the event of skb allocation failure. However if the skb allocation is
performed first and the cq reservation then fails, the xsk skb destructor
is called which blindly adds the skb address to the already full cq leading
to undefined behavior.

This commit restores the original order (cq reservation followed by skb
allocation) and uses the xskq_prod_cancel helper to undo the cq reserve in
event of skb allocation failure.

Fixes: 675716400da6 ("xdp: fix possible cq entry leak")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 net/xdp/xsk.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

Comments

Magnus Karlsson June 14, 2022, 1:16 p.m. UTC | #1
On Tue, Jun 14, 2022 at 9:09 AM Ciara Loftus <ciara.loftus@intel.com> wrote:
>
> Two points of potential failure in the generic transmit function are:
> 1. completion queue (cq) reservation failure.
> 2. skb allocation failure
>
> Originally the cq reservation was performed first, followed by the skb
> allocation. Commit 675716400da6 ("xdp: fix possible cq entry leak")
> reversed the order because at the time there was no mechanism available to
> undo the cq reservation which could have led to possible cq entry leaks in
> the event of skb allocation failure. However if the skb allocation is
> performed first and the cq reservation then fails, the xsk skb destructor
> is called which blindly adds the skb address to the already full cq leading
> to undefined behavior.
>
> This commit restores the original order (cq reservation followed by skb
> allocation) and uses the xskq_prod_cancel helper to undo the cq reserve in
> event of skb allocation failure.

Thanks for fixing this Ciara.

Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>

> Fixes: 675716400da6 ("xdp: fix possible cq entry leak")
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> ---
>  net/xdp/xsk.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 19ac872a6624..09002387987e 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -538,12 +538,6 @@ static int xsk_generic_xmit(struct sock *sk)
>                         goto out;
>                 }
>
> -               skb = xsk_build_skb(xs, &desc);
> -               if (IS_ERR(skb)) {
> -                       err = PTR_ERR(skb);
> -                       goto out;
> -               }
> -
>                 /* This is the backpressure mechanism for the Tx path.
>                  * Reserve space in the completion queue and only proceed
>                  * if there is space in it. This avoids having to implement
> @@ -552,11 +546,19 @@ static int xsk_generic_xmit(struct sock *sk)
>                 spin_lock_irqsave(&xs->pool->cq_lock, flags);
>                 if (xskq_prod_reserve(xs->pool->cq)) {
>                         spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
> -                       kfree_skb(skb);
>                         goto out;
>                 }
>                 spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
>
> +               skb = xsk_build_skb(xs, &desc);
> +               if (IS_ERR(skb)) {
> +                       err = PTR_ERR(skb);
> +                       spin_lock_irqsave(&xs->pool->cq_lock, flags);
> +                       xskq_prod_cancel(xs->pool->cq);
> +                       spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
> +                       goto out;
> +               }
> +
>                 err = __dev_direct_xmit(skb, xs->queue_id);
>                 if  (err == NETDEV_TX_BUSY) {
>                         /* Tell user-space to retry the send */
> --
> 2.25.1
>
patchwork-bot+netdevbpf@kernel.org June 14, 2022, 3 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Tue, 14 Jun 2022 07:07:46 +0000 you wrote:
> Two points of potential failure in the generic transmit function are:
> 1. completion queue (cq) reservation failure.
> 2. skb allocation failure
> 
> Originally the cq reservation was performed first, followed by the skb
> allocation. Commit 675716400da6 ("xdp: fix possible cq entry leak")
> reversed the order because at the time there was no mechanism available to
> undo the cq reservation which could have led to possible cq entry leaks in
> the event of skb allocation failure. However if the skb allocation is
> performed first and the cq reservation then fails, the xsk skb destructor
> is called which blindly adds the skb address to the already full cq leading
> to undefined behavior.
> 
> [...]

Here is the summary with links:
  - [bpf] xsk: fix generic transmit when completion queue reservation fails
    https://git.kernel.org/bpf/bpf/c/a6e944f25cdb

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 19ac872a6624..09002387987e 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -538,12 +538,6 @@  static int xsk_generic_xmit(struct sock *sk)
 			goto out;
 		}
 
-		skb = xsk_build_skb(xs, &desc);
-		if (IS_ERR(skb)) {
-			err = PTR_ERR(skb);
-			goto out;
-		}
-
 		/* This is the backpressure mechanism for the Tx path.
 		 * Reserve space in the completion queue and only proceed
 		 * if there is space in it. This avoids having to implement
@@ -552,11 +546,19 @@  static int xsk_generic_xmit(struct sock *sk)
 		spin_lock_irqsave(&xs->pool->cq_lock, flags);
 		if (xskq_prod_reserve(xs->pool->cq)) {
 			spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
-			kfree_skb(skb);
 			goto out;
 		}
 		spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
 
+		skb = xsk_build_skb(xs, &desc);
+		if (IS_ERR(skb)) {
+			err = PTR_ERR(skb);
+			spin_lock_irqsave(&xs->pool->cq_lock, flags);
+			xskq_prod_cancel(xs->pool->cq);
+			spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
+			goto out;
+		}
+
 		err = __dev_direct_xmit(skb, xs->queue_id);
 		if  (err == NETDEV_TX_BUSY) {
 			/* Tell user-space to retry the send */