diff mbox series

[bpf] xsk: Fix handling of invalid descriptors in XSK Tx batching API

Message ID 20220607142200.576735-1-maciej.fijalkowski@intel.com (mailing list archive)
State Accepted
Commit d678cbd2f867a564a3c5b276c454e873f43f02f8
Delegated to: BPF
Headers show
Series [bpf] xsk: Fix handling of invalid descriptors in XSK Tx batching API | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers fail 1 blamed authors not CCed: john.fastabend@gmail.com; 12 maintainers not CCed: edumazet@google.com songliubraving@fb.com hawk@kernel.org pabeni@redhat.com jonathan.lemon@gmail.com yhs@fb.com kuba@kernel.org john.fastabend@gmail.com davem@davemloft.net kafai@fb.com andrii@kernel.org kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 7 this patch: 7
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 31 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-VM_Test-2 success Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-VM_Test-3 success Logs for Kernel LATEST on z15 with gcc

Commit Message

Maciej Fijalkowski June 7, 2022, 2:22 p.m. UTC
Xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK
Tx batching API. There is a test that checks how invalid Tx descriptors
are handled by AF_XDP. Each valid descriptor is followed by invalid one
on Tx side whereas the Rx side expects only to receive a set of valid
descriptors.

In current xsk_tx_peek_release_desc_batch() function, the amount of
available descriptors is hidden inside xskq_cons_peek_desc_batch(). This
can be problematic in cases where invalid descriptors are present due to
the fact that xskq_cons_peek_desc_batch() returns only a count of valid
descriptors. This means that it is impossible to properly update XSK
ring state when calling xskq_cons_release_n().

To address this issue, pull out the contents of
xskq_cons_peek_desc_batch() so that callers (currently only
xsk_tx_peek_release_desc_batch()) will always be able to update the
state of ring properly, as total count of entries is now available and
use this value as an argument in xskq_cons_release_n(). By
doing so, xskq_cons_peek_desc_batch() can be dropped altogether.

Fixes: 9349eb3a9d2a ("xsk: Introduce batched Tx descriptor interfaces")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 net/xdp/xsk.c       | 5 +++--
 net/xdp/xsk_queue.h | 8 --------
 2 files changed, 3 insertions(+), 10 deletions(-)

Comments

Magnus Karlsson June 8, 2022, 9:55 a.m. UTC | #1
On Tue, Jun 7, 2022 at 7:16 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> Xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK
> Tx batching API. There is a test that checks how invalid Tx descriptors
> are handled by AF_XDP. Each valid descriptor is followed by invalid one
> on Tx side whereas the Rx side expects only to receive a set of valid
> descriptors.
>
> In current xsk_tx_peek_release_desc_batch() function, the amount of
> available descriptors is hidden inside xskq_cons_peek_desc_batch(). This
> can be problematic in cases where invalid descriptors are present due to
> the fact that xskq_cons_peek_desc_batch() returns only a count of valid
> descriptors. This means that it is impossible to properly update XSK
> ring state when calling xskq_cons_release_n().
>
> To address this issue, pull out the contents of
> xskq_cons_peek_desc_batch() so that callers (currently only
> xsk_tx_peek_release_desc_batch()) will always be able to update the
> state of ring properly, as total count of entries is now available and
> use this value as an argument in xskq_cons_release_n(). By
> doing so, xskq_cons_peek_desc_batch() can be dropped altogether.

Thank you for catching this Maciej!

Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>

> Fixes: 9349eb3a9d2a ("xsk: Introduce batched Tx descriptor interfaces")
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>  net/xdp/xsk.c       | 5 +++--
>  net/xdp/xsk_queue.h | 8 --------
>  2 files changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index e0a4526ab66b..19ac872a6624 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -373,7 +373,8 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
>                 goto out;
>         }
>
> -       nb_pkts = xskq_cons_peek_desc_batch(xs->tx, pool, max_entries);
> +       max_entries = xskq_cons_nb_entries(xs->tx, max_entries);
> +       nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, max_entries);
>         if (!nb_pkts) {
>                 xs->tx->queue_empty_descs++;
>                 goto out;
> @@ -389,7 +390,7 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
>         if (!nb_pkts)
>                 goto out;
>
> -       xskq_cons_release_n(xs->tx, nb_pkts);
> +       xskq_cons_release_n(xs->tx, max_entries);
>         __xskq_cons_release(xs->tx);
>         xs->sk.sk_write_space(&xs->sk);
>
> diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
> index a794410989cc..fb20bf7207cf 100644
> --- a/net/xdp/xsk_queue.h
> +++ b/net/xdp/xsk_queue.h
> @@ -282,14 +282,6 @@ static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
>         return xskq_cons_read_desc(q, desc, pool);
>  }
>
> -static inline u32 xskq_cons_peek_desc_batch(struct xsk_queue *q, struct xsk_buff_pool *pool,
> -                                           u32 max)
> -{
> -       u32 entries = xskq_cons_nb_entries(q, max);
> -
> -       return xskq_cons_read_desc_batch(q, pool, entries);
> -}
> -
>  /* To improve performance in the xskq_cons_release functions, only update local state here.
>   * Reflect this to global state when we get new entries from the ring in
>   * xskq_cons_get_entries() and whenever Rx or Tx processing are completed in the NAPI loop.
> --
> 2.27.0
>
patchwork-bot+netdevbpf@kernel.org June 8, 2022, 2:30 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Tue,  7 Jun 2022 16:22:00 +0200 you wrote:
> Xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK
> Tx batching API. There is a test that checks how invalid Tx descriptors
> are handled by AF_XDP. Each valid descriptor is followed by invalid one
> on Tx side whereas the Rx side expects only to receive a set of valid
> descriptors.
> 
> In current xsk_tx_peek_release_desc_batch() function, the amount of
> available descriptors is hidden inside xskq_cons_peek_desc_batch(). This
> can be problematic in cases where invalid descriptors are present due to
> the fact that xskq_cons_peek_desc_batch() returns only a count of valid
> descriptors. This means that it is impossible to properly update XSK
> ring state when calling xskq_cons_release_n().
> 
> [...]

Here is the summary with links:
  - [bpf] xsk: Fix handling of invalid descriptors in XSK Tx batching API
    https://git.kernel.org/bpf/bpf/c/d678cbd2f867

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index e0a4526ab66b..19ac872a6624 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -373,7 +373,8 @@  u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
 		goto out;
 	}
 
-	nb_pkts = xskq_cons_peek_desc_batch(xs->tx, pool, max_entries);
+	max_entries = xskq_cons_nb_entries(xs->tx, max_entries);
+	nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, max_entries);
 	if (!nb_pkts) {
 		xs->tx->queue_empty_descs++;
 		goto out;
@@ -389,7 +390,7 @@  u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
 	if (!nb_pkts)
 		goto out;
 
-	xskq_cons_release_n(xs->tx, nb_pkts);
+	xskq_cons_release_n(xs->tx, max_entries);
 	__xskq_cons_release(xs->tx);
 	xs->sk.sk_write_space(&xs->sk);
 
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index a794410989cc..fb20bf7207cf 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -282,14 +282,6 @@  static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
 	return xskq_cons_read_desc(q, desc, pool);
 }
 
-static inline u32 xskq_cons_peek_desc_batch(struct xsk_queue *q, struct xsk_buff_pool *pool,
-					    u32 max)
-{
-	u32 entries = xskq_cons_nb_entries(q, max);
-
-	return xskq_cons_read_desc_batch(q, pool, entries);
-}
-
 /* To improve performance in the xskq_cons_release functions, only update local state here.
  * Reflect this to global state when we get new entries from the ring in
  * xskq_cons_get_entries() and whenever Rx or Tx processing are completed in the NAPI loop.