diff mbox series

[bpf] xsk: fix l2fwd for copy mode + busy poll combo

Message ID 20220406155804.434493-1-maciej.fijalkowski@intel.com (mailing list archive)
State Accepted
Commit 8de8b71b787f38983d414d2dba169a3bfefa668a
Delegated to: BPF
Headers show
Series [bpf] xsk: fix l2fwd for copy mode + busy poll combo | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 11 maintainers not CCed: songliubraving@fb.com davem@davemloft.net andrii@kernel.org kuba@kernel.org pabeni@redhat.com kafai@fb.com jonathan.lemon@gmail.com hawk@kernel.org yhs@fb.com john.fastabend@gmail.com kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest + selftests
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-2 success Logs for Kernel LATEST on z15 + selftests

Commit Message

Maciej Fijalkowski April 6, 2022, 3:58 p.m. UTC
While checking AF_XDP copy mode combined with busy poll, strange
results were observed. rxdrop and txonly scenarios worked fine, but
l2fwd broke immediately.

After a deeper look, it turned out that for l2fwd, Tx side was exiting
early due to xsk_no_wakeup() returning true and in the end
xsk_generic_xmit() was never called. Note that AF_XDP Tx in copy mode is
syscall steered, so the current behavior is broken.

Txonly scenario only worked due to the fact that
sk_mark_napi_id_once_xdp() was never called - since Rx side is not in
the picture for this case and mentioned function is called in
xsk_rcv_check(), sk::sk_napi_id was never set, which in turn meant that
xsk_no_wakeup() was returning false (see the sk->sk_napi_id >=
MIN_NAPI_ID check in there).

To fix this, prefer busy poll in xsk_sendmsg() only when zero copy is
enabled on a given AF_XDP socket. By doing so, busy poll in copy mode
would not exit early on Tx side and eventually xsk_generic_xmit() will
be called.

Fixes: a0731952d9cd ("xsk: Add busy-poll support for {recv,send}msg()")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 net/xdp/xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

patchwork-bot+netdevbpf@kernel.org April 7, 2022, 9:10 p.m. UTC | #1
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Wed,  6 Apr 2022 17:58:04 +0200 you wrote:
> While checking AF_XDP copy mode combined with busy poll, strange
> results were observed. rxdrop and txonly scenarios worked fine, but
> l2fwd broke immediately.
> 
> After a deeper look, it turned out that for l2fwd, Tx side was exiting
> early due to xsk_no_wakeup() returning true and in the end
> xsk_generic_xmit() was never called. Note that AF_XDP Tx in copy mode is
> syscall steered, so the current behavior is broken.
> 
> [...]

Here is the summary with links:
  - [bpf] xsk: fix l2fwd for copy mode + busy poll combo
    https://git.kernel.org/bpf/bpf/c/8de8b71b787f

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2c34caee0fd1..7d3a00cb24ec 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -639,7 +639,7 @@  static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len
 	if (sk_can_busy_loop(sk))
 		sk_busy_loop(sk, 1); /* only support non-blocking sockets */
 
-	if (xsk_no_wakeup(sk))
+	if (xs->zc && xsk_no_wakeup(sk))
 		return 0;
 
 	pool = xs->pool;