[net] net/sched: act_mirred: use the backlog for mirred ingress

William reports kernel soft-lockups on some OVS topologies when TC mirred
"egress-to-ingress" action is hit by local TCP traffic. Indeed, using the
mirred action in egress-to-ingress can easily produce a dmesg splat like:

 ============================================
 WARNING: possible recursive locking detected
 6.0.0-rc4+ #511 Not tainted
 --------------------------------------------
 nc/1037 is trying to acquire lock:
 ffff950687843cb0 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1023/0x1160

 but task is already holding lock:
 ffff950687846cb0 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1023/0x1160

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(slock-AF_INET/1);
   lock(slock-AF_INET/1);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 12 locks held by nc/1037:
  #0: ffff950687843d40 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x19/0x40
  #1: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5/0x610
  #2: ffffffff9be072e0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0xaa/0xa10
  #3: ffffffff9be072e0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x72/0x11b0
  #4: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x181/0x400
  #5: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x54/0x160
  #6: ffff950687846cb0 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1023/0x1160
  #7: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5/0x610
  #8: ffffffff9be072e0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0xaa/0xa10
  #9: ffffffff9be072e0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x72/0x11b0
  #10: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x181/0x400
  #11: ffffffff9be07320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x54/0x160

 stack backtrace:
 CPU: 1 PID: 1037 Comm: nc Not tainted 6.0.0-rc4+ #511
 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x44/0x5b
  __lock_acquire.cold.76+0x121/0x2a7
  lock_acquire+0xd5/0x310
  _raw_spin_lock_nested+0x39/0x70
  tcp_v4_rcv+0x1023/0x1160
  ip_protocol_deliver_rcu+0x4d/0x280
  ip_local_deliver_finish+0xac/0x160
  ip_local_deliver+0x71/0x220
  ip_rcv+0x5a/0x200
  __netif_receive_skb_one_core+0x89/0xa0
  netif_receive_skb+0x1c1/0x400
  tcf_mirred_act+0x2a5/0x610 [act_mirred]
  tcf_action_exec+0xb3/0x210
  fl_classify+0x1f7/0x240 [cls_flower]
  tcf_classify+0x7b/0x320
  __dev_queue_xmit+0x3a4/0x11b0
  ip_finish_output2+0x3b8/0xa10
  ip_output+0x7f/0x260
  __ip_queue_xmit+0x1ce/0x610
  __tcp_transmit_skb+0xabc/0xc80
  tcp_rcv_state_process+0x669/0x1290
  tcp_v4_do_rcv+0xd7/0x370
  tcp_v4_rcv+0x10bc/0x1160
  ip_protocol_deliver_rcu+0x4d/0x280
  ip_local_deliver_finish+0xac/0x160
  ip_local_deliver+0x71/0x220
  ip_rcv+0x5a/0x200
  __netif_receive_skb_one_core+0x89/0xa0
  netif_receive_skb+0x1c1/0x400
  tcf_mirred_act+0x2a5/0x610 [act_mirred]
  tcf_action_exec+0xb3/0x210
  fl_classify+0x1f7/0x240 [cls_flower]
  tcf_classify+0x7b/0x320
  __dev_queue_xmit+0x3a4/0x11b0
  ip_finish_output2+0x3b8/0xa10
  ip_output+0x7f/0x260
  __ip_queue_xmit+0x1ce/0x610
  __tcp_transmit_skb+0xabc/0xc80
  tcp_write_xmit+0x229/0x12c0
  __tcp_push_pending_frames+0x32/0xf0
  tcp_sendmsg_locked+0x297/0xe10
  tcp_sendmsg+0x27/0x40
  sock_sendmsg+0x58/0x70
  __sys_sendto+0xfd/0x170
  __x64_sys_sendto+0x24/0x30
  do_syscall_64+0x3a/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7f11a06fd281
 Code: 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 e5 43 2c 00 41 89 ca 8b 00 85 c0 75 1c 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 67 c3 66 0f 1f 44 00 00 41 56 41 89 ce 41 55
 RSP: 002b:00007ffd17958358 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000555c6e671610 RCX: 00007f11a06fd281
 RDX: 0000000000002000 RSI: 0000555c6e73a9f0 RDI: 0000000000000003
 RBP: 0000555c6e6433b0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
 R13: 0000555c6e671410 R14: 0000555c6e671410 R15: 0000555c6e6433f8
  </TASK>

that is very similar to those observed by William in his setup.
By using netif_rx() for mirred ingress packets, packets are queued in the
backlog, like it's done in the receive path of "loopback" and "veth", and
the deadlock is not visible anymore. Also add a selftest that can be used
to reproduce the problem / verify the fix.

Fixes: 53592b364001 ("net/sched: act_mirred: Implement ingress actions")
Reported-by: William Zhao <wizhao@redhat.com>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/sched/act_mirred.c                        |  2 +-
 .../selftests/net/forwarding/tc_actions.sh    | 29 ++++++++++++++++++-
 2 files changed, 29 insertions(+), 2 deletions(-)

Message ID	33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> From: Davide Caratti <dcaratti@redhat.com> To: Jamal Hadi Salim <jhs@mojatatu.com>, Cong Wang <xiyou.wangcong@gmail.com>, Jiri Pirko <jiri@resnulli.us>, Paolo Abeni <pabeni@redhat.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>, wizhao@redhat.com, netdev@vger.kernel.org Subject: [PATCH net] net/sched: act_mirred: use the backlog for mirred ingress Date: Fri, 23 Sep 2022 17:11:12 +0200 Message-Id: <33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[net] net/sched: act_mirred: use the backlog for mirred ingress \| expand [net] net/sched: act_mirred: use the backlog for mirred ingress

Context	Check	Description
netdev/tree_selection	success	Clearly marked for net
netdev/fixes_present	success	Fixes tag present in non-next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Single patches do not need cover letters
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers	fail	2 blamed authors not CCed: shmulik.ladkani@gmail.com davem@davemloft.net; 8 maintainers not CCed: shmulik.ladkani@gmail.com kuba@kernel.org shuah@kernel.org idosch@nvidia.com davem@davemloft.net linux-kselftest@vger.kernel.org vladimir.oltean@nxp.com edumazet@google.com
netdev/build_clang	success	Errors and warnings before: 0 this patch: 0
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	net selftest script(s) already in Makefile
netdev/verify_fixes	success	Fixes tag looks correct
netdev/build_allmodconfig_warn	success	Errors and warnings before: 0 this patch: 0
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 49 lines checked
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

[net] net/sched: act_mirred: use the backlog for mirred ingress

Checks

Commit Message

Comments

Patch