diff mbox series

[net,v2] nfp: fix rcu_read_lock/unlock while rcu_derefrencing

Message ID 20230615073139.8656-1-louis.peens@corigine.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] nfp: fix rcu_read_lock/unlock while rcu_derefrencing | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers fail 2 blamed authors not CCed: daniel@iogearbox.net toke@redhat.com; 8 maintainers not CCed: hawk@kernel.org daniel@iogearbox.net john.fastabend@gmail.com yanguo.li@corigine.com bpf@vger.kernel.org toke@redhat.com ast@kernel.org edumazet@google.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Louis Peens June 15, 2023, 7:31 a.m. UTC
From: Tianyu Yuan <tianyu.yuan@corigine.com>

When CONFIG_PROVE_LOCKING and CONFIG_PROVE_RCU are enabled, using OVS with
vf reprs on bridge will lead to following log in dmesg:

 .../nfp/flower/main.c:269 suspicious rcu_dereference_check() usage!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 no locks held by swapper/15/0.

 ......
 Call Trace:
  <IRQ>
  dump_stack_lvl+0x8c/0xa0
  lockdep_rcu_suspicious+0x118/0x1a0
  nfp_flower_dev_get+0xc1/0x240 [nfp]
  nfp_nfd3_rx+0x419/0xb90 [nfp]
  ? validate_chain+0x640/0x1880
  nfp_nfd3_poll+0x3e/0x180 [nfp]
  __napi_poll+0x28/0x1d0
  net_rx_action+0x2bd/0x3c0
  ? _raw_spin_unlock_irqrestore+0x42/0x70
  __do_softirq+0xc3/0x3c6
  irq_exit_rcu+0xeb/0x130
  common_interrupt+0xb9/0xd0
  </IRQ>
  <TASK>
  ......
  </TASK>

In previous patch rcu_read_lock()/unlock() are removed because rcu-lock may
affect xdp_prog. However this removal will make RCU lockdep report above
warning because of missing of rcu_read_lock()/unlock() pair around
rcu_deference().

This patch resolves this problem by replacing rcu_deference() with
rcu_dereference_check() to annotate that access is safe if
rcu_read_lock/rcu_read_lock_bh is held.

Fixes: d5789621b658 ("nfp: Remove rcu_read_lock() around XDP program invocation")
CC: stable@vger.kernel.org
Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
---
v1 -> v2:
   Remove unnessary nfp_app_dev_get_locked() helper function
   Replace rcu_dereference() with rcu_dereference_check() in .get_dev()
   Improve commit message
---
 drivers/net/ethernet/netronome/nfp/abm/main.c    | 4 ++--
 drivers/net/ethernet/netronome/nfp/flower/main.c | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

Comments

Jakub Kicinski June 15, 2023, 7:47 p.m. UTC | #1
On Thu, 15 Jun 2023 09:31:39 +0200 Louis Peens wrote:
> From: Tianyu Yuan <tianyu.yuan@corigine.com>
> 
> When CONFIG_PROVE_LOCKING and CONFIG_PROVE_RCU are enabled, using OVS with
> vf reprs on bridge will lead to following log in dmesg:
> 
>  .../nfp/flower/main.c:269 suspicious rcu_dereference_check() usage!
> 
>  other info that might help us debug this:
> 
>  rcu_scheduler_active = 2, debug_locks = 1
>  no locks held by swapper/15/0.
> 
>  ......
>  Call Trace:
>   <IRQ>
>   dump_stack_lvl+0x8c/0xa0
>   lockdep_rcu_suspicious+0x118/0x1a0
>   nfp_flower_dev_get+0xc1/0x240 [nfp]
>   nfp_nfd3_rx+0x419/0xb90 [nfp]
>   ? validate_chain+0x640/0x1880
>   nfp_nfd3_poll+0x3e/0x180 [nfp]
>   __napi_poll+0x28/0x1d0
>   net_rx_action+0x2bd/0x3c0
>   ? _raw_spin_unlock_irqrestore+0x42/0x70
>   __do_softirq+0xc3/0x3c6
>   irq_exit_rcu+0xeb/0x130
>   common_interrupt+0xb9/0xd0
>   </IRQ>
>   <TASK>
>   ......
>   </TASK>
> 
> In previous patch rcu_read_lock()/unlock() are removed because rcu-lock may
> affect xdp_prog. However this removal will make RCU lockdep report above
> warning because of missing of rcu_read_lock()/unlock() pair around
> rcu_deference().
> 
> This patch resolves this problem by replacing rcu_deference() with
> rcu_dereference_check() to annotate that access is safe if
> rcu_read_lock/rcu_read_lock_bh is held.
> 
> Fixes: d5789621b658 ("nfp: Remove rcu_read_lock() around XDP program invocation")

I'd vote to simply revert that commit. Toke likely assumed that the RCU
protection is only for XDP but turns out we have more datapath stuff
that depends on it. No strong preference but my vote would be to not
play with RCU flavors at the driver level.

> CC: stable@vger.kernel.org
> Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
> Acked-by: Simon Horman <simon.horman@corigine.com>
> Signed-off-by: Louis Peens <louis.peens@corigine.com>

> -	reprs = rcu_dereference(app->reprs[rtype]);
> +	reprs = rcu_dereference_check(app->reprs[rtype], rcu_read_lock_bh_held());

If you prefer to keep the patch I think this is just
rcu_dereference_bh() ?
diff mbox series

Patch

diff --git a/drivers/net/ethernet/netronome/nfp/abm/main.c b/drivers/net/ethernet/netronome/nfp/abm/main.c
index 5d3df28c648f..10b73297a313 100644
--- a/drivers/net/ethernet/netronome/nfp/abm/main.c
+++ b/drivers/net/ethernet/netronome/nfp/abm/main.c
@@ -63,14 +63,14 @@  nfp_abm_repr_get(struct nfp_app *app, u32 port_id, bool *redir_egress)
 	rtype = FIELD_GET(NFP_ABM_PORTID_TYPE, port_id);
 	port = FIELD_GET(NFP_ABM_PORTID_ID, port_id);
 
-	reprs = rcu_dereference(app->reprs[rtype]);
+	reprs = rcu_dereference_check(app->reprs[rtype], rcu_read_lock_bh_held());
 	if (!reprs)
 		return NULL;
 
 	if (port >= reprs->num_reprs)
 		return NULL;
 
-	return rcu_dereference(reprs->reprs[port]);
+	return rcu_dereference_check(reprs->reprs[port], rcu_read_lock_bh_held());
 }
 
 static int
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 83eaa5ae3cd4..d33cc22c788d 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -257,14 +257,15 @@  nfp_flower_dev_get(struct nfp_app *app, u32 port_id, bool *redir_egress)
 	if (repr_type > NFP_REPR_TYPE_MAX)
 		return NULL;
 
-	reprs = rcu_dereference(app->reprs[repr_type]);
+	reprs = rcu_dereference_check(app->reprs[repr_type],
+				      rcu_read_lock_bh_held());
 	if (!reprs)
 		return NULL;
 
 	if (port >= reprs->num_reprs)
 		return NULL;
 
-	return rcu_dereference(reprs->reprs[port]);
+	return rcu_dereference_check(reprs->reprs[port], rcu_read_lock_bh_held());
 }
 
 static int