diff mbox series

[net] sfc: fix crash when reading stats while NIC is resetting

Message ID 20230623143448.47159-1-edward.cree@amd.com (mailing list archive)
State Accepted
Commit d1b355438b8325a486f087e506d412c4e852f37b
Delegated to: Netdev Maintainers
Headers show
Series [net] sfc: fix crash when reading stats while NIC is resetting | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 27 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

edward.cree@amd.com June 23, 2023, 2:34 p.m. UTC
From: Edward Cree <ecree.xilinx@gmail.com>

efx_net_stats() (.ndo_get_stats64) can be called during an ethtool
 selftest, during which time nic_data->mc_stats is NULL as the NIC has
 been fini'd.  In this case do not attempt to fetch the latest stats
 from the hardware, else we will crash on a NULL dereference:
    BUG: kernel NULL pointer dereference, address: 0000000000000038
    RIP efx_nic_update_stats
    abridged calltrace:
    efx_ef10_update_stats_pf
    efx_net_stats
    dev_get_stats
    dev_seq_printf_stats
Skipping the read is safe, we will simply give out stale stats.
To ensure that the free in efx_ef10_fini_nic() does not race against
 efx_ef10_update_stats_pf(), which could cause a TOCTTOU bug, take the
 efx->stats_lock in fini_nic (it is already held across update_stats).

Fixes: d3142c193dca ("sfc: refactor EF10 stats handling")
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
 drivers/net/ethernet/sfc/ef10.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org June 26, 2023, 8:30 a.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Fri, 23 Jun 2023 15:34:48 +0100 you wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> efx_net_stats() (.ndo_get_stats64) can be called during an ethtool
>  selftest, during which time nic_data->mc_stats is NULL as the NIC has
>  been fini'd.  In this case do not attempt to fetch the latest stats
>  from the hardware, else we will crash on a NULL dereference:
>     BUG: kernel NULL pointer dereference, address: 0000000000000038
>     RIP efx_nic_update_stats
>     abridged calltrace:
>     efx_ef10_update_stats_pf
>     efx_net_stats
>     dev_get_stats
>     dev_seq_printf_stats
> Skipping the read is safe, we will simply give out stale stats.
> To ensure that the free in efx_ef10_fini_nic() does not race against
>  efx_ef10_update_stats_pf(), which could cause a TOCTTOU bug, take the
>  efx->stats_lock in fini_nic (it is already held across update_stats).
> 
> [...]

Here is the summary with links:
  - [net] sfc: fix crash when reading stats while NIC is resetting
    https://git.kernel.org/netdev/net/c/d1b355438b83

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index b63e47af6365..8c019f382a7f 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1297,8 +1297,10 @@  static void efx_ef10_fini_nic(struct efx_nic *efx)
 {
 	struct efx_ef10_nic_data *nic_data = efx->nic_data;
 
+	spin_lock_bh(&efx->stats_lock);
 	kfree(nic_data->mc_stats);
 	nic_data->mc_stats = NULL;
+	spin_unlock_bh(&efx->stats_lock);
 }
 
 static int efx_ef10_init_nic(struct efx_nic *efx)
@@ -1852,9 +1854,14 @@  static size_t efx_ef10_update_stats_pf(struct efx_nic *efx, u64 *full_stats,
 
 	efx_ef10_get_stat_mask(efx, mask);
 
-	efx_nic_copy_stats(efx, nic_data->mc_stats);
-	efx_nic_update_stats(efx_ef10_stat_desc, EF10_STAT_COUNT,
-			     mask, stats, nic_data->mc_stats, false);
+	/* If NIC was fini'd (probably resetting), then we can't read
+	 * updated stats right now.
+	 */
+	if (nic_data->mc_stats) {
+		efx_nic_copy_stats(efx, nic_data->mc_stats);
+		efx_nic_update_stats(efx_ef10_stat_desc, EF10_STAT_COUNT,
+				     mask, stats, nic_data->mc_stats, false);
+	}
 
 	/* Update derived statistics */
 	efx_nic_fix_nodesc_drop_stat(efx,