diff mbox series

[net-next,01/11] net: ethtool: let drivers remove lost RSS contexts

Message ID 20240711220713.283778-2-kuba@kernel.org (mailing list archive)
State Accepted
Commit d69ba6bbaf1f606ac354e925571a54d025e32aae
Delegated to: Netdev Maintainers
Headers show
Series eth: bnxt: use the new RSS API | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 824 this patch: 824
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 886 this patch: 886
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2631 this patch: 2631
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 25 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 7 this patch: 7
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-12--06-00 (tests: 695)

Commit Message

Jakub Kicinski July 11, 2024, 10:07 p.m. UTC
RSS contexts may get lost from a device, in various extreme circumstances.
Specifically if the firmware leaks resources and resets, or crashes and
either recovers in partially working state or the crash causes a
different FW version to run - creating the context again may fail.

Drivers should do their absolute best to prevent this from happening.
When it does, however, telling user that a context exists, when it can't
possibly be used any more is counter productive. Add a helper for
drivers to discard contexts. Print an error, in the future netlink
notification will also be sent.

More robust approaches were proposed, like keeping the contexts
but marking them as "dead" (but possibly resurrected by next reset).
That may be better but it's unclear at this stage whether the
effort is worth the benefits.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
--
I'm keeping this patch as is, since this is the simplest solution,
and nobody has strong opinions.

v2:
 - move to common.c to avoid build problems when ethtool-nl isn't enabled
 - add a note about the counter proposal in the commit message
---
 include/linux/ethtool.h |  2 ++
 net/ethtool/common.c    | 14 ++++++++++++++
 2 files changed, 16 insertions(+)
diff mbox series

Patch

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index e213b5508da6..89da0254ccd4 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -210,6 +210,8 @@  static inline size_t ethtool_rxfh_context_size(u32 indir_size, u32 key_size,
 	return struct_size_t(struct ethtool_rxfh_context, data, flex_len);
 }
 
+void ethtool_rxfh_context_lost(struct net_device *dev, u32 context_id);
+
 /* declare a link mode bitmap */
 #define __ETHTOOL_DECLARE_LINK_MODE_MASK(name)		\
 	DECLARE_BITMAP(name, __ETHTOOL_LINK_MODE_MASK_NBITS)
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 7bda9600efcf..67d06cd002a5 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -741,3 +741,17 @@  ethtool_forced_speed_maps_init(struct ethtool_forced_speed_map *maps, u32 size)
 	}
 }
 EXPORT_SYMBOL_GPL(ethtool_forced_speed_maps_init);
+
+void ethtool_rxfh_context_lost(struct net_device *dev, u32 context_id)
+{
+	struct ethtool_rxfh_context *ctx;
+
+	WARN_ONCE(!rtnl_is_locked() &&
+		  !lockdep_is_held_type(&dev->ethtool->rss_lock, -1),
+		  "RSS context lock assertion failed\n");
+
+	netdev_err(dev, "device error, RSS context %d lost\n", context_id);
+	ctx = xa_erase(&dev->ethtool->rss_ctx, context_id);
+	kfree(ctx);
+}
+EXPORT_SYMBOL(ethtool_rxfh_context_lost);