diff mbox series

[net-next,6/7] ice: dump ethtool stats and skb by Tx hang devlink health reporter

Message ID 20241211223231.397203-7-anthony.l.nguyen@intel.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ice: add support for devlink health events | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5 this patch: 5
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 55 this patch: 55
netdev/source_inline success Was 0 now: 0

Commit Message

Tony Nguyen Dec. 11, 2024, 10:32 p.m. UTC
From: Przemek Kitszel <przemyslaw.kitszel@intel.com>

Print the ethtool stats and skb diagnostic information as part of Tx hang
devlink health dump.

Move the declarations of ethtool functions that devlink health uses out
to a new file: ice_ethtool_common.h

To utilize our existing ethtool code in this context, convert it to
non-static.

Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 .../net/ethernet/intel/ice/devlink/health.c   | 36 +++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_ethtool.c  | 10 +++---
 drivers/net/ethernet/intel/ice/ice_ethtool.h  |  2 ++
 .../ethernet/intel/ice/ice_ethtool_common.h   | 19 ++++++++++
 4 files changed, 62 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ice/ice_ethtool_common.h

Comments

Jakub Kicinski Dec. 13, 2024, 3 a.m. UTC | #1
On Wed, 11 Dec 2024 14:32:14 -0800 Tony Nguyen wrote:
> From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> 
> Print the ethtool stats and skb diagnostic information as part of Tx hang
> devlink health dump.
> 
> Move the declarations of ethtool functions that devlink health uses out
> to a new file: ice_ethtool_common.h
> 
> To utilize our existing ethtool code in this context, convert it to
> non-static.

This is going too far, user space is fully capable of capturing this
data. It gets a netlink notification when health reporter flips to 
a bad state. I think Jiri worked on a daemon what could capture more
data from user space ? I may be misremembering...
Przemek Kitszel Dec. 16, 2024, 4:53 a.m. UTC | #2
On 12/13/24 04:00, Jakub Kicinski wrote:
> On Wed, 11 Dec 2024 14:32:14 -0800 Tony Nguyen wrote:
>> From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>>
>> Print the ethtool stats and skb diagnostic information as part of Tx hang
>> devlink health dump.
>>
>> Move the declarations of ethtool functions that devlink health uses out
>> to a new file: ice_ethtool_common.h
>>
>> To utilize our existing ethtool code in this context, convert it to
>> non-static.
> 
> This is going too far, user space is fully capable of capturing this
> data. It gets a netlink notification when health reporter flips to
> a bad state. 

It really pays to split your patches into trivial vs controversial ones.

Will it be fine to merge this series without patch 6 (and 3) then?
Patches 2, 4 and 5 are dependency for another health reporters that
Konrad did:
https://lore.kernel.org/intel-wired-lan/20241211110357.196167-1-konrad.knitter@intel.com

>I think Jiri worked on a daemon what could capture more
> data from user space ? I may be misremembering...

We would love to read more on that, then with more knowledge revisit
what to do about our needs covered by this patch.
Przemek Kitszel Dec. 16, 2024, 12:58 p.m. UTC | #3
On 12/16/24 05:53, Przemek Kitszel wrote:
> On 12/13/24 04:00, Jakub Kicinski wrote:
>> On Wed, 11 Dec 2024 14:32:14 -0800 Tony Nguyen wrote:
>>> From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>>>
>>> Print the ethtool stats and skb diagnostic information as part of Tx 
>>> hang
>>> devlink health dump.
>>>
>>> Move the declarations of ethtool functions that devlink health uses out
>>> to a new file: ice_ethtool_common.h
>>>
>>> To utilize our existing ethtool code in this context, convert it to
>>> non-static.
>>
>> This is going too far, user space is fully capable of capturing this
>> data. It gets a netlink notification when health reporter flips to
>> a bad state. 
> 
> It really pays to split your patches into trivial vs controversial ones.

not so trivial for git anyway...

> 
> Will it be fine to merge this series without patch 6 (and 3) then?

we will have to resend, so I will remove just ethtool stats part for now
sorry for the noise

> Patches 2, 4 and 5 are dependency for another health reporters that
> Konrad did:
> https://lore.kernel.org/intel-wired-lan/20241211110357.196167-1- 
> konrad.knitter@intel.com
> 
>> I think Jiri worked on a daemon what could capture more
>> data from user space ? I may be misremembering...
> 
> We would love to read more on that, then with more knowledge revisit
> what to do about our needs covered by this patch.

still interested ofc
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c b/drivers/net/ethernet/intel/ice/devlink/health.c
index b8c5a1c372dc..b0abb6d4e3e4 100644
--- a/drivers/net/ethernet/intel/ice/devlink/health.c
+++ b/drivers/net/ethernet/intel/ice/devlink/health.c
@@ -3,6 +3,7 @@ 
 
 #include "health.h"
 #include "ice.h"
+#include "ice_ethtool_common.h"
 
 #define ICE_DEVLINK_FMSG_PUT_FIELD(fmsg, obj, name) \
 	devlink_fmsg_put(fmsg, #name, (obj)->name)
@@ -26,6 +27,36 @@  static void ice_devlink_health_report(struct devlink_health_reporter *reporter,
 	devlink_health_report(reporter, msg, priv_ctx);
 }
 
+static void ice_dump_ethtool_stats_to_fmsg(struct devlink_fmsg *fmsg,
+					   struct net_device *netdev)
+{
+	const u32 string_set = ETH_SS_STATS;
+	u64 *stats;
+	u8 *names;
+	int scnt;
+
+	scnt = ice_get_sset_count(netdev, string_set);
+	devlink_fmsg_put(fmsg, "stats-cnt", (u32)scnt);
+	if (scnt <= 0)
+		return;
+
+	names = kcalloc(scnt, ETH_GSTRING_LEN, GFP_KERNEL);
+	stats = kcalloc(scnt, sizeof(*stats), GFP_KERNEL);
+	if (!names || !stats)
+		goto out;
+
+	ice_get_strings(netdev, string_set, names);
+	ice_get_ethtool_stats(netdev, NULL, stats);
+
+	devlink_fmsg_obj_nest_start(fmsg);
+	for (int i = 0; i < scnt; ++i)
+		devlink_fmsg_put(fmsg, &names[i * ETH_GSTRING_LEN], stats[i]);
+	devlink_fmsg_obj_nest_end(fmsg);
+out:
+	kfree(names);
+	kfree(stats);
+}
+
 /**
  * ice_fmsg_put_ptr - put hex value of pointer into fmsg
  *
@@ -57,10 +88,12 @@  static int ice_tx_hang_reporter_dump(struct devlink_health_reporter *reporter,
 				     struct netlink_ext_ack *extack)
 {
 	struct ice_tx_hang_event *event = priv_ctx;
+	struct sk_buff *skb;
 
 	if (!event)
 		return 0;
 
+	skb = event->tx_ring->tx_buf->skb;
 	devlink_fmsg_obj_nest_start(fmsg);
 	ICE_DEVLINK_FMSG_PUT_FIELD(fmsg, event, head);
 	ICE_DEVLINK_FMSG_PUT_FIELD(fmsg, event, intr);
@@ -71,8 +104,11 @@  static int ice_tx_hang_reporter_dump(struct devlink_health_reporter *reporter,
 	devlink_fmsg_put(fmsg, "irq-mapping", event->tx_ring->q_vector->name);
 	ice_fmsg_put_ptr(fmsg, "desc-ptr", event->tx_ring->desc);
 	ice_fmsg_put_ptr(fmsg, "dma-ptr", (void *)(long)event->tx_ring->dma);
+	ice_fmsg_put_ptr(fmsg, "skb-ptr", skb);
 	devlink_fmsg_binary_pair_put(fmsg, "desc", event->tx_ring->desc,
 				     event->tx_ring->count * sizeof(struct ice_tx_desc));
+	devlink_fmsg_dump_skb(fmsg, skb);
+	ice_dump_ethtool_stats_to_fmsg(fmsg, event->tx_ring->vsi->netdev);
 	devlink_fmsg_obj_nest_end(fmsg);
 
 	return 0;
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 3072634bf049..b552439fc1f9 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -1507,7 +1507,7 @@  __ice_get_strings(struct net_device *netdev, u32 stringset, u8 *data,
 	}
 }
 
-static void ice_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
+void ice_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
 {
 	struct ice_netdev_priv *np = netdev_priv(netdev);
 
@@ -1887,7 +1887,7 @@  static int ice_set_priv_flags(struct net_device *netdev, u32 flags)
 	return ret;
 }
 
-static int ice_get_sset_count(struct net_device *netdev, int sset)
+int ice_get_sset_count(struct net_device *netdev, int sset)
 {
 	switch (sset) {
 	case ETH_SS_STATS:
@@ -1990,9 +1990,9 @@  __ice_get_ethtool_stats(struct net_device *netdev,
 	}
 }
 
-static void
-ice_get_ethtool_stats(struct net_device *netdev,
-		      struct ethtool_stats __always_unused *stats, u64 *data)
+void ice_get_ethtool_stats(struct net_device *netdev,
+			   struct ethtool_stats __always_unused *stats,
+			   u64 *data)
 {
 	struct ice_netdev_priv *np = netdev_priv(netdev);
 
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.h b/drivers/net/ethernet/intel/ice/ice_ethtool.h
index 8f2ad1c172c0..a1a34440557d 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.h
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.h
@@ -4,6 +4,8 @@ 
 #ifndef _ICE_ETHTOOL_H_
 #define _ICE_ETHTOOL_H_
 
+#include "ice_ethtool_common.h"
+
 struct ice_phy_type_to_ethtool {
 	u64 aq_link_speed;
 	u8 link_mode;
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool_common.h b/drivers/net/ethernet/intel/ice/ice_ethtool_common.h
new file mode 100644
index 000000000000..0c772056f006
--- /dev/null
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool_common.h
@@ -0,0 +1,19 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2024, Intel Corporation. */
+
+#ifndef _ICE_ETHTOOL_COMMON_H_
+#define _ICE_ETHTOOL_COMMON_H_
+
+/**
+ * DOC: ice_ethtool_common.h
+ *
+ * This header is for ethtool related code that is reused in other places.
+ */
+
+void ice_get_strings(struct net_device *netdev, u32 stringset, u8 *data);
+int ice_get_sset_count(struct net_device *netdev, int sset);
+void ice_get_ethtool_stats(struct net_device *netdev,
+			   struct ethtool_stats __always_unused *stats,
+			   u64 *data);
+
+#endif /* _ICE_ETHTOOL_COMMON_H_ */