mbox series

[net-next,0/4] netconsole: Add support for CPU population

Message ID 20241113-netcon_cpu-v1-0-d187bf7c0321@debian.org (mailing list archive)
Headers show
Series netconsole: Add support for CPU population | expand

Message

Breno Leitao Nov. 13, 2024, 3:10 p.m. UTC
The current implementation of netconsole sends all log messages in
parallel, which can lead to an intermixed and interleaved output on the
receiving side. This makes it challenging to demultiplex the messages
and attribute them to their originating CPUs.

As a result, users and developers often struggle to effectively analyze
and debug the parallel log output received through netconsole.

Example of a message got from produciton hosts:

	------------[ cut here ]------------
	------------[ cut here ]------------
	refcount_t: saturated; leaking memory.
	WARNING: CPU: 2 PID: 1613668 at lib/refcount.c:22 refcount_warn_saturate+0x5e/0xe0
	refcount_t: addition on 0; use-after-free.
	WARNING: CPU: 26 PID: 4139916 at lib/refcount.c:25 refcount_warn_saturate+0x7d/0xe0
	Modules linked in: bpf_preload(E) vhost_net(E) tun(E) vhost(E)

This series of patches introduces a new feature to the netconsole
subsystem that allows the automatic population of the CPU number in the
userdata field for each log message. This enhancement provides several
benefits:

* Improved demultiplexing of parallel log output: When multiple CPUs are
  sending messages concurrently, the added CPU number in the userdata
  makes it easier to differentiate and attribute the messages to their
  originating CPUs.

* Better visibility into message sources: The CPU number information
  gives users and developers more insight into which specific CPU a
  particular log message came from, which can be valuable for debugging
  and analysis.

The changes in this series are as follows:

Patch "Ensure dynamic_netconsole_mutex is held during userdata update"

Add a lockdep assert to make sure  dynamic_netconsole_mutex is held when
calling update_userdata().

Patch "netconsole: Add option to auto-populate CPU number in userdata"

Adds a new option to enable automatic CPU number population in the
netconsole userdata Provides a new "populate_cpu_nr" sysfs attribute to
control this feature

Patch "netconsole: selftest: test CPU number auto-population"

Expands the existing netconsole selftest to verify the CPU number
auto-population functionality Ensures the received netconsole messages
contain the expected "cpu=" entry in the userdata

Patch "netconsole: docs: Add documentation for CPU number auto-population"

Updates the netconsole documentation to explain the new CPU number
auto-population feature Provides instructions on how to enable and use
the feature

I believe these changes will be a valuable addition to the netconsole
subsystem, enhancing its usefulness for kernel developers and users.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
Breno Leitao (4):
      netconsole: Ensure dynamic_netconsole_mutex is held during userdata update
      netconsole: Add option to auto-populate CPU number in userdata
      netconsole: docs: Add documentation for CPU number auto-population
      netconsole: selftest: Validate CPU number auto-population in userdata

 Documentation/networking/netconsole.rst            | 44 +++++++++++++++
 drivers/net/netconsole.c                           | 63 ++++++++++++++++++++++
 .../testing/selftests/drivers/net/netcons_basic.sh | 18 +++++++
 3 files changed, 125 insertions(+)
---
base-commit: a58f00ed24b849d449f7134fd5d86f07090fe2f5
change-id: 20241108-netcon_cpu-ce3917e88f4b

Best regards,