mbox series

[RFC,fwctl,0/5] pds_fwctl: fwctl for AMD/Pensando core devices

Message ID 20250211234854.52277-1-shannon.nelson@amd.com
Headers show
Series pds_fwctl: fwctl for AMD/Pensando core devices | expand

Message

Nelson, Shannon Feb. 11, 2025, 11:48 p.m. UTC
Following along from Jason's work [1] we have our initial RFC patchset
for pds_fwctl - an auxiliary_bus driver for supporting fwctl through the
pds_core driver and its PDS core device.

The PDS core is PCI device that is separate and distinct from the
ionic Eth device and from the other PCI devices that can be supported
by the AMD/Pensando DSC.  It is used by pds_vdpa and pds_vfio_pci to
coordinate/communicate with the FW for setting up their services.

Until now the DSC's basic configurations for defining what devices to
support and for getting low-level device debug information have been
through internal commands not available from the host side, requiring
access into the DSC's own OS.  Adding the fwctl service allows us to start
bringing these capabilities up into the host, but they don't replace the
existing function-specific tools.  For example, these are things that make
the Eth PCI device appear on the PCI bus, while the tuning of the specific
Eth features still go through the standard ethtool/devlink/ip/etc tools.

The first two patches add a new pds_core.fwctl auxiliary_device to the
pds_core driver.  This is only supported by the pds_core PF and not on
any VFs.

The remaining patches add the pds_fwctl driver framework and then fill
in the details for the fwctl services.

[1] https://lore.kernel.org/netdev/0-v4-0cf4ec3b8143+4995-fwctl_jgg@nvidia.com/
    [PATCH v4 00/10] Introduce fwctl subystem

Brett Creeley (1):
  pds_fwctl: add rpc and query support

Shannon Nelson (4):
  pds_core: specify auxiliary_device to be created
  pds_core: add new fwctl auxilary_device
  pds_fwctl: initial driver framework
  pds_fwctl: add Documentation entries

 Documentation/userspace-api/fwctl/fwctl.rst   |   1 +
 Documentation/userspace-api/fwctl/index.rst   |   1 +
 .../userspace-api/fwctl/pds_fwctl.rst         |  41 ++
 MAINTAINERS                                   |   7 +
 drivers/fwctl/Kconfig                         |  10 +
 drivers/fwctl/Makefile                        |   1 +
 drivers/fwctl/pds/Makefile                    |   4 +
 drivers/fwctl/pds/main.c                      | 558 ++++++++++++++++++
 drivers/net/ethernet/amd/pds_core/auxbus.c    |  44 +-
 drivers/net/ethernet/amd/pds_core/core.c      |   7 +
 drivers/net/ethernet/amd/pds_core/core.h      |   8 +-
 drivers/net/ethernet/amd/pds_core/devlink.c   |   6 +-
 drivers/net/ethernet/amd/pds_core/main.c      |  21 +-
 include/linux/pds/pds_adminq.h                | 264 +++++++++
 include/linux/pds/pds_common.h                |   2 +
 include/uapi/fwctl/fwctl.h                    |   1 +
 include/uapi/fwctl/pds.h                      |  43 ++
 17 files changed, 988 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/userspace-api/fwctl/pds_fwctl.rst
 create mode 100644 drivers/fwctl/pds/Makefile
 create mode 100644 drivers/fwctl/pds/main.c
 create mode 100644 include/uapi/fwctl/pds.h

Comments

Andrew Lunn Feb. 12, 2025, 1:40 p.m. UTC | #1
> existing function-specific tools.  For example, these are things that make
> the Eth PCI device appear on the PCI bus

That sounds like a common operation which many vendors will need? So
why use fwctl for this? The whole point of fwctl is things which are
highly vendor specific and not networking.

Isn't this even generic for any sort of SR-IOV? Wouldn't you need the
same sort of operation for a GPU, or anything with a pool of resources
which can be mapped to VFs?

If you really want to use this as you key selling point, you need to
clearly explain why is this highly vendor specific and cannot be
generalised.

	Andrew
Jason Gunthorpe Feb. 12, 2025, 2:43 p.m. UTC | #2
On Wed, Feb 12, 2025 at 02:40:45PM +0100, Andrew Lunn wrote:

> Isn't this even generic for any sort of SR-IOV? Wouldn't you need the
> same sort of operation for a GPU, or anything with a pool of resources
> which can be mapped to VFs?

We've been calling this device profiling in the vfio discussions,
generally yes the general idea of profiling is common, but the actual
detail of the profile is very device specific.

In vfio land we think fwctl is a good choice here. We already have
things like libvirt and Kubernetes that have generic userspace plugin
mechanims and an existing mature ecosystem for device profiling built
up around that. All sophisticated devices have their own plugins
because they have unique capabilities. It seems to be working well in
that world.

From a kernel perspective fwctl is alot better than some of what has
been tried so far, ie various vfio drivers having questionable
device-specific sysfs, and then a libvirt/k8s plugin anyhow.

Even the better stuff like mlx5's devlink is only partially capable
and the existing mlx plugins still has to do stuff beyond that.

The kernel isn't the only point, or necessarily the most appropriate
point, to insert a consolidation layer in the stack. We don't want to
move chunks of existing k8s operator code into the kernel, for
instance.

Again, this isn't an exclusive thing, that fwctl can profile a PCI
function doesn't in any way exclude other kernel options, like
devlink, from doing that too.

Jason
Andrew Lunn Feb. 12, 2025, 4:19 p.m. UTC | #3
On Wed, Feb 12, 2025 at 10:43:28AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 12, 2025 at 02:40:45PM +0100, Andrew Lunn wrote:
> 
> > Isn't this even generic for any sort of SR-IOV? Wouldn't you need the
> > same sort of operation for a GPU, or anything with a pool of resources
> > which can be mapped to VFs?
> 
> We've been calling this device profiling in the vfio discussions,
> generally yes the general idea of profiling is common, but the actual
> detail of the profile is very device specific.

This is your poster child for fwctl. You are trying to convince us it
is a way to configure things which are very vendor specific. Yet, as
you point out, the idea of profiling is common. So why start here? It
seems an odd choice. So i would of expected the messaging to be
clearer. You the vendors agree there is no commonality, so explain
that. Take three different vendors cards and list all the parameters
which are needed for profiling with these cards. Really show that
there is no commonality. And maybe take it a step further. Get these
vendors to work together to produce three patchset implementing device
profiling, so we can see there cannot be code sharing. Then you might
have a convincing poster child for fwctl.

Given how contentious fwctl is, i would say vendors need to work
together to show there is nothing in common, at least to start
with.

	Andrew