mbox series

[v3,vfio,0/7] pds vfio driver

Message ID 20230219083908.40013-1-brett.creeley@amd.com (mailing list archive)
Headers show
Series pds vfio driver | expand

Message

Brett Creeley Feb. 19, 2023, 8:39 a.m. UTC
This is a draft patchset for a new vendor specific VFIO driver
(pds_vfio) for use with the AMD/Pensando Distributed Services Card
(DSC). This driver is device type agnostic and live migration is
supported as long as the underlying SR-IOV VF supports live migration
on the DSC. This driver is a client of the newly introduced pds_core
driver, which the latest version can be referenced at:

https://lore.kernel.org/netdev/20230217225558.19837-1-shannon.nelson@amd.com/

This driver will use the pds_core device and auxiliary_bus as the VFIO
control path to the DSC. The pds_core device creates auxiliary_bus
devices for each live migratable VF. The devices are named by their
feature plus the VF PCI BDF so the auxiliary_bus driver implemented by
pds_vfio can find its related VF PCI driver instance. Once this
auxiliary bus connection is configured, the pds_vfio driver can send
admin queue commands to the device and receive events from pds_core.

An ASCII diagram of a VFIO instance looks something like this and can
be used with the VFIO subsystem to provide devices VFIO and live
migration support.

                               .------.  .--------------------------.
                               | QEMU |--|  VM     .-------------.  |
                               '......'  |         | PCI driver  |  |
                                  |      |         .-------------.  |
                                  |      |         |  SR-IOV VF  |  |
                                  |      |         '-------------'  |
                                  |      '---------------||---------'
                               .--------------.          ||
                               |/dev/<vfio_fd>|          ||
                               '--------------'          ||
Host Userspace                         |                 ||
===================================================      ||
Host Kernel                            |                 ||
                                       |                 ||
           pds_core.LM.2305 <--+   .--------.            ||
                   |           |   |vfio-pci|            ||
                   |           |   '--------'            ||
                   |           |       |                 ||
         .------------.       .-------------.            ||
         |  pds_core  |       |   pds_vfio  |            ||
         '------------'       '-------------'            ||
               ||                   ||                   ||
             09:00.0              09:00.1                ||
== PCI ==================================================||=====
               ||                   ||                   ||
          .----------.         .----------.              ||
    ,-----|    PF    |---------|    VF    |-------------------,
    |     '----------'         '----------'  |       VF       |
    |                     DSC                |  data/control  |
    |                                        |      path      |
    -----------------------------------------------------------


The pds_vfio driver is targeted to reside in drivers/vfio/pci/pds.
It makes use of and introduces new files in the common include/linux/pds
include directory.

Changes:

v3:
- Update copyright year to 2023 and use "Advanced Micro Devices, Inc."
  for the company name
- Clarify the fact that AMD/Pensando's VFIO solution is device type
  agnostic, which aligns with other current VFIO solutions
- Add line in drivers/vfio/pci/Makefile to build pds_vfio
- Move documentation to amd sub-directory
- Remove some dead code due to the pds_core implementation of
  listening to BIND/UNBIND events
- Move a dev_dbg() to a previous patch in the series
- Add implementation for vfio_migration_ops.migration_get_data_size to
  return the maximum possible device state size

RFC to v2:
https://lore.kernel.org/all/20221214232136.64220-1-brett.creeley@amd.com/
- Implement state transitions for VFIO_MIGRATION_P2P flag
- Improve auxiliary driver probe by returning EPROBE_DEFER
  when the PCI driver is not set up correctly
- Add pointer to docs in
  Documentation/networking/device_drivers/ethernet/index.rst

RFC:
https://lore.kernel.org/all/20221207010705.35128-1-brett.creeley@amd.com/

Brett Creeley (7):
  vfio/pds: Initial support for pds_vfio VFIO driver
  vfio/pds: Add support to register as PDS client
  vfio/pds: Add VFIO live migration support
  vfio: Commonize combine_ranges for use in other VFIO drivers
  vfio/pds: Add support for dirty page tracking
  vfio/pds: Add support for firmware recovery
  vfio/pds: Add Kconfig and documentation

 .../device_drivers/ethernet/amd/pds_vfio.rst  |  88 +++
 .../device_drivers/ethernet/index.rst         |   1 +
 MAINTAINERS                                   |   7 +
 drivers/vfio/pci/Kconfig                      |   2 +
 drivers/vfio/pci/Makefile                     |   2 +
 drivers/vfio/pci/mlx5/cmd.c                   |  48 +-
 drivers/vfio/pci/pds/Kconfig                  |  19 +
 drivers/vfio/pci/pds/Makefile                 |  12 +
 drivers/vfio/pci/pds/aux_drv.c                | 210 +++++++
 drivers/vfio/pci/pds/aux_drv.h                |  28 +
 drivers/vfio/pci/pds/cmds.c                   | 485 ++++++++++++++++
 drivers/vfio/pci/pds/cmds.h                   |  44 ++
 drivers/vfio/pci/pds/dirty.c                  | 541 ++++++++++++++++++
 drivers/vfio/pci/pds/dirty.h                  |  48 ++
 drivers/vfio/pci/pds/lm.c                     | 491 ++++++++++++++++
 drivers/vfio/pci/pds/lm.h                     |  53 ++
 drivers/vfio/pci/pds/pci_drv.c                | 126 ++++
 drivers/vfio/pci/pds/pci_drv.h                |  14 +
 drivers/vfio/pci/pds/vfio_dev.c               | 239 ++++++++
 drivers/vfio/pci/pds/vfio_dev.h               |  42 ++
 drivers/vfio/vfio_main.c                      |  48 ++
 include/linux/pds/pds_lm.h                    | 391 +++++++++++++
 include/linux/vfio.h                          |   3 +
 23 files changed, 2895 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/networking/device_drivers/ethernet/amd/pds_vfio.rst
 create mode 100644 drivers/vfio/pci/pds/Kconfig
 create mode 100644 drivers/vfio/pci/pds/Makefile
 create mode 100644 drivers/vfio/pci/pds/aux_drv.c
 create mode 100644 drivers/vfio/pci/pds/aux_drv.h
 create mode 100644 drivers/vfio/pci/pds/cmds.c
 create mode 100644 drivers/vfio/pci/pds/cmds.h
 create mode 100644 drivers/vfio/pci/pds/dirty.c
 create mode 100644 drivers/vfio/pci/pds/dirty.h
 create mode 100644 drivers/vfio/pci/pds/lm.c
 create mode 100644 drivers/vfio/pci/pds/lm.h
 create mode 100644 drivers/vfio/pci/pds/pci_drv.c
 create mode 100644 drivers/vfio/pci/pds/pci_drv.h
 create mode 100644 drivers/vfio/pci/pds/vfio_dev.c
 create mode 100644 drivers/vfio/pci/pds/vfio_dev.h
 create mode 100644 include/linux/pds/pds_lm.h

Comments

Christoph Hellwig Feb. 20, 2023, 6:29 a.m. UTC | #1
On Sun, Feb 19, 2023 at 12:39:01AM -0800, Brett Creeley wrote:
> This is a draft patchset for a new vendor specific VFIO driver
> (pds_vfio) for use with the AMD/Pensando Distributed Services Card
> (DSC). This driver is device type agnostic and live migration is
> supported as long as the underlying SR-IOV VF supports live migration
> on the DSC. This driver is a client of the newly introduced pds_core
> driver, which the latest version can be referenced at:

Just as a broken clock:  non-standard nvme live migration is not
acceptable.  Please work with the NVMe technical workning group to
get this feature standardized.  Note that despite various interested
parties on linux lists I've seen exactly zero activity from the
(not so) smart nic vendors active there.
Brett Creeley Feb. 21, 2023, 12:45 a.m. UTC | #2
On 2/19/2023 10:29 PM, Christoph Hellwig wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Sun, Feb 19, 2023 at 12:39:01AM -0800, Brett Creeley wrote:
>> This is a draft patchset for a new vendor specific VFIO driver
>> (pds_vfio) for use with the AMD/Pensando Distributed Services Card
>> (DSC). This driver is device type agnostic and live migration is
>> supported as long as the underlying SR-IOV VF supports live migration
>> on the DSC. This driver is a client of the newly introduced pds_core
>> driver, which the latest version can be referenced at:
> 
> Just as a broken clock:  non-standard nvme live migration is not
> acceptable.  Please work with the NVMe technical workning group to
> get this feature standardized.  Note that despite various interested
> parties on linux lists I've seen exactly zero activity from the
> (not so) smart nic vendors active there.


You're right, we intend to work with the respective groups, and we 
removed any mention of NVMe from the series. However, this solution 
applies to our other PCI devices.
Jason Gunthorpe Feb. 21, 2023, 1:11 a.m. UTC | #3
On Mon, Feb 20, 2023 at 04:45:51PM -0800, Brett Creeley wrote:
> > On Sun, Feb 19, 2023 at 12:39:01AM -0800, Brett Creeley wrote:
> > > This is a draft patchset for a new vendor specific VFIO driver
> > > (pds_vfio) for use with the AMD/Pensando Distributed Services Card
> > > (DSC). This driver is device type agnostic and live migration is
> > > supported as long as the underlying SR-IOV VF supports live migration
> > > on the DSC. This driver is a client of the newly introduced pds_core
> > > driver, which the latest version can be referenced at:
> > 
> > Just as a broken clock:  non-standard nvme live migration is not
> > acceptable.  Please work with the NVMe technical workning group to
> > get this feature standardized.  Note that despite various interested
> > parties on linux lists I've seen exactly zero activity from the
> > (not so) smart nic vendors active there.
> 
> 
> You're right, we intend to work with the respective groups, and we removed
> any mention of NVMe from the series. However, this solution applies to our
> other PCI devices.

The first posting had a PCI ID that was literally only for NVMe and
now suddenly this very same driver supports "other devices" with nary
a mention of what those devices are? It strains credibility.

List the exact IDs of these other devices in your PCI ID table and
don't try to get away with a PCI_ANY_ID that just happens to match the
NVMe device ID too.

Keeping in mind that PCI IDs of the VF are not supposed to differ from
the PF so this looks like a spec violation to me too :\

You have to remove the aux bus stuff also if you want this taken
seriously. Either aux for all or aux for none, I don't want drivers
making up their own stuff here. Especially since this implementation
is wrongly locked and racy.

Jason
Brett Creeley Feb. 23, 2023, 7:01 a.m. UTC | #4
On 2/20/2023 5:11 PM, Jason Gunthorpe wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Mon, Feb 20, 2023 at 04:45:51PM -0800, Brett Creeley wrote:
>>> On Sun, Feb 19, 2023 at 12:39:01AM -0800, Brett Creeley wrote:
>>>> This is a draft patchset for a new vendor specific VFIO driver
>>>> (pds_vfio) for use with the AMD/Pensando Distributed Services Card
>>>> (DSC). This driver is device type agnostic and live migration is
>>>> supported as long as the underlying SR-IOV VF supports live migration
>>>> on the DSC. This driver is a client of the newly introduced pds_core
>>>> driver, which the latest version can be referenced at:
>>>
>>> Just as a broken clock:  non-standard nvme live migration is not
>>> acceptable.  Please work with the NVMe technical workning group to
>>> get this feature standardized.  Note that despite various interested
>>> parties on linux lists I've seen exactly zero activity from the
>>> (not so) smart nic vendors active there.
>>
>>
>> You're right, we intend to work with the respective groups, and we removed
>> any mention of NVMe from the series. However, this solution applies to our
>> other PCI devices.
> 
> The first posting had a PCI ID that was literally only for NVMe and
> now suddenly this very same driver supports "other devices" with nary
> a mention of what those devices are? It strains credibility.
> 
> List the exact IDs of these other devices in your PCI ID table and
> don't try to get away with a PCI_ANY_ID that just happens to match the
> NVMe device ID too.

Okay, we'll look at revising/updating our VF device ID scheme for a 
specific VF and add that entry in the PCI ID table.

> 
> Keeping in mind that PCI IDs of the VF are not supposed to differ from
> the PF so this looks like a spec violation to me too :\
> 
> You have to remove the aux bus stuff also if you want this taken
> seriously. Either aux for all or aux for none, I don't want drivers

Can you please expand on the "aux for all or aux for none" comment? It's 
not clear what you mean here.

> making up their own stuff here. Especially since this implementation
> is wrongly locked and racy.
Can you please provide more details on what's wrongly locked and racy?

Thanks for the review.

Brett

> 
> Jason
Jason Gunthorpe Feb. 23, 2023, 1:26 p.m. UTC | #5
On Wed, Feb 22, 2023 at 11:01:33PM -0800, Brett Creeley wrote:
> > You have to remove the aux bus stuff also if you want this taken
> > seriously. Either aux for all or aux for none, I don't want drivers
> 
> Can you please expand on the "aux for all or aux for none" comment? It's not
> clear what you mean here.

You shouldn't be using aux at all for this, but if you do figure out
how to make it work right then all the drivers should be moved to use
it.

Jason