mbox series

[RFC,V1,0/7] Add support for a new IMS interrupt mechanism

Message ID 1568338328-22458-1-git-send-email-megha.dey@linux.intel.com (mailing list archive)
Headers show
Series Add support for a new IMS interrupt mechanism | expand

Message

Dey, Megha Sept. 13, 2019, 1:32 a.m. UTC
Currently, MSI (Message signaled interrupts) and MSI-X are the de facto
standard for device interrupt mechanism. MSI-X supports up to 2048
interrupts per device while MSI supports 32, which seems more than enough
for current devices. However, the introduction of SIOV (Scalable IO
virtualization) shifts the creation of assignable virtual devices from
hardware to a more software assisted approach. This flexible composition
of direct assignable devices, a.k.a. assignable device interfaces (ADIs)
unchains hardware from costly PCI standard. Under SIOV, device resource
can now be mapped directly to a guest or other user space drivers for
near native DMA performance. To complete functionality of ADIs, a matching
interrupt resource must also be introduced which will be scalable.

Interrupt message storage (IMS) is conceived  as a scalable albeit device
specific interrupt mechanism to meet such a demand. With IMS, there is
theoretically no upper bound on the number of interrupts which a device
can support. The size and location of IMS is device-specific; some devices
may implement IMS as on-device storage which are memory-mapped, others may
opt to implement IMS in system memory. IMS stores each interrupt message as
a DWORD size data payload and a 64-bit address(same as MSI-X). Access to
the IMS is through the host driver due to the non-architectural nature of
device IMS unlike the architectural MSI-X table which are accessed through
PCI drivers.

In this patchset, we introduce generic IMS APIs that fits the Linux IRQ
subsystem, supports IMS IRQ chip and domains that can be used by drivers
which are capable of generating IMS interrupts.

The IMS has been introduced as part of Intel's Scalable I/O virtualization
specification:
https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification

This patchset is based on Linux 5.3-rc8.

Currently there is no device out in the market which supports SIOV (Hence no
device supports IMS).

This series is a basic patchset to get the ball rolling and receive some
inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
at the Linux Plumbers, I need to do the following:
1. Since a device can support MSI-X and IMS simultaneously, ensure proper
   locking mechanism for the 'msi_list' in the device structure.
2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
3. IMS support of a device needs to be discoverable. A bit in the vendor
   specific capability in the PCI config is to be added rather than getting
   this information from each device driver.

Jason Gunthorpe of Mellanox technologies is looking to do something similar
on ARM platforms and was wondering why IMS is x86 sepcific. Perhaps we can
use this thread to discuss further on this. 

Megha Dey (7):
  genirq/msi: Differentiate between various MSI based interrupts
  drivers/base: Introduce callbacks for IMS interrupt domain
  x86/ims: Add support for a new IMS irq domain
  irq_remapping: New interfaces to support IMS irqdomain
  x86/ims: Introduce x86_ims_ops
  ims-msi: Add APIs to allocate/free IMS interrupts
  ims: Add the set_desc callback

 arch/mips/pci/msi-xlp.c              |   2 +-
 arch/s390/pci/pci_irq.c              |   2 +-
 arch/x86/include/asm/irq_remapping.h |  13 ++
 arch/x86/include/asm/msi.h           |   4 +
 arch/x86/include/asm/pci.h           |   4 +
 arch/x86/include/asm/x86_init.h      |  10 +
 arch/x86/kernel/apic/Makefile        |   1 +
 arch/x86/kernel/apic/ims.c           | 118 ++++++++++++
 arch/x86/kernel/apic/msi.c           |   6 +-
 arch/x86/kernel/x86_init.c           |  23 +++
 arch/x86/pci/xen.c                   |   2 +-
 drivers/base/Kconfig                 |   7 +
 drivers/base/Makefile                |   1 +
 drivers/base/ims-msi.c               | 353 +++++++++++++++++++++++++++++++++++
 drivers/iommu/intel_irq_remapping.c  |  30 +++
 drivers/iommu/irq_remapping.c        |   9 +
 drivers/iommu/irq_remapping.h        |   3 +
 drivers/pci/msi.c                    |  19 +-
 drivers/vfio/mdev/mdev_core.c        |   6 +
 drivers/vfio/mdev/mdev_private.h     |   1 -
 include/linux/intel-iommu.h          |   1 +
 include/linux/mdev.h                 |   2 +
 include/linux/msi.h                  |  55 +++++-
 kernel/irq/msi.c                     |   2 +-
 24 files changed, 655 insertions(+), 19 deletions(-)
 create mode 100644 arch/x86/kernel/apic/ims.c
 create mode 100644 drivers/base/ims-msi.c

Comments

Jason Gunthorpe Sept. 13, 2019, 7:50 p.m. UTC | #1
On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:

> This series is a basic patchset to get the ball rolling and receive some
> inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> at the Linux Plumbers, I need to do the following:
> 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
>    locking mechanism for the 'msi_list' in the device structure.
> 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> 3. IMS support of a device needs to be discoverable. A bit in the vendor
>    specific capability in the PCI config is to be added rather than getting
>    this information from each device driver.

Why #3? The point of this scheme is to delegate programming the
addr/data pairs to the driver so it can be done in some
device-specific way. There is no PCI standard behind this, and no
change in PCI semantics. 

I think it would be a tall ask to get a config space bit from PCI-SIG
for something that has little to do with PCI.

After seeing that we already have a platform device based version of
this same idea (drivers/base/platform-msi.c), I think the task here is
really just to extend and expand that approach to work generically for
platform and PCI devices. Along the way tidying the arch interface so
x86 and ARM's stuff to support that uses the same generic interfaces.

ie it is re-organizing code and ideas already in Linux, not defining
some new standard.

I also think refering to this existing idea by some new IMS name is
only confusing people what the goal is... Which is perhaps why #3 was
suggested??

Stated more clearly, I think all uses would be satisfied if
platform_msi_domain_alloc_irqs() could be called for struct
pci_device, could be called multiple times for the same struct
pci_device, and co-existed with MSI and MSI-X on the same pci_device.

Jason
Ashok Raj Sept. 13, 2019, 8:27 p.m. UTC | #2
On Fri, Sep 13, 2019 at 07:50:50PM +0000, Jason Gunthorpe wrote:
> On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:
> 
> > This series is a basic patchset to get the ball rolling and receive some
> > inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> > at the Linux Plumbers, I need to do the following:
> > 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
> >    locking mechanism for the 'msi_list' in the device structure.
> > 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> > 3. IMS support of a device needs to be discoverable. A bit in the vendor
> >    specific capability in the PCI config is to be added rather than getting
> >    this information from each device driver.
> 
> Why #3? The point of this scheme is to delegate programming the
> addr/data pairs to the driver so it can be done in some
> device-specific way. There is no PCI standard behind this, and no
> change in PCI semantics. 
> 
> I think it would be a tall ask to get a config space bit from PCI-SIG
> for something that has little to do with PCI.

This isn't a standard config capability. Its Designated Vendor Specific
Capability (DVSEC). The device is responsible for managing the addr-data
pair. This provides a hint to the OS framework that this device has
device specific methods.

Agreed its not required, but some OSV's like a generic way to discover
these capabilities, hence its there so device vendors can have
a common guideline.

Check here for some of those details:

https://software.intel.com/en-us/blogs/2018/06/25/introducing-intel-scalable-io-virtualization 

> 
> After seeing that we already have a platform device based version of
> this same idea (drivers/base/platform-msi.c), I think the task here is
> really just to extend and expand that approach to work generically for
> platform and PCI devices. Along the way tidying the arch interface so
> x86 and ARM's stuff to support that uses the same generic interfaces.
> 
> ie it is re-organizing code and ideas already in Linux, not defining
> some new standard.
> 
> I also think refering to this existing idea by some new IMS name is
> only confusing people what the goal is... Which is perhaps why #3 was
> suggested??
> 
> Stated more clearly, I think all uses would be satisfied if
> platform_msi_domain_alloc_irqs() could be called for struct
> pci_device, could be called multiple times for the same struct
> pci_device, and co-existed with MSI and MSI-X on the same pci_device.
> 
> Jason
Jason Gunthorpe Sept. 19, 2019, 6:25 p.m. UTC | #3
On Fri, Sep 13, 2019 at 01:27:10PM -0700, Raj, Ashok wrote:
> On Fri, Sep 13, 2019 at 07:50:50PM +0000, Jason Gunthorpe wrote:
> > On Thu, Sep 12, 2019 at 06:32:01PM -0700, Megha Dey wrote:
> > 
> > > This series is a basic patchset to get the ball rolling and receive some
> > > inital comments. As per my discussion with Marc Zyngier and Thomas Gleixner
> > > at the Linux Plumbers, I need to do the following:
> > > 1. Since a device can support MSI-X and IMS simultaneously, ensure proper
> > >    locking mechanism for the 'msi_list' in the device structure.
> > > 2. Introduce dynamic allocation of IMS vectors perhaps by using a group ID
> > > 3. IMS support of a device needs to be discoverable. A bit in the vendor
> > >    specific capability in the PCI config is to be added rather than getting
> > >    this information from each device driver.
> > 
> > Why #3? The point of this scheme is to delegate programming the
> > addr/data pairs to the driver so it can be done in some
> > device-specific way. There is no PCI standard behind this, and no
> > change in PCI semantics. 
> > 
> > I think it would be a tall ask to get a config space bit from PCI-SIG
> > for something that has little to do with PCI.
> 
> This isn't a standard config capability. Its Designated Vendor Specific
> Capability (DVSEC). The device is responsible for managing the addr-data
> pair. This provides a hint to the OS framework that this device has
> device specific methods.
> 
> Agreed its not required, but some OSV's like a generic way to discover
> these capabilities, hence its there so device vendors can have
> a common guideline.

I think it would be reasonable for a specific device driver to test
the DVSEC, if it wishes too. 

Since it is not required it does not make sense for the core kernel to
enforce this on all devices, at least for such a nebulous reason.

Jason