diff mbox series

[V2,24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

Message ID 20200826112333.047315047@linutronix.de (mailing list archive)
State New, archived
Headers show
Series x86, PCI, XEN, genirq ...: Prepare for device MSI | expand

Commit Message

Thomas Gleixner Aug. 26, 2020, 11:16 a.m. UTC
From: Thomas Gleixner <tglx@linutronix.de>

Devices on the VMD bus use their own MSI irq domain, but it is not
distinguishable from regular PCI/MSI irq domains. This is required
to exclude VMD devices from getting the irq domain pointer set by
interrupt remapping.

Override the default bus token.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/controller/vmd.c |    6 ++++++
 1 file changed, 6 insertions(+)

Comments

Marc Zyngier Aug. 26, 2020, 8:47 p.m. UTC | #1
On Wed, 26 Aug 2020 12:16:52 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Devices on the VMD bus use their own MSI irq domain, but it is not
> distinguishable from regular PCI/MSI irq domains. This is required
> to exclude VMD devices from getting the irq domain pointer set by
> interrupt remapping.
> 
> Override the default bus token.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/controller/vmd.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
>  		return -ENODEV;
>  	}
>  
> +	/*
> +	 * Override the irq domain bus token so the domain can be distinguished
> +	 * from a regular PCI/MSI domain.
> +	 */
> +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> +

One day, we'll be able to set the token at domain creation time. In
the meantime,

Acked-by: Marc Zyngier <maz@kernel.org>

	M.
Jason Gunthorpe Aug. 31, 2020, 2:39 p.m. UTC | #2
On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Devices on the VMD bus use their own MSI irq domain, but it is not
> distinguishable from regular PCI/MSI irq domains. This is required
> to exclude VMD devices from getting the irq domain pointer set by
> interrupt remapping.
> 
> Override the default bus token.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>  drivers/pci/controller/vmd.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> +++ b/drivers/pci/controller/vmd.c
> @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
>  		return -ENODEV;
>  	}
>  
> +	/*
> +	 * Override the irq domain bus token so the domain can be distinguished
> +	 * from a regular PCI/MSI domain.
> +	 */
> +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> +

Having the non-transparent-bridge hold a MSI table and
multiplex/de-multiplex IRQs looks like another good use case for
something close to pci_subdevice_msi_create_irq_domain()?

If each device could have its own internal MSI-X table programmed
properly things would work alot better. Disable capture/remap of the
MSI range in the NTB.

Then each device could have a proper non-multiplexed interrupt
delivered to the CPU.. Affinity would work properly, no need to share
IRQs (eg vmd_irq() goes away)/etc.

Something for the VMD maintainers to think about at least.

As I hear more about NTB these days a full MSI scheme for NTB seems
interesting to have in the PCI-E core code..

Jason
Jon Derrick Sept. 30, 2020, 12:45 p.m. UTC | #3
Hi Jason

On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > From: Thomas Gleixner <tglx@linutronix.de>
> > 
> > Devices on the VMD bus use their own MSI irq domain, but it is not
> > distinguishable from regular PCI/MSI irq domains. This is required
> > to exclude VMD devices from getting the irq domain pointer set by
> > interrupt remapping.
> > 
> > Override the default bus token.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> >  drivers/pci/controller/vmd.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> >  		return -ENODEV;
> >  	}
> >  
> > +	/*
> > +	 * Override the irq domain bus token so the domain can be distinguished
> > +	 * from a regular PCI/MSI domain.
> > +	 */
> > +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > +
> 
> Having the non-transparent-bridge hold a MSI table and
> multiplex/de-multiplex IRQs looks like another good use case for
> something close to pci_subdevice_msi_create_irq_domain()?
> 
> If each device could have its own internal MSI-X table programmed
> properly things would work alot better. Disable capture/remap of the
> MSI range in the NTB.
We can disable the capture and remap in newer devices so we don't even
need the irq domain. Legacy VMD will automatically remap based on the
APIC dest bits in the MSI address.

I would however like to determine if the MSI data bits could be used
for individual devices to have the IRQ domain subsystem demultiplex the
virq from that and eliminate the IRQ list iteration.

A concern I have about that scheme is virtualization as I think those
bits are used to route to the virtual vector.

> 
> Then each device could have a proper non-multiplexed interrupt
> delivered to the CPU.. Affinity would work properly, no need to share
> IRQs (eg vmd_irq() goes away)/etc.
> 
> Something for the VMD maintainers to think about at least.
> 
> As I hear more about NTB these days a full MSI scheme for NTB seems
> interesting to have in the PCI-E core code..
> 
> Jason
Jason Gunthorpe Sept. 30, 2020, 12:57 p.m. UTC | #4
On Wed, Sep 30, 2020 at 12:45:30PM +0000, Derrick, Jonathan wrote:
> Hi Jason
> 
> On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > From: Thomas Gleixner <tglx@linutronix.de>
> > > 
> > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > distinguishable from regular PCI/MSI irq domains. This is required
> > > to exclude VMD devices from getting the irq domain pointer set by
> > > interrupt remapping.
> > > 
> > > Override the default bus token.
> > > 
> > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > >  drivers/pci/controller/vmd.c |    6 ++++++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > >  		return -ENODEV;
> > >  	}
> > >  
> > > +	/*
> > > +	 * Override the irq domain bus token so the domain can be distinguished
> > > +	 * from a regular PCI/MSI domain.
> > > +	 */
> > > +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > > +
> > 
> > Having the non-transparent-bridge hold a MSI table and
> > multiplex/de-multiplex IRQs looks like another good use case for
> > something close to pci_subdevice_msi_create_irq_domain()?
> > 
> > If each device could have its own internal MSI-X table programmed
> > properly things would work alot better. Disable capture/remap of the
> > MSI range in the NTB.

> We can disable the capture and remap in newer devices so we don't even
> need the irq domain.

You'd still need an irq domain, it just comes from
pci_subdevice_msi_create_irq_domain() instead of internal to this
driver.

> I would however like to determine if the MSI data bits could be used
> for individual devices to have the IRQ domain subsystem demultiplex the
> virq from that and eliminate the IRQ list iteration.

Yes, exactly. This new pci_subdevice_msi_create_irq_domain() creates
*proper* fully functional interrupts, no need for any IRQ handler in
this driver.

> A concern I have about that scheme is virtualization as I think those
> bits are used to route to the virtual vector.

It should be fine with this patch series, consult with Megha how
virtualization is working with IDXD/etc

Jason
Jon Derrick Sept. 30, 2020, 1:08 p.m. UTC | #5
+Megha

On Wed, 2020-09-30 at 09:57 -0300, Jason Gunthorpe wrote:
> On Wed, Sep 30, 2020 at 12:45:30PM +0000, Derrick, Jonathan wrote:
> > Hi Jason
> > 
> > On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > > From: Thomas Gleixner <tglx@linutronix.de>
> > > > 
> > > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > > distinguishable from regular PCI/MSI irq domains. This is required
> > > > to exclude VMD devices from getting the irq domain pointer set by
> > > > interrupt remapping.
> > > > 
> > > > Override the default bus token.
> > > > 
> > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > > > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > > >  drivers/pci/controller/vmd.c |    6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > > 
> > > > +++ b/drivers/pci/controller/vmd.c
> > > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > > >  		return -ENODEV;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * Override the irq domain bus token so the domain can be distinguished
> > > > +	 * from a regular PCI/MSI domain.
> > > > +	 */
> > > > +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > > > +
> > > 
> > > Having the non-transparent-bridge hold a MSI table and
> > > multiplex/de-multiplex IRQs looks like another good use case for
> > > something close to pci_subdevice_msi_create_irq_domain()?
> > > 
> > > If each device could have its own internal MSI-X table programmed
> > > properly things would work alot better. Disable capture/remap of the
> > > MSI range in the NTB.
> > We can disable the capture and remap in newer devices so we don't even
> > need the irq domain.
> 
> You'd still need an irq domain, it just comes from
> pci_subdevice_msi_create_irq_domain() instead of internal to this
> driver.
I have this set which disables remapping and bypasses the creation of
the IRQ domain:
https://patchwork.ozlabs.org/project/linux-pci/list/?series=192936

It allows the end-devices like NVMe to directly manager their own
interrupts and eliminates the VMD interrupt completely. The only quirk
was that kernel has to program IRTE with the VMD device ID as it still
remaps Requester-ID from subdevices.

> 
> > I would however like to determine if the MSI data bits could be used
> > for individual devices to have the IRQ domain subsystem demultiplex the
> > virq from that and eliminate the IRQ list iteration.
> 
> Yes, exactly. This new pci_subdevice_msi_create_irq_domain() creates
> *proper* fully functional interrupts, no need for any IRQ handler in
> this driver.
> 
> > A concern I have about that scheme is virtualization as I think those
> > bits are used to route to the virtual vector.
> 
> It should be fine with this patch series, consult with Megha how
> virtualization is working with IDXD/etc
> 
> Jason
Jason Gunthorpe Sept. 30, 2020, 6:47 p.m. UTC | #6
On Wed, Sep 30, 2020 at 01:08:27PM +0000, Derrick, Jonathan wrote:
> +Megha
> 
> On Wed, 2020-09-30 at 09:57 -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 30, 2020 at 12:45:30PM +0000, Derrick, Jonathan wrote:
> > > Hi Jason
> > > 
> > > On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > > > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > > > From: Thomas Gleixner <tglx@linutronix.de>
> > > > > 
> > > > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > > > distinguishable from regular PCI/MSI irq domains. This is required
> > > > > to exclude VMD devices from getting the irq domain pointer set by
> > > > > interrupt remapping.
> > > > > 
> > > > > Override the default bus token.
> > > > > 
> > > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > > > > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > > > >  drivers/pci/controller/vmd.c |    6 ++++++
> > > > >  1 file changed, 6 insertions(+)
> > > > > 
> > > > > +++ b/drivers/pci/controller/vmd.c
> > > > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > > > >  		return -ENODEV;
> > > > >  	}
> > > > >  
> > > > > +	/*
> > > > > +	 * Override the irq domain bus token so the domain can be distinguished
> > > > > +	 * from a regular PCI/MSI domain.
> > > > > +	 */
> > > > > +	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > > > > +
> > > > 
> > > > Having the non-transparent-bridge hold a MSI table and
> > > > multiplex/de-multiplex IRQs looks like another good use case for
> > > > something close to pci_subdevice_msi_create_irq_domain()?
> > > > 
> > > > If each device could have its own internal MSI-X table programmed
> > > > properly things would work alot better. Disable capture/remap of the
> > > > MSI range in the NTB.
> > > We can disable the capture and remap in newer devices so we don't even
> > > need the irq domain.
> > 
> > You'd still need an irq domain, it just comes from
> > pci_subdevice_msi_create_irq_domain() instead of internal to this
> > driver.
> I have this set which disables remapping and bypasses the creation of
> the IRQ domain:
> https://patchwork.ozlabs.org/project/linux-pci/list/?series=192936

After Thomas's series VMD needs to supply a hierarchical IRQ domain to
get the correct PCI originator. This instead of the x86 patch in that
series. I think that domain should be created by
pci_subdevice_msi_create_irq_domain(), at least I'd start there.

Jason
diff mbox series

Patch

--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -579,6 +579,12 @@  static int vmd_enable_domain(struct vmd_
 		return -ENODEV;
 	}
 
+	/*
+	 * Override the irq domain bus token so the domain can be distinguished
+	 * from a regular PCI/MSI domain.
+	 */
+	irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
+
 	pci_add_resource(&resources, &vmd->resources[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[2], offset[1]);