diff mbox series

[2/5] PCI: Add a reset quirk for VMD

Message ID 20201120225144.15138-3-jonathan.derrick@intel.com (mailing list archive)
State Changes Requested, archived
Headers show
Series Legacy direct-assign mode | expand

Commit Message

Jon Derrick Nov. 20, 2020, 10:51 p.m. UTC
VMD domains should be reset in-between special attachment such as VFIO
users. VMD does not offer a reset, however the subdevice domain itself
can be reset starting at the Root Bus. Add a Secondary Bus Reset on each
of the individual root port devices immediately downstream of the VMD
root bus.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/quirks.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

Comments

Bjorn Helgaas Nov. 24, 2020, 9:40 p.m. UTC | #1
[+cc Alex]

On Fri, Nov 20, 2020 at 03:51:41PM -0700, Jon Derrick wrote:
> VMD domains should be reset in-between special attachment such as VFIO
> users. VMD does not offer a reset, however the subdevice domain itself
> can be reset starting at the Root Bus. Add a Secondary Bus Reset on each
> of the individual root port devices immediately downstream of the VMD
> root bus.
> 
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> ---
>  drivers/pci/quirks.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index f70692a..ee58b51 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3744,6 +3744,49 @@ static int reset_ivb_igd(struct pci_dev *dev, int probe)
>  	return 0;
>  }
>  
> +/* Issues SBR to VMD domain to clear PCI configuration */
> +static int reset_vmd_sbr(struct pci_dev *dev, int probe)
> +{
> +	char __iomem *cfgbar, *base;
> +	int rp;
> +	u16 ctl;
> +
> +	if (probe)
> +		return 0;
> +
> +	if (dev->dev.driver)
> +		return 0;

I guess "dev" here is the VMD endpoint?  And if the vmd.c driver is
bound to it, you return success without doing anything?

If there's no driver for the VMD device, who is trying to reset it?

I guess I don't quite understand how VMD works.  I would have thought
that if vmd.c isn't bound to the VMD device, the devices behind the
VMD would be inaccessible and there'd be no point in a reset.

> +	cfgbar = pci_iomap(dev, 0, 0);
> +	if (!cfgbar)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Subdevice config space is mapped linearly using 4k config space
> +	 * increments. Use increments of 0x8000 to locate root port devices.
> +	 */
> +	for (rp = 0; rp < 4; rp++) {
> +		base = cfgbar + rp * 0x8000;

I really don't like this part -- iomapping BAR 0 (apparently
VMD_CFGBAR), and making up the ECAM-ish addresses and basically
open-coding ECAM accesses below.  I guess this assumes Root Ports are
only on functions .0, .2, .4, .6?

Is it all open-coded here because this reset path is only of interest
when vmd.c is NOT bound to the the VMD device, so you can't use
vmd->cfgbar, etc?

What about the case when vmd.c IS bound?  We don't do anything here,
so does that mean we instead use the usual case of asserting SBR on
the Root Ports behind the VMD?

> +		if (readl(base + PCI_COMMAND) == 0xFFFFFFFF)
> +			continue;
> +
> +		/* pci_reset_secondary_bus() */
> +		ctl = readw(base + PCI_BRIDGE_CONTROL);
> +		ctl |= PCI_BRIDGE_CTL_BUS_RESET;
> +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> +		readw(base + PCI_BRIDGE_CONTROL);
> +		msleep(2);
> +
> +		ctl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> +		readw(base + PCI_BRIDGE_CONTROL);
> +	}
> +
> +	ssleep(1);
> +	pci_iounmap(dev, cfgbar);
> +	return 0;
> +}
> +
>  /* Device-specific reset method for Chelsio T4-based adapters */
>  static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
>  {
> @@ -3919,6 +3962,11 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
>  		reset_ivb_igd },
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
>  		reset_ivb_igd },
> +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D, reset_vmd_sbr },
> +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0, reset_vmd_sbr },
> +	{ PCI_VENDOR_ID_INTEL, 0x467f, reset_vmd_sbr },
> +	{ PCI_VENDOR_ID_INTEL, 0x4c3d, reset_vmd_sbr },
> +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B, reset_vmd_sbr },
>  	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
>  	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
>  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
> -- 
> 1.8.3.1
>
Jon Derrick Nov. 25, 2020, 5:22 p.m. UTC | #2
Hi Bjorn,

On Tue, 2020-11-24 at 15:40 -0600, Bjorn Helgaas wrote:
> [+cc Alex]
> 
> On Fri, Nov 20, 2020 at 03:51:41PM -0700, Jon Derrick wrote:
> > VMD domains should be reset in-between special attachment such as VFIO
> > users. VMD does not offer a reset, however the subdevice domain itself
> > can be reset starting at the Root Bus. Add a Secondary Bus Reset on each
> > of the individual root port devices immediately downstream of the VMD
> > root bus.
> > 
> > Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> > ---
> >  drivers/pci/quirks.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 48 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index f70692a..ee58b51 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3744,6 +3744,49 @@ static int reset_ivb_igd(struct pci_dev *dev, int probe)
> >  	return 0;
> >  }
> >  
> > +/* Issues SBR to VMD domain to clear PCI configuration */
> > +static int reset_vmd_sbr(struct pci_dev *dev, int probe)
> > +{
> > +	char __iomem *cfgbar, *base;
> > +	int rp;
> > +	u16 ctl;
> > +
> > +	if (probe)
> > +		return 0;
> > +
> > +	if (dev->dev.driver)
> > +		return 0;
> 
> I guess "dev" here is the VMD endpoint?  And if the vmd.c driver is
> bound to it, you return success without doing anything?
> 
> If there's no driver for the VMD device, who is trying to reset it?
> 
> I guess I don't quite understand how VMD works.  I would have thought
> that if vmd.c isn't bound to the VMD device, the devices behind the
> VMD would be inaccessible and there'd be no point in a reset.

This is basically the idea behind this reset - allow the user to reset
VMD if there is no driver bound to it, but prevent the reset from
deenumerating the domain if there is a driver.

If this is an unusual/unexpected use case, we can drop it.


> 
> > +	cfgbar = pci_iomap(dev, 0, 0);
> > +	if (!cfgbar)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Subdevice config space is mapped linearly using 4k config space
> > +	 * increments. Use increments of 0x8000 to locate root port devices.
> > +	 */
> > +	for (rp = 0; rp < 4; rp++) {
> > +		base = cfgbar + rp * 0x8000;
> 
> I really don't like this part -- iomapping BAR 0 (apparently
> VMD_CFGBAR), and making up the ECAM-ish addresses and basically
> open-coding ECAM accesses below.  I guess this assumes Root Ports are
> only on functions .0, .2, .4, .6?

The Root Ports are Devices xx:00.0, xx:01.0, xx:02.0, and xx:03.0
(corresponding to PCIE_EXT_SLOT_SHIFT = 15)


> 
> Is it all open-coded here because this reset path is only of interest
> when vmd.c is NOT bound to the the VMD device, so you can't use
> vmd->cfgbar, etc?

That's correct, but as mentioned above it might be an unusual code path
so is not as important as the reset within the driver in patch 1/5.

> 
> What about the case when vmd.c IS bound?  We don't do anything here,
> so does that mean we instead use the usual case of asserting SBR on
> the Root Ports behind the VMD?

It uses the standard Linux reset code paths for Root Port devices

> 
> > +		if (readl(base + PCI_COMMAND) == 0xFFFFFFFF)
> > +			continue;
> > +
> > +		/* pci_reset_secondary_bus() */
> > +		ctl = readw(base + PCI_BRIDGE_CONTROL);
> > +		ctl |= PCI_BRIDGE_CTL_BUS_RESET;
> > +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> > +		readw(base + PCI_BRIDGE_CONTROL);
> > +		msleep(2);
> > +
> > +		ctl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> > +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> > +		readw(base + PCI_BRIDGE_CONTROL);
> > +	}
> > +
> > +	ssleep(1);
> > +	pci_iounmap(dev, cfgbar);
> > +	return 0;
> > +}
> > +
> >  /* Device-specific reset method for Chelsio T4-based adapters */
> >  static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
> >  {
> > @@ -3919,6 +3962,11 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
> >  		reset_ivb_igd },
> >  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
> >  		reset_ivb_igd },
> > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D, reset_vmd_sbr },
> > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0, reset_vmd_sbr },
> > +	{ PCI_VENDOR_ID_INTEL, 0x467f, reset_vmd_sbr },
> > +	{ PCI_VENDOR_ID_INTEL, 0x4c3d, reset_vmd_sbr },
> > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B, reset_vmd_sbr },
> >  	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
> >  	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
> >  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
> > -- 
> > 1.8.3.1
> >
Alex Williamson Nov. 25, 2020, 5:34 p.m. UTC | #3
On Wed, 25 Nov 2020 17:22:05 +0000
"Derrick, Jonathan" <jonathan.derrick@intel.com> wrote:

> Hi Bjorn,
> 
> On Tue, 2020-11-24 at 15:40 -0600, Bjorn Helgaas wrote:
> > [+cc Alex]
> > 
> > On Fri, Nov 20, 2020 at 03:51:41PM -0700, Jon Derrick wrote:  
> > > VMD domains should be reset in-between special attachment such as VFIO
> > > users. VMD does not offer a reset, however the subdevice domain itself
> > > can be reset starting at the Root Bus. Add a Secondary Bus Reset on each
> > > of the individual root port devices immediately downstream of the VMD
> > > root bus.
> > > 
> > > Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> > > ---
> > >  drivers/pci/quirks.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 48 insertions(+)
> > > 
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index f70692a..ee58b51 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -3744,6 +3744,49 @@ static int reset_ivb_igd(struct pci_dev *dev, int probe)
> > >  	return 0;
> > >  }
> > >  
> > > +/* Issues SBR to VMD domain to clear PCI configuration */
> > > +static int reset_vmd_sbr(struct pci_dev *dev, int probe)
> > > +{
> > > +	char __iomem *cfgbar, *base;
> > > +	int rp;
> > > +	u16 ctl;
> > > +
> > > +	if (probe)
> > > +		return 0;
> > > +
> > > +	if (dev->dev.driver)
> > > +		return 0;  
> > 
> > I guess "dev" here is the VMD endpoint?  And if the vmd.c driver is
> > bound to it, you return success without doing anything?
> > 
> > If there's no driver for the VMD device, who is trying to reset it?
> > 
> > I guess I don't quite understand how VMD works.  I would have thought
> > that if vmd.c isn't bound to the VMD device, the devices behind the
> > VMD would be inaccessible and there'd be no point in a reset.  
> 
> This is basically the idea behind this reset - allow the user to reset
> VMD if there is no driver bound to it, but prevent the reset from
> deenumerating the domain if there is a driver.
> 
> If this is an unusual/unexpected use case, we can drop it.

I don't understand how this improves the vfio use case as claimed in
the commit log, are you expecting the device to be unbound from all
drivers and reset via pci-sysfs between uses?  vfio would not be able
to perform the reset itself with this behavior, including between
resets of a VM or between separate users without external manual
unbinding and reset.

   
> > > +	cfgbar = pci_iomap(dev, 0, 0);
> > > +	if (!cfgbar)
> > > +		return -ENOMEM;
> > > +
> > > +	/*
> > > +	 * Subdevice config space is mapped linearly using 4k config space
> > > +	 * increments. Use increments of 0x8000 to locate root port devices.
> > > +	 */
> > > +	for (rp = 0; rp < 4; rp++) {
> > > +		base = cfgbar + rp * 0x8000;  
> > 
> > I really don't like this part -- iomapping BAR 0 (apparently
> > VMD_CFGBAR), and making up the ECAM-ish addresses and basically
> > open-coding ECAM accesses below.  I guess this assumes Root Ports are
> > only on functions .0, .2, .4, .6?  
> 
> The Root Ports are Devices xx:00.0, xx:01.0, xx:02.0, and xx:03.0
> (corresponding to PCIE_EXT_SLOT_SHIFT = 15)
> 
> 
> > 
> > Is it all open-coded here because this reset path is only of interest
> > when vmd.c is NOT bound to the the VMD device, so you can't use
> > vmd->cfgbar, etc?  
> 
> That's correct, but as mentioned above it might be an unusual code path
> so is not as important as the reset within the driver in patch 1/5.
> 
> > 
> > What about the case when vmd.c IS bound?  We don't do anything here,
> > so does that mean we instead use the usual case of asserting SBR on
> > the Root Ports behind the VMD?  
> 
> It uses the standard Linux reset code paths for Root Port devices
> 
> >   
> > > +		if (readl(base + PCI_COMMAND) == 0xFFFFFFFF)
> > > +			continue;
> > > +
> > > +		/* pci_reset_secondary_bus() */
> > > +		ctl = readw(base + PCI_BRIDGE_CONTROL);
> > > +		ctl |= PCI_BRIDGE_CTL_BUS_RESET;
> > > +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> > > +		readw(base + PCI_BRIDGE_CONTROL);
> > > +		msleep(2);
> > > +
> > > +		ctl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> > > +		writew(ctl, base + PCI_BRIDGE_CONTROL);
> > > +		readw(base + PCI_BRIDGE_CONTROL);

We're performing an SBR of the internal root ports here, is the config
space of the affected endpoints handled via save+restore of the code
that calls this?  I'm a little rusty on VMD again.  Thanks,

Alex


> > > +	}
> > > +
> > > +	ssleep(1);
> > > +	pci_iounmap(dev, cfgbar);
> > > +	return 0;
> > > +}
> > > +
> > >  /* Device-specific reset method for Chelsio T4-based adapters */
> > >  static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
> > >  {
> > > @@ -3919,6 +3962,11 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
> > >  		reset_ivb_igd },
> > >  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
> > >  		reset_ivb_igd },
> > > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D, reset_vmd_sbr },
> > > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0, reset_vmd_sbr },
> > > +	{ PCI_VENDOR_ID_INTEL, 0x467f, reset_vmd_sbr },
> > > +	{ PCI_VENDOR_ID_INTEL, 0x4c3d, reset_vmd_sbr },
> > > +	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B, reset_vmd_sbr },
> > >  	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
> > >  	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
> > >  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
> > > -- 
> > > 1.8.3.1
> > >
diff mbox series

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index f70692a..ee58b51 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3744,6 +3744,49 @@  static int reset_ivb_igd(struct pci_dev *dev, int probe)
 	return 0;
 }
 
+/* Issues SBR to VMD domain to clear PCI configuration */
+static int reset_vmd_sbr(struct pci_dev *dev, int probe)
+{
+	char __iomem *cfgbar, *base;
+	int rp;
+	u16 ctl;
+
+	if (probe)
+		return 0;
+
+	if (dev->dev.driver)
+		return 0;
+
+	cfgbar = pci_iomap(dev, 0, 0);
+	if (!cfgbar)
+		return -ENOMEM;
+
+	/*
+	 * Subdevice config space is mapped linearly using 4k config space
+	 * increments. Use increments of 0x8000 to locate root port devices.
+	 */
+	for (rp = 0; rp < 4; rp++) {
+		base = cfgbar + rp * 0x8000;
+		if (readl(base + PCI_COMMAND) == 0xFFFFFFFF)
+			continue;
+
+		/* pci_reset_secondary_bus() */
+		ctl = readw(base + PCI_BRIDGE_CONTROL);
+		ctl |= PCI_BRIDGE_CTL_BUS_RESET;
+		writew(ctl, base + PCI_BRIDGE_CONTROL);
+		readw(base + PCI_BRIDGE_CONTROL);
+		msleep(2);
+
+		ctl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+		writew(ctl, base + PCI_BRIDGE_CONTROL);
+		readw(base + PCI_BRIDGE_CONTROL);
+	}
+
+	ssleep(1);
+	pci_iounmap(dev, cfgbar);
+	return 0;
+}
+
 /* Device-specific reset method for Chelsio T4-based adapters */
 static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
 {
@@ -3919,6 +3962,11 @@  static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
 		reset_ivb_igd },
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
 		reset_ivb_igd },
+	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D, reset_vmd_sbr },
+	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0, reset_vmd_sbr },
+	{ PCI_VENDOR_ID_INTEL, 0x467f, reset_vmd_sbr },
+	{ PCI_VENDOR_ID_INTEL, 0x4c3d, reset_vmd_sbr },
+	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B, reset_vmd_sbr },
 	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
 	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
 	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,