Message ID | 20230807204520.1088801-1-nirmal.patel@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [v3] PCI: vmd: Disable bridge window for domain reset | expand |
On Mon, Aug 07, 2023 at 04:45:20PM -0400, Nirmal Patel wrote: > During domain reset process vmd_domain_reset() clears PCI > configuration space of VMD root ports. But certain platform > has observed following errors and failed to boot. > ... > DMAR: VT-d detected Invalidation Queue Error: Reason f > DMAR: VT-d detected Invalidation Time-out Error: SID ffff > DMAR: VT-d detected Invalidation Completion Error: SID ffff > DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 > DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 > DMAR: Invalidation Time-out Error (ITE) cleared > > The root cause is that memset_io() clears prefetchable memory base/limit > registers and prefetchable base/limit 32 bits registers sequentially. > This seems to be enabling prefetchable memory if the device disabled > prefetchable memory originally. > Here is an example (before memset_io()): Paragraph separation or wrapping error. > PCI configuration space for 10000:00:00.0: > 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 > 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 > 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 > ... > > So, prefetchable memory is ffffffff00000000-575000fffff, which is > disabled. When memset_io() clears prefetchable base 32 bits register, > the prefetchable memory becomes 0000000000000000-575000fffff, which is > enabled. > > This is believed to be the reason for the failure and in addition the > sequence of operation in vmd_domain_reset() is not following the PCIe > specs. > > Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: > > The Prefetchable Memory Limit register must be programmed to a smaller > value than the Prefetchable Memory Base register if there is no > prefetchable memory on the secondary side of the bridg s/bridg/bridge/ The [mem 0x0-0x575000fffff] state is transient, right? After the memset_io() completes, the window is [mem 0x0-0x000fffff], which still not what you want, since it's enabled, while you want to *disable* the window. I don't know what the connection between this and the DMAR invalidation queue errors is. Maybe those can happen with either the transient [mem 0x0-0x575000fffff] state or the [mem 0x0-0x000fffff] end state? IIUC there are two problems with the memset_io(): 1) The memset_io() writes 0 to all the base and limit registers. For address decoding purposes, the low bits of the base are implicitly clear while the low bits of the limit are implicitly set, so setting the base to zero always makes the windows *enabled*, e.g., [io 0x0000-0x0fff]. 2) The I/O and prefetchable base/limit can't be configured with a single config write, so we have to write them in a specific order to avoid transient enabled windows that could cause conflicts. > Disable the bridge window by executing a sequence of operations > borrowed from pci_disable_bridge_window(), that comply with the > PCI specifications. > > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> > --- > v3->v3: Add more information to commit description. > v1->v2: Follow same chain of operation as pci_disable_bridge_window > and update commit log. > --- > drivers/pci/controller/vmd.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c > index 769eedeb8802..ca9081be917d 100644 > --- a/drivers/pci/controller/vmd.c > +++ b/drivers/pci/controller/vmd.c > @@ -526,8 +526,16 @@ static void vmd_domain_reset(struct vmd_dev *vmd) > PCI_CLASS_BRIDGE_PCI)) > continue; > > - memset_io(base + PCI_IO_BASE, 0, > - PCI_ROM_ADDRESS1 - PCI_IO_BASE); > + writel(0, base + PCI_IO_BASE); This is a 32-bit write, so it writes zero to PCI_IO_BASE, PCI_IO_LIMIT, and PCI_SEC_STATUS. Writing zero to PCI_SEC_STATUS is probably harmless since everything there is RO or RW1C, but is unnecessary and seems a little sloppy. Writing zero to PCI_IO_BASE and PCI_IO_LIMIT enables it as [io 0x0000-0x0fff]. I think the code in pci_setup_bridge_io() is more like what you want. > + /* MMIO Base/Limit */ > + writel(0x0000fff0, base + PCI_MEMORY_BASE); > + > + /* Prefetchable MMIO Base/Limit */ > + writel(0, base + PCI_PREF_LIMIT_UPPER32); > + writel(0x0000fff0, base + PCI_PREF_MEMORY_BASE); > + writel(0xffffffff, base + PCI_PREF_BASE_UPPER32); > + writel(0, base + PCI_IO_BASE_UPPER16); kkkk > + writeb(0, base + PCI_CAPABILITY_LIST); > } > } > } > -- > 2.31.1 >
On 8/7/2023 3:23 PM, Bjorn Helgaas wrote: > On Mon, Aug 07, 2023 at 04:45:20PM -0400, Nirmal Patel wrote: >> During domain reset process vmd_domain_reset() clears PCI >> configuration space of VMD root ports. But certain platform >> has observed following errors and failed to boot. >> ... >> DMAR: VT-d detected Invalidation Queue Error: Reason f >> DMAR: VT-d detected Invalidation Time-out Error: SID ffff >> DMAR: VT-d detected Invalidation Completion Error: SID ffff >> DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 >> DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 >> DMAR: Invalidation Time-out Error (ITE) cleared >> >> The root cause is that memset_io() clears prefetchable memory base/limit >> registers and prefetchable base/limit 32 bits registers sequentially. >> This seems to be enabling prefetchable memory if the device disabled >> prefetchable memory originally. >> Here is an example (before memset_io()): > Paragraph separation or wrapping error. I will fix it. > >> PCI configuration space for 10000:00:00.0: >> 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 >> 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 >> 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 >> ... >> >> So, prefetchable memory is ffffffff00000000-575000fffff, which is >> disabled. When memset_io() clears prefetchable base 32 bits register, >> the prefetchable memory becomes 0000000000000000-575000fffff, which is >> enabled. >> >> This is believed to be the reason for the failure and in addition the >> sequence of operation in vmd_domain_reset() is not following the PCIe >> specs. >> >> Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: >> >> The Prefetchable Memory Limit register must be programmed to a smaller >> value than the Prefetchable Memory Base register if there is no >> prefetchable memory on the secondary side of the bridg > s/bridg/bridge/ > > The [mem 0x0-0x575000fffff] state is transient, right? After the > memset_io() completes, the window is [mem 0x0-0x000fffff], which still > not what you want, since it's enabled, while you want to *disable* the > window. > > I don't know what the connection between this and the DMAR > invalidation queue errors is. Maybe those can happen with either the > transient [mem 0x0-0x575000fffff] state or the [mem 0x0-0x000fffff] > end state? > > IIUC there are two problems with the memset_io(): > > 1) The memset_io() writes 0 to all the base and limit registers. > For address decoding purposes, the low bits of the base are > implicitly clear while the low bits of the limit are implicitly > set, so setting the base to zero always makes the windows > *enabled*, e.g., [io 0x0000-0x0fff]. > > 2) The I/O and prefetchable base/limit can't be configured with a > single config write, so we have to write them in a specific order > to avoid transient enabled windows that could cause conflicts. Yes. Your explanation is very accurate. > >> Disable the bridge window by executing a sequence of operations >> borrowed from pci_disable_bridge_window(), that comply with the >> PCI specifications. >> >> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> >> --- >> v3->v3: Add more information to commit description. >> v1->v2: Follow same chain of operation as pci_disable_bridge_window >> and update commit log. >> --- >> drivers/pci/controller/vmd.c | 12 ++++++++++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c >> index 769eedeb8802..ca9081be917d 100644 >> --- a/drivers/pci/controller/vmd.c >> +++ b/drivers/pci/controller/vmd.c >> @@ -526,8 +526,16 @@ static void vmd_domain_reset(struct vmd_dev *vmd) >> PCI_CLASS_BRIDGE_PCI)) >> continue; >> >> - memset_io(base + PCI_IO_BASE, 0, >> - PCI_ROM_ADDRESS1 - PCI_IO_BASE); >> + writel(0, base + PCI_IO_BASE); > This is a 32-bit write, so it writes zero to PCI_IO_BASE, > PCI_IO_LIMIT, and PCI_SEC_STATUS. Writing zero to PCI_SEC_STATUS is > probably harmless since everything there is RO or RW1C, but is > unnecessary and seems a little sloppy. > > Writing zero to PCI_IO_BASE and PCI_IO_LIMIT enables it as > [io 0x0000-0x0fff]. I think the code in pci_setup_bridge_io() is more > like what you want. I will make changes according to pci_setup_bridge_io() ASAP. Thanks. > >> + /* MMIO Base/Limit */ >> + writel(0x0000fff0, base + PCI_MEMORY_BASE); >> + >> + /* Prefetchable MMIO Base/Limit */ >> + writel(0, base + PCI_PREF_LIMIT_UPPER32); >> + writel(0x0000fff0, base + PCI_PREF_MEMORY_BASE); >> + writel(0xffffffff, base + PCI_PREF_BASE_UPPER32); >> + writel(0, base + PCI_IO_BASE_UPPER16); > kkkk >> + writeb(0, base + PCI_CAPABILITY_LIST); >> } >> } >> } >> -- >> 2.31.1 >>
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 769eedeb8802..ca9081be917d 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -526,8 +526,16 @@ static void vmd_domain_reset(struct vmd_dev *vmd) PCI_CLASS_BRIDGE_PCI)) continue; - memset_io(base + PCI_IO_BASE, 0, - PCI_ROM_ADDRESS1 - PCI_IO_BASE); + writel(0, base + PCI_IO_BASE); + /* MMIO Base/Limit */ + writel(0x0000fff0, base + PCI_MEMORY_BASE); + + /* Prefetchable MMIO Base/Limit */ + writel(0, base + PCI_PREF_LIMIT_UPPER32); + writel(0x0000fff0, base + PCI_PREF_MEMORY_BASE); + writel(0xffffffff, base + PCI_PREF_BASE_UPPER32); + writel(0, base + PCI_IO_BASE_UPPER16); + writeb(0, base + PCI_CAPABILITY_LIST); } } }
During domain reset process vmd_domain_reset() clears PCI configuration space of VMD root ports. But certain platform has observed following errors and failed to boot. ... DMAR: VT-d detected Invalidation Queue Error: Reason f DMAR: VT-d detected Invalidation Time-out Error: SID ffff DMAR: VT-d detected Invalidation Completion Error: SID ffff DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: Invalidation Time-out Error (ITE) cleared The root cause is that memset_io() clears prefetchable memory base/limit registers and prefetchable base/limit 32 bits registers sequentially. This seems to be enabling prefetchable memory if the device disabled prefetchable memory originally. Here is an example (before memset_io()): PCI configuration space for 10000:00:00.0: 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 ... So, prefetchable memory is ffffffff00000000-575000fffff, which is disabled. When memset_io() clears prefetchable base 32 bits register, the prefetchable memory becomes 0000000000000000-575000fffff, which is enabled. This is believed to be the reason for the failure and in addition the sequence of operation in vmd_domain_reset() is not following the PCIe specs. Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: The Prefetchable Memory Limit register must be programmed to a smaller value than the Prefetchable Memory Base register if there is no prefetchable memory on the secondary side of the bridg Disable the bridge window by executing a sequence of operations borrowed from pci_disable_bridge_window(), that comply with the PCI specifications. Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> --- v3->v3: Add more information to commit description. v1->v2: Follow same chain of operation as pci_disable_bridge_window and update commit log. --- drivers/pci/controller/vmd.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)