Message ID | 20240805210956.364624-1-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86/msi: fix locking for SRIOV devices | expand |
On 05.08.2024 23:09, Stewart Hildebrand wrote: > In commit 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci > structure") a lock moved from allocate_and_map_msi_pirq() to the caller > and changed from pcidevs_lock() to read_lock(&d->pci_lock). However, one > call path wasn't updated to reflect the change, leading to a failed > assertion observed on debug builds of Xen when an SRIOV device is > present with one or more VFs enabled: > > (XEN) Assertion 'd || pcidevs_locked()' failed at drivers/passthrough/pci.c:535 > (XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]---- There must be more to this than just "SR-IOV with VFs enabled"? The locking change has been there for quite a while, yet the issue was noticed only know. > @@ -829,7 +829,8 @@ static int msix_capability_init(struct pci_dev *dev, > vf = dev->sbdf.bdf; > } > > - table_paddr = read_pci_mem_bar(seg, pbus, pslot, pfunc, bir, vf); > + table_paddr = read_pci_mem_bar(dev->domain, seg, pbus, pslot, pfunc, > + bir, vf); While dev->domain is the owner of dev (seg:bus:slot.func), it may not be the owner of seg:pbus:pslot.pfunc. Or if it (reliably) is, I'd expect the guarantees towards that to be spelled out in the description. (Dependency on ordering of toolstack operations likely wouldn't count as a guarantee to me. Yet if I'm not mistaken there's a problem in all cases if the VF was already moved to DomIO, before the DomU is started.) Further to this, until realizing that the code path leading here is accessible only for Dom0, I suspected a security angle to this, and hence didn't respond publicly on Matrix. I noticed, however, that apparently there's another instance of the same issue (via the other call site of allocate_and_map_msi_pirq(), i.e. from vpci_msix_arch_enable_entry() through vpci_msi_enable()). Imo that wants dealing with at the same time (potentially requiring a different overall approach). The code path leading here being accessible only for Dom0 likely is also a mistake (read: overly strict). In principle it ought to be possible to move a PF to a driver domain, for the VFs to then be assigned to individual DomU-s. In such a case I expect it shouldn't be Dom0 to invoke PHYSDEVOP_prepare_msix. Jan
On 8/6/24 02:25, Jan Beulich wrote: > On 05.08.2024 23:09, Stewart Hildebrand wrote: >> In commit 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci >> structure") a lock moved from allocate_and_map_msi_pirq() to the caller >> and changed from pcidevs_lock() to read_lock(&d->pci_lock). However, one >> call path wasn't updated to reflect the change, leading to a failed >> assertion observed on debug builds of Xen when an SRIOV device is >> present with one or more VFs enabled: >> >> (XEN) Assertion 'd || pcidevs_locked()' failed at drivers/passthrough/pci.c:535 >> (XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]---- > > There must be more to this than just "SR-IOV with VFs enabled"? The > locking change has been there for quite a while, yet the issue was > noticed only know. Yes, the conditions as far as I can tell are: * PV dom0 * Debug build (debug=y) only * There is a SR-IOV device in the system with one or more VFs enabled/configured * Dom0 has loaded the driver for the VF and enabled MSI-X The assert triggers when dom0 loads the VF driver and enables MSI-X before PHYSDEVOP_prepare_msix is invoked. >> @@ -829,7 +829,8 @@ static int msix_capability_init(struct pci_dev *dev, >> vf = dev->sbdf.bdf; >> } >> >> - table_paddr = read_pci_mem_bar(seg, pbus, pslot, pfunc, bir, vf); >> + table_paddr = read_pci_mem_bar(dev->domain, seg, pbus, pslot, pfunc, >> + bir, vf); > > While dev->domain is the owner of dev (seg:bus:slot.func), it may not be > the owner of seg:pbus:pslot.pfunc. Or if it (reliably) is, I'd expect the > guarantees towards that to be spelled out in the description. (Dependency > on ordering of toolstack operations likely wouldn't count as a guarantee > to me. Yet if I'm not mistaken there's a problem in all cases if the VF > was already moved to DomIO, before the DomU is started.) On the possibility that they're not the same (now or in the future), disregard this patch. > Further to this, until realizing that the code path leading here is > accessible only for Dom0, I suspected a security angle to this, and hence > didn't respond publicly on Matrix. I noticed, however, that apparently > there's another instance of the same issue (via the other call site of > allocate_and_map_msi_pirq(), i.e. from vpci_msix_arch_enable_entry() > through vpci_msi_enable()). Imo that wants dealing with at the same time > (potentially requiring a different overall approach). There's a simpler way to solve this: the VF doesn't strictly need to obtain the pdev of its associated PF, so the call to pci_get_pdev() can be removed from read_pci_mem_bar(). Patch to follow... > The code path leading here being accessible only for Dom0 likely is also > a mistake (read: overly strict). In principle it ought to be possible to > move a PF to a driver domain, for the VFs to then be assigned to > individual DomU-s. In such a case I expect it shouldn't be Dom0 to invoke > PHYSDEVOP_prepare_msix. > > Jan
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c index 0c97fbb3fc03..95a48615a1a4 100644 --- a/xen/arch/x86/msi.c +++ b/xen/arch/x86/msi.c @@ -662,7 +662,8 @@ static int msi_capability_init(struct pci_dev *dev, return 0; } -static u64 read_pci_mem_bar(u16 seg, u8 bus, u8 slot, u8 func, u8 bir, int vf) +static u64 read_pci_mem_bar(struct domain *d, u16 seg, u8 bus, u8 slot, u8 func, + u8 bir, int vf) { u8 limit; u32 addr, base = PCI_BASE_ADDRESS_0; @@ -670,8 +671,7 @@ static u64 read_pci_mem_bar(u16 seg, u8 bus, u8 slot, u8 func, u8 bir, int vf) if ( vf >= 0 ) { - struct pci_dev *pdev = pci_get_pdev(NULL, - PCI_SBDF(seg, bus, slot, func)); + struct pci_dev *pdev = pci_get_pdev(d, PCI_SBDF(seg, bus, slot, func)); unsigned int pos; uint16_t ctrl, num_vf, offset, stride; @@ -829,7 +829,8 @@ static int msix_capability_init(struct pci_dev *dev, vf = dev->sbdf.bdf; } - table_paddr = read_pci_mem_bar(seg, pbus, pslot, pfunc, bir, vf); + table_paddr = read_pci_mem_bar(dev->domain, seg, pbus, pslot, pfunc, + bir, vf); WARN_ON(msi && msi->table_base != table_paddr); if ( !table_paddr ) { @@ -852,7 +853,8 @@ static int msix_capability_init(struct pci_dev *dev, pba_offset = pci_conf_read32(dev->sbdf, msix_pba_offset_reg(pos)); bir = (u8)(pba_offset & PCI_MSIX_BIRMASK); - pba_paddr = read_pci_mem_bar(seg, pbus, pslot, pfunc, bir, vf); + pba_paddr = read_pci_mem_bar(dev->domain, seg, pbus, pslot, pfunc, bir, + vf); WARN_ON(!pba_paddr); pba_paddr += pba_offset & ~PCI_MSIX_BIRMASK;
In commit 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci structure") a lock moved from allocate_and_map_msi_pirq() to the caller and changed from pcidevs_lock() to read_lock(&d->pci_lock). However, one call path wasn't updated to reflect the change, leading to a failed assertion observed on debug builds of Xen when an SRIOV device is present with one or more VFs enabled: (XEN) Assertion 'd || pcidevs_locked()' failed at drivers/passthrough/pci.c:535 (XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]---- ... (XEN) Xen call trace: (XEN) [<ffff82d040284da8>] R pci_get_pdev+0x4c/0xab (XEN) [<ffff82d040344f5c>] F arch/x86/msi.c#read_pci_mem_bar+0x58/0x272 (XEN) [<ffff82d04034530e>] F arch/x86/msi.c#msix_capability_init+0x198/0x755 (XEN) [<ffff82d040345dad>] F arch/x86/msi.c#__pci_enable_msix+0x82/0xe8 (XEN) [<ffff82d0403463e5>] F pci_enable_msi+0x3f/0x78 (XEN) [<ffff82d04034be2b>] F map_domain_pirq+0x2a4/0x6dc (XEN) [<ffff82d04034d4d5>] F allocate_and_map_msi_pirq+0x103/0x262 (XEN) [<ffff82d04035da5d>] F physdev_map_pirq+0x210/0x259 (XEN) [<ffff82d04035e798>] F do_physdev_op+0x9c3/0x1454 (XEN) [<ffff82d040329475>] F pv_hypercall+0x5ac/0x6af (XEN) [<ffff82d0402012d3>] F lstar_enter+0x143/0x150 Fix it by passing the struct domain pointer to pci_get_pdev() in this call path. Fixes: 4f78438b45e2 ("vpci: use per-domain PCI lock to protect vpci structure") Reported-by: Teddy Astie <teddy.astie@vates.tech> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> --- Candidate for backport to 4.19 --- xen/arch/x86/msi.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)