Message ID | 162158429107.145117.5843504911924013125.stgit@jupiter (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] spapr: Fix EEH capability issue on KVM guest for PCI passthru | expand |
On Fri, May 21, 2021 at 01:35:51PM +0530, Mahesh Salgaonkar wrote: > With upstream kernel, especially after commit 98ba956f6a389 > ("powerpc/pseries/eeh: Rework device EEH PE determination") we see that KVM > guest isn't able to enable EEH option for PCI pass-through devices anymore. > > [root@atest-guest ~]# dmesg | grep EEH > [ 0.032337] EEH: pSeries platform initialized > [ 0.298207] EEH: No capable adapters found: recovery disabled. > [root@atest-guest ~]# > > So far the linux kernel was assuming pe_config_addr equal to device's > config_addr and using it to enable EEH on the PE through ibm,set-eeh-option > RTAS call. Which wasn't the correct way as per PAPR. The linux kernel > commit 98ba956f6a389 fixed this flow. With that fixed, linux now uses PE > config address returned by ibm,get-config-addr-info2 RTAS call to enable > EEH option per-PE basis instead of per-device basis. However this has > uncovered a bug in qemu where ibm,set-eeh-option is treating PE config > address as per-device config address. > > Hence in qemu guest with recent kernel the ibm,set-eeh-option RTAS call > fails with -3 return value indicating that there is no PCI device exist for > the specified PE config address. The rtas_ibm_set_eeh_option call uses > pci_find_device() to get the PC device that matches specific bus and devfn > extracted from PE config address passed as argument. Thus it tries to map > the PE config address to a single specific PCI device 'bus->devices[devfn]' > which always results into checking device on slot 0 'bus->devices[0]'. > This succeeds when there is a pass-through device (vfio-pci) present on > slot 0. But in cases where there is no pass-through device present in slot > 0, but present in non-zero slots, ibm,set-eeh-option call fails to enable > the EEH capability. > > hw/ppc/spapr_pci_vfio.c: spapr_phb_vfio_eeh_set_option() > case RTAS_EEH_ENABLE: { > PCIHostState *phb; > PCIDevice *pdev; > > /* > * The EEH functionality is enabled on basis of PCI device, > * instead of PE. We need check the validity of the PCI > * device address. > */ > phb = PCI_HOST_BRIDGE(sphb); > pdev = pci_find_device(phb->bus, > (addr >> 16) & 0xFF, (addr >> 8) & 0xFF); > if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { > return RTAS_OUT_PARAM_ERROR; > } > > hw/pci/pci.c:pci_find_device() > > PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn) > { > bus = pci_find_bus_nr(bus, bus_num); > > if (!bus) > return NULL; > > return bus->devices[devfn]; > } > > This patch fixes ibm,set-eeh-option to check for presence of any PCI device > (vfio-pci) under specified bus and enable the EEH if found. The current > code already makes sure that all the devices on that bus are from same > iommu group (within same PE) and fail very early if it does not. > > After this fix guest is able to find EEH capable devices and enable EEH > recovery on it. > > [root@atest-guest ~]# dmesg | grep EEH > [ 0.048139] EEH: pSeries platform initialized > [ 0.405115] EEH: Capable adapter found: recovery enabled. > [root@atest-guest ~]# > > Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Gross, but I don't see a better way of handling this, so, applied to ppc-for-6.1, thanks. > --- > Change in v3: > - Add a comment about reason for not checking for validity of supplied > config_addr as pointed by Oliver in spapr_phb_vfio_eeh_set_option() > function. > Change in v2: > - Fix ibm,set-eeh-option instead of returning per-device PE config address. > - Changed patch subject line. > --- > hw/ppc/spapr_pci_vfio.c | 40 +++++++++++++++++++++++++++++++++------- > 1 file changed, 33 insertions(+), 7 deletions(-) > > diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c > index e0547b1740..6587c8cb5b 100644 > --- a/hw/ppc/spapr_pci_vfio.c > +++ b/hw/ppc/spapr_pci_vfio.c > @@ -47,6 +47,16 @@ void spapr_phb_vfio_reset(DeviceState *qdev) > spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); > } > > +static void spapr_eeh_pci_find_device(PCIBus *bus, PCIDevice *pdev, > + void *opaque) > +{ > + bool *found = opaque; > + > + if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { > + *found = true; > + } > +} > + > int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb, > unsigned int addr, int option) > { > @@ -59,17 +69,33 @@ int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb, > break; > case RTAS_EEH_ENABLE: { > PCIHostState *phb; > - PCIDevice *pdev; > + bool found = false; > > /* > - * The EEH functionality is enabled on basis of PCI device, > - * instead of PE. We need check the validity of the PCI > - * device address. > + * The EEH functionality is enabled per sphb level instead of > + * per PCI device. We have already identified this specific sphb > + * based on buid passed as argument to ibm,set-eeh-option rtas > + * call. Now we just need to check the validity of the PCI > + * pass-through devices (vfio-pci) under this sphb bus. > + * We have already validated that all the devices under this sphb > + * are from same iommu group (within same PE) before comming here. > + * > + * Prior to linux commit 98ba956f6a389 ("powerpc/pseries/eeh: > + * Rework device EEH PE determination") kernel would call > + * eeh-set-option for each device in the PE using the device's > + * config_address as the argument rather than the PE address. > + * Hence if we check validity of supplied config_addr whether > + * it matches to this PHB will cause issues with older kernel > + * versions v5.9 and older. If we return an error from > + * eeh-set-option when the argument isn't a valid PE address > + * then older kernels (v5.9 and older) will interpret that as > + * EEH not being supported. > */ > phb = PCI_HOST_BRIDGE(sphb); > - pdev = pci_find_device(phb->bus, > - (addr >> 16) & 0xFF, (addr >> 8) & 0xFF); > - if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { > + pci_for_each_device(phb->bus, (addr >> 16) & 0xFF, > + spapr_eeh_pci_find_device, &found); > + > + if (!found) { > return RTAS_OUT_PARAM_ERROR; > } > > >
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index e0547b1740..6587c8cb5b 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -47,6 +47,16 @@ void spapr_phb_vfio_reset(DeviceState *qdev) spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); } +static void spapr_eeh_pci_find_device(PCIBus *bus, PCIDevice *pdev, + void *opaque) +{ + bool *found = opaque; + + if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { + *found = true; + } +} + int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb, unsigned int addr, int option) { @@ -59,17 +69,33 @@ int spapr_phb_vfio_eeh_set_option(SpaprPhbState *sphb, break; case RTAS_EEH_ENABLE: { PCIHostState *phb; - PCIDevice *pdev; + bool found = false; /* - * The EEH functionality is enabled on basis of PCI device, - * instead of PE. We need check the validity of the PCI - * device address. + * The EEH functionality is enabled per sphb level instead of + * per PCI device. We have already identified this specific sphb + * based on buid passed as argument to ibm,set-eeh-option rtas + * call. Now we just need to check the validity of the PCI + * pass-through devices (vfio-pci) under this sphb bus. + * We have already validated that all the devices under this sphb + * are from same iommu group (within same PE) before comming here. + * + * Prior to linux commit 98ba956f6a389 ("powerpc/pseries/eeh: + * Rework device EEH PE determination") kernel would call + * eeh-set-option for each device in the PE using the device's + * config_address as the argument rather than the PE address. + * Hence if we check validity of supplied config_addr whether + * it matches to this PHB will cause issues with older kernel + * versions v5.9 and older. If we return an error from + * eeh-set-option when the argument isn't a valid PE address + * then older kernels (v5.9 and older) will interpret that as + * EEH not being supported. */ phb = PCI_HOST_BRIDGE(sphb); - pdev = pci_find_device(phb->bus, - (addr >> 16) & 0xFF, (addr >> 8) & 0xFF); - if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { + pci_for_each_device(phb->bus, (addr >> 16) & 0xFF, + spapr_eeh_pci_find_device, &found); + + if (!found) { return RTAS_OUT_PARAM_ERROR; }