Message ID | 20230615051645.4798-1-anisinha@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] hw/pci: prevent hotplug of devices on pcie-root-ports on the wrong slot | expand |
On Thu, 15 Jun 2023 10:46:45 +0530 Ani Sinha <anisinha@redhat.com> wrote: > PCIE root ports and other upstream ports only allow one device on slot 0. > When hotplugging a device on a pcie root port, make sure that the device > address passed always represents slot 0. Any other slot value would be > illegal on a root port. > > CC: jusual@redhat.com > CC: imammedo@redhat.com > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > Signed-off-by: Ani Sinha <anisinha@redhat.com> > --- > hw/pci/pci.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > changelog: > v2: feedback from mst included. > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index bf38905b7d..66999352cc 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -64,6 +64,7 @@ bool pci_available = true; > static char *pcibus_get_dev_path(DeviceState *dev); > static char *pcibus_get_fw_dev_path(DeviceState *dev); > static void pcibus_reset(BusState *qbus); > +static bool pcie_has_upstream_port(PCIDevice *dev); > > static Property pci_props[] = { > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > @@ -1182,6 +1183,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > } else if (dev->hotplugged && > !pci_is_vf(pci_dev) && > pci_get_function_0(pci_dev)) { > + /* > + * populating function 0 triggers a bus scan from the guest that > + * exposes other non-zero functions. Hence we need to ensure that > + * function 0 is available. > + */ > error_setg(errp, "PCI: slot %d function 0 already occupied by %s," > " new func %s cannot be exposed to guest.", > PCI_SLOT(pci_get_function_0(pci_dev)->devfn), > @@ -1189,6 +1195,16 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > name); > > return NULL; > + } else if (dev->hotplugged && > + !pci_is_vf(pci_dev) && > + pcie_has_upstream_port(pci_dev) && PCI_SLOT(devfn)) { > + /* > + * If the device is being plugged into an upstream PCIE port, you probably mixing up downstream port with upstream one, the only thing that could be plugged into upstream port is PCIE switch. Also I'm not sure that we should do this at all. Looking at BZ it seems that QEMU crashes inside backend and tear down/cleanup sequence is broken somewhere. And that is the root cause, so I'd rather fix that 1st and only after that consider adding workarounds if any were necessary. > + * like a pcie root port, we only support one device at slot 0 > + */ > + error_setg(errp, "PCI: slot %d is not valid for %s", > + PCI_SLOT(devfn), name); > + return NULL; > } > > pci_dev->devfn = devfn;
> On 15-Jun-2023, at 4:56 PM, Igor Mammedov <imammedo@redhat.com> wrote: > > On Thu, 15 Jun 2023 10:46:45 +0530 > Ani Sinha <anisinha@redhat.com> wrote: > >> PCIE root ports and other upstream ports only allow one device on slot 0. >> When hotplugging a device on a pcie root port, make sure that the device >> address passed always represents slot 0. Any other slot value would be >> illegal on a root port. >> >> CC: jusual@redhat.com >> CC: imammedo@redhat.com >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >> Signed-off-by: Ani Sinha <anisinha@redhat.com> >> --- >> hw/pci/pci.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> changelog: >> v2: feedback from mst included. >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index bf38905b7d..66999352cc 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -64,6 +64,7 @@ bool pci_available = true; >> static char *pcibus_get_dev_path(DeviceState *dev); >> static char *pcibus_get_fw_dev_path(DeviceState *dev); >> static void pcibus_reset(BusState *qbus); >> +static bool pcie_has_upstream_port(PCIDevice *dev); >> >> static Property pci_props[] = { >> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >> @@ -1182,6 +1183,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >> } else if (dev->hotplugged && >> !pci_is_vf(pci_dev) && >> pci_get_function_0(pci_dev)) { >> + /* >> + * populating function 0 triggers a bus scan from the guest that >> + * exposes other non-zero functions. Hence we need to ensure that >> + * function 0 is available. >> + */ >> error_setg(errp, "PCI: slot %d function 0 already occupied by %s," >> " new func %s cannot be exposed to guest.", >> PCI_SLOT(pci_get_function_0(pci_dev)->devfn), >> @@ -1189,6 +1195,16 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >> name); >> >> return NULL; >> + } else if (dev->hotplugged && >> + !pci_is_vf(pci_dev) && >> + pcie_has_upstream_port(pci_dev) && PCI_SLOT(devfn)) { >> + /* >> + * If the device is being plugged into an upstream PCIE port, > > you probably mixing up downstream port with upstream one, > the only thing that could be plugged into upstream port > is PCIE switch. > > Also I'm not sure that we should do this at all. > Looking at BZ it seems that QEMU crashes inside backend > and tear down/cleanup sequence is broken somewhere. > And that is the root cause, so I'd rather fix that 1st > and only after that consider adding workarounds if any > were necessary. I have added more details in the ticket. I still believe that my approach is in the right direction. > > >> + * like a pcie root port, we only support one device at slot 0 >> + */ >> + error_setg(errp, "PCI: slot %d is not valid for %s", >> + PCI_SLOT(devfn), name); >> + return NULL; >> } >> >> pci_dev->devfn = devfn;
On Fri, 16 Jun 2023 13:06:06 +0530 Ani Sinha <anisinha@redhat.com> wrote: > > On 15-Jun-2023, at 4:56 PM, Igor Mammedov <imammedo@redhat.com> wrote: > > > > On Thu, 15 Jun 2023 10:46:45 +0530 > > Ani Sinha <anisinha@redhat.com> wrote: > > > >> PCIE root ports and other upstream ports only allow one device on slot 0. > >> When hotplugging a device on a pcie root port, make sure that the device > >> address passed always represents slot 0. Any other slot value would be > >> illegal on a root port. > >> > >> CC: jusual@redhat.com > >> CC: imammedo@redhat.com > >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > >> Signed-off-by: Ani Sinha <anisinha@redhat.com> > >> --- > >> hw/pci/pci.c | 16 ++++++++++++++++ > >> 1 file changed, 16 insertions(+) > >> > >> changelog: > >> v2: feedback from mst included. > >> > >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >> index bf38905b7d..66999352cc 100644 > >> --- a/hw/pci/pci.c > >> +++ b/hw/pci/pci.c > >> @@ -64,6 +64,7 @@ bool pci_available = true; > >> static char *pcibus_get_dev_path(DeviceState *dev); > >> static char *pcibus_get_fw_dev_path(DeviceState *dev); > >> static void pcibus_reset(BusState *qbus); > >> +static bool pcie_has_upstream_port(PCIDevice *dev); > >> > >> static Property pci_props[] = { > >> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > >> @@ -1182,6 +1183,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > >> } else if (dev->hotplugged && > >> !pci_is_vf(pci_dev) && > >> pci_get_function_0(pci_dev)) { > >> + /* > >> + * populating function 0 triggers a bus scan from the guest that > >> + * exposes other non-zero functions. Hence we need to ensure that > >> + * function 0 is available. > >> + */ > >> error_setg(errp, "PCI: slot %d function 0 already occupied by %s," > >> " new func %s cannot be exposed to guest.", > >> PCI_SLOT(pci_get_function_0(pci_dev)->devfn), > >> @@ -1189,6 +1195,16 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > >> name); > >> > >> return NULL; > >> + } else if (dev->hotplugged && > >> + !pci_is_vf(pci_dev) && > >> + pcie_has_upstream_port(pci_dev) && PCI_SLOT(devfn)) { > >> + /* > >> + * If the device is being plugged into an upstream PCIE port, > > > > you probably mixing up downstream port with upstream one, > > the only thing that could be plugged into upstream port > > is PCIE switch. > > > > Also I'm not sure that we should do this at all. > > Looking at BZ it seems that QEMU crashes inside backend > > and tear down/cleanup sequence is broken somewhere. > > And that is the root cause, so I'd rather fix that 1st > > and only after that consider adding workarounds if any > > were necessary. > > I have added more details in the ticket. I still believe that my approach is in the right direction. eject is _optional_ and guest is free to ignore eject request. That shall not cause improper QEMU behavior. Preventing bug trigger is ok if we can't fix root cause but then one should explain in commit message what root cause is and why it can't be fixed. does it work for similar coldplugged setup without unplug (if yes then why)? > > > > > > >> + * like a pcie root port, we only support one device at slot 0 > >> + */ > >> + error_setg(errp, "PCI: slot %d is not valid for %s", > >> + PCI_SLOT(devfn), name); > >> + return NULL; > >> } > >> > >> pci_dev->devfn = devfn; >
> On 15-Jun-2023, at 4:56 PM, Igor Mammedov <imammedo@redhat.com> wrote: > > On Thu, 15 Jun 2023 10:46:45 +0530 > Ani Sinha <anisinha@redhat.com> wrote: > >> PCIE root ports and other upstream ports only allow one device on slot 0. >> When hotplugging a device on a pcie root port, make sure that the device >> address passed always represents slot 0. Any other slot value would be >> illegal on a root port. >> >> CC: jusual@redhat.com >> CC: imammedo@redhat.com >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >> Signed-off-by: Ani Sinha <anisinha@redhat.com> >> --- >> hw/pci/pci.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> changelog: >> v2: feedback from mst included. >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index bf38905b7d..66999352cc 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -64,6 +64,7 @@ bool pci_available = true; >> static char *pcibus_get_dev_path(DeviceState *dev); >> static char *pcibus_get_fw_dev_path(DeviceState *dev); >> static void pcibus_reset(BusState *qbus); >> +static bool pcie_has_upstream_port(PCIDevice *dev); >> >> static Property pci_props[] = { >> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >> @@ -1182,6 +1183,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >> } else if (dev->hotplugged && >> !pci_is_vf(pci_dev) && >> pci_get_function_0(pci_dev)) { >> + /* >> + * populating function 0 triggers a bus scan from the guest that >> + * exposes other non-zero functions. Hence we need to ensure that >> + * function 0 is available. >> + */ >> error_setg(errp, "PCI: slot %d function 0 already occupied by %s," >> " new func %s cannot be exposed to guest.", >> PCI_SLOT(pci_get_function_0(pci_dev)->devfn), >> @@ -1189,6 +1195,16 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >> name); >> >> return NULL; >> + } else if (dev->hotplugged && >> + !pci_is_vf(pci_dev) && >> + pcie_has_upstream_port(pci_dev) && PCI_SLOT(devfn)) { >> + /* >> + * If the device is being plugged into an upstream PCIE port, > > you probably mixing up downstream port with upstream one, > the only thing that could be plugged into upstream port > is PCIE switch. > > Also I'm not sure that we should do this at all. > Looking at BZ it seems that QEMU crashes inside backend > and tear down/cleanup sequence is broken somewhere. > And that is the root cause, so I'd rather fix that 1st > and only after that consider adding workarounds if any > were necessary. I have sent an upstream patch "vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present” for the backend vdpa cleanup issue. > > >> + * like a pcie root port, we only support one device at slot 0 >> + */ >> + error_setg(errp, "PCI: slot %d is not valid for %s", >> + PCI_SLOT(devfn), name); >> + return NULL; >> } >> >> pci_dev->devfn = devfn; >
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index bf38905b7d..66999352cc 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -64,6 +64,7 @@ bool pci_available = true; static char *pcibus_get_dev_path(DeviceState *dev); static char *pcibus_get_fw_dev_path(DeviceState *dev); static void pcibus_reset(BusState *qbus); +static bool pcie_has_upstream_port(PCIDevice *dev); static Property pci_props[] = { DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), @@ -1182,6 +1183,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, } else if (dev->hotplugged && !pci_is_vf(pci_dev) && pci_get_function_0(pci_dev)) { + /* + * populating function 0 triggers a bus scan from the guest that + * exposes other non-zero functions. Hence we need to ensure that + * function 0 is available. + */ error_setg(errp, "PCI: slot %d function 0 already occupied by %s," " new func %s cannot be exposed to guest.", PCI_SLOT(pci_get_function_0(pci_dev)->devfn), @@ -1189,6 +1195,16 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, name); return NULL; + } else if (dev->hotplugged && + !pci_is_vf(pci_dev) && + pcie_has_upstream_port(pci_dev) && PCI_SLOT(devfn)) { + /* + * If the device is being plugged into an upstream PCIE port, + * like a pcie root port, we only support one device at slot 0 + */ + error_setg(errp, "PCI: slot %d is not valid for %s", + PCI_SLOT(devfn), name); + return NULL; } pci_dev->devfn = devfn;
PCIE root ports and other upstream ports only allow one device on slot 0. When hotplugging a device on a pcie root port, make sure that the device address passed always represents slot 0. Any other slot value would be illegal on a root port. CC: jusual@redhat.com CC: imammedo@redhat.com Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 Signed-off-by: Ani Sinha <anisinha@redhat.com> --- hw/pci/pci.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) changelog: v2: feedback from mst included.