Message ID | a09d2427397621eaecee4c46b33507a99cc5f161.1673334040.git.brchuckz@aol.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru | expand |
On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: > Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, > as noted in docs/igd-assign.txt in the Qemu source code. > > Currently, when the xl toolstack is used to configure a Xen HVM guest with > Intel IGD passthrough to the guest with the Qemu upstream device model, > a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy > a different slot. This problem often prevents the guest from booting. > > The only available workaround is not good: Configure Xen HVM guests to use > the old and no longer maintained Qemu traditional device model available > from xenbits.xen.org which does reserve slot 2 for the Intel IGD. > > To implement this feature in the Qemu upstream device model for Xen HVM > guests, introduce the following new functions, types, and macros: > > * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE > * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS > * typedef XenPTQdevRealize function pointer > * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 > * xen_igd_reserve_slot and xen_igd_clear_slot functions > > The new xen_igd_reserve_slot function uses the existing slot_reserved_mask > member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using > the xl toolstack with the gfx_passthru option enabled, which sets the > igd-passthru=on option to Qemu for the Xen HVM machine type. > > The new xen_igd_reserve_slot function also needs to be implemented in > hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case > when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, > in which case it does nothing. > > The new xen_igd_clear_slot function overrides qdev->realize of the parent > PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus > since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was > created in hw/i386/pc_piix.c for the case when igd-passthru=on. > > Move the call to xen_host_pci_device_get, and the associated error > handling, from xen_pt_realize to the new xen_igd_clear_slot function to > initialize the device class and vendor values which enables the checks for > the Intel IGD to succeed. The verification that the host device is an > Intel IGD to be passed through is done by checking the domain, bus, slot, > and function values as well as by checking that gfx_passthru is enabled, > the device class is VGA, and the device vendor in Intel. > > Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> > --- > Notes that might be helpful to reviewers of patched code in hw/xen: > > The new functions and types are based on recommendations from Qemu docs: > https://qemu.readthedocs.io/en/latest/devel/qom.html > > Notes that might be helpful to reviewers of patched code in hw/i386: > > The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does > not affect builds that do not have CONFIG_XEN defined. > > xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an > existing function that is only true when Qemu is built with > xen-pci-passthrough enabled and the administrator has configured the Xen > HVM guest with Qemu's igd-passthru=on option. > > v2: Remove From: <email address> tag at top of commit message > > v3: Changed the test for the Intel IGD in xen_igd_clear_slot: > > if (is_igd_vga_passthrough(&s->real_device) && > (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { > > is changed to > > if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) > && (s->hostaddr.function == 0)) { > > I hoped that I could use the test in v2, since it matches the > other tests for the Intel IGD in Qemu and Xen, but those tests > do not work because the necessary data structures are not set with > their values yet. So instead use the test that the administrator > has enabled gfx_passthru and the device address on the host is > 02.0. This test does detect the Intel IGD correctly. > > v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's > email address to match the address used by the same author in commits > be9c61da and c0e86b76 > > Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc > > v5: The patch of xen_pt.c was re-worked to allow a more consistent test > for the Intel IGD that uses the same criteria as in other places. > This involved moving the call to xen_host_pci_device_get from > xen_pt_realize to xen_igd_clear_slot and updating the checks for the > Intel IGD in xen_igd_clear_slot: > > if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) > && (s->hostaddr.function == 0)) { > > is changed to > > if (is_igd_vga_passthrough(&s->real_device) && > s->real_device.domain == 0 && s->real_device.bus == 0 && > s->real_device.dev == 2 && s->real_device.func == 0 && > s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { > > Added an explanation for the move of xen_host_pci_device_get from > xen_pt_realize to xen_igd_clear_slot to the commit message. > > Rebase. > > v6: Fix logging by removing these lines from the move from xen_pt_realize > to xen_igd_clear_slot that was done in v5: > > XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" > " to devfn 0x%x\n", > s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, > s->dev.devfn); > > This log needs to be in xen_pt_realize because s->dev.devfn is not > set yet in xen_igd_clear_slot. > > v7: The v7 that was posted to the mailing list was incorrect. v8 is what > v7 was intended to be. > > v8: Inhibit out of context log message and needless processing by > adding 2 lines at the top of the new xen_igd_clear_slot function: > > if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) > return; > > Rebase. This removed an unnecessary header file from xen_pt.h > > hw/i386/pc_piix.c | 3 +++ > hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- > hw/xen/xen_pt.h | 16 +++++++++++++++ > hw/xen/xen_pt_stub.c | 4 ++++ > 4 files changed, 63 insertions(+), 9 deletions(-) > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index b48047f50c..bc5efa4f59 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) > } > > pc_xen_hvm_init_pci(machine); > + if (xen_igd_gfx_pt_enabled()) { > + xen_igd_reserve_slot(pcms->bus); > + } > pci_create_simple(pcms->bus, -1, "xen-platform"); > } > #endif I would even maybe go further and move the whole logic into xen_igd_reserve_slot. And I would even just name it xen_hvm_init_reserved_slots without worrying about the what or why at the pc level. At this point it will be up to Xen maintainers. > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c > index 0ec7e52183..eff38155ef 100644 > --- a/hw/xen/xen_pt.c > +++ b/hw/xen/xen_pt.c > @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) > s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, > s->dev.devfn); > > - xen_host_pci_device_get(&s->real_device, > - s->hostaddr.domain, s->hostaddr.bus, > - s->hostaddr.slot, s->hostaddr.function, > - errp); > - if (*errp) { > - error_append_hint(errp, "Failed to \"open\" the real pci device"); > - return; > - } > - > s->is_virtfn = s->real_device.is_virtfn; > if (s->is_virtfn) { > XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", > @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) > PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; > } > > +void xen_igd_reserve_slot(PCIBus *pci_bus) > +{ > + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); > + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; > +} > + > +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) > +{ > + ERRP_GUARD(); > + PCIDevice *pci_dev = (PCIDevice *)qdev; > + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); > + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); > + PCIBus *pci_bus = pci_get_bus(pci_dev); > + > + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) > + return; > + > + xen_host_pci_device_get(&s->real_device, > + s->hostaddr.domain, s->hostaddr.bus, > + s->hostaddr.slot, s->hostaddr.function, > + errp); > + if (*errp) { > + error_append_hint(errp, "Failed to \"open\" the real pci device"); > + return; > + } > + > + if (is_igd_vga_passthrough(&s->real_device) && > + s->real_device.domain == 0 && s->real_device.bus == 0 && > + s->real_device.dev == 2 && s->real_device.func == 0 && > + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { how about macros for these? #define XEN_PCI_IGD_DOMAIN 0 #define XEN_PCI_IGD_BUS 0 #define XEN_PCI_IGD_DEV 2 #define XEN_PCI_IGD_FN 0 > + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; If you are going to do this, you should set it back either after pci_qdev_realize or in unrealize, for symmetry. > + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); > + } > + xpdc->pci_qdev_realize(qdev, errp); > +} > + > static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) > { > DeviceClass *dc = DEVICE_CLASS(klass); > PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); > > + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); > + xpdc->pci_qdev_realize = dc->realize; > + dc->realize = xen_igd_clear_slot; > k->realize = xen_pt_realize; > k->exit = xen_pt_unregister_device; > k->config_read = xen_pt_pci_read_config; > @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { > .instance_size = sizeof(XenPCIPassthroughState), > .instance_finalize = xen_pci_passthrough_finalize, > .class_init = xen_pci_passthrough_class_init, > + .class_size = sizeof(XenPTDeviceClass), > .instance_init = xen_pci_passthrough_instance_init, > .interfaces = (InterfaceInfo[]) { > { INTERFACE_CONVENTIONAL_PCI_DEVICE }, > diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h > index cf10fc7bbf..8c25932b4b 100644 > --- a/hw/xen/xen_pt.h > +++ b/hw/xen/xen_pt.h > @@ -2,6 +2,7 @@ > #define XEN_PT_H > > #include "hw/xen/xen_common.h" > +#include "hw/pci/pci_bus.h" > #include "xen-host-pci-device.h" > #include "qom/object.h" > > @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; > #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" > OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) > > +#define XEN_PT_DEVICE_CLASS(klass) \ > + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) > +#define XEN_PT_DEVICE_GET_CLASS(obj) \ > + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) > + > +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); > + > +typedef struct XenPTDeviceClass { > + PCIDeviceClass parent_class; > + XenPTQdevRealize pci_qdev_realize; > +} XenPTDeviceClass; > + > uint32_t igd_read_opregion(XenPCIPassthroughState *s); > +void xen_igd_reserve_slot(PCIBus *pci_bus); > void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); > void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, > XenHostPCIDevice *dev); > @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) > > #define XEN_PCI_INTEL_OPREGION 0xfc > > +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ > + I think you want to calculate this based on dev fn: #define XEN_PCI_IGD_SLOT_MASK \ (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) > typedef enum { > XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ > XEN_PT_GRP_TYPE_EMU, /* emul reg group */ > diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c > index 2d8cac8d54..5c108446a8 100644 > --- a/hw/xen/xen_pt_stub.c > +++ b/hw/xen/xen_pt_stub.c > @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) > error_setg(errp, "Xen PCI passthrough support not built in"); > } > } > + > +void xen_igd_reserve_slot(PCIBus *pci_bus) > +{ > +} > -- > 2.39.0
On 1/10/2023 3:16 AM, Michael S. Tsirkin wrote: > On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: > > Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, > > as noted in docs/igd-assign.txt in the Qemu source code. > > > > Currently, when the xl toolstack is used to configure a Xen HVM guest with > > Intel IGD passthrough to the guest with the Qemu upstream device model, > > a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy > > a different slot. This problem often prevents the guest from booting. > > > > The only available workaround is not good: Configure Xen HVM guests to use > > the old and no longer maintained Qemu traditional device model available > > from xenbits.xen.org which does reserve slot 2 for the Intel IGD. > > > > To implement this feature in the Qemu upstream device model for Xen HVM > > guests, introduce the following new functions, types, and macros: > > > > * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE > > * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS > > * typedef XenPTQdevRealize function pointer > > * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 > > * xen_igd_reserve_slot and xen_igd_clear_slot functions > > > > The new xen_igd_reserve_slot function uses the existing slot_reserved_mask > > member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using > > the xl toolstack with the gfx_passthru option enabled, which sets the > > igd-passthru=on option to Qemu for the Xen HVM machine type. > > > > The new xen_igd_reserve_slot function also needs to be implemented in > > hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case > > when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, > > in which case it does nothing. > > > > The new xen_igd_clear_slot function overrides qdev->realize of the parent > > PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus > > since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was > > created in hw/i386/pc_piix.c for the case when igd-passthru=on. > > > > Move the call to xen_host_pci_device_get, and the associated error > > handling, from xen_pt_realize to the new xen_igd_clear_slot function to > > initialize the device class and vendor values which enables the checks for > > the Intel IGD to succeed. The verification that the host device is an > > Intel IGD to be passed through is done by checking the domain, bus, slot, > > and function values as well as by checking that gfx_passthru is enabled, > > the device class is VGA, and the device vendor in Intel. > > > > Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> > > --- > > Notes that might be helpful to reviewers of patched code in hw/xen: > > > > The new functions and types are based on recommendations from Qemu docs: > > https://qemu.readthedocs.io/en/latest/devel/qom.html > > > > Notes that might be helpful to reviewers of patched code in hw/i386: > > > > The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does > > not affect builds that do not have CONFIG_XEN defined. > > > > xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an > > existing function that is only true when Qemu is built with > > xen-pci-passthrough enabled and the administrator has configured the Xen > > HVM guest with Qemu's igd-passthru=on option. > > > > v2: Remove From: <email address> tag at top of commit message > > > > v3: Changed the test for the Intel IGD in xen_igd_clear_slot: > > > > if (is_igd_vga_passthrough(&s->real_device) && > > (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { > > > > is changed to > > > > if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) > > && (s->hostaddr.function == 0)) { > > > > I hoped that I could use the test in v2, since it matches the > > other tests for the Intel IGD in Qemu and Xen, but those tests > > do not work because the necessary data structures are not set with > > their values yet. So instead use the test that the administrator > > has enabled gfx_passthru and the device address on the host is > > 02.0. This test does detect the Intel IGD correctly. > > > > v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's > > email address to match the address used by the same author in commits > > be9c61da and c0e86b76 > > > > Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc > > > > v5: The patch of xen_pt.c was re-worked to allow a more consistent test > > for the Intel IGD that uses the same criteria as in other places. > > This involved moving the call to xen_host_pci_device_get from > > xen_pt_realize to xen_igd_clear_slot and updating the checks for the > > Intel IGD in xen_igd_clear_slot: > > > > if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) > > && (s->hostaddr.function == 0)) { > > > > is changed to > > > > if (is_igd_vga_passthrough(&s->real_device) && > > s->real_device.domain == 0 && s->real_device.bus == 0 && > > s->real_device.dev == 2 && s->real_device.func == 0 && > > s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { > > > > Added an explanation for the move of xen_host_pci_device_get from > > xen_pt_realize to xen_igd_clear_slot to the commit message. > > > > Rebase. > > > > v6: Fix logging by removing these lines from the move from xen_pt_realize > > to xen_igd_clear_slot that was done in v5: > > > > XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" > > " to devfn 0x%x\n", > > s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, > > s->dev.devfn); > > > > This log needs to be in xen_pt_realize because s->dev.devfn is not > > set yet in xen_igd_clear_slot. > > > > v7: The v7 that was posted to the mailing list was incorrect. v8 is what > > v7 was intended to be. > > > > v8: Inhibit out of context log message and needless processing by > > adding 2 lines at the top of the new xen_igd_clear_slot function: > > > > if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) > > return; > > > > Rebase. This removed an unnecessary header file from xen_pt.h > > > > hw/i386/pc_piix.c | 3 +++ > > hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- > > hw/xen/xen_pt.h | 16 +++++++++++++++ > > hw/xen/xen_pt_stub.c | 4 ++++ > > 4 files changed, 63 insertions(+), 9 deletions(-) > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > > index b48047f50c..bc5efa4f59 100644 > > --- a/hw/i386/pc_piix.c > > +++ b/hw/i386/pc_piix.c > > @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) > > } > > > > pc_xen_hvm_init_pci(machine); > > + if (xen_igd_gfx_pt_enabled()) { > > + xen_igd_reserve_slot(pcms->bus); > > + } > > pci_create_simple(pcms->bus, -1, "xen-platform"); > > } > > #endif > > I would even maybe go further and move the whole logic into > xen_igd_reserve_slot. And I would even just name it > xen_hvm_init_reserved_slots without worrying about the what > or why at the pc level. At this point it will be up to Xen maintainers. I will try to do that for v9. That would reduce, rather than increase, the technical debt. I actually wanted to move all the xen specific stuff in pc_piix.c to xen-hvm.c. That would reduce the technical debt that has accumulated over the years. I just couldn't see how to do it easily. It looked I would need to do violence to pc_init1. But, I suppose, pc_init1 is the kind of function that has accumulated lots of technical debt and needs some violence done to it! I will give it a try, but it might take me a while. I think if it was easy, someone would have done it by now. Thanks, Chuck > > > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c > > index 0ec7e52183..eff38155ef 100644 > > --- a/hw/xen/xen_pt.c > > +++ b/hw/xen/xen_pt.c > > @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) > > s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, > > s->dev.devfn); > > > > - xen_host_pci_device_get(&s->real_device, > > - s->hostaddr.domain, s->hostaddr.bus, > > - s->hostaddr.slot, s->hostaddr.function, > > - errp); > > - if (*errp) { > > - error_append_hint(errp, "Failed to \"open\" the real pci device"); > > - return; > > - } > > - > > s->is_virtfn = s->real_device.is_virtfn; > > if (s->is_virtfn) { > > XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", > > @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) > > PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; > > } > > > > +void xen_igd_reserve_slot(PCIBus *pci_bus) > > +{ > > + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); > > + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; > > +} > > + > > +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) > > +{ > > + ERRP_GUARD(); > > + PCIDevice *pci_dev = (PCIDevice *)qdev; > > + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); > > + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); > > + PCIBus *pci_bus = pci_get_bus(pci_dev); > > + > > + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) > > + return; > > + > > + xen_host_pci_device_get(&s->real_device, > > + s->hostaddr.domain, s->hostaddr.bus, > > + s->hostaddr.slot, s->hostaddr.function, > > + errp); > > + if (*errp) { > > + error_append_hint(errp, "Failed to \"open\" the real pci device"); > > + return; > > + } > > + > > + if (is_igd_vga_passthrough(&s->real_device) && > > + s->real_device.domain == 0 && s->real_device.bus == 0 && > > + s->real_device.dev == 2 && s->real_device.func == 0 && > > + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { > > how about macros for these? > > #define XEN_PCI_IGD_DOMAIN 0 > #define XEN_PCI_IGD_BUS 0 > #define XEN_PCI_IGD_DEV 2 > #define XEN_PCI_IGD_FN 0 > > > + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; > > If you are going to do this, you should set it back > either after pci_qdev_realize or in unrealize, > for symmetry. > > > + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); > > + } > > > > + xpdc->pci_qdev_realize(qdev, errp); > > +} > > + > > > > > static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) > > { > > DeviceClass *dc = DEVICE_CLASS(klass); > > PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); > > > > + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); > > + xpdc->pci_qdev_realize = dc->realize; > > + dc->realize = xen_igd_clear_slot; > > k->realize = xen_pt_realize; > > k->exit = xen_pt_unregister_device; > > k->config_read = xen_pt_pci_read_config; > > @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { > > .instance_size = sizeof(XenPCIPassthroughState), > > .instance_finalize = xen_pci_passthrough_finalize, > > .class_init = xen_pci_passthrough_class_init, > > + .class_size = sizeof(XenPTDeviceClass), > > .instance_init = xen_pci_passthrough_instance_init, > > .interfaces = (InterfaceInfo[]) { > > { INTERFACE_CONVENTIONAL_PCI_DEVICE }, > > diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h > > index cf10fc7bbf..8c25932b4b 100644 > > --- a/hw/xen/xen_pt.h > > +++ b/hw/xen/xen_pt.h > > @@ -2,6 +2,7 @@ > > #define XEN_PT_H > > > > #include "hw/xen/xen_common.h" > > +#include "hw/pci/pci_bus.h" > > #include "xen-host-pci-device.h" > > #include "qom/object.h" > > > > @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; > > #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" > > OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) > > > > +#define XEN_PT_DEVICE_CLASS(klass) \ > > + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) > > +#define XEN_PT_DEVICE_GET_CLASS(obj) \ > > + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) > > + > > +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); > > + > > +typedef struct XenPTDeviceClass { > > + PCIDeviceClass parent_class; > > + XenPTQdevRealize pci_qdev_realize; > > +} XenPTDeviceClass; > > + > > uint32_t igd_read_opregion(XenPCIPassthroughState *s); > > +void xen_igd_reserve_slot(PCIBus *pci_bus); > > void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); > > void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, > > XenHostPCIDevice *dev); > > @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) > > > > #define XEN_PCI_INTEL_OPREGION 0xfc > > > > +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ > > + > > I think you want to calculate this based on dev fn: > > #define XEN_PCI_IGD_SLOT_MASK \ > (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) > > > > typedef enum { > > XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ > > XEN_PT_GRP_TYPE_EMU, /* emul reg group */ > > diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c > > index 2d8cac8d54..5c108446a8 100644 > > --- a/hw/xen/xen_pt_stub.c > > +++ b/hw/xen/xen_pt_stub.c > > @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) > > error_setg(errp, "Xen PCI passthrough support not built in"); > > } > > } > > + > > +void xen_igd_reserve_slot(PCIBus *pci_bus) > > +{ > > +} > > -- > > 2.39.0 >
On 1/10/23 3:16 AM, Michael S. Tsirkin wrote: > On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, >> as noted in docs/igd-assign.txt in the Qemu source code. >> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with >> Intel IGD passthrough to the guest with the Qemu upstream device model, >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy >> a different slot. This problem often prevents the guest from booting. >> >> The only available workaround is not good: Configure Xen HVM guests to use >> the old and no longer maintained Qemu traditional device model available >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD. >> >> To implement this feature in the Qemu upstream device model for Xen HVM >> guests, introduce the following new functions, types, and macros: >> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS >> * typedef XenPTQdevRealize function pointer >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 >> * xen_igd_reserve_slot and xen_igd_clear_slot functions >> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using >> the xl toolstack with the gfx_passthru option enabled, which sets the >> igd-passthru=on option to Qemu for the Xen HVM machine type. >> >> The new xen_igd_reserve_slot function also needs to be implemented in >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, >> in which case it does nothing. >> >> The new xen_igd_clear_slot function overrides qdev->realize of the parent >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was >> created in hw/i386/pc_piix.c for the case when igd-passthru=on. >> >> Move the call to xen_host_pci_device_get, and the associated error >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to >> initialize the device class and vendor values which enables the checks for >> the Intel IGD to succeed. The verification that the host device is an >> Intel IGD to be passed through is done by checking the domain, bus, slot, >> and function values as well as by checking that gfx_passthru is enabled, >> the device class is VGA, and the device vendor in Intel. >> >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> >> --- >> Notes that might be helpful to reviewers of patched code in hw/xen: >> >> The new functions and types are based on recommendations from Qemu docs: >> https://qemu.readthedocs.io/en/latest/devel/qom.html >> >> Notes that might be helpful to reviewers of patched code in hw/i386: >> >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does >> not affect builds that do not have CONFIG_XEN defined. >> >> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an >> existing function that is only true when Qemu is built with >> xen-pci-passthrough enabled and the administrator has configured the Xen >> HVM guest with Qemu's igd-passthru=on option. >> >> v2: Remove From: <email address> tag at top of commit message >> >> v3: Changed the test for the Intel IGD in xen_igd_clear_slot: >> >> if (is_igd_vga_passthrough(&s->real_device) && >> (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { >> >> is changed to >> >> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >> && (s->hostaddr.function == 0)) { >> >> I hoped that I could use the test in v2, since it matches the >> other tests for the Intel IGD in Qemu and Xen, but those tests >> do not work because the necessary data structures are not set with >> their values yet. So instead use the test that the administrator >> has enabled gfx_passthru and the device address on the host is >> 02.0. This test does detect the Intel IGD correctly. >> >> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's >> email address to match the address used by the same author in commits >> be9c61da and c0e86b76 >> >> Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc >> >> v5: The patch of xen_pt.c was re-worked to allow a more consistent test >> for the Intel IGD that uses the same criteria as in other places. >> This involved moving the call to xen_host_pci_device_get from >> xen_pt_realize to xen_igd_clear_slot and updating the checks for the >> Intel IGD in xen_igd_clear_slot: >> >> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >> && (s->hostaddr.function == 0)) { >> >> is changed to >> >> if (is_igd_vga_passthrough(&s->real_device) && >> s->real_device.domain == 0 && s->real_device.bus == 0 && >> s->real_device.dev == 2 && s->real_device.func == 0 && >> s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >> >> Added an explanation for the move of xen_host_pci_device_get from >> xen_pt_realize to xen_igd_clear_slot to the commit message. >> >> Rebase. >> >> v6: Fix logging by removing these lines from the move from xen_pt_realize >> to xen_igd_clear_slot that was done in v5: >> >> XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" >> " to devfn 0x%x\n", >> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >> s->dev.devfn); >> >> This log needs to be in xen_pt_realize because s->dev.devfn is not >> set yet in xen_igd_clear_slot. >> >> v7: The v7 that was posted to the mailing list was incorrect. v8 is what >> v7 was intended to be. >> >> v8: Inhibit out of context log message and needless processing by >> adding 2 lines at the top of the new xen_igd_clear_slot function: >> >> if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >> return; >> >> Rebase. This removed an unnecessary header file from xen_pt.h >> >> hw/i386/pc_piix.c | 3 +++ >> hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- >> hw/xen/xen_pt.h | 16 +++++++++++++++ >> hw/xen/xen_pt_stub.c | 4 ++++ >> 4 files changed, 63 insertions(+), 9 deletions(-) >> >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >> index b48047f50c..bc5efa4f59 100644 >> --- a/hw/i386/pc_piix.c >> +++ b/hw/i386/pc_piix.c >> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) >> } >> >> pc_xen_hvm_init_pci(machine); >> + if (xen_igd_gfx_pt_enabled()) { >> + xen_igd_reserve_slot(pcms->bus); >> + } >> pci_create_simple(pcms->bus, -1, "xen-platform"); >> } >> #endif > > I would even maybe go further and move the whole logic into > xen_igd_reserve_slot. And I would even just name it > xen_hvm_init_reserved_slots without worrying about the what > or why at the pc level. At this point it will be up to Xen maintainers. I see to do that would be to resolve the two pc_xen_hvm* functions in pc_piix.c that are guarded by CONFIG_XEN and move them to an appropriate place such as xen-hvm.c. That is along the lines of the work that Bernhard and Philippe are doing, so I am Cc'ing them. My first inclination is just to defer to them: I think eventually the little patch I propose here to pc_piix.c is eventually going to be moved out of pc_piix.c by Bernhard in a future patch. What they have been doing is very conservative, and I expect if and when Bernhard gets here to resolve those functions, they will do it in a way that keeps the dependency of the xenfv machine type on the pc machine type and the pc_init1 function. What I would propose would be to break the dependency of xenfv on the pc_init1 function. That is, I would propose having a xenfv_init function in xen-hvm.c, and the first version would be the current version of pc_init1, so xenfv would still depend on many i440fx type things, but with the change xen developers would be free to tweak xenfv_init without affecting the users of the pc machine type. Would that be a good idea? If I get posiive feedback for this idea, I will put it on the table, probably initially as an RFC patch. Also, thanks, Michael, for your other suggestions for this patch about using macros for the devfn constants. Chuck > >> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >> index 0ec7e52183..eff38155ef 100644 >> --- a/hw/xen/xen_pt.c >> +++ b/hw/xen/xen_pt.c >> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >> s->dev.devfn); >> >> - xen_host_pci_device_get(&s->real_device, >> - s->hostaddr.domain, s->hostaddr.bus, >> - s->hostaddr.slot, s->hostaddr.function, >> - errp); >> - if (*errp) { >> - error_append_hint(errp, "Failed to \"open\" the real pci device"); >> - return; >> - } >> - >> s->is_virtfn = s->real_device.is_virtfn; >> if (s->is_virtfn) { >> XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", >> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) >> PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; >> } >> >> +void xen_igd_reserve_slot(PCIBus *pci_bus) >> +{ >> + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); >> + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; >> +} >> + >> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) >> +{ >> + ERRP_GUARD(); >> + PCIDevice *pci_dev = (PCIDevice *)qdev; >> + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); >> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); >> + PCIBus *pci_bus = pci_get_bus(pci_dev); >> + >> + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >> + return; >> + >> + xen_host_pci_device_get(&s->real_device, >> + s->hostaddr.domain, s->hostaddr.bus, >> + s->hostaddr.slot, s->hostaddr.function, >> + errp); >> + if (*errp) { >> + error_append_hint(errp, "Failed to \"open\" the real pci device"); >> + return; >> + } >> + >> + if (is_igd_vga_passthrough(&s->real_device) && >> + s->real_device.domain == 0 && s->real_device.bus == 0 && >> + s->real_device.dev == 2 && s->real_device.func == 0 && >> + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { > > how about macros for these? > > #define XEN_PCI_IGD_DOMAIN 0 > #define XEN_PCI_IGD_BUS 0 > #define XEN_PCI_IGD_DEV 2 > #define XEN_PCI_IGD_FN 0 > >> + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; > > If you are going to do this, you should set it back > either after pci_qdev_realize or in unrealize, > for symmetry. > >> + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); >> + } > > >> + xpdc->pci_qdev_realize(qdev, errp); >> +} >> + > > > >> static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) >> { >> DeviceClass *dc = DEVICE_CLASS(klass); >> PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); >> >> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); >> + xpdc->pci_qdev_realize = dc->realize; >> + dc->realize = xen_igd_clear_slot; >> k->realize = xen_pt_realize; >> k->exit = xen_pt_unregister_device; >> k->config_read = xen_pt_pci_read_config; >> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { >> .instance_size = sizeof(XenPCIPassthroughState), >> .instance_finalize = xen_pci_passthrough_finalize, >> .class_init = xen_pci_passthrough_class_init, >> + .class_size = sizeof(XenPTDeviceClass), >> .instance_init = xen_pci_passthrough_instance_init, >> .interfaces = (InterfaceInfo[]) { >> { INTERFACE_CONVENTIONAL_PCI_DEVICE }, >> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >> index cf10fc7bbf..8c25932b4b 100644 >> --- a/hw/xen/xen_pt.h >> +++ b/hw/xen/xen_pt.h >> @@ -2,6 +2,7 @@ >> #define XEN_PT_H >> >> #include "hw/xen/xen_common.h" >> +#include "hw/pci/pci_bus.h" >> #include "xen-host-pci-device.h" >> #include "qom/object.h" >> >> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; >> #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" >> OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) >> >> +#define XEN_PT_DEVICE_CLASS(klass) \ >> + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) >> +#define XEN_PT_DEVICE_GET_CLASS(obj) \ >> + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) >> + >> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); >> + >> +typedef struct XenPTDeviceClass { >> + PCIDeviceClass parent_class; >> + XenPTQdevRealize pci_qdev_realize; >> +} XenPTDeviceClass; >> + >> uint32_t igd_read_opregion(XenPCIPassthroughState *s); >> +void xen_igd_reserve_slot(PCIBus *pci_bus); >> void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); >> void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, >> XenHostPCIDevice *dev); >> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) >> >> #define XEN_PCI_INTEL_OPREGION 0xfc >> >> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ >> + > > I think you want to calculate this based on dev fn: > > #define XEN_PCI_IGD_SLOT_MASK \ > (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) > > >> typedef enum { >> XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ >> XEN_PT_GRP_TYPE_EMU, /* emul reg group */ >> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c >> index 2d8cac8d54..5c108446a8 100644 >> --- a/hw/xen/xen_pt_stub.c >> +++ b/hw/xen/xen_pt_stub.c >> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) >> error_setg(errp, "Xen PCI passthrough support not built in"); >> } >> } >> + >> +void xen_igd_reserve_slot(PCIBus *pci_bus) >> +{ >> +} >> -- >> 2.39.0 >
Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>: >On 1/10/23 3:16 AM, Michael S. Tsirkin wrote: >> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: >>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, >>> as noted in docs/igd-assign.txt in the Qemu source code. >>> >>> Currently, when the xl toolstack is used to configure a Xen HVM guest with >>> Intel IGD passthrough to the guest with the Qemu upstream device model, >>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy >>> a different slot. This problem often prevents the guest from booting. >>> >>> The only available workaround is not good: Configure Xen HVM guests to use >>> the old and no longer maintained Qemu traditional device model available >>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD. >>> >>> To implement this feature in the Qemu upstream device model for Xen HVM >>> guests, introduce the following new functions, types, and macros: >>> >>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE >>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS >>> * typedef XenPTQdevRealize function pointer >>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 >>> * xen_igd_reserve_slot and xen_igd_clear_slot functions >>> >>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask >>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using >>> the xl toolstack with the gfx_passthru option enabled, which sets the >>> igd-passthru=on option to Qemu for the Xen HVM machine type. >>> >>> The new xen_igd_reserve_slot function also needs to be implemented in >>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case >>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, >>> in which case it does nothing. >>> >>> The new xen_igd_clear_slot function overrides qdev->realize of the parent >>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus >>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was >>> created in hw/i386/pc_piix.c for the case when igd-passthru=on. >>> >>> Move the call to xen_host_pci_device_get, and the associated error >>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to >>> initialize the device class and vendor values which enables the checks for >>> the Intel IGD to succeed. The verification that the host device is an >>> Intel IGD to be passed through is done by checking the domain, bus, slot, >>> and function values as well as by checking that gfx_passthru is enabled, >>> the device class is VGA, and the device vendor in Intel. >>> >>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> >>> --- >>> Notes that might be helpful to reviewers of patched code in hw/xen: >>> >>> The new functions and types are based on recommendations from Qemu docs: >>> https://qemu.readthedocs.io/en/latest/devel/qom.html >>> >>> Notes that might be helpful to reviewers of patched code in hw/i386: >>> >>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does >>> not affect builds that do not have CONFIG_XEN defined. >>> >>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an >>> existing function that is only true when Qemu is built with >>> xen-pci-passthrough enabled and the administrator has configured the Xen >>> HVM guest with Qemu's igd-passthru=on option. >>> >>> v2: Remove From: <email address> tag at top of commit message >>> >>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot: >>> >>> if (is_igd_vga_passthrough(&s->real_device) && >>> (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { >>> >>> is changed to >>> >>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>> && (s->hostaddr.function == 0)) { >>> >>> I hoped that I could use the test in v2, since it matches the >>> other tests for the Intel IGD in Qemu and Xen, but those tests >>> do not work because the necessary data structures are not set with >>> their values yet. So instead use the test that the administrator >>> has enabled gfx_passthru and the device address on the host is >>> 02.0. This test does detect the Intel IGD correctly. >>> >>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's >>> email address to match the address used by the same author in commits >>> be9c61da and c0e86b76 >>> >>> Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc >>> >>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test >>> for the Intel IGD that uses the same criteria as in other places. >>> This involved moving the call to xen_host_pci_device_get from >>> xen_pt_realize to xen_igd_clear_slot and updating the checks for the >>> Intel IGD in xen_igd_clear_slot: >>> >>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>> && (s->hostaddr.function == 0)) { >>> >>> is changed to >>> >>> if (is_igd_vga_passthrough(&s->real_device) && >>> s->real_device.domain == 0 && s->real_device.bus == 0 && >>> s->real_device.dev == 2 && s->real_device.func == 0 && >>> s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >>> >>> Added an explanation for the move of xen_host_pci_device_get from >>> xen_pt_realize to xen_igd_clear_slot to the commit message. >>> >>> Rebase. >>> >>> v6: Fix logging by removing these lines from the move from xen_pt_realize >>> to xen_igd_clear_slot that was done in v5: >>> >>> XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" >>> " to devfn 0x%x\n", >>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>> s->dev.devfn); >>> >>> This log needs to be in xen_pt_realize because s->dev.devfn is not >>> set yet in xen_igd_clear_slot. >>> >>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what >>> v7 was intended to be. >>> >>> v8: Inhibit out of context log message and needless processing by >>> adding 2 lines at the top of the new xen_igd_clear_slot function: >>> >>> if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>> return; >>> >>> Rebase. This removed an unnecessary header file from xen_pt.h >>> >>> hw/i386/pc_piix.c | 3 +++ >>> hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- >>> hw/xen/xen_pt.h | 16 +++++++++++++++ >>> hw/xen/xen_pt_stub.c | 4 ++++ >>> 4 files changed, 63 insertions(+), 9 deletions(-) >>> >>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >>> index b48047f50c..bc5efa4f59 100644 >>> --- a/hw/i386/pc_piix.c >>> +++ b/hw/i386/pc_piix.c >>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) >>> } >>> >>> pc_xen_hvm_init_pci(machine); >>> + if (xen_igd_gfx_pt_enabled()) { >>> + xen_igd_reserve_slot(pcms->bus); >>> + } >>> pci_create_simple(pcms->bus, -1, "xen-platform"); >>> } >>> #endif >> >> I would even maybe go further and move the whole logic into >> xen_igd_reserve_slot. And I would even just name it >> xen_hvm_init_reserved_slots without worrying about the what >> or why at the pc level. At this point it will be up to Xen maintainers. > >I see to do that would be to resolve the two pc_xen_hvm* >functions in pc_piix.c that are guarded by CONFIG_XEN and >move them to an appropriate place such as xen-hvm.c. > >That is along the lines of the work that Bernhard and Philippe >are doing, so I am Cc'ing them. My first inclination is just >to defer to them: I think eventually the little patch I propose >here to pc_piix.c is eventually going to be moved out of pc_piix.c >by Bernhard in a future patch. > >What they have been doing is very conservative, and I expect >if and when Bernhard gets here to resolve those functions, they >will do it in a way that keeps the dependency of the xenfv machine >type on the pc machine type and the pc_init1 function. > >What I would propose would be to break the dependency of xenfv >on the pc_init1 function. That is, I would propose having a >xenfv_init function in xen-hvm.c, and the first version would >be the current version of pc_init1, so xenfv would still depend >on many i440fx type things, but with the change xen developers >would be free to tweak xenfv_init without affecting the users >of the pc machine type. > >Would that be a good idea? If I get posiive feedback for this >idea, I will put it on the table, probably initially as an RFC >patch. In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now. What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3. I like Michael's idea of going one step further, both in terms of the approach and the reasoning. Best regards, Bernhard >Also, thanks, Michael, for your other suggestions for this patch >about using macros for the devfn constants. > >Chuck > >> >>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >>> index 0ec7e52183..eff38155ef 100644 >>> --- a/hw/xen/xen_pt.c >>> +++ b/hw/xen/xen_pt.c >>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>> s->dev.devfn); >>> >>> - xen_host_pci_device_get(&s->real_device, >>> - s->hostaddr.domain, s->hostaddr.bus, >>> - s->hostaddr.slot, s->hostaddr.function, >>> - errp); >>> - if (*errp) { >>> - error_append_hint(errp, "Failed to \"open\" the real pci device"); >>> - return; >>> - } >>> - >>> s->is_virtfn = s->real_device.is_virtfn; >>> if (s->is_virtfn) { >>> XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", >>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) >>> PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; >>> } >>> >>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>> +{ >>> + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); >>> + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; >>> +} >>> + >>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) >>> +{ >>> + ERRP_GUARD(); >>> + PCIDevice *pci_dev = (PCIDevice *)qdev; >>> + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); >>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); >>> + PCIBus *pci_bus = pci_get_bus(pci_dev); >>> + >>> + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>> + return; >>> + >>> + xen_host_pci_device_get(&s->real_device, >>> + s->hostaddr.domain, s->hostaddr.bus, >>> + s->hostaddr.slot, s->hostaddr.function, >>> + errp); >>> + if (*errp) { >>> + error_append_hint(errp, "Failed to \"open\" the real pci device"); >>> + return; >>> + } >>> + >>> + if (is_igd_vga_passthrough(&s->real_device) && >>> + s->real_device.domain == 0 && s->real_device.bus == 0 && >>> + s->real_device.dev == 2 && s->real_device.func == 0 && >>> + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >> >> how about macros for these? >> >> #define XEN_PCI_IGD_DOMAIN 0 >> #define XEN_PCI_IGD_BUS 0 >> #define XEN_PCI_IGD_DEV 2 >> #define XEN_PCI_IGD_FN 0 >> >>> + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; >> >> If you are going to do this, you should set it back >> either after pci_qdev_realize or in unrealize, >> for symmetry. >> >>> + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); >>> + } >> >> >>> + xpdc->pci_qdev_realize(qdev, errp); >>> +} >>> + >> >> >> >>> static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) >>> { >>> DeviceClass *dc = DEVICE_CLASS(klass); >>> PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); >>> >>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); >>> + xpdc->pci_qdev_realize = dc->realize; >>> + dc->realize = xen_igd_clear_slot; >>> k->realize = xen_pt_realize; >>> k->exit = xen_pt_unregister_device; >>> k->config_read = xen_pt_pci_read_config; >>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { >>> .instance_size = sizeof(XenPCIPassthroughState), >>> .instance_finalize = xen_pci_passthrough_finalize, >>> .class_init = xen_pci_passthrough_class_init, >>> + .class_size = sizeof(XenPTDeviceClass), >>> .instance_init = xen_pci_passthrough_instance_init, >>> .interfaces = (InterfaceInfo[]) { >>> { INTERFACE_CONVENTIONAL_PCI_DEVICE }, >>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >>> index cf10fc7bbf..8c25932b4b 100644 >>> --- a/hw/xen/xen_pt.h >>> +++ b/hw/xen/xen_pt.h >>> @@ -2,6 +2,7 @@ >>> #define XEN_PT_H >>> >>> #include "hw/xen/xen_common.h" >>> +#include "hw/pci/pci_bus.h" >>> #include "xen-host-pci-device.h" >>> #include "qom/object.h" >>> >>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; >>> #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" >>> OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) >>> >>> +#define XEN_PT_DEVICE_CLASS(klass) \ >>> + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) >>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \ >>> + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) >>> + >>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); >>> + >>> +typedef struct XenPTDeviceClass { >>> + PCIDeviceClass parent_class; >>> + XenPTQdevRealize pci_qdev_realize; >>> +} XenPTDeviceClass; >>> + >>> uint32_t igd_read_opregion(XenPCIPassthroughState *s); >>> +void xen_igd_reserve_slot(PCIBus *pci_bus); >>> void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); >>> void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, >>> XenHostPCIDevice *dev); >>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) >>> >>> #define XEN_PCI_INTEL_OPREGION 0xfc >>> >>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ >>> + >> >> I think you want to calculate this based on dev fn: >> >> #define XEN_PCI_IGD_SLOT_MASK \ >> (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) >> >> >>> typedef enum { >>> XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ >>> XEN_PT_GRP_TYPE_EMU, /* emul reg group */ >>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c >>> index 2d8cac8d54..5c108446a8 100644 >>> --- a/hw/xen/xen_pt_stub.c >>> +++ b/hw/xen/xen_pt_stub.c >>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) >>> error_setg(errp, "Xen PCI passthrough support not built in"); >>> } >>> } >>> + >>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>> +{ >>> +} >>> -- >>> 2.39.0 >> >
On 1/12/23 2:18 PM, Bernhard Beschow wrote: > > > Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>: >>On 1/10/23 3:16 AM, Michael S. Tsirkin wrote: >>> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: >>>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, >>>> as noted in docs/igd-assign.txt in the Qemu source code. >>>> >>>> Currently, when the xl toolstack is used to configure a Xen HVM guest with >>>> Intel IGD passthrough to the guest with the Qemu upstream device model, >>>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy >>>> a different slot. This problem often prevents the guest from booting. >>>> >>>> The only available workaround is not good: Configure Xen HVM guests to use >>>> the old and no longer maintained Qemu traditional device model available >>>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD. >>>> >>>> To implement this feature in the Qemu upstream device model for Xen HVM >>>> guests, introduce the following new functions, types, and macros: >>>> >>>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE >>>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS >>>> * typedef XenPTQdevRealize function pointer >>>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 >>>> * xen_igd_reserve_slot and xen_igd_clear_slot functions >>>> >>>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask >>>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using >>>> the xl toolstack with the gfx_passthru option enabled, which sets the >>>> igd-passthru=on option to Qemu for the Xen HVM machine type. >>>> >>>> The new xen_igd_reserve_slot function also needs to be implemented in >>>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case >>>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, >>>> in which case it does nothing. >>>> >>>> The new xen_igd_clear_slot function overrides qdev->realize of the parent >>>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus >>>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was >>>> created in hw/i386/pc_piix.c for the case when igd-passthru=on. >>>> >>>> Move the call to xen_host_pci_device_get, and the associated error >>>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to >>>> initialize the device class and vendor values which enables the checks for >>>> the Intel IGD to succeed. The verification that the host device is an >>>> Intel IGD to be passed through is done by checking the domain, bus, slot, >>>> and function values as well as by checking that gfx_passthru is enabled, >>>> the device class is VGA, and the device vendor in Intel. >>>> >>>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> >>>> --- >>>> Notes that might be helpful to reviewers of patched code in hw/xen: >>>> >>>> The new functions and types are based on recommendations from Qemu docs: >>>> https://qemu.readthedocs.io/en/latest/devel/qom.html >>>> >>>> Notes that might be helpful to reviewers of patched code in hw/i386: >>>> >>>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does >>>> not affect builds that do not have CONFIG_XEN defined. >>>> >>>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an >>>> existing function that is only true when Qemu is built with >>>> xen-pci-passthrough enabled and the administrator has configured the Xen >>>> HVM guest with Qemu's igd-passthru=on option. >>>> >>>> v2: Remove From: <email address> tag at top of commit message >>>> >>>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot: >>>> >>>> if (is_igd_vga_passthrough(&s->real_device) && >>>> (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { >>>> >>>> is changed to >>>> >>>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>>> && (s->hostaddr.function == 0)) { >>>> >>>> I hoped that I could use the test in v2, since it matches the >>>> other tests for the Intel IGD in Qemu and Xen, but those tests >>>> do not work because the necessary data structures are not set with >>>> their values yet. So instead use the test that the administrator >>>> has enabled gfx_passthru and the device address on the host is >>>> 02.0. This test does detect the Intel IGD correctly. >>>> >>>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's >>>> email address to match the address used by the same author in commits >>>> be9c61da and c0e86b76 >>>> >>>> Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc >>>> >>>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test >>>> for the Intel IGD that uses the same criteria as in other places. >>>> This involved moving the call to xen_host_pci_device_get from >>>> xen_pt_realize to xen_igd_clear_slot and updating the checks for the >>>> Intel IGD in xen_igd_clear_slot: >>>> >>>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>>> && (s->hostaddr.function == 0)) { >>>> >>>> is changed to >>>> >>>> if (is_igd_vga_passthrough(&s->real_device) && >>>> s->real_device.domain == 0 && s->real_device.bus == 0 && >>>> s->real_device.dev == 2 && s->real_device.func == 0 && >>>> s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >>>> >>>> Added an explanation for the move of xen_host_pci_device_get from >>>> xen_pt_realize to xen_igd_clear_slot to the commit message. >>>> >>>> Rebase. >>>> >>>> v6: Fix logging by removing these lines from the move from xen_pt_realize >>>> to xen_igd_clear_slot that was done in v5: >>>> >>>> XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" >>>> " to devfn 0x%x\n", >>>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>>> s->dev.devfn); >>>> >>>> This log needs to be in xen_pt_realize because s->dev.devfn is not >>>> set yet in xen_igd_clear_slot. >>>> >>>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what >>>> v7 was intended to be. >>>> >>>> v8: Inhibit out of context log message and needless processing by >>>> adding 2 lines at the top of the new xen_igd_clear_slot function: >>>> >>>> if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>>> return; >>>> >>>> Rebase. This removed an unnecessary header file from xen_pt.h >>>> >>>> hw/i386/pc_piix.c | 3 +++ >>>> hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- >>>> hw/xen/xen_pt.h | 16 +++++++++++++++ >>>> hw/xen/xen_pt_stub.c | 4 ++++ >>>> 4 files changed, 63 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >>>> index b48047f50c..bc5efa4f59 100644 >>>> --- a/hw/i386/pc_piix.c >>>> +++ b/hw/i386/pc_piix.c >>>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) >>>> } >>>> >>>> pc_xen_hvm_init_pci(machine); >>>> + if (xen_igd_gfx_pt_enabled()) { >>>> + xen_igd_reserve_slot(pcms->bus); >>>> + } >>>> pci_create_simple(pcms->bus, -1, "xen-platform"); >>>> } >>>> #endif >>> >>> I would even maybe go further and move the whole logic into >>> xen_igd_reserve_slot. And I would even just name it >>> xen_hvm_init_reserved_slots without worrying about the what >>> or why at the pc level. At this point it will be up to Xen maintainers. >> >>I see to do that would be to resolve the two pc_xen_hvm* >>functions in pc_piix.c that are guarded by CONFIG_XEN and >>move them to an appropriate place such as xen-hvm.c. >> >>That is along the lines of the work that Bernhard and Philippe >>are doing, so I am Cc'ing them. My first inclination is just >>to defer to them: I think eventually the little patch I propose >>here to pc_piix.c is eventually going to be moved out of pc_piix.c >>by Bernhard in a future patch. >> >>What they have been doing is very conservative, and I expect >>if and when Bernhard gets here to resolve those functions, they >>will do it in a way that keeps the dependency of the xenfv machine >>type on the pc machine type and the pc_init1 function. >> >>What I would propose would be to break the dependency of xenfv >>on the pc_init1 function. That is, I would propose having a >>xenfv_init function in xen-hvm.c, and the first version would >>be the current version of pc_init1, so xenfv would still depend >>on many i440fx type things, but with the change xen developers >>would be free to tweak xenfv_init without affecting the users >>of the pc machine type. >> >>Would that be a good idea? If I get posiive feedback for this >>idea, I will put it on the table, probably initially as an RFC >>patch. > > In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now. > > What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3. I think what you are saying is that if I try to move the logic of my patch to xen-hvm.c, as Michael suggests, I should not move or copy any piix3 code to the Xen integration code, but access it via an appropriate qom interface to the code in pc_piix.c and only move Xen specific things to the Xen integration code such as the content of my patch. I can try to do that for a v9 of my patch. It might take me a little while (I am not a professional coder), so I will just leave v8 of my patch as is for now until I have a patch ready to move it out of pc_piix.c the qom way. Thanks, Chuck > > I like Michael's idea of going one step further, both in terms of the approach and the reasoning. > > Best regards, > Bernhard > >>Also, thanks, Michael, for your other suggestions for this patch >>about using macros for the devfn constants. >> >>Chuck >> >>> >>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >>>> index 0ec7e52183..eff38155ef 100644 >>>> --- a/hw/xen/xen_pt.c >>>> +++ b/hw/xen/xen_pt.c >>>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >>>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>>> s->dev.devfn); >>>> >>>> - xen_host_pci_device_get(&s->real_device, >>>> - s->hostaddr.domain, s->hostaddr.bus, >>>> - s->hostaddr.slot, s->hostaddr.function, >>>> - errp); >>>> - if (*errp) { >>>> - error_append_hint(errp, "Failed to \"open\" the real pci device"); >>>> - return; >>>> - } >>>> - >>>> s->is_virtfn = s->real_device.is_virtfn; >>>> if (s->is_virtfn) { >>>> XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", >>>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) >>>> PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; >>>> } >>>> >>>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>>> +{ >>>> + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); >>>> + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; >>>> +} >>>> + >>>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) >>>> +{ >>>> + ERRP_GUARD(); >>>> + PCIDevice *pci_dev = (PCIDevice *)qdev; >>>> + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); >>>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); >>>> + PCIBus *pci_bus = pci_get_bus(pci_dev); >>>> + >>>> + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>>> + return; >>>> + >>>> + xen_host_pci_device_get(&s->real_device, >>>> + s->hostaddr.domain, s->hostaddr.bus, >>>> + s->hostaddr.slot, s->hostaddr.function, >>>> + errp); >>>> + if (*errp) { >>>> + error_append_hint(errp, "Failed to \"open\" the real pci device"); >>>> + return; >>>> + } >>>> + >>>> + if (is_igd_vga_passthrough(&s->real_device) && >>>> + s->real_device.domain == 0 && s->real_device.bus == 0 && >>>> + s->real_device.dev == 2 && s->real_device.func == 0 && >>>> + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >>> >>> how about macros for these? >>> >>> #define XEN_PCI_IGD_DOMAIN 0 >>> #define XEN_PCI_IGD_BUS 0 >>> #define XEN_PCI_IGD_DEV 2 >>> #define XEN_PCI_IGD_FN 0 >>> >>>> + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; >>> >>> If you are going to do this, you should set it back >>> either after pci_qdev_realize or in unrealize, >>> for symmetry. >>> >>>> + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); >>>> + } >>> >>> >>>> + xpdc->pci_qdev_realize(qdev, errp); >>>> +} >>>> + >>> >>> >>> >>>> static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) >>>> { >>>> DeviceClass *dc = DEVICE_CLASS(klass); >>>> PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); >>>> >>>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); >>>> + xpdc->pci_qdev_realize = dc->realize; >>>> + dc->realize = xen_igd_clear_slot; >>>> k->realize = xen_pt_realize; >>>> k->exit = xen_pt_unregister_device; >>>> k->config_read = xen_pt_pci_read_config; >>>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { >>>> .instance_size = sizeof(XenPCIPassthroughState), >>>> .instance_finalize = xen_pci_passthrough_finalize, >>>> .class_init = xen_pci_passthrough_class_init, >>>> + .class_size = sizeof(XenPTDeviceClass), >>>> .instance_init = xen_pci_passthrough_instance_init, >>>> .interfaces = (InterfaceInfo[]) { >>>> { INTERFACE_CONVENTIONAL_PCI_DEVICE }, >>>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >>>> index cf10fc7bbf..8c25932b4b 100644 >>>> --- a/hw/xen/xen_pt.h >>>> +++ b/hw/xen/xen_pt.h >>>> @@ -2,6 +2,7 @@ >>>> #define XEN_PT_H >>>> >>>> #include "hw/xen/xen_common.h" >>>> +#include "hw/pci/pci_bus.h" >>>> #include "xen-host-pci-device.h" >>>> #include "qom/object.h" >>>> >>>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; >>>> #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" >>>> OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) >>>> >>>> +#define XEN_PT_DEVICE_CLASS(klass) \ >>>> + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) >>>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \ >>>> + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) >>>> + >>>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); >>>> + >>>> +typedef struct XenPTDeviceClass { >>>> + PCIDeviceClass parent_class; >>>> + XenPTQdevRealize pci_qdev_realize; >>>> +} XenPTDeviceClass; >>>> + >>>> uint32_t igd_read_opregion(XenPCIPassthroughState *s); >>>> +void xen_igd_reserve_slot(PCIBus *pci_bus); >>>> void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); >>>> void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, >>>> XenHostPCIDevice *dev); >>>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) >>>> >>>> #define XEN_PCI_INTEL_OPREGION 0xfc >>>> >>>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ >>>> + >>> >>> I think you want to calculate this based on dev fn: >>> >>> #define XEN_PCI_IGD_SLOT_MASK \ >>> (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) >>> >>> >>>> typedef enum { >>>> XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ >>>> XEN_PT_GRP_TYPE_EMU, /* emul reg group */ >>>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c >>>> index 2d8cac8d54..5c108446a8 100644 >>>> --- a/hw/xen/xen_pt_stub.c >>>> +++ b/hw/xen/xen_pt_stub.c >>>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) >>>> error_setg(errp, "Xen PCI passthrough support not built in"); >>>> } >>>> } >>>> + >>>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>>> +{ >>>> +} >>>> -- >>>> 2.39.0 >>> >>
Am 12. Januar 2023 20:11:54 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>: >On 1/12/23 2:18 PM, Bernhard Beschow wrote: >> >> >> Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>: >>>On 1/10/23 3:16 AM, Michael S. Tsirkin wrote: >>>> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: >>>>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, >>>>> as noted in docs/igd-assign.txt in the Qemu source code. >>>>> >>>>> Currently, when the xl toolstack is used to configure a Xen HVM guest with >>>>> Intel IGD passthrough to the guest with the Qemu upstream device model, >>>>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy >>>>> a different slot. This problem often prevents the guest from booting. >>>>> >>>>> The only available workaround is not good: Configure Xen HVM guests to use >>>>> the old and no longer maintained Qemu traditional device model available >>>>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD. >>>>> >>>>> To implement this feature in the Qemu upstream device model for Xen HVM >>>>> guests, introduce the following new functions, types, and macros: >>>>> >>>>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE >>>>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS >>>>> * typedef XenPTQdevRealize function pointer >>>>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 >>>>> * xen_igd_reserve_slot and xen_igd_clear_slot functions >>>>> >>>>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask >>>>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using >>>>> the xl toolstack with the gfx_passthru option enabled, which sets the >>>>> igd-passthru=on option to Qemu for the Xen HVM machine type. >>>>> >>>>> The new xen_igd_reserve_slot function also needs to be implemented in >>>>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case >>>>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, >>>>> in which case it does nothing. >>>>> >>>>> The new xen_igd_clear_slot function overrides qdev->realize of the parent >>>>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus >>>>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was >>>>> created in hw/i386/pc_piix.c for the case when igd-passthru=on. >>>>> >>>>> Move the call to xen_host_pci_device_get, and the associated error >>>>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to >>>>> initialize the device class and vendor values which enables the checks for >>>>> the Intel IGD to succeed. The verification that the host device is an >>>>> Intel IGD to be passed through is done by checking the domain, bus, slot, >>>>> and function values as well as by checking that gfx_passthru is enabled, >>>>> the device class is VGA, and the device vendor in Intel. >>>>> >>>>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> >>>>> --- >>>>> Notes that might be helpful to reviewers of patched code in hw/xen: >>>>> >>>>> The new functions and types are based on recommendations from Qemu docs: >>>>> https://qemu.readthedocs.io/en/latest/devel/qom.html >>>>> >>>>> Notes that might be helpful to reviewers of patched code in hw/i386: >>>>> >>>>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does >>>>> not affect builds that do not have CONFIG_XEN defined. >>>>> >>>>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an >>>>> existing function that is only true when Qemu is built with >>>>> xen-pci-passthrough enabled and the administrator has configured the Xen >>>>> HVM guest with Qemu's igd-passthru=on option. >>>>> >>>>> v2: Remove From: <email address> tag at top of commit message >>>>> >>>>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot: >>>>> >>>>> if (is_igd_vga_passthrough(&s->real_device) && >>>>> (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { >>>>> >>>>> is changed to >>>>> >>>>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>>>> && (s->hostaddr.function == 0)) { >>>>> >>>>> I hoped that I could use the test in v2, since it matches the >>>>> other tests for the Intel IGD in Qemu and Xen, but those tests >>>>> do not work because the necessary data structures are not set with >>>>> their values yet. So instead use the test that the administrator >>>>> has enabled gfx_passthru and the device address on the host is >>>>> 02.0. This test does detect the Intel IGD correctly. >>>>> >>>>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's >>>>> email address to match the address used by the same author in commits >>>>> be9c61da and c0e86b76 >>>>> >>>>> Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc >>>>> >>>>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test >>>>> for the Intel IGD that uses the same criteria as in other places. >>>>> This involved moving the call to xen_host_pci_device_get from >>>>> xen_pt_realize to xen_igd_clear_slot and updating the checks for the >>>>> Intel IGD in xen_igd_clear_slot: >>>>> >>>>> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >>>>> && (s->hostaddr.function == 0)) { >>>>> >>>>> is changed to >>>>> >>>>> if (is_igd_vga_passthrough(&s->real_device) && >>>>> s->real_device.domain == 0 && s->real_device.bus == 0 && >>>>> s->real_device.dev == 2 && s->real_device.func == 0 && >>>>> s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >>>>> >>>>> Added an explanation for the move of xen_host_pci_device_get from >>>>> xen_pt_realize to xen_igd_clear_slot to the commit message. >>>>> >>>>> Rebase. >>>>> >>>>> v6: Fix logging by removing these lines from the move from xen_pt_realize >>>>> to xen_igd_clear_slot that was done in v5: >>>>> >>>>> XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" >>>>> " to devfn 0x%x\n", >>>>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>>>> s->dev.devfn); >>>>> >>>>> This log needs to be in xen_pt_realize because s->dev.devfn is not >>>>> set yet in xen_igd_clear_slot. >>>>> >>>>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what >>>>> v7 was intended to be. >>>>> >>>>> v8: Inhibit out of context log message and needless processing by >>>>> adding 2 lines at the top of the new xen_igd_clear_slot function: >>>>> >>>>> if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>>>> return; >>>>> >>>>> Rebase. This removed an unnecessary header file from xen_pt.h >>>>> >>>>> hw/i386/pc_piix.c | 3 +++ >>>>> hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- >>>>> hw/xen/xen_pt.h | 16 +++++++++++++++ >>>>> hw/xen/xen_pt_stub.c | 4 ++++ >>>>> 4 files changed, 63 insertions(+), 9 deletions(-) >>>>> >>>>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >>>>> index b48047f50c..bc5efa4f59 100644 >>>>> --- a/hw/i386/pc_piix.c >>>>> +++ b/hw/i386/pc_piix.c >>>>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) >>>>> } >>>>> >>>>> pc_xen_hvm_init_pci(machine); >>>>> + if (xen_igd_gfx_pt_enabled()) { >>>>> + xen_igd_reserve_slot(pcms->bus); >>>>> + } >>>>> pci_create_simple(pcms->bus, -1, "xen-platform"); >>>>> } >>>>> #endif >>>> >>>> I would even maybe go further and move the whole logic into >>>> xen_igd_reserve_slot. And I would even just name it >>>> xen_hvm_init_reserved_slots without worrying about the what >>>> or why at the pc level. At this point it will be up to Xen maintainers. >>> >>>I see to do that would be to resolve the two pc_xen_hvm* >>>functions in pc_piix.c that are guarded by CONFIG_XEN and >>>move them to an appropriate place such as xen-hvm.c. >>> >>>That is along the lines of the work that Bernhard and Philippe >>>are doing, so I am Cc'ing them. My first inclination is just >>>to defer to them: I think eventually the little patch I propose >>>here to pc_piix.c is eventually going to be moved out of pc_piix.c >>>by Bernhard in a future patch. >>> >>>What they have been doing is very conservative, and I expect >>>if and when Bernhard gets here to resolve those functions, they >>>will do it in a way that keeps the dependency of the xenfv machine >>>type on the pc machine type and the pc_init1 function. >>> >>>What I would propose would be to break the dependency of xenfv >>>on the pc_init1 function. That is, I would propose having a >>>xenfv_init function in xen-hvm.c, and the first version would >>>be the current version of pc_init1, so xenfv would still depend >>>on many i440fx type things, but with the change xen developers >>>would be free to tweak xenfv_init without affecting the users >>>of the pc machine type. >>> >>>Would that be a good idea? If I get posiive feedback for this >>>idea, I will put it on the table, probably initially as an RFC >>>patch. >> >> In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now. >> >> What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3. > >I think what you are saying is that if I try to move the logic of my patch to xen-hvm.c, as Michael suggests, I should not move or copy any piix3 code to the Xen integration code, but access it via an appropriate qom interface to the code in pc_piix.c and only move Xen specific things to the Xen integration code such as the content of my patch. I can try to do that for a v9 of my patch. It might take me a little while (I am not a professional coder), so I will just leave v8 of my patch as is for now until I have a patch ready to move it out of pc_piix.c the qom way. I think the change Michael suggests is very minimalistic: Move the if condition around xen_igd_reserve_slot() into the function itself and always call it there unconditionally -- basically turning three lines into one. Since xen_igd_reserve_slot() seems very problem specific, Michael further suggests to rename it to something more general. All in all no big changes required. Best regards, Bernhard > >Thanks, > >Chuck > >> >> I like Michael's idea of going one step further, both in terms of the approach and the reasoning. >> >> Best regards, >> Bernhard >> >>>Also, thanks, Michael, for your other suggestions for this patch >>>about using macros for the devfn constants. >>> >>>Chuck >>> >>>> >>>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >>>>> index 0ec7e52183..eff38155ef 100644 >>>>> --- a/hw/xen/xen_pt.c >>>>> +++ b/hw/xen/xen_pt.c >>>>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >>>>> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >>>>> s->dev.devfn); >>>>> >>>>> - xen_host_pci_device_get(&s->real_device, >>>>> - s->hostaddr.domain, s->hostaddr.bus, >>>>> - s->hostaddr.slot, s->hostaddr.function, >>>>> - errp); >>>>> - if (*errp) { >>>>> - error_append_hint(errp, "Failed to \"open\" the real pci device"); >>>>> - return; >>>>> - } >>>>> - >>>>> s->is_virtfn = s->real_device.is_virtfn; >>>>> if (s->is_virtfn) { >>>>> XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", >>>>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) >>>>> PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; >>>>> } >>>>> >>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>>>> +{ >>>>> + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); >>>>> + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; >>>>> +} >>>>> + >>>>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) >>>>> +{ >>>>> + ERRP_GUARD(); >>>>> + PCIDevice *pci_dev = (PCIDevice *)qdev; >>>>> + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); >>>>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); >>>>> + PCIBus *pci_bus = pci_get_bus(pci_dev); >>>>> + >>>>> + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >>>>> + return; >>>>> + >>>>> + xen_host_pci_device_get(&s->real_device, >>>>> + s->hostaddr.domain, s->hostaddr.bus, >>>>> + s->hostaddr.slot, s->hostaddr.function, >>>>> + errp); >>>>> + if (*errp) { >>>>> + error_append_hint(errp, "Failed to \"open\" the real pci device"); >>>>> + return; >>>>> + } >>>>> + >>>>> + if (is_igd_vga_passthrough(&s->real_device) && >>>>> + s->real_device.domain == 0 && s->real_device.bus == 0 && >>>>> + s->real_device.dev == 2 && s->real_device.func == 0 && >>>>> + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >>>> >>>> how about macros for these? >>>> >>>> #define XEN_PCI_IGD_DOMAIN 0 >>>> #define XEN_PCI_IGD_BUS 0 >>>> #define XEN_PCI_IGD_DEV 2 >>>> #define XEN_PCI_IGD_FN 0 >>>> >>>>> + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; >>>> >>>> If you are going to do this, you should set it back >>>> either after pci_qdev_realize or in unrealize, >>>> for symmetry. >>>> >>>>> + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); >>>>> + } >>>> >>>> >>>>> + xpdc->pci_qdev_realize(qdev, errp); >>>>> +} >>>>> + >>>> >>>> >>>> >>>>> static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) >>>>> { >>>>> DeviceClass *dc = DEVICE_CLASS(klass); >>>>> PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); >>>>> >>>>> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); >>>>> + xpdc->pci_qdev_realize = dc->realize; >>>>> + dc->realize = xen_igd_clear_slot; >>>>> k->realize = xen_pt_realize; >>>>> k->exit = xen_pt_unregister_device; >>>>> k->config_read = xen_pt_pci_read_config; >>>>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { >>>>> .instance_size = sizeof(XenPCIPassthroughState), >>>>> .instance_finalize = xen_pci_passthrough_finalize, >>>>> .class_init = xen_pci_passthrough_class_init, >>>>> + .class_size = sizeof(XenPTDeviceClass), >>>>> .instance_init = xen_pci_passthrough_instance_init, >>>>> .interfaces = (InterfaceInfo[]) { >>>>> { INTERFACE_CONVENTIONAL_PCI_DEVICE }, >>>>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >>>>> index cf10fc7bbf..8c25932b4b 100644 >>>>> --- a/hw/xen/xen_pt.h >>>>> +++ b/hw/xen/xen_pt.h >>>>> @@ -2,6 +2,7 @@ >>>>> #define XEN_PT_H >>>>> >>>>> #include "hw/xen/xen_common.h" >>>>> +#include "hw/pci/pci_bus.h" >>>>> #include "xen-host-pci-device.h" >>>>> #include "qom/object.h" >>>>> >>>>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; >>>>> #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" >>>>> OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) >>>>> >>>>> +#define XEN_PT_DEVICE_CLASS(klass) \ >>>>> + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) >>>>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \ >>>>> + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) >>>>> + >>>>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); >>>>> + >>>>> +typedef struct XenPTDeviceClass { >>>>> + PCIDeviceClass parent_class; >>>>> + XenPTQdevRealize pci_qdev_realize; >>>>> +} XenPTDeviceClass; >>>>> + >>>>> uint32_t igd_read_opregion(XenPCIPassthroughState *s); >>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus); >>>>> void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); >>>>> void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, >>>>> XenHostPCIDevice *dev); >>>>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) >>>>> >>>>> #define XEN_PCI_INTEL_OPREGION 0xfc >>>>> >>>>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ >>>>> + >>>> >>>> I think you want to calculate this based on dev fn: >>>> >>>> #define XEN_PCI_IGD_SLOT_MASK \ >>>> (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) >>>> >>>> >>>>> typedef enum { >>>>> XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ >>>>> XEN_PT_GRP_TYPE_EMU, /* emul reg group */ >>>>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c >>>>> index 2d8cac8d54..5c108446a8 100644 >>>>> --- a/hw/xen/xen_pt_stub.c >>>>> +++ b/hw/xen/xen_pt_stub.c >>>>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) >>>>> error_setg(errp, "Xen PCI passthrough support not built in"); >>>>> } >>>>> } >>>>> + >>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus) >>>>> +{ >>>>> +} >>>>> -- >>>>> 2.39.0 >>>> >>> >
On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > I think the change Michael suggests is very minimalistic: Move the if > condition around xen_igd_reserve_slot() into the function itself and > always call it there unconditionally -- basically turning three lines > into one. Since xen_igd_reserve_slot() seems very problem specific, > Michael further suggests to rename it to something more general. All > in all no big changes required. yes, exactly.
On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: >> I think the change Michael suggests is very minimalistic: Move the if >> condition around xen_igd_reserve_slot() into the function itself and >> always call it there unconditionally -- basically turning three lines >> into one. Since xen_igd_reserve_slot() seems very problem specific, >> Michael further suggests to rename it to something more general. All >> in all no big changes required. > > yes, exactly. > OK, got it. I can do that along with the other suggestions. Thanks.
On Thu, 12 Jan 2023 23:14:26 -0500 Chuck Zmudzinski <brchuckz@aol.com> wrote: > On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > >> I think the change Michael suggests is very minimalistic: Move the if > >> condition around xen_igd_reserve_slot() into the function itself and > >> always call it there unconditionally -- basically turning three lines > >> into one. Since xen_igd_reserve_slot() seems very problem specific, > >> Michael further suggests to rename it to something more general. All > >> in all no big changes required. > > > > yes, exactly. > > > > OK, got it. I can do that along with the other suggestions. have you considered instead of reservation, putting a slot check in device model and if it's intel igd being passed through, fail at realize time if it can't take required slot (with a error directing user to fix command line)? That could be less complicated than dealing with slot reservations at the cost of being less convenient. > > Thanks. >
On 1/13/23 4:33 AM, Igor Mammedov wrote: > On Thu, 12 Jan 2023 23:14:26 -0500 > Chuck Zmudzinski <brchuckz@aol.com> wrote: > >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: >> >> I think the change Michael suggests is very minimalistic: Move the if >> >> condition around xen_igd_reserve_slot() into the function itself and >> >> always call it there unconditionally -- basically turning three lines >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, >> >> Michael further suggests to rename it to something more general. All >> >> in all no big changes required. >> > >> > yes, exactly. >> > >> >> OK, got it. I can do that along with the other suggestions. > > have you considered instead of reservation, putting a slot check in device model > and if it's intel igd being passed through, fail at realize time if it can't take > required slot (with a error directing user to fix command line)? Yes, but the core pci code currently already fails at realize time with a useful error message if the user tries to use slot 2 for the igd, because of the xen platform device which has slot 2. The user can fix this without patching qemu, but having the user fix it on the command line is not the best way to solve the problem, primarily because the user would need to hotplug the xen platform device via a command line option instead of having the xen platform device added by pc_xen_hvm_init functions almost immediately after creating the pci bus, and that delay in adding the xen platform device degrades startup performance of the guest. > That could be less complicated than dealing with slot reservations at the cost of > being less convenient. And also a cost of reduced startup performance. However, the performance hit can be prevented by assigning slot 3 instead of slot 2 for the xen platform device if igd passthrough is configured on the command line instead of doing slot reservation, but there would still be less convenience and, for libxl users, an inability to easily configure the command line so that the igd can still have slot 2 without a hacky and error-prone patch to libxl to deal with this problem. I did post a patch on xen-devel to fix this using libxl, but so far it has not yet been reviewed and I mentioned in that patch that the approach of patching qemu so qemu reserves slot 2 for the igd is less prone to coding errors and is easier to maintain than the patch that would be required to implement the fix in libxl.
On 1/10/23 3:16 AM, Michael S. Tsirkin wrote: > On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote: >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, >> as noted in docs/igd-assign.txt in the Qemu source code. >> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with >> Intel IGD passthrough to the guest with the Qemu upstream device model, >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy >> a different slot. This problem often prevents the guest from booting. >> >> The only available workaround is not good: Configure Xen HVM guests to use >> the old and no longer maintained Qemu traditional device model available >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD. >> >> To implement this feature in the Qemu upstream device model for Xen HVM >> guests, introduce the following new functions, types, and macros: >> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS >> * typedef XenPTQdevRealize function pointer >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 >> * xen_igd_reserve_slot and xen_igd_clear_slot functions >> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using >> the xl toolstack with the gfx_passthru option enabled, which sets the >> igd-passthru=on option to Qemu for the Xen HVM machine type. >> >> The new xen_igd_reserve_slot function also needs to be implemented in >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, >> in which case it does nothing. >> >> The new xen_igd_clear_slot function overrides qdev->realize of the parent >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was >> created in hw/i386/pc_piix.c for the case when igd-passthru=on. >> >> Move the call to xen_host_pci_device_get, and the associated error >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to >> initialize the device class and vendor values which enables the checks for >> the Intel IGD to succeed. The verification that the host device is an >> Intel IGD to be passed through is done by checking the domain, bus, slot, >> and function values as well as by checking that gfx_passthru is enabled, >> the device class is VGA, and the device vendor in Intel. >> >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> >> --- >> Notes that might be helpful to reviewers of patched code in hw/xen: >> >> The new functions and types are based on recommendations from Qemu docs: >> https://qemu.readthedocs.io/en/latest/devel/qom.html >> >> Notes that might be helpful to reviewers of patched code in hw/i386: >> >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does >> not affect builds that do not have CONFIG_XEN defined. >> >> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an >> existing function that is only true when Qemu is built with >> xen-pci-passthrough enabled and the administrator has configured the Xen >> HVM guest with Qemu's igd-passthru=on option. >> >> v2: Remove From: <email address> tag at top of commit message >> >> v3: Changed the test for the Intel IGD in xen_igd_clear_slot: >> >> if (is_igd_vga_passthrough(&s->real_device) && >> (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { >> >> is changed to >> >> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >> && (s->hostaddr.function == 0)) { >> >> I hoped that I could use the test in v2, since it matches the >> other tests for the Intel IGD in Qemu and Xen, but those tests >> do not work because the necessary data structures are not set with >> their values yet. So instead use the test that the administrator >> has enabled gfx_passthru and the device address on the host is >> 02.0. This test does detect the Intel IGD correctly. >> >> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's >> email address to match the address used by the same author in commits >> be9c61da and c0e86b76 >> >> Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc >> >> v5: The patch of xen_pt.c was re-worked to allow a more consistent test >> for the Intel IGD that uses the same criteria as in other places. >> This involved moving the call to xen_host_pci_device_get from >> xen_pt_realize to xen_igd_clear_slot and updating the checks for the >> Intel IGD in xen_igd_clear_slot: >> >> if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) >> && (s->hostaddr.function == 0)) { >> >> is changed to >> >> if (is_igd_vga_passthrough(&s->real_device) && >> s->real_device.domain == 0 && s->real_device.bus == 0 && >> s->real_device.dev == 2 && s->real_device.func == 0 && >> s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { >> >> Added an explanation for the move of xen_host_pci_device_get from >> xen_pt_realize to xen_igd_clear_slot to the commit message. >> >> Rebase. >> >> v6: Fix logging by removing these lines from the move from xen_pt_realize >> to xen_igd_clear_slot that was done in v5: >> >> XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" >> " to devfn 0x%x\n", >> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >> s->dev.devfn); >> >> This log needs to be in xen_pt_realize because s->dev.devfn is not >> set yet in xen_igd_clear_slot. >> >> v7: The v7 that was posted to the mailing list was incorrect. v8 is what >> v7 was intended to be. >> >> v8: Inhibit out of context log message and needless processing by >> adding 2 lines at the top of the new xen_igd_clear_slot function: >> >> if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >> return; >> >> Rebase. This removed an unnecessary header file from xen_pt.h >> >> hw/i386/pc_piix.c | 3 +++ >> hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- >> hw/xen/xen_pt.h | 16 +++++++++++++++ >> hw/xen/xen_pt_stub.c | 4 ++++ >> 4 files changed, 63 insertions(+), 9 deletions(-) >> >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c >> index b48047f50c..bc5efa4f59 100644 >> --- a/hw/i386/pc_piix.c >> +++ b/hw/i386/pc_piix.c >> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) >> } >> >> pc_xen_hvm_init_pci(machine); >> + if (xen_igd_gfx_pt_enabled()) { >> + xen_igd_reserve_slot(pcms->bus); >> + } >> pci_create_simple(pcms->bus, -1, "xen-platform"); >> } >> #endif > > I would even maybe go further and move the whole logic into > xen_igd_reserve_slot. And I would even just name it > xen_hvm_init_reserved_slots without worrying about the what > or why at the pc level. At this point it will be up to Xen maintainers. > >> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c >> index 0ec7e52183..eff38155ef 100644 >> --- a/hw/xen/xen_pt.c >> +++ b/hw/xen/xen_pt.c >> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) >> s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, >> s->dev.devfn); >> >> - xen_host_pci_device_get(&s->real_device, >> - s->hostaddr.domain, s->hostaddr.bus, >> - s->hostaddr.slot, s->hostaddr.function, >> - errp); >> - if (*errp) { >> - error_append_hint(errp, "Failed to \"open\" the real pci device"); >> - return; >> - } >> - >> s->is_virtfn = s->real_device.is_virtfn; >> if (s->is_virtfn) { >> XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", >> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) >> PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; >> } >> >> +void xen_igd_reserve_slot(PCIBus *pci_bus) >> +{ >> + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); >> + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; >> +} >> + >> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) >> +{ >> + ERRP_GUARD(); >> + PCIDevice *pci_dev = (PCIDevice *)qdev; >> + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); >> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); >> + PCIBus *pci_bus = pci_get_bus(pci_dev); >> + >> + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) >> + return; >> + >> + xen_host_pci_device_get(&s->real_device, >> + s->hostaddr.domain, s->hostaddr.bus, >> + s->hostaddr.slot, s->hostaddr.function, >> + errp); >> + if (*errp) { >> + error_append_hint(errp, "Failed to \"open\" the real pci device"); >> + return; >> + } >> + >> + if (is_igd_vga_passthrough(&s->real_device) && >> + s->real_device.domain == 0 && s->real_device.bus == 0 && >> + s->real_device.dev == 2 && s->real_device.func == 0 && >> + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { > > how about macros for these? > > #define XEN_PCI_IGD_DOMAIN 0 > #define XEN_PCI_IGD_BUS 0 > #define XEN_PCI_IGD_DEV 2 > #define XEN_PCI_IGD_FN 0 > >> + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; > > If you are going to do this, you should set it back > either after pci_qdev_realize or in unrealize, > for symmetry. I presume you are talking about the log here. The clearing of the bit must be done before pci_qdev_realize because the slot is assigned in pci_qdev_realize. If the bit is not cleared *before* calling pci_qdev_realize, the igd will not be assigned slot 2. Doing that would defeat the whole purpose of the patch. I suppose I could move the log message after pci_qdev_realize for symmetry, that would not change how the patch works. > >> + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); >> + } > > >> + xpdc->pci_qdev_realize(qdev, errp); >> +} >> + > > > >> static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) >> { >> DeviceClass *dc = DEVICE_CLASS(klass); >> PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); >> >> + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); >> + xpdc->pci_qdev_realize = dc->realize; >> + dc->realize = xen_igd_clear_slot; >> k->realize = xen_pt_realize; >> k->exit = xen_pt_unregister_device; >> k->config_read = xen_pt_pci_read_config; >> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { >> .instance_size = sizeof(XenPCIPassthroughState), >> .instance_finalize = xen_pci_passthrough_finalize, >> .class_init = xen_pci_passthrough_class_init, >> + .class_size = sizeof(XenPTDeviceClass), >> .instance_init = xen_pci_passthrough_instance_init, >> .interfaces = (InterfaceInfo[]) { >> { INTERFACE_CONVENTIONAL_PCI_DEVICE }, >> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h >> index cf10fc7bbf..8c25932b4b 100644 >> --- a/hw/xen/xen_pt.h >> +++ b/hw/xen/xen_pt.h >> @@ -2,6 +2,7 @@ >> #define XEN_PT_H >> >> #include "hw/xen/xen_common.h" >> +#include "hw/pci/pci_bus.h" >> #include "xen-host-pci-device.h" >> #include "qom/object.h" >> >> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; >> #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" >> OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) >> >> +#define XEN_PT_DEVICE_CLASS(klass) \ >> + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) >> +#define XEN_PT_DEVICE_GET_CLASS(obj) \ >> + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) >> + >> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); >> + >> +typedef struct XenPTDeviceClass { >> + PCIDeviceClass parent_class; >> + XenPTQdevRealize pci_qdev_realize; >> +} XenPTDeviceClass; >> + >> uint32_t igd_read_opregion(XenPCIPassthroughState *s); >> +void xen_igd_reserve_slot(PCIBus *pci_bus); >> void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); >> void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, >> XenHostPCIDevice *dev); >> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) >> >> #define XEN_PCI_INTEL_OPREGION 0xfc >> >> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ >> + > > I think you want to calculate this based on dev fn: > > #define XEN_PCI_IGD_SLOT_MASK \ > (0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN))) > > >> typedef enum { >> XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ >> XEN_PT_GRP_TYPE_EMU, /* emul reg group */ >> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c >> index 2d8cac8d54..5c108446a8 100644 >> --- a/hw/xen/xen_pt_stub.c >> +++ b/hw/xen/xen_pt_stub.c >> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) >> error_setg(errp, "Xen PCI passthrough support not built in"); >> } >> } >> + >> +void xen_igd_reserve_slot(PCIBus *pci_bus) >> +{ >> +} >> -- >> 2.39.0 >
On Fri, 13 Jan 2023 16:31:26 -0500 Chuck Zmudzinski <brchuckz@aol.com> wrote: > On 1/13/23 4:33 AM, Igor Mammedov wrote: > > On Thu, 12 Jan 2023 23:14:26 -0500 > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > >> >> I think the change Michael suggests is very minimalistic: Move the if > >> >> condition around xen_igd_reserve_slot() into the function itself and > >> >> always call it there unconditionally -- basically turning three lines > >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > >> >> Michael further suggests to rename it to something more general. All > >> >> in all no big changes required. > >> > > >> > yes, exactly. > >> > > >> > >> OK, got it. I can do that along with the other suggestions. > > > > have you considered instead of reservation, putting a slot check in device model > > and if it's intel igd being passed through, fail at realize time if it can't take > > required slot (with a error directing user to fix command line)? > > Yes, but the core pci code currently already fails at realize time > with a useful error message if the user tries to use slot 2 for the > igd, because of the xen platform device which has slot 2. The user > can fix this without patching qemu, but having the user fix it on > the command line is not the best way to solve the problem, primarily > because the user would need to hotplug the xen platform device via a > command line option instead of having the xen platform device added by > pc_xen_hvm_init functions almost immediately after creating the pci > bus, and that delay in adding the xen platform device degrades > startup performance of the guest. > > > That could be less complicated than dealing with slot reservations at the cost of > > being less convenient. > > And also a cost of reduced startup performance Could you clarify how it affects performance (and how much). (as I see, setup done at board_init time is roughly the same as with '-device foo' CLI options, modulo time needed to parse options which should be negligible. and both ways are done before guest runs) > However, the performance hit can be prevented by assigning slot > 3 instead of slot 2 for the xen platform device if igd passthrough > is configured on the command line instead of doing slot reservation, > but there would still be less convenience and, for libxl users, an > inability to easily configure the command line so that the igd can > still have slot 2 without a hacky and error-prone patch to libxl to > deal with this problem. libvirt manages to get it right on management side without quirks on QEMU side. > I did post a patch on xen-devel to fix this using libxl, but so far > it has not yet been reviewed and I mentioned in that patch that the > approach of patching qemu so qemu reserves slot 2 for the igd is less > prone to coding errors and is easier to maintain than the patch that > would be required to implement the fix in libxl. the patch is not trivial, and adds maintenance on QEMU. Though I don't object to it as long as it's constrained to xen only code and doesn't spill into generic PCI. All I wanted is just point out there are other approach to problem (i.e. do force user to user to provide correct configuration instead of adding quirks whenever it's possible).
On 1/16/23 10:33, Igor Mammedov wrote: > On Fri, 13 Jan 2023 16:31:26 -0500 > Chuck Zmudzinski <brchuckz@aol.com> wrote: > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: >> > On Thu, 12 Jan 2023 23:14:26 -0500 >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: >> > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: >> >> >> I think the change Michael suggests is very minimalistic: Move the if >> >> >> condition around xen_igd_reserve_slot() into the function itself and >> >> >> always call it there unconditionally -- basically turning three lines >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, >> >> >> Michael further suggests to rename it to something more general. All >> >> >> in all no big changes required. >> >> > >> >> > yes, exactly. >> >> > >> >> >> >> OK, got it. I can do that along with the other suggestions. >> > >> > have you considered instead of reservation, putting a slot check in device model >> > and if it's intel igd being passed through, fail at realize time if it can't take >> > required slot (with a error directing user to fix command line)? >> >> Yes, but the core pci code currently already fails at realize time >> with a useful error message if the user tries to use slot 2 for the >> igd, because of the xen platform device which has slot 2. The user >> can fix this without patching qemu, but having the user fix it on >> the command line is not the best way to solve the problem, primarily >> because the user would need to hotplug the xen platform device via a >> command line option instead of having the xen platform device added by >> pc_xen_hvm_init functions almost immediately after creating the pci >> bus, and that delay in adding the xen platform device degrades >> startup performance of the guest. >> >> > That could be less complicated than dealing with slot reservations at the cost of >> > being less convenient. >> >> And also a cost of reduced startup performance > > Could you clarify how it affects performance (and how much). > (as I see, setup done at board_init time is roughly the same > as with '-device foo' CLI options, modulo time needed to parse > options which should be negligible. and both ways are done before > guest runs) I preface my answer by saying there is a v9, but you don't need to look at that. I will answer all your questions here. I am going by what I observe on the main HDMI display with the different approaches. With the approach of not patching Qemu to fix this, which requires adding the Xen platform device a little later, the length of time it takes to fully load the guest is increased. I also noticed with Linux guests that use the grub bootoader, the grub vga driver cannot display the grub boot menu at the native resolution of the display, which in the tested case is 1920x1080, when the Xen platform device is added via a command line option instead of by the pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch to Qemu, the grub menu is displayed at the full, 1920x1080 native resolution of the display. Once the guest fully loads, there is no noticeable difference in performance. It is mainly a degradation in startup performance, not performance once the guest OS is fully loaded. > >> However, the performance hit can be prevented by assigning slot >> 3 instead of slot 2 for the xen platform device if igd passthrough >> is configured on the command line instead of doing slot reservation, >> but there would still be less convenience and, for libxl users, an >> inability to easily configure the command line so that the igd can >> still have slot 2 without a hacky and error-prone patch to libxl to >> deal with this problem. > libvirt manages to get it right on management side without quirks on > QEMU side. I think the reason libvirt/kvm gets it right is simply because the code implementing the libvirt/kvm approach got more attention and testing than the code that implements the libxl/Xen approach. This patch really represents what should have been done when support for the igd-passthru=on option for xenfv machines was added seven years ago, but the code was apparently added without much testing and is stale now and needs this fix which is entirely implemented in either files maintained by Xen maintainers or, in the case of the small patch to pc_piix.c, entirely within a section guarded by #ifdef CONFIG_XEN. Not much maintenance burden for hw/i386 maintainers. > >> I did post a patch on xen-devel to fix this using libxl, but so far >> it has not yet been reviewed and I mentioned in that patch that the >> approach of patching qemu so qemu reserves slot 2 for the igd is less >> prone to coding errors and is easier to maintain than the patch that >> would be required to implement the fix in libxl. > > the patch is not trivial, and adds maintenance on QEMU. For all practical purposes, the only additional maintenance would be handled by Xen maintainers, and the Xen maintainer of the Xen files being patched gave a Reviewed-by to an earlier iteration of this patch. So I think the decision about the maintenance cost of this patch should be made by the Xen maintainers. In fact, if I were a Xen maintainer, I think this patch to Qemu would be much easier for the Xen maintainers to maintain than the proposed patch to libxl to fix this. So ultimately, I think it makes sense for the Xen maintainers to decide on the maintenance cost. So far they have not weighed in since the Reviewed-by that Anthony gave to an earlier iteration of this patch. So far, they have not responded to my patch to libxl, and I don't blame them because that would be more difficult for them to maintain than this patch to some of the Xen-specific code within Qemu. For reference, the patch for libxl that fixes this is here: https://lore.kernel.org/qemu-devel/20230110073201.mdUvSjy1vKtxPriqMQuWAxIjQzf1eAqIlZgal1u3GBI@z/ > Though I don't object to it as long as it's constrained to xen only > code It already is constrained to Xen only code - the small patch to pc_piix.c is entirely guarded by #ifdef CONFIG_XEN. and doesn't spill into generic PCI. In comments on an earlier iteration of this patch, Michael indicated he would not object a patch to core pci if it added some useful functionality. Michael, do I misunderstand you? I have already proposed a patch that does that, which, if accepted, would address the objection that unconditionally reserving the slot during initialization is not desirable. He pointed out that a patch to core pci could fix that, and I have proposed such a patch, independent of this patch, here: https://lore.kernel.org/qemu-devel/ad5f5cf8bc4bd4a720724ed41e47565a7f27adf5.1673829387.git.brchuckz@aol.com/ > All I wanted is just point out there are other approach to problem > (i.e. do force user to user to provide correct configuration instead > of adding quirks whenever it's possible). > I disagree that the default configuration should configure the hardware in a way that does not conform to the requirements of the device and thereby force users to add non-default settings to configure the machine correctly. That is simply not being friendly to Xen users of Qemu, and that, unfortunately is what Qemu code currently does and has done for the past seven years as regards the configuration by Qemu of igd passthru on Xen. IMO, it is unreasonable to not fix this, and since the fix can be implemented in entirely Xen-specific code, I hope and expect that eventually the Xen maintainers will fix this. I hope they are just waiting until I implement the fixes that you and Michael have requested which are mostly reasonable and admittedly, not completed yet. Perhaps this approach is what you call a "quirk" because of the limitations of how slot_reserved_mask works. That can be fixed by patching core pci. That, IMO, is the best and most maintainable way to fix this. So my plan is to wait and see how my proposed patch to core pci is received. If it gets accepted, I will do a v10 of this patch which will use the improved management capability added by the patch to core pci that addresses the concerns that this patch will interfere with the libvirt/kvm approach of manually assigning the slots by causing the slot_reserved_mask to only take effect when the device being added is configured for auto assignment of the slot address. When libvirt adds a pci device to a xenfv machine configured for igd-patthru, my proposed v10, with the patch to core pci as a prerequisite, will not introduce any change to how Qemu configures the machine in response to a libvirt configuration that manually assigns the slot addresses. I do accept that v8/v9 of the patch is stalled, and I am working to address all the concerns being raised here. Thanks for your comments! Kind regards, Chuck
On Mon, 16 Jan 2023 13:00:53 -0500 Chuck Zmudzinski <brchuckz@netscape.net> wrote: > On 1/16/23 10:33, Igor Mammedov wrote: > > On Fri, 13 Jan 2023 16:31:26 -0500 > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > >> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > >> >> >> always call it there unconditionally -- basically turning three lines > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > >> >> >> Michael further suggests to rename it to something more general. All > >> >> >> in all no big changes required. > >> >> > > >> >> > yes, exactly. > >> >> > > >> >> > >> >> OK, got it. I can do that along with the other suggestions. > >> > > >> > have you considered instead of reservation, putting a slot check in device model > >> > and if it's intel igd being passed through, fail at realize time if it can't take > >> > required slot (with a error directing user to fix command line)? > >> > >> Yes, but the core pci code currently already fails at realize time > >> with a useful error message if the user tries to use slot 2 for the > >> igd, because of the xen platform device which has slot 2. The user > >> can fix this without patching qemu, but having the user fix it on > >> the command line is not the best way to solve the problem, primarily > >> because the user would need to hotplug the xen platform device via a > >> command line option instead of having the xen platform device added by > >> pc_xen_hvm_init functions almost immediately after creating the pci > >> bus, and that delay in adding the xen platform device degrades > >> startup performance of the guest. > >> > >> > That could be less complicated than dealing with slot reservations at the cost of > >> > being less convenient. > >> > >> And also a cost of reduced startup performance > > > > Could you clarify how it affects performance (and how much). > > (as I see, setup done at board_init time is roughly the same > > as with '-device foo' CLI options, modulo time needed to parse > > options which should be negligible. and both ways are done before > > guest runs) > > I preface my answer by saying there is a v9, but you don't > need to look at that. I will answer all your questions here. > > I am going by what I observe on the main HDMI display with the > different approaches. With the approach of not patching Qemu > to fix this, which requires adding the Xen platform device a > little later, the length of time it takes to fully load the > guest is increased. I also noticed with Linux guests that use > the grub bootoader, the grub vga driver cannot display the > grub boot menu at the native resolution of the display, which > in the tested case is 1920x1080, when the Xen platform device > is added via a command line option instead of by the > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > to Qemu, the grub menu is displayed at the full, 1920x1080 > native resolution of the display. Once the guest fully loads, > there is no noticeable difference in performance. It is mainly > a degradation in startup performance, not performance once > the guest OS is fully loaded. above hints on presence of bug[s] in igd-passthru implementation, and this patch effectively hides problem instead of trying to figure out what's wrong and fixing it.
On Mon, 16 Jan 2023 13:00:53 -0500 Chuck Zmudzinski <brchuckz@netscape.net> wrote: > On 1/16/23 10:33, Igor Mammedov wrote: > > On Fri, 13 Jan 2023 16:31:26 -0500 > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > >> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > >> >> >> always call it there unconditionally -- basically turning three lines > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > >> >> >> Michael further suggests to rename it to something more general. All > >> >> >> in all no big changes required. > >> >> > > >> >> > yes, exactly. > >> >> > > >> >> > >> >> OK, got it. I can do that along with the other suggestions. > >> > > >> > have you considered instead of reservation, putting a slot check in device model > >> > and if it's intel igd being passed through, fail at realize time if it can't take > >> > required slot (with a error directing user to fix command line)? > >> > >> Yes, but the core pci code currently already fails at realize time > >> with a useful error message if the user tries to use slot 2 for the > >> igd, because of the xen platform device which has slot 2. The user > >> can fix this without patching qemu, but having the user fix it on > >> the command line is not the best way to solve the problem, primarily > >> because the user would need to hotplug the xen platform device via a > >> command line option instead of having the xen platform device added by > >> pc_xen_hvm_init functions almost immediately after creating the pci > >> bus, and that delay in adding the xen platform device degrades > >> startup performance of the guest. > >> > >> > That could be less complicated than dealing with slot reservations at the cost of > >> > being less convenient. > >> > >> And also a cost of reduced startup performance > > > > Could you clarify how it affects performance (and how much). > > (as I see, setup done at board_init time is roughly the same > > as with '-device foo' CLI options, modulo time needed to parse > > options which should be negligible. and both ways are done before > > guest runs) > > I preface my answer by saying there is a v9, but you don't > need to look at that. I will answer all your questions here. > > I am going by what I observe on the main HDMI display with the > different approaches. With the approach of not patching Qemu > to fix this, which requires adding the Xen platform device a > little later, the length of time it takes to fully load the > guest is increased. I also noticed with Linux guests that use > the grub bootoader, the grub vga driver cannot display the > grub boot menu at the native resolution of the display, which > in the tested case is 1920x1080, when the Xen platform device > is added via a command line option instead of by the > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > to Qemu, the grub menu is displayed at the full, 1920x1080 > native resolution of the display. Once the guest fully loads, > there is no noticeable difference in performance. It is mainly > a degradation in startup performance, not performance once > the guest OS is fully loaded. Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI option, and actually drop at least graphics defaults explicitly. So it is expected to work fine even when IGD is constructed with '-device'. Could you provide full CLI current xen starts QEMU with and then a CLI you used (with explicit -device for IGD) that leads to reduced performance? CCing vfio folks who might have an idea what could be wrong based on vfio experience.
On 1/17/2023 5:35 AM, Igor Mammedov wrote: > On Mon, 16 Jan 2023 13:00:53 -0500 > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > >> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > >> >> >> always call it there unconditionally -- basically turning three lines > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > >> >> >> Michael further suggests to rename it to something more general. All > > >> >> >> in all no big changes required. > > >> >> > > > >> >> > yes, exactly. > > >> >> > > > >> >> > > >> >> OK, got it. I can do that along with the other suggestions. > > >> > > > >> > have you considered instead of reservation, putting a slot check in device model > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > >> > required slot (with a error directing user to fix command line)? > > >> > > >> Yes, but the core pci code currently already fails at realize time > > >> with a useful error message if the user tries to use slot 2 for the > > >> igd, because of the xen platform device which has slot 2. The user > > >> can fix this without patching qemu, but having the user fix it on > > >> the command line is not the best way to solve the problem, primarily > > >> because the user would need to hotplug the xen platform device via a > > >> command line option instead of having the xen platform device added by > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > >> bus, and that delay in adding the xen platform device degrades > > >> startup performance of the guest. > > >> > > >> > That could be less complicated than dealing with slot reservations at the cost of > > >> > being less convenient. > > >> > > >> And also a cost of reduced startup performance > > > > > > Could you clarify how it affects performance (and how much). > > > (as I see, setup done at board_init time is roughly the same > > > as with '-device foo' CLI options, modulo time needed to parse > > > options which should be negligible. and both ways are done before > > > guest runs) > > > > I preface my answer by saying there is a v9, but you don't > > need to look at that. I will answer all your questions here. > > > > I am going by what I observe on the main HDMI display with the > > different approaches. With the approach of not patching Qemu > > to fix this, which requires adding the Xen platform device a > > little later, the length of time it takes to fully load the > > guest is increased. I also noticed with Linux guests that use > > the grub bootoader, the grub vga driver cannot display the > > grub boot menu at the native resolution of the display, which > > in the tested case is 1920x1080, when the Xen platform device > > is added via a command line option instead of by the > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > native resolution of the display. Once the guest fully loads, > > there is no noticeable difference in performance. It is mainly > > a degradation in startup performance, not performance once > > the guest OS is fully loaded. > above hints on presence of bug[s] in igd-passthru implementation, > and this patch effectively hides problem instead of trying to figure > out what's wrong and fixing it. > Why did you delete the rest of my answers to your other observations and not respond to them?
On 1/17/2023 5:35 AM, Igor Mammedov wrote: > On Mon, 16 Jan 2023 13:00:53 -0500 > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > >> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > >> >> >> always call it there unconditionally -- basically turning three lines > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > >> >> >> Michael further suggests to rename it to something more general. All > > >> >> >> in all no big changes required. > > >> >> > > > >> >> > yes, exactly. > > >> >> > > > >> >> > > >> >> OK, got it. I can do that along with the other suggestions. > > >> > > > >> > have you considered instead of reservation, putting a slot check in device model > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > >> > required slot (with a error directing user to fix command line)? > > >> > > >> Yes, but the core pci code currently already fails at realize time > > >> with a useful error message if the user tries to use slot 2 for the > > >> igd, because of the xen platform device which has slot 2. The user > > >> can fix this without patching qemu, but having the user fix it on > > >> the command line is not the best way to solve the problem, primarily > > >> because the user would need to hotplug the xen platform device via a > > >> command line option instead of having the xen platform device added by > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > >> bus, and that delay in adding the xen platform device degrades > > >> startup performance of the guest. > > >> > > >> > That could be less complicated than dealing with slot reservations at the cost of > > >> > being less convenient. > > >> > > >> And also a cost of reduced startup performance > > > > > > Could you clarify how it affects performance (and how much). > > > (as I see, setup done at board_init time is roughly the same > > > as with '-device foo' CLI options, modulo time needed to parse > > > options which should be negligible. and both ways are done before > > > guest runs) > > > > I preface my answer by saying there is a v9, but you don't > > need to look at that. I will answer all your questions here. > > > > I am going by what I observe on the main HDMI display with the > > different approaches. With the approach of not patching Qemu > > to fix this, which requires adding the Xen platform device a > > little later, the length of time it takes to fully load the > > guest is increased. I also noticed with Linux guests that use > > the grub bootoader, the grub vga driver cannot display the > > grub boot menu at the native resolution of the display, which > > in the tested case is 1920x1080, when the Xen platform device > > is added via a command line option instead of by the > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > native resolution of the display. Once the guest fully loads, > > there is no noticeable difference in performance. It is mainly > > a degradation in startup performance, not performance once > > the guest OS is fully loaded. > above hints on presence of bug[s] in igd-passthru implementation, > and this patch effectively hides problem instead of trying to figure > out what's wrong and fixing it. > I agree that the with the current Qemu there is a bug in the igd-passthru implementation. My proposed patch fixes it. So why not support the proposed patch with a Reviewed-by tag?
On 1/17/2023 6:04 AM, Igor Mammedov wrote: > On Mon, 16 Jan 2023 13:00:53 -0500 > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > >> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > >> >> >> always call it there unconditionally -- basically turning three lines > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > >> >> >> Michael further suggests to rename it to something more general. All > > >> >> >> in all no big changes required. > > >> >> > > > >> >> > yes, exactly. > > >> >> > > > >> >> > > >> >> OK, got it. I can do that along with the other suggestions. > > >> > > > >> > have you considered instead of reservation, putting a slot check in device model > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > >> > required slot (with a error directing user to fix command line)? > > >> > > >> Yes, but the core pci code currently already fails at realize time > > >> with a useful error message if the user tries to use slot 2 for the > > >> igd, because of the xen platform device which has slot 2. The user > > >> can fix this without patching qemu, but having the user fix it on > > >> the command line is not the best way to solve the problem, primarily > > >> because the user would need to hotplug the xen platform device via a > > >> command line option instead of having the xen platform device added by > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > >> bus, and that delay in adding the xen platform device degrades > > >> startup performance of the guest. > > >> > > >> > That could be less complicated than dealing with slot reservations at the cost of > > >> > being less convenient. > > >> > > >> And also a cost of reduced startup performance > > > > > > Could you clarify how it affects performance (and how much). > > > (as I see, setup done at board_init time is roughly the same > > > as with '-device foo' CLI options, modulo time needed to parse > > > options which should be negligible. and both ways are done before > > > guest runs) > > > > I preface my answer by saying there is a v9, but you don't > > need to look at that. I will answer all your questions here. > > > > I am going by what I observe on the main HDMI display with the > > different approaches. With the approach of not patching Qemu > > to fix this, which requires adding the Xen platform device a > > little later, the length of time it takes to fully load the > > guest is increased. I also noticed with Linux guests that use > > the grub bootoader, the grub vga driver cannot display the > > grub boot menu at the native resolution of the display, which > > in the tested case is 1920x1080, when the Xen platform device > > is added via a command line option instead of by the > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > native resolution of the display. Once the guest fully loads, > > there is no noticeable difference in performance. It is mainly > > a degradation in startup performance, not performance once > > the guest OS is fully loaded. > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI > option, and actually drop at least graphics defaults explicitly. > So it is expected to work fine even when IGD is constructed with > '-device'. > > Could you provide full CLI current xen starts QEMU with and then > a CLI you used (with explicit -device for IGD) that leads > to reduced performance? > > CCing vfio folks who might have an idea what could be wrong based > on vfio experience. Actually, the igd is not added with an explicit -device option using Xen: 1573 ? Ssl 0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback I think it is added by xl (libxl management tool) when the guest is created using the qmp-libxl socket that appears on the command line, but I am not 100 percent sure. So, with libxl, the command line alone does not tell the whole story. The xl.cfg file has a line like this to define the pci devices passed through, and in qemu they are type XEN_PT devices, not VFIO devices: pci = [ '00:1b.0','00:14.0','00:02.0@02' ] This means three host pci devices are passed through, the ones on the host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd. The @02 means libxl is requesting slot 2 in the guest for the igd, the other 2 devices are just auto assigned a slot by Qemu. Qemu cannot assign the igd to slot 2 for xenfv machines without a patch that prevents the Xen platform device from grabbing slot 2. That is what this patch accomplishes. The workaround involves using the Qemu pc machine instead of the Qemu xenfv machine, in which case the code in Qemu that adds the Xen platform device at slot 2 is avoided, and in that case the Xen platform device is added via a command line option instead at slot 3 instead of slot 2. The differences between vfio and the Xen pci passthrough device might make a difference in Xen vs. kvm for igd-pasthru. Also, kvm does not use the Xen platform device, and it seems the Xen guests behave better at startup when the Xen platform device is added very early, during the initialization of the emulated devices of the machine, which is based on the i440fx piix3 machine, instead of being added using a command line option. Perhaps the performance at startup could be improved by adding the igd via a command line option using vfio instead of the canonical way that libxl does pci passthrough on Xen, but I have no idea if vfio works on Xen the way it works on kvm. I am willing to investigate and experiment with it, though. So if any vfio people can shed some light on this, that would help. Thanks, Chuck
On Tue, 17 Jan 2023 19:15:57 -0500 Chuck Zmudzinski <brchuckz@aol.com> wrote: > On 1/17/2023 6:04 AM, Igor Mammedov wrote: > > On Mon, 16 Jan 2023 13:00:53 -0500 > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > >> >> >> Michael further suggests to rename it to something more general. All > > > >> >> >> in all no big changes required. > > > >> >> > > > > >> >> > yes, exactly. > > > >> >> > > > > >> >> > > > >> >> OK, got it. I can do that along with the other suggestions. > > > >> > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > >> > required slot (with a error directing user to fix command line)? > > > >> > > > >> Yes, but the core pci code currently already fails at realize time > > > >> with a useful error message if the user tries to use slot 2 for the > > > >> igd, because of the xen platform device which has slot 2. The user > > > >> can fix this without patching qemu, but having the user fix it on > > > >> the command line is not the best way to solve the problem, primarily > > > >> because the user would need to hotplug the xen platform device via a > > > >> command line option instead of having the xen platform device added by > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > >> bus, and that delay in adding the xen platform device degrades > > > >> startup performance of the guest. > > > >> > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > >> > being less convenient. > > > >> > > > >> And also a cost of reduced startup performance > > > > > > > > Could you clarify how it affects performance (and how much). > > > > (as I see, setup done at board_init time is roughly the same > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > options which should be negligible. and both ways are done before > > > > guest runs) > > > > > > I preface my answer by saying there is a v9, but you don't > > > need to look at that. I will answer all your questions here. > > > > > > I am going by what I observe on the main HDMI display with the > > > different approaches. With the approach of not patching Qemu > > > to fix this, which requires adding the Xen platform device a > > > little later, the length of time it takes to fully load the > > > guest is increased. I also noticed with Linux guests that use > > > the grub bootoader, the grub vga driver cannot display the > > > grub boot menu at the native resolution of the display, which > > > in the tested case is 1920x1080, when the Xen platform device > > > is added via a command line option instead of by the > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > native resolution of the display. Once the guest fully loads, > > > there is no noticeable difference in performance. It is mainly > > > a degradation in startup performance, not performance once > > > the guest OS is fully loaded. > > > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI > > option, and actually drop at least graphics defaults explicitly. > > So it is expected to work fine even when IGD is constructed with > > '-device'. > > > > Could you provide full CLI current xen starts QEMU with and then > > a CLI you used (with explicit -device for IGD) that leads > > to reduced performance? > > > > CCing vfio folks who might have an idea what could be wrong based > > on vfio experience. > > Actually, the igd is not added with an explicit -device option using Xen: > > 1573 ? Ssl 0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback > > I think it is added by xl (libxl management tool) when the guest is created > using the qmp-libxl socket that appears on the command line, but I am not 100 > percent sure. So, with libxl, the command line alone does not tell the whole > story. The xl.cfg file has a line like this to define the pci devices passed through, > and in qemu they are type XEN_PT devices, not VFIO devices: > > pci = [ '00:1b.0','00:14.0','00:02.0@02' ] > > This means three host pci devices are passed through, the ones on the > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd. > The @02 means libxl is requesting slot 2 in the guest for the igd, the > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot > assign the igd to slot 2 for xenfv machines without a patch that prevents > the Xen platform device from grabbing slot 2. That is what this patch > accomplishes. The workaround involves using the Qemu pc machine > instead of the Qemu xenfv machine, in which case the code in Qemu > that adds the Xen platform device at slot 2 is avoided, and in that case > the Xen platform device is added via a command line option instead > at slot 3 instead of slot 2. > > The differences between vfio and the Xen pci passthrough device > might make a difference in Xen vs. kvm for igd-pasthru. > > Also, kvm does not use the Xen platform device, and it seems the > Xen guests behave better at startup when the Xen platform device > is added very early, during the initialization of the emulated devices > of the machine, which is based on the i440fx piix3 machine, instead > of being added using a command line option. Perhaps the performance > at startup could be improved by adding the igd via a command line > option using vfio instead of the canonical way that libxl does pci > passthrough on Xen, but I have no idea if vfio works on Xen the way it > works on kvm. I am willing to investigate and experiment with it, though. > > So if any vfio people can shed some light on this, that would help. ISTR some rumors of Xen thinking about vfio, but AFAIK there is no combination of vfio with Xen, nor is there any sharing of device quirks to assign IGD. IGD assignment has various VM platform dependencies, of which the bus address is just one. Vfio support for IGD assignment takes a much different approach, the user or management tool is responsible for configuring the VM correctly for IGD assignment, including assigning the device to the correct bus address and using the right VM chipset, with the correct slot free for the LPC controller. If all the user configuration of the VM aligns correctly, we'll activate "legacy mode" by inserting the opregion and instantiate the LPC bridge with the correct fields to make the BIOS and drivers happy. Otherwise, maybe the user is trying to use the mythical UPT mode, but given Intel's lack of follow-through, it probably just won't work. Or maybe it's a vGPU and we don't need the legacy features anyway. Working with an expectation that QEMU needs to do the heavy lifting to make it all work automatically, with no support from the management tool for putting devices in the right place, I'm afraid there might not be much to leverage from the vfio vesion. Thanks, Alex
On 1/17/2023 11:27 PM, Alex Williamson wrote: > On Tue, 17 Jan 2023 19:15:57 -0500 > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > On 1/17/2023 6:04 AM, Igor Mammedov wrote: > > > On Mon, 16 Jan 2023 13:00:53 -0500 > > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > >> > > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > > >> >> >> Michael further suggests to rename it to something more general. All > > > > >> >> >> in all no big changes required. > > > > >> >> > > > > > >> >> > yes, exactly. > > > > >> >> > > > > > >> >> > > > > >> >> OK, got it. I can do that along with the other suggestions. > > > > >> > > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > > >> > required slot (with a error directing user to fix command line)? > > > > >> > > > > >> Yes, but the core pci code currently already fails at realize time > > > > >> with a useful error message if the user tries to use slot 2 for the > > > > >> igd, because of the xen platform device which has slot 2. The user > > > > >> can fix this without patching qemu, but having the user fix it on > > > > >> the command line is not the best way to solve the problem, primarily > > > > >> because the user would need to hotplug the xen platform device via a > > > > >> command line option instead of having the xen platform device added by > > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > > >> bus, and that delay in adding the xen platform device degrades > > > > >> startup performance of the guest. > > > > >> > > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > > >> > being less convenient. > > > > >> > > > > >> And also a cost of reduced startup performance > > > > > > > > > > Could you clarify how it affects performance (and how much). > > > > > (as I see, setup done at board_init time is roughly the same > > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > > options which should be negligible. and both ways are done before > > > > > guest runs) > > > > > > > > I preface my answer by saying there is a v9, but you don't > > > > need to look at that. I will answer all your questions here. > > > > > > > > I am going by what I observe on the main HDMI display with the > > > > different approaches. With the approach of not patching Qemu > > > > to fix this, which requires adding the Xen platform device a > > > > little later, the length of time it takes to fully load the > > > > guest is increased. I also noticed with Linux guests that use > > > > the grub bootoader, the grub vga driver cannot display the > > > > grub boot menu at the native resolution of the display, which > > > > in the tested case is 1920x1080, when the Xen platform device > > > > is added via a command line option instead of by the > > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > > native resolution of the display. Once the guest fully loads, > > > > there is no noticeable difference in performance. It is mainly > > > > a degradation in startup performance, not performance once > > > > the guest OS is fully loaded. > > > > > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI > > > option, and actually drop at least graphics defaults explicitly. > > > So it is expected to work fine even when IGD is constructed with > > > '-device'. > > > > > > Could you provide full CLI current xen starts QEMU with and then > > > a CLI you used (with explicit -device for IGD) that leads > > > to reduced performance? > > > > > > CCing vfio folks who might have an idea what could be wrong based > > > on vfio experience. > > > > Actually, the igd is not added with an explicit -device option using Xen: > > > > 1573 ? Ssl 0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback > > > > I think it is added by xl (libxl management tool) when the guest is created > > using the qmp-libxl socket that appears on the command line, but I am not 100 > > percent sure. So, with libxl, the command line alone does not tell the whole > > story. The xl.cfg file has a line like this to define the pci devices passed through, > > and in qemu they are type XEN_PT devices, not VFIO devices: > > > > pci = [ '00:1b.0','00:14.0','00:02.0@02' ] > > > > This means three host pci devices are passed through, the ones on the > > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd. > > The @02 means libxl is requesting slot 2 in the guest for the igd, the > > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot > > assign the igd to slot 2 for xenfv machines without a patch that prevents > > the Xen platform device from grabbing slot 2. That is what this patch > > accomplishes. The workaround involves using the Qemu pc machine > > instead of the Qemu xenfv machine, in which case the code in Qemu > > that adds the Xen platform device at slot 2 is avoided, and in that case > > the Xen platform device is added via a command line option instead > > at slot 3 instead of slot 2. > > > > The differences between vfio and the Xen pci passthrough device > > might make a difference in Xen vs. kvm for igd-pasthru. > > > > Also, kvm does not use the Xen platform device, and it seems the > > Xen guests behave better at startup when the Xen platform device > > is added very early, during the initialization of the emulated devices > > of the machine, which is based on the i440fx piix3 machine, instead > > of being added using a command line option. Perhaps the performance > > at startup could be improved by adding the igd via a command line > > option using vfio instead of the canonical way that libxl does pci > > passthrough on Xen, but I have no idea if vfio works on Xen the way it > > works on kvm. I am willing to investigate and experiment with it, though. > > > > So if any vfio people can shed some light on this, that would help. > > ISTR some rumors of Xen thinking about vfio, but AFAIK there is no > combination of vfio with Xen, nor is there any sharing of device quirks > to assign IGD. IGD assignment has various VM platform dependencies, of > which the bus address is just one. Vfio support for IGD assignment > takes a much different approach, the user or management tool is > responsible for configuring the VM correctly for IGD assignment, > including assigning the device to the correct bus address and using the > right VM chipset, with the correct slot free for the LPC controller. If > all the user configuration of the VM aligns correctly, we'll activate > "legacy mode" by inserting the opregion and instantiate the LPC bridge > with the correct fields to make the BIOS and drivers happy. Otherwise, > maybe the user is trying to use the mythical UPT mode, but given > Intel's lack of follow-through, it probably just won't work. Or maybe > it's a vGPU and we don't need the legacy features anyway. > > Working with an expectation that QEMU needs to do the heavy lifting to > make it all work automatically, with no support from the management > tool for putting devices in the right place, I'm afraid there might not > be much to leverage from the vfio vesion. Thanks, > > Alex Thanks for answering my question. I thought vfio's implementation was distinct and probably incompatible from Xen's implementation. Chuck
On Tue, 17 Jan 2023 09:43:52 -0500 Chuck Zmudzinski <brchuckz@aol.com> wrote: > On 1/17/2023 5:35 AM, Igor Mammedov wrote: > > On Mon, 16 Jan 2023 13:00:53 -0500 > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > >> >> >> Michael further suggests to rename it to something more general. All > > > >> >> >> in all no big changes required. > > > >> >> > > > > >> >> > yes, exactly. > > > >> >> > > > > >> >> > > > >> >> OK, got it. I can do that along with the other suggestions. > > > >> > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > >> > required slot (with a error directing user to fix command line)? > > > >> > > > >> Yes, but the core pci code currently already fails at realize time > > > >> with a useful error message if the user tries to use slot 2 for the > > > >> igd, because of the xen platform device which has slot 2. The user > > > >> can fix this without patching qemu, but having the user fix it on > > > >> the command line is not the best way to solve the problem, primarily > > > >> because the user would need to hotplug the xen platform device via a > > > >> command line option instead of having the xen platform device added by > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > >> bus, and that delay in adding the xen platform device degrades > > > >> startup performance of the guest. > > > >> > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > >> > being less convenient. > > > >> > > > >> And also a cost of reduced startup performance > > > > > > > > Could you clarify how it affects performance (and how much). > > > > (as I see, setup done at board_init time is roughly the same > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > options which should be negligible. and both ways are done before > > > > guest runs) > > > > > > I preface my answer by saying there is a v9, but you don't > > > need to look at that. I will answer all your questions here. > > > > > > I am going by what I observe on the main HDMI display with the > > > different approaches. With the approach of not patching Qemu > > > to fix this, which requires adding the Xen platform device a > > > little later, the length of time it takes to fully load the > > > guest is increased. I also noticed with Linux guests that use > > > the grub bootoader, the grub vga driver cannot display the > > > grub boot menu at the native resolution of the display, which > > > in the tested case is 1920x1080, when the Xen platform device > > > is added via a command line option instead of by the > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > native resolution of the display. Once the guest fully loads, > > > there is no noticeable difference in performance. It is mainly > > > a degradation in startup performance, not performance once > > > the guest OS is fully loaded. > > above hints on presence of bug[s] in igd-passthru implementation, > > and this patch effectively hides problem instead of trying to figure > > out what's wrong and fixing it. > > > > Why did you delete the rest of my answers to your other observations and > not respond to them? they are preserved on mail-list in you previous email. It's usually fine to trim non relevant parts and keep only part/context that is being replied to. I didn't want to argue point that it's usually job of management arrange correct IGD passthrough for QEMU like Alex pointed out. Hence those part got trimmed. >
On Tue, 17 Jan 2023 09:50:23 -0500 Chuck Zmudzinski <brchuckz@aol.com> wrote: > On 1/17/2023 5:35 AM, Igor Mammedov wrote: > > On Mon, 16 Jan 2023 13:00:53 -0500 > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > >> >> >> Michael further suggests to rename it to something more general. All > > > >> >> >> in all no big changes required. > > > >> >> > > > > >> >> > yes, exactly. > > > >> >> > > > > >> >> > > > >> >> OK, got it. I can do that along with the other suggestions. > > > >> > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > >> > required slot (with a error directing user to fix command line)? > > > >> > > > >> Yes, but the core pci code currently already fails at realize time > > > >> with a useful error message if the user tries to use slot 2 for the > > > >> igd, because of the xen platform device which has slot 2. The user > > > >> can fix this without patching qemu, but having the user fix it on > > > >> the command line is not the best way to solve the problem, primarily > > > >> because the user would need to hotplug the xen platform device via a > > > >> command line option instead of having the xen platform device added by > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > >> bus, and that delay in adding the xen platform device degrades > > > >> startup performance of the guest. > > > >> > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > >> > being less convenient. > > > >> > > > >> And also a cost of reduced startup performance > > > > > > > > Could you clarify how it affects performance (and how much). > > > > (as I see, setup done at board_init time is roughly the same > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > options which should be negligible. and both ways are done before > > > > guest runs) > > > > > > I preface my answer by saying there is a v9, but you don't > > > need to look at that. I will answer all your questions here. > > > > > > I am going by what I observe on the main HDMI display with the > > > different approaches. With the approach of not patching Qemu > > > to fix this, which requires adding the Xen platform device a > > > little later, the length of time it takes to fully load the > > > guest is increased. I also noticed with Linux guests that use > > > the grub bootoader, the grub vga driver cannot display the > > > grub boot menu at the native resolution of the display, which > > > in the tested case is 1920x1080, when the Xen platform device > > > is added via a command line option instead of by the > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > native resolution of the display. Once the guest fully loads, > > > there is no noticeable difference in performance. It is mainly > > > a degradation in startup performance, not performance once > > > the guest OS is fully loaded. > > above hints on presence of bug[s] in igd-passthru implementation, > > and this patch effectively hides problem instead of trying to figure > > out what's wrong and fixing it. > > > > I agree that the with the current Qemu there is a bug in the igd-passthru > implementation. My proposed patch fixes it. So why not support the > proposed patch with a Reviewed-by tag? I see the patch as workaround (it might be a reasonable one) and it's upto xen maintainers to give Reviewed-by/accept it. Though I'd add to commit message something about performance issues you are experiencing when trying to passthrough IGD manually as one of justifications for workaround (it might push Xen folks to investigate the issue and it won't be lost/forgotten on mail-list).
On Tue, 17 Jan 2023, Chuck Zmudzinski wrote: > On 1/17/2023 6:04 AM, Igor Mammedov wrote: > > On Mon, 16 Jan 2023 13:00:53 -0500 > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > >> > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > >> >> >> Michael further suggests to rename it to something more general. All > > > >> >> >> in all no big changes required. > > > >> >> > > > > >> >> > yes, exactly. > > > >> >> > > > > >> >> > > > >> >> OK, got it. I can do that along with the other suggestions. > > > >> > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > >> > required slot (with a error directing user to fix command line)? > > > >> > > > >> Yes, but the core pci code currently already fails at realize time > > > >> with a useful error message if the user tries to use slot 2 for the > > > >> igd, because of the xen platform device which has slot 2. The user > > > >> can fix this without patching qemu, but having the user fix it on > > > >> the command line is not the best way to solve the problem, primarily > > > >> because the user would need to hotplug the xen platform device via a > > > >> command line option instead of having the xen platform device added by > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > >> bus, and that delay in adding the xen platform device degrades > > > >> startup performance of the guest. > > > >> > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > >> > being less convenient. > > > >> > > > >> And also a cost of reduced startup performance > > > > > > > > Could you clarify how it affects performance (and how much). > > > > (as I see, setup done at board_init time is roughly the same > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > options which should be negligible. and both ways are done before > > > > guest runs) > > > > > > I preface my answer by saying there is a v9, but you don't > > > need to look at that. I will answer all your questions here. > > > > > > I am going by what I observe on the main HDMI display with the > > > different approaches. With the approach of not patching Qemu > > > to fix this, which requires adding the Xen platform device a > > > little later, the length of time it takes to fully load the > > > guest is increased. I also noticed with Linux guests that use > > > the grub bootoader, the grub vga driver cannot display the > > > grub boot menu at the native resolution of the display, which > > > in the tested case is 1920x1080, when the Xen platform device > > > is added via a command line option instead of by the > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > native resolution of the display. Once the guest fully loads, > > > there is no noticeable difference in performance. It is mainly > > > a degradation in startup performance, not performance once > > > the guest OS is fully loaded. > > > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI > > option, and actually drop at least graphics defaults explicitly. > > So it is expected to work fine even when IGD is constructed with > > '-device'. > > > > Could you provide full CLI current xen starts QEMU with and then > > a CLI you used (with explicit -device for IGD) that leads > > to reduced performance? > > > > CCing vfio folks who might have an idea what could be wrong based > > on vfio experience. > > Actually, the igd is not added with an explicit -device option using Xen: > > 1573 ? Ssl 0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback > > I think it is added by xl (libxl management tool) when the guest is created > using the qmp-libxl socket that appears on the command line, but I am not 100 > percent sure. So, with libxl, the command line alone does not tell the whole > story. The xl.cfg file has a line like this to define the pci devices passed through, > and in qemu they are type XEN_PT devices, not VFIO devices: > > pci = [ '00:1b.0','00:14.0','00:02.0@02' ] > > This means three host pci devices are passed through, the ones on the > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd. > The @02 means libxl is requesting slot 2 in the guest for the igd, the > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot > assign the igd to slot 2 for xenfv machines without a patch that prevents > the Xen platform device from grabbing slot 2. That is what this patch > accomplishes. In principle I think this change is OK. Apologies that this patch is at v9 and none of the Xen/QEMU maintainers took a look at it yet. I'll try to look at it today.
On 1/20/2023 11:57 AM, Stefano Stabellini wrote: > On Tue, 17 Jan 2023, Chuck Zmudzinski wrote: > > On 1/17/2023 6:04 AM, Igor Mammedov wrote: > > > On Mon, 16 Jan 2023 13:00:53 -0500 > > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > >> > > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > > >> >> >> Michael further suggests to rename it to something more general. All > > > > >> >> >> in all no big changes required. > > > > >> >> > > > > > >> >> > yes, exactly. > > > > >> >> > > > > > >> >> > > > > >> >> OK, got it. I can do that along with the other suggestions. > > > > >> > > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > > >> > required slot (with a error directing user to fix command line)? > > > > >> > > > > >> Yes, but the core pci code currently already fails at realize time > > > > >> with a useful error message if the user tries to use slot 2 for the > > > > >> igd, because of the xen platform device which has slot 2. The user > > > > >> can fix this without patching qemu, but having the user fix it on > > > > >> the command line is not the best way to solve the problem, primarily > > > > >> because the user would need to hotplug the xen platform device via a > > > > >> command line option instead of having the xen platform device added by > > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > > >> bus, and that delay in adding the xen platform device degrades > > > > >> startup performance of the guest. > > > > >> > > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > > >> > being less convenient. > > > > >> > > > > >> And also a cost of reduced startup performance > > > > > > > > > > Could you clarify how it affects performance (and how much). > > > > > (as I see, setup done at board_init time is roughly the same > > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > > options which should be negligible. and both ways are done before > > > > > guest runs) > > > > > > > > I preface my answer by saying there is a v9, but you don't > > > > need to look at that. I will answer all your questions here. > > > > > > > > I am going by what I observe on the main HDMI display with the > > > > different approaches. With the approach of not patching Qemu > > > > to fix this, which requires adding the Xen platform device a > > > > little later, the length of time it takes to fully load the > > > > guest is increased. I also noticed with Linux guests that use > > > > the grub bootoader, the grub vga driver cannot display the > > > > grub boot menu at the native resolution of the display, which > > > > in the tested case is 1920x1080, when the Xen platform device > > > > is added via a command line option instead of by the > > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > > native resolution of the display. Once the guest fully loads, > > > > there is no noticeable difference in performance. It is mainly > > > > a degradation in startup performance, not performance once > > > > the guest OS is fully loaded. > > > > > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI > > > option, and actually drop at least graphics defaults explicitly. > > > So it is expected to work fine even when IGD is constructed with > > > '-device'. > > > > > > Could you provide full CLI current xen starts QEMU with and then > > > a CLI you used (with explicit -device for IGD) that leads > > > to reduced performance? > > > > > > CCing vfio folks who might have an idea what could be wrong based > > > on vfio experience. > > > > Actually, the igd is not added with an explicit -device option using Xen: > > > > 1573 ? Ssl 0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback > > > > I think it is added by xl (libxl management tool) when the guest is created > > using the qmp-libxl socket that appears on the command line, but I am not 100 > > percent sure. So, with libxl, the command line alone does not tell the whole > > story. The xl.cfg file has a line like this to define the pci devices passed through, > > and in qemu they are type XEN_PT devices, not VFIO devices: > > > > pci = [ '00:1b.0','00:14.0','00:02.0@02' ] > > > > This means three host pci devices are passed through, the ones on the > > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd. > > The @02 means libxl is requesting slot 2 in the guest for the igd, the > > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot > > assign the igd to slot 2 for xenfv machines without a patch that prevents > > the Xen platform device from grabbing slot 2. That is what this patch > > accomplishes. > > In principle I think this change is OK. Apologies that this patch is at > v9 and none of the Xen/QEMU maintainers took a look at it yet. I'll try > to look at it today. Thanks!
On 1/20/2023 11:01 AM, Igor Mammedov wrote: > On Tue, 17 Jan 2023 09:50:23 -0500 > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > On 1/17/2023 5:35 AM, Igor Mammedov wrote: > > > On Mon, 16 Jan 2023 13:00:53 -0500 > > > Chuck Zmudzinski <brchuckz@netscape.net> wrote: > > > > > > > On 1/16/23 10:33, Igor Mammedov wrote: > > > > > On Fri, 13 Jan 2023 16:31:26 -0500 > > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > > > > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote: > > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500 > > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote: > > > > >> > > > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote: > > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: > > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if > > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and > > > > >> >> >> always call it there unconditionally -- basically turning three lines > > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific, > > > > >> >> >> Michael further suggests to rename it to something more general. All > > > > >> >> >> in all no big changes required. > > > > >> >> > > > > > >> >> > yes, exactly. > > > > >> >> > > > > > >> >> > > > > >> >> OK, got it. I can do that along with the other suggestions. > > > > >> > > > > > >> > have you considered instead of reservation, putting a slot check in device model > > > > >> > and if it's intel igd being passed through, fail at realize time if it can't take > > > > >> > required slot (with a error directing user to fix command line)? > > > > >> > > > > >> Yes, but the core pci code currently already fails at realize time > > > > >> with a useful error message if the user tries to use slot 2 for the > > > > >> igd, because of the xen platform device which has slot 2. The user > > > > >> can fix this without patching qemu, but having the user fix it on > > > > >> the command line is not the best way to solve the problem, primarily > > > > >> because the user would need to hotplug the xen platform device via a > > > > >> command line option instead of having the xen platform device added by > > > > >> pc_xen_hvm_init functions almost immediately after creating the pci > > > > >> bus, and that delay in adding the xen platform device degrades > > > > >> startup performance of the guest. > > > > >> > > > > >> > That could be less complicated than dealing with slot reservations at the cost of > > > > >> > being less convenient. > > > > >> > > > > >> And also a cost of reduced startup performance > > > > > > > > > > Could you clarify how it affects performance (and how much). > > > > > (as I see, setup done at board_init time is roughly the same > > > > > as with '-device foo' CLI options, modulo time needed to parse > > > > > options which should be negligible. and both ways are done before > > > > > guest runs) > > > > > > > > I preface my answer by saying there is a v9, but you don't > > > > need to look at that. I will answer all your questions here. > > > > > > > > I am going by what I observe on the main HDMI display with the > > > > different approaches. With the approach of not patching Qemu > > > > to fix this, which requires adding the Xen platform device a > > > > little later, the length of time it takes to fully load the > > > > guest is increased. I also noticed with Linux guests that use > > > > the grub bootoader, the grub vga driver cannot display the > > > > grub boot menu at the native resolution of the display, which > > > > in the tested case is 1920x1080, when the Xen platform device > > > > is added via a command line option instead of by the > > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch > > > > to Qemu, the grub menu is displayed at the full, 1920x1080 > > > > native resolution of the display. Once the guest fully loads, > > > > there is no noticeable difference in performance. It is mainly > > > > a degradation in startup performance, not performance once > > > > the guest OS is fully loaded. > > > above hints on presence of bug[s] in igd-passthru implementation, > > > and this patch effectively hides problem instead of trying to figure > > > out what's wrong and fixing it. > > > > > > > I agree that the with the current Qemu there is a bug in the igd-passthru > > implementation. My proposed patch fixes it. So why not support the > > proposed patch with a Reviewed-by tag? > I see the patch as workaround (it might be a reasonable one) and > it's upto xen maintainers to give Reviewed-by/accept it. > > Though I'd add to commit message something about performance issues > you are experiencing when trying to passthrough IGD manually No, the device that needs to be passed through manually with the workaround is the Xen platform device, and the workaround (as I understand it) is the patch to libxl, not this patch to Qemu. The igd is passed through the same way whether or not Qemu or libxl is patched, with these patches I have proposed, and IIUC that is by using QMP and the Qemu device listener. > as one of justifications for workaround (it might push Xen folks > to investigate the issue and it won't be lost/forgotten on mail-list). > I could add that on the next iteration. As far as Xen folks investigating further, that would be welcome. I think the best way would be for the pc_xen_hvm_init functions to add the igd at slot 2 and the xen platform device at slot 3 when igd-passthru=on is set on the command line. But I don't know if it is possible for the pc_xen_hvm_init functions to attach a type XEN_PT, which is the device type of the igd when using xen, without changing the interface between libxl and Qemu. My patches fix this without changing that interface, but the Qemu patch does it in a way that achieves better startup performance that the patch to libxl because of the fact that the xen platform device can be added sooner during guest creation by patching Qemu than by patching libxl. Currently, IIUC, the XEN_PT devices are added through QMP and the Qemu device listener. If there was a way for Qemu (the pc_xen_hvm_init_pci function) to actively attach the igd instead of having libxl add it via QMP and the Qemu device listener, that would also work and might improve performance more because then both the igd and the xen platform device would be added early during the creation of the guest. Thanks.
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b48047f50c..bc5efa4f59 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine) } pc_xen_hvm_init_pci(machine); + if (xen_igd_gfx_pt_enabled()) { + xen_igd_reserve_slot(pcms->bus); + } pci_create_simple(pcms->bus, -1, "xen-platform"); } #endif diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index 0ec7e52183..eff38155ef 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp) s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, s->dev.devfn); - xen_host_pci_device_get(&s->real_device, - s->hostaddr.domain, s->hostaddr.bus, - s->hostaddr.slot, s->hostaddr.function, - errp); - if (*errp) { - error_append_hint(errp, "Failed to \"open\" the real pci device"); - return; - } - s->is_virtfn = s->real_device.is_virtfn; if (s->is_virtfn) { XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n", @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj) PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS; } +void xen_igd_reserve_slot(PCIBus *pci_bus) +{ + XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n"); + pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK; +} + +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp) +{ + ERRP_GUARD(); + PCIDevice *pci_dev = (PCIDevice *)qdev; + XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev); + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s); + PCIBus *pci_bus = pci_get_bus(pci_dev); + + if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) + return; + + xen_host_pci_device_get(&s->real_device, + s->hostaddr.domain, s->hostaddr.bus, + s->hostaddr.slot, s->hostaddr.function, + errp); + if (*errp) { + error_append_hint(errp, "Failed to \"open\" the real pci device"); + return; + } + + if (is_igd_vga_passthrough(&s->real_device) && + s->real_device.domain == 0 && s->real_device.bus == 0 && + s->real_device.dev == 2 && s->real_device.func == 0 && + s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { + pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK; + XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n"); + } + xpdc->pci_qdev_realize(qdev, errp); +} + static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass); + xpdc->pci_qdev_realize = dc->realize; + dc->realize = xen_igd_clear_slot; k->realize = xen_pt_realize; k->exit = xen_pt_unregister_device; k->config_read = xen_pt_pci_read_config; @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = { .instance_size = sizeof(XenPCIPassthroughState), .instance_finalize = xen_pci_passthrough_finalize, .class_init = xen_pci_passthrough_class_init, + .class_size = sizeof(XenPTDeviceClass), .instance_init = xen_pci_passthrough_instance_init, .interfaces = (InterfaceInfo[]) { { INTERFACE_CONVENTIONAL_PCI_DEVICE }, diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index cf10fc7bbf..8c25932b4b 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -2,6 +2,7 @@ #define XEN_PT_H #include "hw/xen/xen_common.h" +#include "hw/pci/pci_bus.h" #include "xen-host-pci-device.h" #include "qom/object.h" @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg; #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough" OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE) +#define XEN_PT_DEVICE_CLASS(klass) \ + OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE) +#define XEN_PT_DEVICE_GET_CLASS(obj) \ + OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE) + +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp); + +typedef struct XenPTDeviceClass { + PCIDeviceClass parent_class; + XenPTQdevRealize pci_qdev_realize; +} XenPTDeviceClass; + uint32_t igd_read_opregion(XenPCIPassthroughState *s); +void xen_igd_reserve_slot(PCIBus *pci_bus); void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val); void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s, XenHostPCIDevice *dev); @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read) #define XEN_PCI_INTEL_OPREGION 0xfc +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */ + typedef enum { XEN_PT_GRP_TYPE_HARDWIRED = 0, /* 0 Hardwired reg group */ XEN_PT_GRP_TYPE_EMU, /* emul reg group */ diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c index 2d8cac8d54..5c108446a8 100644 --- a/hw/xen/xen_pt_stub.c +++ b/hw/xen/xen_pt_stub.c @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp) error_setg(errp, "Xen PCI passthrough support not built in"); } } + +void xen_igd_reserve_slot(PCIBus *pci_bus) +{ +}
Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus, as noted in docs/igd-assign.txt in the Qemu source code. Currently, when the xl toolstack is used to configure a Xen HVM guest with Intel IGD passthrough to the guest with the Qemu upstream device model, a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy a different slot. This problem often prevents the guest from booting. The only available workaround is not good: Configure Xen HVM guests to use the old and no longer maintained Qemu traditional device model available from xenbits.xen.org which does reserve slot 2 for the Intel IGD. To implement this feature in the Qemu upstream device model for Xen HVM guests, introduce the following new functions, types, and macros: * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS * typedef XenPTQdevRealize function pointer * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2 * xen_igd_reserve_slot and xen_igd_clear_slot functions The new xen_igd_reserve_slot function uses the existing slot_reserved_mask member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using the xl toolstack with the gfx_passthru option enabled, which sets the igd-passthru=on option to Qemu for the Xen HVM machine type. The new xen_igd_reserve_slot function also needs to be implemented in hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough, in which case it does nothing. The new xen_igd_clear_slot function overrides qdev->realize of the parent PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was created in hw/i386/pc_piix.c for the case when igd-passthru=on. Move the call to xen_host_pci_device_get, and the associated error handling, from xen_pt_realize to the new xen_igd_clear_slot function to initialize the device class and vendor values which enables the checks for the Intel IGD to succeed. The verification that the host device is an Intel IGD to be passed through is done by checking the domain, bus, slot, and function values as well as by checking that gfx_passthru is enabled, the device class is VGA, and the device vendor in Intel. Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com> --- Notes that might be helpful to reviewers of patched code in hw/xen: The new functions and types are based on recommendations from Qemu docs: https://qemu.readthedocs.io/en/latest/devel/qom.html Notes that might be helpful to reviewers of patched code in hw/i386: The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does not affect builds that do not have CONFIG_XEN defined. xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an existing function that is only true when Qemu is built with xen-pci-passthrough enabled and the administrator has configured the Xen HVM guest with Qemu's igd-passthru=on option. v2: Remove From: <email address> tag at top of commit message v3: Changed the test for the Intel IGD in xen_igd_clear_slot: if (is_igd_vga_passthrough(&s->real_device) && (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) { is changed to if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) && (s->hostaddr.function == 0)) { I hoped that I could use the test in v2, since it matches the other tests for the Intel IGD in Qemu and Xen, but those tests do not work because the necessary data structures are not set with their values yet. So instead use the test that the administrator has enabled gfx_passthru and the device address on the host is 02.0. This test does detect the Intel IGD correctly. v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's email address to match the address used by the same author in commits be9c61da and c0e86b76 Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc v5: The patch of xen_pt.c was re-worked to allow a more consistent test for the Intel IGD that uses the same criteria as in other places. This involved moving the call to xen_host_pci_device_get from xen_pt_realize to xen_igd_clear_slot and updating the checks for the Intel IGD in xen_igd_clear_slot: if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2) && (s->hostaddr.function == 0)) { is changed to if (is_igd_vga_passthrough(&s->real_device) && s->real_device.domain == 0 && s->real_device.bus == 0 && s->real_device.dev == 2 && s->real_device.func == 0 && s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) { Added an explanation for the move of xen_host_pci_device_get from xen_pt_realize to xen_igd_clear_slot to the commit message. Rebase. v6: Fix logging by removing these lines from the move from xen_pt_realize to xen_igd_clear_slot that was done in v5: XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d" " to devfn 0x%x\n", s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function, s->dev.devfn); This log needs to be in xen_pt_realize because s->dev.devfn is not set yet in xen_igd_clear_slot. v7: The v7 that was posted to the mailing list was incorrect. v8 is what v7 was intended to be. v8: Inhibit out of context log message and needless processing by adding 2 lines at the top of the new xen_igd_clear_slot function: if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK)) return; Rebase. This removed an unnecessary header file from xen_pt.h hw/i386/pc_piix.c | 3 +++ hw/xen/xen_pt.c | 49 ++++++++++++++++++++++++++++++++++++-------- hw/xen/xen_pt.h | 16 +++++++++++++++ hw/xen/xen_pt_stub.c | 4 ++++ 4 files changed, 63 insertions(+), 9 deletions(-)