Message ID | 4703b7f1-8d3c-5128-213c-e39f487e4cde@netscape.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | BUG: libxenlight fails to grant permission to access Intel IGD Opregion | expand |
On 11.03.2022 06:01, Chuck Zmudzinski wrote: > Further research showed that these two pages at 0xcc490 are for the > Intel IGD opregion, and because this memory is not permitted to be > accessed by the domain, the passthrough of an Intel IGD to a Linux > HVM domain fails, causing a crash of the Linux i915.ko kernel module > in the HVM domain. My testing, which was on a desktop with a Haswell > Intel CPU/IGD, confirmed that these two extra pages need to be > permitted in order for passthrough of the Intel IGD to a Linux > domain to work properly. > > I find that adding two pages is enough to fix the problem, but I > have read in other places that the Opregion is actually three pages, > and maybe newer revisions of the Intel IGD do need three pages instead > of two. I am testing on a Haswell Intel chip, which is over 8 years old > now. So the patch I propose adds two pages, but I am not sure if > it should be three pages for newer Intel chips. > > The failure to map this memory with gfx_passthru enabled > is therefore a bug, a regression that was introduced with the two > aforementioned patches way back in 2014 when Xen 4.5 was under > development. Thanks for this analysis. It looks quite plausible (but the question of 2 vs 3 pages of course needs resolving). > Once I developed a patch, I did more testing with the traditional > Qemu device model and Debian's package of Xen-4.16 for Debian > sid/unstable after I discovered where this bug first appeared in > Xen 4.5-unstable back in 2014. In my testing, Windows HVM domains are > not affected by this bug and they function properly, most likely > because proprietary Intel graphics drivers for Windows are more > forgiving than the Linux open source drivers for Intel graphics > regarding the details of how Xen and Qemu configure the domain. > > This bug still exists in current supported versions of Xen > because in Xen 4.16, passthrough of my Haswell Intel IGD to a Linux > domain still fails with a crash of the i915 Linux kernel module in > the Linux unprivileged domain when the traditional Qemu device model > is used in dom0. The patch at the end of this message fixes it. > > I have not yet succeeded in reproducing this bug with the > upstream device model because there is another bug in Qemu > upstream that breaks passthrough of the Intel IGD to a Linux HVM > domU, so for now, to reproduce it, please use the traditional device > model. > > Also, as a starting point to reproduce the bug, first get Intel IGD > passthrough to a Windows HVM domain using the Qemu traditional > device model working on Xen 4.16. Then replace the Windows HVM domain > with a Linux HVM domain, keeping everything else the same including > the Qemu traditional device model. I tested using a Debian 11.2 > (bullseye) HVM domain and Debian sid/unstable with Xen 4.16 and > a build of the Qemu traditional device model from source as > provided on xenbits.xen.org > > I am using a desktop computer and the xl toolstack and Xen as > packaged by Debian, except that I added the traditional device > model that Debian does not provide. > > If you need more info, please let me know. I am not subscribed to > xen-devel so please cc me with your replies. > > Regards, > > Chuck > > Here is the patch that fixes the bug on Debian sid/Xen 4.16: As to an actual patch for us to take - please see docs/process/sending-patches.pandoc for the formal requirements. (Note this was recently introduced, so you won't find it in the 4.16 sources. But your patch wants to be against latest staging anyway.) Jan
On 3/11/2022 3:09 AM, Jan Beulich wrote: > On 11.03.2022 06:01, Chuck Zmudzinski wrote: >> Further research showed that these two pages at 0xcc490 are for the >> Intel IGD opregion, and because this memory is not permitted to be >> accessed by the domain, the passthrough of an Intel IGD to a Linux >> HVM domain fails, causing a crash of the Linux i915.ko kernel module >> in the HVM domain. My testing, which was on a desktop with a Haswell >> Intel CPU/IGD, confirmed that these two extra pages need to be >> permitted in order for passthrough of the Intel IGD to a Linux >> domain to work properly. >> >> I find that adding two pages is enough to fix the problem, but I >> have read in other places that the Opregion is actually three pages, >> and maybe newer revisions of the Intel IGD do need three pages instead >> of two. I am testing on a Haswell Intel chip, which is over 8 years old >> now. So the patch I propose adds two pages, but I am not sure if >> it should be three pages for newer Intel chips. >> >> The failure to map this memory with gfx_passthru enabled >> is therefore a bug, a regression that was introduced with the two >> aforementioned patches way back in 2014 when Xen 4.5 was under >> development. > Thanks for this analysis. It looks quite plausible (but the question > of 2 vs 3 pages of course needs resolving). > >> Once I developed a patch, I did more testing with the traditional >> Qemu device model and Debian's package of Xen-4.16 for Debian >> sid/unstable after I discovered where this bug first appeared in >> Xen 4.5-unstable back in 2014. In my testing, Windows HVM domains are >> not affected by this bug and they function properly, most likely >> because proprietary Intel graphics drivers for Windows are more >> forgiving than the Linux open source drivers for Intel graphics >> regarding the details of how Xen and Qemu configure the domain. >> >> This bug still exists in current supported versions of Xen >> because in Xen 4.16, passthrough of my Haswell Intel IGD to a Linux >> domain still fails with a crash of the i915 Linux kernel module in >> the Linux unprivileged domain when the traditional Qemu device model >> is used in dom0. The patch at the end of this message fixes it. >> >> I have not yet succeeded in reproducing this bug with the >> upstream device model because there is another bug in Qemu >> upstream that breaks passthrough of the Intel IGD to a Linux HVM >> domU, so for now, to reproduce it, please use the traditional device >> model. >> >> Also, as a starting point to reproduce the bug, first get Intel IGD >> passthrough to a Windows HVM domain using the Qemu traditional >> device model working on Xen 4.16. Then replace the Windows HVM domain >> with a Linux HVM domain, keeping everything else the same including >> the Qemu traditional device model. I tested using a Debian 11.2 >> (bullseye) HVM domain and Debian sid/unstable with Xen 4.16 and >> a build of the Qemu traditional device model from source as >> provided on xenbits.xen.org >> >> I am using a desktop computer and the xl toolstack and Xen as >> packaged by Debian, except that I added the traditional device >> model that Debian does not provide. >> >> If you need more info, please let me know. I am not subscribed to >> xen-devel so please cc me with your replies. >> >> Regards, >> >> Chuck >> >> Here is the patch that fixes the bug on Debian sid/Xen 4.16: > As to an actual patch for us to take - please see > docs/process/sending-patches.pandoc for the formal requirements. > (Note this was recently introduced, so you won't find it in the > 4.16 sources. But your patch wants to be against latest staging > anyway.) > > Jan > After resolving the question of two vs. three pages, I will follow the process for submitting a patch against the latest staging. Qubes OS has a patch that uses three pages, and the igd.c pci file in Qemu's hw/vfio directory also specifies three pages, but if two is enough, that seems to be safer. I haven't checked yet to see if there is an official specification from Intel. I will start by looking in the Linux kernel i915 driver code which might give a clue. Chuck
On 3/11/22 8:47 AM, Chuck Zmudzinski wrote: > On 3/11/2022 3:09 AM, Jan Beulich wrote: >> >> Thanks for this analysis. It looks quite plausible (but the question >> of 2 vs 3 pages of course needs resolving). >> >> >> > > After resolving the question of two vs. three pages, I will follow > the process for submitting a patch against the latest staging. > > Qubes OS has a patch that uses three pages, and the > igd.c pci file in Qemu's hw/vfio directory also specifies > three pages, but if two is enough, that seems to be safer. > I haven't checked yet to see if there is an official specification > from Intel. I will start by looking in the Linux kernel i915 > driver code which might give a clue. It looks like both in Xen and Qemu, it is agreed we need 3 pages In Xen, we have: From: Keir Fraser <keir@xen.org> Date: Thu, 10 Jan 2013 17:26:24 +0000 (+0000) Subject: hvmloader: Allocate 3 pages for Intel GPU OpRegion passthrough. X-Git-Tag: 4.3.0-rc1~546 X-Git-Url: https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff_plain;h=408a9e56343b006c9e58a334f0b97dd2deedf9ac hvmloader: Allocate 3 pages for Intel GPU OpRegion passthrough. The 8kB region may not be page aligned, hence requiring 3 pages to be mapped through. Signed-off-by: Keir Fraser <keir@xen.org> Also Qemu upstream defines it as 3 pages: https://xenbits.xen.org/gitweb/?p=qemu-xen.git;a=blob;f=hw/xen/xen_pt_graphics.c;h=a3bc7e39214b3d71c32c8386758fdef6ced5c0df;hb=a68d6d311c2d1fd9d2fa9a0768ea2353e8a79b42#l242 In Keir's commit, the OpRegion size was increased from 2 to 3 pages, so I think it is correct to use 3 pages in a proposed patch. In tools/firmware/hvmloader, there are header files that have macros setting the address and size of the Intel IGD opregion. So my only question is: Is it proper to include header files from tools/firmware/hvmloader in tools/libs/light/libxl_pci.c ? Chuck
On 3/11/22 3:09 AM, Jan Beulich wrote: > On 11.03.2022 06:01, Chuck Zmudzinski wrote: > >> Here is the patch that fixes the bug on Debian sid/Xen 4.16: > As to an actual patch for us to take - please see > docs/process/sending-patches.pandoc for the formal requirements. > (Note this was recently introduced, so you won't find it in the > 4.16 sources. But your patch wants to be against latest staging > anyway.) > > Jan > OK, I took a look at the process and I also studied this issue more closely, and my conclusion is that I would not recommend fixing this old bug now until we have a better idea about how good our current tests for the Intel IGD are. AFAICT, if our tests for the Intel IGD result in a false positive, then hvmloader will map three pages in the guest for the IGD opregion, but the mapped memory would certainly not be the expected IGD opregion if the device is not actually an IGD or GPU with an opregion. In such a case, we would be mapping three pages of unexpected memory to the guest. So before proposing a patch that would fix this bug but have the unintended consequence of allowing access to unexpected memory in the case of a false positive detection of an Intel IGD, I will first spend some time deciding if a more accurate and reliable test is needed to determine if a PCI device with class VGA and vendor Intel actually has an IGD opregion. Once I am confident that the risk of a false positive when testing for the Intel IGD is acceptably low , then I would consider submitting a patch that fixes this bug. Our tests check if the PCI device has class VGA and that the vendor is Intel, and we also check if the gfx_passthru option is enabled. Those tests are applied both in hvmloader and in libxenlight to decide about mapping the IGD opregion to the guest and informing Qemu about the mapped address. I don't think these tests for the Intel IGD account for recent developments such as newer discrete Intel GPUs that might not have an IGD opregion, nor do they account for older Intel IGDs/GPUs that also might not have an IGD opregion. I think some time is needed to look at the i915 Linux kernel driver code to more precisely identify the devices that need access to the IGD opregion. Other devices either are not compatible with the feature of VGA passthrough or do not need to have access to the IGD opregion. With this information, a patch can be developed that will more accurately determine when the guest needs access to the IGD opregion. With such a patch committed in Xen, I would be more comfortable submitting a fix for this bug. Regards, Chuck
On 11.03.2022 21:35, Chuck Zmudzinski wrote: > So my only question is: > > Is it proper to include header files from tools/firmware/hvmloader in > tools/libs/light/libxl_pci.c ? No, it certainly is not. Jan
On 3/14/22 3:26 AM, Jan Beulich wrote: > On 11.03.2022 21:35, Chuck Zmudzinski wrote: >> So my only question is: >> >> Is it proper to include header files from tools/firmware/hvmloader in >> tools/libs/light/libxl_pci.c ? > No, it certainly is not. > > Jan > That's a relief, because if if was proper, I wouldn't know how to do it properly. Instead, I wrote a patch that defines the macros I need in tools/libs/light/libxl_pci.c with the same values they have in tools/firmware/hvmloader. When you get a chance, can you take a look at it? I cc'd the patch to you because of it's reference to hvmloader. Chuck
--- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -1033,7 +1033,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) ret = -EPERM; if ( !iomem_access_permitted(current->domain, mfn, mfn_end) || !iomem_access_permitted(d, mfn, mfn_end) ) + { + printk(XENLOG_G_WARNING + "memory_map:access not permitted: dom%d gfn=%lx mfn=%lx nr=%lx\n", + d->domain_id, gfn, mfn, nr_mfns); break; + } ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn_end, add); if ( ret ) snip ---------------------------------------------- snip Further research showed that these two pages at 0xcc490 are for the Intel IGD opregion, and because this memory is not permitted to be accessed by the domain, the passthrough of an Intel IGD to a Linux HVM domain fails, causing a crash of the Linux i915.ko kernel module in the HVM domain. My testing, which was on a desktop with a Haswell Intel CPU/IGD, confirmed that these two extra pages need to be permitted in order for passthrough of the Intel IGD to a Linux domain to work properly. I find that adding two pages is enough to fix the problem, but I have read in other places that the Opregion is actually three pages, and maybe newer revisions of the Intel IGD do need three pages instead of two. I am testing on a Haswell Intel chip, which is over 8 years old now. So the patch I propose adds two pages, but I am not sure if it should be three pages for newer Intel chips. The failure to map this memory with gfx_passthru enabled is therefore a bug, a regression that was introduced with the two aforementioned patches way back in 2014 when Xen 4.5 was under development. Once I developed a patch, I did more testing with the traditional Qemu device model and Debian's package of Xen-4.16 for Debian sid/unstable after I discovered where this bug first appeared in Xen 4.5-unstable back in 2014. In my testing, Windows HVM domains are not affected by this bug and they function properly, most likely because proprietary Intel graphics drivers for Windows are more forgiving than the Linux open source drivers for Intel graphics regarding the details of how Xen and Qemu configure the domain. This bug still exists in current supported versions of Xen because in Xen 4.16, passthrough of my Haswell Intel IGD to a Linux domain still fails with a crash of the i915 Linux kernel module in the Linux unprivileged domain when the traditional Qemu device model is used in dom0. The patch at the end of this message fixes it. I have not yet succeeded in reproducing this bug with the upstream device model because there is another bug in Qemu upstream that breaks passthrough of the Intel IGD to a Linux HVM domU, so for now, to reproduce it, please use the traditional device model. Also, as a starting point to reproduce the bug, first get Intel IGD passthrough to a Windows HVM domain using the Qemu traditional device model working on Xen 4.16. Then replace the Windows HVM domain with a Linux HVM domain, keeping everything else the same including the Qemu traditional device model. I tested using a Debian 11.2 (bullseye) HVM domain and Debian sid/unstable with Xen 4.16 and a build of the Qemu traditional device model from source as provided on xenbits.xen.org I am using a desktop computer and the xl toolstack and Xen as packaged by Debian, except that I added the traditional device model that Debian does not provide. If you need more info, please let me know. I am not subscribed to xen-devel so please cc me with your replies. Regards, Chuck Here is the patch that fixes the bug on Debian sid/Xen 4.16: --- a/tools/libs/light/libxl_pci.c +++ b/tools/libs/light/libxl_pci.c @@ -24,6 +24,7 @@ #define PCI_OPTIONS "msitranslate=%d,power_mgmt=%d" #define PCI_BDF_XSPATH "%04x-%02x-%02x-%01x" #define PCI_PT_QDEV_ID "pci-pt-%02x_%02x.%01x" +#define PCI_INTEL_OPREGION 0xfc /* value defined in tools/firmware/hvmloader/pci_regs.h */ static unsigned int pci_encode_bdf(libxl_device_pci *pci)