mbox series

[XEN,v2,0/3] Configure qemu upstream correctly by default for igd-passthru

Message ID cover.1673300848.git.brchuckz@aol.com (mailing list archive)
Headers show
Series Configure qemu upstream correctly by default for igd-passthru | expand

Message

Chuck Zmudzinski Jan. 10, 2023, 7:32 a.m. UTC
Sorry for the length of this cover letter but it is helpful to put all
the pros and cons of the two different approaches to solving the problem
of configuring the Intel IGD with qemu upstream and libxl in one place,
which I attempt to do here. Of course the other approach involves a
patch to qemu [1] instead of using this patch series for libxl.

The quick answer:

I think the other patch to qemu is the better option, but I would be OK
if you use this patch series instead.

Details with my reasons for preferring the other patch to qemu over this
patch series to libxl: 

I call attention to the commit message of the first patch which points
out that using the "pc" machine and adding the xen platform device on
the qemu upstream command line is not functionally equivalent to using
the "xenfv" machine which automatically adds the xen platform device
earlier in the guest creation process. As a result, there is a noticeable
reduction in the performance of the guest during startup with the "pc"
machne type even if the xen platform device is added via the qemu
command line options, although eventually both Linux and Windows guests
perform equally well once the guest operating system is fully loaded.

Specifically, startup time is longer and neither the grub vga drivers
nor the windows vga drivers in early startup perform as well when the
xen platform device is added via the qemu command line instead of being
added immediately after the other emulated i440fx pci devices when the
"xenfv" machine type is used.

For example, when using the "pc" machine, which adds the xen platform
device using a command line option, the Linux guest could not display
the grub boot menu at the native resolution of the monitor, but with the
"xenfv" machine, the grub menu is displayed at the full 1920x1080
native resolution of the monitor for testing. So improved startup
performance is an advantage for the patch for qemu.

I also call attention to the last point of the commit message of the
second patch and the comments for reviewers section of the second patch.
This approach, as opposed to fixing this in qemu upstream, makes
maintaining the code in libxl__build_device_model_args_new more
difficult and therefore increases the chances of problems caused by
coding errors and typos for users of libxl. So that is another advantage
of the patch for qemu.

OTOH, fixing this in qemu causes newer qemu versions to behave
differently than previous versions of qemu, which the qemu community
does not like, although they seem OK with the other patch since it only
affects qemu "xenfv" machine types, but they do not want the patch to
affect toolstacks like libvirt that do not use qemu upstream's
autoconfiguration options as much as libxl does, and, of course, libvirt
can manage qemu "xenfv" machines so exising "xenfv" guests configured
manually by libvirt could be adversely affected by the patch to qemu,
but only if those same guests are also configured for igd-passthrough,
which is likely a very small number of possibly affected libvirt users
of qemu.

A year or two ago I tried to configure guests for pci passthrough on xen
using libvirt's tool to convert a libxl xl.cfg file to libvirt xml. It
could not convert an xl.cfg file with a configuration item
pci = [ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ...] for pci passthrough.
So it is unlikely there are any users out there using libvirt to
configure xen hvm guests for igd passthrough on xen, and those are the
only users that could be adversely affected by the simpler patch to qemu
to fix this.

The only advantage of this patch series over the qemu patch is that
this patch series does not need any patches to qemu to make Intel IGD
configuration easier with libxl so the risk of affecting other qemu
users is entirely eliminated if we use this patch instead of patching
qemu. The cost of patching libxl instead of qemu is reduced startup
performance compared to what could be achieved by patching qemu instead
and an increased risk that the tedious process of manually managing the
slot addresses of all the emulated devices will make it more difficult
to keep the libxl code free of bugs.

I will leave it to the maintainer of the code in both qemu and libxl
(Anthony) to decide which, if any, of the patches to apply. I am OK with
either this patch series to libxl or the proposed patch to qemu to fix
this problem, but I do recommend the other patch to qemu over this patch
series because of the improved performance during startup with that
patch and the relatively low risk that any libvirt users will be
adversely affected by that patch.

Brief statement of the problem this patch series solves:

Currently only the qemu traditional device model reserves slot 2 for the
Intel Integrated Graphics Device (IGD) with default settings. Assigning
the Intel IGD to slot 2 is necessary for the device to operate properly
when passed through as the primary graphics adapter. The qemu
traditional device model takes care of this by reserving slot 2 for the
Intel IGD, but the upstream qemu device model currently does not reserve
slot 2 for the Intel IGD.

This patch series modifies libxl so the upstream qemu device model will
also, with default settings, assign slot 2 for the Intel IGD.

There are three reasons why it is difficult to configure the guest
so the Intel IGD is assigned to slot 2 in the guest using libxl and the
upstream device model, so the patch series is logically organized in
three separate patches; each patch resolves one of the three reasons
that cause problems:

The description of what each of the three libxl patches do:

1. With the default "xenfv" machine type, qemu upstream is hard-coded
   to assign the xen platform device to slot 2. The first patch fixes
   that by using the "pc" machine instead when gfx_passthru type is igd
   and, if xen_platform_pci is set in the guest config, libxl now assigns
   the xen platform device to slot 3, making it possible to assign the
   IGD to slot 2. The patch only affects guests with the gfx_passthru
   option enabled. The default behavior (xen_platform_pci is enabled
   but gfx_passthru option is disabled) of using the "xenfv" machine
   type is preserved. Another way to describe what the patch does is
   to say that it adds a second exception to the default choice of the
   "xenfv" machine type, with the first exception being that the "pc"
   machine type is also used instead of "xenfv" if the xen platform pci
   device is disabled in the guest xl.cfg file.

2. Currently, with libxl and qemu upstream, most emulated pci devices
   are by default automatically assigned a pci slot, and the emulated
   ones are assigned before the passed through ones, which means that
   even if libxl is patched so the xen platform device will not be
   assigned to slot 2, any other emulated device will be assigned slot 2
   unless libxl explicitly assigns the slot address of each emulated pci
   device in such a way that the IGD will be assigned slot 2. The second
   patch fixes this by hard coding the slot assignment for the emulated
   devices instead of deferring to qemu upstream's auto-assignment which
   does not do what is necessary to configure the Intel IGD correctly.
   With the second patch applied, it is possible to configure the Intel
   IGD correctly by using the @VSLOT parameter in xl.cfg to specify the
   slot address of each passed through pci device in the guest. The
   second patch is also designed to not change the default behavior of
   letting qemu autoconfigure the pci slot addresses when igd
   gfx_pasthru is disabled in xl.cfg.  

3. For convenience, the third patch automatically assigns slot 2 to the
   Intel IGD when the gfx_passthru type is igd so with the third patch
   appled it is not necessary to set the @VSLOT parameter to configure
   the Intel IGD correctly.

Testing:

I tested a system with Intel IGD passthrough and two other pci devices
passed through, with and without the xen platform device. I also did
tests on guests without any pci passthrough configured. In all cases
tested, libxl behaved as expected. For example, the device model
arguments are only changed if gfx_passthru is set for the IGD, libxl
respected administrator settings such as @VSLOT and xen_platform_pci
with the patch series applied, and not adding the xen platform device to
the guest caused reduced performance because in that case the guest
could not take advantage of the improvements offered by the Xen PV
drivers in the guest. I tested the following emulated devices on my
setup: xen-platform, e1000, and VGA. I also verified the device that is
added by the "hdtype = 'ahci'" xl.cfg option is configured correctly
with the patch applied. I did not test all 12 devices that could be
affected by patch 2 of the series. These include the intel-hda high
definition audio device, a virtio-serial, device, etc. Once can look
at the second patch for the full list of qemu emulated devices whose
behavior is affected by the second patch of the series when the guest
is configured for igd gfx_passthru. These devices are also subject
to mistakes in the patch not discovered by the compiler, as mentioned
in the comments for reviewers section of the second patch. 

[1] https://lore.kernel.org/qemu-devel/a09d2427397621eaecee4c46b33507a99cc5f161.1673334040.git.brchuckz@aol.com/

v2: correct the link to the qemu patch - the link in v1 was to an
    incorrect version of the patch

Chuck Zmudzinski (3):
  libxl/dm: Use "pc" machine type for Intel IGD passthrough
  libxl/dm: Manage pci slot assignment for Intel IGD passthrough
  libxl/dm: Assign slot 2 by default for Intel IGD passthrough

 tools/libs/light/libxl_dm.c | 227 +++++++++++++++++++++++++++++-------
 1 file changed, 183 insertions(+), 44 deletions(-)

Comments

Anthony PERARD Jan. 25, 2023, 11:37 a.m. UTC | #1
On Tue, Jan 10, 2023 at 02:32:01AM -0500, Chuck Zmudzinski wrote:
> I call attention to the commit message of the first patch which points
> out that using the "pc" machine and adding the xen platform device on
> the qemu upstream command line is not functionally equivalent to using
> the "xenfv" machine which automatically adds the xen platform device
> earlier in the guest creation process. As a result, there is a noticeable
> reduction in the performance of the guest during startup with the "pc"
> machne type even if the xen platform device is added via the qemu
> command line options, although eventually both Linux and Windows guests
> perform equally well once the guest operating system is fully loaded.

There shouldn't be a difference between "xenfv" machine or using the
"pc" machine while adding the "xen-platform" device, at least with
regards to access to disk or network.

The first patch of the series is using the "pc" machine without any
"xen-platform" device, so we can't compare startup performance based on
that.

> Specifically, startup time is longer and neither the grub vga drivers
> nor the windows vga drivers in early startup perform as well when the
> xen platform device is added via the qemu command line instead of being
> added immediately after the other emulated i440fx pci devices when the
> "xenfv" machine type is used.

The "xen-platform" device is mostly an hint to a guest that they can use
pv-disk and pv-network devices. I don't think it would change anything
with regards to graphics.

> For example, when using the "pc" machine, which adds the xen platform
> device using a command line option, the Linux guest could not display
> the grub boot menu at the native resolution of the monitor, but with the
> "xenfv" machine, the grub menu is displayed at the full 1920x1080
> native resolution of the monitor for testing. So improved startup
> performance is an advantage for the patch for qemu.

I've just found out that when doing IGD passthrough, both machine
"xenfv" and "pc" are much more different than I though ... :-(
pc_xen_hvm_init_pci() in QEMU changes the pci-host device, which in
turns copy some informations from the real host bridge.
I guess this new host bridge help when the firmware setup the graphic
for grub.

> I also call attention to the last point of the commit message of the
> second patch and the comments for reviewers section of the second patch.
> This approach, as opposed to fixing this in qemu upstream, makes
> maintaining the code in libxl__build_device_model_args_new more
> difficult and therefore increases the chances of problems caused by
> coding errors and typos for users of libxl. So that is another advantage
> of the patch for qemu.

We would just needs to use a different approach in libxl when generating
the command line. We could probably avoid duplications. I was hopping to
have patch series for libxl that would change the machine used to start
using "pc" instead of "xenfv" for all configurations, but based on the
point above (IGD specific change to "xenfv"), then I guess we can't
really do anything from libxl to fix IGD passthrough.

> OTOH, fixing this in qemu causes newer qemu versions to behave
> differently than previous versions of qemu, which the qemu community
> does not like, although they seem OK with the other patch since it only
> affects qemu "xenfv" machine types, but they do not want the patch to
> affect toolstacks like libvirt that do not use qemu upstream's
> autoconfiguration options as much as libxl does, and, of course, libvirt
> can manage qemu "xenfv" machines so exising "xenfv" guests configured
> manually by libvirt could be adversely affected by the patch to qemu,
> but only if those same guests are also configured for igd-passthrough,
> which is likely a very small number of possibly affected libvirt users
> of qemu.
> 
> A year or two ago I tried to configure guests for pci passthrough on xen
> using libvirt's tool to convert a libxl xl.cfg file to libvirt xml. It
> could not convert an xl.cfg file with a configuration item
> pci = [ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ...] for pci passthrough.
> So it is unlikely there are any users out there using libvirt to
> configure xen hvm guests for igd passthrough on xen, and those are the
> only users that could be adversely affected by the simpler patch to qemu
> to fix this.

FYI, libvirt should be using libxl to create guest, I don't think there
is another way for libvirt to create xen guests.



So overall, unfortunately the "pc" machine in QEMU isn't suitable to do
IGD passthrough as the "xenfv" machine has already some workaround to
make IGD work and just need some more.

I've seen that the patch for QEMU is now reviewed, so I look at having
it merged soonish.

Thanks,
Chuck Zmudzinski Jan. 25, 2023, 8:20 p.m. UTC | #2
On 1/25/2023 6:37 AM, Anthony PERARD wrote:
> On Tue, Jan 10, 2023 at 02:32:01AM -0500, Chuck Zmudzinski wrote:
> > I call attention to the commit message of the first patch which points
> > out that using the "pc" machine and adding the xen platform device on
> > the qemu upstream command line is not functionally equivalent to using
> > the "xenfv" machine which automatically adds the xen platform device
> > earlier in the guest creation process. As a result, there is a noticeable
> > reduction in the performance of the guest during startup with the "pc"
> > machne type even if the xen platform device is added via the qemu
> > command line options, although eventually both Linux and Windows guests
> > perform equally well once the guest operating system is fully loaded.
>
> There shouldn't be a difference between "xenfv" machine or using the
> "pc" machine while adding the "xen-platform" device, at least with
> regards to access to disk or network.
>
> The first patch of the series is using the "pc" machine without any
> "xen-platform" device, so we can't compare startup performance based on
> that.
>
> > Specifically, startup time is longer and neither the grub vga drivers
> > nor the windows vga drivers in early startup perform as well when the
> > xen platform device is added via the qemu command line instead of being
> > added immediately after the other emulated i440fx pci devices when the
> > "xenfv" machine type is used.
>
> The "xen-platform" device is mostly an hint to a guest that they can use
> pv-disk and pv-network devices. I don't think it would change anything
> with regards to graphics.
>
> > For example, when using the "pc" machine, which adds the xen platform
> > device using a command line option, the Linux guest could not display
> > the grub boot menu at the native resolution of the monitor, but with the
> > "xenfv" machine, the grub menu is displayed at the full 1920x1080
> > native resolution of the monitor for testing. So improved startup
> > performance is an advantage for the patch for qemu.
>
> I've just found out that when doing IGD passthrough, both machine
> "xenfv" and "pc" are much more different than I though ... :-(
> pc_xen_hvm_init_pci() in QEMU changes the pci-host device, which in
> turns copy some informations from the real host bridge.
> I guess this new host bridge help when the firmware setup the graphic
> for grub.
>
> > I also call attention to the last point of the commit message of the
> > second patch and the comments for reviewers section of the second patch.
> > This approach, as opposed to fixing this in qemu upstream, makes
> > maintaining the code in libxl__build_device_model_args_new more
> > difficult and therefore increases the chances of problems caused by
> > coding errors and typos for users of libxl. So that is another advantage
> > of the patch for qemu.
>
> We would just needs to use a different approach in libxl when generating
> the command line. We could probably avoid duplications. I was hopping to
> have patch series for libxl that would change the machine used to start
> using "pc" instead of "xenfv" for all configurations, but based on the
> point above (IGD specific change to "xenfv"), then I guess we can't
> really do anything from libxl to fix IGD passthrough.
>
> > OTOH, fixing this in qemu causes newer qemu versions to behave
> > differently than previous versions of qemu, which the qemu community
> > does not like, although they seem OK with the other patch since it only
> > affects qemu "xenfv" machine types, but they do not want the patch to
> > affect toolstacks like libvirt that do not use qemu upstream's
> > autoconfiguration options as much as libxl does, and, of course, libvirt
> > can manage qemu "xenfv" machines so exising "xenfv" guests configured
> > manually by libvirt could be adversely affected by the patch to qemu,
> > but only if those same guests are also configured for igd-passthrough,
> > which is likely a very small number of possibly affected libvirt users
> > of qemu.
> > 
> > A year or two ago I tried to configure guests for pci passthrough on xen
> > using libvirt's tool to convert a libxl xl.cfg file to libvirt xml. It
> > could not convert an xl.cfg file with a configuration item
> > pci = [ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ...] for pci passthrough.
> > So it is unlikely there are any users out there using libvirt to
> > configure xen hvm guests for igd passthrough on xen, and those are the
> > only users that could be adversely affected by the simpler patch to qemu
> > to fix this.
>
> FYI, libvirt should be using libxl to create guest, I don't think there
> is another way for libvirt to create xen guests.

I have success using libvirt as a frontend to libxl for most of my xen guests,
except for HVM guests that have pci devices passed through because the
tool to convert an xl.cfg file to libvirt xml was not able to convert the
pci = ... line in xl.cfg. Perhaps newer versions of libvirt can do it (I haven't
tried it since at least a couple of years ago with an older version of libvirt).

>
>
>
> So overall, unfortunately the "pc" machine in QEMU isn't suitable to do
> IGD passthrough as the "xenfv" machine has already some workaround to
> make IGD work and just need some more.
>
> I've seen that the patch for QEMU is now reviewed, so I look at having
> it merged soonish.

Hi Anthony,

Thanks for looking at this and for also looking at the Qemu patch
to fix this. As I said earlier, I think to fix this problem for the IGD,
the qemu patch is probably better than this patch to libxl.

Regarding the rest of your comments, I think the Xen developers
need to decide what the roadmap for the future development of
Xen HVM machines on x86 is before deciding on any further
changes. I have not noticed much development in this feature
in the past few years, except for Bernhard Beschow who has been
doing some work to make the piix3 stuff more maintainable in
Qemu upstream. When that is done, it might be an opportunity to do
some work improving the "xenfv" machine in Qemu upstream.
The "pc" machine type is of course a very old machine type
to still be using as the device model for modern systems.

I noticed about four or five years ago there was a patch set
proposed to use "q35" instead of "pc" for Xen HVM guests and
Qemu upstream, but there did not seem to be any agreement
about the best way to implement that change, with some saying
more of it should be implemented outside of Qemu and by libxl
or maybe hvmloader instead. If anyone can describe if there is a
roadmap for the future of Xen HVM on x86, that would be helpful.
Thanks,

Chuck
Chuck Zmudzinski Jan. 25, 2023, 11:19 p.m. UTC | #3
On 1/25/2023 6:37 AM, Anthony PERARD wrote:
> On Tue, Jan 10, 2023 at 02:32:01AM -0500, Chuck Zmudzinski wrote:
> > I call attention to the commit message of the first patch which points
> > out that using the "pc" machine and adding the xen platform device on
> > the qemu upstream command line is not functionally equivalent to using
> > the "xenfv" machine which automatically adds the xen platform device
> > earlier in the guest creation process. As a result, there is a noticeable
> > reduction in the performance of the guest during startup with the "pc"
> > machne type even if the xen platform device is added via the qemu
> > command line options, although eventually both Linux and Windows guests
> > perform equally well once the guest operating system is fully loaded.
>
> There shouldn't be a difference between "xenfv" machine or using the
> "pc" machine while adding the "xen-platform" device, at least with
> regards to access to disk or network.
>
> The first patch of the series is using the "pc" machine without any
> "xen-platform" device, so we can't compare startup performance based on
> that.
>
> > Specifically, startup time is longer and neither the grub vga drivers
> > nor the windows vga drivers in early startup perform as well when the
> > xen platform device is added via the qemu command line instead of being
> > added immediately after the other emulated i440fx pci devices when the
> > "xenfv" machine type is used.
>
> The "xen-platform" device is mostly an hint to a guest that they can use
> pv-disk and pv-network devices. I don't think it would change anything
> with regards to graphics.
>
> > For example, when using the "pc" machine, which adds the xen platform
> > device using a command line option, the Linux guest could not display
> > the grub boot menu at the native resolution of the monitor, but with the
> > "xenfv" machine, the grub menu is displayed at the full 1920x1080
> > native resolution of the monitor for testing. So improved startup
> > performance is an advantage for the patch for qemu.
>
> I've just found out that when doing IGD passthrough, both machine
> "xenfv" and "pc" are much more different than I though ... :-(
> pc_xen_hvm_init_pci() in QEMU changes the pci-host device, which in
> turns copy some informations from the real host bridge.
> I guess this new host bridge help when the firmware setup the graphic
> for grub.

I am surprised it works at all with the "pc" machine, that is, without the
TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE that is used in the "xenfv"
machine. This only seems to affect the legacy grub vga driver and the legacy
Windows vga driver during early boot. Still, I much prefer keeping the "xenfv"
machine for Intel IGD than this workaround of patching libxl to use the "pc"
machine.

>
> > I also call attention to the last point of the commit message of the
> > second patch and the comments for reviewers section of the second patch.
> > This approach, as opposed to fixing this in qemu upstream, makes
> > maintaining the code in libxl__build_device_model_args_new more
> > difficult and therefore increases the chances of problems caused by
> > coding errors and typos for users of libxl. So that is another advantage
> > of the patch for qemu.
>
> We would just needs to use a different approach in libxl when generating
> the command line. We could probably avoid duplications. I was hopping to
> have patch series for libxl that would change the machine used to start
> using "pc" instead of "xenfv" for all configurations, but based on the
> point above (IGD specific change to "xenfv"), then I guess we can't
> really do anything from libxl to fix IGD passthrough.

We could switch to the "pc" machine, but we would need to patch
qemu also so the "pc" machine uses the special device the "xenfv"
machine uses (TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE).
So it is simpler to just use the other patch to qemu and not patch
libxl at all to fix this.

>
> > OTOH, fixing this in qemu causes newer qemu versions to behave
> > differently than previous versions of qemu, which the qemu community
> > does not like, although they seem OK with the other patch since it only
> > affects qemu "xenfv" machine types, but they do not want the patch to
> > affect toolstacks like libvirt that do not use qemu upstream's
> > autoconfiguration options as much as libxl does, and, of course, libvirt
> > can manage qemu "xenfv" machines so exising "xenfv" guests configured
> > manually by libvirt could be adversely affected by the patch to qemu,
> > but only if those same guests are also configured for igd-passthrough,
> > which is likely a very small number of possibly affected libvirt users
> > of qemu.
> > 
> > A year or two ago I tried to configure guests for pci passthrough on xen
> > using libvirt's tool to convert a libxl xl.cfg file to libvirt xml. It
> > could not convert an xl.cfg file with a configuration item
> > pci = [ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ...] for pci passthrough.
> > So it is unlikely there are any users out there using libvirt to
> > configure xen hvm guests for igd passthrough on xen, and those are the
> > only users that could be adversely affected by the simpler patch to qemu
> > to fix this.
>
> FYI, libvirt should be using libxl to create guest, I don't think there
> is another way for libvirt to create xen guests.
>
>
>
> So overall, unfortunately the "pc" machine in QEMU isn't suitable to do
> IGD passthrough as the "xenfv" machine has already some workaround to
> make IGD work and just need some more.
>
> I've seen that the patch for QEMU is now reviewed, so I look at having
> it merged soonish.
>
> Thanks,
>
Chuck Zmudzinski Jan. 30, 2023, 12:38 a.m. UTC | #4
On 1/25/23 6:19 PM, Chuck Zmudzinski wrote:
> On 1/25/2023 6:37 AM, Anthony PERARD wrote:
>> On Tue, Jan 10, 2023 at 02:32:01AM -0500, Chuck Zmudzinski wrote:
>> > I call attention to the commit message of the first patch which points
>> > out that using the "pc" machine and adding the xen platform device on
>> > the qemu upstream command line is not functionally equivalent to using
>> > the "xenfv" machine which automatically adds the xen platform device
>> > earlier in the guest creation process. As a result, there is a noticeable
>> > reduction in the performance of the guest during startup with the "pc"
>> > machne type even if the xen platform device is added via the qemu
>> > command line options, although eventually both Linux and Windows guests
>> > perform equally well once the guest operating system is fully loaded.
>>
>> There shouldn't be a difference between "xenfv" machine or using the
>> "pc" machine while adding the "xen-platform" device, at least with
>> regards to access to disk or network.
>>
>> The first patch of the series is using the "pc" machine without any
>> "xen-platform" device, so we can't compare startup performance based on
>> that.
>>
>> > Specifically, startup time is longer and neither the grub vga drivers
>> > nor the windows vga drivers in early startup perform as well when the
>> > xen platform device is added via the qemu command line instead of being
>> > added immediately after the other emulated i440fx pci devices when the
>> > "xenfv" machine type is used.
>>
>> The "xen-platform" device is mostly an hint to a guest that they can use
>> pv-disk and pv-network devices. I don't think it would change anything
>> with regards to graphics.
>>
>> > For example, when using the "pc" machine, which adds the xen platform
>> > device using a command line option, the Linux guest could not display
>> > the grub boot menu at the native resolution of the monitor, but with the
>> > "xenfv" machine, the grub menu is displayed at the full 1920x1080
>> > native resolution of the monitor for testing. So improved startup
>> > performance is an advantage for the patch for qemu.
>>
>> I've just found out that when doing IGD passthrough, both machine
>> "xenfv" and "pc" are much more different than I though ... :-(
>> pc_xen_hvm_init_pci() in QEMU changes the pci-host device, which in
>> turns copy some informations from the real host bridge.
>> I guess this new host bridge help when the firmware setup the graphic
>> for grub.

Yes, it is needed - see below for the very simple patch to Qemu
upstream that fixes it for the "pc" machine!

> 
> I am surprised it works at all with the "pc" machine, that is, without the
> TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE that is used in the "xenfv"
> machine. This only seems to affect the legacy grub vga driver and the legacy
> Windows vga driver during early boot. Still, I much prefer keeping the "xenfv"
> machine for Intel IGD than this workaround of patching libxl to use the "pc"
> machine.
> 
>>
>> > I also call attention to the last point of the commit message of the
>> > second patch and the comments for reviewers section of the second patch.
>> > This approach, as opposed to fixing this in qemu upstream, makes
>> > maintaining the code in libxl__build_device_model_args_new more
>> > difficult and therefore increases the chances of problems caused by
>> > coding errors and typos for users of libxl. So that is another advantage
>> > of the patch for qemu.
>>
>> We would just needs to use a different approach in libxl when generating
>> the command line. We could probably avoid duplications.

I was thinking we could also either write a test to verify the correctness
of the second patch to ensure it generates the correct Qemu command line
or take the time to verify the second patch's accuracy before committing it.

>> I was hopping to
>> have patch series for libxl that would change the machine used to start
>> using "pc" instead of "xenfv" for all configurations, but based on the
>> point above (IGD specific change to "xenfv"), then I guess we can't
>> really do anything from libxl to fix IGD passthrough.
> 
> We could switch to the "pc" machine, but we would need to patch
> qemu also so the "pc" machine uses the special device the "xenfv"
> machine uses (TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE).
> ...

I just tested a very simple patch to Qemu upstream to fix the
difference between the upstream Qemu "pc" machine and the upstream
Qemu "xenfv" machine:

--- a/hw/i386/pc_piix.c	2023-01-28 13:22:15.714595514 -0500
+++ b/hw/i386/pc_piix.c	2023-01-29 18:08:34.668491593 -0500
@@ -434,6 +434,8 @@
             compat(machine); \
         } \
         pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
+                 xen_igd_gfx_pt_enabled() ? \
+                 TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : \
                  TYPE_I440FX_PCI_DEVICE); \
     } \
     DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
----- snip -------

With this simple two-line patch to upstream Qemu, we can use the "pc"
machine without any regressions such as the startup performance
degradation I observed without this small patch to fix the "pc" machine
with igd passthru!

The "pc" machine maintainers for upstream Qemu would need to accept
this small patch to Qemu upstream. They might prefer this to the
other Qemu patch that reserves slot 2 for the Qemu upstream "xenfv"
machine when the guest is configured for igd passthru.

>>
>> ...
>>
>> So overall, unfortunately the "pc" machine in QEMU isn't suitable to do
>> IGD passthrough as the "xenfv" machine has already some workaround to
>> make IGD work and just need some more.

Well, the little patch to upstream shown above fixes the "pc" machine
with IGD so maybe this approach of patching libxl to use the "pc" machine
will be a viable fix for the IGD.

>>
>> I've seen that the patch for QEMU is now reviewed, so I look at having
>> it merged soonish.
>>
>> Thanks,
>>
> 

I just added the bit of information above to help you decide which
approach to use to improve the support for the igd passthru feature
with Xen and Qemu upstream. I think the test of the small patch to
Qemu to fix the "pc" machine with igd passthru makes this patch to
libxl a viable alternative to the other patch to Qemu upstream that
reserves slot 2 when using the "xenfv" machine and igd passthru.

Thanks,

Chuck
Chuck Zmudzinski Jan. 31, 2023, 7:35 p.m. UTC | #5
On 1/29/23 7:38 PM, Chuck Zmudzinski wrote:
> On 1/25/23 6:19 PM, Chuck Zmudzinski wrote:
>> On 1/25/2023 6:37 AM, Anthony PERARD wrote:
>>> On Tue, Jan 10, 2023 at 02:32:01AM -0500, Chuck Zmudzinski wrote:
>>> > I call attention to the commit message of the first patch which points
>>> > out that using the "pc" machine and adding the xen platform device on
>>> > the qemu upstream command line is not functionally equivalent to using
>>> > the "xenfv" machine which automatically adds the xen platform device
>>> > earlier in the guest creation process. As a result, there is a noticeable
>>> > reduction in the performance of the guest during startup with the "pc"
>>> > machne type even if the xen platform device is added via the qemu
>>> > command line options, although eventually both Linux and Windows guests
>>> > perform equally well once the guest operating system is fully loaded.
>>>
>>> There shouldn't be a difference between "xenfv" machine or using the
>>> "pc" machine while adding the "xen-platform" device, at least with
>>> regards to access to disk or network.
>>>
>>> The first patch of the series is using the "pc" machine without any
>>> "xen-platform" device, so we can't compare startup performance based on
>>> that.
>>>
>>> > Specifically, startup time is longer and neither the grub vga drivers
>>> > nor the windows vga drivers in early startup perform as well when the
>>> > xen platform device is added via the qemu command line instead of being
>>> > added immediately after the other emulated i440fx pci devices when the
>>> > "xenfv" machine type is used.
>>>
>>> The "xen-platform" device is mostly an hint to a guest that they can use
>>> pv-disk and pv-network devices. I don't think it would change anything
>>> with regards to graphics.
>>>
>>> > For example, when using the "pc" machine, which adds the xen platform
>>> > device using a command line option, the Linux guest could not display
>>> > the grub boot menu at the native resolution of the monitor, but with the
>>> > "xenfv" machine, the grub menu is displayed at the full 1920x1080
>>> > native resolution of the monitor for testing. So improved startup
>>> > performance is an advantage for the patch for qemu.
>>>
>>> I've just found out that when doing IGD passthrough, both machine
>>> "xenfv" and "pc" are much more different than I though ... :-(
>>> pc_xen_hvm_init_pci() in QEMU changes the pci-host device, which in
>>> turns copy some informations from the real host bridge.
>>> I guess this new host bridge help when the firmware setup the graphic
>>> for grub.
> 
> Yes, it is needed - see below for the very simple patch to Qemu
> upstream that fixes it for the "pc" machine!
> 
>> 
>> I am surprised it works at all with the "pc" machine, that is, without the
>> TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE that is used in the "xenfv"
>> machine. This only seems to affect the legacy grub vga driver and the legacy
>> Windows vga driver during early boot. Still, I much prefer keeping the "xenfv"
>> machine for Intel IGD than this workaround of patching libxl to use the "pc"
>> machine.
>> 
>>>
>>> > I also call attention to the last point of the commit message of the
>>> > second patch and the comments for reviewers section of the second patch.
>>> > This approach, as opposed to fixing this in qemu upstream, makes
>>> > maintaining the code in libxl__build_device_model_args_new more
>>> > difficult and therefore increases the chances of problems caused by
>>> > coding errors and typos for users of libxl. So that is another advantage
>>> > of the patch for qemu.
>>>
>>> We would just needs to use a different approach in libxl when generating
>>> the command line. We could probably avoid duplications.
> 
> I was thinking we could also either write a test to verify the correctness
> of the second patch to ensure it generates the correct Qemu command line
> or take the time to verify the second patch's accuracy before committing it.
> 
>>> I was hopping to
>>> have patch series for libxl that would change the machine used to start
>>> using "pc" instead of "xenfv" for all configurations, but based on the
>>> point above (IGD specific change to "xenfv"), then I guess we can't
>>> really do anything from libxl to fix IGD passthrough.
>> 
>> We could switch to the "pc" machine, but we would need to patch
>> qemu also so the "pc" machine uses the special device the "xenfv"
>> machine uses (TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE).
>> ...
> 
> I just tested a very simple patch to Qemu upstream to fix the
> difference between the upstream Qemu "pc" machine and the upstream
> Qemu "xenfv" machine:
> 
> --- a/hw/i386/pc_piix.c	2023-01-28 13:22:15.714595514 -0500
> +++ b/hw/i386/pc_piix.c	2023-01-29 18:08:34.668491593 -0500
> @@ -434,6 +434,8 @@
>              compat(machine); \
>          } \
>          pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
> +                 xen_igd_gfx_pt_enabled() ? \
> +                 TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : \
>                   TYPE_I440FX_PCI_DEVICE); \
>      } \
>      DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
> ----- snip -------
> 
> With this simple two-line patch to upstream Qemu, we can use the "pc"
> machine without any regressions such as the startup performance
> degradation I observed without this small patch to fix the "pc" machine
> with igd passthru!

Hi Anthony,

Actually, to implement the fix for the "pc" machine and IGD in Qemu
upstream and not break builds for configurations such as --disable-xen
the patch to Qemu needs to add four lines instead of two (still trivial!):


--- a/hw/i386/pc_piix.c	2023-01-29 18:05:15.714595514 -0500
+++ b/hw/i386/pc_piix.c	2023-01-29 18:08:34.668491593 -0500
@@ -434,6 +434,8 @@
             compat(machine); \
         } \
         pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
+                 pc_xen_igd_gfx_pt_enabled() ? \
+                 TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : \
                  TYPE_I440FX_PCI_DEVICE); \
     } \
     DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
--- a/include/sysemu/xen.h	2023-01-20 08:17:55.000000000 -0500
+++ b/include/sysemu/xen.h	2023-01-30 00:18:29.276886734 -0500
@@ -23,6 +23,7 @@
 extern bool xen_allowed;
 
 #define xen_enabled()           (xen_allowed)
+#define pc_xen_igd_gfx_pt_enabled()    xen_igd_gfx_pt_enabled()
 
 #ifndef CONFIG_USER_ONLY
 void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length);
@@ -33,6 +34,7 @@
 #else /* !CONFIG_XEN_IS_POSSIBLE */
 
 #define xen_enabled() 0
+#define pc_xen_igd_gfx_pt_enabled() 0
 #ifndef CONFIG_USER_ONLY
 static inline void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length)
 {
------- snip -------

Would you support this patch to Qemu if I formally submitted it to
Qemu as a replacement for the current more complicated patch to Qemu
that I proposed to reserve slot 2 for the IGD?

Thanks,

Chuck

> 
> The "pc" machine maintainers for upstream Qemu would need to accept
> this small patch to Qemu upstream. They might prefer this to the
> other Qemu patch that reserves slot 2 for the Qemu upstream "xenfv"
> machine when the guest is configured for igd passthru.
> 
>>>
>>> ...
>>>
>>> So overall, unfortunately the "pc" machine in QEMU isn't suitable to do
>>> IGD passthrough as the "xenfv" machine has already some workaround to
>>> make IGD work and just need some more.
> 
> Well, the little patch to upstream shown above fixes the "pc" machine
> with IGD so maybe this approach of patching libxl to use the "pc" machine
> will be a viable fix for the IGD.
> 
>>>
>>> I've seen that the patch for QEMU is now reviewed, so I look at having
>>> it merged soonish.
>>>
>>> Thanks,
>>>
>> 
> 
> I just added the bit of information above to help you decide which
> approach to use to improve the support for the igd passthru feature
> with Xen and Qemu upstream. I think the test of the small patch to
> Qemu to fix the "pc" machine with igd passthru makes this patch to
> libxl a viable alternative to the other patch to Qemu upstream that
> reserves slot 2 when using the "xenfv" machine and igd passthru.
> 
> Thanks,
> 
> Chuck