diff mbox series

[v2] piix: fix regression during unplug in Xen HVM domUs

Message ID 20210317070046.17860-1-olaf@aepfle.de (mailing list archive)
State New
Headers show
Series [v2] piix: fix regression during unplug in Xen HVM domUs | expand

Commit Message

Olaf Hering March 17, 2021, 7 a.m. UTC
Commit ee358e919e385fdc79d59d0d47b4a81e349cd5c9 causes a regression in
Xen HVM domUs which run xenlinux based kernels.

If the domU has an USB device assigned, for example with
"usbdevice=['tablet']" in domU.cfg, the late unplug of devices will
kill the emulated USB host. As a result the khubd thread hangs, and as
a result the entire boot process.

For some reason this does not affect pvops based kernels. This is
most likely caused by the fact that unplugging happens very early
during boot.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 hw/ide/piix.c        | 5 +++++
 include/hw/ide/pci.h | 1 +
 2 files changed, 6 insertions(+)

Comments

John Snow March 22, 2021, 10:09 p.m. UTC | #1
On 3/17/21 3:00 AM, Olaf Hering wrote:
> Commit ee358e919e385fdc79d59d0d47b4a81e349cd5c9 causes a regression in
> Xen HVM domUs which run xenlinux based kernels.
> 
> If the domU has an USB device assigned, for example with
> "usbdevice=['tablet']" in domU.cfg, the late unplug of devices will
> kill the emulated USB host. As a result the khubd thread hangs, and as
> a result the entire boot process.
> 
> For some reason this does not affect pvops based kernels. This is
> most likely caused by the fact that unplugging happens very early
> during boot.
> 

I'm not entirely sure of how the commit message relates to the patch, 
actually. (Sorry, I am not well familiar with XEN.)

> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>   hw/ide/piix.c        | 5 +++++
>   include/hw/ide/pci.h | 1 +
>   2 files changed, 6 insertions(+)
> 
> diff --git a/hw/ide/piix.c b/hw/ide/piix.c
> index b9860e35a5..7f1998bf04 100644
> --- a/hw/ide/piix.c
> +++ b/hw/ide/piix.c
> @@ -109,6 +109,9 @@ static void piix_ide_reset(DeviceState *dev)
>       uint8_t *pci_conf = pd->config;
>       int i;
>   
> +    if (d->xen_unplug_done == true) {
> +        return;
> +    }

My understanding is that XEN has some extra disks that it unplugs when 
it later figures out it doesn't need them. How exactly this works is 
something I've not looked into too closely.

So if these IDE devices have been "unplugged" already, we avoid 
resetting them here. What about this reset causes the bug you describe 
in the commit message?

Does this reset now happen earlier/later as compared to what it did 
prior to ee358e91 ?

>       for (i = 0; i < 2; i++) {
>           ide_bus_reset(&d->bus[i]);
>       }
> @@ -151,6 +154,7 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error **errp)
>       PCIIDEState *d = PCI_IDE(dev);
>       uint8_t *pci_conf = dev->config;
>   
> +    d->xen_unplug_done = false;
>       pci_conf[PCI_CLASS_PROG] = 0x80; // legacy ATA mode
>   
>       bmdma_setup_bar(d);
> @@ -170,6 +174,7 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev, bool aux)
>       BlockBackend *blk;
>   
>       pci_ide = PCI_IDE(dev);
> +    pci_ide->xen_unplug_done = true;
>   
>       for (i = aux ? 1 : 0; i < 4; i++) {
>           idebus = &pci_ide->bus[i / 2];
> diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
> index d8384e1c42..9e71cfec3b 100644
> --- a/include/hw/ide/pci.h
> +++ b/include/hw/ide/pci.h
> @@ -50,6 +50,7 @@ struct PCIIDEState {
>       IDEBus bus[2];
>       BMDMAState bmdma[2];
>       uint32_t secondary; /* used only for cmd646 */
> +    bool xen_unplug_done;

I am hesitant to put a new XEN-specific boolean here, but don't know 
enough about the problem to outright say "no".

This looks like a band-aid that's out of place, but I don't understand 
the problem well enough yet to suggest a better place.

>       MemoryRegion bmdma_bar;
>       MemoryRegion cmd_bar[2];
>       MemoryRegion data_bar[2];
> 

(If anyone else with more experience with XEN wants to take over the 
review of this patch, let me know. I only really care about the IDE bits.)
Olaf Hering March 25, 2021, 11:12 a.m. UTC | #2
Am Mon, 22 Mar 2021 18:09:17 -0400
schrieb John Snow <jsnow@redhat.com>:

> My understanding is that XEN has some extra disks that it unplugs when 
> it later figures out it doesn't need them. How exactly this works is 
> something I've not looked into too closely.

It has no extra disks, why would it?

I assume each virtualization variant has some sort of unplug if it has to support guests that lack PV/virtio/enlightened/whatever drivers.

In case of HVM, the configured block or network devices can be either accessed via emulated PCI or via the PV drivers. Since the BIOS, the bootloader and potentially the operating system kernel typically lack PV drivers, they will find the devices only via the PCI bus. In case they happen to have PV drivers in addition to PCI drivers, both drivers will find and offer the same resource via different paths. In case of a block device, ata_piix.ko will show it via "/dev/sda" and xen-blkfront.ko will show it via "/dev/xvda". This is obviously bad, at least in the read-write case.

The pvops kernel triggers the unplug of the emulated PCI hardware early, prior any other PCI initialization. As a result the PCI drivers will not find their hardware anymore. In case of ata_piix, only the non-CDROM storage will be removed in qmeu, because there is no PV-CDROM driver.

The PV support in old xenlinux based kernels is only available as modules. As a result the unplug will happen after PCI was initialized, but it must happen before any PCI device drivers are loaded.


> So if these IDE devices have been "unplugged" already, we avoid 
> resetting them here. What about this reset causes the bug you describe 
> in the commit message?
> 
> Does this reset now happen earlier/later as compared to what it did 
> prior to ee358e91 ?

Prior this commit, piix_ide_reset was only called when the entire emulated machine was reset. Like: never.
With this commit, piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some reason it confuses the emulated USB hardware. Why it does confused it, no idea.

I wonder what the purpose of the qdev_reset_all() call really is. It is 10 years old. It might be stale.


Olaf
Paolo Bonzini March 25, 2021, 4:49 p.m. UTC | #3
On 25/03/21 12:12, Olaf Hering wrote:
> Am Mon, 22 Mar 2021 18:09:17 -0400
> schrieb John Snow <jsnow@redhat.com>:
> 
>> My understanding is that XEN has some extra disks that it unplugs when
>> it later figures out it doesn't need them. How exactly this works is
>> something I've not looked into too closely.
> 
> It has no extra disks, why would it?
> 
> I assume each virtualization variant has some sort of unplug if it has to support guests that lack PV/virtio/enlightened/whatever drivers.

No, it's Xen only and really should be legacy.  Ideally one would just 
have devices supported at all levels from firmware to kernel.

>> So if these IDE devices have been "unplugged" already, we avoid
>> resetting them here. What about this reset causes the bug you describe
>> in the commit message?
>>
>> Does this reset now happen earlier/later as compared to what it did
>> prior to ee358e91 ?
> 
> Prior this commit, piix_ide_reset was only called when the entire
> emulated machine was reset. Like: never. With this commit,
> piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some
> reason it confuses the emulated USB hardware. Why it does confused
> it, no idea.

> I wonder what the purpose of the qdev_reset_all() call really is. It
> is 10 years old. It might be stale.

piix_ide_reset is only calling ide_bus_reset, and from there ide_reset 
and bmdma_reset.  All of these functions do just two things: reset 
internal registers and ensure pending I/O is completed or canceled.  The 
latter is indeed unnecessary; drain/flush/detach is already done before 
the call to qdev_reset_all.

But the fact that it breaks USB is weird.  That's the part that needs to 
be debugged, because changing IDE to unbreak USB needs an explanation 
even if it's the right thing to do.

If you don't want to debug it, removing the qdev_reset_all call might do 
the job; you'll have to see what the Xen maintainers think of it.  But 
if you don't debug the USB issue now, it will come back later almost surely.

Paolo
diff mbox series

Patch

diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index b9860e35a5..7f1998bf04 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -109,6 +109,9 @@  static void piix_ide_reset(DeviceState *dev)
     uint8_t *pci_conf = pd->config;
     int i;
 
+    if (d->xen_unplug_done == true) {
+        return;
+    }
     for (i = 0; i < 2; i++) {
         ide_bus_reset(&d->bus[i]);
     }
@@ -151,6 +154,7 @@  static void pci_piix_ide_realize(PCIDevice *dev, Error **errp)
     PCIIDEState *d = PCI_IDE(dev);
     uint8_t *pci_conf = dev->config;
 
+    d->xen_unplug_done = false;
     pci_conf[PCI_CLASS_PROG] = 0x80; // legacy ATA mode
 
     bmdma_setup_bar(d);
@@ -170,6 +174,7 @@  int pci_piix3_xen_ide_unplug(DeviceState *dev, bool aux)
     BlockBackend *blk;
 
     pci_ide = PCI_IDE(dev);
+    pci_ide->xen_unplug_done = true;
 
     for (i = aux ? 1 : 0; i < 4; i++) {
         idebus = &pci_ide->bus[i / 2];
diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
index d8384e1c42..9e71cfec3b 100644
--- a/include/hw/ide/pci.h
+++ b/include/hw/ide/pci.h
@@ -50,6 +50,7 @@  struct PCIIDEState {
     IDEBus bus[2];
     BMDMAState bmdma[2];
     uint32_t secondary; /* used only for cmd646 */
+    bool xen_unplug_done;
     MemoryRegion bmdma_bar;
     MemoryRegion cmd_bar[2];
     MemoryRegion data_bar[2];