[v2] piix: fix regression during unplug in Xen HVM domUs

Message ID	20210317070046.17860-1-olaf@aepfle.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=alo8=IP=lists.xenproject.org=xen-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 733F864F01 Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Olaf Hering <olaf@aepfle.de> To: xen-devel@lists.xenproject.org, qemu-block@nongnu.org, qemu-devel@nongnu.org Cc: Olaf Hering <olaf@aepfle.de>, John Snow <jsnow@redhat.com> Subject: [PATCH v2] piix: fix regression during unplug in Xen HVM domUs Date: Wed, 17 Mar 2021 08:00:46 +0100 Message-Id: <20210317070046.17860-1-olaf@aepfle.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[v2] piix: fix regression during unplug in Xen HVM domUs \| expand [v2] piix: fix regression during unplug in Xen HVM domUs

Olaf Hering March 17, 2021, 7 a.m. UTC

Commit ee358e919e385fdc79d59d0d47b4a81e349cd5c9 causes a regression in
Xen HVM domUs which run xenlinux based kernels.

If the domU has an USB device assigned, for example with
"usbdevice=['tablet']" in domU.cfg, the late unplug of devices will
kill the emulated USB host. As a result the khubd thread hangs, and as
a result the entire boot process.

For some reason this does not affect pvops based kernels. This is
most likely caused by the fact that unplugging happens very early
during boot.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 hw/ide/piix.c        | 5 +++++
 include/hw/ide/pci.h | 1 +
 2 files changed, 6 insertions(+)

John Snow March 22, 2021, 10:09 p.m. UTC | #1

On 3/17/21 3:00 AM, Olaf Hering wrote:
> Commit ee358e919e385fdc79d59d0d47b4a81e349cd5c9 causes a regression in
> Xen HVM domUs which run xenlinux based kernels.
> 
> If the domU has an USB device assigned, for example with
> "usbdevice=['tablet']" in domU.cfg, the late unplug of devices will
> kill the emulated USB host. As a result the khubd thread hangs, and as
> a result the entire boot process.
> 
> For some reason this does not affect pvops based kernels. This is
> most likely caused by the fact that unplugging happens very early
> during boot.
> 

I'm not entirely sure of how the commit message relates to the patch, 
actually. (Sorry, I am not well familiar with XEN.)

> Signed-off-by: Olaf Hering <olaf@aepfle.de>
> ---
>   hw/ide/piix.c        | 5 +++++
>   include/hw/ide/pci.h | 1 +
>   2 files changed, 6 insertions(+)
> 
> diff --git a/hw/ide/piix.c b/hw/ide/piix.c
> index b9860e35a5..7f1998bf04 100644
> --- a/hw/ide/piix.c
> +++ b/hw/ide/piix.c
> @@ -109,6 +109,9 @@ static void piix_ide_reset(DeviceState *dev)
>       uint8_t *pci_conf = pd->config;
>       int i;
>   
> +    if (d->xen_unplug_done == true) {
> +        return;
> +    }

My understanding is that XEN has some extra disks that it unplugs when 
it later figures out it doesn't need them. How exactly this works is 
something I've not looked into too closely.

So if these IDE devices have been "unplugged" already, we avoid 
resetting them here. What about this reset causes the bug you describe 
in the commit message?

Does this reset now happen earlier/later as compared to what it did 
prior to ee358e91 ?

>       for (i = 0; i < 2; i++) {
>           ide_bus_reset(&d->bus[i]);
>       }
> @@ -151,6 +154,7 @@ static void pci_piix_ide_realize(PCIDevice *dev, Error **errp)
>       PCIIDEState *d = PCI_IDE(dev);
>       uint8_t *pci_conf = dev->config;
>   
> +    d->xen_unplug_done = false;
>       pci_conf[PCI_CLASS_PROG] = 0x80; // legacy ATA mode
>   
>       bmdma_setup_bar(d);
> @@ -170,6 +174,7 @@ int pci_piix3_xen_ide_unplug(DeviceState *dev, bool aux)
>       BlockBackend *blk;
>   
>       pci_ide = PCI_IDE(dev);
> +    pci_ide->xen_unplug_done = true;
>   
>       for (i = aux ? 1 : 0; i < 4; i++) {
>           idebus = &pci_ide->bus[i / 2];
> diff --git a/include/hw/ide/pci.h b/include/hw/ide/pci.h
> index d8384e1c42..9e71cfec3b 100644
> --- a/include/hw/ide/pci.h
> +++ b/include/hw/ide/pci.h
> @@ -50,6 +50,7 @@ struct PCIIDEState {
>       IDEBus bus[2];
>       BMDMAState bmdma[2];
>       uint32_t secondary; /* used only for cmd646 */
> +    bool xen_unplug_done;

I am hesitant to put a new XEN-specific boolean here, but don't know 
enough about the problem to outright say "no".

This looks like a band-aid that's out of place, but I don't understand 
the problem well enough yet to suggest a better place.

>       MemoryRegion bmdma_bar;
>       MemoryRegion cmd_bar[2];
>       MemoryRegion data_bar[2];
> 

(If anyone else with more experience with XEN wants to take over the 
review of this patch, let me know. I only really care about the IDE bits.)

Olaf Hering March 25, 2021, 11:12 a.m. UTC | #2

Am Mon, 22 Mar 2021 18:09:17 -0400
schrieb John Snow <jsnow@redhat.com>:

> My understanding is that XEN has some extra disks that it unplugs when 
> it later figures out it doesn't need them. How exactly this works is 
> something I've not looked into too closely.

It has no extra disks, why would it?

I assume each virtualization variant has some sort of unplug if it has to support guests that lack PV/virtio/enlightened/whatever drivers.

In case of HVM, the configured block or network devices can be either accessed via emulated PCI or via the PV drivers. Since the BIOS, the bootloader and potentially the operating system kernel typically lack PV drivers, they will find the devices only via the PCI bus. In case they happen to have PV drivers in addition to PCI drivers, both drivers will find and offer the same resource via different paths. In case of a block device, ata_piix.ko will show it via "/dev/sda" and xen-blkfront.ko will show it via "/dev/xvda". This is obviously bad, at least in the read-write case.

The pvops kernel triggers the unplug of the emulated PCI hardware early, prior any other PCI initialization. As a result the PCI drivers will not find their hardware anymore. In case of ata_piix, only the non-CDROM storage will be removed in qmeu, because there is no PV-CDROM driver.

The PV support in old xenlinux based kernels is only available as modules. As a result the unplug will happen after PCI was initialized, but it must happen before any PCI device drivers are loaded.


> So if these IDE devices have been "unplugged" already, we avoid 
> resetting them here. What about this reset causes the bug you describe 
> in the commit message?
> 
> Does this reset now happen earlier/later as compared to what it did 
> prior to ee358e91 ?

Prior this commit, piix_ide_reset was only called when the entire emulated machine was reset. Like: never.
With this commit, piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some reason it confuses the emulated USB hardware. Why it does confused it, no idea.

I wonder what the purpose of the qdev_reset_all() call really is. It is 10 years old. It might be stale.


Olaf

Paolo Bonzini March 25, 2021, 4:49 p.m. UTC | #3

On 25/03/21 12:12, Olaf Hering wrote:
> Am Mon, 22 Mar 2021 18:09:17 -0400
> schrieb John Snow <jsnow@redhat.com>:
> 
>> My understanding is that XEN has some extra disks that it unplugs when
>> it later figures out it doesn't need them. How exactly this works is
>> something I've not looked into too closely.
> 
> It has no extra disks, why would it?
> 
> I assume each virtualization variant has some sort of unplug if it has to support guests that lack PV/virtio/enlightened/whatever drivers.

No, it's Xen only and really should be legacy.  Ideally one would just 
have devices supported at all levels from firmware to kernel.

>> So if these IDE devices have been "unplugged" already, we avoid
>> resetting them here. What about this reset causes the bug you describe
>> in the commit message?
>>
>> Does this reset now happen earlier/later as compared to what it did
>> prior to ee358e91 ?
> 
> Prior this commit, piix_ide_reset was only called when the entire
> emulated machine was reset. Like: never. With this commit,
> piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some
> reason it confuses the emulated USB hardware. Why it does confused
> it, no idea.

> I wonder what the purpose of the qdev_reset_all() call really is. It
> is 10 years old. It might be stale.

piix_ide_reset is only calling ide_bus_reset, and from there ide_reset 
and bmdma_reset.  All of these functions do just two things: reset 
internal registers and ensure pending I/O is completed or canceled.  The 
latter is indeed unnecessary; drain/flush/detach is already done before 
the call to qdev_reset_all.

But the fact that it breaks USB is weird.  That's the part that needs to 
be debugged, because changing IDE to unbreak USB needs an explanation 
even if it's the right thing to do.

If you don't want to debug it, removing the qdev_reset_all call might do 
the job; you'll have to see what the Xen maintainers think of it.  But 
if you don't debug the USB issue now, it will come back later almost surely.

Paolo

Olaf Hering May 9, 2023, 10:58 p.m. UTC | #4

Resuming this old thread about an unfixed bug, which was introduced in qemu-4.2:

qemu ends up in piix_ide_reset from pci_unplug_disks.
This was not the case prior 4.2, the removed call to
qemu_register_reset(piix3_reset, d) in
ee358e919e385fdc79d59d0d47b4a81e349cd5c9 did apparently nothing.

In my debugging (with v8.0.0) it turned out the three pci_set_word
causes the domU to hang. In fact, it is just the last one:

   pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */

It changes the value from 0xc121 to 0x1.

The question is: what does this do in practice?

Starting with recent qemu (like 7.2), the domU sometimes proceeds with
these messages:

    [    1.631161] uhci_hcd 0000:00:01.2: host system error, PCI problems?
    [    1.634965] uhci_hcd 0000:00:01.2: host controller process error, something bad happened!
    [    1.634965] uhci_hcd 0000:00:01.2: host controller halted, very bad!
    [    1.634965] uhci_hcd 0000:00:01.2: HC died; cleaning up
    Loading basic drivers...[    2.398048] Disabling IRQ #23

Is anyone familiar enough with PIIX3 and knows how these devices are
interacting?

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)
00:03.0 VGA compatible controller: Cirrus Logic GD 5446

Thanks,
Olaf

On Thu, Mar 25, Paolo Bonzini wrote:

> On 25/03/21 12:12, Olaf Hering wrote:
> > Am Mon, 22 Mar 2021 18:09:17 -0400
> > schrieb John Snow <jsnow@redhat.com>:
> > 
> > > My understanding is that XEN has some extra disks that it unplugs when
> > > it later figures out it doesn't need them. How exactly this works is
> > > something I've not looked into too closely.
> > 
> > It has no extra disks, why would it?
> > 
> > I assume each virtualization variant has some sort of unplug if it has to support guests that lack PV/virtio/enlightened/whatever drivers.
> 
> No, it's Xen only and really should be legacy.  Ideally one would just have
> devices supported at all levels from firmware to kernel.
> 
> > > So if these IDE devices have been "unplugged" already, we avoid
> > > resetting them here. What about this reset causes the bug you describe
> > > in the commit message?
> > > 
> > > Does this reset now happen earlier/later as compared to what it did
> > > prior to ee358e91 ?
> > 
> > Prior this commit, piix_ide_reset was only called when the entire
> > emulated machine was reset. Like: never. With this commit,
> > piix_ide_reset will be called from pci_piix3_xen_ide_unplug. For some
> > reason it confuses the emulated USB hardware. Why it does confused
> > it, no idea.
> 
> > I wonder what the purpose of the qdev_reset_all() call really is. It
> > is 10 years old. It might be stale.
> 
> piix_ide_reset is only calling ide_bus_reset, and from there ide_reset and
> bmdma_reset.  All of these functions do just two things: reset internal
> registers and ensure pending I/O is completed or canceled.  The latter is
> indeed unnecessary; drain/flush/detach is already done before the call to
> qdev_reset_all.
> 
> But the fact that it breaks USB is weird.  That's the part that needs to be
> debugged, because changing IDE to unbreak USB needs an explanation even if
> it's the right thing to do.
> 
> If you don't want to debug it, removing the qdev_reset_all call might do the
> job; you'll have to see what the Xen maintainers think of it.  But if you
> don't debug the USB issue now, it will come back later almost surely.
> 
> Paolo

Olaf Hering May 10, 2023, 7:47 a.m. UTC | #5

Wed, 10 May 2023 00:58:27 +0200 Olaf Hering <olaf@aepfle.de>:

> In my debugging (with v8.0.0) it turned out the three pci_set_word
> causes the domU to hang. In fact, it is just the last one:
> 
>    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
> 
> It changes the value from 0xc121 to 0x1.

If I disable just "pci_set_word(pci_conf + PCI_COMMAND, 0x0000);" it works as well.
It changes the value from 0x5 to 0.

In general I feel it is wrong to fiddle with PCI from the host side.
This is most likely not the intention of the Xen unplug protocol.
I'm sure the guest does not expect such changes under the hood.
It happens to work by luck with pvops kernels because their PCI discovery
is done after the unplug.

So, what do we do here to get this off the table?


Olaf

Stefano Stabellini May 12, 2023, 9:12 p.m. UTC | #6

On Wed, 10 May 2023, Olaf Hering wrote:
> Wed, 10 May 2023 00:58:27 +0200 Olaf Hering <olaf@aepfle.de>:
> 
> > In my debugging (with v8.0.0) it turned out the three pci_set_word
> > causes the domU to hang. In fact, it is just the last one:
> > 
> >    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
> > 
> > It changes the value from 0xc121 to 0x1.
> 
> If I disable just "pci_set_word(pci_conf + PCI_COMMAND, 0x0000);" it works as well.
> It changes the value from 0x5 to 0.
> 
> In general I feel it is wrong to fiddle with PCI from the host side.
> This is most likely not the intention of the Xen unplug protocol.
> I'm sure the guest does not expect such changes under the hood.
> It happens to work by luck with pvops kernels because their PCI discovery
> is done after the unplug.
> 
> So, what do we do here to get this off the table?

I don't have a concrete suggestion because I don't understand the root
cause of the issue. Looking back at Paolo's reply from 2021

https://marc.info/?l=xen-devel&m=161669099305992&w=2

I think he was right. We can either fix the root cause of the issue or
avoid calling qdev_reset_all on unplug. I am OK with either one.

John Snow May 16, 2023, 5:38 p.m. UTC | #7

On Fri, May 12, 2023 at 5:14 PM Stefano Stabellini
<sstabellini@kernel.org> wrote:
>
> On Wed, 10 May 2023, Olaf Hering wrote:
> > Wed, 10 May 2023 00:58:27 +0200 Olaf Hering <olaf@aepfle.de>:
> >
> > > In my debugging (with v8.0.0) it turned out the three pci_set_word
> > > causes the domU to hang. In fact, it is just the last one:
> > >
> > >    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
> > >
> > > It changes the value from 0xc121 to 0x1.
> >
> > If I disable just "pci_set_word(pci_conf + PCI_COMMAND, 0x0000);" it works as well.
> > It changes the value from 0x5 to 0.
> >
> > In general I feel it is wrong to fiddle with PCI from the host side.
> > This is most likely not the intention of the Xen unplug protocol.
> > I'm sure the guest does not expect such changes under the hood.
> > It happens to work by luck with pvops kernels because their PCI discovery
> > is done after the unplug.
> >
> > So, what do we do here to get this off the table?
>
> I don't have a concrete suggestion because I don't understand the root
> cause of the issue. Looking back at Paolo's reply from 2021
>
> https://marc.info/?l=xen-devel&m=161669099305992&w=2
>
> I think he was right. We can either fix the root cause of the issue or
> avoid calling qdev_reset_all on unplug. I am OK with either one.

I haven't touched IDE or block code in quite a long while now -- I
don't think I can help land this fix, but I won't get in anyone's way,
either. Maybe just re-submit the patches with an improved commit
message / cover letter that helps collect the info from the previous
thread, the core issue, etc.

--js

Olaf Hering May 16, 2023, 8 p.m. UTC | #8

Am Tue, 16 May 2023 13:38:42 -0400
schrieb John Snow <jsnow@redhat.com>:

> I haven't touched IDE or block code in quite a long while now -- I
> don't think I can help land this fix, but I won't get in anyone's way,
> either. Maybe just re-submit the patches with an improved commit
> message / cover letter that helps collect the info from the previous
> thread, the core issue, etc.

I poked at it some more in the past days. Paolo was right in 2019, this
issue needs to be debugged more to really understand why fiddling
with one PCI devices breaks another, apparently unrelated PCI device.

Once I know more, I will suggest a new change. The old one is
stale, and needs to be rebased anyway.


Olaf

Olaf Hering June 26, 2023, 9:19 p.m. UTC | #9

I need advice on how to debug this.

One thing that stands out is uhci_irq().
It reads a u16 from the USBSTS register. 

On the qemu side, this read is served from bmdma_read. Since the read
size is 2, the result is ~0, and uhci_irq() turns the controller off.
In other words, memory_region_ops_read from addr=0xc102 is served from "piix-bmdma"

If the pci_set_word calls in piix_ide_reset are skipped, the read is
served from uhci_port_write. This is the expected behavior.
In other words, memory_region_ops_read from addr=0xc102 is served from "uhci".

So far I was unable to decipher how the pci_set_word calls can
possibly affect the outcome and the owner of memory_region_ops_read.


Thanks,
Olaf

Wed, 10 May 2023 00:58:27 +0200 Olaf Hering <olaf@aepfle.de>:

> Resuming this old thread about an unfixed bug, which was introduced in qemu-4.2:
> 
> qemu ends up in piix_ide_reset from pci_unplug_disks.
> This was not the case prior 4.2, the removed call to
> qemu_register_reset(piix3_reset, d) in
> ee358e919e385fdc79d59d0d47b4a81e349cd5c9 did apparently nothing.
> 
> In my debugging (with v8.0.0) it turned out the three pci_set_word
> causes the domU to hang. In fact, it is just the last one:
> 
>    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
> 
> It changes the value from 0xc121 to 0x1.
> 
> The question is: what does this do in practice?
> 
> Starting with recent qemu (like 7.2), the domU sometimes proceeds with
> these messages:
> 
>     [    1.631161] uhci_hcd 0000:00:01.2: host system error, PCI problems?
>     [    1.634965] uhci_hcd 0000:00:01.2: host controller process error, something bad happened!
>     [    1.634965] uhci_hcd 0000:00:01.2: host controller halted, very bad!
>     [    1.634965] uhci_hcd 0000:00:01.2: HC died; cleaning up
>     Loading basic drivers...[    2.398048] Disabling IRQ #23

Paolo Bonzini June 27, 2023, 7:11 a.m. UTC | #10

On 6/26/23 23:19, Olaf Hering wrote:
> I need advice on how to debug this.
> 
> One thing that stands out is uhci_irq().
> It reads a u16 from the USBSTS register.
> 
> On the qemu side, this read is served from bmdma_read. Since the read
> size is 2, the result is ~0, and uhci_irq() turns the controller off.
> In other words, memory_region_ops_read from addr=0xc102 is served from "piix-bmdma"
> 
> If the pci_set_word calls in piix_ide_reset are skipped, the read is
> served from uhci_port_write. This is the expected behavior.
> In other words, memory_region_ops_read from addr=0xc102 is served from "uhci".

I think what's happening is that

     pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */

is setting the BAR to 0xC100, therefore overlapping the UHCI device's 
region.  In principle this line shouldn't be necessary at all though; 
it's enough to clear the COMMAND register.

Can you check the value of the COMMAND register (pci_conf + 0x04, 16 
bits, little endian)?  Something might be causing the register to be set 
back to a nonzero value, therefore re-enabling the I/O at the address 
that overlaps the UHCI device.

Paolo

Bernhard Beschow June 27, 2023, 10:12 a.m. UTC | #11

Am 27. Juni 2023 07:11:33 UTC schrieb Paolo Bonzini <pbonzini@redhat.com>:
>On 6/26/23 23:19, Olaf Hering wrote:
>> I need advice on how to debug this.
>> 
>> One thing that stands out is uhci_irq().
>> It reads a u16 from the USBSTS register.
>> 
>> On the qemu side, this read is served from bmdma_read. Since the read
>> size is 2, the result is ~0, and uhci_irq() turns the controller off.
>> In other words, memory_region_ops_read from addr=0xc102 is served from "piix-bmdma"
>> 
>> If the pci_set_word calls in piix_ide_reset are skipped, the read is
>> served from uhci_port_write. This is the expected behavior.
>> In other words, memory_region_ops_read from addr=0xc102 is served from "uhci".
>
>I think what's happening is that
>
>    pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
>
>is setting the BAR to 0xC100, therefore overlapping the UHCI device's region.  In principle this line shouldn't be necessary at all though; it's enough to clear the COMMAND register.

Interesting. The BAR is a 32 bit register whose default value is 0x00000001. I think what's supposed to happen here is a pci_set_long() rather than a pci_set_byte().

Bits 4..15 represent the BAR address, and pci_set_byte() only clears bits 4..7, leaving bits 8..15 unchanged. Perhaps this causes the BAR to be moved into the UHCI region? Does changing the call to pci_set_long() fix the problem?

Best regards,
Bernhard

>
>Can you check the value of the COMMAND register (pci_conf + 0x04, 16 bits, little endian)?  Something might be causing the register to be set back to a nonzero value, therefore re-enabling the I/O at the address that overlaps the UHCI device.
>
>Paolo
>
>

Olaf Hering June 27, 2023, 11:32 a.m. UTC | #12

Mon, 26 Jun 2023 23:19:01 +0200 Olaf Hering <olaf@aepfle.de>:

> So far I was unable to decipher how the pci_set_word calls can
> possibly affect the outcome and the owner of memory_region_ops_read.

It is enough to return from piix_ide_reset right after
pci_set_word(pci_conf + PCI_COMMAND, 0) to trigger the issue.


One thing which was not mentioned yet: the order in which kernel drivers
are loaded matters. Usually it is xen-platform-pci/uhci-hcd/ata_piix.

When uhci loads, it scans the USB bus, finds the tablet, loads usbhid.
While this happens, ata_piix loads. It finds the PCI device in state
disabled. The PCI code enables the device. On the qemu side this ends
up in pci_default_write_config for PCI device "piix3-ide" with addr=4,
val=1, len=2. This calls pci_update_mappings, which for region #4
changes the addr from 0xc120 to 0xc100. This causes the issue. Now
usbhid tries to use the USB bus, but uhci_irq fails.

If ata_piix is not loaded, uhci works.
If ata_piix is loaded before uhci-hcd, the USB bus can not be scanned,
udev is killed after a timeout and boot proceeds.
If usbhid is loaded before ata_piix, USB bus discovery usually finishes
before ata_piix enables its PCI device, boot proceeds.


Olaf

Olaf Hering June 27, 2023, 11:40 a.m. UTC | #13

Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:

> Bits 4..15 represent the BAR address, and pci_set_byte() only clears bits 4..7, leaving bits 8..15 unchanged. Perhaps this causes the BAR to be moved into the UHCI region? Does changing the call to pci_set_long() fix the problem?

Thanks, it seems this fixes the issue. Let me verify this with a clean build.


Olaf

Olaf Hering June 27, 2023, 12:07 p.m. UTC | #14

Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:

> The BAR is a 32 bit register whose default value is 0x00000001. I think what's supposed to happen here is a pci_set_long() rather than a pci_set_byte().

Indeed, the u32 at that address changes from c121 to c101 with the current code.

Olaf

Bernhard Beschow June 28, 2023, 9:27 a.m. UTC | #15

Am 27. Juni 2023 12:07:40 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:
>
>> The BAR is a 32 bit register whose default value is 0x00000001. I think what's supposed to happen here is a pci_set_long() rather than a pci_set_byte().
>
>Indeed, the u32 at that address changes from c121 to c101 with the current code.

Neat! Would you mind sending a patch fixing the BMIBA register to be reset as 32 bit?

Best regards,
Bernhard
>
>Olaf

Olaf Hering June 30, 2023, 7:29 a.m. UTC | #16

Wed, 28 Jun 2023 09:27:16 +0000 Bernhard Beschow <shentey@gmail.com>:

> Would you mind sending a patch fixing the BMIBA register to be reset as 32 bit?

Will do so next week.

Are the specs for this chipset available, does this address really need
to be accessed in quantities of u32, or is perhaps u16 required? I guess
for this specific bug pci_set_word may work as well.

Either way, commit e6a71ae327a388723182a504bb253777ec36869b was wrong.
Does the comment added in this commit mean, the quantity is really u32?


Olaf

Bernhard Beschow June 30, 2023, 8:05 a.m. UTC | #17

Am 30. Juni 2023 07:29:21 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>Wed, 28 Jun 2023 09:27:16 +0000 Bernhard Beschow <shentey@gmail.com>:
>
>> Would you mind sending a patch fixing the BMIBA register to be reset as 32 bit?
>
>Will do so next week.

Great! Perhaps it could then be picked up by maintainers together with my other IDE changes.

>
>Are the specs for this chipset available,

Yes. Have a look for piix3/piix4 here: https://www.intel.com/design/archives/chipsets/440/index.htm

> does this address really need
>to be accessed in quantities of u32, or is perhaps u16 required? I guess
>for this specific bug pci_set_word may work as well.
>
>Either way, commit e6a71ae327a388723182a504bb253777ec36869b was wrong.
>Does the comment added in this commit mean, the quantity is really u32?

As per the datasheet (and per the comment in the source code) the BMIBA register is 32 bit. Search the datasheet for the register name and you'll also find its default value which it holds after reset.

Best regards,
Bernhard

>
>
>Olaf

Paolo Bonzini June 30, 2023, 8:48 a.m. UTC | #18

Il mer 28 giu 2023, 13:28 Bernhard Beschow <shentey@gmail.com> ha scritto:

>
>
> Am 27. Juni 2023 12:07:40 UTC schrieb Olaf Hering <olaf@aepfle.de>:
> >Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:
> >
> >> The BAR is a 32 bit register whose default value is 0x00000001. I think
> what's supposed to happen here is a pci_set_long() rather than a
> pci_set_byte().
> >
> >Indeed, the u32 at that address changes from c121 to c101 with the
> current code.
>
> Neat! Would you mind sending a patch fixing the BMIBA register to be reset
> as 32 bit?
>

I think we should also check why writing the command register is not
disabling the BAR as well.

Paolo


> Best regards,
> Bernhard
> >
> >Olaf
>
>

Olaf Hering June 30, 2023, 11:32 a.m. UTC | #19

Fri, 30 Jun 2023 08:05:29 +0000 Bernhard Beschow <shentey@gmail.com>:

> Yes. Have a look for piix3/piix4 here: https://www.intel.com/design/archives/chipsets/440/index.htm

This is hidden behind a login or whatever.

I should be able to come up with a commit message without hardware specs being available.


Olaf

Bernhard Beschow June 30, 2023, 10:15 p.m. UTC | #20

Am 30. Juni 2023 11:32:42 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>Fri, 30 Jun 2023 08:05:29 +0000 Bernhard Beschow <shentey@gmail.com>:
>
>> Yes. Have a look for piix3/piix4 here: https://www.intel.com/design/archives/chipsets/440/index.htm
>
>This is hidden behind a login or whatever.

None of the links ask annoying questions in my case.

Best regards,
Bernhard

>
>I should be able to come up with a commit message without hardware specs being available.
>
>
>Olaf

Bernhard Beschow July 1, 2023, 9:53 a.m. UTC | #21

Am 30. Juni 2023 08:48:02 UTC schrieb Paolo Bonzini <pbonzini@redhat.com>:
>Il mer 28 giu 2023, 13:28 Bernhard Beschow <shentey@gmail.com> ha scritto:
>
>>
>>
>> Am 27. Juni 2023 12:07:40 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>> >Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:
>> >
>> >> The BAR is a 32 bit register whose default value is 0x00000001. I think
>> what's supposed to happen here is a pci_set_long() rather than a
>> pci_set_byte().
>> >
>> >Indeed, the u32 at that address changes from c121 to c101 with the
>> current code.
>>
>> Neat! Would you mind sending a patch fixing the BMIBA register to be reset
>> as 32 bit?
>>
>
>I think we should also check why writing the command register is not
>disabling the BAR as well.

So IIUC the BMIBA register is managed internally by QEMU's PCI code and we shouldn't have to mess with the register at all. We should actually remove the explicit reset of BMIBA, correct?

I've tried debugging the PCI code when working on the VIA IDE controller to understand it better. But despite QEMU being compiled with --enable-debug it seemd to be compiled with -O2 still, making debugging quite hard. I'm not sure if any compile flags leak into my build environment though.

Best regards,
Bernhard 
>
>Paolo
>
>
>> Best regards,
>> Bernhard
>> >
>> >Olaf
>>
>>

Mark Cave-Ayland July 1, 2023, 11:58 a.m. UTC | #22

On 01/07/2023 10:53, Bernhard Beschow wrote:

> Am 30. Juni 2023 08:48:02 UTC schrieb Paolo Bonzini <pbonzini@redhat.com>:
>> Il mer 28 giu 2023, 13:28 Bernhard Beschow <shentey@gmail.com> ha scritto:
>>
>>>
>>>
>>> Am 27. Juni 2023 12:07:40 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>>>> Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:
>>>>
>>>>> The BAR is a 32 bit register whose default value is 0x00000001. I think
>>> what's supposed to happen here is a pci_set_long() rather than a
>>> pci_set_byte().
>>>>
>>>> Indeed, the u32 at that address changes from c121 to c101 with the
>>> current code.
>>>
>>> Neat! Would you mind sending a patch fixing the BMIBA register to be reset
>>> as 32 bit?
>>>
>>
>> I think we should also check why writing the command register is not
>> disabling the BAR as well.
> 
> So IIUC the BMIBA register is managed internally by QEMU's PCI code and we shouldn't have to mess with the register at all. We should actually remove the explicit reset of BMIBA, correct?
> 
> I've tried debugging the PCI code when working on the VIA IDE controller to understand it better. But despite QEMU being compiled with --enable-debug it seemd to be compiled with -O2 still, making debugging quite hard. I'm not sure if any compile flags leak into my build environment though.

Certainly --enable-debug normally does the right thing when building QEMU. If you 
want to double-check the compiler flags in use to see if anything from CFLAGS/LDFLAGS 
is getting picked up, use "make V=1" after configure which outputs the full command 
being used during the build rather than just the summary.


ATB,

Mark.

Bernhard Beschow July 2, 2023, 10:25 p.m. UTC | #23

Am 1. Juli 2023 11:58:57 UTC schrieb Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>:
>On 01/07/2023 10:53, Bernhard Beschow wrote:
>
>> Am 30. Juni 2023 08:48:02 UTC schrieb Paolo Bonzini <pbonzini@redhat.com>:
>>> Il mer 28 giu 2023, 13:28 Bernhard Beschow <shentey@gmail.com> ha scritto:
>>> 
>>>> 
>>>> 
>>>> Am 27. Juni 2023 12:07:40 UTC schrieb Olaf Hering <olaf@aepfle.de>:
>>>>> Tue, 27 Jun 2023 10:12:50 +0000 Bernhard Beschow <shentey@gmail.com>:
>>>>> 
>>>>>> The BAR is a 32 bit register whose default value is 0x00000001. I think
>>>> what's supposed to happen here is a pci_set_long() rather than a
>>>> pci_set_byte().
>>>>> 
>>>>> Indeed, the u32 at that address changes from c121 to c101 with the
>>>> current code.
>>>> 
>>>> Neat! Would you mind sending a patch fixing the BMIBA register to be reset
>>>> as 32 bit?
>>>> 
>>> 
>>> I think we should also check why writing the command register is not
>>> disabling the BAR as well.
>> 
>> So IIUC the BMIBA register is managed internally by QEMU's PCI code and we shouldn't have to mess with the register at all. We should actually remove the explicit reset of BMIBA, correct?
>> 
>> I've tried debugging the PCI code when working on the VIA IDE controller to understand it better. But despite QEMU being compiled with --enable-debug it seemd to be compiled with -O2 still, making debugging quite hard. I'm not sure if any compile flags leak into my build environment though.
>
>Certainly --enable-debug normally does the right thing when building QEMU. If you want to double-check the compiler flags in use to see if anything from CFLAGS/LDFLAGS is getting picked up, use "make V=1" after configure which outputs the full command being used during the build rather than just the summary.

--enable-debug does the right thing indeed. The error was on my side. Solved!

Thanks,
Bernhard
>
>
>ATB,
>
>Mark.
>

[v2] piix: fix regression during unplug in Xen HVM domUs

Commit Message

Comments

Patch