diff mbox

kvmtool: don't use PCI config space IRQ line field

Message ID 1423064390-15269-1-git-send-email-andre.przywara@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andre Przywara Feb. 4, 2015, 3:39 p.m. UTC
In PCI config space there is an interrupt line field (offset 0x3f),
which is used to initially communicate the IRQ line number from
firmware to the OS. _Hardware_ should never use this information,
as the OS is free to write any information in there.
But kvmtool uses this number when it triggers IRQs in the guest,
which fails starting with Linux 3.19-rc1, where the PCI layer starts
writing the virtual IRQ number in there.

Fix that by storing the IRQ number in a separate field in
struct virtio_pci, which is independent from the PCI config space
and cannot be influenced by the guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
Hi,

this fixes the hangs we have seen with kvmtool and PCI in 3.19-rc1+.
I scanned kvmtool's code for further usage of that config space
field, but couldn't find anything relevant except pci-shmem.c, which
is completely broken atm, so I didn't bother to fix this.

Cheers,
Andre.

 tools/kvm/include/kvm/virtio-pci.h |    8 ++++++++
 tools/kvm/virtio/pci.c             |    9 ++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

Comments

Will Deacon Feb. 6, 2015, 6:55 p.m. UTC | #1
On Wed, Feb 04, 2015 at 03:39:50PM +0000, Andre Przywara wrote:
> In PCI config space there is an interrupt line field (offset 0x3f),
> which is used to initially communicate the IRQ line number from
> firmware to the OS. _Hardware_ should never use this information,
> as the OS is free to write any information in there.

Is this true even with probe-only? I appreciate that this isn't a BAR,
but it still feels odd for Linux to write this in that case.

Will
Peter Maydell Feb. 6, 2015, 7:02 p.m. UTC | #2
On 6 February 2015 at 18:55, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Feb 04, 2015 at 03:39:50PM +0000, Andre Przywara wrote:
>> In PCI config space there is an interrupt line field (offset 0x3f),
>> which is used to initially communicate the IRQ line number from
>> firmware to the OS. _Hardware_ should never use this information,
>> as the OS is free to write any information in there.
>
> Is this true even with probe-only? I appreciate that this isn't a BAR,
> but it still feels odd for Linux to write this in that case.

The hardware (model) shouldn't be doing anything with the value
in this register anyway, so I think this change to kvmtool is
correct regardless of Linux's behaviour.

-- PMM
Will Deacon Feb. 6, 2015, 7:07 p.m. UTC | #3
On Fri, Feb 06, 2015 at 07:02:25PM +0000, Peter Maydell wrote:
> On 6 February 2015 at 18:55, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, Feb 04, 2015 at 03:39:50PM +0000, Andre Przywara wrote:
> >> In PCI config space there is an interrupt line field (offset 0x3f),
> >> which is used to initially communicate the IRQ line number from
> >> firmware to the OS. _Hardware_ should never use this information,
> >> as the OS is free to write any information in there.
> >
> > Is this true even with probe-only? I appreciate that this isn't a BAR,
> > but it still feels odd for Linux to write this in that case.
> 
> The hardware (model) shouldn't be doing anything with the value
> in this register anyway, so I think this change to kvmtool is
> correct regardless of Linux's behaviour.

Well, kvmtool is also pretending to be firmware in this case, which is why
it passes things like probe-only and PSCI nodes.

Will
Arnd Bergmann Feb. 7, 2015, 9:24 p.m. UTC | #4
> Will Deacon <will.deacon@arm.com> hat am 6. Februar 2015 um 20:07 geschrieben:
> On Fri, Feb 06, 2015 at 07:02:25PM +0000, Peter Maydell wrote:
> > On 6 February 2015 at 18:55, Will Deacon <will.deacon@arm.com> wrote:
> > > On Wed, Feb 04, 2015 at 03:39:50PM +0000, Andre Przywara wrote:
> > >> In PCI config space there is an interrupt line field (offset 0x3f),
> > >> which is used to initially communicate the IRQ line number from
> > >> firmware to the OS. _Hardware_ should never use this information,
> > >> as the OS is free to write any information in there.
> > >
> > > Is this true even with probe-only? I appreciate that this isn't a BAR,
> > > but it still feels odd for Linux to write this in that case.
> >
> > The hardware (model) shouldn't be doing anything with the value
> > in this register anyway, so I think this change to kvmtool is
> > correct regardless of Linux's behaviour.
>
> Well, kvmtool is also pretending to be firmware in this case, which is why
> it passes things like probe-only and PSCI nodes.

And this means that we should expect kvmtool to initialize these fields to
a meaningful value, but not care if the OS writes something else to them.

An interesting question is what value kvmtool should write in there. IIRC,
SBSA says that the each interrupt line on the PCI should be an SPI of the
primary GIC, so I suppose we could write the SPI number, although that would
be different of the traditional Linux interrupt number user for that SPI,
which has an offset added to it.

Of course for any hardware that is not SBSA compliant in this regard and
connects the interrupt line to a secondary irqchip (e.g. gpio), there is
no good 8-bit number we can write in here. Also, Linux does not care,
because it gets the number from DT rather than the PCI config space.

     Arnd
diff mbox

Patch

diff --git a/tools/kvm/include/kvm/virtio-pci.h b/tools/kvm/include/kvm/virtio-pci.h
index c795ce7..b70cadd 100644
--- a/tools/kvm/include/kvm/virtio-pci.h
+++ b/tools/kvm/include/kvm/virtio-pci.h
@@ -30,6 +30,14 @@  struct virtio_pci {
 	u8			isr;
 	u32			features;
 
+	/*
+	 * We cannot rely on the INTERRUPT_LINE byte in the config space once
+	 * we have run guest code, as the OS is allowed to use that field
+	 * as a scratch pad to communicate between driver and PCI layer.
+	 * So store our legacy interrupt line number in here for internal use.
+	 */
+	u8			legacy_irq_line;
+
 	/* MSI-X */
 	u16			config_vector;
 	u32			config_gsi;
diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index 7556239..e17e5a9 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -141,7 +141,7 @@  static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p
 		break;
 	case VIRTIO_PCI_ISR:
 		ioport__write8(data, vpci->isr);
-		kvm__irq_line(kvm, vpci->pci_hdr.irq_line, VIRTIO_IRQ_LOW);
+		kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
 		vpci->isr = VIRTIO_IRQ_LOW;
 		break;
 	default:
@@ -299,7 +299,7 @@  int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
 			kvm__irq_trigger(kvm, vpci->gsis[vq]);
 	} else {
 		vpci->isr = VIRTIO_IRQ_HIGH;
-		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
+		kvm__irq_trigger(kvm, vpci->legacy_irq_line);
 	}
 	return 0;
 }
@@ -323,7 +323,7 @@  int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev)
 			kvm__irq_trigger(kvm, vpci->config_gsi);
 	} else {
 		vpci->isr = VIRTIO_PCI_ISR_CONFIG;
-		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
+		kvm__irq_trigger(kvm, vpci->legacy_irq_line);
 	}
 
 	return 0;
@@ -422,6 +422,9 @@  int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	if (r < 0)
 		goto free_msix_mmio;
 
+	/* save the IRQ that device__register() has allocated */
+	vpci->legacy_irq_line = vpci->pci_hdr.irq_line;
+
 	return 0;
 
 free_msix_mmio: