diff mbox

[2/2] kvm tools: inject MSI directly without going through a GSI

Message ID 1343917764-28715-2-git-send-email-levinsasha928@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sasha Levin Aug. 2, 2012, 2:29 p.m. UTC
Use the new KVM_SIGNAL_MSI ioctl to inject interrupts directly.

We still create GSIs and keep them for two reasons:

 - They're required by virtio-* devices.
 - There's not much overhead since we just create them when starting the
guest, they don't use anything when the guest is running.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 tools/kvm/virtio/pci.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

Comments

Pekka Enberg Aug. 4, 2012, 9:14 a.m. UTC | #1
Hi Sasha,

On Thu, Aug 2, 2012 at 5:29 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> Use the new KVM_SIGNAL_MSI ioctl to inject interrupts directly.
>
> We still create GSIs and keep them for two reasons:
>
>  - They're required by virtio-* devices.
>  - There's not much overhead since we just create them when starting the
> guest, they don't use anything when the guest is running.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

This patch makes 'make check' hang for me. Full boot log below:

  # lkvm run -k ../../arch/x86/boot/bzImage -m 448 -c 4 --name guest-27916

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.5.0+ (penberg@tux) (gcc version 4.6.3
20120306 (Red Hat 4.6.3-2) (GCC) ) #22 SMP Sat Aug 4 12:10:59 EEST
2012
[    0.000000] Command line: noapic noacpi pci=conf1 reboot=k panic=1
i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 console=ttyS0
earlyprintk=serial i8042.noaux=1 init=init root=/dev/vda rw
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000ffffe] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001bffffff] usable
[    0.000000] bootconsole [earlyser0] enabled
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI not present or invalid.
[    0.000000] No AGP bridge found
[    0.000000] e820: last_pfn = 0x1c000 max_arch_pfn = 0x400000000
[    0.000000] x86 PAT enabled: cpu 0, old 0x70106, new 0x7010600070106
[    0.000000] CPU MTRRs all blank - virtualized system.
[    0.000000] found SMP MP-table at [mem 0x000f0370-0x000f037f]
mapped at [ffff8800000f0370]
[    0.000000] init_memory_mapping: [mem 0x00000000-0x1bffffff]
[    0.000000] ACPI Error: A valid RSDP was not found (20120320/tbxfroot-219)
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000001bffffff]
[    0.000000] Initmem setup node 0 [mem 0x00000000-0x1bffffff]
[    0.000000]   NODE_DATA [mem 0x1bffc000-0x1bffffff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x00010000-0x00ffffff]
[    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x00010000-0x0009efff]
[    0.000000]   node   0: [mem 0x00100000-0x1bffffff]
[    0.000000] Intel MultiProcessor Specification v1.4
[    0.000000] MPTABLE: OEM ID: KVMCPU00
[    0.000000] MPTABLE: Product ID: 0.1
[    0.000000] MPTABLE: APIC at: 0xFEE00000
[    0.000000] Processor #0 (Bootup-CPU)
[    0.000000] Processor #1
[    0.000000] Processor #2
[    0.000000] Processor #3
[    0.000000] IOAPIC[0]: apic_id 5, version 17, address 0xfec00000, GSI 0-23
[    0.000000] Processors: 4
[    0.000000] SMP: Allowing 4 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
[    0.000000] PM: Registered nosave memory: 00000000000f0000 - 00000000000ff000
[    0.000000] PM: Registered nosave memory: 00000000000ff000 - 0000000000100000
[    0.000000] e820: [mem 0x1c000000-0xffffffff] available for PCI devices
[    0.000000] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64
nr_cpu_ids:4 nr_node_ids:1
[    0.000000] PERCPU: Embedded 26 pages/cpu @ffff88001bc00000 s77120
r8192 d21184 u524288
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 112777
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: noapic noacpi pci=conf1 reboot=k
panic=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 console=ttyS0
earlyprintk=serial i8042.noaux=1 init=init root=/dev/vda rw
[    0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[    0.000000] __ex_table already sorted, skipping sort
[    0.000000] xsave/xrstor: enabled xstate_bv 0x7, cntxt size 0x340
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Memory: 435020k/458752k available (7228k kernel code,
452k absent, 23280k reserved, 5730k data, 588k init)
[    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0,
CPUs=4, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000] NR_IRQS:4352 nr_irqs:712 16
[    0.000000] Console: colour *CGA 80x25
[    0.000000] console [ttyS0] enabled, bootconsole disabled
[    0.000000] console [ttyS0] enabled, bootconsole disabled
[    0.000000] Fast TSC calibration using PIT
[    0.000000] Detected 2691.176 MHz processor.
[    0.003002] Calibrating delay loop (skipped), value calculated
using timer frequency.. 5382.35 BogoMIPS (lpj=2691176)
[    0.004267] pid_max: default: 32768 minimum: 301
[    0.004712] Security Framework initialized
[    0.005003] SELinux:  Initializing.
[    0.005381] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.006208] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.007306] Mount-cache hash table entries: 256
[    0.008012] Initializing cgroup subsys cpuacct
[    0.008433] Initializing cgroup subsys freezer
[    0.008892] CPU: Physical Processor ID: 0
[    0.009001] CPU: Processor Core ID: 0
[    0.009359] mce: CPU supports 32 MCE banks
[    0.010403] CPU0: Intel 06/2a stepping 07
[    0.112030] Performance Events: unsupported p6 CPU model 42 no PMU
driver, software events only.
[    0.113087] Booting Node   0, Processors  #1 #2 #3 Ok.
[    0.153001] Brought up 4 CPUs
[    0.153316] Total of 4 processors activated (21529.40 BogoMIPS).
[    0.155187] kworker/u:0 (20) used greatest stack depth: 5560 bytes left
[    0.155187] RTC time:  9:14:11, date: 08/04/12
[    0.155187] NET: Registered protocol family 16
[    0.160055] PCI: Using configuration type 1 for base access
[    0.196037] bio: create slab <bio-0> at 0
[    0.198008] ACPI: Interpreter disabled.
[    0.199002] vgaarb: loaded
[    0.199983] SCSI subsystem initialized
[    0.201025] usbcore: registered new interface driver usbfs
[    0.202011] usbcore: registered new interface driver hub
[    0.203014] usbcore: registered new device driver usb
[    0.205021] Advanced Linux Sound Architecture Driver Version 1.0.25.
[    0.205980] PCI: Probing PCI hardware
[    0.207032] PCI host bridge to bus 0000:00
[    0.207984] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    0.208990] pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff]
[    0.213040] cfg80211: Calling CRDA to update world regulatory domain
[    0.218077] NetLabel: Initializing
[    0.218418] NetLabel:  domain hash size = 128
[    0.218418] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.218477] NetLabel:  unlabeled traffic allowed by default
[    0.224031] pnp: PnP ACPI: disabled
[    0.233022] NET: Registered protocol family 2
[    0.234012] IP route cache hash table entries: 4096 (order: 3, 32768 bytes)
[    0.234807] TCP established hash table entries: 16384 (order: 6,
262144 bytes)
[    0.235032] TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
[    0.236030] TCP: Hash tables configured (established 16384 bind 16384)
[    0.236976] TCP: reno registered
[    0.237290] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.237982] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.238656] NET: Registered protocol family 1
[    0.239049] RPC: Registered named UNIX socket transport module.
[    0.239975] RPC: Registered udp transport module.
[    0.240425] RPC: Registered tcp transport module.
[    0.240974] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.241756] platform rtc_cmos: registered platform RTC device (no
PNP device found)
[    0.245007] microcode: CPU0 sig=0x206a7, pf=0x1, revision=0x1
[    0.245007] microcode: CPU1 sig=0x206a7, pf=0x1, revision=0x1
[    0.245975] microcode: CPU2 sig=0x206a7, pf=0x1, revision=0x1
[    0.246974] microcode: CPU3 sig=0x206a7, pf=0x1, revision=0x1
[    0.248310] microcode: Microcode Update Driver: v2.00
<tigran@aivazian.fsnet.co.uk>, Peter Oruba
[    0.249974] audit: initializing netlink socket (disabled)
[    0.250500] type=2000 audit(4499745251.249:1): initialized
[    0.266041] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.274081] VFS: Disk quotas dquot_6.5.2
[    0.275003] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.278011] NFS: Registering the id_resolver key type
[    0.278987] Key type id_resolver registered
[    0.280046] 9p: Installing v9fs 9p2000 file system support
[    0.281980] msgmni has been set to 849
[    0.284002] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 253)
[    0.284969] io scheduler noop registered
[    0.285968] io scheduler deadline registered
[    0.286978] io scheduler cfq registered (default)
[    0.287969] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    0.288531] virtio-pci 0000:00:01.0: enabling device (0000 -> 0003)
[    0.289056] virtio-pci 0000:00:02.0: enabling device (0000 -> 0003)
[    0.291079] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
ÿ[    0.538076] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a U6_16550A
ÿ[    0.785054] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a U6_16550A
ÿ[    1.031999] serial8250: ttyS2 at I/O 0x3e8 (irq = 4) is a U6_16550A
[    1.033890] Non-volatile memory driver v1.3
[    1.034855] Linux agpgart interface v0.103
[    1.035895] [drm] Initialized drm 1.1.0 20060810
[    1.036351] [drm:i915_init] *ERROR* drm/i915 can't work without
intel_agp module!
[    1.040917] brd: module loaded
[    1.043922] loop: module loaded
[    1.243831] Switching to clocksource tsc
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin Aug. 4, 2012, 9:30 a.m. UTC | #2
On 08/04/2012 11:14 AM, Pekka Enberg wrote:
> Hi Sasha,
> 
> On Thu, Aug 2, 2012 at 5:29 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> Use the new KVM_SIGNAL_MSI ioctl to inject interrupts directly.
>>
>> We still create GSIs and keep them for two reasons:
>>
>>  - They're required by virtio-* devices.
>>  - There's not much overhead since we just create them when starting the
>> guest, they don't use anything when the guest is running.
>>
>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> 
> This patch makes 'make check' hang for me. Full boot log below:

Is your host kernel running 3.5? The new MSI injection ioctl is a new 3.5 feature.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg Aug. 4, 2012, 11:02 a.m. UTC | #3
On 08/04/2012 11:14 AM, Pekka Enberg wrote:
>> This patch makes 'make check' hang for me. Full boot log below:

On Sat, Aug 4, 2012 at 12:30 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> Is your host kernel running 3.5? The new MSI injection ioctl is a new 3.5 feature.

No, it's not running 3.5. We need to support older *host* kernels, though.

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin Aug. 5, 2012, 7:02 a.m. UTC | #4
On 08/04/2012 01:02 PM, Pekka Enberg wrote:
> On 08/04/2012 11:14 AM, Pekka Enberg wrote:
>>> This patch makes 'make check' hang for me. Full boot log below:
> 
> On Sat, Aug 4, 2012 at 12:30 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> Is your host kernel running 3.5? The new MSI injection ioctl is a new 3.5 feature.
> 
> No, it's not running 3.5. We need to support older *host* kernels, though.

Do we? Don't we need to support just the kernel that the tool was built with?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg Aug. 5, 2012, 9:08 a.m. UTC | #5
On 08/04/2012 01:02 PM, Pekka Enberg wrote:
>> No, it's not running 3.5. We need to support older *host* kernels,
>> though.

On Sun, Aug 5, 2012 at 10:02 AM, Sasha Levin <levinsasha928@gmail.com> wrote:
> Do we? Don't we need to support just the kernel that the tool was
> built with?

We only do that for *guest kernels* if we have to but we've always been
compatible with older host kernels.

Isn't there a capability flag that KVM sets if KVM_SIGNAL_MSI is
supported? Just store that in 'struct kvm" and switch between
virtio_pci__signal_msi() and kvm__irq_trigger() depending on wheter the
flag is set.

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin Aug. 5, 2012, 9:14 a.m. UTC | #6
On 08/05/2012 11:08 AM, Pekka Enberg wrote:
> On 08/04/2012 01:02 PM, Pekka Enberg wrote:
>>> No, it's not running 3.5. We need to support older *host* kernels,
>>> though.
> 
> On Sun, Aug 5, 2012 at 10:02 AM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> Do we? Don't we need to support just the kernel that the tool was
>> built with?
> 
> We only do that for *guest kernels* if we have to but we've always been
> compatible with older host kernels.
> 
> Isn't there a capability flag that KVM sets if KVM_SIGNAL_MSI is
> supported? Just store that in 'struct kvm" and switch between
> virtio_pci__signal_msi() and kvm__irq_trigger() depending on wheter the
> flag is set.

There is, but we've broken backwards compatibility for guests several times before as well - which is why I assumed thats fine.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg Aug. 5, 2012, 9:31 a.m. UTC | #7
On 08/05/2012 11:08 AM, Pekka Enberg wrote:
>> Isn't there a capability flag that KVM sets if KVM_SIGNAL_MSI is
>> supported? Just store that in 'struct kvm" and switch between
>> virtio_pci__signal_msi() and kvm__irq_trigger() depending on wheter the
>> flag is set.

On Sun, Aug 5, 2012 at 12:14 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> There is, but we've broken backwards compatibility for guests several
> times before as well - which is why I assumed thats fine.

Let me be clear about this: I don't like breaking backwards
compatibility at all. Yes, we have done it in the past for *guest
kernels* for various technical reasons but we've never ever done it on
purpose for host kernels.

We have been more relaxed on backwards compatibility requirements than
QEMU but I think we're reaching a point where it's more painful to break
backwards compatibility than it is to write code to accommodate older
kernels. Especially for something like this which is obviously a new KVM
feature and not required for running Linux.

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index f17cd8a..9888b1a 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -10,6 +10,7 @@ 
 #include <linux/virtio_pci.h>
 #include <linux/byteorder.h>
 #include <string.h>
+#include <sys/ioctl.h>
 
 static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param)
 {
@@ -236,6 +237,17 @@  static void virtio_pci__mmio_callback(u64 addr, u8 *data, u32 len, u8 is_write,
 		memcpy(data, table + addr - offset, len);
 }
 
+static void virtio_pci__signal_msi(struct kvm *kvm, struct virtio_pci *vpci, int vec)
+{
+	struct kvm_msi msi = {
+		.address_lo = vpci->msix_table[vec].msg.address_lo,
+		.address_hi = vpci->msix_table[vec].msg.address_hi,
+		.data = vpci->msix_table[vec].msg.data,
+	};
+
+	ioctl(kvm->vm_fd, KVM_SIGNAL_MSI, &msi);
+}
+
 int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
 {
 	struct virtio_pci *vpci = vdev->virtio;
@@ -249,7 +261,7 @@  int virtio_pci__signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
 			return 0;
 		}
 
-		kvm__irq_trigger(kvm, vpci->gsis[vq]);
+		virtio_pci__signal_msi(kvm, vpci, vpci->vq_vector[vq]);
 	} else {
 		vpci->isr = VIRTIO_IRQ_HIGH;
 		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
@@ -270,7 +282,7 @@  int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev)
 			return 0;
 		}
 
-		kvm__irq_trigger(kvm, vpci->config_gsi);
+		virtio_pci__signal_msi(kvm, vpci, vpci->vq_vector[vpci->config_vector]);
 	} else {
 		vpci->isr = VIRTIO_PCI_ISR_CONFIG;
 		kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);