diff mbox

[v3] ARM: KVM: Enable in-kernel timers with user space gic

Message ID 1498577737-130264-1-git-send-email-agraf@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Alexander Graf June 27, 2017, 3:35 p.m. UTC
When running with KVM enabled, you can choose between emulating the
gic in kernel or user space. If the kernel supports in-kernel virtualization
of the interrupt controller, it will default to that. If not, if will
default to user space emulation.

Unfortunately when running in user mode gic emulation, we miss out on
interrupt events which are only available from kernel space, such as the timer.
This patch leverages the new kernel/user space pending line synchronization for
timer events. It does not handle PMU events yet.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Andrew Jones <drjones@redhat.com>

---

v1 -> v2:

  - whitespace fixes
  - use !! to determine whether bit is set
  - call in-kernel device IRQs out by their name everywhere

v2 -> v3:

  - fix last occurence of calling out timer IRQs explicitly
---
 accel/kvm/kvm-all.c    |  5 +++++
 accel/stubs/kvm-stub.c |  5 +++++
 hw/intc/arm_gic.c      |  7 +++++++
 include/sysemu/kvm.h   | 11 +++++++++++
 target/arm/cpu.h       |  3 +++
 target/arm/kvm.c       | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 82 insertions(+)

Comments

Andrew Jones June 28, 2017, 11:51 a.m. UTC | #1
On Tue, Jun 27, 2017 at 05:35:37PM +0200, Alexander Graf wrote:
> When running with KVM enabled, you can choose between emulating the
> gic in kernel or user space. If the kernel supports in-kernel virtualization
> of the interrupt controller, it will default to that. If not, if will
> default to user space emulation.
> 
> Unfortunately when running in user mode gic emulation, we miss out on
> interrupt events which are only available from kernel space, such as the timer.
> This patch leverages the new kernel/user space pending line synchronization for
> timer events. It does not handle PMU events yet.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> Reviewed-by: Andrew Jones <drjones@redhat.com>
> 
> ---
> 
> v1 -> v2:
> 
>   - whitespace fixes
>   - use !! to determine whether bit is set
>   - call in-kernel device IRQs out by their name everywhere
> 
> v2 -> v3:
> 
>   - fix last occurence of calling out timer IRQs explicitly
> ---
>  accel/kvm/kvm-all.c    |  5 +++++
>  accel/stubs/kvm-stub.c |  5 +++++
>  hw/intc/arm_gic.c      |  7 +++++++
>  include/sysemu/kvm.h   | 11 +++++++++++
>  target/arm/cpu.h       |  3 +++
>  target/arm/kvm.c       | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 82 insertions(+)
>

Tried testing this on a gicv3 machine, a ThunderX2. The guest kernel
complains with

 GICv3: GIC: unable to set SRE (disabled at EL2), panic ahead

but no panic occurs. Instead it hangs in cpu_do_idle(), waiting forever
for an interrupt.

AAVMF also complains about SRE support, actually it asserts it.

 ASSERT [ArmGicDxe] /builddir/build/BUILD/ovmf-c325e41585e3/ArmVirtPkg/Library/ArmVirtGicArchLib/ArmVirtGicArchLib.c(113): IccSre & (1 << 0)


I still haven't seen any problems with gicv2 though.

Thanks,
drew
Alexander Graf June 28, 2017, 12:36 p.m. UTC | #2
On 28.06.17 13:51, Andrew Jones wrote:
> On Tue, Jun 27, 2017 at 05:35:37PM +0200, Alexander Graf wrote:
>> When running with KVM enabled, you can choose between emulating the
>> gic in kernel or user space. If the kernel supports in-kernel virtualization
>> of the interrupt controller, it will default to that. If not, if will
>> default to user space emulation.
>>
>> Unfortunately when running in user mode gic emulation, we miss out on
>> interrupt events which are only available from kernel space, such as the timer.
>> This patch leverages the new kernel/user space pending line synchronization for
>> timer events. It does not handle PMU events yet.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> Reviewed-by: Andrew Jones <drjones@redhat.com>
>>
>> ---
>>
>> v1 -> v2:
>>
>>    - whitespace fixes
>>    - use !! to determine whether bit is set
>>    - call in-kernel device IRQs out by their name everywhere
>>
>> v2 -> v3:
>>
>>    - fix last occurence of calling out timer IRQs explicitly
>> ---
>>   accel/kvm/kvm-all.c    |  5 +++++
>>   accel/stubs/kvm-stub.c |  5 +++++
>>   hw/intc/arm_gic.c      |  7 +++++++
>>   include/sysemu/kvm.h   | 11 +++++++++++
>>   target/arm/cpu.h       |  3 +++
>>   target/arm/kvm.c       | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>   6 files changed, 82 insertions(+)
>>
> 
> Tried testing this on a gicv3 machine, a ThunderX2. The guest kernel

Did you patch QEMU to automatically choose the gic version? The upstream 
default is to have gicv2 as the guest gic type. And gicv2 should work 
just fine.

I have seen issues with gicv3 emulation in user space, yes. I guess we 
don't have a channel to properly trap the MSRs into user space yet.


Alex
Andrew Jones June 28, 2017, 1:43 p.m. UTC | #3
On Wed, Jun 28, 2017 at 02:36:17PM +0200, Alexander Graf wrote:
> 
> 
> On 28.06.17 13:51, Andrew Jones wrote:
> > On Tue, Jun 27, 2017 at 05:35:37PM +0200, Alexander Graf wrote:
> > > When running with KVM enabled, you can choose between emulating the
> > > gic in kernel or user space. If the kernel supports in-kernel virtualization
> > > of the interrupt controller, it will default to that. If not, if will
> > > default to user space emulation.
> > > 
> > > Unfortunately when running in user mode gic emulation, we miss out on
> > > interrupt events which are only available from kernel space, such as the timer.
> > > This patch leverages the new kernel/user space pending line synchronization for
> > > timer events. It does not handle PMU events yet.
> > > 
> > > Signed-off-by: Alexander Graf <agraf@suse.de>
> > > Reviewed-by: Andrew Jones <drjones@redhat.com>
> > > 
> > > ---
> > > 
> > > v1 -> v2:
> > > 
> > >    - whitespace fixes
> > >    - use !! to determine whether bit is set
> > >    - call in-kernel device IRQs out by their name everywhere
> > > 
> > > v2 -> v3:
> > > 
> > >    - fix last occurence of calling out timer IRQs explicitly
> > > ---
> > >   accel/kvm/kvm-all.c    |  5 +++++
> > >   accel/stubs/kvm-stub.c |  5 +++++
> > >   hw/intc/arm_gic.c      |  7 +++++++
> > >   include/sysemu/kvm.h   | 11 +++++++++++
> > >   target/arm/cpu.h       |  3 +++
> > >   target/arm/kvm.c       | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > >   6 files changed, 82 insertions(+)
> > > 
> > 
> > Tried testing this on a gicv3 machine, a ThunderX2. The guest kernel
> 
> Did you patch QEMU to automatically choose the gic version?

Nope, I was just trying to use a pre-existing guest config on that host,
which had gic-version=3 on its command line.

> The upstream
> default is to have gicv2 as the guest gic type. And gicv2 should work just
> fine.

Yup, works for me now (with its limitations - had to reduce the number of
cpus the pre-existing guest config had configured to 8.)

Thanks,
drew
Peter Maydell June 29, 2017, 3:22 p.m. UTC | #4
On 27 June 2017 at 16:35, Alexander Graf <agraf@suse.de> wrote:
> When running with KVM enabled, you can choose between emulating the
> gic in kernel or user space. If the kernel supports in-kernel virtualization
> of the interrupt controller, it will default to that. If not, if will
> default to user space emulation.
>
> Unfortunately when running in user mode gic emulation, we miss out on
> interrupt events which are only available from kernel space, such as the timer.
> This patch leverages the new kernel/user space pending line synchronization for
> timer events. It does not handle PMU events yet.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> Reviewed-by: Andrew Jones <drjones@redhat.com>



Applied to target-arm.next, thanks.

-- PMM
diff mbox

Patch

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 75feffa..ade32ea 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2285,6 +2285,11 @@  int kvm_has_intx_set_mask(void)
     return kvm_state->intx_set_mask;
 }
 
+bool kvm_arm_supports_user_irq(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_ARM_USER_IRQ);
+}
+
 #ifdef KVM_CAP_SET_GUEST_DEBUG
 struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *cpu,
                                                  target_ulong pc)
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index ef0c734..3965c52 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -155,4 +155,9 @@  void kvm_init_cpu_signals(CPUState *cpu)
 {
     abort();
 }
+
+bool kvm_arm_supports_user_irq(void)
+{
+    return false;
+}
 #endif
diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index b305d90..5a0e2a3 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -25,6 +25,7 @@ 
 #include "qom/cpu.h"
 #include "qemu/log.h"
 #include "trace.h"
+#include "sysemu/kvm.h"
 
 /* #define DEBUG_GIC */
 
@@ -1412,6 +1413,12 @@  static void arm_gic_realize(DeviceState *dev, Error **errp)
         return;
     }
 
+    if (kvm_enabled() && !kvm_arm_supports_user_irq()) {
+        error_setg(errp, "KVM with user space irqchip only works when the "
+                         "host kernel supports KVM_CAP_ARM_USER_IRQ");
+        return;
+    }
+
     /* This creates distributor and main CPU interface (s->cpuiomem[0]) */
     gic_init_irqs_and_mmio(s, gic_set_irq, gic_ops);
 
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 1e91613..9f11fc0 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -227,6 +227,17 @@  int kvm_init_vcpu(CPUState *cpu);
 int kvm_cpu_exec(CPUState *cpu);
 int kvm_destroy_vcpu(CPUState *cpu);
 
+/**
+ * kvm_arm_supports_user_irq
+ *
+ * Not all KVM implementations support notifications for kernel generated
+ * interrupt events to user space. This function indicates whether the current
+ * KVM implementation does support them.
+ *
+ * Returns: true if KVM supports using kernel generated IRQs from user space
+ */
+bool kvm_arm_supports_user_irq(void);
+
 #ifdef NEED_CPU_H
 #include "cpu.h"
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 16a1e59..102c58a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -706,6 +706,9 @@  struct ARMCPU {
     void *el_change_hook_opaque;
 
     int32_t node_id; /* NUMA node this CPU belongs to */
+
+    /* Used to synchronize KVM and QEMU in-kernel device levels */
+    uint8_t device_irq_level;
 };
 
 static inline ARMCPU *arm_env_get_cpu(CPUARMState *env)
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 4555468..7c17f0d 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -174,6 +174,12 @@  int kvm_arch_init(MachineState *ms, KVMState *s)
      */
     kvm_async_interrupts_allowed = true;
 
+    /*
+     * PSCI wakes up secondary cores, so we always need to
+     * have vCPUs waiting in kernel space
+     */
+    kvm_halt_in_kernel_allowed = true;
+
     cap_has_mp_state = kvm_check_extension(s, KVM_CAP_MP_STATE);
 
     type_register_static(&host_arm_cpu_type_info);
@@ -528,6 +534,51 @@  void kvm_arch_pre_run(CPUState *cs, struct kvm_run *run)
 
 MemTxAttrs kvm_arch_post_run(CPUState *cs, struct kvm_run *run)
 {
+    ARMCPU *cpu;
+    uint32_t switched_level;
+
+    if (kvm_irqchip_in_kernel()) {
+        /*
+         * We only need to sync timer states with user-space interrupt
+         * controllers, so return early and save cycles if we don't.
+         */
+        return MEMTXATTRS_UNSPECIFIED;
+    }
+
+    cpu = ARM_CPU(cs);
+
+    /* Synchronize our shadowed in-kernel device irq lines with the kvm ones */
+    if (run->s.regs.device_irq_level != cpu->device_irq_level) {
+        switched_level = cpu->device_irq_level ^ run->s.regs.device_irq_level;
+
+        qemu_mutex_lock_iothread();
+
+        if (switched_level & KVM_ARM_DEV_EL1_VTIMER) {
+            qemu_set_irq(cpu->gt_timer_outputs[GTIMER_VIRT],
+                         !!(run->s.regs.device_irq_level &
+                            KVM_ARM_DEV_EL1_VTIMER));
+            switched_level &= ~KVM_ARM_DEV_EL1_VTIMER;
+        }
+
+        if (switched_level & KVM_ARM_DEV_EL1_PTIMER) {
+            qemu_set_irq(cpu->gt_timer_outputs[GTIMER_PHYS],
+                         !!(run->s.regs.device_irq_level &
+                            KVM_ARM_DEV_EL1_PTIMER));
+            switched_level &= ~KVM_ARM_DEV_EL1_PTIMER;
+        }
+
+        /* XXX PMU IRQ is missing */
+
+        if (switched_level) {
+            qemu_log_mask(LOG_UNIMP, "%s: unhandled in-kernel device IRQ %x\n",
+                          __func__, switched_level);
+        }
+
+        /* We also mark unknown levels as processed to not waste cycles */
+        cpu->device_irq_level = run->s.regs.device_irq_level;
+        qemu_mutex_unlock_iothread();
+    }
+
     return MEMTXATTRS_UNSPECIFIED;
 }