diff mbox

[2/3] KVM: arm/arm64: Add ARM arch timer interrupts ABI

Message ID 20160927190806.22988-3-christoffer.dall@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Christoffer Dall Sept. 27, 2016, 7:08 p.m. UTC
From: Alexander Graf <agraf@suse.de>

We have 2 modes for dealing with interrupts in the ARM world. We can
either handle them all using hardware acceleration through the vgic or
we can emulate a gic in user space and only drive CPU IRQ pins from
there.

Unfortunately, when driving IRQs from user space, we never tell user
space about timer events that may result in interrupt line state
changes, so we lose out on timer events if we run with user space gic
emulation.

Define an ABI to publish the timer output level to userspace.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 Documentation/virtual/kvm/api.txt | 29 +++++++++++++++++++++++++++++
 arch/arm/include/uapi/asm/kvm.h   |  2 ++
 arch/arm64/include/uapi/asm/kvm.h |  2 ++
 include/uapi/linux/kvm.h          |  6 ++++++
 4 files changed, 39 insertions(+)

Comments

Peter Maydell Nov. 1, 2016, 11:26 a.m. UTC | #1
On 27 September 2016 at 20:08, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> From: Alexander Graf <agraf@suse.de>
>
> We have 2 modes for dealing with interrupts in the ARM world. We can
> either handle them all using hardware acceleration through the vgic or
> we can emulate a gic in user space and only drive CPU IRQ pins from
> there.
>
> Unfortunately, when driving IRQs from user space, we never tell user
> space about timer events that may result in interrupt line state
> changes, so we lose out on timer events if we run with user space gic
> emulation.
>
> Define an ABI to publish the timer output level to userspace.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  Documentation/virtual/kvm/api.txt | 29 +++++++++++++++++++++++++++++
>  arch/arm/include/uapi/asm/kvm.h   |  2 ++
>  arch/arm64/include/uapi/asm/kvm.h |  2 ++
>  include/uapi/linux/kvm.h          |  6 ++++++
>  4 files changed, 39 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 739db9a..2adf600 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -3928,3 +3928,32 @@ In order to use SynIC, it has to be activated by setting this
>  capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
>  will disable the use of APIC hardware virtualization even if supported
>  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
> +
> +8.3 KVM_CAP_ARM_TIMER
> +
> +Architectures: arm, arm64
> +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
> +that if userspace creates a VM without an in-kernel interrupt controller, it
> +will be notified of changes to the output level of ARM architected timers
> +presented to the VM.  For such VMs, on every return to userspace, the kernel
> +updates the vcpu's run->s.regs.timer_irq_level field to represent the actual
> +output level of the timers.
> +
> +Whenever kvm detects a change in the timer output level, kvm guarantees at
> +least one return to userspace before running the VM.  This exit could either
> +be a KVM_EXIT_INTR or any other exit event, like KVM_EXIT_MMIO. This way,
> +userspace can always sample the timer output level and re-compute the state of
> +the userspace interrupt controller.  Userspace should always check the state
> +of run->s.regs.timer_irq_level on every kvm exit.  The value in
> +run->s.regs.timer_irq_level should be considered a level triggered interrupt
> +signal.
> +
> +The field run->s.regs.timer_irq_level is available independent of
> +run->kvm_valid_regs or run->kvm_dirty_regs bits.
> +
> +Currently the following bits are defined for the timer_irq_level bitmap:
> +
> +    KVM_ARM_TIMER_VTIMER  -  virtual timer
> +
> +Future versions of kvm may implement additional timer events. These will get
> +indicated by additional KVM_CAP extensions.

This API looks good to me generally. My only question is whether we
want to name the struct fields so they're not specifically talking
about timer interrupts. For instance we probably want to expose the
vPMU interrupt line to userspace too. We could do that by adding another
struct field pmu_irq_level, but we could equally just assign it a bit
in the existing irq_level field.

Possible current and future outbound interrupt lines (some of these
would only show up in some unlikely or lots-of-implementation-needed
cases, I'm just trying to produce an exhaustive list):
 * virtual timer
 * physical timer
 * hyp timer (nested virtualization case)
 * secure timer (unlikely but maybe if EL3 is ever supported inside a VM)
 * gic maintenance interrupt (nested virt again)
 * PMU interrupt

The kernel doesn't know which interrupt number these would be wired
up to, so they're all just arbitrary outputs, and you could put them
in one field or split them up into multiple fields, it doesn't make
much difference.

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoffer Dall Nov. 1, 2016, 2:50 p.m. UTC | #2
On Tue, Nov 01, 2016 at 11:26:54AM +0000, Peter Maydell wrote:
> On 27 September 2016 at 20:08, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > From: Alexander Graf <agraf@suse.de>
> >
> > We have 2 modes for dealing with interrupts in the ARM world. We can
> > either handle them all using hardware acceleration through the vgic or
> > we can emulate a gic in user space and only drive CPU IRQ pins from
> > there.
> >
> > Unfortunately, when driving IRQs from user space, we never tell user
> > space about timer events that may result in interrupt line state
> > changes, so we lose out on timer events if we run with user space gic
> > emulation.
> >
> > Define an ABI to publish the timer output level to userspace.
> >
> > Signed-off-by: Alexander Graf <agraf@suse.de>
> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  Documentation/virtual/kvm/api.txt | 29 +++++++++++++++++++++++++++++
> >  arch/arm/include/uapi/asm/kvm.h   |  2 ++
> >  arch/arm64/include/uapi/asm/kvm.h |  2 ++
> >  include/uapi/linux/kvm.h          |  6 ++++++
> >  4 files changed, 39 insertions(+)
> >
> > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> > index 739db9a..2adf600 100644
> > --- a/Documentation/virtual/kvm/api.txt
> > +++ b/Documentation/virtual/kvm/api.txt
> > @@ -3928,3 +3928,32 @@ In order to use SynIC, it has to be activated by setting this
> >  capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
> >  will disable the use of APIC hardware virtualization even if supported
> >  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
> > +
> > +8.3 KVM_CAP_ARM_TIMER
> > +
> > +Architectures: arm, arm64
> > +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
> > +that if userspace creates a VM without an in-kernel interrupt controller, it
> > +will be notified of changes to the output level of ARM architected timers
> > +presented to the VM.  For such VMs, on every return to userspace, the kernel
> > +updates the vcpu's run->s.regs.timer_irq_level field to represent the actual
> > +output level of the timers.
> > +
> > +Whenever kvm detects a change in the timer output level, kvm guarantees at
> > +least one return to userspace before running the VM.  This exit could either
> > +be a KVM_EXIT_INTR or any other exit event, like KVM_EXIT_MMIO. This way,
> > +userspace can always sample the timer output level and re-compute the state of
> > +the userspace interrupt controller.  Userspace should always check the state
> > +of run->s.regs.timer_irq_level on every kvm exit.  The value in
> > +run->s.regs.timer_irq_level should be considered a level triggered interrupt
> > +signal.
> > +
> > +The field run->s.regs.timer_irq_level is available independent of
> > +run->kvm_valid_regs or run->kvm_dirty_regs bits.
> > +
> > +Currently the following bits are defined for the timer_irq_level bitmap:
> > +
> > +    KVM_ARM_TIMER_VTIMER  -  virtual timer
> > +
> > +Future versions of kvm may implement additional timer events. These will get
> > +indicated by additional KVM_CAP extensions.
> 
> This API looks good to me generally. My only question is whether we
> want to name the struct fields so they're not specifically talking
> about timer interrupts. For instance we probably want to expose the
> vPMU interrupt line to userspace too. We could do that by adding another
> struct field pmu_irq_level, but we could equally just assign it a bit
> in the existing irq_level field.
> 
> Possible current and future outbound interrupt lines (some of these
> would only show up in some unlikely or lots-of-implementation-needed
> cases, I'm just trying to produce an exhaustive list):
>  * virtual timer
>  * physical timer
>  * hyp timer (nested virtualization case)
>  * secure timer (unlikely but maybe if EL3 is ever supported inside a VM)
>  * gic maintenance interrupt (nested virt again)
>  * PMU interrupt

Thanks for the list, that's good to have around for the future.

There's also the potential of the EL2 virtual timer for nested VHE
support, right?

> 
> The kernel doesn't know which interrupt number these would be wired
> up to, so they're all just arbitrary outputs, and you could put them
> in one field or split them up into multiple fields, it doesn't make
> much difference.
> 

So if we keep this we're kind of suggesting that we'll have a field per
device type later on.  Since this is a u8 and we are talking about up 5
5 timers already, there's not much waste currently, and we have plenty
of padding.  I suppose we an always add a 'other_devices' thing later,
so I prefer just sticking with this one for now.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Maydell Nov. 1, 2016, 2:54 p.m. UTC | #3
On 1 November 2016 at 14:50, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Tue, Nov 01, 2016 at 11:26:54AM +0000, Peter Maydell wrote:
>> Possible current and future outbound interrupt lines (some of these
>> would only show up in some unlikely or lots-of-implementation-needed
>> cases, I'm just trying to produce an exhaustive list):
>>  * virtual timer
>>  * physical timer
>>  * hyp timer (nested virtualization case)
>>  * secure timer (unlikely but maybe if EL3 is ever supported inside a VM)
>>  * gic maintenance interrupt (nested virt again)
>>  * PMU interrupt
>
> Thanks for the list, that's good to have around for the future.
>
> There's also the potential of the EL2 virtual timer for nested VHE
> support, right?

That's the one I meant by "hyp timer".

>> The kernel doesn't know which interrupt number these would be wired
>> up to, so they're all just arbitrary outputs, and you could put them
>> in one field or split them up into multiple fields, it doesn't make
>> much difference.
>>
>
> So if we keep this we're kind of suggesting that we'll have a field per
> device type later on.  Since this is a u8 and we are talking about up 5
> 5 timers already,

4.

> there's not much waste currently, and we have plenty
> of padding.  I suppose we an always add a 'other_devices' thing later,
> so I prefer just sticking with this one for now.

Yeah, it's not a big deal either way.

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoffer Dall Nov. 1, 2016, 3:32 p.m. UTC | #4
On Tue, Nov 01, 2016 at 02:54:11PM +0000, Peter Maydell wrote:
> On 1 November 2016 at 14:50, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
> > On Tue, Nov 01, 2016 at 11:26:54AM +0000, Peter Maydell wrote:
> >> Possible current and future outbound interrupt lines (some of these
> >> would only show up in some unlikely or lots-of-implementation-needed
> >> cases, I'm just trying to produce an exhaustive list):
> >>  * virtual timer
> >>  * physical timer
> >>  * hyp timer (nested virtualization case)
> >>  * secure timer (unlikely but maybe if EL3 is ever supported inside a VM)
> >>  * gic maintenance interrupt (nested virt again)
> >>  * PMU interrupt
> >
> > Thanks for the list, that's good to have around for the future.
> >
> > There's also the potential of the EL2 virtual timer for nested VHE
> > support, right?
> 
> That's the one I meant by "hyp timer".
> 

there's the hyp timer, and then there's the ARMv8.1 virtual hyp timer.

> >> The kernel doesn't know which interrupt number these would be wired
> >> up to, so they're all just arbitrary outputs, and you could put them
> >> in one field or split them up into multiple fields, it doesn't make
> >> much difference.
> >>
> >
> > So if we keep this we're kind of suggesting that we'll have a field per
> > device type later on.  Since this is a u8 and we are talking about up 5
> > 5 timers already,
> 
> 4.
> 

virtual
physical
hyp
virtual hyp
secure

Am I missing something?

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Zyngier Nov. 1, 2016, 4:56 p.m. UTC | #5
On Tue, Nov 01 2016 at 02:54:11 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 1 November 2016 at 14:50, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
>> On Tue, Nov 01, 2016 at 11:26:54AM +0000, Peter Maydell wrote:
>>> Possible current and future outbound interrupt lines (some of these
>>> would only show up in some unlikely or lots-of-implementation-needed
>>> cases, I'm just trying to produce an exhaustive list):
>>>  * virtual timer
>>>  * physical timer
>>>  * hyp timer (nested virtualization case)
>>>  * secure timer (unlikely but maybe if EL3 is ever supported inside a VM)
>>>  * gic maintenance interrupt (nested virt again)
>>>  * PMU interrupt
>>
>> Thanks for the list, that's good to have around for the future.
>>
>> There's also the potential of the EL2 virtual timer for nested VHE
>> support, right?
>
> That's the one I meant by "hyp timer".

VHE also adds an extra virtual timer, for symmetry with what EL1
provides (and on which CNTVOFF doesn't have any effect) - see section
B8.1.1 of the ARMv8.1 addendum. So we effectively have:

- Secure physical EL3
- Non-secure physical EL1
- Non-secure virtual EL1
- Non-secure physical EL2
- Non-secure virtual EL2

Thanks,

	M.
diff mbox

Patch

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 739db9a..2adf600 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3928,3 +3928,32 @@  In order to use SynIC, it has to be activated by setting this
 capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
 will disable the use of APIC hardware virtualization even if supported
 by the CPU, as it's incompatible with SynIC auto-EOI behavior.
+
+8.3 KVM_CAP_ARM_TIMER
+
+Architectures: arm, arm64
+This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
+that if userspace creates a VM without an in-kernel interrupt controller, it
+will be notified of changes to the output level of ARM architected timers
+presented to the VM.  For such VMs, on every return to userspace, the kernel
+updates the vcpu's run->s.regs.timer_irq_level field to represent the actual
+output level of the timers.
+
+Whenever kvm detects a change in the timer output level, kvm guarantees at
+least one return to userspace before running the VM.  This exit could either
+be a KVM_EXIT_INTR or any other exit event, like KVM_EXIT_MMIO. This way,
+userspace can always sample the timer output level and re-compute the state of
+the userspace interrupt controller.  Userspace should always check the state
+of run->s.regs.timer_irq_level on every kvm exit.  The value in
+run->s.regs.timer_irq_level should be considered a level triggered interrupt
+signal.
+
+The field run->s.regs.timer_irq_level is available independent of
+run->kvm_valid_regs or run->kvm_dirty_regs bits.
+
+Currently the following bits are defined for the timer_irq_level bitmap:
+
+    KVM_ARM_TIMER_VTIMER  -  virtual timer
+
+Future versions of kvm may implement additional timer events. These will get
+indicated by additional KVM_CAP extensions.
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index b38c10c..23c2e77 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -112,6 +112,8 @@  struct kvm_debug_exit_arch {
 };
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_TIMER */
+	u8 timer_irq_level;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 3051f86..411d62a 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -143,6 +143,8 @@  struct kvm_debug_exit_arch {
 #define KVM_GUESTDBG_USE_HW		(1 << 17)
 
 struct kvm_sync_regs {
+	/* Used with KVM_CAP_ARM_TIMER */
+	u8 timer_irq_level;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 300ef25..c293fc9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -870,6 +870,7 @@  struct kvm_ppc_smmu_info {
 #define KVM_CAP_S390_USER_INSTR0 130
 #define KVM_CAP_MSI_DEVID 131
 #define KVM_CAP_PPC_HTM 132
+#define KVM_CAP_ARM_TIMER 133
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1327,4 +1328,9 @@  struct kvm_assigned_msix_entry {
 #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
 #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
 
+/* Available with KVM_CAP_ARM_TIMER */
+
+/* Bits for run->s.regs.timer_irq_level */
+#define KVM_ARM_TIMER_VTIMER		(1 << 0)
+
 #endif /* __LINUX_KVM_H */