Message ID | 55E87E1F.7030702@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Sep 03, 2015 at 06:06:39PM +0100, Marc Zyngier wrote: > On 30/08/15 14:54, Christoffer Dall wrote: > > The arch timer currently uses edge-triggered semantics in the sense that > > the line is never sampled by the vgic and lowering the line from the > > timer to the vgic doesn't have any affect on the pending state of > > virtual interrupts in the vgic. This means that we do not support a > > guest with the otherwise valid behavior of (1) disable interrupts (2) > > enable the timer (3) disable the timer (4) enable interrupts. Such a > > guest would validly not expect to see any interrupts on real hardware, > > but will see interrupts on KVM. > > > > This patches fixes this shortcoming through the following series of > > changes. > > > > First, we change the flow of the timer/vgic sync/flush operations. Now > > the timer is always flushed/synced before the vgic, because the vgic > > samples the state of the timer output. This has the implication that we > > move the timer operations in to non-preempible sections, but that is > > fine after the previous commit getting rid of hrtimer schedules on every > > entry/exit. > > > > Second, we change the internal behavior of the timer, letting the timer > > keep track of its previous output state, and only lower/raise the line > > to the vgic when the state changes. Note that in theory this could have > > been accomplished more simply by signalling the vgic every time the > > state *potentially* changed, but we don't want to be hitting the vgic > > more often than necessary. > > > > Third, we get rid of the use of the map->active field in the vgic and > > instead simply set the interrupt as active on the physical distributor > > whenever we signal a mapped interrupt to the guest, and we reset the > > active state when we sync back the HW state from the vgic. > > > > Fourth, and finally, we now initialize the timer PPIs (and all the other > > unused PPIs for now), to be level-triggered, and modify the sync code to > > sample the line state on HW sync and re-inject a new interrupt if it is > > still pending at that time. > > > > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> > > --- > > arch/arm/kvm/arm.c | 11 +++++-- > > include/kvm/arm_arch_timer.h | 2 +- > > include/kvm/arm_vgic.h | 3 -- > > virt/kvm/arm/arch_timer.c | 68 +++++++++++++++++++++++++++++++------------- > > virt/kvm/arm/vgic.c | 67 +++++++++++++++---------------------------- > > 5 files changed, 81 insertions(+), 70 deletions(-) > > > > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > > index bdf8871..102a4aa 100644 > > --- a/arch/arm/kvm/arm.c > > +++ b/arch/arm/kvm/arm.c > > @@ -561,9 +561,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > > > > if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) { > > local_irq_enable(); > > + kvm_timer_sync_hwstate(vcpu); > > kvm_vgic_sync_hwstate(vcpu); > > preempt_enable(); > > - kvm_timer_sync_hwstate(vcpu); > > continue; > > } > > > > @@ -608,12 +608,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > > kvm_guest_exit(); > > trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); > > > > + /* > > + * We must sync the timer state before the vgic state so that > > + * the vgic can properly sample the updated state of the > > + * interrupt line. > > + */ > > + kvm_timer_sync_hwstate(vcpu); > > + > > kvm_vgic_sync_hwstate(vcpu); > > > > preempt_enable(); > > > > - kvm_timer_sync_hwstate(vcpu); > > - > > ret = handle_exit(vcpu, run, ret); > > } > > > > diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h > > index ef14cc1..1800227 100644 > > --- a/include/kvm/arm_arch_timer.h > > +++ b/include/kvm/arm_arch_timer.h > > @@ -51,7 +51,7 @@ struct arch_timer_cpu { > > bool armed; > > > > /* Timer IRQ */ > > - const struct kvm_irq_level *irq; > > + struct kvm_irq_level irq; > > > > /* VGIC mapping */ > > struct irq_phys_map *map; > > diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > > index d901f1a..99011a0 100644 > > --- a/include/kvm/arm_vgic.h > > +++ b/include/kvm/arm_vgic.h > > @@ -163,7 +163,6 @@ struct irq_phys_map { > > u32 virt_irq; > > u32 phys_irq; > > u32 irq; > > - bool active; > > }; > > > > struct irq_phys_map_entry { > > @@ -358,8 +357,6 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu); > > struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, > > int virt_irq, int irq); > > int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map); > > -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map); > > -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active); > > > > #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) > > #define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus)) > > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c > > index 018f3d6..747302f 100644 > > --- a/virt/kvm/arm/arch_timer.c > > +++ b/virt/kvm/arm/arch_timer.c > > @@ -59,18 +59,6 @@ static void timer_disarm(struct arch_timer_cpu *timer) > > } > > } > > > > -static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu) > > -{ > > - int ret; > > - struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > > - > > - kvm_vgic_set_phys_irq_active(timer->map, true); > > - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > > - timer->map, > > - timer->irq->level); > > - WARN_ON(ret); > > -} > > - > > static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) > > { > > struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id; > > @@ -116,8 +104,7 @@ static bool kvm_timer_irq_enabled(struct kvm_vcpu *vcpu) > > struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > > > > return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) && > > - (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) && > > - !kvm_vgic_get_phys_irq_active(timer->map); > > + (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE); > > } > > > > bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > > @@ -134,6 +121,45 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > > return cval <= now; > > } > > > > +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) > > +{ > > + int ret; > > + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > > + > > + BUG_ON(!vgic_initialized(vcpu->kvm)); > > + > > + ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > > + timer->map, > > + timer->irq.level); > > + WARN_ON(ret); > > +} > > + > > +/* > > + * Check if there was a change in the timer state (should we raise or lower > > + * the line level to the GIC). > > + */ > > +static void kvm_timer_update_state(struct kvm_vcpu *vcpu) > > +{ > > + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > > + > > + /* > > + * If userspace modified the timer registers via SET_ONE_REG before > > + * the vgic was initialized, we mustn't set the timer->irq.level value > > + * because the guest would never see the interrupt. Instead wait > > + * until we call this funciton from kvm_timer_flush_hwstate. > > + */ > > + if (!vgic_initialized(vcpu->kvm)) > > + return; > > + > > + if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { > > + timer->irq.level = 1; > > + kvm_timer_update_irq(vcpu); > > + } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { > > + timer->irq.level = 0; > > + kvm_timer_update_irq(vcpu); > > + } > > +} > > + > > It took me ages to parse this, so I rewrote it to match my understanding: > > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c > index 8a0fdfc..a722f0f 100644 > --- a/virt/kvm/arm/arch_timer.c > +++ b/virt/kvm/arm/arch_timer.c > @@ -121,13 +121,14 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > return cval <= now; > } > > -static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) > +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_state) > { > int ret; > struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > > BUG_ON(!vgic_initialized(vcpu->kvm)); > > + timer->irq.level = new_state; > ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > timer->map, > timer->irq.level); > @@ -151,13 +152,8 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu) > if (!vgic_initialized(vcpu->kvm)) > return; > > - if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { > - timer->irq.level = 1; > - kvm_timer_update_irq(vcpu); > - } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { > - timer->irq.level = 0; > - kvm_timer_update_irq(vcpu); > - } > + if (kvm_timer_should_fire(vcpu) != timer->irq.level) > + kvm_timer_update_irq(vcpu, !timer->irq.level); > } > > /* > > Did I get it right? almost, you'd have to assign timer->irq.level after you check for it though, right? > > > /* > > * Schedule the background timer before calling kvm_vcpu_block, so that this > > * thread is removed from its waitqueue and made runnable when there's a timer > > @@ -191,8 +217,7 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) > > * If the timer expired while we were not scheduled, now is the time > > * to inject it. > > */ > > - if (kvm_timer_should_fire(vcpu)) > > - kvm_timer_inject_irq(vcpu); > > + kvm_timer_update_state(vcpu); > > } > > > > /** > > @@ -208,8 +233,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu) > > > > BUG_ON(timer_is_armed(timer)); > > > > - if (kvm_timer_should_fire(vcpu)) > > - kvm_timer_inject_irq(vcpu); > > + /* > > + * The guest could have modified the timer registers or the timer > > + * could have expired, update the timer state. > > + */ > > + kvm_timer_update_state(vcpu); > > } > > > > int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, > > @@ -224,7 +252,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, > > * kvm_vcpu_set_target(). To handle this, we determine > > * vcpu timer irq number when the vcpu is reset. > > */ > > - timer->irq = irq; > > + timer->irq.irq = irq->irq; > > > > /* > > * Tell the VGIC that the virtual interrupt is tied to a > > @@ -269,6 +297,8 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value) > > default: > > return -1; > > } > > + > > + kvm_timer_update_state(vcpu); > > return 0; > > } > > > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > > index 9ed8d53..f4ea950 100644 > > --- a/virt/kvm/arm/vgic.c > > +++ b/virt/kvm/arm/vgic.c > > @@ -1422,34 +1422,43 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu) > > /* > > * Save the physical active state, and reset it to inactive. > > * > > - * Return 1 if HW interrupt went from active to inactive, and 0 otherwise. > > + * Return true if there's a pending level triggered interrupt line to queue. > > */ > > -static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr) > > +static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr) > > { > > struct irq_phys_map *map; > > + bool phys_active; > > int ret; > > > > if (!(vlr.state & LR_HW)) > > return 0; > > > > map = vgic_irq_map_search(vcpu, vlr.irq); > > - BUG_ON(!map || !map->active); > > + BUG_ON(!map); > > > > ret = irq_get_irqchip_state(map->irq, > > IRQCHIP_STATE_ACTIVE, > > - &map->active); > > + &phys_active); > > > > WARN_ON(ret); > > > > - if (map->active) { > > + if (phys_active) { > > + /* > > + * Interrupt still marked as active on the physical > > + * distributor, so guest did not EOI it yet. Reset to > > + * non-active so that other VMs can see interrupts from this > > + * device. > > + */ > > ret = irq_set_irqchip_state(map->irq, > > IRQCHIP_STATE_ACTIVE, > > false); > > WARN_ON(ret); > > - return 0; > > + return false; > > } > > > > - return 1; > > + /* Mapped edge-triggered interrupts not yet supported. */ > > + WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq)); > > Hmmm. What are we missing? > I don't know really, my brain ran out of memory, but it's not like we claimed to support this earlier and clearly we didn't work this well enough through. > > + return process_level_irq(vcpu, lr, vlr); > > } > > > > /* Sync back the VGIC state after a guest run */ > > @@ -1474,18 +1483,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) > > continue; > > > > vlr = vgic_get_lr(vcpu, lr); > > - if (vgic_sync_hwirq(vcpu, vlr)) { > > - /* > > - * So this is a HW interrupt that the guest > > - * EOI-ed. Clean the LR state and allow the > > - * interrupt to be sampled again. > > - */ > > - vlr.state = 0; > > - vlr.hwirq = 0; > > - vgic_set_lr(vcpu, lr, vlr); > > - vgic_irq_clear_queued(vcpu, vlr.irq); > > - set_bit(lr, elrsr_ptr); > > - } > > + if (vgic_sync_hwirq(vcpu, lr, vlr)) > > + level_pending = true; > > > > if (!test_bit(lr, elrsr_ptr)) > > continue; > > @@ -1861,30 +1860,6 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu) > > } > > > > /** > > - * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ > > - * > > - * Return the logical active state of a mapped interrupt. This doesn't > > - * necessarily reflects the current HW state. > > - */ > > -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map) > > -{ > > - BUG_ON(!map); > > - return map->active; > > -} > > - > > -/** > > - * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ > > - * > > - * Set the logical active state of a mapped interrupt. This doesn't > > - * immediately affects the HW state. > > - */ > > -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active) > > -{ > > - BUG_ON(!map); > > - map->active = active; > > -} > > - > > -/** > > * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping > > * @vcpu: The VCPU pointer > > * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq > > @@ -2112,10 +2087,14 @@ int vgic_init(struct kvm *kvm) > > if (i < VGIC_NR_SGIS) > > vgic_bitmap_set_irq_val(&dist->irq_enabled, > > vcpu->vcpu_id, i, 1); > > - if (i < VGIC_NR_PRIVATE_IRQS) > > + if (i < VGIC_NR_SGIS) > > vgic_bitmap_set_irq_val(&dist->irq_cfg, > > vcpu->vcpu_id, i, > > VGIC_CFG_EDGE); > > + else if (i < VGIC_NR_PRIVATE_IRQS) /* PPIs */ > > + vgic_bitmap_set_irq_val(&dist->irq_cfg, > > + vcpu->vcpu_id, i, > > + VGIC_CFG_LEVEL); > > } > > > > vgic_enable(vcpu); > > > > My only real objection to this patch is that it puts my brain upside down. > Hopefully that won't last. > Yeah, I tried helping in the commit message, but I couldn't do much beyond that. Splitting up the patch further didn't really work out for me. Thanks for the review, -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/09/15 18:23, Christoffer Dall wrote: > On Thu, Sep 03, 2015 at 06:06:39PM +0100, Marc Zyngier wrote: >> On 30/08/15 14:54, Christoffer Dall wrote: >>> The arch timer currently uses edge-triggered semantics in the sense that >>> the line is never sampled by the vgic and lowering the line from the >>> timer to the vgic doesn't have any affect on the pending state of >>> virtual interrupts in the vgic. This means that we do not support a >>> guest with the otherwise valid behavior of (1) disable interrupts (2) >>> enable the timer (3) disable the timer (4) enable interrupts. Such a >>> guest would validly not expect to see any interrupts on real hardware, >>> but will see interrupts on KVM. >>> >>> This patches fixes this shortcoming through the following series of >>> changes. >>> >>> First, we change the flow of the timer/vgic sync/flush operations. Now >>> the timer is always flushed/synced before the vgic, because the vgic >>> samples the state of the timer output. This has the implication that we >>> move the timer operations in to non-preempible sections, but that is >>> fine after the previous commit getting rid of hrtimer schedules on every >>> entry/exit. >>> >>> Second, we change the internal behavior of the timer, letting the timer >>> keep track of its previous output state, and only lower/raise the line >>> to the vgic when the state changes. Note that in theory this could have >>> been accomplished more simply by signalling the vgic every time the >>> state *potentially* changed, but we don't want to be hitting the vgic >>> more often than necessary. >>> >>> Third, we get rid of the use of the map->active field in the vgic and >>> instead simply set the interrupt as active on the physical distributor >>> whenever we signal a mapped interrupt to the guest, and we reset the >>> active state when we sync back the HW state from the vgic. >>> >>> Fourth, and finally, we now initialize the timer PPIs (and all the other >>> unused PPIs for now), to be level-triggered, and modify the sync code to >>> sample the line state on HW sync and re-inject a new interrupt if it is >>> still pending at that time. >>> >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> >>> --- >>> arch/arm/kvm/arm.c | 11 +++++-- >>> include/kvm/arm_arch_timer.h | 2 +- >>> include/kvm/arm_vgic.h | 3 -- >>> virt/kvm/arm/arch_timer.c | 68 +++++++++++++++++++++++++++++++------------- >>> virt/kvm/arm/vgic.c | 67 +++++++++++++++---------------------------- >>> 5 files changed, 81 insertions(+), 70 deletions(-) >>> >>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c >>> index bdf8871..102a4aa 100644 >>> --- a/arch/arm/kvm/arm.c >>> +++ b/arch/arm/kvm/arm.c >>> @@ -561,9 +561,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) >>> >>> if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) { >>> local_irq_enable(); >>> + kvm_timer_sync_hwstate(vcpu); >>> kvm_vgic_sync_hwstate(vcpu); >>> preempt_enable(); >>> - kvm_timer_sync_hwstate(vcpu); >>> continue; >>> } >>> >>> @@ -608,12 +608,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) >>> kvm_guest_exit(); >>> trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); >>> >>> + /* >>> + * We must sync the timer state before the vgic state so that >>> + * the vgic can properly sample the updated state of the >>> + * interrupt line. >>> + */ >>> + kvm_timer_sync_hwstate(vcpu); >>> + >>> kvm_vgic_sync_hwstate(vcpu); >>> >>> preempt_enable(); >>> >>> - kvm_timer_sync_hwstate(vcpu); >>> - >>> ret = handle_exit(vcpu, run, ret); >>> } >>> >>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h >>> index ef14cc1..1800227 100644 >>> --- a/include/kvm/arm_arch_timer.h >>> +++ b/include/kvm/arm_arch_timer.h >>> @@ -51,7 +51,7 @@ struct arch_timer_cpu { >>> bool armed; >>> >>> /* Timer IRQ */ >>> - const struct kvm_irq_level *irq; >>> + struct kvm_irq_level irq; >>> >>> /* VGIC mapping */ >>> struct irq_phys_map *map; >>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h >>> index d901f1a..99011a0 100644 >>> --- a/include/kvm/arm_vgic.h >>> +++ b/include/kvm/arm_vgic.h >>> @@ -163,7 +163,6 @@ struct irq_phys_map { >>> u32 virt_irq; >>> u32 phys_irq; >>> u32 irq; >>> - bool active; >>> }; >>> >>> struct irq_phys_map_entry { >>> @@ -358,8 +357,6 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu); >>> struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, >>> int virt_irq, int irq); >>> int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map); >>> -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map); >>> -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active); >>> >>> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) >>> #define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus)) >>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c >>> index 018f3d6..747302f 100644 >>> --- a/virt/kvm/arm/arch_timer.c >>> +++ b/virt/kvm/arm/arch_timer.c >>> @@ -59,18 +59,6 @@ static void timer_disarm(struct arch_timer_cpu *timer) >>> } >>> } >>> >>> -static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu) >>> -{ >>> - int ret; >>> - struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >>> - >>> - kvm_vgic_set_phys_irq_active(timer->map, true); >>> - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, >>> - timer->map, >>> - timer->irq->level); >>> - WARN_ON(ret); >>> -} >>> - >>> static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) >>> { >>> struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id; >>> @@ -116,8 +104,7 @@ static bool kvm_timer_irq_enabled(struct kvm_vcpu *vcpu) >>> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >>> >>> return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) && >>> - (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) && >>> - !kvm_vgic_get_phys_irq_active(timer->map); >>> + (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE); >>> } >>> >>> bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) >>> @@ -134,6 +121,45 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) >>> return cval <= now; >>> } >>> >>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) >>> +{ >>> + int ret; >>> + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >>> + >>> + BUG_ON(!vgic_initialized(vcpu->kvm)); >>> + >>> + ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, >>> + timer->map, >>> + timer->irq.level); >>> + WARN_ON(ret); >>> +} >>> + >>> +/* >>> + * Check if there was a change in the timer state (should we raise or lower >>> + * the line level to the GIC). >>> + */ >>> +static void kvm_timer_update_state(struct kvm_vcpu *vcpu) >>> +{ >>> + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >>> + >>> + /* >>> + * If userspace modified the timer registers via SET_ONE_REG before >>> + * the vgic was initialized, we mustn't set the timer->irq.level value >>> + * because the guest would never see the interrupt. Instead wait >>> + * until we call this funciton from kvm_timer_flush_hwstate. >>> + */ >>> + if (!vgic_initialized(vcpu->kvm)) >>> + return; >>> + >>> + if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { >>> + timer->irq.level = 1; >>> + kvm_timer_update_irq(vcpu); >>> + } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { >>> + timer->irq.level = 0; >>> + kvm_timer_update_irq(vcpu); >>> + } >>> +} >>> + >> >> It took me ages to parse this, so I rewrote it to match my understanding: >> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c >> index 8a0fdfc..a722f0f 100644 >> --- a/virt/kvm/arm/arch_timer.c >> +++ b/virt/kvm/arm/arch_timer.c >> @@ -121,13 +121,14 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) >> return cval <= now; >> } >> >> -static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) >> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_state) >> { >> int ret; >> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >> >> BUG_ON(!vgic_initialized(vcpu->kvm)); >> >> + timer->irq.level = new_state; >> ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, >> timer->map, >> timer->irq.level); >> @@ -151,13 +152,8 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu) >> if (!vgic_initialized(vcpu->kvm)) >> return; >> >> - if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { >> - timer->irq.level = 1; >> - kvm_timer_update_irq(vcpu); >> - } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { >> - timer->irq.level = 0; >> - kvm_timer_update_irq(vcpu); >> - } >> + if (kvm_timer_should_fire(vcpu) != timer->irq.level) >> + kvm_timer_update_irq(vcpu, !timer->irq.level); >> } >> >> /* >> >> Did I get it right? > > almost, you'd have to assign timer->irq.level after you check for it > though, right? That's why I've added this line in kvm_timer_update_irq()! :-) >> >>> /* >>> * Schedule the background timer before calling kvm_vcpu_block, so that this >>> * thread is removed from its waitqueue and made runnable when there's a timer >>> @@ -191,8 +217,7 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) >>> * If the timer expired while we were not scheduled, now is the time >>> * to inject it. >>> */ >>> - if (kvm_timer_should_fire(vcpu)) >>> - kvm_timer_inject_irq(vcpu); >>> + kvm_timer_update_state(vcpu); >>> } >>> >>> /** >>> @@ -208,8 +233,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu) >>> >>> BUG_ON(timer_is_armed(timer)); >>> >>> - if (kvm_timer_should_fire(vcpu)) >>> - kvm_timer_inject_irq(vcpu); >>> + /* >>> + * The guest could have modified the timer registers or the timer >>> + * could have expired, update the timer state. >>> + */ >>> + kvm_timer_update_state(vcpu); >>> } >>> >>> int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, >>> @@ -224,7 +252,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, >>> * kvm_vcpu_set_target(). To handle this, we determine >>> * vcpu timer irq number when the vcpu is reset. >>> */ >>> - timer->irq = irq; >>> + timer->irq.irq = irq->irq; >>> >>> /* >>> * Tell the VGIC that the virtual interrupt is tied to a >>> @@ -269,6 +297,8 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value) >>> default: >>> return -1; >>> } >>> + >>> + kvm_timer_update_state(vcpu); >>> return 0; >>> } >>> >>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c >>> index 9ed8d53..f4ea950 100644 >>> --- a/virt/kvm/arm/vgic.c >>> +++ b/virt/kvm/arm/vgic.c >>> @@ -1422,34 +1422,43 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu) >>> /* >>> * Save the physical active state, and reset it to inactive. >>> * >>> - * Return 1 if HW interrupt went from active to inactive, and 0 otherwise. >>> + * Return true if there's a pending level triggered interrupt line to queue. >>> */ >>> -static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr) >>> +static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr) >>> { >>> struct irq_phys_map *map; >>> + bool phys_active; >>> int ret; >>> >>> if (!(vlr.state & LR_HW)) >>> return 0; >>> >>> map = vgic_irq_map_search(vcpu, vlr.irq); >>> - BUG_ON(!map || !map->active); >>> + BUG_ON(!map); >>> >>> ret = irq_get_irqchip_state(map->irq, >>> IRQCHIP_STATE_ACTIVE, >>> - &map->active); >>> + &phys_active); >>> >>> WARN_ON(ret); >>> >>> - if (map->active) { >>> + if (phys_active) { >>> + /* >>> + * Interrupt still marked as active on the physical >>> + * distributor, so guest did not EOI it yet. Reset to >>> + * non-active so that other VMs can see interrupts from this >>> + * device. >>> + */ >>> ret = irq_set_irqchip_state(map->irq, >>> IRQCHIP_STATE_ACTIVE, >>> false); >>> WARN_ON(ret); >>> - return 0; >>> + return false; >>> } >>> >>> - return 1; >>> + /* Mapped edge-triggered interrupts not yet supported. */ >>> + WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq)); >> >> Hmmm. What are we missing? >> > > I don't know really, my brain ran out of memory, but it's not like we > claimed to support this earlier and clearly we didn't work this well > enough through. We can definitely revisit this later, but I have the feeling that the flow is quite similar... > >>> + return process_level_irq(vcpu, lr, vlr); >>> } >>> >>> /* Sync back the VGIC state after a guest run */ >>> @@ -1474,18 +1483,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) >>> continue; >>> >>> vlr = vgic_get_lr(vcpu, lr); >>> - if (vgic_sync_hwirq(vcpu, vlr)) { >>> - /* >>> - * So this is a HW interrupt that the guest >>> - * EOI-ed. Clean the LR state and allow the >>> - * interrupt to be sampled again. >>> - */ >>> - vlr.state = 0; >>> - vlr.hwirq = 0; >>> - vgic_set_lr(vcpu, lr, vlr); >>> - vgic_irq_clear_queued(vcpu, vlr.irq); >>> - set_bit(lr, elrsr_ptr); >>> - } >>> + if (vgic_sync_hwirq(vcpu, lr, vlr)) >>> + level_pending = true; >>> >>> if (!test_bit(lr, elrsr_ptr)) >>> continue; >>> @@ -1861,30 +1860,6 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu) >>> } >>> >>> /** >>> - * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ >>> - * >>> - * Return the logical active state of a mapped interrupt. This doesn't >>> - * necessarily reflects the current HW state. >>> - */ >>> -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map) >>> -{ >>> - BUG_ON(!map); >>> - return map->active; >>> -} >>> - >>> -/** >>> - * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ >>> - * >>> - * Set the logical active state of a mapped interrupt. This doesn't >>> - * immediately affects the HW state. >>> - */ >>> -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active) >>> -{ >>> - BUG_ON(!map); >>> - map->active = active; >>> -} >>> - >>> -/** >>> * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping >>> * @vcpu: The VCPU pointer >>> * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq >>> @@ -2112,10 +2087,14 @@ int vgic_init(struct kvm *kvm) >>> if (i < VGIC_NR_SGIS) >>> vgic_bitmap_set_irq_val(&dist->irq_enabled, >>> vcpu->vcpu_id, i, 1); >>> - if (i < VGIC_NR_PRIVATE_IRQS) >>> + if (i < VGIC_NR_SGIS) >>> vgic_bitmap_set_irq_val(&dist->irq_cfg, >>> vcpu->vcpu_id, i, >>> VGIC_CFG_EDGE); >>> + else if (i < VGIC_NR_PRIVATE_IRQS) /* PPIs */ >>> + vgic_bitmap_set_irq_val(&dist->irq_cfg, >>> + vcpu->vcpu_id, i, >>> + VGIC_CFG_LEVEL); >>> } >>> >>> vgic_enable(vcpu); >>> >> >> My only real objection to this patch is that it puts my brain upside down. >> Hopefully that won't last. >> > Yeah, I tried helping in the commit message, but I couldn't do much > beyond that. Splitting up the patch further didn't really work out for > me. It is indeed quite intricated, and hard to really take apart. Guess we'll have to live with it. Thanks, M.
On Thu, Sep 03, 2015 at 06:29:14PM +0100, Marc Zyngier wrote: > On 03/09/15 18:23, Christoffer Dall wrote: > > On Thu, Sep 03, 2015 at 06:06:39PM +0100, Marc Zyngier wrote: > >> On 30/08/15 14:54, Christoffer Dall wrote: > >>> The arch timer currently uses edge-triggered semantics in the sense that > >>> the line is never sampled by the vgic and lowering the line from the > >>> timer to the vgic doesn't have any affect on the pending state of > >>> virtual interrupts in the vgic. This means that we do not support a > >>> guest with the otherwise valid behavior of (1) disable interrupts (2) > >>> enable the timer (3) disable the timer (4) enable interrupts. Such a > >>> guest would validly not expect to see any interrupts on real hardware, > >>> but will see interrupts on KVM. > >>> > >>> This patches fixes this shortcoming through the following series of > >>> changes. > >>> > >>> First, we change the flow of the timer/vgic sync/flush operations. Now > >>> the timer is always flushed/synced before the vgic, because the vgic > >>> samples the state of the timer output. This has the implication that we > >>> move the timer operations in to non-preempible sections, but that is > >>> fine after the previous commit getting rid of hrtimer schedules on every > >>> entry/exit. > >>> > >>> Second, we change the internal behavior of the timer, letting the timer > >>> keep track of its previous output state, and only lower/raise the line > >>> to the vgic when the state changes. Note that in theory this could have > >>> been accomplished more simply by signalling the vgic every time the > >>> state *potentially* changed, but we don't want to be hitting the vgic > >>> more often than necessary. > >>> > >>> Third, we get rid of the use of the map->active field in the vgic and > >>> instead simply set the interrupt as active on the physical distributor > >>> whenever we signal a mapped interrupt to the guest, and we reset the > >>> active state when we sync back the HW state from the vgic. > >>> > >>> Fourth, and finally, we now initialize the timer PPIs (and all the other > >>> unused PPIs for now), to be level-triggered, and modify the sync code to > >>> sample the line state on HW sync and re-inject a new interrupt if it is > >>> still pending at that time. > >>> > >>> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> > >>> --- > >>> arch/arm/kvm/arm.c | 11 +++++-- > >>> include/kvm/arm_arch_timer.h | 2 +- > >>> include/kvm/arm_vgic.h | 3 -- > >>> virt/kvm/arm/arch_timer.c | 68 +++++++++++++++++++++++++++++++------------- > >>> virt/kvm/arm/vgic.c | 67 +++++++++++++++---------------------------- > >>> 5 files changed, 81 insertions(+), 70 deletions(-) > >>> > >>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > >>> index bdf8871..102a4aa 100644 > >>> --- a/arch/arm/kvm/arm.c > >>> +++ b/arch/arm/kvm/arm.c > >>> @@ -561,9 +561,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > >>> > >>> if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) { > >>> local_irq_enable(); > >>> + kvm_timer_sync_hwstate(vcpu); > >>> kvm_vgic_sync_hwstate(vcpu); > >>> preempt_enable(); > >>> - kvm_timer_sync_hwstate(vcpu); > >>> continue; > >>> } > >>> > >>> @@ -608,12 +608,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > >>> kvm_guest_exit(); > >>> trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); > >>> > >>> + /* > >>> + * We must sync the timer state before the vgic state so that > >>> + * the vgic can properly sample the updated state of the > >>> + * interrupt line. > >>> + */ > >>> + kvm_timer_sync_hwstate(vcpu); > >>> + > >>> kvm_vgic_sync_hwstate(vcpu); > >>> > >>> preempt_enable(); > >>> > >>> - kvm_timer_sync_hwstate(vcpu); > >>> - > >>> ret = handle_exit(vcpu, run, ret); > >>> } > >>> > >>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h > >>> index ef14cc1..1800227 100644 > >>> --- a/include/kvm/arm_arch_timer.h > >>> +++ b/include/kvm/arm_arch_timer.h > >>> @@ -51,7 +51,7 @@ struct arch_timer_cpu { > >>> bool armed; > >>> > >>> /* Timer IRQ */ > >>> - const struct kvm_irq_level *irq; > >>> + struct kvm_irq_level irq; > >>> > >>> /* VGIC mapping */ > >>> struct irq_phys_map *map; > >>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > >>> index d901f1a..99011a0 100644 > >>> --- a/include/kvm/arm_vgic.h > >>> +++ b/include/kvm/arm_vgic.h > >>> @@ -163,7 +163,6 @@ struct irq_phys_map { > >>> u32 virt_irq; > >>> u32 phys_irq; > >>> u32 irq; > >>> - bool active; > >>> }; > >>> > >>> struct irq_phys_map_entry { > >>> @@ -358,8 +357,6 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu); > >>> struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, > >>> int virt_irq, int irq); > >>> int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map); > >>> -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map); > >>> -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active); > >>> > >>> #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) > >>> #define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus)) > >>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c > >>> index 018f3d6..747302f 100644 > >>> --- a/virt/kvm/arm/arch_timer.c > >>> +++ b/virt/kvm/arm/arch_timer.c > >>> @@ -59,18 +59,6 @@ static void timer_disarm(struct arch_timer_cpu *timer) > >>> } > >>> } > >>> > >>> -static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu) > >>> -{ > >>> - int ret; > >>> - struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > >>> - > >>> - kvm_vgic_set_phys_irq_active(timer->map, true); > >>> - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > >>> - timer->map, > >>> - timer->irq->level); > >>> - WARN_ON(ret); > >>> -} > >>> - > >>> static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) > >>> { > >>> struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)dev_id; > >>> @@ -116,8 +104,7 @@ static bool kvm_timer_irq_enabled(struct kvm_vcpu *vcpu) > >>> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > >>> > >>> return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) && > >>> - (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) && > >>> - !kvm_vgic_get_phys_irq_active(timer->map); > >>> + (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE); > >>> } > >>> > >>> bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > >>> @@ -134,6 +121,45 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > >>> return cval <= now; > >>> } > >>> > >>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) > >>> +{ > >>> + int ret; > >>> + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > >>> + > >>> + BUG_ON(!vgic_initialized(vcpu->kvm)); > >>> + > >>> + ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > >>> + timer->map, > >>> + timer->irq.level); > >>> + WARN_ON(ret); > >>> +} > >>> + > >>> +/* > >>> + * Check if there was a change in the timer state (should we raise or lower > >>> + * the line level to the GIC). > >>> + */ > >>> +static void kvm_timer_update_state(struct kvm_vcpu *vcpu) > >>> +{ > >>> + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > >>> + > >>> + /* > >>> + * If userspace modified the timer registers via SET_ONE_REG before > >>> + * the vgic was initialized, we mustn't set the timer->irq.level value > >>> + * because the guest would never see the interrupt. Instead wait > >>> + * until we call this funciton from kvm_timer_flush_hwstate. > >>> + */ > >>> + if (!vgic_initialized(vcpu->kvm)) > >>> + return; > >>> + > >>> + if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { > >>> + timer->irq.level = 1; > >>> + kvm_timer_update_irq(vcpu); > >>> + } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { > >>> + timer->irq.level = 0; > >>> + kvm_timer_update_irq(vcpu); > >>> + } > >>> +} > >>> + > >> > >> It took me ages to parse this, so I rewrote it to match my understanding: > >> > >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c > >> index 8a0fdfc..a722f0f 100644 > >> --- a/virt/kvm/arm/arch_timer.c > >> +++ b/virt/kvm/arm/arch_timer.c > >> @@ -121,13 +121,14 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) > >> return cval <= now; > >> } > >> > >> -static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) > >> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_state) > >> { > >> int ret; > >> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; > >> > >> BUG_ON(!vgic_initialized(vcpu->kvm)); > >> > >> + timer->irq.level = new_state; > >> ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, > >> timer->map, > >> timer->irq.level); > >> @@ -151,13 +152,8 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu) > >> if (!vgic_initialized(vcpu->kvm)) > >> return; > >> > >> - if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { > >> - timer->irq.level = 1; > >> - kvm_timer_update_irq(vcpu); > >> - } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { > >> - timer->irq.level = 0; > >> - kvm_timer_update_irq(vcpu); > >> - } > >> + if (kvm_timer_should_fire(vcpu) != timer->irq.level) > >> + kvm_timer_update_irq(vcpu, !timer->irq.level); > >> } > >> > >> /* > >> > >> Did I get it right? > > > > almost, you'd have to assign timer->irq.level after you check for it > > though, right? > > That's why I've added this line in kvm_timer_update_irq()! :-) > duh, /me learns to read diffs all over again. Yeah, your version is probably easier to read. thanks. > >> > >>> /* > >>> * Schedule the background timer before calling kvm_vcpu_block, so that this > >>> * thread is removed from its waitqueue and made runnable when there's a timer > >>> @@ -191,8 +217,7 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) > >>> * If the timer expired while we were not scheduled, now is the time > >>> * to inject it. > >>> */ > >>> - if (kvm_timer_should_fire(vcpu)) > >>> - kvm_timer_inject_irq(vcpu); > >>> + kvm_timer_update_state(vcpu); > >>> } > >>> > >>> /** > >>> @@ -208,8 +233,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu) > >>> > >>> BUG_ON(timer_is_armed(timer)); > >>> > >>> - if (kvm_timer_should_fire(vcpu)) > >>> - kvm_timer_inject_irq(vcpu); > >>> + /* > >>> + * The guest could have modified the timer registers or the timer > >>> + * could have expired, update the timer state. > >>> + */ > >>> + kvm_timer_update_state(vcpu); > >>> } > >>> > >>> int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, > >>> @@ -224,7 +252,7 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, > >>> * kvm_vcpu_set_target(). To handle this, we determine > >>> * vcpu timer irq number when the vcpu is reset. > >>> */ > >>> - timer->irq = irq; > >>> + timer->irq.irq = irq->irq; > >>> > >>> /* > >>> * Tell the VGIC that the virtual interrupt is tied to a > >>> @@ -269,6 +297,8 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value) > >>> default: > >>> return -1; > >>> } > >>> + > >>> + kvm_timer_update_state(vcpu); > >>> return 0; > >>> } > >>> > >>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > >>> index 9ed8d53..f4ea950 100644 > >>> --- a/virt/kvm/arm/vgic.c > >>> +++ b/virt/kvm/arm/vgic.c > >>> @@ -1422,34 +1422,43 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu) > >>> /* > >>> * Save the physical active state, and reset it to inactive. > >>> * > >>> - * Return 1 if HW interrupt went from active to inactive, and 0 otherwise. > >>> + * Return true if there's a pending level triggered interrupt line to queue. > >>> */ > >>> -static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr) > >>> +static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr) > >>> { > >>> struct irq_phys_map *map; > >>> + bool phys_active; > >>> int ret; > >>> > >>> if (!(vlr.state & LR_HW)) > >>> return 0; > >>> > >>> map = vgic_irq_map_search(vcpu, vlr.irq); > >>> - BUG_ON(!map || !map->active); > >>> + BUG_ON(!map); > >>> > >>> ret = irq_get_irqchip_state(map->irq, > >>> IRQCHIP_STATE_ACTIVE, > >>> - &map->active); > >>> + &phys_active); > >>> > >>> WARN_ON(ret); > >>> > >>> - if (map->active) { > >>> + if (phys_active) { > >>> + /* > >>> + * Interrupt still marked as active on the physical > >>> + * distributor, so guest did not EOI it yet. Reset to > >>> + * non-active so that other VMs can see interrupts from this > >>> + * device. > >>> + */ > >>> ret = irq_set_irqchip_state(map->irq, > >>> IRQCHIP_STATE_ACTIVE, > >>> false); > >>> WARN_ON(ret); > >>> - return 0; > >>> + return false; > >>> } > >>> > >>> - return 1; > >>> + /* Mapped edge-triggered interrupts not yet supported. */ > >>> + WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq)); > >> > >> Hmmm. What are we missing? > >> > > > > I don't know really, my brain ran out of memory, but it's not like we > > claimed to support this earlier and clearly we didn't work this well > > enough through. > > We can definitely revisit this later, but I have the feeling that the > flow is quite similar... > > > > >>> + return process_level_irq(vcpu, lr, vlr); > >>> } > >>> > >>> /* Sync back the VGIC state after a guest run */ > >>> @@ -1474,18 +1483,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) > >>> continue; > >>> > >>> vlr = vgic_get_lr(vcpu, lr); > >>> - if (vgic_sync_hwirq(vcpu, vlr)) { > >>> - /* > >>> - * So this is a HW interrupt that the guest > >>> - * EOI-ed. Clean the LR state and allow the > >>> - * interrupt to be sampled again. > >>> - */ > >>> - vlr.state = 0; > >>> - vlr.hwirq = 0; > >>> - vgic_set_lr(vcpu, lr, vlr); > >>> - vgic_irq_clear_queued(vcpu, vlr.irq); > >>> - set_bit(lr, elrsr_ptr); > >>> - } > >>> + if (vgic_sync_hwirq(vcpu, lr, vlr)) > >>> + level_pending = true; > >>> > >>> if (!test_bit(lr, elrsr_ptr)) > >>> continue; > >>> @@ -1861,30 +1860,6 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu) > >>> } > >>> > >>> /** > >>> - * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ > >>> - * > >>> - * Return the logical active state of a mapped interrupt. This doesn't > >>> - * necessarily reflects the current HW state. > >>> - */ > >>> -bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map) > >>> -{ > >>> - BUG_ON(!map); > >>> - return map->active; > >>> -} > >>> - > >>> -/** > >>> - * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ > >>> - * > >>> - * Set the logical active state of a mapped interrupt. This doesn't > >>> - * immediately affects the HW state. > >>> - */ > >>> -void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active) > >>> -{ > >>> - BUG_ON(!map); > >>> - map->active = active; > >>> -} > >>> - > >>> -/** > >>> * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping > >>> * @vcpu: The VCPU pointer > >>> * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq > >>> @@ -2112,10 +2087,14 @@ int vgic_init(struct kvm *kvm) > >>> if (i < VGIC_NR_SGIS) > >>> vgic_bitmap_set_irq_val(&dist->irq_enabled, > >>> vcpu->vcpu_id, i, 1); > >>> - if (i < VGIC_NR_PRIVATE_IRQS) > >>> + if (i < VGIC_NR_SGIS) > >>> vgic_bitmap_set_irq_val(&dist->irq_cfg, > >>> vcpu->vcpu_id, i, > >>> VGIC_CFG_EDGE); > >>> + else if (i < VGIC_NR_PRIVATE_IRQS) /* PPIs */ > >>> + vgic_bitmap_set_irq_val(&dist->irq_cfg, > >>> + vcpu->vcpu_id, i, > >>> + VGIC_CFG_LEVEL); > >>> } > >>> > >>> vgic_enable(vcpu); > >>> > >> > >> My only real objection to this patch is that it puts my brain upside down. > >> Hopefully that won't last. > >> > > Yeah, I tried helping in the commit message, but I couldn't do much > > beyond that. Splitting up the patch further didn't really work out for > > me. > > It is indeed quite intricated, and hard to really take apart. Guess > we'll have to live with it. > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index 8a0fdfc..a722f0f 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -121,13 +121,14 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) return cval <= now; } -static void kvm_timer_update_irq(struct kvm_vcpu *vcpu) +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_state) { int ret; struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; BUG_ON(!vgic_initialized(vcpu->kvm)); + timer->irq.level = new_state; ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, timer->map, timer->irq.level); @@ -151,13 +152,8 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu) if (!vgic_initialized(vcpu->kvm)) return; - if (kvm_timer_should_fire(vcpu) && !timer->irq.level) { - timer->irq.level = 1; - kvm_timer_update_irq(vcpu); - } else if (!kvm_timer_should_fire(vcpu) && timer->irq.level) { - timer->irq.level = 0; - kvm_timer_update_irq(vcpu); - } + if (kvm_timer_should_fire(vcpu) != timer->irq.level) + kvm_timer_update_irq(vcpu, !timer->irq.level); } /*