Message ID | 20240615185423.49474-1-florian.lugou@provenrun.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target/arm/helper: Fix timer interrupt masking when HCR_EL2.E2H == 0 | expand |
On Sat, 15 Jun 2024 at 19:56, Florian Lugou <florian.lugou@provenrun.com> wrote: > > CNTHCTL_EL2 based masking of timer interrupts was introduced in > f6fc36deef6abcee406211f3e2f11ff894b87fa4. This masking was however > effective no matter whether EL2 was enabled in the current security > state or not, contrary to arm specification. > > Signed-off-by: Florian Lugou <florian.lugou@provenrun.com> > --- > target/arm/helper.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index ce31957235..60e2344c68 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -2684,7 +2684,8 @@ static void gt_update_irq(ARMCPU *cpu, int timeridx) > * If bit CNTHCTL_EL2.CNT[VP]MASK is set, it overrides IMASK. > * It is RES0 in Secure and NonSecure state. > */ > - if ((ss == ARMSS_Root || ss == ARMSS_Realm) && > + if ((arm_hcr_el2_eff(env) & HCR_E2H) && > + (ss == ARMSS_Root || ss == ARMSS_Realm) && When the architecture says "is EL2 enabled in the current security state" it doesn't mean "is HCR_EL2.E2H set?", it means "is this either NonSecure/Realm or else is SCR_EL2.EEL2 set?". Compare the pseudocode EL2Enabled() and QEMU's arm_is_el2_enabled() and arm_is_el2_enabled_secstate() functions. This doesn't mean much in Root state, and for Realm state EL2 is always enabled (assuming it is implemented). For this timer check, we're doing I think the same thing as the pseudocode AArch64.CheckTimerConditions(), which does: if (IsFeatureImplemented(FEAT_RME) && ss IN {SS_Root, SS_Realm} && CNTHCTL_EL2.CNTPMASK == '1') then imask = '1'; so I'm inclined to say that our current implementation in QEMU is correct. > ((timeridx == GTIMER_VIRT && (cnthctl & R_CNTHCTL_CNTVMASK_MASK)) || > (timeridx == GTIMER_PHYS && (cnthctl & R_CNTHCTL_CNTPMASK_MASK)))) { > irqstate = 0; > -- thanks -- PMM
On Thu, Jun 20, 2024 at 11:43:17AM +0100, Peter Maydell wrote: > On Sat, 15 Jun 2024 at 19:56, Florian Lugou <florian.lugou@provenrun.com> wrote: > > > > CNTHCTL_EL2 based masking of timer interrupts was introduced in > > f6fc36deef6abcee406211f3e2f11ff894b87fa4. This masking was however > > effective no matter whether EL2 was enabled in the current security > > state or not, contrary to arm specification. > > > > Signed-off-by: Florian Lugou <florian.lugou@provenrun.com> > > --- > > target/arm/helper.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/target/arm/helper.c b/target/arm/helper.c > > index ce31957235..60e2344c68 100644 > > --- a/target/arm/helper.c > > +++ b/target/arm/helper.c > > @@ -2684,7 +2684,8 @@ static void gt_update_irq(ARMCPU *cpu, int timeridx) > > * If bit CNTHCTL_EL2.CNT[VP]MASK is set, it overrides IMASK. > > * It is RES0 in Secure and NonSecure state. > > */ > > - if ((ss == ARMSS_Root || ss == ARMSS_Realm) && > > + if ((arm_hcr_el2_eff(env) & HCR_E2H) && > > + (ss == ARMSS_Root || ss == ARMSS_Realm) && > > When the architecture says "is EL2 enabled in the current security state" > it doesn't mean "is HCR_EL2.E2H set?", it means "is this either NonSecure/Realm > or else is SCR_EL2.EEL2 set?". Compare the pseudocode EL2Enabled() > and QEMU's arm_is_el2_enabled() and arm_is_el2_enabled_secstate() functions. > This doesn't mean much in Root state, and for Realm state EL2 is always > enabled (assuming it is implemented). > > For this timer check, we're doing I think the same thing as the > pseudocode AArch64.CheckTimerConditions(), which does: > > if (IsFeatureImplemented(FEAT_RME) && ss IN {SS_Root, SS_Realm} && > CNTHCTL_EL2.CNTPMASK == '1') then > imask = '1'; > > so I'm inclined to say that our current implementation in QEMU is correct. Indeed. I got confused with the specification, my apologies. I am facing an issue with QEMU freezing waiting for a timer interrupt when running with -icount shift=0,sleep=off. Bissection has shown that the issue appeared with f6fc36deef6abcee406211f3e2f11ff894b87fa4. Further testing suggests that the issue may come from gt_recalc_timer. Calling gt_update_irq before timer_mod (as it was done before f6fc36deef6a) rather than at the end of the function solves the issue. Is it possible that timer_mod relies on cpu->gt_timer_outputs, which has not been modified at this point to reflect the timer triggering? > > > ((timeridx == GTIMER_VIRT && (cnthctl & R_CNTHCTL_CNTVMASK_MASK)) || > > (timeridx == GTIMER_PHYS && (cnthctl & R_CNTHCTL_CNTPMASK_MASK)))) { > > irqstate = 0; > > -- > > thanks > -- PMM Best,
On Thu, 20 Jun 2024 at 14:56, Florian Lugou <florian.lugou@provenrun.com> wrote: > > On Thu, Jun 20, 2024 at 11:43:17AM +0100, Peter Maydell wrote: > > For this timer check, we're doing I think the same thing as the > > pseudocode AArch64.CheckTimerConditions(), which does: > > > > if (IsFeatureImplemented(FEAT_RME) && ss IN {SS_Root, SS_Realm} && > > CNTHCTL_EL2.CNTPMASK == '1') then > > imask = '1'; > > > > so I'm inclined to say that our current implementation in QEMU is correct. > > Indeed. I got confused with the specification, my apologies. > > I am facing an issue with QEMU freezing waiting for a timer interrupt when > running with -icount shift=0,sleep=off. Bissection has shown that the issue > appeared with f6fc36deef6abcee406211f3e2f11ff894b87fa4. > > Further testing suggests that the issue may come from gt_recalc_timer. Calling > gt_update_irq before timer_mod (as it was done before f6fc36deef6a) rather than > at the end of the function solves the issue. Is it possible that timer_mod > relies on cpu->gt_timer_outputs, which has not been modified at this point to > reflect the timer triggering? I don't *think* it ought to care -- timer_mod() tells QEMU's timer infrastructure when to schedule the next timer callback for, and the gt_timer_outputs qemu_irqs tell the interrupt controller that the interrupt lines have changed state. Do you have a reproduce case? -- PMM
On Thu, Jun 20, 2024 at 08:01:01PM +0100, Peter Maydell wrote: > On Thu, 20 Jun 2024 at 14:56, Florian Lugou <florian.lugou@provenrun.com> wrote: > > > > On Thu, Jun 20, 2024 at 11:43:17AM +0100, Peter Maydell wrote: > > > For this timer check, we're doing I think the same thing as the > > > pseudocode AArch64.CheckTimerConditions(), which does: > > > > > > if (IsFeatureImplemented(FEAT_RME) && ss IN {SS_Root, SS_Realm} && > > > CNTHCTL_EL2.CNTPMASK == '1') then > > > imask = '1'; > > > > > > so I'm inclined to say that our current implementation in QEMU is correct. > > > > Indeed. I got confused with the specification, my apologies. > > > > I am facing an issue with QEMU freezing waiting for a timer interrupt when > > running with -icount shift=0,sleep=off. Bissection has shown that the issue > > appeared with f6fc36deef6abcee406211f3e2f11ff894b87fa4. > > > > Further testing suggests that the issue may come from gt_recalc_timer. Calling > > gt_update_irq before timer_mod (as it was done before f6fc36deef6a) rather than > > at the end of the function solves the issue. Is it possible that timer_mod > > relies on cpu->gt_timer_outputs, which has not been modified at this point to > > reflect the timer triggering? > > I don't *think* it ought to care -- timer_mod() tells QEMU's timer > infrastructure when to schedule the next timer callback for, > and the gt_timer_outputs qemu_irqs tell the interrupt controller > that the interrupt lines have changed state. > > Do you have a reproduce case? I do: $ cat test.S .section .text .global __start __start: /* Setup exception table */ ldr x0, =exc_vector_table msr vbar_el3, x0 /* Enable and mask secure physical timer */ mrs x0, CNTPS_CTL_EL1 orr x0, x0, 3 msr CNTPS_CTL_EL1, x0 mov x0, 0x8000000 /* GIC base address */ /* Enable group 0 */ ldr w1, [x0, 0] /* GICD_CTLR */ orr w1, w1, 0x1 str w1, [x0, 0] /* GICD_CTLR */ /* Enable timer interrupt */ add x0, x0, 0xb0000 /* SGI_base */ mov w1, (1 << 29) str w1, [x0, 0x100] /* GICR_ISENABLER0 */ /* Enable all priorities */ mov x0, 0xff msr ICC_PMR_EL1, x0 mov x0, 1 msr ICC_IGRPEN0_EL1, x0 /* Set timer compare value ~0.8s in the future */ mrs x0, CNTPCT_EL0 mov x1, 0x3000000 add x0, x0, x1 msr CNTPS_CVAL_EL1, x0 /* Unmask the timer */ mrs x0, CNTPS_CTL_EL1 bic x0, x0, 2 msr CNTPS_CTL_EL1, x0 /* Enable interrupts */ mrs x0, SCR_EL3 orr x0, x0, 4 msr SCR_EL3, x0 msr daifclr, 0x1 dsb sy /* Loop on WFI */ 0: wfi b 0b .macro EXIT .p2align 7 /* Issue a SYS_EXIT semihosting call */ mov x0, 0x18 .word 0xD45E0000 /* unreachable */ b . .endm .macro HOLE .p2align 7 b . .endm .p2align 11 exc_vector_table: HOLE /* Synchronous, from EL3, with SP_EL0 */ HOLE /* IRQ, from EL3, with SP_EL0 */ HOLE /* FIQ, from EL3, with SP_EL0 */ HOLE /* SError, from EL3, with SP_EL0 */ HOLE /* Synchronous, from EL3, with SP_ELx */ HOLE /* IRQ, from EL3, with SP_ELx */ EXIT /* FIQ, from EL3, with SP_ELx */ HOLE /* SError, from EL3, with SP_ELx */ HOLE /* Synchronous, from lower, with lvl n-1 aarch64 */ HOLE /* IRQ, from lower, with lvl n-1 aarch64 */ HOLE /* FIQ, from lower, with lvl n-1 aarch64 */ HOLE /* SError, from lower, with lvl n-1 aarch64 */ HOLE /* Synchronous, from lower, with lvl n-1 aarch32 */ HOLE /* IRQ, from lower, with lvl n-1 aarch32 */ HOLE /* FIQ, from lower, with lvl n-1 aarch32 */ HOLE /* SError, from lower, with lvl n-1 aarch32 */ $ aarch64-none-elf-gcc -ffreestanding -nostdlib -T qemu/tests/tcg/aarch64/system/kernel.ld -o test test.S $ qemu-system-aarch64 \ -machine virt,secure=on,gic-version=3 \ -cpu cortex-a57 \ -kernel test \ -display none \ -semihosting $ # Exits after ~1s $ qemu-system-aarch64 \ -machine virt,secure=on,gic-version=3 \ -cpu cortex-a57 \ -kernel test \ -display none \ -semihosting \ -icount shift=0,sleep=off ... (hangs until QEMU is killed) Best,
On Fri, 21 Jun 2024 at 15:07, Florian Lugou <florian.lugou@provenrun.com> wrote: > > On Thu, Jun 20, 2024 at 08:01:01PM +0100, Peter Maydell wrote: > > On Thu, 20 Jun 2024 at 14:56, Florian Lugou <florian.lugou@provenrun.com> wrote: > > > > > > On Thu, Jun 20, 2024 at 11:43:17AM +0100, Peter Maydell wrote: > > > > For this timer check, we're doing I think the same thing as the > > > > pseudocode AArch64.CheckTimerConditions(), which does: > > > > > > > > if (IsFeatureImplemented(FEAT_RME) && ss IN {SS_Root, SS_Realm} && > > > > CNTHCTL_EL2.CNTPMASK == '1') then > > > > imask = '1'; > > > > > > > > so I'm inclined to say that our current implementation in QEMU is correct. > > > > > > Indeed. I got confused with the specification, my apologies. > > > > > > I am facing an issue with QEMU freezing waiting for a timer interrupt when > > > running with -icount shift=0,sleep=off. Bissection has shown that the issue > > > appeared with f6fc36deef6abcee406211f3e2f11ff894b87fa4. > > > > > > Further testing suggests that the issue may come from gt_recalc_timer. Calling > > > gt_update_irq before timer_mod (as it was done before f6fc36deef6a) rather than > > > at the end of the function solves the issue. Is it possible that timer_mod > > > relies on cpu->gt_timer_outputs, which has not been modified at this point to > > > reflect the timer triggering? > > > > I don't *think* it ought to care -- timer_mod() tells QEMU's timer > > infrastructure when to schedule the next timer callback for, > > and the gt_timer_outputs qemu_irqs tell the interrupt controller > > that the interrupt lines have changed state. > > > > Do you have a reproduce case? > > I do: (Sorry I didn't get back to you on this earlier.) > $ aarch64-none-elf-gcc -ffreestanding -nostdlib -T qemu/tests/tcg/aarch64/system/kernel.ld -o test test.S > > $ qemu-system-aarch64 \ > -machine virt,secure=on,gic-version=3 \ > -cpu cortex-a57 \ > -kernel test \ > -display none \ > -semihosting > > $ # Exits after ~1s > > $ qemu-system-aarch64 \ > -machine virt,secure=on,gic-version=3 \ > -cpu cortex-a57 \ > -kernel test \ > -display none \ > -semihosting \ > -icount shift=0,sleep=off > > ... (hangs until QEMU is killed) For me, with QEMU commit 9eb51530c12ae645b, this test case exits (doesn't hang) with both these command lines. Do you still see this bug? I guess it's possible we fixed it in the last month or so, though I can't see anything obviously relevant in the git logs. thanks -- PMM
> > $ aarch64-none-elf-gcc -ffreestanding -nostdlib -T qemu/tests/tcg/aarch64/system/kernel.ld -o test test.S > > > > $ qemu-system-aarch64 \ > > -machine virt,secure=on,gic-version=3 \ > > -cpu cortex-a57 \ > > -kernel test \ > > -display none \ > > -semihosting > > > > $ # Exits after ~1s > > > > $ qemu-system-aarch64 \ > > -machine virt,secure=on,gic-version=3 \ > > -cpu cortex-a57 \ > > -kernel test \ > > -display none \ > > -semihosting \ > > -icount shift=0,sleep=off > > > > ... (hangs until QEMU is killed) > > For me, with QEMU commit 9eb51530c12ae645b, this test case > exits (doesn't hang) with both these command lines. Do you > still see this bug? I guess it's possible we fixed it in > the last month or so, though I can't see anything obviously > relevant in the git logs. Thank you for taking the time to test it. On my machine (Ubuntu 22.04), with QEMU configuration options "--target-list=aarch64-softmmu --enable-debug", running the provided test case with "-icount shift=0,sleep=off" still makes QEMU hang forever on commit 9eb51530c12ae645b. The issue was initially reported by a colleague of mine so I was hoping it would be somehow reliably reproducible. But apparently it is not. I will try to find some time to investigate a bit more. Thank you,
diff --git a/target/arm/helper.c b/target/arm/helper.c index ce31957235..60e2344c68 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -2684,7 +2684,8 @@ static void gt_update_irq(ARMCPU *cpu, int timeridx) * If bit CNTHCTL_EL2.CNT[VP]MASK is set, it overrides IMASK. * It is RES0 in Secure and NonSecure state. */ - if ((ss == ARMSS_Root || ss == ARMSS_Realm) && + if ((arm_hcr_el2_eff(env) & HCR_E2H) && + (ss == ARMSS_Root || ss == ARMSS_Realm) && ((timeridx == GTIMER_VIRT && (cnthctl & R_CNTHCTL_CNTVMASK_MASK)) || (timeridx == GTIMER_PHYS && (cnthctl & R_CNTHCTL_CNTPMASK_MASK)))) { irqstate = 0;
CNTHCTL_EL2 based masking of timer interrupts was introduced in f6fc36deef6abcee406211f3e2f11ff894b87fa4. This masking was however effective no matter whether EL2 was enabled in the current security state or not, contrary to arm specification. Signed-off-by: Florian Lugou <florian.lugou@provenrun.com> --- target/arm/helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)