Message ID | 1489402563-4978-15-git-send-email-Wei.Chen@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 13 Mar 2017, Wei Chen wrote: > Currently, we masked the Abort/SError bit in Xen exception entries. > So Xen could not capture any Abort/SError while it's running. > Now, Xen has the ability to handle the Abort/SError, we should unmask > the Abort/SError bit by default to let Xen capture Abort/SError while > it's running. > > But in order to avoid receiving nested asynchronous abort, we don't > unmask Abort/SError bit in hyp_error and trap_data_abort. > > Signed-off-by: Wei Chen <Wei.Chen@arm.com> > --- > We haven't done this before, so I don't know how can this change > will affect the Xen. If the IRQ and Abort take place at the same > time, how can we handle them? If the abort is for Xen, the hypervisor will crash and that's the end of it. If the abort is for the guest, Xen will inject it into the VM, then it will return from handling the abort, going back to handling the IRQ as before. Isn't that right? > If an abort is taking place while we're handling the IRQ, the program > jump to abort exception, and then enable the IRQ. In this case, what > will happen? So I think I need more discussions from community. Do you know of an example scenario where Xen could have a problem with this? > --- > xen/arch/arm/arm32/entry.S | 15 ++++++++++++++- > xen/arch/arm/arm64/entry.S | 13 ++++++++----- > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S > index 79929ca..4d46239 100644 > --- a/xen/arch/arm/arm32/entry.S > +++ b/xen/arch/arm/arm32/entry.S > @@ -125,6 +125,7 @@ abort_guest_exit_end: > trap_##trap: \ > SAVE_ALL; \ > cpsie i; /* local_irq_enable */ \ > + cpsie a; /* asynchronous abort enable */ \ > adr lr, return_from_trap; \ > mov r0, sp; \ > mov r11, sp; \ > @@ -135,6 +136,18 @@ trap_##trap: \ > ALIGN; \ > trap_##trap: \ > SAVE_ALL; \ > + cpsie a; /* asynchronous abort enable */ \ > + adr lr, return_from_trap; \ > + mov r0, sp; \ > + mov r11, sp; \ > + bic sp, #7; /* Align the stack pointer (noop on guest trap) */ \ > + b do_trap_##trap > + > +#define DEFINE_TRAP_ENTRY_NOABORT(trap) \ > + ALIGN; \ > +trap_##trap: \ > + SAVE_ALL; \ > + cpsie i; /* local_irq_enable */ \ > adr lr, return_from_trap; \ > mov r0, sp; \ > mov r11, sp; \ > @@ -155,10 +168,10 @@ GLOBAL(hyp_traps_vector) > DEFINE_TRAP_ENTRY(undefined_instruction) > DEFINE_TRAP_ENTRY(supervisor_call) > DEFINE_TRAP_ENTRY(prefetch_abort) > -DEFINE_TRAP_ENTRY(data_abort) > DEFINE_TRAP_ENTRY(hypervisor) > DEFINE_TRAP_ENTRY_NOIRQ(irq) > DEFINE_TRAP_ENTRY_NOIRQ(fiq) > +DEFINE_TRAP_ENTRY_NOABORT(data_abort) > > return_from_trap: > mov sp, r11 > diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S > index 8d5a890..0401a41 100644 > --- a/xen/arch/arm/arm64/entry.S > +++ b/xen/arch/arm/arm64/entry.S > @@ -187,13 +187,14 @@ hyp_error: > /* Traps taken in Current EL with SP_ELx */ > hyp_sync: > entry hyp=1 > - msr daifclr, #2 > + msr daifclr, #6 > mov x0, sp > bl do_trap_hypervisor > exit hyp=1 > > hyp_irq: > entry hyp=1 > + msr daifclr, #4 > mov x0, sp > bl do_trap_irq > exit hyp=1 > @@ -208,7 +209,7 @@ guest_sync: > ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", > "nop; nop", > SKIP_CHECK_PENDING_VSERROR) > - msr daifclr, #2 > + msr daifclr, #6 > mov x0, sp > bl do_trap_hypervisor > 1: > @@ -224,6 +225,7 @@ guest_irq: > ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", > "nop; nop", > SKIP_CHECK_PENDING_VSERROR) > + msr daifclr, #4 > mov x0, sp > bl do_trap_irq > 1: > @@ -235,7 +237,7 @@ guest_fiq_invalid: > > guest_error: > entry hyp=0, compat=0 > - msr daifclr, #2 > + msr daifclr, #6 > mov x0, sp > bl do_trap_guest_serror > exit hyp=0, compat=0 > @@ -250,7 +252,7 @@ guest_sync_compat: > ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", > "nop; nop", > SKIP_CHECK_PENDING_VSERROR) > - msr daifclr, #2 > + msr daifclr, #6 > mov x0, sp > bl do_trap_hypervisor > 1: > @@ -266,6 +268,7 @@ guest_irq_compat: > ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", > "nop; nop", > SKIP_CHECK_PENDING_VSERROR) > + msr daifclr, #4 > mov x0, sp > bl do_trap_irq > 1: > @@ -277,7 +280,7 @@ guest_fiq_invalid_compat: > > guest_error_compat: > entry hyp=0, compat=1 > - msr daifclr, #2 > + msr daifclr, #6 > mov x0, sp > bl do_trap_guest_serror > exit hyp=0, compat=1 > -- > 2.7.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel >
Hi Stefano, On 2017/3/21 5:38, Stefano Stabellini wrote: > On Mon, 13 Mar 2017, Wei Chen wrote: >> Currently, we masked the Abort/SError bit in Xen exception entries. >> So Xen could not capture any Abort/SError while it's running. >> Now, Xen has the ability to handle the Abort/SError, we should unmask >> the Abort/SError bit by default to let Xen capture Abort/SError while >> it's running. >> >> But in order to avoid receiving nested asynchronous abort, we don't >> unmask Abort/SError bit in hyp_error and trap_data_abort. >> >> Signed-off-by: Wei Chen <Wei.Chen@arm.com> >> --- >> We haven't done this before, so I don't know how can this change >> will affect the Xen. If the IRQ and Abort take place at the same >> time, how can we handle them? > > If the abort is for Xen, the hypervisor will crash and that's the end of Before the system crash, we have enable the IRQ, so that would not be the end if an IRQ happens at the same time. > it. If the abort is for the guest, Xen will inject it into the VM, then Before we have inject the abort to VM, we have enable the IRQ. > it will return from handling the abort, going back to handling the IRQ > as before. Isn't that right? If the abort has higher priority then IRQ, it's right. > > >> If an abort is taking place while we're handling the IRQ, the program >> jump to abort exception, and then enable the IRQ. In this case, what >> will happen? So I think I need more discussions from community. > > Do you know of an example scenario where Xen could have a problem with > this? > For example, 1. Trigger a SError in hypervisor. 2. Jump to hyp_error to handle SError. 3. Enable IRQ in hyp_error before PANIC 4. A timer IRQ happens. 5. Jump to hyp_irq and unmask abort again. 6. Jump hyp_error again? > >> --- >> xen/arch/arm/arm32/entry.S | 15 ++++++++++++++- >> xen/arch/arm/arm64/entry.S | 13 ++++++++----- >> 2 files changed, 22 insertions(+), 6 deletions(-) >> >> diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S >> index 79929ca..4d46239 100644 >> --- a/xen/arch/arm/arm32/entry.S >> +++ b/xen/arch/arm/arm32/entry.S >> @@ -125,6 +125,7 @@ abort_guest_exit_end: >> trap_##trap: \ >> SAVE_ALL; \ >> cpsie i; /* local_irq_enable */ \ >> + cpsie a; /* asynchronous abort enable */ \ >> adr lr, return_from_trap; \ >> mov r0, sp; \ >> mov r11, sp; \ >> @@ -135,6 +136,18 @@ trap_##trap: \ >> ALIGN; \ >> trap_##trap: \ >> SAVE_ALL; \ >> + cpsie a; /* asynchronous abort enable */ \ >> + adr lr, return_from_trap; \ >> + mov r0, sp; \ >> + mov r11, sp; \ >> + bic sp, #7; /* Align the stack pointer (noop on guest trap) */ \ >> + b do_trap_##trap >> + >> +#define DEFINE_TRAP_ENTRY_NOABORT(trap) \ >> + ALIGN; \ >> +trap_##trap: \ >> + SAVE_ALL; \ >> + cpsie i; /* local_irq_enable */ \ >> adr lr, return_from_trap; \ >> mov r0, sp; \ >> mov r11, sp; \ >> @@ -155,10 +168,10 @@ GLOBAL(hyp_traps_vector) >> DEFINE_TRAP_ENTRY(undefined_instruction) >> DEFINE_TRAP_ENTRY(supervisor_call) >> DEFINE_TRAP_ENTRY(prefetch_abort) >> -DEFINE_TRAP_ENTRY(data_abort) >> DEFINE_TRAP_ENTRY(hypervisor) >> DEFINE_TRAP_ENTRY_NOIRQ(irq) >> DEFINE_TRAP_ENTRY_NOIRQ(fiq) >> +DEFINE_TRAP_ENTRY_NOABORT(data_abort) >> >> return_from_trap: >> mov sp, r11 >> diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S >> index 8d5a890..0401a41 100644 >> --- a/xen/arch/arm/arm64/entry.S >> +++ b/xen/arch/arm/arm64/entry.S >> @@ -187,13 +187,14 @@ hyp_error: >> /* Traps taken in Current EL with SP_ELx */ >> hyp_sync: >> entry hyp=1 >> - msr daifclr, #2 >> + msr daifclr, #6 >> mov x0, sp >> bl do_trap_hypervisor >> exit hyp=1 >> >> hyp_irq: >> entry hyp=1 >> + msr daifclr, #4 >> mov x0, sp >> bl do_trap_irq >> exit hyp=1 >> @@ -208,7 +209,7 @@ guest_sync: >> ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", >> "nop; nop", >> SKIP_CHECK_PENDING_VSERROR) >> - msr daifclr, #2 >> + msr daifclr, #6 >> mov x0, sp >> bl do_trap_hypervisor >> 1: >> @@ -224,6 +225,7 @@ guest_irq: >> ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", >> "nop; nop", >> SKIP_CHECK_PENDING_VSERROR) >> + msr daifclr, #4 >> mov x0, sp >> bl do_trap_irq >> 1: >> @@ -235,7 +237,7 @@ guest_fiq_invalid: >> >> guest_error: >> entry hyp=0, compat=0 >> - msr daifclr, #2 >> + msr daifclr, #6 >> mov x0, sp >> bl do_trap_guest_serror >> exit hyp=0, compat=0 >> @@ -250,7 +252,7 @@ guest_sync_compat: >> ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", >> "nop; nop", >> SKIP_CHECK_PENDING_VSERROR) >> - msr daifclr, #2 >> + msr daifclr, #6 >> mov x0, sp >> bl do_trap_hypervisor >> 1: >> @@ -266,6 +268,7 @@ guest_irq_compat: >> ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", >> "nop; nop", >> SKIP_CHECK_PENDING_VSERROR) >> + msr daifclr, #4 >> mov x0, sp >> bl do_trap_irq >> 1: >> @@ -277,7 +280,7 @@ guest_fiq_invalid_compat: >> >> guest_error_compat: >> entry hyp=0, compat=1 >> - msr daifclr, #2 >> + msr daifclr, #6 >> mov x0, sp >> bl do_trap_guest_serror >> exit hyp=0, compat=1 >> -- >> 2.7.4 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> https://lists.xen.org/xen-devel >> >
Hi Wei, On 22/03/17 08:49, Wei Chen wrote: > Hi Stefano, > > On 2017/3/21 5:38, Stefano Stabellini wrote: >> On Mon, 13 Mar 2017, Wei Chen wrote: >>> Currently, we masked the Abort/SError bit in Xen exception entries. >>> So Xen could not capture any Abort/SError while it's running. >>> Now, Xen has the ability to handle the Abort/SError, we should unmask >>> the Abort/SError bit by default to let Xen capture Abort/SError while >>> it's running. >>> >>> But in order to avoid receiving nested asynchronous abort, we don't >>> unmask Abort/SError bit in hyp_error and trap_data_abort. >>> >>> Signed-off-by: Wei Chen <Wei.Chen@arm.com> >>> --- >>> We haven't done this before, so I don't know how can this change >>> will affect the Xen. If the IRQ and Abort take place at the same >>> time, how can we handle them? >> >> If the abort is for Xen, the hypervisor will crash and that's the end of > > Before the system crash, we have enable the IRQ, so that would not be > the end if an IRQ happens at the same time. > >> it. If the abort is for the guest, Xen will inject it into the VM, then > > Before we have inject the abort to VM, we have enable the IRQ. > >> it will return from handling the abort, going back to handling the IRQ >> as before. Isn't that right? > > If the abort has higher priority then IRQ, it's right. > >> >> >>> If an abort is taking place while we're handling the IRQ, the program >>> jump to abort exception, and then enable the IRQ. In this case, what >>> will happen? So I think I need more discussions from community. >> >> Do you know of an example scenario where Xen could have a problem with >> this? >> > > For example, > 1. Trigger a SError in hypervisor. > 2. Jump to hyp_error to handle SError. > 3. Enable IRQ in hyp_error before PANIC > 4. A timer IRQ happens. > 5. Jump to hyp_irq and unmask abort again. > 6. Jump hyp_error again? Technically you could end up in an infinite loop if hyp_error code generates a SError. It will stay pending until the end and then trigger again when SError is unmasked... That's unfortunate but I don't think this is a big issue as if this is happening your platform is already doomed. Cheers,
On Wed, 22 Mar 2017, Julien Grall wrote: > Hi Wei, > > On 22/03/17 08:49, Wei Chen wrote: > > Hi Stefano, > > > > On 2017/3/21 5:38, Stefano Stabellini wrote: > > > On Mon, 13 Mar 2017, Wei Chen wrote: > > > > Currently, we masked the Abort/SError bit in Xen exception entries. > > > > So Xen could not capture any Abort/SError while it's running. > > > > Now, Xen has the ability to handle the Abort/SError, we should unmask > > > > the Abort/SError bit by default to let Xen capture Abort/SError while > > > > it's running. > > > > > > > > But in order to avoid receiving nested asynchronous abort, we don't > > > > unmask Abort/SError bit in hyp_error and trap_data_abort. > > > > > > > > Signed-off-by: Wei Chen <Wei.Chen@arm.com> > > > > --- > > > > We haven't done this before, so I don't know how can this change > > > > will affect the Xen. If the IRQ and Abort take place at the same > > > > time, how can we handle them? > > > > > > If the abort is for Xen, the hypervisor will crash and that's the end of > > > > Before the system crash, we have enable the IRQ, so that would not be > > the end if an IRQ happens at the same time. > > > > > it. If the abort is for the guest, Xen will inject it into the VM, then > > > > Before we have inject the abort to VM, we have enable the IRQ. > > > > > it will return from handling the abort, going back to handling the IRQ > > > as before. Isn't that right? > > > > If the abort has higher priority then IRQ, it's right. > > > > > > > > > > > > If an abort is taking place while we're handling the IRQ, the program > > > > jump to abort exception, and then enable the IRQ. In this case, what > > > > will happen? So I think I need more discussions from community. > > > > > > Do you know of an example scenario where Xen could have a problem with > > > this? > > > > > > > For example, > > 1. Trigger a SError in hypervisor. > > 2. Jump to hyp_error to handle SError. > > 3. Enable IRQ in hyp_error before PANIC > > 4. A timer IRQ happens. > > 5. Jump to hyp_irq and unmask abort again. > > 6. Jump hyp_error again? > > Technically you could end up in an infinite loop if hyp_error code generates a > SError. It will stay pending until the end and then trigger again when SError > is unmasked... > > That's unfortunate but I don't think this is a big issue as if this is > happening your platform is already doomed. I agree, but the scenario suggested by Wei is not like that: hyp_error does not generate an serror, it only unmask irqs. I think that it would be safer not to unmask IRQs in hyp_error (remove msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at the end of do_trap_hyp_serror (if the hypervisor does not panic).
Hi Stefano, On 2017/3/23 6:22, Stefano Stabellini wrote: > On Wed, 22 Mar 2017, Julien Grall wrote: >> Hi Wei, >> >> On 22/03/17 08:49, Wei Chen wrote: >>> Hi Stefano, >>> >>> On 2017/3/21 5:38, Stefano Stabellini wrote: >>>> On Mon, 13 Mar 2017, Wei Chen wrote: >>>>> Currently, we masked the Abort/SError bit in Xen exception entries. >>>>> So Xen could not capture any Abort/SError while it's running. >>>>> Now, Xen has the ability to handle the Abort/SError, we should unmask >>>>> the Abort/SError bit by default to let Xen capture Abort/SError while >>>>> it's running. >>>>> >>>>> But in order to avoid receiving nested asynchronous abort, we don't >>>>> unmask Abort/SError bit in hyp_error and trap_data_abort. >>>>> >>>>> Signed-off-by: Wei Chen <Wei.Chen@arm.com> >>>>> --- >>>>> We haven't done this before, so I don't know how can this change >>>>> will affect the Xen. If the IRQ and Abort take place at the same >>>>> time, how can we handle them? >>>> >>>> If the abort is for Xen, the hypervisor will crash and that's the end of >>> >>> Before the system crash, we have enable the IRQ, so that would not be >>> the end if an IRQ happens at the same time. >>> >>>> it. If the abort is for the guest, Xen will inject it into the VM, then >>> >>> Before we have inject the abort to VM, we have enable the IRQ. >>> >>>> it will return from handling the abort, going back to handling the IRQ >>>> as before. Isn't that right? >>> >>> If the abort has higher priority then IRQ, it's right. >>> >>>> >>>> >>>>> If an abort is taking place while we're handling the IRQ, the program >>>>> jump to abort exception, and then enable the IRQ. In this case, what >>>>> will happen? So I think I need more discussions from community. >>>> >>>> Do you know of an example scenario where Xen could have a problem with >>>> this? >>>> >>> >>> For example, >>> 1. Trigger a SError in hypervisor. >>> 2. Jump to hyp_error to handle SError. >>> 3. Enable IRQ in hyp_error before PANIC >>> 4. A timer IRQ happens. >>> 5. Jump to hyp_irq and unmask abort again. >>> 6. Jump hyp_error again? >> >> Technically you could end up in an infinite loop if hyp_error code generates a >> SError. It will stay pending until the end and then trigger again when SError >> is unmasked... >> >> That's unfortunate but I don't think this is a big issue as if this is >> happening your platform is already doomed. > > I agree, but the scenario suggested by Wei is not like that: hyp_error > does not generate an serror, it only unmask irqs. > > I think that it would be safer not to unmask IRQs in hyp_error (remove > msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at > the end of do_trap_hyp_serror (if the hypervisor does not panic). > Yes, it would be safer if we defined an end for exception loop. And hyp_error is a good place to end up the exception loop. 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle interrupts is non-significant. 2. If hyp_error will forward the SErrors, enable IRQ before forwarding SError to guest will make Xen enter exception loop. The SError would not have chance to be forwarded to guest. So, I think enable IRQ at the end of do_trap_hyp_serror would be better.
Hi Wei, On 23/03/17 03:13, Wei Chen wrote: > On 2017/3/23 6:22, Stefano Stabellini wrote: >> On Wed, 22 Mar 2017, Julien Grall wrote: >>> Hi Wei, >>> >>> On 22/03/17 08:49, Wei Chen wrote: >>>> Hi Stefano, >>>> >>>> On 2017/3/21 5:38, Stefano Stabellini wrote: >>>>> On Mon, 13 Mar 2017, Wei Chen wrote: >>>>>> Currently, we masked the Abort/SError bit in Xen exception entries. >>>>>> So Xen could not capture any Abort/SError while it's running. >>>>>> Now, Xen has the ability to handle the Abort/SError, we should unmask >>>>>> the Abort/SError bit by default to let Xen capture Abort/SError while >>>>>> it's running. >>>>>> >>>>>> But in order to avoid receiving nested asynchronous abort, we don't >>>>>> unmask Abort/SError bit in hyp_error and trap_data_abort. >>>>>> >>>>>> Signed-off-by: Wei Chen <Wei.Chen@arm.com> >>>>>> --- >>>>>> We haven't done this before, so I don't know how can this change >>>>>> will affect the Xen. If the IRQ and Abort take place at the same >>>>>> time, how can we handle them? >>>>> >>>>> If the abort is for Xen, the hypervisor will crash and that's the end of >>>> >>>> Before the system crash, we have enable the IRQ, so that would not be >>>> the end if an IRQ happens at the same time. >>>> >>>>> it. If the abort is for the guest, Xen will inject it into the VM, then >>>> >>>> Before we have inject the abort to VM, we have enable the IRQ. >>>> >>>>> it will return from handling the abort, going back to handling the IRQ >>>>> as before. Isn't that right? >>>> >>>> If the abort has higher priority then IRQ, it's right. >>>> >>>>> >>>>> >>>>>> If an abort is taking place while we're handling the IRQ, the program >>>>>> jump to abort exception, and then enable the IRQ. In this case, what >>>>>> will happen? So I think I need more discussions from community. >>>>> >>>>> Do you know of an example scenario where Xen could have a problem with >>>>> this? >>>>> >>>> >>>> For example, >>>> 1. Trigger a SError in hypervisor. >>>> 2. Jump to hyp_error to handle SError. >>>> 3. Enable IRQ in hyp_error before PANIC >>>> 4. A timer IRQ happens. >>>> 5. Jump to hyp_irq and unmask abort again. >>>> 6. Jump hyp_error again? >>> >>> Technically you could end up in an infinite loop if hyp_error code generates a >>> SError. It will stay pending until the end and then trigger again when SError >>> is unmasked... >>> >>> That's unfortunate but I don't think this is a big issue as if this is >>> happening your platform is already doomed. >> >> I agree, but the scenario suggested by Wei is not like that: hyp_error >> does not generate an serror, it only unmask irqs. >> >> I think that it would be safer not to unmask IRQs in hyp_error (remove >> msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at >> the end of do_trap_hyp_serror (if the hypervisor does not panic). >> > > Yes, it would be safer if we defined an end for exception loop. And > hyp_error is a good place to end up the exception loop. > 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle > interrupts is non-significant. > 2. If hyp_error will forward the SErrors, enable IRQ before forwarding > SError to guest will make Xen enter exception loop. The SError would > not have chance to be forwarded to guest. > > So, I think enable IRQ at the end of do_trap_hyp_serror would be better. That's not going to help. If the IRQ path is triggering an SError again and again, there is no way to go out even if you re-enable IRQ later. The HCR_VA will be set for the guest, but you will never go return to the guest. So you will still ending up in an infinite loop. Cheers,
On Thu, 23 Mar 2017, Julien Grall wrote: > Hi Wei, > > On 23/03/17 03:13, Wei Chen wrote: > > On 2017/3/23 6:22, Stefano Stabellini wrote: > > > On Wed, 22 Mar 2017, Julien Grall wrote: > > > > Hi Wei, > > > > > > > > On 22/03/17 08:49, Wei Chen wrote: > > > > > Hi Stefano, > > > > > > > > > > On 2017/3/21 5:38, Stefano Stabellini wrote: > > > > > > On Mon, 13 Mar 2017, Wei Chen wrote: > > > > > > > Currently, we masked the Abort/SError bit in Xen exception > > > > > > > entries. > > > > > > > So Xen could not capture any Abort/SError while it's running. > > > > > > > Now, Xen has the ability to handle the Abort/SError, we should > > > > > > > unmask > > > > > > > the Abort/SError bit by default to let Xen capture Abort/SError > > > > > > > while > > > > > > > it's running. > > > > > > > > > > > > > > But in order to avoid receiving nested asynchronous abort, we > > > > > > > don't > > > > > > > unmask Abort/SError bit in hyp_error and trap_data_abort. > > > > > > > > > > > > > > Signed-off-by: Wei Chen <Wei.Chen@arm.com> > > > > > > > --- > > > > > > > We haven't done this before, so I don't know how can this change > > > > > > > will affect the Xen. If the IRQ and Abort take place at the same > > > > > > > time, how can we handle them? > > > > > > > > > > > > If the abort is for Xen, the hypervisor will crash and that's the > > > > > > end of > > > > > > > > > > Before the system crash, we have enable the IRQ, so that would not be > > > > > the end if an IRQ happens at the same time. > > > > > > > > > > > it. If the abort is for the guest, Xen will inject it into the VM, > > > > > > then > > > > > > > > > > Before we have inject the abort to VM, we have enable the IRQ. > > > > > > > > > > > it will return from handling the abort, going back to handling the > > > > > > IRQ > > > > > > as before. Isn't that right? > > > > > > > > > > If the abort has higher priority then IRQ, it's right. > > > > > > > > > > > > > > > > > > > > > > > > If an abort is taking place while we're handling the IRQ, the > > > > > > > program > > > > > > > jump to abort exception, and then enable the IRQ. In this case, > > > > > > > what > > > > > > > will happen? So I think I need more discussions from community. > > > > > > > > > > > > Do you know of an example scenario where Xen could have a problem > > > > > > with > > > > > > this? > > > > > > > > > > > > > > > > For example, > > > > > 1. Trigger a SError in hypervisor. > > > > > 2. Jump to hyp_error to handle SError. > > > > > 3. Enable IRQ in hyp_error before PANIC > > > > > 4. A timer IRQ happens. > > > > > 5. Jump to hyp_irq and unmask abort again. > > > > > 6. Jump hyp_error again? > > > > > > > > Technically you could end up in an infinite loop if hyp_error code > > > > generates a > > > > SError. It will stay pending until the end and then trigger again when > > > > SError > > > > is unmasked... > > > > > > > > That's unfortunate but I don't think this is a big issue as if this is > > > > happening your platform is already doomed. > > > > > > I agree, but the scenario suggested by Wei is not like that: hyp_error > > > does not generate an serror, it only unmask irqs. > > > > > > I think that it would be safer not to unmask IRQs in hyp_error (remove > > > msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at > > > the end of do_trap_hyp_serror (if the hypervisor does not panic). > > > > > > > Yes, it would be safer if we defined an end for exception loop. And > > hyp_error is a good place to end up the exception loop. > > 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle > > interrupts is non-significant. > > 2. If hyp_error will forward the SErrors, enable IRQ before forwarding > > SError to guest will make Xen enter exception loop. The SError would > > not have chance to be forwarded to guest. > > > > So, I think enable IRQ at the end of do_trap_hyp_serror would be better. > > That's not going to help. If the IRQ path is triggering an SError again and > again, there is no way to go out even if you re-enable IRQ later. I understand where the misunderstanding comes from, I think it is due to a different interpretation of how the hardware behaves. Only one is correct, and I think it's yours. In the following case, assuming only one SError is ever generated: 1. A guest causes an SError 2. Xen jumps to hyp_error to handle it 3. Xen enables IRQ in hyp_error 4. Xen receives a timer event 5. Xen jumps to hyp_irq and unmask abort 6. ??? In 6., Xen doesn't jump to hyp_error again because of the first SError, right? The SError has already been received, Xen wouldn't trap again on the same abort because something hasn't been cleared, right? In that case, there is no need to delay unmasking IRQs, as you have been saying: 6. Xen handles the IRQ, injects a virtual irq to guest 7. Xen returns from IRQ, goes back to abort handler 8. Xen injects abort to guest in that case this patch is good as is.
Hi Stefano, On 2017/3/24 8:10, Stefano Stabellini wrote: > On Thu, 23 Mar 2017, Julien Grall wrote: >> Hi Wei, >> >> On 23/03/17 03:13, Wei Chen wrote: >>> On 2017/3/23 6:22, Stefano Stabellini wrote: >>>> On Wed, 22 Mar 2017, Julien Grall wrote: >>>>> Hi Wei, >>>>> >>>>> On 22/03/17 08:49, Wei Chen wrote: >>>>>> Hi Stefano, >>>>>> >>>>>> On 2017/3/21 5:38, Stefano Stabellini wrote: >>>>>>> On Mon, 13 Mar 2017, Wei Chen wrote: >>>>>>>> Currently, we masked the Abort/SError bit in Xen exception >>>>>>>> entries. >>>>>>>> So Xen could not capture any Abort/SError while it's running. >>>>>>>> Now, Xen has the ability to handle the Abort/SError, we should >>>>>>>> unmask >>>>>>>> the Abort/SError bit by default to let Xen capture Abort/SError >>>>>>>> while >>>>>>>> it's running. >>>>>>>> >>>>>>>> But in order to avoid receiving nested asynchronous abort, we >>>>>>>> don't >>>>>>>> unmask Abort/SError bit in hyp_error and trap_data_abort. >>>>>>>> >>>>>>>> Signed-off-by: Wei Chen <Wei.Chen@arm.com> >>>>>>>> --- >>>>>>>> We haven't done this before, so I don't know how can this change >>>>>>>> will affect the Xen. If the IRQ and Abort take place at the same >>>>>>>> time, how can we handle them? >>>>>>> >>>>>>> If the abort is for Xen, the hypervisor will crash and that's the >>>>>>> end of >>>>>> >>>>>> Before the system crash, we have enable the IRQ, so that would not be >>>>>> the end if an IRQ happens at the same time. >>>>>> >>>>>>> it. If the abort is for the guest, Xen will inject it into the VM, >>>>>>> then >>>>>> >>>>>> Before we have inject the abort to VM, we have enable the IRQ. >>>>>> >>>>>>> it will return from handling the abort, going back to handling the >>>>>>> IRQ >>>>>>> as before. Isn't that right? >>>>>> >>>>>> If the abort has higher priority then IRQ, it's right. >>>>>> >>>>>>> >>>>>>> >>>>>>>> If an abort is taking place while we're handling the IRQ, the >>>>>>>> program >>>>>>>> jump to abort exception, and then enable the IRQ. In this case, >>>>>>>> what >>>>>>>> will happen? So I think I need more discussions from community. >>>>>>> >>>>>>> Do you know of an example scenario where Xen could have a problem >>>>>>> with >>>>>>> this? >>>>>>> >>>>>> >>>>>> For example, >>>>>> 1. Trigger a SError in hypervisor. >>>>>> 2. Jump to hyp_error to handle SError. >>>>>> 3. Enable IRQ in hyp_error before PANIC >>>>>> 4. A timer IRQ happens. >>>>>> 5. Jump to hyp_irq and unmask abort again. >>>>>> 6. Jump hyp_error again? >>>>> >>>>> Technically you could end up in an infinite loop if hyp_error code >>>>> generates a >>>>> SError. It will stay pending until the end and then trigger again when >>>>> SError >>>>> is unmasked... >>>>> >>>>> That's unfortunate but I don't think this is a big issue as if this is >>>>> happening your platform is already doomed. >>>> >>>> I agree, but the scenario suggested by Wei is not like that: hyp_error >>>> does not generate an serror, it only unmask irqs. >>>> >>>> I think that it would be safer not to unmask IRQs in hyp_error (remove >>>> msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at >>>> the end of do_trap_hyp_serror (if the hypervisor does not panic). >>>> >>> >>> Yes, it would be safer if we defined an end for exception loop. And >>> hyp_error is a good place to end up the exception loop. >>> 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle >>> interrupts is non-significant. >>> 2. If hyp_error will forward the SErrors, enable IRQ before forwarding >>> SError to guest will make Xen enter exception loop. The SError would >>> not have chance to be forwarded to guest. >>> >>> So, I think enable IRQ at the end of do_trap_hyp_serror would be better. >> >> That's not going to help. If the IRQ path is triggering an SError again and >> again, there is no way to go out even if you re-enable IRQ later. > > I understand where the misunderstanding comes from, I think it is due to > a different interpretation of how the hardware behaves. Only one is > correct, and I think it's yours. > > In the following case, assuming only one SError is ever generated: > > 1. A guest causes an SError > 2. Xen jumps to hyp_error to handle it > 3. Xen enables IRQ in hyp_error > 4. Xen receives a timer event > 5. Xen jumps to hyp_irq and unmask abort > 6. ??? > > In 6., Xen doesn't jump to hyp_error again because of the first SError, > right? The SError has already been received, Xen wouldn't trap again on > the same abort because something hasn't been cleared, right? In that Yes, I think you are right. I have done a test, the hardware works as you said. Xen would not jump to hyp_error again by the first SError. > case, there is no need to delay unmasking IRQs, as you have been saying: > > 6. Xen handles the IRQ, injects a virtual irq to guest > 7. Xen returns from IRQ, goes back to abort handler > 8. Xen injects abort to guest > > in that case this patch is good as is. > Thanks, that reassures me :)
On Fri, 24 Mar 2017, Wei Chen wrote: > Hi Stefano, > > On 2017/3/24 8:10, Stefano Stabellini wrote: > > On Thu, 23 Mar 2017, Julien Grall wrote: > >> Hi Wei, > >> > >> On 23/03/17 03:13, Wei Chen wrote: > >>> On 2017/3/23 6:22, Stefano Stabellini wrote: > >>>> On Wed, 22 Mar 2017, Julien Grall wrote: > >>>>> Hi Wei, > >>>>> > >>>>> On 22/03/17 08:49, Wei Chen wrote: > >>>>>> Hi Stefano, > >>>>>> > >>>>>> On 2017/3/21 5:38, Stefano Stabellini wrote: > >>>>>>> On Mon, 13 Mar 2017, Wei Chen wrote: > >>>>>>>> Currently, we masked the Abort/SError bit in Xen exception > >>>>>>>> entries. > >>>>>>>> So Xen could not capture any Abort/SError while it's running. > >>>>>>>> Now, Xen has the ability to handle the Abort/SError, we should > >>>>>>>> unmask > >>>>>>>> the Abort/SError bit by default to let Xen capture Abort/SError > >>>>>>>> while > >>>>>>>> it's running. > >>>>>>>> > >>>>>>>> But in order to avoid receiving nested asynchronous abort, we > >>>>>>>> don't > >>>>>>>> unmask Abort/SError bit in hyp_error and trap_data_abort. > >>>>>>>> > >>>>>>>> Signed-off-by: Wei Chen <Wei.Chen@arm.com> > >>>>>>>> --- > >>>>>>>> We haven't done this before, so I don't know how can this change > >>>>>>>> will affect the Xen. If the IRQ and Abort take place at the same > >>>>>>>> time, how can we handle them? > >>>>>>> > >>>>>>> If the abort is for Xen, the hypervisor will crash and that's the > >>>>>>> end of > >>>>>> > >>>>>> Before the system crash, we have enable the IRQ, so that would not be > >>>>>> the end if an IRQ happens at the same time. > >>>>>> > >>>>>>> it. If the abort is for the guest, Xen will inject it into the VM, > >>>>>>> then > >>>>>> > >>>>>> Before we have inject the abort to VM, we have enable the IRQ. > >>>>>> > >>>>>>> it will return from handling the abort, going back to handling the > >>>>>>> IRQ > >>>>>>> as before. Isn't that right? > >>>>>> > >>>>>> If the abort has higher priority then IRQ, it's right. > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> If an abort is taking place while we're handling the IRQ, the > >>>>>>>> program > >>>>>>>> jump to abort exception, and then enable the IRQ. In this case, > >>>>>>>> what > >>>>>>>> will happen? So I think I need more discussions from community. > >>>>>>> > >>>>>>> Do you know of an example scenario where Xen could have a problem > >>>>>>> with > >>>>>>> this? > >>>>>>> > >>>>>> > >>>>>> For example, > >>>>>> 1. Trigger a SError in hypervisor. > >>>>>> 2. Jump to hyp_error to handle SError. > >>>>>> 3. Enable IRQ in hyp_error before PANIC > >>>>>> 4. A timer IRQ happens. > >>>>>> 5. Jump to hyp_irq and unmask abort again. > >>>>>> 6. Jump hyp_error again? > >>>>> > >>>>> Technically you could end up in an infinite loop if hyp_error code > >>>>> generates a > >>>>> SError. It will stay pending until the end and then trigger again when > >>>>> SError > >>>>> is unmasked... > >>>>> > >>>>> That's unfortunate but I don't think this is a big issue as if this is > >>>>> happening your platform is already doomed. > >>>> > >>>> I agree, but the scenario suggested by Wei is not like that: hyp_error > >>>> does not generate an serror, it only unmask irqs. > >>>> > >>>> I think that it would be safer not to unmask IRQs in hyp_error (remove > >>>> msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at > >>>> the end of do_trap_hyp_serror (if the hypervisor does not panic). > >>>> > >>> > >>> Yes, it would be safer if we defined an end for exception loop. And > >>> hyp_error is a good place to end up the exception loop. > >>> 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle > >>> interrupts is non-significant. > >>> 2. If hyp_error will forward the SErrors, enable IRQ before forwarding > >>> SError to guest will make Xen enter exception loop. The SError would > >>> not have chance to be forwarded to guest. > >>> > >>> So, I think enable IRQ at the end of do_trap_hyp_serror would be better. > >> > >> That's not going to help. If the IRQ path is triggering an SError again and > >> again, there is no way to go out even if you re-enable IRQ later. > > > > I understand where the misunderstanding comes from, I think it is due to > > a different interpretation of how the hardware behaves. Only one is > > correct, and I think it's yours. > > > > In the following case, assuming only one SError is ever generated: > > > > 1. A guest causes an SError > > 2. Xen jumps to hyp_error to handle it > > 3. Xen enables IRQ in hyp_error > > 4. Xen receives a timer event > > 5. Xen jumps to hyp_irq and unmask abort > > 6. ??? > > > > In 6., Xen doesn't jump to hyp_error again because of the first SError, > > right? The SError has already been received, Xen wouldn't trap again on > > the same abort because something hasn't been cleared, right? In that > > Yes, I think you are right. I have done a test, the hardware works as > you said. Xen would not jump to hyp_error again by the first SError. > > > case, there is no need to delay unmasking IRQs, as you have been saying: > > > > 6. Xen handles the IRQ, injects a virtual irq to guest > > 7. Xen returns from IRQ, goes back to abort handler > > 8. Xen injects abort to guest > > > > in that case this patch is good as is. > > > > Thanks, that reassures me :) Great, you can add my Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
diff --git a/xen/arch/arm/arm32/entry.S b/xen/arch/arm/arm32/entry.S index 79929ca..4d46239 100644 --- a/xen/arch/arm/arm32/entry.S +++ b/xen/arch/arm/arm32/entry.S @@ -125,6 +125,7 @@ abort_guest_exit_end: trap_##trap: \ SAVE_ALL; \ cpsie i; /* local_irq_enable */ \ + cpsie a; /* asynchronous abort enable */ \ adr lr, return_from_trap; \ mov r0, sp; \ mov r11, sp; \ @@ -135,6 +136,18 @@ trap_##trap: \ ALIGN; \ trap_##trap: \ SAVE_ALL; \ + cpsie a; /* asynchronous abort enable */ \ + adr lr, return_from_trap; \ + mov r0, sp; \ + mov r11, sp; \ + bic sp, #7; /* Align the stack pointer (noop on guest trap) */ \ + b do_trap_##trap + +#define DEFINE_TRAP_ENTRY_NOABORT(trap) \ + ALIGN; \ +trap_##trap: \ + SAVE_ALL; \ + cpsie i; /* local_irq_enable */ \ adr lr, return_from_trap; \ mov r0, sp; \ mov r11, sp; \ @@ -155,10 +168,10 @@ GLOBAL(hyp_traps_vector) DEFINE_TRAP_ENTRY(undefined_instruction) DEFINE_TRAP_ENTRY(supervisor_call) DEFINE_TRAP_ENTRY(prefetch_abort) -DEFINE_TRAP_ENTRY(data_abort) DEFINE_TRAP_ENTRY(hypervisor) DEFINE_TRAP_ENTRY_NOIRQ(irq) DEFINE_TRAP_ENTRY_NOIRQ(fiq) +DEFINE_TRAP_ENTRY_NOABORT(data_abort) return_from_trap: mov sp, r11 diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S index 8d5a890..0401a41 100644 --- a/xen/arch/arm/arm64/entry.S +++ b/xen/arch/arm/arm64/entry.S @@ -187,13 +187,14 @@ hyp_error: /* Traps taken in Current EL with SP_ELx */ hyp_sync: entry hyp=1 - msr daifclr, #2 + msr daifclr, #6 mov x0, sp bl do_trap_hypervisor exit hyp=1 hyp_irq: entry hyp=1 + msr daifclr, #4 mov x0, sp bl do_trap_irq exit hyp=1 @@ -208,7 +209,7 @@ guest_sync: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_CHECK_PENDING_VSERROR) - msr daifclr, #2 + msr daifclr, #6 mov x0, sp bl do_trap_hypervisor 1: @@ -224,6 +225,7 @@ guest_irq: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_CHECK_PENDING_VSERROR) + msr daifclr, #4 mov x0, sp bl do_trap_irq 1: @@ -235,7 +237,7 @@ guest_fiq_invalid: guest_error: entry hyp=0, compat=0 - msr daifclr, #2 + msr daifclr, #6 mov x0, sp bl do_trap_guest_serror exit hyp=0, compat=0 @@ -250,7 +252,7 @@ guest_sync_compat: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_CHECK_PENDING_VSERROR) - msr daifclr, #2 + msr daifclr, #6 mov x0, sp bl do_trap_hypervisor 1: @@ -266,6 +268,7 @@ guest_irq_compat: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_CHECK_PENDING_VSERROR) + msr daifclr, #4 mov x0, sp bl do_trap_irq 1: @@ -277,7 +280,7 @@ guest_fiq_invalid_compat: guest_error_compat: entry hyp=0, compat=1 - msr daifclr, #2 + msr daifclr, #6 mov x0, sp bl do_trap_guest_serror exit hyp=0, compat=1
Currently, we masked the Abort/SError bit in Xen exception entries. So Xen could not capture any Abort/SError while it's running. Now, Xen has the ability to handle the Abort/SError, we should unmask the Abort/SError bit by default to let Xen capture Abort/SError while it's running. But in order to avoid receiving nested asynchronous abort, we don't unmask Abort/SError bit in hyp_error and trap_data_abort. Signed-off-by: Wei Chen <Wei.Chen@arm.com> --- We haven't done this before, so I don't know how can this change will affect the Xen. If the IRQ and Abort take place at the same time, how can we handle them? If an abort is taking place while we're handling the IRQ, the program jump to abort exception, and then enable the IRQ. In this case, what will happen? So I think I need more discussions from community. --- xen/arch/arm/arm32/entry.S | 15 ++++++++++++++- xen/arch/arm/arm64/entry.S | 13 ++++++++----- 2 files changed, 22 insertions(+), 6 deletions(-)