diff mbox series

[RFC,1/3] arm64: kvm: Handle Asymmetric AArch32 systems

Message ID 20201008181641.32767-2-qais.yousef@arm.com (mailing list archive)
State New, archived
Headers show
Series Add support for Asymmetric AArch32 systems | expand

Commit Message

Qais Yousef Oct. 8, 2020, 6:16 p.m. UTC
On a system without uniform support for AArch32 at EL0, it is possible
for the guest to force run AArch32 at EL0 and potentially cause an
illegal exception if running on the wrong core.

Add an extra check to catch if the guest ever does that and prevent it
from running again, treating it as ARM_EXCEPTION_IL.

We try to catch this misbehavior as early as possible and not rely on
PSTATE.IL to occur.

Tested on Juno by instrumenting the host to:

	* Fake asym aarch32.
	* Comment out hiding of ID registers from the guest.

Any attempt to run 32bit app in the guest will produce such error on
qemu:

	# ./test
	error: kvm run failed Invalid argument
	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
	R08=00000000 R09=00000000 R10=00000000 R11=00000000
	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
	PSR=a0000010 N-C- A usr32

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 arch/arm64/kvm/arm.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Marc Zyngier Oct. 9, 2020, 8:12 a.m. UTC | #1
On 2020-10-08 19:16, Qais Yousef wrote:
> On a system without uniform support for AArch32 at EL0, it is possible
> for the guest to force run AArch32 at EL0 and potentially cause an
> illegal exception if running on the wrong core.
> 
> Add an extra check to catch if the guest ever does that and prevent it
> from running again, treating it as ARM_EXCEPTION_IL.
> 
> We try to catch this misbehavior as early as possible and not rely on
> PSTATE.IL to occur.

nit: PSTATE.IL doesn't "occur". It is an "Illegal Exception Return" that
can happen.

> 
> Tested on Juno by instrumenting the host to:
> 
> 	* Fake asym aarch32.
> 	* Comment out hiding of ID registers from the guest.
> 
> Any attempt to run 32bit app in the guest will produce such error on
> qemu:
> 
> 	# ./test
> 	error: kvm run failed Invalid argument
> 	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
> 	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
> 	R08=00000000 R09=00000000 R10=00000000 R11=00000000
> 	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
> 	PSR=a0000010 N-C- A usr32
> 
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> ---
>  arch/arm64/kvm/arm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index b588c3b5c2f0..22ff3373d855 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  	struct kvm_run *run = vcpu->run;
>  	int ret;
> 
> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");

No, we don't scream on the console in an uncontrolled way based on
illegal user input (yes, the VM *is* userspace).

Furthermore, you seem to deal with the same problem *twice*. See below.

> +		return -ENOEXEC;
> +	}
> +
>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>  		return -ENOEXEC;
> 
> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> 
>  		preempt_enable();
> 
> +		/*
> +		 * For asym aarch32 systems we present a 64bit only system to
> +		 * the guest. But in case it managed somehow to escape that and
> +		 * enter 32bit mode, catch that and prevent it from running
> +		 * again.

The guest didn't *escape* anything. It merely used the CPU as designed.
The fact that the hypervisor cannot prevent the guest from using AArch32
is an architectural defect.

> +		 */
> +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");

Same remark as above. Userspace has access to PSTATE and can work out
the issue by itself.

> +			ret = ARM_EXCEPTION_IL;

This will cause the thread to return to userspace after having done a
vcpu_put(). So why don't you just mark the vcpu as uninitialized before
returning to userspace? It already is in an illegal state, and the only
reasonable thing userspace can do is to reset it.

This way, we keep the horror in a single place, and force userspace to
take action if it really wants to recover the guest.

         M.
Qais Yousef Oct. 9, 2020, 9:58 a.m. UTC | #2
On 10/09/20 09:12, Marc Zyngier wrote:
> On 2020-10-08 19:16, Qais Yousef wrote:
> > On a system without uniform support for AArch32 at EL0, it is possible
> > for the guest to force run AArch32 at EL0 and potentially cause an
> > illegal exception if running on the wrong core.
> > 
> > Add an extra check to catch if the guest ever does that and prevent it
> > from running again, treating it as ARM_EXCEPTION_IL.
> > 
> > We try to catch this misbehavior as early as possible and not rely on
> > PSTATE.IL to occur.
> 
> nit: PSTATE.IL doesn't "occur". It is an "Illegal Exception Return" that
> can happen.

+1

> 
> > 
> > Tested on Juno by instrumenting the host to:
> > 
> > 	* Fake asym aarch32.
> > 	* Comment out hiding of ID registers from the guest.
> > 
> > Any attempt to run 32bit app in the guest will produce such error on
> > qemu:
> > 
> > 	# ./test
> > 	error: kvm run failed Invalid argument
> > 	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
> > 	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
> > 	R08=00000000 R09=00000000 R10=00000000 R11=00000000
> > 	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
> > 	PSR=a0000010 N-C- A usr32
> > 
> > Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> > ---
> >  arch/arm64/kvm/arm.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index b588c3b5c2f0..22ff3373d855 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  	struct kvm_run *run = vcpu->run;
> >  	int ret;
> > 
> > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
> 
> No, we don't scream on the console in an uncontrolled way based on
> illegal user input (yes, the VM *is* userspace).

It seemed kind to print a good reason of what just happened.

> 
> Furthermore, you seem to deal with the same problem *twice*. See below.

It's done below because we could loop back into the guest again, so we force an
exit then. Here to make sure if the VMM ignores the error value we returned
earlier it can't force its way back in again.

> 
> > +		return -ENOEXEC;
> > +	}
> > +
> >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
> >  		return -ENOEXEC;
> > 
> > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > 
> >  		preempt_enable();
> > 
> > +		/*
> > +		 * For asym aarch32 systems we present a 64bit only system to
> > +		 * the guest. But in case it managed somehow to escape that and
> > +		 * enter 32bit mode, catch that and prevent it from running
> > +		 * again.
> 
> The guest didn't *escape* anything. It merely used the CPU as designed.
> The fact that the hypervisor cannot prevent the guest from using AArch32
> is an architectural defect.

Happy to change the wording if you tell me what you prefer :-)

> 
> > +		 */
> > +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
> 
> Same remark as above. Userspace has access to PSTATE and can work out
> the issue by itself.

Okay.

> 
> > +			ret = ARM_EXCEPTION_IL;
> 
> This will cause the thread to return to userspace after having done a
> vcpu_put(). So why don't you just mark the vcpu as uninitialized before
> returning to userspace? It already is in an illegal state, and the only
> reasonable thing userspace can do is to reset it.

Because I probably didn't navigate my way correctly around the code. Mind
expanding how to mark the vcpu as uninitialized? I have tried 2 ways
in that effect but they were really horrible, so will abstain from sharing :-)

> 
> This way, we keep the horror in a single place, and force userspace to
> take action if it really wants to recover the guest.

Happy to try it out.

Thanks

--
Qais Yousef
Marc Zyngier Oct. 9, 2020, 12:34 p.m. UTC | #3
On 2020-10-09 10:58, Qais Yousef wrote:

[...]

>> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> > index b588c3b5c2f0..22ff3373d855 100644
>> > --- a/arch/arm64/kvm/arm.c
>> > +++ b/arch/arm64/kvm/arm.c
>> > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>> >  	struct kvm_run *run = vcpu->run;
>> >  	int ret;
>> >
>> > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>> > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>> 
>> No, we don't scream on the console in an uncontrolled way based on
>> illegal user input (yes, the VM *is* userspace).
> 
> It seemed kind to print a good reason of what just happened.

I'm afraid it only serves as an instrument to spam the console. 
Userspace
gave you an illegal state, you respond with an error. The error is, on
its own, descriptive enough. In general, we only print on the console
when KVM is faced with an internal error of some sort. That's not the
case here.

> 
>> 
>> Furthermore, you seem to deal with the same problem *twice*. See 
>> below.
> 
> It's done below because we could loop back into the guest again, so we 
> force an
> exit then. Here to make sure if the VMM ignores the error value we 
> returned
> earlier it can't force its way back in again.

Which we already handle if you do what I hinted at below.

>> 
>> > +		return -ENOEXEC;
>> > +	}
>> > +
>> >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>> >  		return -ENOEXEC;
>> >
>> > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>> >
>> >  		preempt_enable();
>> >
>> > +		/*
>> > +		 * For asym aarch32 systems we present a 64bit only system to
>> > +		 * the guest. But in case it managed somehow to escape that and
>> > +		 * enter 32bit mode, catch that and prevent it from running
>> > +		 * again.
>> 
>> The guest didn't *escape* anything. It merely used the CPU as 
>> designed.
>> The fact that the hypervisor cannot prevent the guest from using 
>> AArch32
>> is an architectural defect.
> 
> Happy to change the wording if you tell me what you prefer :-)

"The ARMv8 architecture doesn't give the hypervisor a mechanism to 
prevent
  a guest from dropping to AArch32 EL0 if implemented by the CPU. If we 
spot
  the guest in such state and that we decided it wasn't supposed to do so
  (like with the asymmetric AArch32 case), return to userspace with a 
fatal
  error."

> 
>> 
>> > +		 */
>> > +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>> > +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
>> 
>> Same remark as above. Userspace has access to PSTATE and can work out
>> the issue by itself.
> 
> Okay.
> 
>> 
>> > +			ret = ARM_EXCEPTION_IL;
>> 
>> This will cause the thread to return to userspace after having done a
>> vcpu_put(). So why don't you just mark the vcpu as uninitialized 
>> before
>> returning to userspace? It already is in an illegal state, and the 
>> only
>> reasonable thing userspace can do is to reset it.
> 
> Because I probably didn't navigate my way correctly around the code. 
> Mind
> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
> in that effect but they were really horrible, so will abstain from 
> sharing :-)

You can try setting vcpu->arch.target to -1, which is already caught by
kvm_vcpu_initialized() right at the top of this function. This will
prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

         M.
Qais Yousef Oct. 9, 2020, 12:48 p.m. UTC | #4
On 10/09/20 13:34, Marc Zyngier wrote:
> On 2020-10-09 10:58, Qais Yousef wrote:
> 
> [...]
> 
> > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > index b588c3b5c2f0..22ff3373d855 100644
> > > > --- a/arch/arm64/kvm/arm.c
> > > > +++ b/arch/arm64/kvm/arm.c
> > > > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >  	struct kvm_run *run = vcpu->run;
> > > >  	int ret;
> > > >
> > > > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > > > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
> > > 
> > > No, we don't scream on the console in an uncontrolled way based on
> > > illegal user input (yes, the VM *is* userspace).
> > 
> > It seemed kind to print a good reason of what just happened.
> 
> I'm afraid it only serves as an instrument to spam the console. Userspace
> gave you an illegal state, you respond with an error. The error is, on
> its own, descriptive enough. In general, we only print on the console
> when KVM is faced with an internal error of some sort. That's not the
> case here.

Fair enough.

> 
> > 
> > > 
> > > Furthermore, you seem to deal with the same problem *twice*. See
> > > below.
> > 
> > It's done below because we could loop back into the guest again, so we
> > force an
> > exit then. Here to make sure if the VMM ignores the error value we
> > returned
> > earlier it can't force its way back in again.
> 
> Which we already handle if you do what I hinted at below.
> 
> > > 
> > > > +		return -ENOEXEC;
> > > > +	}
> > > > +
> > > >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
> > > >  		return -ENOEXEC;
> > > >
> > > > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >
> > > >  		preempt_enable();
> > > >
> > > > +		/*
> > > > +		 * For asym aarch32 systems we present a 64bit only system to
> > > > +		 * the guest. But in case it managed somehow to escape that and
> > > > +		 * enter 32bit mode, catch that and prevent it from running
> > > > +		 * again.
> > > 
> > > The guest didn't *escape* anything. It merely used the CPU as
> > > designed.
> > > The fact that the hypervisor cannot prevent the guest from using
> > > AArch32
> > > is an architectural defect.
> > 
> > Happy to change the wording if you tell me what you prefer :-)
> 
> "The ARMv8 architecture doesn't give the hypervisor a mechanism to prevent
>  a guest from dropping to AArch32 EL0 if implemented by the CPU. If we spot
>  the guest in such state and that we decided it wasn't supposed to do so
>  (like with the asymmetric AArch32 case), return to userspace with a fatal
>  error."

Thanks.

> > Because I probably didn't navigate my way correctly around the code.
> > Mind
> > expanding how to mark the vcpu as uninitialized? I have tried 2 ways
> > in that effect but they were really horrible, so will abstain from
> > sharing :-)
> 
> You can try setting vcpu->arch.target to -1, which is already caught by
> kvm_vcpu_initialized() right at the top of this function. This will
> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

That's actually one of the things I tried before ending with this patch.
I thought it's really aggressive and unfriendly.

It does indeed cause qemu to abort out completely.

> +                       /* Reset target so it won't run again */
> +                       vcpu->arch.target = -1;
>
>       # ./test
>       error: kvm run failed Invalid argument
>        PC=ffff80001093ad64 X00=ffff80001686d014 X01=ffff80001093ad40
>       X02=3fa71c3de1990200 X03=0000000000000030 X04=0000000000000000
>       X05=0000000000000000 X06=ffff0000767d3c06 X07=74246e6873716875
>       X08=7f7f7f7f7f7f7f7f X09=0400000000000001 X10=0101010101010101
>       X11=0000000000000001 X12=ffff800016887000 X13=ffff0000767d3c07
>       X14=ffff0000767d3c08 X15=ffff80001618a988 X16=000000000000000e
>       X17=0000000000000000 X18=ffffffffffffffff X19=ffff00007ad99800
>       X20=ffff8000163d20b8 X21=0000000000000000 X22=ffff00007ad99810
>       X23=ffff8000163d2270 X24=ffff8000163d22d0 X25=0000000000000000
>       X26=ffff800012bb5b80 X27=ffff800012cc1068 X28=0000000000000007
>       X29=ffff80001668bab0 X30=ffff8000109367c8  SP=ffff80001668bab0
>       PSTATE=80000005 N--- EL1h
>       qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT
>       Aborted

Thanks

--
Qais Yousef
James Morse Oct. 12, 2020, 3:32 p.m. UTC | #5
Hi Marc, Qais,

On 09/10/2020 13:48, Qais Yousef wrote:
> On 10/09/20 13:34, Marc Zyngier wrote:
>> On 2020-10-09 10:58, Qais Yousef wrote:

>>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>>> index b588c3b5c2f0..22ff3373d855 100644
>>>>> --- a/arch/arm64/kvm/arm.c
>>>>> +++ b/arch/arm64/kvm/arm.c
>>>>> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>  	struct kvm_run *run = vcpu->run;
>>>>>  	int ret;
>>>>>
>>>>> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>>>>> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>>>>
>>>> No, we don't scream on the console in an uncontrolled way based on
>>>> illegal user input (yes, the VM *is* userspace).
>>>
>>> It seemed kind to print a good reason of what just happened.

>>>> Furthermore, you seem to deal with the same problem *twice*. See
>>>> below.
>>>
>>> It's done below because we could loop back into the guest again, so we
>>> force an
>>> exit then. Here to make sure if the VMM ignores the error value we
>>> returned
>>> earlier it can't force its way back in again.

>> Which we already handle if you do what I hinted at below.

Do we trust the VMM not to try and get out of this?

We sanity-check the SPSR values the VMM writes via set_one_reg() to prevent aarch32 on
systems that don't support it. It seems strange that if you can get the bad value out of
hardware: you can keep it.
Returning to aarch32 from EL2 on a CPU that doesn't support it is terrifying.

To avoid always testing on entry from user-space, we could add a
'vmm-fixed-bad-value-from-hardware' request type, and refactor check_vcpu_requests() to
allow it to force a guest exit. Any KVM_EXIT_FAIL_ENTRY can set this to ensure the VMM
doesn't have its fingers in its ears.

This means the VMM can fix this by set_one_reg()ing an exception to aarch64 if it really
wants to, but it can't restart the guest with the bad SPSR value.


>>>>> +		return -ENOEXEC;
>>>>> +	}
>>>>> +
>>>>>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>>>>>  		return -ENOEXEC;
>>>>>
>>>>> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>
>>>>>  		preempt_enable();
>>>>>
>>>>> +		/*
>>>>> +		 * For asym aarch32 systems we present a 64bit only system to
>>>>> +		 * the guest. But in case it managed somehow to escape that and
>>>>> +		 * enter 32bit mode, catch that and prevent it from running
>>>>> +		 * again.
>>>>
>>>> The guest didn't *escape* anything. It merely used the CPU as
>>>> designed.
>>>> The fact that the hypervisor cannot prevent the guest from using
>>>> AArch32
>>>> is an architectural defect.

>>> Because I probably didn't navigate my way correctly around the code.
>>> Mind
>>> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
>>> in that effect but they were really horrible, so will abstain from
>>> sharing :-)
>>
>> You can try setting vcpu->arch.target to -1, which is already caught by
>> kvm_vcpu_initialized() right at the top of this function. This will

>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

This doesn't reset SPSR, so this lets the VMM restart the guest with a bad value.

I think we should make it impossible to return to aarch32 from EL2 on these systems.


Thanks,

James


> That's actually one of the things I tried before ending with this patch.
> I thought it's really aggressive and unfriendly.
> 
> It does indeed cause qemu to abort out completely.
> 
>> +                       /* Reset target so it won't run again */
>> +                       vcpu->arch.target = -1;
>>
Marc Zyngier Oct. 13, 2020, 10:32 a.m. UTC | #6
Hi James,

On 2020-10-12 16:32, James Morse wrote:
> Hi Marc, Qais,
> 
> On 09/10/2020 13:48, Qais Yousef wrote:
>> On 10/09/20 13:34, Marc Zyngier wrote:
>>> On 2020-10-09 10:58, Qais Yousef wrote:
> 
>>>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>>>> index b588c3b5c2f0..22ff3373d855 100644
>>>>>> --- a/arch/arm64/kvm/arm.c
>>>>>> +++ b/arch/arm64/kvm/arm.c
>>>>>> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
>>>>>> *vcpu)
>>>>>>  	struct kvm_run *run = vcpu->run;
>>>>>>  	int ret;
>>>>>> 
>>>>>> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>>>>>> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>>>>> 
>>>>> No, we don't scream on the console in an uncontrolled way based on
>>>>> illegal user input (yes, the VM *is* userspace).
>>>> 
>>>> It seemed kind to print a good reason of what just happened.
> 
>>>>> Furthermore, you seem to deal with the same problem *twice*. See
>>>>> below.
>>>> 
>>>> It's done below because we could loop back into the guest again, so 
>>>> we
>>>> force an
>>>> exit then. Here to make sure if the VMM ignores the error value we
>>>> returned
>>>> earlier it can't force its way back in again.
> 
>>> Which we already handle if you do what I hinted at below.
> 
> Do we trust the VMM not to try and get out of this?

I usually don't put the words trust and VMM in the same sentence...

> We sanity-check the SPSR values the VMM writes via set_one_reg() to
> prevent aarch32 on systems that don't support it. It seems strange
> that if you can get the bad value out of hardware: you can keep it.

The guest went into an illegal state, and we return that illegal state
to userspace. Garbage in, garbage out. I'm not too bothered about that,
but we cal always sanity check it.

> Returning to aarch32 from EL2 on a CPU that doesn't support it is 
> terrifying.

And I'm not proposing that we even try.

> 
> To avoid always testing on entry from user-space, we could add a
> 'vmm-fixed-bad-value-from-hardware' request type, and refactor
> check_vcpu_requests() to
> allow it to force a guest exit. Any KVM_EXIT_FAIL_ENTRY can set this
> to ensure the VMM
> doesn't have its fingers in its ears.
> 
> This means the VMM can fix this by set_one_reg()ing an exception to
> aarch64 if it really
> wants to, but it can't restart the guest with the bad SPSR value.
> 
> 
>>>>>> +		return -ENOEXEC;
>>>>>> +	}
>>>>>> +
>>>>>>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>>>>>>  		return -ENOEXEC;
>>>>>> 
>>>>>> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
>>>>>> *vcpu)
>>>>>> 
>>>>>>  		preempt_enable();
>>>>>> 
>>>>>> +		/*
>>>>>> +		 * For asym aarch32 systems we present a 64bit only system to
>>>>>> +		 * the guest. But in case it managed somehow to escape that and
>>>>>> +		 * enter 32bit mode, catch that and prevent it from running
>>>>>> +		 * again.
>>>>> 
>>>>> The guest didn't *escape* anything. It merely used the CPU as
>>>>> designed.
>>>>> The fact that the hypervisor cannot prevent the guest from using
>>>>> AArch32
>>>>> is an architectural defect.
> 
>>>> Because I probably didn't navigate my way correctly around the code.
>>>> Mind
>>>> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
>>>> in that effect but they were really horrible, so will abstain from
>>>> sharing :-)
>>> 
>>> You can try setting vcpu->arch.target to -1, which is already caught 
>>> by
>>> kvm_vcpu_initialized() right at the top of this function. This will
> 
>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
> 
> This doesn't reset SPSR, so this lets the VMM restart the guest with a
> bad value.

That's not my reading of the code. Without a valid target, you cannot 
enter
the guest (kvm_vcpu_initialized() prevents you to do so). To set the 
target,
you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
kvm_reset_vcpu(), which does set PSTATE to something legal.

Or do you mean the guest's SPSR_EL1? Why would that matter?

> I think we should make it impossible to return to aarch32 from EL2 on
> these systems.

And I'm saying that the above fulfills that requirement. Am I missing
something obvious?

Thanks,

         M.
James Morse Oct. 13, 2020, 11:51 a.m. UTC | #7
Hi Marc,

On 13/10/2020 11:32, Marc Zyngier wrote:
> On 2020-10-12 16:32, James Morse wrote:
>> On 09/10/2020 13:48, Qais Yousef wrote:
>>> On 10/09/20 13:34, Marc Zyngier wrote:
>>>> You can try setting vcpu->arch.target to -1, which is already caught by
>>>> kvm_vcpu_initialized() right at the top of this function. This will
>>
>>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
>>
>> This doesn't reset SPSR, so this lets the VMM restart the guest with a
>> bad value.
> 
> That's not my reading of the code. Without a valid target, you cannot enter
> the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
> you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls

> kvm_reset_vcpu(), which does set PSTATE to something legal.

aha! So it does, this is what I was missing.


> Or do you mean the guest's SPSR_EL1? Why would that matter?
> 
>> I think we should make it impossible to return to aarch32 from EL2 on
>> these systems.
> 
> And I'm saying that the above fulfills that requirement. Am I missing
> something obvious?

Nope, I was.

I agree the check on entry from user-space isn't needed.


Thanks,

James
Qais Yousef Oct. 13, 2020, 11:59 a.m. UTC | #8
On 10/13/20 12:51, James Morse wrote:
> Hi Marc,
> 
> On 13/10/2020 11:32, Marc Zyngier wrote:
> > On 2020-10-12 16:32, James Morse wrote:
> >> On 09/10/2020 13:48, Qais Yousef wrote:
> >>> On 10/09/20 13:34, Marc Zyngier wrote:
> >>>> You can try setting vcpu->arch.target to -1, which is already caught by
> >>>> kvm_vcpu_initialized() right at the top of this function. This will
> >>
> >>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
> >>
> >> This doesn't reset SPSR, so this lets the VMM restart the guest with a
> >> bad value.
> > 
> > That's not my reading of the code. Without a valid target, you cannot enter
> > the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
> > you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
> 
> > kvm_reset_vcpu(), which does set PSTATE to something legal.
> 
> aha! So it does, this is what I was missing.
> 
> 
> > Or do you mean the guest's SPSR_EL1? Why would that matter?
> > 
> >> I think we should make it impossible to return to aarch32 from EL2 on
> >> these systems.
> > 
> > And I'm saying that the above fulfills that requirement. Am I missing
> > something obvious?
> 
> Nope, I was.
> 
> I agree the check on entry from user-space isn't needed.

Thanks both.

So using the vcpu->arch.target = -1 is the best way forward. In my experiments
when I did that I considered calling kvm_reset_vcpu() too, does it make sense
to force the reset here too? Or too heavy handed?

Thanks

--
Qais Yousef
Marc Zyngier Oct. 13, 2020, 12:09 p.m. UTC | #9
On 2020-10-13 12:59, Qais Yousef wrote:
> On 10/13/20 12:51, James Morse wrote:
>> Hi Marc,
>> 
>> On 13/10/2020 11:32, Marc Zyngier wrote:
>> > On 2020-10-12 16:32, James Morse wrote:
>> >> On 09/10/2020 13:48, Qais Yousef wrote:
>> >>> On 10/09/20 13:34, Marc Zyngier wrote:
>> >>>> You can try setting vcpu->arch.target to -1, which is already caught by
>> >>>> kvm_vcpu_initialized() right at the top of this function. This will
>> >>
>> >>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
>> >>
>> >> This doesn't reset SPSR, so this lets the VMM restart the guest with a
>> >> bad value.
>> >
>> > That's not my reading of the code. Without a valid target, you cannot enter
>> > the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
>> > you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
>> 
>> > kvm_reset_vcpu(), which does set PSTATE to something legal.
>> 
>> aha! So it does, this is what I was missing.
>> 
>> 
>> > Or do you mean the guest's SPSR_EL1? Why would that matter?
>> >
>> >> I think we should make it impossible to return to aarch32 from EL2 on
>> >> these systems.
>> >
>> > And I'm saying that the above fulfills that requirement. Am I missing
>> > something obvious?
>> 
>> Nope, I was.
>> 
>> I agree the check on entry from user-space isn't needed.
> 
> Thanks both.
> 
> So using the vcpu->arch.target = -1 is the best way forward. In my 
> experiments
> when I did that I considered calling kvm_reset_vcpu() too, does it make 
> sense
> to force the reset here too? Or too heavy handed?

No, userspace should know about it and take action if it wants too.
Trying to fix it behind the scenes is setting expectations, which
I'd really like to avoid.

         M.
Qais Yousef Oct. 13, 2020, 12:16 p.m. UTC | #10
On 10/13/20 13:09, Marc Zyngier wrote:
> On 2020-10-13 12:59, Qais Yousef wrote:
> > Thanks both.
> > 
> > So using the vcpu->arch.target = -1 is the best way forward. In my
> > experiments
> > when I did that I considered calling kvm_reset_vcpu() too, does it make
> > sense
> > to force the reset here too? Or too heavy handed?
> 
> No, userspace should know about it and take action if it wants too.
> Trying to fix it behind the scenes is setting expectations, which
> I'd really like to avoid.

Cool I thought so but I wanted to hear it directly :-)

Thanks!

--
Qais Yousef
diff mbox series

Patch

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index b588c3b5c2f0..22ff3373d855 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -644,6 +644,11 @@  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 	struct kvm_run *run = vcpu->run;
 	int ret;
 
+	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+		kvm_err("Illegal AArch32 mode at EL0, can't run.");
+		return -ENOEXEC;
+	}
+
 	if (unlikely(!kvm_vcpu_initialized(vcpu)))
 		return -ENOEXEC;
 
@@ -804,6 +809,17 @@  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 		preempt_enable();
 
+		/*
+		 * For asym aarch32 systems we present a 64bit only system to
+		 * the guest. But in case it managed somehow to escape that and
+		 * enter 32bit mode, catch that and prevent it from running
+		 * again.
+		 */
+		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
+			ret = ARM_EXCEPTION_IL;
+		}
+
 		ret = handle_exit(vcpu, ret);
 	}