KVM: arm64: Fix 32bit PC wrap-around

Message ID	20200430101513.318541-1-maz@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=DGI+=6O=vger.kernel.org=kvm-owner@kernel.org> From: Marc Zyngier <maz@kernel.org> To: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu Cc: James Morse <james.morse@arm.com>, Julien Thierry <julien.thierry.kdev@gmail.com>, Suzuki K Poulose <suzuki.poulose@arm.com>, Will Deacon <will@kernel.org> Subject: [PATCH] KVM: arm64: Fix 32bit PC wrap-around Date: Thu, 30 Apr 2020 11:15:13 +0100 Message-Id: <20200430101513.318541-1-maz@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	KVM: arm64: Fix 32bit PC wrap-around \| expand KVM: arm64: Fix 32bit PC wrap-around

Marc Zyngier April 30, 2020, 10:15 a.m. UTC

In the unlikely event that a 32bit vcpu traps into the hypervisor
on an instruction that is located right at the end of the 32bit
range, the emulation of that instruction is going to increment
PC past the 32bit range. This isn't great, as userspace can then
observe this value and get a bit confused.

Conversly, userspace can do things like (in the context of a 64bit
guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
that PC hasn't been truncated. More confusion.

Fix both by:
- truncating PC increments for 32bit guests
- sanitize PC every time a core reg is changed by userspace, and
  that PSTATE indicates a 32bit mode.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/guest.c     | 4 ++++
 virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

Will Deacon April 30, 2020, 10:25 a.m. UTC | #1

On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> In the unlikely event that a 32bit vcpu traps into the hypervisor
> on an instruction that is located right at the end of the 32bit
> range, the emulation of that instruction is going to increment
> PC past the 32bit range. This isn't great, as userspace can then
> observe this value and get a bit confused.
> 
> Conversly, userspace can do things like (in the context of a 64bit
> guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
> set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
> that PC hasn't been truncated. More confusion.
> 
> Fix both by:
> - truncating PC increments for 32bit guests
> - sanitize PC every time a core reg is changed by userspace, and
>   that PSTATE indicates a 32bit mode.

It's not clear to me whether this needs a cc stable. What do you think? I
suppose that it really depends on how confused e.g. QEMU gets.

> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/arm64/kvm/guest.c     | 4 ++++
>  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 23ebe51410f0..2a159af82429 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> +
> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));

It seems slightly odd to me that we don't enforce this for *all* the
registers when running as a 32-bit guest. Couldn't userspace be equally
confused by a 64-bit lr or sp?

Will

Marc Zyngier April 30, 2020, 10:59 a.m. UTC | #2

On 2020-04-30 11:25, Will Deacon wrote:
> On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
>> In the unlikely event that a 32bit vcpu traps into the hypervisor
>> on an instruction that is located right at the end of the 32bit
>> range, the emulation of that instruction is going to increment
>> PC past the 32bit range. This isn't great, as userspace can then
>> observe this value and get a bit confused.
>> 
>> Conversly, userspace can do things like (in the context of a 64bit
>> guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
>> set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
>> that PC hasn't been truncated. More confusion.
>> 
>> Fix both by:
>> - truncating PC increments for 32bit guests
>> - sanitize PC every time a core reg is changed by userspace, and
>>   that PSTATE indicates a 32bit mode.
> 
> It's not clear to me whether this needs a cc stable. What do you think? 
> I
> suppose that it really depends on how confused e.g. QEMU gets.

It isn't so much QEMU itself that I'm worried about (the emulation 
shouldn't
really care about the PC), but the likes of GDB. So yes, a cc stable 
seems to
be in order.

> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  arch/arm64/kvm/guest.c     | 4 ++++
>>  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 23ebe51410f0..2a159af82429 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
>> const struct kvm_one_reg *reg)
>>  	}
>> 
>>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> +
>> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> 
> It seems slightly odd to me that we don't enforce this for *all* the
> registers when running as a 32-bit guest. Couldn't userspace be equally
> confused by a 64-bit lr or sp?

Fair point. How about this on top, which wipes the upper 32 bits for
each and every register in the current mode:

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 2a159af82429..f958c3c7bf65 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
const struct kvm_one_reg *reg)

  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));

-	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
-		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
+	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
+		int i;

+		for (i = 0; i < 16; i++)
+			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
+	}
  out:
  	return err;
  }

I'm tempted to make the whole SET_REG hunk a separate patch though.

Thanks,

         M.

Will Deacon April 30, 2020, 12:31 p.m. UTC | #3

On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
> On 2020-04-30 11:25, Will Deacon wrote:
> > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> > > In the unlikely event that a 32bit vcpu traps into the hypervisor
> > > on an instruction that is located right at the end of the 32bit
> > > range, the emulation of that instruction is going to increment
> > > PC past the 32bit range. This isn't great, as userspace can then
> > > observe this value and get a bit confused.
> > > 
> > > Conversly, userspace can do things like (in the context of a 64bit
> > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
> > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
> > > that PC hasn't been truncated. More confusion.
> > > 
> > > Fix both by:
> > > - truncating PC increments for 32bit guests
> > > - sanitize PC every time a core reg is changed by userspace, and
> > >   that PSTATE indicates a 32bit mode.
> > 
> > It's not clear to me whether this needs a cc stable. What do you think?
> > I
> > suppose that it really depends on how confused e.g. QEMU gets.
> 
> It isn't so much QEMU itself that I'm worried about (the emulation shouldn't
> really care about the PC), but the likes of GDB. So yes, a cc stable seems
> to
> be in order.

Okey doke.

> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  arch/arm64/kvm/guest.c     | 4 ++++
> > >  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
> > >  2 files changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index 23ebe51410f0..2a159af82429 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > const struct kvm_one_reg *reg)
> > >  	}
> > > 
> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > +
> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > 
> > It seems slightly odd to me that we don't enforce this for *all* the
> > registers when running as a 32-bit guest. Couldn't userspace be equally
> > confused by a 64-bit lr or sp?
> 
> Fair point. How about this on top, which wipes the upper 32 bits for
> each and every register in the current mode:
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 2a159af82429..f958c3c7bf65 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const
> struct kvm_one_reg *reg)
> 
>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> 
> -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
> +		int i;
> 
> +		for (i = 0; i < 16; i++)
> +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);

I think you're missing all the funny banked registers that live all the way
up to x30 iirc.

Will

Marc Zyngier April 30, 2020, 12:45 p.m. UTC | #4

On 2020-04-30 13:31, Will Deacon wrote:
> On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
>> On 2020-04-30 11:25, Will Deacon wrote:
>> > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
>> > > In the unlikely event that a 32bit vcpu traps into the hypervisor
>> > > on an instruction that is located right at the end of the 32bit
>> > > range, the emulation of that instruction is going to increment
>> > > PC past the 32bit range. This isn't great, as userspace can then
>> > > observe this value and get a bit confused.
>> > >
>> > > Conversly, userspace can do things like (in the context of a 64bit
>> > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
>> > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
>> > > that PC hasn't been truncated. More confusion.
>> > >
>> > > Fix both by:
>> > > - truncating PC increments for 32bit guests
>> > > - sanitize PC every time a core reg is changed by userspace, and
>> > >   that PSTATE indicates a 32bit mode.
>> >
>> > It's not clear to me whether this needs a cc stable. What do you think?
>> > I
>> > suppose that it really depends on how confused e.g. QEMU gets.
>> 
>> It isn't so much QEMU itself that I'm worried about (the emulation 
>> shouldn't
>> really care about the PC), but the likes of GDB. So yes, a cc stable 
>> seems
>> to
>> be in order.
> 
> Okey doke.
> 
>> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> > > ---
>> > >  arch/arm64/kvm/guest.c     | 4 ++++
>> > >  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>> > >  2 files changed, 10 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> > > index 23ebe51410f0..2a159af82429 100644
>> > > --- a/arch/arm64/kvm/guest.c
>> > > +++ b/arch/arm64/kvm/guest.c
>> > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
>> > > const struct kvm_one_reg *reg)
>> > >  	}
>> > >
>> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> > > +
>> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
>> >
>> > It seems slightly odd to me that we don't enforce this for *all* the
>> > registers when running as a 32-bit guest. Couldn't userspace be equally
>> > confused by a 64-bit lr or sp?
>> 
>> Fair point. How about this on top, which wipes the upper 32 bits for
>> each and every register in the current mode:
>> 
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 2a159af82429..f958c3c7bf65 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
>> const
>> struct kvm_one_reg *reg)
>> 
>>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> 
>> -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
>> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
>> +		int i;
>> 
>> +		for (i = 0; i < 16; i++)
>> +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
> 
> I think you're missing all the funny banked registers that live all the 
> way
> up to x30 iirc.

No, they are all indirected via vcpu_reg32(), which has the magic 
tables.
And the whole point is that we only want to affect the current mode (no 
point
in repainting the FIQ registers if the PSR says USR).

Or am I missing something obvious?

         M.

Will Deacon April 30, 2020, 1:46 p.m. UTC | #5

On Thu, Apr 30, 2020 at 01:45:51PM +0100, Marc Zyngier wrote:
> On 2020-04-30 13:31, Will Deacon wrote:
> > On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
> > > On 2020-04-30 11:25, Will Deacon wrote:
> > > > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> > > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > > > index 23ebe51410f0..2a159af82429 100644
> > > > > --- a/arch/arm64/kvm/guest.c
> > > > > +++ b/arch/arm64/kvm/guest.c
> > > > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > > > const struct kvm_one_reg *reg)
> > > > >  	}
> > > > >
> > > > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > > > +
> > > > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > > >
> > > > It seems slightly odd to me that we don't enforce this for *all* the
> > > > registers when running as a 32-bit guest. Couldn't userspace be equally
> > > > confused by a 64-bit lr or sp?
> > > 
> > > Fair point. How about this on top, which wipes the upper 32 bits for
> > > each and every register in the current mode:
> > > 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index 2a159af82429..f958c3c7bf65 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > const
> > > struct kvm_one_reg *reg)
> > > 
> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > 
> > > -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
> > > +		int i;
> > > 
> > > +		for (i = 0; i < 16; i++)
> > > +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
> > 
> > I think you're missing all the funny banked registers that live all the
> > way
> > up to x30 iirc.
> 
> No, they are all indirected via vcpu_reg32(), which has the magic tables.
> And the whole point is that we only want to affect the current mode (no
> point
> in repainting the FIQ registers if the PSR says USR).
> 
> Or am I missing something obvious?

Nope, just my inability to parse vcpu_reg32 the first time around! So, for
the updated patch:

Acked-by: Will Deacon <will@kernel.org?

Thanks,

Will

KVM: arm64: Fix 32bit PC wrap-around

Commit Message

Comments

Patch