diff mbox series

KVM: arm64: Fix 32bit PC wrap-around

Message ID 20200430101513.318541-1-maz@kernel.org (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Fix 32bit PC wrap-around | expand

Commit Message

Marc Zyngier April 30, 2020, 10:15 a.m. UTC
In the unlikely event that a 32bit vcpu traps into the hypervisor
on an instruction that is located right at the end of the 32bit
range, the emulation of that instruction is going to increment
PC past the 32bit range. This isn't great, as userspace can then
observe this value and get a bit confused.

Conversly, userspace can do things like (in the context of a 64bit
guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
that PC hasn't been truncated. More confusion.

Fix both by:
- truncating PC increments for 32bit guests
- sanitize PC every time a core reg is changed by userspace, and
  that PSTATE indicates a 32bit mode.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/guest.c     | 4 ++++
 virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

Comments

Will Deacon April 30, 2020, 10:25 a.m. UTC | #1
On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> In the unlikely event that a 32bit vcpu traps into the hypervisor
> on an instruction that is located right at the end of the 32bit
> range, the emulation of that instruction is going to increment
> PC past the 32bit range. This isn't great, as userspace can then
> observe this value and get a bit confused.
> 
> Conversly, userspace can do things like (in the context of a 64bit
> guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
> set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
> that PC hasn't been truncated. More confusion.
> 
> Fix both by:
> - truncating PC increments for 32bit guests
> - sanitize PC every time a core reg is changed by userspace, and
>   that PSTATE indicates a 32bit mode.

It's not clear to me whether this needs a cc stable. What do you think? I
suppose that it really depends on how confused e.g. QEMU gets.

> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/arm64/kvm/guest.c     | 4 ++++
>  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 23ebe51410f0..2a159af82429 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> +
> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));

It seems slightly odd to me that we don't enforce this for *all* the
registers when running as a 32-bit guest. Couldn't userspace be equally
confused by a 64-bit lr or sp?

Will
Marc Zyngier April 30, 2020, 10:59 a.m. UTC | #2
On 2020-04-30 11:25, Will Deacon wrote:
> On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
>> In the unlikely event that a 32bit vcpu traps into the hypervisor
>> on an instruction that is located right at the end of the 32bit
>> range, the emulation of that instruction is going to increment
>> PC past the 32bit range. This isn't great, as userspace can then
>> observe this value and get a bit confused.
>> 
>> Conversly, userspace can do things like (in the context of a 64bit
>> guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
>> set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
>> that PC hasn't been truncated. More confusion.
>> 
>> Fix both by:
>> - truncating PC increments for 32bit guests
>> - sanitize PC every time a core reg is changed by userspace, and
>>   that PSTATE indicates a 32bit mode.
> 
> It's not clear to me whether this needs a cc stable. What do you think? 
> I
> suppose that it really depends on how confused e.g. QEMU gets.

It isn't so much QEMU itself that I'm worried about (the emulation 
shouldn't
really care about the PC), but the likes of GDB. So yes, a cc stable 
seems to
be in order.

> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> ---
>>  arch/arm64/kvm/guest.c     | 4 ++++
>>  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 23ebe51410f0..2a159af82429 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
>> const struct kvm_one_reg *reg)
>>  	}
>> 
>>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> +
>> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> 
> It seems slightly odd to me that we don't enforce this for *all* the
> registers when running as a 32-bit guest. Couldn't userspace be equally
> confused by a 64-bit lr or sp?

Fair point. How about this on top, which wipes the upper 32 bits for
each and every register in the current mode:

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 2a159af82429..f958c3c7bf65 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
const struct kvm_one_reg *reg)

  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));

-	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
-		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
+	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
+		int i;

+		for (i = 0; i < 16; i++)
+			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
+	}
  out:
  	return err;
  }

I'm tempted to make the whole SET_REG hunk a separate patch though.

Thanks,

         M.
Will Deacon April 30, 2020, 12:31 p.m. UTC | #3
On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
> On 2020-04-30 11:25, Will Deacon wrote:
> > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> > > In the unlikely event that a 32bit vcpu traps into the hypervisor
> > > on an instruction that is located right at the end of the 32bit
> > > range, the emulation of that instruction is going to increment
> > > PC past the 32bit range. This isn't great, as userspace can then
> > > observe this value and get a bit confused.
> > > 
> > > Conversly, userspace can do things like (in the context of a 64bit
> > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
> > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
> > > that PC hasn't been truncated. More confusion.
> > > 
> > > Fix both by:
> > > - truncating PC increments for 32bit guests
> > > - sanitize PC every time a core reg is changed by userspace, and
> > >   that PSTATE indicates a 32bit mode.
> > 
> > It's not clear to me whether this needs a cc stable. What do you think?
> > I
> > suppose that it really depends on how confused e.g. QEMU gets.
> 
> It isn't so much QEMU itself that I'm worried about (the emulation shouldn't
> really care about the PC), but the likes of GDB. So yes, a cc stable seems
> to
> be in order.

Okey doke.

> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  arch/arm64/kvm/guest.c     | 4 ++++
> > >  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
> > >  2 files changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index 23ebe51410f0..2a159af82429 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > const struct kvm_one_reg *reg)
> > >  	}
> > > 
> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > +
> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > 
> > It seems slightly odd to me that we don't enforce this for *all* the
> > registers when running as a 32-bit guest. Couldn't userspace be equally
> > confused by a 64-bit lr or sp?
> 
> Fair point. How about this on top, which wipes the upper 32 bits for
> each and every register in the current mode:
> 
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 2a159af82429..f958c3c7bf65 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const
> struct kvm_one_reg *reg)
> 
>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> 
> -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
> +		int i;
> 
> +		for (i = 0; i < 16; i++)
> +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);

I think you're missing all the funny banked registers that live all the way
up to x30 iirc.

Will
Marc Zyngier April 30, 2020, 12:45 p.m. UTC | #4
On 2020-04-30 13:31, Will Deacon wrote:
> On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
>> On 2020-04-30 11:25, Will Deacon wrote:
>> > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
>> > > In the unlikely event that a 32bit vcpu traps into the hypervisor
>> > > on an instruction that is located right at the end of the 32bit
>> > > range, the emulation of that instruction is going to increment
>> > > PC past the 32bit range. This isn't great, as userspace can then
>> > > observe this value and get a bit confused.
>> > >
>> > > Conversly, userspace can do things like (in the context of a 64bit
>> > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0,
>> > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe
>> > > that PC hasn't been truncated. More confusion.
>> > >
>> > > Fix both by:
>> > > - truncating PC increments for 32bit guests
>> > > - sanitize PC every time a core reg is changed by userspace, and
>> > >   that PSTATE indicates a 32bit mode.
>> >
>> > It's not clear to me whether this needs a cc stable. What do you think?
>> > I
>> > suppose that it really depends on how confused e.g. QEMU gets.
>> 
>> It isn't so much QEMU itself that I'm worried about (the emulation 
>> shouldn't
>> really care about the PC), but the likes of GDB. So yes, a cc stable 
>> seems
>> to
>> be in order.
> 
> Okey doke.
> 
>> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> > > ---
>> > >  arch/arm64/kvm/guest.c     | 4 ++++
>> > >  virt/kvm/arm/hyp/aarch32.c | 8 ++++++--
>> > >  2 files changed, 10 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> > > index 23ebe51410f0..2a159af82429 100644
>> > > --- a/arch/arm64/kvm/guest.c
>> > > +++ b/arch/arm64/kvm/guest.c
>> > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
>> > > const struct kvm_one_reg *reg)
>> > >  	}
>> > >
>> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> > > +
>> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
>> >
>> > It seems slightly odd to me that we don't enforce this for *all* the
>> > registers when running as a 32-bit guest. Couldn't userspace be equally
>> > confused by a 64-bit lr or sp?
>> 
>> Fair point. How about this on top, which wipes the upper 32 bits for
>> each and every register in the current mode:
>> 
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 2a159af82429..f958c3c7bf65 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, 
>> const
>> struct kvm_one_reg *reg)
>> 
>>  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
>> 
>> -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
>> -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
>> +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
>> +		int i;
>> 
>> +		for (i = 0; i < 16; i++)
>> +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
> 
> I think you're missing all the funny banked registers that live all the 
> way
> up to x30 iirc.

No, they are all indirected via vcpu_reg32(), which has the magic 
tables.
And the whole point is that we only want to affect the current mode (no 
point
in repainting the FIQ registers if the PSR says USR).

Or am I missing something obvious?

         M.
Will Deacon April 30, 2020, 1:46 p.m. UTC | #5
On Thu, Apr 30, 2020 at 01:45:51PM +0100, Marc Zyngier wrote:
> On 2020-04-30 13:31, Will Deacon wrote:
> > On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote:
> > > On 2020-04-30 11:25, Will Deacon wrote:
> > > > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote:
> > > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > > > index 23ebe51410f0..2a159af82429 100644
> > > > > --- a/arch/arm64/kvm/guest.c
> > > > > +++ b/arch/arm64/kvm/guest.c
> > > > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > > > const struct kvm_one_reg *reg)
> > > > >  	}
> > > > >
> > > > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > > > +
> > > > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > > > +		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > > >
> > > > It seems slightly odd to me that we don't enforce this for *all* the
> > > > registers when running as a 32-bit guest. Couldn't userspace be equally
> > > > confused by a 64-bit lr or sp?
> > > 
> > > Fair point. How about this on top, which wipes the upper 32 bits for
> > > each and every register in the current mode:
> > > 
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index 2a159af82429..f958c3c7bf65 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu,
> > > const
> > > struct kvm_one_reg *reg)
> > > 
> > >  	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
> > > 
> > > -	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
> > > -		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
> > > +	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
> > > +		int i;
> > > 
> > > +		for (i = 0; i < 16; i++)
> > > +			*vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i);
> > 
> > I think you're missing all the funny banked registers that live all the
> > way
> > up to x30 iirc.
> 
> No, they are all indirected via vcpu_reg32(), which has the magic tables.
> And the whole point is that we only want to affect the current mode (no
> point
> in repainting the FIQ registers if the PSR says USR).
> 
> Or am I missing something obvious?

Nope, just my inability to parse vcpu_reg32 the first time around! So, for
the updated patch:

Acked-by: Will Deacon <will@kernel.org?

Thanks,

Will
diff mbox series

Patch

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 23ebe51410f0..2a159af82429 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -200,6 +200,10 @@  static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	}
 
 	memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id));
+
+	if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK)
+		*vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu));
+
 out:
 	return err;
 }
diff --git a/virt/kvm/arm/hyp/aarch32.c b/virt/kvm/arm/hyp/aarch32.c
index d31f267961e7..25c0e47d57cb 100644
--- a/virt/kvm/arm/hyp/aarch32.c
+++ b/virt/kvm/arm/hyp/aarch32.c
@@ -125,12 +125,16 @@  static void __hyp_text kvm_adjust_itstate(struct kvm_vcpu *vcpu)
  */
 void __hyp_text kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr)
 {
+	u32 pc = *vcpu_pc(vcpu);
 	bool is_thumb;
 
 	is_thumb = !!(*vcpu_cpsr(vcpu) & PSR_AA32_T_BIT);
 	if (is_thumb && !is_wide_instr)
-		*vcpu_pc(vcpu) += 2;
+		pc += 2;
 	else
-		*vcpu_pc(vcpu) += 4;
+		pc += 4;
+
+	*vcpu_pc(vcpu) = pc;
+
 	kvm_adjust_itstate(vcpu);
 }