Message ID | 20200430101513.318541-1-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Fix 32bit PC wrap-around | expand |
On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote: > In the unlikely event that a 32bit vcpu traps into the hypervisor > on an instruction that is located right at the end of the 32bit > range, the emulation of that instruction is going to increment > PC past the 32bit range. This isn't great, as userspace can then > observe this value and get a bit confused. > > Conversly, userspace can do things like (in the context of a 64bit > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe > that PC hasn't been truncated. More confusion. > > Fix both by: > - truncating PC increments for 32bit guests > - sanitize PC every time a core reg is changed by userspace, and > that PSTATE indicates a 32bit mode. It's not clear to me whether this needs a cc stable. What do you think? I suppose that it really depends on how confused e.g. QEMU gets. > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > arch/arm64/kvm/guest.c | 4 ++++ > virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index 23ebe51410f0..2a159af82429 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > } > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); > + > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) > + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); It seems slightly odd to me that we don't enforce this for *all* the registers when running as a 32-bit guest. Couldn't userspace be equally confused by a 64-bit lr or sp? Will
On 2020-04-30 11:25, Will Deacon wrote: > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote: >> In the unlikely event that a 32bit vcpu traps into the hypervisor >> on an instruction that is located right at the end of the 32bit >> range, the emulation of that instruction is going to increment >> PC past the 32bit range. This isn't great, as userspace can then >> observe this value and get a bit confused. >> >> Conversly, userspace can do things like (in the context of a 64bit >> guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, >> set PC to a 64bit value, change PSTATE to AArch32-USR, and observe >> that PC hasn't been truncated. More confusion. >> >> Fix both by: >> - truncating PC increments for 32bit guests >> - sanitize PC every time a core reg is changed by userspace, and >> that PSTATE indicates a 32bit mode. > > It's not clear to me whether this needs a cc stable. What do you think? > I > suppose that it really depends on how confused e.g. QEMU gets. It isn't so much QEMU itself that I'm worried about (the emulation shouldn't really care about the PC), but the likes of GDB. So yes, a cc stable seems to be in order. > >> Signed-off-by: Marc Zyngier <maz@kernel.org> >> --- >> arch/arm64/kvm/guest.c | 4 ++++ >> virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- >> 2 files changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c >> index 23ebe51410f0..2a159af82429 100644 >> --- a/arch/arm64/kvm/guest.c >> +++ b/arch/arm64/kvm/guest.c >> @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, >> const struct kvm_one_reg *reg) >> } >> >> memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); >> + >> + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) >> + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); > > It seems slightly odd to me that we don't enforce this for *all* the > registers when running as a 32-bit guest. Couldn't userspace be equally > confused by a 64-bit lr or sp? Fair point. How about this on top, which wipes the upper 32 bits for each and every register in the current mode: diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 2a159af82429..f958c3c7bf65 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); - if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) - *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) { + int i; + for (i = 0; i < 16; i++) + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); + } out: return err; } I'm tempted to make the whole SET_REG hunk a separate patch though. Thanks, M.
On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote: > On 2020-04-30 11:25, Will Deacon wrote: > > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote: > > > In the unlikely event that a 32bit vcpu traps into the hypervisor > > > on an instruction that is located right at the end of the 32bit > > > range, the emulation of that instruction is going to increment > > > PC past the 32bit range. This isn't great, as userspace can then > > > observe this value and get a bit confused. > > > > > > Conversly, userspace can do things like (in the context of a 64bit > > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, > > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe > > > that PC hasn't been truncated. More confusion. > > > > > > Fix both by: > > > - truncating PC increments for 32bit guests > > > - sanitize PC every time a core reg is changed by userspace, and > > > that PSTATE indicates a 32bit mode. > > > > It's not clear to me whether this needs a cc stable. What do you think? > > I > > suppose that it really depends on how confused e.g. QEMU gets. > > It isn't so much QEMU itself that I'm worried about (the emulation shouldn't > really care about the PC), but the likes of GDB. So yes, a cc stable seems > to > be in order. Okey doke. > > > Signed-off-by: Marc Zyngier <maz@kernel.org> > > > --- > > > arch/arm64/kvm/guest.c | 4 ++++ > > > virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- > > > 2 files changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > > index 23ebe51410f0..2a159af82429 100644 > > > --- a/arch/arm64/kvm/guest.c > > > +++ b/arch/arm64/kvm/guest.c > > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, > > > const struct kvm_one_reg *reg) > > > } > > > > > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); > > > + > > > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) > > > + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); > > > > It seems slightly odd to me that we don't enforce this for *all* the > > registers when running as a 32-bit guest. Couldn't userspace be equally > > confused by a 64-bit lr or sp? > > Fair point. How about this on top, which wipes the upper 32 bits for > each and every register in the current mode: > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index 2a159af82429..f958c3c7bf65 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const > struct kvm_one_reg *reg) > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); > > - if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) > - *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) { > + int i; > > + for (i = 0; i < 16; i++) > + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); I think you're missing all the funny banked registers that live all the way up to x30 iirc. Will
On 2020-04-30 13:31, Will Deacon wrote: > On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote: >> On 2020-04-30 11:25, Will Deacon wrote: >> > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote: >> > > In the unlikely event that a 32bit vcpu traps into the hypervisor >> > > on an instruction that is located right at the end of the 32bit >> > > range, the emulation of that instruction is going to increment >> > > PC past the 32bit range. This isn't great, as userspace can then >> > > observe this value and get a bit confused. >> > > >> > > Conversly, userspace can do things like (in the context of a 64bit >> > > guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, >> > > set PC to a 64bit value, change PSTATE to AArch32-USR, and observe >> > > that PC hasn't been truncated. More confusion. >> > > >> > > Fix both by: >> > > - truncating PC increments for 32bit guests >> > > - sanitize PC every time a core reg is changed by userspace, and >> > > that PSTATE indicates a 32bit mode. >> > >> > It's not clear to me whether this needs a cc stable. What do you think? >> > I >> > suppose that it really depends on how confused e.g. QEMU gets. >> >> It isn't so much QEMU itself that I'm worried about (the emulation >> shouldn't >> really care about the PC), but the likes of GDB. So yes, a cc stable >> seems >> to >> be in order. > > Okey doke. > >> > > Signed-off-by: Marc Zyngier <maz@kernel.org> >> > > --- >> > > arch/arm64/kvm/guest.c | 4 ++++ >> > > virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- >> > > 2 files changed, 10 insertions(+), 2 deletions(-) >> > > >> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c >> > > index 23ebe51410f0..2a159af82429 100644 >> > > --- a/arch/arm64/kvm/guest.c >> > > +++ b/arch/arm64/kvm/guest.c >> > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, >> > > const struct kvm_one_reg *reg) >> > > } >> > > >> > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); >> > > + >> > > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) >> > > + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); >> > >> > It seems slightly odd to me that we don't enforce this for *all* the >> > registers when running as a 32-bit guest. Couldn't userspace be equally >> > confused by a 64-bit lr or sp? >> >> Fair point. How about this on top, which wipes the upper 32 bits for >> each and every register in the current mode: >> >> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c >> index 2a159af82429..f958c3c7bf65 100644 >> --- a/arch/arm64/kvm/guest.c >> +++ b/arch/arm64/kvm/guest.c >> @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, >> const >> struct kvm_one_reg *reg) >> >> memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); >> >> - if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) >> - *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); >> + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) { >> + int i; >> >> + for (i = 0; i < 16; i++) >> + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); > > I think you're missing all the funny banked registers that live all the > way > up to x30 iirc. No, they are all indirected via vcpu_reg32(), which has the magic tables. And the whole point is that we only want to affect the current mode (no point in repainting the FIQ registers if the PSR says USR). Or am I missing something obvious? M.
On Thu, Apr 30, 2020 at 01:45:51PM +0100, Marc Zyngier wrote: > On 2020-04-30 13:31, Will Deacon wrote: > > On Thu, Apr 30, 2020 at 11:59:05AM +0100, Marc Zyngier wrote: > > > On 2020-04-30 11:25, Will Deacon wrote: > > > > On Thu, Apr 30, 2020 at 11:15:13AM +0100, Marc Zyngier wrote: > > > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > > > > index 23ebe51410f0..2a159af82429 100644 > > > > > --- a/arch/arm64/kvm/guest.c > > > > > +++ b/arch/arm64/kvm/guest.c > > > > > @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, > > > > > const struct kvm_one_reg *reg) > > > > > } > > > > > > > > > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); > > > > > + > > > > > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) > > > > > + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); > > > > > > > > It seems slightly odd to me that we don't enforce this for *all* the > > > > registers when running as a 32-bit guest. Couldn't userspace be equally > > > > confused by a 64-bit lr or sp? > > > > > > Fair point. How about this on top, which wipes the upper 32 bits for > > > each and every register in the current mode: > > > > > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > > > index 2a159af82429..f958c3c7bf65 100644 > > > --- a/arch/arm64/kvm/guest.c > > > +++ b/arch/arm64/kvm/guest.c > > > @@ -201,9 +201,12 @@ static int set_core_reg(struct kvm_vcpu *vcpu, > > > const > > > struct kvm_one_reg *reg) > > > > > > memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); > > > > > > - if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) > > > - *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); > > > + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) { > > > + int i; > > > > > > + for (i = 0; i < 16; i++) > > > + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); > > > > I think you're missing all the funny banked registers that live all the > > way > > up to x30 iirc. > > No, they are all indirected via vcpu_reg32(), which has the magic tables. > And the whole point is that we only want to affect the current mode (no > point > in repainting the FIQ registers if the PSR says USR). > > Or am I missing something obvious? Nope, just my inability to parse vcpu_reg32 the first time around! So, for the updated patch: Acked-by: Will Deacon <will@kernel.org? Thanks, Will
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 23ebe51410f0..2a159af82429 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -200,6 +200,10 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) } memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); + + if (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) + *vcpu_pc(vcpu) = lower_32_bits(*vcpu_pc(vcpu)); + out: return err; } diff --git a/virt/kvm/arm/hyp/aarch32.c b/virt/kvm/arm/hyp/aarch32.c index d31f267961e7..25c0e47d57cb 100644 --- a/virt/kvm/arm/hyp/aarch32.c +++ b/virt/kvm/arm/hyp/aarch32.c @@ -125,12 +125,16 @@ static void __hyp_text kvm_adjust_itstate(struct kvm_vcpu *vcpu) */ void __hyp_text kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr) { + u32 pc = *vcpu_pc(vcpu); bool is_thumb; is_thumb = !!(*vcpu_cpsr(vcpu) & PSR_AA32_T_BIT); if (is_thumb && !is_wide_instr) - *vcpu_pc(vcpu) += 2; + pc += 2; else - *vcpu_pc(vcpu) += 4; + pc += 4; + + *vcpu_pc(vcpu) = pc; + kvm_adjust_itstate(vcpu); }
In the unlikely event that a 32bit vcpu traps into the hypervisor on an instruction that is located right at the end of the 32bit range, the emulation of that instruction is going to increment PC past the 32bit range. This isn't great, as userspace can then observe this value and get a bit confused. Conversly, userspace can do things like (in the context of a 64bit guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, set PC to a 64bit value, change PSTATE to AArch32-USR, and observe that PC hasn't been truncated. More confusion. Fix both by: - truncating PC increments for 32bit guests - sanitize PC every time a core reg is changed by userspace, and that PSTATE indicates a 32bit mode. Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/kvm/guest.c | 4 ++++ virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- 2 files changed, 10 insertions(+), 2 deletions(-)