diff mbox series

KVM: arm64: Avoid corrupting vCPU context register in guest exit

Message ID 20210226181211.14542-1-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Avoid corrupting vCPU context register in guest exit | expand

Commit Message

Will Deacon Feb. 26, 2021, 6:12 p.m. UTC
Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest
context") tracks the currently running vCPU, clearing the pointer to
NULL on exit from a guest.

Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the
kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS
code to go off into the weeds when it saves the DISR assuming that the
CPU context is embedded in a struct vCPU.

Leave x1 alone and use x3 as a temporary register instead when clearing
the vCPU on the guest exit path.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Andrew Scull <ascull@google.com>
Cc: <stable@vger.kernel.org>
Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context")
Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---

This was pretty awful to debug!

 arch/arm64/kvm/hyp/entry.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marc Zyngier Feb. 26, 2021, 6:35 p.m. UTC | #1
On 2021-02-26 18:12, Will Deacon wrote:
> Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest
> context") tracks the currently running vCPU, clearing the pointer to
> NULL on exit from a guest.
> 
> Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the
> kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS
> code to go off into the weeds when it saves the DISR assuming that the
> CPU context is embedded in a struct vCPU.
> 
> Leave x1 alone and use x3 as a temporary register instead when clearing
> the vCPU on the guest exit path.
> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Andrew Scull <ascull@google.com>
> Cc: <stable@vger.kernel.org>
> Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest 
> context")
> Suggested-by: Quentin Perret <qperret@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
> 
> This was pretty awful to debug!
> 
>  arch/arm64/kvm/hyp/entry.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index b0afad7a99c6..0c66a1d408fd 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -146,7 +146,7 @@ SYM_INNER_LABEL(__guest_exit, SYM_L_GLOBAL)
>  	// Now restore the hyp regs
>  	restore_callee_saved_regs x2
> 
> -	set_loaded_vcpu xzr, x1, x2
> +	set_loaded_vcpu xzr, x2, x3
> 
>  alternative_if ARM64_HAS_RAS_EXTN
>  	// If we have the RAS extensions we can consume a pending error

Grmbl... How comes we have never seen that for the past 5 months,
including on CPUs that implement RAS?

Thanks,

         M.
Will Deacon Feb. 26, 2021, 7:05 p.m. UTC | #2
On Fri, Feb 26, 2021 at 06:35:42PM +0000, Marc Zyngier wrote:
> On 2021-02-26 18:12, Will Deacon wrote:
> > Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest
> > context") tracks the currently running vCPU, clearing the pointer to
> > NULL on exit from a guest.
> > 
> > Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the
> > kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS
> > code to go off into the weeds when it saves the DISR assuming that the
> > CPU context is embedded in a struct vCPU.
> > 
> > Leave x1 alone and use x3 as a temporary register instead when clearing
> > the vCPU on the guest exit path.
> > 
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Andrew Scull <ascull@google.com>
> > Cc: <stable@vger.kernel.org>
> > Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest
> > context")
> > Suggested-by: Quentin Perret <qperret@google.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> > 
> > This was pretty awful to debug!
> > 
> >  arch/arm64/kvm/hyp/entry.S | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> > index b0afad7a99c6..0c66a1d408fd 100644
> > --- a/arch/arm64/kvm/hyp/entry.S
> > +++ b/arch/arm64/kvm/hyp/entry.S
> > @@ -146,7 +146,7 @@ SYM_INNER_LABEL(__guest_exit, SYM_L_GLOBAL)
> >  	// Now restore the hyp regs
> >  	restore_callee_saved_regs x2
> > 
> > -	set_loaded_vcpu xzr, x1, x2
> > +	set_loaded_vcpu xzr, x2, x3
> > 
> >  alternative_if ARM64_HAS_RAS_EXTN
> >  	// If we have the RAS extensions we can consume a pending error
> 
> Grmbl... How comes we have never seen that for the past 5 months,
> including on CPUs that implement RAS?

I think it's probably a combination of (a) not having a massive testing
community (b) not having tools that would scream about this (e.g. I don't
think you could detect this with KASAN) and (c) the nature of the
corruption being mostly benign in practice.

We found it in pKVM development because it landed on the vtcr we were
restoring when coming out of suspend, which then meant the page-table
code went wonky on the next stage-2 fault because it got the wrong start
level and kept returning -EAGAIN because it thought a table was a leaf.
So even then, the failure mode is horribly subtle.

Will
Marc Zyngier March 2, 2021, 6:57 p.m. UTC | #3
On Fri, 26 Feb 2021 18:12:11 +0000, Will Deacon wrote:
> Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest
> context") tracks the currently running vCPU, clearing the pointer to
> NULL on exit from a guest.
> 
> Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the
> kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS
> code to go off into the weeds when it saves the DISR assuming that the
> CPU context is embedded in a struct vCPU.
> 
> [...]

Applied to kvmarm-master/fixes, thanks!

[1/1] KVM: arm64: Avoid corrupting vCPU context register in guest exit
      commit: a8a0f5dbcdf57d89bb8d555c6423763d99a156c1

Cheers,

	M.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index b0afad7a99c6..0c66a1d408fd 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -146,7 +146,7 @@  SYM_INNER_LABEL(__guest_exit, SYM_L_GLOBAL)
 	// Now restore the hyp regs
 	restore_callee_saved_regs x2
 
-	set_loaded_vcpu xzr, x1, x2
+	set_loaded_vcpu xzr, x2, x3
 
 alternative_if ARM64_HAS_RAS_EXTN
 	// If we have the RAS extensions we can consume a pending error