diff mbox series

[06/16] KVM: arm64: Force a full unmap on vpcu reinit

Message ID 20210715163159.1480168-7-maz@kernel.org (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: MMIO guard PV services | expand

Commit Message

Marc Zyngier July 15, 2021, 4:31 p.m. UTC
As we now keep information in the S2PT, we must be careful not
to keep it across a VM reboot, which could otherwise lead to
interesting problems.

Make sure that the S2 is completely discarded on reset of
a vcpu, and remove the flag that enforces the MMIO check.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/arm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Will Deacon July 27, 2021, 6:11 p.m. UTC | #1
On Thu, Jul 15, 2021 at 05:31:49PM +0100, Marc Zyngier wrote:
> As we now keep information in the S2PT, we must be careful not
> to keep it across a VM reboot, which could otherwise lead to
> interesting problems.
> 
> Make sure that the S2 is completely discarded on reset of
> a vcpu, and remove the flag that enforces the MMIO check.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/arm64/kvm/arm.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 97ab1512c44f..b0d2225190d2 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1096,12 +1096,18 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
>  	 * ensuring that the data side is always coherent. We still
>  	 * need to invalidate the I-cache though, as FWB does *not*
>  	 * imply CTR_EL0.DIC.
> +	 *
> +	 * If the MMIO guard was enabled, we pay the price of a full
> +	 * unmap to get back to a sane state (and clear the flag).
>  	 */
>  	if (vcpu->arch.has_run_once) {
> -		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
> +		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) ||
> +		    test_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags))
>  			stage2_unmap_vm(vcpu->kvm);
>  		else
>  			icache_inval_all_pou();
> +
> +		clear_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);

What prevents this racing with another vCPU trying to set the bit?

Will
Marc Zyngier July 28, 2021, 10:38 a.m. UTC | #2
On Tue, 27 Jul 2021 19:11:33 +0100,
Will Deacon <will@kernel.org> wrote:
> 
> On Thu, Jul 15, 2021 at 05:31:49PM +0100, Marc Zyngier wrote:
> > As we now keep information in the S2PT, we must be careful not
> > to keep it across a VM reboot, which could otherwise lead to
> > interesting problems.
> > 
> > Make sure that the S2 is completely discarded on reset of
> > a vcpu, and remove the flag that enforces the MMIO check.
> > 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> >  arch/arm64/kvm/arm.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 97ab1512c44f..b0d2225190d2 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1096,12 +1096,18 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
> >  	 * ensuring that the data side is always coherent. We still
> >  	 * need to invalidate the I-cache though, as FWB does *not*
> >  	 * imply CTR_EL0.DIC.
> > +	 *
> > +	 * If the MMIO guard was enabled, we pay the price of a full
> > +	 * unmap to get back to a sane state (and clear the flag).
> >  	 */
> >  	if (vcpu->arch.has_run_once) {
> > -		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
> > +		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) ||
> > +		    test_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags))
> >  			stage2_unmap_vm(vcpu->kvm);
> >  		else
> >  			icache_inval_all_pou();
> > +
> > +		clear_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);
> 
> What prevents this racing with another vCPU trying to set the bit?

Not much. We could take the kvm lock on both ends to serialize it, but
that's pretty ugly. And should we care? What is the semantic of
resetting a vcpu while another is still running?

If we want to support this sort of behaviour, then our tracking is
totally bogus, because it is VM-wide. And you don't even have to play
with that bit from another vcpu: all the information is lost at the
point where we unmap the S2 PTs.

Maybe an alternative is to move this to the halt/reboot PSCI handlers,
making it clearer what we expect?

	M.
Will Deacon July 30, 2021, 12:50 p.m. UTC | #3
On Wed, Jul 28, 2021 at 11:38:35AM +0100, Marc Zyngier wrote:
> On Tue, 27 Jul 2021 19:11:33 +0100,
> Will Deacon <will@kernel.org> wrote:
> > 
> > On Thu, Jul 15, 2021 at 05:31:49PM +0100, Marc Zyngier wrote:
> > > As we now keep information in the S2PT, we must be careful not
> > > to keep it across a VM reboot, which could otherwise lead to
> > > interesting problems.
> > > 
> > > Make sure that the S2 is completely discarded on reset of
> > > a vcpu, and remove the flag that enforces the MMIO check.
> > > 
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > >  arch/arm64/kvm/arm.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index 97ab1512c44f..b0d2225190d2 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -1096,12 +1096,18 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
> > >  	 * ensuring that the data side is always coherent. We still
> > >  	 * need to invalidate the I-cache though, as FWB does *not*
> > >  	 * imply CTR_EL0.DIC.
> > > +	 *
> > > +	 * If the MMIO guard was enabled, we pay the price of a full
> > > +	 * unmap to get back to a sane state (and clear the flag).
> > >  	 */
> > >  	if (vcpu->arch.has_run_once) {
> > > -		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
> > > +		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) ||
> > > +		    test_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags))
> > >  			stage2_unmap_vm(vcpu->kvm);
> > >  		else
> > >  			icache_inval_all_pou();
> > > +
> > > +		clear_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);
> > 
> > What prevents this racing with another vCPU trying to set the bit?
> 
> Not much. We could take the kvm lock on both ends to serialize it, but
> that's pretty ugly. And should we care? What is the semantic of
> resetting a vcpu while another is still running?

It's definitely weird, but given that this is an attack vector I don't think
we can rule out attackers trying whacky stuff like this (although maybe
we end up forbidding vcpu reset in pKVM -- dunno).

> If we want to support this sort of behaviour, then our tracking is
> totally bogus, because it is VM-wide. And you don't even have to play
> with that bit from another vcpu: all the information is lost at the
> point where we unmap the S2 PTs.
> 
> Maybe an alternative is to move this to the halt/reboot PSCI handlers,
> making it clearer what we expect?

I think that's probably worth looking at. The race is quite hard to reason
about otherwise, so if clearing the bit can be done on the teardown path
in single-threaded context then I think that's better. It looks like
kvm_prepare_system_event() has all the synchronisation we need there.

Will
diff mbox series

Patch

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 97ab1512c44f..b0d2225190d2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1096,12 +1096,18 @@  static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
 	 * ensuring that the data side is always coherent. We still
 	 * need to invalidate the I-cache though, as FWB does *not*
 	 * imply CTR_EL0.DIC.
+	 *
+	 * If the MMIO guard was enabled, we pay the price of a full
+	 * unmap to get back to a sane state (and clear the flag).
 	 */
 	if (vcpu->arch.has_run_once) {
-		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+		if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) ||
+		    test_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags))
 			stage2_unmap_vm(vcpu->kvm);
 		else
 			icache_inval_all_pou();
+
+		clear_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);
 	}
 
 	vcpu_reset_hcr(vcpu);