diff mbox series

[6/7] KVM: arm64: Handle stage-2 faults on stage-1 page-table walks earlier

Message ID 20200724143506.17772-7-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Fixes to early stage-2 fault handling | expand

Commit Message

Will Deacon July 24, 2020, 2:35 p.m. UTC
Stage-2 faults on stage-1 page-table walks can occur on both the I-side
and the D-side. It is IMPLEMENTATATION DEFINED whether they are reported
as reads or writes and, in the case that they are generated by an AT
instruction, they are reported with the CM bit set.

All of this deeply confuses the logic in kvm_handle_guest_abort();
userspace may or may not see the fault, depending on whether it occurs
on the data or the instruction side, and an AT instruction may be skipped
if the translation tables are held in a read-only memslot.

Move the handling of stage-2 faults on stage-1 page-table walks earlier
so that they consistently result in either a data or an instruction abort
being re-injected back to the guest.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/mmu.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Marc Zyngier July 26, 2020, 1:38 p.m. UTC | #1
On Fri, 24 Jul 2020 15:35:05 +0100,
Will Deacon <will@kernel.org> wrote:
> 
> Stage-2 faults on stage-1 page-table walks can occur on both the I-side
> and the D-side. It is IMPLEMENTATATION DEFINED whether they are reported
> as reads or writes and, in the case that they are generated by an AT
> instruction, they are reported with the CM bit set.
> 
> All of this deeply confuses the logic in kvm_handle_guest_abort();
> userspace may or may not see the fault, depending on whether it occurs
> on the data or the instruction side, and an AT instruction may be skipped
> if the translation tables are held in a read-only memslot.

Yuk, that's indeed ugly. Well spotted. I guess the saving grace is
that a S2 trap caused by an ATS1 instruction will be reported as
S1PTW+CM, while the fault caused by a CMO is reported as *either*
S1PTW *or* CM, but never both.

> 
> Move the handling of stage-2 faults on stage-1 page-table walks earlier
> so that they consistently result in either a data or an instruction abort
> being re-injected back to the guest.

The instruction abort seems to be happening as the side effect of
executing outside of a memslot, not really because of a S1PTW.  I
wonder whether these S1PTW faults should be classified as external
aborts instead (because putting your page tables outside of a memslot
seems a bit bonkers).

> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Quentin Perret <qperret@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/mmu.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index adb933ecd177..9e72e7f4a2c2 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -2124,6 +2124,11 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  			goto out;
>  		}
>  
> +		if (kvm_vcpu_dabt_iss1tw(vcpu)) {
> +			ret = -ENXIO;
> +			goto out;
> +		}
> +
>  		/*
>  		 * Check for a cache maintenance operation. Since we
>  		 * ended-up here, we know it is outside of any memory
> @@ -2157,11 +2162,6 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  			goto out_unlock;
>  		}
>  
> -		if (kvm_vcpu_dabt_iss1tw(vcpu)) {
> -			ret = -ENXIO;
> -			goto out;
> -		}
> -
>  		ret = io_mem_abort(vcpu, run, fault_ipa);
>  		goto out_unlock;
>  	}
> -- 
> 2.28.0.rc0.142.g3c755180ce-goog
> 
> 

Thanks,

	M.
Will Deacon July 27, 2020, 10:29 a.m. UTC | #2
On Sun, Jul 26, 2020 at 02:38:38PM +0100, Marc Zyngier wrote:
> On Fri, 24 Jul 2020 15:35:05 +0100,
> Will Deacon <will@kernel.org> wrote:
> > 
> > Stage-2 faults on stage-1 page-table walks can occur on both the I-side
> > and the D-side. It is IMPLEMENTATATION DEFINED whether they are reported
> > as reads or writes and, in the case that they are generated by an AT
> > instruction, they are reported with the CM bit set.
> > 
> > All of this deeply confuses the logic in kvm_handle_guest_abort();
> > userspace may or may not see the fault, depending on whether it occurs
> > on the data or the instruction side, and an AT instruction may be skipped
> > if the translation tables are held in a read-only memslot.
> 
> Yuk, that's indeed ugly. Well spotted. I guess the saving grace is
> that a S2 trap caused by an ATS1 instruction will be reported as
> S1PTW+CM, while the fault caused by a CMO is reported as *either*
> S1PTW *or* CM, but never both.

Hmm, is that right? If the translation faults at S2 for a CM instruction,
wouldn't it have both bits set?

> > Move the handling of stage-2 faults on stage-1 page-table walks earlier
> > so that they consistently result in either a data or an instruction abort
> > being re-injected back to the guest.
> 
> The instruction abort seems to be happening as the side effect of
> executing outside of a memslot, not really because of a S1PTW.

Not sure about that. If the instruction fetch generates an S2 abort during
translation, then we could be executing from within a memslot; it's the
location of the page-tables that matters.

However, I think that means things still aren't quite right with my patches
because we can end up calling io_mem_abort() from an instruction abort,
which won't have enough syndrome information to do anything useful. Hmm.

Stepping back, here's what I _think_ we want, although see the '(?)'
bits where I'm particularly unsure:


S2 instruction abort:
  * Not in memslot:		inject external iabt to guest
  * In R/O memslot:
    - S2 fault on S1 walk:	either EXIT_NISV or inject external iabt
				to guest (?)

S2 data abort:
  * Not in memslot:
    - S2 fault on S1 walk:	inject external dabt to guest
    - Cache maintenance:	skip instr
    - Syndrome valid		EXIT_MMIO
    - Syndrome invalid		EXIT_NISV
  * In R/O memslot:
    - S2 fault on S1 walk:	either EXIT_NISV or inject external dabt
				to guest (?)
    - Access is write (including cache maintenance (?)):
      - Syndrome valid		EXIT_MMIO
      - Syndrome invalid	EXIT_NISV


Everything else gets handled by handle_access_fault()/user_mem_abort().

What do you think?

> I wonder whether these S1PTW faults should be classified as external
> aborts instead (because putting your page tables outside of a memslot
> seems a bit bonkers).

I think that's what this patch does, since we end up in kvm_inject_dabt().

Will
diff mbox series

Patch

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index adb933ecd177..9e72e7f4a2c2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -2124,6 +2124,11 @@  int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 			goto out;
 		}
 
+		if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+			ret = -ENXIO;
+			goto out;
+		}
+
 		/*
 		 * Check for a cache maintenance operation. Since we
 		 * ended-up here, we know it is outside of any memory
@@ -2157,11 +2162,6 @@  int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
 			goto out_unlock;
 		}
 
-		if (kvm_vcpu_dabt_iss1tw(vcpu)) {
-			ret = -ENXIO;
-			goto out;
-		}
-
 		ret = io_mem_abort(vcpu, run, fault_ipa);
 		goto out_unlock;
 	}