[6/8] kvm/x86: Add mem fault exit on EPT violations

Message ID	20230215011614.725983-7-amoorthy@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@vger.kernel.org> Date: Wed, 15 Feb 2023 01:16:12 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> Message-ID: <20230215011614.725983-7-amoorthy@google.com> Subject: [PATCH 6/8] kvm/x86: Add mem fault exit on EPT violations From: Anish Moorthy <amoorthy@google.com> To: Paolo Bonzini <pbonzini@redhat.com>, Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev>, Sean Christopherson <seanjc@google.com>, James Houghton <jthoughton@google.com>, Anish Moorthy <amoorthy@google.com>, Ben Gardon <bgardon@google.com>, David Matlack <dmatlack@google.com>, Ricardo Koller <ricarkol@google.com>, Chao Peng <chao.p.peng@linux.intel.com>, Axel Rasmussen <axelrasmussen@google.com>, kvm@vger.kernel.org, kvmarm@lists.linux.dev Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Add memory fault exits to avoid slow GUP \| expand [0/8] Add memory fault exits to avoid slow GUP [1/8] selftests/kvm: Fix bug in how demand_paging_test calculates paging rate [2/8] selftests/kvm: Allow many vcpus per UFFD in demand paging test [3/8] selftests/kvm: Switch demand paging uffd readers to epoll [4/8] kvm: Allow hva_pfn_fast to resolve read-only faults. [5/8] kvm: Add cap/kvm_run field for memory fault exits [6/8] kvm/x86: Add mem fault exit on EPT violations [7/8] kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64 [8/8] selftests/kvm: Handle mem fault exits in demand paging test

Message ID

20230215011614.725983-7-amoorthy@google.com (mailing list archive)

State

New, archived

Headers

Date: Wed, 15 Feb 2023 01:16:12 +0000
In-Reply-To: <20230215011614.725983-1-amoorthy@google.com>
Mime-Version: 1.0
References: <20230215011614.725983-1-amoorthy@google.com>
Message-ID: <20230215011614.725983-7-amoorthy@google.com>
Subject: [PATCH 6/8] kvm/x86: Add mem fault exit on EPT violations
From: Anish Moorthy <amoorthy@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>, Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>,
        Sean Christopherson <seanjc@google.com>,
        James Houghton <jthoughton@google.com>,
        Anish Moorthy <amoorthy@google.com>,
        Ben Gardon <bgardon@google.com>,
        David Matlack <dmatlack@google.com>,
        Ricardo Koller <ricarkol@google.com>,
        Chao Peng <chao.p.peng@linux.intel.com>,
        Axel Rasmussen <axelrasmussen@google.com>, kvm@vger.kernel.org,
        kvmarm@lists.linux.dev
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Add memory fault exits to avoid slow GUP | expand

Commit Message

Anish Moorthy Feb. 15, 2023, 1:16 a.m. UTC

With the relevant kvm cap enabled, EPT violations will exit to userspace
w/ reason KVM_EXIT_MEMORY_FAULT instead of resolving the fault via slow
get_user_pages.

Signed-off-by: Anish Moorthy <amoorthy@google.com>
Suggested-by: James Houghton <jthoughton@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 23 ++++++++++++++++++++---
 arch/x86/kvm/x86.c     |  1 +
 2 files changed, 21 insertions(+), 3 deletions(-)

Comments

Sean Christopherson Feb. 15, 2023, 5:23 p.m. UTC | #1

On Wed, Feb 15, 2023, Anish Moorthy wrote:
> With the relevant kvm cap enabled, EPT violations will exit to userspace
> w/ reason KVM_EXIT_MEMORY_FAULT instead of resolving the fault via slow
> get_user_pages.

Similar to the other patch's feedback, please rewrite the changelog to phrase the
changes as commands, not as descriptions of the resulting behavior.

> Signed-off-by: Anish Moorthy <amoorthy@google.com>
> Suggested-by: James Houghton <jthoughton@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 23 ++++++++++++++++++++---
>  arch/x86/kvm/x86.c     |  1 +
>  2 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index aeb240b339f54..28af8d60adee6 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4201,6 +4201,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
>  {
>  	struct kvm_memory_slot *slot = fault->slot;
>  	bool async;
> +	bool mem_fault_nowait;
>  
>  	/*
>  	 * Retry the page fault if the gfn hit a memslot that is being deleted
> @@ -4230,9 +4231,25 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
>  	}
>  
>  	async = false;
> -	fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async,
> -					  fault->write, &fault->map_writable,
> -					  &fault->hva);
> +	mem_fault_nowait = memory_faults_enabled(vcpu->kvm);
> +
> +	fault->pfn = __gfn_to_pfn_memslot(
> +		slot, fault->gfn,
> +		mem_fault_nowait,

Hrm, as prep work for this series, I think we should first clean up the pile o'
bools.  This came up before when the "interruptible" flag was added[*].  We punted
then, but I think it's time to bite the bullet, especially since "nowait" and
"async" need to be mutually exclusive.

[*] https://lore.kernel.org/all/YrR9i3yHzh5ftOxB@google.com

> +		false,
> +		mem_fault_nowait ? NULL : &async,
> +		fault->write, &fault->map_writable,
> +		&fault->hva);

Same comments as the other patch: follow kernel style, not google3's madness :-)

> +	if (mem_fault_nowait) {
> +		if (fault->pfn == KVM_PFN_ERR_FAULT) {
> +			vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
> +			vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT;
> +			vcpu->run->memory_fault.size = PAGE_SIZE;

This belongs in a separate patch, and the exit stuff should be filled in by
kvm_handle_error_pfn().  Then this if-statement goes away entirely because the
"if (!async)" will always evaluate true in the nowait case.

> +		}
> +		return RET_PF_CONTINUE;
> +	}
> +
>  	if (!async)
>  		return RET_PF_CONTINUE; /* *pfn has correct page already */
>

Peter Xu Feb. 16, 2023, 10:55 p.m. UTC | #2

Hi, Sean,

On Wed, Feb 15, 2023 at 09:23:55AM -0800, Sean Christopherson wrote:
> > @@ -4230,9 +4231,25 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
> >  	}
> >  
> >  	async = false;
> > -	fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async,
> > -					  fault->write, &fault->map_writable,
> > -					  &fault->hva);
> > +	mem_fault_nowait = memory_faults_enabled(vcpu->kvm);
> > +
> > +	fault->pfn = __gfn_to_pfn_memslot(
> > +		slot, fault->gfn,
> > +		mem_fault_nowait,
> 
> Hrm, as prep work for this series, I think we should first clean up the pile o'
> bools.  This came up before when the "interruptible" flag was added[*].  We punted
> then, but I think it's time to bite the bullet, especially since "nowait" and
> "async" need to be mutually exclusive.
> 
> [*] https://lore.kernel.org/all/YrR9i3yHzh5ftOxB@google.com

IIUC at that time we didn't reach a consensus on how to rework it, and the
suggestion was using a bool and separate the cleanup (while the flag
cleanup got NACKed..), which makes sense to me.

Personally I like your other patch a lot:

https://lore.kernel.org/all/YrTNGVpT8Cw2yrnr@google.com/

The question is, after we merge the bools into a flag, do we still need a
struct* anyway?  If we can reach a consensus, I'll be happy to continue
working on the cleanup (by picking up your patch first).  Or if Anish would
like to take it over in any form that'll be also perfect.

Thanks,

Anish Moorthy Feb. 23, 2023, 12:35 a.m. UTC | #3

On Wed, Feb 15, 2023 at 9:24 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Feb 15, 2023, Anish Moorthy wrote:
> > +     if (mem_fault_nowait) {
> > +             if (fault->pfn == KVM_PFN_ERR_FAULT) {
> > +                     vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
> > +                     vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT;
> > +                     vcpu->run->memory_fault.size = PAGE_SIZE;
>
> This belongs in a separate patch, and the exit stuff should be filled in by
> kvm_handle_error_pfn().  Then this if-statement goes away entirely because the
> "if (!async)" will always evaluate true in the nowait case.

Hi Sean, what exactly did you want "in a separate patch"? Unless you
mean separating
the changes to arch/x86/kvm/mmu/mmu.c and the one-liner to enable the cap in
arch/x86/kvm/x86.c, I don't really understand how this patch could
logically be split.

Sean Christopherson Feb. 23, 2023, 8:11 p.m. UTC | #4

On Wed, Feb 22, 2023, Anish Moorthy wrote:
> On Wed, Feb 15, 2023 at 9:24 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Feb 15, 2023, Anish Moorthy wrote:
> > > +     if (mem_fault_nowait) {
> > > +             if (fault->pfn == KVM_PFN_ERR_FAULT) {
> > > +                     vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
> > > +                     vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT;
> > > +                     vcpu->run->memory_fault.size = PAGE_SIZE;
> >
> > This belongs in a separate patch, and the exit stuff should be filled in by
> > kvm_handle_error_pfn().  Then this if-statement goes away entirely because the
> > "if (!async)" will always evaluate true in the nowait case.
> 
> Hi Sean, what exactly did you want "in a separate patch"?

Separate "exit if fast gup() fails", a.k.a. nowait, from "exit with KVM_EXIT_MEMORY_FAULT
instead of -EFAULT on errors".  I.e. don't tie KVM_EXIT_MEMORY_FAULT to the fast
gup() capability (or memslot flag?).  That way filing vcpu->run->memory_fault
lands in a separate, largely generic patch, and this patch only introduces the
fast gup() logic.

That probably means adding yet another capability, but capabilities are cheap.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index aeb240b339f54..28af8d60adee6 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4201,6 +4201,7 @@  static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 {
 	struct kvm_memory_slot *slot = fault->slot;
 	bool async;
+	bool mem_fault_nowait;
 
 	/*
 	 * Retry the page fault if the gfn hit a memslot that is being deleted
@@ -4230,9 +4231,25 @@  static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
 	}
 
 	async = false;
-	fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async,
-					  fault->write, &fault->map_writable,
-					  &fault->hva);
+	mem_fault_nowait = memory_faults_enabled(vcpu->kvm);
+
+	fault->pfn = __gfn_to_pfn_memslot(
+		slot, fault->gfn,
+		mem_fault_nowait,
+		false,
+		mem_fault_nowait ? NULL : &async,
+		fault->write, &fault->map_writable,
+		&fault->hva);
+
+	if (mem_fault_nowait) {
+		if (fault->pfn == KVM_PFN_ERR_FAULT) {
+			vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT;
+			vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT;
+			vcpu->run->memory_fault.size = PAGE_SIZE;
+		}
+		return RET_PF_CONTINUE;
+	}
+
 	if (!async)
 		return RET_PF_CONTINUE; /* *pfn has correct page already */
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 508074e47bc0e..fe39ab2af5db4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4427,6 +4427,7 @@  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VAPIC:
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+	case KVM_CAP_MEM_FAULT_NOWAIT:
 		r = 1;
 		break;
 	case KVM_CAP_EXIT_HYPERCALL:

[6/8] kvm/x86: Add mem fault exit on EPT violations

Commit Message

Comments

Patch