diff mbox series

mm: Export follow_pte() for KVM so that KVM can stop using follow_pfn()

Message ID 20210204171619.3640084-1-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series mm: Export follow_pte() for KVM so that KVM can stop using follow_pfn() | expand

Commit Message

Sean Christopherson Feb. 4, 2021, 5:16 p.m. UTC
Export follow_pte() to fix build breakage when KVM is built as a module.
An in-flight KVM fix switches from follow_pfn() to follow_pte() in order
to grab the page protections along with the PFN.

Fixes: bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn")
Cc: David Stevens <stevensd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

Paolo, maybe you can squash this with the appropriate acks?

 mm/memory.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Bonzini Feb. 4, 2021, 5:19 p.m. UTC | #1
On 04/02/21 18:16, Sean Christopherson wrote:
> Export follow_pte() to fix build breakage when KVM is built as a module.
> An in-flight KVM fix switches from follow_pfn() to follow_pte() in order
> to grab the page protections along with the PFN.
> 
> Fixes: bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn")
> Cc: David Stevens <stevensd@google.com>
> Cc: Jann Horn <jannh@google.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: kvm@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> 
> Paolo, maybe you can squash this with the appropriate acks?

Indeed, you beat me by a minute.  This change is why I hadn't sent out 
the patch yet.

Andrew or Jason, ok to squash this?

Paolo

>   mm/memory.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index feff48e1465a..15cbd10afd59 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4775,6 +4775,7 @@ int follow_pte(struct mm_struct *mm, unsigned long address,
>   out:
>   	return -EINVAL;
>   }
> +EXPORT_SYMBOL_GPL(follow_pte);
>   
>   /**
>    * follow_pfn - look up PFN at a user virtual address
>
Jason Gunthorpe Feb. 4, 2021, 8:33 p.m. UTC | #2
On Thu, Feb 04, 2021 at 06:19:13PM +0100, Paolo Bonzini wrote:
> On 04/02/21 18:16, Sean Christopherson wrote:
> > Export follow_pte() to fix build breakage when KVM is built as a module.
> > An in-flight KVM fix switches from follow_pfn() to follow_pte() in order
> > to grab the page protections along with the PFN.
> > 
> > Fixes: bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn")
> > Cc: David Stevens <stevensd@google.com>
> > Cc: Jann Horn <jannh@google.com>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > 
> > Paolo, maybe you can squash this with the appropriate acks?
> 
> Indeed, you beat me by a minute.  This change is why I hadn't sent out the
> patch yet.
> 
> Andrew or Jason, ok to squash this?

I think usual process would be to put this in the patch/series/pr that
needs it.

Given how badly follow_pfn has been misused, I would greatly prefer to
see you add a kdoc along with exporting it - making it clear about the
rules.

And it looks like we should remove the range argument for modular use

And document the locking requirements, it does a lockless read of the
page table:

	pgd = pgd_offset(mm, address);
	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
		goto out;

	p4d = p4d_offset(pgd, address);

It doesn't do the trickery that fast GUP does, so it must require the
mmap sem in read mode at least.

Not sure I understand how fsdax is able to call it only under the
i_mmap_lock_read lock? What prevents a page table level from being
freed concurrently?

And it is missing READ_ONCE's for the lockless page table walk.. :(

Jason
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..15cbd10afd59 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4775,6 +4775,7 @@  int follow_pte(struct mm_struct *mm, unsigned long address,
 out:
 	return -EINVAL;
 }
+EXPORT_SYMBOL_GPL(follow_pte);
 
 /**
  * follow_pfn - look up PFN at a user virtual address