[RFC,09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs

Message ID	20200110190313.17144-10-joao.m.martins@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=Qd/p=27=vger.kernel.org=kvm-owner@kernel.org> From: Joao Martins <joao.m.martins@oracle.com> To: linux-nvdimm@lists.01.org Cc: Dan Williams <dan.j.williams@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Dave Jiang <dave.jiang@intel.com>, Ira Weiny <ira.weiny@intel.com>, Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Liran Alon <liran.alon@oracle.com>, Nikita Leshenko <nikita.leshchenko@oracle.com>, Barret Rhoden <brho@google.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Matthew Wilcox <willy@infradead.org>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Subject: [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs Date: Fri, 10 Jan 2020 19:03:12 +0000 Message-Id: <20200110190313.17144-10-joao.m.martins@oracle.com> In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> References: <20200110190313.17144-1-joao.m.martins@oracle.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	[RFC,01/10] mm: Add pmd support for _PAGE_SPECIAL \| expand [RFC,01/10] mm: Add pmd support for _PAGE_SPECIAL [RFC,02/10] mm: Handle pmd entries in follow_pfn() [RFC,03/10] mm: Add pud support for _PAGE_SPECIAL [RFC,04/10] mm: Handle pud entries in follow_pfn() [RFC,05/10] device-dax: Do not enforce MADV_DONTFORK on mmap() [RFC,06/10] device-dax: Introduce pfn_flags helper [RFC,07/10] device-dax: Add support for PFN_SPECIAL flags [RFC,08/10] dax/pmem: Add device-dax support for PFN_MODE_NONE [RFC,09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs [RFC,10/10] nvdimm/e820: add multiple namespaces support

Message ID

20200110190313.17144-10-joao.m.martins@oracle.com (mailing list archive)

State

New, archived

Headers

From: Joao Martins <joao.m.martins@oracle.com>
To: linux-nvdimm@lists.01.org
Cc: Dan Williams <dan.j.williams@intel.com>,
        Vishal Verma <vishal.l.verma@intel.com>,
        Dave Jiang <dave.jiang@intel.com>,
        Ira Weiny <ira.weiny@intel.com>,
        Alex Williamson <alex.williamson@redhat.com>,
        Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org,
        Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
        Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
        "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org,
        Liran Alon <liran.alon@oracle.com>,
        Nikita Leshenko <nikita.leshchenko@oracle.com>,
        Barret Rhoden <brho@google.com>,
        Boris Ostrovsky <boris.ostrovsky@oracle.com>,
        Matthew Wilcox <willy@infradead.org>,
        Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs
Date: Fri, 10 Jan 2020 19:03:12 +0000
Message-Id: <20200110190313.17144-10-joao.m.martins@oracle.com>
In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com>
References: <20200110190313.17144-1-joao.m.martins@oracle.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk

Series

[RFC,01/10] mm: Add pmd support for _PAGE_SPECIAL | expand

Commit Message

Joao Martins Jan. 10, 2020, 7:03 p.m. UTC

From: Nikita Leshenko <nikita.leshchenko@oracle.com>

Unconditionally interpreting vm_pgoff as a PFN is incorrect.

VMAs created by /dev/mem do this, but in general VM_PFNMAP just means
that the VMA doesn't have an associated struct page and is being managed
directly by something other than the core mmu.

Use follow_pfn like KVM does to find the PFN.

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
---
 drivers/vfio/vfio_iommu_type1.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Jason Gunthorpe Feb. 7, 2020, 9:08 p.m. UTC | #1

On Fri, Jan 10, 2020 at 07:03:12PM +0000, Joao Martins wrote:
> From: Nikita Leshenko <nikita.leshchenko@oracle.com>
> 
> Unconditionally interpreting vm_pgoff as a PFN is incorrect.
> 
> VMAs created by /dev/mem do this, but in general VM_PFNMAP just means
> that the VMA doesn't have an associated struct page and is being managed
> directly by something other than the core mmu.
> 
> Use follow_pfn like KVM does to find the PFN.
> 
> Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>  drivers/vfio/vfio_iommu_type1.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 2ada8e6cdb88..1e43581f95ea 100644
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -362,9 +362,9 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
>  	vma = find_vma_intersection(mm, vaddr, vaddr + 1);
>  
>  	if (vma && vma->vm_flags & VM_PFNMAP) {
> -		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> -		if (is_invalid_reserved_pfn(*pfn))
> -			ret = 0;
> +		ret = follow_pfn(vma, vaddr, pfn);
> +		if (!ret && !is_invalid_reserved_pfn(*pfn))
> +			ret = -EOPNOTSUPP;
>  	}

FWIW this existing code is a huge hack and a security problem.

I'm not sure how you could be successfully using this path on actual
memory without hitting bad bugs?

Fudamentally VFIO can't retain a reference to a page from within a VMA
without some kind of recount/locking/etc to allow the thing that put
the page there to know it is still being used (ie programmed in a
IOMMU) by VFIO.

Otherwise it creates use-after-free style security problems on the
page.

This code needs to be deleted, not extended :(

Jason

Joao Martins Feb. 11, 2020, 4:23 p.m. UTC | #2

On 2/7/20 9:08 PM, Jason Gunthorpe wrote:
> On Fri, Jan 10, 2020 at 07:03:12PM +0000, Joao Martins wrote:
>> From: Nikita Leshenko <nikita.leshchenko@oracle.com>
>>
>> Unconditionally interpreting vm_pgoff as a PFN is incorrect.
>>
>> VMAs created by /dev/mem do this, but in general VM_PFNMAP just means
>> that the VMA doesn't have an associated struct page and is being managed
>> directly by something other than the core mmu.
>>
>> Use follow_pfn like KVM does to find the PFN.
>>
>> Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>>  drivers/vfio/vfio_iommu_type1.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 2ada8e6cdb88..1e43581f95ea 100644
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -362,9 +362,9 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
>>  	vma = find_vma_intersection(mm, vaddr, vaddr + 1);
>>  
>>  	if (vma && vma->vm_flags & VM_PFNMAP) {
>> -		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>> -		if (is_invalid_reserved_pfn(*pfn))
>> -			ret = 0;
>> +		ret = follow_pfn(vma, vaddr, pfn);
>> +		if (!ret && !is_invalid_reserved_pfn(*pfn))
>> +			ret = -EOPNOTSUPP;
>>  	}
> 
> FWIW this existing code is a huge hack and a security problem.
> 
> I'm not sure how you could be successfully using this path on actual
> memory without hitting bad bugs?
> 
ATM I think this codepath is largelly hit at the moment for MMIO (GPU
passthrough, or mdev). In the context of this patch, guest memory would be
treated similarly meaning the device-dax backing memory wouldn't have a 'struct
page' (as introduced in this series).

> Fudamentally VFIO can't retain a reference to a page from within a VMA
> without some kind of recount/locking/etc to allow the thing that put
> the page there to know it is still being used (ie programmed in a
> IOMMU) by VFIO.
> 
> Otherwise it creates use-after-free style security problems on the
> page.
> 
I take it you're referring to the past problems with long term page pinning +
fsdax? Or you had something else in mind, perhaps related to your LSFMM topic?

Here the memory can't be used by the kernel (and there's no struct page) except
from device-dax managing/tearing/driving the pfn region (which is static and the
underlying PFNs won't change throughout device lifetime), and vfio
pinning/unpinning the pfns (which are refcounted against multiple map/unmaps);

> This code needs to be deleted, not extended :(

To some extent it isn't really an extension: the patch was just removing the
assumption @vm_pgoff being the 'start pfn' on PFNMAP vmas. This is also
similarly done by get_vaddr_frames().

	Joao

Jason Gunthorpe Feb. 11, 2020, 4:50 p.m. UTC | #3

On Tue, Feb 11, 2020 at 04:23:49PM +0000, Joao Martins wrote:
> On 2/7/20 9:08 PM, Jason Gunthorpe wrote:
> > On Fri, Jan 10, 2020 at 07:03:12PM +0000, Joao Martins wrote:
> >> From: Nikita Leshenko <nikita.leshchenko@oracle.com>
> >>
> >> Unconditionally interpreting vm_pgoff as a PFN is incorrect.
> >>
> >> VMAs created by /dev/mem do this, but in general VM_PFNMAP just means
> >> that the VMA doesn't have an associated struct page and is being managed
> >> directly by something other than the core mmu.
> >>
> >> Use follow_pfn like KVM does to find the PFN.
> >>
> >> Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
> >>  drivers/vfio/vfio_iommu_type1.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index 2ada8e6cdb88..1e43581f95ea 100644
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -362,9 +362,9 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
> >>  	vma = find_vma_intersection(mm, vaddr, vaddr + 1);
> >>  
> >>  	if (vma && vma->vm_flags & VM_PFNMAP) {
> >> -		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> >> -		if (is_invalid_reserved_pfn(*pfn))
> >> -			ret = 0;
> >> +		ret = follow_pfn(vma, vaddr, pfn);
> >> +		if (!ret && !is_invalid_reserved_pfn(*pfn))
> >> +			ret = -EOPNOTSUPP;
> >>  	}
> > 
> > FWIW this existing code is a huge hack and a security problem.
> > 
> > I'm not sure how you could be successfully using this path on actual
> > memory without hitting bad bugs?
> > 
> ATM I think this codepath is largelly hit at the moment for MMIO (GPU
> passthrough, or mdev). In the context of this patch, guest memory would be
> treated similarly meaning the device-dax backing memory wouldn't have a 'struct
> page' (as introduced in this series).

I think it is being used specifically to allow two VFIO's to be
inserted into a VM and have the IOMMU setup to allow MMIO access.

> > Fudamentally VFIO can't retain a reference to a page from within a VMA
> > without some kind of recount/locking/etc to allow the thing that put
> > the page there to know it is still being used (ie programmed in a
> > IOMMU) by VFIO.
> > 
> > Otherwise it creates use-after-free style security problems on the
> > page.
>
> I take it you're referring to the past problems with long term page pinning +
> fsdax? Or you had something else in mind, perhaps related to your LSFMM topic?

No. I'm refering to retaining access to memory backed a VMA without
holding any kind of locking on it. This is an access after free scenario.

It *should* be like a long term page pin so that the VMA owner knows
something is happening.
 
> Here the memory can't be used by the kernel (and there's no struct page) except
> from device-dax managing/tearing/driving the pfn region (which is static and the
> underlying PFNs won't change throughout device lifetime), and vfio
> pinning/unpinning the pfns (which are refcounted against multiple map/unmaps);

For instance if you tear down the device-dax then VFIO will happily
continue to reference the memory. This is a bug.

There are other cases that escalate to security bugs.

> > This code needs to be deleted, not extended :(
> 
> To some extent it isn't really an extension: the patch was just removing the
> assumption @vm_pgoff being the 'start pfn' on PFNMAP vmas. This is also
> similarly done by get_vaddr_frames().

You are extending it in the sense that you plan to use it for more
cases than VMAs created by some other VFIO. That should not be
done as it will only complicate fixing this code.

KVM is allowed to use follow_pfn because it uses MMU notifiers and
does not allow the result of follow_pfn to outlive the VMA (AFAIK at
least). So it should be safe.

Jason

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2ada8e6cdb88..1e43581f95ea 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -362,9 +362,9 @@  static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
 	vma = find_vma_intersection(mm, vaddr, vaddr + 1);
 
 	if (vma && vma->vm_flags & VM_PFNMAP) {
-		*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
-		if (is_invalid_reserved_pfn(*pfn))
-			ret = 0;
+		ret = follow_pfn(vma, vaddr, pfn);
+		if (!ret && !is_invalid_reserved_pfn(*pfn))
+			ret = -EOPNOTSUPP;
 	}
 
 	up_read(&mm->mmap_sem);

[RFC,09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs

Commit Message

Comments

Patch