diff mbox series

[1/3] virt: acrn: stop using follow_pfn

Message ID 20240324234542.2038726-2-hch@lst.de (mailing list archive)
State New
Headers show
Series [1/3] virt: acrn: stop using follow_pfn | expand

Commit Message

Christoph Hellwig March 24, 2024, 11:45 p.m. UTC
Switch from follow_pfn to follow_pte so that we can get rid of
follow_pfn.  Note that this doesn't fix any of the pre-existing
raciness and lack of permission checking in the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/virt/acrn/mm.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

David Hildenbrand March 25, 2024, 10:33 a.m. UTC | #1
On 25.03.24 00:45, Christoph Hellwig wrote:
> Switch from follow_pfn to follow_pte so that we can get rid of
> follow_pfn.  Note that this doesn't fix any of the pre-existing
> raciness and lack of permission checking in the code.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/virt/acrn/mm.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c
> index fa5d9ca6be5706..69c3f619f88196 100644
> --- a/drivers/virt/acrn/mm.c
> +++ b/drivers/virt/acrn/mm.c
> @@ -171,18 +171,24 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
>   	mmap_read_lock(current->mm);
>   	vma = vma_lookup(current->mm, memmap->vma_base);
>   	if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) {
> +		spinlock_t *ptl;
> +		pte_t *ptep;
> +
>   		if ((memmap->vma_base + memmap->len) > vma->vm_end) {
>   			mmap_read_unlock(current->mm);
>   			return -EINVAL;
>   		}
>   
> -		ret = follow_pfn(vma, memmap->vma_base, &pfn);
> -		mmap_read_unlock(current->mm);
> +		ret = follow_pte(vma->vm_mm, memmap->vma_base, &ptep, &ptl);
>   		if (ret < 0) {
> +			mmap_read_unlock(current->mm);
>   			dev_dbg(acrn_dev.this_device,
>   				"Failed to lookup PFN at VMA:%pK.\n", (void *)memmap->vma_base);
>   			return ret;
>   		}
> +		pfn = pte_pfn(ptep_get(ptep));
> +		pte_unmap_unlock(ptep, ptl);
> +		mmap_read_unlock(current->mm);
>   
>   		return acrn_mm_region_add(vm, memmap->user_vm_pa,
>   			 PFN_PHYS(pfn), memmap->len,

... I have similar patches lying around here (see bwlow). I added some
actual access permission checks.

(I also realized, that if we get an anon folio in a COW mapping via follow_pte()
here, I suspect one might be able to do some nasty things. Just imagine if we
munmap(), free the anon folio, and then it gets used in other context ... At
least KVM/vfio handle that using references+MMU notifiers.)

Reviewed-by: David Hildenbrand <david@redhat.com>


commit 812e577dea97327bcc68d34504e7387dff2ffd8f
Author: David Hildenbrand <david@redhat.com>
Date:   Fri Mar 8 13:53:04 2024 +0100

     virt/acrn/mm: use follow_pte() instead of follow_pfn()
     
     follow_pfn() should not be used. Instead, use follow_pte() and do some
     best-guess PTE permission checks.
     
     Should we simply always require pte_write()? Maybe. Performing no
     checks clearly looks wrong, and pin_user_pages_fast() is unconditionally
     called with FOLL_WRITE.
     
     Signed-off-by: David Hildenbrand <david@redhat.com>

diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c
index fa5d9ca6be57..563c545adb2c 100644
--- a/drivers/virt/acrn/mm.c
+++ b/drivers/virt/acrn/mm.c
@@ -171,12 +171,22 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
         mmap_read_lock(current->mm);
         vma = vma_lookup(current->mm, memmap->vma_base);
         if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) {
+               spinlock_t *ptl;
+               pte_t *ptep;
+
                 if ((memmap->vma_base + memmap->len) > vma->vm_end) {
                         mmap_read_unlock(current->mm);
                         return -EINVAL;
                 }
  
-               ret = follow_pfn(vma, memmap->vma_base, &pfn);
+               ret = follow_pte(vma, memmap->vma_base, &ptep, &ptl);
+               if (!ret) {
+                       pfn = pte_pfn(ptep_get(ptep));
+                       if (!pte_write(ptep_get(ptep)) &&
+                           (memmap->attr & ACRN_MEM_ACCESS_WRITE))
+                               ret = -EFAULT;
+                       pte_unmap_unlock(ptep, ptl);
+               }
                 mmap_read_unlock(current->mm);
                 if (ret < 0) {
Christoph Hellwig March 26, 2024, 6:04 a.m. UTC | #2
On Mon, Mar 25, 2024 at 11:33:31AM +0100, David Hildenbrand wrote:
> ... I have similar patches lying around here (see bwlow). I added some
> actual access permission checks.
>
> (I also realized, that if we get an anon folio in a COW mapping via follow_pte()
> here, I suspect one might be able to do some nasty things. Just imagine if we
> munmap(), free the anon folio, and then it gets used in other context ... At
> least KVM/vfio handle that using references+MMU notifiers.)

How about you just send out your series that seems to go further and
I retract mine?
David Hildenbrand March 26, 2024, 5:06 p.m. UTC | #3
On 26.03.24 07:04, Christoph Hellwig wrote:
> On Mon, Mar 25, 2024 at 11:33:31AM +0100, David Hildenbrand wrote:
>> ... I have similar patches lying around here (see bwlow). I added some
>> actual access permission checks.
>>
>> (I also realized, that if we get an anon folio in a COW mapping via follow_pte()
>> here, I suspect one might be able to do some nasty things. Just imagine if we
>> munmap(), free the anon folio, and then it gets used in other context ... At
>> least KVM/vfio handle that using references+MMU notifiers.)
> 
> How about you just send out your series that seems to go further and
> I retract mine?

Let's go with yours first and I'll rebase.

Regarding above issue, I still have not made up my mind: likely we 
should reject any PFN in acrn that has a valid "struct page", and that 
page does not have PG_reserved set. That's what VFIO effectively does IIRC.
diff mbox series

Patch

diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c
index fa5d9ca6be5706..69c3f619f88196 100644
--- a/drivers/virt/acrn/mm.c
+++ b/drivers/virt/acrn/mm.c
@@ -171,18 +171,24 @@  int acrn_vm_ram_map(struct acrn_vm *vm, struct acrn_vm_memmap *memmap)
 	mmap_read_lock(current->mm);
 	vma = vma_lookup(current->mm, memmap->vma_base);
 	if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) {
+		spinlock_t *ptl;
+		pte_t *ptep;
+
 		if ((memmap->vma_base + memmap->len) > vma->vm_end) {
 			mmap_read_unlock(current->mm);
 			return -EINVAL;
 		}
 
-		ret = follow_pfn(vma, memmap->vma_base, &pfn);
-		mmap_read_unlock(current->mm);
+		ret = follow_pte(vma->vm_mm, memmap->vma_base, &ptep, &ptl);
 		if (ret < 0) {
+			mmap_read_unlock(current->mm);
 			dev_dbg(acrn_dev.this_device,
 				"Failed to lookup PFN at VMA:%pK.\n", (void *)memmap->vma_base);
 			return ret;
 		}
+		pfn = pte_pfn(ptep_get(ptep));
+		pte_unmap_unlock(ptep, ptl);
+		mmap_read_unlock(current->mm);
 
 		return acrn_mm_region_add(vm, memmap->user_vm_pa,
 			 PFN_PHYS(pfn), memmap->len,