diff mbox series

[v5,17/21] mm/mmap: Relocate arch_unmap() to vms_complete_munmap_vmas()

Message ID 20240717200709.1552558-18-Liam.Howlett@oracle.com (mailing list archive)
State New
Headers show
Series Avoid MAP_FIXED gap exposure | expand

Commit Message

Liam R. Howlett July 17, 2024, 8:07 p.m. UTC
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

The arch_unmap call was previously moved above the rbtree modifications
in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
corruption").  The move was motivated by an issue with calling
arch_unmap() after the rbtree was modified.

Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
arch_unmap() prior to modifying the vma tree no longer exists
(regardless of rbtree or maple tree implementations).

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: LEROY Christophe <christophe.leroy2@cs-soprasteria.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Hansen <dave.hansen@intel.com>
---
 mm/mmap.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

Comments

Lorenzo Stoakes July 22, 2024, 2:25 p.m. UTC | #1
On Wed, Jul 17, 2024 at 04:07:05PM GMT, Liam R. Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
>
> The arch_unmap call was previously moved above the rbtree modifications
> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> corruption").  The move was motivated by an issue with calling
> arch_unmap() after the rbtree was modified.
>
> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> arch_unmap() prior to modifying the vma tree no longer exists
> (regardless of rbtree or maple tree implementations).
>
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> Cc: LEROY Christophe <christophe.leroy2@cs-soprasteria.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Dave Hansen <dave.hansen@intel.com>
> ---
>  mm/mmap.c | 12 ++----------
>  1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9f870e715a47..117e8240f697 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2680,6 +2680,7 @@ static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
>  	mm = vms->mm;
>  	mm->map_count -= vms->vma_count;
>  	mm->locked_vm -= vms->locked_vm;
> +	arch_unmap(mm, vms->start, vms->end); /* write lock needed */

Worth having a mmap_assert_write_locked() here? Would make this
self-documenting also.

>  	if (vms->unlock)
>  		mmap_write_downgrade(mm);
>
> @@ -2907,7 +2908,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>   *
>   * This function takes a @mas that is either pointing to the previous VMA or set
>   * to MA_START and sets it up to remove the mapping(s).  The @len will be
> - * aligned and any arch_unmap work will be preformed.
> + * aligned prior to munmap.
>   *
>   * Return: 0 on success and drops the lock if so directed, error and leaves the
>   * lock held otherwise.
> @@ -2927,16 +2928,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
>  		return -EINVAL;
>
>  	/*
> -	 * Check if memory is sealed before arch_unmap.
>  	 * Prevent unmapping a sealed VMA.
>  	 * can_modify_mm assumes we have acquired the lock on MM.
>  	 */
>  	if (unlikely(!can_modify_mm(mm, start, end)))
>  		return -EPERM;
>
> -	 /* arch_unmap() might do unmaps itself.  */
> -	arch_unmap(mm, start, end);
> -
>  	/* Find the first overlapping VMA */
>  	vma = vma_find(vmi, end);
>  	if (!vma) {
> @@ -2997,9 +2994,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	if (unlikely(!can_modify_mm(mm, addr, end)))
>  		return -EPERM;
>
> -	 /* arch_unmap() might do unmaps itself.  */
> -	arch_unmap(mm, addr, end);
> -

It seems to me that the intent of this particular invocation was to ensure
we have done what we can to unmap before trying to unmap ourselves.

However this seems stupid to me anyway - 'hey maybe the arch will do this
for us' - yeah probably not.

So this should definitely go regardless, given we will invoke it later now
anyway.

>  	/* Find the first overlapping VMA */
>  	vma = vma_find(&vmi, end);
>  	init_vma_munmap(&vms, &vmi, vma, addr, end, uf, /* unlock = */ false);
> @@ -3377,14 +3371,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	struct mm_struct *mm = vma->vm_mm;
>
>  	/*
> -	 * Check if memory is sealed before arch_unmap.
>  	 * Prevent unmapping a sealed VMA.
>  	 * can_modify_mm assumes we have acquired the lock on MM.
>  	 */
>  	if (unlikely(!can_modify_mm(mm, start, end)))
>  		return -EPERM;
>
> -	arch_unmap(mm, start, end);
>  	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
>  }
>
> --
> 2.43.0
>

I hope we can find a way to eliminate these kind of hooks altogether as
they reduce our control over how VMA operations are performed.

LGTM,

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Liam R. Howlett July 23, 2024, 2:11 p.m. UTC | #2
* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [240722 10:25]:
> On Wed, Jul 17, 2024 at 04:07:05PM GMT, Liam R. Howlett wrote:
> > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> >
> > The arch_unmap call was previously moved above the rbtree modifications
> > in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> > corruption").  The move was motivated by an issue with calling
> > arch_unmap() after the rbtree was modified.
> >
> > Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> > ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> > arch_unmap() prior to modifying the vma tree no longer exists
> > (regardless of rbtree or maple tree implementations).
> >
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > Cc: LEROY Christophe <christophe.leroy2@cs-soprasteria.com>
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: Dave Hansen <dave.hansen@intel.com>
> > ---
> >  mm/mmap.c | 12 ++----------
> >  1 file changed, 2 insertions(+), 10 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 9f870e715a47..117e8240f697 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2680,6 +2680,7 @@ static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
> >  	mm = vms->mm;
> >  	mm->map_count -= vms->vma_count;
> >  	mm->locked_vm -= vms->locked_vm;
> > +	arch_unmap(mm, vms->start, vms->end); /* write lock needed */
> 
> Worth having a mmap_assert_write_locked() here? Would make this
> self-documenting also.

No, this is just to point out it cannot be lowered further in this
function.

> 
> >  	if (vms->unlock)
> >  		mmap_write_downgrade(mm);
> >
> > @@ -2907,7 +2908,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >   *
> >   * This function takes a @mas that is either pointing to the previous VMA or set
> >   * to MA_START and sets it up to remove the mapping(s).  The @len will be
> > - * aligned and any arch_unmap work will be preformed.
> > + * aligned prior to munmap.
> >   *
> >   * Return: 0 on success and drops the lock if so directed, error and leaves the
> >   * lock held otherwise.
> > @@ -2927,16 +2928,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
> >  		return -EINVAL;
> >
> >  	/*
> > -	 * Check if memory is sealed before arch_unmap.
> >  	 * Prevent unmapping a sealed VMA.
> >  	 * can_modify_mm assumes we have acquired the lock on MM.
> >  	 */
> >  	if (unlikely(!can_modify_mm(mm, start, end)))
> >  		return -EPERM;
> >
> > -	 /* arch_unmap() might do unmaps itself.  */
> > -	arch_unmap(mm, start, end);
> > -
> >  	/* Find the first overlapping VMA */
> >  	vma = vma_find(vmi, end);
> >  	if (!vma) {
> > @@ -2997,9 +2994,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
> >  	if (unlikely(!can_modify_mm(mm, addr, end)))
> >  		return -EPERM;
> >
> > -	 /* arch_unmap() might do unmaps itself.  */
> > -	arch_unmap(mm, addr, end);
> > -
> 
> It seems to me that the intent of this particular invocation was to ensure
> we have done what we can to unmap before trying to unmap ourselves.
> 
> However this seems stupid to me anyway - 'hey maybe the arch will do this
> for us' - yeah probably not.
> 
> So this should definitely go regardless, given we will invoke it later now
> anyway.

This was covered in the commit message, it was because we needed to
remove the VMAs earlier for a dead feature (mpx).

> 
> >  	/* Find the first overlapping VMA */
> >  	vma = vma_find(&vmi, end);
> >  	init_vma_munmap(&vms, &vmi, vma, addr, end, uf, /* unlock = */ false);
> > @@ -3377,14 +3371,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	struct mm_struct *mm = vma->vm_mm;
> >
> >  	/*
> > -	 * Check if memory is sealed before arch_unmap.
> >  	 * Prevent unmapping a sealed VMA.
> >  	 * can_modify_mm assumes we have acquired the lock on MM.
> >  	 */
> >  	if (unlikely(!can_modify_mm(mm, start, end)))
> >  		return -EPERM;
> >
> > -	arch_unmap(mm, start, end);
> >  	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> >  }
> >
> > --
> > 2.43.0
> >
> 
> I hope we can find a way to eliminate these kind of hooks altogether as
> they reduce our control over how VMA operations are performed.

Agreed.  I see a path forward on doing just that.

> 
> LGTM,
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 9f870e715a47..117e8240f697 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2680,6 +2680,7 @@  static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
 	mm = vms->mm;
 	mm->map_count -= vms->vma_count;
 	mm->locked_vm -= vms->locked_vm;
+	arch_unmap(mm, vms->start, vms->end); /* write lock needed */
 	if (vms->unlock)
 		mmap_write_downgrade(mm);
 
@@ -2907,7 +2908,7 @@  do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
  *
  * This function takes a @mas that is either pointing to the previous VMA or set
  * to MA_START and sets it up to remove the mapping(s).  The @len will be
- * aligned and any arch_unmap work will be preformed.
+ * aligned prior to munmap.
  *
  * Return: 0 on success and drops the lock if so directed, error and leaves the
  * lock held otherwise.
@@ -2927,16 +2928,12 @@  int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
 		return -EINVAL;
 
 	/*
-	 * Check if memory is sealed before arch_unmap.
 	 * Prevent unmapping a sealed VMA.
 	 * can_modify_mm assumes we have acquired the lock on MM.
 	 */
 	if (unlikely(!can_modify_mm(mm, start, end)))
 		return -EPERM;
 
-	 /* arch_unmap() might do unmaps itself.  */
-	arch_unmap(mm, start, end);
-
 	/* Find the first overlapping VMA */
 	vma = vma_find(vmi, end);
 	if (!vma) {
@@ -2997,9 +2994,6 @@  unsigned long mmap_region(struct file *file, unsigned long addr,
 	if (unlikely(!can_modify_mm(mm, addr, end)))
 		return -EPERM;
 
-	 /* arch_unmap() might do unmaps itself.  */
-	arch_unmap(mm, addr, end);
-
 	/* Find the first overlapping VMA */
 	vma = vma_find(&vmi, end);
 	init_vma_munmap(&vms, &vmi, vma, addr, end, uf, /* unlock = */ false);
@@ -3377,14 +3371,12 @@  int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	struct mm_struct *mm = vma->vm_mm;
 
 	/*
-	 * Check if memory is sealed before arch_unmap.
 	 * Prevent unmapping a sealed VMA.
 	 * can_modify_mm assumes we have acquired the lock on MM.
 	 */
 	if (unlikely(!can_modify_mm(mm, start, end)))
 		return -EPERM;
 
-	arch_unmap(mm, start, end);
 	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
 }