diff mbox series

mm: Always downgrade mmap_lock if requested

Message ID 20230629191414.1215929-1-willy@infradead.org (mailing list archive)
State New
Headers show
Series mm: Always downgrade mmap_lock if requested | expand

Commit Message

Matthew Wilcox June 29, 2023, 7:14 p.m. UTC
Now that stack growth must always hold the mmap_lock for write, we can
always downgrade the mmap_lock to read and safely unmap pages from the
page table, even if we're next to a stack.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/mmap.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

Comments

Linus Torvalds June 29, 2023, 8 p.m. UTC | #1
On Thu, 29 Jun 2023 at 12:14, Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Now that stack growth must always hold the mmap_lock for write, we can
> always downgrade the mmap_lock to read and safely unmap pages from the
> page table, even if we're next to a stack.

Can we please also fix the really odd return value semantics?

Right now that function returns either an error - meaning that it
didn't downgrade, or it returns 0/1 as a success to show "did I
downgrade as you asked me to"?

That is *really* confusing, but it was needed in that bad old world order.

But now that the downgrade is not a "try to downgrade if you can", but
something reliable, can we please just make the success case be 0, and
make the callers all know that on success, it was downgraded?

And yes, I realize that that means do_vmi_munmap() also has to be
changed. The documentation for that function is horrid, btw, in that
it says

 * Returns: -EINVAL on failure, 1 on success and unlock, 0 otherwise.

which is just not true. It can return other errors than -EINVAL
(through exactly that do_vmi_align_munmap() function), and the "1 on
success and unlock" is not true, it's a "success and downgrade if you
asked me to".

So I think all of those callers should also be changed to "if you
asked for a downgrade, and do_vmi_munmap() returned success, then you
got a downgrade".

Then some of the callers of *that* can be simplified too.

Please?

                 Linus
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index 9b5188b65800..82efaca58ca2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2550,19 +2550,8 @@  do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
 
 	mm->locked_vm -= locked_vm;
 	mm->map_count -= count;
-	/*
-	 * Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
-	 * VM_GROWSUP VMA. Such VMAs can change their size under
-	 * down_read(mmap_lock) and collide with the VMA we are about to unmap.
-	 */
-	if (downgrade) {
-		if (next && (next->vm_flags & VM_GROWSDOWN))
-			downgrade = false;
-		else if (prev && (prev->vm_flags & VM_GROWSUP))
-			downgrade = false;
-		else
-			mmap_write_downgrade(mm);
-	}
+	if (downgrade)
+		mmap_write_downgrade(mm);
 
 	/*
 	 * We can free page tables without write-locking mmap_lock because VMAs