mbox series

[v2,0/7] Make core VMA operations internal and testable

Message ID cover.1720121068.git.lorenzo.stoakes@oracle.com (mailing list archive)
Headers show
Series Make core VMA operations internal and testable | expand

Message

Lorenzo Stoakes July 4, 2024, 7:27 p.m. UTC
There are a number of "core" VMA manipulation functions implemented in
mm/mmap.c, notably those concerning VMA merging, splitting, modifying,
expanding and shrinking, which logically don't belong there.

More importantly this functionality represents an internal implementation
detail of memory management and should not be exposed outside of mm/
itself.

This patch series isolates core VMA manipulation functionality into its own
file, mm/vma.c, and provides an API to the rest of the mm code in mm/vma.h.

Importantly, it also carefully implements mm/vma_internal.h, which
specifies which headers need to be imported by vma.c, leading to the very
useful property that vma.c depends only on mm/vma.h and mm/vma_internal.h.

This means we can then re-implement vma_internal.h in userland, adding
shims for kernel mechanisms as required, allowing us to unit test internal
VMA functionality.

This testing is useful as opposed to an e.g. kunit implementation as this
way we can avoid all external kernel side-effects while testing, run tests
VERY quickly, and iterate on and debug problems quickly.

Excitingly this opens the door to, in the future, recreating precise
problems observed in production in userland and very quickly debugging
problems that might otherwise be very difficult to reproduce.

This patch series takes advantage of existing shim logic and full userland
maple tree support contained in tools/testing/radix-tree/ and
tools/include/linux/, separating out shared components of the radix tree
implementation to provide this testing.

Kernel functionality is stubbed and shimmed as needed in tools/testing/vma/
which contains a fully functional userland vma_internal.h file and which
imports mm/vma.c and mm/vma.h to be directly tested from userland.

A simple, skeleton testing implementation is provided in
tools/testing/vma/vma.c as a proof-of-concept, asserting that simple VMA
merge, modify (testing split), expand and shrink functionality work
correctly.

v2:
* NOMMU fixup in mm/vma.h.
* Fixup minor incorrect header edits and remove accidentally included empty
  test file, and incorrect license header.
* Remove generated/autoconf.h file from tools/testing/vma/ and create
  directory if doesn't already exist.
* Have vma binary return an error code if any tests fail.

v1:
* Fix test_simple_modify() to specify correct prev.
* Improve vma test Makefile so it picks up dependency changes correctly.
* Rename relocate_vma() to relocate_vma_down().
* Remove shift_arg_pages() and invoked relocate_vma_down() directly from
  setup_arg_pages().
* MAINTAINERS fixups.
https://lore.kernel.org/all/cover.1720006125.git.lorenzo.stoakes@oracle.com/

RFC v2:
* Reword commit messages.
* Replace vma_expand() / vma_shrink() wrappers with relocate_vma().
* Make move_page_tables() internal too.
* Have internal.h import vma.h.
* Use header guards to more cleanly implement userland testing code.
* Rename main.c to vma.c.
* Update mm/vma_internal.h to have fewer superfluous comments.
* Rework testing logic so we count test failures, and output test results.
* Correct some SPDX license prefixes.
* Make VM_xxx_ON() debug asserts forward to xxx_ON() macros.
* Update VMA tests to correctly free memory, and re-enable ASAN leak
  detection.
https://lore.kernel.org/all/cover.1719584707.git.lstoakes@gmail.com/

RFC v1:
https://lore.kernel.org/all/cover.1719481836.git.lstoakes@gmail.com/


Lorenzo Stoakes (7):
  userfaultfd: move core VMA manipulation logic to mm/userfaultfd.c
  mm: move vma_modify() and helpers to internal header
  mm: move vma_shrink(), vma_expand() to internal header
  mm: move internal core VMA manipulation functions to own file
  MAINTAINERS: Add entry for new VMA files
  tools: separate out shared radix-tree components
  tools: add skeleton code for userland testing of VMA logic

 MAINTAINERS                                   |   14 +
 fs/exec.c                                     |   81 +-
 fs/userfaultfd.c                              |  160 +-
 include/linux/mm.h                            |  112 +-
 include/linux/userfaultfd_k.h                 |   19 +
 mm/Makefile                                   |    2 +-
 mm/internal.h                                 |  167 +-
 mm/mmap.c                                     | 2069 ++---------------
 mm/mmu_notifier.c                             |    2 +
 mm/userfaultfd.c                              |  168 ++
 mm/vma.c                                      | 1766 ++++++++++++++
 mm/vma.h                                      |  364 +++
 mm/vma_internal.h                             |   52 +
 tools/testing/radix-tree/Makefile             |   68 +-
 tools/testing/radix-tree/maple.c              |   14 +-
 tools/testing/radix-tree/xarray.c             |    9 +-
 tools/testing/shared/autoconf.h               |    2 +
 tools/testing/{radix-tree => shared}/bitmap.c |    0
 tools/testing/{radix-tree => shared}/linux.c  |    0
 .../{radix-tree => shared}/linux/bug.h        |    0
 .../{radix-tree => shared}/linux/cpu.h        |    0
 .../{radix-tree => shared}/linux/idr.h        |    0
 .../{radix-tree => shared}/linux/init.h       |    0
 .../{radix-tree => shared}/linux/kconfig.h    |    0
 .../{radix-tree => shared}/linux/kernel.h     |    0
 .../{radix-tree => shared}/linux/kmemleak.h   |    0
 .../{radix-tree => shared}/linux/local_lock.h |    0
 .../{radix-tree => shared}/linux/lockdep.h    |    0
 .../{radix-tree => shared}/linux/maple_tree.h |    0
 .../{radix-tree => shared}/linux/percpu.h     |    0
 .../{radix-tree => shared}/linux/preempt.h    |    0
 .../{radix-tree => shared}/linux/radix-tree.h |    0
 .../{radix-tree => shared}/linux/rcupdate.h   |    0
 .../{radix-tree => shared}/linux/xarray.h     |    0
 tools/testing/shared/maple-shared.h           |    9 +
 tools/testing/shared/maple-shim.c             |    7 +
 tools/testing/shared/shared.h                 |   34 +
 tools/testing/shared/shared.mk                |   71 +
 .../testing/shared/trace/events/maple_tree.h  |    5 +
 tools/testing/shared/xarray-shared.c          |    5 +
 tools/testing/shared/xarray-shared.h          |    4 +
 tools/testing/vma/.gitignore                  |    7 +
 tools/testing/vma/Makefile                    |   16 +
 tools/testing/vma/linux/atomic.h              |   12 +
 tools/testing/vma/linux/mmzone.h              |   38 +
 tools/testing/vma/vma.c                       |  207 ++
 tools/testing/vma/vma_internal.h              |  882 +++++++
 47 files changed, 3915 insertions(+), 2451 deletions(-)
 create mode 100644 mm/vma.c
 create mode 100644 mm/vma.h
 create mode 100644 mm/vma_internal.h
 create mode 100644 tools/testing/shared/autoconf.h
 rename tools/testing/{radix-tree => shared}/bitmap.c (100%)
 rename tools/testing/{radix-tree => shared}/linux.c (100%)
 rename tools/testing/{radix-tree => shared}/linux/bug.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/cpu.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/idr.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/init.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/kconfig.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/kernel.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/kmemleak.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/local_lock.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/lockdep.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/maple_tree.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/percpu.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/preempt.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/radix-tree.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/rcupdate.h (100%)
 rename tools/testing/{radix-tree => shared}/linux/xarray.h (100%)
 create mode 100644 tools/testing/shared/maple-shared.h
 create mode 100644 tools/testing/shared/maple-shim.c
 create mode 100644 tools/testing/shared/shared.h
 create mode 100644 tools/testing/shared/shared.mk
 create mode 100644 tools/testing/shared/trace/events/maple_tree.h
 create mode 100644 tools/testing/shared/xarray-shared.c
 create mode 100644 tools/testing/shared/xarray-shared.h
 create mode 100644 tools/testing/vma/.gitignore
 create mode 100644 tools/testing/vma/Makefile
 create mode 100644 tools/testing/vma/linux/atomic.h
 create mode 100644 tools/testing/vma/linux/mmzone.h
 create mode 100644 tools/testing/vma/vma.c
 create mode 100644 tools/testing/vma/vma_internal.h

--
2.45.2

Comments

Lorenzo Stoakes July 10, 2024, 7:32 p.m. UTC | #1
On Thu, Jul 04, 2024 at 08:27:55PM GMT, Lorenzo Stoakes wrote:
> There are a number of "core" VMA manipulation functions implemented in
> mm/mmap.c, notably those concerning VMA merging, splitting, modifying,
> expanding and shrinking, which logically don't belong there.
[snip]

Hi Andrew,

Wondering if we're good to look at this going to mm-unstable? As this has
had time to settle, and received R-b tags from Liam and Vlasta.

It'd be good to get it in, as it's kind of inviting merge conflicts
otherwise and be good to get some certainty as to ordering for instance
vs. Liam's upcoming MAP_FIXED series.

Also I have some further work I'd like to build on this :>)

Thanks, Lorenzo
Andrew Morton July 11, 2024, 2:54 a.m. UTC | #2
On Wed, 10 Jul 2024 20:32:05 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> On Thu, Jul 04, 2024 at 08:27:55PM GMT, Lorenzo Stoakes wrote:
> > There are a number of "core" VMA manipulation functions implemented in
> > mm/mmap.c, notably those concerning VMA merging, splitting, modifying,
> > expanding and shrinking, which logically don't belong there.
> [snip]
> 
> Hi Andrew,
> 
> Wondering if we're good to look at this going to mm-unstable? As this has
> had time to settle, and received R-b tags from Liam and Vlasta.
> 
> It'd be good to get it in, as it's kind of inviting merge conflicts
> otherwise and be good to get some certainty as to ordering for instance
> vs. Liam's upcoming MAP_FIXED series.
> 
> Also I have some further work I'd like to build on this :>)

It's really big and it's quite new and it's really late.  I think it best to await the
next -rc cycle, see how much grief it all causes.
Liam R. Howlett July 11, 2024, 6 p.m. UTC | #3
* Andrew Morton <akpm@linux-foundation.org> [240710 22:54]:
> On Wed, 10 Jul 2024 20:32:05 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> 
> > On Thu, Jul 04, 2024 at 08:27:55PM GMT, Lorenzo Stoakes wrote:
> > > There are a number of "core" VMA manipulation functions implemented in
> > > mm/mmap.c, notably those concerning VMA merging, splitting, modifying,
> > > expanding and shrinking, which logically don't belong there.
> > [snip]
> > 
> > Hi Andrew,
> > 
> > Wondering if we're good to look at this going to mm-unstable? As this has
> > had time to settle, and received R-b tags from Liam and Vlasta.
> > 
> > It'd be good to get it in, as it's kind of inviting merge conflicts
> > otherwise and be good to get some certainty as to ordering for instance
> > vs. Liam's upcoming MAP_FIXED series.
> > 
> > Also I have some further work I'd like to build on this :>)
> 
> It's really big and it's quite new and it's really late.  I think it best to await the
> next -rc cycle, see how much grief it all causes.

Yes, this patch set is huge.

It is, however, extremely necessary to get to the point where we can
test things better than full system tests (and then wait for a distro to
rebuild all their rpms and find a missed issue).  I know a lot of people
would rather see everything in a kunit test, but the reality is that, at
this level in the kernel, we cannot test as well as we can with the
userspace approach.

With the scope of the change, it will be a lot of work to develop in
parallel and rebase on top of the moving of this code.  I'm wondering if
you can provide some more information on your plan?  Will this be the
first series in your mm-unstable branch after the release? iow, should I
be developing on top of the code moving around for my future work?  I am
happy enough to rebase my in-flight MAP_FIXED patches on top of this set
if that helps things along.

Thanks,
Liam
Lorenzo Stoakes July 19, 2024, 10:52 a.m. UTC | #4
On Thu, Jul 11, 2024 at 02:00:15PM GMT, Liam R. Howlett wrote:
> * Andrew Morton <akpm@linux-foundation.org> [240710 22:54]:
> > On Wed, 10 Jul 2024 20:32:05 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> >
> > > On Thu, Jul 04, 2024 at 08:27:55PM GMT, Lorenzo Stoakes wrote:
> > > > There are a number of "core" VMA manipulation functions implemented in
> > > > mm/mmap.c, notably those concerning VMA merging, splitting, modifying,
> > > > expanding and shrinking, which logically don't belong there.
> > > [snip]
> > >
> > > Hi Andrew,
> > >
> > > Wondering if we're good to look at this going to mm-unstable? As this has
> > > had time to settle, and received R-b tags from Liam and Vlasta.
> > >
> > > It'd be good to get it in, as it's kind of inviting merge conflicts
> > > otherwise and be good to get some certainty as to ordering for instance
> > > vs. Liam's upcoming MAP_FIXED series.
> > >
> > > Also I have some further work I'd like to build on this :>)
> >
> > It's really big and it's quite new and it's really late.  I think it best to await the
> > next -rc cycle, see how much grief it all causes.
>
> Yes, this patch set is huge.
>
> It is, however, extremely necessary to get to the point where we can
> test things better than full system tests (and then wait for a distro to
> rebuild all their rpms and find a missed issue).  I know a lot of people
> would rather see everything in a kunit test, but the reality is that, at
> this level in the kernel, we cannot test as well as we can with the
> userspace approach.
>
> With the scope of the change, it will be a lot of work to develop in
> parallel and rebase on top of the moving of this code.  I'm wondering if
> you can provide some more information on your plan?  Will this be the
> first series in your mm-unstable branch after the release? iow, should I
> be developing on top of the code moving around for my future work?  I am
> happy enough to rebase my in-flight MAP_FIXED patches on top of this set
> if that helps things along.
>
> Thanks,
> Liam

Thanks Liam!

I think best way forward unless you feel we should take a different
approach Andrew is for me to simply wait until the end of the merge window
and at the start of the week after rebase on 6.11-rc1 and do a resend?
Andrew Morton July 19, 2024, 8:49 p.m. UTC | #5
On Fri, 19 Jul 2024 11:52:00 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> > > It's really big and it's quite new and it's really late.  I think it best to await the
> > > next -rc cycle, see how much grief it all causes.
> >
> > Yes, this patch set is huge.
> >
> > It is, however, extremely necessary to get to the point where we can
> > test things better than full system tests (and then wait for a distro to
> > rebuild all their rpms and find a missed issue).  I know a lot of people
> > would rather see everything in a kunit test, but the reality is that, at
> > this level in the kernel, we cannot test as well as we can with the
> > userspace approach.
> >
> > With the scope of the change, it will be a lot of work to develop in
> > parallel and rebase on top of the moving of this code.  I'm wondering if
> > you can provide some more information on your plan?  Will this be the
> > first series in your mm-unstable branch after the release? iow, should I
> > be developing on top of the code moving around for my future work?  I am
> > happy enough to rebase my in-flight MAP_FIXED patches on top of this set
> > if that helps things along.
> >
> > Thanks,
> > Liam
> 
> Thanks Liam!
> 
> I think best way forward unless you feel we should take a different
> approach Andrew is for me to simply wait until the end of the merge window
> and at the start of the week after rebase on 6.11-rc1 and do a resend?

How about you send out a version early next week and I'll aim to send
it in to Linus late in this merge window?