diff mbox series

mm: add comments to do_mmap(), mmap_region() and vm_mmap()

Message ID 20241212113152.28849-1-lorenzo.stoakes@oracle.com (mailing list archive)
State New
Headers show
Series mm: add comments to do_mmap(), mmap_region() and vm_mmap() | expand

Commit Message

Lorenzo Stoakes Dec. 12, 2024, 11:31 a.m. UTC
It isn't always entirely clear to users the difference between do_mmap(),
mmap_region() and vm_mmap(), so add comments to clarify what's going on in
each.

This is compounded by the fact that we actually allow callers external to
mm to invoke both do_mmap() and mmap_region() (!), the latter of which is
really strictly speaking an internal memory mapping implementation detail.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/mmap.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 mm/util.c | 17 ++++++++++++
 2 files changed, 95 insertions(+), 1 deletion(-)

Comments

Andrew Morton Dec. 13, 2024, 2:36 a.m. UTC | #1
On Thu, 12 Dec 2024 11:31:52 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> It isn't always entirely clear to users the difference between do_mmap(),
> mmap_region() and vm_mmap(), so add comments to clarify what's going on in
> each.
> 
> This is compounded by the fact that we actually allow callers external to
> mm to invoke both do_mmap() and mmap_region() (!), the latter of which is
> really strictly speaking an internal memory mapping implementation detail.

Thanks, the world just improved.

These functions have pretty dumb names.  Patches to give them more
useful names would be small...
Lorenzo Stoakes Dec. 13, 2024, 9:22 a.m. UTC | #2
On Thu, Dec 12, 2024 at 06:36:04PM -0800, Andrew Morton wrote:
> On Thu, 12 Dec 2024 11:31:52 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>
> > It isn't always entirely clear to users the difference between do_mmap(),
> > mmap_region() and vm_mmap(), so add comments to clarify what's going on in
> > each.
> >
> > This is compounded by the fact that we actually allow callers external to
> > mm to invoke both do_mmap() and mmap_region() (!), the latter of which is
> > really strictly speaking an internal memory mapping implementation detail.
>
> Thanks, the world just improved.

Thanks :)

This was one I promised somebody a while ago (sorry to whoever it was - I did go
look but couldn't find my mail...) - wrote it on my TODO - and so since I'm
winding down for the holidays and like to keep my promises thought it was about
time I tackled it :>)

>
> These functions have pretty dumb names.  Patches to give them more
> useful names would be small...
>

You're right, they are pretty incredibly dumb. Let me have a think on what would
make more sense here...
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index df9154b15ef9..61260369f034 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -277,8 +277,62 @@  static inline bool file_mmap_ok(struct file *file, struct inode *inode,
 	return true;
 }
 
-/*
+/**
+ * do_mmap() - Perform a userland memory mapping into the current process
+ * address space of length @len with protection bits @prot, mmap flags @flags
+ * (from which VMA flags will be inferred), and any additional VMA flags to
+ * apply @vm_flags. If this is a file-backed mapping then the file is specified
+ * in @file and page offset into the file via @pgoff.
+ *
+ * This function does not perform security checks on the file and assumes, if
+ * @uf is non-NULL, the caller has provided a list head to track unmap events
+ * for userfaultfd @uf.
+ *
+ * It also simply indicates whether memory population is required by setting
+ * @populate, which must be non-NULL, expecting the caller to actually perform
+ * this task itself if appropriate.
+ *
+ * This function will invoke architecture-specific (and if provided and
+ * relevant, file system-specific) logic to determine the most appropriate
+ * unmapped area in which to place the mapping if not MAP_FIXED.
+ *
+ * Callers which require userland mmap() behaviour should invoke vm_mmap(),
+ * which is also exported for module use.
+ *
+ * Those which require this behaviour less security checks, userfaultfd and
+ * populate behaviour, and who handle the mmap write lock themselves, should
+ * call this function.
+ *
+ * Note that the returned address may reside within a merged VMA if an
+ * appropriate merge were to take place, so it doesn't necessarily specify the
+ * start of a VMA, rather only the start of a valid mapped range of length
+ * @len bytes, rounded down to the nearest page size.
+ *
  * The caller must write-lock current->mm->mmap_lock.
+ *
+ * @file: An optional struct file pointer describing the file which is to be
+ * mapped, if a file-backed mapping.
+ * @addr: If non-zero, hints at (or if @flags has MAP_FIXED set, specifies) the
+ * address at which to perform this mapping. See mmap (2) for details. Must be
+ * page-aligned.
+ * @len: The length of the mapping. Will be page-aligned and must be at least 1
+ * page in size.
+ * @prot: Protection bits describing access required to the mapping. See mmap
+ * (2) for details.
+ * @flags: Flags specifying how the mapping should be performed, see mmap (2)
+ * for details.
+ * @vm_flags: VMA flags which should be set by default, or 0 otherwise.
+ * @pgoff: Page offset into the @file if file-backed, should be 0 otherwise.
+ * @populate: A pointer to a value which will be set to 0 if no population of
+ * the range is required, or the number of bytes to populate if it is. Must be
+ * non-NULL. See mmap (2) for details as to under what circumstances population
+ * of the range occurs.
+ * @uf: An optional pointer to a list head to track userfaultfd unmap events
+ * should unmapping events arise. If provided, it is up to the caller to manage
+ * this.
+ *
+ * Returns: Either an error, or the address at which the requested mapping has
+ * been performed.
  */
 unsigned long do_mmap(struct file *file, unsigned long addr,
 			unsigned long len, unsigned long prot,
@@ -1016,6 +1070,29 @@  int do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
 	return do_vmi_munmap(&vmi, mm, start, len, uf, false);
 }
 
+/**
+ * mmap_region() - Actually perform the userland mapping of a VMA into
+ * current->mm with known, aligned and overflow-checked @addr and @len, and
+ * correctly determined VMA flags @vm_flags and page offset @pgoff.
+ *
+ * This is an internal memory management function, and should not be used
+ * directly.
+ *
+ * The caller must write-lock current->mm->mmap_lock.
+ *
+ * @file: If a file-backed mapping, a pointer to the struct file describing the
+ * file to be mapped, otherwise NULL.
+ * @addr: The page-aligned address at which to perform the mapping.
+ * @len: The page-aligned, non-zero, length of the mapping.
+ * @vm_flags: The VMA flags which should be applied to the mapping.
+ * @pgoff: If @file is specified, the page offset into the file, if not then
+ * the virtual page offset in memory of the anonymous mapping.
+ * @uf: Optionally, a pointer to a list head used for tracking userfaultfd unmap
+ * events.
+ *
+ * Returns: Either an error, or the address at which the requested mapping has
+ * been performed.
+ */
 unsigned long mmap_region(struct file *file, unsigned long addr,
 			  unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
 			  struct list_head *uf)
diff --git a/mm/util.c b/mm/util.c
index c1c3b06ab4f9..b7dc6fabaae5 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -587,6 +587,23 @@  unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
 	return ret;
 }
 
+/*
+ * Perform a userland memory mapping into the current process address space. See
+ * the comment for do_mmap() for more details on this operation in general.
+ *
+ * This differs from do_mmap() in that:
+ *
+ * a. An offset parameter is provided rather than pgoff, which is both checked
+ *    for overflow and page alignment.
+ * b. mmap locking is performed on the caller's behalf.
+ * c. Userfaultfd unmap events and memory population are handled.
+ *
+ * This means that this function performs essentially the same work as if
+ * userland were invoking mmap (2).
+ *
+ * Returns either an error, or the address at which the requested mapping has
+ * been performed.
+ */
 unsigned long vm_mmap(struct file *file, unsigned long addr,
 	unsigned long len, unsigned long prot,
 	unsigned long flag, unsigned long offset)