diff mbox series

[RFC,3/7] mm: unexport vma_expand() / vma_shrink()

Message ID 8c548bb3d0286bfaef2cd5e67d7bf698967a52a1.1719481836.git.lstoakes@gmail.com (mailing list archive)
State New
Headers show
Series Make core VMA operations internal and testable | expand

Commit Message

Lorenzo Stoakes June 27, 2024, 10:39 a.m. UTC
The vma_expand() and vma_shrink() functions are core VMA manipulaion
functions which ultimately invoke VMA split/merge. In order to make these
testable, it is convenient to place all such core functions in a header
internal to mm/.

In addition, it is safer to abstract direct access to such functionality so
we can better control how other parts of the kernel use them, which
provides us the freedom to change how this functionality behaves as needed
without having to worry about how this functionality is used elsewhere.

In order to service both these requirements, we provide abstractions for
the sole external user of these functions, shift_arg_pages() in fs/exec.c.

We provide vma_expand_bottom() and vma_shrink_top() functions which better
match the semantics of what shift_arg_pages() is trying to accomplish by
explicitly wrapping the safe expansion of the bottom of a VMA and the
shrinking of the top of a VMA.

As a result, we place the vma_shrink() and vma_expand() functions into
mm/internal.h to unexport them from use by any other part of the kernel.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 fs/exec.c          | 26 +++++--------------
 include/linux/mm.h |  9 +++----
 mm/internal.h      |  6 +++++
 mm/mmap.c          | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 82 insertions(+), 24 deletions(-)

Comments

Liam R. Howlett June 27, 2024, 5:45 p.m. UTC | #1
* Lorenzo Stoakes <lstoakes@gmail.com> [240627 06:39]:
> The vma_expand() and vma_shrink() functions are core VMA manipulaion
> functions which ultimately invoke VMA split/merge. In order to make these
> testable, it is convenient to place all such core functions in a header
> internal to mm/.
> 

The sole user doesn't cause a split or merge, it relocates a vma by
'sliding' the window of the vma by expand/shrink with the moving of page
tables in the middle of the slide.

It slides to relocate the vma start/end and keep the vma pointer
constant.

> In addition, it is safer to abstract direct access to such functionality so
> we can better control how other parts of the kernel use them, which
> provides us the freedom to change how this functionality behaves as needed
> without having to worry about how this functionality is used elsewhere.
> 
> In order to service both these requirements, we provide abstractions for
> the sole external user of these functions, shift_arg_pages() in fs/exec.c.
> 
> We provide vma_expand_bottom() and vma_shrink_top() functions which better
> match the semantics of what shift_arg_pages() is trying to accomplish by
> explicitly wrapping the safe expansion of the bottom of a VMA and the
> shrinking of the top of a VMA.
> 
> As a result, we place the vma_shrink() and vma_expand() functions into
> mm/internal.h to unexport them from use by any other part of the kernel.

There is no point to have vma_shrink() have a wrapper since this is the
only place it's ever used.  So we're wrapping a function that's only
called once.

I'd rather a vma_relocate() do everything in this function than wrap
them.  The only other think it does is the page table moving and freeing
- which we have to do in the vma code.  We;d expose something we want no
one to use - but we already have two of those here..

> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
>  fs/exec.c          | 26 +++++--------------
>  include/linux/mm.h |  9 +++----
>  mm/internal.h      |  6 +++++
>  mm/mmap.c          | 65 ++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 82 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 40073142288f..1cb3bf323e0f 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -700,25 +700,14 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
>  	unsigned long length = old_end - old_start;
>  	unsigned long new_start = old_start - shift;
>  	unsigned long new_end = old_end - shift;
> -	VMA_ITERATOR(vmi, mm, new_start);
> +	VMA_ITERATOR(vmi, mm, 0);
>  	struct vm_area_struct *next;
>  	struct mmu_gather tlb;
> +	int ret;
>  
> -	BUG_ON(new_start > new_end);
> -
> -	/*
> -	 * ensure there are no vmas between where we want to go
> -	 * and where we are
> -	 */
> -	if (vma != vma_next(&vmi))
> -		return -EFAULT;
> -
> -	vma_iter_prev_range(&vmi);
> -	/*
> -	 * cover the whole range: [new_start, old_end)
> -	 */
> -	if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
> -		return -ENOMEM;
> +	ret = vma_expand_bottom(&vmi, vma, shift, &next);
> +	if (ret)
> +		return ret;
>  
>  	/*
>  	 * move the page tables downwards, on failure we rely on
> @@ -730,7 +719,7 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
>  
>  	lru_add_drain();
>  	tlb_gather_mmu(&tlb, mm);
> -	next = vma_next(&vmi);
> +
>  	if (new_end > old_start) {
>  		/*
>  		 * when the old and new regions overlap clear from new_end.
> @@ -749,9 +738,8 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
>  	}
>  	tlb_finish_mmu(&tlb);
>  
> -	vma_prev(&vmi);
>  	/* Shrink the vma to just the new range */
> -	return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff);
> +	return vma_shrink_top(&vmi, vma, shift);
>  }
>  
>  /*
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4d2b5538925b..e3220439cf75 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3273,11 +3273,10 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
>  
>  /* mmap.c */
>  extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
> -extern int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> -		      unsigned long start, unsigned long end, pgoff_t pgoff,
> -		      struct vm_area_struct *next);
> -extern int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
> -		       unsigned long start, unsigned long end, pgoff_t pgoff);
> +extern int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +			     unsigned long shift, struct vm_area_struct **next);
> +extern int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +			  unsigned long shift);
>  extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
>  extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *);
>  extern void unlink_file_vma(struct vm_area_struct *);
> diff --git a/mm/internal.h b/mm/internal.h
> index c8177200c943..f7779727bb78 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1305,6 +1305,12 @@ static inline struct vm_area_struct
>  			  vma_policy(vma), new_ctx, anon_vma_name(vma));
>  }
>  
> +int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +	       unsigned long start, unsigned long end, pgoff_t pgoff,
> +		      struct vm_area_struct *next);
> +int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +	       unsigned long start, unsigned long end, pgoff_t pgoff);
> +
>  enum {
>  	/* mark page accessed */
>  	FOLL_TOUCH = 1 << 16,
> diff --git a/mm/mmap.c b/mm/mmap.c
> index e42d89f98071..574e69a04ebe 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3940,6 +3940,71 @@ void mm_drop_all_locks(struct mm_struct *mm)
>  	mutex_unlock(&mm_all_locks_mutex);
>  }
>  
> +/*
> + * vma_expand_bottom() - Expands the bottom of a VMA downwards. An error will
> + *                       arise if there is another VMA in the expanded range, or
> + *                       if the expansion fails. This function leaves the VMA
> + *                       iterator, vmi, positioned at the newly expanded VMA.
> + * @vmi: The VMA iterator.
> + * @vma: The VMA to modify.
> + * @shift: The number of bytes by which to expand the bottom of the VMA.
> + * @next: Output parameter, pointing at the VMA immediately succeeding the newly
> + *        expanded VMA.
> + *
> + * Returns: 0 on success, an error code otherwise.
> + */
> +int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +		      unsigned long shift, struct vm_area_struct **next)
> +{
> +	unsigned long old_start = vma->vm_start;
> +	unsigned long old_end = vma->vm_end;
> +	unsigned long new_start = old_start - shift;
> +	unsigned long new_end = old_end - shift;
> +
> +	BUG_ON(new_start > new_end);
> +
> +	vma_iter_set(vmi, new_start);
> +
> +	/*
> +	 * ensure there are no vmas between where we want to go
> +	 * and where we are
> +	 */
> +	if (vma != vma_next(vmi))
> +		return -EFAULT;
> +
> +	vma_iter_prev_range(vmi);
> +
> +	/*
> +	 * cover the whole range: [new_start, old_end)
> +	 */
> +	if (vma_expand(vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
> +		return -ENOMEM;
> +
> +	*next = vma_next(vmi);
> +	vma_prev(vmi);
> +
> +	return 0;
> +}
> +
> +/*
> + * vma_shrink_top() - Reduce an existing VMA's memory area by shift bytes from
> + *                    the top of the VMA.
> + * @vmi: The VMA iterator, must be positioned at the VMA.
> + * @vma: The VMA to modify.
> + * @shift: The number of bytes by which to shrink the VMA.
> + *
> + * Returns: 0 on success, an error code otherwise.
> + */
> +int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
> +		   unsigned long shift)
> +{
> +	if (shift >= vma->vm_end - vma->vm_start)
> +		return -EINVAL;
> +
> +	return vma_shrink(vmi, vma, vma->vm_start, vma->vm_end - shift,
> +			  vma->vm_pgoff);
> +}
> +
>  /*
>   * initialise the percpu counter for VM
>   */
> -- 
> 2.45.1
>
Lorenzo Stoakes June 27, 2024, 7:38 p.m. UTC | #2
On Thu, Jun 27, 2024 at 01:45:34PM -0400, Liam R. Howlett wrote:
> * Lorenzo Stoakes <lstoakes@gmail.com> [240627 06:39]:
> > The vma_expand() and vma_shrink() functions are core VMA manipulaion
> > functions which ultimately invoke VMA split/merge. In order to make these
> > testable, it is convenient to place all such core functions in a header
> > internal to mm/.
> >
>
> The sole user doesn't cause a split or merge, it relocates a vma by
> 'sliding' the window of the vma by expand/shrink with the moving of page
> tables in the middle of the slide.
>
> It slides to relocate the vma start/end and keep the vma pointer
> constant.

Yeah sorry, I actually don't know why I said this (I did say ultimately
again as well!), as you say and I was in fact aware of, this doesn't invoke
split/merge. I will put this down to me being tired when I wrote this :)

Will fix.

>
> > In addition, it is safer to abstract direct access to such functionality so
> > we can better control how other parts of the kernel use them, which
> > provides us the freedom to change how this functionality behaves as needed
> > without having to worry about how this functionality is used elsewhere.
> >
> > In order to service both these requirements, we provide abstractions for
> > the sole external user of these functions, shift_arg_pages() in fs/exec.c.
> >
> > We provide vma_expand_bottom() and vma_shrink_top() functions which better
> > match the semantics of what shift_arg_pages() is trying to accomplish by
> > explicitly wrapping the safe expansion of the bottom of a VMA and the
> > shrinking of the top of a VMA.
> >
> > As a result, we place the vma_shrink() and vma_expand() functions into
> > mm/internal.h to unexport them from use by any other part of the kernel.
>
> There is no point to have vma_shrink() have a wrapper since this is the
> only place it's ever used.  So we're wrapping a function that's only
> called once.

Yeah that was a sketchy part of this change, I feel the vma_expand() case
is a lot more defensible, the vma_shrink() one, well I expected I might get
some feedback on anyway :)

This was obviously to try to find a way to abstract these away from fs/ in
some vaguely sensible fashion while retaining functionality.

>
> I'd rather a vma_relocate() do everything in this function than wrap
> them.  The only other think it does is the page table moving and freeing
> - which we have to do in the vma code.  We;d expose something we want no
> one to use - but we already have two of those here..

Right, I think I was trying to avoid _the whole thing_ as it's so specific
and not so nice to make available, but at the same time, it is perhaps the
only way forward reasonably to avoid the vma_shrink() micro-wrapper.

So yeah, will rework with a vma_relocate() or similar. As you say, we can't
really get away from exposing something nasty here.

>
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > ---
> >  fs/exec.c          | 26 +++++--------------
> >  include/linux/mm.h |  9 +++----
> >  mm/internal.h      |  6 +++++
> >  mm/mmap.c          | 65 ++++++++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 82 insertions(+), 24 deletions(-)
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index 40073142288f..1cb3bf323e0f 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -700,25 +700,14 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
> >  	unsigned long length = old_end - old_start;
> >  	unsigned long new_start = old_start - shift;
> >  	unsigned long new_end = old_end - shift;
> > -	VMA_ITERATOR(vmi, mm, new_start);
> > +	VMA_ITERATOR(vmi, mm, 0);
> >  	struct vm_area_struct *next;
> >  	struct mmu_gather tlb;
> > +	int ret;
> >
> > -	BUG_ON(new_start > new_end);
> > -
> > -	/*
> > -	 * ensure there are no vmas between where we want to go
> > -	 * and where we are
> > -	 */
> > -	if (vma != vma_next(&vmi))
> > -		return -EFAULT;
> > -
> > -	vma_iter_prev_range(&vmi);
> > -	/*
> > -	 * cover the whole range: [new_start, old_end)
> > -	 */
> > -	if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
> > -		return -ENOMEM;
> > +	ret = vma_expand_bottom(&vmi, vma, shift, &next);
> > +	if (ret)
> > +		return ret;
> >
> >  	/*
> >  	 * move the page tables downwards, on failure we rely on
> > @@ -730,7 +719,7 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
> >
> >  	lru_add_drain();
> >  	tlb_gather_mmu(&tlb, mm);
> > -	next = vma_next(&vmi);
> > +
> >  	if (new_end > old_start) {
> >  		/*
> >  		 * when the old and new regions overlap clear from new_end.
> > @@ -749,9 +738,8 @@ static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
> >  	}
> >  	tlb_finish_mmu(&tlb);
> >
> > -	vma_prev(&vmi);
> >  	/* Shrink the vma to just the new range */
> > -	return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff);
> > +	return vma_shrink_top(&vmi, vma, shift);
> >  }
> >
> >  /*
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 4d2b5538925b..e3220439cf75 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3273,11 +3273,10 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
> >
> >  /* mmap.c */
> >  extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
> > -extern int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > -		      unsigned long start, unsigned long end, pgoff_t pgoff,
> > -		      struct vm_area_struct *next);
> > -extern int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > -		       unsigned long start, unsigned long end, pgoff_t pgoff);
> > +extern int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +			     unsigned long shift, struct vm_area_struct **next);
> > +extern int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +			  unsigned long shift);
> >  extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
> >  extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *);
> >  extern void unlink_file_vma(struct vm_area_struct *);
> > diff --git a/mm/internal.h b/mm/internal.h
> > index c8177200c943..f7779727bb78 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -1305,6 +1305,12 @@ static inline struct vm_area_struct
> >  			  vma_policy(vma), new_ctx, anon_vma_name(vma));
> >  }
> >
> > +int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +	       unsigned long start, unsigned long end, pgoff_t pgoff,
> > +		      struct vm_area_struct *next);
> > +int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +	       unsigned long start, unsigned long end, pgoff_t pgoff);
> > +
> >  enum {
> >  	/* mark page accessed */
> >  	FOLL_TOUCH = 1 << 16,
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index e42d89f98071..574e69a04ebe 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -3940,6 +3940,71 @@ void mm_drop_all_locks(struct mm_struct *mm)
> >  	mutex_unlock(&mm_all_locks_mutex);
> >  }
> >
> > +/*
> > + * vma_expand_bottom() - Expands the bottom of a VMA downwards. An error will
> > + *                       arise if there is another VMA in the expanded range, or
> > + *                       if the expansion fails. This function leaves the VMA
> > + *                       iterator, vmi, positioned at the newly expanded VMA.
> > + * @vmi: The VMA iterator.
> > + * @vma: The VMA to modify.
> > + * @shift: The number of bytes by which to expand the bottom of the VMA.
> > + * @next: Output parameter, pointing at the VMA immediately succeeding the newly
> > + *        expanded VMA.
> > + *
> > + * Returns: 0 on success, an error code otherwise.
> > + */
> > +int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +		      unsigned long shift, struct vm_area_struct **next)
> > +{
> > +	unsigned long old_start = vma->vm_start;
> > +	unsigned long old_end = vma->vm_end;
> > +	unsigned long new_start = old_start - shift;
> > +	unsigned long new_end = old_end - shift;
> > +
> > +	BUG_ON(new_start > new_end);
> > +
> > +	vma_iter_set(vmi, new_start);
> > +
> > +	/*
> > +	 * ensure there are no vmas between where we want to go
> > +	 * and where we are
> > +	 */
> > +	if (vma != vma_next(vmi))
> > +		return -EFAULT;
> > +
> > +	vma_iter_prev_range(vmi);
> > +
> > +	/*
> > +	 * cover the whole range: [new_start, old_end)
> > +	 */
> > +	if (vma_expand(vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
> > +		return -ENOMEM;
> > +
> > +	*next = vma_next(vmi);
> > +	vma_prev(vmi);
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * vma_shrink_top() - Reduce an existing VMA's memory area by shift bytes from
> > + *                    the top of the VMA.
> > + * @vmi: The VMA iterator, must be positioned at the VMA.
> > + * @vma: The VMA to modify.
> > + * @shift: The number of bytes by which to shrink the VMA.
> > + *
> > + * Returns: 0 on success, an error code otherwise.
> > + */
> > +int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > +		   unsigned long shift)
> > +{
> > +	if (shift >= vma->vm_end - vma->vm_start)
> > +		return -EINVAL;
> > +
> > +	return vma_shrink(vmi, vma, vma->vm_start, vma->vm_end - shift,
> > +			  vma->vm_pgoff);
> > +}
> > +
> >  /*
> >   * initialise the percpu counter for VM
> >   */
> > --
> > 2.45.1
> >
diff mbox series

Patch

diff --git a/fs/exec.c b/fs/exec.c
index 40073142288f..1cb3bf323e0f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -700,25 +700,14 @@  static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
 	unsigned long length = old_end - old_start;
 	unsigned long new_start = old_start - shift;
 	unsigned long new_end = old_end - shift;
-	VMA_ITERATOR(vmi, mm, new_start);
+	VMA_ITERATOR(vmi, mm, 0);
 	struct vm_area_struct *next;
 	struct mmu_gather tlb;
+	int ret;
 
-	BUG_ON(new_start > new_end);
-
-	/*
-	 * ensure there are no vmas between where we want to go
-	 * and where we are
-	 */
-	if (vma != vma_next(&vmi))
-		return -EFAULT;
-
-	vma_iter_prev_range(&vmi);
-	/*
-	 * cover the whole range: [new_start, old_end)
-	 */
-	if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
-		return -ENOMEM;
+	ret = vma_expand_bottom(&vmi, vma, shift, &next);
+	if (ret)
+		return ret;
 
 	/*
 	 * move the page tables downwards, on failure we rely on
@@ -730,7 +719,7 @@  static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
 
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm);
-	next = vma_next(&vmi);
+
 	if (new_end > old_start) {
 		/*
 		 * when the old and new regions overlap clear from new_end.
@@ -749,9 +738,8 @@  static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
 	}
 	tlb_finish_mmu(&tlb);
 
-	vma_prev(&vmi);
 	/* Shrink the vma to just the new range */
-	return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff);
+	return vma_shrink_top(&vmi, vma, shift);
 }
 
 /*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4d2b5538925b..e3220439cf75 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3273,11 +3273,10 @@  void anon_vma_interval_tree_verify(struct anon_vma_chain *node);
 
 /* mmap.c */
 extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin);
-extern int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
-		      unsigned long start, unsigned long end, pgoff_t pgoff,
-		      struct vm_area_struct *next);
-extern int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
-		       unsigned long start, unsigned long end, pgoff_t pgoff);
+extern int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
+			     unsigned long shift, struct vm_area_struct **next);
+extern int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
+			  unsigned long shift);
 extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *);
 extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *);
 extern void unlink_file_vma(struct vm_area_struct *);
diff --git a/mm/internal.h b/mm/internal.h
index c8177200c943..f7779727bb78 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1305,6 +1305,12 @@  static inline struct vm_area_struct
 			  vma_policy(vma), new_ctx, anon_vma_name(vma));
 }
 
+int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma,
+	       unsigned long start, unsigned long end, pgoff_t pgoff,
+		      struct vm_area_struct *next);
+int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
+	       unsigned long start, unsigned long end, pgoff_t pgoff);
+
 enum {
 	/* mark page accessed */
 	FOLL_TOUCH = 1 << 16,
diff --git a/mm/mmap.c b/mm/mmap.c
index e42d89f98071..574e69a04ebe 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3940,6 +3940,71 @@  void mm_drop_all_locks(struct mm_struct *mm)
 	mutex_unlock(&mm_all_locks_mutex);
 }
 
+/*
+ * vma_expand_bottom() - Expands the bottom of a VMA downwards. An error will
+ *                       arise if there is another VMA in the expanded range, or
+ *                       if the expansion fails. This function leaves the VMA
+ *                       iterator, vmi, positioned at the newly expanded VMA.
+ * @vmi: The VMA iterator.
+ * @vma: The VMA to modify.
+ * @shift: The number of bytes by which to expand the bottom of the VMA.
+ * @next: Output parameter, pointing at the VMA immediately succeeding the newly
+ *        expanded VMA.
+ *
+ * Returns: 0 on success, an error code otherwise.
+ */
+int vma_expand_bottom(struct vma_iterator *vmi, struct vm_area_struct *vma,
+		      unsigned long shift, struct vm_area_struct **next)
+{
+	unsigned long old_start = vma->vm_start;
+	unsigned long old_end = vma->vm_end;
+	unsigned long new_start = old_start - shift;
+	unsigned long new_end = old_end - shift;
+
+	BUG_ON(new_start > new_end);
+
+	vma_iter_set(vmi, new_start);
+
+	/*
+	 * ensure there are no vmas between where we want to go
+	 * and where we are
+	 */
+	if (vma != vma_next(vmi))
+		return -EFAULT;
+
+	vma_iter_prev_range(vmi);
+
+	/*
+	 * cover the whole range: [new_start, old_end)
+	 */
+	if (vma_expand(vmi, vma, new_start, old_end, vma->vm_pgoff, NULL))
+		return -ENOMEM;
+
+	*next = vma_next(vmi);
+	vma_prev(vmi);
+
+	return 0;
+}
+
+/*
+ * vma_shrink_top() - Reduce an existing VMA's memory area by shift bytes from
+ *                    the top of the VMA.
+ * @vmi: The VMA iterator, must be positioned at the VMA.
+ * @vma: The VMA to modify.
+ * @shift: The number of bytes by which to shrink the VMA.
+ *
+ * Returns: 0 on success, an error code otherwise.
+ */
+int vma_shrink_top(struct vma_iterator *vmi, struct vm_area_struct *vma,
+		   unsigned long shift)
+{
+	if (shift >= vma->vm_end - vma->vm_start)
+		return -EINVAL;
+
+	return vma_shrink(vmi, vma, vma->vm_start, vma->vm_end - shift,
+			  vma->vm_pgoff);
+}
+
 /*
  * initialise the percpu counter for VM
  */