diff mbox series

[5/5] mm: completely abstract unnecessary adj_start calculation

Message ID ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes@oracle.com (mailing list archive)
State New
Headers show
Series mm: further simplify VMA merge operation | expand

Commit Message

Lorenzo Stoakes Jan. 27, 2025, 3:50 p.m. UTC
The adj_start calculation has been a constant source of confusion in the
VMA merge code.

There are two cases to consider, one where we adjust the start of the
vmg->middle VMA (i.e. the __VMG_FLAG_ADJUST_MIDDLE_START merge flag is
set), in which case adj_start is calculated as:

(1) adj_start = vmg->end - vmg->middle->vm_start

And the case where we adjust the start of the vmg->next VMA (i.e.t he
__VMG_FLAG_ADJUST_NEXT_START merge flag is set), in which case adj_start is
calculated as:

(2) adj_start = -(vmg->middle->vm_end - vmg->end)

We apply (1) thusly:

vmg->middle->vm_start =
	vmg->middle->vm_start + vmg->end - vmg->middle->vm_start

Which simplifies to:

vmg->middle->vm_start = vmg->end

Similarly, we apply (2) as:

vmg->next->vm_start =
	vmg->next->vm_start + -(vmg->middle->vm_end - vmg->end)

Noting that for these VMAs to be mergeable vmg->middle->vm_end ==
vmg->next->vm_start and so this simplifies to:

vmg->next->vm_start =
	vmg->next->vm_start + -(vmg->next->vm_start - vmg->end)

Which simplifies to:

vmg->next->vm_start = vmg->end

Therefore in each case, we simply need to adjust the start of the VMA to
vmg->end (!) and can do away with this adj_start calculation. The only
caveat is that we must ensure we update the vm_pgoff field correctly.

We therefore abstract this entire calculation to a new function
vmg_adjust_set_range() which performs this calculation and sets the
adjusted VMA's new range using the general vma_set_range() function.

We also must update vma_adjust_trans_huge() which expects the
now-abstracted adj_start parameter. It turns out this is wholly
unnecessary.

In vma_adjust_trans_huge() the relevant code is:

	if (adjust_next > 0) {
		struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
		unsigned long nstart = next->vm_start;
		nstart += adjust_next;
		split_huge_pmd_if_needed(next, nstart);
	}

The only case where this is relevant is when __VMG_FLAG_ADJUST_MIDDLE_START
is specified (in which case adj_next would have been positive), i.e. the
one in which the vma specified is vmg->prev and this the sought 'next' VMA
would be vmg->middle.

We can therefore eliminate the find_vma() invocation altogether and simply
provide the vmg->middle VMA in this instance, or NULL otherwise.

Again we have an adj_next offset calculation:

next->vm_start + vmg->end - vmg->middle->vm_start

Where next == vmg->middle this simplifies to vmg->end as previously
demonstrated.

Therefore nstart is equal to vmg->end, which is already passed to
vma_adjust_trans_huge() via the 'end' parameter and so this code (rather
delightfully) simplifies to:

	if (next)
		split_huge_pmd_if_needed(next, end);

With these changes in place, it becomes silly for commit_merge() to return
vmg->target, as it is always the same and threaded through vmg, so we
finally change commit_merge() to return an error value once again.

This patch has no change in functional behaviour.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/linux/huge_mm.h          |   2 +-
 mm/huge_memory.c                 |  19 ++----
 mm/vma.c                         | 102 +++++++++++++++----------------
 tools/testing/vma/vma_internal.h |   4 +-
 4 files changed, 58 insertions(+), 69 deletions(-)

Comments

kernel test robot Jan. 27, 2025, 6:50 p.m. UTC | #1
Hi Lorenzo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280232.qYHi6Rkt-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/posix_types.h:5,
                    from include/uapi/linux/types.h:14,
                    from include/linux/types.h:6,
                    from include/linux/kasan-checks.h:5,
                    from include/asm-generic/rwonce.h:26,
                    from ./arch/x86/include/generated/asm/rwonce.h:1,
                    from include/linux/compiler.h:339,
                    from include/linux/array_size.h:5,
                    from include/linux/kernel.h:16,
                    from include/linux/backing-dev.h:12,
                    from mm/vma_internal.h:12,
                    from mm/vma.c:7:
   mm/vma.c: In function '__split_vma':
>> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
       8 | #define NULL ((void *)0)
         |              ^~~~~~~~~~~
         |              |
         |              void *
   mm/vma.c:518:57: note: in expansion of macro 'NULL'
     518 |         vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
         |                                                         ^~~~
   In file included from include/linux/mm.h:1133,
                    from arch/um/include/asm/tlbflush.h:9,
                    from arch/um/include/asm/cacheflush.h:4,
                    from include/linux/cacheflush.h:5,
                    from include/linux/highmem.h:8,
                    from include/linux/bvec.h:10,
                    from include/linux/blk_types.h:10,
                    from include/linux/writeback.h:13,
                    from include/linux/backing-dev.h:16:
   include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
     574 |                                          long adjust_next)
         |                                          ~~~~~^~~~~~~~~~~
   mm/vma.c: In function 'commit_merge':
>> mm/vma.c:704:56: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
     704 |                               adj_middle ? vmg->middle : NULL);
   include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'struct vm_area_struct *'
     574 |                                          long adjust_next)
         |                                          ~~~~~^~~~~~~~~~~
   mm/vma.c: In function 'vma_shrink':
>> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
       8 | #define NULL ((void *)0)
         |              ^~~~~~~~~~~
         |              |
         |              void *
   mm/vma.c:1141:48: note: in expansion of macro 'NULL'
    1141 |         vma_adjust_trans_huge(vma, start, end, NULL);
         |                                                ^~~~
   include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
     574 |                                          long adjust_next)
         |                                          ~~~~~^~~~~~~~~~~


vim +/vma_adjust_trans_huge +8 include/linux/stddef.h

^1da177e4c3f41 Linus Torvalds   2005-04-16  6  
^1da177e4c3f41 Linus Torvalds   2005-04-16  7  #undef NULL
^1da177e4c3f41 Linus Torvalds   2005-04-16 @8  #define NULL ((void *)0)
6e218287432472 Richard Knutsson 2006-09-30  9
Lorenzo Stoakes Jan. 27, 2025, 6:56 p.m. UTC | #2
On Tue, Jan 28, 2025 at 02:50:55AM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
> patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
> config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202501280232.qYHi6Rkt-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
>    In file included from include/uapi/linux/posix_types.h:5,
>                     from include/uapi/linux/types.h:14,
>                     from include/linux/types.h:6,
>                     from include/linux/kasan-checks.h:5,
>                     from include/asm-generic/rwonce.h:26,
>                     from ./arch/x86/include/generated/asm/rwonce.h:1,
>                     from include/linux/compiler.h:339,
>                     from include/linux/array_size.h:5,
>                     from include/linux/kernel.h:16,
>                     from include/linux/backing-dev.h:12,
>                     from mm/vma_internal.h:12,
>                     from mm/vma.c:7:
>    mm/vma.c: In function '__split_vma':
> >> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
>        8 | #define NULL ((void *)0)
>          |              ^~~~~~~~~~~
>          |              |
>          |              void *
>    mm/vma.c:518:57: note: in expansion of macro 'NULL'
>      518 |         vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
>          |                                                         ^~~~
>    In file included from include/linux/mm.h:1133,
>                     from arch/um/include/asm/tlbflush.h:9,
>                     from arch/um/include/asm/cacheflush.h:4,
>                     from include/linux/cacheflush.h:5,
>                     from include/linux/highmem.h:8,
>                     from include/linux/bvec.h:10,
>                     from include/linux/blk_types.h:10,
>                     from include/linux/writeback.h:13,
>                     from include/linux/backing-dev.h:16:
>    include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
>      574 |                                          long adjust_next)
>          |                                          ~~~~~^~~~~~~~~~~

Doh! My bad.

For some reason I didn't change the header, and my local compiles must have
made this a warning only and had no problem with it... That'll teach me for
disabling CONFIG_WERROR :)

I'll fix that.

Thanks!

>    mm/vma.c: In function 'commit_merge':
> >> mm/vma.c:704:56: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
>      704 |                               adj_middle ? vmg->middle : NULL);
>    include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'struct vm_area_struct *'
>      574 |                                          long adjust_next)
>          |                                          ~~~~~^~~~~~~~~~~
>    mm/vma.c: In function 'vma_shrink':
> >> include/linux/stddef.h:8:14: warning: passing argument 4 of 'vma_adjust_trans_huge' makes integer from pointer without a cast [-Wint-conversion]
>        8 | #define NULL ((void *)0)
>          |              ^~~~~~~~~~~
>          |              |
>          |              void *
>    mm/vma.c:1141:48: note: in expansion of macro 'NULL'
>     1141 |         vma_adjust_trans_huge(vma, start, end, NULL);
>          |                                                ^~~~
>    include/linux/huge_mm.h:574:47: note: expected 'long int' but argument is of type 'void *'
>      574 |                                          long adjust_next)
>          |                                          ~~~~~^~~~~~~~~~~
>
>
> vim +/vma_adjust_trans_huge +8 include/linux/stddef.h
>
> ^1da177e4c3f41 Linus Torvalds   2005-04-16  6
> ^1da177e4c3f41 Linus Torvalds   2005-04-16  7  #undef NULL
> ^1da177e4c3f41 Linus Torvalds   2005-04-16 @8  #define NULL ((void *)0)
> 6e218287432472 Richard Knutsson 2006-09-30  9
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
Lorenzo Stoakes Jan. 27, 2025, 7:58 p.m. UTC | #3
On Tue, Jan 28, 2025 at 02:50:55AM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
> patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
> config: um-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280232.qYHi6Rkt-lkp@intel.com/reproduce)
>
[snap]

Hm, this is actually only applicable in the case that
CONFIG_TRANSPARENT_HUGEPAGE is specified, I missed the stub.

Andrew - if this doesn't go for a respin before the merge window ends, I attach
a fix-patch which should resolve this.

Thanks!

----8<----
From 47cf78f1975133fea2862830b43ab9a6937fef78 Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date: Mon, 27 Jan 2025 19:56:13 +0000
Subject: [PATCH] fixup

---
 include/linux/huge_mm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 43f76b54b522..e1bea54820ff 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -571,7 +571,7 @@ static inline int madvise_collapse(struct vm_area_struct *vma,
 static inline void vma_adjust_trans_huge(struct vm_area_struct *vma,
 					 unsigned long start,
 					 unsigned long end,
-					 long adjust_next)
+					 struct vm_area_struct *next)
 {
 }
 static inline int is_swap_pmd(pmd_t pmd)
--
2.48.0
kernel test robot Jan. 27, 2025, 8:13 p.m. UTC | #4
Hi Lorenzo,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: hexagon-randconfig-001-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280337.7bKYRAYQ-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 19306351a2c45e266fa11b41eb1362b20b6ca56d)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280337.7bKYRAYQ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280337.7bKYRAYQ-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from mm/vma.c:7:
   In file included from mm/vma_internal.h:29:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
>> mm/vma.c:518:50: error: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
     518 |         vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
         |                                                         ^~~~
   include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
       8 | #define NULL ((void *)0)
         |              ^~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
     574 |                                          long adjust_next)
         |                                               ^
>> mm/vma.c:704:10: error: incompatible pointer to integer conversion passing 'struct vm_area_struct *' to parameter of type 'long' [-Wint-conversion]
     704 |                               adj_middle ? vmg->middle : NULL);
         |                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
     574 |                                          long adjust_next)
         |                                               ^
   mm/vma.c:1141:41: error: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
    1141 |         vma_adjust_trans_huge(vma, start, end, NULL);
         |                                                ^~~~
   include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
       8 | #define NULL ((void *)0)
         |              ^~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
     574 |                                          long adjust_next)
         |                                               ^
   2 warnings and 3 errors generated.


vim +518 mm/vma.c

   459	
   460	/*
   461	 * __split_vma() bypasses sysctl_max_map_count checking.  We use this where it
   462	 * has already been checked or doesn't make sense to fail.
   463	 * VMA Iterator will point to the original VMA.
   464	 */
   465	static __must_check int
   466	__split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   467		    unsigned long addr, int new_below)
   468	{
   469		struct vma_prepare vp;
   470		struct vm_area_struct *new;
   471		int err;
   472	
   473		WARN_ON(vma->vm_start >= addr);
   474		WARN_ON(vma->vm_end <= addr);
   475	
   476		if (vma->vm_ops && vma->vm_ops->may_split) {
   477			err = vma->vm_ops->may_split(vma, addr);
   478			if (err)
   479				return err;
   480		}
   481	
   482		new = vm_area_dup(vma);
   483		if (!new)
   484			return -ENOMEM;
   485	
   486		if (new_below) {
   487			new->vm_end = addr;
   488		} else {
   489			new->vm_start = addr;
   490			new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
   491		}
   492	
   493		err = -ENOMEM;
   494		vma_iter_config(vmi, new->vm_start, new->vm_end);
   495		if (vma_iter_prealloc(vmi, new))
   496			goto out_free_vma;
   497	
   498		err = vma_dup_policy(vma, new);
   499		if (err)
   500			goto out_free_vmi;
   501	
   502		err = anon_vma_clone(new, vma);
   503		if (err)
   504			goto out_free_mpol;
   505	
   506		if (new->vm_file)
   507			get_file(new->vm_file);
   508	
   509		if (new->vm_ops && new->vm_ops->open)
   510			new->vm_ops->open(new);
   511	
   512		vma_start_write(vma);
   513		vma_start_write(new);
   514	
   515		init_vma_prep(&vp, vma);
   516		vp.insert = new;
   517		vma_prepare(&vp);
 > 518		vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
   519	
   520		if (new_below) {
   521			vma->vm_start = addr;
   522			vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
   523		} else {
   524			vma->vm_end = addr;
   525		}
   526	
   527		/* vma_complete stores the new vma */
   528		vma_complete(&vp, vmi, vma->vm_mm);
   529		validate_mm(vma->vm_mm);
   530	
   531		/* Success. */
   532		if (new_below)
   533			vma_next(vmi);
   534		else
   535			vma_prev(vmi);
   536	
   537		return 0;
   538	
   539	out_free_mpol:
   540		mpol_put(vma_policy(new));
   541	out_free_vmi:
   542		vma_iter_free(vmi);
   543	out_free_vma:
   544		vm_area_free(new);
   545		return err;
   546	}
   547	
   548	/*
   549	 * Split a vma into two pieces at address 'addr', a new vma is allocated
   550	 * either for the first part or the tail.
   551	 */
   552	static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   553			     unsigned long addr, int new_below)
   554	{
   555		if (vma->vm_mm->map_count >= sysctl_max_map_count)
   556			return -ENOMEM;
   557	
   558		return __split_vma(vmi, vma, addr, new_below);
   559	}
   560	
   561	/*
   562	 * dup_anon_vma() - Helper function to duplicate anon_vma
   563	 * @dst: The destination VMA
   564	 * @src: The source VMA
   565	 * @dup: Pointer to the destination VMA when successful.
   566	 *
   567	 * Returns: 0 on success.
   568	 */
   569	static int dup_anon_vma(struct vm_area_struct *dst,
   570				struct vm_area_struct *src, struct vm_area_struct **dup)
   571	{
   572		/*
   573		 * Easily overlooked: when mprotect shifts the boundary, make sure the
   574		 * expanding vma has anon_vma set if the shrinking vma had, to cover any
   575		 * anon pages imported.
   576		 */
   577		if (src->anon_vma && !dst->anon_vma) {
   578			int ret;
   579	
   580			vma_assert_write_locked(dst);
   581			dst->anon_vma = src->anon_vma;
   582			ret = anon_vma_clone(dst, src);
   583			if (ret)
   584				return ret;
   585	
   586			*dup = dst;
   587		}
   588	
   589		return 0;
   590	}
   591	
   592	#ifdef CONFIG_DEBUG_VM_MAPLE_TREE
   593	void validate_mm(struct mm_struct *mm)
   594	{
   595		int bug = 0;
   596		int i = 0;
   597		struct vm_area_struct *vma;
   598		VMA_ITERATOR(vmi, mm, 0);
   599	
   600		mt_validate(&mm->mm_mt);
   601		for_each_vma(vmi, vma) {
   602	#ifdef CONFIG_DEBUG_VM_RB
   603			struct anon_vma *anon_vma = vma->anon_vma;
   604			struct anon_vma_chain *avc;
   605	#endif
   606			unsigned long vmi_start, vmi_end;
   607			bool warn = 0;
   608	
   609			vmi_start = vma_iter_addr(&vmi);
   610			vmi_end = vma_iter_end(&vmi);
   611			if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
   612				warn = 1;
   613	
   614			if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
   615				warn = 1;
   616	
   617			if (warn) {
   618				pr_emerg("issue in %s\n", current->comm);
   619				dump_stack();
   620				dump_vma(vma);
   621				pr_emerg("tree range: %px start %lx end %lx\n", vma,
   622					 vmi_start, vmi_end - 1);
   623				vma_iter_dump_tree(&vmi);
   624			}
   625	
   626	#ifdef CONFIG_DEBUG_VM_RB
   627			if (anon_vma) {
   628				anon_vma_lock_read(anon_vma);
   629				list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
   630					anon_vma_interval_tree_verify(avc);
   631				anon_vma_unlock_read(anon_vma);
   632			}
   633	#endif
   634			/* Check for a infinite loop */
   635			if (++i > mm->map_count + 10) {
   636				i = -1;
   637				break;
   638			}
   639		}
   640		if (i != mm->map_count) {
   641			pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
   642			bug = 1;
   643		}
   644		VM_BUG_ON_MM(bug, mm);
   645	}
   646	#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
   647	
   648	/*
   649	 * Based on the vmg flag indicating whether we need to adjust the vm_start field
   650	 * for the middle or next VMA, we calculate what the range of the newly adjusted
   651	 * VMA ought to be, and set the VMA's range accordingly.
   652	 */
   653	static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
   654	{
   655		unsigned long flags = vmg->merge_flags;
   656		struct vm_area_struct *adjust;
   657		pgoff_t pgoff;
   658	
   659		if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
   660			adjust = vmg->middle;
   661			pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
   662		} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
   663			adjust = vmg->next;
   664			pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
   665		} else {
   666			return;
   667		}
   668	
   669		vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
   670	}
   671	
   672	/*
   673	 * Actually perform the VMA merge operation.
   674	 *
   675	 * On success, returns the merged VMA. Otherwise returns NULL.
   676	 */
   677	static int commit_merge(struct vma_merge_struct *vmg)
   678	{
   679		struct vm_area_struct *vma;
   680		struct vma_prepare vp;
   681		bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
   682	
   683		if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
   684			/* In this case we manipulate middle and return next. */
   685			vma = vmg->middle;
   686			vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
   687		} else {
   688			vma = vmg->target;
   689			 /* Note: vma iterator must be pointing to 'start'. */
   690			vma_iter_config(vmg->vmi, vmg->start, vmg->end);
   691		}
   692	
   693		init_multi_vma_prep(&vp, vma, vmg);
   694	
   695		if (vma_iter_prealloc(vmg->vmi, vma))
   696			return -ENOMEM;
   697	
   698		vma_prepare(&vp);
   699		/*
   700		 * THP pages may need to do additional splits if we increase
   701		 * middle->vm_start.
   702		 */
   703		vma_adjust_trans_huge(vma, vmg->start, vmg->end,
 > 704				      adj_middle ? vmg->middle : NULL);
   705		vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
   706		vmg_adjust_set_range(vmg);
   707		vma_iter_store(vmg->vmi, vmg->target);
   708	
   709		vma_complete(&vp, vmg->vmi, vma->vm_mm);
   710	
   711		return 0;
   712	}
   713
kernel test robot Jan. 27, 2025, 8:35 p.m. UTC | #5
Hi Lorenzo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: hexagon-randconfig-002-20250128 (https://download.01.org/0day-ci/archive/20250128/202501280408.9NtSz2Lt-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501280408.9NtSz2Lt-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501280408.9NtSz2Lt-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/vma.c:518:50: warning: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
           vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
                                                           ^~~~
   include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
   #define NULL ((void *)0)
                ^~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
                                            long adjust_next)
                                                 ^
>> mm/vma.c:704:10: warning: incompatible pointer to integer conversion passing 'struct vm_area_struct *' to parameter of type 'long' [-Wint-conversion]
                                 adj_middle ? vmg->middle : NULL);
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
                                            long adjust_next)
                                                 ^
   mm/vma.c:1141:41: warning: incompatible pointer to integer conversion passing 'void *' to parameter of type 'long' [-Wint-conversion]
           vma_adjust_trans_huge(vma, start, end, NULL);
                                                  ^~~~
   include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
   #define NULL ((void *)0)
                ^~~~~~~~~~~
   include/linux/huge_mm.h:574:12: note: passing argument to parameter 'adjust_next' here
                                            long adjust_next)
                                                 ^
   3 warnings generated.


vim +518 mm/vma.c

   459	
   460	/*
   461	 * __split_vma() bypasses sysctl_max_map_count checking.  We use this where it
   462	 * has already been checked or doesn't make sense to fail.
   463	 * VMA Iterator will point to the original VMA.
   464	 */
   465	static __must_check int
   466	__split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   467		    unsigned long addr, int new_below)
   468	{
   469		struct vma_prepare vp;
   470		struct vm_area_struct *new;
   471		int err;
   472	
   473		WARN_ON(vma->vm_start >= addr);
   474		WARN_ON(vma->vm_end <= addr);
   475	
   476		if (vma->vm_ops && vma->vm_ops->may_split) {
   477			err = vma->vm_ops->may_split(vma, addr);
   478			if (err)
   479				return err;
   480		}
   481	
   482		new = vm_area_dup(vma);
   483		if (!new)
   484			return -ENOMEM;
   485	
   486		if (new_below) {
   487			new->vm_end = addr;
   488		} else {
   489			new->vm_start = addr;
   490			new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
   491		}
   492	
   493		err = -ENOMEM;
   494		vma_iter_config(vmi, new->vm_start, new->vm_end);
   495		if (vma_iter_prealloc(vmi, new))
   496			goto out_free_vma;
   497	
   498		err = vma_dup_policy(vma, new);
   499		if (err)
   500			goto out_free_vmi;
   501	
   502		err = anon_vma_clone(new, vma);
   503		if (err)
   504			goto out_free_mpol;
   505	
   506		if (new->vm_file)
   507			get_file(new->vm_file);
   508	
   509		if (new->vm_ops && new->vm_ops->open)
   510			new->vm_ops->open(new);
   511	
   512		vma_start_write(vma);
   513		vma_start_write(new);
   514	
   515		init_vma_prep(&vp, vma);
   516		vp.insert = new;
   517		vma_prepare(&vp);
 > 518		vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
   519	
   520		if (new_below) {
   521			vma->vm_start = addr;
   522			vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
   523		} else {
   524			vma->vm_end = addr;
   525		}
   526	
   527		/* vma_complete stores the new vma */
   528		vma_complete(&vp, vmi, vma->vm_mm);
   529		validate_mm(vma->vm_mm);
   530	
   531		/* Success. */
   532		if (new_below)
   533			vma_next(vmi);
   534		else
   535			vma_prev(vmi);
   536	
   537		return 0;
   538	
   539	out_free_mpol:
   540		mpol_put(vma_policy(new));
   541	out_free_vmi:
   542		vma_iter_free(vmi);
   543	out_free_vma:
   544		vm_area_free(new);
   545		return err;
   546	}
   547	
   548	/*
   549	 * Split a vma into two pieces at address 'addr', a new vma is allocated
   550	 * either for the first part or the tail.
   551	 */
   552	static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   553			     unsigned long addr, int new_below)
   554	{
   555		if (vma->vm_mm->map_count >= sysctl_max_map_count)
   556			return -ENOMEM;
   557	
   558		return __split_vma(vmi, vma, addr, new_below);
   559	}
   560	
   561	/*
   562	 * dup_anon_vma() - Helper function to duplicate anon_vma
   563	 * @dst: The destination VMA
   564	 * @src: The source VMA
   565	 * @dup: Pointer to the destination VMA when successful.
   566	 *
   567	 * Returns: 0 on success.
   568	 */
   569	static int dup_anon_vma(struct vm_area_struct *dst,
   570				struct vm_area_struct *src, struct vm_area_struct **dup)
   571	{
   572		/*
   573		 * Easily overlooked: when mprotect shifts the boundary, make sure the
   574		 * expanding vma has anon_vma set if the shrinking vma had, to cover any
   575		 * anon pages imported.
   576		 */
   577		if (src->anon_vma && !dst->anon_vma) {
   578			int ret;
   579	
   580			vma_assert_write_locked(dst);
   581			dst->anon_vma = src->anon_vma;
   582			ret = anon_vma_clone(dst, src);
   583			if (ret)
   584				return ret;
   585	
   586			*dup = dst;
   587		}
   588	
   589		return 0;
   590	}
   591	
   592	#ifdef CONFIG_DEBUG_VM_MAPLE_TREE
   593	void validate_mm(struct mm_struct *mm)
   594	{
   595		int bug = 0;
   596		int i = 0;
   597		struct vm_area_struct *vma;
   598		VMA_ITERATOR(vmi, mm, 0);
   599	
   600		mt_validate(&mm->mm_mt);
   601		for_each_vma(vmi, vma) {
   602	#ifdef CONFIG_DEBUG_VM_RB
   603			struct anon_vma *anon_vma = vma->anon_vma;
   604			struct anon_vma_chain *avc;
   605	#endif
   606			unsigned long vmi_start, vmi_end;
   607			bool warn = 0;
   608	
   609			vmi_start = vma_iter_addr(&vmi);
   610			vmi_end = vma_iter_end(&vmi);
   611			if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
   612				warn = 1;
   613	
   614			if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
   615				warn = 1;
   616	
   617			if (warn) {
   618				pr_emerg("issue in %s\n", current->comm);
   619				dump_stack();
   620				dump_vma(vma);
   621				pr_emerg("tree range: %px start %lx end %lx\n", vma,
   622					 vmi_start, vmi_end - 1);
   623				vma_iter_dump_tree(&vmi);
   624			}
   625	
   626	#ifdef CONFIG_DEBUG_VM_RB
   627			if (anon_vma) {
   628				anon_vma_lock_read(anon_vma);
   629				list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
   630					anon_vma_interval_tree_verify(avc);
   631				anon_vma_unlock_read(anon_vma);
   632			}
   633	#endif
   634			/* Check for a infinite loop */
   635			if (++i > mm->map_count + 10) {
   636				i = -1;
   637				break;
   638			}
   639		}
   640		if (i != mm->map_count) {
   641			pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
   642			bug = 1;
   643		}
   644		VM_BUG_ON_MM(bug, mm);
   645	}
   646	#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
   647	
   648	/*
   649	 * Based on the vmg flag indicating whether we need to adjust the vm_start field
   650	 * for the middle or next VMA, we calculate what the range of the newly adjusted
   651	 * VMA ought to be, and set the VMA's range accordingly.
   652	 */
   653	static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
   654	{
   655		unsigned long flags = vmg->merge_flags;
   656		struct vm_area_struct *adjust;
   657		pgoff_t pgoff;
   658	
   659		if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
   660			adjust = vmg->middle;
   661			pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
   662		} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
   663			adjust = vmg->next;
   664			pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
   665		} else {
   666			return;
   667		}
   668	
   669		vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
   670	}
   671	
   672	/*
   673	 * Actually perform the VMA merge operation.
   674	 *
   675	 * On success, returns the merged VMA. Otherwise returns NULL.
   676	 */
   677	static int commit_merge(struct vma_merge_struct *vmg)
   678	{
   679		struct vm_area_struct *vma;
   680		struct vma_prepare vp;
   681		bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
   682	
   683		if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
   684			/* In this case we manipulate middle and return next. */
   685			vma = vmg->middle;
   686			vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
   687		} else {
   688			vma = vmg->target;
   689			 /* Note: vma iterator must be pointing to 'start'. */
   690			vma_iter_config(vmg->vmi, vmg->start, vmg->end);
   691		}
   692	
   693		init_multi_vma_prep(&vp, vma, vmg);
   694	
   695		if (vma_iter_prealloc(vmg->vmi, vma))
   696			return -ENOMEM;
   697	
   698		vma_prepare(&vp);
   699		/*
   700		 * THP pages may need to do additional splits if we increase
   701		 * middle->vm_start.
   702		 */
   703		vma_adjust_trans_huge(vma, vmg->start, vmg->end,
 > 704				      adj_middle ? vmg->middle : NULL);
   705		vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
   706		vmg_adjust_set_range(vmg);
   707		vma_iter_store(vmg->vmi, vmg->target);
   708	
   709		vma_complete(&vp, vmg->vmi, vma->vm_mm);
   710	
   711		return 0;
   712	}
   713
kernel test robot Jan. 28, 2025, 7:54 a.m. UTC | #6
Hi Lorenzo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-simplify-vma-merge-structure-and-expand-comments/20250127-235322
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/ef00aec42a892fe6ac9557b3a11f18f30a2e51b3.1737929364.git.lorenzo.stoakes%40oracle.com
patch subject: [PATCH 5/5] mm: completely abstract unnecessary adj_start calculation
config: x86_64-randconfig-122-20250128 (https://download.01.org/0day-ci/archive/20250128/202501281552.zDLVA5oq-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250128/202501281552.zDLVA5oq-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501281552.zDLVA5oq-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> mm/vma.c:518:57: sparse: sparse: incorrect type in argument 4 (different base types) @@     expected long adjust_next @@     got void * @@
   mm/vma.c:518:57: sparse:     expected long adjust_next
   mm/vma.c:518:57: sparse:     got void *
>> mm/vma.c:704:42: sparse: sparse: incorrect type in argument 4 (different base types) @@     expected long adjust_next @@     got struct vm_area_struct * @@
   mm/vma.c:704:42: sparse:     expected long adjust_next
   mm/vma.c:704:42: sparse:     got struct vm_area_struct *
   mm/vma.c:1141:48: sparse: sparse: incorrect type in argument 4 (different base types) @@     expected long adjust_next @@     got void * @@
   mm/vma.c:1141:48: sparse:     expected long adjust_next
   mm/vma.c:1141:48: sparse:     got void *

vim +518 mm/vma.c

   459	
   460	/*
   461	 * __split_vma() bypasses sysctl_max_map_count checking.  We use this where it
   462	 * has already been checked or doesn't make sense to fail.
   463	 * VMA Iterator will point to the original VMA.
   464	 */
   465	static __must_check int
   466	__split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   467		    unsigned long addr, int new_below)
   468	{
   469		struct vma_prepare vp;
   470		struct vm_area_struct *new;
   471		int err;
   472	
   473		WARN_ON(vma->vm_start >= addr);
   474		WARN_ON(vma->vm_end <= addr);
   475	
   476		if (vma->vm_ops && vma->vm_ops->may_split) {
   477			err = vma->vm_ops->may_split(vma, addr);
   478			if (err)
   479				return err;
   480		}
   481	
   482		new = vm_area_dup(vma);
   483		if (!new)
   484			return -ENOMEM;
   485	
   486		if (new_below) {
   487			new->vm_end = addr;
   488		} else {
   489			new->vm_start = addr;
   490			new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
   491		}
   492	
   493		err = -ENOMEM;
   494		vma_iter_config(vmi, new->vm_start, new->vm_end);
   495		if (vma_iter_prealloc(vmi, new))
   496			goto out_free_vma;
   497	
   498		err = vma_dup_policy(vma, new);
   499		if (err)
   500			goto out_free_vmi;
   501	
   502		err = anon_vma_clone(new, vma);
   503		if (err)
   504			goto out_free_mpol;
   505	
   506		if (new->vm_file)
   507			get_file(new->vm_file);
   508	
   509		if (new->vm_ops && new->vm_ops->open)
   510			new->vm_ops->open(new);
   511	
   512		vma_start_write(vma);
   513		vma_start_write(new);
   514	
   515		init_vma_prep(&vp, vma);
   516		vp.insert = new;
   517		vma_prepare(&vp);
 > 518		vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
   519	
   520		if (new_below) {
   521			vma->vm_start = addr;
   522			vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT;
   523		} else {
   524			vma->vm_end = addr;
   525		}
   526	
   527		/* vma_complete stores the new vma */
   528		vma_complete(&vp, vmi, vma->vm_mm);
   529		validate_mm(vma->vm_mm);
   530	
   531		/* Success. */
   532		if (new_below)
   533			vma_next(vmi);
   534		else
   535			vma_prev(vmi);
   536	
   537		return 0;
   538	
   539	out_free_mpol:
   540		mpol_put(vma_policy(new));
   541	out_free_vmi:
   542		vma_iter_free(vmi);
   543	out_free_vma:
   544		vm_area_free(new);
   545		return err;
   546	}
   547	
   548	/*
   549	 * Split a vma into two pieces at address 'addr', a new vma is allocated
   550	 * either for the first part or the tail.
   551	 */
   552	static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
   553			     unsigned long addr, int new_below)
   554	{
   555		if (vma->vm_mm->map_count >= sysctl_max_map_count)
   556			return -ENOMEM;
   557	
   558		return __split_vma(vmi, vma, addr, new_below);
   559	}
   560	
   561	/*
   562	 * dup_anon_vma() - Helper function to duplicate anon_vma
   563	 * @dst: The destination VMA
   564	 * @src: The source VMA
   565	 * @dup: Pointer to the destination VMA when successful.
   566	 *
   567	 * Returns: 0 on success.
   568	 */
   569	static int dup_anon_vma(struct vm_area_struct *dst,
   570				struct vm_area_struct *src, struct vm_area_struct **dup)
   571	{
   572		/*
   573		 * Easily overlooked: when mprotect shifts the boundary, make sure the
   574		 * expanding vma has anon_vma set if the shrinking vma had, to cover any
   575		 * anon pages imported.
   576		 */
   577		if (src->anon_vma && !dst->anon_vma) {
   578			int ret;
   579	
   580			vma_assert_write_locked(dst);
   581			dst->anon_vma = src->anon_vma;
   582			ret = anon_vma_clone(dst, src);
   583			if (ret)
   584				return ret;
   585	
   586			*dup = dst;
   587		}
   588	
   589		return 0;
   590	}
   591	
   592	#ifdef CONFIG_DEBUG_VM_MAPLE_TREE
   593	void validate_mm(struct mm_struct *mm)
   594	{
   595		int bug = 0;
   596		int i = 0;
   597		struct vm_area_struct *vma;
   598		VMA_ITERATOR(vmi, mm, 0);
   599	
   600		mt_validate(&mm->mm_mt);
   601		for_each_vma(vmi, vma) {
   602	#ifdef CONFIG_DEBUG_VM_RB
   603			struct anon_vma *anon_vma = vma->anon_vma;
   604			struct anon_vma_chain *avc;
   605	#endif
   606			unsigned long vmi_start, vmi_end;
   607			bool warn = 0;
   608	
   609			vmi_start = vma_iter_addr(&vmi);
   610			vmi_end = vma_iter_end(&vmi);
   611			if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm))
   612				warn = 1;
   613	
   614			if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm))
   615				warn = 1;
   616	
   617			if (warn) {
   618				pr_emerg("issue in %s\n", current->comm);
   619				dump_stack();
   620				dump_vma(vma);
   621				pr_emerg("tree range: %px start %lx end %lx\n", vma,
   622					 vmi_start, vmi_end - 1);
   623				vma_iter_dump_tree(&vmi);
   624			}
   625	
   626	#ifdef CONFIG_DEBUG_VM_RB
   627			if (anon_vma) {
   628				anon_vma_lock_read(anon_vma);
   629				list_for_each_entry(avc, &vma->anon_vma_chain, same_vma)
   630					anon_vma_interval_tree_verify(avc);
   631				anon_vma_unlock_read(anon_vma);
   632			}
   633	#endif
   634			/* Check for a infinite loop */
   635			if (++i > mm->map_count + 10) {
   636				i = -1;
   637				break;
   638			}
   639		}
   640		if (i != mm->map_count) {
   641			pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i);
   642			bug = 1;
   643		}
   644		VM_BUG_ON_MM(bug, mm);
   645	}
   646	#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
   647	
   648	/*
   649	 * Based on the vmg flag indicating whether we need to adjust the vm_start field
   650	 * for the middle or next VMA, we calculate what the range of the newly adjusted
   651	 * VMA ought to be, and set the VMA's range accordingly.
   652	 */
   653	static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
   654	{
   655		unsigned long flags = vmg->merge_flags;
   656		struct vm_area_struct *adjust;
   657		pgoff_t pgoff;
   658	
   659		if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
   660			adjust = vmg->middle;
   661			pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
   662		} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
   663			adjust = vmg->next;
   664			pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
   665		} else {
   666			return;
   667		}
   668	
   669		vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
   670	}
   671	
   672	/*
   673	 * Actually perform the VMA merge operation.
   674	 *
   675	 * On success, returns the merged VMA. Otherwise returns NULL.
   676	 */
   677	static int commit_merge(struct vma_merge_struct *vmg)
   678	{
   679		struct vm_area_struct *vma;
   680		struct vma_prepare vp;
   681		bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
   682	
   683		if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
   684			/* In this case we manipulate middle and return next. */
   685			vma = vmg->middle;
   686			vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
   687		} else {
   688			vma = vmg->target;
   689			 /* Note: vma iterator must be pointing to 'start'. */
   690			vma_iter_config(vmg->vmi, vmg->start, vmg->end);
   691		}
   692	
   693		init_multi_vma_prep(&vp, vma, vmg);
   694	
   695		if (vma_iter_prealloc(vmg->vmi, vma))
   696			return -ENOMEM;
   697	
   698		vma_prepare(&vp);
   699		/*
   700		 * THP pages may need to do additional splits if we increase
   701		 * middle->vm_start.
   702		 */
   703		vma_adjust_trans_huge(vma, vmg->start, vmg->end,
 > 704				      adj_middle ? vmg->middle : NULL);
   705		vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
   706		vmg_adjust_set_range(vmg);
   707		vma_iter_store(vmg->vmi, vmg->target);
   708	
   709		vma_complete(&vp, vmg->vmi, vma->vm_mm);
   710	
   711		return 0;
   712	}
   713
diff mbox series

Patch

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 93e509b6c00e..43f76b54b522 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -404,7 +404,7 @@  int madvise_collapse(struct vm_area_struct *vma,
 		     struct vm_area_struct **prev,
 		     unsigned long start, unsigned long end);
 void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start,
-			   unsigned long end, long adjust_next);
+			   unsigned long end, struct vm_area_struct *next);
 spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma);
 spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma);
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3d3ebdc002d5..c44599f9ad09 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3017,9 +3017,9 @@  static inline void split_huge_pmd_if_needed(struct vm_area_struct *vma, unsigned
 }
 
 void vma_adjust_trans_huge(struct vm_area_struct *vma,
-			     unsigned long start,
-			     unsigned long end,
-			     long adjust_next)
+			   unsigned long start,
+			   unsigned long end,
+			   struct vm_area_struct *next)
 {
 	/* Check if we need to split start first. */
 	split_huge_pmd_if_needed(vma, start);
@@ -3027,16 +3027,9 @@  void vma_adjust_trans_huge(struct vm_area_struct *vma,
 	/* Check if we need to split end next. */
 	split_huge_pmd_if_needed(vma, end);
 
-	/*
-	 * If we're also updating the next vma vm_start,
-	 * check if we need to split it.
-	 */
-	if (adjust_next > 0) {
-		struct vm_area_struct *next = find_vma(vma->vm_mm, vma->vm_end);
-		unsigned long nstart = next->vm_start;
-		nstart += adjust_next;
-		split_huge_pmd_if_needed(next, nstart);
-	}
+	/* If we're incrementing next->vm_start, we might need to split it. */
+	if (next)
+		split_huge_pmd_if_needed(next, end);
 }
 
 static void unmap_folio(struct folio *folio)
diff --git a/mm/vma.c b/mm/vma.c
index cfcadca7b116..48a7acc83a82 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -515,7 +515,7 @@  __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	init_vma_prep(&vp, vma);
 	vp.insert = new;
 	vma_prepare(&vp);
-	vma_adjust_trans_huge(vma, vma->vm_start, addr, 0);
+	vma_adjust_trans_huge(vma, vma->vm_start, addr, NULL);
 
 	if (new_below) {
 		vma->vm_start = addr;
@@ -646,47 +646,46 @@  void validate_mm(struct mm_struct *mm)
 #endif /* CONFIG_DEBUG_VM_MAPLE_TREE */
 
 /*
- * Actually perform the VMA merge operation.
- *
- * On success, returns the merged VMA. Otherwise returns NULL.
+ * Based on the vmg flag indicating whether we need to adjust the vm_start field
+ * for the middle or next VMA, we calculate what the range of the newly adjusted
+ * VMA ought to be, and set the VMA's range accordingly.
  */
-static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
+static void vmg_adjust_set_range(struct vma_merge_struct *vmg)
 {
-	struct vm_area_struct *vma;
-	struct vma_prepare vp;
-	struct vm_area_struct *adjust;
-	long adj_start;
 	unsigned long flags = vmg->merge_flags;
+	struct vm_area_struct *adjust;
+	pgoff_t pgoff;
 
-	/*
-	 * If modifying an existing VMA and we don't remove vmg->middle, then we
-	 * shrink the adjacent VMA.
-	 */
 	if (flags & __VMG_FLAG_ADJUST_MIDDLE_START) {
-		vma = vmg->target;
 		adjust = vmg->middle;
-		/* The POSITIVE value by which we offset vmg->middle->vm_start. */
-		adj_start = vmg->end - vmg->middle->vm_start;
-
-		 /* Note: vma iterator must be pointing to 'start'. */
-		vma_iter_config(vmg->vmi, vmg->start, vmg->end);
+		pgoff = adjust->vm_pgoff + PHYS_PFN(vmg->end - adjust->vm_start);
 	} else if (flags & __VMG_FLAG_ADJUST_NEXT_START) {
-		/*
-		 * In this case alone, the VMA we manipulate is vmg->middle, but
-		 * we ultimately return vmg->next.
-		 */
-		vma = vmg->middle;
 		adjust = vmg->next;
-		/* The NEGATIVE value by which we offset vmg->next->vm_start. */
-		adj_start = -(vmg->middle->vm_end - vmg->end);
+		pgoff = adjust->vm_pgoff - PHYS_PFN(adjust->vm_start - vmg->end);
+	} else {
+		return;
+	}
+
+	vma_set_range(adjust, vmg->end, adjust->vm_end, pgoff);
+}
+
+/*
+ * Actually perform the VMA merge operation.
+ *
+ * On success, returns the merged VMA. Otherwise returns NULL.
+ */
+static int commit_merge(struct vma_merge_struct *vmg)
+{
+	struct vm_area_struct *vma;
+	struct vma_prepare vp;
+	bool adj_middle = vmg->merge_flags & __VMG_FLAG_ADJUST_MIDDLE_START;
 
-		vma_iter_config(vmg->vmi, vmg->next->vm_start + adj_start,
-				vmg->next->vm_end);
+	if (vmg->merge_flags & __VMG_FLAG_ADJUST_NEXT_START) {
+		/* In this case we manipulate middle and return next. */
+		vma = vmg->middle;
+		vma_iter_config(vmg->vmi, vmg->end, vmg->next->vm_end);
 	} else {
 		vma = vmg->target;
-		adjust = NULL;
-		adj_start = 0;
-
 		 /* Note: vma iterator must be pointing to 'start'. */
 		vma_iter_config(vmg->vmi, vmg->start, vmg->end);
 	}
@@ -694,22 +693,22 @@  static struct vm_area_struct *commit_merge(struct vma_merge_struct *vmg)
 	init_multi_vma_prep(&vp, vma, vmg);
 
 	if (vma_iter_prealloc(vmg->vmi, vma))
-		return NULL;
+		return -ENOMEM;
 
 	vma_prepare(&vp);
-	vma_adjust_trans_huge(vma, vmg->start, vmg->end, adj_start);
+	/*
+	 * THP pages may need to do additional splits if we increase
+	 * middle->vm_start.
+	 */
+	vma_adjust_trans_huge(vma, vmg->start, vmg->end,
+			      adj_middle ? vmg->middle : NULL);
 	vma_set_range(vma, vmg->start, vmg->end, vmg->pgoff);
-
-	if (adj_start) {
-		adjust->vm_start += adj_start;
-		adjust->vm_pgoff += PHYS_PFN(adj_start);
-	}
-
+	vmg_adjust_set_range(vmg);
 	vma_iter_store(vmg->vmi, vmg->target);
 
 	vma_complete(&vp, vmg->vmi, vma->vm_mm);
 
-	return vmg->target;
+	return 0;
 }
 
 /* We can only remove VMAs when merging if they do not have a close hook. */
@@ -752,7 +751,7 @@  static __must_check struct vm_area_struct *vma_merge_existing_range(
 {
 	struct vm_area_struct *middle = vmg->middle;
 	struct vm_area_struct *prev = vmg->prev;
-	struct vm_area_struct *next, *res;
+	struct vm_area_struct *next;
 	struct vm_area_struct *anon_dup = NULL;
 	unsigned long start = vmg->start;
 	unsigned long end = vmg->end;
@@ -904,12 +903,7 @@  static __must_check struct vm_area_struct *vma_merge_existing_range(
 			vmg->end = next->vm_end;
 			vmg->pgoff = next->vm_pgoff - pglen;
 		} else {
-			/*
-			 * We shrink middle and expand next.
-			 *
-			 * IMPORTANT: This is the ONLY case where the final
-			 * merged VMA is NOT vmg->target, but rather vmg->next.
-			 */
+			/* We shrink middle and expand next. */
 			vmg->merge_flags |= __VMG_FLAG_ADJUST_NEXT_START;
 			vmg->start = middle->vm_start;
 			vmg->end = start;
@@ -927,8 +921,10 @@  static __must_check struct vm_area_struct *vma_merge_existing_range(
 	if (merge_will_delete_next)
 		vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
 
-	res = commit_merge(vmg);
-	if (!res) {
+	err = commit_merge(vmg);
+	if (err) {
+		VM_WARN_ON(err != -ENOMEM);
+
 		if (anon_dup)
 			unlink_anon_vmas(anon_dup);
 
@@ -936,9 +932,9 @@  static __must_check struct vm_area_struct *vma_merge_existing_range(
 		return NULL;
 	}
 
-	khugepaged_enter_vma(res, vmg->flags);
+	khugepaged_enter_vma(vmg->target, vmg->flags);
 	vmg->state = VMA_MERGE_SUCCESS;
-	return res;
+	return vmg->target;
 
 abort:
 	vma_iter_set(vmg->vmi, start);
@@ -1102,7 +1098,7 @@  int vma_expand(struct vma_merge_struct *vmg)
 	if (remove_next)
 		vmg->merge_flags |= __VMG_FLAG_REMOVE_NEXT;
 
-	if (!commit_merge(vmg))
+	if (commit_merge(vmg))
 		goto nomem;
 
 	return 0;
@@ -1142,7 +1138,7 @@  int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma,
 
 	init_vma_prep(&vp, vma);
 	vma_prepare(&vp);
-	vma_adjust_trans_huge(vma, start, end, 0);
+	vma_adjust_trans_huge(vma, start, end, NULL);
 
 	vma_iter_clear(vmi);
 	vma_set_range(vma, start, end, pgoff);
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 1eae23039854..bb273927af0f 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -796,12 +796,12 @@  static inline void vma_start_write(struct vm_area_struct *vma)
 static inline void vma_adjust_trans_huge(struct vm_area_struct *vma,
 					 unsigned long start,
 					 unsigned long end,
-					 long adjust_next)
+					 struct vm_area_struct *next)
 {
 	(void)vma;
 	(void)start;
 	(void)end;
-	(void)adjust_next;
+	(void)next;
 }
 
 static inline void vma_iter_free(struct vma_iterator *vmi)