diff mbox series

[3/3] mm: thp: relocate flush_cache_range() in migrate_misplaced_transhuge_page()

Message ID 20181013002430.698-4-aarcange@redhat.com (mailing list archive)
State New, archived
Headers show
Series migrate_misplaced_transhuge_page race conditions | expand

Commit Message

Andrea Arcangeli Oct. 13, 2018, 12:24 a.m. UTC
There should be no cache left by the time we overwrite the old
transhuge pmd with the new one. It's already too late to flush through
the virtual address because we already copied the page data to the new
physical address.

So flush the cache before the data copy.

Also delete the "end" variable to shutoff a "unused variable" warning
on x86 where flush_cache_range() is a noop.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/migrate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

kernel test robot Oct. 14, 2018, 9:58 a.m. UTC | #1
Hi Andrea,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux-sof-driver/master]
[also build test ERROR on v4.19-rc7 next-20181012]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Andrea-Arcangeli/mm-thp-fix-MADV_DONTNEED-vs-migrate_misplaced_transhuge_page-race-condition/20181014-143004
base:   https://github.com/thesofproject/linux master
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=arm64 

All errors (new ones prefixed by >>):

   mm/migrate.c: In function 'migrate_misplaced_transhuge_page':
>> mm/migrate.c:2054:32: error: 'end' undeclared (first use in this function); did you mean '_end'?
     flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
                                   ^~~
                                   _end
   mm/migrate.c:2054:32: note: each undeclared identifier is reported only once for each function it appears in

vim +2054 mm/migrate.c

  2005	
  2006	#if defined(CONFIG_NUMA_BALANCING) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
  2007	/*
  2008	 * Migrates a THP to a given target node. page must be locked and is unlocked
  2009	 * before returning.
  2010	 */
  2011	int migrate_misplaced_transhuge_page(struct mm_struct *mm,
  2012					struct vm_area_struct *vma,
  2013					pmd_t *pmd, pmd_t entry,
  2014					unsigned long address,
  2015					struct page *page, int node)
  2016	{
  2017		spinlock_t *ptl;
  2018		pg_data_t *pgdat = NODE_DATA(node);
  2019		int isolated = 0;
  2020		struct page *new_page = NULL;
  2021		int page_lru = page_is_file_cache(page);
  2022		unsigned long start = address & HPAGE_PMD_MASK;
  2023	
  2024		/*
  2025		 * Rate-limit the amount of data that is being migrated to a node.
  2026		 * Optimal placement is no good if the memory bus is saturated and
  2027		 * all the time is being spent migrating!
  2028		 */
  2029		if (numamigrate_update_ratelimit(pgdat, HPAGE_PMD_NR))
  2030			goto out_dropref;
  2031	
  2032		new_page = alloc_pages_node(node,
  2033			(GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
  2034			HPAGE_PMD_ORDER);
  2035		if (!new_page)
  2036			goto out_fail;
  2037		prep_transhuge_page(new_page);
  2038	
  2039		isolated = numamigrate_isolate_page(pgdat, page);
  2040		if (!isolated) {
  2041			put_page(new_page);
  2042			goto out_fail;
  2043		}
  2044	
  2045		/* Prepare a page as a migration target */
  2046		__SetPageLocked(new_page);
  2047		if (PageSwapBacked(page))
  2048			__SetPageSwapBacked(new_page);
  2049	
  2050		/* anon mapping, we can simply copy page->mapping to the new page: */
  2051		new_page->mapping = page->mapping;
  2052		new_page->index = page->index;
  2053		/* flush the cache before copying using the kernel virtual address */
> 2054		flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
  2055		migrate_page_copy(new_page, page);
  2056		WARN_ON(PageLRU(new_page));
  2057	
  2058		/* Recheck the target PMD */
  2059		ptl = pmd_lock(mm, pmd);
  2060		if (unlikely(!pmd_same(*pmd, entry) || !page_ref_freeze(page, 2))) {
  2061			spin_unlock(ptl);
  2062	
  2063			/* Reverse changes made by migrate_page_copy() */
  2064			if (TestClearPageActive(new_page))
  2065				SetPageActive(page);
  2066			if (TestClearPageUnevictable(new_page))
  2067				SetPageUnevictable(page);
  2068	
  2069			unlock_page(new_page);
  2070			put_page(new_page);		/* Free it */
  2071	
  2072			/* Retake the callers reference and putback on LRU */
  2073			get_page(page);
  2074			putback_lru_page(page);
  2075			mod_node_page_state(page_pgdat(page),
  2076				 NR_ISOLATED_ANON + page_lru, -HPAGE_PMD_NR);
  2077	
  2078			goto out_unlock;
  2079		}
  2080	
  2081		entry = mk_huge_pmd(new_page, vma->vm_page_prot);
  2082		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
  2083	
  2084		/*
  2085		 * Overwrite the old entry under pagetable lock and establish
  2086		 * the new PTE. Any parallel GUP will either observe the old
  2087		 * page blocking on the page lock, block on the page table
  2088		 * lock or observe the new page. The SetPageUptodate on the
  2089		 * new page and page_add_new_anon_rmap guarantee the copy is
  2090		 * visible before the pagetable update.
  2091		 */
  2092		page_add_anon_rmap(new_page, vma, start, true);
  2093		/*
  2094		 * At this point the pmd is numa/protnone (i.e. non present)
  2095		 * and the TLB has already been flushed globally. So no TLB
  2096		 * can be currently caching this non present pmd mapping.
  2097		 * There's no need of clearing the pmd before doing
  2098		 * set_pmd_at(), nor to flush the TLB after
  2099		 * set_pmd_at(). Clearing the pmd here would introduce a race
  2100		 * condition against MADV_DONTNEED, beacuse MADV_DONTNEED only
  2101		 * holds the mmap_sem for reading. If the pmd is set to NULL
  2102		 * at any given time, MADV_DONTNEED won't wait on the pmd lock
  2103		 * and it'll skip clearing this pmd.
  2104		 */
  2105		set_pmd_at(mm, start, pmd, entry);
  2106		update_mmu_cache_pmd(vma, address, &entry);
  2107	
  2108		page_ref_unfreeze(page, 2);
  2109		mlock_migrate_page(new_page, page);
  2110		page_remove_rmap(page, true);
  2111		set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);
  2112	
  2113		spin_unlock(ptl);
  2114	
  2115		/* Take an "isolate" reference and put new page on the LRU. */
  2116		get_page(new_page);
  2117		putback_lru_page(new_page);
  2118	
  2119		unlock_page(new_page);
  2120		unlock_page(page);
  2121		put_page(page);			/* Drop the rmap reference */
  2122		put_page(page);			/* Drop the LRU isolation reference */
  2123	
  2124		count_vm_events(PGMIGRATE_SUCCESS, HPAGE_PMD_NR);
  2125		count_vm_numa_events(NUMA_PAGE_MIGRATE, HPAGE_PMD_NR);
  2126	
  2127		mod_node_page_state(page_pgdat(page),
  2128				NR_ISOLATED_ANON + page_lru,
  2129				-HPAGE_PMD_NR);
  2130		return isolated;
  2131	
  2132	out_fail:
  2133		count_vm_events(PGMIGRATE_FAIL, HPAGE_PMD_NR);
  2134	out_dropref:
  2135		ptl = pmd_lock(mm, pmd);
  2136		if (pmd_same(*pmd, entry)) {
  2137			entry = pmd_modify(entry, vma->vm_page_prot);
  2138			set_pmd_at(mm, start, pmd, entry);
  2139			update_mmu_cache_pmd(vma, address, &entry);
  2140		}
  2141		spin_unlock(ptl);
  2142	
  2143	out_unlock:
  2144		unlock_page(page);
  2145		put_page(page);
  2146		return 0;
  2147	}
  2148	#endif /* CONFIG_NUMA_BALANCING */
  2149	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Andrea Arcangeli Oct. 14, 2018, 7:58 p.m. UTC | #2
On Sun, Oct 14, 2018 at 05:58:27PM +0800, kbuild test robot wrote:
> Hi Andrea,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on linux-sof-driver/master]
> [also build test ERROR on v4.19-rc7 next-20181012]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Andrea-Arcangeli/mm-thp-fix-MADV_DONTNEED-vs-migrate_misplaced_transhuge_page-race-condition/20181014-143004
> base:   https://github.com/thesofproject/linux master
> config: arm64-defconfig (attached as .config)
> compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         GCC_VERSION=7.2.0 make.cross ARCH=arm64 
> 
> All errors (new ones prefixed by >>):
> 
>    mm/migrate.c: In function 'migrate_misplaced_transhuge_page':
> >> mm/migrate.c:2054:32: error: 'end' undeclared (first use in this function); did you mean '_end'?
>      flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
>                                    ^~~
>                                    _end
>    mm/migrate.c:2054:32: note: each undeclared identifier is reported only once for each function it appears in

Nice non-x86 coverage. I intended converted "end" to "start +
HPAGE_PMD_SIZE" to delete the "end" variable purely to shut off a
warning about unused "end" var from gcc on x86, but the s/end/start/
was missed and it still build fine on x86 but not anymore on aarch64.

Anyway I'm waiting some feedback about the whole patchset, before
resending patch 3/3.

diff --git a/mm/migrate.c b/mm/migrate.c
index 9bf5fe9a1008..8afb41167641 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2050,7 +2050,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	new_page->mapping = page->mapping;
 	new_page->index = page->index;
 	/* flush the cache before copying using the kernel virtual address */
-	flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
+	flush_cache_range(vma, start, start + HPAGE_PMD_SIZE);
 	migrate_page_copy(new_page, page);
 	WARN_ON(PageLRU(new_page));
 

Thanks!
Andrea
Kirill A. Shutemov Oct. 15, 2018, 3:35 p.m. UTC | #3
On Sun, Oct 14, 2018 at 03:58:53PM -0400, Andrea Arcangeli wrote:
> On Sun, Oct 14, 2018 at 05:58:27PM +0800, kbuild test robot wrote:
> > Hi Andrea,
> > 
> > Thank you for the patch! Yet something to improve:
> > 
> > [auto build test ERROR on linux-sof-driver/master]
> > [also build test ERROR on v4.19-rc7 next-20181012]
> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> > 
> > url:    https://github.com/0day-ci/linux/commits/Andrea-Arcangeli/mm-thp-fix-MADV_DONTNEED-vs-migrate_misplaced_transhuge_page-race-condition/20181014-143004
> > base:   https://github.com/thesofproject/linux master
> > config: arm64-defconfig (attached as .config)
> > compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> > reproduce:
> >         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >         chmod +x ~/bin/make.cross
> >         # save the attached .config to linux build tree
> >         GCC_VERSION=7.2.0 make.cross ARCH=arm64 
> > 
> > All errors (new ones prefixed by >>):
> > 
> >    mm/migrate.c: In function 'migrate_misplaced_transhuge_page':
> > >> mm/migrate.c:2054:32: error: 'end' undeclared (first use in this function); did you mean '_end'?
> >      flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
> >                                    ^~~
> >                                    _end
> >    mm/migrate.c:2054:32: note: each undeclared identifier is reported only once for each function it appears in
> 
> Nice non-x86 coverage. I intended converted "end" to "start +
> HPAGE_PMD_SIZE" to delete the "end" variable purely to shut off a
> warning about unused "end" var from gcc on x86, but the s/end/start/
> was missed and it still build fine on x86 but not anymore on aarch64.
> 
> Anyway I'm waiting some feedback about the whole patchset, before
> resending patch 3/3.
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 9bf5fe9a1008..8afb41167641 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2050,7 +2050,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>  	new_page->mapping = page->mapping;
>  	new_page->index = page->index;
>  	/* flush the cache before copying using the kernel virtual address */
> -	flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
> +	flush_cache_range(vma, start, start + HPAGE_PMD_SIZE);
>  	migrate_page_copy(new_page, page);
>  	WARN_ON(PageLRU(new_page));
>  
> 

Looks good to me with the fixup.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index c9e9b7db8b6d..9bf5fe9a1008 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2019,7 +2019,6 @@  int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	struct page *new_page = NULL;
 	int page_lru = page_is_file_cache(page);
 	unsigned long start = address & HPAGE_PMD_MASK;
-	unsigned long end = start + HPAGE_PMD_SIZE;
 
 	/*
 	 * Rate-limit the amount of data that is being migrated to a node.
@@ -2050,6 +2049,8 @@  int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	/* anon mapping, we can simply copy page->mapping to the new page: */
 	new_page->mapping = page->mapping;
 	new_page->index = page->index;
+	/* flush the cache before copying using the kernel virtual address */
+	flush_cache_range(vma, start, end + HPAGE_PMD_SIZE);
 	migrate_page_copy(new_page, page);
 	WARN_ON(PageLRU(new_page));
 
@@ -2087,7 +2088,6 @@  int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	 * new page and page_add_new_anon_rmap guarantee the copy is
 	 * visible before the pagetable update.
 	 */
-	flush_cache_range(vma, start, end);
 	page_add_anon_rmap(new_page, vma, start, true);
 	/*
 	 * At this point the pmd is numa/protnone (i.e. non present)