diff mbox series

[v8,10/23] mm/shmem: Handle uffd-wp during fork()

Message ID 20220405014855.14468-1-peterx@redhat.com (mailing list archive)
State New
Headers show
Series userfaultfd-wp: Support shmem and hugetlbfs | expand

Commit Message

Peter Xu April 5, 2022, 1:48 a.m. UTC
Normally we skip copy page when fork() for VM_SHARED shmem, but we can't skip
it anymore if uffd-wp is enabled on dst vma.  This should only happen when the
src uffd has UFFD_FEATURE_EVENT_FORK enabled on uffd-wp shmem vma, so that
VM_UFFD_WP will be propagated onto dst vma too, then we should copy the
pgtables with uffd-wp bit and pte markers, because these information will be
lost otherwise.

Since the condition checks will become even more complicated for deciding
"whether a vma needs to copy the pgtable during fork()", introduce a helper
vma_needs_copy() for it, so everything will be clearer.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

Comments

kernel test robot April 6, 2022, 6:16 a.m. UTC | #1
Hi Peter,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on hnaz-mm/master]
[cannot apply to arnd-asm-generic/master linus/master linux/master v5.18-rc1 next-20220405]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/userfaultfd-wp-Support-shmem-and-hugetlbfs/20220405-100136
base:   https://github.com/hnaz/linux-mm master
config: ia64-buildonly-randconfig-r005-20220405 (https://download.01.org/0day-ci/archive/20220406/202204061453.OXOxSh6e-lkp@intel.com/config)
compiler: ia64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/49e56558a1f453907d2813e1ba94d91f9d102e73
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Peter-Xu/userfaultfd-wp-Support-shmem-and-hugetlbfs/20220405-100136
        git checkout 49e56558a1f453907d2813e1ba94d91f9d102e73
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=ia64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from arch/ia64/include/asm/sections.h:11,
                    from include/linux/interrupt.h:21,
                    from include/linux/kernel_stat.h:9,
                    from mm/memory.c:42:
   arch/ia64/include/asm/mmu_context.h: In function 'reload_context':
   arch/ia64/include/asm/mmu_context.h:127:48: warning: variable 'old_rr4' set but not used [-Wunused-but-set-variable]
     127 |         unsigned long rr0, rr1, rr2, rr3, rr4, old_rr4;
         |                                                ^~~~~~~
   In file included from include/linux/mm_inline.h:9,
                    from mm/memory.c:44:
   include/linux/userfaultfd_k.h: In function 'pte_marker_entry_uffd_wp':
   include/linux/userfaultfd_k.h:260:16: error: implicit declaration of function 'is_pte_marker_entry' [-Werror=implicit-function-declaration]
     260 |         return is_pte_marker_entry(entry) &&
         |                ^~~~~~~~~~~~~~~~~~~
   include/linux/userfaultfd_k.h:261:14: error: implicit declaration of function 'pte_marker_get' [-Werror=implicit-function-declaration]
     261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
         |              ^~~~~~~~~~~~~~
   include/linux/userfaultfd_k.h:261:38: error: 'PTE_MARKER_UFFD_WP' undeclared (first use in this function)
     261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
         |                                      ^~~~~~~~~~~~~~~~~~
   include/linux/userfaultfd_k.h:261:38: note: each undeclared identifier is reported only once for each function it appears in
   In file included from include/linux/mm_inline.h:10,
                    from mm/memory.c:44:
   include/linux/swapops.h: At top level:
   include/linux/swapops.h:289:20: error: conflicting types for 'is_pte_marker_entry'; have 'bool(swp_entry_t)' {aka '_Bool(swp_entry_t)'}
     289 | static inline bool is_pte_marker_entry(swp_entry_t entry)
         |                    ^~~~~~~~~~~~~~~~~~~
   In file included from include/linux/mm_inline.h:9,
                    from mm/memory.c:44:
   include/linux/userfaultfd_k.h:260:16: note: previous implicit declaration of 'is_pte_marker_entry' with type 'int()'
     260 |         return is_pte_marker_entry(entry) &&
         |                ^~~~~~~~~~~~~~~~~~~
   In file included from include/linux/mm_inline.h:10,
                    from mm/memory.c:44:
   include/linux/swapops.h:294:26: error: conflicting types for 'pte_marker_get'; have 'pte_marker(swp_entry_t)' {aka 'long unsigned int(swp_entry_t)'}
     294 | static inline pte_marker pte_marker_get(swp_entry_t entry)
         |                          ^~~~~~~~~~~~~~
   In file included from include/linux/mm_inline.h:9,
                    from mm/memory.c:44:
   include/linux/userfaultfd_k.h:261:14: note: previous implicit declaration of 'pte_marker_get' with type 'int()'
     261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
         |              ^~~~~~~~~~~~~~
>> mm/memory.c:1238:1: warning: no previous prototype for 'vma_needs_copy' [-Wmissing-prototypes]
    1238 | vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
         | ^~~~~~~~~~~~~~
   In file included from include/linux/mm_inline.h:9,
                    from mm/memory.c:44:
   include/linux/userfaultfd_k.h: In function 'pte_marker_entry_uffd_wp':
   include/linux/userfaultfd_k.h:262:1: error: control reaches end of non-void function [-Werror=return-type]
     262 | }
         | ^
   cc1: some warnings being treated as errors


vim +/vma_needs_copy +1238 mm/memory.c

  1231	
  1232	/*
  1233	 * Return true if the vma needs to copy the pgtable during this fork().  Return
  1234	 * false when we can speed up fork() by allowing lazy page faults later until
  1235	 * when the child accesses the memory range.
  1236	 */
  1237	bool
> 1238	vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
  1239	{
  1240		/*
  1241		 * Always copy pgtables when dst_vma has uffd-wp enabled even if it's
  1242		 * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable
  1243		 * contains uffd-wp protection information, that's something we can't
  1244		 * retrieve from page cache, and skip copying will lose those info.
  1245		 */
  1246		if (userfaultfd_wp(dst_vma))
  1247			return true;
  1248	
  1249		if (src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP))
  1250			return true;
  1251	
  1252		if (src_vma->anon_vma)
  1253			return true;
  1254	
  1255		/*
  1256		 * Don't copy ptes where a page fault will fill them correctly.  Fork
  1257		 * becomes much lighter when there are big shared or private readonly
  1258		 * mappings. The tradeoff is that copy_page_range is more efficient
  1259		 * than faulting.
  1260		 */
  1261		return false;
  1262	}
  1263
Peter Xu April 6, 2022, 12:18 p.m. UTC | #2
I assume below should be the same issue as the allnoconfig one Andrew
reported, IOW after the fixup squashed this report along with the other one
in patch 4 should go away.  Let me know otherwise..

Thanks,

On Wed, Apr 06, 2022 at 02:16:56PM +0800, kernel test robot wrote:
> All warnings (new ones prefixed by >>):
> 
>    In file included from arch/ia64/include/asm/pgtable.h:153,
>                     from include/linux/pgtable.h:6,
>                     from arch/ia64/include/asm/uaccess.h:40,
>                     from include/linux/uaccess.h:11,
>                     from arch/ia64/include/asm/sections.h:11,
>                     from include/linux/interrupt.h:21,
>                     from include/linux/kernel_stat.h:9,
>                     from mm/memory.c:42:
>    arch/ia64/include/asm/mmu_context.h: In function 'reload_context':
>    arch/ia64/include/asm/mmu_context.h:127:48: warning: variable 'old_rr4' set but not used [-Wunused-but-set-variable]
>      127 |         unsigned long rr0, rr1, rr2, rr3, rr4, old_rr4;
>          |                                                ^~~~~~~
>    In file included from include/linux/mm_inline.h:9,
>                     from mm/memory.c:44:
>    include/linux/userfaultfd_k.h: In function 'pte_marker_entry_uffd_wp':
>    include/linux/userfaultfd_k.h:260:16: error: implicit declaration of function 'is_pte_marker_entry' [-Werror=implicit-function-declaration]
>      260 |         return is_pte_marker_entry(entry) &&
>          |                ^~~~~~~~~~~~~~~~~~~
>    include/linux/userfaultfd_k.h:261:14: error: implicit declaration of function 'pte_marker_get' [-Werror=implicit-function-declaration]
>      261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
>          |              ^~~~~~~~~~~~~~
>    include/linux/userfaultfd_k.h:261:38: error: 'PTE_MARKER_UFFD_WP' undeclared (first use in this function)
>      261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
>          |                                      ^~~~~~~~~~~~~~~~~~
>    include/linux/userfaultfd_k.h:261:38: note: each undeclared identifier is reported only once for each function it appears in
>    In file included from include/linux/mm_inline.h:10,
>                     from mm/memory.c:44:
>    include/linux/swapops.h: At top level:
>    include/linux/swapops.h:289:20: error: conflicting types for 'is_pte_marker_entry'; have 'bool(swp_entry_t)' {aka '_Bool(swp_entry_t)'}
>      289 | static inline bool is_pte_marker_entry(swp_entry_t entry)
>          |                    ^~~~~~~~~~~~~~~~~~~
>    In file included from include/linux/mm_inline.h:9,
>                     from mm/memory.c:44:
>    include/linux/userfaultfd_k.h:260:16: note: previous implicit declaration of 'is_pte_marker_entry' with type 'int()'
>      260 |         return is_pte_marker_entry(entry) &&
>          |                ^~~~~~~~~~~~~~~~~~~
>    In file included from include/linux/mm_inline.h:10,
>                     from mm/memory.c:44:
>    include/linux/swapops.h:294:26: error: conflicting types for 'pte_marker_get'; have 'pte_marker(swp_entry_t)' {aka 'long unsigned int(swp_entry_t)'}
>      294 | static inline pte_marker pte_marker_get(swp_entry_t entry)
>          |                          ^~~~~~~~~~~~~~
>    In file included from include/linux/mm_inline.h:9,
>                     from mm/memory.c:44:
>    include/linux/userfaultfd_k.h:261:14: note: previous implicit declaration of 'pte_marker_get' with type 'int()'
>      261 |             (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
>          |              ^~~~~~~~~~~~~~
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 1144845ff734..8ba1bb196095 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -867,6 +867,14 @@  copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		if (try_restore_exclusive_pte(src_pte, src_vma, addr))
 			return -EBUSY;
 		return -ENOENT;
+	} else if (is_pte_marker_entry(entry)) {
+		/*
+		 * We're copying the pgtable should only because dst_vma has
+		 * uffd-wp enabled, do sanity check.
+		 */
+		WARN_ON_ONCE(!userfaultfd_wp(dst_vma));
+		set_pte_at(dst_mm, addr, dst_pte, pte);
+		return 0;
 	}
 	if (!userfaultfd_wp(dst_vma))
 		pte = pte_swp_clear_uffd_wp(pte);
@@ -1221,6 +1229,38 @@  copy_p4d_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
 	return 0;
 }
 
+/*
+ * Return true if the vma needs to copy the pgtable during this fork().  Return
+ * false when we can speed up fork() by allowing lazy page faults later until
+ * when the child accesses the memory range.
+ */
+bool
+vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
+{
+	/*
+	 * Always copy pgtables when dst_vma has uffd-wp enabled even if it's
+	 * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable
+	 * contains uffd-wp protection information, that's something we can't
+	 * retrieve from page cache, and skip copying will lose those info.
+	 */
+	if (userfaultfd_wp(dst_vma))
+		return true;
+
+	if (src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP))
+		return true;
+
+	if (src_vma->anon_vma)
+		return true;
+
+	/*
+	 * Don't copy ptes where a page fault will fill them correctly.  Fork
+	 * becomes much lighter when there are big shared or private readonly
+	 * mappings. The tradeoff is that copy_page_range is more efficient
+	 * than faulting.
+	 */
+	return false;
+}
+
 int
 copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
 {
@@ -1234,14 +1274,7 @@  copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
 	bool is_cow;
 	int ret;
 
-	/*
-	 * Don't copy ptes where a page fault will fill them correctly.
-	 * Fork becomes much lighter when there are big shared or private
-	 * readonly mappings. The tradeoff is that copy_page_range is more
-	 * efficient than faulting.
-	 */
-	if (!(src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) &&
-	    !src_vma->anon_vma)
+	if (!vma_needs_copy(dst_vma, src_vma))
 		return 0;
 
 	if (is_vm_hugetlb_page(src_vma))