Message ID | 20210218231202.15426-1-peterx@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hugetlb: Disable huge pmd unshare for uffd-wp | expand |
Hi Peter, On Fri, 19 Feb 2021 at 04:43, Peter Xu <peterx@redhat.com> wrote: > > Huge pmd sharing could bring problem to userfaultfd. The thing is that > userfaultfd is running its logic based on the special bits on page table > entries, however the huge pmd sharing could potentially share page table > entries for different address ranges. That could cause issues on either: > > - When sharing huge pmd page tables for an uffd write protected range, the > newly mapped huge pmd range will also be write protected unexpectedly, or, > > - When we try to write protect a range of huge pmd shared range, we'll first > do huge_pmd_unshare() in hugetlb_change_protection(), however that also > means the UFFDIO_WRITEPROTECT could be silently skipped for the shared > region, which could lead to data loss. > > Since at it, a few other things are done altogether: > > - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because > that's definitely something that arch code would like to use too > > - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when > trying to share huge pmd. Switch to the want_pmd_share() helper. > > Since at it, move vma_shareable() from huge_pmd_share() into want_pmd_share(). > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > arch/arm64/mm/hugetlbpage.c | 3 +-- > include/linux/hugetlb.h | 2 ++ > include/linux/userfaultfd_k.h | 9 +++++++++ > mm/hugetlb.c | 20 ++++++++++++++------ > 4 files changed, 26 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 6e3bcffe2837..58987a98e179 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, > */ > ptep = pte_alloc_map(mm, pmdp, addr); > } else if (sz == PMD_SIZE) { > - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && > - pud_none(READ_ONCE(*pudp))) > + if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) While building Linux next 20210310 tag for arm64 architecture with - CONFIG_ARM64_64K_PAGES=y enabled the build failed due to below errors / warnings make --silent --keep-going --jobs=8 O=/home/tuxbuild/.cache/tuxmake/builds/1/tmp ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- 'CC=sccache aarch64-linux-gnu-gcc' 'HOSTCC=sccache gcc' aarch64-linux-gnu-ld: Unexpected GOT/PLT entries detected! aarch64-linux-gnu-ld: Unexpected run-time procedure linkages detected! aarch64-linux-gnu-ld: arch/arm64/mm/hugetlbpage.o: in function `huge_pte_alloc': hugetlbpage.c:(.text+0x7d8): undefined reference to `want_pmd_share' Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Steps to reproduce: ---------------------------- # TuxMake is a command line tool and Python library that provides # portable and repeatable Linux kernel builds across a variety of # architectures, toolchains, kernel configurations, and make targets. # # TuxMake supports the concept of runtimes. # See https://docs.tuxmake.org/runtimes/, for that to work it requires # that you install podman or docker on your system. # # To install tuxmake on your system globally: # sudo pip3 install -U tuxmake # # See https://docs.tuxmake.org/ for complete documentation. tuxmake --runtime podman --target-arch arm64 --toolchain gcc-9 --kconfig defconfig --kconfig-add https://builds.tuxbuild.com/1pYCSoc1oGtPWlPgLAJxbHx07kL/config Build link, https://builds.tuxbuild.com/1pYCSoc1oGtPWlPgLAJxbHx07kL/
On Wed, Mar 10, 2021 at 01:18:42PM +0530, Naresh Kamboju wrote: > Hi Peter, Hi, Naresh, > > On Fri, 19 Feb 2021 at 04:43, Peter Xu <peterx@redhat.com> wrote: > > > > Huge pmd sharing could bring problem to userfaultfd. The thing is that > > userfaultfd is running its logic based on the special bits on page table > > entries, however the huge pmd sharing could potentially share page table > > entries for different address ranges. That could cause issues on either: > > > > - When sharing huge pmd page tables for an uffd write protected range, the > > newly mapped huge pmd range will also be write protected unexpectedly, or, > > > > - When we try to write protect a range of huge pmd shared range, we'll first > > do huge_pmd_unshare() in hugetlb_change_protection(), however that also > > means the UFFDIO_WRITEPROTECT could be silently skipped for the shared > > region, which could lead to data loss. > > > > Since at it, a few other things are done altogether: > > > > - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because > > that's definitely something that arch code would like to use too > > > > - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when > > trying to share huge pmd. Switch to the want_pmd_share() helper. > > > > Since at it, move vma_shareable() from huge_pmd_share() into want_pmd_share(). > > > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > > Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > arch/arm64/mm/hugetlbpage.c | 3 +-- > > include/linux/hugetlb.h | 2 ++ > > include/linux/userfaultfd_k.h | 9 +++++++++ > > mm/hugetlb.c | 20 ++++++++++++++------ > > 4 files changed, 26 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > index 6e3bcffe2837..58987a98e179 100644 > > --- a/arch/arm64/mm/hugetlbpage.c > > +++ b/arch/arm64/mm/hugetlbpage.c > > @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, > > */ > > ptep = pte_alloc_map(mm, pmdp, addr); > > } else if (sz == PMD_SIZE) { > > - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && > > - pud_none(READ_ONCE(*pudp))) > > + if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) > > While building Linux next 20210310 tag for arm64 architecture with > > - CONFIG_ARM64_64K_PAGES=y > > enabled the build failed due to below errors / warnings > > make --silent --keep-going --jobs=8 > O=/home/tuxbuild/.cache/tuxmake/builds/1/tmp ARCH=arm64 > CROSS_COMPILE=aarch64-linux-gnu- 'CC=sccache aarch64-linux-gnu-gcc' > 'HOSTCC=sccache gcc' > aarch64-linux-gnu-ld: Unexpected GOT/PLT entries detected! > aarch64-linux-gnu-ld: Unexpected run-time procedure linkages detected! > aarch64-linux-gnu-ld: arch/arm64/mm/hugetlbpage.o: in function `huge_pte_alloc': > hugetlbpage.c:(.text+0x7d8): undefined reference to `want_pmd_share' > > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Sorry for the issue & thanks for the report. Would you please check whether the patch attached could fix the issue?
On Wed, 10 Mar 2021 at 22:27, Peter Xu <peterx@redhat.com> wrote: > > On Wed, Mar 10, 2021 at 01:18:42PM +0530, Naresh Kamboju wrote: > > Hi Peter, > > Hi, Naresh, > > > > > On Fri, 19 Feb 2021 at 04:43, Peter Xu <peterx@redhat.com> wrote: > > > > > > Huge pmd sharing could bring problem to userfaultfd. The thing is that > > > userfaultfd is running its logic based on the special bits on page table > > > entries, however the huge pmd sharing could potentially share page table > > > entries for different address ranges. That could cause issues on either: > > > > > > - When sharing huge pmd page tables for an uffd write protected range, the > > > newly mapped huge pmd range will also be write protected unexpectedly, or, > > > > > > - When we try to write protect a range of huge pmd shared range, we'll first > > > do huge_pmd_unshare() in hugetlb_change_protection(), however that also > > > means the UFFDIO_WRITEPROTECT could be silently skipped for the shared > > > region, which could lead to data loss. > > > > > > Since at it, a few other things are done altogether: > > > > > > - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because > > > that's definitely something that arch code would like to use too > > > > > > - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when > > > trying to share huge pmd. Switch to the want_pmd_share() helper. > > > > > > Since at it, move vma_shareable() from huge_pmd_share() into want_pmd_share(). > > > > > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > > > Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > --- > > > arch/arm64/mm/hugetlbpage.c | 3 +-- > > > include/linux/hugetlb.h | 2 ++ > > > include/linux/userfaultfd_k.h | 9 +++++++++ > > > mm/hugetlb.c | 20 ++++++++++++++------ > > > 4 files changed, 26 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > > > index 6e3bcffe2837..58987a98e179 100644 > > > --- a/arch/arm64/mm/hugetlbpage.c > > > +++ b/arch/arm64/mm/hugetlbpage.c > > > @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, > > > */ > > > ptep = pte_alloc_map(mm, pmdp, addr); > > > } else if (sz == PMD_SIZE) { > > > - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && > > > - pud_none(READ_ONCE(*pudp))) > > > + if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) > > > > While building Linux next 20210310 tag for arm64 architecture with > > > > - CONFIG_ARM64_64K_PAGES=y > > > > enabled the build failed due to below errors / warnings > > > > make --silent --keep-going --jobs=8 > > O=/home/tuxbuild/.cache/tuxmake/builds/1/tmp ARCH=arm64 > > CROSS_COMPILE=aarch64-linux-gnu- 'CC=sccache aarch64-linux-gnu-gcc' > > 'HOSTCC=sccache gcc' > > aarch64-linux-gnu-ld: Unexpected GOT/PLT entries detected! > > aarch64-linux-gnu-ld: Unexpected run-time procedure linkages detected! > > aarch64-linux-gnu-ld: arch/arm64/mm/hugetlbpage.o: in function `huge_pte_alloc': > > hugetlbpage.c:(.text+0x7d8): undefined reference to `want_pmd_share' > > > > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> > > Sorry for the issue & thanks for the report. Would you please check whether > the patch attached could fix the issue? The attached patch build tested and build pass for arm64 including 64k pages config. CONFIG_ARM64_64K_PAGES=y - Naresh
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 6e3bcffe2837..58987a98e179 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, */ ptep = pte_alloc_map(mm, pmdp, addr); } else if (sz == PMD_SIZE) { - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && - pud_none(READ_ONCE(*pudp))) + if (want_pmd_share(vma, addr) && pud_none(READ_ONCE(*pudp))) ptep = huge_pmd_share(mm, vma, addr, pudp); else ptep = (pte_t *)pmd_alloc(mm, pudp, addr); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a6113fa6d21d..bc86f2f516e7 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -950,4 +950,6 @@ static inline __init void hugetlb_cma_check(void) } #endif +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); + #endif /* _LINUX_HUGETLB_H */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index a8e5f3ea9bb2..c63ccdae3eab 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -52,6 +52,15 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, return vma->vm_userfaultfd_ctx.ctx == vm_ctx.ctx; } +/* + * Never enable huge pmd sharing on uffd-wp registered vmas, because uffd-wp + * protect information is per pgtable entry. + */ +static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 07bb9bdc3282..8e8e2f3dfe06 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5292,6 +5292,18 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr) return false; } +bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) +{ +#ifndef CONFIG_ARCH_WANT_HUGE_PMD_SHARE + return false; +#endif +#ifdef CONFIG_USERFAULTFD + if (uffd_disable_huge_pmd_share(vma)) + return false; +#endif + return vma_shareable(vma, addr); +} + /* * Determine if start,end range within vma could be mapped by shared pmd. * If yes, adjust start and end to cover range associated with possible @@ -5346,9 +5358,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, pte_t *pte; spinlock_t *ptl; - if (!vma_shareable(vma, addr)) - return (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_assert_locked(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) @@ -5412,7 +5421,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE; return 1; } -#define want_pmd_share() (1) + #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct vma, unsigned long addr, pud_t *pud) @@ -5430,7 +5439,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, unsigned long *start, unsigned long *end) { } -#define want_pmd_share() (0) #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB @@ -5452,7 +5460,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pud; } else { BUG_ON(sz != PMD_SIZE); - if (want_pmd_share() && pud_none(*pud)) + if (want_pmd_share(vma, addr) && pud_none(*pud)) pte = huge_pmd_share(mm, vma, addr, pud); else pte = (pte_t *)pmd_alloc(mm, pud, addr);