Message ID | 20220718114748.2623-1-namit@vmware.com (mailing list archive) |
---|---|
Headers | show |
Series | userfaultfd: support access/write hints | expand |
On Mon, Jul 18, 2022 at 04:47:44AM -0700, Nadav Amit wrote: > From: Nadav Amit <namit@vmware.com> > > As the next patches are going to introduce more information that needs > to be propagated regarding handled user requests, introduce uffd_flags > that would be used to propagate this information. > > Remove the unused UFFD_FLAGS_SET to avoid confusion in the constant > names. > > Introducing uffd flags also allows to avoid mm/userfaultfd from being > using uapi (e.g., UFFDIO_COPY_MODE_WP). > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Axel Rasmussen <axelrasmussen@google.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Acked-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Nadav Amit <namit@vmware.com> Acked-by: Peter Xu <peterx@redhat.com>
On 18.07.22 13:47, Nadav Amit wrote: > From: Nadav Amit <namit@vmware.com> > > As the next patches are going to introduce more information that needs > to be propagated regarding handled user requests, introduce uffd_flags > that would be used to propagate this information. > > Remove the unused UFFD_FLAGS_SET to avoid confusion in the constant > names. > > Introducing uffd flags also allows to avoid mm/userfaultfd from being > using uapi (e.g., UFFDIO_COPY_MODE_WP). > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Axel Rasmussen <axelrasmussen@google.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Acked-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Nadav Amit <namit@vmware.com> [...] > > int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, bool enable_wp, > - atomic_t *mmap_changing) > + unsigned long len, > + atomic_t *mmap_changing, uffd_flags_t uffd_flags) > { > + bool enable_wp = uffd_flags & UFFD_FLAGS_WP; Could be that this will trigger a sparse warnings, but I haven't fully understood yet when/how sparse will start to complain. If so, this would have to be bool enable_wp = !!(uffd_flags & UFFD_FLAGS_WP); I stumbled into something like that in https://lore.kernel.org/lkml/202202252038.ij1YGn0d-lkp@intel.com/T/ Reviewed-by: David Hildenbrand <david@redhat.com>
On Jul 22, 2022, at 12:54 AM, David Hildenbrand <david@redhat.com> wrote: > ⚠ External Email > > On 18.07.22 13:47, Nadav Amit wrote: >> From: Nadav Amit <namit@vmware.com> >> >> As the next patches are going to introduce more information that needs >> to be propagated regarding handled user requests, introduce uffd_flags >> that would be used to propagate this information. >> >> Remove the unused UFFD_FLAGS_SET to avoid confusion in the constant >> names. >> >> Introducing uffd flags also allows to avoid mm/userfaultfd from being >> using uapi (e.g., UFFDIO_COPY_MODE_WP). >> >> Cc: Mike Kravetz <mike.kravetz@oracle.com> >> Cc: Hugh Dickins <hughd@google.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Axel Rasmussen <axelrasmussen@google.com> >> Cc: Peter Xu <peterx@redhat.com> >> Cc: Mike Rapoport <rppt@linux.ibm.com> >> Acked-by: David Hildenbrand <david@redhat.com> >> Signed-off-by: Nadav Amit <namit@vmware.com> > > [...] > >> int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, >> - unsigned long len, bool enable_wp, >> - atomic_t *mmap_changing) >> + unsigned long len, >> + atomic_t *mmap_changing, uffd_flags_t uffd_flags) >> { >> + bool enable_wp = uffd_flags & UFFD_FLAGS_WP; > > Could be that this will trigger a sparse warnings, but I haven't fully > understood yet when/how sparse will start to complain. If so, this would > have to be > > bool enable_wp = !!(uffd_flags & UFFD_FLAGS_WP); > > I stumbled into something like that in > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F202202252038.ij1YGn0d-lkp%40intel.com%2FT%2F&data=05%7C01%7Cnamit%40vmware.com%7Cc237032d11f04972fdb708da6bb77ada%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637940733049609220%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qW0xe6sS7PjP3papl890GoPTbZ97iE%2Ffztt1rA9t6%2F0%3D&reserved=0 Oh, damn. Thanks for pointing it out. Sparse gives me segmentation faults for some reason, but I guess it should be addressed - just in case.
On Mon, Jul 18, 2022 at 04:47:44AM -0700, Nadav Amit wrote: > From: Nadav Amit <namit@vmware.com> > > As the next patches are going to introduce more information that needs > to be propagated regarding handled user requests, introduce uffd_flags > that would be used to propagate this information. > > Remove the unused UFFD_FLAGS_SET to avoid confusion in the constant > names. > > Introducing uffd flags also allows to avoid mm/userfaultfd from being > using uapi (e.g., UFFDIO_COPY_MODE_WP). > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Axel Rasmussen <axelrasmussen@google.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Mike Rapoport <rppt@linux.ibm.com> > Acked-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Nadav Amit <namit@vmware.com> > --- > fs/userfaultfd.c | 22 +++++++++++--- > include/linux/hugetlb.h | 4 +-- > include/linux/shmem_fs.h | 8 +++-- > include/linux/userfaultfd_k.h | 24 +++++++++------ > mm/hugetlb.c | 3 +- > mm/shmem.c | 6 ++-- > mm/userfaultfd.c | 57 ++++++++++++++++++----------------- > 7 files changed, 73 insertions(+), 51 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index e943370107d0..2ae24327beec 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1682,6 +1682,8 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, > struct uffdio_copy uffdio_copy; > struct uffdio_copy __user *user_uffdio_copy; > struct userfaultfd_wake_range range; > + bool mode_wp; > + uffd_flags_t uffd_flags; > > user_uffdio_copy = (struct uffdio_copy __user *) arg; > > @@ -1708,10 +1710,15 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, > goto out; > if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) > goto out; > + > + mode_wp = uffdio_copy.mode & UFFDIO_COPY_MODE_WP; This seems to be the only place where mode_wp is used in this function. I'd just drop it, and set uffd_flags directly from uffdio_copy.mode. E.g. something like uffd_flags_t uffd_flags = UFFD_FLAGS_NONE; ... if (uffdio_copy.mode & UFFDIO_COPY_MODE_WP) uffd_flags = UFFD_FLAGS_WP; Otherwise Acked-by: Mike Rapoport <rppt@linux.ibm.com> > + > + uffd_flags = mode_wp ? UFFD_FLAGS_WP : UFFD_FLAGS_NONE; > + > if (mmget_not_zero(ctx->mm)) { > ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src, > uffdio_copy.len, &ctx->mmap_changing, > - uffdio_copy.mode); > + uffd_flags); > mmput(ctx->mm); > } else { > return -ESRCH;
> On Jul 23, 2022, at 2:12 AM, Mike Rapoport <rppt@linux.ibm.com> wrote: > > On Mon, Jul 18, 2022 at 04:47:44AM -0700, Nadav Amit wrote: >> From: Nadav Amit <namit@vmware.com> >> >> As the next patches are going to introduce more information that needs >> to be propagated regarding handled user requests, introduce uffd_flags >> that would be used to propagate this information. >> >> Remove the unused UFFD_FLAGS_SET to avoid confusion in the constant >> names. >> >> Introducing uffd flags also allows to avoid mm/userfaultfd from being >> using uapi (e.g., UFFDIO_COPY_MODE_WP). >> >> Cc: Mike Kravetz <mike.kravetz@oracle.com> >> Cc: Hugh Dickins <hughd@google.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Axel Rasmussen <axelrasmussen@google.com> >> Cc: Peter Xu <peterx@redhat.com> >> Cc: Mike Rapoport <rppt@linux.ibm.com> >> Acked-by: David Hildenbrand <david@redhat.com> >> Signed-off-by: Nadav Amit <namit@vmware.com> >> --- >> fs/userfaultfd.c | 22 +++++++++++--- >> include/linux/hugetlb.h | 4 +-- >> include/linux/shmem_fs.h | 8 +++-- >> include/linux/userfaultfd_k.h | 24 +++++++++------ >> mm/hugetlb.c | 3 +- >> mm/shmem.c | 6 ++-- >> mm/userfaultfd.c | 57 ++++++++++++++++++----------------- >> 7 files changed, 73 insertions(+), 51 deletions(-) >> >> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c >> index e943370107d0..2ae24327beec 100644 >> --- a/fs/userfaultfd.c >> +++ b/fs/userfaultfd.c >> @@ -1682,6 +1682,8 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, >> struct uffdio_copy uffdio_copy; >> struct uffdio_copy __user *user_uffdio_copy; >> struct userfaultfd_wake_range range; >> + bool mode_wp; >> + uffd_flags_t uffd_flags; >> >> user_uffdio_copy = (struct uffdio_copy __user *) arg; >> >> @@ -1708,10 +1710,15 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, >> goto out; >> if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) >> goto out; >> + >> + mode_wp = uffdio_copy.mode & UFFDIO_COPY_MODE_WP; > > This seems to be the only place where mode_wp is used in this function. > I'd just drop it, and set uffd_flags directly from uffdio_copy.mode. E.g. > something like > > uffd_flags_t uffd_flags = UFFD_FLAGS_NONE; > > ... > > if (uffdio_copy.mode & UFFDIO_COPY_MODE_WP) > uffd_flags = UFFD_FLAGS_WP; Good point; taken.
From: Nadav Amit <namit@vmware.com> This patch-set introduces access/write hints for userfaultfd. Unlike the previous versions, the use of these hints in this version is limited. Yet, in order to keep introducing new features again and again, hints are introduced for all of uffd related ioctls. The access-hint is currently used to set the young bit, similarly to do_set_pte(). This has no effect on x86, but may on arm64. When a write-hint is provided on zeropage ioctl, a clear page is allocated instead of mapping the zero-page. Future patches would use the write-hint to decide whether to map the writable pages on write-(un)protect ioctl. Setting the access-bit and dirty-bit introduces a tradeoff. When the bit is set access/write is faster, but memory reclamation might be slower. Currently, in the common userfaultfd cases the access-bit is not set on and the dirty-bit is set. This is a questionable behavior. Allow userspace to control this behavior through hints access- and write-likely hints. These hints are used to control access- and dirty-bits. For zero-pages that with write-likely hint, allocate a clear page instead of mapping the zero-page. v1 -> v2: * Leave dirty-bit as it was before [Peter Xu] RFCv2 -> v1: * Adding hints to zeropage and continue * Fixing other issues pointed by David H. & Peter Xu * Adding tests to ./run_vmtests.sh Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Nadav Amit (5): userfaultfd: introduce uffd_flags userfaultfd: introduce access-likely mode for common operations userfaultfd: introduce write-likely mode for uffd operations userfaultfd: zero access/write hints selftest/userfaultfd: test read/write hints fs/userfaultfd.c | 77 ++++++++++++++-- include/linux/hugetlb.h | 4 +- include/linux/shmem_fs.h | 8 +- include/linux/userfaultfd_k.h | 26 ++++-- include/uapi/linux/userfaultfd.h | 31 ++++++- mm/hugetlb.c | 3 +- mm/internal.h | 13 +++ mm/memory.c | 12 --- mm/shmem.c | 6 +- mm/userfaultfd.c | 103 +++++++++++++++------- tools/testing/selftests/vm/run_vmtests.sh | 16 ++-- tools/testing/selftests/vm/userfaultfd.c | 54 +++++++++++- 12 files changed, 275 insertions(+), 78 deletions(-)