mbox series

[v2,0/7] mm: Page fault enhancements

Message ID 20190905101534.9637-1-peterx@redhat.com (mailing list archive)
Headers show
Series mm: Page fault enhancements | expand

Message

Peter Xu Sept. 5, 2019, 10:15 a.m. UTC
v2:
- resent previous version, rebase only

This series is split out of userfaultfd-wp series to only cover the
general page fault changes, since it seems to make sense itself.

Basically it does two things:

  (a) Allows the page fault handlers to be more interactive on not
      only SIGKILL, but also the rest of userspace signals (especially
      for user-mode faults), and,

  (b) Allows the page fault retry (VM_FAULT_RETRY) to happen for more
      than once.

I'm keeping the CC list as in uffd-wp v5, hopefully I'm not sending
too much spams...

And, instead of writting again the cover letter, I'm just copy-pasting
my previous link here which has more details on why we do this:

  https://patchwork.kernel.org/cover/10691991/

The major change from that latest version should be that we introduced
a new page fault flag FAULT_FLAG_INTERRUPTIBLE as suggested by Linus
[1] to represents that we would like the fault handler to respond to
non-fatal signals.  Also, we're more careful now on when to do the
immediate return of the page fault for such signals.  For example, now
we'll only check against signal_pending() for user-mode page faults
and we keep the kernel-mode page fault patch untouched for it.  More
information can be found in separate patches.

The patchset is only lightly tested on x86.

All comments are greatly welcomed.  Thanks,

[1] https://lkml.org/lkml/2019/6/25/1382

Peter Xu (7):
  mm/gup: Rename "nonblocking" to "locked" where proper
  mm: Introduce FAULT_FLAG_DEFAULT
  mm: Introduce FAULT_FLAG_INTERRUPTIBLE
  mm: Return faster for non-fatal signals in user mode faults
  userfaultfd: Don't retake mmap_sem to emulate NOPAGE
  mm: Allow VM_FAULT_RETRY for multiple times
  mm/gup: Allow VM_FAULT_RETRY for multiple times

 arch/alpha/mm/fault.c           |  7 +--
 arch/arc/mm/fault.c             |  8 +++-
 arch/arm/mm/fault.c             | 14 +++---
 arch/arm64/mm/fault.c           | 16 +++----
 arch/hexagon/mm/vm_fault.c      |  6 +--
 arch/ia64/mm/fault.c            |  6 +--
 arch/m68k/mm/fault.c            | 10 ++--
 arch/microblaze/mm/fault.c      |  6 +--
 arch/mips/mm/fault.c            |  6 +--
 arch/nds32/mm/fault.c           | 12 ++---
 arch/nios2/mm/fault.c           |  8 ++--
 arch/openrisc/mm/fault.c        |  6 +--
 arch/parisc/mm/fault.c          |  9 ++--
 arch/powerpc/mm/fault.c         | 10 ++--
 arch/riscv/mm/fault.c           | 12 ++---
 arch/s390/mm/fault.c            | 11 ++---
 arch/sh/mm/fault.c              |  7 ++-
 arch/sparc/mm/fault_32.c        |  5 +-
 arch/sparc/mm/fault_64.c        |  6 +--
 arch/um/kernel/trap.c           |  7 +--
 arch/unicore32/mm/fault.c       | 11 ++---
 arch/x86/mm/fault.c             |  6 +--
 arch/xtensa/mm/fault.c          |  6 +--
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 12 +++--
 fs/userfaultfd.c                | 28 +-----------
 include/linux/mm.h              | 81 +++++++++++++++++++++++++++++----
 include/linux/sched/signal.h    | 12 +++++
 mm/filemap.c                    |  2 +-
 mm/gup.c                        | 61 ++++++++++++++-----------
 mm/hugetlb.c                    | 14 +++---
 mm/shmem.c                      |  2 +-
 31 files changed, 227 insertions(+), 180 deletions(-)

Comments

Linus Torvalds Sept. 5, 2019, 9:06 p.m. UTC | #1
On Thu, Sep 5, 2019 at 3:15 AM Peter Xu <peterx@redhat.com> wrote:
>
> This series is split out of userfaultfd-wp series to only cover the
> general page fault changes, since it seems to make sense itself.

The series continues to look sane to me, but I'd like VM people to
take a look. I see a few reviewed-by's, it would be nice to see more
comments from people. I'd like to see Andrea in particular say "yeah,
this looks all good to me".

Also a question on how this will get to me - it smells like Andrew's
-mm tree to me, both from a VM and a userfaultfd angle (and looking
around, at least a couple of previous patches by Peter have gone that
way).

And it would be lovely to have actual _numbers_ for the alleged
latency improvements. I 100% believe them, but still, numbers rule.

Talking about latency, what about that retry loop in gup()? That's the
one I'm not at all convinced about. It doesn't check for signals, so
if there is some retry logic, it loops forever. Hmm?

             Linus
Peter Xu Sept. 6, 2019, 6:39 a.m. UTC | #2
On Thu, Sep 05, 2019 at 02:06:04PM -0700, Linus Torvalds wrote:
> On Thu, Sep 5, 2019 at 3:15 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > This series is split out of userfaultfd-wp series to only cover the
> > general page fault changes, since it seems to make sense itself.
> 
> The series continues to look sane to me, but I'd like VM people to
> take a look. I see a few reviewed-by's, it would be nice to see more
> comments from people. I'd like to see Andrea in particular say "yeah,
> this looks all good to me".

Yes I agree.  I would appreciate if either Andrea or any of the other
mm experts can comment on this patchset.

> 
> Also a question on how this will get to me - it smells like Andrew's
> -mm tree to me, both from a VM and a userfaultfd angle (and looking
> around, at least a couple of previous patches by Peter have gone that
> way).
> 
> And it would be lovely to have actual _numbers_ for the alleged
> latency improvements. I 100% believe them, but still, numbers rule.

If the question was about the userspace signal handling - IMHO it's
not really a latency number that I can measure, but it's some
functional difference just like what dfa37dc3fc1f6f wanted to solve
previously (though that solution seemed to be causing some other issue
like what have been mentioned in the cover letter on invalid VMA
access), while this series should be a cleaner approach.

To be clear about the functional differnce: if without the userspace
non-fatal handling patch in this series ("mm: Return faster for
non-fatal signals in user mode faults"), we can't use Ctrl-C to stop a
program hanging in handle_userfault(), nor can we use gdb to attach to
that process (we can do it if with dfa37dc3fc1f6f, but again it's not
the clean approach).  And, if with this whole series (hence with "mm:
Return faster for non-fatal signals in user mode faults"), we can do
both (Ctrl-C to stop the process, or gdb attaching to that hanging
process without hanging gdb).

> 
> Talking about latency, what about that retry loop in gup()? That's the
> one I'm not at all convinced about. It doesn't check for signals, so
> if there is some retry logic, it loops forever. Hmm?

Hmm seems to be a valid point... IMHO it'll be fine for non-fatal
signals, because GUPs will still be without FAULT_FLAG_INTERRUPTIBLE
when calling handle_mm_fault(), hence the page fault logic should at
least ignore non-fatal signals.  However I agree that we probably need
a check for fatal signals in __get_user_pages_locked() now.

Thanks,

[1] https://lkml.org/lkml/2017/11/2/833
[2] https://github.com/xzpeter/clibs/blob/master/gpl/userspace/uffd-test/uffd-test.c