Message ID | 20190923042523.10027-6-peterx@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Page fault enhancements | expand |
On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > This patch is a preparation of removing that special path by allowing > the page fault to return even faster if we were interrupted by a > non-fatal signal during a user-mode page fault handling routine. So I really wish saome other vm person would also review these things, but looking over this series once more, this is the patch I probably like the least. And the reason I like it the least is that I have a hard time explaining to myself what the code does and why, and why it's so full of this pattern: > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; which isn't all that pretty. Why isn't this just static bool fault_signal_pending(unsigned int fault_flags, struct pt_regs *regs) { return (fault_flags & VM_FAULT_RETRY) && (fatal_signal_pending(current) || (user_mode(regs) && signal_pending(current))); } and then most of the users would be something like if (fault_signal_pending(fault, regs)) return; and the exceptions could do their own thing. Now the code is prettier and more understandable, I feel. And if something doesn't follow this pattern, maybe it either _should_ follow that pattern or it should just not use the helper but explain why it has an unusual pattern. Linus
On Mon, Sep 23, 2019 at 11:03:49AM -0700, Linus Torvalds wrote: > On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > > > This patch is a preparation of removing that special path by allowing > > the page fault to return even faster if we were interrupted by a > > non-fatal signal during a user-mode page fault handling routine. > > So I really wish saome other vm person would also review these things, > but looking over this series once more, this is the patch I probably > like the least. > > And the reason I like it the least is that I have a hard time > explaining to myself what the code does and why, and why it's so full > of this pattern: > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > + if ((fault & VM_FAULT_RETRY) && > > + fault_should_check_signal(user_mode(regs))) > > return; > > which isn't all that pretty. > > Why isn't this just > > static bool fault_signal_pending(unsigned int fault_flags, struct > pt_regs *regs) > { > return (fault_flags & VM_FAULT_RETRY) && > (fatal_signal_pending(current) || > (user_mode(regs) && signal_pending(current))); > } > > and then most of the users would be something like > > if (fault_signal_pending(fault, regs)) > return; > > and the exceptions could do their own thing. > > Now the code is prettier and more understandable, I feel. > > And if something doesn't follow this pattern, maybe it either _should_ > follow that pattern or it should just not use the helper but explain > why it has an unusual pattern. I see the point on why this patch is disliked - Yeh it should look better to have a single function to cover the most common cases. Besides, I attempted to squash the extra signal_pending() check into some existing code path but maybe it's not really benefiting much while instead it makes the review even harder. So I plan to isolate those paths out too, from something like: ==================================== --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -291,14 +291,15 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, try to handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return 0; + if (user_mode(regs)) + return 0; } ==================================== into: ==================================== --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -301,6 +301,11 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) return 0; } + /* Fast path to handle user mode signals */ + if ((fault & VM_FAULT_RETRY) && user_mode(regs) && + signal_pending(current)) + return 0; + /* * Major/minor page fault accounting is only done on the * initial attempt. If we go through a retry, it is extremely ==================================== I hope it'll be better with that. A complete patch attached too. Thanks,
On Tue, Sep 24, 2019 at 10:47:21AM +0800, Peter Xu wrote: > On Mon, Sep 23, 2019 at 11:03:49AM -0700, Linus Torvalds wrote: > > On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > This patch is a preparation of removing that special path by allowing > > > the page fault to return even faster if we were interrupted by a > > > non-fatal signal during a user-mode page fault handling routine. > > > > So I really wish saome other vm person would also review these things, > > but looking over this series once more, this is the patch I probably > > like the least. > > > > And the reason I like it the least is that I have a hard time > > explaining to myself what the code does and why, and why it's so full > > of this pattern: > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > + if ((fault & VM_FAULT_RETRY) && > > > + fault_should_check_signal(user_mode(regs))) > > > return; > > > > which isn't all that pretty. > > > > Why isn't this just > > > > static bool fault_signal_pending(unsigned int fault_flags, struct > > pt_regs *regs) > > { > > return (fault_flags & VM_FAULT_RETRY) && > > (fatal_signal_pending(current) || > > (user_mode(regs) && signal_pending(current))); > > } > > > > and then most of the users would be something like > > > > if (fault_signal_pending(fault, regs)) > > return; > > > > and the exceptions could do their own thing. > > > > Now the code is prettier and more understandable, I feel. > > > > And if something doesn't follow this pattern, maybe it either _should_ > > follow that pattern or it should just not use the helper but explain > > why it has an unusual pattern. > +++ b/arch/alpha/mm/fault.c > @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, > the fault. */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if (fault_signal_pending(fault, regs)) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > +++ b/arch/arm/mm/fault.c > @@ -301,6 +301,11 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > return 0; > } > > + /* Fast path to handle user mode signals */ > + if ((fault & VM_FAULT_RETRY) && user_mode(regs) && > + signal_pending(current)) > + return 0; But _why_ are they different? This is a good opportunity to make more code the same between architectures.
On Mon, Sep 23, 2019 at 07:54:47PM -0700, Matthew Wilcox wrote: > On Tue, Sep 24, 2019 at 10:47:21AM +0800, Peter Xu wrote: > > On Mon, Sep 23, 2019 at 11:03:49AM -0700, Linus Torvalds wrote: > > > On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > > > This patch is a preparation of removing that special path by allowing > > > > the page fault to return even faster if we were interrupted by a > > > > non-fatal signal during a user-mode page fault handling routine. > > > > > > So I really wish saome other vm person would also review these things, > > > but looking over this series once more, this is the patch I probably > > > like the least. > > > > > > And the reason I like it the least is that I have a hard time > > > explaining to myself what the code does and why, and why it's so full > > > of this pattern: > > > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > > + if ((fault & VM_FAULT_RETRY) && > > > > + fault_should_check_signal(user_mode(regs))) > > > > return; > > > > > > which isn't all that pretty. > > > > > > Why isn't this just > > > > > > static bool fault_signal_pending(unsigned int fault_flags, struct > > > pt_regs *regs) > > > { > > > return (fault_flags & VM_FAULT_RETRY) && > > > (fatal_signal_pending(current) || > > > (user_mode(regs) && signal_pending(current))); > > > } > > > > > > and then most of the users would be something like > > > > > > if (fault_signal_pending(fault, regs)) > > > return; > > > > > > and the exceptions could do their own thing. > > > > > > Now the code is prettier and more understandable, I feel. > > > > > > And if something doesn't follow this pattern, maybe it either _should_ > > > follow that pattern or it should just not use the helper but explain > > > why it has an unusual pattern. > > > +++ b/arch/alpha/mm/fault.c > > @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, > > the fault. */ > > fault = handle_mm_fault(vma, address, flags); > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > + if (fault_signal_pending(fault, regs)) > > return; > > > > if (unlikely(fault & VM_FAULT_ERROR)) { > > > +++ b/arch/arm/mm/fault.c > > @@ -301,6 +301,11 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > > return 0; > > } > > > > + /* Fast path to handle user mode signals */ > > + if ((fault & VM_FAULT_RETRY) && user_mode(regs) && > > + signal_pending(current)) > > + return 0; > > But _why_ are they different? This is a good opportunity to make more > code the same between architectures. (Thanks for joining the discussion) I'd like to do these - my only worry is that I can't really test them well simply because I don't have all the hardwares. For now the changes are mostly straightforward so I'm relatively confident (not to mention the code needs proper reviews too, and of course I would appreciate much if anyone wants to smoke test it). If I change it in a drastic way, I won't be that confident without some tests at least on multiple archs (not to mention that even smoke testing across major archs will be a huge amount of work...). So IMHO those might be more suitable as follow-up for per-arch developers if we can at least reach a consensus on the whole idea of this patchset. Thanks,
On Tue, Sep 24, 2019 at 11:19:08AM +0800, Peter Xu wrote: > On Mon, Sep 23, 2019 at 07:54:47PM -0700, Matthew Wilcox wrote: > > On Tue, Sep 24, 2019 at 10:47:21AM +0800, Peter Xu wrote: > > > On Mon, Sep 23, 2019 at 11:03:49AM -0700, Linus Torvalds wrote: > > > > On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > > > > > This patch is a preparation of removing that special path by allowing > > > > > the page fault to return even faster if we were interrupted by a > > > > > non-fatal signal during a user-mode page fault handling routine. > > > > > > > > So I really wish saome other vm person would also review these things, > > > > but looking over this series once more, this is the patch I probably > > > > like the least. > > > > > > > > And the reason I like it the least is that I have a hard time > > > > explaining to myself what the code does and why, and why it's so full > > > > of this pattern: > > > > > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > > > + if ((fault & VM_FAULT_RETRY) && > > > > > + fault_should_check_signal(user_mode(regs))) > > > > > return; > > > > > > > > which isn't all that pretty. > > > > > > > > Why isn't this just > > > > > > > > static bool fault_signal_pending(unsigned int fault_flags, struct > > > > pt_regs *regs) > > > > { > > > > return (fault_flags & VM_FAULT_RETRY) && > > > > (fatal_signal_pending(current) || > > > > (user_mode(regs) && signal_pending(current))); > > > > } > > > > > > > > and then most of the users would be something like > > > > > > > > if (fault_signal_pending(fault, regs)) > > > > return; > > > > > > > > and the exceptions could do their own thing. > > > > > > > > Now the code is prettier and more understandable, I feel. > > > > > > > > And if something doesn't follow this pattern, maybe it either _should_ > > > > follow that pattern or it should just not use the helper but explain > > > > why it has an unusual pattern. > > > > > +++ b/arch/alpha/mm/fault.c > > > @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, > > > the fault. */ > > > fault = handle_mm_fault(vma, address, flags); > > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > + if (fault_signal_pending(fault, regs)) > > > return; > > > > > > if (unlikely(fault & VM_FAULT_ERROR)) { > > > > > +++ b/arch/arm/mm/fault.c > > > @@ -301,6 +301,11 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > > > return 0; > > > } > > > > > > + /* Fast path to handle user mode signals */ > > > + if ((fault & VM_FAULT_RETRY) && user_mode(regs) && > > > + signal_pending(current)) > > > + return 0; > > > > But _why_ are they different? This is a good opportunity to make more > > code the same between architectures. > > (Thanks for joining the discussion) > > I'd like to do these - my only worry is that I can't really test them > well simply because I don't have all the hardwares. For now the > changes are mostly straightforward so I'm relatively confident (not to > mention the code needs proper reviews too, and of course I would > appreciate much if anyone wants to smoke test it). If I change it in > a drastic way, I won't be that confident without some tests at least > on multiple archs (not to mention that even smoke testing across major > archs will be a huge amount of work...). So IMHO those might be more > suitable as follow-up for per-arch developers if we can at least reach > a consensus on the whole idea of this patchset. I think the way to do this is to introduce fault_signal_pending(), converting the architectures to it that match that pattern. Then one patch per architecture to convert the ones which use a different pattern to the same pattern. Oh, and while you're looking at the callers of handle_mm_fault(), a lot of them don't check conditions in the right order. x86, at least, handles FAULT_RETRY before handling FAULT_ERROR, which is clearly wrong. Kirill and I recently discussed it here: https://lore.kernel.org/linux-mm/20190911152338.gqqgxrmqycodfocb@box/T/
On Tue, Sep 24, 2019 at 08:45:18AM -0700, Matthew Wilcox wrote: > On Tue, Sep 24, 2019 at 11:19:08AM +0800, Peter Xu wrote: > > On Mon, Sep 23, 2019 at 07:54:47PM -0700, Matthew Wilcox wrote: > > > On Tue, Sep 24, 2019 at 10:47:21AM +0800, Peter Xu wrote: > > > > On Mon, Sep 23, 2019 at 11:03:49AM -0700, Linus Torvalds wrote: > > > > > On Sun, Sep 22, 2019 at 9:26 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > > > > > > > This patch is a preparation of removing that special path by allowing > > > > > > the page fault to return even faster if we were interrupted by a > > > > > > non-fatal signal during a user-mode page fault handling routine. > > > > > > > > > > So I really wish saome other vm person would also review these things, > > > > > but looking over this series once more, this is the patch I probably > > > > > like the least. > > > > > > > > > > And the reason I like it the least is that I have a hard time > > > > > explaining to myself what the code does and why, and why it's so full > > > > > of this pattern: > > > > > > > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > > > > + if ((fault & VM_FAULT_RETRY) && > > > > > > + fault_should_check_signal(user_mode(regs))) > > > > > > return; > > > > > > > > > > which isn't all that pretty. > > > > > > > > > > Why isn't this just > > > > > > > > > > static bool fault_signal_pending(unsigned int fault_flags, struct > > > > > pt_regs *regs) > > > > > { > > > > > return (fault_flags & VM_FAULT_RETRY) && > > > > > (fatal_signal_pending(current) || > > > > > (user_mode(regs) && signal_pending(current))); > > > > > } > > > > > > > > > > and then most of the users would be something like > > > > > > > > > > if (fault_signal_pending(fault, regs)) > > > > > return; > > > > > > > > > > and the exceptions could do their own thing. > > > > > > > > > > Now the code is prettier and more understandable, I feel. > > > > > > > > > > And if something doesn't follow this pattern, maybe it either _should_ > > > > > follow that pattern or it should just not use the helper but explain > > > > > why it has an unusual pattern. > > > > > > > +++ b/arch/alpha/mm/fault.c > > > > @@ -150,7 +150,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr, > > > > the fault. */ > > > > fault = handle_mm_fault(vma, address, flags); > > > > > > > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > > > > + if (fault_signal_pending(fault, regs)) > > > > return; > > > > > > > > if (unlikely(fault & VM_FAULT_ERROR)) { > > > > > > > +++ b/arch/arm/mm/fault.c > > > > @@ -301,6 +301,11 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > > > > return 0; > > > > } > > > > > > > > + /* Fast path to handle user mode signals */ > > > > + if ((fault & VM_FAULT_RETRY) && user_mode(regs) && > > > > + signal_pending(current)) > > > > + return 0; > > > > > > But _why_ are they different? This is a good opportunity to make more > > > code the same between architectures. > > > > (Thanks for joining the discussion) > > > > I'd like to do these - my only worry is that I can't really test them > > well simply because I don't have all the hardwares. For now the > > changes are mostly straightforward so I'm relatively confident (not to > > mention the code needs proper reviews too, and of course I would > > appreciate much if anyone wants to smoke test it). If I change it in > > a drastic way, I won't be that confident without some tests at least > > on multiple archs (not to mention that even smoke testing across major > > archs will be a huge amount of work...). So IMHO those might be more > > suitable as follow-up for per-arch developers if we can at least reach > > a consensus on the whole idea of this patchset. > > I think the way to do this is to introduce fault_signal_pending(), > converting the architectures to it that match that pattern. Then one > patch per architecture to convert the ones which use a different pattern > to the same pattern. Fair enough. I can start with a fault_signal_pending() only keeps the sigkill handling just like before, then convert all the archs, with the last patch to only touch fault_signal_pending() for non-fatal signals. > > Oh, and while you're looking at the callers of handle_mm_fault(), a > lot of them don't check conditions in the right order. x86, at least, > handles FAULT_RETRY before handling FAULT_ERROR, which is clearly wrong. > > Kirill and I recently discussed it here: > https://lore.kernel.org/linux-mm/20190911152338.gqqgxrmqycodfocb@box/T/ Hmm sure. These sound very reasonable. I must admit that I am not brave enough to continue grow my patchset on my own. The condition I'm facing right now is that I can't really find enough reviewers for this series (Linus helped me quite a lot, I really, really, appreciated that), while it's still growing. I hope the started discussion means that you'll be at least another potential reviewer (oh, should I count Kirill in as well? :) at least to the coming patches for the things mentioned above. Thanks,
On Tue, Sep 24, 2019 at 08:45:18AM -0700, Matthew Wilcox wrote: [...] > Oh, and while you're looking at the callers of handle_mm_fault(), a > lot of them don't check conditions in the right order. x86, at least, > handles FAULT_RETRY before handling FAULT_ERROR, which is clearly wrong. > > Kirill and I recently discussed it here: > https://lore.kernel.org/linux-mm/20190911152338.gqqgxrmqycodfocb@box/T/ Is there any existing path in master that we can get VM_FAULT_RETRY returned with any existing VM_FAULT_ERROR bit? It seems to me that above link is the first one that is going to introduce such case? If so, I'm uncertain now on whether I should have one patch to handle the ERROR case first as you suggested with this series, because otherwise that patch won't explain itself without a real benefit... Thanks,
On Sun, 22 Sep 2019 21:25:18 PDT (-0700), peterx@redhat.com wrote: > The idea comes from the upstream discussion between Linus and Andrea: > > https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/ > > A summary to the issue: there was a special path in handle_userfault() > in the past that we'll return a VM_FAULT_NOPAGE when we detected > non-fatal signals when waiting for userfault handling. We did that by > reacquiring the mmap_sem before returning. However that brings a risk > in that the vmas might have changed when we retake the mmap_sem and > even we could be holding an invalid vma structure. > > This patch is a preparation of removing that special path by allowing > the page fault to return even faster if we were interrupted by a > non-fatal signal during a user-mode page fault handling routine. > > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > Suggested-by: Andrea Arcangeli <aarcange@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > arch/alpha/mm/fault.c | 3 ++- > arch/arc/mm/fault.c | 5 +++++ > arch/arm/mm/fault.c | 9 +++++---- > arch/arm64/mm/fault.c | 9 +++++---- > arch/hexagon/mm/vm_fault.c | 3 ++- > arch/ia64/mm/fault.c | 3 ++- > arch/m68k/mm/fault.c | 5 +++-- > arch/microblaze/mm/fault.c | 3 ++- > arch/mips/mm/fault.c | 3 ++- > arch/nds32/mm/fault.c | 9 +++++---- > arch/nios2/mm/fault.c | 3 ++- > arch/openrisc/mm/fault.c | 3 ++- > arch/parisc/mm/fault.c | 3 ++- > arch/powerpc/mm/fault.c | 2 ++ > arch/riscv/mm/fault.c | 5 +++-- > arch/s390/mm/fault.c | 4 ++-- > arch/sh/mm/fault.c | 4 ++++ > arch/sparc/mm/fault_32.c | 2 +- > arch/sparc/mm/fault_64.c | 3 ++- > arch/um/kernel/trap.c | 4 +++- > arch/unicore32/mm/fault.c | 5 +++-- > arch/x86/mm/fault.c | 2 ++ > arch/xtensa/mm/fault.c | 3 ++- > include/linux/sched/signal.h | 12 ++++++++++++ > 24 files changed, 75 insertions(+), 32 deletions(-) > > diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c > index de4cc6936391..ab1d4212d658 100644 > --- a/arch/alpha/mm/fault.c > +++ b/arch/alpha/mm/fault.c > @@ -150,7 +150,8 @@ do_page_fault(unsigned long address, unsigned long mmcsr, > the fault. */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c > index 61919e4e4eec..27adf4e608e4 100644 > --- a/arch/arc/mm/fault.c > +++ b/arch/arc/mm/fault.c > @@ -142,6 +142,11 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) > goto no_context; > return; > } > + > + /* Allow user to handle non-fatal signals first */ > + if (signal_pending(current) && user_mode(regs)) > + return; > + > /* > * retry state machine > */ > diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c > index 2ae28ffec622..f00fb4eafe54 100644 > --- a/arch/arm/mm/fault.c > +++ b/arch/arm/mm/fault.c > @@ -291,14 +291,15 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > > fault = __do_page_fault(mm, addr, fsr, flags, tsk); > > - /* If we need to retry but a fatal signal is pending, handle the > + /* If we need to retry but a signal is pending, try to handle the > * signal first. We do not need to release the mmap_sem because > * it would already be released in __lock_page_or_retry in > * mm/filemap.c. */ > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { > - if (!user_mode(regs)) > + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { > + if (fatal_signal_pending(current) && !user_mode(regs)) > goto no_context; > - return 0; > + if (user_mode(regs)) > + return 0; > } > > /* > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index 613e7434c208..0d3fe0ea6a70 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -479,15 +479,16 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, > > if (fault & VM_FAULT_RETRY) { > /* > - * If we need to retry but a fatal signal is pending, > + * If we need to retry but a signal is pending, try to > * handle the signal first. We do not need to release > * the mmap_sem because it would already be released > * in __lock_page_or_retry in mm/filemap.c. > */ > - if (fatal_signal_pending(current)) { > - if (!user_mode(regs)) > + if (signal_pending(current)) { > + if (fatal_signal_pending(current) && !user_mode(regs)) > goto no_context; > - return 0; > + if (user_mode(regs)) > + return 0; > } > > /* > diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c > index 223787e01bdd..88a2e5635bfb 100644 > --- a/arch/hexagon/mm/vm_fault.c > +++ b/arch/hexagon/mm/vm_fault.c > @@ -91,7 +91,8 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) > > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > /* The most common case -- we are done. */ > diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c > index d039b846f671..8d47acf50fda 100644 > --- a/arch/ia64/mm/fault.c > +++ b/arch/ia64/mm/fault.c > @@ -141,7 +141,8 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c > index 8e734309ace9..103f93ba8139 100644 > --- a/arch/m68k/mm/fault.c > +++ b/arch/m68k/mm/fault.c > @@ -138,8 +138,9 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, > fault = handle_mm_fault(vma, address, flags); > pr_debug("handle_mm_fault returns %x\n", fault); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > - return 0; > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > + return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > if (fault & VM_FAULT_OOM) > diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c > index 45c9f66c1dbc..8b0615eab4b6 100644 > --- a/arch/microblaze/mm/fault.c > +++ b/arch/microblaze/mm/fault.c > @@ -217,7 +217,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c > index 6660b77ff8f3..48aac20a1ded 100644 > --- a/arch/mips/mm/fault.c > +++ b/arch/mips/mm/fault.c > @@ -154,7 +154,8 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); > diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c > index a40de112a23a..baa44f9d0b4a 100644 > --- a/arch/nds32/mm/fault.c > +++ b/arch/nds32/mm/fault.c > @@ -206,14 +206,15 @@ void do_page_fault(unsigned long entry, unsigned long addr, > fault = handle_mm_fault(vma, addr, flags); > > /* > - * If we need to retry but a fatal signal is pending, handle the > + * If we need to retry but a signal is pending, try to handle the > * signal first. We do not need to release the mmap_sem because it > * would already be released in __lock_page_or_retry in mm/filemap.c. > */ > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { > - if (!user_mode(regs)) > + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { > + if (fatal_signal_pending(current) && !user_mode(regs)) > goto no_context; > - return; > + if (user_mode(regs)) > + return; > } > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c > index a401b45cae47..f9f178484184 100644 > --- a/arch/nios2/mm/fault.c > +++ b/arch/nios2/mm/fault.c > @@ -133,7 +133,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c > index fd1592a56238..8ba3696dd10c 100644 > --- a/arch/openrisc/mm/fault.c > +++ b/arch/openrisc/mm/fault.c > @@ -161,7 +161,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, > > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c > index 355e3e13fa72..163dcb080c7b 100644 > --- a/arch/parisc/mm/fault.c > +++ b/arch/parisc/mm/fault.c > @@ -304,7 +304,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, > > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > index 408ee769c470..d321a6c5fe62 100644 > --- a/arch/powerpc/mm/fault.c > +++ b/arch/powerpc/mm/fault.c > @@ -596,6 +596,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, > */ > flags &= ~FAULT_FLAG_ALLOW_RETRY; > flags |= FAULT_FLAG_TRIED; > + if (is_user && signal_pending(current)) > + return 0; > if (!fatal_signal_pending(current)) > goto retry; > } > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c > index deeb820bd855..ea8f301de65b 100644 > --- a/arch/riscv/mm/fault.c > +++ b/arch/riscv/mm/fault.c > @@ -111,11 +111,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs) > fault = handle_mm_fault(vma, addr, flags); > > /* > - * If we need to retry but a fatal signal is pending, handle the > + * If we need to retry but a signal is pending, try to handle the > * signal first. We do not need to release the mmap_sem because it > * would already be released in __lock_page_or_retry in mm/filemap.c. > */ > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { Acked-by: Palmer Dabbelt <palmer@sifive.com> # RISC-V parts I'm assuming this is going in through some other tree. > diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c > index 74a77b2bca75..3ad77501deef 100644 > --- a/arch/s390/mm/fault.c > +++ b/arch/s390/mm/fault.c > @@ -480,8 +480,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) > * the fault. > */ > fault = handle_mm_fault(vma, address, flags); > - /* No reason to continue if interrupted by SIGKILL. */ > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) { > fault = VM_FAULT_SIGNAL; > if (flags & FAULT_FLAG_RETRY_NOWAIT) > goto out_up; > diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c > index becf0be267bb..f620282a37fd 100644 > --- a/arch/sh/mm/fault.c > +++ b/arch/sh/mm/fault.c > @@ -489,6 +489,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, > * have already released it in __lock_page_or_retry > * in mm/filemap.c. > */ > + > + if (user_mode(regs) && signal_pending(tsk)) > + return; > + > goto retry; > } > } > diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c > index 0863f6fdd2c5..9af0c3ad50d6 100644 > --- a/arch/sparc/mm/fault_32.c > +++ b/arch/sparc/mm/fault_32.c > @@ -237,7 +237,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && fault_should_check_signal(from_user)) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c > index a1cba3eef79e..566f05f9040b 100644 > --- a/arch/sparc/mm/fault_64.c > +++ b/arch/sparc/mm/fault_64.c > @@ -421,7 +421,8 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) > > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(flags & FAULT_FLAG_USER)) > goto exit_exception; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c > index bc2756782d64..3c72111f27e9 100644 > --- a/arch/um/kernel/trap.c > +++ b/arch/um/kernel/trap.c > @@ -76,7 +76,9 @@ int handle_page_fault(unsigned long address, unsigned long ip, > > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(is_user)) > goto out_nosemaphore; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c > index 60453c892c51..04c193439c97 100644 > --- a/arch/unicore32/mm/fault.c > +++ b/arch/unicore32/mm/fault.c > @@ -246,11 +246,12 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) > > fault = __do_pf(mm, addr, fsr, flags, tsk); > > - /* If we need to retry but a fatal signal is pending, handle the > + /* If we need to retry but a signal is pending, try to handle the > * signal first. We do not need to release the mmap_sem because > * it would already be released in __lock_page_or_retry in > * mm/filemap.c. */ > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return 0; > > if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 994c860ac2d8..f7836472961e 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -1451,6 +1451,8 @@ void do_user_addr_fault(struct pt_regs *regs, > if (flags & FAULT_FLAG_ALLOW_RETRY) { > flags &= ~FAULT_FLAG_ALLOW_RETRY; > flags |= FAULT_FLAG_TRIED; > + if ((flags & FAULT_FLAG_USER) && signal_pending(tsk)) > + return; > if (!fatal_signal_pending(tsk)) > goto retry; > } > diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c > index d2b082908538..094606676c36 100644 > --- a/arch/xtensa/mm/fault.c > +++ b/arch/xtensa/mm/fault.c > @@ -110,7 +110,8 @@ void do_page_fault(struct pt_regs *regs) > */ > fault = handle_mm_fault(vma, address, flags); > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) > + if ((fault & VM_FAULT_RETRY) && > + fault_should_check_signal(user_mode(regs))) > return; > > if (unlikely(fault & VM_FAULT_ERROR)) { > diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h > index efd8ce7675ed..ccce63f2822d 100644 > --- a/include/linux/sched/signal.h > +++ b/include/linux/sched/signal.h > @@ -377,6 +377,18 @@ static inline int signal_pending_state(long state, struct task_struct *p) > return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p); > } > > +/* > + * This should only be used in fault handlers to decide whether we > + * should stop the current fault routine to handle the signals > + * instead. It should normally be used when a signal interrupted a > + * page fault which can lead to a VM_FAULT_RETRY. > + */ > +static inline bool fault_should_check_signal(bool is_user) > +{ > + return (fatal_signal_pending(current) || > + (is_user && signal_pending(current))); > +} > + > /* > * Reevaluate whether the task has signals pending delivery. > * Wake the task if so.
On Tue, Oct 08, 2019 at 03:43:19PM -0700, Palmer Dabbelt wrote: > > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c > > index deeb820bd855..ea8f301de65b 100644 > > --- a/arch/riscv/mm/fault.c > > +++ b/arch/riscv/mm/fault.c > > @@ -111,11 +111,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs) > > fault = handle_mm_fault(vma, addr, flags); > > > > /* > > - * If we need to retry but a fatal signal is pending, handle the > > + * If we need to retry but a signal is pending, try to handle the > > * signal first. We do not need to release the mmap_sem because it > > * would already be released in __lock_page_or_retry in mm/filemap.c. > > */ > > - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) > > + if ((fault & VM_FAULT_RETRY) && > > + fault_should_check_signal(user_mode(regs))) > > return; > > > > if (unlikely(fault & VM_FAULT_ERROR)) { > > Acked-by: Palmer Dabbelt <palmer@sifive.com> # RISC-V parts > > I'm assuming this is going in through some other tree. Hi, Palmer, Thanks for reviewing! There's a new version here, please feel free to have a look too: https://lore.kernel.org/lkml/20190926093904.5090-1-peterx@redhat.com/ Regards,
diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c index de4cc6936391..ab1d4212d658 100644 --- a/arch/alpha/mm/fault.c +++ b/arch/alpha/mm/fault.c @@ -150,7 +150,8 @@ do_page_fault(unsigned long address, unsigned long mmcsr, the fault. */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c index 61919e4e4eec..27adf4e608e4 100644 --- a/arch/arc/mm/fault.c +++ b/arch/arc/mm/fault.c @@ -142,6 +142,11 @@ void do_page_fault(unsigned long address, struct pt_regs *regs) goto no_context; return; } + + /* Allow user to handle non-fatal signals first */ + if (signal_pending(current) && user_mode(regs)) + return; + /* * retry state machine */ diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index 2ae28ffec622..f00fb4eafe54 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -291,14 +291,15 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_page_fault(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, try to handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (unlikely(fault & VM_FAULT_RETRY && signal_pending(current))) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return 0; + if (user_mode(regs)) + return 0; } /* diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 613e7434c208..0d3fe0ea6a70 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -479,15 +479,16 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, if (fault & VM_FAULT_RETRY) { /* - * If we need to retry but a fatal signal is pending, + * If we need to retry but a signal is pending, try to * handle the signal first. We do not need to release * the mmap_sem because it would already be released * in __lock_page_or_retry in mm/filemap.c. */ - if (fatal_signal_pending(current)) { - if (!user_mode(regs)) + if (signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return 0; + if (user_mode(regs)) + return 0; } /* diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c index 223787e01bdd..88a2e5635bfb 100644 --- a/arch/hexagon/mm/vm_fault.c +++ b/arch/hexagon/mm/vm_fault.c @@ -91,7 +91,8 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; /* The most common case -- we are done. */ diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index d039b846f671..8d47acf50fda 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -141,7 +141,8 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index 8e734309ace9..103f93ba8139 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -138,8 +138,9 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); pr_debug("handle_mm_fault returns %x\n", fault); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) - return 0; + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) + return; if (unlikely(fault & VM_FAULT_ERROR)) { if (fault & VM_FAULT_OOM) diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 45c9f66c1dbc..8b0615eab4b6 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -217,7 +217,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long address, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c index 6660b77ff8f3..48aac20a1ded 100644 --- a/arch/mips/mm/fault.c +++ b/arch/mips/mm/fault.c @@ -154,7 +154,8 @@ static void __kprobes __do_page_fault(struct pt_regs *regs, unsigned long write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index a40de112a23a..baa44f9d0b4a 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -206,14 +206,15 @@ void do_page_fault(unsigned long entry, unsigned long addr, fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, try to handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { - if (!user_mode(regs)) + if ((fault & VM_FAULT_RETRY) && signal_pending(current)) { + if (fatal_signal_pending(current) && !user_mode(regs)) goto no_context; - return; + if (user_mode(regs)) + return; } if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c index a401b45cae47..f9f178484184 100644 --- a/arch/nios2/mm/fault.c +++ b/arch/nios2/mm/fault.c @@ -133,7 +133,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c index fd1592a56238..8ba3696dd10c 100644 --- a/arch/openrisc/mm/fault.c +++ b/arch/openrisc/mm/fault.c @@ -161,7 +161,8 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c index 355e3e13fa72..163dcb080c7b 100644 --- a/arch/parisc/mm/fault.c +++ b/arch/parisc/mm/fault.c @@ -304,7 +304,8 @@ void do_page_fault(struct pt_regs *regs, unsigned long code, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 408ee769c470..d321a6c5fe62 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -596,6 +596,8 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, */ flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if (is_user && signal_pending(current)) + return 0; if (!fatal_signal_pending(current)) goto retry; } diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index deeb820bd855..ea8f301de65b 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -111,11 +111,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, addr, flags); /* - * If we need to retry but a fatal signal is pending, handle the + * If we need to retry but a signal is pending, try to handle the * signal first. We do not need to release the mmap_sem because it * would already be released in __lock_page_or_retry in mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 74a77b2bca75..3ad77501deef 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -480,8 +480,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) * the fault. */ fault = handle_mm_fault(vma, address, flags); - /* No reason to continue if interrupted by SIGKILL. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) { + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) { fault = VM_FAULT_SIGNAL; if (flags & FAULT_FLAG_RETRY_NOWAIT) goto out_up; diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c index becf0be267bb..f620282a37fd 100644 --- a/arch/sh/mm/fault.c +++ b/arch/sh/mm/fault.c @@ -489,6 +489,10 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, * have already released it in __lock_page_or_retry * in mm/filemap.c. */ + + if (user_mode(regs) && signal_pending(tsk)) + return; + goto retry; } } diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c index 0863f6fdd2c5..9af0c3ad50d6 100644 --- a/arch/sparc/mm/fault_32.c +++ b/arch/sparc/mm/fault_32.c @@ -237,7 +237,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write, */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && fault_should_check_signal(from_user)) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c index a1cba3eef79e..566f05f9040b 100644 --- a/arch/sparc/mm/fault_64.c +++ b/arch/sparc/mm/fault_64.c @@ -421,7 +421,8 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs) fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(flags & FAULT_FLAG_USER)) goto exit_exception; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c index bc2756782d64..3c72111f27e9 100644 --- a/arch/um/kernel/trap.c +++ b/arch/um/kernel/trap.c @@ -76,7 +76,9 @@ int handle_page_fault(unsigned long address, unsigned long ip, fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(is_user)) goto out_nosemaphore; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/arch/unicore32/mm/fault.c b/arch/unicore32/mm/fault.c index 60453c892c51..04c193439c97 100644 --- a/arch/unicore32/mm/fault.c +++ b/arch/unicore32/mm/fault.c @@ -246,11 +246,12 @@ static int do_pf(unsigned long addr, unsigned int fsr, struct pt_regs *regs) fault = __do_pf(mm, addr, fsr, flags, tsk); - /* If we need to retry but a fatal signal is pending, handle the + /* If we need to retry but a signal is pending, try to handle the * signal first. We do not need to release the mmap_sem because * it would already be released in __lock_page_or_retry in * mm/filemap.c. */ - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return 0; if (!(fault & VM_FAULT_ERROR) && (flags & FAULT_FLAG_ALLOW_RETRY)) { diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 994c860ac2d8..f7836472961e 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1451,6 +1451,8 @@ void do_user_addr_fault(struct pt_regs *regs, if (flags & FAULT_FLAG_ALLOW_RETRY) { flags &= ~FAULT_FLAG_ALLOW_RETRY; flags |= FAULT_FLAG_TRIED; + if ((flags & FAULT_FLAG_USER) && signal_pending(tsk)) + return; if (!fatal_signal_pending(tsk)) goto retry; } diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c index d2b082908538..094606676c36 100644 --- a/arch/xtensa/mm/fault.c +++ b/arch/xtensa/mm/fault.c @@ -110,7 +110,8 @@ void do_page_fault(struct pt_regs *regs) */ fault = handle_mm_fault(vma, address, flags); - if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) + if ((fault & VM_FAULT_RETRY) && + fault_should_check_signal(user_mode(regs))) return; if (unlikely(fault & VM_FAULT_ERROR)) { diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index efd8ce7675ed..ccce63f2822d 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -377,6 +377,18 @@ static inline int signal_pending_state(long state, struct task_struct *p) return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p); } +/* + * This should only be used in fault handlers to decide whether we + * should stop the current fault routine to handle the signals + * instead. It should normally be used when a signal interrupted a + * page fault which can lead to a VM_FAULT_RETRY. + */ +static inline bool fault_should_check_signal(bool is_user) +{ + return (fatal_signal_pending(current) || + (is_user && signal_pending(current))); +} + /* * Reevaluate whether the task has signals pending delivery. * Wake the task if so.
The idea comes from the upstream discussion between Linus and Andrea: https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/ A summary to the issue: there was a special path in handle_userfault() in the past that we'll return a VM_FAULT_NOPAGE when we detected non-fatal signals when waiting for userfault handling. We did that by reacquiring the mmap_sem before returning. However that brings a risk in that the vmas might have changed when we retake the mmap_sem and even we could be holding an invalid vma structure. This patch is a preparation of removing that special path by allowing the page fault to return even faster if we were interrupted by a non-fatal signal during a user-mode page fault handling routine. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- arch/alpha/mm/fault.c | 3 ++- arch/arc/mm/fault.c | 5 +++++ arch/arm/mm/fault.c | 9 +++++---- arch/arm64/mm/fault.c | 9 +++++---- arch/hexagon/mm/vm_fault.c | 3 ++- arch/ia64/mm/fault.c | 3 ++- arch/m68k/mm/fault.c | 5 +++-- arch/microblaze/mm/fault.c | 3 ++- arch/mips/mm/fault.c | 3 ++- arch/nds32/mm/fault.c | 9 +++++---- arch/nios2/mm/fault.c | 3 ++- arch/openrisc/mm/fault.c | 3 ++- arch/parisc/mm/fault.c | 3 ++- arch/powerpc/mm/fault.c | 2 ++ arch/riscv/mm/fault.c | 5 +++-- arch/s390/mm/fault.c | 4 ++-- arch/sh/mm/fault.c | 4 ++++ arch/sparc/mm/fault_32.c | 2 +- arch/sparc/mm/fault_64.c | 3 ++- arch/um/kernel/trap.c | 4 +++- arch/unicore32/mm/fault.c | 5 +++-- arch/x86/mm/fault.c | 2 ++ arch/xtensa/mm/fault.c | 3 ++- include/linux/sched/signal.h | 12 ++++++++++++ 24 files changed, 75 insertions(+), 32 deletions(-)