Message ID | 20200716153034.4935-1-penguin-kernel@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Warn mmput() from memory reclaim context. | expand |
On Fri, 17 Jul 2020, Tetsuo Handa wrote: > syzbot is reporting that mmput() from shrinker function has a risk of > deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls > kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and > uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock. > > However, it took 18 months to hit this race for the third time, for > mmput() invokes __mmput() only when ->mm_users dropped to 0. If we > always warn like might_sleep(), we can detect the possibility of > deadlock more easier. > > For now, I inlined the check under CONFIG_PROVE_LOCKING. If we find > more locations, we could introduce a macro like might_sleep(). > Hi Tetsuo, It looks like this is one issue where mmput() interacted poorly while in direct reclaim because of a uprobes issue, I'm not sure that we can make a generalization that mmput() is *always* problematic when PF_MEMALLOC is set. I'm also mindful of the (ab)use of PF_MEMALLOC outside just the direct reclaim path. Or maybe there is a way you can show that mmput() while PF_MEMALLOC is set is always concerning? > [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45 > I wasn't familiar with this particular report, but it seems like the fix is simply to do the kzalloc() before taking delayed_uprobe_lock and freeing it if delayed_uprobe_check() already finds one for that uprobe? > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > --- > kernel/fork.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/fork.c b/kernel/fork.c > index efc5493203ae..8717ce50ff0d 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1109,6 +1109,10 @@ static inline void __mmput(struct mm_struct *mm) > void mmput(struct mm_struct *mm) > { > might_sleep(); > +#ifdef CONFIG_PROVE_LOCKING > + /* Calling mmput() from shrinker context can deadlock. */ > + WARN_ON(current->flags & PF_MEMALLOC); > +#endif > > if (atomic_dec_and_test(&mm->mm_users)) > __mmput(mm);
On 2020/07/17 4:45, David Rientjes wrote: > I wasn't familiar with this particular report, but it seems like the fix > is simply to do the kzalloc() before taking delayed_uprobe_lock and > freeing it if delayed_uprobe_check() already finds one for that uprobe? The fix will be to use mmput_async() ( https://lkml.kernel.org/r/20200716162931.g3delsp7qmfjup6x@wittgenstein ). We didn't call mmput() from the OOM reaper context. I think the reason is that __mmput() might do something more complicated which blocks reclaiming memory.
diff --git a/kernel/fork.c b/kernel/fork.c index efc5493203ae..8717ce50ff0d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1109,6 +1109,10 @@ static inline void __mmput(struct mm_struct *mm) void mmput(struct mm_struct *mm) { might_sleep(); +#ifdef CONFIG_PROVE_LOCKING + /* Calling mmput() from shrinker context can deadlock. */ + WARN_ON(current->flags & PF_MEMALLOC); +#endif if (atomic_dec_and_test(&mm->mm_users)) __mmput(mm);
syzbot is reporting that mmput() from shrinker function has a risk of deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock. However, it took 18 months to hit this race for the third time, for mmput() invokes __mmput() only when ->mm_users dropped to 0. If we always warn like might_sleep(), we can detect the possibility of deadlock more easier. For now, I inlined the check under CONFIG_PROVE_LOCKING. If we find more locations, we could introduce a macro like might_sleep(). [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- kernel/fork.c | 4 ++++ 1 file changed, 4 insertions(+)