Message ID | 20190509200633.19678-1-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kernel/locking/semaphore: use wake_q in up() | expand |
On Fri, May 10, 2019 at 7:50 AM Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > > On (05/09/19 22:06), Daniel Vetter wrote: > [..] > > +/* Functions for the contended case */ > > + > > +struct semaphore_waiter { > > + struct list_head list; > > + struct task_struct *task; > > + bool up; > > +}; > > + > > /** > > * up - release the semaphore > > * @sem: the semaphore to release > > @@ -179,24 +187,25 @@ EXPORT_SYMBOL(down_timeout); > > void up(struct semaphore *sem) > > { > > unsigned long flags; > > + struct semaphore_waiter *waiter; > > + DEFINE_WAKE_Q(wake_q); > > > > raw_spin_lock_irqsave(&sem->lock, flags); > > - if (likely(list_empty(&sem->wait_list))) > > + if (likely(list_empty(&sem->wait_list))) { > > sem->count++; > > - else > > - __up(sem); > > + } else { > > + waiter = list_first_entry(&sem->wait_list, > > + struct semaphore_waiter, list); > > + list_del(&waiter->list); > > + waiter->up = true; > > + wake_q_add(&wake_q, waiter->task); > > + } > > raw_spin_unlock_irqrestore(&sem->lock, flags); > > So the new code still can printk/WARN under sem->lock in some buggy > cases. > > E.g. > wake_q_add() > get_task_struct() > refcount_inc_checked() > WARN_ONCE() > > Are we fine with that? Hm not great. It's not as bad as the one I'm trying to fix (or not the same at least), because with the wake up chain we have a few locks in there. Which allows lockdep to connect the loop and complain, even when we never actually hit that specific recursion. I.e. once hitting a WARN_ON from try_to_wake_up is enough, plus a totally separate callchain can then close the semaphore.lock->scheduler locks part. Your chain only goes boom if it happens from the console_lock's up. wake_q_add_safe would be an option, but then we somehow need to arrange for down to call get_task_struct(current) and releasing that, but only if there's no waker who needs that task ref. Sounds tricky ... Also not sure we want to stuff that trickery into the generic semaphore code. -Daniel
On Thu 2019-05-09 22:06:33, Daniel Vetter wrote: > console_trylock, called from within printk, can be called from pretty > much anywhere. Including try_to_wake_up. Note that this isn't common, > usually the box is in pretty bad shape at that point already. But it > really doesn't help when then lockdep jumps in and spams the logs, > potentially obscuring the real backtrace we're really interested in. > One case I've seen (slightly simplified backtrace): > > Fix this specific locking recursion by moving the wake_up_process out > from under the semaphore.lock spinlock, using wake_q as recommended by > Peter Zijlstra. It might make sense to mention also the optimization effect mentioned by Peter. > diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c > index 561acdd39960..7a6f33715688 100644 > --- a/kernel/locking/semaphore.c > +++ b/kernel/locking/semaphore.c > @@ -169,6 +169,14 @@ int down_timeout(struct semaphore *sem, long timeout) > } > EXPORT_SYMBOL(down_timeout); > > +/* Functions for the contended case */ > + > +struct semaphore_waiter { > + struct list_head list; > + struct task_struct *task; > + bool up; > +}; > + > /** > * up - release the semaphore > * @sem: the semaphore to release > @@ -179,24 +187,25 @@ EXPORT_SYMBOL(down_timeout); > void up(struct semaphore *sem) > { > unsigned long flags; > + struct semaphore_waiter *waiter; > + DEFINE_WAKE_Q(wake_q); We need to call wake_q_init(&wake_q) to make sure that it is empty. Best Regards, Petr > raw_spin_lock_irqsave(&sem->lock, flags); > - if (likely(list_empty(&sem->wait_list))) > + if (likely(list_empty(&sem->wait_list))) { > sem->count++; > - else > - __up(sem); > + } else { > + waiter = list_first_entry(&sem->wait_list, > + struct semaphore_waiter, list); > + list_del(&waiter->list); > + waiter->up = true; > + wake_q_add(&wake_q, waiter->task); > + } > raw_spin_unlock_irqrestore(&sem->lock, flags); > + > + wake_up_q(&wake_q); > } > EXPORT_SYMBOL(up); > > -/* Functions for the contended case */ > - > -struct semaphore_waiter { > - struct list_head list; > - struct task_struct *task; > - bool up; > -}; > - > /* > * Because this function is inlined, the 'state' parameter will be > * constant, and thus optimised away by the compiler. Likewise the
On Fri, May 10, 2019 at 11:28 AM Petr Mladek <pmladek@suse.com> wrote: > > On Thu 2019-05-09 22:06:33, Daniel Vetter wrote: > > console_trylock, called from within printk, can be called from pretty > > much anywhere. Including try_to_wake_up. Note that this isn't common, > > usually the box is in pretty bad shape at that point already. But it > > really doesn't help when then lockdep jumps in and spams the logs, > > potentially obscuring the real backtrace we're really interested in. > > One case I've seen (slightly simplified backtrace): > > > > Fix this specific locking recursion by moving the wake_up_process out > > from under the semaphore.lock spinlock, using wake_q as recommended by > > Peter Zijlstra. > > It might make sense to mention also the optimization effect mentioned > by Peter. > > > diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c > > index 561acdd39960..7a6f33715688 100644 > > --- a/kernel/locking/semaphore.c > > +++ b/kernel/locking/semaphore.c > > @@ -169,6 +169,14 @@ int down_timeout(struct semaphore *sem, long timeout) > > } > > EXPORT_SYMBOL(down_timeout); > > > > +/* Functions for the contended case */ > > + > > +struct semaphore_waiter { > > + struct list_head list; > > + struct task_struct *task; > > + bool up; > > +}; > > + > > /** > > * up - release the semaphore > > * @sem: the semaphore to release > > @@ -179,24 +187,25 @@ EXPORT_SYMBOL(down_timeout); > > void up(struct semaphore *sem) > > { > > unsigned long flags; > > + struct semaphore_waiter *waiter; > > + DEFINE_WAKE_Q(wake_q); > > We need to call wake_q_init(&wake_q) to make sure that > it is empty. DEFINE_WAKE_Q does that already, and if it didn't, I'd wonder how I managed to boot with this patch. console_lock is usally terribly contented because thanks to fbcon we must do a full display modeset while holding it, which takes forever. As long as anyone printks meanwhile (guaranteed while loading drivers really) you have contention. -Daniel > Best Regards, > Petr > > > raw_spin_lock_irqsave(&sem->lock, flags); > > - if (likely(list_empty(&sem->wait_list))) > > + if (likely(list_empty(&sem->wait_list))) { > > sem->count++; > > - else > > - __up(sem); > > + } else { > > + waiter = list_first_entry(&sem->wait_list, > > + struct semaphore_waiter, list); > > + list_del(&waiter->list); > > + waiter->up = true; > > + wake_q_add(&wake_q, waiter->task); > > + } > > raw_spin_unlock_irqrestore(&sem->lock, flags); > > + > > + wake_up_q(&wake_q); > > } > > EXPORT_SYMBOL(up); > > > > -/* Functions for the contended case */ > > - > > -struct semaphore_waiter { > > - struct list_head list; > > - struct task_struct *task; > > - bool up; > > -}; > > - > > /* > > * Because this function is inlined, the 'state' parameter will be > > * constant, and thus optimised away by the compiler. Likewise the
On Fri 2019-05-10 17:20:15, Daniel Vetter wrote: > On Fri, May 10, 2019 at 11:28 AM Petr Mladek <pmladek@suse.com> wrote: > > > > On Thu 2019-05-09 22:06:33, Daniel Vetter wrote: > > > console_trylock, called from within printk, can be called from pretty > > > much anywhere. Including try_to_wake_up. Note that this isn't common, > > > usually the box is in pretty bad shape at that point already. But it > > > really doesn't help when then lockdep jumps in and spams the logs, > > > potentially obscuring the real backtrace we're really interested in. > > > One case I've seen (slightly simplified backtrace): > > > > > > Fix this specific locking recursion by moving the wake_up_process out > > > from under the semaphore.lock spinlock, using wake_q as recommended by > > > Peter Zijlstra. > > > > It might make sense to mention also the optimization effect mentioned > > by Peter. > > > > > diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c > > > index 561acdd39960..7a6f33715688 100644 > > > --- a/kernel/locking/semaphore.c > > > +++ b/kernel/locking/semaphore.c > > > @@ -169,6 +169,14 @@ int down_timeout(struct semaphore *sem, long timeout) > > > } > > > EXPORT_SYMBOL(down_timeout); > > > > > > +/* Functions for the contended case */ > > > + > > > +struct semaphore_waiter { > > > + struct list_head list; > > > + struct task_struct *task; > > > + bool up; > > > +}; > > > + > > > /** > > > * up - release the semaphore > > > * @sem: the semaphore to release > > > @@ -179,24 +187,25 @@ EXPORT_SYMBOL(down_timeout); > > > void up(struct semaphore *sem) > > > { > > > unsigned long flags; > > > + struct semaphore_waiter *waiter; > > > + DEFINE_WAKE_Q(wake_q); > > > > We need to call wake_q_init(&wake_q) to make sure that > > it is empty. > > DEFINE_WAKE_Q does that already, and if it didn't, I'd wonder how I > managed to boot with this patch. console_lock is usally terribly > contented because thanks to fbcon we must do a full display modeset > while holding it, which takes forever. As long as anyone printks > meanwhile (guaranteed while loading drivers really) you have > contention. > -Daniel You are right. It is initialized by DEFINE_WAKE_Q. The patch looks correct to me then: Reviewed-by: Petr Mladek <pmladek@suse,com> Best Regards, Petr
diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c index 561acdd39960..7a6f33715688 100644 --- a/kernel/locking/semaphore.c +++ b/kernel/locking/semaphore.c @@ -33,12 +33,12 @@ #include <linux/semaphore.h> #include <linux/spinlock.h> #include <linux/ftrace.h> +#include <linux/sched/wake_q.h> static noinline void __down(struct semaphore *sem); static noinline int __down_interruptible(struct semaphore *sem); static noinline int __down_killable(struct semaphore *sem); static noinline int __down_timeout(struct semaphore *sem, long timeout); -static noinline void __up(struct semaphore *sem); /** * down - acquire the semaphore @@ -169,6 +169,14 @@ int down_timeout(struct semaphore *sem, long timeout) } EXPORT_SYMBOL(down_timeout); +/* Functions for the contended case */ + +struct semaphore_waiter { + struct list_head list; + struct task_struct *task; + bool up; +}; + /** * up - release the semaphore * @sem: the semaphore to release @@ -179,24 +187,25 @@ EXPORT_SYMBOL(down_timeout); void up(struct semaphore *sem) { unsigned long flags; + struct semaphore_waiter *waiter; + DEFINE_WAKE_Q(wake_q); raw_spin_lock_irqsave(&sem->lock, flags); - if (likely(list_empty(&sem->wait_list))) + if (likely(list_empty(&sem->wait_list))) { sem->count++; - else - __up(sem); + } else { + waiter = list_first_entry(&sem->wait_list, + struct semaphore_waiter, list); + list_del(&waiter->list); + waiter->up = true; + wake_q_add(&wake_q, waiter->task); + } raw_spin_unlock_irqrestore(&sem->lock, flags); + + wake_up_q(&wake_q); } EXPORT_SYMBOL(up); -/* Functions for the contended case */ - -struct semaphore_waiter { - struct list_head list; - struct task_struct *task; - bool up; -}; - /* * Because this function is inlined, the 'state' parameter will be * constant, and thus optimised away by the compiler. Likewise the @@ -252,12 +261,3 @@ static noinline int __sched __down_timeout(struct semaphore *sem, long timeout) { return __down_common(sem, TASK_UNINTERRUPTIBLE, timeout); } - -static noinline void __sched __up(struct semaphore *sem) -{ - struct semaphore_waiter *waiter = list_first_entry(&sem->wait_list, - struct semaphore_waiter, list); - list_del(&waiter->list); - waiter->up = true; - wake_up_process(waiter->task); -}