Message ID | 1484313321-17196-1-git-send-email-pmladek@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 13 Jan 2017 14:15:21 +0100 Petr Mladek <pmladek@suse.com> wrote: > --- > This is related to the thread > https://lkml.kernel.org/r/201612261954.FJE69201.OFLVtFJSQFOHMO@I-love.SAKURA.ne.jp > > kernel/printk/printk.c | 25 ++++++++++++------------- > 1 file changed, 12 insertions(+), 13 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 7180088cbb23..2ac54291230d 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2150,7 +2150,7 @@ void console_unlock(void) > static u64 seen_seq; > unsigned long flags; > bool wake_klogd = false; > - bool do_cond_resched, retry; > + bool may_schedule_orig, retry; <bike-shedding> Hmm, I just hate the name of that variable. console_may_schedule_orig, keep the full name? </bike-shedding> > > if (console_suspended) { > up_console_sem(); > @@ -2158,17 +2158,15 @@ void console_unlock(void) > } > > /* > - * Console drivers are called under logbuf_lock, so > - * @console_may_schedule should be cleared before; however, we may > - * end up dumping a lot of lines, for example, if called from > - * console registration path, and should invoke cond_resched() > - * between lines if allowable. Not doing so can cause a very long > - * scheduling stall on a slow console leading to RCU stall and > - * softlockup warnings which exacerbate the issue with more > - * messages practically incapacitating the system. > + * Console drivers are called with interrupts disabled, so > + * @console_may_schedule must be cleared before. The original > + * value must be restored so that we could schedule between lines. > + * > + * console_trylock() is not able to detect the preemptive context when > + * CONFIG_PREEMPT_COUNT is disabled. Therefore the value must be > + * stored before the "again" goto label. > */ > - do_cond_resched = console_may_schedule; > - console_may_schedule = 0; > + may_schedule_orig = console_may_schedule; > > again: > /* > @@ -2235,12 +2233,13 @@ void console_unlock(void) > raw_spin_unlock(&logbuf_lock); > > stop_critical_timings(); /* don't trace print latency */ > + console_may_schedule = 0; > call_console_drivers(ext_text, ext_len, text, len); > + console_may_schedule = may_schedule_orig; > start_critical_timings(); > printk_safe_exit_irqrestore(flags); > > - if (do_cond_resched) > - cond_resched(); > + console_conditional_schedule(); Makes perfect sense to me. The only thing that worries me is that it does change the logic slightly, and I'm not sure if this will have any ramifications with it. That is, console_unlock() use to always leave with console_may_schedule equal to zero, where console_unlock() clears it. With this change, console_unlock() no longer clears that variable. Will that have any side effects that we are unaware of? -- Steve > } > console_locked = 0; > -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri 2017-01-13 11:05:42, Steven Rostedt wrote: > On Fri, 13 Jan 2017 14:15:21 +0100 > Petr Mladek <pmladek@suse.com> wrote: > > > --- > > This is related to the thread > > https://lkml.kernel.org/r/201612261954.FJE69201.OFLVtFJSQFOHMO@I-love.SAKURA.ne.jp > > > > kernel/printk/printk.c | 25 ++++++++++++------------- > > 1 file changed, 12 insertions(+), 13 deletions(-) > > > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > > index 7180088cbb23..2ac54291230d 100644 > > --- a/kernel/printk/printk.c > > +++ b/kernel/printk/printk.c > > @@ -2150,7 +2150,7 @@ void console_unlock(void) > > static u64 seen_seq; > > unsigned long flags; > > bool wake_klogd = false; > > - bool do_cond_resched, retry; > > + bool may_schedule_orig, retry; > > <bike-shedding> > Hmm, I just hate the name of that variable. > console_may_schedule_orig, keep the full name? > </bike-shedding> No problem. I will use the full name in the next iteration if there will be one. > > > > if (console_suspended) { > > up_console_sem(); > > @@ -2158,17 +2158,15 @@ void console_unlock(void) > > } > > > > /* > > - * Console drivers are called under logbuf_lock, so > > - * @console_may_schedule should be cleared before; however, we may > > - * end up dumping a lot of lines, for example, if called from > > - * console registration path, and should invoke cond_resched() > > - * between lines if allowable. Not doing so can cause a very long > > - * scheduling stall on a slow console leading to RCU stall and > > - * softlockup warnings which exacerbate the issue with more > > - * messages practically incapacitating the system. > > + * Console drivers are called with interrupts disabled, so > > + * @console_may_schedule must be cleared before. The original > > + * value must be restored so that we could schedule between lines. > > + * > > + * console_trylock() is not able to detect the preemptive context when > > + * CONFIG_PREEMPT_COUNT is disabled. Therefore the value must be > > + * stored before the "again" goto label. > > */ > > - do_cond_resched = console_may_schedule; > > - console_may_schedule = 0; > > + may_schedule_orig = console_may_schedule; > > > > again: > > /* > > @@ -2235,12 +2233,13 @@ void console_unlock(void) > > raw_spin_unlock(&logbuf_lock); > > > > stop_critical_timings(); /* don't trace print latency */ > > + console_may_schedule = 0; > > call_console_drivers(ext_text, ext_len, text, len); > > + console_may_schedule = may_schedule_orig; > > start_critical_timings(); > > printk_safe_exit_irqrestore(flags); > > > > - if (do_cond_resched) > > - cond_resched(); > > + console_conditional_schedule(); > > Makes perfect sense to me. The only thing that worries me is that it > does change the logic slightly, and I'm not sure if this will have any > ramifications with it. That is, console_unlock() use to always leave > with console_may_schedule equal to zero, where console_unlock() clears > it. With this change, console_unlock() no longer clears that variable. > Will that have any side effects that we are unaware of? Good question! If I get it correctly, the variable should never be used without the console semaphore. IMHO, if it was used without the semaphore or if it was not set correctly when the semaphore was taken, it would be a bug. It means that leaving the variable set might actually help to find a buggy usage if there is any. My findings: + console_may_lock is set only by functions that get the console semaphore. + The function that takes the semaphore and does not set the variable is resume_console(). IMHO, it is a bug. We are on the safe side because the function is called from the same context as suspend_console() and it allows rescheduling. + I am not aware of any use of the variable without the semaphore. But it is not easy to prove just be reading the code. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On (01/16/17 12:00), Petr Mladek wrote: [..] > > Makes perfect sense to me. The only thing that worries me is that it > > does change the logic slightly, and I'm not sure if this will have any > > ramifications with it. That is, console_unlock() use to always leave > > with console_may_schedule equal to zero, where console_unlock() clears > > it. With this change, console_unlock() no longer clears that variable. > > Will that have any side effects that we are unaware of? > > Good question! it does look a bit worrisome. > If I get it correctly, the variable should never be used without the > console semaphore. IMHO, if it was used without the semaphore or if > it was not set correctly when the semaphore was taken, it would be a > bug. It means that leaving the variable set might actually help > to find a buggy usage if there is any. > > My findings: > > + console_may_lock is set only by functions that get the console > semaphore. > > + The function that takes the semaphore and does not set the > variable is resume_console(). IMHO, it is a bug. > > We are on the safe side because the function is called from > the same context as suspend_console() and it allows rescheduling. > > > + I am not aware of any use of the variable without the > semaphore. But it is not easy to prove just be reading > the code. there is a function that clears @console_may_schedule out of console_sem scope - console_flush_on_panic(). so I *may be* can think about a worst case scenario of race condition between console_flush_on_panic()->console_may_schedule = 0 on panic CPU and console_unlock()->console_may_schedule = 1 from CPU that panic CPU failed to stop (smp_send_stop() can return with secondary CPUs still being online). thoughts? -ss -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On (01/18/17 14:45), Sergey Senozhatsky wrote: [..] > > there is a function that clears @console_may_schedule out of > console_sem scope - console_flush_on_panic(). > so I *may be* can think about a worst case scenario of race > condition between > console_flush_on_panic()->console_may_schedule = 0 on panic CPU > and > console_unlock()->console_may_schedule = 1 from CPU that panic CPU > failed to stop (smp_send_stop() can return with secondary CPUs still being > online). what I mean, is that we can have, let's say, 2 CPUs spinning in console_unlock(), both with @console_may_schedule == 1 (because secondary CPU restores global @console_may_schedule value). now, suppose, we have misbehaving scheduler (well, we are in panic after all). secondary CPU will cond_resched() and may be lockup somewhere in the scheduler. which is fine, we don't care about that secondary CPU anyway. but the same can happen to panic CPU as well. what do you think? -ss -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed 2017-01-18 16:21:41, Sergey Senozhatsky wrote: > On (01/18/17 14:45), Sergey Senozhatsky wrote: > [..] > > > > there is a function that clears @console_may_schedule out of > > console_sem scope - console_flush_on_panic(). > > so I *may be* can think about a worst case scenario of race > > condition between > > console_flush_on_panic()->console_may_schedule = 0 on panic CPU > > and > > console_unlock()->console_may_schedule = 1 from CPU that panic CPU > > failed to stop (smp_send_stop() can return with secondary CPUs still being > > online). > > what I mean, is that we can have, let's say, 2 CPUs spinning in > console_unlock(), both with @console_may_schedule == 1 (because secondary > CPU restores global @console_may_schedule value). now, suppose, we have > misbehaving scheduler (well, we are in panic after all). secondary CPU > will cond_resched() and may be lockup somewhere in the scheduler. which is > fine, we don't care about that secondary CPU anyway. but the same can happen > to panic CPU as well. > > what do you think? Great catch! console_flush_on_panic() is called after smp_send_stop(); so only one CPU should be running. But it is not guaranteed. Better be on the safe side. I am going to use a conservative solution that will only move the "again" goto label. Just some thoughts for a future work: The dependencies between console_sem, console_may_schedule, console_locked, and console_suspended are complex like hell. There are several surprises. For example, console_trylock() and console_lock() behave differently when console_suspended is set. console_trylock() completely fails. console_lock() succeeds but it does not modify console_locked and console_may_schedule. This is the reason why we do not need to check console_suspended after the "again" goto target. IMHO, the key to make it more straightforward is to split console flushing functionality from console_unlock(). It is a bit problematic. console_unlock() guarantees that all messages are flushed when the semaphore is finally released. IMHO, it might get more relaxed with some deferred techniques. The deferred handling is perfectly fine most of the time. In emergency situations, the console_sem is either available or we rely on console_flush_on_panic() anyway. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 7180088cbb23..2ac54291230d 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2150,7 +2150,7 @@ void console_unlock(void) static u64 seen_seq; unsigned long flags; bool wake_klogd = false; - bool do_cond_resched, retry; + bool may_schedule_orig, retry; if (console_suspended) { up_console_sem(); @@ -2158,17 +2158,15 @@ void console_unlock(void) } /* - * Console drivers are called under logbuf_lock, so - * @console_may_schedule should be cleared before; however, we may - * end up dumping a lot of lines, for example, if called from - * console registration path, and should invoke cond_resched() - * between lines if allowable. Not doing so can cause a very long - * scheduling stall on a slow console leading to RCU stall and - * softlockup warnings which exacerbate the issue with more - * messages practically incapacitating the system. + * Console drivers are called with interrupts disabled, so + * @console_may_schedule must be cleared before. The original + * value must be restored so that we could schedule between lines. + * + * console_trylock() is not able to detect the preemptive context when + * CONFIG_PREEMPT_COUNT is disabled. Therefore the value must be + * stored before the "again" goto label. */ - do_cond_resched = console_may_schedule; - console_may_schedule = 0; + may_schedule_orig = console_may_schedule; again: /* @@ -2235,12 +2233,13 @@ void console_unlock(void) raw_spin_unlock(&logbuf_lock); stop_critical_timings(); /* don't trace print latency */ + console_may_schedule = 0; call_console_drivers(ext_text, ext_len, text, len); + console_may_schedule = may_schedule_orig; start_critical_timings(); printk_safe_exit_irqrestore(flags); - if (do_cond_resched) - cond_resched(); + console_conditional_schedule(); } console_locked = 0;
Some console drivers code calls console_conditional_schedule() that looks at @console_may_schedule. The value must be cleared when the drivers are called from console_unlock() with interrupts disabled. But rescheduling is fine when the same code is called, for example, from tty operations where the console semaphore is taken via console_lock(). This is why @console_may_schedule is cleared before calling console drivers. The original value is stored to decide if we could sleep between lines. Now, @console_may_schedule is not cleared when we call console_trylock() and jump back to the "again" goto label. This has become a problem, since the commit 6b97a20d3a7909daa066 ("printk: set may_schedule for some of console_trylock() callers"). @console_may_schedule might get enabled now. There is also the opposite problem. console_lock() can be called only from preemptive context. It can always enable scheduling in the console code. But console_trylock() is not able to detect it when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the original @console_may_schedule value after re-acquiring the console semaphore in console_unlock(). This patch solves both problems by clearing and restoring the very original @may_schedule setting only around call_console_drivers(). Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Petr Mladek <pmladek@suse.com> --- This is related to the thread https://lkml.kernel.org/r/201612261954.FJE69201.OFLVtFJSQFOHMO@I-love.SAKURA.ne.jp kernel/printk/printk.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-)