Message ID | alpine.DEB.2.11.1605201740320.3639@nanos (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Thomas, On Fri, May 20, 2016 at 05:42:17PM +0200, Thomas Gleixner wrote: > do_work_pending() calls schedule() with interrupts disabled, which is just > wrong. Fix it. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > arch/arm/kernel/signal.c | 1 + > 1 file changed, 1 insertion(+) > > --- a/arch/arm/kernel/signal.c > +++ b/arch/arm/kernel/signal.c > @@ -573,6 +573,7 @@ do_work_pending(struct pt_regs *regs, un > trace_hardirqs_off(); > do { > if (likely(thread_flags & _TIF_NEED_RESCHED)) { > + local_irq_enable(); > schedule(); > } else { > if (unlikely(!user_mode(regs))) We may have the same bug on arm64 (arch/arm64/kernel/entry.S). Is there a more fundamental problem with calling schedule() with IRQs off? The __schedule() function disables the IRQs shortly after it is entered. To silence IRQ trace warnings on arm64, we merged commit db3899a6477a ("arm64: Add trace_hardirqs_off annotation in ret_to_user"). But we were also debating whether enabling the IRQs before calling schedule() in arch/arm64/kernel/entry.S would make more sense. It looks like we need to revisit this patch: https://git.kernel.org/cgit/linux/kernel/git/mark/linux.git/commit/?h=arm64/entry-deasm&id=d244472af6e88c55603dc1ba342fae4e85cde31c Thanks.
On Mon, May 23, 2016 at 11:54:20AM +0100, Catalin Marinas wrote: > We may have the same bug on arm64 (arch/arm64/kernel/entry.S). Is there > a more fundamental problem with calling schedule() with IRQs off? The > __schedule() function disables the IRQs shortly after it is entered. schedule() does other stuff before entering __schedule() though, such as calling into the block layer. This code may have the expectation that interrupts are enabled. However, having interrupts enabled in this path (which I'd argue is special in respect of the "thou shalt not enter schedule() with IRQs off" rule) opens up the possibility to call into schedule() with the need_resched flag cleared: - need_resched was set when returning to userspace, we enter do_work_pending(). - do_work_pending() enables IRQs, and an IRQ was pending. - IRQ is processed, and during that kernel preemption happens, clearing this thread's need_resched flag. - we return to this thread, and now we will enter schedule() with need_resched clear. Whether that matters or not is a different question - and I guess is a question for scheduler people. The likelyhood of this happening depends on the IRQ load, but the requirements are quite simple: need_resched set while returning to userspace with a pending IRQ.
On Fri, May 20, 2016 at 05:42:17PM +0200, Thomas Gleixner wrote: > do_work_pending() calls schedule() with interrupts disabled, which is just > wrong. Fix it. Thomas; lockdep cannot currently catch this. It doesn't do IRQ state validation other than ensuring the state matches with the hardware. So things like: local_irq_disable(); local_irq_disable(); local_irq_save(); local_irq_enable(); (and 'obviously' suspect sequence of IRQ events) Are _fine_ by it. The only time it will yell is if flipping IRQ state ends up marking an actual held lock with ENABLED_HARDIRQ while it already had USED_IN_HARDIRQ.
--- a/arch/arm/kernel/signal.c +++ b/arch/arm/kernel/signal.c @@ -573,6 +573,7 @@ do_work_pending(struct pt_regs *regs, un trace_hardirqs_off(); do { if (likely(thread_flags & _TIF_NEED_RESCHED)) { + local_irq_enable(); schedule(); } else { if (unlikely(!user_mode(regs)))
do_work_pending() calls schedule() with interrupts disabled, which is just wrong. Fix it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/arm/kernel/signal.c | 1 + 1 file changed, 1 insertion(+)