Message ID | 20240307160951.3607374-1-stefan.wiehler@nokia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm: smp: Avoid false positive CPU hotplug Lockdep-RCU splat | expand |
On Thu, Mar 07, 2024 at 05:09:51PM +0100, Stefan Wiehler wrote: > With CONFIG_PROVE_RCU_LIST=y and by executing > > $ echo 0 > /sys/devices/system/cpu/cpu1/online > > one can trigger the following Lockdep-RCU splat on ARM: > > ============================= > WARNING: suspicious RCU usage > 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted > ----------------------------- > kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! > > other info that might help us debug this: > > RCU used illegally from offline CPU! > rcu_scheduler_active = 2, debug_locks = 1 > no locks held by swapper/1/0. > > stack backtrace: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 > Hardware name: Allwinner sun8i Family > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x60/0x90 > dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 > lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 > __lock_acquire from lock_acquire+0x10c/0x348 > lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c > _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 > check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c > arch_cpu_idle_dead from do_idle+0xbc/0x138 > do_idle from cpu_startup_entry+0x28/0x2c > cpu_startup_entry from secondary_start_kernel+0x11c/0x124 > secondary_start_kernel from 0x401018a0 > > The CPU is already reported as offline from RCU perspective in > cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above > RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the > ASID spinlock. > > Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as > online again while the spinlock is held. > > Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> From an RCU perspective, this looks plausible. One question below. Thanx, Paul > --- > arch/arm/kernel/smp.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > index 3431c0553f45..6875e2c5dd50 100644 > --- a/arch/arm/kernel/smp.c > +++ b/arch/arm/kernel/smp.c > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > { > unsigned int cpu = smp_processor_id(); > > + /* > + * Briefly report CPU as online again to avoid false positive > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > + * spinlock. > + */ > + rcutree_report_cpu_starting(cpu); > idle_task_exit(); > + rcutree_report_cpu_dead(); > > local_irq_disable(); Both rcutree_report_cpu_starting() and rcutree_report_cpu_dead() complain bitterly via lockdep if interrupts are enabled. And the call sites have interrupts disabled. So I don't understand what this local_irq_disable() is needed for.
On Thu, Mar 07, 2024 at 09:45:36AM -0800, Paul E. McKenney wrote: > On Thu, Mar 07, 2024 at 05:09:51PM +0100, Stefan Wiehler wrote: > > With CONFIG_PROVE_RCU_LIST=y and by executing > > > > $ echo 0 > /sys/devices/system/cpu/cpu1/online > > > > one can trigger the following Lockdep-RCU splat on ARM: > > > > ============================= > > WARNING: suspicious RCU usage > > 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted > > ----------------------------- > > kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! > > > > other info that might help us debug this: > > > > RCU used illegally from offline CPU! > > rcu_scheduler_active = 2, debug_locks = 1 > > no locks held by swapper/1/0. > > > > stack backtrace: > > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 > > Hardware name: Allwinner sun8i Family > > unwind_backtrace from show_stack+0x10/0x14 > > show_stack from dump_stack_lvl+0x60/0x90 > > dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 > > lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 > > __lock_acquire from lock_acquire+0x10c/0x348 > > lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c > > _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 > > check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c > > arch_cpu_idle_dead from do_idle+0xbc/0x138 > > do_idle from cpu_startup_entry+0x28/0x2c > > cpu_startup_entry from secondary_start_kernel+0x11c/0x124 > > secondary_start_kernel from 0x401018a0 > > > > The CPU is already reported as offline from RCU perspective in > > cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above > > RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the > > ASID spinlock. > > > > Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as > > online again while the spinlock is held. > > > > Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> > > From an RCU perspective, this looks plausible. One question > below. But one additional caution... If execution is delayed during that call to idle_task_exit(), RCU will stall and won't have a reasonable way of motivating this CPU. Such delays could be due to vCPU preemption or due to firmware grabbing the CPU. But this is only a caution, not opposition. After all, you could have the same problem with an online CPU that gets similarly delayed while its interrupts are disabled. Thanx, Paul > > --- > > arch/arm/kernel/smp.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > > index 3431c0553f45..6875e2c5dd50 100644 > > --- a/arch/arm/kernel/smp.c > > +++ b/arch/arm/kernel/smp.c > > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > > { > > unsigned int cpu = smp_processor_id(); > > > > + /* > > + * Briefly report CPU as online again to avoid false positive > > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > > + * spinlock. > > + */ > > + rcutree_report_cpu_starting(cpu); > > idle_task_exit(); > > + rcutree_report_cpu_dead(); > > > > local_irq_disable(); > > Both rcutree_report_cpu_starting() and rcutree_report_cpu_dead() complain > bitterly via lockdep if interrupts are enabled. And the call sites have > interrupts disabled. So I don't understand what this local_irq_disable() > is needed for.
[Cc Frederic] On Thu, Mar 07, 2024 at 05:09:51PM +0100, Stefan Wiehler wrote: > With CONFIG_PROVE_RCU_LIST=y and by executing > > $ echo 0 > /sys/devices/system/cpu/cpu1/online > > one can trigger the following Lockdep-RCU splat on ARM: > > ============================= > WARNING: suspicious RCU usage > 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted > ----------------------------- > kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! > > other info that might help us debug this: > > RCU used illegally from offline CPU! > rcu_scheduler_active = 2, debug_locks = 1 > no locks held by swapper/1/0. > > stack backtrace: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 > Hardware name: Allwinner sun8i Family > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x60/0x90 > dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 > lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 > __lock_acquire from lock_acquire+0x10c/0x348 > lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c > _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 > check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c > arch_cpu_idle_dead from do_idle+0xbc/0x138 > do_idle from cpu_startup_entry+0x28/0x2c > cpu_startup_entry from secondary_start_kernel+0x11c/0x124 > secondary_start_kernel from 0x401018a0 > > The CPU is already reported as offline from RCU perspective in > cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above > RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the > ASID spinlock. > > Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as I'm not sure this is a false-positive: if you traverse an RCU-list without RCU watching the CPU, then a grace period may pass, the next list can be garbage and you can go anywhere from the list. Or am I missing something that a CPU hotplug can block a grace period? But codewise, it looks good to me. Although I hope we can avoid calling rcutree_report_cpu_{starting,dead}() here and use something simple, however after a few attempts, I don't find a better way. Regards, Boqun > online again while the spinlock is held. > > Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> > --- > arch/arm/kernel/smp.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > index 3431c0553f45..6875e2c5dd50 100644 > --- a/arch/arm/kernel/smp.c > +++ b/arch/arm/kernel/smp.c > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > { > unsigned int cpu = smp_processor_id(); > > + /* > + * Briefly report CPU as online again to avoid false positive > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > + * spinlock. > + */ > + rcutree_report_cpu_starting(cpu); > idle_task_exit(); > + rcutree_report_cpu_dead(); > > local_irq_disable(); > > -- > 2.42.0 >
On Thu, Mar 07, 2024 at 09:45:36AM -0800, Paul E. McKenney wrote: > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > > index 3431c0553f45..6875e2c5dd50 100644 > > --- a/arch/arm/kernel/smp.c > > +++ b/arch/arm/kernel/smp.c > > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > > { > > unsigned int cpu = smp_processor_id(); > > > > + /* > > + * Briefly report CPU as online again to avoid false positive > > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > > + * spinlock. > > + */ > > + rcutree_report_cpu_starting(cpu); > > idle_task_exit(); > > + rcutree_report_cpu_dead(); > > > > local_irq_disable(); > > Both rcutree_report_cpu_starting() and rcutree_report_cpu_dead() complain > bitterly via lockdep if interrupts are enabled. And the call sites have > interrupts disabled. So I don't understand what this local_irq_disable() > is needed for. I think that's a question for this commit: commit e78a7614f3876ac649b3df608789cb6ef74d0480 Author: Peter Zijlstra <peterz@infradead.org> Date: Wed Jun 5 07:46:43 2019 -0700 Before this commit, arch_cpu_idle_dead() was called with IRQs enabled. This commit moved the local_irq_disable() before calling arch_cpu_idle_dead() but it seems no one looked at the various arch implementations to clean those up. Quite how arch people are supposed to spot this and clean up after such a commit, I'm not sure. The local_irq_disable() that you're asking about has been there ever since the inception of SMP on 32-bit ARM in this commit: commit a054a811597a17ffbe92bc4db04a4dc2f1b1ea55 Author: Russell King <rmk@dyn-67.arm.linux.org.uk> Date: Wed Nov 2 22:24:33 2005 +0000 Where cpu_die() was later renamed to arch_cpu_idle_dead(). So it's purely a case of a change being made to core code and arch code not receiving any fixups for it.
On Thu, Mar 07, 2024 at 06:54:38PM +0000, Russell King (Oracle) wrote: > On Thu, Mar 07, 2024 at 09:45:36AM -0800, Paul E. McKenney wrote: > > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > > > index 3431c0553f45..6875e2c5dd50 100644 > > > --- a/arch/arm/kernel/smp.c > > > +++ b/arch/arm/kernel/smp.c > > > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > > > { > > > unsigned int cpu = smp_processor_id(); > > > > > > + /* > > > + * Briefly report CPU as online again to avoid false positive > > > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > > > + * spinlock. > > > + */ > > > + rcutree_report_cpu_starting(cpu); > > > idle_task_exit(); > > > + rcutree_report_cpu_dead(); > > > > > > local_irq_disable(); > > > > Both rcutree_report_cpu_starting() and rcutree_report_cpu_dead() complain > > bitterly via lockdep if interrupts are enabled. And the call sites have > > interrupts disabled. So I don't understand what this local_irq_disable() > > is needed for. > > I think that's a question for this commit: > > commit e78a7614f3876ac649b3df608789cb6ef74d0480 > Author: Peter Zijlstra <peterz@infradead.org> > Date: Wed Jun 5 07:46:43 2019 -0700 > > Before this commit, arch_cpu_idle_dead() was called with IRQs enabled. > This commit moved the local_irq_disable() before calling > arch_cpu_idle_dead() but it seems no one looked at the various arch > implementations to clean those up. Quite how arch people are supposed > to spot this and clean up after such a commit, I'm not sure. Telepathy? ;-) > The local_irq_disable() that you're asking about has been there ever > since the inception of SMP on 32-bit ARM in this commit: > > commit a054a811597a17ffbe92bc4db04a4dc2f1b1ea55 > Author: Russell King <rmk@dyn-67.arm.linux.org.uk> > Date: Wed Nov 2 22:24:33 2005 +0000 > > Where cpu_die() was later renamed to arch_cpu_idle_dead(). So it's > purely a case of a change being made to core code and arch code not > receiving any fixups for it. Thank you for the info! Thanx, Paul
> I'm not sure this is a false-positive: if you traverse an RCU-list > without RCU watching the CPU, then a grace period may pass, the next > list can be garbage and you can go anywhere from the list. Or am I > missing something that a CPU hotplug can block a grace period? Thanks for the feedback! Yes, calling it a false positive is probably not correct, I will remove that. However, as Russell has explained in a previous email, the spinlock is fundamental for ASID handling and cannot be removed. @Russell, I will submit to the patch tracking system then.
On 3/7/2024 11:09 AM, Stefan Wiehler wrote: > With CONFIG_PROVE_RCU_LIST=y and by executing > > $ echo 0 > /sys/devices/system/cpu/cpu1/online > > one can trigger the following Lockdep-RCU splat on ARM: > > ============================= > WARNING: suspicious RCU usage > 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted > ----------------------------- > kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! > > other info that might help us debug this: > > RCU used illegally from offline CPU! > rcu_scheduler_active = 2, debug_locks = 1 > no locks held by swapper/1/0. > > stack backtrace: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 > Hardware name: Allwinner sun8i Family > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x60/0x90 > dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 > lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 > __lock_acquire from lock_acquire+0x10c/0x348 > lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c > _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 > check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c > arch_cpu_idle_dead from do_idle+0xbc/0x138 > do_idle from cpu_startup_entry+0x28/0x2c > cpu_startup_entry from secondary_start_kernel+0x11c/0x124 > secondary_start_kernel from 0x401018a0 > > The CPU is already reported as offline from RCU perspective in > cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above > RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the > ASID spinlock. > > Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as > online again while the spinlock is held. > > Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> > --- > arch/arm/kernel/smp.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c > index 3431c0553f45..6875e2c5dd50 100644 > --- a/arch/arm/kernel/smp.c > +++ b/arch/arm/kernel/smp.c > @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) > { > unsigned int cpu = smp_processor_id(); > > + /* > + * Briefly report CPU as online again to avoid false positive > + * Lockdep-RCU splat when check_and_switch_context() acquires ASID > + * spinlock. > + */ > + rcutree_report_cpu_starting(cpu); > idle_task_exit(); > + rcutree_report_cpu_dead(); I agree with the problem but disagree with the patch because it feels like a terrible workaround. Can we just use arch_spin_lock() for the cpu_asid_lock? This might require acquiring the raw_lock within the raw_spinlock_t, but there is precedent: arch/powerpc/kvm/book3s_hv_rm_mmu.c:245: arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); IMO, lockdep tracking of this lock is not necessary or possible considering the hotplug situation. Or is there a reason you need lockdep working for the cpu_asid_lock? thanks, - Joel
> I agree with the problem but disagree with the patch because it feels like a > terrible workaround. > > Can we just use arch_spin_lock() for the cpu_asid_lock? This might require > acquiring the raw_lock within the raw_spinlock_t, but there is precedent: > > arch/powerpc/kvm/book3s_hv_rm_mmu.c:245: > arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); > > IMO, lockdep tracking of this lock is not necessary or possible considering the > hotplug situation. > > Or is there a reason you need lockdep working for the cpu_asid_lock? I was not aware of this possibility to bypass lockdep tracing, but this seems to work and indeed looks like less of a workaround: diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c index 4204ffa2d104..4fc2c559f1b6 100644 --- a/arch/arm/mm/context.c +++ b/arch/arm/mm/context.c @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) goto switch_mm_fastpath; - raw_spin_lock_irqsave(&cpu_asid_lock, flags); + local_irq_save(flags); + arch_spin_lock(&cpu_asid_lock.raw_lock); /* Check that our ASID belongs to the current generation. */ asid = atomic64_read(&mm->context.id); if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) atomic64_set(&per_cpu(active_asids, cpu), asid); cpumask_set_cpu(cpu, mm_cpumask(mm)); - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); + arch_spin_unlock(&cpu_asid_lock.raw_lock); + local_irq_restore(flags); switch_mm_fastpath: cpu_switch_mm(mm->pgd, mm); @Russell, what do you think?
On Sat, Mar 09, 2024 at 08:45:35AM +0100, Stefan Wiehler wrote: > > I agree with the problem but disagree with the patch because it feels like a > > terrible workaround. > > > > Can we just use arch_spin_lock() for the cpu_asid_lock? This might require > > acquiring the raw_lock within the raw_spinlock_t, but there is precedent: > > > > arch/powerpc/kvm/book3s_hv_rm_mmu.c:245: > > arch_spin_lock(&kvm->mmu_lock.rlock.raw_lock); > > > > IMO, lockdep tracking of this lock is not necessary or possible considering the > > hotplug situation. > > > > Or is there a reason you need lockdep working for the cpu_asid_lock? > > I was not aware of this possibility to bypass lockdep tracing, but this seems > to work and indeed looks like less of a workaround: > > diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c > index 4204ffa2d104..4fc2c559f1b6 100644 > --- a/arch/arm/mm/context.c > +++ b/arch/arm/mm/context.c > @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) > goto switch_mm_fastpath; > > - raw_spin_lock_irqsave(&cpu_asid_lock, flags); > + local_irq_save(flags); > + arch_spin_lock(&cpu_asid_lock.raw_lock); > /* Check that our ASID belongs to the current generation. */ > asid = atomic64_read(&mm->context.id); > if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { > @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > atomic64_set(&per_cpu(active_asids, cpu), asid); > cpumask_set_cpu(cpu, mm_cpumask(mm)); > - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); > + arch_spin_unlock(&cpu_asid_lock.raw_lock); > + local_irq_restore(flags); > > switch_mm_fastpath: > cpu_switch_mm(mm->pgd, mm); > > @Russell, what do you think? I think this is Will Deacon's code, so we ought to hear from Will...
On Thu, Mar 07, 2024 at 05:09:51PM +0100, Stefan Wiehler wrote: > With CONFIG_PROVE_RCU_LIST=y and by executing > > $ echo 0 > /sys/devices/system/cpu/cpu1/online > > one can trigger the following Lockdep-RCU splat on ARM: > > ============================= > WARNING: suspicious RCU usage > 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted > ----------------------------- > kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! > > other info that might help us debug this: > > RCU used illegally from offline CPU! > rcu_scheduler_active = 2, debug_locks = 1 > no locks held by swapper/1/0. > > stack backtrace: > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 > Hardware name: Allwinner sun8i Family > unwind_backtrace from show_stack+0x10/0x14 > show_stack from dump_stack_lvl+0x60/0x90 > dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 > lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 > __lock_acquire from lock_acquire+0x10c/0x348 > lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c > _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 > check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c > arch_cpu_idle_dead from do_idle+0xbc/0x138 > do_idle from cpu_startup_entry+0x28/0x2c > cpu_startup_entry from secondary_start_kernel+0x11c/0x124 > secondary_start_kernel from 0x401018a0 > > The CPU is already reported as offline from RCU perspective in > cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above > RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the > ASID spinlock. > > Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as > online again while the spinlock is held. > > Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> So what do I do about this? I see you submitted this to the patch system on the 8th March, but proposed a different approach on the 9th March. I don't see evidence that Paul is happy with the original approach either and there's no ack/r-b's on anything here.
> So what do I do about this? I see you submitted this to the patch system > on the 8th March, but proposed a different approach on the 9th March. I > don't see evidence that Paul is happy with the original approach either > and there's no ack/r-b's on anything here. I think we need to wait for feedback from Will Deacon regarding the new arch_spin_lock() based approach. So please abandon the original proposal in the patch system, at least for now…
On Mon, Mar 11, 2024 at 05:17:43PM +0100, Stefan Wiehler wrote: > > So what do I do about this? I see you submitted this to the patch system > > on the 8th March, but proposed a different approach on the 9th March. I > > don't see evidence that Paul is happy with the original approach either > > and there's no ack/r-b's on anything here. > > I think we need to wait for feedback from Will Deacon regarding the new > arch_spin_lock() based approach. So please abandon the original proposal in the > patch system, at least for now… I prefer the arch_spin_lock() version, but the other approach also works. I agree with Stefan that it would also be good to get Will's feedback. The downside of the arch_spin_lock() approach is that, in theory, it prevents lockdep from seeing deadlocks involving that lock acquisition. But in practice, that is a problem only if additional locks can be acquired while that lock is held, and there are no such additional locks, right? The downside of the original approach is that it further complicates RCU's interaction with offline CPUs, especially should people copy that approach in other pieces of offline code. So, again, I prefer the arch_spin_lock() approach. Thanx, Paul
Hi Russell, On Sat, Mar 09, 2024 at 09:57:04AM +0000, Russell King (Oracle) wrote: > On Sat, Mar 09, 2024 at 08:45:35AM +0100, Stefan Wiehler wrote: > > diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c > > index 4204ffa2d104..4fc2c559f1b6 100644 > > --- a/arch/arm/mm/context.c > > +++ b/arch/arm/mm/context.c > > @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) > > goto switch_mm_fastpath; > > > > - raw_spin_lock_irqsave(&cpu_asid_lock, flags); > > + local_irq_save(flags); > > + arch_spin_lock(&cpu_asid_lock.raw_lock); > > /* Check that our ASID belongs to the current generation. */ > > asid = atomic64_read(&mm->context.id); > > if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { > > @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > atomic64_set(&per_cpu(active_asids, cpu), asid); > > cpumask_set_cpu(cpu, mm_cpumask(mm)); > > - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); > > + arch_spin_unlock(&cpu_asid_lock.raw_lock); > > + local_irq_restore(flags); > > > > switch_mm_fastpath: > > cpu_switch_mm(mm->pgd, mm); > > > > @Russell, what do you think? > > I think this is Will Deacon's code, so we ought to hear from Will... Thanks for adding me in. Using arch_spin_lock() really feels like a bodge to me. This code isn't run only on the hot-unplug path, but rather this is part of switch_mm() and we really should be able to have lockdep work properly there for the usual case. Now, do we actually need to worry about the ASID when switching to the init_mm? I'd have thought that would be confined to global (kernel) mappings, so I wonder whether we could avoid this slow path code altogether like we do on arm64 in __switch_mm(). But I must confess that I don't recall the details of the pre-LPAE MMU configuration... Will
On Tue, Mar 12, 2024 at 10:14:40PM +0000, Will Deacon wrote: > Hi Russell, > > On Sat, Mar 09, 2024 at 09:57:04AM +0000, Russell King (Oracle) wrote: > > On Sat, Mar 09, 2024 at 08:45:35AM +0100, Stefan Wiehler wrote: > > > diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c > > > index 4204ffa2d104..4fc2c559f1b6 100644 > > > --- a/arch/arm/mm/context.c > > > +++ b/arch/arm/mm/context.c > > > @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) > > > goto switch_mm_fastpath; > > > > > > - raw_spin_lock_irqsave(&cpu_asid_lock, flags); > > > + local_irq_save(flags); > > > + arch_spin_lock(&cpu_asid_lock.raw_lock); > > > /* Check that our ASID belongs to the current generation. */ > > > asid = atomic64_read(&mm->context.id); > > > if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { > > > @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > > > atomic64_set(&per_cpu(active_asids, cpu), asid); > > > cpumask_set_cpu(cpu, mm_cpumask(mm)); > > > - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); > > > + arch_spin_unlock(&cpu_asid_lock.raw_lock); > > > + local_irq_restore(flags); > > > > > > switch_mm_fastpath: > > > cpu_switch_mm(mm->pgd, mm); > > > > > > @Russell, what do you think? > > > > I think this is Will Deacon's code, so we ought to hear from Will... > > Thanks for adding me in. > > Using arch_spin_lock() really feels like a bodge to me. This code isn't > run only on the hot-unplug path, but rather this is part of switch_mm() > and we really should be able to have lockdep work properly there for > the usual case. > > Now, do we actually need to worry about the ASID when switching to the > init_mm? I'd have thought that would be confined to global (kernel) > mappings, so I wonder whether we could avoid this slow path code > altogether like we do on arm64 in __switch_mm(). But I must confess that > I don't recall the details of the pre-LPAE MMU configuration... As the init_mm shouldn't have any userspace mappings, isn't the ASID entirely redundant? Couldn't check_and_switch_context() just simply do the vmalloc seq check, set the reserved ASID, and then head to switch_mm_fastpath to call the mm switch code?
On Tue, Mar 12, 2024 at 10:39:30PM +0000, Russell King (Oracle) wrote: > On Tue, Mar 12, 2024 at 10:14:40PM +0000, Will Deacon wrote: > > On Sat, Mar 09, 2024 at 09:57:04AM +0000, Russell King (Oracle) wrote: > > > On Sat, Mar 09, 2024 at 08:45:35AM +0100, Stefan Wiehler wrote: > > > > diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c > > > > index 4204ffa2d104..4fc2c559f1b6 100644 > > > > --- a/arch/arm/mm/context.c > > > > +++ b/arch/arm/mm/context.c > > > > @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) > > > > goto switch_mm_fastpath; > > > > > > > > - raw_spin_lock_irqsave(&cpu_asid_lock, flags); > > > > + local_irq_save(flags); > > > > + arch_spin_lock(&cpu_asid_lock.raw_lock); > > > > /* Check that our ASID belongs to the current generation. */ > > > > asid = atomic64_read(&mm->context.id); > > > > if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { > > > > @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > > > > > atomic64_set(&per_cpu(active_asids, cpu), asid); > > > > cpumask_set_cpu(cpu, mm_cpumask(mm)); > > > > - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); > > > > + arch_spin_unlock(&cpu_asid_lock.raw_lock); > > > > + local_irq_restore(flags); > > > > > > > > switch_mm_fastpath: > > > > cpu_switch_mm(mm->pgd, mm); > > > > > > > > @Russell, what do you think? > > > > > > I think this is Will Deacon's code, so we ought to hear from Will... > > > > Thanks for adding me in. > > > > Using arch_spin_lock() really feels like a bodge to me. This code isn't > > run only on the hot-unplug path, but rather this is part of switch_mm() > > and we really should be able to have lockdep work properly there for > > the usual case. > > > > Now, do we actually need to worry about the ASID when switching to the > > init_mm? I'd have thought that would be confined to global (kernel) > > mappings, so I wonder whether we could avoid this slow path code > > altogether like we do on arm64 in __switch_mm(). But I must confess that > > I don't recall the details of the pre-LPAE MMU configuration... > > As the init_mm shouldn't have any userspace mappings, isn't the ASID > entirely redundant? Couldn't check_and_switch_context() just simply > do the vmalloc seq check, set the reserved ASID, and then head to > switch_mm_fastpath to call the mm switch code? Right, that's what I was thinking too, but I have some distant memories of the module space causing potential issues in some configurations. Does that ring a bell with you? Will
On Wed, Mar 13, 2024 at 12:32:44AM +0000, Will Deacon wrote: > On Tue, Mar 12, 2024 at 10:39:30PM +0000, Russell King (Oracle) wrote: > > On Tue, Mar 12, 2024 at 10:14:40PM +0000, Will Deacon wrote: > > > On Sat, Mar 09, 2024 at 09:57:04AM +0000, Russell King (Oracle) wrote: > > > > On Sat, Mar 09, 2024 at 08:45:35AM +0100, Stefan Wiehler wrote: > > > > > diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c > > > > > index 4204ffa2d104..4fc2c559f1b6 100644 > > > > > --- a/arch/arm/mm/context.c > > > > > +++ b/arch/arm/mm/context.c > > > > > @@ -254,7 +254,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > > && atomic64_xchg(&per_cpu(active_asids, cpu), asid)) > > > > > goto switch_mm_fastpath; > > > > > > > > > > - raw_spin_lock_irqsave(&cpu_asid_lock, flags); > > > > > + local_irq_save(flags); > > > > > + arch_spin_lock(&cpu_asid_lock.raw_lock); > > > > > /* Check that our ASID belongs to the current generation. */ > > > > > asid = atomic64_read(&mm->context.id); > > > > > if ((asid ^ atomic64_read(&asid_generation)) >> ASID_BITS) { > > > > > @@ -269,7 +270,8 @@ void check_and_switch_context(struct mm_struct *mm, struct task_struct *tsk) > > > > > > > > > > atomic64_set(&per_cpu(active_asids, cpu), asid); > > > > > cpumask_set_cpu(cpu, mm_cpumask(mm)); > > > > > - raw_spin_unlock_irqrestore(&cpu_asid_lock, flags); > > > > > + arch_spin_unlock(&cpu_asid_lock.raw_lock); > > > > > + local_irq_restore(flags); > > > > > > > > > > switch_mm_fastpath: > > > > > cpu_switch_mm(mm->pgd, mm); > > > > > > > > > > @Russell, what do you think? > > > > > > > > I think this is Will Deacon's code, so we ought to hear from Will... > > > > > > Thanks for adding me in. > > > > > > Using arch_spin_lock() really feels like a bodge to me. This code isn't > > > run only on the hot-unplug path, but rather this is part of switch_mm() > > > and we really should be able to have lockdep work properly there for > > > the usual case. > > > > > > Now, do we actually need to worry about the ASID when switching to the > > > init_mm? I'd have thought that would be confined to global (kernel) > > > mappings, so I wonder whether we could avoid this slow path code > > > altogether like we do on arm64 in __switch_mm(). But I must confess that > > > I don't recall the details of the pre-LPAE MMU configuration... > > > > As the init_mm shouldn't have any userspace mappings, isn't the ASID > > entirely redundant? Couldn't check_and_switch_context() just simply > > do the vmalloc seq check, set the reserved ASID, and then head to > > switch_mm_fastpath to call the mm switch code? > > Right, that's what I was thinking too, but I have some distant memories > of the module space causing potential issues in some configurations. Does > that ring a bell with you? For 32-bit non-LPAE, I can't recall any issues, nor can I think of any because module space is just another few entries in the L1 page tables below the direct mapping (which isn't a problem because we don't use anything in hardware to separate the kernel space from user space in the page tables.) TTBCR is set to 0. For LPAE, there may be issues there because TTBR0 and TTBR1 are both used, and TTBCR.T1SZ is set non-zero to: arch/arm/include/asm/pgtable-3level-hwdef.h:#define TTBR1_SIZE (((PAGE_OFFSET >> 30) - 1) << 16) so I suspect that's where the problems may lie - but then module mappings should also exist in init_mm (swapper_pg_dir) and should be global.
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 3431c0553f45..6875e2c5dd50 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -319,7 +319,14 @@ void __noreturn arch_cpu_idle_dead(void) { unsigned int cpu = smp_processor_id(); + /* + * Briefly report CPU as online again to avoid false positive + * Lockdep-RCU splat when check_and_switch_context() acquires ASID + * spinlock. + */ + rcutree_report_cpu_starting(cpu); idle_task_exit(); + rcutree_report_cpu_dead(); local_irq_disable();
With CONFIG_PROVE_RCU_LIST=y and by executing $ echo 0 > /sys/devices/system/cpu/cpu1/online one can trigger the following Lockdep-RCU splat on ARM: ============================= WARNING: suspicious RCU usage 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted ----------------------------- kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10 Hardware name: Allwinner sun8i Family unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x60/0x90 dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0 lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8 __lock_acquire from lock_acquire+0x10c/0x348 lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8 check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c arch_cpu_idle_dead from do_idle+0xbc/0x138 do_idle from cpu_startup_entry+0x28/0x2c cpu_startup_entry from secondary_start_kernel+0x11c/0x124 secondary_start_kernel from 0x401018a0 The CPU is already reported as offline from RCU perspective in cpuhp_report_idle_dead() before arch_cpu_idle_dead() is invoked. Above RCU-Lockdep splat is then triggered by check_and_switch_context() acquiring the ASID spinlock. Avoid the false-positive Lockdep-RCU splat by briefly reporting the CPU as online again while the spinlock is held. Signed-off-by: Stefan Wiehler <stefan.wiehler@nokia.com> --- arch/arm/kernel/smp.c | 7 +++++++ 1 file changed, 7 insertions(+)