Message ID | 1568392064-3052-1-git-send-email-cai@lca.pw (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/slub: fix a deadlock in shuffle_freelist() | expand |
On 2019-09-13 12:27:44 [-0400], Qian Cai wrote: … > Chain exists of: > random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(batched_entropy_u32.lock); > lock(&rq->lock); > lock(batched_entropy_u32.lock); > lock(random_write_wait.lock); would this deadlock still occur if lockdep knew that batched_entropy_u32.lock on CPU0 could be acquired at the same time as CPU1 acquired its batched_entropy_u32.lock? Sebastian
On Mon, 2019-09-16 at 11:03 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-13 12:27:44 [-0400], Qian Cai wrote: > … > > Chain exists of: > > random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock(batched_entropy_u32.lock); > > lock(&rq->lock); > > lock(batched_entropy_u32.lock); > > lock(random_write_wait.lock); > > would this deadlock still occur if lockdep knew that > batched_entropy_u32.lock on CPU0 could be acquired at the same time > as CPU1 acquired its batched_entropy_u32.lock? I suppose that might fix it too if it can teach the lockdep the trick, but it would be better if there is a patch if you have something in mind that could be tested to make sure.
On 2019-09-16 10:01:27 [-0400], Qian Cai wrote: > On Mon, 2019-09-16 at 11:03 +0200, Sebastian Andrzej Siewior wrote: > > On 2019-09-13 12:27:44 [-0400], Qian Cai wrote: > > … > > > Chain exists of: > > > random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock > > > > > > Possible unsafe locking scenario: > > > > > > CPU0 CPU1 > > > ---- ---- > > > lock(batched_entropy_u32.lock); > > > lock(&rq->lock); > > > lock(batched_entropy_u32.lock); > > > lock(random_write_wait.lock); > > > > would this deadlock still occur if lockdep knew that > > batched_entropy_u32.lock on CPU0 could be acquired at the same time > > as CPU1 acquired its batched_entropy_u32.lock? > > I suppose that might fix it too if it can teach the lockdep the trick, but it > would be better if there is a patch if you have something in mind that could be > tested to make sure. get_random_bytes() is heavier than get_random_int() so I would prefer to avoid its usage to fix what looks like a false positive report from lockdep. But no, I don't have a patch sitting around. A lock in per-CPU memory could lead to the scenario mentioned above if the lock could be obtained cross-CPU it just isn't so in that case. So I don't think it is that simple. Sebastian
On Mon, 2019-09-16 at 21:51 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-16 10:01:27 [-0400], Qian Cai wrote: > > On Mon, 2019-09-16 at 11:03 +0200, Sebastian Andrzej Siewior wrote: > > > On 2019-09-13 12:27:44 [-0400], Qian Cai wrote: > > > … > > > > Chain exists of: > > > > random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock > > > > > > > > Possible unsafe locking scenario: > > > > > > > > CPU0 CPU1 > > > > ---- ---- > > > > lock(batched_entropy_u32.lock); > > > > lock(&rq->lock); > > > > lock(batched_entropy_u32.lock); > > > > lock(random_write_wait.lock); > > > > > > would this deadlock still occur if lockdep knew that > > > batched_entropy_u32.lock on CPU0 could be acquired at the same time > > > as CPU1 acquired its batched_entropy_u32.lock? > > > > I suppose that might fix it too if it can teach the lockdep the trick, but it > > would be better if there is a patch if you have something in mind that could be > > tested to make sure. > > get_random_bytes() is heavier than get_random_int() so I would prefer to > avoid its usage to fix what looks like a false positive report from > lockdep. > But no, I don't have a patch sitting around. A lock in per-CPU memory > could lead to the scenario mentioned above if the lock could be obtained > cross-CPU it just isn't so in that case. So I don't think it is that > simple. get_random_u64() is also busted. [ 752.925079] WARNING: possible circular locking dependency detected [ 752.931951] 5.3.0-rc8-next-20190915+ #2 Tainted: G L [ 752.938906] ------------------------------------------------------ [ 752.945774] ls/9665 is trying to acquire lock: [ 752.950905] ffff90001311fef8 (random_write_wait.lock){..-.}, at: __wake_up_common_lock+0xa8/0x11c [ 752.960481] but task is already holding lock: [ 752.967698] ffff008abc7b9c00 (batched_entropy_u64.lock){....}, at: get_random_u64+0x6c/0x1dc [ 752.976835] which lock already depends on the new lock. [ 752.987089] the existing dependency chain (in reverse order) is: [ 752.995953] -> #4 (batched_entropy_u64.lock){....}: [ 753.003702] lock_acquire+0x320/0x364 [ 753.008577] _raw_spin_lock_irqsave+0x7c/0x9c [ 753.014145] get_random_u64+0x6c/0x1dc [ 753.019109] add_to_free_area_random+0x54/0x1c8 [ 753.024851] free_one_page+0x86c/0xc28 [ 753.029818] __free_pages_ok+0x69c/0xdac [ 753.034960] __free_pages+0xbc/0xf8 [ 753.039663] __free_pages_core+0x2ac/0x3c0 [ 753.044973] memblock_free_pages+0xe0/0xf8 [ 753.050281] __free_pages_memory+0xcc/0xfc [ 753.055588] __free_memory_core+0x70/0x78 [ 753.060809] free_low_memory_core_early+0x148/0x18c [ 753.066897] memblock_free_all+0x18/0x54 [ 753.072033] mem_init+0x9c/0x160 [ 753.076472] mm_init+0x14/0x38 [ 753.080737] start_kernel+0x19c/0x52c [ 753.085607] -> #3 (&(&zone->lock)->rlock){..-.}: [ 753.093092] lock_acquire+0x320/0x364 [ 753.097964] _raw_spin_lock+0x64/0x80 [ 753.102839] rmqueue_bulk+0x50/0x15a0 [ 753.107712] get_page_from_freelist+0x2260/0x29dc [ 753.113627] __alloc_pages_nodemask+0x36c/0x1ce0 [ 753.119457] alloc_page_interleave+0x34/0x17c [ 753.125023] alloc_pages_current+0x80/0xe0 [ 753.130334] allocate_slab+0xfc/0x1d80 [ 753.135296] ___slab_alloc+0x5d4/0xa70 [ 753.140257] kmem_cache_alloc+0x588/0x66c [ 753.145480] __debug_object_init+0x9d8/0xbac [ 753.150962] debug_object_init+0x40/0x50 [ 753.156098] hrtimer_init+0x38/0x2b4 [ 753.160885] init_dl_task_timer+0x24/0x44 [ 753.166108] __sched_fork+0xc0/0x168 [ 753.170894] init_idle+0x80/0x3d8 [ 753.175420] idle_thread_get+0x60/0x8c [ 753.180385] _cpu_up+0x10c/0x348 [ 753.184824] do_cpu_up+0x114/0x170 [ 753.189437] cpu_up+0x20/0x2c [ 753.193615] smp_init+0xf8/0x1bc [ 753.198054] kernel_init_freeable+0x198/0x26c [ 753.203622] kernel_init+0x18/0x334 [ 753.208323] ret_from_fork+0x10/0x18 [ 753.213107] -> #2 (&rq->lock){-.-.}: [ 753.219550] lock_acquire+0x320/0x364 [ 753.224423] _raw_spin_lock+0x64/0x80 [ 753.229299] task_fork_fair+0x64/0x22c [ 753.234261] sched_fork+0x24c/0x3d8 [ 753.238962] copy_process+0xa60/0x29b0 [ 753.243921] _do_fork+0xb8/0xa64 [ 753.248360] kernel_thread+0xc4/0xf4 [ 753.253147] rest_init+0x30/0x320 [ 753.257673] arch_call_rest_init+0x10/0x18 [ 753.262980] start_kernel+0x424/0x52c [ 753.267849] -> #1 (&p->pi_lock){-.-.}: [ 753.274467] lock_acquire+0x320/0x364 [ 753.279342] _raw_spin_lock_irqsave+0x7c/0x9c [ 753.284910] try_to_wake_up+0x74/0x128c [ 753.289959] default_wake_function+0x38/0x48 [ 753.295440] pollwake+0x118/0x158 [ 753.299967] __wake_up_common+0x16c/0x240 [ 753.305187] __wake_up_common_lock+0xc8/0x11c [ 753.310754] __wake_up+0x3c/0x4c [ 753.315193] account+0x390/0x3e0 [ 753.319632] extract_entropy+0x2cc/0x37c [ 753.324766] _xfer_secondary_pool+0x35c/0x3c4 [ 753.330333] push_to_pool+0x54/0x308 [ 753.335119] process_one_work+0x558/0xb1c [ 753.340339] worker_thread+0x494/0x650 [ 753.345300] kthread+0x1cc/0x1e8 [ 753.349739] ret_from_fork+0x10/0x18 [ 753.354522] -> #0 (random_write_wait.lock){..-.}: [ 753.362093] validate_chain+0xfcc/0x2fd4 [ 753.367227] __lock_acquire+0x868/0xc2c [ 753.372274] lock_acquire+0x320/0x364 [ 753.377147] _raw_spin_lock_irqsave+0x7c/0x9c [ 753.382715] __wake_up_common_lock+0xa8/0x11c [ 753.388282] __wake_up+0x3c/0x4c [ 753.392720] account+0x390/0x3e0 [ 753.397159] extract_entropy+0x2cc/0x37c [ 753.402292] crng_reseed+0x60/0x350 [ 753.406991] _extract_crng+0xd8/0x164 [ 753.411864] crng_reseed+0x7c/0x350 [ 753.416563] _extract_crng+0xd8/0x164 [ 753.421436] get_random_u64+0xec/0x1dc [ 753.426396] arch_mmap_rnd+0x18/0x78 [ 753.431187] load_elf_binary+0x6d0/0x1730 [ 753.436411] search_binary_handler+0x10c/0x35c [ 753.442067] __do_execve_file+0xb58/0xf7c [ 753.447287] __arm64_sys_execve+0x6c/0xa4 [ 753.452509] el0_svc_handler+0x170/0x240 [ 753.457643] el0_svc+0x8/0xc [ 753.461732] other info that might help us debug this: [ 753.471812] Chain exists of: random_write_wait.lock --> &(&zone->lock)->rlock --> batched_entropy_u64.lock [ 753.486588] Possible unsafe locking scenario: [ 753.493890] CPU0 CPU1 [ 753.499108] ---- ---- [ 753.504324] lock(batched_entropy_u64.lock); [ 753.509372] lock(&(&zone->lock)->rlock); [ 753.516675] lock(batched_entropy_u64.lock); [ 753.524238] lock(random_write_wait.lock); [ 753.529113] *** DEADLOCK *** [ 753.537111] 1 lock held by ls/9665: [ 753.541287] #0: ffff008abc7b9c00 (batched_entropy_u64.lock){....}, at: get_random_u64+0x6c/0x1dc [ 753.550858] stack backtrace: [ 753.556602] CPU: 121 PID: 9665 Comm: ls Tainted: G L 5.3.0- rc8-next-20190915+ #2 [ 753.565987] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 [ 753.576414] Call trace: [ 753.579553] dump_backtrace+0x0/0x264 [ 753.583905] show_stack+0x20/0x2c [ 753.587911] dump_stack+0xd0/0x140 [ 753.592003] print_circular_bug+0x368/0x380 [ 753.596876] check_noncircular+0x28c/0x294 [ 753.601664] validate_chain+0xfcc/0x2fd4 [ 753.606276] __lock_acquire+0x868/0xc2c [ 753.610802] lock_acquire+0x320/0x364 [ 753.615154] _raw_spin_lock_irqsave+0x7c/0x9c [ 753.620202] __wake_up_common_lock+0xa8/0x11c [ 753.625248] __wake_up+0x3c/0x4c [ 753.629171] account+0x390/0x3e0 [ 753.633095] extract_entropy+0x2cc/0x37c [ 753.637708] crng_reseed+0x60/0x350 [ 753.641887] _extract_crng+0xd8/0x164 [ 753.646238] crng_reseed+0x7c/0x350 [ 753.650417] _extract_crng+0xd8/0x164 [ 753.654768] get_random_u64+0xec/0x1dc [ 753.659208] arch_mmap_rnd+0x18/0x78 [ 753.663474] load_elf_binary+0x6d0/0x1730 [ 753.668173] search_binary_handler+0x10c/0x35c [ 753.673308] __do_execve_file+0xb58/0xf7c [ 753.678007] __arm64_sys_execve+0x6c/0xa4 [ 753.682707] el0_svc_handler+0x170/0x240 [ 753.687319] el0_svc+0x8/0xc
On 2019-09-16 17:31:34 [-0400], Qian Cai wrote: … > get_random_u64() is also busted. … > [ 753.486588] Possible unsafe locking scenario: > > [ 753.493890] CPU0 CPU1 > [ 753.499108] ---- ---- > [ 753.504324] lock(batched_entropy_u64.lock); > [ 753.509372] lock(&(&zone->lock)->rlock); > [ 753.516675] lock(batched_entropy_u64.lock); > [ 753.524238] lock(random_write_wait.lock); > [ 753.529113] > *** DEADLOCK *** This is the same scenario as the previous one in regard to the batched_entropy_….lock. Sebastian
On Tue, 2019-09-17 at 09:16 +0200, Sebastian Andrzej Siewior wrote: > On 2019-09-16 17:31:34 [-0400], Qian Cai wrote: > … > > get_random_u64() is also busted. > > … > > [ 753.486588] Possible unsafe locking scenario: > > > > [ 753.493890] CPU0 CPU1 > > [ 753.499108] ---- ---- > > [ 753.504324] lock(batched_entropy_u64.lock); > > [ 753.509372] lock(&(&zone->lock)->rlock); > > [ 753.516675] lock(batched_entropy_u64.lock); > > [ 753.524238] lock(random_write_wait.lock); > > [ 753.529113] > > *** DEADLOCK *** > > This is the same scenario as the previous one in regard to the > batched_entropy_….lock. The commit 383776fa7527 ("locking/lockdep: Handle statically initialized percpu locks properly") which increased the chance of false positives for percpu locks significantly especially for large systems like in those examples since it makes all of them the same class. Once there happens a false positive, lockdep will become useless. In reality, each percpu lock is a different lock as we have seen in those examples where each CPU only take a local one. The only thing that should worry about is the path that another CPU could take a non-local percpu lock. For example, invalidate_batched_entropy() which is a for_each_possible_cpu() call. Is there any other place that another CPU could take a non-local percpu lock but not a for_each_possible_cpu() or similar iterator? Even before the above commit, if the system is running long enough, it could still catch a deadlock from those percpu lock iterators since it will register each percpu lock usage in lockdep Overall, it sounds to me the side-effects of commit 383776fa7527 outweight the benefits, and should be reverted. Do I miss anything?
On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote: > The commit b7d5dc21072c ("random: add a spinlock_t to struct > batched_entropy") insists on acquiring "batched_entropy_u32.lock" in > get_random_u32() which introduced the lock chain, > > "&rq->lock --> batched_entropy_u32.lock" > > even after crng init. As the result, it could result in deadlock below. > Fix it by using get_random_bytes() in shuffle_freelist() which does not > need to take on the batched_entropy locks. > > WARNING: possible circular locking dependency detected > 5.3.0-rc7-mm1+ #3 Tainted: G L > ------------------------------------------------------ > make/7937 is trying to acquire lock: > ffff900012f225f8 (random_write_wait.lock){....}, at: > __wake_up_common_lock+0xa8/0x11c > > but task is already holding lock: > ffff0096b9429c00 (batched_entropy_u32.lock){-.-.}, at: > get_random_u32+0x6c/0x1dc > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #3 (batched_entropy_u32.lock){-.-.}: > lock_acquire+0x31c/0x360 > _raw_spin_lock_irqsave+0x7c/0x9c > get_random_u32+0x6c/0x1dc > new_slab+0x234/0x6c0 > ___slab_alloc+0x3c8/0x650 > kmem_cache_alloc+0x4b0/0x590 > __debug_object_init+0x778/0x8b4 > debug_object_init+0x40/0x50 > debug_init+0x30/0x29c > hrtimer_init+0x30/0x50 > init_dl_task_timer+0x24/0x44 > __sched_fork+0xc0/0x168 > init_idle+0x78/0x26c > fork_idle+0x12c/0x178 > idle_threads_init+0x108/0x178 > smp_init+0x20/0x1bc > kernel_init_freeable+0x198/0x26c > kernel_init+0x18/0x334 > ret_from_fork+0x10/0x18 > > -> #2 (&rq->lock){-.-.}: This relation is silly.. I suspect the below 'works'... diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 63900ca029e0..ec1d72f18b34 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6027,10 +6027,11 @@ void init_idle(struct task_struct *idle, int cpu) struct rq *rq = cpu_rq(cpu); unsigned long flags; + __sched_fork(0, idle); + raw_spin_lock_irqsave(&idle->pi_lock, flags); raw_spin_lock(&rq->lock); - __sched_fork(0, idle); idle->state = TASK_RUNNING; idle->se.exec_start = sched_clock(); idle->flags |= PF_IDLE;
On Wed, 2019-09-25 at 11:31 +0200, Peter Zijlstra wrote: > On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote: > > The commit b7d5dc21072c ("random: add a spinlock_t to struct > > batched_entropy") insists on acquiring "batched_entropy_u32.lock" in > > get_random_u32() which introduced the lock chain, > > > > "&rq->lock --> batched_entropy_u32.lock" > > > > even after crng init. As the result, it could result in deadlock below. > > Fix it by using get_random_bytes() in shuffle_freelist() which does not > > need to take on the batched_entropy locks. > > > > WARNING: possible circular locking dependency detected > > 5.3.0-rc7-mm1+ #3 Tainted: G L > > ------------------------------------------------------ > > make/7937 is trying to acquire lock: > > ffff900012f225f8 (random_write_wait.lock){....}, at: > > __wake_up_common_lock+0xa8/0x11c > > > > but task is already holding lock: > > ffff0096b9429c00 (batched_entropy_u32.lock){-.-.}, at: > > get_random_u32+0x6c/0x1dc > > > > which lock already depends on the new lock. > > > > the existing dependency chain (in reverse order) is: > > > > -> #3 (batched_entropy_u32.lock){-.-.}: > > lock_acquire+0x31c/0x360 > > _raw_spin_lock_irqsave+0x7c/0x9c > > get_random_u32+0x6c/0x1dc > > new_slab+0x234/0x6c0 > > ___slab_alloc+0x3c8/0x650 > > kmem_cache_alloc+0x4b0/0x590 > > __debug_object_init+0x778/0x8b4 > > debug_object_init+0x40/0x50 > > debug_init+0x30/0x29c > > hrtimer_init+0x30/0x50 > > init_dl_task_timer+0x24/0x44 > > __sched_fork+0xc0/0x168 > > init_idle+0x78/0x26c > > fork_idle+0x12c/0x178 > > idle_threads_init+0x108/0x178 > > smp_init+0x20/0x1bc > > kernel_init_freeable+0x198/0x26c > > kernel_init+0x18/0x334 > > ret_from_fork+0x10/0x18 > > > > -> #2 (&rq->lock){-.-.}: > > This relation is silly.. > > I suspect the below 'works'... Unfortunately, the relation is still there, copy_process()->rt_mutex_init_task()->"&p->pi_lock" [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: [24438.676727][ T2] __lock_acquire+0x5b4/0xbf0 [24438.676736][ T2] lock_acquire+0x130/0x360 [24438.676754][ T2] _raw_spin_lock+0x54/0x80 [24438.676771][ T2] task_fork_fair+0x60/0x190 [24438.676788][ T2] sched_fork+0x128/0x270 [24438.676806][ T2] copy_process+0x7a4/0x1bf0 [24438.676823][ T2] _do_fork+0xac/0xac0 [24438.676841][ T2] kernel_thread+0x70/0xa0 [24438.676858][ T2] rest_init+0x4c/0x42c [24438.676884][ T2] start_kernel+0x778/0x7c0 [24438.676902][ T2] start_here_common+0x1c/0x334 Whole thing, [24438.675704][ T2] WARNING: possible circular locking dependency detected [24438.675714][ T2] 5.3.0-next-20190924 #2 Not tainted [24438.675722][ T2] ------------------------------------------------------ [24438.675731][ T2] kthreadd/2 is trying to acquire lock: [24438.675740][ T2] c0000000010a7450 (random_write_wait.lock){..-.}, at: __wake_up_common_lock+0x88/0x110 [24438.675768][ T2] [24438.675768][ T2] but task is already holding lock: [24438.675778][ T2] c000001ffd2f06e0 (batched_entropy_u64.lock){-...}, at: get_random_u64+0x60/0x100 [24438.675803][ T2] [24438.675803][ T2] which lock already depends on the new lock. [24438.675803][ T2] [24438.675816][ T2] [24438.675816][ T2] the existing dependency chain (in reverse order) is: [24438.675836][ T2] [24438.675836][ T2] -> #4 (batched_entropy_u64.lock){-...}: [24438.675860][ T2] __lock_acquire+0x5b4/0xbf0 [24438.675878][ T2] lock_acquire+0x130/0x360 [24438.675906][ T2] _raw_spin_lock_irqsave+0x70/0xa0 [24438.675923][ T2] get_random_u64+0x60/0x100 [24438.675944][ T2] add_to_free_area_random+0x164/0x1b0 [24438.675962][ T2] free_one_page+0xb24/0xcf0 [24438.675980][ T2] __free_pages_ok+0x448/0xbf0 [24438.675999][ T2] deferred_init_maxorder+0x404/0x4a4 [24438.676018][ T2] deferred_grow_zone+0x158/0x1f0 [24438.676035][ T2] get_page_from_freelist+0x1dc8/0x1e10 [24438.676063][ T2] __alloc_pages_nodemask+0x1d8/0x1940 [24438.676083][ T2] allocate_slab+0x130/0x2740 [24438.676091][ T2] new_slab+0xa8/0xe0 [24438.676101][ T2] kmem_cache_open+0x254/0x660 [24438.676119][ T2] __kmem_cache_create+0x44/0x2a0 [24438.676136][ T2] create_boot_cache+0xcc/0x110 [24438.676154][ T2] kmem_cache_init+0x90/0x1f0 [24438.676173][ T2] start_kernel+0x3b8/0x7c0 [24438.676191][ T2] start_here_common+0x1c/0x334 [24438.676208][ T2] [24438.676208][ T2] -> #3 (&(&zone->lock)->rlock){-.-.}: [24438.676221][ T2] __lock_acquire+0x5b4/0xbf0 [24438.676247][ T2] lock_acquire+0x130/0x360 [24438.676264][ T2] _raw_spin_lock+0x54/0x80 [24438.676282][ T2] rmqueue_bulk.constprop.23+0x64/0xf20 [24438.676300][ T2] get_page_from_freelist+0x718/0x1e10 [24438.676319][ T2] __alloc_pages_nodemask+0x1d8/0x1940 [24438.676339][ T2] alloc_page_interleave+0x34/0x170 [24438.676356][ T2] allocate_slab+0xd1c/0x2740 [24438.676374][ T2] new_slab+0xa8/0xe0 [24438.676391][ T2] ___slab_alloc+0x580/0xef0 [24438.676408][ T2] __slab_alloc+0x64/0xd0 [24438.676426][ T2] kmem_cache_alloc+0x5c4/0x6c0 [24438.676444][ T2] fill_pool+0x280/0x540 [24438.676461][ T2] __debug_object_init+0x60/0x6b0 [24438.676479][ T2] hrtimer_init+0x5c/0x310 [24438.676497][ T2] init_dl_task_timer+0x34/0x60 [24438.676516][ T2] __sched_fork+0x8c/0x110 [24438.676535][ T2] init_idle+0xb4/0x3c0 [24438.676553][ T2] idle_thread_get+0x78/0x120 [24438.676572][ T2] bringup_cpu+0x30/0x230 [24438.676590][ T2] cpuhp_invoke_callback+0x190/0x1580 [24438.676618][ T2] do_cpu_up+0x248/0x460 [24438.676636][ T2] smp_init+0x118/0x1c0 [24438.676662][ T2] kernel_init_freeable+0x3f8/0x8dc [24438.676681][ T2] kernel_init+0x2c/0x154 [24438.676699][ T2] ret_from_kernel_thread+0x5c/0x74 [24438.676716][ T2] [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: [24438.676727][ T2] __lock_acquire+0x5b4/0xbf0 [24438.676736][ T2] lock_acquire+0x130/0x360 [24438.676754][ T2] _raw_spin_lock+0x54/0x80 [24438.676771][ T2] task_fork_fair+0x60/0x190 [24438.676788][ T2] sched_fork+0x128/0x270 [24438.676806][ T2] copy_process+0x7a4/0x1bf0 [24438.676823][ T2] _do_fork+0xac/0xac0 [24438.676841][ T2] kernel_thread+0x70/0xa0 [24438.676858][ T2] rest_init+0x4c/0x42c [24438.676884][ T2] start_kernel+0x778/0x7c0 [24438.676902][ T2] start_here_common+0x1c/0x334 [24438.676910][ T2] [24438.676910][ T2] -> #1 (&p->pi_lock){-.-.}: [24438.676921][ T2] __lock_acquire+0x5b4/0xbf0 [24438.676929][ T2] lock_acquire+0x130/0x360 [24438.676947][ T2] _raw_spin_lock_irqsave+0x70/0xa0 [24438.676973][ T2] try_to_wake_up+0x70/0x1600 [24438.676991][ T2] pollwake+0x88/0xc0 [24438.677009][ T2] __wake_up_common+0xec/0x280 [24438.677026][ T2] __wake_up_common_lock+0xac/0x110 [24438.677044][ T2] account.constprop.8+0x284/0x430 [24438.677061][ T2] extract_entropy.constprop.7+0xd4/0x330 [24438.677080][ T2] _xfer_secondary_pool+0x104/0x3e0 [24438.677097][ T2] push_to_pool+0x58/0x310 [24438.677116][ T2] process_one_work+0x300/0x8e0 [24438.677133][ T2] worker_thread+0x78/0x530 [24438.677151][ T2] kthread+0x1a8/0x1b0 [24438.677180][ T2] ret_from_kernel_thread+0x5c/0x74 [24438.677245][ T2] [24438.677245][ T2] -> #0 (random_write_wait.lock){..-.}: [24438.677329][ T2] check_prev_add+0x100/0x11b0 [24438.677377][ T2] validate_chain+0x868/0x1530 [24438.677446][ T2] __lock_acquire+0x5b4/0xbf0 [24438.677516][ T2] lock_acquire+0x130/0x360 [24438.677563][ T2] _raw_spin_lock_irqsave+0x70/0xa0 [24438.677618][ T2] __wake_up_common_lock+0x88/0x110 [24438.677678][ T2] account.constprop.8+0x284/0x430 [24438.677743][ T2] extract_entropy.constprop.7+0xd4/0x330 [24438.677802][ T2] crng_reseed+0x68/0x490 [24438.677867][ T2] _extract_crng+0x104/0x110 [24438.677914][ T2] crng_reseed+0x284/0x490 [24438.677983][ T2] _extract_crng+0x104/0x110 [24438.678032][ T2] get_random_u64+0xdc/0x100 [24438.678101][ T2] copy_process+0x2d8/0x1bf0 [24438.678148][ T2] _do_fork+0xac/0xac0 [24438.678208][ T2] kernel_thread+0x70/0xa0 [24438.678246][ T2] kthreadd+0x270/0x330 [24438.678301][ T2] ret_from_kernel_thread+0x5c/0x74 [24438.678342][ T2] [24438.678342][ T2] other info that might help us debug this: [24438.678342][ T2] [24438.678459][ T2] Chain exists of: [24438.678459][ T2] random_write_wait.lock --> &(&zone->lock)->rlock --> batched_entropy_u64.lock [24438.678459][ T2] [24438.678636][ T2] Possible unsafe locking scenario: [24438.678636][ T2] [24438.678692][ T2] CPU0 CPU1 [24438.678754][ T2] ---- ---- [24438.678814][ T2] lock(batched_entropy_u64.lock); [24438.678878][ T2] lock(&(&zone->lock)- >rlock); [24438.678951][ T2] lock(batched_entropy_u64.l ock); [24438.679038][ T2] lock(random_write_wait.lock); [24438.679098][ T2] [24438.679098][ T2] *** DEADLOCK *** [24438.679098][ T2] [24438.679174][ T2] 1 lock held by kthreadd/2: [24438.679230][ T2] #0: c000001ffd2f06e0 (batched_entropy_u64.lock){-...}, at: get_random_u64+0x60/0x100 [24438.679341][ T2] [24438.679341][ T2] stack backtrace: [24438.679413][ T2] CPU: 13 PID: 2 Comm: kthreadd Not tainted 5.3.0-next- 20190924 #2 [24438.679485][ T2] Call Trace: [24438.679507][ T2] [c00000002c84efe0] [c00000000091a574] dump_stack+0xe8/0x164 (unreliable) [24438.679618][ T2] [c00000002c84f030] [c0000000001cc9b8] print_circular_bug+0x3a8/0x420 [24438.679701][ T2] [c00000002c84f0e0] [c0000000001ccc90] check_noncircular+0x260/0x320 [24438.679769][ T2] [c00000002c84f1e0] [c0000000001ce7e0] check_prev_add+0x100/0x11b0 [24438.679868][ T2] [c00000002c84f2c0] [c0000000001d00f8] validate_chain+0x868/0x1530 [24438.679950][ T2] [c00000002c84f3f0] [c0000000001d3064] __lock_acquire+0x5b4/0xbf0 [24438.680059][ T2] [c00000002c84f4f0] [c0000000001d3ed0] lock_acquire+0x130/0x360 [24438.680122][ T2] [c00000002c84f5d0] [c000000000947d70] _raw_spin_lock_irqsave+0x70/0xa0 [24438.680207][ T2] [c00000002c84f610] [c0000000001a9488] __wake_up_common_lock+0x88/0x110 [24438.680298][ T2] [c00000002c84f690] [c0000000006f11a4] account.constprop.8+0x284/0x430 [24438.680399][ T2] [c00000002c84f750] [c0000000006f1554] extract_entropy.constprop.7+0xd4/0x330 [24438.680495][ T2] [c00000002c84f7d0] [c0000000006f1818] crng_reseed+0x68/0x490 [24438.680590][ T2] [c00000002c84f910] [c0000000006f4094] _extract_crng+0x104/0x110 [24438.680662][ T2] [c00000002c84f950] [c0000000006f1a34] crng_reseed+0x284/0x490 [24438.680751][ T2] [c00000002c84fa90] [c0000000006f4094] _extract_crng+0x104/0x110 [24438.680828][ T2] [c00000002c84fad0] [c0000000006f4c0c] get_random_u64+0xdc/0x100 [24438.680931][ T2] [c00000002c84fb10] [c000000000106988] copy_process+0x2d8/0x1bf0 [24438.681007][ T2] [c00000002c84fc30] [c00000000010861c] _do_fork+0xac/0xac0 [24438.681074][ T2] [c00000002c84fd10] [c0000000001090d0] kernel_thread+0x70/0xa0 [24438.681170][ T2] [c00000002c84fd80] [c0000000001518f0] kthreadd+0x270/0x330 [24438.681257][ T2] [c00000002c84fe20] [c00000000000b748] ret_from_kernel_thread+0x5c/0x74 > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 63900ca029e0..ec1d72f18b34 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6027,10 +6027,11 @@ void init_idle(struct task_struct *idle, int cpu) > struct rq *rq = cpu_rq(cpu); > unsigned long flags; > > + __sched_fork(0, idle); > + > raw_spin_lock_irqsave(&idle->pi_lock, flags); > raw_spin_lock(&rq->lock); > > - __sched_fork(0, idle); > idle->state = TASK_RUNNING; > idle->se.exec_start = sched_clock(); > idle->flags |= PF_IDLE;
On Wed, Sep 25, 2019 at 11:18:47AM -0400, Qian Cai wrote: > On Wed, 2019-09-25 at 11:31 +0200, Peter Zijlstra wrote: > > On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote: > > > -> #3 (batched_entropy_u32.lock){-.-.}: > > > lock_acquire+0x31c/0x360 > > > _raw_spin_lock_irqsave+0x7c/0x9c > > > get_random_u32+0x6c/0x1dc > > > new_slab+0x234/0x6c0 > > > ___slab_alloc+0x3c8/0x650 > > > kmem_cache_alloc+0x4b0/0x590 > > > __debug_object_init+0x778/0x8b4 > > > debug_object_init+0x40/0x50 > > > debug_init+0x30/0x29c > > > hrtimer_init+0x30/0x50 > > > init_dl_task_timer+0x24/0x44 > > > __sched_fork+0xc0/0x168 > > > init_idle+0x78/0x26c > > > fork_idle+0x12c/0x178 > > > idle_threads_init+0x108/0x178 > > > smp_init+0x20/0x1bc > > > kernel_init_freeable+0x198/0x26c > > > kernel_init+0x18/0x334 > > > ret_from_fork+0x10/0x18 > > > > > > -> #2 (&rq->lock){-.-.}: > > > > This relation is silly.. > > > > I suspect the below 'works'... > > Unfortunately, the relation is still there, > > copy_process()->rt_mutex_init_task()->"&p->pi_lock" > > [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: > [24438.676727][ T2] __lock_acquire+0x5b4/0xbf0 > [24438.676736][ T2] lock_acquire+0x130/0x360 > [24438.676754][ T2] _raw_spin_lock+0x54/0x80 > [24438.676771][ T2] task_fork_fair+0x60/0x190 > [24438.676788][ T2] sched_fork+0x128/0x270 > [24438.676806][ T2] copy_process+0x7a4/0x1bf0 > [24438.676823][ T2] _do_fork+0xac/0xac0 > [24438.676841][ T2] kernel_thread+0x70/0xa0 > [24438.676858][ T2] rest_init+0x4c/0x42c > [24438.676884][ T2] start_kernel+0x778/0x7c0 > [24438.676902][ T2] start_here_common+0x1c/0x334 That's the 'where we took #2 while holding #1' stacktrace and not relevant to our discussion. > [24438.675836][ T2] -> #4 (batched_entropy_u64.lock){-...}: > [24438.675860][ T2] __lock_acquire+0x5b4/0xbf0 > [24438.675878][ T2] lock_acquire+0x130/0x360 > [24438.675906][ T2] _raw_spin_lock_irqsave+0x70/0xa0 > [24438.675923][ T2] get_random_u64+0x60/0x100 > [24438.675944][ T2] add_to_free_area_random+0x164/0x1b0 > [24438.675962][ T2] free_one_page+0xb24/0xcf0 > [24438.675980][ T2] __free_pages_ok+0x448/0xbf0 > [24438.675999][ T2] deferred_init_maxorder+0x404/0x4a4 > [24438.676018][ T2] deferred_grow_zone+0x158/0x1f0 > [24438.676035][ T2] get_page_from_freelist+0x1dc8/0x1e10 > [24438.676063][ T2] __alloc_pages_nodemask+0x1d8/0x1940 > [24438.676083][ T2] allocate_slab+0x130/0x2740 > [24438.676091][ T2] new_slab+0xa8/0xe0 > [24438.676101][ T2] kmem_cache_open+0x254/0x660 > [24438.676119][ T2] __kmem_cache_create+0x44/0x2a0 > [24438.676136][ T2] create_boot_cache+0xcc/0x110 > [24438.676154][ T2] kmem_cache_init+0x90/0x1f0 > [24438.676173][ T2] start_kernel+0x3b8/0x7c0 > [24438.676191][ T2] start_here_common+0x1c/0x334 > [24438.676208][ T2] > [24438.676208][ T2] -> #3 (&(&zone->lock)->rlock){-.-.}: > [24438.676221][ T2] __lock_acquire+0x5b4/0xbf0 > [24438.676247][ T2] lock_acquire+0x130/0x360 > [24438.676264][ T2] _raw_spin_lock+0x54/0x80 > [24438.676282][ T2] rmqueue_bulk.constprop.23+0x64/0xf20 > [24438.676300][ T2] get_page_from_freelist+0x718/0x1e10 > [24438.676319][ T2] __alloc_pages_nodemask+0x1d8/0x1940 > [24438.676339][ T2] alloc_page_interleave+0x34/0x170 > [24438.676356][ T2] allocate_slab+0xd1c/0x2740 > [24438.676374][ T2] new_slab+0xa8/0xe0 > [24438.676391][ T2] ___slab_alloc+0x580/0xef0 > [24438.676408][ T2] __slab_alloc+0x64/0xd0 > [24438.676426][ T2] kmem_cache_alloc+0x5c4/0x6c0 > [24438.676444][ T2] fill_pool+0x280/0x540 > [24438.676461][ T2] __debug_object_init+0x60/0x6b0 > [24438.676479][ T2] hrtimer_init+0x5c/0x310 > [24438.676497][ T2] init_dl_task_timer+0x34/0x60 > [24438.676516][ T2] __sched_fork+0x8c/0x110 > [24438.676535][ T2] init_idle+0xb4/0x3c0 > [24438.676553][ T2] idle_thread_get+0x78/0x120 > [24438.676572][ T2] bringup_cpu+0x30/0x230 > [24438.676590][ T2] cpuhp_invoke_callback+0x190/0x1580 > [24438.676618][ T2] do_cpu_up+0x248/0x460 > [24438.676636][ T2] smp_init+0x118/0x1c0 > [24438.676662][ T2] kernel_init_freeable+0x3f8/0x8dc > [24438.676681][ T2] kernel_init+0x2c/0x154 > [24438.676699][ T2] ret_from_kernel_thread+0x5c/0x74 > [24438.676716][ T2] > [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: This then shows we now have: rq->lock zone->lock.rlock batched_entropy_u64.lock Which, to me, appears to be distinctly different from the last time, which was: rq->lock batched_entropy_u32.lock Notable: "u32" != "u64". But #3 has: > [24438.676516][ T2] __sched_fork+0x8c/0x110 > [24438.676535][ T2] init_idle+0xb4/0x3c0 Which seems to suggest you didn't actually apply the patch; or rather, if you did, i'm not immediately seeing where #2 is acquired.
On Wed, 2019-09-25 at 18:45 +0200, Peter Zijlstra wrote: > On Wed, Sep 25, 2019 at 11:18:47AM -0400, Qian Cai wrote: > > On Wed, 2019-09-25 at 11:31 +0200, Peter Zijlstra wrote: > > > On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote: > > > > -> #3 (batched_entropy_u32.lock){-.-.}: > > > > lock_acquire+0x31c/0x360 > > > > _raw_spin_lock_irqsave+0x7c/0x9c > > > > get_random_u32+0x6c/0x1dc > > > > new_slab+0x234/0x6c0 > > > > ___slab_alloc+0x3c8/0x650 > > > > kmem_cache_alloc+0x4b0/0x590 > > > > __debug_object_init+0x778/0x8b4 > > > > debug_object_init+0x40/0x50 > > > > debug_init+0x30/0x29c > > > > hrtimer_init+0x30/0x50 > > > > init_dl_task_timer+0x24/0x44 > > > > __sched_fork+0xc0/0x168 > > > > init_idle+0x78/0x26c > > > > fork_idle+0x12c/0x178 > > > > idle_threads_init+0x108/0x178 > > > > smp_init+0x20/0x1bc > > > > kernel_init_freeable+0x198/0x26c > > > > kernel_init+0x18/0x334 > > > > ret_from_fork+0x10/0x18 > > > > > > > > -> #2 (&rq->lock){-.-.}: > > > > > > This relation is silly.. > > > > > > I suspect the below 'works'... > > > > Unfortunately, the relation is still there, > > > > copy_process()->rt_mutex_init_task()->"&p->pi_lock" > > > > [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: > > [24438.676727][ T2] __lock_acquire+0x5b4/0xbf0 > > [24438.676736][ T2] lock_acquire+0x130/0x360 > > [24438.676754][ T2] _raw_spin_lock+0x54/0x80 > > [24438.676771][ T2] task_fork_fair+0x60/0x190 > > [24438.676788][ T2] sched_fork+0x128/0x270 > > [24438.676806][ T2] copy_process+0x7a4/0x1bf0 > > [24438.676823][ T2] _do_fork+0xac/0xac0 > > [24438.676841][ T2] kernel_thread+0x70/0xa0 > > [24438.676858][ T2] rest_init+0x4c/0x42c > > [24438.676884][ T2] start_kernel+0x778/0x7c0 > > [24438.676902][ T2] start_here_common+0x1c/0x334 > > That's the 'where we took #2 while holding #1' stacktrace and not > relevant to our discussion. Oh, you were talking about took #3 while holding #2. Anyway, your patch is working fine so far. Care to post/merge it officially or do you want me to post it? > > > [24438.675836][ T2] -> #4 (batched_entropy_u64.lock){-...}: > > [24438.675860][ T2] __lock_acquire+0x5b4/0xbf0 > > [24438.675878][ T2] lock_acquire+0x130/0x360 > > [24438.675906][ T2] _raw_spin_lock_irqsave+0x70/0xa0 > > [24438.675923][ T2] get_random_u64+0x60/0x100 > > [24438.675944][ T2] add_to_free_area_random+0x164/0x1b0 > > [24438.675962][ T2] free_one_page+0xb24/0xcf0 > > [24438.675980][ T2] __free_pages_ok+0x448/0xbf0 > > [24438.675999][ T2] deferred_init_maxorder+0x404/0x4a4 > > [24438.676018][ T2] deferred_grow_zone+0x158/0x1f0 > > [24438.676035][ T2] get_page_from_freelist+0x1dc8/0x1e10 > > [24438.676063][ T2] __alloc_pages_nodemask+0x1d8/0x1940 > > [24438.676083][ T2] allocate_slab+0x130/0x2740 > > [24438.676091][ T2] new_slab+0xa8/0xe0 > > [24438.676101][ T2] kmem_cache_open+0x254/0x660 > > [24438.676119][ T2] __kmem_cache_create+0x44/0x2a0 > > [24438.676136][ T2] create_boot_cache+0xcc/0x110 > > [24438.676154][ T2] kmem_cache_init+0x90/0x1f0 > > [24438.676173][ T2] start_kernel+0x3b8/0x7c0 > > [24438.676191][ T2] start_here_common+0x1c/0x334 > > [24438.676208][ T2] > > [24438.676208][ T2] -> #3 (&(&zone->lock)->rlock){-.-.}: > > [24438.676221][ T2] __lock_acquire+0x5b4/0xbf0 > > [24438.676247][ T2] lock_acquire+0x130/0x360 > > [24438.676264][ T2] _raw_spin_lock+0x54/0x80 > > [24438.676282][ T2] rmqueue_bulk.constprop.23+0x64/0xf20 > > [24438.676300][ T2] get_page_from_freelist+0x718/0x1e10 > > [24438.676319][ T2] __alloc_pages_nodemask+0x1d8/0x1940 > > [24438.676339][ T2] alloc_page_interleave+0x34/0x170 > > [24438.676356][ T2] allocate_slab+0xd1c/0x2740 > > [24438.676374][ T2] new_slab+0xa8/0xe0 > > [24438.676391][ T2] ___slab_alloc+0x580/0xef0 > > [24438.676408][ T2] __slab_alloc+0x64/0xd0 > > [24438.676426][ T2] kmem_cache_alloc+0x5c4/0x6c0 > > [24438.676444][ T2] fill_pool+0x280/0x540 > > [24438.676461][ T2] __debug_object_init+0x60/0x6b0 > > [24438.676479][ T2] hrtimer_init+0x5c/0x310 > > [24438.676497][ T2] init_dl_task_timer+0x34/0x60 > > [24438.676516][ T2] __sched_fork+0x8c/0x110 > > [24438.676535][ T2] init_idle+0xb4/0x3c0 > > [24438.676553][ T2] idle_thread_get+0x78/0x120 > > [24438.676572][ T2] bringup_cpu+0x30/0x230 > > [24438.676590][ T2] cpuhp_invoke_callback+0x190/0x1580 > > [24438.676618][ T2] do_cpu_up+0x248/0x460 > > [24438.676636][ T2] smp_init+0x118/0x1c0 > > [24438.676662][ T2] kernel_init_freeable+0x3f8/0x8dc > > [24438.676681][ T2] kernel_init+0x2c/0x154 > > [24438.676699][ T2] ret_from_kernel_thread+0x5c/0x74 > > [24438.676716][ T2] > > [24438.676716][ T2] -> #2 (&rq->lock){-.-.}: > > This then shows we now have: > > rq->lock > zone->lock.rlock > batched_entropy_u64.lock > > Which, to me, appears to be distinctly different from the last time, > which was: > > rq->lock > batched_entropy_u32.lock > > Notable: "u32" != "u64". > > But #3 has: > > > [24438.676516][ T2] __sched_fork+0x8c/0x110 > > [24438.676535][ T2] init_idle+0xb4/0x3c0 > > Which seems to suggest you didn't actually apply the patch; or rather, > if you did, i'm not immediately seeing where #2 is acquired. >
diff --git a/mm/slub.c b/mm/slub.c index 8834563cdb4b..96cdd36f9380 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1598,8 +1598,15 @@ static bool shuffle_freelist(struct kmem_cache *s, struct page *page) if (page->objects < 2 || !s->random_seq) return false; + /* + * Don't use get_random_int() here as it might deadlock due to + * "&rq->lock --> batched_entropy_u32.lock" chain. + */ + if (!arch_get_random_int((int *)&pos)) + get_random_bytes(&pos, sizeof(int)); + freelist_count = oo_objects(s->oo); - pos = get_random_int() % freelist_count; + pos %= freelist_count; page_limit = page->objects * s->size; start = fixup_red_left(s, page_address(page));
The commit b7d5dc21072c ("random: add a spinlock_t to struct batched_entropy") insists on acquiring "batched_entropy_u32.lock" in get_random_u32() which introduced the lock chain, "&rq->lock --> batched_entropy_u32.lock" even after crng init. As the result, it could result in deadlock below. Fix it by using get_random_bytes() in shuffle_freelist() which does not need to take on the batched_entropy locks. WARNING: possible circular locking dependency detected 5.3.0-rc7-mm1+ #3 Tainted: G L ------------------------------------------------------ make/7937 is trying to acquire lock: ffff900012f225f8 (random_write_wait.lock){....}, at: __wake_up_common_lock+0xa8/0x11c but task is already holding lock: ffff0096b9429c00 (batched_entropy_u32.lock){-.-.}, at: get_random_u32+0x6c/0x1dc which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (batched_entropy_u32.lock){-.-.}: lock_acquire+0x31c/0x360 _raw_spin_lock_irqsave+0x7c/0x9c get_random_u32+0x6c/0x1dc new_slab+0x234/0x6c0 ___slab_alloc+0x3c8/0x650 kmem_cache_alloc+0x4b0/0x590 __debug_object_init+0x778/0x8b4 debug_object_init+0x40/0x50 debug_init+0x30/0x29c hrtimer_init+0x30/0x50 init_dl_task_timer+0x24/0x44 __sched_fork+0xc0/0x168 init_idle+0x78/0x26c fork_idle+0x12c/0x178 idle_threads_init+0x108/0x178 smp_init+0x20/0x1bc kernel_init_freeable+0x198/0x26c kernel_init+0x18/0x334 ret_from_fork+0x10/0x18 -> #2 (&rq->lock){-.-.}: lock_acquire+0x31c/0x360 _raw_spin_lock+0x64/0x80 task_fork_fair+0x5c/0x1b0 sched_fork+0x15c/0x2dc copy_process+0x9e0/0x244c _do_fork+0xb8/0x644 kernel_thread+0xc4/0xf4 rest_init+0x30/0x238 arch_call_rest_init+0x10/0x18 start_kernel+0x424/0x52c -> #1 (&p->pi_lock){-.-.}: lock_acquire+0x31c/0x360 _raw_spin_lock_irqsave+0x7c/0x9c try_to_wake_up+0x74/0x8d0 default_wake_function+0x38/0x48 pollwake+0x118/0x158 __wake_up_common+0x130/0x1c4 __wake_up_common_lock+0xc8/0x11c __wake_up+0x3c/0x4c account+0x390/0x3e0 extract_entropy+0x2cc/0x37c _xfer_secondary_pool+0x35c/0x3c4 push_to_pool+0x54/0x308 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (random_write_wait.lock){....}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 _raw_spin_lock_irqsave+0x7c/0x9c __wake_up_common_lock+0xa8/0x11c __wake_up+0x3c/0x4c account+0x390/0x3e0 extract_entropy+0x2cc/0x37c crng_reseed+0x60/0x2f8 _extract_crng+0xd8/0x164 crng_reseed+0x7c/0x2f8 _extract_crng+0xd8/0x164 get_random_u32+0xec/0x1dc new_slab+0x234/0x6c0 ___slab_alloc+0x3c8/0x650 kmem_cache_alloc+0x4b0/0x590 getname_flags+0x44/0x1c8 user_path_at_empty+0x3c/0x68 vfs_statx+0xa4/0x134 __arm64_sys_newfstatat+0x94/0x120 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(batched_entropy_u32.lock); lock(&rq->lock); lock(batched_entropy_u32.lock); lock(random_write_wait.lock); *** DEADLOCK *** 1 lock held by make/7937: #0: ffff0096b9429c00 (batched_entropy_u32.lock){-.-.}, at: get_random_u32+0x6c/0x1dc stack backtrace: CPU: 220 PID: 7937 Comm: make Tainted: G L 5.3.0-rc7-mm1+ Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 _raw_spin_lock_irqsave+0x7c/0x9c __wake_up_common_lock+0xa8/0x11c __wake_up+0x3c/0x4c account+0x390/0x3e0 extract_entropy+0x2cc/0x37c crng_reseed+0x60/0x2f8 _extract_crng+0xd8/0x164 crng_reseed+0x7c/0x2f8 _extract_crng+0xd8/0x164 get_random_u32+0xec/0x1dc new_slab+0x234/0x6c0 ___slab_alloc+0x3c8/0x650 kmem_cache_alloc+0x4b0/0x590 getname_flags+0x44/0x1c8 user_path_at_empty+0x3c/0x68 vfs_statx+0xa4/0x134 __arm64_sys_newfstatat+0x94/0x120 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc Signed-off-by: Qian Cai <cai@lca.pw> --- mm/slub.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)