Message ID | 1497228440-10349-1-git-send-email-stummala@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon 12-06-17 06:17:20, Sahitya Tummala wrote: > __list_lru_walk_one() can hold the spin lock for longer duration > if there are more number of entries to be isolated. > > This results in "BUG: spinlock lockup suspected" in the below path - > > [<ffffff8eca0fb0bc>] spin_bug+0x90 > [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc > [<ffffff8ecafb7798>] _raw_spin_lock+0x28 > [<ffffff8eca1ae884>] list_lru_add+0x28 > [<ffffff8eca1f5dac>] dput+0x1c8 > [<ffffff8eca1eb46c>] path_put+0x20 > [<ffffff8eca1eb73c>] terminate_walk+0x3c > [<ffffff8eca1eee58>] path_lookupat+0x100 > [<ffffff8eca1f00fc>] filename_lookup+0x6c > [<ffffff8eca1f0264>] user_path_at_empty+0x54 > [<ffffff8eca1e066c>] SyS_faccessat+0xd0 > [<ffffff8eca084e30>] el0_svc_naked+0x24 > > This nlru->lock has been acquired by another CPU in this path - > > [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 > [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 > [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 > [<ffffff8eca1aec34>] list_lru_walk_node+0x40 > [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 > [<ffffff8eca1e56a8>] do_remount_sb+0xbc > [<ffffff8eca1e583c>] do_emergency_remount+0xb0 > [<ffffff8eca0ba510>] process_one_work+0x228 > [<ffffff8eca0bb158>] worker_thread+0x2e0 > [<ffffff8eca0c040c>] kthread+0xf4 > [<ffffff8eca084dd0>] ret_from_fork+0x10 > > Link: http://marc.info/?t=149511514800002&r=1&w=2 > Fix-suggested-by: Jan kara <jack@suse.cz> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Looks good to me. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > mm/list_lru.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/list_lru.c b/mm/list_lru.c > index 5d8dffd..1af0709 100644 > --- a/mm/list_lru.c > +++ b/mm/list_lru.c > @@ -249,6 +249,8 @@ restart: > default: > BUG(); > } > + if (cond_resched_lock(&nlru->lock)) > + goto restart; > } > > spin_unlock(&nlru->lock); > -- > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project. >
On Mon, 12 Jun 2017 06:17:20 +0530 Sahitya Tummala <stummala@codeaurora.org> wrote: > __list_lru_walk_one() can hold the spin lock for longer duration > if there are more number of entries to be isolated. > > This results in "BUG: spinlock lockup suspected" in the below path - > > [<ffffff8eca0fb0bc>] spin_bug+0x90 > [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc > [<ffffff8ecafb7798>] _raw_spin_lock+0x28 > [<ffffff8eca1ae884>] list_lru_add+0x28 > [<ffffff8eca1f5dac>] dput+0x1c8 > [<ffffff8eca1eb46c>] path_put+0x20 > [<ffffff8eca1eb73c>] terminate_walk+0x3c > [<ffffff8eca1eee58>] path_lookupat+0x100 > [<ffffff8eca1f00fc>] filename_lookup+0x6c > [<ffffff8eca1f0264>] user_path_at_empty+0x54 > [<ffffff8eca1e066c>] SyS_faccessat+0xd0 > [<ffffff8eca084e30>] el0_svc_naked+0x24 > > This nlru->lock has been acquired by another CPU in this path - > > [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 > [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 > [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 > [<ffffff8eca1aec34>] list_lru_walk_node+0x40 > [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 > [<ffffff8eca1e56a8>] do_remount_sb+0xbc > [<ffffff8eca1e583c>] do_emergency_remount+0xb0 > [<ffffff8eca0ba510>] process_one_work+0x228 > [<ffffff8eca0bb158>] worker_thread+0x2e0 > [<ffffff8eca0c040c>] kthread+0xf4 > [<ffffff8eca084dd0>] ret_from_fork+0x10 > > Link: http://marc.info/?t=149511514800002&r=1&w=2 > Fix-suggested-by: Jan kara <jack@suse.cz> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> > --- > mm/list_lru.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/list_lru.c b/mm/list_lru.c > index 5d8dffd..1af0709 100644 > --- a/mm/list_lru.c > +++ b/mm/list_lru.c > @@ -249,6 +249,8 @@ restart: > default: > BUG(); > } > + if (cond_resched_lock(&nlru->lock)) > + goto restart; > } > > spin_unlock(&nlru->lock); This is rather worrying. a) Why are we spending so long holding that lock that this is occurring? b) With this patch, we're restarting the entire scan. Are there situations in which this loop will never terminate, or will take a very long time? Suppose that this process is getting rescheds blasted at it for some reason? IOW this looks like a bit of a band-aid and a deeper analysis and understanding might be needed.
On 6/16/2017 2:35 AM, Andrew Morton wrote: > diff --git a/mm/list_lru.c b/mm/list_lru.c >> index 5d8dffd..1af0709 100644 >> --- a/mm/list_lru.c >> +++ b/mm/list_lru.c >> @@ -249,6 +249,8 @@ restart: >> default: >> BUG(); >> } >> + if (cond_resched_lock(&nlru->lock)) >> + goto restart; >> } >> >> spin_unlock(&nlru->lock); > This is rather worrying. > > a) Why are we spending so long holding that lock that this is occurring? At the time of crash I see that __list_lru_walk_one() shows number of entries isolated as 1774475 with nr_items still pending as 130748. On my system, I see that for dentries of 100000, it takes around 75ms for __list_lru_walk_one() to complete. So for a total of 1900000 dentries as in issue scenario, it will take upto 1425ms, which explains why the spin lockup condition got hit on the other CPU. It looks like __list_lru_walk_one() is expected to take more time if there are more number of dentries present. And I think it is a valid scenario to have those many number dentries. > b) With this patch, we're restarting the entire scan. Are there > situations in which this loop will never terminate, or will take a > very long time? Suppose that this process is getting rescheds > blasted at it for some reason? In the above scenario, I observed that the dentry entries from lru list are removedall the time i.e LRU_REMOVED is returned from the isolate (dentry_lru_isolate()) callback. I don't know if there is any case where we skip several entries in the lru list and restartseveral times due to this cond_resched_lock(). This can happen even with theexisting code if LRU_RETRY is returned often from the isolate callback. > IOW this looks like a bit of a band-aid and a deeper analysis and > understanding might be needed.
Hello, On Thu, Jun 15, 2017 at 02:05:23PM -0700, Andrew Morton wrote: > On Mon, 12 Jun 2017 06:17:20 +0530 Sahitya Tummala <stummala@codeaurora.org> wrote: > > > __list_lru_walk_one() can hold the spin lock for longer duration > > if there are more number of entries to be isolated. > > > > This results in "BUG: spinlock lockup suspected" in the below path - > > > > [<ffffff8eca0fb0bc>] spin_bug+0x90 > > [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc > > [<ffffff8ecafb7798>] _raw_spin_lock+0x28 > > [<ffffff8eca1ae884>] list_lru_add+0x28 > > [<ffffff8eca1f5dac>] dput+0x1c8 > > [<ffffff8eca1eb46c>] path_put+0x20 > > [<ffffff8eca1eb73c>] terminate_walk+0x3c > > [<ffffff8eca1eee58>] path_lookupat+0x100 > > [<ffffff8eca1f00fc>] filename_lookup+0x6c > > [<ffffff8eca1f0264>] user_path_at_empty+0x54 > > [<ffffff8eca1e066c>] SyS_faccessat+0xd0 > > [<ffffff8eca084e30>] el0_svc_naked+0x24 > > > > This nlru->lock has been acquired by another CPU in this path - > > > > [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 > > [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 > > [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 > > [<ffffff8eca1aec34>] list_lru_walk_node+0x40 > > [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 > > [<ffffff8eca1e56a8>] do_remount_sb+0xbc > > [<ffffff8eca1e583c>] do_emergency_remount+0xb0 > > [<ffffff8eca0ba510>] process_one_work+0x228 > > [<ffffff8eca0bb158>] worker_thread+0x2e0 > > [<ffffff8eca0c040c>] kthread+0xf4 > > [<ffffff8eca084dd0>] ret_from_fork+0x10 > > > > Link: http://marc.info/?t=149511514800002&r=1&w=2 > > Fix-suggested-by: Jan kara <jack@suse.cz> > > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> > > --- > > mm/list_lru.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/list_lru.c b/mm/list_lru.c > > index 5d8dffd..1af0709 100644 > > --- a/mm/list_lru.c > > +++ b/mm/list_lru.c > > @@ -249,6 +249,8 @@ restart: > > default: > > BUG(); > > } > > + if (cond_resched_lock(&nlru->lock)) > > + goto restart; > > } > > > > spin_unlock(&nlru->lock); > > This is rather worrying. > > a) Why are we spending so long holding that lock that this is occurring? > > b) With this patch, we're restarting the entire scan. Are there > situations in which this loop will never terminate, or will take a > very long time? Suppose that this process is getting rescheds > blasted at it for some reason? > > IOW this looks like a bit of a band-aid and a deeper analysis and > understanding might be needed. The goal of list_lru_walk is removing inactive entries from the lru list (LRU_REMOVED). Memory shrinkers may also choose to move active entries to the tail of the lru list (LRU_ROTATED). LRU_SKIP is supposed to be returned only to avoid a possible deadlock. So I don't see how restarting lru walk could have adverse effects. However, I do find this patch kinda ugly, because: - list_lru_walk already gives you a way to avoid a lockup - just make the callback reschedule and return LRU_RETRY every now and then, see shadow_lru_isolate() for an example. Alternatively, you can limit the number of entries scanned in one go (nr_to_walk) and reschedule between calls. This is what shrink_slab() does: the number of dentries scanned without releasing the lock is limited to 1024, see how super_block::s_shrink is initialized. - Someone might want to call list_lru_walk with a spin lock held, and I don't see anything wrong in doing that. With your patch it can't be done anymore. That said, I think it would be better to patch shrink_dcache_sb() or dentry_lru_isolate_shrink() instead of list_lru_walk() in order to fix this lockup.
Hello, On 6/17/2017 4:44 PM, Vladimir Davydov wrote: > > That said, I think it would be better to patch shrink_dcache_sb() or > dentry_lru_isolate_shrink() instead of list_lru_walk() in order to fix > this lockup. Thanks for the review. I will enhance the patch as per your suggestion.
diff --git a/mm/list_lru.c b/mm/list_lru.c index 5d8dffd..1af0709 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -249,6 +249,8 @@ restart: default: BUG(); } + if (cond_resched_lock(&nlru->lock)) + goto restart; } spin_unlock(&nlru->lock);
__list_lru_walk_one() can hold the spin lock for longer duration if there are more number of entries to be isolated. This results in "BUG: spinlock lockup suspected" in the below path - [<ffffff8eca0fb0bc>] spin_bug+0x90 [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc [<ffffff8ecafb7798>] _raw_spin_lock+0x28 [<ffffff8eca1ae884>] list_lru_add+0x28 [<ffffff8eca1f5dac>] dput+0x1c8 [<ffffff8eca1eb46c>] path_put+0x20 [<ffffff8eca1eb73c>] terminate_walk+0x3c [<ffffff8eca1eee58>] path_lookupat+0x100 [<ffffff8eca1f00fc>] filename_lookup+0x6c [<ffffff8eca1f0264>] user_path_at_empty+0x54 [<ffffff8eca1e066c>] SyS_faccessat+0xd0 [<ffffff8eca084e30>] el0_svc_naked+0x24 This nlru->lock has been acquired by another CPU in this path - [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 [<ffffff8eca1aec34>] list_lru_walk_node+0x40 [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 [<ffffff8eca1e56a8>] do_remount_sb+0xbc [<ffffff8eca1e583c>] do_emergency_remount+0xb0 [<ffffff8eca0ba510>] process_one_work+0x228 [<ffffff8eca0bb158>] worker_thread+0x2e0 [<ffffff8eca0c040c>] kthread+0xf4 [<ffffff8eca084dd0>] ret_from_fork+0x10 Link: http://marc.info/?t=149511514800002&r=1&w=2 Fix-suggested-by: Jan kara <jack@suse.cz> Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> --- mm/list_lru.c | 2 ++ 1 file changed, 2 insertions(+)