Message ID | 20230420112946.2869956-2-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Song Liu |
Headers | show |
Series | md/raid1-10: limit the number of plugged bio | expand |
On Thu, Apr 20, 2023 at 4:31 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > From: Yu Kuai <yukuai3@huawei.com> > > Currently, there is no limit for raid1/raid10 plugged bio. While flushing > writes, raid1 has cond_resched() while raid10 doesn't, and too many > writes can cause soft lockup. > > Follow up soft lockup can be triggered easily with writeback test for > raid10 with ramdisks: > > watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] > Call Trace: > <TASK> > call_rcu+0x16/0x20 > put_object+0x41/0x80 > __delete_object+0x50/0x90 > delete_object_full+0x2b/0x40 > kmemleak_free+0x46/0xa0 > slab_free_freelist_hook.constprop.0+0xed/0x1a0 > kmem_cache_free+0xfd/0x300 > mempool_free_slab+0x1f/0x30 > mempool_free+0x3a/0x100 > bio_free+0x59/0x80 > bio_put+0xcf/0x2c0 > free_r10bio+0xbf/0xf0 > raid_end_bio_io+0x78/0xb0 > one_write_done+0x8a/0xa0 > raid10_end_write_request+0x1b4/0x430 > bio_endio+0x175/0x320 > brd_submit_bio+0x3b9/0x9b7 [brd] > __submit_bio+0x69/0xe0 > submit_bio_noacct_nocheck+0x1e6/0x5a0 > submit_bio_noacct+0x38c/0x7e0 > flush_pending_writes+0xf0/0x240 > raid10d+0xac/0x1ed0 > > This patch fix the problem by adding cond_resched() to raid10 like what > raid1 did. nit: per submitting-patches.rst: Describe your changes in imperative mood, e.g. "make xyzzy do frotz" instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to do frotz", as if you are giving orders to the codebase to change its behaviour. > > Note that unlimited plugged bio still need to be optimized because in > the case of writeback lots of dirty pages, this will take lots of memory > and io latecy is quite bad. typo: latency. > > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > --- > drivers/md/raid10.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 6590aa49598c..a116b7c9d9f3 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) > else > submit_bio_noacct(bio); > bio = next; > + cond_resched(); > } > blk_finish_plug(&plug); > } else > @@ -1140,6 +1141,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) > else > submit_bio_noacct(bio); > bio = next; > + cond_resched(); > } > kfree(plug); > } > -- > 2.39.2 >
On Thu, Apr 20, 2023 at 4:31 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > From: Yu Kuai <yukuai3@huawei.com> > > Currently, there is no limit for raid1/raid10 plugged bio. While flushing > writes, raid1 has cond_resched() while raid10 doesn't, and too many > writes can cause soft lockup. > > Follow up soft lockup can be triggered easily with writeback test for > raid10 with ramdisks: > > watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] > Call Trace: > <TASK> > call_rcu+0x16/0x20 > put_object+0x41/0x80 > __delete_object+0x50/0x90 > delete_object_full+0x2b/0x40 > kmemleak_free+0x46/0xa0 > slab_free_freelist_hook.constprop.0+0xed/0x1a0 > kmem_cache_free+0xfd/0x300 > mempool_free_slab+0x1f/0x30 > mempool_free+0x3a/0x100 > bio_free+0x59/0x80 > bio_put+0xcf/0x2c0 > free_r10bio+0xbf/0xf0 > raid_end_bio_io+0x78/0xb0 > one_write_done+0x8a/0xa0 > raid10_end_write_request+0x1b4/0x430 > bio_endio+0x175/0x320 > brd_submit_bio+0x3b9/0x9b7 [brd] > __submit_bio+0x69/0xe0 > submit_bio_noacct_nocheck+0x1e6/0x5a0 > submit_bio_noacct+0x38c/0x7e0 > flush_pending_writes+0xf0/0x240 > raid10d+0xac/0x1ed0 Is it possible to trigger this with a mdadm test? Thanks, Song > > This patch fix the problem by adding cond_resched() to raid10 like what > raid1 did. > > Note that unlimited plugged bio still need to be optimized because in > the case of writeback lots of dirty pages, this will take lots of memory > and io latecy is quite bad. > > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > --- > drivers/md/raid10.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 6590aa49598c..a116b7c9d9f3 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) > else > submit_bio_noacct(bio); > bio = next; > + cond_resched(); > } > blk_finish_plug(&plug); > } else > @@ -1140,6 +1141,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) > else > submit_bio_noacct(bio); > bio = next; > + cond_resched(); > } > kfree(plug); > } > -- > 2.39.2 >
Hi, 在 2023/04/25 8:23, Song Liu 写道: > On Thu, Apr 20, 2023 at 4:31 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> From: Yu Kuai <yukuai3@huawei.com> >> >> Currently, there is no limit for raid1/raid10 plugged bio. While flushing >> writes, raid1 has cond_resched() while raid10 doesn't, and too many >> writes can cause soft lockup. >> >> Follow up soft lockup can be triggered easily with writeback test for >> raid10 with ramdisks: >> >> watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] >> Call Trace: >> <TASK> >> call_rcu+0x16/0x20 >> put_object+0x41/0x80 >> __delete_object+0x50/0x90 >> delete_object_full+0x2b/0x40 >> kmemleak_free+0x46/0xa0 >> slab_free_freelist_hook.constprop.0+0xed/0x1a0 >> kmem_cache_free+0xfd/0x300 >> mempool_free_slab+0x1f/0x30 >> mempool_free+0x3a/0x100 >> bio_free+0x59/0x80 >> bio_put+0xcf/0x2c0 >> free_r10bio+0xbf/0xf0 >> raid_end_bio_io+0x78/0xb0 >> one_write_done+0x8a/0xa0 >> raid10_end_write_request+0x1b4/0x430 >> bio_endio+0x175/0x320 >> brd_submit_bio+0x3b9/0x9b7 [brd] >> __submit_bio+0x69/0xe0 >> submit_bio_noacct_nocheck+0x1e6/0x5a0 >> submit_bio_noacct+0x38c/0x7e0 >> flush_pending_writes+0xf0/0x240 >> raid10d+0xac/0x1ed0 > > Is it possible to trigger this with a mdadm test? > The test I mentioned in patch 8 can trigger this problem reliablity, so I this add a new test can achieve this. Thanks, Kuai > Thanks, > Song > >> >> This patch fix the problem by adding cond_resched() to raid10 like what >> raid1 did. >> >> Note that unlimited plugged bio still need to be optimized because in >> the case of writeback lots of dirty pages, this will take lots of memory >> and io latecy is quite bad. >> >> Signed-off-by: Yu Kuai <yukuai3@huawei.com> >> --- >> drivers/md/raid10.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c >> index 6590aa49598c..a116b7c9d9f3 100644 >> --- a/drivers/md/raid10.c >> +++ b/drivers/md/raid10.c >> @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) >> else >> submit_bio_noacct(bio); >> bio = next; >> + cond_resched(); >> } >> blk_finish_plug(&plug); >> } else >> @@ -1140,6 +1141,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) >> else >> submit_bio_noacct(bio); >> bio = next; >> + cond_resched(); >> } >> kfree(plug); >> } >> -- >> 2.39.2 >> > . >
On Mon, Apr 24, 2023 at 11:16 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2023/04/25 8:23, Song Liu 写道: > > On Thu, Apr 20, 2023 at 4:31 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: > >> > >> From: Yu Kuai <yukuai3@huawei.com> > >> > >> Currently, there is no limit for raid1/raid10 plugged bio. While flushing > >> writes, raid1 has cond_resched() while raid10 doesn't, and too many > >> writes can cause soft lockup. > >> > >> Follow up soft lockup can be triggered easily with writeback test for > >> raid10 with ramdisks: > >> > >> watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] > >> Call Trace: > >> <TASK> > >> call_rcu+0x16/0x20 > >> put_object+0x41/0x80 > >> __delete_object+0x50/0x90 > >> delete_object_full+0x2b/0x40 > >> kmemleak_free+0x46/0xa0 > >> slab_free_freelist_hook.constprop.0+0xed/0x1a0 > >> kmem_cache_free+0xfd/0x300 > >> mempool_free_slab+0x1f/0x30 > >> mempool_free+0x3a/0x100 > >> bio_free+0x59/0x80 > >> bio_put+0xcf/0x2c0 > >> free_r10bio+0xbf/0xf0 > >> raid_end_bio_io+0x78/0xb0 > >> one_write_done+0x8a/0xa0 > >> raid10_end_write_request+0x1b4/0x430 > >> bio_endio+0x175/0x320 > >> brd_submit_bio+0x3b9/0x9b7 [brd] > >> __submit_bio+0x69/0xe0 > >> submit_bio_noacct_nocheck+0x1e6/0x5a0 > >> submit_bio_noacct+0x38c/0x7e0 > >> flush_pending_writes+0xf0/0x240 > >> raid10d+0xac/0x1ed0 > > > > Is it possible to trigger this with a mdadm test? > > > > The test I mentioned in patch 8 can trigger this problem reliablity, so > I this add a new test can achieve this. To be clear, by "mdadm test" I mean the tests included in mdadm: https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/tree/tests Could you please try to add a test? If it works, we should add it to mdadm. Thanks, Song
Hi, 在 2023/04/25 14:39, Song Liu 写道: > On Mon, Apr 24, 2023 at 11:16 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2023/04/25 8:23, Song Liu 写道: >>> On Thu, Apr 20, 2023 at 4:31 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: >>>> >>>> From: Yu Kuai <yukuai3@huawei.com> >>>> >>>> Currently, there is no limit for raid1/raid10 plugged bio. While flushing >>>> writes, raid1 has cond_resched() while raid10 doesn't, and too many >>>> writes can cause soft lockup. >>>> >>>> Follow up soft lockup can be triggered easily with writeback test for >>>> raid10 with ramdisks: >>>> >>>> watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] >>>> Call Trace: >>>> <TASK> >>>> call_rcu+0x16/0x20 >>>> put_object+0x41/0x80 >>>> __delete_object+0x50/0x90 >>>> delete_object_full+0x2b/0x40 >>>> kmemleak_free+0x46/0xa0 >>>> slab_free_freelist_hook.constprop.0+0xed/0x1a0 >>>> kmem_cache_free+0xfd/0x300 >>>> mempool_free_slab+0x1f/0x30 >>>> mempool_free+0x3a/0x100 >>>> bio_free+0x59/0x80 >>>> bio_put+0xcf/0x2c0 >>>> free_r10bio+0xbf/0xf0 >>>> raid_end_bio_io+0x78/0xb0 >>>> one_write_done+0x8a/0xa0 >>>> raid10_end_write_request+0x1b4/0x430 >>>> bio_endio+0x175/0x320 >>>> brd_submit_bio+0x3b9/0x9b7 [brd] >>>> __submit_bio+0x69/0xe0 >>>> submit_bio_noacct_nocheck+0x1e6/0x5a0 >>>> submit_bio_noacct+0x38c/0x7e0 >>>> flush_pending_writes+0xf0/0x240 >>>> raid10d+0xac/0x1ed0 >>> >>> Is it possible to trigger this with a mdadm test? >>> >> >> The test I mentioned in patch 8 can trigger this problem reliablity, so >> I this add a new test can achieve this. > > To be clear, by "mdadm test" I mean the tests included in mdadm: > > https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/tree/tests > > Could you please try to add a test? If it works, we should add it to > mdadm. Yes, of course. However, I'm not familiar how mdadm tests works yet, it might take some time. By the way, I'll be good if I can add the test to blktests if possible. Thanks, Kuai > > Thanks, > Song > . >
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6590aa49598c..a116b7c9d9f3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1140,6 +1141,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else submit_bio_noacct(bio); bio = next; + cond_resched(); } kfree(plug); }