Message ID | 20150825181152.GA26785@mtj.duckdns.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 08/25/2015 12:11 PM, Tejun Heo wrote: > e79729123f63 ("writeback: don't issue wb_writeback_work if clean") > updated writeback path to avoid kicking writeback work items if there > are no inodes to be written out; unfortunately, the avoidance logic > was too aggressive and broke sync_inodes_sb(). > > * sync_inodes_sb() must write out I_DIRTY_TIME inodes but I_DIRTY_TIME > inodes dont't contribute to bdi/wb_has_dirty_io() tests and were > being skipped over. > > * inodes are taken off wb->b_dirty/io/more_io lists after writeback > starts on them. sync_inodes_sb() skipping wait_sb_inodes() when > bdi_has_dirty_io() breaks it by making it return while writebacks > are in-flight. > > This patch fixes the breakages by > > * Removing bdi_has_dirty_io() shortcut from bdi_split_work_to_wbs(). > The callers are already testing the condition. > > * Removing bdi_has_dirty_io() shortcut from sync_inodes_sb() so that > it always calls into bdi_split_work_to_wbs() and wait_sb_inodes(). > > * Making bdi_split_work_to_wbs() consider the b_dirty_time list for > WB_SYNC_ALL writebacks. > > Kudos to Eryu, Dave and Jan for tracking down the issue. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Fixes: e79729123f63 ("writeback: don't issue wb_writeback_work if clean") > Link: http://lkml.kernel.org/g/20150812101204.GE17933@dhcp-13-216.nay.redhat.com > Reported-and-bisected-by: Eryu Guan <eguan@redhat.com> > Cc: Dave Chinner <david@fromorbit.com> > Cc: Jan Kara <jack@suse.com> > Cc: Ted Ts'o <tytso@google.com> Added for 4.2.
On Tue 25-08-15 14:11:52, Tejun Heo wrote: > e79729123f63 ("writeback: don't issue wb_writeback_work if clean") > updated writeback path to avoid kicking writeback work items if there > are no inodes to be written out; unfortunately, the avoidance logic > was too aggressive and broke sync_inodes_sb(). > > * sync_inodes_sb() must write out I_DIRTY_TIME inodes but I_DIRTY_TIME > inodes dont't contribute to bdi/wb_has_dirty_io() tests and were > being skipped over. > > * inodes are taken off wb->b_dirty/io/more_io lists after writeback > starts on them. sync_inodes_sb() skipping wait_sb_inodes() when > bdi_has_dirty_io() breaks it by making it return while writebacks > are in-flight. > > This patch fixes the breakages by > > * Removing bdi_has_dirty_io() shortcut from bdi_split_work_to_wbs(). > The callers are already testing the condition. > > * Removing bdi_has_dirty_io() shortcut from sync_inodes_sb() so that > it always calls into bdi_split_work_to_wbs() and wait_sb_inodes(). > > * Making bdi_split_work_to_wbs() consider the b_dirty_time list for > WB_SYNC_ALL writebacks. > > Kudos to Eryu, Dave and Jan for tracking down the issue. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Fixes: e79729123f63 ("writeback: don't issue wb_writeback_work if clean") > Link: http://lkml.kernel.org/g/20150812101204.GE17933@dhcp-13-216.nay.redhat.com > Reported-and-bisected-by: Eryu Guan <eguan@redhat.com> > Cc: Dave Chinner <david@fromorbit.com> > Cc: Jan Kara <jack@suse.com> > Cc: Ted Ts'o <tytso@google.com> > --- > fs/fs-writeback.c | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) The patch looks good. You can add: Reviewed-by: Jan Kara <jack@suse.com> Honza > > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -844,14 +844,15 @@ static void bdi_split_work_to_wbs(struct > struct wb_iter iter; > > might_sleep(); > - > - if (!bdi_has_dirty_io(bdi)) > - return; > restart: > rcu_read_lock(); > bdi_for_each_wb(wb, bdi, &iter, next_blkcg_id) { > - if (!wb_has_dirty_io(wb) || > - (skip_if_busy && writeback_in_progress(wb))) > + /* SYNC_ALL writes out I_DIRTY_TIME too */ > + if (!wb_has_dirty_io(wb) && > + (base_work->sync_mode == WB_SYNC_NONE || > + list_empty(&wb->b_dirty_time))) > + continue; > + if (skip_if_busy && writeback_in_progress(wb)) > continue; > > base_work->nr_pages = wb_split_bdi_pages(wb, nr_pages); > @@ -899,8 +900,7 @@ static void bdi_split_work_to_wbs(struct > { > might_sleep(); > > - if (bdi_has_dirty_io(bdi) && > - (!skip_if_busy || !writeback_in_progress(&bdi->wb))) { > + if (!skip_if_busy || !writeback_in_progress(&bdi->wb)) { > base_work->auto_free = 0; > base_work->single_wait = 0; > base_work->single_done = 0; > @@ -2275,8 +2275,12 @@ void sync_inodes_sb(struct super_block * > }; > struct backing_dev_info *bdi = sb->s_bdi; > > - /* Nothing to do? */ > - if (!bdi_has_dirty_io(bdi) || bdi == &noop_backing_dev_info) > + /* > + * Can't skip on !bdi_has_dirty() because we should wait for !dirty > + * inodes under writeback and I_DIRTY_TIME inodes ignored by > + * bdi_has_dirty() need to be written out too. > + */ > + if (bdi == &noop_backing_dev_info) > return; > WARN_ON(!rwsem_is_locked(&sb->s_umount)); > >
--- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -844,14 +844,15 @@ static void bdi_split_work_to_wbs(struct struct wb_iter iter; might_sleep(); - - if (!bdi_has_dirty_io(bdi)) - return; restart: rcu_read_lock(); bdi_for_each_wb(wb, bdi, &iter, next_blkcg_id) { - if (!wb_has_dirty_io(wb) || - (skip_if_busy && writeback_in_progress(wb))) + /* SYNC_ALL writes out I_DIRTY_TIME too */ + if (!wb_has_dirty_io(wb) && + (base_work->sync_mode == WB_SYNC_NONE || + list_empty(&wb->b_dirty_time))) + continue; + if (skip_if_busy && writeback_in_progress(wb)) continue; base_work->nr_pages = wb_split_bdi_pages(wb, nr_pages); @@ -899,8 +900,7 @@ static void bdi_split_work_to_wbs(struct { might_sleep(); - if (bdi_has_dirty_io(bdi) && - (!skip_if_busy || !writeback_in_progress(&bdi->wb))) { + if (!skip_if_busy || !writeback_in_progress(&bdi->wb)) { base_work->auto_free = 0; base_work->single_wait = 0; base_work->single_done = 0; @@ -2275,8 +2275,12 @@ void sync_inodes_sb(struct super_block * }; struct backing_dev_info *bdi = sb->s_bdi; - /* Nothing to do? */ - if (!bdi_has_dirty_io(bdi) || bdi == &noop_backing_dev_info) + /* + * Can't skip on !bdi_has_dirty() because we should wait for !dirty + * inodes under writeback and I_DIRTY_TIME inodes ignored by + * bdi_has_dirty() need to be written out too. + */ + if (bdi == &noop_backing_dev_info) return; WARN_ON(!rwsem_is_locked(&sb->s_umount));
e79729123f63 ("writeback: don't issue wb_writeback_work if clean") updated writeback path to avoid kicking writeback work items if there are no inodes to be written out; unfortunately, the avoidance logic was too aggressive and broke sync_inodes_sb(). * sync_inodes_sb() must write out I_DIRTY_TIME inodes but I_DIRTY_TIME inodes dont't contribute to bdi/wb_has_dirty_io() tests and were being skipped over. * inodes are taken off wb->b_dirty/io/more_io lists after writeback starts on them. sync_inodes_sb() skipping wait_sb_inodes() when bdi_has_dirty_io() breaks it by making it return while writebacks are in-flight. This patch fixes the breakages by * Removing bdi_has_dirty_io() shortcut from bdi_split_work_to_wbs(). The callers are already testing the condition. * Removing bdi_has_dirty_io() shortcut from sync_inodes_sb() so that it always calls into bdi_split_work_to_wbs() and wait_sb_inodes(). * Making bdi_split_work_to_wbs() consider the b_dirty_time list for WB_SYNC_ALL writebacks. Kudos to Eryu, Dave and Jan for tracking down the issue. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: e79729123f63 ("writeback: don't issue wb_writeback_work if clean") Link: http://lkml.kernel.org/g/20150812101204.GE17933@dhcp-13-216.nay.redhat.com Reported-and-bisected-by: Eryu Guan <eguan@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.com> Cc: Ted Ts'o <tytso@google.com> --- fs/fs-writeback.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html