Message ID | 20150429072530.39d38b00@notabene.brown (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On Wed, Apr 29, 2015 at 07:25:30AM +1000, NeilBrown wrote: > As bdi_set_min_ratio doesn't touch bdi->dev, there seems to be no need for > the test, or the warning. > > I wonder if it would make sense to move the bdi_set_min_ratio() call to > bdi_destroy, and discard bdi_unregister?? > There is a comment which suggests bdi_unregister might be of use later, but > it might be best to have a clean slate in which to add whatever might be > needed?? This seems fine to me from the block dev point of view. I don't really understand the bdi_min_ratio logic, but Peter might have a better idea. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed, Apr 29, 2015 at 03:35:12PM +0200, Christoph Hellwig wrote: > On Wed, Apr 29, 2015 at 07:25:30AM +1000, NeilBrown wrote: > > As bdi_set_min_ratio doesn't touch bdi->dev, there seems to be no need for > > the test, or the warning. > > > > I wonder if it would make sense to move the bdi_set_min_ratio() call to > > bdi_destroy, and discard bdi_unregister?? > > There is a comment which suggests bdi_unregister might be of use later, but > > it might be best to have a clean slate in which to add whatever might be > > needed?? > > This seems fine to me from the block dev point of view. I don't really > understand the bdi_min_ratio logic, but Peter might have a better idea. Ah, that was a bit of digging, I've not looked at that in ages :-) So if you look at bdi_dirty_limit()'s comment: * The bdi's share of dirty limit will be adapting to its throughput and * bounded by the bdi->min_ratio and/or bdi->max_ratio parameters, if set. So the min_ratio is a minimum guaranteed fraction of the total throughput. Now the problem before commit ccb6108f5b0b ("mm/backing-dev.c: reset bdi min_ratio in bdi_unregister()") was that since bdi_set_min_ratio() keeps a global sum of bdi->min_ratio, you need to subtract from said global sum when taking the BDI away. Otherwise we loose/leak a fraction of the total throughput available (to the other BDIs). Which is what that bdi_set_min_ratio(bdi, 0) in unregister does. It resets the min_ratio for the bdi being taken out and frees up the min allocated bandwidth for the others. So I think moving that do destroy would be fine; assuming the delay between unregister and destroy is typically 'short'. Because without that you can 'leak' this min ratio for extended periods which means the bandwidth is unavailable for other BDIs. Does that make sense? -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Wed, 29 Apr 2015 18:02:58 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Apr 29, 2015 at 03:35:12PM +0200, Christoph Hellwig wrote: > > On Wed, Apr 29, 2015 at 07:25:30AM +1000, NeilBrown wrote: > > > As bdi_set_min_ratio doesn't touch bdi->dev, there seems to be no need for > > > the test, or the warning. > > > > > > I wonder if it would make sense to move the bdi_set_min_ratio() call to > > > bdi_destroy, and discard bdi_unregister?? > > > There is a comment which suggests bdi_unregister might be of use later, but > > > it might be best to have a clean slate in which to add whatever might be > > > needed?? > > > > This seems fine to me from the block dev point of view. I don't really > > understand the bdi_min_ratio logic, but Peter might have a better idea. > > Ah, that was a bit of digging, I've not looked at that in ages :-) > > So if you look at bdi_dirty_limit()'s comment: > > * The bdi's share of dirty limit will be adapting to its throughput and > * bounded by the bdi->min_ratio and/or bdi->max_ratio parameters, if set. > > So the min_ratio is a minimum guaranteed fraction of the total > throughput. > > Now the problem before commit ccb6108f5b0b ("mm/backing-dev.c: reset bdi > min_ratio in bdi_unregister()") was that since bdi_set_min_ratio() > keeps a global sum of bdi->min_ratio, you need to subtract from said > global sum when taking the BDI away. Otherwise we loose/leak a fraction > of the total throughput available (to the other BDIs). > > Which is what that bdi_set_min_ratio(bdi, 0) in unregister does. It > resets the min_ratio for the bdi being taken out and frees up the min > allocated bandwidth for the others. > > So I think moving that do destroy would be fine; assuming the delay > between unregister and destroy is typically 'short'. Because without > that you can 'leak' this min ratio for extended periods which means the > bandwidth is unavailable for other BDIs. > > Does that make sense? Your assessment is almost exactly what I had come up with, so it definitely makes sense :-) 'destroy' does come very shortly after 'unregister' (and immediately before 'blk_put_queue' which actually frees the struct). However the driving force for this patch was a desire to move blk_cleanup_queue(), which calls 'destroy', earlier. So the net result is that bdi_set_min_ratio will be called slightly sooner. Thanks, NeilBrown -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/block/genhd.c b/block/genhd.c index e351fc521053..1d4435478e8a 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -657,7 +657,6 @@ void del_gendisk(struct gendisk *disk) disk->flags &= ~GENHD_FL_UP; sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi"); - bdi_unregister(&disk->queue->backing_dev_info); blk_unregister_queue(disk); blk_unregister_region(disk_devt(disk), disk->minors); diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index aff923ae8c4b..d87d8eced064 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -116,7 +116,6 @@ __printf(3, 4) int bdi_register(struct backing_dev_info *bdi, struct device *parent, const char *fmt, ...); int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev); -void bdi_unregister(struct backing_dev_info *bdi); int __must_check bdi_setup_and_register(struct backing_dev_info *, char *); void bdi_start_writeback(struct backing_dev_info *bdi, long nr_pages, enum wb_reason reason); diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 880dd7437172..c178d13d6f4c 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -250,7 +250,6 @@ DEFINE_EVENT(writeback_class, name, \ DEFINE_WRITEBACK_EVENT(writeback_nowork); DEFINE_WRITEBACK_EVENT(writeback_wake_background); DEFINE_WRITEBACK_EVENT(writeback_bdi_register); -DEFINE_WRITEBACK_EVENT(writeback_bdi_unregister); DECLARE_EVENT_CLASS(wbc_class, TP_PROTO(struct writeback_control *wbc, struct backing_dev_info *bdi), diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 6dc4580df2af..000e7b3b9896 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -359,23 +359,6 @@ static void bdi_wb_shutdown(struct backing_dev_info *bdi) flush_delayed_work(&bdi->wb.dwork); } -/* - * Called when the device behind @bdi has been removed or ejected. - * - * We can't really do much here except for reducing the dirty ratio at - * the moment. In the future we should be able to set a flag so that - * the filesystem can handle errors at mark_inode_dirty time instead - * of only at writeback time. - */ -void bdi_unregister(struct backing_dev_info *bdi) -{ - if (WARN_ON_ONCE(!bdi->dev)) - return; - - bdi_set_min_ratio(bdi, 0); -} -EXPORT_SYMBOL(bdi_unregister); - static void bdi_wb_init(struct bdi_writeback *wb, struct backing_dev_info *bdi) { memset(wb, 0, sizeof(*wb)); @@ -443,6 +426,7 @@ void bdi_destroy(struct backing_dev_info *bdi) int i; bdi_wb_shutdown(bdi); + bdi_set_min_ratio(bdi, 0); WARN_ON(!list_empty(&bdi->work_list)); WARN_ON(delayed_work_pending(&bdi->wb.dwork));