Message ID | 564F9AFF.3050605@sandisk.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, I think you should squash the revert of v1 into this patch, and then document the crash the original patch caused and how this new patch is fixing that. A. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/20/2015 02:44 PM, Aaro Koskinen wrote: > I think you should squash the revert of v1 into this patch, and then > document the crash the original patch caused and how this new patch is > fixing that. Hello Aaro, I'd like to know the opinion of the SCSI maintainers about this. It's not impossible that they would prefer to submit the revert to Linus quickly and only send the reworked fix to Linus during the v4.5 merge window. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What sort of re-registration is this? Seems like we should only release the minor number once the bdi is released. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/24/2015 03:13 PM, Christoph Hellwig wrote: > What sort of re-registration is this? Seems like we should only > release the minor number once the bdi is released. Hello Christoph, As you most likely know the BDI device name for disks is based on the device major and minor number: $ ls -l /dev/sda brw-rw---- 1 root disk 8, 0 Nov 24 14:53 /dev/sda $ ls -l /sys/block/sda/bdi lrwxrwxrwx 1 root root 0 Nov 24 15:17 /sys/block/sda/bdi -> ../../../../../../../../virtual/bdi/8:0 So if a driver stops using a (major, minor) number pair and the same device number is reused before the bdi device has been released the warning mentioned in the patch description at the start of this thread is triggered. This patch fixes that race by removing the bdi device from sysfs during the __scsi_remove_device() call instead of when the bdi device is released. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bart, On Tue, Nov 24, 2015 at 03:23:21PM -0800, Bart Van Assche wrote: > So if a driver stops using a (major, minor) number pair and the same device > number is reused before the bdi device has been released the warning > mentioned in the patch description at the start of this thread is triggered. > This patch fixes that race by removing the bdi device from sysfs during the > __scsi_remove_device() call instead of when the bdi device is released. that's why I suggested only releasing the minor number (or rather dev_t) once we release the BDI, similar to what MD and DM do. But what I really wanted to ask for is what your reproducer looks like. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/25/15 00:47, Christoph Hellwig wrote:
> But what I really wanted to ask for is what your reproducer looks like.
Hello Christoph,
This race is hard to trigger. I can trigger it by repeatedly removing
and re-adding SRP SCSI devices. Enabling debug options like SLUB
debugging and kmemleak helps. I think that is because these debug
options slow down the SCSI device removal code and thereby increase the
chance that this race is triggered.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> "Bart" == Bart Van Assche <bart.vanassche@sandisk.com> writes:
Bart,
Bart> This race is hard to trigger. I can trigger it by repeatedly
Bart> removing and re-adding SRP SCSI devices. Enabling debug options
Bart> like SLUB debugging and kmemleak helps. I think that is because
Bart> these debug options slow down the SCSI device removal code and
Bart> thereby increase the chance that this race is triggered.
Any updates on this? Your updated patch has no reviews.
Should I just revert the original patch for 4.4?
On 11/30/2015 04:57 PM, Martin K. Petersen wrote: >>>>>> "Bart" == Bart Van Assche <bart.vanassche@sandisk.com> writes: > > Bart> This race is hard to trigger. I can trigger it by repeatedly > Bart> removing and re-adding SRP SCSI devices. Enabling debug options > Bart> like SLUB debugging and kmemleak helps. I think that is because > Bart> these debug options slow down the SCSI device removal code and > Bart> thereby increase the chance that this race is triggered. > > Any updates on this? Your updated patch has no reviews. > > Should I just revert the original patch for 4.4? Hello Martin, Since the original patch caused a regression, please proceed with reverting the original patch. Regarding this patch: is there anyone on the CC-list of this e-mail who can review it ? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 30, 2015 at 05:18:50PM -0800, Bart Van Assche wrote: > Since the original patch caused a regression, please proceed with reverting > the original patch. > > Regarding this patch: is there anyone on the CC-list of this e-mail who can > review it ? I'm not too fond of the approach. I'd much prefer if SCSI would just release the dev_t later, similar to DM or MD. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/30/2015 11:23 PM, Christoph Hellwig wrote: > On Mon, Nov 30, 2015 at 05:18:50PM -0800, Bart Van Assche wrote: >> Since the original patch caused a regression, please proceed with reverting >> the original patch. >> >> Regarding this patch: is there anyone on the CC-list of this e-mail who can >> review it ? > > I'm not too fond of the approach. I'd much prefer if SCSI would just > release the dev_t later, similar to DM or MD. (replying to an e-mail of four months ago) Hello Christoph, Sorry that it took so long but last week I finally found the time to look further into this. I will post a third version of this patch. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index f5ace2b..8d64518 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -12,6 +12,7 @@ #include <linux/blkdev.h> #include <linux/device.h> #include <linux/pm_runtime.h> +#include <linux/backing-dev.h> #include <scsi/scsi.h> #include <scsi/scsi_device.h> @@ -1110,6 +1111,7 @@ void __scsi_remove_device(struct scsi_device *sdev) device_unregister(&sdev->sdev_dev); transport_remove_device(dev); scsi_dh_remove_device(sdev); + bdi_sysfs_del(&sdev->request_queue->backing_dev_info); device_del(dev); } else put_device(&sdev->sdev_dev); diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 1b4d69f..1a42ecb 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -135,6 +135,7 @@ struct bdi_writeback { struct backing_dev_info { struct list_head bdi_list; + bool is_visible; unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */ unsigned int capabilities; /* Device capabilities */ congested_fn *congested_fn; /* Function pointer if device is md/dm */ diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index c82794f..9004d90 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -24,6 +24,7 @@ __printf(3, 4) int bdi_register(struct backing_dev_info *bdi, struct device *parent, const char *fmt, ...); int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev); +void bdi_sysfs_del(struct backing_dev_info *bdi); void bdi_unregister(struct backing_dev_info *bdi); int __must_check bdi_setup_and_register(struct backing_dev_info *, char *); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 8ed2ffd..b56971f 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -774,6 +774,7 @@ int bdi_init(struct backing_dev_info *bdi) int ret; bdi->dev = NULL; + bdi->is_visible = false; bdi->min_ratio = 0; bdi->max_ratio = 100; @@ -806,6 +807,7 @@ int bdi_register(struct backing_dev_info *bdi, struct device *parent, return PTR_ERR(dev); bdi->dev = dev; + bdi->is_visible = true; bdi_debug_register(bdi, dev_name(dev)); set_bit(WB_registered, &bdi->wb.state); @@ -837,6 +839,28 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi) synchronize_rcu_expedited(); } +/** + * bdi_sysfs_del - remove a BDI device from sysfs + * @bdi: BDI device pointer. + * + * It is safe to call this function more than once. + */ +void bdi_sysfs_del(struct backing_dev_info *bdi) +{ + bool is_visible = false; + + spin_lock_bh(&bdi_lock); + swap(bdi->is_visible, is_visible); + spin_unlock_bh(&bdi_lock); + + if (!is_visible) + return; + + bdi_debug_unregister(bdi); + device_del(bdi->dev); +} +EXPORT_SYMBOL(bdi_sysfs_del); + void bdi_unregister(struct backing_dev_info *bdi) { /* make sure nobody finds us on the bdi_list anymore */ @@ -845,8 +869,8 @@ void bdi_unregister(struct backing_dev_info *bdi) cgwb_bdi_destroy(bdi); if (bdi->dev) { - bdi_debug_unregister(bdi); - device_unregister(bdi->dev); + bdi_sysfs_del(bdi); + put_device(bdi->dev); bdi->dev = NULL; } }
Unregister and reregister BDI devices in the proper order. This patch avoids that the following kernel warning can be triggered during SCSI device reregistration: WARNING: CPU: 7 PID: 203 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80() sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:32' Workqueue: events_unbound async_run_entry_fn Call Trace: [<ffffffff814ff5a4>] dump_stack+0x4c/0x65 [<ffffffff810746ba>] warn_slowpath_common+0x8a/0xc0 [<ffffffff81074736>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81237ca8>] sysfs_warn_dup+0x68/0x80 [<ffffffff81237d8e>] sysfs_create_dir_ns+0x7e/0x90 [<ffffffff81291f58>] kobject_add_internal+0xa8/0x320 [<ffffffff812923a0>] kobject_add+0x60/0xb0 [<ffffffff8138c937>] device_add+0x107/0x5e0 [<ffffffff8138d018>] device_create_groups_vargs+0xd8/0x100 [<ffffffff8138d05c>] device_create_vargs+0x1c/0x20 [<ffffffff8117f233>] bdi_register+0x63/0x2a0 [<ffffffff8117f497>] bdi_register_dev+0x27/0x30 [<ffffffff81281549>] add_disk+0x1a9/0x4e0 [<ffffffffa00c5739>] sd_probe_async+0x119/0x1d0 [sd_mod] [<ffffffff8109a81a>] async_run_entry_fn+0x4a/0x140 [<ffffffff81091078>] process_one_work+0x1d8/0x7c0 [<ffffffff81091774>] worker_thread+0x114/0x460 [<ffffffff81097878>] kthread+0xf8/0x110 [<ffffffff8150801f>] ret_from_fork+0x3f/0x70 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Hannes Reinecke <hare@suse.de> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: <stable@vger.kernel.org> --- drivers/scsi/scsi_sysfs.c | 2 ++ include/linux/backing-dev-defs.h | 1 + include/linux/backing-dev.h | 1 + mm/backing-dev.c | 28 ++++++++++++++++++++++++++-- 4 files changed, 30 insertions(+), 2 deletions(-)