diff mbox

[02/11] block: Fix race of bdev open with gendisk shutdown

Message ID 20170306163404.1238-3-jack@suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kara March 6, 2017, 4:33 p.m. UTC
blkdev_open() may race with gendisk shutdown in such a way that
del_gendisk() has already unhashed block device inode (and thus
bd_acquire() will end up creating new block device inode) however
gen_gendisk() will still return the gendisk that is being destroyed.
This will result in the new bdev inode being associated with bdi of the
request queue that is going away and when the device number gets
eventually reused, the block device will still be pointing to the stale
bdi.

Fix the problem by checking whether the gendisk is still alive when
associating bdev inode with it and its bdi. That way we are sure that
once we are unhashing block device inodes in del_gendisk(), newly
created bdev inodes cannot be associated with bdi of the deleted gendisk
anymore. We actually already do this check when opening a partition so
we need to add it only for the case when the whole device is opened.

Also add a warning that will tell us about unexpected inconsistencies
between bdi associated with the bdev inode and bdi associated with the
disk.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/genhd.c  | 7 ++++++-
 fs/block_dev.c | 5 ++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

Comments

Tejun Heo March 6, 2017, 10:04 p.m. UTC | #1
Hello,

On Mon, Mar 06, 2017 at 05:33:55PM +0100, Jan Kara wrote:
> +	disk->flags &= ~GENHD_FL_UP;
> +	/*
> +	 * Make sure __blkdev_open() sees the disk is going away before
> +	 * starting to unhash bdev inodes.
> +	 */
> +	smp_wmb();

But which rmb is this paired with?  Without paring memory barriers
don't do anything.  Given that this isn't a super hot path, I think
it'd be far better to stick to a simpler synchronization mechanism.

Thanks.
Jan Kara March 7, 2017, 11:14 a.m. UTC | #2
On Mon 06-03-17 17:04:59, Tejun Heo wrote:
> Hello,
> 
> On Mon, Mar 06, 2017 at 05:33:55PM +0100, Jan Kara wrote:
> > +	disk->flags &= ~GENHD_FL_UP;
> > +	/*
> > +	 * Make sure __blkdev_open() sees the disk is going away before
> > +	 * starting to unhash bdev inodes.
> > +	 */
> > +	smp_wmb();
> 
> But which rmb is this paired with?  Without paring memory barriers
> don't do anything.

It is paired with the fact that the test (disk->flags & GENHD_FL_UP) in
__blkdev_get() cannot be reordered before the bd_acquire() call in
blkdev_open() (or however the bdev is looked up). I was thinking so long
about how and where to sanely comment on that that I did not write anything
in the end.

I was also thinking about adding an explicit barrier before that test just
to make things explicit since as you say below the first blkdev open is not
that hot path. Maybe that would be a better option?

> Given that this isn't a super hot path, I think it'd be far better to
> stick to a simpler synchronization mechanism.

I'd be happy to do so however struct gendisk does not have any lock in it
and introducing a special lock just for that seems awkward.

								Honza
Tejun Heo March 7, 2017, 7:43 p.m. UTC | #3
Hello,

On Tue, Mar 07, 2017 at 12:14:20PM +0100, Jan Kara wrote:
> > Given that this isn't a super hot path, I think it'd be far better to
> > stick to a simpler synchronization mechanism.
> 
> I'd be happy to do so however struct gendisk does not have any lock in it
> and introducing a special lock just for that seems awkward.

I think even a awkward shared lock is better than memory barriers.
Barriers are the trickiest locking construct to get right and we
really shouldn't using that unless absolutely necessary.

Thanks.
Jan Kara March 8, 2017, 2:25 p.m. UTC | #4
On Tue 07-03-17 14:43:49, Tejun Heo wrote:
> Hello,
> 
> On Tue, Mar 07, 2017 at 12:14:20PM +0100, Jan Kara wrote:
> > > Given that this isn't a super hot path, I think it'd be far better to
> > > stick to a simpler synchronization mechanism.
> > 
> > I'd be happy to do so however struct gendisk does not have any lock in it
> > and introducing a special lock just for that seems awkward.
> 
> I think even a awkward shared lock is better than memory barriers.
> Barriers are the trickiest locking construct to get right and we
> really shouldn't using that unless absolutely necessary.

OK, I'll try to come up with something.

								Honza
diff mbox

Patch

diff --git a/block/genhd.c b/block/genhd.c
index b26a5ea115d0..e8df37de03af 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -662,6 +662,12 @@  void del_gendisk(struct gendisk *disk)
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	disk->flags &= ~GENHD_FL_UP;
+	/*
+	 * Make sure __blkdev_open() sees the disk is going away before
+	 * starting to unhash bdev inodes.
+	 */
+	smp_wmb();
 	blk_integrity_del(disk);
 	disk_del_events(disk);
 
@@ -678,7 +684,6 @@  void del_gendisk(struct gendisk *disk)
 	invalidate_partition(disk, 0);
 	bdev_unhash_inode(disk_devt(disk));
 	set_capacity(disk, 0);
-	disk->flags &= ~GENHD_FL_UP;
 
 	sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
 	/*
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 53e2389ae4d4..9e1993a2827f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1560,7 +1560,7 @@  static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
 		if (!partno) {
 			ret = -ENXIO;
 			bdev->bd_part = disk_get_part(disk, partno);
-			if (!bdev->bd_part)
+			if (!(disk->flags & GENHD_FL_UP) || !bdev->bd_part)
 				goto out_clear;
 
 			ret = 0;
@@ -1623,6 +1623,9 @@  static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
 
 		if (bdev->bd_bdi == &noop_backing_dev_info)
 			bdev->bd_bdi = bdi_get(disk->queue->backing_dev_info);
+		else
+			WARN_ON_ONCE(bdev->bd_bdi !=
+				     disk->queue->backing_dev_info);
 	} else {
 		if (bdev->bd_contains == bdev) {
 			ret = 0;