Message ID | 20220216150901.4166235-2-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] block: fix surprise removal for drivers calling blk_set_queue_dying | expand |
On 2/16/22 16:09, Christoph Hellwig wrote: > For surprise removals that have already marked the disk dead, there is > no point in calling fsync_bdev as all I/O will fail anyway, so skip it. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > block/genhd.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 626c8406f21a6..f68bdfe4f883b 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -584,7 +584,14 @@ void del_gendisk(struct gendisk *disk) > blk_drop_partitions(disk); > mutex_unlock(&disk->open_mutex); > > - fsync_bdev(disk->part0); > + /* > + * If this is not a surprise removal see if there is a file system > + * mounted on this device and sync it (although this won't work for > + * partitions). For surprise removals that have already marked the > + * disk dead skip this call as no I/O is possible anyway. > + */ > + if (!test_bit(GD_DEAD, &disk->state)) > + fsync_bdev(disk->part0); > __invalidate_device(disk->part0, true); > > /* My turn to be picky: In the previous patch you use 'set_bit()' for GD_DEAD, which to my knowledge doesn't imply a memory barrier. Yet here you rely on that for the 'test_bit()' to return the correct/most recent value. Don't we need a memory barrier here somewhere? Cheers, Hannes
On Wed, Feb 16, 2022 at 04:18:43PM +0100, Hannes Reinecke wrote: > My turn to be picky: > In the previous patch you use 'set_bit()' for GD_DEAD, which to my > knowledge doesn't imply a memory barrier. > Yet here you rely on that for the 'test_bit()' to return the correct/most > recent value. > Don't we need a memory barrier here somewhere? Well, we only do the test to skip useless work. A race is not a problem here.
On 2/16/22 16:20, Christoph Hellwig wrote: > On Wed, Feb 16, 2022 at 04:18:43PM +0100, Hannes Reinecke wrote: >> My turn to be picky: >> In the previous patch you use 'set_bit()' for GD_DEAD, which to my >> knowledge doesn't imply a memory barrier. >> Yet here you rely on that for the 'test_bit()' to return the correct/most >> recent value. >> Don't we need a memory barrier here somewhere? > > Well, we only do the test to skip useless work. A race is not a > problem here. Ok. Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On Wed, Feb 16, 2022 at 04:09:01PM +0100, Christoph Hellwig wrote: > For surprise removals that have already marked the disk dead, there is > no point in calling fsync_bdev as all I/O will fail anyway, so skip it. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > block/genhd.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 626c8406f21a6..f68bdfe4f883b 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -584,7 +584,14 @@ void del_gendisk(struct gendisk *disk) > blk_drop_partitions(disk); blk_drop_partitions() also invokes fsync_bdev() via delete_partition(). So why treat them differently? > mutex_unlock(&disk->open_mutex); > > - fsync_bdev(disk->part0); > + /* > + * If this is not a surprise removal see if there is a file system > + * mounted on this device and sync it (although this won't work for > + * partitions). For surprise removals that have already marked the > + * disk dead skip this call as no I/O is possible anyway. > + */ > + if (!test_bit(GD_DEAD, &disk->state)) > + fsync_bdev(disk->part0); > __invalidate_device(disk->part0, true); > > /* > -- > 2.30.2 >
On Wed, Feb 16, 2022 at 04:09:01PM +0100, Christoph Hellwig wrote: > - fsync_bdev(disk->part0); > + /* > + * If this is not a surprise removal see if there is a file system > + * mounted on this device and sync it (although this won't work for > + * partitions). For surprise removals that have already marked the > + * disk dead skip this call as no I/O is possible anyway. > + */ > + if (!test_bit(GD_DEAD, &disk->state)) > + fsync_bdev(disk->part0); > __invalidate_device(disk->part0, true); It used to be that any dirty pages would attempt to write, and get an error on a surprise removal. Now that you're skipping the fsync_bdev(), is something else taking responsibility to error those pages?
On Wed, Feb 16, 2022 at 07:37:22AM -0800, Keith Busch wrote: > On Wed, Feb 16, 2022 at 04:09:01PM +0100, Christoph Hellwig wrote: > > - fsync_bdev(disk->part0); > > + /* > > + * If this is not a surprise removal see if there is a file system > > + * mounted on this device and sync it (although this won't work for > > + * partitions). For surprise removals that have already marked the > > + * disk dead skip this call as no I/O is possible anyway. > > + */ > > + if (!test_bit(GD_DEAD, &disk->state)) > > + fsync_bdev(disk->part0); > > __invalidate_device(disk->part0, true); > > It used to be that any dirty pages would attempt to write, and get an > error on a surprise removal. Now that you're skipping the fsync_bdev(), > is something else taking responsibility to error those pages? truncate_inode_pages when closing the device.
On Wed, Feb 16, 2022 at 04:32:26PM +0100, Markus Blöchl wrote: > On Wed, Feb 16, 2022 at 04:09:01PM +0100, Christoph Hellwig wrote: > > For surprise removals that have already marked the disk dead, there is > > no point in calling fsync_bdev as all I/O will fail anyway, so skip it. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > --- > > block/genhd.c | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/block/genhd.c b/block/genhd.c > > index 626c8406f21a6..f68bdfe4f883b 100644 > > --- a/block/genhd.c > > +++ b/block/genhd.c > > @@ -584,7 +584,14 @@ void del_gendisk(struct gendisk *disk) > > blk_drop_partitions(disk); > > blk_drop_partitions() also invokes fsync_bdev() via delete_partition(). > So why treat them differently? Yeah. I guess we should just skip this patch for now.
diff --git a/block/genhd.c b/block/genhd.c index 626c8406f21a6..f68bdfe4f883b 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -584,7 +584,14 @@ void del_gendisk(struct gendisk *disk) blk_drop_partitions(disk); mutex_unlock(&disk->open_mutex); - fsync_bdev(disk->part0); + /* + * If this is not a surprise removal see if there is a file system + * mounted on this device and sync it (although this won't work for + * partitions). For surprise removals that have already marked the + * disk dead skip this call as no I/O is possible anyway. + */ + if (!test_bit(GD_DEAD, &disk->state)) + fsync_bdev(disk->part0); __invalidate_device(disk->part0, true); /*
For surprise removals that have already marked the disk dead, there is no point in calling fsync_bdev as all I/O will fail anyway, so skip it. Signed-off-by: Christoph Hellwig <hch@lst.de> --- block/genhd.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)