diff mbox series

[v2] block: Try to handle busy underlying device on discard

Message ID 20210222094809.21775-1-jack@suse.cz (mailing list archive)
State New, archived
Headers show
Series [v2] block: Try to handle busy underlying device on discard | expand

Commit Message

Jan Kara Feb. 22, 2021, 9:48 a.m. UTC
Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted
filesystem") made paths issuing discard or zeroout requests to the
underlying device try to grab block device in exclusive mode. If that
failed we returned EBUSY to userspace. This however caused unexpected
fallout in userspace where e.g. FUSE filesystems issue discard requests
from userspace daemons although the device is open exclusively by the
kernel. Also shrinking of logical volume by LVM issues discard requests
to a device which may be claimed exclusively because there's another LV
on the same PV. So to avoid these userspace regressions, fall back to
invalidate_inode_pages2_range() instead of returning EBUSY to userspace
and return EBUSY only of that call fails as well (meaning that there's
indeed someone using the particular device range we are trying to
discard).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167
Fixes: 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Jan Kara Feb. 22, 2021, 11:59 a.m. UTC | #1
On Mon 22-02-21 10:48:09, Jan Kara wrote:
> Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted
> filesystem") made paths issuing discard or zeroout requests to the
> underlying device try to grab block device in exclusive mode. If that
> failed we returned EBUSY to userspace. This however caused unexpected
> fallout in userspace where e.g. FUSE filesystems issue discard requests
> from userspace daemons although the device is open exclusively by the
> kernel. Also shrinking of logical volume by LVM issues discard requests
> to a device which may be claimed exclusively because there's another LV
> on the same PV. So to avoid these userspace regressions, fall back to
> invalidate_inode_pages2_range() instead of returning EBUSY to userspace
> and return EBUSY only of that call fails as well (meaning that there's
> indeed someone using the particular device range we are trying to
> discard).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167
> Fixes: 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem")
> CC: stable@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>

Before I forget: I'd like to add two tested by tags to give credit to
people who helped with testing.

Tested-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Andreas Klauer <Andreas.Klauer@metamorpher.de>

									Honza

> ---
>  fs/block_dev.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 235b5042672e..c33151020bcd 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -118,13 +118,22 @@ int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
>  	if (!(mode & FMODE_EXCL)) {
>  		int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
>  		if (err)
> -			return err;
> +			goto invalidate;
>  	}
>  
>  	truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
>  	if (!(mode & FMODE_EXCL))
>  		bd_abort_claiming(bdev, truncate_bdev_range);
>  	return 0;
> +
> +invalidate:
> +	/*
> +	 * Someone else has handle exclusively open. Try invalidating instead.
> +	 * The 'end' argument is inclusive so the rounding is safe.
> +	 */
> +	return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping,
> +					     lstart >> PAGE_SHIFT,
> +					     lend >> PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(truncate_bdev_range);
>  
> -- 
> 2.26.2
>
Jan Kara March 4, 2021, 12:02 p.m. UTC | #2
On Mon 22-02-21 10:48:09, Jan Kara wrote:
> Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted
> filesystem") made paths issuing discard or zeroout requests to the
> underlying device try to grab block device in exclusive mode. If that
> failed we returned EBUSY to userspace. This however caused unexpected
> fallout in userspace where e.g. FUSE filesystems issue discard requests
> from userspace daemons although the device is open exclusively by the
> kernel. Also shrinking of logical volume by LVM issues discard requests
> to a device which may be claimed exclusively because there's another LV
> on the same PV. So to avoid these userspace regressions, fall back to
> invalidate_inode_pages2_range() instead of returning EBUSY to userspace
> and return EBUSY only of that call fails as well (meaning that there's
> indeed someone using the particular device range we are trying to
> discard).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167
> Fixes: 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem")
> CC: stable@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>

Ping guys? Can we get this reviewed and merged? Thanks!

								Honza

> ---
>  fs/block_dev.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 235b5042672e..c33151020bcd 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -118,13 +118,22 @@ int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
>  	if (!(mode & FMODE_EXCL)) {
>  		int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
>  		if (err)
> -			return err;
> +			goto invalidate;
>  	}
>  
>  	truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
>  	if (!(mode & FMODE_EXCL))
>  		bd_abort_claiming(bdev, truncate_bdev_range);
>  	return 0;
> +
> +invalidate:
> +	/*
> +	 * Someone else has handle exclusively open. Try invalidating instead.
> +	 * The 'end' argument is inclusive so the rounding is safe.
> +	 */
> +	return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping,
> +					     lstart >> PAGE_SHIFT,
> +					     lend >> PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(truncate_bdev_range);
>  
> -- 
> 2.26.2
>
Christoph Hellwig March 5, 2021, 1:12 p.m. UTC | #3
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe March 5, 2021, 6:27 p.m. UTC | #4
On 2/22/21 2:48 AM, Jan Kara wrote:
> Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted
> filesystem") made paths issuing discard or zeroout requests to the
> underlying device try to grab block device in exclusive mode. If that
> failed we returned EBUSY to userspace. This however caused unexpected
> fallout in userspace where e.g. FUSE filesystems issue discard requests
> from userspace daemons although the device is open exclusively by the
> kernel. Also shrinking of logical volume by LVM issues discard requests
> to a device which may be claimed exclusively because there's another LV
> on the same PV. So to avoid these userspace regressions, fall back to
> invalidate_inode_pages2_range() instead of returning EBUSY to userspace
> and return EBUSY only of that call fails as well (meaning that there's
> indeed someone using the particular device range we are trying to
> discard).

This missed -rc2, but I'll queue it up for -rc3.
diff mbox series

Patch

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 235b5042672e..c33151020bcd 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -118,13 +118,22 @@  int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
 	if (!(mode & FMODE_EXCL)) {
 		int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
 		if (err)
-			return err;
+			goto invalidate;
 	}
 
 	truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
 	if (!(mode & FMODE_EXCL))
 		bd_abort_claiming(bdev, truncate_bdev_range);
 	return 0;
+
+invalidate:
+	/*
+	 * Someone else has handle exclusively open. Try invalidating instead.
+	 * The 'end' argument is inclusive so the rounding is safe.
+	 */
+	return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping,
+					     lstart >> PAGE_SHIFT,
+					     lend >> PAGE_SHIFT);
 }
 EXPORT_SYMBOL(truncate_bdev_range);