diff mbox series

[v2] btrfs: allow defrag to be interruptible

Message ID 9a755aa9a3528e385c6dee47e536cd2fe539ab59.1642513202.git.fdmanana@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] btrfs: allow defrag to be interruptible | expand

Commit Message

Filipe Manana Jan. 18, 2022, 1:43 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

During defrag, at btrfs_defrag_file(), we have this loop that iterates
over a file range in steps no larger than 256K subranges. If the range
is too long, there's no way to interrupt it. So make the loop check in
each iteration if there's signal pending, and if there is, break and
return -AGAIN to userspace.

Before kernel 5.16, we used to allow defrag to be cancelled through a
signal, but that was lost with commit 7b508037d4cac3 ("btrfs: defrag:
use defrag_one_cluster() to implement btrfs_defrag_file()").

This change adds back the possibility to cancel a defrag with a signal
and keeps the same semantics, returning -EAGAIN to user space (and not
the usually more expected -EINTR).

This is also motivated by a recent bug on 5.16 where defragging a 1 byte
file resulted in iterating from file range 0 to (u64)-1, as hitting the
bug triggered a too long loop, basically requiring one to reboot the
machine, as it was not possible to cancel defrag.

Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---

v2: Add Fixes tag, as it's actually a regression introduced in 5.16.
    Use the helper btrfs_defrag_cancelled() instead of fatal_signal_pending()
    and return -EAGAIN instead of -EINTR to userspace, as that is what used
    to be returned before 5.16.

 fs/btrfs/ioctl.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Qu Wenruo Jan. 19, 2022, 1:18 a.m. UTC | #1
On 2022/1/18 21:43, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> During defrag, at btrfs_defrag_file(), we have this loop that iterates
> over a file range in steps no larger than 256K subranges. If the range
> is too long, there's no way to interrupt it. So make the loop check in
> each iteration if there's signal pending, and if there is, break and
> return -AGAIN to userspace.
>
> Before kernel 5.16, we used to allow defrag to be cancelled through a
> signal, but that was lost with commit 7b508037d4cac3 ("btrfs: defrag:
> use defrag_one_cluster() to implement btrfs_defrag_file()").
>
> This change adds back the possibility to cancel a defrag with a signal
> and keeps the same semantics, returning -EAGAIN to user space (and not
> the usually more expected -EINTR).
>
> This is also motivated by a recent bug on 5.16 where defragging a 1 byte
> file resulted in iterating from file range 0 to (u64)-1, as hitting the
> bug triggered a too long loop, basically requiring one to reboot the
> machine, as it was not possible to cancel defrag.
>
> Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>
> v2: Add Fixes tag, as it's actually a regression introduced in 5.16.
>      Use the helper btrfs_defrag_cancelled() instead of fatal_signal_pending()
>      and return -EAGAIN instead of -EINTR to userspace, as that is what used
>      to be returned before 5.16.
>
>   fs/btrfs/ioctl.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 6ad2bc2e5af3..6391be7409d8 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1546,6 +1546,11 @@ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
>   		/* The cluster size 256K should always be page aligned */
>   		BUILD_BUG_ON(!IS_ALIGNED(CLUSTER_SIZE, PAGE_SIZE));
>
> +		if (btrfs_defrag_cancelled(fs_info)) {
> +			ret = -EAGAIN;
> +			break;
> +		}
> +
>   		/* We want the cluster end at page boundary when possible */
>   		cluster_end = (((cur >> PAGE_SHIFT) +
>   			       (SZ_256K >> PAGE_SHIFT)) << PAGE_SHIFT) - 1;
David Sterba Jan. 19, 2022, 5:11 p.m. UTC | #2
On Tue, Jan 18, 2022 at 01:43:31PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> During defrag, at btrfs_defrag_file(), we have this loop that iterates
> over a file range in steps no larger than 256K subranges. If the range
> is too long, there's no way to interrupt it. So make the loop check in
> each iteration if there's signal pending, and if there is, break and
> return -AGAIN to userspace.
> 
> Before kernel 5.16, we used to allow defrag to be cancelled through a
> signal, but that was lost with commit 7b508037d4cac3 ("btrfs: defrag:
> use defrag_one_cluster() to implement btrfs_defrag_file()").
> 
> This change adds back the possibility to cancel a defrag with a signal
> and keeps the same semantics, returning -EAGAIN to user space (and not
> the usually more expected -EINTR).
> 
> This is also motivated by a recent bug on 5.16 where defragging a 1 byte
> file resulted in iterating from file range 0 to (u64)-1, as hitting the
> bug triggered a too long loop, basically requiring one to reboot the
> machine, as it was not possible to cancel defrag.
> 
> Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
> 
> v2: Add Fixes tag, as it's actually a regression introduced in 5.16.
>     Use the helper btrfs_defrag_cancelled() instead of fatal_signal_pending()
>     and return -EAGAIN instead of -EINTR to userspace, as that is what used
>     to be returned before 5.16.

Added to misc-next, thanks.
diff mbox series

Patch

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 6ad2bc2e5af3..6391be7409d8 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1546,6 +1546,11 @@  int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
 		/* The cluster size 256K should always be page aligned */
 		BUILD_BUG_ON(!IS_ALIGNED(CLUSTER_SIZE, PAGE_SIZE));
 
+		if (btrfs_defrag_cancelled(fs_info)) {
+			ret = -EAGAIN;
+			break;
+		}
+
 		/* We want the cluster end at page boundary when possible */
 		cluster_end = (((cur >> PAGE_SHIFT) +
 			       (SZ_256K >> PAGE_SHIFT)) << PAGE_SHIFT) - 1;