diff mbox

btrfs: add cancellation points to defrag

Message ID 1360453086-10683-1-git-send-email-dsterba@suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

David Sterba Feb. 9, 2013, 11:38 p.m. UTC
The defrag operation can take very long, we want to have a way how to
cancel it. The code checks for a pending signal at safe points in the
defrag loops and returns EAGAIN. This means a user can press ^C after
running 'btrfs fi defrag', woks for both defrag modes, files and root.

Returning from the command was instant in my light tests, but may take
longer depending on the aging factor of the filesystem.

Signed-off-by: David Sterba <dsterba@suse.cz>
---
 fs/btrfs/ctree.h       |    7 +++++++
 fs/btrfs/ioctl.c       |    6 ++++++
 fs/btrfs/transaction.c |    6 ++++++
 3 files changed, 19 insertions(+), 0 deletions(-)

Comments

Eric Sandeen Feb. 11, 2013, 4:59 p.m. UTC | #1
On 2/9/13 5:38 PM, David Sterba wrote:
> The defrag operation can take very long, we want to have a way how to
> cancel it. The code checks for a pending signal at safe points in the
> defrag loops and returns EAGAIN. This means a user can press ^C after
> running 'btrfs fi defrag', woks for both defrag modes, files and root.
> 
> Returning from the command was instant in my light tests, but may take
> longer depending on the aging factor of the filesystem.

When __btrfs_run_defrag_inode() calls btrfs_defrag_file() and gets
-EAGAIN back due to the cancellation, will it reset the defrag->
counters and call btrfs_requeue_inode_defrag()?  Is that ok?

Should __btrfs_run_defrag_inode explicitly check for and handle
an actual error returned to it?

Thanks,
-Eric

> Signed-off-by: David Sterba <dsterba@suse.cz>
> ---
>  fs/btrfs/ctree.h       |    7 +++++++
>  fs/btrfs/ioctl.c       |    6 ++++++
>  fs/btrfs/transaction.c |    6 ++++++
>  3 files changed, 19 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 547b7b0..4b41d7c 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3745,4 +3745,11 @@ static inline int is_fstree(u64 rootid)
>  		return 1;
>  	return 0;
>  }
> +
> +static inline int btrfs_defrag_cancelled(struct btrfs_fs_info *fs_info)
> +{
> +	return signal_pending(current);
> +}
> +
> +
>  #endif
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 338f259..78a5580 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1206,6 +1206,12 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
>  		if (!(inode->i_sb->s_flags & MS_ACTIVE))
>  			break;
>  
> +		if (btrfs_defrag_cancelled(root->fs_info)) {
> +			printk(KERN_DEBUG "btrfs: defrag_file cancelled\n");
> +			ret = -EAGAIN;
> +			break;
> +		}
> +
>  		if (!should_defrag_range(inode, (u64)i << PAGE_CACHE_SHIFT,
>  					 extent_thresh, &last_len, &skip,
>  					 &defrag_end, range->flags &
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index fc03aa6..2c509c4 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -986,6 +986,12 @@ int btrfs_defrag_root(struct btrfs_root *root, int cacheonly)
>  
>  		if (btrfs_fs_closing(root->fs_info) || ret != -EAGAIN)
>  			break;
> +
> +		if (btrfs_defrag_cancelled(root->fs_info)) {
> +			printk(KERN_DEBUG "btrfs: defrag_root cancelled\n");
> +			ret = -EAGAIN;
> +			break;
> +		}
>  	}
>  	root->defrag_running = 0;
>  	return ret;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Feb. 11, 2013, 5:45 p.m. UTC | #2
On Mon, Feb 11, 2013 at 10:59:54AM -0600, Eric Sandeen wrote:
> On 2/9/13 5:38 PM, David Sterba wrote:
> > The defrag operation can take very long, we want to have a way how to
> > cancel it. The code checks for a pending signal at safe points in the
> > defrag loops and returns EAGAIN. This means a user can press ^C after
> > running 'btrfs fi defrag', woks for both defrag modes, files and root.
> > 
> > Returning from the command was instant in my light tests, but may take
> > longer depending on the aging factor of the filesystem.
> 
> When __btrfs_run_defrag_inode() calls btrfs_defrag_file() and gets
> -EAGAIN back due to the cancellation, will it reset the defrag->
> counters and call btrfs_requeue_inode_defrag()?  Is that ok?
> 
> Should __btrfs_run_defrag_inode explicitly check for and handle
> an actual error returned to it?

__btrfs_run_defrag_inode -> btrfs_defrag_file applies only in case of
autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and
what you describe does not happen here.

(The autodefrag loop runs within kernel threads and I want to avoid
enabling signals for them.)

I agree that the negative error code should be handled in
__btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM).
Requeing defrag might make sense in theory in the ENOMEM case, but
triggering more activity when the system is low on memory is not
practical.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Sandeen Feb. 11, 2013, 5:48 p.m. UTC | #3
On 2/11/13 11:45 AM, David Sterba wrote:
> On Mon, Feb 11, 2013 at 10:59:54AM -0600, Eric Sandeen wrote:
>> On 2/9/13 5:38 PM, David Sterba wrote:
>>> The defrag operation can take very long, we want to have a way how to
>>> cancel it. The code checks for a pending signal at safe points in the
>>> defrag loops and returns EAGAIN. This means a user can press ^C after
>>> running 'btrfs fi defrag', woks for both defrag modes, files and root.
>>>
>>> Returning from the command was instant in my light tests, but may take
>>> longer depending on the aging factor of the filesystem.
>>
>> When __btrfs_run_defrag_inode() calls btrfs_defrag_file() and gets
>> -EAGAIN back due to the cancellation, will it reset the defrag->
>> counters and call btrfs_requeue_inode_defrag()?  Is that ok?
>>
>> Should __btrfs_run_defrag_inode explicitly check for and handle
>> an actual error returned to it?
> 
> __btrfs_run_defrag_inode -> btrfs_defrag_file applies only in case of
> autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and
> what you describe does not happen here.

Ok, was thinking that might be the case, but wasn't sure what all that
worker thread handled.  So I guess there should be no signals in that case.

> (The autodefrag loop runs within kernel threads and I want to avoid
> enabling signals for them.)

Understood.

> I agree that the negative error code should be handled in
> __btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM).
> Requeing defrag might make sense in theory in the ENOMEM case, but
> triggering more activity when the system is low on memory is not
> practical.

*nod* - but a separate issue, I guess.

Thanks,
-Eric

> 
> david
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Feb. 11, 2013, 6:14 p.m. UTC | #4
On Mon, Feb 11, 2013 at 11:48:20AM -0600, Eric Sandeen wrote:
> On 2/11/13 11:45 AM, David Sterba wrote:
> > __btrfs_run_defrag_inode -> btrfs_defrag_file applies only in case of
> > autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and
> > what you describe does not happen here.
> 
> Ok, was thinking that might be the case, but wasn't sure what all that
> worker thread handled.  So I guess there should be no signals in that case.

Just for the record, btrfs_run_defrag_inodes is called from cleaner
thread, so if we had a cleaner way to inform the thread to stop the
defrag it'd be possible to stop autodefrag as well. I tested defrag of a
1G file with ~90k of extents (produced by overwriting random 4k ranges
in the file) and then doing sequential rewrite of the file. Cpu and IO
activity went expectedly high and in case of many files under defrag
it'd be more flexible to actually control the internal defrag as well.
Not by a signal, because in case of more mounted filesystems one cannot
know which process to target.

> > I agree that the negative error code should be handled in
> > __btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM).
> > Requeing defrag might make sense in theory in the ENOMEM case, but
> > triggering more activity when the system is low on memory is not
> > practical.
> 
> *nod* - but a separate issue, I guess.

Yes.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 547b7b0..4b41d7c 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3745,4 +3745,11 @@  static inline int is_fstree(u64 rootid)
 		return 1;
 	return 0;
 }
+
+static inline int btrfs_defrag_cancelled(struct btrfs_fs_info *fs_info)
+{
+	return signal_pending(current);
+}
+
+
 #endif
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 338f259..78a5580 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1206,6 +1206,12 @@  int btrfs_defrag_file(struct inode *inode, struct file *file,
 		if (!(inode->i_sb->s_flags & MS_ACTIVE))
 			break;
 
+		if (btrfs_defrag_cancelled(root->fs_info)) {
+			printk(KERN_DEBUG "btrfs: defrag_file cancelled\n");
+			ret = -EAGAIN;
+			break;
+		}
+
 		if (!should_defrag_range(inode, (u64)i << PAGE_CACHE_SHIFT,
 					 extent_thresh, &last_len, &skip,
 					 &defrag_end, range->flags &
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index fc03aa6..2c509c4 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -986,6 +986,12 @@  int btrfs_defrag_root(struct btrfs_root *root, int cacheonly)
 
 		if (btrfs_fs_closing(root->fs_info) || ret != -EAGAIN)
 			break;
+
+		if (btrfs_defrag_cancelled(root->fs_info)) {
+			printk(KERN_DEBUG "btrfs: defrag_root cancelled\n");
+			ret = -EAGAIN;
+			break;
+		}
 	}
 	root->defrag_running = 0;
 	return ret;