Message ID | 20200717204221.2285627-1-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] btrfs: add an autodefrag property | expand |
On 2020/7/18 上午4:42, Josef Bacik wrote: > Autodefrag is very useful for somethings, like the 9000 sqllite files > that Firefox uses, but is way less useful for virt images. > Unfortunately this is only available currently as a whole mount option. > Fix this by adding an "autodefrag" property, that users can set on a per > file or per directory basis. Thus allowing them to control where > exactly the extra write activity is going to occur. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > - This is an RFC because I want to make sure we're ok with this before I go and > add btrfs-progs support for this. I'm not married to the name or the value, > but I think the core goal is valuable. The idea looks pretty good to me. Although it would be much more convincing to bring some real-world micro bench to show the benefit. However I still have a concern related to defrag. Since it's on-disk flag, thus can be inherited by snapshot, then what would happen if an auto-defrag inode get snapshotted. Would any write to the auto-defrag inode in new snapshot break the space saving? > > fs/btrfs/btrfs_inode.h | 1 + > fs/btrfs/file.c | 16 ++++++++++------ > fs/btrfs/props.c | 41 +++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 52 insertions(+), 6 deletions(-) > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > index e7d709505cb1..4f04f0535f90 100644 > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -31,6 +31,7 @@ enum { > BTRFS_INODE_READDIO_NEED_LOCK, > BTRFS_INODE_HAS_PROPS, > BTRFS_INODE_SNAPSHOT_FLUSH, > + BTRFS_INODE_AUTODEFRAG, > }; > > /* in memory btrfs inode */ > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 841c516079a9..cac2092bdcdf 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -116,12 +116,16 @@ static int __btrfs_add_inode_defrag(struct btrfs_inode *inode, > return 0; > } > > -static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) > +static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info, > + struct btrfs_inode *inode) We can grab fs_info from btrfs_inode, thus no need for the fs_info parameter. Thanks, Qu > { > - if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > + if (btrfs_fs_closing(fs_info)) > return 0; > > - if (btrfs_fs_closing(fs_info)) > + if (inode && test_bit(BTRFS_INODE_AUTODEFRAG, &inode->runtime_flags)) > + return 1; > + > + if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > return 0; > > return 1; > @@ -140,7 +144,7 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, > u64 transid; > int ret; > > - if (!__need_auto_defrag(fs_info)) > + if (!__need_auto_defrag(fs_info, inode)) > return 0; > > if (test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) > @@ -187,7 +191,7 @@ static void btrfs_requeue_inode_defrag(struct btrfs_inode *inode, > struct btrfs_fs_info *fs_info = inode->root->fs_info; > int ret; > > - if (!__need_auto_defrag(fs_info)) > + if (!__need_auto_defrag(fs_info, inode)) > goto out; > > /* > @@ -348,7 +352,7 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) > &fs_info->fs_state)) > break; > > - if (!__need_auto_defrag(fs_info)) > + if (btrfs_fs_closing(fs_info)) > break; > > /* find an inode to defrag */ > diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c > index 2dcb1cb21634..5c6411c8b19f 100644 > --- a/fs/btrfs/props.c > +++ b/fs/btrfs/props.c > @@ -310,6 +310,40 @@ static const char *prop_compression_extract(struct inode *inode) > return NULL; > } > > +static int prop_autodefrag_validate(const char *value, size_t len) > +{ > + if (!value) > + return 0; > + if (!strncmp("true", value, 4)) > + return 0; > + return -EINVAL; > +} > + > +static int prop_autodefrag_apply(struct inode *inode, const char *value, > + size_t len) > +{ > + if (len == 0) { > + clear_bit(BTRFS_INODE_AUTODEFRAG, > + &BTRFS_I(inode)->runtime_flags); > + return 0; > + } > + > + if (!strncmp("true", value, 4)) { > + set_bit(BTRFS_INODE_AUTODEFRAG, > + &BTRFS_I(inode)->runtime_flags); > + return 0; > + } > + > + return -EINVAL; > +} > + > +static const char *prop_autodefrag_extract(struct inode *inode) > +{ > + if (test_bit(BTRFS_INODE_AUTODEFRAG, &BTRFS_I(inode)->runtime_flags)) > + return "true"; > + return NULL; > +} > + > static struct prop_handler prop_handlers[] = { > { > .xattr_name = XATTR_BTRFS_PREFIX "compression", > @@ -318,6 +352,13 @@ static struct prop_handler prop_handlers[] = { > .extract = prop_compression_extract, > .inheritable = 1 > }, > + { > + .xattr_name = XATTR_BTRFS_PREFIX "autodefrag", > + .validate = prop_autodefrag_validate, > + .apply = prop_autodefrag_apply, > + .extract = prop_autodefrag_extract, > + .inheritable = 1 > + }, > }; > > static int inherit_props(struct btrfs_trans_handle *trans, >
On 7/17/20 7:46 PM, Qu Wenruo wrote: > > > On 2020/7/18 上午4:42, Josef Bacik wrote: >> Autodefrag is very useful for somethings, like the 9000 sqllite files >> that Firefox uses, but is way less useful for virt images. >> Unfortunately this is only available currently as a whole mount option. >> Fix this by adding an "autodefrag" property, that users can set on a per >> file or per directory basis. Thus allowing them to control where >> exactly the extra write activity is going to occur. >> >> Signed-off-by: Josef Bacik <josef@toxicpanda.com> >> --- >> - This is an RFC because I want to make sure we're ok with this before I go and >> add btrfs-progs support for this. I'm not married to the name or the value, >> but I think the core goal is valuable. > > The idea looks pretty good to me. > > Although it would be much more convincing to bring some real-world micro > bench to show the benefit. > The same benefit as what existed originally for autodefrag, firefox isn't unusably slow on spinning rust. > > However I still have a concern related to defrag. > Since it's on-disk flag, thus can be inherited by snapshot, then what > would happen if an auto-defrag inode get snapshotted. > > Would any write to the auto-defrag inode in new snapshot break the space > saving? Sure but that's the case for all defrag, and this is in fact better than mount -o autodefrag because you can limit the damage. Thanks, Josef
On 2020/7/18 上午7:58, Josef Bacik wrote: > On 7/17/20 7:46 PM, Qu Wenruo wrote: >> >> >> On 2020/7/18 上午4:42, Josef Bacik wrote: >>> Autodefrag is very useful for somethings, like the 9000 sqllite files >>> that Firefox uses, but is way less useful for virt images. >>> Unfortunately this is only available currently as a whole mount option. >>> Fix this by adding an "autodefrag" property, that users can set on a per >>> file or per directory basis. Thus allowing them to control where >>> exactly the extra write activity is going to occur. >>> >>> Signed-off-by: Josef Bacik <josef@toxicpanda.com> >>> --- >>> - This is an RFC because I want to make sure we're ok with this >>> before I go and >>> add btrfs-progs support for this. I'm not married to the name or >>> the value, >>> but I think the core goal is valuable. >> >> The idea looks pretty good to me. >> >> Although it would be much more convincing to bring some real-world micro >> bench to show the benefit. >> > > The same benefit as what existed originally for autodefrag, firefox > isn't unusably slow on spinning rust. > >> >> However I still have a concern related to defrag. >> Since it's on-disk flag, thus can be inherited by snapshot, then what >> would happen if an auto-defrag inode get snapshotted. >> >> Would any write to the auto-defrag inode in new snapshot break the space >> saving? > Sure but that's the case for all defrag, and this is in fact better than > mount -o autodefrag because you can limit the damage. Thanks, That's right. So no problem at all then. Thanks, Qu > > Josef
On 17.07.20 г. 23:42 ч., Josef Bacik wrote: > Autodefrag is very useful for somethings, like the 9000 sqllite files > that Firefox uses, but is way less useful for virt images. > Unfortunately this is only available currently as a whole mount option. > Fix this by adding an "autodefrag" property, that users can set on a per > file or per directory basis. Thus allowing them to control where > exactly the extra write activity is going to occur. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > - This is an RFC because I want to make sure we're ok with this before I go and > add btrfs-progs support for this. I'm not married to the name or the value, > but I think the core goal is valuable. > > fs/btrfs/btrfs_inode.h | 1 + > fs/btrfs/file.c | 16 ++++++++++------ > fs/btrfs/props.c | 41 +++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 52 insertions(+), 6 deletions(-) > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > index e7d709505cb1..4f04f0535f90 100644 > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -31,6 +31,7 @@ enum { > BTRFS_INODE_READDIO_NEED_LOCK, > BTRFS_INODE_HAS_PROPS, > BTRFS_INODE_SNAPSHOT_FLUSH, > + BTRFS_INODE_AUTODEFRAG, > }; > > /* in memory btrfs inode */ > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 841c516079a9..cac2092bdcdf 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -116,12 +116,16 @@ static int __btrfs_add_inode_defrag(struct btrfs_inode *inode, > return 0; > } > > -static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) > +static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info, > + struct btrfs_inode *inode) > { > - if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > + if (btrfs_fs_closing(fs_info)) > return 0; > > - if (btrfs_fs_closing(fs_info)) > + if (inode && test_bit(BTRFS_INODE_AUTODEFRAG, &inode->runtime_flags)) > + return 1; Qu already made a comment about accessing fs_info from inode, but since you check for the presence of inode here, when do you expect for it to be NULL ? If not, then that check is redundant. > + > + if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > return 0; > > return 1; nit: while at it use the occasion to convert the function to proper bool <snip>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Fri, 2020-07-17 at 16:42 -0400, Josef Bacik wrote: > Autodefrag is very useful for somethings, like the 9000 sqllite files > that Firefox uses, but is way less useful for virt images. > Unfortunately this is only available currently as a whole mount > option. > Fix this by adding an "autodefrag" property, that users can set on a > per > file or per directory basis. Thus allowing them to control where > exactly the extra write activity is going to occur. This would be cool so that openSUSE could drop their autodefrag plugin for zypper (essentially forces defragmentation of rpmdb) and we could write some code to set this property for RPMDB inside RPM itself. > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > - This is an RFC because I want to make sure we're ok with this > before I go and > add btrfs-progs support for this. I'm not married to the name or > the value, > but I think the core goal is valuable. > > fs/btrfs/btrfs_inode.h | 1 + > fs/btrfs/file.c | 16 ++++++++++------ > fs/btrfs/props.c | 41 > +++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 52 insertions(+), 6 deletions(-) > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > index e7d709505cb1..4f04f0535f90 100644 > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -31,6 +31,7 @@ enum { > BTRFS_INODE_READDIO_NEED_LOCK, > BTRFS_INODE_HAS_PROPS, > BTRFS_INODE_SNAPSHOT_FLUSH, > + BTRFS_INODE_AUTODEFRAG, > }; > > /* in memory btrfs inode */ > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 841c516079a9..cac2092bdcdf 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -116,12 +116,16 @@ static int __btrfs_add_inode_defrag(struct > btrfs_inode *inode, > return 0; > } > > -static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) > +static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info, > + struct btrfs_inode *inode) > { > - if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > + if (btrfs_fs_closing(fs_info)) > return 0; > > - if (btrfs_fs_closing(fs_info)) > + if (inode && test_bit(BTRFS_INODE_AUTODEFRAG, &inode- > >runtime_flags)) > + return 1; > + > + if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) > return 0; > > return 1; > @@ -140,7 +144,7 @@ int btrfs_add_inode_defrag(struct > btrfs_trans_handle *trans, > u64 transid; > int ret; > > - if (!__need_auto_defrag(fs_info)) > + if (!__need_auto_defrag(fs_info, inode)) > return 0; > > if (test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) > @@ -187,7 +191,7 @@ static void btrfs_requeue_inode_defrag(struct > btrfs_inode *inode, > struct btrfs_fs_info *fs_info = inode->root->fs_info; > int ret; > > - if (!__need_auto_defrag(fs_info)) > + if (!__need_auto_defrag(fs_info, inode)) > goto out; > > /* > @@ -348,7 +352,7 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info > *fs_info) > &fs_info->fs_state)) > break; > > - if (!__need_auto_defrag(fs_info)) > + if (btrfs_fs_closing(fs_info)) > break; > > /* find an inode to defrag */ > diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c > index 2dcb1cb21634..5c6411c8b19f 100644 > --- a/fs/btrfs/props.c > +++ b/fs/btrfs/props.c > @@ -310,6 +310,40 @@ static const char > *prop_compression_extract(struct inode *inode) > return NULL; > } > > +static int prop_autodefrag_validate(const char *value, size_t len) > +{ > + if (!value) > + return 0; > + if (!strncmp("true", value, 4)) > + return 0; > + return -EINVAL; > +} > + > +static int prop_autodefrag_apply(struct inode *inode, const char > *value, > + size_t len) > +{ > + if (len == 0) { > + clear_bit(BTRFS_INODE_AUTODEFRAG, > + &BTRFS_I(inode)->runtime_flags); > + return 0; > + } > + > + if (!strncmp("true", value, 4)) { > + set_bit(BTRFS_INODE_AUTODEFRAG, > + &BTRFS_I(inode)->runtime_flags); > + return 0; > + } > + > + return -EINVAL; > +} > + > +static const char *prop_autodefrag_extract(struct inode *inode) > +{ > + if (test_bit(BTRFS_INODE_AUTODEFRAG, &BTRFS_I(inode)- > >runtime_flags)) > + return "true"; > + return NULL; > +} > + > static struct prop_handler prop_handlers[] = { > { > .xattr_name = XATTR_BTRFS_PREFIX "compression", > @@ -318,6 +352,13 @@ static struct prop_handler prop_handlers[] = { > .extract = prop_compression_extract, > .inheritable = 1 > }, > + { > + .xattr_name = XATTR_BTRFS_PREFIX "autodefrag", > + .validate = prop_autodefrag_validate, > + .apply = prop_autodefrag_apply, > + .extract = prop_autodefrag_extract, > + .inheritable = 1 > + }, > }; > > static int inherit_props(struct btrfs_trans_handle *trans, - -- Igor Raits <igor.raits@gmail.com> -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEcwgJ58gsbV5f5dMcEV1auJxcHh4FAl8UXqgACgkQEV1auJxc Hh7QpxAAmC5pEaP1WTQQgfb+tS+pww2ln+Vk/3yTBRjOapyvisk7FIs4evPLlKFW kIGb2q+FsGTZ+f/8PWXEcXJSU2HSXYv3dbHh6gp6/ZFKMjZPAoxnErFTDx2ZT1af 0GcmKJdpV8RS24qQjs+cORnzOPF1HAyBSL6qHEfpdzkws1LmJXvPOxsToVV24h1D JKaCH3XdIRLf4JXuLyRup0ASpspA1oENHLr6+YcoQjN8IVTNjPaBdJcz+VkFarEi tIwWMJr3A6e1yFWkKldt3HBMtMR+QLWYga3b8d2tROZ2bAsU+JXeWtyUmBONx++r japV9wSquPEj7ht2QH58VsisU68tSFHqyyaGJ5a6CV6BsMkF5KhArK971F3Du3xl te8A5HVvSOnNLzHllhklMehDN2e0qsTjeZaU9DniCV1OkS5F8SNXftJ25YWJtrba mdchFezuXwcQzfC/G5WSZK+6YDNb/Gnind+rkV8fgtz7JiuVpluR3EMZMVD0mMN8 ahZH86xw+78VZeQ/4Q1FGjS+BrDVakn8cGKjSUjJAFF+owx1aXIPByObIKAKd6rf lEmsaPWj0roM81PU+93BxN16Ti8s9wzEM1eoBEmhVrFCKsKt7iSJ+ZJyLRpPIRCZ qAexm0iOKCxY8EMa0A7XbCxcvDSw6NXjwgbkNqFQRd4286uDdsk= =kcj9 -----END PGP SIGNATURE-----
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index e7d709505cb1..4f04f0535f90 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -31,6 +31,7 @@ enum { BTRFS_INODE_READDIO_NEED_LOCK, BTRFS_INODE_HAS_PROPS, BTRFS_INODE_SNAPSHOT_FLUSH, + BTRFS_INODE_AUTODEFRAG, }; /* in memory btrfs inode */ diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 841c516079a9..cac2092bdcdf 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -116,12 +116,16 @@ static int __btrfs_add_inode_defrag(struct btrfs_inode *inode, return 0; } -static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) +static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info, + struct btrfs_inode *inode) { - if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) + if (btrfs_fs_closing(fs_info)) return 0; - if (btrfs_fs_closing(fs_info)) + if (inode && test_bit(BTRFS_INODE_AUTODEFRAG, &inode->runtime_flags)) + return 1; + + if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) return 0; return 1; @@ -140,7 +144,7 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, u64 transid; int ret; - if (!__need_auto_defrag(fs_info)) + if (!__need_auto_defrag(fs_info, inode)) return 0; if (test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) @@ -187,7 +191,7 @@ static void btrfs_requeue_inode_defrag(struct btrfs_inode *inode, struct btrfs_fs_info *fs_info = inode->root->fs_info; int ret; - if (!__need_auto_defrag(fs_info)) + if (!__need_auto_defrag(fs_info, inode)) goto out; /* @@ -348,7 +352,7 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) &fs_info->fs_state)) break; - if (!__need_auto_defrag(fs_info)) + if (btrfs_fs_closing(fs_info)) break; /* find an inode to defrag */ diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c index 2dcb1cb21634..5c6411c8b19f 100644 --- a/fs/btrfs/props.c +++ b/fs/btrfs/props.c @@ -310,6 +310,40 @@ static const char *prop_compression_extract(struct inode *inode) return NULL; } +static int prop_autodefrag_validate(const char *value, size_t len) +{ + if (!value) + return 0; + if (!strncmp("true", value, 4)) + return 0; + return -EINVAL; +} + +static int prop_autodefrag_apply(struct inode *inode, const char *value, + size_t len) +{ + if (len == 0) { + clear_bit(BTRFS_INODE_AUTODEFRAG, + &BTRFS_I(inode)->runtime_flags); + return 0; + } + + if (!strncmp("true", value, 4)) { + set_bit(BTRFS_INODE_AUTODEFRAG, + &BTRFS_I(inode)->runtime_flags); + return 0; + } + + return -EINVAL; +} + +static const char *prop_autodefrag_extract(struct inode *inode) +{ + if (test_bit(BTRFS_INODE_AUTODEFRAG, &BTRFS_I(inode)->runtime_flags)) + return "true"; + return NULL; +} + static struct prop_handler prop_handlers[] = { { .xattr_name = XATTR_BTRFS_PREFIX "compression", @@ -318,6 +352,13 @@ static struct prop_handler prop_handlers[] = { .extract = prop_compression_extract, .inheritable = 1 }, + { + .xattr_name = XATTR_BTRFS_PREFIX "autodefrag", + .validate = prop_autodefrag_validate, + .apply = prop_autodefrag_apply, + .extract = prop_autodefrag_extract, + .inheritable = 1 + }, }; static int inherit_props(struct btrfs_trans_handle *trans,
Autodefrag is very useful for somethings, like the 9000 sqllite files that Firefox uses, but is way less useful for virt images. Unfortunately this is only available currently as a whole mount option. Fix this by adding an "autodefrag" property, that users can set on a per file or per directory basis. Thus allowing them to control where exactly the extra write activity is going to occur. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- - This is an RFC because I want to make sure we're ok with this before I go and add btrfs-progs support for this. I'm not married to the name or the value, but I think the core goal is valuable. fs/btrfs/btrfs_inode.h | 1 + fs/btrfs/file.c | 16 ++++++++++------ fs/btrfs/props.c | 41 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 52 insertions(+), 6 deletions(-)