Message ID | 20200628050715.60961-3-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: add sysfs interface for qgroup | expand |
On Sun, Jun 28, 2020 at 01:07:15PM +0800, Qu Wenruo wrote: > +QGROUP_ATTR(rfer, reference); Note that this is 'referenced'. > +QGROUP_ATTR(excl, exclusive); > +QGROUP_ATTR(max_rfer, max_reference); And here max_referenced. > +QGROUP_ATTR(max_excl, max_exclusive); > +QGROUP_ATTR(lim_flags, limit_flags); > +QGROUP_RSV_ATTR(data, BTRFS_QGROUP_RSV_DATA); > +QGROUP_RSV_ATTR(meta_pertrans, BTRFS_QGROUP_RSV_META_PERTRANS); > +QGROUP_RSV_ATTR(meta_prealloc, BTRFS_QGROUP_RSV_META_PREALLOC); The two above fixed but otherwise it's good, thanks. The qgroup membership and relations could be added to the sysfs export too, but we're limited by the PAGE_SIZE output buffer so the information could be incomplete.
On 2020/6/30 上午5:30, David Sterba wrote: > On Sun, Jun 28, 2020 at 01:07:15PM +0800, Qu Wenruo wrote: >> +QGROUP_ATTR(rfer, reference); > > Note that this is 'referenced'. > >> +QGROUP_ATTR(excl, exclusive); >> +QGROUP_ATTR(max_rfer, max_reference); > > And here max_referenced. > >> +QGROUP_ATTR(max_excl, max_exclusive); >> +QGROUP_ATTR(lim_flags, limit_flags); >> +QGROUP_RSV_ATTR(data, BTRFS_QGROUP_RSV_DATA); >> +QGROUP_RSV_ATTR(meta_pertrans, BTRFS_QGROUP_RSV_META_PERTRANS); >> +QGROUP_RSV_ATTR(meta_prealloc, BTRFS_QGROUP_RSV_META_PREALLOC); > > The two above fixed but otherwise it's good, thanks. > > The qgroup membership and relations could be added to the sysfs export > too, but we're limited by the PAGE_SIZE output buffer so the information > could be incomplete. > Yep, PAGE_SIZE is one limitation. But we can also go another direction, just using a new dir for related qgroups, then we can workaround the limitation. But personally speaking, the main objective is still the rsv_* members for debug. I don't really want to turn the sysfs interface into a way to export all qgroup info yet. But if this inspired us to explore more usage of sysfs other than adding new ioctls to get information from btrfs, then I can't be more happier. Thanks, Qu
On Tue, Jun 30, 2020 at 07:17:25AM +0800, Qu Wenruo wrote: > On 2020/6/30 上午5:30, David Sterba wrote: > > On Sun, Jun 28, 2020 at 01:07:15PM +0800, Qu Wenruo wrote: > >> +QGROUP_ATTR(rfer, reference); > > > > Note that this is 'referenced'. > > > >> +QGROUP_ATTR(excl, exclusive); > >> +QGROUP_ATTR(max_rfer, max_reference); > > > > And here max_referenced. > > > >> +QGROUP_ATTR(max_excl, max_exclusive); > >> +QGROUP_ATTR(lim_flags, limit_flags); > >> +QGROUP_RSV_ATTR(data, BTRFS_QGROUP_RSV_DATA); > >> +QGROUP_RSV_ATTR(meta_pertrans, BTRFS_QGROUP_RSV_META_PERTRANS); > >> +QGROUP_RSV_ATTR(meta_prealloc, BTRFS_QGROUP_RSV_META_PREALLOC); > > > > The two above fixed but otherwise it's good, thanks. > > > > The qgroup membership and relations could be added to the sysfs export > > too, but we're limited by the PAGE_SIZE output buffer so the information > > could be incomplete. > > > Yep, PAGE_SIZE is one limitation. > But we can also go another direction, just using a new dir for related > qgroups, then we can workaround the limitation. > > But personally speaking, the main objective is still the rsv_* members > for debug. > I don't really want to turn the sysfs interface into a way to export all > qgroup info yet. Well, that's a bit of a problem, if you're designing a public interface for your own needs and neglecting that it will be used as a way to read qgroup information. Sooner or later someone will ask for the additional information to be added. If it's just for debugging then it should be under 'debug', but as the bug reporter does not want to apply patches/change config you ask to make it visible unconditionally. I'm ok with that as long as the interface is done right because any mistake will be potentially much harder to fix in the future. We need to at lest have a solid idea how to extend it, not neccessarily to implement it right now. Adding the group membership to qgroup directories under the qgroup directory as symlinks between the kobjects sounds ok-ish to me but it's a new idea and I need to think about it.
On Tue, Jun 30, 2020 at 10:07:35AM +0200, David Sterba wrote: > We need to at lest have a solid idea how to extend it, not neccessarily > to implement it right now. Adding the group membership to qgroup > directories under the qgroup directory as symlinks between the kobjects > sounds ok-ish to me but it's a new idea and I need to think about it. So my current winner idea for the hierarchy representation is to keep all child qgroups as symlinks in the parent group. This does not require extra kojbect to keep, and the path in sysfs would reflect the hierarchy, like: /sys/fs/btrfs/UUID/qgroups/4_1/3_1/2_1/1234_0 This tracks only the parent -> child pointers, so to see all parent qgroups one has to build the reverse graph or simply search which 1st level directories contain the group as child. Implementation is one more sysfs_create_link, the leaf qgroups representing subvolumes will contain only the files with stats. One potential drawback is the maximum size of path that will not allow to show deep qgroup hierarchies, but still about 200-400 in case of average qgroup id 20 or 10 respectively.
On Sun, Jun 28, 2020 at 01:07:15PM +0800, Qu Wenruo wrote: > @@ -1030,6 +1040,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > btrfs_abort_transaction(trans, ret); > goto out_free_path; > } > + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); > + if (ret < 0) { > + btrfs_abort_transaction(trans, ret); > + goto out_free_path; > + } > } > ret = btrfs_next_item(tree_root, path); > if (ret < 0) { > @@ -1054,6 +1069,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > btrfs_abort_transaction(trans, ret); > goto out_free_path; > } > + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); > + if (ret < 0) { > + btrfs_abort_transaction(trans, ret); > + goto out_free_path; > + } This adds 2 new transaction abort sites altough I don't think it's justified, the filesystem is fine. If system is that low on memory it's gonna be very bad elsewhere too so we don't need to make things worse jsut because of some missing sysfs entries. A warning would be better, though in that case the validity of the kobjects should be double checked where it's accessed.
On 2020/7/1 上午12:57, David Sterba wrote: > On Sun, Jun 28, 2020 at 01:07:15PM +0800, Qu Wenruo wrote: >> @@ -1030,6 +1040,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) >> btrfs_abort_transaction(trans, ret); >> goto out_free_path; >> } >> + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); >> + if (ret < 0) { >> + btrfs_abort_transaction(trans, ret); >> + goto out_free_path; >> + } >> } >> ret = btrfs_next_item(tree_root, path); >> if (ret < 0) { >> @@ -1054,6 +1069,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) >> btrfs_abort_transaction(trans, ret); >> goto out_free_path; >> } >> + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); >> + if (ret < 0) { >> + btrfs_abort_transaction(trans, ret); >> + goto out_free_path; >> + } > > This adds 2 new transaction abort sites altough I don't think it's > justified, the filesystem is fine. If system is that low on memory it's > gonna be very bad elsewhere too so we don't need to make things worse > jsut because of some missing sysfs entries. The problem here is, we don't have good enough way to revert back to previous status. This is common among a lot of qgroup code, and I prefer to fix them later, as it will be a big qgroup error patch cleanup. > > A warning would be better, though in that case the validity of the > kobjects should be double checked where it's accessed. > It would be even worse if the qgroup relationship is also exported through sysfs. In that case, warning is not good enough. So I still prefer error path cleanup as the ultimate fix. The objective is, if we hit any error during qgroup enabling or other qgroup operations, we revert to previous status if possible. For qgroup enable, if we hit any non-critical error, we don't abort trans at all, but remove all qgroups along with its qgroup items, remove the qgroup tree, then reverts back to qgroup disabled case. This includes -ENOMEM case. While for critical error like tree operations errors, we still abort transaction. In fact, I'm already working on a similar project, but for extent_changeset. So I guess it won't take too long for qgroup. Thanks, Qu
Hi Wenruo, While testing my pending patches on top of linux-next, I encountered a bug that seems related to this patch during btrfs unmount. Specifically, a null pointer dereference in kobject_del inside btrfs_sysfs_del_qgroups from close_ctree. The fix may be as simple as checking if the kobject is initialised, although perhaps it should always be initialised in this case, so I'll leave you to work out what the real issue is :-) RIP: kobject_del+0x1/0x20 [...] Call Trace: btrfs_sysfs_del_qgroups+0xa5/0xe0 close_ctree+0x1cd/0x2c0 generic_shutdown_super+0x6c/0x100 kill_anon_super+0x14/0x30 btrfs_kill_super+0x12/0x20 deactivate_locked_super+0x36/0x90 cleanup_mnt+0x12d/0x190 task_work_run+0x5c/0x90 __prepare_exit_to_usermode+0x164/0x170 [...] Thanks, Chris
On 2020/7/15 下午9:49, Chris Down wrote: > Hi Wenruo, > > While testing my pending patches on top of linux-next, I encountered a > bug that seems related to this patch during btrfs unmount. Specifically, > a null pointer dereference in kobject_del inside btrfs_sysfs_del_qgroups > from close_ctree. > > The fix may be as simple as checking if the kobject is initialised, > although perhaps it should always be initialised in this case, so I'll > leave you to work out what the real issue is :-) Thank you very much for the report. May I ask if the qgroup is enabled? Or qgroup is not enabled at all? Thanks, Qu > > > RIP: kobject_del+0x1/0x20 > > [...] > > Call Trace: > btrfs_sysfs_del_qgroups+0xa5/0xe0 > close_ctree+0x1cd/0x2c0 > generic_shutdown_super+0x6c/0x100 > kill_anon_super+0x14/0x30 > btrfs_kill_super+0x12/0x20 > deactivate_locked_super+0x36/0x90 > cleanup_mnt+0x12d/0x190 > task_work_run+0x5c/0x90 > __prepare_exit_to_usermode+0x164/0x170 > [...] > > Thanks, > > Chris
Qu Wenruo writes: >On 2020/7/15 下午9:49, Chris Down wrote: >> While testing my pending patches on top of linux-next, I encountered a >> bug that seems related to this patch during btrfs unmount. Specifically, >> a null pointer dereference in kobject_del inside btrfs_sysfs_del_qgroups >> from close_ctree. >> >> The fix may be as simple as checking if the kobject is initialised, >> although perhaps it should always be initialised in this case, so I'll >> leave you to work out what the real issue is :-) > >Thank you very much for the report. > >May I ask if the qgroup is enabled? Or qgroup is not enabled at all? I have quotas disabled, so I assume qgroup is also implicitly disabled.
On 2020/7/16 上午8:15, Qu Wenruo wrote: > > > On 2020/7/15 下午9:49, Chris Down wrote: >> Hi Wenruo, >> >> While testing my pending patches on top of linux-next, I encountered a >> bug that seems related to this patch during btrfs unmount. Specifically, >> a null pointer dereference in kobject_del inside btrfs_sysfs_del_qgroups >> from close_ctree. >> >> The fix may be as simple as checking if the kobject is initialised, >> although perhaps it should always be initialised in this case, so I'll >> leave you to work out what the real issue is :-) > > Thank you very much for the report. > > May I ask if the qgroup is enabled? Or qgroup is not enabled at all? BTW, after checking the code, it looks a little strange to me. Firstly, both kobject_del and kobject_put() has extra check on NULL pointers, thus if fs_info->qgroups_kobj is NULL, it should do nothing and exit. Secondly, the fs_info->qgroup_kobj is initialized to zero, by kvzalloc() in btrfs_mount_root(). Thus unless we modified it manually, it should always be NULL. And for the locations modifying qgroups_kobj, it's either allocating it, in btrfs_sysfs_add_qgroups(), or removing it and set it back to NULL in btrfs_sysfs_del_qgroups(). Thus this looks pretty weird. Would you please provide the full call trace (especially the address causing the NULL pointer deref) and the reproducer (if possible)? Thanks, Qu > > Thanks, > Qu >> >> >> RIP: kobject_del+0x1/0x20 >> >> [...] >> >> Call Trace: >> btrfs_sysfs_del_qgroups+0xa5/0xe0 >> close_ctree+0x1cd/0x2c0 >> generic_shutdown_super+0x6c/0x100 >> kill_anon_super+0x14/0x30 >> btrfs_kill_super+0x12/0x20 >> deactivate_locked_super+0x36/0x90 >> cleanup_mnt+0x12d/0x190 >> task_work_run+0x5c/0x90 >> __prepare_exit_to_usermode+0x164/0x170 >> [...] >> >> Thanks, >> >> Chris >
On 2020/7/16 上午8:40, Chris Down wrote: > Qu Wenruo writes: >> Would you please provide the full call trace (especially the address >> causing the NULL pointer deref) and the reproducer (if possible)? > > I'll try to reproduce it again when I have time. I didn't have kdump > set, and am not currently running linux-next, so until then, you can > have this crappy photo that I rushed to take earlier before panic=30 > took effect :-) No problem, the image is already good enough for me to lookinto the case. And the extra clue of linux-next would also help a lot. Thank you very much again for the report, Qu
On 2020/7/16 上午8:40, Chris Down wrote: > Qu Wenruo writes: >> Would you please provide the full call trace (especially the address >> causing the NULL pointer deref) and the reproducer (if possible)? > > I'll try to reproduce it again when I have time. I didn't have kdump > set, and am not currently running linux-next, so until then, you can > have this crappy photo that I rushed to take earlier before panic=30 > took effect :-) The linux-next is using an older version of that patch, which has a bad memory allocation inside atomic critical section. If you're unlucky enough, it could cause various weird bugs. Thankfully, David has exposed it and the latest version in btrfs tree should have it fixed. If you're uncertain if that's the case, feel free to try the latest misc-next branch to see if it's fixed. https://github.com/kdave/btrfs-devel/tree/misc-next Thanks for your report, Qu
Qu Wenruo writes: > > >On 2020/7/16 上午8:40, Chris Down wrote: >> Qu Wenruo writes: >>> Would you please provide the full call trace (especially the address >>> causing the NULL pointer deref) and the reproducer (if possible)? >> >> I'll try to reproduce it again when I have time. I didn't have kdump >> set, and am not currently running linux-next, so until then, you can >> have this crappy photo that I rushed to take earlier before panic=30 >> took effect :-) > >The linux-next is using an older version of that patch, which has a bad >memory allocation inside atomic critical section. > >If you're unlucky enough, it could cause various weird bugs. >Thankfully, David has exposed it and the latest version in btrfs tree >should have it fixed. Great, thanks! When I run my tests again on linux-next I'll be sure to reapply the latest version of your patch first, then. Thanks, Chris
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index d8301bf240e0..9c0394893e8e 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -779,6 +779,7 @@ struct btrfs_fs_info { u32 thread_pool_size; struct kobject *space_info_kobj; + struct kobject *qgroups_kobj; u64 total_pinned; diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 014d32429165..fc90ec0a86f1 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -22,6 +22,7 @@ #include "extent_io.h" #include "qgroup.h" #include "block-group.h" +#include "sysfs.h" /* TODO XXX FIXME * - subvol delete -> delete when ref goes to 0? delete limits also? @@ -220,10 +221,12 @@ static struct btrfs_qgroup *add_qgroup_rb(struct btrfs_fs_info *fs_info, return qgroup; } -static void __del_qgroup_rb(struct btrfs_qgroup *qgroup) +static void __del_qgroup_rb(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup) { struct btrfs_qgroup_list *list; + btrfs_sysfs_del_one_qgroup(fs_info, qgroup); list_del(&qgroup->dirty); while (!list_empty(&qgroup->groups)) { list = list_first_entry(&qgroup->groups, @@ -252,7 +255,7 @@ static int del_qgroup_rb(struct btrfs_fs_info *fs_info, u64 qgroupid) return -ENOENT; rb_erase(&qgroup->node, &fs_info->qgroup_tree); - __del_qgroup_rb(qgroup); + __del_qgroup_rb(fs_info, qgroup); return 0; } @@ -351,6 +354,9 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info) goto out; } + ret = btrfs_sysfs_add_qgroups(fs_info); + if (ret < 0) + goto out; /* default this to quota off, in case no status key is found */ fs_info->qgroup_flags = 0; @@ -412,6 +418,10 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info) goto out; } } + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); + if (ret < 0) + goto out; + switch (found_key.type) { case BTRFS_QGROUP_INFO_KEY: { struct btrfs_qgroup_info_item *ptr; @@ -500,16 +510,12 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info) ulist_free(fs_info->qgroup_ulist); fs_info->qgroup_ulist = NULL; fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN; + btrfs_sysfs_del_qgroups(fs_info); } return ret < 0 ? ret : 0; } -static u64 btrfs_qgroup_subvolid(u64 qgroupid) -{ - return (qgroupid & ((1ULL << BTRFS_QGROUP_LEVEL_SHIFT) - 1)); -} - /* * Called in close_ctree() when quota is still enabled. This verifies we don't * leak some reserved space. @@ -562,7 +568,7 @@ void btrfs_free_qgroup_config(struct btrfs_fs_info *fs_info) while ((n = rb_first(&fs_info->qgroup_tree))) { qgroup = rb_entry(n, struct btrfs_qgroup, node); rb_erase(n, &fs_info->qgroup_tree); - __del_qgroup_rb(qgroup); + __del_qgroup_rb(fs_info, qgroup); } /* * We call btrfs_free_qgroup_config() when unmounting @@ -571,6 +577,7 @@ void btrfs_free_qgroup_config(struct btrfs_fs_info *fs_info) */ ulist_free(fs_info->qgroup_ulist); fs_info->qgroup_ulist = NULL; + btrfs_sysfs_del_qgroups(fs_info); } static int add_qgroup_relation_item(struct btrfs_trans_handle *trans, u64 src, @@ -943,6 +950,9 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) goto out; } + ret = btrfs_sysfs_add_qgroups(fs_info); + if (ret < 0) + goto out; /* * 1 for quota root item * 1 for BTRFS_QGROUP_STATUS item @@ -1030,6 +1040,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) btrfs_abort_transaction(trans, ret); goto out_free_path; } + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); + if (ret < 0) { + btrfs_abort_transaction(trans, ret); + goto out_free_path; + } } ret = btrfs_next_item(tree_root, path); if (ret < 0) { @@ -1054,6 +1069,11 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) btrfs_abort_transaction(trans, ret); goto out_free_path; } + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); + if (ret < 0) { + btrfs_abort_transaction(trans, ret); + goto out_free_path; + } ret = btrfs_commit_transaction(trans); trans = NULL; @@ -1089,6 +1109,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) fs_info->qgroup_ulist = NULL; if (trans) btrfs_end_transaction(trans); + btrfs_sysfs_del_qgroups(fs_info); } mutex_unlock(&fs_info->qgroup_ioctl_lock); return ret; @@ -1441,8 +1462,11 @@ int btrfs_create_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid) qgroup = add_qgroup_rb(fs_info, qgroupid); spin_unlock(&fs_info->qgroup_lock); - if (IS_ERR(qgroup)) + if (IS_ERR(qgroup)) { ret = PTR_ERR(qgroup); + goto out; + } + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); out: mutex_unlock(&fs_info->qgroup_ioctl_lock); return ret; @@ -2861,6 +2885,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, unlock: spin_unlock(&fs_info->qgroup_lock); + if (!ret) + ret = btrfs_sysfs_add_one_qgroup(fs_info, dstgroup); out: if (!committing) mutex_unlock(&fs_info->qgroup_ioctl_lock); diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h index 3be5198a3719..0116779d34e5 100644 --- a/fs/btrfs/qgroup.h +++ b/fs/btrfs/qgroup.h @@ -8,6 +8,7 @@ #include <linux/spinlock.h> #include <linux/rbtree.h> +#include <linux/kobject.h> #include "ulist.h" #include "delayed-ref.h" @@ -223,8 +224,18 @@ struct btrfs_qgroup { */ u64 old_refcnt; u64 new_refcnt; + + /* + * Sysfs kobjectid + */ + struct kobject kobj; }; +static inline u64 btrfs_qgroup_subvolid(u64 qgroupid) +{ + return (qgroupid & ((1ULL << BTRFS_QGROUP_LEVEL_SHIFT) - 1)); +} + /* * For qgroup event trace points only */ diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index a39bff64ff24..3d7a939601a9 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -19,6 +19,7 @@ #include "volumes.h" #include "space-info.h" #include "block-group.h" +#include "qgroup.h" struct btrfs_feature_attr { struct kobj_attribute kobj_attr; @@ -1455,6 +1456,156 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info) return error; } +static struct btrfs_fs_info *qgroup_kobj_to_fs_info(struct kobject *kobj) +{ + return to_fs_info(kobj->parent->parent); +} + +#define QGROUP_ATTR(_member, _show_name) \ +static ssize_t btrfs_qgroup_show_##_member(struct kobject *qgroup_kobj, \ + struct kobj_attribute *a, \ + char *buf) \ +{ \ + struct btrfs_fs_info *fs_info = qgroup_kobj_to_fs_info( \ + qgroup_kobj); \ + struct btrfs_qgroup *qgroup = container_of(qgroup_kobj, \ + struct btrfs_qgroup, kobj); \ + \ + return btrfs_show_u64(&qgroup->_member, &fs_info->qgroup_lock, \ + buf); \ +} \ +BTRFS_ATTR(qgroup, _show_name, btrfs_qgroup_show_##_member) + +#define QGROUP_RSV_ATTR(_name, _type) \ +static ssize_t btrfs_qgroup_rsv_show_##_name(struct kobject *qgroup_kobj, \ + struct kobj_attribute *a, \ + char *buf) \ +{ \ + struct btrfs_fs_info *fs_info = qgroup_kobj_to_fs_info( \ + qgroup_kobj); \ + struct btrfs_qgroup *qgroup = container_of(qgroup_kobj, \ + struct btrfs_qgroup, kobj); \ + \ + return btrfs_show_u64(&qgroup->rsv.values[_type], \ + &fs_info->qgroup_lock, buf); \ +} \ +BTRFS_ATTR(qgroup, rsv_##_name, btrfs_qgroup_rsv_show_##_name) + +QGROUP_ATTR(rfer, reference); +QGROUP_ATTR(excl, exclusive); +QGROUP_ATTR(max_rfer, max_reference); +QGROUP_ATTR(max_excl, max_exclusive); +QGROUP_ATTR(lim_flags, limit_flags); +QGROUP_RSV_ATTR(data, BTRFS_QGROUP_RSV_DATA); +QGROUP_RSV_ATTR(meta_pertrans, BTRFS_QGROUP_RSV_META_PERTRANS); +QGROUP_RSV_ATTR(meta_prealloc, BTRFS_QGROUP_RSV_META_PREALLOC); + +static struct attribute *qgroup_attrs[] = { + BTRFS_ATTR_PTR(qgroup, reference), + BTRFS_ATTR_PTR(qgroup, exclusive), + BTRFS_ATTR_PTR(qgroup, max_reference), + BTRFS_ATTR_PTR(qgroup, max_exclusive), + BTRFS_ATTR_PTR(qgroup, limit_flags), + BTRFS_ATTR_PTR(qgroup, rsv_data), + BTRFS_ATTR_PTR(qgroup, rsv_meta_pertrans), + BTRFS_ATTR_PTR(qgroup, rsv_meta_prealloc), + NULL +}; +ATTRIBUTE_GROUPS(qgroup); + +static void qgroup_release(struct kobject *kobj) +{ + struct btrfs_qgroup *qgroup = container_of(kobj, struct btrfs_qgroup, + kobj); + + memset(&qgroup->kobj, 0, sizeof(*kobj)); +} + +static struct kobj_type qgroup_ktype = { + .sysfs_ops = &kobj_sysfs_ops, + .release = qgroup_release, + .default_groups = qgroup_groups, +}; + +int btrfs_sysfs_add_one_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup) +{ + struct kobject *qgroups_kobj = fs_info->qgroups_kobj; + int ret; + + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state)) + return 0; + if (qgroup->kobj.state_initialized) + return 0; + if (!qgroups_kobj) + return -EINVAL; + + ret = kobject_init_and_add(&qgroup->kobj, &qgroup_ktype, qgroups_kobj, + "%hu_%llu", btrfs_qgroup_level(qgroup->qgroupid), + btrfs_qgroup_subvolid(qgroup->qgroupid)); + if (ret < 0) + kobject_put(&qgroup->kobj); + + return ret; +} + +void btrfs_sysfs_del_qgroups(struct btrfs_fs_info *fs_info) +{ + struct btrfs_qgroup *qgroup; + struct btrfs_qgroup *next; + + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state)) + return; + rbtree_postorder_for_each_entry_safe(qgroup, next, + &fs_info->qgroup_tree, node) + btrfs_sysfs_del_one_qgroup(fs_info, qgroup); + kobject_del(fs_info->qgroups_kobj); + kobject_put(fs_info->qgroups_kobj); + fs_info->qgroups_kobj = NULL; +} + +/* Called when qgroup get initialized, thus there is no need for extra lock. */ +int btrfs_sysfs_add_qgroups(struct btrfs_fs_info *fs_info) +{ + struct kobject *fsid_kobj = &fs_info->fs_devices->fsid_kobj; + struct btrfs_qgroup *qgroup; + struct btrfs_qgroup *next; + int ret = 0; + + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state)) + return 0; + ASSERT(fsid_kobj); + if (fs_info->qgroups_kobj) + return 0; + + fs_info->qgroups_kobj = kobject_create_and_add("qgroups", fsid_kobj); + if (!fs_info->qgroups_kobj) { + ret = -ENOMEM; + goto out; + } + rbtree_postorder_for_each_entry_safe(qgroup, next, + &fs_info->qgroup_tree, node) { + ret = btrfs_sysfs_add_one_qgroup(fs_info, qgroup); + if (ret < 0) + goto out; + } + +out: + if (ret < 0) + btrfs_sysfs_del_qgroups(fs_info); + return ret; +} + +void btrfs_sysfs_del_one_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup) +{ + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state)) + return; + if (qgroup->kobj.state_initialized) { + kobject_del(&qgroup->kobj); + kobject_put(&qgroup->kobj); + } +} /* * Change per-fs features in /sys/fs/btrfs/UUID/features to match current diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h index 718a26c97833..1e27a9c94c84 100644 --- a/fs/btrfs/sysfs.h +++ b/fs/btrfs/sysfs.h @@ -36,4 +36,10 @@ int btrfs_sysfs_add_space_info_type(struct btrfs_fs_info *fs_info, void btrfs_sysfs_remove_space_info(struct btrfs_space_info *space_info); void btrfs_sysfs_update_devid(struct btrfs_device *device); +int btrfs_sysfs_add_one_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup); +void btrfs_sysfs_del_qgroups(struct btrfs_fs_info *fs_info); +int btrfs_sysfs_add_qgroups(struct btrfs_fs_info *fs_info); +void btrfs_sysfs_del_one_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup); #endif
This patch will add the following sysfs interface: /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/reference /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/exclusive /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/max_reference /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/max_exclusive /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/limit_flags ^^^ Above are already in "btrfs qgroup show" command output ^^^ /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/rsv_data /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/rsv_meta_pertrans /sys/fs/btrfs/<UUID>/qgroups/<qgroup_id>/rsv_meta_prealloc The last 3 rsv related members are not visible to users, but can be very useful to debug qgroup limit related bugs. Also, to avoid '/' used in <qgroup_id>, the separator between qgroup level and qgroup id is changed to '_'. The interface is not hidden behind 'debug' as I want this interface to be included into production build so we could have an easier life to debug qgroup rsv related bugs. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/ctree.h | 1 + fs/btrfs/qgroup.c | 44 +++++++++++--- fs/btrfs/qgroup.h | 11 ++++ fs/btrfs/sysfs.c | 151 ++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/sysfs.h | 6 ++ 5 files changed, 204 insertions(+), 9 deletions(-)