Message ID | 20250320041749.881-3-rakie.kim@sk.com |
---|---|
State | Superseded |
Headers | show |
Series | Enhance sysfs handling for memory hotplug in weighted interleave | expand |
On Thu, Mar 20, 2025 at 01:17:47PM +0900, Rakie Kim wrote: > Previously, the weighted interleave sysfs structure was statically > managed, preventing dynamic updates when nodes were added or removed. > > This patch restructures the weighted interleave sysfs to support > dynamic insertion and deletion. The sysfs that was part of > the 'weighted_interleave_group' is now globally accessible, > allowing external access to that sysfs. > > With this change, sysfs management for weighted interleave is > more flexible, supporting hotplug events and runtime updates > more effectively. > > Signed-off-by: Rakie Kim <rakie.kim@sk.com> Reviewed-by: Gregory Price <gourry@gourry.net> 1 nit > --- > mm/mempolicy.c | 70 ++++++++++++++++++++++---------------------------- > 1 file changed, 30 insertions(+), 40 deletions(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 5950d5d5b85e..6c8843114afd 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -3388,6 +3388,13 @@ struct iw_node_attr { > int nid; > }; > > +struct sysfs_wi_group { > + struct kobject wi_kobj; > + struct iw_node_attr *nattrs[]; > +}; > + > +static struct sysfs_wi_group *sgrp; > + sgrp -> wi_group? Or something similar, sgrp is not very descriptive for a global. ~Gregory
On Fri, 21 Mar 2025 10:09:12 -0400 Gregory Price <gourry@gourry.net> wrote: > On Thu, Mar 20, 2025 at 01:17:47PM +0900, Rakie Kim wrote: > > Previously, the weighted interleave sysfs structure was statically > > managed, preventing dynamic updates when nodes were added or removed. > > > > This patch restructures the weighted interleave sysfs to support > > dynamic insertion and deletion. The sysfs that was part of > > the 'weighted_interleave_group' is now globally accessible, > > allowing external access to that sysfs. > > > > With this change, sysfs management for weighted interleave is > > more flexible, supporting hotplug events and runtime updates > > more effectively. > > > > Signed-off-by: Rakie Kim <rakie.kim@sk.com> > > Reviewed-by: Gregory Price <gourry@gourry.net> > > 1 nit > > > --- > > mm/mempolicy.c | 70 ++++++++++++++++++++++---------------------------- > > 1 file changed, 30 insertions(+), 40 deletions(-) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index 5950d5d5b85e..6c8843114afd 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -3388,6 +3388,13 @@ struct iw_node_attr { > > int nid; > > }; > > > > +struct sysfs_wi_group { > > + struct kobject wi_kobj; > > + struct iw_node_attr *nattrs[]; > > +}; > > + > > +static struct sysfs_wi_group *sgrp; > > + > > sgrp -> wi_group? Or something similar, sgrp is not very descriptive > for a global. > > ~Gregory Yes, I agree. `wi_group` is more descriptive than `sgrp`. I will rename the structure to `wi_group` as suggested. Rakie
Rakie Kim wrote: > Previously, the weighted interleave sysfs structure was statically > managed, preventing dynamic updates when nodes were added or removed. > > This patch restructures the weighted interleave sysfs to support > dynamic insertion and deletion. The sysfs that was part of > the 'weighted_interleave_group' is now globally accessible, > allowing external access to that sysfs. > > With this change, sysfs management for weighted interleave is > more flexible, supporting hotplug events and runtime updates > more effectively. I understand the urge to try to make a general case for a patch, but it is better to state the explicit reason especially when someone is later reading the history and may not realize that this is part of a series. So instead of making claims like "this is more flexible / more effective for runtime updates", state that motivation explicitly. Something like: "In preparation for enabling weighted-interleave sysfs attributes to react to node-online/offline events, introduce sysfs_wi_node_add() and sysfs_wi_node_delete() helpers to dynamically manage the weighted-interleave attributes. A follow-on patch registers a memory-hotplug notifier to use these helpers, for now just refactor the current "publish all possible node" approach to use sysfs_wi_node_{add,delete}()." > > Signed-off-by: Rakie Kim <rakie.kim@sk.com> > --- > mm/mempolicy.c | 70 ++++++++++++++++++++++---------------------------- > 1 file changed, 30 insertions(+), 40 deletions(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 5950d5d5b85e..6c8843114afd 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -3388,6 +3388,13 @@ struct iw_node_attr { > int nid; > }; > > +struct sysfs_wi_group { > + struct kobject wi_kobj; > + struct iw_node_attr *nattrs[]; > +}; > + > +static struct sysfs_wi_group *sgrp; > + > static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr, > char *buf) > { > @@ -3430,27 +3437,23 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr, > return count; > } > > -static struct iw_node_attr **node_attrs; > - > -static void sysfs_wi_node_release(struct iw_node_attr *node_attr, > - struct kobject *parent) > +static void sysfs_wi_node_release(int nid) I called this sysfs_wi_node_delete() above because _release() is typically callback invoked on last put of a kobject. > { > - if (!node_attr) > + if (!sgrp->nattrs[nid]) > return; > - sysfs_remove_file(parent, &node_attr->kobj_attr.attr); > - kfree(node_attr->kobj_attr.attr.name); > - kfree(node_attr); > + > + sysfs_remove_file(&sgrp->wi_kobj, &sgrp->nattrs[nid]->kobj_attr.attr); > + kfree(sgrp->nattrs[nid]->kobj_attr.attr.name); > + kfree(sgrp->nattrs[nid]); > } > > static void sysfs_wi_release(struct kobject *wi_kobj) > { > - int i; > - > - for (i = 0; i < nr_node_ids; i++) > - sysfs_wi_node_release(node_attrs[i], wi_kobj); > + int nid; > > - kfree(node_attrs); > - kfree(wi_kobj); > + for (nid = 0; nid < nr_node_ids; nid++) > + sysfs_wi_node_release(nid); > + kfree(sgrp); This looks broken, are you sure that a kobject with a zero reference can still host child attributes? The teardown flow I would expect is: sysfs_remove_file(node_attrs[i], kobject_del(wi_kobj) ...that does final kobject_put()... kfree(container_of(wi_kobj)) However, now I do not think patch1 is actually fixing anything because there is never a kobject_del() of the mempolicy_kobj. Just like there is never a kobject_del() of the mm_kobj. So patch1 seems to potentially be addressing a bug introduced by this dynamic work which is caused by the original code being confused about the kobject shutdown path. The original problems are that sysfs_wi_release() has a kobject_put() which, yes, is broken, but equally problematic is that there is no kobject_del() in sight for either of these kobjects(), even with the new changes. mempolicy_kobj_release() seems to confuse the activities that I would expect to be near a kobject_del() call with the minimal kfree() on final put. > } > > static const struct kobj_type wi_ktype = { > @@ -3458,7 +3461,7 @@ static const struct kobj_type wi_ktype = { > .release = sysfs_wi_release, > }; > > -static int add_weight_node(int nid, struct kobject *wi_kobj) > +static int sysfs_wi_node_add(int nid) > { > struct iw_node_attr *node_attr; > char *name; > @@ -3480,57 +3483,44 @@ static int add_weight_node(int nid, struct kobject *wi_kobj) > node_attr->kobj_attr.store = node_store; > node_attr->nid = nid; > > - if (sysfs_create_file(wi_kobj, &node_attr->kobj_attr.attr)) { > + if (sysfs_create_file(&sgrp->wi_kobj, &node_attr->kobj_attr.attr)) { > kfree(node_attr->kobj_attr.attr.name); > kfree(node_attr); > pr_err("failed to add attribute to weighted_interleave\n"); > return -ENOMEM; > } > > - node_attrs[nid] = node_attr; > + sgrp->nattrs[nid] = node_attr; > return 0; > } > > -static int add_weighted_interleave_group(struct kobject *root_kobj) > +static int add_weighted_interleave_group(struct kobject *mempolicy_kobj) > { > - struct kobject *wi_kobj; > int nid, err; > > - node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *), > - GFP_KERNEL); > - if (!node_attrs) > + sgrp = kzalloc(sizeof(struct sysfs_wi_group) + \ > + nr_node_ids * sizeof(struct iw_node_attr *), \ > + GFP_KERNEL); The recommended way to allocate a struct with a flexible array is using the struct_size() helper. kzalloc(struct_size(sgrp, nattrs, nr_node_ids), GFP_KERNEL) ...but overall I think the original code needs a cleanup and to be clear that I think there is no memory leak risk exposed to existing users given the shutdown path is never invoked.
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 5950d5d5b85e..6c8843114afd 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -3388,6 +3388,13 @@ struct iw_node_attr { int nid; }; +struct sysfs_wi_group { + struct kobject wi_kobj; + struct iw_node_attr *nattrs[]; +}; + +static struct sysfs_wi_group *sgrp; + static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -3430,27 +3437,23 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr, return count; } -static struct iw_node_attr **node_attrs; - -static void sysfs_wi_node_release(struct iw_node_attr *node_attr, - struct kobject *parent) +static void sysfs_wi_node_release(int nid) { - if (!node_attr) + if (!sgrp->nattrs[nid]) return; - sysfs_remove_file(parent, &node_attr->kobj_attr.attr); - kfree(node_attr->kobj_attr.attr.name); - kfree(node_attr); + + sysfs_remove_file(&sgrp->wi_kobj, &sgrp->nattrs[nid]->kobj_attr.attr); + kfree(sgrp->nattrs[nid]->kobj_attr.attr.name); + kfree(sgrp->nattrs[nid]); } static void sysfs_wi_release(struct kobject *wi_kobj) { - int i; - - for (i = 0; i < nr_node_ids; i++) - sysfs_wi_node_release(node_attrs[i], wi_kobj); + int nid; - kfree(node_attrs); - kfree(wi_kobj); + for (nid = 0; nid < nr_node_ids; nid++) + sysfs_wi_node_release(nid); + kfree(sgrp); } static const struct kobj_type wi_ktype = { @@ -3458,7 +3461,7 @@ static const struct kobj_type wi_ktype = { .release = sysfs_wi_release, }; -static int add_weight_node(int nid, struct kobject *wi_kobj) +static int sysfs_wi_node_add(int nid) { struct iw_node_attr *node_attr; char *name; @@ -3480,57 +3483,44 @@ static int add_weight_node(int nid, struct kobject *wi_kobj) node_attr->kobj_attr.store = node_store; node_attr->nid = nid; - if (sysfs_create_file(wi_kobj, &node_attr->kobj_attr.attr)) { + if (sysfs_create_file(&sgrp->wi_kobj, &node_attr->kobj_attr.attr)) { kfree(node_attr->kobj_attr.attr.name); kfree(node_attr); pr_err("failed to add attribute to weighted_interleave\n"); return -ENOMEM; } - node_attrs[nid] = node_attr; + sgrp->nattrs[nid] = node_attr; return 0; } -static int add_weighted_interleave_group(struct kobject *root_kobj) +static int add_weighted_interleave_group(struct kobject *mempolicy_kobj) { - struct kobject *wi_kobj; int nid, err; - node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *), - GFP_KERNEL); - if (!node_attrs) + sgrp = kzalloc(sizeof(struct sysfs_wi_group) + \ + nr_node_ids * sizeof(struct iw_node_attr *), \ + GFP_KERNEL); + if (!sgrp) return -ENOMEM; - wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL); - if (!wi_kobj) { - err = -ENOMEM; - goto node_out; - } - - err = kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj, + err = kobject_init_and_add(&sgrp->wi_kobj, &wi_ktype, mempolicy_kobj, "weighted_interleave"); - if (err) { - kobject_put(wi_kobj); + if (err) goto err_out; - } for_each_node_state(nid, N_POSSIBLE) { - err = add_weight_node(nid, wi_kobj); + err = sysfs_wi_node_add(nid); if (err) { pr_err("failed to add sysfs [node%d]\n", nid); - break; + goto err_out; } } - if (err) { - kobject_put(wi_kobj); - goto err_out; - } return 0; -node_out: - kfree(node_attrs); err_out: + kobject_put(&sgrp->wi_kobj); return err; }
Previously, the weighted interleave sysfs structure was statically managed, preventing dynamic updates when nodes were added or removed. This patch restructures the weighted interleave sysfs to support dynamic insertion and deletion. The sysfs that was part of the 'weighted_interleave_group' is now globally accessible, allowing external access to that sysfs. With this change, sysfs management for weighted interleave is more flexible, supporting hotplug events and runtime updates more effectively. Signed-off-by: Rakie Kim <rakie.kim@sk.com> --- mm/mempolicy.c | 70 ++++++++++++++++++++++---------------------------- 1 file changed, 30 insertions(+), 40 deletions(-)