diff mbox series

[v2,3/5] mm: demotion: Add support to set targets from userspace

Message ID 20220413092206.73974-4-jvgediya@linux.ibm.com (mailing list archive)
State New
Headers show
Series mm: demotion: Introduce new node state N_DEMOTION_TARGETS | expand

Commit Message

Jagdish Gediya April 13, 2022, 9:22 a.m. UTC
Add support to set node_states[N_DEMOTION_TARGETS] from
user space to override the default demotion targets.

Restrict demotion targets to memory only numa nodes
while setting them from user space.

Demotion targets can be set from userspace using command,
echo <nodelist> > /sys/kernel/mm/numa/demotion_target

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Jagdish Gediya <jvgediya@linux.ibm.com>
---
 .../ABI/testing/sysfs-kernel-mm-numa          | 12 +++++++
 mm/migrate.c                                  | 35 +++++++++++++++++++
 2 files changed, 47 insertions(+)

Comments

Wei Xu April 21, 2022, 4:26 a.m. UTC | #1
On Wed, Apr 13, 2022 at 2:22 AM Jagdish Gediya <jvgediya@linux.ibm.com> wrote:
>
> Add support to set node_states[N_DEMOTION_TARGETS] from
> user space to override the default demotion targets.
>
> Restrict demotion targets to memory only numa nodes
> while setting them from user space.

Why should we restrict demotion targets to memory only nodes if they
get set explicitly from user space? For example, if we use NUMA
emulation to test demotion without actual hardware, these emulated
NUMA nodes can have CPUs, but we still want some of them to serve as
demotion targets.

> Demotion targets can be set from userspace using command,
> echo <nodelist> > /sys/kernel/mm/numa/demotion_target
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Jagdish Gediya <jvgediya@linux.ibm.com>
> ---
>  .../ABI/testing/sysfs-kernel-mm-numa          | 12 +++++++
>  mm/migrate.c                                  | 35 +++++++++++++++++++
>  2 files changed, 47 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-numa b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> index 77e559d4ed80..10e9e643845c 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-mm-numa
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> @@ -22,3 +22,15 @@ Description: Enable/disable demoting pages during reclaim
>                 the guarantees of cpusets.  This should not be enabled
>                 on systems which need strict cpuset location
>                 guarantees.
> +
> +What:          /sys/kernel/mm/numa/demotion_target
> +Date:          April 2022
> +Contact:       Linux memory management mailing list <linux-mm@kvack.org>
> +Description:   Configure demotion target nodes
> +
> +               Page migration during reclaim is intended for systems
> +               with tiered memory configurations. Preferred migration target
> +               nodes can be configured in a system using this interface, based
> +               on which demotion table is prepared in kernel. If demotion is
> +               enabled then pages will be migrated to set demotion targets
> +               during reclaim.
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 516f4e1348c1..4d3d80ca0a7f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2564,12 +2564,47 @@ static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
>         return count;
>  }
>
> +
> +static ssize_t numa_demotion_target_show(struct kobject *kobj,
> +                                         struct kobj_attribute *attr, char *buf)
> +{
> +       return sysfs_emit(buf, "%*pbl\n",
> +                         nodemask_pr_args(&node_states[N_DEMOTION_TARGETS]));
> +}
> +
> +static ssize_t numa_demotion_target_store(struct kobject *kobj,
> +                                         struct kobj_attribute *attr,
> +                                         const char *nodelist, size_t count)
> +{
> +       nodemask_t nodes;
> +
> +       if (nodelist_parse(nodelist, nodes))
> +               return -EINVAL;
> +
> +       if (!nodes_subset(nodes, node_states[N_MEMORY]))
> +               return -EINVAL;
> +
> +       if (nodes_intersects(nodes, node_states[N_CPU]))
> +               return -EINVAL;
> +
> +       node_states[N_DEMOTION_TARGETS] = nodes;
> +
> +       set_migration_target_nodes();
> +
> +       return count;
> +}
> +
>  static struct kobj_attribute numa_demotion_enabled_attr =
>         __ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
>                numa_demotion_enabled_store);
>
> +static struct kobj_attribute numa_demotion_target_attr =
> +       __ATTR(demotion_target, 0644, numa_demotion_target_show,
> +              numa_demotion_target_store);
> +
>  static struct attribute *numa_attrs[] = {
>         &numa_demotion_enabled_attr.attr,
> +       &numa_demotion_target_attr.attr,
>         NULL,
>  };
>
> --
> 2.35.1
>
>
Wei Xu April 21, 2022, 5:31 a.m. UTC | #2
On Wed, Apr 13, 2022 at 2:22 AM Jagdish Gediya <jvgediya@linux.ibm.com> wrote:
>
> Add support to set node_states[N_DEMOTION_TARGETS] from
> user space to override the default demotion targets.
>
> Restrict demotion targets to memory only numa nodes
> while setting them from user space.
>
> Demotion targets can be set from userspace using command,
> echo <nodelist> > /sys/kernel/mm/numa/demotion_target
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Jagdish Gediya <jvgediya@linux.ibm.com>
> ---
>  .../ABI/testing/sysfs-kernel-mm-numa          | 12 +++++++
>  mm/migrate.c                                  | 35 +++++++++++++++++++
>  2 files changed, 47 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-numa b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> index 77e559d4ed80..10e9e643845c 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-mm-numa
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> @@ -22,3 +22,15 @@ Description: Enable/disable demoting pages during reclaim
>                 the guarantees of cpusets.  This should not be enabled
>                 on systems which need strict cpuset location
>                 guarantees.
> +
> +What:          /sys/kernel/mm/numa/demotion_target

demotion_target -> demotion_targets?

Also, with the previous change, we already have
/sys/devices/system/node/has_demotion_targets (or demotion_targets as
I have suggested). Wouldn't it be simpler to make that sysfs file
writable instead of adding a parallel interface?

> +Date:          April 2022
> +Contact:       Linux memory management mailing list <linux-mm@kvack.org>
> +Description:   Configure demotion target nodes
> +
> +               Page migration during reclaim is intended for systems
> +               with tiered memory configurations. Preferred migration target
> +               nodes can be configured in a system using this interface, based
> +               on which demotion table is prepared in kernel. If demotion is
> +               enabled then pages will be migrated to set demotion targets
> +               during reclaim.
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 516f4e1348c1..4d3d80ca0a7f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2564,12 +2564,47 @@ static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
>         return count;
>  }
>
> +
> +static ssize_t numa_demotion_target_show(struct kobject *kobj,
> +                                         struct kobj_attribute *attr, char *buf)
> +{
> +       return sysfs_emit(buf, "%*pbl\n",
> +                         nodemask_pr_args(&node_states[N_DEMOTION_TARGETS]));
> +}
> +
> +static ssize_t numa_demotion_target_store(struct kobject *kobj,
> +                                         struct kobj_attribute *attr,
> +                                         const char *nodelist, size_t count)
> +{
> +       nodemask_t nodes;
> +
> +       if (nodelist_parse(nodelist, nodes))
> +               return -EINVAL;
> +
> +       if (!nodes_subset(nodes, node_states[N_MEMORY]))
> +               return -EINVAL;
> +
> +       if (nodes_intersects(nodes, node_states[N_CPU]))
> +               return -EINVAL;
> +
> +       node_states[N_DEMOTION_TARGETS] = nodes;
> +
> +       set_migration_target_nodes();
> +
> +       return count;
> +}
> +
>  static struct kobj_attribute numa_demotion_enabled_attr =
>         __ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
>                numa_demotion_enabled_store);
>
> +static struct kobj_attribute numa_demotion_target_attr =
> +       __ATTR(demotion_target, 0644, numa_demotion_target_show,
> +              numa_demotion_target_store);
> +
>  static struct attribute *numa_attrs[] = {
>         &numa_demotion_enabled_attr.attr,
> +       &numa_demotion_target_attr.attr,
>         NULL,
>  };
>
> --
> 2.35.1
>
>
Jagdish Gediya April 22, 2022, 9:13 a.m. UTC | #3
On Wed, Apr 20, 2022 at 09:26:34PM -0700, Wei Xu wrote:
> On Wed, Apr 13, 2022 at 2:22 AM Jagdish Gediya <jvgediya@linux.ibm.com> wrote:
> >
> > Add support to set node_states[N_DEMOTION_TARGETS] from
> > user space to override the default demotion targets.
> >
> > Restrict demotion targets to memory only numa nodes
> > while setting them from user space.
> 
> Why should we restrict demotion targets to memory only nodes if they
> get set explicitly from user space? For example, if we use NUMA
> emulation to test demotion without actual hardware, these emulated
> NUMA nodes can have CPUs, but we still want some of them to serve as
> demotion targets.

If nodes have memory then it should be possible to use them as demotion
targets in case user wants to explicitly set them, will correct it in next
version if we finalize the same approach for demotion target override as
per discussion on this series.

> > Demotion targets can be set from userspace using command,
> > echo <nodelist> > /sys/kernel/mm/numa/demotion_target
> >
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> > Signed-off-by: Jagdish Gediya <jvgediya@linux.ibm.com>
> > ---
> >  .../ABI/testing/sysfs-kernel-mm-numa          | 12 +++++++
> >  mm/migrate.c                                  | 35 +++++++++++++++++++
> >  2 files changed, 47 insertions(+)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-numa b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> > index 77e559d4ed80..10e9e643845c 100644
> > --- a/Documentation/ABI/testing/sysfs-kernel-mm-numa
> > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-numa
> > @@ -22,3 +22,15 @@ Description: Enable/disable demoting pages during reclaim
> >                 the guarantees of cpusets.  This should not be enabled
> >                 on systems which need strict cpuset location
> >                 guarantees.
> > +
> > +What:          /sys/kernel/mm/numa/demotion_target
> > +Date:          April 2022
> > +Contact:       Linux memory management mailing list <linux-mm@kvack.org>
> > +Description:   Configure demotion target nodes
> > +
> > +               Page migration during reclaim is intended for systems
> > +               with tiered memory configurations. Preferred migration target
> > +               nodes can be configured in a system using this interface, based
> > +               on which demotion table is prepared in kernel. If demotion is
> > +               enabled then pages will be migrated to set demotion targets
> > +               during reclaim.
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 516f4e1348c1..4d3d80ca0a7f 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -2564,12 +2564,47 @@ static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
> >         return count;
> >  }
> >
> > +
> > +static ssize_t numa_demotion_target_show(struct kobject *kobj,
> > +                                         struct kobj_attribute *attr, char *buf)
> > +{
> > +       return sysfs_emit(buf, "%*pbl\n",
> > +                         nodemask_pr_args(&node_states[N_DEMOTION_TARGETS]));
> > +}
> > +
> > +static ssize_t numa_demotion_target_store(struct kobject *kobj,
> > +                                         struct kobj_attribute *attr,
> > +                                         const char *nodelist, size_t count)
> > +{
> > +       nodemask_t nodes;
> > +
> > +       if (nodelist_parse(nodelist, nodes))
> > +               return -EINVAL;
> > +
> > +       if (!nodes_subset(nodes, node_states[N_MEMORY]))
> > +               return -EINVAL;
> > +
> > +       if (nodes_intersects(nodes, node_states[N_CPU]))
> > +               return -EINVAL;
> > +
> > +       node_states[N_DEMOTION_TARGETS] = nodes;
> > +
> > +       set_migration_target_nodes();
> > +
> > +       return count;
> > +}
> > +
> >  static struct kobj_attribute numa_demotion_enabled_attr =
> >         __ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
> >                numa_demotion_enabled_store);
> >
> > +static struct kobj_attribute numa_demotion_target_attr =
> > +       __ATTR(demotion_target, 0644, numa_demotion_target_show,
> > +              numa_demotion_target_store);
> > +
> >  static struct attribute *numa_attrs[] = {
> >         &numa_demotion_enabled_attr.attr,
> > +       &numa_demotion_target_attr.attr,
> >         NULL,
> >  };
> >
> > --
> > 2.35.1
> >
> >
>
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-numa b/Documentation/ABI/testing/sysfs-kernel-mm-numa
index 77e559d4ed80..10e9e643845c 100644
--- a/Documentation/ABI/testing/sysfs-kernel-mm-numa
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-numa
@@ -22,3 +22,15 @@  Description:	Enable/disable demoting pages during reclaim
 		the guarantees of cpusets.  This should not be enabled
 		on systems which need strict cpuset location
 		guarantees.
+
+What:		/sys/kernel/mm/numa/demotion_target
+Date:		April 2022
+Contact:	Linux memory management mailing list <linux-mm@kvack.org>
+Description:	Configure demotion target nodes
+
+		Page migration during reclaim is intended for systems
+		with tiered memory configurations. Preferred migration target
+		nodes can be configured in a system using this interface, based
+		on which demotion table is prepared in kernel. If demotion is
+		enabled then pages will be migrated to set demotion targets
+		during reclaim.
diff --git a/mm/migrate.c b/mm/migrate.c
index 516f4e1348c1..4d3d80ca0a7f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2564,12 +2564,47 @@  static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
 	return count;
 }
 
+
+static ssize_t numa_demotion_target_show(struct kobject *kobj,
+					  struct kobj_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%*pbl\n",
+			  nodemask_pr_args(&node_states[N_DEMOTION_TARGETS]));
+}
+
+static ssize_t numa_demotion_target_store(struct kobject *kobj,
+					  struct kobj_attribute *attr,
+					  const char *nodelist, size_t count)
+{
+	nodemask_t nodes;
+
+	if (nodelist_parse(nodelist, nodes))
+		return -EINVAL;
+
+	if (!nodes_subset(nodes, node_states[N_MEMORY]))
+		return -EINVAL;
+
+	if (nodes_intersects(nodes, node_states[N_CPU]))
+		return -EINVAL;
+
+	node_states[N_DEMOTION_TARGETS] = nodes;
+
+	set_migration_target_nodes();
+
+	return count;
+}
+
 static struct kobj_attribute numa_demotion_enabled_attr =
 	__ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
 	       numa_demotion_enabled_store);
 
+static struct kobj_attribute numa_demotion_target_attr =
+	__ATTR(demotion_target, 0644, numa_demotion_target_show,
+	       numa_demotion_target_store);
+
 static struct attribute *numa_attrs[] = {
 	&numa_demotion_enabled_attr.attr,
+	&numa_demotion_target_attr.attr,
 	NULL,
 };