diff mbox series

[6/6] mm/migrate: export whether or not node is toptier in sysf

Message ID 20220417034932.jborenmvfbqrfhlj@offworld (mailing list archive)
State New
Headers show
Series mm: proactive reclaim and memory tiering topics | expand

Commit Message

Davidlohr Bueso April 17, 2022, 3:49 a.m. UTC
This allows userspace to know if the node is considered fast
memory (with CPUs attached to it). While this can be already
derived without a new file, this helps further encapsulate the
concept.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
Resending, just noticed this oatch was never posted.

  Documentation/ABI/stable/sysfs-devices-node |  6 ++++++
  drivers/base/node.c                         | 13 +++++++++++++
  2 files changed, 19 insertions(+)

--
2.26.2

Comments

Dave Hansen April 18, 2022, 3:34 p.m. UTC | #1
On 4/16/22 20:49, Davidlohr Bueso wrote:
> This allows userspace to know if the node is considered fast
> memory (with CPUs attached to it). While this can be already
> derived without a new file, this helps further encapsulate the
> concept.

What is userspace supposed to *do* with this, though?

What does "attached" mean?

Isn't it just asking for trouble to add (known) redundancy to the ABI?
It seems like a recipe for future inconsistency.
Davidlohr Bueso April 18, 2022, 4:45 p.m. UTC | #2
On Mon, 18 Apr 2022, Dave Hansen wrote:

>On 4/16/22 20:49, Davidlohr Bueso wrote:
>> This allows userspace to know if the node is considered fast
>> memory (with CPUs attached to it). While this can be already
>> derived without a new file, this helps further encapsulate the
>> concept.
>
>What is userspace supposed to *do* with this, though?

This came as a scratch to my own itch. I wanted to start testing
more tiering patches overall that I see pop up, and wanted a way
to differentiate the slow vs the fast memories in order to better
configure workload(s) working set sizes beyond what is your typical
grep MemTotal /proc/meminfo. If there is a better way I'm all
for it.

>
>What does "attached" mean?

I'll rephrase.

>Isn't it just asking for trouble to add (known) redundancy to the ABI?
>It seems like a recipe for future inconsistency.

Perhaps. It was mostly about the fact that the notion of top tier
could also change as technology evolves.

Thanks,
Davidlohr
Dave Hansen April 18, 2022, 4:50 p.m. UTC | #3
On 4/18/22 09:45, Davidlohr Bueso wrote:
> On Mon, 18 Apr 2022, Dave Hansen wrote:
>> On 4/16/22 20:49, Davidlohr Bueso wrote:
>>> This allows userspace to know if the node is considered fast
>>> memory (with CPUs attached to it). While this can be already
>>> derived without a new file, this helps further encapsulate the
>>> concept.
>>
>> What is userspace supposed to *do* with this, though?
> 
> This came as a scratch to my own itch. I wanted to start testing
> more tiering patches overall that I see pop up, and wanted a way
> to differentiate the slow vs the fast memories in order to better
> configure workload(s) working set sizes beyond what is your typical
> grep MemTotal /proc/meminfo. If there is a better way I'm all
> for it.

But how does this help you?  Does it save you a few lines in a shell
script to find the nodes that have memory and CPUs?

>> Isn't it just asking for trouble to add (known) redundancy to the ABI?
>> It seems like a recipe for future inconsistency.
> 
> Perhaps. It was mostly about the fact that the notion of top tier
> could also change as technology evolves.

It seems like something arbitrary that everyone will just disagree on.
I think we should try to stick to cold, hard facts as must as possible
rather than trying to have the *kernel* dictate as a policy what is fast
versus slow.
Yang Shi April 22, 2022, 5:37 p.m. UTC | #4
On Sat, Apr 16, 2022 at 8:49 PM Davidlohr Bueso <dave@stgolabs.net> wrote:
>
>
>
> This allows userspace to know if the node is considered fast
> memory (with CPUs attached to it). While this can be already
> derived without a new file, this helps further encapsulate the
> concept.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> ---
> Resending, just noticed this oatch was never posted.
>
>   Documentation/ABI/stable/sysfs-devices-node |  6 ++++++
>   drivers/base/node.c                         | 13 +++++++++++++
>   2 files changed, 19 insertions(+)
>
> diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> index f620c6ae013c..1c21c3985535 100644
> --- a/Documentation/ABI/stable/sysfs-devices-node
> +++ b/Documentation/ABI/stable/sysfs-devices-node
> @@ -198,3 +198,9 @@ Date:               April 2022
>   Contact:      Davidlohr Bueso <dave@stgolabs.net>
>   Description:
>                 Shows nodes within the next tier of slower memory below this node.
> +
> +What:          /sys/devices/system/node/nodeX/memory_toptier
> +Date:          April 2022
> +Contact:       Davidlohr Bueso <dave@stgolabs.net>
> +Description:
> +               Node is attached to fast memory or not.
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index ab4bae777535..b9de5b0360f2 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -598,12 +598,25 @@ static ssize_t node_read_demotion_path(struct device *dev,
>   }
>   static DEVICE_ATTR(demotion_path, 0444, node_read_demotion_path, NULL);
>
> +static ssize_t node_read_memory_toptier(struct device *dev,
> +                                    struct device_attribute *attr, char *buf)
> +{
> +       int nid = dev->id;
> +       int len = 0;
> +
> +       len += sysfs_emit_at(buf, len, "%d\n", !!node_is_toptier(nid));

It is not guaranteed. Some hardware configurations have cpuless DRAM
nodes, but they should be treated as top tier nodes IMHO. Please see
https://lore.kernel.org/linux-mm/20220413092206.73974-1-jvgediya@linux.ibm.com/

> +
> +       return len;
> +}
> +static DEVICE_ATTR(memory_toptier, 0444, node_read_memory_toptier, NULL);
> +
>   static struct attribute *node_dev_attrs[] = {
>         &dev_attr_meminfo.attr,
>         &dev_attr_numastat.attr,
>         &dev_attr_distance.attr,
>         &dev_attr_vmstat.attr,
>         &dev_attr_demotion_path.attr,
> +       &dev_attr_memory_toptier.attr,
>         NULL
>   };
>
> --
> 2.26.2
>
diff mbox series

Patch

diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
index f620c6ae013c..1c21c3985535 100644
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -198,3 +198,9 @@  Date:		April 2022
  Contact:	Davidlohr Bueso <dave@stgolabs.net>
  Description:
		Shows nodes within the next tier of slower memory below this node.
+
+What:		/sys/devices/system/node/nodeX/memory_toptier
+Date:		April 2022
+Contact:	Davidlohr Bueso <dave@stgolabs.net>
+Description:
+		Node is attached to fast memory or not.
diff --git a/drivers/base/node.c b/drivers/base/node.c
index ab4bae777535..b9de5b0360f2 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -598,12 +598,25 @@  static ssize_t node_read_demotion_path(struct device *dev,
  }
  static DEVICE_ATTR(demotion_path, 0444, node_read_demotion_path, NULL);

+static ssize_t node_read_memory_toptier(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	int nid = dev->id;
+	int len = 0;
+
+	len += sysfs_emit_at(buf, len, "%d\n", !!node_is_toptier(nid));
+
+	return len;
+}
+static DEVICE_ATTR(memory_toptier, 0444, node_read_memory_toptier, NULL);
+
  static struct attribute *node_dev_attrs[] = {
	&dev_attr_meminfo.attr,
	&dev_attr_numastat.attr,
	&dev_attr_distance.attr,
	&dev_attr_vmstat.attr,
	&dev_attr_demotion_path.attr,
+	&dev_attr_memory_toptier.attr,
	NULL
  };