diff mbox series

[PATCHv4,05/13] Documentation/ABI: Add new node sysfs attributes

Message ID 20190116175804.30196-6-keith.busch@intel.com (mailing list archive)
State New, archived
Headers show
Series Heterogeneuos memory node attributes | expand

Commit Message

Keith Busch Jan. 16, 2019, 5:57 p.m. UTC
Add entries for memory initiator and target node class attributes.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki Jan. 17, 2019, 11:41 a.m. UTC | #1
On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
>
> Add entries for memory initiator and target node class attributes.
>
> Signed-off-by: Keith Busch <keith.busch@intel.com>

I would recommend combining this with the previous patch, as the way
it is now I need to look at two patches at the time. :-)

> ---
>  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> index 3e90e1f3bf0a..a9c47b4b0eee 100644
> --- a/Documentation/ABI/stable/sysfs-devices-node
> +++ b/Documentation/ABI/stable/sysfs-devices-node
> @@ -90,4 +90,27 @@ Date:                December 2009
>  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
>  Description:
>                 The node's huge page size control/query attributes.
> -               See Documentation/admin-guide/mm/hugetlbpage.rst
> \ No newline at end of file
> +               See Documentation/admin-guide/mm/hugetlbpage.rst
> +
> +What:          /sys/devices/system/node/nodeX/classY/
> +Date:          December 2018
> +Contact:       Keith Busch <keith.busch@intel.com>
> +Description:
> +               The node's relationship to other nodes for access class "Y".
> +
> +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> +Date:          December 2018
> +Contact:       Keith Busch <keith.busch@intel.com>
> +Description:
> +               The node list of memory initiators that have class "Y" access
> +               to this node's memory. CPUs and other memory initiators in
> +               nodes not in the list accessing this node's memory may have
> +               different performance.

This does not follow the general "one value per file" rule of sysfs (I
know that there are other sysfs files with more than one value in
them, but it is better to follow this rule as long as that makes
sense).

> +
> +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> +Date:          December 2018
> +Contact:       Keith Busch <keith.busch@intel.com>
> +Description:
> +               The node list of memory targets that this initiator node has
> +               class "Y" access. Memory accesses from this node to nodes not
> +               in this list may have differet performance.
> --

Same here.

And if you follow the recommendation given in the previous message
(add "initiators" and "targets" subdirs under "classX"), you won't
even need the two files above.

And, of course, the symlinks part needs to be documented as well.  I
guess you can follow the
Documentation/ABI/testing/sysfs-devices-power_resources_D0 with that.
Jonathan Cameron Jan. 18, 2019, 11:21 a.m. UTC | #2
On Wed, 16 Jan 2019 10:57:56 -0700
Keith Busch <keith.busch@intel.com> wrote:

> Add entries for memory initiator and target node class attributes.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> index 3e90e1f3bf0a..a9c47b4b0eee 100644
> --- a/Documentation/ABI/stable/sysfs-devices-node
> +++ b/Documentation/ABI/stable/sysfs-devices-node
> @@ -90,4 +90,27 @@ Date:		December 2009
>  Contact:	Lee Schermerhorn <lee.schermerhorn@hp.com>
>  Description:
>  		The node's huge page size control/query attributes.
> -		See Documentation/admin-guide/mm/hugetlbpage.rst
> \ No newline at end of file
> +		See Documentation/admin-guide/mm/hugetlbpage.rst
> +
> +What:		/sys/devices/system/node/nodeX/classY/
> +Date:		December 2018
> +Contact:	Keith Busch <keith.busch@intel.com>
> +Description:
> +		The node's relationship to other nodes for access class "Y".
> +
> +What:		/sys/devices/system/node/nodeX/classY/initiator_nodelist
> +Date:		December 2018
> +Contact:	Keith Busch <keith.busch@intel.com>
> +Description:
> +		The node list of memory initiators that have class "Y" access
> +		to this node's memory. CPUs and other memory initiators in
> +		nodes not in the list accessing this node's memory may have
> +		different performance.
> +
> +What:		/sys/devices/system/node/nodeX/classY/target_nodelist
> +Date:		December 2018
> +Contact:	Keith Busch <keith.busch@intel.com>
> +Description:
> +		The node list of memory targets that this initiator node has
> +		class "Y" access. Memory accesses from this node to nodes not
> +		in this list may have differet performance.

Different performance from what?  In the other thread we established that
these target_nodelists are kind of a backwards reference, they all have
their characteristics anyway.  Perhaps this just needs to say:
"Memory access from this node to these targets may have different performance"?

i.e. Don't make the assumption I did that they should all be the same!
Dan Williams Jan. 18, 2019, 4:35 p.m. UTC | #3
On Fri, Jan 18, 2019 at 3:22 AM Jonathan Cameron
<jonathan.cameron@huawei.com> wrote:
>
> On Wed, 16 Jan 2019 10:57:56 -0700
> Keith Busch <keith.busch@intel.com> wrote:
>
> > Add entries for memory initiator and target node class attributes.
> >
> > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > ---
> >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > --- a/Documentation/ABI/stable/sysfs-devices-node
> > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > @@ -90,4 +90,27 @@ Date:              December 2009
> >  Contact:     Lee Schermerhorn <lee.schermerhorn@hp.com>
> >  Description:
> >               The node's huge page size control/query attributes.
> > -             See Documentation/admin-guide/mm/hugetlbpage.rst
> > \ No newline at end of file
> > +             See Documentation/admin-guide/mm/hugetlbpage.rst
> > +
> > +What:                /sys/devices/system/node/nodeX/classY/
> > +Date:                December 2018
> > +Contact:     Keith Busch <keith.busch@intel.com>
> > +Description:
> > +             The node's relationship to other nodes for access class "Y".
> > +
> > +What:                /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > +Date:                December 2018
> > +Contact:     Keith Busch <keith.busch@intel.com>
> > +Description:
> > +             The node list of memory initiators that have class "Y" access
> > +             to this node's memory. CPUs and other memory initiators in
> > +             nodes not in the list accessing this node's memory may have
> > +             different performance.
> > +
> > +What:                /sys/devices/system/node/nodeX/classY/target_nodelist
> > +Date:                December 2018
> > +Contact:     Keith Busch <keith.busch@intel.com>
> > +Description:
> > +             The node list of memory targets that this initiator node has
> > +             class "Y" access. Memory accesses from this node to nodes not
> > +             in this list may have differet performance.
>
> Different performance from what?  In the other thread we established that
> these target_nodelists are kind of a backwards reference, they all have
> their characteristics anyway.  Perhaps this just needs to say:
> "Memory access from this node to these targets may have different performance"?
>
> i.e. Don't make the assumption I did that they should all be the same!

I think a clarification of "class" is needed in this context. A
"class" is the the set of initiators that have the same rated
performance to a given target set. In other words "class" is a tuple
of (performance profile, initiator set, target set). Different
performance creates a different tuple / class.
Keith Busch Jan. 18, 2019, 8:42 p.m. UTC | #4
On Thu, Jan 17, 2019 at 12:41:19PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> >
> > Add entries for memory initiator and target node class attributes.
> >
> > Signed-off-by: Keith Busch <keith.busch@intel.com>
> 
> I would recommend combining this with the previous patch, as the way
> it is now I need to look at two patches at the time. :-)

Will do.
 
> > ---
> >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > --- a/Documentation/ABI/stable/sysfs-devices-node
> > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > @@ -90,4 +90,27 @@ Date:                December 2009
> >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> >  Description:
> >                 The node's huge page size control/query attributes.
> > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > \ No newline at end of file
> > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node's relationship to other nodes for access class "Y".
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node list of memory initiators that have class "Y" access
> > +               to this node's memory. CPUs and other memory initiators in
> > +               nodes not in the list accessing this node's memory may have
> > +               different performance.
> 
> This does not follow the general "one value per file" rule of sysfs (I
> know that there are other sysfs files with more than one value in
> them, but it is better to follow this rule as long as that makes
> sense).

I think it actually does make sense to use "%*pbl" format here. The
type we're displaying is a bit array, so this is one value for that
attribute. There are many precedents for disapling bit arrays this
way. The existing node attributes that we're appending to show cpu
lists, and the top loevel shows node lists for "possible", "online",
"has_memory", "has_cpu". This new attribute should fit well with
exisiting similar attributes from this subsystem.
 
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node list of memory targets that this initiator node has
> > +               class "Y" access. Memory accesses from this node to nodes not
> > +               in this list may have differet performance.
> > --
> 
> Same here.
> 
> And if you follow the recommendation given in the previous message
> (add "initiators" and "targets" subdirs under "classX"), you won't
> even need the two files above.

I think that makes sense, I'll take try out this suggestion.
 
> And, of course, the symlinks part needs to be documented as well.  I
> guess you can follow the
> Documentation/ABI/testing/sysfs-devices-power_resources_D0 with that.
Dan Williams Jan. 18, 2019, 9:08 p.m. UTC | #5
On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> >
> > Add entries for memory initiator and target node class attributes.
> >
> > Signed-off-by: Keith Busch <keith.busch@intel.com>
>
> I would recommend combining this with the previous patch, as the way
> it is now I need to look at two patches at the time. :-)
>
> > ---
> >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > --- a/Documentation/ABI/stable/sysfs-devices-node
> > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > @@ -90,4 +90,27 @@ Date:                December 2009
> >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> >  Description:
> >                 The node's huge page size control/query attributes.
> > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > \ No newline at end of file
> > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node's relationship to other nodes for access class "Y".
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node list of memory initiators that have class "Y" access
> > +               to this node's memory. CPUs and other memory initiators in
> > +               nodes not in the list accessing this node's memory may have
> > +               different performance.
>
> This does not follow the general "one value per file" rule of sysfs (I
> know that there are other sysfs files with more than one value in
> them, but it is better to follow this rule as long as that makes
> sense).
>
> > +
> > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > +Date:          December 2018
> > +Contact:       Keith Busch <keith.busch@intel.com>
> > +Description:
> > +               The node list of memory targets that this initiator node has
> > +               class "Y" access. Memory accesses from this node to nodes not
> > +               in this list may have differet performance.
> > --
>
> Same here.
>
> And if you follow the recommendation given in the previous message
> (add "initiators" and "targets" subdirs under "classX"), you won't
> even need the two files above.

This recommendation is in conflict with Greg's feedback about kobject
usage. If these are just "vanity" subdirs I think it's better to have
a multi-value sysfs file. This "list" style is already commonplace for
the /sys/devices/system hierarchy.
Greg KH Jan. 19, 2019, 9:01 a.m. UTC | #6
On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > >
> > > Add entries for memory initiator and target node class attributes.
> > >
> > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> >
> > I would recommend combining this with the previous patch, as the way
> > it is now I need to look at two patches at the time. :-)
> >
> > > ---
> > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > @@ -90,4 +90,27 @@ Date:                December 2009
> > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > >  Description:
> > >                 The node's huge page size control/query attributes.
> > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > \ No newline at end of file
> > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > +
> > > +What:          /sys/devices/system/node/nodeX/classY/
> > > +Date:          December 2018
> > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > +Description:
> > > +               The node's relationship to other nodes for access class "Y".
> > > +
> > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > +Date:          December 2018
> > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > +Description:
> > > +               The node list of memory initiators that have class "Y" access
> > > +               to this node's memory. CPUs and other memory initiators in
> > > +               nodes not in the list accessing this node's memory may have
> > > +               different performance.
> >
> > This does not follow the general "one value per file" rule of sysfs (I
> > know that there are other sysfs files with more than one value in
> > them, but it is better to follow this rule as long as that makes
> > sense).
> >
> > > +
> > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > +Date:          December 2018
> > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > +Description:
> > > +               The node list of memory targets that this initiator node has
> > > +               class "Y" access. Memory accesses from this node to nodes not
> > > +               in this list may have differet performance.
> > > --
> >
> > Same here.
> >
> > And if you follow the recommendation given in the previous message
> > (add "initiators" and "targets" subdirs under "classX"), you won't
> > even need the two files above.
> 
> This recommendation is in conflict with Greg's feedback about kobject
> usage. If these are just "vanity" subdirs I think it's better to have
> a multi-value sysfs file. This "list" style is already commonplace for
> the /sys/devices/system hierarchy.

If you do a subdirectory "correctly" (i.e. a name for an attribute
group), that's fine.  Just do not ever create a kobject just for a
subdir, that will mess up userspace.

And I hate the "multi-value" sysfs files, where at all possible, please
do not copy past bad mistakes there.  If you can make them individual
files, please do that, it makes it easier to maintain and code the
kernel for.

thanks,

greg k-h
Dan Williams Jan. 19, 2019, 4:56 p.m. UTC | #7
On Sat, Jan 19, 2019 at 1:01 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> > On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > > >
> > > > Add entries for memory initiator and target node class attributes.
> > > >
> > > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > >
> > > I would recommend combining this with the previous patch, as the way
> > > it is now I need to look at two patches at the time. :-)
> > >
> > > > ---
> > > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > > @@ -90,4 +90,27 @@ Date:                December 2009
> > > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > >  Description:
> > > >                 The node's huge page size control/query attributes.
> > > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > \ No newline at end of file
> > > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node's relationship to other nodes for access class "Y".
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node list of memory initiators that have class "Y" access
> > > > +               to this node's memory. CPUs and other memory initiators in
> > > > +               nodes not in the list accessing this node's memory may have
> > > > +               different performance.
> > >
> > > This does not follow the general "one value per file" rule of sysfs (I
> > > know that there are other sysfs files with more than one value in
> > > them, but it is better to follow this rule as long as that makes
> > > sense).
> > >
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node list of memory targets that this initiator node has
> > > > +               class "Y" access. Memory accesses from this node to nodes not
> > > > +               in this list may have differet performance.
> > > > --
> > >
> > > Same here.
> > >
> > > And if you follow the recommendation given in the previous message
> > > (add "initiators" and "targets" subdirs under "classX"), you won't
> > > even need the two files above.
> >
> > This recommendation is in conflict with Greg's feedback about kobject
> > usage. If these are just "vanity" subdirs I think it's better to have
> > a multi-value sysfs file. This "list" style is already commonplace for
> > the /sys/devices/system hierarchy.
>
> If you do a subdirectory "correctly" (i.e. a name for an attribute
> group), that's fine.  Just do not ever create a kobject just for a
> subdir, that will mess up userspace.
>
> And I hate the "multi-value" sysfs files, where at all possible, please
> do not copy past bad mistakes there.  If you can make them individual
> files, please do that, it makes it easier to maintain and code the
> kernel for.

I agree in general about multi-value sysfs, but in this case we're
talking about a mask. Masks are a single value. That said I can get on
board with calling what 'cpulist' does a design mistake (human
readable mask), but otherwise switching to one file per item in the
mask is a mess for userspace to consume.
Rafael J. Wysocki Jan. 20, 2019, 4:16 p.m. UTC | #8
On Sat, Jan 19, 2019 at 10:01 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> > On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > > >
> > > > Add entries for memory initiator and target node class attributes.
> > > >
> > > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > >
> > > I would recommend combining this with the previous patch, as the way
> > > it is now I need to look at two patches at the time. :-)
> > >
> > > > ---
> > > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > > @@ -90,4 +90,27 @@ Date:                December 2009
> > > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > >  Description:
> > > >                 The node's huge page size control/query attributes.
> > > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > \ No newline at end of file
> > > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node's relationship to other nodes for access class "Y".
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node list of memory initiators that have class "Y" access
> > > > +               to this node's memory. CPUs and other memory initiators in
> > > > +               nodes not in the list accessing this node's memory may have
> > > > +               different performance.
> > >
> > > This does not follow the general "one value per file" rule of sysfs (I
> > > know that there are other sysfs files with more than one value in
> > > them, but it is better to follow this rule as long as that makes
> > > sense).
> > >
> > > > +
> > > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > > +Date:          December 2018
> > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > +Description:
> > > > +               The node list of memory targets that this initiator node has
> > > > +               class "Y" access. Memory accesses from this node to nodes not
> > > > +               in this list may have differet performance.
> > > > --
> > >
> > > Same here.
> > >
> > > And if you follow the recommendation given in the previous message
> > > (add "initiators" and "targets" subdirs under "classX"), you won't
> > > even need the two files above.
> >
> > This recommendation is in conflict with Greg's feedback about kobject
> > usage. If these are just "vanity" subdirs I think it's better to have
> > a multi-value sysfs file. This "list" style is already commonplace for
> > the /sys/devices/system hierarchy.
>
> If you do a subdirectory "correctly" (i.e. a name for an attribute
> group), that's fine.

Yes, that's what I was thinking about: along the lines of the "power"
group under device kobjects.

Cheers,
Rafael
Rafael J. Wysocki Jan. 20, 2019, 4:19 p.m. UTC | #9
On Sat, Jan 19, 2019 at 5:56 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Sat, Jan 19, 2019 at 1:01 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> > > On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > > > >
> > > > > Add entries for memory initiator and target node class attributes.
> > > > >
> > > > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > > >
> > > > I would recommend combining this with the previous patch, as the way
> > > > it is now I need to look at two patches at the time. :-)
> > > >
> > > > > ---
> > > > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > > > @@ -90,4 +90,27 @@ Date:                December 2009
> > > > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > > >  Description:
> > > > >                 The node's huge page size control/query attributes.
> > > > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > \ No newline at end of file
> > > > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > +
> > > > > +What:          /sys/devices/system/node/nodeX/classY/
> > > > > +Date:          December 2018
> > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > +Description:
> > > > > +               The node's relationship to other nodes for access class "Y".
> > > > > +
> > > > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > > > +Date:          December 2018
> > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > +Description:
> > > > > +               The node list of memory initiators that have class "Y" access
> > > > > +               to this node's memory. CPUs and other memory initiators in
> > > > > +               nodes not in the list accessing this node's memory may have
> > > > > +               different performance.
> > > >
> > > > This does not follow the general "one value per file" rule of sysfs (I
> > > > know that there are other sysfs files with more than one value in
> > > > them, but it is better to follow this rule as long as that makes
> > > > sense).
> > > >
> > > > > +
> > > > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > > > +Date:          December 2018
> > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > +Description:
> > > > > +               The node list of memory targets that this initiator node has
> > > > > +               class "Y" access. Memory accesses from this node to nodes not
> > > > > +               in this list may have differet performance.
> > > > > --
> > > >
> > > > Same here.
> > > >
> > > > And if you follow the recommendation given in the previous message
> > > > (add "initiators" and "targets" subdirs under "classX"), you won't
> > > > even need the two files above.
> > >
> > > This recommendation is in conflict with Greg's feedback about kobject
> > > usage. If these are just "vanity" subdirs I think it's better to have
> > > a multi-value sysfs file. This "list" style is already commonplace for
> > > the /sys/devices/system hierarchy.
> >
> > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > group), that's fine.  Just do not ever create a kobject just for a
> > subdir, that will mess up userspace.
> >
> > And I hate the "multi-value" sysfs files, where at all possible, please
> > do not copy past bad mistakes there.  If you can make them individual
> > files, please do that, it makes it easier to maintain and code the
> > kernel for.
>
> I agree in general about multi-value sysfs, but in this case we're
> talking about a mask. Masks are a single value. That said I can get on
> board with calling what 'cpulist' does a design mistake (human
> readable mask), but otherwise switching to one file per item in the
> mask is a mess for userspace to consume.

Can you please refer to my response to Keith?

If you have "initiators" and "targets" under "classX" and a list of
symlinks in each of them, I don't see any kind of a mess here.
Dan Williams Jan. 20, 2019, 5:34 p.m. UTC | #10
On Sun, Jan 20, 2019 at 8:20 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Sat, Jan 19, 2019 at 5:56 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > On Sat, Jan 19, 2019 at 1:01 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> > > > On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > >
> > > > > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > > > > >
> > > > > > Add entries for memory initiator and target node class attributes.
> > > > > >
> > > > > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > > > >
> > > > > I would recommend combining this with the previous patch, as the way
> > > > > it is now I need to look at two patches at the time. :-)
> > > > >
> > > > > > ---
> > > > > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > > > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > > > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > > > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > > > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > > > > @@ -90,4 +90,27 @@ Date:                December 2009
> > > > > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > > > >  Description:
> > > > > >                 The node's huge page size control/query attributes.
> > > > > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > > \ No newline at end of file
> > > > > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > > +
> > > > > > +What:          /sys/devices/system/node/nodeX/classY/
> > > > > > +Date:          December 2018
> > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > +Description:
> > > > > > +               The node's relationship to other nodes for access class "Y".
> > > > > > +
> > > > > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > > > > +Date:          December 2018
> > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > +Description:
> > > > > > +               The node list of memory initiators that have class "Y" access
> > > > > > +               to this node's memory. CPUs and other memory initiators in
> > > > > > +               nodes not in the list accessing this node's memory may have
> > > > > > +               different performance.
> > > > >
> > > > > This does not follow the general "one value per file" rule of sysfs (I
> > > > > know that there are other sysfs files with more than one value in
> > > > > them, but it is better to follow this rule as long as that makes
> > > > > sense).
> > > > >
> > > > > > +
> > > > > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > > > > +Date:          December 2018
> > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > +Description:
> > > > > > +               The node list of memory targets that this initiator node has
> > > > > > +               class "Y" access. Memory accesses from this node to nodes not
> > > > > > +               in this list may have differet performance.
> > > > > > --
> > > > >
> > > > > Same here.
> > > > >
> > > > > And if you follow the recommendation given in the previous message
> > > > > (add "initiators" and "targets" subdirs under "classX"), you won't
> > > > > even need the two files above.
> > > >
> > > > This recommendation is in conflict with Greg's feedback about kobject
> > > > usage. If these are just "vanity" subdirs I think it's better to have
> > > > a multi-value sysfs file. This "list" style is already commonplace for
> > > > the /sys/devices/system hierarchy.
> > >
> > > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > > group), that's fine.  Just do not ever create a kobject just for a
> > > subdir, that will mess up userspace.
> > >
> > > And I hate the "multi-value" sysfs files, where at all possible, please
> > > do not copy past bad mistakes there.  If you can make them individual
> > > files, please do that, it makes it easier to maintain and code the
> > > kernel for.
> >
> > I agree in general about multi-value sysfs, but in this case we're
> > talking about a mask. Masks are a single value. That said I can get on
> > board with calling what 'cpulist' does a design mistake (human
> > readable mask), but otherwise switching to one file per item in the
> > mask is a mess for userspace to consume.
>
> Can you please refer to my response to Keith?

Ah, ok I missed the patch4 comments and was reading this one in
isolation... which also bolsters your comment about squashing these
two patches together.

> If you have "initiators" and "targets" under "classX" and a list of
> symlinks in each of them, I don't see any kind of a mess here.

In this instance, I think having symlinks at all is "messy" vs just
having a mask. Yes, you're right, if we have the proposed symlinks
from patch4 there is no need for these _nodelist attributes, and those
symlinks would be better under "initiator" and "target" directories.
However, I'm arguing going the other way, just have the 2 mask
attributes and no symlinks. The HMAT erodes the concept of "numa
nodes" typically being a single digit number space per platform. Given
increasing numbers of memory target types and initiator devices its
going to be cumbersome to have userspace walk multiple symlinks vs
just reading a mask and opening the canonical path for a node
directly.

This is also part of the rationale for only emitting one "class"
(initiator / target performance profile) by default. There's an N-to-N
initiator-target description in the HMAT. When / if we decide emit
more classes the more work userspace would need to do to convert
directory structures back into data.
Rafael J. Wysocki Jan. 21, 2019, 9:54 a.m. UTC | #11
On Sun, Jan 20, 2019 at 6:34 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Sun, Jan 20, 2019 at 8:20 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Sat, Jan 19, 2019 at 5:56 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > On Sat, Jan 19, 2019 at 1:01 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Fri, Jan 18, 2019 at 01:08:02PM -0800, Dan Williams wrote:
> > > > > On Thu, Jan 17, 2019 at 3:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > > >
> > > > > > On Wed, Jan 16, 2019 at 6:59 PM Keith Busch <keith.busch@intel.com> wrote:
> > > > > > >
> > > > > > > Add entries for memory initiator and target node class attributes.
> > > > > > >
> > > > > > > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > > > > >
> > > > > > I would recommend combining this with the previous patch, as the way
> > > > > > it is now I need to look at two patches at the time. :-)
> > > > > >
> > > > > > > ---
> > > > > > >  Documentation/ABI/stable/sysfs-devices-node | 25 ++++++++++++++++++++++++-
> > > > > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
> > > > > > > index 3e90e1f3bf0a..a9c47b4b0eee 100644
> > > > > > > --- a/Documentation/ABI/stable/sysfs-devices-node
> > > > > > > +++ b/Documentation/ABI/stable/sysfs-devices-node
> > > > > > > @@ -90,4 +90,27 @@ Date:                December 2009
> > > > > > >  Contact:       Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > > > > >  Description:
> > > > > > >                 The node's huge page size control/query attributes.
> > > > > > > -               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > > > \ No newline at end of file
> > > > > > > +               See Documentation/admin-guide/mm/hugetlbpage.rst
> > > > > > > +
> > > > > > > +What:          /sys/devices/system/node/nodeX/classY/
> > > > > > > +Date:          December 2018
> > > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > > +Description:
> > > > > > > +               The node's relationship to other nodes for access class "Y".
> > > > > > > +
> > > > > > > +What:          /sys/devices/system/node/nodeX/classY/initiator_nodelist
> > > > > > > +Date:          December 2018
> > > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > > +Description:
> > > > > > > +               The node list of memory initiators that have class "Y" access
> > > > > > > +               to this node's memory. CPUs and other memory initiators in
> > > > > > > +               nodes not in the list accessing this node's memory may have
> > > > > > > +               different performance.
> > > > > >
> > > > > > This does not follow the general "one value per file" rule of sysfs (I
> > > > > > know that there are other sysfs files with more than one value in
> > > > > > them, but it is better to follow this rule as long as that makes
> > > > > > sense).
> > > > > >
> > > > > > > +
> > > > > > > +What:          /sys/devices/system/node/nodeX/classY/target_nodelist
> > > > > > > +Date:          December 2018
> > > > > > > +Contact:       Keith Busch <keith.busch@intel.com>
> > > > > > > +Description:
> > > > > > > +               The node list of memory targets that this initiator node has
> > > > > > > +               class "Y" access. Memory accesses from this node to nodes not
> > > > > > > +               in this list may have differet performance.
> > > > > > > --
> > > > > >
> > > > > > Same here.
> > > > > >
> > > > > > And if you follow the recommendation given in the previous message
> > > > > > (add "initiators" and "targets" subdirs under "classX"), you won't
> > > > > > even need the two files above.
> > > > >
> > > > > This recommendation is in conflict with Greg's feedback about kobject
> > > > > usage. If these are just "vanity" subdirs I think it's better to have
> > > > > a multi-value sysfs file. This "list" style is already commonplace for
> > > > > the /sys/devices/system hierarchy.
> > > >
> > > > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > > > group), that's fine.  Just do not ever create a kobject just for a
> > > > subdir, that will mess up userspace.
> > > >
> > > > And I hate the "multi-value" sysfs files, where at all possible, please
> > > > do not copy past bad mistakes there.  If you can make them individual
> > > > files, please do that, it makes it easier to maintain and code the
> > > > kernel for.
> > >
> > > I agree in general about multi-value sysfs, but in this case we're
> > > talking about a mask. Masks are a single value. That said I can get on
> > > board with calling what 'cpulist' does a design mistake (human
> > > readable mask), but otherwise switching to one file per item in the
> > > mask is a mess for userspace to consume.
> >
> > Can you please refer to my response to Keith?
>
> Ah, ok I missed the patch4 comments and was reading this one in
> isolation... which also bolsters your comment about squashing these
> two patches together.
>
> > If you have "initiators" and "targets" under "classX" and a list of
> > symlinks in each of them, I don't see any kind of a mess here.
>
> In this instance, I think having symlinks at all is "messy" vs just
> having a mask. Yes, you're right, if we have the proposed symlinks
> from patch4 there is no need for these _nodelist attributes, and those
> symlinks would be better under "initiator" and "target" directories.
> However, I'm arguing going the other way, just have the 2 mask
> attributes and no symlinks. The HMAT erodes the concept of "numa
> nodes" typically being a single digit number space per platform. Given
> increasing numbers of memory target types and initiator devices its
> going to be cumbersome to have userspace walk multiple symlinks vs
> just reading a mask and opening the canonical path for a node
> directly.

The symlinks only need to be walked once, however, and after walking
them (once) the user space program can simply create a mask for
itself.

To me, the question here is how to represent a graph in a filesystem,
and since nodes are naturally represented by directories, it is also
natural to represent connections between them as symlinks.

> This is also part of the rationale for only emitting one "class"
> (initiator / target performance profile) by default. There's an N-to-N
> initiator-target description in the HMAT. When / if we decide emit
> more classes the more work userspace would need to do to convert
> directory structures back into data.

I'm not convinced by this argument, as this conversion is a one-off
exercise.  Ultimately, a tool in user space would need to represent a
graph anyway and how it obtains the data to do that doesn't matter too
much to it.

However, IMO there is value in representing the graph in a filesystem
in such a way that, say, a file manager program (without special
modifications) can be used to walk it by a human.

Let alone the one-value-per-file rule of sysfs that doesn't need to be
violated if symlinks are used.
Keith Busch Jan. 22, 2019, 4:36 p.m. UTC | #12
On Sun, Jan 20, 2019 at 05:16:05PM +0100, Rafael J. Wysocki wrote:
> On Sat, Jan 19, 2019 at 10:01 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > group), that's fine.
> 
> Yes, that's what I was thinking about: along the lines of the "power"
> group under device kobjects.

We can't append symlinks to an attribute group, though. I'd need to create
a lot of struct devices just to get the desired directory hiearchy. And
then each of those "devices" will have their own "power" group, which
really doesn't make any sense for what we're trying to show. Is that
really the right way to do this, or something else I'm missing?
Rafael J. Wysocki Jan. 22, 2019, 4:51 p.m. UTC | #13
On Tue, Jan 22, 2019 at 5:37 PM Keith Busch <keith.busch@intel.com> wrote:
>
> On Sun, Jan 20, 2019 at 05:16:05PM +0100, Rafael J. Wysocki wrote:
> > On Sat, Jan 19, 2019 at 10:01 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > > group), that's fine.
> >
> > Yes, that's what I was thinking about: along the lines of the "power"
> > group under device kobjects.
>
> We can't append symlinks to an attribute group, though.

That's right, unfortunately.

> I'd need to create a lot of struct devices just to get the desired directory hiearchy.

No, you don't need to do that.  Kobjects can be added without
registering a struct device for each of them kind of along the lines
of what cpufreq does for its policy objects etc.  See
cpufreq_policy_alloc() and cpufreq_core_init() for examples.

> And then each of those "devices" will have their own "power" group, which
> really doesn't make any sense for what we're trying to show. Is that
> really the right way to do this, or something else I'm missing?

Above?
Rafael J. Wysocki Jan. 22, 2019, 4:54 p.m. UTC | #14
On Tue, Jan 22, 2019 at 5:51 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Jan 22, 2019 at 5:37 PM Keith Busch <keith.busch@intel.com> wrote:
> >
> > On Sun, Jan 20, 2019 at 05:16:05PM +0100, Rafael J. Wysocki wrote:
> > > On Sat, Jan 19, 2019 at 10:01 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > If you do a subdirectory "correctly" (i.e. a name for an attribute
> > > > group), that's fine.
> > >
> > > Yes, that's what I was thinking about: along the lines of the "power"
> > > group under device kobjects.
> >
> > We can't append symlinks to an attribute group, though.
>
> That's right, unfortunately.

Scratch this.

You can add them using sysfs_add_link_to_group().  For example, see
what acpi_power_expose_list() does.
diff mbox series

Patch

diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
index 3e90e1f3bf0a..a9c47b4b0eee 100644
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -90,4 +90,27 @@  Date:		December 2009
 Contact:	Lee Schermerhorn <lee.schermerhorn@hp.com>
 Description:
 		The node's huge page size control/query attributes.
-		See Documentation/admin-guide/mm/hugetlbpage.rst
\ No newline at end of file
+		See Documentation/admin-guide/mm/hugetlbpage.rst
+
+What:		/sys/devices/system/node/nodeX/classY/
+Date:		December 2018
+Contact:	Keith Busch <keith.busch@intel.com>
+Description:
+		The node's relationship to other nodes for access class "Y".
+
+What:		/sys/devices/system/node/nodeX/classY/initiator_nodelist
+Date:		December 2018
+Contact:	Keith Busch <keith.busch@intel.com>
+Description:
+		The node list of memory initiators that have class "Y" access
+		to this node's memory. CPUs and other memory initiators in
+		nodes not in the list accessing this node's memory may have
+		different performance.
+
+What:		/sys/devices/system/node/nodeX/classY/target_nodelist
+Date:		December 2018
+Contact:	Keith Busch <keith.busch@intel.com>
+Description:
+		The node list of memory targets that this initiator node has
+		class "Y" access. Memory accesses from this node to nodes not
+		in this list may have differet performance.