diff mbox series

[v3] PCI/DOE: Expose the DOE protocols via sysfs

Message ID 20230809232851.1004023-1-alistair.francis@wdc.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series [v3] PCI/DOE: Expose the DOE protocols via sysfs | expand

Commit Message

Alistair Francis Aug. 9, 2023, 11:28 p.m. UTC
The PCIe 6 specification added support for the Data Object Exchange (DOE).
When DOE is supported the Discovery Data Object Protocol must be
implemented. The protocol allows a requester to obtain information about
the other DOE protocols supported by the device.

The kernel is already querying the DOE protocols supported and cacheing
the values. This patch exposes the values via sysfs. This will allow
userspace to determine which DOE protocols are supported by the PCIe
device.

By exposing the information to userspace tools like lspci can relay the
information to users. By listing all of the supported protocols we can
allow userspace to parse and support the list, which might include
vendor specific protocols as well as yet to be supported protocols.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
v3:
 - Expose each DOE feature as a separate file

 Documentation/ABI/testing/sysfs-bus-pci |  10 +++
 drivers/pci/doe.c                       | 107 ++++++++++++++++++++++++
 drivers/pci/pci-sysfs.c                 |   7 ++
 include/linux/pci-doe.h                 |   1 +
 4 files changed, 125 insertions(+)

Comments

Randy Dunlap Aug. 9, 2023, 11:43 p.m. UTC | #1
Hi--

On 8/9/23 16:28, Alistair Francis wrote:
> The PCIe 6 specification added support for the Data Object Exchange (DOE).
> When DOE is supported the Discovery Data Object Protocol must be
> implemented. The protocol allows a requester to obtain information about
> the other DOE protocols supported by the device.
> 
> The kernel is already querying the DOE protocols supported and cacheing
> the values. This patch exposes the values via sysfs. This will allow
> userspace to determine which DOE protocols are supported by the PCIe
> device.
> 
> By exposing the information to userspace tools like lspci can relay the
> information to users. By listing all of the supported protocols we can
> allow userspace to parse and support the list, which might include
> vendor specific protocols as well as yet to be supported protocols.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v3:
>  - Expose each DOE feature as a separate file
> 
>  Documentation/ABI/testing/sysfs-bus-pci |  10 +++
>  drivers/pci/doe.c                       | 107 ++++++++++++++++++++++++
>  drivers/pci/pci-sysfs.c                 |   7 ++
>  include/linux/pci-doe.h                 |   1 +
>  4 files changed, 125 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index ecf47559f495..e754b8efdb69 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -500,3 +500,13 @@ Description:
>  		console drivers from the device.  Raw users of pci-sysfs
>  		resourceN attributes must be terminated prior to resizing.
>  		Success of the resizing operation is not guaranteed.
> +
> +What:		/sys/bus/pci/devices/.../doe_proto

Should this be                                   doe_protos
? like this:
+	.name	= "doe_protos",


> +Date:		July 2023
> +Contact:	Linux PCI developers <linux-pci@vger.kernel.org>
> +Description:
> +		This diectory contains a list of the supported Data Object Exchange (DOE)

		     directory

> +		features. Each feature is a single file.

		The feature values are in the file name; the files have no contents.
?

> +		The value comes from the device and specifies the vendor and
> +		data object type supported. The lower byte is the data object type and the next
> +		two bytes are the vendor ID.
Chaitanya Kulkarni Aug. 10, 2023, 12:13 a.m. UTC | #2
On 8/9/2023 4:28 PM, Alistair Francis wrote:
> The PCIe 6 specification added support for the Data Object Exchange (DOE).
> When DOE is supported the Discovery Data Object Protocol must be
> implemented. The protocol allows a requester to obtain information about
> the other DOE protocols supported by the device.
> 
> The kernel is already querying the DOE protocols supported and cacheing
> the values. This patch exposes the values via sysfs. This will allow
> userspace to determine which DOE protocols are supported by the PCIe
> device.
> 
> By exposing the information to userspace tools like lspci can relay the
> information to users. By listing all of the supported protocols we can
> allow userspace to parse and support the list, which might include
> vendor specific protocols as well as yet to be supported protocols.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v3:
>   - Expose each DOE feature as a separate file
> 

it will be nice to have entire version history ...

>   Documentation/ABI/testing/sysfs-bus-pci |  10 +++
>   drivers/pci/doe.c                       | 107 ++++++++++++++++++++++++
>   drivers/pci/pci-sysfs.c                 |   7 ++
>   include/linux/pci-doe.h                 |   1 +
>   4 files changed, 125 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index ecf47559f495..e754b8efdb69 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -500,3 +500,13 @@ Description:
>   		console drivers from the device.  Raw users of pci-sysfs
>   		resourceN attributes must be terminated prior to resizing.
>   		Success of the resizing operation is not guaranteed.
> +
> +What:		/sys/bus/pci/devices/.../doe_proto
> +Date:		July 2023
> +Contact:	Linux PCI developers <linux-pci@vger.kernel.org>
> +Description:
> +		This diectory contains a list of the supported Data Object Exchange (DOE)
> +		features. Each feature is a single file.
> +		The value comes from the device and specifies the vendor and
> +		data object type supported. The lower byte is the data object type and the next
> +		two bytes are the vendor ID.

can you run spell check above ?

> diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
> index 1b97a5ab71a9..d5cbcb6a457a 100644
> --- a/drivers/pci/doe.c
> +++ b/drivers/pci/doe.c
> @@ -56,6 +56,10 @@ struct pci_doe_mb {
>   	wait_queue_head_t wq;
>   	struct workqueue_struct *work_queue;
>   	unsigned long flags;
> +
> +#ifdef CONFIG_SYSFS
> +	struct device_attribute *sysfs_attrs;
> +#endif
>   };
>   
>   struct pci_doe_protocol {
> @@ -92,6 +96,109 @@ struct pci_doe_task {
>   	struct pci_doe_mb *doe_mb;
>   };
>   
> +#ifdef CONFIG_SYSFS
> +static struct attribute *pci_dev_doe_proto_attrs[] = {
> +	NULL,
> +};
> +
> +static const struct attribute_group pci_dev_doe_proto_group = {
> +	.name	= "doe_protos",
> +	.attrs	= pci_dev_doe_proto_attrs,
> +};
> +
> +static void pci_doe_sysfs_remove_desc(struct pci_doe_mb *doe_mb)
> +{
> +	struct device_attribute *attrs = doe_mb->sysfs_attrs;
> +	unsigned long i;
> +	void *entry;
> +
> +	if (!attrs)
> +		return;
> +

nit : following reads well and matches next NULL assignment, feel free
to ignore :-

	if (!doe_mb->sysfs_attrs)
		return;


> +	doe_mb->sysfs_attrs = NULL;
> +	xa_for_each(&doe_mb->prots, i, entry)
> +		kfree(attrs[i].attr.name);
> +
> +	kfree(attrs);
> +}
> +
> +static int pci_doe_sysfs_proto_supports(struct pci_dev *pdev, struct pci_doe_mb *doe_mb)
> +{
> +	struct device_attribute *attrs;
> +	struct device *dev = &pdev->dev;
> +	unsigned long i;
> +	int ret;
> +	unsigned long num_protos = 0;
> +	unsigned long vid, type;
> +	void *entry;
> +

cosmetic: feel free to ignore but I'd use, feel free to ignore :-

	struct device *dev = &pdev->dev;
	struct device_attribute *attrs;
	unsigned long num_protos = 0;
	unsigned long vid, type;
	unsigned long i;
	void *entry;
	int ret;

> +	xa_for_each(&doe_mb->prots, i, entry)
> +		num_protos++;
> +
> +	attrs = kcalloc(num_protos, sizeof(*attrs), GFP_KERNEL);
> +	if (!attrs)
> +		return -ENOMEM;
> +
> +	doe_mb->sysfs_attrs = attrs;
> +	xa_for_each(&doe_mb->prots, i, entry) {
> +		sysfs_attr_init(&attrs[i].attr);
> +		vid = xa_to_value(entry) >> 8;
> +		type = xa_to_value(entry) & 0xFF;
> +		attrs[i].attr.name = kasprintf(GFP_KERNEL, "0x%04lX:%02lX", vid, type);
> +		if (!attrs[i].attr.name) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		attrs[i].attr.mode = 0444;
> +		attrs[i].show = NULL;
> +
> +		ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr,
> +					      pci_dev_doe_proto_group.name);
> +		if (ret)
> +			goto fail;
> +	}
> +
> +	return 0;
> +
> +fail:
> +	pci_doe_sysfs_remove_desc(doe_mb);
> +	return ret;
> +}
> +
> +int doe_sysfs_init(struct pci_dev *pdev)
> +{
> +	unsigned long index, j;
> +	int ret;
> +	struct pci_doe_mb *doe_mb;
> +	unsigned long total_protos = 0;
> +	void *entry;
> +

cosmetic: same as above feel free to ignore this comment :-

	unsigned long total_protos = 0;
	struct pci_doe_mb *doe_mb;
	unsigned long index, j;
	void *entry;
	int ret;

> +	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
> +		xa_for_each(&doe_mb->prots, j, entry)
> +			total_protos++;
> +	}
> +
> +	if (total_protos == 0)
> +		return 0;
> +
> +	ret = devm_device_add_group(&pdev->dev, &pci_dev_doe_proto_group);
> +	if (ret) {
> +		pci_err(pdev, "can't create DOE goup: %d\n", ret);
> +		return ret;
> +	}
> +
> +	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
> +		ret = pci_doe_sysfs_proto_supports(pdev, doe_mb);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +#endif
> +
>   static int pci_doe_wait(struct pci_doe_mb *doe_mb, unsigned long timeout)
>   {
>   	if (wait_event_timeout(doe_mb->wq,
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index ab32a91f287b..cb25aba081bc 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -16,6 +16,7 @@
>   #include <linux/kernel.h>
>   #include <linux/sched.h>
>   #include <linux/pci.h>
> +#include <linux/pci-doe.h>
>   #include <linux/stat.h>
>   #include <linux/export.h>
>   #include <linux/topology.h>
> @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
>   	int i;
>   	int retval;
>   
> +#ifdef CONFIG_PCI_DOE
> +	retval = doe_sysfs_init(pdev);
> +	if (retval)
> +		return retval;
> +#endif
> +

how about following? unless there is a bug then ignore this comment ..

	if (IS_ENABLED(CONFIG_PCI_DOE)) {
		retval = doe_sysfs_init(pdev);
		if (retval)
			return retval;
	}

>   	/* Expose the PCI resources from this device as files */
>   	for (i = 0; i < PCI_STD_NUM_BARS; i++) {
>   
> diff --git a/include/linux/pci-doe.h b/include/linux/pci-doe.h
> index 1f14aed4354b..4cc13d9ccb50 100644
> --- a/include/linux/pci-doe.h
> +++ b/include/linux/pci-doe.h
> @@ -22,4 +22,5 @@ int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
>   	    const void *request, size_t request_sz,
>   	    void *response, size_t response_sz);
>   
> +int doe_sysfs_init(struct pci_dev *pci_dev);
>   #endif

-ck
Greg Kroah-Hartman Aug. 10, 2023, 5:05 a.m. UTC | #3
On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> The PCIe 6 specification added support for the Data Object Exchange (DOE).
> When DOE is supported the Discovery Data Object Protocol must be
> implemented. The protocol allows a requester to obtain information about
> the other DOE protocols supported by the device.
> 
> The kernel is already querying the DOE protocols supported and cacheing
> the values. This patch exposes the values via sysfs. This will allow
> userspace to determine which DOE protocols are supported by the PCIe
> device.
> 
> By exposing the information to userspace tools like lspci can relay the
> information to users. By listing all of the supported protocols we can
> allow userspace to parse and support the list, which might include
> vendor specific protocols as well as yet to be supported protocols.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v3:
>  - Expose each DOE feature as a separate file

But you don't actually have anything in the sysfs files, why not?

> --- a/drivers/pci/doe.c
> +++ b/drivers/pci/doe.c
> @@ -56,6 +56,10 @@ struct pci_doe_mb {
>  	wait_queue_head_t wq;
>  	struct workqueue_struct *work_queue;
>  	unsigned long flags;
> +
> +#ifdef CONFIG_SYSFS
> +	struct device_attribute *sysfs_attrs;
> +#endif

Please don't put #ifdefs in .c files if you can prevent it.  I think
this will work just fine if you don't have the #ifdef.  And who would be
using pci without sysfs?

> +		attrs[i].attr.mode = 0444;
> +		attrs[i].show = NULL;

Why set to NULL something that is already NULL?  Did you just forget to
actually set the proper show callback here?

> +#ifdef CONFIG_PCI_DOE
> +	retval = doe_sysfs_init(pdev);
> +	if (retval)
> +		return retval;
> +#endif

Again, no #ifdef in .c files please, put this in the .h file like
normal.

thanks,

greg k-h
Lukas Wunner Aug. 10, 2023, 7:34 a.m. UTC | #4
On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
>  	int i;
>  	int retval;
>  
> +#ifdef CONFIG_PCI_DOE
> +	retval = doe_sysfs_init(pdev);
> +	if (retval)
> +		return retval;
> +#endif
> +

The preferred way to expose PCI sysfs attributes nowadays is to add them
to pci_dev_attr_groups[] and use the ->is_visible callback to check
whether they're applicable to a particular pci_dev.  The alternative
via pci_create_resource_files() has race conditions which I think
still haven't been fixed. Bjorn recommended the ->is_visible approach
in response to the most recent attempt to fix the race:

https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/

To avoid (most of) the #ifdefs, you may want to consider adding a
doe-sysfs.c file like I've done for cma in this commit:

https://github.com/l1k/linux/commit/b057e2fb0ee0

Thanks,

Lukas
Lukas Wunner Aug. 10, 2023, 7:44 a.m. UTC | #5
On Thu, Aug 10, 2023 at 07:05:12AM +0200, Greg KH wrote:
> On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > v3:
> >  - Expose each DOE feature as a separate file
> 
> But you don't actually have anything in the sysfs files, why not?

He wants to expose a list of supported protocols.

He first exposed the list in a single attribute, separated by newlines.
Which made sense because it allows users to grep for a specific protocol.

You told him not to expose multiple values in a single attribute.
So he's exposing the available protocols each in an empty file.
The file name contains the protocol.

You got what you asked for. ;)


> > --- a/drivers/pci/doe.c
> > +++ b/drivers/pci/doe.c
> > @@ -56,6 +56,10 @@ struct pci_doe_mb {
> >  	wait_queue_head_t wq;
> >  	struct workqueue_struct *work_queue;
> >  	unsigned long flags;
> > +
> > +#ifdef CONFIG_SYSFS
> > +	struct device_attribute *sysfs_attrs;
> > +#endif
> 
> Please don't put #ifdefs in .c files if you can prevent it.  I think
> this will work just fine if you don't have the #ifdef.  And who would be
> using pci without sysfs?

People with space-constrained devices such as routers.

It is perfectly legal to compile a kernel with CONFIG_PCI=y and
CONFIG_SYSFS=n.

And it is reasonable not to include code in the kernel which has
specifically been deselected in the kernel config.

Thanks,

Lukas
Greg Kroah-Hartman Aug. 10, 2023, 7:48 a.m. UTC | #6
On Thu, Aug 10, 2023 at 09:44:44AM +0200, Lukas Wunner wrote:
> On Thu, Aug 10, 2023 at 07:05:12AM +0200, Greg KH wrote:
> > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > v3:
> > >  - Expose each DOE feature as a separate file
> > 
> > But you don't actually have anything in the sysfs files, why not?
> 
> He wants to expose a list of supported protocols.
> 
> He first exposed the list in a single attribute, separated by newlines.
> Which made sense because it allows users to grep for a specific protocol.
> 
> You told him not to expose multiple values in a single attribute.
> So he's exposing the available protocols each in an empty file.
> The file name contains the protocol.
> 
> You got what you asked for. ;)

But that's not what was documented, it should say "empty file",
otherwise this is going to be very odd when people try to read a file
that is marked as readable, but yet returns an error :(

> > > --- a/drivers/pci/doe.c
> > > +++ b/drivers/pci/doe.c
> > > @@ -56,6 +56,10 @@ struct pci_doe_mb {
> > >  	wait_queue_head_t wq;
> > >  	struct workqueue_struct *work_queue;
> > >  	unsigned long flags;
> > > +
> > > +#ifdef CONFIG_SYSFS
> > > +	struct device_attribute *sysfs_attrs;
> > > +#endif
> > 
> > Please don't put #ifdefs in .c files if you can prevent it.  I think
> > this will work just fine if you don't have the #ifdef.  And who would be
> > using pci without sysfs?
> 
> People with space-constrained devices such as routers.

So the extra pointer here is a real problem for them?  And how much
memory are you saving?

> It is perfectly legal to compile a kernel with CONFIG_PCI=y and
> CONFIG_SYSFS=n.

Sure, just not common.

> And it is reasonable not to include code in the kernel which has
> specifically been deselected in the kernel config.

Sure, but not at the expense of a zillion #ifdef lines :)

thanks,

greg k-h
Alistair Francis Aug. 10, 2023, 3:34 p.m. UTC | #7
On Thu, Aug 10, 2023 at 3:34 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> >       int i;
> >       int retval;
> >
> > +#ifdef CONFIG_PCI_DOE
> > +     retval = doe_sysfs_init(pdev);
> > +     if (retval)
> > +             return retval;
> > +#endif
> > +
>
> The preferred way to expose PCI sysfs attributes nowadays is to add them
> to pci_dev_attr_groups[] and use the ->is_visible callback to check
> whether they're applicable to a particular pci_dev.  The alternative
> via pci_create_resource_files() has race conditions which I think
> still haven't been fixed. Bjorn recommended the ->is_visible approach
> in response to the most recent attempt to fix the race:
>
> https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/

The is_visible doen't seem to work in this case.

AFAIK is_visible only applies to the attributes under the group. Which
means that every PCIe device will see a `doe_protos` directory, no
matter if DOE is supported.

On top of that is_visible() is only called
(fs/sysfs/group.c:create_files()) if there are sub attrs, which we
don't have. We will possibly end up with some attributes, but all of
them are dynamic. So we still end up needing to call doe_sysfs_init()
anyway.

Alistair

>
> To avoid (most of) the #ifdefs, you may want to consider adding a
> doe-sysfs.c file like I've done for cma in this commit:
>
> https://github.com/l1k/linux/commit/b057e2fb0ee0
>
> Thanks,
>
> Lukas
Lukas Wunner Aug. 12, 2023, 8:15 a.m. UTC | #8
On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > --- a/drivers/pci/pci-sysfs.c
> > > +++ b/drivers/pci/pci-sysfs.c
> > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > >       int i;
> > >       int retval;
> > >
> > > +#ifdef CONFIG_PCI_DOE
> > > +     retval = doe_sysfs_init(pdev);
> > > +     if (retval)
> > > +             return retval;
> > > +#endif
> > > +
> >
> > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > whether they're applicable to a particular pci_dev.  The alternative
> > via pci_create_resource_files() has race conditions which I think
> > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > in response to the most recent attempt to fix the race:
> >
> > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> 
> The is_visible doen't seem to work in this case.
> 
> AFAIK is_visible only applies to the attributes under the group. Which
> means that every PCIe device will see a `doe_protos` directory, no
> matter if DOE is supported.

internal_create_group() in fs/sysfs/group.c does this:

	if (grp->name) {
			...
			kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...

So I'm under the impression that if you set the ->name member of
struct attribute_group, the attributes in that group appear under
a directory of that name.

In fact, the kernel-doc for struct attribute_group claims as much:

 * struct attribute_group - data structure used to declare an attribute group.
 * @name:	Optional: Attribute group name
 *		If specified, the attribute group will be created in
 *		a new subdirectory with this name.

So I don't quite understand why you think that "every PCIe device will
see a `doe_protos` directory, no matter if DOE is supported"?

Am I missing something?

Thanks,

Lukas
Greg Kroah-Hartman Aug. 12, 2023, 8:26 a.m. UTC | #9
On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > --- a/drivers/pci/pci-sysfs.c
> > > > +++ b/drivers/pci/pci-sysfs.c
> > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > >       int i;
> > > >       int retval;
> > > >
> > > > +#ifdef CONFIG_PCI_DOE
> > > > +     retval = doe_sysfs_init(pdev);
> > > > +     if (retval)
> > > > +             return retval;
> > > > +#endif
> > > > +
> > >
> > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > whether they're applicable to a particular pci_dev.  The alternative
> > > via pci_create_resource_files() has race conditions which I think
> > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > in response to the most recent attempt to fix the race:
> > >
> > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > 
> > The is_visible doen't seem to work in this case.
> > 
> > AFAIK is_visible only applies to the attributes under the group. Which
> > means that every PCIe device will see a `doe_protos` directory, no
> > matter if DOE is supported.
> 
> internal_create_group() in fs/sysfs/group.c does this:
> 
> 	if (grp->name) {
> 			...
> 			kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> 
> So I'm under the impression that if you set the ->name member of
> struct attribute_group, the attributes in that group appear under
> a directory of that name.
> 
> In fact, the kernel-doc for struct attribute_group claims as much:
> 
>  * struct attribute_group - data structure used to declare an attribute group.
>  * @name:	Optional: Attribute group name
>  *		If specified, the attribute group will be created in
>  *		a new subdirectory with this name.
> 
> So I don't quite understand why you think that "every PCIe device will
> see a `doe_protos` directory, no matter if DOE is supported"?
> 
> Am I missing something?

I think the issue might be that the directory will be created even if no
attributes are present in it due to the is_visable() check not returning
any valid files?

If so, I had a patch somewhere around here where I was trying to fix
that up:
	https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
but it didn't seem to work properly and kept crashing.  I didn't spend
much time on looking into it, but if this is an issue, I can work on
fixing this properly.

thanks,

greg k-h
Alistair Francis Aug. 15, 2023, 1:44 p.m. UTC | #10
On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > >       int i;
> > > > >       int retval;
> > > > >
> > > > > +#ifdef CONFIG_PCI_DOE
> > > > > +     retval = doe_sysfs_init(pdev);
> > > > > +     if (retval)
> > > > > +             return retval;
> > > > > +#endif
> > > > > +
> > > >
> > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > via pci_create_resource_files() has race conditions which I think
> > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > in response to the most recent attempt to fix the race:
> > > >
> > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > >
> > > The is_visible doen't seem to work in this case.
> > >
> > > AFAIK is_visible only applies to the attributes under the group. Which
> > > means that every PCIe device will see a `doe_protos` directory, no
> > > matter if DOE is supported.
> >
> > internal_create_group() in fs/sysfs/group.c does this:
> >
> >       if (grp->name) {
> >                       ...
> >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> >
> > So I'm under the impression that if you set the ->name member of
> > struct attribute_group, the attributes in that group appear under
> > a directory of that name.
> >
> > In fact, the kernel-doc for struct attribute_group claims as much:
> >
> >  * struct attribute_group - data structure used to declare an attribute group.
> >  * @name:     Optional: Attribute group name
> >  *            If specified, the attribute group will be created in
> >  *            a new subdirectory with this name.
> >
> > So I don't quite understand why you think that "every PCIe device will
> > see a `doe_protos` directory, no matter if DOE is supported"?
> >
> > Am I missing something?
>
> I think the issue might be that the directory will be created even if no
> attributes are present in it due to the is_visable() check not returning
> any valid files?

Yes, that's what I'm seeing. I see the directory for all PCIe devices

This is a WIP that I had:
https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c

>
> If so, I had a patch somewhere around here where I was trying to fix
> that up:
>         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> but it didn't seem to work properly and kept crashing.  I didn't spend
> much time on looking into it, but if this is an issue, I can work on
> fixing this properly.

That patch sounds like it would fix the issue of empty directories
that I'm seeing. Do you mind fixing it up properly?

Then I just need to address the race condition and we can switch to
using .is_visible()

Alistair

>
> thanks,
>
> greg k-h
Greg Kroah-Hartman Aug. 15, 2023, 3:11 p.m. UTC | #11
On Tue, Aug 15, 2023 at 09:44:32AM -0400, Alistair Francis wrote:
> On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > > >       int i;
> > > > > >       int retval;
> > > > > >
> > > > > > +#ifdef CONFIG_PCI_DOE
> > > > > > +     retval = doe_sysfs_init(pdev);
> > > > > > +     if (retval)
> > > > > > +             return retval;
> > > > > > +#endif
> > > > > > +
> > > > >
> > > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > > via pci_create_resource_files() has race conditions which I think
> > > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > > in response to the most recent attempt to fix the race:
> > > > >
> > > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > > >
> > > > The is_visible doen't seem to work in this case.
> > > >
> > > > AFAIK is_visible only applies to the attributes under the group. Which
> > > > means that every PCIe device will see a `doe_protos` directory, no
> > > > matter if DOE is supported.
> > >
> > > internal_create_group() in fs/sysfs/group.c does this:
> > >
> > >       if (grp->name) {
> > >                       ...
> > >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> > >
> > > So I'm under the impression that if you set the ->name member of
> > > struct attribute_group, the attributes in that group appear under
> > > a directory of that name.
> > >
> > > In fact, the kernel-doc for struct attribute_group claims as much:
> > >
> > >  * struct attribute_group - data structure used to declare an attribute group.
> > >  * @name:     Optional: Attribute group name
> > >  *            If specified, the attribute group will be created in
> > >  *            a new subdirectory with this name.
> > >
> > > So I don't quite understand why you think that "every PCIe device will
> > > see a `doe_protos` directory, no matter if DOE is supported"?
> > >
> > > Am I missing something?
> >
> > I think the issue might be that the directory will be created even if no
> > attributes are present in it due to the is_visable() check not returning
> > any valid files?
> 
> Yes, that's what I'm seeing. I see the directory for all PCIe devices
> 
> This is a WIP that I had:
> https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c
> 
> >
> > If so, I had a patch somewhere around here where I was trying to fix
> > that up:
> >         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> > but it didn't seem to work properly and kept crashing.  I didn't spend
> > much time on looking into it, but if this is an issue, I can work on
> > fixing this properly.
> 
> That patch sounds like it would fix the issue of empty directories
> that I'm seeing. Do you mind fixing it up properly?

I am currently unable to do so due to travel and stuff for a few weeks,
sorry.  Feel free to take it and fix the boot crash that is seen with it
and make it part of your patch series if you can't wait that long.

thanks,

greg k-h
Alistair Francis Aug. 15, 2023, 7:50 p.m. UTC | #12
On Tue, Aug 15, 2023 at 11:11 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Aug 15, 2023 at 09:44:32AM -0400, Alistair Francis wrote:
> > On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > > > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > > > >       int i;
> > > > > > >       int retval;
> > > > > > >
> > > > > > > +#ifdef CONFIG_PCI_DOE
> > > > > > > +     retval = doe_sysfs_init(pdev);
> > > > > > > +     if (retval)
> > > > > > > +             return retval;
> > > > > > > +#endif
> > > > > > > +
> > > > > >
> > > > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > > > via pci_create_resource_files() has race conditions which I think
> > > > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > > > in response to the most recent attempt to fix the race:
> > > > > >
> > > > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > > > >
> > > > > The is_visible doen't seem to work in this case.
> > > > >
> > > > > AFAIK is_visible only applies to the attributes under the group. Which
> > > > > means that every PCIe device will see a `doe_protos` directory, no
> > > > > matter if DOE is supported.
> > > >
> > > > internal_create_group() in fs/sysfs/group.c does this:
> > > >
> > > >       if (grp->name) {
> > > >                       ...
> > > >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> > > >
> > > > So I'm under the impression that if you set the ->name member of
> > > > struct attribute_group, the attributes in that group appear under
> > > > a directory of that name.
> > > >
> > > > In fact, the kernel-doc for struct attribute_group claims as much:
> > > >
> > > >  * struct attribute_group - data structure used to declare an attribute group.
> > > >  * @name:     Optional: Attribute group name
> > > >  *            If specified, the attribute group will be created in
> > > >  *            a new subdirectory with this name.
> > > >
> > > > So I don't quite understand why you think that "every PCIe device will
> > > > see a `doe_protos` directory, no matter if DOE is supported"?
> > > >
> > > > Am I missing something?
> > >
> > > I think the issue might be that the directory will be created even if no
> > > attributes are present in it due to the is_visable() check not returning
> > > any valid files?
> >
> > Yes, that's what I'm seeing. I see the directory for all PCIe devices
> >
> > This is a WIP that I had:
> > https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c
> >
> > >
> > > If so, I had a patch somewhere around here where I was trying to fix
> > > that up:
> > >         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> > > but it didn't seem to work properly and kept crashing.  I didn't spend
> > > much time on looking into it, but if this is an issue, I can work on
> > > fixing this properly.
> >
> > That patch sounds like it would fix the issue of empty directories
> > that I'm seeing. Do you mind fixing it up properly?
>
> I am currently unable to do so due to travel and stuff for a few weeks,
> sorry.  Feel free to take it and fix the boot crash that is seen with it
> and make it part of your patch series if you can't wait that long.

No worries.

It's much harder than I first thought though. There are currently lots
of users who expect the group to remain even if empty, as they
dynamically add/merge properties later. Which is what we end up doing
for DOE as well

I'll keep looking into this and see if I can figure something out.

Would an .attr_is_visible() function pointer for struct
attribute_group something that the kernel would accept?

Alistair

>
> thanks,
>
> greg k-h
Greg Kroah-Hartman Aug. 15, 2023, 8:10 p.m. UTC | #13
On Tue, Aug 15, 2023 at 03:50:31PM -0400, Alistair Francis wrote:
> On Tue, Aug 15, 2023 at 11:11 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Aug 15, 2023 at 09:44:32AM -0400, Alistair Francis wrote:
> > > On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > > > > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > > > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > > > > >       int i;
> > > > > > > >       int retval;
> > > > > > > >
> > > > > > > > +#ifdef CONFIG_PCI_DOE
> > > > > > > > +     retval = doe_sysfs_init(pdev);
> > > > > > > > +     if (retval)
> > > > > > > > +             return retval;
> > > > > > > > +#endif
> > > > > > > > +
> > > > > > >
> > > > > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > > > > via pci_create_resource_files() has race conditions which I think
> > > > > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > > > > in response to the most recent attempt to fix the race:
> > > > > > >
> > > > > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > > > > >
> > > > > > The is_visible doen't seem to work in this case.
> > > > > >
> > > > > > AFAIK is_visible only applies to the attributes under the group. Which
> > > > > > means that every PCIe device will see a `doe_protos` directory, no
> > > > > > matter if DOE is supported.
> > > > >
> > > > > internal_create_group() in fs/sysfs/group.c does this:
> > > > >
> > > > >       if (grp->name) {
> > > > >                       ...
> > > > >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> > > > >
> > > > > So I'm under the impression that if you set the ->name member of
> > > > > struct attribute_group, the attributes in that group appear under
> > > > > a directory of that name.
> > > > >
> > > > > In fact, the kernel-doc for struct attribute_group claims as much:
> > > > >
> > > > >  * struct attribute_group - data structure used to declare an attribute group.
> > > > >  * @name:     Optional: Attribute group name
> > > > >  *            If specified, the attribute group will be created in
> > > > >  *            a new subdirectory with this name.
> > > > >
> > > > > So I don't quite understand why you think that "every PCIe device will
> > > > > see a `doe_protos` directory, no matter if DOE is supported"?
> > > > >
> > > > > Am I missing something?
> > > >
> > > > I think the issue might be that the directory will be created even if no
> > > > attributes are present in it due to the is_visable() check not returning
> > > > any valid files?
> > >
> > > Yes, that's what I'm seeing. I see the directory for all PCIe devices
> > >
> > > This is a WIP that I had:
> > > https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c
> > >
> > > >
> > > > If so, I had a patch somewhere around here where I was trying to fix
> > > > that up:
> > > >         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> > > > but it didn't seem to work properly and kept crashing.  I didn't spend
> > > > much time on looking into it, but if this is an issue, I can work on
> > > > fixing this properly.
> > >
> > > That patch sounds like it would fix the issue of empty directories
> > > that I'm seeing. Do you mind fixing it up properly?
> >
> > I am currently unable to do so due to travel and stuff for a few weeks,
> > sorry.  Feel free to take it and fix the boot crash that is seen with it
> > and make it part of your patch series if you can't wait that long.
> 
> No worries.
> 
> It's much harder than I first thought though. There are currently lots
> of users who expect the group to remain even if empty, as they
> dynamically add/merge properties later. Which is what we end up doing
> for DOE as well
> 
> I'll keep looking into this and see if I can figure something out.

Yeah, now that I think about it, that's where stuff fell apart for me as
well.  We should be able to create the group and then only create the
file when they are added/merged, so I bet I missed a codepath somewhere.

> Would an .attr_is_visible() function pointer for struct
> attribute_group something that the kernel would accept?

Worst case, yes, that would be acceptable.  But try to see where I
messed up on the original patch, it should be able to be done
automatically somehow...

thanks,

greg k-h
Alistair Francis Aug. 17, 2023, 7:41 p.m. UTC | #14
On Tue, Aug 15, 2023 at 4:10 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Aug 15, 2023 at 03:50:31PM -0400, Alistair Francis wrote:
> > On Tue, Aug 15, 2023 at 11:11 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Tue, Aug 15, 2023 at 09:44:32AM -0400, Alistair Francis wrote:
> > > > On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > > > > > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > > > > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > > > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > > > > > >       int i;
> > > > > > > > >       int retval;
> > > > > > > > >
> > > > > > > > > +#ifdef CONFIG_PCI_DOE
> > > > > > > > > +     retval = doe_sysfs_init(pdev);
> > > > > > > > > +     if (retval)
> > > > > > > > > +             return retval;
> > > > > > > > > +#endif
> > > > > > > > > +
> > > > > > > >
> > > > > > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > > > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > > > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > > > > > via pci_create_resource_files() has race conditions which I think
> > > > > > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > > > > > in response to the most recent attempt to fix the race:
> > > > > > > >
> > > > > > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > > > > > >
> > > > > > > The is_visible doen't seem to work in this case.
> > > > > > >
> > > > > > > AFAIK is_visible only applies to the attributes under the group. Which
> > > > > > > means that every PCIe device will see a `doe_protos` directory, no
> > > > > > > matter if DOE is supported.
> > > > > >
> > > > > > internal_create_group() in fs/sysfs/group.c does this:
> > > > > >
> > > > > >       if (grp->name) {
> > > > > >                       ...
> > > > > >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> > > > > >
> > > > > > So I'm under the impression that if you set the ->name member of
> > > > > > struct attribute_group, the attributes in that group appear under
> > > > > > a directory of that name.
> > > > > >
> > > > > > In fact, the kernel-doc for struct attribute_group claims as much:
> > > > > >
> > > > > >  * struct attribute_group - data structure used to declare an attribute group.
> > > > > >  * @name:     Optional: Attribute group name
> > > > > >  *            If specified, the attribute group will be created in
> > > > > >  *            a new subdirectory with this name.
> > > > > >
> > > > > > So I don't quite understand why you think that "every PCIe device will
> > > > > > see a `doe_protos` directory, no matter if DOE is supported"?
> > > > > >
> > > > > > Am I missing something?
> > > > >
> > > > > I think the issue might be that the directory will be created even if no
> > > > > attributes are present in it due to the is_visable() check not returning
> > > > > any valid files?
> > > >
> > > > Yes, that's what I'm seeing. I see the directory for all PCIe devices
> > > >
> > > > This is a WIP that I had:
> > > > https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c
> > > >
> > > > >
> > > > > If so, I had a patch somewhere around here where I was trying to fix
> > > > > that up:
> > > > >         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> > > > > but it didn't seem to work properly and kept crashing.  I didn't spend
> > > > > much time on looking into it, but if this is an issue, I can work on
> > > > > fixing this properly.
> > > >
> > > > That patch sounds like it would fix the issue of empty directories
> > > > that I'm seeing. Do you mind fixing it up properly?
> > >
> > > I am currently unable to do so due to travel and stuff for a few weeks,
> > > sorry.  Feel free to take it and fix the boot crash that is seen with it
> > > and make it part of your patch series if you can't wait that long.
> >
> > No worries.
> >
> > It's much harder than I first thought though. There are currently lots
> > of users who expect the group to remain even if empty, as they
> > dynamically add/merge properties later. Which is what we end up doing
> > for DOE as well
> >
> > I'll keep looking into this and see if I can figure something out.
>
> Yeah, now that I think about it, that's where stuff fell apart for me as
> well.  We should be able to create the group and then only create the
> file when they are added/merged, so I bet I missed a codepath somewhere.

Yeah, it's tricky.

The documentation for sysfs_merge_group() specifically says

This function returns an error if the group doesn't exist or any of the
files already exist in that group, in which case none of the new files
are created.

as an empty group isn't created with your patch it doesn't work with
sysfs_merge_group().

I'm assuming we don't want to change those public functions by
creating the group in sysfs_merge_group() if it isn't created.

Creating the group is just creating the directory, so I don't see a
way we can create the group without creating the directory

Alistair

>
> > Would an .attr_is_visible() function pointer for struct
> > attribute_group something that the kernel would accept?
>
> Worst case, yes, that would be acceptable.  But try to see where I
> messed up on the original patch, it should be able to be done
> automatically somehow...
>
> thanks,
>
> greg k-h
Greg Kroah-Hartman Aug. 17, 2023, 8:21 p.m. UTC | #15
On Thu, Aug 17, 2023 at 03:41:23PM -0400, Alistair Francis wrote:
> On Tue, Aug 15, 2023 at 4:10 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Aug 15, 2023 at 03:50:31PM -0400, Alistair Francis wrote:
> > > On Tue, Aug 15, 2023 at 11:11 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Tue, Aug 15, 2023 at 09:44:32AM -0400, Alistair Francis wrote:
> > > > > On Sat, Aug 12, 2023 at 4:26 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Sat, Aug 12, 2023 at 10:15:26AM +0200, Lukas Wunner wrote:
> > > > > > > On Thu, Aug 10, 2023 at 11:34:11AM -0400, Alistair Francis wrote:
> > > > > > > > On Thu, Aug 10, 2023 at 3:34???AM Lukas Wunner <lukas@wunner.de> wrote:
> > > > > > > > > On Wed, Aug 09, 2023 at 07:28:51PM -0400, Alistair Francis wrote:
> > > > > > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > > > > > @@ -1226,6 +1227,12 @@ static int pci_create_resource_files(struct pci_dev *pdev)
> > > > > > > > > >       int i;
> > > > > > > > > >       int retval;
> > > > > > > > > >
> > > > > > > > > > +#ifdef CONFIG_PCI_DOE
> > > > > > > > > > +     retval = doe_sysfs_init(pdev);
> > > > > > > > > > +     if (retval)
> > > > > > > > > > +             return retval;
> > > > > > > > > > +#endif
> > > > > > > > > > +
> > > > > > > > >
> > > > > > > > > The preferred way to expose PCI sysfs attributes nowadays is to add them
> > > > > > > > > to pci_dev_attr_groups[] and use the ->is_visible callback to check
> > > > > > > > > whether they're applicable to a particular pci_dev.  The alternative
> > > > > > > > > via pci_create_resource_files() has race conditions which I think
> > > > > > > > > still haven't been fixed. Bjorn recommended the ->is_visible approach
> > > > > > > > > in response to the most recent attempt to fix the race:
> > > > > > > > >
> > > > > > > > > https://lore.kernel.org/linux-pci/20230427161458.GA249886@bhelgaas/
> > > > > > > >
> > > > > > > > The is_visible doen't seem to work in this case.
> > > > > > > >
> > > > > > > > AFAIK is_visible only applies to the attributes under the group. Which
> > > > > > > > means that every PCIe device will see a `doe_protos` directory, no
> > > > > > > > matter if DOE is supported.
> > > > > > >
> > > > > > > internal_create_group() in fs/sysfs/group.c does this:
> > > > > > >
> > > > > > >       if (grp->name) {
> > > > > > >                       ...
> > > > > > >                       kn = kernfs_create_dir_ns(kobj->sd, grp->name, ...
> > > > > > >
> > > > > > > So I'm under the impression that if you set the ->name member of
> > > > > > > struct attribute_group, the attributes in that group appear under
> > > > > > > a directory of that name.
> > > > > > >
> > > > > > > In fact, the kernel-doc for struct attribute_group claims as much:
> > > > > > >
> > > > > > >  * struct attribute_group - data structure used to declare an attribute group.
> > > > > > >  * @name:     Optional: Attribute group name
> > > > > > >  *            If specified, the attribute group will be created in
> > > > > > >  *            a new subdirectory with this name.
> > > > > > >
> > > > > > > So I don't quite understand why you think that "every PCIe device will
> > > > > > > see a `doe_protos` directory, no matter if DOE is supported"?
> > > > > > >
> > > > > > > Am I missing something?
> > > > > >
> > > > > > I think the issue might be that the directory will be created even if no
> > > > > > attributes are present in it due to the is_visable() check not returning
> > > > > > any valid files?
> > > > >
> > > > > Yes, that's what I'm seeing. I see the directory for all PCIe devices
> > > > >
> > > > > This is a WIP that I had:
> > > > > https://github.com/alistair23/linux/commit/61925cd174c31386eaa7e51e3a1be606b38f847c
> > > > >
> > > > > >
> > > > > > If so, I had a patch somewhere around here where I was trying to fix
> > > > > > that up:
> > > > > >         https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/commit/?h=debugfs_cleanup&id=f670945dfbaf353fe068544c31e3fa45575da5b5
> > > > > > but it didn't seem to work properly and kept crashing.  I didn't spend
> > > > > > much time on looking into it, but if this is an issue, I can work on
> > > > > > fixing this properly.
> > > > >
> > > > > That patch sounds like it would fix the issue of empty directories
> > > > > that I'm seeing. Do you mind fixing it up properly?
> > > >
> > > > I am currently unable to do so due to travel and stuff for a few weeks,
> > > > sorry.  Feel free to take it and fix the boot crash that is seen with it
> > > > and make it part of your patch series if you can't wait that long.
> > >
> > > No worries.
> > >
> > > It's much harder than I first thought though. There are currently lots
> > > of users who expect the group to remain even if empty, as they
> > > dynamically add/merge properties later. Which is what we end up doing
> > > for DOE as well
> > >
> > > I'll keep looking into this and see if I can figure something out.
> >
> > Yeah, now that I think about it, that's where stuff fell apart for me as
> > well.  We should be able to create the group and then only create the
> > file when they are added/merged, so I bet I missed a codepath somewhere.
> 
> Yeah, it's tricky.
> 
> The documentation for sysfs_merge_group() specifically says
> 
> This function returns an error if the group doesn't exist or any of the
> files already exist in that group, in which case none of the new files
> are created.
> 
> as an empty group isn't created with your patch it doesn't work with
> sysfs_merge_group().
> 
> I'm assuming we don't want to change those public functions by
> creating the group in sysfs_merge_group() if it isn't created.

Please change away!  We are allowed to change anything we want here, as
long as you fix up all in-tree users of the functions.  Nothing is ever
"frozen" in the in-kernel apis.

> Creating the group is just creating the directory, so I don't see a
> way we can create the group without creating the directory

We can wait to actually create the directory when the first visible file
is present.  That is what my patch was attempting to do (well, it
removed it if non-visible) but as you point out, it's obviously broken :)

So switch it around, only create the directory when it is explicitly
needed.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ecf47559f495..e754b8efdb69 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -500,3 +500,13 @@  Description:
 		console drivers from the device.  Raw users of pci-sysfs
 		resourceN attributes must be terminated prior to resizing.
 		Success of the resizing operation is not guaranteed.
+
+What:		/sys/bus/pci/devices/.../doe_proto
+Date:		July 2023
+Contact:	Linux PCI developers <linux-pci@vger.kernel.org>
+Description:
+		This diectory contains a list of the supported Data Object Exchange (DOE)
+		features. Each feature is a single file.
+		The value comes from the device and specifies the vendor and
+		data object type supported. The lower byte is the data object type and the next
+		two bytes are the vendor ID.
diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
index 1b97a5ab71a9..d5cbcb6a457a 100644
--- a/drivers/pci/doe.c
+++ b/drivers/pci/doe.c
@@ -56,6 +56,10 @@  struct pci_doe_mb {
 	wait_queue_head_t wq;
 	struct workqueue_struct *work_queue;
 	unsigned long flags;
+
+#ifdef CONFIG_SYSFS
+	struct device_attribute *sysfs_attrs;
+#endif
 };
 
 struct pci_doe_protocol {
@@ -92,6 +96,109 @@  struct pci_doe_task {
 	struct pci_doe_mb *doe_mb;
 };
 
+#ifdef CONFIG_SYSFS
+static struct attribute *pci_dev_doe_proto_attrs[] = {
+	NULL,
+};
+
+static const struct attribute_group pci_dev_doe_proto_group = {
+	.name	= "doe_protos",
+	.attrs	= pci_dev_doe_proto_attrs,
+};
+
+static void pci_doe_sysfs_remove_desc(struct pci_doe_mb *doe_mb)
+{
+	struct device_attribute *attrs = doe_mb->sysfs_attrs;
+	unsigned long i;
+	void *entry;
+
+	if (!attrs)
+		return;
+
+	doe_mb->sysfs_attrs = NULL;
+	xa_for_each(&doe_mb->prots, i, entry)
+		kfree(attrs[i].attr.name);
+
+	kfree(attrs);
+}
+
+static int pci_doe_sysfs_proto_supports(struct pci_dev *pdev, struct pci_doe_mb *doe_mb)
+{
+	struct device_attribute *attrs;
+	struct device *dev = &pdev->dev;
+	unsigned long i;
+	int ret;
+	unsigned long num_protos = 0;
+	unsigned long vid, type;
+	void *entry;
+
+	xa_for_each(&doe_mb->prots, i, entry)
+		num_protos++;
+
+	attrs = kcalloc(num_protos, sizeof(*attrs), GFP_KERNEL);
+	if (!attrs)
+		return -ENOMEM;
+
+	doe_mb->sysfs_attrs = attrs;
+	xa_for_each(&doe_mb->prots, i, entry) {
+		sysfs_attr_init(&attrs[i].attr);
+		vid = xa_to_value(entry) >> 8;
+		type = xa_to_value(entry) & 0xFF;
+		attrs[i].attr.name = kasprintf(GFP_KERNEL, "0x%04lX:%02lX", vid, type);
+		if (!attrs[i].attr.name) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+
+		attrs[i].attr.mode = 0444;
+		attrs[i].show = NULL;
+
+		ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr,
+					      pci_dev_doe_proto_group.name);
+		if (ret)
+			goto fail;
+	}
+
+	return 0;
+
+fail:
+	pci_doe_sysfs_remove_desc(doe_mb);
+	return ret;
+}
+
+int doe_sysfs_init(struct pci_dev *pdev)
+{
+	unsigned long index, j;
+	int ret;
+	struct pci_doe_mb *doe_mb;
+	unsigned long total_protos = 0;
+	void *entry;
+
+	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
+		xa_for_each(&doe_mb->prots, j, entry)
+			total_protos++;
+	}
+
+	if (total_protos == 0)
+		return 0;
+
+	ret = devm_device_add_group(&pdev->dev, &pci_dev_doe_proto_group);
+	if (ret) {
+		pci_err(pdev, "can't create DOE goup: %d\n", ret);
+		return ret;
+	}
+
+	xa_for_each(&pdev->doe_mbs, index, doe_mb) {
+		ret = pci_doe_sysfs_proto_supports(pdev, doe_mb);
+
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+#endif
+
 static int pci_doe_wait(struct pci_doe_mb *doe_mb, unsigned long timeout)
 {
 	if (wait_event_timeout(doe_mb->wq,
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index ab32a91f287b..cb25aba081bc 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -16,6 +16,7 @@ 
 #include <linux/kernel.h>
 #include <linux/sched.h>
 #include <linux/pci.h>
+#include <linux/pci-doe.h>
 #include <linux/stat.h>
 #include <linux/export.h>
 #include <linux/topology.h>
@@ -1226,6 +1227,12 @@  static int pci_create_resource_files(struct pci_dev *pdev)
 	int i;
 	int retval;
 
+#ifdef CONFIG_PCI_DOE
+	retval = doe_sysfs_init(pdev);
+	if (retval)
+		return retval;
+#endif
+
 	/* Expose the PCI resources from this device as files */
 	for (i = 0; i < PCI_STD_NUM_BARS; i++) {
 
diff --git a/include/linux/pci-doe.h b/include/linux/pci-doe.h
index 1f14aed4354b..4cc13d9ccb50 100644
--- a/include/linux/pci-doe.h
+++ b/include/linux/pci-doe.h
@@ -22,4 +22,5 @@  int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type,
 	    const void *request, size_t request_sz,
 	    void *response, size_t response_sz);
 
+int doe_sysfs_init(struct pci_dev *pci_dev);
 #endif