diff mbox series

[V12,4/9] cxl/pci: Create PCI DOE mailbox's for memory devices

Message ID 20220628041527.742333-5-ira.weiny@intel.com (mailing list archive)
State Superseded
Headers show
Series CXL: Read CDAT and DSMAS data | expand

Commit Message

Ira Weiny June 28, 2022, 4:15 a.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

DOE mailbox objects will be needed for various mailbox communications
with each memory device.

Iterate each DOE mailbox capability and create PCI DOE mailbox objects
as found.

It is not anticipated that this is the final resting place for the
iteration of the DOE devices.  The support of switch ports will drive
this code into the PCIe side.  In this imagined architecture the CXL
port driver would then query into the PCI device for the DOE mailbox
array.

For now creating the mailboxes in the CXL port is good enough for the
endpoints.  Later PCIe ports will need to support this to support switch
ports more generically.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V11:
	Drop review from: Ben Widawsky <bwidawsk@kernel.org>
	Remove irq code for now
	Adjust for pci_doe_get_int_msg_num()
	Adjust for pcim_doe_create_mb()
		(No longer need to handle the destroy.)
	Use xarray for DOE mailbox array

Changes from V9:
	Bug fix: ensure DOE mailboxes are iterated before memdev add
	Ben Widawsky
		Set use_irq to false and just return on error.
		Don't return a value from devm_cxl_pci_create_doe()
		Skip allocating doe_mb array if there are no mailboxes
		Skip requesting irqs if none found.
	Ben/Jonathan Cameron
		s/num_irqs/max_irqs

Changes from V8:
	Move PCI_DOE selection to CXL_BUS to support future patches
	which move queries into the port code.
	Remove Auxiliary device arch
	Squash the functionality of the auxiliary driver into this
	patch.
	Split out the irq handling a bit.

Changes from V7:
	Minor code clean ups
	Rebased on cxl-pending

Changes from V6:
	Move all the auxiliary device stuff to the CXL layer

Changes from V5:
	Split the CXL specific stuff off from the PCI DOE create
	auxiliary device code.
---
 drivers/cxl/Kconfig  |  1 +
 drivers/cxl/cxlmem.h |  3 +++
 drivers/cxl/pci.c    | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

Comments

Jonathan Cameron June 28, 2022, 2:33 p.m. UTC | #1
On Mon, 27 Jun 2022 21:15:22 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> DOE mailbox objects will be needed for various mailbox communications
> with each memory device.
> 
> Iterate each DOE mailbox capability and create PCI DOE mailbox objects
> as found.
> 
> It is not anticipated that this is the final resting place for the
> iteration of the DOE devices.  The support of switch ports will drive
> this code into the PCIe side.  In this imagined architecture the CXL
> port driver would then query into the PCI device for the DOE mailbox
> array.
> 
> For now creating the mailboxes in the CXL port is good enough for the
> endpoints.  Later PCIe ports will need to support this to support switch
> ports more generically.
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Lukas Wunner <lukas@wunner.de>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

My main comment on this is that we should not paper over any errors in
DOE setup.  Those indicate a bug or hardware fault, so like anything similar
we should at very least document why it makes sense to continue.  In most
cases I'd argue it doesn't as something is very wrong.

> 
> ---
> Changes from V11:
> 	Drop review from: Ben Widawsky <bwidawsk@kernel.org>
> 	Remove irq code for now
> 	Adjust for pci_doe_get_int_msg_num()
> 	Adjust for pcim_doe_create_mb()
> 		(No longer need to handle the destroy.)
> 	Use xarray for DOE mailbox array
> 
> Changes from V9:
> 	Bug fix: ensure DOE mailboxes are iterated before memdev add
> 	Ben Widawsky
> 		Set use_irq to false and just return on error.
> 		Don't return a value from devm_cxl_pci_create_doe()
> 		Skip allocating doe_mb array if there are no mailboxes
> 		Skip requesting irqs if none found.
> 	Ben/Jonathan Cameron
> 		s/num_irqs/max_irqs
> 
> Changes from V8:
> 	Move PCI_DOE selection to CXL_BUS to support future patches
> 	which move queries into the port code.
> 	Remove Auxiliary device arch
> 	Squash the functionality of the auxiliary driver into this
> 	patch.
> 	Split out the irq handling a bit.
> 
> Changes from V7:
> 	Minor code clean ups
> 	Rebased on cxl-pending
> 
> Changes from V6:
> 	Move all the auxiliary device stuff to the CXL layer
> 
> Changes from V5:
> 	Split the CXL specific stuff off from the PCI DOE create
> 	auxiliary device code.
> ---
>  drivers/cxl/Kconfig  |  1 +
>  drivers/cxl/cxlmem.h |  3 +++
>  drivers/cxl/pci.c    | 37 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 41 insertions(+)
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index f64e3984689f..7adaaf80b302 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -2,6 +2,7 @@
>  menuconfig CXL_BUS
>  	tristate "CXL (Compute Express Link) Devices Support"
>  	depends on PCI
> +	select PCI_DOE
>  	help
>  	  CXL is a bus that is electrically compatible with PCI Express, but
>  	  layers three protocols on that signalling (CXL.io, CXL.cache, and
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 60d10ee1e7fc..360f282ef80c 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -191,6 +191,7 @@ struct cxl_endpoint_dvsec_info {
>   * @component_reg_phys: register base of component registers
>   * @info: Cached DVSEC information about the device.
>   * @serial: PCIe Device Serial Number
> + * @doe_mbs: PCI DOE mailbox array
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -224,6 +225,8 @@ struct cxl_dev_state {
>  	resource_size_t component_reg_phys;
>  	u64 serial;
>  
> +	struct xarray doe_mbs;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 5a0ae46d4989..5821e6c1253b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -8,6 +8,7 @@
>  #include <linux/mutex.h>
>  #include <linux/list.h>
>  #include <linux/pci.h>
> +#include <linux/pci-doe.h>
>  #include <linux/io.h>
>  #include "cxlmem.h"
>  #include "cxlpci.h"
> @@ -386,6 +387,37 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	return rc;
>  }
>  
> +static void cxl_pci_destroy_doe(void *mbs)
> +{
> +	struct xarray *xa = mbs;

Local variable doesn't add anything...

> +
> +	xa_destroy(xa);
> +}
> +
> +static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u16 off = 0;
> +
> +	pci_doe_for_each_off(pdev, off) {
> +		struct pci_doe_mb *doe_mb;
> +
> +		doe_mb = pcim_doe_create_mb(pdev, off, -1);
> +		if (IS_ERR(doe_mb)) {
> +			pci_err(pdev,
> +				"Failed to create MB object for MB @ %x\n",
> +				off);

Definitely at least need a comment for why papering over this failure is
fine.  My gut feeling is we shouldn't ignore it.

> +			doe_mb = NULL;
> +		}
> +
> +		if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL))
> +			break;

If we hit that break something has gone horribly wrong and we shouldn't
paper over it either.  We might have a partial list of DOEs and callers
after this will have no way of knowing it isn't the full list.

> +
> +		pci_dbg(pdev, "Created DOE mailbox @%x\n", off);
> +	}
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_register_map map;
> @@ -408,6 +440,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlds))
>  		return PTR_ERR(cxlds);
>  
> +	xa_init(&cxlds->doe_mbs);
> +	devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs);

_or_reset()?  If the devm registration itself fails we want to bail out cleanly.
It's vanishingly unlikely to happen, but we should still handle that case.

> +
>  	cxlds->serial = pci_get_dsn(pdev);
>  	cxlds->cxl_dvsec = pci_find_dvsec_capability(
>  		pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
> @@ -434,6 +469,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	cxlds->component_reg_phys = cxl_regmap_to_base(pdev, &map);
>  
> +	devm_cxl_pci_create_doe(cxlds);
> +
>  	rc = cxl_pci_setup_mailbox(cxlds);
>  	if (rc)
>  		return rc;
Ira Weiny June 30, 2022, 5:32 a.m. UTC | #2
On Tue, Jun 28, 2022 at 03:33:17PM +0100, Jonathan Cameron wrote:
> On Mon, 27 Jun 2022 21:15:22 -0700
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > DOE mailbox objects will be needed for various mailbox communications
> > with each memory device.
> > 
> > Iterate each DOE mailbox capability and create PCI DOE mailbox objects
> > as found.
> > 
> > It is not anticipated that this is the final resting place for the
> > iteration of the DOE devices.  The support of switch ports will drive
> > this code into the PCIe side.  In this imagined architecture the CXL
> > port driver would then query into the PCI device for the DOE mailbox
> > array.
> > 
> > For now creating the mailboxes in the CXL port is good enough for the
> > endpoints.  Later PCIe ports will need to support this to support switch
> > ports more generically.
> > 
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Davidlohr Bueso <dave@stgolabs.net>
> > Cc: Lukas Wunner <lukas@wunner.de>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> My main comment on this is that we should not paper over any errors in
> DOE setup.  Those indicate a bug or hardware fault, so like anything similar
> we should at very least document why it makes sense to continue.  In most
> cases I'd argue it doesn't as something is very wrong.
> 
> > 
> > ---
> > Changes from V11:
> > 	Drop review from: Ben Widawsky <bwidawsk@kernel.org>
> > 	Remove irq code for now
> > 	Adjust for pci_doe_get_int_msg_num()
> > 	Adjust for pcim_doe_create_mb()
> > 		(No longer need to handle the destroy.)
> > 	Use xarray for DOE mailbox array
> > 
> > Changes from V9:
> > 	Bug fix: ensure DOE mailboxes are iterated before memdev add
> > 	Ben Widawsky
> > 		Set use_irq to false and just return on error.
> > 		Don't return a value from devm_cxl_pci_create_doe()
> > 		Skip allocating doe_mb array if there are no mailboxes
> > 		Skip requesting irqs if none found.
> > 	Ben/Jonathan Cameron
> > 		s/num_irqs/max_irqs
> > 
> > Changes from V8:
> > 	Move PCI_DOE selection to CXL_BUS to support future patches
> > 	which move queries into the port code.
> > 	Remove Auxiliary device arch
> > 	Squash the functionality of the auxiliary driver into this
> > 	patch.
> > 	Split out the irq handling a bit.
> > 
> > Changes from V7:
> > 	Minor code clean ups
> > 	Rebased on cxl-pending
> > 
> > Changes from V6:
> > 	Move all the auxiliary device stuff to the CXL layer
> > 
> > Changes from V5:
> > 	Split the CXL specific stuff off from the PCI DOE create
> > 	auxiliary device code.
> > ---
> >  drivers/cxl/Kconfig  |  1 +
> >  drivers/cxl/cxlmem.h |  3 +++
> >  drivers/cxl/pci.c    | 37 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 41 insertions(+)
> > 
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index f64e3984689f..7adaaf80b302 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -2,6 +2,7 @@
> >  menuconfig CXL_BUS
> >  	tristate "CXL (Compute Express Link) Devices Support"
> >  	depends on PCI
> > +	select PCI_DOE
> >  	help
> >  	  CXL is a bus that is electrically compatible with PCI Express, but
> >  	  layers three protocols on that signalling (CXL.io, CXL.cache, and
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 60d10ee1e7fc..360f282ef80c 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -191,6 +191,7 @@ struct cxl_endpoint_dvsec_info {
> >   * @component_reg_phys: register base of component registers
> >   * @info: Cached DVSEC information about the device.
> >   * @serial: PCIe Device Serial Number
> > + * @doe_mbs: PCI DOE mailbox array
> >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> >   *
> >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > @@ -224,6 +225,8 @@ struct cxl_dev_state {
> >  	resource_size_t component_reg_phys;
> >  	u64 serial;
> >  
> > +	struct xarray doe_mbs;
> > +
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 5a0ae46d4989..5821e6c1253b 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/mutex.h>
> >  #include <linux/list.h>
> >  #include <linux/pci.h>
> > +#include <linux/pci-doe.h>
> >  #include <linux/io.h>
> >  #include "cxlmem.h"
> >  #include "cxlpci.h"
> > @@ -386,6 +387,37 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> >  	return rc;
> >  }
> >  
> > +static void cxl_pci_destroy_doe(void *mbs)
> > +{
> > +	struct xarray *xa = mbs;
> 
> Local variable doesn't add anything...
> 
> > +
> > +	xa_destroy(xa);
> > +}
> > +
> > +static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	u16 off = 0;
> > +
> > +	pci_doe_for_each_off(pdev, off) {
> > +		struct pci_doe_mb *doe_mb;
> > +
> > +		doe_mb = pcim_doe_create_mb(pdev, off, -1);
> > +		if (IS_ERR(doe_mb)) {
> > +			pci_err(pdev,
> > +				"Failed to create MB object for MB @ %x\n",
> > +				off);
> 
> Definitely at least need a comment for why papering over this failure is
> fine.  My gut feeling is we shouldn't ignore it.

I'm so confused at this point I don't really know any more.  :-/

> 
> > +			doe_mb = NULL;

This at least needs to be a continue.

> > +		}
> > +
> > +		if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL))
> > +			break;
> 
> If we hit that break something has gone horribly wrong and we shouldn't
> paper over it either.  We might have a partial list of DOEs and callers
> after this will have no way of knowing it isn't the full list.

I _thought_ that we did not care if some mailboxes failed or not.

If CDAT is not supported on any of the mailboxes found then CDAT will not show
up on sysfs (as per Dan's last comment).  If it was supported on a mailbox but
no data is found the sysfs will show up but be 0 length.

At this layer I thought we agreed to skip over these errors.  If a protocol is
needed at a higher layer and it is not found on any of the mailboxes the errors
should show up in that layer.  In this series CDAT is not 100% necessary as
devices can work without it.  So the errors were mostly paper'ed over in favor
of just printing error messages and muddle on.

The xa_insert() deserves a pci_err() though.

And I'll add a comment indicating finding mailboxes is 'best effort' and higher
layers may error out the device depending on necessity.  How about this?

        /* 
         * Mailbox creation is best effort.  Higher layers must determine if
         * the lack of a mailbox for their protocol is a device failure or not.
         */

> 
> > +
> > +		pci_dbg(pdev, "Created DOE mailbox @%x\n", off);
> > +	}
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct cxl_register_map map;
> > @@ -408,6 +440,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlds))
> >  		return PTR_ERR(cxlds);
> >  
> > +	xa_init(&cxlds->doe_mbs);
> > +	devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs);
> 
> _or_reset()?  If the devm registration itself fails we want to bail out cleanly.
> It's vanishingly unlikely to happen, but we should still handle that case.
> 

Actually no; xa_destroy does not need to be called at this point if we fail.
However, I do need to check the return of devm_add_action and fails if it is
not set.  Thanks again!

Ira

> > +
> >  	cxlds->serial = pci_get_dsn(pdev);
> >  	cxlds->cxl_dvsec = pci_find_dvsec_capability(
> >  		pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
> > @@ -434,6 +469,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  
> >  	cxlds->component_reg_phys = cxl_regmap_to_base(pdev, &map);
> >  
> > +	devm_cxl_pci_create_doe(cxlds);
> > +
> >  	rc = cxl_pci_setup_mailbox(cxlds);
> >  	if (rc)
> >  		return rc;
>
Jonathan Cameron June 30, 2022, 3:32 p.m. UTC | #3
On Wed, 29 Jun 2022 22:32:57 -0700
Ira Weiny <ira.weiny@intel.com> wrote:

> On Tue, Jun 28, 2022 at 03:33:17PM +0100, Jonathan Cameron wrote:
> > On Mon, 27 Jun 2022 21:15:22 -0700
> > ira.weiny@intel.com wrote:
> >   
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > DOE mailbox objects will be needed for various mailbox communications
> > > with each memory device.
> > > 
> > > Iterate each DOE mailbox capability and create PCI DOE mailbox objects
> > > as found.
> > > 
> > > It is not anticipated that this is the final resting place for the
> > > iteration of the DOE devices.  The support of switch ports will drive
> > > this code into the PCIe side.  In this imagined architecture the CXL
> > > port driver would then query into the PCI device for the DOE mailbox
> > > array.
> > > 
> > > For now creating the mailboxes in the CXL port is good enough for the
> > > endpoints.  Later PCIe ports will need to support this to support switch
> > > ports more generically.
> > > 
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Davidlohr Bueso <dave@stgolabs.net>
> > > Cc: Lukas Wunner <lukas@wunner.de>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>  
> > 
> > My main comment on this is that we should not paper over any errors in
> > DOE setup.  Those indicate a bug or hardware fault, so like anything similar
> > we should at very least document why it makes sense to continue.  In most
> > cases I'd argue it doesn't as something is very wrong.
> >   
> > > 
> > > ---
> > > Changes from V11:
> > > 	Drop review from: Ben Widawsky <bwidawsk@kernel.org>
> > > 	Remove irq code for now
> > > 	Adjust for pci_doe_get_int_msg_num()
> > > 	Adjust for pcim_doe_create_mb()
> > > 		(No longer need to handle the destroy.)
> > > 	Use xarray for DOE mailbox array
> > > 
> > > Changes from V9:
> > > 	Bug fix: ensure DOE mailboxes are iterated before memdev add
> > > 	Ben Widawsky
> > > 		Set use_irq to false and just return on error.
> > > 		Don't return a value from devm_cxl_pci_create_doe()
> > > 		Skip allocating doe_mb array if there are no mailboxes
> > > 		Skip requesting irqs if none found.
> > > 	Ben/Jonathan Cameron
> > > 		s/num_irqs/max_irqs
> > > 
> > > Changes from V8:
> > > 	Move PCI_DOE selection to CXL_BUS to support future patches
> > > 	which move queries into the port code.
> > > 	Remove Auxiliary device arch
> > > 	Squash the functionality of the auxiliary driver into this
> > > 	patch.
> > > 	Split out the irq handling a bit.
> > > 
> > > Changes from V7:
> > > 	Minor code clean ups
> > > 	Rebased on cxl-pending
> > > 
> > > Changes from V6:
> > > 	Move all the auxiliary device stuff to the CXL layer
> > > 
> > > Changes from V5:
> > > 	Split the CXL specific stuff off from the PCI DOE create
> > > 	auxiliary device code.
> > > ---
> > >  drivers/cxl/Kconfig  |  1 +
> > >  drivers/cxl/cxlmem.h |  3 +++
> > >  drivers/cxl/pci.c    | 37 +++++++++++++++++++++++++++++++++++++
> > >  3 files changed, 41 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > index f64e3984689f..7adaaf80b302 100644
> > > --- a/drivers/cxl/Kconfig
> > > +++ b/drivers/cxl/Kconfig
> > > @@ -2,6 +2,7 @@
> > >  menuconfig CXL_BUS
> > >  	tristate "CXL (Compute Express Link) Devices Support"
> > >  	depends on PCI
> > > +	select PCI_DOE
> > >  	help
> > >  	  CXL is a bus that is electrically compatible with PCI Express, but
> > >  	  layers three protocols on that signalling (CXL.io, CXL.cache, and
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index 60d10ee1e7fc..360f282ef80c 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -191,6 +191,7 @@ struct cxl_endpoint_dvsec_info {
> > >   * @component_reg_phys: register base of component registers
> > >   * @info: Cached DVSEC information about the device.
> > >   * @serial: PCIe Device Serial Number
> > > + * @doe_mbs: PCI DOE mailbox array
> > >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> > >   *
> > >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > @@ -224,6 +225,8 @@ struct cxl_dev_state {
> > >  	resource_size_t component_reg_phys;
> > >  	u64 serial;
> > >  
> > > +	struct xarray doe_mbs;
> > > +
> > >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > >  };
> > >  
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 5a0ae46d4989..5821e6c1253b 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -8,6 +8,7 @@
> > >  #include <linux/mutex.h>
> > >  #include <linux/list.h>
> > >  #include <linux/pci.h>
> > > +#include <linux/pci-doe.h>
> > >  #include <linux/io.h>
> > >  #include "cxlmem.h"
> > >  #include "cxlpci.h"
> > > @@ -386,6 +387,37 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> > >  	return rc;
> > >  }
> > >  
> > > +static void cxl_pci_destroy_doe(void *mbs)
> > > +{
> > > +	struct xarray *xa = mbs;  
> > 
> > Local variable doesn't add anything...
> >   
> > > +
> > > +	xa_destroy(xa);
> > > +}
> > > +
> > > +static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct device *dev = cxlds->dev;
> > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > +	u16 off = 0;
> > > +
> > > +	pci_doe_for_each_off(pdev, off) {
> > > +		struct pci_doe_mb *doe_mb;
> > > +
> > > +		doe_mb = pcim_doe_create_mb(pdev, off, -1);
> > > +		if (IS_ERR(doe_mb)) {
> > > +			pci_err(pdev,
> > > +				"Failed to create MB object for MB @ %x\n",
> > > +				off);  
> > 
> > Definitely at least need a comment for why papering over this failure is
> > fine.  My gut feeling is we shouldn't ignore it.  
> 
> I'm so confused at this point I don't really know any more.  :-/
> 
> >   
> > > +			doe_mb = NULL;  
> 
> This at least needs to be a continue.
> 
> > > +		}
> > > +
> > > +		if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL))
> > > +			break;  
> > 
> > If we hit that break something has gone horribly wrong and we shouldn't
> > paper over it either.  We might have a partial list of DOEs and callers
> > after this will have no way of knowing it isn't the full list.  
> 
> I _thought_ that we did not care if some mailboxes failed or not.

I have a different view to Dan on this.  In my view if your hardware is
not working in any way at all scream like mad don't carry on... Dan
is keen to try to muddle onwards.

> 
> If CDAT is not supported on any of the mailboxes found then CDAT will not show
> up on sysfs (as per Dan's last comment).  If it was supported on a mailbox but
> no data is found the sysfs will show up but be 0 length.
> 
> At this layer I thought we agreed to skip over these errors.  If a protocol is
> needed at a higher layer and it is not found on any of the mailboxes the errors
> should show up in that layer.  In this series CDAT is not 100% necessary as
> devices can work without it.  So the errors were mostly paper'ed over in favor
> of just printing error messages and muddle on.
> 
> The xa_insert() deserves a pci_err() though.

That's probably the minimum we should do.  The xa_insert() failing is something
horrible going wrong in our software / host afterall.

> 
> And I'll add a comment indicating finding mailboxes is 'best effort' and higher
> layers may error out the device depending on necessity.  How about this?
> 
>         /* 
>          * Mailbox creation is best effort.  Higher layers must determine if
>          * the lack of a mailbox for their protocol is a device failure or not.
>          */

Good.

> 
> >   
> > > +
> > > +		pci_dbg(pdev, "Created DOE mailbox @%x\n", off);
> > > +	}
> > > +}
> > > +
> > >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  {
> > >  	struct cxl_register_map map;
> > > @@ -408,6 +440,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (IS_ERR(cxlds))
> > >  		return PTR_ERR(cxlds);
> > >  
> > > +	xa_init(&cxlds->doe_mbs);
> > > +	devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs);  
> > 
> > _or_reset()?  If the devm registration itself fails we want to bail out cleanly.
> > It's vanishingly unlikely to happen, but we should still handle that case.
> >   
> 
> Actually no; xa_destroy does not need to be called at this point if we fail.

Ah ok, I'd missed that.

> However, I do need to check the return of devm_add_action and fails if it is
> not set.  Thanks again!
> 
> Ira
> 
> > > +
> > >  	cxlds->serial = pci_get_dsn(pdev);
> > >  	cxlds->cxl_dvsec = pci_find_dvsec_capability(
> > >  		pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
> > > @@ -434,6 +469,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  
> > >  	cxlds->component_reg_phys = cxl_regmap_to_base(pdev, &map);
> > >  
> > > +	devm_cxl_pci_create_doe(cxlds);
> > > +
> > >  	rc = cxl_pci_setup_mailbox(cxlds);
> > >  	if (rc)
> > >  		return rc;  
> >
Davidlohr Bueso June 30, 2022, 4:14 p.m. UTC | #4
On Thu, 30 Jun 2022, Jonathan Cameron wrote:

>On Wed, 29 Jun 2022 22:32:57 -0700 Ira Weiny <ira.weiny@intel.com> wrote:

>> I _thought_ that we did not care if some mailboxes failed or not.
>
>I have a different view to Dan on this.  In my view if your hardware is
>not working in any way at all scream like mad don't carry on... Dan
>is keen to try to muddle onwards.

I am also of the idea of not carrying on upon any indication of failure.

>>
>> If CDAT is not supported on any of the mailboxes found then CDAT will not show
>> up on sysfs (as per Dan's last comment).  If it was supported on a mailbox but
>> no data is found the sysfs will show up but be 0 length.
>>
>> At this layer I thought we agreed to skip over these errors.  If a protocol is
>> needed at a higher layer and it is not found on any of the mailboxes the errors
>> should show up in that layer.  In this series CDAT is not 100% necessary as
>> devices can work without it.  So the errors were mostly paper'ed over in favor
>> of just printing error messages and muddle on.
>>
>> The xa_insert() deserves a pci_err() though.
>
>That's probably the minimum we should do.  The xa_insert() failing is something
>horrible going wrong in our software / host afterall.

Yes. And in addition, devm_cxl_pci_create_doe() should return any error status, and
cxl_pci_probe() can choose to omit any errors, but it's still better to have it.

Thanks,
Davidlohr
Ira Weiny July 1, 2022, 7:52 p.m. UTC | #5
On Thu, Jun 30, 2022 at 09:14:06AM -0700, Davidlohr Bueso wrote:
> On Thu, 30 Jun 2022, Jonathan Cameron wrote:
> 
> > On Wed, 29 Jun 2022 22:32:57 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> 
> > > I _thought_ that we did not care if some mailboxes failed or not.
> > 
> > I have a different view to Dan on this.  In my view if your hardware is
> > not working in any way at all scream like mad don't carry on... Dan
> > is keen to try to muddle onwards.
> 
> I am also of the idea of not carrying on upon any indication of failure.
> 
> > > 
> > > If CDAT is not supported on any of the mailboxes found then CDAT will not show
> > > up on sysfs (as per Dan's last comment).  If it was supported on a mailbox but
> > > no data is found the sysfs will show up but be 0 length.
> > > 
> > > At this layer I thought we agreed to skip over these errors.  If a protocol is
> > > needed at a higher layer and it is not found on any of the mailboxes the errors
> > > should show up in that layer.  In this series CDAT is not 100% necessary as
> > > devices can work without it.  So the errors were mostly paper'ed over in favor
> > > of just printing error messages and muddle on.
> > > 
> > > The xa_insert() deserves a pci_err() though.
> > 
> > That's probably the minimum we should do.  The xa_insert() failing is something
> > horrible going wrong in our software / host afterall.
> 
> Yes. And in addition, devm_cxl_pci_create_doe() should return any error status, and
> cxl_pci_probe() can choose to omit any errors, but it's still better to have it.

I don't think this adds anything.  If devm_cxl_pci_create_doe() already prints
error messages what is cxl_pci_probe() going to do with the error if it does
not fail the probe?

I want devm_cxl_pci_create_doe() to continue trying to create the mailbox
objects.  So for now I'm going to print the errors in devm_cxl_pci_create_doe()
and keep going without returning any error.

Ira

> 
> Thanks,
> Davidlohr
diff mbox series

Patch

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index f64e3984689f..7adaaf80b302 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -2,6 +2,7 @@ 
 menuconfig CXL_BUS
 	tristate "CXL (Compute Express Link) Devices Support"
 	depends on PCI
+	select PCI_DOE
 	help
 	  CXL is a bus that is electrically compatible with PCI Express, but
 	  layers three protocols on that signalling (CXL.io, CXL.cache, and
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 60d10ee1e7fc..360f282ef80c 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -191,6 +191,7 @@  struct cxl_endpoint_dvsec_info {
  * @component_reg_phys: register base of component registers
  * @info: Cached DVSEC information about the device.
  * @serial: PCIe Device Serial Number
+ * @doe_mbs: PCI DOE mailbox array
  * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
@@ -224,6 +225,8 @@  struct cxl_dev_state {
 	resource_size_t component_reg_phys;
 	u64 serial;
 
+	struct xarray doe_mbs;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 5a0ae46d4989..5821e6c1253b 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -8,6 +8,7 @@ 
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/pci.h>
+#include <linux/pci-doe.h>
 #include <linux/io.h>
 #include "cxlmem.h"
 #include "cxlpci.h"
@@ -386,6 +387,37 @@  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 	return rc;
 }
 
+static void cxl_pci_destroy_doe(void *mbs)
+{
+	struct xarray *xa = mbs;
+
+	xa_destroy(xa);
+}
+
+static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 off = 0;
+
+	pci_doe_for_each_off(pdev, off) {
+		struct pci_doe_mb *doe_mb;
+
+		doe_mb = pcim_doe_create_mb(pdev, off, -1);
+		if (IS_ERR(doe_mb)) {
+			pci_err(pdev,
+				"Failed to create MB object for MB @ %x\n",
+				off);
+			doe_mb = NULL;
+		}
+
+		if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL))
+			break;
+
+		pci_dbg(pdev, "Created DOE mailbox @%x\n", off);
+	}
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct cxl_register_map map;
@@ -408,6 +440,9 @@  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlds))
 		return PTR_ERR(cxlds);
 
+	xa_init(&cxlds->doe_mbs);
+	devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs);
+
 	cxlds->serial = pci_get_dsn(pdev);
 	cxlds->cxl_dvsec = pci_find_dvsec_capability(
 		pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
@@ -434,6 +469,8 @@  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	cxlds->component_reg_phys = cxl_regmap_to_base(pdev, &map);
 
+	devm_cxl_pci_create_doe(cxlds);
+
 	rc = cxl_pci_setup_mailbox(cxlds);
 	if (rc)
 		return rc;