diff mbox series

[2/5] cxl/memdev: Add support for the Clear Poison mailbox command

Message ID 091f50b2644f220f0607633a4a953184e9c88b53.1669781852.git.alison.schofield@intel.com
State Superseded
Headers show
Series cxl: CXL Inject & Clear Poison | expand

Commit Message

Alison Schofield Nov. 30, 2022, 4:34 a.m. UTC
From: Alison Schofield <alison.schofield@intel.com>

CXL devices optionally support the CLEAR POISON mailbox command. Add
a sysfs attribute and memdev driver support for clearing poison.

When a Device Physical Address (DPA) is written to the clear_poison
sysfs attribute send a clear poison command to the device for the
specified address.

Per the CXL Specification (8.2.9.8.4.3), after receiving a valid clear
poison request, the device removes the address from the device's Poison
List and writes 0 (zero) for 64 bytes starting at address. If the device
cannot clear poison from the address, it returns a permanent media error
and ENXIO is returned to the user.

Additionally, and per the spec also, it is not an error to clear poison
of an address that is not poisoned. No error is returned and the address
is not overwritten. The memdev driver performs basic sanity checking on
the address, however, it does not go as far as reading the poison list to
see if the address is poisoned before clearing. That discovery is left to
the device. The device safely handles that case.

Implementation note: Although the CXL specification defines the clear
command to accept 64 bytes of 'write-data' to be used when clearing
the poisoned address, this implementation always uses 0 (zeros) for
the write-data.

The clear_poison attribute is only visible for devices supporting the
capability.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl | 17 +++++++++
 drivers/cxl/core/memdev.c               | 47 +++++++++++++++++++++++++
 drivers/cxl/cxlmem.h                    |  6 ++++
 3 files changed, 70 insertions(+)

Comments

Jonathan Cameron Nov. 30, 2022, 2:43 p.m. UTC | #1
On Tue, 29 Nov 2022 20:34:34 -0800
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> CXL devices optionally support the CLEAR POISON mailbox command. Add
> a sysfs attribute and memdev driver support for clearing poison.
> 
> When a Device Physical Address (DPA) is written to the clear_poison
> sysfs attribute send a clear poison command to the device for the
> specified address.
> 
> Per the CXL Specification (8.2.9.8.4.3), after receiving a valid clear
> poison request, the device removes the address from the device's Poison
> List and writes 0 (zero) for 64 bytes starting at address. If the device
> cannot clear poison from the address, it returns a permanent media error
> and ENXIO is returned to the user.

-ENXIO

> 
> Additionally, and per the spec also, it is not an error to clear poison
> of an address that is not poisoned. No error is returned and the address
> is not overwritten. The memdev driver performs basic sanity checking on
> the address, however, it does not go as far as reading the poison list to
> see if the address is poisoned before clearing. That discovery is left to
> the device. The device safely handles that case.
> 
> Implementation note: Although the CXL specification defines the clear
> command to accept 64 bytes of 'write-data' to be used when clearing
> the poisoned address, this implementation always uses 0 (zeros) for
> the write-data.

Maybe put a * above to refer to this note given the spec is referenced
for stuff different from what you are doing with it.  Nice to flag
up to anyone reading this that they shouldn't write a 'no that's not
what it says' comment before reading on. (who would do something
silly like that? :)

> 
> The clear_poison attribute is only visible for devices supporting the
> capability.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Otherwise, a few really trivial things inline + it made me notice I'd missread
the code for patch 1, hence the reply to my reply.

With this stuff tweaked.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  Documentation/ABI/testing/sysfs-bus-cxl | 17 +++++++++
>  drivers/cxl/core/memdev.c               | 47 +++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h                    |  6 ++++
>  3 files changed, 70 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 20db97f7a1aa..9d2b0fa07e17 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -435,3 +435,20 @@ Description:
>  		poison into an address that already has poison present and no
>  		error is returned. The inject_poison attribute is only visible
>                  for devices supporting the capability.
> +
> +
> +What:		/sys/bus/cxl/devices/memX/clear_poison
> +Date:		December, 2022
> +KernelVersion:	v6.2
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(WO) When a Device Physical Address (DPA) is written to this
> +		attribute the memdev driver sends a clear poison command to the
> +		device for the specified address. Clearing poison removes the
> +		address from the device's Poison List and writes 0 (zero)
> +		for 64 bytes starting at address. It is not an error to clear
> +		poison from an address that does not have poison set, and if
> +		poison was not set, the address is not overwritten. If the
> +		device cannot clear poison from the address, ENXIO is returned.

-ENXIO ?

> +		The clear_poison attribute is only visible for devices
> +		supporting the capability.
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 71130813030f..85caffd5a85c 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -187,6 +187,44 @@ static ssize_t inject_poison_store(struct device *dev,
>  }
>  static DEVICE_ATTR_WO(inject_poison);
>  
> +static ssize_t clear_poison_store(struct device *dev,
> +				  struct device_attribute *attr,
> +				  const char *buf, size_t len)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_mbox_clear_poison *pi;
> +	u64 dpa;
> +	int rc;
> +
> +	rc = kstrtou64(buf, 0, &dpa);
> +	if (rc)
> +		return rc;
> +	rc = cxl_validate_poison_dpa(cxlds, dpa);
> +	if (rc)
> +		return rc;
Trivial:
blank line here.  Kind of make sense to keep the string parser and validation in
one block, but good to then separate that from the next bit of code.

> +	pi = kzalloc(sizeof(*pi), GFP_KERNEL);
> +	if (!pi)
> +		return -ENOMEM;
> +	/*
> +	 * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
> +	 * is defined to accept 64 bytes of 'write-data', along with the
> +	 * address to clear. The device writes 'write-data' into the DPA,
> +	 * atomically, while clearing poison if the location is marked as
> +	 * being poisoned.
> +	 *
> +	 * Always use '0' for the write-data.
> +	 */
> +	pi->address = cpu_to_le64(dpa);
> +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_POISON, pi,
> +			       sizeof(*pi), NULL, cxlds->payload_size);
> +	if (rc)
> +		return rc;
> +
> +	return len;
> +}
> +static DEVICE_ATTR_WO(clear_poison);
...
Dave Jiang Dec. 1, 2022, 5:54 p.m. UTC | #2
On 11/29/2022 9:34 PM, alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> CXL devices optionally support the CLEAR POISON mailbox command. Add
> a sysfs attribute and memdev driver support for clearing poison.
> 
> When a Device Physical Address (DPA) is written to the clear_poison
> sysfs attribute send a clear poison command to the device for the

comma between 'attribute' and 'send'

> specified address.
> 
> Per the CXL Specification (8.2.9.8.4.3), after receiving a valid clear

Please add spec version.

> poison request, the device removes the address from the device's Poison
> List and writes 0 (zero) for 64 bytes starting at address. If the device
> cannot clear poison from the address, it returns a permanent media error
> and ENXIO is returned to the user.
> 
> Additionally, and per the spec also, it is not an error to clear poison
> of an address that is not poisoned. No error is returned and the address
> is not overwritten. The memdev driver performs basic sanity checking on
> the address, however, it does not go as far as reading the poison list to
> see if the address is poisoned before clearing. That discovery is left to
> the device. The device safely handles that case.
> 
> Implementation note: Although the CXL specification defines the clear
> command to accept 64 bytes of 'write-data' to be used when clearing
> the poisoned address, this implementation always uses 0 (zeros) for
> the write-data.
> 
> The clear_poison attribute is only visible for devices supporting the
> capability.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
>   Documentation/ABI/testing/sysfs-bus-cxl | 17 +++++++++
>   drivers/cxl/core/memdev.c               | 47 +++++++++++++++++++++++++
>   drivers/cxl/cxlmem.h                    |  6 ++++
>   3 files changed, 70 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 20db97f7a1aa..9d2b0fa07e17 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -435,3 +435,20 @@ Description:
>   		poison into an address that already has poison present and no
>   		error is returned. The inject_poison attribute is only visible
>                   for devices supporting the capability.
> +
> +
> +What:		/sys/bus/cxl/devices/memX/clear_poison
> +Date:		December, 2022
> +KernelVersion:	v6.2
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(WO) When a Device Physical Address (DPA) is written to this
> +		attribute the memdev driver sends a clear poison command to the

comma between 'attribute' and 'the'.

DJ

> +		device for the specified address. Clearing poison removes the
> +		address from the device's Poison List and writes 0 (zero)
> +		for 64 bytes starting at address. It is not an error to clear
> +		poison from an address that does not have poison set, and if
> +		poison was not set, the address is not overwritten. If the
> +		device cannot clear poison from the address, ENXIO is returned.
> +		The clear_poison attribute is only visible for devices
> +		supporting the capability.
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 71130813030f..85caffd5a85c 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -187,6 +187,44 @@ static ssize_t inject_poison_store(struct device *dev,
>   }
>   static DEVICE_ATTR_WO(inject_poison);
>   
> +static ssize_t clear_poison_store(struct device *dev,
> +				  struct device_attribute *attr,
> +				  const char *buf, size_t len)
> +{
> +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	struct cxl_mbox_clear_poison *pi;
> +	u64 dpa;
> +	int rc;
> +
> +	rc = kstrtou64(buf, 0, &dpa);
> +	if (rc)
> +		return rc;
> +	rc = cxl_validate_poison_dpa(cxlds, dpa);
> +	if (rc)
> +		return rc;
> +	pi = kzalloc(sizeof(*pi), GFP_KERNEL);
> +	if (!pi)
> +		return -ENOMEM;
> +	/*
> +	 * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
> +	 * is defined to accept 64 bytes of 'write-data', along with the
> +	 * address to clear. The device writes 'write-data' into the DPA,
> +	 * atomically, while clearing poison if the location is marked as
> +	 * being poisoned.
> +	 *
> +	 * Always use '0' for the write-data.
> +	 */
> +	pi->address = cpu_to_le64(dpa);
> +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_POISON, pi,
> +			       sizeof(*pi), NULL, cxlds->payload_size);
> +	if (rc)
> +		return rc;
> +
> +	return len;
> +}
> +static DEVICE_ATTR_WO(clear_poison);
> +
>   static struct attribute *cxl_memdev_attributes[] = {
>   	&dev_attr_serial.attr,
>   	&dev_attr_firmware_version.attr,
> @@ -195,6 +233,7 @@ static struct attribute *cxl_memdev_attributes[] = {
>   	&dev_attr_numa_node.attr,
>   	&dev_attr_trigger_poison_list.attr,
>   	&dev_attr_inject_poison.attr,
> +	&dev_attr_clear_poison.attr,
>   	NULL,
>   };
>   
> @@ -228,6 +267,14 @@ static umode_t cxl_memdev_visible(struct kobject *kobj, struct attribute *a,
>   			      to_cxl_memdev(dev)->cxlds->enabled_cmds))
>   			return 0;
>   	}
> +	if (a == &dev_attr_clear_poison.attr) {
> +		struct device *dev = kobj_to_dev(kobj);
> +
> +		if (!test_bit(CXL_MEM_COMMAND_ID_CLEAR_POISON,
> +			      to_cxl_memdev(dev)->cxlds->enabled_cmds)) {
> +			return 0;
> +		}
> +	}
>   	return a->mode;
>   }
>   
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 0d4c34be7335..532adf9c3afd 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -399,6 +399,12 @@ struct cxl_mbox_poison_payload_out {
>   /* Inject & Clear Poison  CXL 3.0 Spec 8.2.9.8.4.2/3 */
>   #define CXL_POISON_INJECT_RESERVED	GENMASK_ULL(5, 0)
>   
> +/* Clear Poison  CXL 3.0 Spec 8.2.9.8.4.3 */
> +struct cxl_mbox_clear_poison {
> +	__le64 address;
> +	u8 write_data[64];
> +} __packed;
> +
>   /**
>    * struct cxl_mem_command - Driver representation of a memory device command
>    * @info: Command information as it exists for the UAPI
Alison Schofield Dec. 1, 2022, 8:09 p.m. UTC | #3
On Thu, Dec 01, 2022 at 10:54:47AM -0700, Dave Jiang wrote:
> 
> 
> On 11/29/2022 9:34 PM, alison.schofield@intel.com wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > CXL devices optionally support the CLEAR POISON mailbox command. Add
> > a sysfs attribute and memdev driver support for clearing poison.
> > 
> > When a Device Physical Address (DPA) is written to the clear_poison
> > sysfs attribute send a clear poison command to the device for the
> 
> comma between 'attribute' and 'send'

Thanks for the review Dave!  Addressed this and your suggestions
below also.

> 
> > specified address.
> > 
> > Per the CXL Specification (8.2.9.8.4.3), after receiving a valid clear
> 
> Please add spec version.
> 
> > poison request, the device removes the address from the device's Poison
> > List and writes 0 (zero) for 64 bytes starting at address. If the device
> > cannot clear poison from the address, it returns a permanent media error
> > and ENXIO is returned to the user.
> > 
> > Additionally, and per the spec also, it is not an error to clear poison
> > of an address that is not poisoned. No error is returned and the address
> > is not overwritten. The memdev driver performs basic sanity checking on
> > the address, however, it does not go as far as reading the poison list to
> > see if the address is poisoned before clearing. That discovery is left to
> > the device. The device safely handles that case.
> > 
> > Implementation note: Although the CXL specification defines the clear
> > command to accept 64 bytes of 'write-data' to be used when clearing
> > the poisoned address, this implementation always uses 0 (zeros) for
> > the write-data.
> > 
> > The clear_poison attribute is only visible for devices supporting the
> > capability.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> >   Documentation/ABI/testing/sysfs-bus-cxl | 17 +++++++++
> >   drivers/cxl/core/memdev.c               | 47 +++++++++++++++++++++++++
> >   drivers/cxl/cxlmem.h                    |  6 ++++
> >   3 files changed, 70 insertions(+)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 20db97f7a1aa..9d2b0fa07e17 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -435,3 +435,20 @@ Description:
> >   		poison into an address that already has poison present and no
> >   		error is returned. The inject_poison attribute is only visible
> >                   for devices supporting the capability.
> > +
> > +
> > +What:		/sys/bus/cxl/devices/memX/clear_poison
> > +Date:		December, 2022
> > +KernelVersion:	v6.2
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		(WO) When a Device Physical Address (DPA) is written to this
> > +		attribute the memdev driver sends a clear poison command to the
> 
> comma between 'attribute' and 'the'.
> 
> DJ
> 
> > +		device for the specified address. Clearing poison removes the
> > +		address from the device's Poison List and writes 0 (zero)
> > +		for 64 bytes starting at address. It is not an error to clear
> > +		poison from an address that does not have poison set, and if
> > +		poison was not set, the address is not overwritten. If the
> > +		device cannot clear poison from the address, ENXIO is returned.
> > +		The clear_poison attribute is only visible for devices
> > +		supporting the capability.
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 71130813030f..85caffd5a85c 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -187,6 +187,44 @@ static ssize_t inject_poison_store(struct device *dev,
> >   }
> >   static DEVICE_ATTR_WO(inject_poison);
> > +static ssize_t clear_poison_store(struct device *dev,
> > +				  struct device_attribute *attr,
> > +				  const char *buf, size_t len)
> > +{
> > +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +	struct cxl_mbox_clear_poison *pi;
> > +	u64 dpa;
> > +	int rc;
> > +
> > +	rc = kstrtou64(buf, 0, &dpa);
> > +	if (rc)
> > +		return rc;
> > +	rc = cxl_validate_poison_dpa(cxlds, dpa);
> > +	if (rc)
> > +		return rc;
> > +	pi = kzalloc(sizeof(*pi), GFP_KERNEL);
> > +	if (!pi)
> > +		return -ENOMEM;
> > +	/*
> > +	 * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
> > +	 * is defined to accept 64 bytes of 'write-data', along with the
> > +	 * address to clear. The device writes 'write-data' into the DPA,
> > +	 * atomically, while clearing poison if the location is marked as
> > +	 * being poisoned.
> > +	 *
> > +	 * Always use '0' for the write-data.
> > +	 */
> > +	pi->address = cpu_to_le64(dpa);
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_POISON, pi,
> > +			       sizeof(*pi), NULL, cxlds->payload_size);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return len;
> > +}
> > +static DEVICE_ATTR_WO(clear_poison);
> > +
> >   static struct attribute *cxl_memdev_attributes[] = {
> >   	&dev_attr_serial.attr,
> >   	&dev_attr_firmware_version.attr,
> > @@ -195,6 +233,7 @@ static struct attribute *cxl_memdev_attributes[] = {
> >   	&dev_attr_numa_node.attr,
> >   	&dev_attr_trigger_poison_list.attr,
> >   	&dev_attr_inject_poison.attr,
> > +	&dev_attr_clear_poison.attr,
> >   	NULL,
> >   };
> > @@ -228,6 +267,14 @@ static umode_t cxl_memdev_visible(struct kobject *kobj, struct attribute *a,
> >   			      to_cxl_memdev(dev)->cxlds->enabled_cmds))
> >   			return 0;
> >   	}
> > +	if (a == &dev_attr_clear_poison.attr) {
> > +		struct device *dev = kobj_to_dev(kobj);
> > +
> > +		if (!test_bit(CXL_MEM_COMMAND_ID_CLEAR_POISON,
> > +			      to_cxl_memdev(dev)->cxlds->enabled_cmds)) {
> > +			return 0;
> > +		}
> > +	}
> >   	return a->mode;
> >   }
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 0d4c34be7335..532adf9c3afd 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -399,6 +399,12 @@ struct cxl_mbox_poison_payload_out {
> >   /* Inject & Clear Poison  CXL 3.0 Spec 8.2.9.8.4.2/3 */
> >   #define CXL_POISON_INJECT_RESERVED	GENMASK_ULL(5, 0)
> > +/* Clear Poison  CXL 3.0 Spec 8.2.9.8.4.3 */
> > +struct cxl_mbox_clear_poison {
> > +	__le64 address;
> > +	u8 write_data[64];
> > +} __packed;
> > +
> >   /**
> >    * struct cxl_mem_command - Driver representation of a memory device command
> >    * @info: Command information as it exists for the UAPI
Alison Schofield Dec. 1, 2022, 8:14 p.m. UTC | #4
On Wed, Nov 30, 2022 at 02:43:30PM +0000, Jonathan Cameron wrote:
> On Tue, 29 Nov 2022 20:34:34 -0800
> alison.schofield@intel.com wrote:
> 
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > CXL devices optionally support the CLEAR POISON mailbox command. Add
> > a sysfs attribute and memdev driver support for clearing poison.
> > 
> > When a Device Physical Address (DPA) is written to the clear_poison
> > sysfs attribute send a clear poison command to the device for the
> > specified address.
> > 
> > Per the CXL Specification (8.2.9.8.4.3), after receiving a valid clear
> > poison request, the device removes the address from the device's Poison
> > List and writes 0 (zero) for 64 bytes starting at address. If the device
> > cannot clear poison from the address, it returns a permanent media error
> > and ENXIO is returned to the user.
> 
> -ENXIO
> 
> > 
> > Additionally, and per the spec also, it is not an error to clear poison
> > of an address that is not poisoned. No error is returned and the address
> > is not overwritten. The memdev driver performs basic sanity checking on
> > the address, however, it does not go as far as reading the poison list to
> > see if the address is poisoned before clearing. That discovery is left to
> > the device. The device safely handles that case.
> > 
> > Implementation note: Although the CXL specification defines the clear
> > command to accept 64 bytes of 'write-data' to be used when clearing
> > the poisoned address, this implementation always uses 0 (zeros) for
> > the write-data.
> 
> Maybe put a * above to refer to this note given the spec is referenced
> for stuff different from what you are doing with it.  Nice to flag
> up to anyone reading this that they shouldn't write a 'no that's not
> what it says' comment before reading on. (who would do something
> silly like that? :)
> 

Rereading the 2 paragraphs above, it's not flowing for me now either.
I hop between 'spec says' and 'driver does'. Let me give that another
pass.

> > 
> > The clear_poison attribute is only visible for devices supporting the
> > capability.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Otherwise, a few really trivial things inline + it made me notice I'd missread
> the code for patch 1, hence the reply to my reply.
> 
> With this stuff tweaked.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

I'll pick up the stuff below too.
Thanks!

> 
> > ---
> >  Documentation/ABI/testing/sysfs-bus-cxl | 17 +++++++++
> >  drivers/cxl/core/memdev.c               | 47 +++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h                    |  6 ++++
> >  3 files changed, 70 insertions(+)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 20db97f7a1aa..9d2b0fa07e17 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -435,3 +435,20 @@ Description:
> >  		poison into an address that already has poison present and no
> >  		error is returned. The inject_poison attribute is only visible
> >                  for devices supporting the capability.
> > +
> > +
> > +What:		/sys/bus/cxl/devices/memX/clear_poison
> > +Date:		December, 2022
> > +KernelVersion:	v6.2
> > +Contact:	linux-cxl@vger.kernel.org
> > +Description:
> > +		(WO) When a Device Physical Address (DPA) is written to this
> > +		attribute the memdev driver sends a clear poison command to the
> > +		device for the specified address. Clearing poison removes the
> > +		address from the device's Poison List and writes 0 (zero)
> > +		for 64 bytes starting at address. It is not an error to clear
> > +		poison from an address that does not have poison set, and if
> > +		poison was not set, the address is not overwritten. If the
> > +		device cannot clear poison from the address, ENXIO is returned.
> 
> -ENXIO ?
> 
> > +		The clear_poison attribute is only visible for devices
> > +		supporting the capability.
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 71130813030f..85caffd5a85c 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -187,6 +187,44 @@ static ssize_t inject_poison_store(struct device *dev,
> >  }
> >  static DEVICE_ATTR_WO(inject_poison);
> >  
> > +static ssize_t clear_poison_store(struct device *dev,
> > +				  struct device_attribute *attr,
> > +				  const char *buf, size_t len)
> > +{
> > +	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +	struct cxl_mbox_clear_poison *pi;
> > +	u64 dpa;
> > +	int rc;
> > +
> > +	rc = kstrtou64(buf, 0, &dpa);
> > +	if (rc)
> > +		return rc;
> > +	rc = cxl_validate_poison_dpa(cxlds, dpa);
> > +	if (rc)
> > +		return rc;
> Trivial:
> blank line here.  Kind of make sense to keep the string parser and validation in
> one block, but good to then separate that from the next bit of code.
> 
> > +	pi = kzalloc(sizeof(*pi), GFP_KERNEL);
> > +	if (!pi)
> > +		return -ENOMEM;
> > +	/*
> > +	 * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
> > +	 * is defined to accept 64 bytes of 'write-data', along with the
> > +	 * address to clear. The device writes 'write-data' into the DPA,
> > +	 * atomically, while clearing poison if the location is marked as
> > +	 * being poisoned.
> > +	 *
> > +	 * Always use '0' for the write-data.
> > +	 */
> > +	pi->address = cpu_to_le64(dpa);
> > +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_POISON, pi,
> > +			       sizeof(*pi), NULL, cxlds->payload_size);
> > +	if (rc)
> > +		return rc;
> > +
> > +	return len;
> > +}
> > +static DEVICE_ATTR_WO(clear_poison);
> ...
> 
> 
>
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 20db97f7a1aa..9d2b0fa07e17 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -435,3 +435,20 @@  Description:
 		poison into an address that already has poison present and no
 		error is returned. The inject_poison attribute is only visible
                 for devices supporting the capability.
+
+
+What:		/sys/bus/cxl/devices/memX/clear_poison
+Date:		December, 2022
+KernelVersion:	v6.2
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(WO) When a Device Physical Address (DPA) is written to this
+		attribute the memdev driver sends a clear poison command to the
+		device for the specified address. Clearing poison removes the
+		address from the device's Poison List and writes 0 (zero)
+		for 64 bytes starting at address. It is not an error to clear
+		poison from an address that does not have poison set, and if
+		poison was not set, the address is not overwritten. If the
+		device cannot clear poison from the address, ENXIO is returned.
+		The clear_poison attribute is only visible for devices
+		supporting the capability.
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 71130813030f..85caffd5a85c 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -187,6 +187,44 @@  static ssize_t inject_poison_store(struct device *dev,
 }
 static DEVICE_ATTR_WO(inject_poison);
 
+static ssize_t clear_poison_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t len)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+	struct cxl_mbox_clear_poison *pi;
+	u64 dpa;
+	int rc;
+
+	rc = kstrtou64(buf, 0, &dpa);
+	if (rc)
+		return rc;
+	rc = cxl_validate_poison_dpa(cxlds, dpa);
+	if (rc)
+		return rc;
+	pi = kzalloc(sizeof(*pi), GFP_KERNEL);
+	if (!pi)
+		return -ENOMEM;
+	/*
+	 * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command
+	 * is defined to accept 64 bytes of 'write-data', along with the
+	 * address to clear. The device writes 'write-data' into the DPA,
+	 * atomically, while clearing poison if the location is marked as
+	 * being poisoned.
+	 *
+	 * Always use '0' for the write-data.
+	 */
+	pi->address = cpu_to_le64(dpa);
+	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_POISON, pi,
+			       sizeof(*pi), NULL, cxlds->payload_size);
+	if (rc)
+		return rc;
+
+	return len;
+}
+static DEVICE_ATTR_WO(clear_poison);
+
 static struct attribute *cxl_memdev_attributes[] = {
 	&dev_attr_serial.attr,
 	&dev_attr_firmware_version.attr,
@@ -195,6 +233,7 @@  static struct attribute *cxl_memdev_attributes[] = {
 	&dev_attr_numa_node.attr,
 	&dev_attr_trigger_poison_list.attr,
 	&dev_attr_inject_poison.attr,
+	&dev_attr_clear_poison.attr,
 	NULL,
 };
 
@@ -228,6 +267,14 @@  static umode_t cxl_memdev_visible(struct kobject *kobj, struct attribute *a,
 			      to_cxl_memdev(dev)->cxlds->enabled_cmds))
 			return 0;
 	}
+	if (a == &dev_attr_clear_poison.attr) {
+		struct device *dev = kobj_to_dev(kobj);
+
+		if (!test_bit(CXL_MEM_COMMAND_ID_CLEAR_POISON,
+			      to_cxl_memdev(dev)->cxlds->enabled_cmds)) {
+			return 0;
+		}
+	}
 	return a->mode;
 }
 
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 0d4c34be7335..532adf9c3afd 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -399,6 +399,12 @@  struct cxl_mbox_poison_payload_out {
 /* Inject & Clear Poison  CXL 3.0 Spec 8.2.9.8.4.2/3 */
 #define CXL_POISON_INJECT_RESERVED	GENMASK_ULL(5, 0)
 
+/* Clear Poison  CXL 3.0 Spec 8.2.9.8.4.3 */
+struct cxl_mbox_clear_poison {
+	__le64 address;
+	u8 write_data[64];
+} __packed;
+
 /**
  * struct cxl_mem_command - Driver representation of a memory device command
  * @info: Command information as it exists for the UAPI