diff mbox series

[NDCTL,v2,2/2] cxl: Add check for regions before disabling memdev

Message ID 170120423751.2725915.8152057882418377474.stgit@djiang5-mobl3 (mailing list archive)
State New, archived
Delegated to: Vishal Verma
Headers show
Series [NDCTL,v2,1/2] cxl: Save the number of decoders committed to a port | expand

Commit Message

Dave Jiang Nov. 28, 2023, 8:43 p.m. UTC
Add a check for memdev disable to see if there are active regions present
before disabling the device. This is necessary now regions are present to
fulfill the TODO that was left there. The best way to determine if a
region is active is to see if there are decoders enabled for the mem
device. This is also best effort as the state is only a snapshot the
kernel provides and is not atomic WRT the memdev disable operation. The
expectation is the admin issuing the command has full control of the mem
device and there are no other agents also attempt to control the device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v2:
- Warn if active region regardless of -f. (Alison)
- Expound on -f behavior in man page. (Vishal)
---
 Documentation/cxl/cxl-disable-memdev.txt |    4 +++-
 cxl/memdev.c                             |   17 ++++++++++++++---
 2 files changed, 17 insertions(+), 4 deletions(-)

Comments

Cao, Quanquan/曹 全全 Nov. 30, 2023, 1:41 a.m. UTC | #1
> Add a check for memdev disable to see if there are active regions present
> before disabling the device. This is necessary now regions are present to
> fulfill the TODO that was left there. The best way to determine if a
> region is active is to see if there are decoders enabled for the mem
> device. This is also best effort as the state is only a snapshot the
> kernel provides and is not atomic WRT the memdev disable operation. The
> expectation is the admin issuing the command has full control of the mem
> device and there are no other agents also attempt to control the device.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> v2:
> - Warn if active region regardless of -f. (Alison)
> - Expound on -f behavior in man page. (Vishal)
> ---
>   Documentation/cxl/cxl-disable-memdev.txt |    4 +++-
>   cxl/memdev.c                             |   17 ++++++++++++++---
>   2 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/cxl/cxl-disable-memdev.txt b/Documentation/cxl/cxl-disable-memdev.txt
> index c4edb93ee94a..34b720288705 100644
> --- a/Documentation/cxl/cxl-disable-memdev.txt
> +++ b/Documentation/cxl/cxl-disable-memdev.txt
> @@ -27,7 +27,9 @@ include::bus-option.txt[]
>   	a device if the tool determines the memdev is in active usage. Recall
>   	that CXL memory ranges might have been established by platform
>   	firmware and disabling an active device is akin to force removing
> -	memory from a running system.
> +	memory from a running system. Going down this path does not offline
> +	active memory if they are currently online. User is recommended to
> +	offline and disable the appropriate regions before disabling the memdevs.
>   
>   -v::
>   	Turn on verbose debug messages in the library (if libcxl was built with
> diff --git a/cxl/memdev.c b/cxl/memdev.c
> index 2dd2e7fcc4dd..1d3121915284 100644
> --- a/cxl/memdev.c
> +++ b/cxl/memdev.c
> @@ -437,14 +437,25 @@ static int action_free_dpa(struct cxl_memdev *memdev,
>   
>   static int action_disable(struct cxl_memdev *memdev, struct action_context *actx)
>   {
> +	struct cxl_endpoint *ep;
> +	struct cxl_port *port;
> +
>   	if (!cxl_memdev_is_enabled(memdev))
>   		return 0;
>   
> -	if (!param.force) {
> -		/* TODO: actually detect rather than assume active */
> +	ep = cxl_memdev_get_endpoint(memdev);
> +	if (!ep)
> +		return -ENODEV;
> +
> +	port = cxl_endpoint_get_port(ep);
> +	if (!port)
> +		return -ENODEV;
> +
> +	if (cxl_port_decoders_committed(port)) {
>   		log_err(&ml, "%s is part of an active region\n",
>   			cxl_memdev_get_devname(memdev));
> -		return -EBUSY;
> +		if (!param.force)
> +			return -EBUSY;
>   	}
>   
>   	return cxl_memdev_disable_invalidate(memdev);
> 
> 

Reviewed-by: Quanquan Cao <caoqq@fujitsu.com>
Cao, Quanquan/曹 全全 Nov. 30, 2023, 8:29 a.m. UTC | #2
>   static int action_disable(struct cxl_memdev *memdev, struct action_context *actx)
>   {
> +	struct cxl_endpoint *ep;
> +	struct cxl_port *port;
> +
>   	if (!cxl_memdev_is_enabled(memdev))
>   		return 0;
>   
> -	if (!param.force) {
> -		/* TODO: actually detect rather than assume active */
> +	ep = cxl_memdev_get_endpoint(memdev);
> +	if (!ep)
> +		return -ENODEV;
> +
> +	port = cxl_endpoint_get_port(ep);
> +	if (!port)
> +		return -ENODEV;
> +
> +	if (cxl_port_decoders_committed(port)) {
>   		log_err(&ml, "%s is part of an active region\n",
>   			cxl_memdev_get_devname(memdev));
> -		return -EBUSY;
> +		if (!param.force)
> +			return -EBUSY;
>   	}
>   
>   	return cxl_memdev_disable_invalidate(memdev);
> 
> 
Hi Dave,
Do you think adding one more prompt message would be more user-friendly?

code:
         if (cxl_port_decoders_committed(port)) {
                 log_err(&ml, "%s is part of an active region\n",
                         cxl_memdev_get_devname(memdev));
                 if (!param.force)
                         return -EBUSY;
                 else
                         log_err(&ml,"Forcing memdev disable with an 
active region\n");
         }

output:
[root@fedora-37-client ndctl]# cxl disable-memdev mem0 -f
cxl memdev: action_disable: mem0 is part of an active region
cxl memdev: action_disable: Forcing memdev disable with an active region
cxl memdev: cmd_disable_memdev: disabled 1 mem
Dave Jiang Nov. 30, 2023, 4:07 p.m. UTC | #3
On 11/30/23 01:29, Cao, Quanquan/曹 全全 wrote:
> 
>>   static int action_disable(struct cxl_memdev *memdev, struct action_context *actx)
>>   {
>> +    struct cxl_endpoint *ep;
>> +    struct cxl_port *port;
>> +
>>       if (!cxl_memdev_is_enabled(memdev))
>>           return 0;
>>   -    if (!param.force) {
>> -        /* TODO: actually detect rather than assume active */
>> +    ep = cxl_memdev_get_endpoint(memdev);
>> +    if (!ep)
>> +        return -ENODEV;
>> +
>> +    port = cxl_endpoint_get_port(ep);
>> +    if (!port)
>> +        return -ENODEV;
>> +
>> +    if (cxl_port_decoders_committed(port)) {
>>           log_err(&ml, "%s is part of an active region\n",
>>               cxl_memdev_get_devname(memdev));
>> -        return -EBUSY;
>> +        if (!param.force)
>> +            return -EBUSY;
>>       }
>>         return cxl_memdev_disable_invalidate(memdev);
>>
>>
> Hi Dave,
> Do you think adding one more prompt message would be more user-friendly?

Yes good idea. I'll add.

> 
> code:
>         if (cxl_port_decoders_committed(port)) {
>                 log_err(&ml, "%s is part of an active region\n",
>                         cxl_memdev_get_devname(memdev));
>                 if (!param.force)
>                         return -EBUSY;
>                 else
>                         log_err(&ml,"Forcing memdev disable with an active region\n");
>         }
> 
> output:
> [root@fedora-37-client ndctl]# cxl disable-memdev mem0 -f
> cxl memdev: action_disable: mem0 is part of an active region
> cxl memdev: action_disable: Forcing memdev disable with an active region
> cxl memdev: cmd_disable_memdev: disabled 1 mem
>
diff mbox series

Patch

diff --git a/Documentation/cxl/cxl-disable-memdev.txt b/Documentation/cxl/cxl-disable-memdev.txt
index c4edb93ee94a..34b720288705 100644
--- a/Documentation/cxl/cxl-disable-memdev.txt
+++ b/Documentation/cxl/cxl-disable-memdev.txt
@@ -27,7 +27,9 @@  include::bus-option.txt[]
 	a device if the tool determines the memdev is in active usage. Recall
 	that CXL memory ranges might have been established by platform
 	firmware and disabling an active device is akin to force removing
-	memory from a running system.
+	memory from a running system. Going down this path does not offline
+	active memory if they are currently online. User is recommended to
+	offline and disable the appropriate regions before disabling the memdevs.
 
 -v::
 	Turn on verbose debug messages in the library (if libcxl was built with
diff --git a/cxl/memdev.c b/cxl/memdev.c
index 2dd2e7fcc4dd..1d3121915284 100644
--- a/cxl/memdev.c
+++ b/cxl/memdev.c
@@ -437,14 +437,25 @@  static int action_free_dpa(struct cxl_memdev *memdev,
 
 static int action_disable(struct cxl_memdev *memdev, struct action_context *actx)
 {
+	struct cxl_endpoint *ep;
+	struct cxl_port *port;
+
 	if (!cxl_memdev_is_enabled(memdev))
 		return 0;
 
-	if (!param.force) {
-		/* TODO: actually detect rather than assume active */
+	ep = cxl_memdev_get_endpoint(memdev);
+	if (!ep)
+		return -ENODEV;
+
+	port = cxl_endpoint_get_port(ep);
+	if (!port)
+		return -ENODEV;
+
+	if (cxl_port_decoders_committed(port)) {
 		log_err(&ml, "%s is part of an active region\n",
 			cxl_memdev_get_devname(memdev));
-		return -EBUSY;
+		if (!param.force)
+			return -EBUSY;
 	}
 
 	return cxl_memdev_disable_invalidate(memdev);