diff mbox series

[v2] cxl/region: don't try to cleanup after cxl_region_setup_targets() fails

Message ID 169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in
State Accepted
Commit 0718588c7aaa7a1510b4de972370535b61dddd0d
Headers show
Series [v2] cxl/region: don't try to cleanup after cxl_region_setup_targets() fails | expand

Commit Message

Jim Harris Oct. 11, 2023, 2:51 p.m. UTC
Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in
cxl_region_attach()") tried to avoid 'eiw' initialization errors when
->nr_targets exceeded 16, by just decrementing ->nr_targets when
cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup
target list on attach error") extended that cleanup to also clear
cxled->pos and p->targets[pos].

The initialization error was incidentally fixed separately by patch
8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings")
which was merged a few days after 5e42bcbc.

But now the original cleanup when cxl_region_setup_targets() fails
prevents endpoint and switch decoder resources from being reused:

1) the cleanup does not set the decoder's region to NULL, which results
   in future dpa_size_store() calls returning -EBUSY
2) the decoder is not properly freed, which results in future commit
   errors associated with the upstream switch

Now that the initialization errors were fixed separately, the proper
cleanup for this case is to just return immediately. Then the resources
associated with this target get cleanup up as normal when the failed
region is deleted.

The ->nr_targets decrement in the error case also helped prevent
a p->targets[] array overflow, so add a new check to prevent against
that overflow.

Tested by trying to create an invalid region for a 2 switch * 2 endpoint
topology, and then following up with creating a valid region.

Signed-off-by: Jim Harris <jim.harris@samsung.com>
---
 drivers/cxl/core/region.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Dan Carpenter Oct. 11, 2023, 2:57 p.m. UTC | #1
Acked-by: Dan Carpenter <dan.carpenter@linaro.org>

regards,
dan carpenter
Jonathan Cameron Oct. 11, 2023, 8:41 p.m. UTC | #2
On Wed, 11 Oct 2023 14:51:31 +0000
Jim Harris <jim.harris@samsung.com> wrote:

> Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in
> cxl_region_attach()") tried to avoid 'eiw' initialization errors when
> ->nr_targets exceeded 16, by just decrementing ->nr_targets when  
> cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup
> target list on attach error") extended that cleanup to also clear
> cxled->pos and p->targets[pos].
> 
> The initialization error was incidentally fixed separately by patch
> 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings")
> which was merged a few days after 5e42bcbc.
> 
> But now the original cleanup when cxl_region_setup_targets() fails
> prevents endpoint and switch decoder resources from being reused:
> 
> 1) the cleanup does not set the decoder's region to NULL, which results
>    in future dpa_size_store() calls returning -EBUSY
> 2) the decoder is not properly freed, which results in future commit
>    errors associated with the upstream switch
> 
> Now that the initialization errors were fixed separately, the proper
> cleanup for this case is to just return immediately. Then the resources
> associated with this target get cleanup up as normal when the failed
> region is deleted.
> 
> The ->nr_targets decrement in the error case also helped prevent
> a p->targets[] array overflow, so add a new check to prevent against
> that overflow.
> 
> Tested by trying to create an invalid region for a 2 switch * 2 endpoint
> topology, and then following up with creating a valid region.
> 
> Signed-off-by: Jim Harris <jim.harris@samsung.com>

I agree with your analysis and that this seems to fix the cases you've called out.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> ---
>  drivers/cxl/core/region.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6d63b8798c29..2b3b3c62d0a7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1658,6 +1658,12 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  		return -ENXIO;
>  	}
>  
> +	if (p->nr_targets >= p->interleave_ways) {
> +		dev_dbg(&cxlr->dev, "region already has %d endpoints\n",
> +			p->nr_targets);
> +		return -EINVAL;
> +	}
> +
>  	ep_port = cxled_to_port(cxled);
>  	root_port = cxlrd_to_port(cxlrd);
>  	dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge);
> @@ -1750,7 +1756,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	if (p->nr_targets == p->interleave_ways) {
>  		rc = cxl_region_setup_targets(cxlr);
>  		if (rc)
> -			goto err_decrement;
> +			return rc;
>  		p->state = CXL_CONFIG_ACTIVE;
>  	}
>  
> @@ -1762,12 +1768,6 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	};
>  
>  	return 0;
> -
> -err_decrement:
> -	p->nr_targets--;
> -	cxled->pos = -1;
> -	p->targets[pos] = NULL;
> -	return rc;
>  }
>  
>  static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>
Dave Jiang Oct. 13, 2023, 4:57 p.m. UTC | #3
On 10/11/23 07:51, Jim Harris wrote:
> Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in
> cxl_region_attach()") tried to avoid 'eiw' initialization errors when
> ->nr_targets exceeded 16, by just decrementing ->nr_targets when
> cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup
> target list on attach error") extended that cleanup to also clear
> cxled->pos and p->targets[pos].
> 
> The initialization error was incidentally fixed separately by patch
> 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings")
> which was merged a few days after 5e42bcbc.
> 
> But now the original cleanup when cxl_region_setup_targets() fails
> prevents endpoint and switch decoder resources from being reused:
> 
> 1) the cleanup does not set the decoder's region to NULL, which results
>    in future dpa_size_store() calls returning -EBUSY
> 2) the decoder is not properly freed, which results in future commit
>    errors associated with the upstream switch
> 
> Now that the initialization errors were fixed separately, the proper
> cleanup for this case is to just return immediately. Then the resources
> associated with this target get cleanup up as normal when the failed
> region is deleted.
> 
> The ->nr_targets decrement in the error case also helped prevent
> a p->targets[] array overflow, so add a new check to prevent against
> that overflow.
> 
> Tested by trying to create an invalid region for a 2 switch * 2 endpoint
> topology, and then following up with creating a valid region.
> 
> Signed-off-by: Jim Harris <jim.harris@samsung.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  drivers/cxl/core/region.c |   14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6d63b8798c29..2b3b3c62d0a7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1658,6 +1658,12 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  		return -ENXIO;
>  	}
>  
> +	if (p->nr_targets >= p->interleave_ways) {
> +		dev_dbg(&cxlr->dev, "region already has %d endpoints\n",
> +			p->nr_targets);
> +		return -EINVAL;
> +	}
> +
>  	ep_port = cxled_to_port(cxled);
>  	root_port = cxlrd_to_port(cxlrd);
>  	dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge);
> @@ -1750,7 +1756,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	if (p->nr_targets == p->interleave_ways) {
>  		rc = cxl_region_setup_targets(cxlr);
>  		if (rc)
> -			goto err_decrement;
> +			return rc;
>  		p->state = CXL_CONFIG_ACTIVE;
>  	}
>  
> @@ -1762,12 +1768,6 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	};
>  
>  	return 0;
> -
> -err_decrement:
> -	p->nr_targets--;
> -	cxled->pos = -1;
> -	p->targets[pos] = NULL;
> -	return rc;
>  }
>  
>  static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
>
Dan Williams Oct. 24, 2023, 11:01 p.m. UTC | #4
Jim Harris wrote:
> Patch 5e42bcbc ("cxl/region: decrement ->nr_targets on error in
> cxl_region_attach()") tried to avoid 'eiw' initialization errors when
> ->nr_targets exceeded 16, by just decrementing ->nr_targets when
> cxl_region_setup_targets() failed. Patch 86987c76 ("cxl/region: Cleanup
> target list on attach error") extended that cleanup to also clear
> cxled->pos and p->targets[pos].
> 
> The initialization error was incidentally fixed separately by patch
> 8d4285425 ("cxl/region: Fix port setup uninitialized variable warnings")
> which was merged a few days after 5e42bcbc.

Patch looks good, but I did reflow the above paragraphs to have commit
references per checkpatch expectations. I believe it did not flag them
for you as it did not recognize "Patch <SHA>" as referring to a commit:

    Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in
    cxl_region_attach()") tried to avoid 'eiw' initialization errors when
    ->nr_targets exceeded 16, by just decrementing ->nr_targets when
    cxl_region_setup_targets() failed.
    
    Commit 86987c766276 ("cxl/region: Cleanup target list on attach error")
    extended that cleanup to also clear cxled->pos and p->targets[pos]. The
    initialization error was incidentally fixed separately by: 
    Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable
    warnings") which was merged a few days after 5e42bcbc3fef.

I also went ahead and added:

    Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()")
    Cc: <stable@vger.kernel.org>

Otherwise, good find, thanks Jim!
Jim Harris Oct. 25, 2023, 1:37 a.m. UTC | #5
On Tue, Oct 24, 2023 at 04:01:19PM -0700, Dan Williams wrote:
> 
> Patch looks good, but I did reflow the above paragraphs to have commit
> references per checkpatch expectations. I believe it did not flag them
> for you as it did not recognize "Patch <SHA>" as referring to a commit:
> 
>     Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in
>     cxl_region_attach()") tried to avoid 'eiw' initialization errors when
>     ->nr_targets exceeded 16, by just decrementing ->nr_targets when
>     cxl_region_setup_targets() failed.
>     
>     Commit 86987c766276 ("cxl/region: Cleanup target list on attach error")
>     extended that cleanup to also clear cxled->pos and p->targets[pos]. The
>     initialization error was incidentally fixed separately by: 
>     Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable
>     warnings") which was merged a few days after 5e42bcbc3fef.
> 
> I also went ahead and added:
> 
>     Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()")
>     Cc: <stable@vger.kernel.org>
> 
Thanks Dan, I'll keep an eye out for these in the future.
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6d63b8798c29..2b3b3c62d0a7 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1658,6 +1658,12 @@  static int cxl_region_attach(struct cxl_region *cxlr,
 		return -ENXIO;
 	}
 
+	if (p->nr_targets >= p->interleave_ways) {
+		dev_dbg(&cxlr->dev, "region already has %d endpoints\n",
+			p->nr_targets);
+		return -EINVAL;
+	}
+
 	ep_port = cxled_to_port(cxled);
 	root_port = cxlrd_to_port(cxlrd);
 	dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge);
@@ -1750,7 +1756,7 @@  static int cxl_region_attach(struct cxl_region *cxlr,
 	if (p->nr_targets == p->interleave_ways) {
 		rc = cxl_region_setup_targets(cxlr);
 		if (rc)
-			goto err_decrement;
+			return rc;
 		p->state = CXL_CONFIG_ACTIVE;
 	}
 
@@ -1762,12 +1768,6 @@  static int cxl_region_attach(struct cxl_region *cxlr,
 	};
 
 	return 0;
-
-err_decrement:
-	p->nr_targets--;
-	cxled->pos = -1;
-	p->targets[pos] = NULL;
-	return rc;
 }
 
 static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)