diff mbox series

[v3,1/3] cxl/region: Fix a race condition in memory hotplug notifier

Message ID 20240618084639.1419629-2-ying.huang@intel.com
State Accepted
Commit a3483ee7e6a7f2d12b5950246f4e0ef94f4a5df0
Headers show
Series cxl/region: Support to calculate memory tier abstract distance | expand

Commit Message

Huang, Ying June 18, 2024, 8:46 a.m. UTC
In the memory hotplug notifier function of the CXL region,
cxl_region_perf_attrs_callback(), the node ID is obtained by checking
the host address range of the region. However, the address range
information is not available when the region is registered in
devm_cxl_add_region(). Additionally, this information may be removed
or added under the protection of cxl_region_rwsem during runtime. If
the memory notifier is called for nodes other than that backed by the
region, a race condition may occur, potentially leading to a NULL
dereference or an invalid address range.

The race condition is addressed by checking the availability of the
address range information under the protection of cxl_region_rwsem. To
enhance code readability and use guard(), the relevant code has been
moved into a newly added function: cxl_region_nid().

Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/core/region.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Comments

Jonathan Cameron June 20, 2024, 11:10 a.m. UTC | #1
On Tue, 18 Jun 2024 16:46:37 +0800
Huang Ying <ying.huang@intel.com> wrote:

> In the memory hotplug notifier function of the CXL region,
> cxl_region_perf_attrs_callback(), the node ID is obtained by checking
> the host address range of the region. However, the address range
> information is not available when the region is registered in
> devm_cxl_add_region(). Additionally, this information may be removed
> or added under the protection of cxl_region_rwsem during runtime. If
> the memory notifier is called for nodes other than that backed by the
> region, a race condition may occur, potentially leading to a NULL
> dereference or an invalid address range.
> 
> The race condition is addressed by checking the availability of the
> address range information under the protection of cxl_region_rwsem. To
> enhance code readability and use guard(), the relevant code has been
> moved into a newly added function: cxl_region_nid().
> 
> Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Looks good to me and matches similar cases.  Thanks for the detailed patch
description btw

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Davidlohr Bueso June 24, 2024, 4:18 p.m. UTC | #2
On Tue, 18 Jun 2024, Huang Ying wrote:

>In the memory hotplug notifier function of the CXL region,
>cxl_region_perf_attrs_callback(), the node ID is obtained by checking
>the host address range of the region. However, the address range
>information is not available when the region is registered in
>devm_cxl_add_region(). Additionally, this information may be removed
>or added under the protection of cxl_region_rwsem during runtime. If
>the memory notifier is called for nodes other than that backed by the
>region, a race condition may occur, potentially leading to a NULL
>dereference or an invalid address range.
>
>The race condition is addressed by checking the availability of the
>address range information under the protection of cxl_region_rwsem. To
>enhance code readability and use guard(), the relevant code has been
>moved into a newly added function: cxl_region_nid().

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Gregory Price July 23, 2024, 2:31 p.m. UTC | #3
On Tue, Jun 18, 2024 at 04:46:37PM +0800, Huang Ying wrote:
> In the memory hotplug notifier function of the CXL region,
> cxl_region_perf_attrs_callback(), the node ID is obtained by checking
> the host address range of the region. However, the address range
> information is not available when the region is registered in
> devm_cxl_add_region(). Additionally, this information may be removed
> or added under the protection of cxl_region_rwsem during runtime. If
> the memory notifier is called for nodes other than that backed by the
> region, a race condition may occur, potentially leading to a NULL
> dereference or an invalid address range.
> 
> The race condition is addressed by checking the availability of the
> address range information under the protection of cxl_region_rwsem. To
> enhance code readability and use guard(), the relevant code has been
> moved into a newly added function: cxl_region_nid().
> 
> Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>

Good catch

Reviewed-by: Gregory Price <gourry@gourry.net>
Fan Ni July 23, 2024, 5 p.m. UTC | #4
On Tue, Jun 18, 2024 at 04:46:37PM +0800, Huang Ying wrote:
> In the memory hotplug notifier function of the CXL region,
> cxl_region_perf_attrs_callback(), the node ID is obtained by checking
> the host address range of the region. However, the address range
> information is not available when the region is registered in
> devm_cxl_add_region(). Additionally, this information may be removed
> or added under the protection of cxl_region_rwsem during runtime. If
> the memory notifier is called for nodes other than that backed by the
> region, a race condition may occur, potentially leading to a NULL
> dereference or an invalid address range.
> 
> The race condition is addressed by checking the availability of the
> address range information under the protection of cxl_region_rwsem. To
> enhance code readability and use guard(), the relevant code has been
> moved into a newly added function: cxl_region_nid().
> 
> Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region")
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> ---

Reviewed-by: Fan Ni <fan.ni@samsung.com>

>  drivers/cxl/core/region.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 3c2b6144be23..51aeef2c012c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2304,14 +2304,25 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
>  	return true;
>  }
>  
> +static int cxl_region_nid(struct cxl_region *cxlr)
> +{
> +	struct cxl_region_params *p = &cxlr->params;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_decoder *cxld;
> +
> +	guard(rwsem_read)(&cxl_region_rwsem);
> +	cxled = p->targets[0];
> +	if (!cxled)
> +		return NUMA_NO_NODE;
> +	cxld = &cxled->cxld;
> +	return phys_to_target_node(cxld->hpa_range.start);
> +}
> +
>  static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
>  					  unsigned long action, void *arg)
>  {
>  	struct cxl_region *cxlr = container_of(nb, struct cxl_region,
>  					       memory_notifier);
> -	struct cxl_region_params *p = &cxlr->params;
> -	struct cxl_endpoint_decoder *cxled = p->targets[0];
> -	struct cxl_decoder *cxld = &cxled->cxld;
>  	struct memory_notify *mnb = arg;
>  	int nid = mnb->status_change_nid;
>  	int region_nid;
> @@ -2319,7 +2330,7 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
>  	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
>  		return NOTIFY_DONE;
>  
> -	region_nid = phys_to_target_node(cxld->hpa_range.start);
> +	region_nid = cxl_region_nid(cxlr);
>  	if (nid != region_nid)
>  		return NOTIFY_DONE;
>  
> -- 
> 2.39.2
>
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3c2b6144be23..51aeef2c012c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2304,14 +2304,25 @@  static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
 	return true;
 }
 
+static int cxl_region_nid(struct cxl_region *cxlr)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_decoder *cxld;
+
+	guard(rwsem_read)(&cxl_region_rwsem);
+	cxled = p->targets[0];
+	if (!cxled)
+		return NUMA_NO_NODE;
+	cxld = &cxled->cxld;
+	return phys_to_target_node(cxld->hpa_range.start);
+}
+
 static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
 					  unsigned long action, void *arg)
 {
 	struct cxl_region *cxlr = container_of(nb, struct cxl_region,
 					       memory_notifier);
-	struct cxl_region_params *p = &cxlr->params;
-	struct cxl_endpoint_decoder *cxled = p->targets[0];
-	struct cxl_decoder *cxld = &cxled->cxld;
 	struct memory_notify *mnb = arg;
 	int nid = mnb->status_change_nid;
 	int region_nid;
@@ -2319,7 +2330,7 @@  static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
 	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
 		return NOTIFY_DONE;
 
-	region_nid = phys_to_target_node(cxld->hpa_range.start);
+	region_nid = cxl_region_nid(cxlr);
 	if (nid != region_nid)
 		return NOTIFY_DONE;