diff mbox series

[v2] cxl/core/port: defer endpoint probes when ACPI likely hasn't finished

Message ID 20241004212504.1246-1-gourry@gourry.net
State New
Headers show
Series [v2] cxl/core/port: defer endpoint probes when ACPI likely hasn't finished | expand

Commit Message

Gregory Price Oct. 4, 2024, 9:25 p.m. UTC
In cxl_acpi_probe, we add dports and uports to host bridges iteratively:
- bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_dport);
- bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_uport);

Simultaneously, as ports are probed, memdev endpoints can also be
probed. This creates a race condition, where an endpoint can perceive
its path to the root being broken in devm_cxl_enumerate_ports.

The memdev/endpoint probe will see a heirarchy that may look like:
    mem1
      parent => 0000:c1:00.0
        parent => 0000:c0:01.1
	  parent->parent => NULL

This results in find_cxl_port() returning NULL (since the port hasn't
been associated with the host bridge yet), and add_port_attach_ep
fails because the grandparent's grandparent is NULL.

When the latter condition is detected, the comments suggest:
    /*
     * The iteration reached the topology root without finding the
     * CXL-root 'cxl_port' on a previous iteration, fail for now to
     * be re-probed after platform driver attaches.
     */

This case results in an -ENXIO; however, a re-probe never occurs. Change
this return condition to -EPROBE_DEFER to explicitly cause a reprobe.

(v2: additional debug information)

Signed-off-by: Gregory Price <gourry@gourry.net>
---
 drivers/cxl/core/port.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dan Williams Oct. 5, 2024, 12:08 a.m. UTC | #1
Gregory Price wrote:
> In cxl_acpi_probe, we add dports and uports to host bridges iteratively:
> - bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_dport);
> - bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_uport);
> 
> Simultaneously, as ports are probed, memdev endpoints can also be
> probed. This creates a race condition, where an endpoint can perceive
> its path to the root being broken in devm_cxl_enumerate_ports.
> 
> The memdev/endpoint probe will see a heirarchy that may look like:
>     mem1
>       parent => 0000:c1:00.0
>         parent => 0000:c0:01.1
> 	  parent->parent => NULL
> 
> This results in find_cxl_port() returning NULL (since the port hasn't
> been associated with the host bridge yet), and add_port_attach_ep
> fails because the grandparent's grandparent is NULL.
> 
> When the latter condition is detected, the comments suggest:
>     /*
>      * The iteration reached the topology root without finding the
>      * CXL-root 'cxl_port' on a previous iteration, fail for now to
>      * be re-probed after platform driver attaches.
>      */
> 
> This case results in an -ENXIO; however, a re-probe never occurs. Change
> this return condition to -EPROBE_DEFER to explicitly cause a reprobe.

Ok, thanks for the additional debug. Like we chatted on the CXL Discord
I think this is potentially pointing to a bug in bus_rescan_devices()
where it checks dev->driver without holding the lock.

Can you give this fix a try to see if it also resolves the issue?
Effectively, cxl_bus_rescan() is always needed in case the cxl_acpi
driver loads waaaay after deferred probing has given up, and if this
works then EPROBE_DEFER can remain limited to cases where it is
absolutely known that no other device_attach() kick is coming to save
the day.

-- 8< --
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 1d5007e3795a..6c0cd94888a3 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -2088,11 +2088,18 @@ static void cxl_bus_remove(struct device *dev)
 
 static struct workqueue_struct *cxl_bus_wq;
 
-static void cxl_bus_rescan_queue(struct work_struct *w)
+static int attach_device(struct device *dev, void *data)
 {
-	int rc = bus_rescan_devices(&cxl_bus_type);
+	int rc = device_attach(dev);
+
+	dev_vdbg(dev, "rescan: %s\n", rc ? "attach" : "detached");
 
-	pr_debug("CXL bus rescan result: %d\n", rc);
+	return 0;
+}
+
+static void cxl_bus_rescan_queue(struct work_struct *w)
+{
+	bus_for_each_dev(&cxl_bus_type, NULL, NULL, attach_device);
 }
 
 void cxl_bus_rescan(void)
Gregory Price Oct. 5, 2024, 2:05 a.m. UTC | #2
On Fri, Oct 04, 2024 at 05:08:03PM -0700, Dan Williams wrote:
> Gregory Price wrote:
> > In cxl_acpi_probe, we add dports and uports to host bridges iteratively:
> > - bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_dport);
> > - bus_for_each_dev(adev->dev.bus, NULL, root_port, add_host_bridge_uport);
> > 
> > Simultaneously, as ports are probed, memdev endpoints can also be
> > probed. This creates a race condition, where an endpoint can perceive
> > its path to the root being broken in devm_cxl_enumerate_ports.
> > 
> > The memdev/endpoint probe will see a heirarchy that may look like:
> >     mem1
> >       parent => 0000:c1:00.0
> >         parent => 0000:c0:01.1
> > 	  parent->parent => NULL
> > 
> > This results in find_cxl_port() returning NULL (since the port hasn't
> > been associated with the host bridge yet), and add_port_attach_ep
> > fails because the grandparent's grandparent is NULL.
> > 
> > When the latter condition is detected, the comments suggest:
> >     /*
> >      * The iteration reached the topology root without finding the
> >      * CXL-root 'cxl_port' on a previous iteration, fail for now to
> >      * be re-probed after platform driver attaches.
> >      */
> > 
> > This case results in an -ENXIO; however, a re-probe never occurs. Change
> > this return condition to -EPROBE_DEFER to explicitly cause a reprobe.
> 
> Ok, thanks for the additional debug. Like we chatted on the CXL Discord
> I think this is potentially pointing to a bug in bus_rescan_devices()
> where it checks dev->driver without holding the lock.
> 
> Can you give this fix a try to see if it also resolves the issue?
> Effectively, cxl_bus_rescan() is always needed in case the cxl_acpi
> driver loads waaaay after deferred probing has given up, and if this
> works then EPROBE_DEFER can remain limited to cases where it is
> absolutely known that no other device_attach() kick is coming to save
> the day.
> 

Funny enough, not only did it not work, but now i get neither endpoint lol

$ ls /sys/bus/cxl/devices/
decoder0.0  decoder1.0  decoder2.0  decoder3.1  mem0  port1  port3  root0
decoder0.1  decoder1.1  decoder3.0  decoder4.0  mem1  port2  port4

w/ reprobe patch

# ls /sys/bus/cxl/devices
decoder0.0  decoder1.0  decoder2.0  decoder3.1  decoder5.0  decoder6.0  endpoint5  mem0  port1  port3  root0
decoder0.1  decoder1.1  decoder3.0  decoder4.0  decoder5.1  decoder6.1  endpoint6  mem1  port2  port4

> -- 8< --
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 1d5007e3795a..6c0cd94888a3 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -2088,11 +2088,18 @@ static void cxl_bus_remove(struct device *dev)
>  
>  static struct workqueue_struct *cxl_bus_wq;
>  
> -static void cxl_bus_rescan_queue(struct work_struct *w)
> +static int attach_device(struct device *dev, void *data)
>  {
> -	int rc = bus_rescan_devices(&cxl_bus_type);
> +	int rc = device_attach(dev);
> +
> +	dev_vdbg(dev, "rescan: %s\n", rc ? "attach" : "detached");
>  
> -	pr_debug("CXL bus rescan result: %d\n", rc);
> +	return 0;
> +}
> +
> +static void cxl_bus_rescan_queue(struct work_struct *w)
> +{
> +	bus_for_each_dev(&cxl_bus_type, NULL, NULL, attach_device);
>  }
>  
>  void cxl_bus_rescan(void)
diff mbox series

Patch

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 1d5007e3795a..d6bebf70d142 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1553,7 +1553,7 @@  static int add_port_attach_ep(struct cxl_memdev *cxlmd,
 		 */
 		dev_dbg(&cxlmd->dev, "%s is a root dport\n",
 			dev_name(dport_dev));
-		return -ENXIO;
+		return -EPROBE_DEFER;
 	}
 
 	parent_port = find_cxl_port(dparent, &parent_dport);