diff mbox series

Qestion: of property has dead-lock to call .probe()

Message ID 87bk148g3v.wl-kuninori.morimoto.gx@renesas.com (mailing list archive)
State New, archived
Headers show
Series Qestion: of property has dead-lock to call .probe() | expand

Commit Message

Kuninori Morimoto Sept. 3, 2024, 11:36 p.m. UTC
Hi Saravana, Rob
Cc Mark/ALSA-ML, Geert, Laurent

I got dead-lock issue on drivers/of/property.c to call .probe() 
I'm now using v6.11-rc5.

I'm using 3 devices, and these are connected via OF-Graph.
These are connected like this so far. Let's call it as [Style-A]

[Style-A]
	+---+  +---+
	|(A)+--+   |
	+-+-+  |   |
	  |    |(B)|
	+-+-+  |   |
	|(C)+--+   |
	+---+  +---+

These are using port/endpoint to connect. It works well for now,
no issue I have so far.

(B) is connector for (A)-(C) data, thus, (A) and (C) should
be probed first, and (B) needs probe later
(For ALSA member, this (B) is Audio Graph Card2).

Here, (A)-(C) can connect directly for some data.
Now it is using both data connection (= (A)-(B)-(C) and (A)-(C))

Now, I want to disconnect (A)-(C) connection, like below
Let's call it as [Style-B]

[Style-B]
	+---+  +---+
	|(A)+--+   |
	+---+  |   |
	       |(B)|
	+---+  |   |
	|(C)+--+   |
	+---+  +---+

Then, it seems dead-lock happen.

In my debug, it seems...
	- (B) is handled as supplier for (A).
	- (B) probe() is called, but it needs (A) info which is not yet
	  probed. So it returns -EPROBE_DEFER.
	- Because (B) is not probed, (A) probe() never called

In [Style-A], it seems __fwnode_link_cycle() is called for (A)-(B)
connection, so the dead-lock will be solved.
But it is not called for [Style-B]. Because of that (B) is always handled
as supplier for (A).

If I used below patch, and use "non-supplier" property on (B), this
dead-lock issue was solved. But I know this is not a good solution.

I think it is very normal connection, not super special.
How can I solve this issue on correct way ? Or how can I indicate you
my issue more detail ?
I can add debug patch and test it if you can indicate it to me.

-----------------------------------------------
Thank you for your help !!

Best regards
---
Kuninori Morimoto

Comments

Saravana Kannan Sept. 4, 2024, 4:10 a.m. UTC | #1
On Tue, Sep 3, 2024 at 4:36 PM Kuninori Morimoto
<kuninori.morimoto.gx@renesas.com> wrote:
>
>
> Hi Saravana, Rob
> Cc Mark/ALSA-ML, Geert, Laurent
>
> I got dead-lock issue on drivers/of/property.c to call .probe()
> I'm now using v6.11-rc5.
>
> I'm using 3 devices, and these are connected via OF-Graph.
> These are connected like this so far. Let's call it as [Style-A]
>
> [Style-A]
>         +---+  +---+
>         |(A)+--+   |
>         +-+-+  |   |
>           |    |(B)|
>         +-+-+  |   |
>         |(C)+--+   |
>         +---+  +---+
>
> These are using port/endpoint to connect. It works well for now,
> no issue I have so far.
>
> (B) is connector for (A)-(C) data, thus, (A) and (C) should
> be probed first, and (B) needs probe later
> (For ALSA member, this (B) is Audio Graph Card2).
>
> Here, (A)-(C) can connect directly for some data.
> Now it is using both data connection (= (A)-(B)-(C) and (A)-(C))
>
> Now, I want to disconnect (A)-(C) connection, like below
> Let's call it as [Style-B]
>
> [Style-B]
>         +---+  +---+
>         |(A)+--+   |
>         +---+  |   |
>                |(B)|
>         +---+  |   |
>         |(C)+--+   |
>         +---+  +---+
>
> Then, it seems dead-lock happen.
>
> In my debug, it seems...
>         - (B) is handled as supplier for (A).
>         - (B) probe() is called, but it needs (A) info which is not yet
>           probed. So it returns -EPROBE_DEFER.
>         - Because (B) is not probed, (A) probe() never called
>
> In [Style-A], it seems __fwnode_link_cycle() is called for (A)-(B)
> connection, so the dead-lock will be solved.
> But it is not called for [Style-B]. Because of that (B) is always handled
> as supplier for (A).
>
> If I used below patch, and use "non-supplier" property on (B), this
> dead-lock issue was solved. But I know this is not a good solution.
>
> I think it is very normal connection, not super special.
> How can I solve this issue on correct way ? Or how can I indicate you
> my issue more detail ?
> I can add debug patch and test it if you can indicate it to me.

Kinda swamped, so skimmed your diagram. Looks like you are making up a
"non-supplier" property?

Anyway, I solved this recently in a general way. Use
"post-init-providers" property in the node of "A" and point it to "B".
This tells fw_devlink that B is not needed for A to probe.

So, NACK to this patch, but hope my response helps.

-Saravana

>
> -----------------------------------------------
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index def152c61049..2f08210c2ea4 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1502,11 +1502,22 @@ static struct device_node *parse_remote_endpoint(struct device_node *np,
>                                                  const char *prop_name,
>                                                  int index)
>  {
> +       struct device_node *node;
> +
>         /* Return NULL for index > 0 to signify end of remote-endpoints. */
>         if (index > 0 || strcmp(prop_name, "remote-endpoint"))
>                 return NULL;
>
> -       return of_graph_get_remote_port_parent(np);
> +       node = of_graph_get_remote_port_parent(np);
> +
> +       /*
> +        * There is clearly non-supplier node which is connected via "remote-endpoint".
> +        * Ignore it, otherwise dead-lock might occur
> +        */
> +       if (of_property_present(node, "non-supplier"))
> +               return NULL;
> +
> +       return node;
>  }
> -----------------------------------------------
>
>
> Thank you for your help !!
>
> Best regards
> ---
> Kuninori Morimoto
Kuninori Morimoto Sept. 4, 2024, 5:27 a.m. UTC | #2
Hi Saravana

> > [Style-A]
> >         +---+  +---+
> >         |(A)+--+   |
> >         +-+-+  |   |
> >           |    |(B)|
> >         +-+-+  |   |
> >         |(C)+--+   |
> >         +---+  +---+
(snip)
> > [Style-B]
> >         +---+  +---+
> >         |(A)+--+   |
> >         +---+  |   |
> >                |(B)|
> >         +---+  |   |
> >         |(C)+--+   |
> >         +---+  +---+
(snip)
> > In my debug, it seems...
> >         - (B) is handled as supplier for (A).
> >         - (B) probe() is called, but it needs (A) info which is not yet
> >           probed. So it returns -EPROBE_DEFER.
> >         - Because (B) is not probed, (A) probe() never called
(snip)
> Anyway, I solved this recently in a general way. Use
> "post-init-providers" property in the node of "A" and point it to "B".
> This tells fw_devlink that B is not needed for A to probe.

Thank you for your help !!
"post-init-providers" property helped my issue.

Best regards
---
Kuninori Morimoto
diff mbox series

Patch

diff --git a/drivers/of/property.c b/drivers/of/property.c
index def152c61049..2f08210c2ea4 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1502,11 +1502,22 @@  static struct device_node *parse_remote_endpoint(struct device_node *np,
 						 const char *prop_name,
 						 int index)
 {
+	struct device_node *node;
+
 	/* Return NULL for index > 0 to signify end of remote-endpoints. */
 	if (index > 0 || strcmp(prop_name, "remote-endpoint"))
 		return NULL;
 
-	return of_graph_get_remote_port_parent(np);
+	node = of_graph_get_remote_port_parent(np);
+
+	/*
+	 * There is clearly non-supplier node which is connected via "remote-endpoint".
+	 * Ignore it, otherwise dead-lock might occur
+	 */
+	if (of_property_present(node, "non-supplier"))
+		return NULL;
+
+	return node;
 }
-----------------------------------------------