diff mbox series

[v2] usb: typec: qcom-pmic-typec: split HPD bridge alloc and registration

Message ID 20240408-qc-pmic-typec-hpd-split-v2-1-1704f5321b73@linaro.org (mailing list archive)
State Superseded
Headers show
Series [v2] usb: typec: qcom-pmic-typec: split HPD bridge alloc and registration | expand

Commit Message

Dmitry Baryshkov April 8, 2024, 1:06 a.m. UTC
If a probe function returns -EPROBE_DEFER after creating another device
there is a change of ending up in a probe deferral loop, (see commit
fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").

In order to prevent such probe-defer loops caused by qcom-pmic-typec
driver, use the API added by Johan Hovold and move HPD bridge
registration to the end of the probe function.

Reported-by: Caleb Connolly <caleb.connolly@linaro.org>
Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
Changes in v2:
- Fix commit message (Bryan)
- Link to v1: https://lore.kernel.org/r/20240405-qc-pmic-typec-hpd-split-v1-1-363daafb3c36@linaro.org
---
 drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


---
base-commit: a6bd6c9333397f5a0e2667d4d82fef8c970108f2
change-id: 20240405-qc-pmic-typec-hpd-split-22804201902b

Best regards,

Comments

Johan Hovold April 8, 2024, 7:11 a.m. UTC | #1
On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote:
> If a probe function returns -EPROBE_DEFER after creating another device
> there is a change of ending up in a probe deferral loop, (see commit
> fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").
> 
> In order to prevent such probe-defer loops caused by qcom-pmic-typec
> driver, use the API added by Johan Hovold and move HPD bridge
> registration to the end of the probe function.

You should be more specific here: which function called after
qcom_pmic_typec_probe() can trigger a probe deferral?

I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in
which case the bridge should be added before those calls unless there
are other reasons for not doing so, which then also should be mentioned.

I suspect the trouble is with tcpm_register_port(), but please spell
that out and mention in which scenarios that function may return
-EPROBE_DEFER.

Johan
Dmitry Baryshkov April 8, 2024, 10:49 a.m. UTC | #2
On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote:
> On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote:
> > If a probe function returns -EPROBE_DEFER after creating another device
> > there is a change of ending up in a probe deferral loop, (see commit
> > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").
> > 
> > In order to prevent such probe-defer loops caused by qcom-pmic-typec
> > driver, use the API added by Johan Hovold and move HPD bridge
> > registration to the end of the probe function.
> 
> You should be more specific here: which function called after
> qcom_pmic_typec_probe() can trigger a probe deferral?
> 
> I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in
> which case the bridge should be added before those calls unless there
> are other reasons for not doing so, which then also should be mentioned.
> 
> I suspect the trouble is with tcpm_register_port(), but please spell
> that out and mention in which scenarios that function may return
> -EPROBE_DEFER.

The probe loop comes from from tcpm_register_port(), you are right.
However then putting bridge registration before the _start() functions
is also incorrect as this will be prone to use-after-free errors that
you have fixed in pmic-glink.
Johan Hovold April 8, 2024, 11:44 a.m. UTC | #3
On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote:
> On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote:
> > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote:
> > > If a probe function returns -EPROBE_DEFER after creating another device
> > > there is a change of ending up in a probe deferral loop, (see commit
> > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").
> > > 
> > > In order to prevent such probe-defer loops caused by qcom-pmic-typec
> > > driver, use the API added by Johan Hovold and move HPD bridge
> > > registration to the end of the probe function.
> > 
> > You should be more specific here: which function called after
> > qcom_pmic_typec_probe() can trigger a probe deferral?
> > 
> > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in
> > which case the bridge should be added before those calls unless there
> > are other reasons for not doing so, which then also should be mentioned.
> > 
> > I suspect the trouble is with tcpm_register_port(), but please spell
> > that out and mention in which scenarios that function may return
> > -EPROBE_DEFER.
> 
> The probe loop comes from from tcpm_register_port(), you are right.
> However then putting bridge registration before the _start() functions
> is also incorrect as this will be prone to use-after-free errors that
> you have fixed in pmic-glink.

You obviously have to mention that in the commit message as that is a
separate change and also one that looks broken as you're now registering
resources after the device has gone "live".

So you also need to explain why you think that is safe, if it should be
done at all. You're essentially just papering over a DRM bug in the
unlikely event that probe fails.

Johan
Dmitry Baryshkov April 8, 2024, 11:48 a.m. UTC | #4
On Mon, 8 Apr 2024 at 14:44, Johan Hovold <johan@kernel.org> wrote:
>
> On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote:
> > On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote:
> > > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote:
> > > > If a probe function returns -EPROBE_DEFER after creating another device
> > > > there is a change of ending up in a probe deferral loop, (see commit
> > > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").
> > > >
> > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec
> > > > driver, use the API added by Johan Hovold and move HPD bridge
> > > > registration to the end of the probe function.
> > >
> > > You should be more specific here: which function called after
> > > qcom_pmic_typec_probe() can trigger a probe deferral?
> > >
> > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in
> > > which case the bridge should be added before those calls unless there
> > > are other reasons for not doing so, which then also should be mentioned.
> > >
> > > I suspect the trouble is with tcpm_register_port(), but please spell
> > > that out and mention in which scenarios that function may return
> > > -EPROBE_DEFER.
> >
> > The probe loop comes from from tcpm_register_port(), you are right.
> > However then putting bridge registration before the _start() functions
> > is also incorrect as this will be prone to use-after-free errors that
> > you have fixed in pmic-glink.
>
> You obviously have to mention that in the commit message as that is a
> separate change and also one that looks broken as you're now registering
> resources after the device has gone "live".

No. I'm registering a child device rather than a resource.

> So you also need to explain why you think that is safe, if it should be
> done at all. You're essentially just papering over a DRM bug in the
> unlikely event that probe fails.

Unfortunately, as pointed out by Reported-by, Caleb has actually hit
the probe failure loop.
Johan Hovold April 8, 2024, 12:14 p.m. UTC | #5
On Mon, Apr 08, 2024 at 02:48:44PM +0300, Dmitry Baryshkov wrote:
> On Mon, 8 Apr 2024 at 14:44, Johan Hovold <johan@kernel.org> wrote:
> >
> > On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote:
> > > On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote:
> > > > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote:
> > > > > If a probe function returns -EPROBE_DEFER after creating another device
> > > > > there is a change of ending up in a probe deferral loop, (see commit
> > > > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER").
> > > > >
> > > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec
> > > > > driver, use the API added by Johan Hovold and move HPD bridge
> > > > > registration to the end of the probe function.
> > > >
> > > > You should be more specific here: which function called after
> > > > qcom_pmic_typec_probe() can trigger a probe deferral?
> > > >
> > > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in
> > > > which case the bridge should be added before those calls unless there
> > > > are other reasons for not doing so, which then also should be mentioned.
> > > >
> > > > I suspect the trouble is with tcpm_register_port(), but please spell
> > > > that out and mention in which scenarios that function may return
> > > > -EPROBE_DEFER.
> > >
> > > The probe loop comes from from tcpm_register_port(), you are right.
> > > However then putting bridge registration before the _start() functions
> > > is also incorrect as this will be prone to use-after-free errors that
> > > you have fixed in pmic-glink.
> >
> > You obviously have to mention that in the commit message as that is a
> > separate change and also one that looks broken as you're now registering
> > resources after the device has gone "live".
> 
> No. I'm registering a child device rather than a resource.

There's no difference. You're registering a resource for someone else to
consume.

And it's not obvious that this does not lead to missed events, etc. in
this case.

> > So you also need to explain why you think that is safe, if it should be
> > done at all. You're essentially just papering over a DRM bug in the
> > unlikely event that probe fails.
> 
> Unfortunately, as pointed out by Reported-by, Caleb has actually hit
> the probe failure loop.

The probe loop is probably real, I don't don't doubt that, but you
still need to explain when tcpm_register_port() can return
-EPROBE_DEFER.

But the point is, you don't have to register the bridge after *starting*
the port to fix the probe loop. You're doing that for other,
questionable reasons and you don't explain why in the commit message so
that others can evaluate your reasoning.

Johan
diff mbox series

Patch

diff --git a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c
index e48412cdcb0f..96b41efae318 100644
--- a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c
+++ b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c
@@ -41,7 +41,7 @@  static int qcom_pmic_typec_probe(struct platform_device *pdev)
 	struct device_node *np = dev->of_node;
 	const struct pmic_typec_resources *res;
 	struct regmap *regmap;
-	struct device *bridge_dev;
+	struct auxiliary_device *bridge_dev;
 	u32 base;
 	int ret;
 
@@ -92,7 +92,7 @@  static int qcom_pmic_typec_probe(struct platform_device *pdev)
 	if (!tcpm->tcpc.fwnode)
 		return -EINVAL;
 
-	bridge_dev = drm_dp_hpd_bridge_register(tcpm->dev, to_of_node(tcpm->tcpc.fwnode));
+	bridge_dev = devm_drm_dp_hpd_bridge_alloc(tcpm->dev, to_of_node(tcpm->tcpc.fwnode));
 	if (IS_ERR(bridge_dev))
 		return PTR_ERR(bridge_dev);
 
@@ -110,6 +110,10 @@  static int qcom_pmic_typec_probe(struct platform_device *pdev)
 	if (ret)
 		goto fwnode_remove;
 
+	ret = devm_drm_dp_hpd_bridge_add(tcpm->dev, bridge_dev);
+	if (ret)
+		goto fwnode_remove;
+
 	return 0;
 
 fwnode_remove: