Message ID | 20240408-qc-pmic-typec-hpd-split-v2-1-1704f5321b73@linaro.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] usb: typec: qcom-pmic-typec: split HPD bridge alloc and registration | expand |
On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote: > If a probe function returns -EPROBE_DEFER after creating another device > there is a change of ending up in a probe deferral loop, (see commit > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER"). > > In order to prevent such probe-defer loops caused by qcom-pmic-typec > driver, use the API added by Johan Hovold and move HPD bridge > registration to the end of the probe function. You should be more specific here: which function called after qcom_pmic_typec_probe() can trigger a probe deferral? I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in which case the bridge should be added before those calls unless there are other reasons for not doing so, which then also should be mentioned. I suspect the trouble is with tcpm_register_port(), but please spell that out and mention in which scenarios that function may return -EPROBE_DEFER. Johan
On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote: > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote: > > If a probe function returns -EPROBE_DEFER after creating another device > > there is a change of ending up in a probe deferral loop, (see commit > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER"). > > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec > > driver, use the API added by Johan Hovold and move HPD bridge > > registration to the end of the probe function. > > You should be more specific here: which function called after > qcom_pmic_typec_probe() can trigger a probe deferral? > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in > which case the bridge should be added before those calls unless there > are other reasons for not doing so, which then also should be mentioned. > > I suspect the trouble is with tcpm_register_port(), but please spell > that out and mention in which scenarios that function may return > -EPROBE_DEFER. The probe loop comes from from tcpm_register_port(), you are right. However then putting bridge registration before the _start() functions is also incorrect as this will be prone to use-after-free errors that you have fixed in pmic-glink.
On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote: > On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote: > > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote: > > > If a probe function returns -EPROBE_DEFER after creating another device > > > there is a change of ending up in a probe deferral loop, (see commit > > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER"). > > > > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec > > > driver, use the API added by Johan Hovold and move HPD bridge > > > registration to the end of the probe function. > > > > You should be more specific here: which function called after > > qcom_pmic_typec_probe() can trigger a probe deferral? > > > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in > > which case the bridge should be added before those calls unless there > > are other reasons for not doing so, which then also should be mentioned. > > > > I suspect the trouble is with tcpm_register_port(), but please spell > > that out and mention in which scenarios that function may return > > -EPROBE_DEFER. > > The probe loop comes from from tcpm_register_port(), you are right. > However then putting bridge registration before the _start() functions > is also incorrect as this will be prone to use-after-free errors that > you have fixed in pmic-glink. You obviously have to mention that in the commit message as that is a separate change and also one that looks broken as you're now registering resources after the device has gone "live". So you also need to explain why you think that is safe, if it should be done at all. You're essentially just papering over a DRM bug in the unlikely event that probe fails. Johan
On Mon, 8 Apr 2024 at 14:44, Johan Hovold <johan@kernel.org> wrote: > > On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote: > > On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote: > > > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote: > > > > If a probe function returns -EPROBE_DEFER after creating another device > > > > there is a change of ending up in a probe deferral loop, (see commit > > > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER"). > > > > > > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec > > > > driver, use the API added by Johan Hovold and move HPD bridge > > > > registration to the end of the probe function. > > > > > > You should be more specific here: which function called after > > > qcom_pmic_typec_probe() can trigger a probe deferral? > > > > > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in > > > which case the bridge should be added before those calls unless there > > > are other reasons for not doing so, which then also should be mentioned. > > > > > > I suspect the trouble is with tcpm_register_port(), but please spell > > > that out and mention in which scenarios that function may return > > > -EPROBE_DEFER. > > > > The probe loop comes from from tcpm_register_port(), you are right. > > However then putting bridge registration before the _start() functions > > is also incorrect as this will be prone to use-after-free errors that > > you have fixed in pmic-glink. > > You obviously have to mention that in the commit message as that is a > separate change and also one that looks broken as you're now registering > resources after the device has gone "live". No. I'm registering a child device rather than a resource. > So you also need to explain why you think that is safe, if it should be > done at all. You're essentially just papering over a DRM bug in the > unlikely event that probe fails. Unfortunately, as pointed out by Reported-by, Caleb has actually hit the probe failure loop.
On Mon, Apr 08, 2024 at 02:48:44PM +0300, Dmitry Baryshkov wrote: > On Mon, 8 Apr 2024 at 14:44, Johan Hovold <johan@kernel.org> wrote: > > > > On Mon, Apr 08, 2024 at 01:49:48PM +0300, Dmitry Baryshkov wrote: > > > On Mon, Apr 08, 2024 at 09:11:32AM +0200, Johan Hovold wrote: > > > > On Mon, Apr 08, 2024 at 04:06:40AM +0300, Dmitry Baryshkov wrote: > > > > > If a probe function returns -EPROBE_DEFER after creating another device > > > > > there is a change of ending up in a probe deferral loop, (see commit > > > > > fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER"). > > > > > > > > > > In order to prevent such probe-defer loops caused by qcom-pmic-typec > > > > > driver, use the API added by Johan Hovold and move HPD bridge > > > > > registration to the end of the probe function. > > > > > > > > You should be more specific here: which function called after > > > > qcom_pmic_typec_probe() can trigger a probe deferral? > > > > > > > > I doubt that applies to tcpm->port_start() and tcpm->pdphy_start() in > > > > which case the bridge should be added before those calls unless there > > > > are other reasons for not doing so, which then also should be mentioned. > > > > > > > > I suspect the trouble is with tcpm_register_port(), but please spell > > > > that out and mention in which scenarios that function may return > > > > -EPROBE_DEFER. > > > > > > The probe loop comes from from tcpm_register_port(), you are right. > > > However then putting bridge registration before the _start() functions > > > is also incorrect as this will be prone to use-after-free errors that > > > you have fixed in pmic-glink. > > > > You obviously have to mention that in the commit message as that is a > > separate change and also one that looks broken as you're now registering > > resources after the device has gone "live". > > No. I'm registering a child device rather than a resource. There's no difference. You're registering a resource for someone else to consume. And it's not obvious that this does not lead to missed events, etc. in this case. > > So you also need to explain why you think that is safe, if it should be > > done at all. You're essentially just papering over a DRM bug in the > > unlikely event that probe fails. > > Unfortunately, as pointed out by Reported-by, Caleb has actually hit > the probe failure loop. The probe loop is probably real, I don't don't doubt that, but you still need to explain when tcpm_register_port() can return -EPROBE_DEFER. But the point is, you don't have to register the bridge after *starting* the port to fix the probe loop. You're doing that for other, questionable reasons and you don't explain why in the commit message so that others can evaluate your reasoning. Johan
diff --git a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c index e48412cdcb0f..96b41efae318 100644 --- a/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c +++ b/drivers/usb/typec/tcpm/qcom/qcom_pmic_typec.c @@ -41,7 +41,7 @@ static int qcom_pmic_typec_probe(struct platform_device *pdev) struct device_node *np = dev->of_node; const struct pmic_typec_resources *res; struct regmap *regmap; - struct device *bridge_dev; + struct auxiliary_device *bridge_dev; u32 base; int ret; @@ -92,7 +92,7 @@ static int qcom_pmic_typec_probe(struct platform_device *pdev) if (!tcpm->tcpc.fwnode) return -EINVAL; - bridge_dev = drm_dp_hpd_bridge_register(tcpm->dev, to_of_node(tcpm->tcpc.fwnode)); + bridge_dev = devm_drm_dp_hpd_bridge_alloc(tcpm->dev, to_of_node(tcpm->tcpc.fwnode)); if (IS_ERR(bridge_dev)) return PTR_ERR(bridge_dev); @@ -110,6 +110,10 @@ static int qcom_pmic_typec_probe(struct platform_device *pdev) if (ret) goto fwnode_remove; + ret = devm_drm_dp_hpd_bridge_add(tcpm->dev, bridge_dev); + if (ret) + goto fwnode_remove; + return 0; fwnode_remove: