diff mbox series

[v2,1/2] iio: adc: rzg2l_adc: Drop devm_pm_runtime_enable()

Message ID 20250117114540.289248-2-claudiu.beznea.uj@bp.renesas.com (mailing list archive)
State New
Headers show
Series iio: rzg2l_adc: Cleanups for rzg2l_adc driver | expand

Commit Message

Claudiu Beznea Jan. 17, 2025, 11:45 a.m. UTC
From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>

On all systems where the rzg2l_adc driver is used, the ADC clocks are part
of a PM domain. The code that implements the PM domains support is in
drivers/clk/renesas/rzg2l-cpg.c, the functions of interest for this commit
being rzg2l_cpg_attach_dev() and rzg2l_cpg_deattach_dev(). The PM
domains support is registered with GENPD_FLAG_PM_CLK which, according to
the documentation, instructs genpd to use the PM clk framework while
powering on/off attached devices.

During probe, the ADC device is attached to the PM domain
controlling the ADC clocks. Similarly, during removal, the ADC device is
detached from the PM domain.

The detachment call stack is as follows:

device_driver_detach() ->
  device_release_driver_internal() ->
    __device_release_driver() ->
      device_remove() ->
        platform_remove() ->
          dev_pm_domain_detach()

During driver unbind, after the ADC device is detached from its PM domain,
the device_unbind_cleanup() function is called, which subsequently invokes
devres_release_all(). This function handles devres resource cleanup.

If runtime PM is enabled via devm_pm_runtime_enable(), the cleanup process
triggers the action or reset function for disabling runtime PM. This
function is pm_runtime_disable_action(), which leads to the following call
stack of interest when called:

pm_runtime_disable_action() ->
  pm_runtime_dont_use_autosuspend() ->
    __pm_runtime_use_autosuspend() ->
      update_autosuspend() ->
        rpm_idle()

The rpm_idle() function attempts to runtime resume the ADC device. However,
at the point it is called, the ADC device is no longer part of the PM
domain (which manages the ADC clocks). Since the rzg2l_adc runtime PM
APIs directly modifies hardware registers, the
rzg2l_adc_pm_runtime_resume() function is invoked without the ADC clocks
being enabled. This is because the PM domain no longer resumes along with
the ADC device. As a result, this leads to system aborts.

Drop the devres API for runtime PM enable along with the other devres APIs
after it (devm_request_irq(), devm_register_iio_device()).

Fixes: 89ee8174e8c8 ("iio: adc: rzg2l_adc: Simplify the runtime PM code")
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
---

Changes in v2:
- collected Ulf's tag
- add a comment above pm_runtime_enable() explaining the reason
  it shouldn't be converted to devres
- drop devres calls that request IRQ and register IIO device
  as proposed in the review process: Ulf, I still kept you Rb
  tag; please let me know otherwise

 drivers/iio/adc/rzg2l_adc.c | 60 ++++++++++++++++++++++++++++---------
 1 file changed, 46 insertions(+), 14 deletions(-)

Comments

Claudiu Beznea Feb. 4, 2025, 12:25 p.m. UTC | #1
Hi, Jonathan,

On 17.01.2025 13:45, Claudiu wrote:
> From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> 
> On all systems where the rzg2l_adc driver is used, the ADC clocks are part
> of a PM domain. The code that implements the PM domains support is in
> drivers/clk/renesas/rzg2l-cpg.c, the functions of interest for this commit
> being rzg2l_cpg_attach_dev() and rzg2l_cpg_deattach_dev(). The PM
> domains support is registered with GENPD_FLAG_PM_CLK which, according to
> the documentation, instructs genpd to use the PM clk framework while
> powering on/off attached devices.
> 
> During probe, the ADC device is attached to the PM domain
> controlling the ADC clocks. Similarly, during removal, the ADC device is
> detached from the PM domain.
> 
> The detachment call stack is as follows:
> 
> device_driver_detach() ->
>   device_release_driver_internal() ->
>     __device_release_driver() ->
>       device_remove() ->
>         platform_remove() ->
>           dev_pm_domain_detach()
> 
> During driver unbind, after the ADC device is detached from its PM domain,
> the device_unbind_cleanup() function is called, which subsequently invokes
> devres_release_all(). This function handles devres resource cleanup.
> 
> If runtime PM is enabled via devm_pm_runtime_enable(), the cleanup process
> triggers the action or reset function for disabling runtime PM. This
> function is pm_runtime_disable_action(), which leads to the following call
> stack of interest when called:
> 
> pm_runtime_disable_action() ->
>   pm_runtime_dont_use_autosuspend() ->
>     __pm_runtime_use_autosuspend() ->
>       update_autosuspend() ->
>         rpm_idle()
> 
> The rpm_idle() function attempts to runtime resume the ADC device. However,
> at the point it is called, the ADC device is no longer part of the PM
> domain (which manages the ADC clocks). Since the rzg2l_adc runtime PM
> APIs directly modifies hardware registers, the
> rzg2l_adc_pm_runtime_resume() function is invoked without the ADC clocks
> being enabled. This is because the PM domain no longer resumes along with
> the ADC device. As a result, this leads to system aborts.
> 
> Drop the devres API for runtime PM enable along with the other devres APIs
> after it (devm_request_irq(), devm_register_iio_device()).
> 
> Fixes: 89ee8174e8c8 ("iio: adc: rzg2l_adc: Simplify the runtime PM code")
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> ---

As of my understanding, currently there is is no conclusion from the
discussion at [1]. If it's not too early in the discussion, can you please
let me know how would you prefer to go forward for fixing this driver?

Thank you,
Claudiu

[1]
https://lore.kernel.org/all/20250103140042.1619703-2-claudiu.beznea.uj@bp.renesas.com/
Jonathan Cameron Feb. 4, 2025, 2:44 p.m. UTC | #2
On Tue, 4 Feb 2025 14:25:38 +0200
Claudiu Beznea <claudiu.beznea@tuxon.dev> wrote:

> Hi, Jonathan,
> 
> On 17.01.2025 13:45, Claudiu wrote:
> > From: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> > 
> > On all systems where the rzg2l_adc driver is used, the ADC clocks are part
> > of a PM domain. The code that implements the PM domains support is in
> > drivers/clk/renesas/rzg2l-cpg.c, the functions of interest for this commit
> > being rzg2l_cpg_attach_dev() and rzg2l_cpg_deattach_dev(). The PM
> > domains support is registered with GENPD_FLAG_PM_CLK which, according to
> > the documentation, instructs genpd to use the PM clk framework while
> > powering on/off attached devices.
> > 
> > During probe, the ADC device is attached to the PM domain
> > controlling the ADC clocks. Similarly, during removal, the ADC device is
> > detached from the PM domain.
> > 
> > The detachment call stack is as follows:
> > 
> > device_driver_detach() ->
> >   device_release_driver_internal() ->
> >     __device_release_driver() ->
> >       device_remove() ->
> >         platform_remove() ->
> >           dev_pm_domain_detach()
> > 
> > During driver unbind, after the ADC device is detached from its PM domain,
> > the device_unbind_cleanup() function is called, which subsequently invokes
> > devres_release_all(). This function handles devres resource cleanup.
> > 
> > If runtime PM is enabled via devm_pm_runtime_enable(), the cleanup process
> > triggers the action or reset function for disabling runtime PM. This
> > function is pm_runtime_disable_action(), which leads to the following call
> > stack of interest when called:
> > 
> > pm_runtime_disable_action() ->
> >   pm_runtime_dont_use_autosuspend() ->
> >     __pm_runtime_use_autosuspend() ->
> >       update_autosuspend() ->
> >         rpm_idle()
> > 
> > The rpm_idle() function attempts to runtime resume the ADC device. However,
> > at the point it is called, the ADC device is no longer part of the PM
> > domain (which manages the ADC clocks). Since the rzg2l_adc runtime PM
> > APIs directly modifies hardware registers, the
> > rzg2l_adc_pm_runtime_resume() function is invoked without the ADC clocks
> > being enabled. This is because the PM domain no longer resumes along with
> > the ADC device. As a result, this leads to system aborts.
> > 
> > Drop the devres API for runtime PM enable along with the other devres APIs
> > after it (devm_request_irq(), devm_register_iio_device()).
> > 
> > Fixes: 89ee8174e8c8 ("iio: adc: rzg2l_adc: Simplify the runtime PM code")
> > Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
> > Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> > ---  
> 
> As of my understanding, currently there is is no conclusion from the
> discussion at [1]. If it's not too early in the discussion, can you please
> let me know how would you prefer to go forward for fixing this driver?
> 

Quickest might be to propose a patch similar to the one for i2c that thread
references. Post that as an RFC and see if Greg KH or anyone else shoots it
down? Also verify it fixes what you see here of course!

It was on my list to do, but quite a few other things on that list so
if you have time that would be great.

Thanks,

Jonathan

> Thank you,
> Claudiu
> 
> [1]
> https://lore.kernel.org/all/20250103140042.1619703-2-claudiu.beznea.uj@bp.renesas.com/
>
diff mbox series

Patch

diff --git a/drivers/iio/adc/rzg2l_adc.c b/drivers/iio/adc/rzg2l_adc.c
index 883c167c0670..4742a727a80c 100644
--- a/drivers/iio/adc/rzg2l_adc.c
+++ b/drivers/iio/adc/rzg2l_adc.c
@@ -87,6 +87,7 @@  struct rzg2l_adc {
 	const struct rzg2l_adc_hw_params *hw_params;
 	struct completion completion;
 	struct mutex lock;
+	int irq;
 	u16 last_val[RZG2L_ADC_MAX_CHANNELS];
 	bool was_rpm_active;
 };
@@ -430,7 +431,6 @@  static int rzg2l_adc_probe(struct platform_device *pdev)
 	struct iio_dev *indio_dev;
 	struct rzg2l_adc *adc;
 	int ret;
-	int irq;
 
 	indio_dev = devm_iio_device_alloc(dev, sizeof(*adc));
 	if (!indio_dev)
@@ -464,25 +464,33 @@  static int rzg2l_adc_probe(struct platform_device *pdev)
 
 	pm_runtime_set_autosuspend_delay(dev, 300);
 	pm_runtime_use_autosuspend(dev);
-	ret = devm_pm_runtime_enable(dev);
-	if (ret)
-		return ret;
+	/*
+	 * Use non-devres APIs from this point onward, as the ADC clocks are
+	 * managed through its power domain. Otherwise, durring repeated
+	 * unbind/bind operations, the ADC may be runtime resumed when it
+	 * is not part of its power domain, leading to accessing ADC
+	 * registers without its clocks being enabled and its PM domain
+	 * being turned on.
+	 */
+	pm_runtime_enable(dev);
 
 	platform_set_drvdata(pdev, indio_dev);
 
 	ret = rzg2l_adc_hw_init(dev, adc);
-	if (ret)
-		return dev_err_probe(&pdev->dev, ret,
-				     "failed to initialize ADC HW\n");
+	if (ret) {
+		dev_err_probe(&pdev->dev, ret, "failed to initialize ADC HW\n");
+		goto rpm_disable;
+	}
+
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
+		goto rpm_disable;
 
-	irq = platform_get_irq(pdev, 0);
-	if (irq < 0)
-		return irq;
+	adc->irq = ret;
 
-	ret = devm_request_irq(dev, irq, rzg2l_adc_isr,
-			       0, dev_name(dev), adc);
+	ret = request_irq(adc->irq, rzg2l_adc_isr, 0, dev_name(dev), adc);
 	if (ret < 0)
-		return ret;
+		goto rpm_disable;
 
 	init_completion(&adc->completion);
 
@@ -492,7 +500,30 @@  static int rzg2l_adc_probe(struct platform_device *pdev)
 	indio_dev->channels = adc->data->channels;
 	indio_dev->num_channels = adc->data->num_channels;
 
-	return devm_iio_device_register(dev, indio_dev);
+	ret = iio_device_register(indio_dev);
+	if (ret)
+		goto free_irq;
+
+	return 0;
+
+free_irq:
+	free_irq(adc->irq, adc);
+rpm_disable:
+	pm_runtime_disable(dev);
+	pm_runtime_dont_use_autosuspend(dev);
+	return ret;
+}
+
+static void rzg2l_adc_remove(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct iio_dev *indio_dev = dev_get_drvdata(dev);
+	struct rzg2l_adc *adc = iio_priv(indio_dev);
+
+	iio_device_unregister(indio_dev);
+	free_irq(adc->irq, adc);
+	pm_runtime_disable(dev);
+	pm_runtime_dont_use_autosuspend(dev);
 }
 
 static const struct rzg2l_adc_hw_params rzg2l_hw_params = {
@@ -614,6 +645,7 @@  static const struct dev_pm_ops rzg2l_adc_pm_ops = {
 
 static struct platform_driver rzg2l_adc_driver = {
 	.probe		= rzg2l_adc_probe,
+	.remove		= rzg2l_adc_remove,
 	.driver		= {
 		.name		= DRIVER_NAME,
 		.of_match_table = rzg2l_adc_match,