diff mbox series

Input: snvs_pwrkey - Add clk handling

Message ID 20210922094300.354227-1-u.kleine-koenig@pengutronix.de (mailing list archive)
State Superseded
Headers show
Series Input: snvs_pwrkey - Add clk handling | expand

Commit Message

Uwe Kleine-König Sept. 22, 2021, 9:43 a.m. UTC
On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
associated clock. Accessing the registers requires that this clock is
enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
having the clock enabled results in a complete hang of the machine.
(This usually only happens if snvs_pwrkey is built as a module and the
rtc-snvs driver isn't already bound because at bootup the required clk
is on and only gets disabled when the clk framework disables unused clks
late during boot.)

This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
snvs clock to pwrkey").

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 drivers/input/keyboard/snvs_pwrkey.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)


base-commit: 7d2a07b769330c34b4deabeed939325c77a7ec2f

Comments

Uwe Kleine-König Oct. 5, 2021, 8 p.m. UTC | #1
Hello,

On Wed, Sep 22, 2021 at 11:43:00AM +0200, Uwe Kleine-König wrote:
> On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
> associated clock. Accessing the registers requires that this clock is
> enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
> having the clock enabled results in a complete hang of the machine.
> (This usually only happens if snvs_pwrkey is built as a module and the
> rtc-snvs driver isn't already bound because at bootup the required clk
> is on and only gets disabled when the clk framework disables unused clks
> late during boot.)
> 
> This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
> snvs clock to pwrkey").
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

This patch fixes a hard machine hang that occurs on an i.MX8MP based
machine in ~10% of the boot ups. In my eyes it's suitable to be applied
before v5.14 even.

Any feedback on it?

Best regards
Uwe

> ---
>  drivers/input/keyboard/snvs_pwrkey.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/input/keyboard/snvs_pwrkey.c b/drivers/input/keyboard/snvs_pwrkey.c
> index 2f5e3ab5ed63..b79a559b8329 100644
> --- a/drivers/input/keyboard/snvs_pwrkey.c
> +++ b/drivers/input/keyboard/snvs_pwrkey.c
> @@ -3,6 +3,7 @@
>  // Driver for the IMX SNVS ON/OFF Power Key
>  // Copyright (C) 2015 Freescale Semiconductor, Inc. All Rights Reserved.
>  
> +#include <linux/clk.h>
>  #include <linux/device.h>
>  #include <linux/err.h>
>  #include <linux/init.h>
> @@ -99,6 +100,11 @@ static irqreturn_t imx_snvs_pwrkey_interrupt(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> +static void imx_snvs_pwrkey_disable_clk(void *data)
> +{
> +	clk_disable_unprepare(data);
> +}
> +
>  static void imx_snvs_pwrkey_act(void *pdata)
>  {
>  	struct pwrkey_drv_data *pd = pdata;
> @@ -111,6 +117,7 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
>  	struct pwrkey_drv_data *pdata;
>  	struct input_dev *input;
>  	struct device_node *np;
> +	struct clk *clk;
>  	int error;
>  	u32 vid;
>  
> @@ -134,6 +141,19 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
>  		dev_warn(&pdev->dev, "KEY_POWER without setting in dts\n");
>  	}
>  
> +	clk = devm_clk_get_optional(&pdev->dev, NULL);
> +	if (IS_ERR(clk))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to get clock\n");
> +
> +	error = clk_prepare_enable(clk);
> +	if (error)
> +		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to enable clock\n");
> +
> +	error = devm_add_action_or_reset(&pdev->dev, imx_snvs_pwrkey_disable_clk, clk);
> +	if (error)
> +		return dev_err_probe(&pdev->dev, PTR_ERR(clk),
> +				     "Failed to register clock cleanup handler\n");
> +
>  	pdata->wakeup = of_property_read_bool(np, "wakeup-source");
>  
>  	pdata->irq = platform_get_irq(pdev, 0);
> 
> base-commit: 7d2a07b769330c34b4deabeed939325c77a7ec2f
> -- 
> 2.30.2
> 
> 
>
Dmitry Torokhov Oct. 12, 2021, 1:48 a.m. UTC | #2
Hi Uwe,

On Tue, Oct 05, 2021 at 10:00:05PM +0200, Uwe Kleine-König wrote:
> Hello,
> 
> On Wed, Sep 22, 2021 at 11:43:00AM +0200, Uwe Kleine-König wrote:
> > On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
> > associated clock. Accessing the registers requires that this clock is
> > enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
> > having the clock enabled results in a complete hang of the machine.
> > (This usually only happens if snvs_pwrkey is built as a module and the
> > rtc-snvs driver isn't already bound because at bootup the required clk
> > is on and only gets disabled when the clk framework disables unused clks
> > late during boot.)
> > 
> > This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
> > snvs clock to pwrkey").
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> 
> This patch fixes a hard machine hang that occurs on an i.MX8MP based
> machine in ~10% of the boot ups. In my eyes it's suitable to be applied
> before v5.14 even.
> 
> Any feedback on it?

Sorry for the delay. As you may know I strongly dislike dev_err_probe()
as it conflates the 2 issue - error printing and noting the deferral
event that should be implemented by the resource providers (and I
believe Rob had WIP patches to push this reporting down too providers).

Could you p lease resubmit with "normal" dev_err()/dev_warn()/etc and I
will be happy to apply.

Thanks.
Uwe Kleine-König Oct. 12, 2021, 7:39 a.m. UTC | #3
Hello Dmitry,

On Mon, Oct 11, 2021 at 06:48:51PM -0700, Dmitry Torokhov wrote:
> On Tue, Oct 05, 2021 at 10:00:05PM +0200, Uwe Kleine-König wrote:
> > On Wed, Sep 22, 2021 at 11:43:00AM +0200, Uwe Kleine-König wrote:
> > > On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
> > > associated clock. Accessing the registers requires that this clock is
> > > enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
> > > having the clock enabled results in a complete hang of the machine.
> > > (This usually only happens if snvs_pwrkey is built as a module and the
> > > rtc-snvs driver isn't already bound because at bootup the required clk
> > > is on and only gets disabled when the clk framework disables unused clks
> > > late during boot.)
> > > 
> > > This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
> > > snvs clock to pwrkey").
> > > 
> > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > 
> > This patch fixes a hard machine hang that occurs on an i.MX8MP based
> > machine in ~10% of the boot ups. In my eyes it's suitable to be applied
> > before v5.14 even.
> > 
> > Any feedback on it?
> 
> Sorry for the delay. As you may know I strongly dislike dev_err_probe()
> as it conflates the 2 issue - error printing and noting the deferral
> event that should be implemented by the resource providers (and I
> believe Rob had WIP patches to push this reporting down too providers).

I didn't know your dislike (and I probably will forget it again soon,
given that there seems to be disagreement among maintainers :-), and
from your words I don't understand it. The improved idea is that
devm_clk_get_optional() already registers the deferral event for the
clk? My first intuition is that this won't work, so I'd like to see the
WIP series. (Added Rob to Cc.) Someone has a link?

Also I don't share that sentiment, given that today
devm_clk_get_optional() and all the other resource providers don't do
the necessary stuff for deferral handling, I strongly prefer to use the
mechanism that is available today (even if it might be possible to
improve it) instead of open coding it. And if it's only because once the
improved variant is available it's easier to identify the code locations
that need adaption if they all use a common function instead of
identifying something like

	clk = devm_clk_get_optional(&pdev->dev, NULL);
	if (IS_ERR(clk)) {
		error = PTR_ERR(clk);
		if (error != -EPROBE_DEFER)
			dev_err(pdev->dev, "Failed to get clk: %pe\n", clk)
		else
			device_set_deferred_probe_reason(dev, oh_I_need_a_struct_va_format_how_do_I_get_this?);
		return error;
	}

instead of

	clk = devm_clk_get_optional(&pdev->dev, NULL);
	if (IS_ERR(clk))
		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to get clock\n");
	
Even if the driver does not call device_set_deferred_probe_reason(), the
additional check for error != -EPROBE_DEFER is ugly, isn't it?

> Could you p lease resubmit with "normal" dev_err()/dev_warn()/etc and I
> will be happy to apply.

Is the above the variant you prefer? Maybe without the call to
device_set_deferred_probe_reason()? Or maybe even without the check for
-EPROBE_DEFER (which however might result in wrong error messages which
is IMHO worse than the ugliness of the additional check)?

Please advice. Given that adding clk handling prevents a machine hang,
I'm willing to add it in the way you prefer, even if I don't agree to
your reasoning.

Best regards
Uwe
Dmitry Torokhov Oct. 13, 2021, 3:06 a.m. UTC | #4
Hi Uwe,

On Tue, Oct 12, 2021 at 09:39:59AM +0200, Uwe Kleine-König wrote:
> Hello Dmitry,
> 
> On Mon, Oct 11, 2021 at 06:48:51PM -0700, Dmitry Torokhov wrote:
> > On Tue, Oct 05, 2021 at 10:00:05PM +0200, Uwe Kleine-König wrote:
> > > On Wed, Sep 22, 2021 at 11:43:00AM +0200, Uwe Kleine-König wrote:
> > > > On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
> > > > associated clock. Accessing the registers requires that this clock is
> > > > enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
> > > > having the clock enabled results in a complete hang of the machine.
> > > > (This usually only happens if snvs_pwrkey is built as a module and the
> > > > rtc-snvs driver isn't already bound because at bootup the required clk
> > > > is on and only gets disabled when the clk framework disables unused clks
> > > > late during boot.)
> > > > 
> > > > This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
> > > > snvs clock to pwrkey").
> > > > 
> > > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > > 
> > > This patch fixes a hard machine hang that occurs on an i.MX8MP based
> > > machine in ~10% of the boot ups. In my eyes it's suitable to be applied
> > > before v5.14 even.
> > > 
> > > Any feedback on it?
> > 
> > Sorry for the delay. As you may know I strongly dislike dev_err_probe()
> > as it conflates the 2 issue - error printing and noting the deferral
> > event that should be implemented by the resource providers (and I
> > believe Rob had WIP patches to push this reporting down too providers).
> 
> I didn't know your dislike (and I probably will forget it again soon,
> given that there seems to be disagreement among maintainers :-), and
> from your words I don't understand it. The improved idea is that
> devm_clk_get_optional() already registers the deferral event for the
> clk? My first intuition is that this won't work, so I'd like to see the
> WIP series. (Added Rob to Cc.) Someone has a link?

I think this is here:

https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dev_err-removal

I do not think it adds calls to device_set_deferred_probe_reason() but
that should be trivial to add, given we have device pointer and ID of
the resource, which should be enough to track it.

> 
> Also I don't share that sentiment, given that today
> devm_clk_get_optional() and all the other resource providers don't do
> the necessary stuff for deferral handling, I strongly prefer to use the
> mechanism that is available today (even if it might be possible to
> improve it) instead of open coding it. And if it's only because once the
> improved variant is available it's easier to identify the code locations
> that need adaption if they all use a common function instead of
> identifying something like
> 
> 	clk = devm_clk_get_optional(&pdev->dev, NULL);
> 	if (IS_ERR(clk)) {
> 		error = PTR_ERR(clk);
> 		if (error != -EPROBE_DEFER)
> 			dev_err(pdev->dev, "Failed to get clk: %pe\n", clk)
> 		else
> 			device_set_deferred_probe_reason(dev, oh_I_need_a_struct_va_format_how_do_I_get_this?);

You do not, you happily ignore it and wait for providers to do it for
you instead of forcing the change through all drivers.

> 		return error;
> 	}
> 
> instead of
> 
> 	clk = devm_clk_get_optional(&pdev->dev, NULL);
> 	if (IS_ERR(clk))
> 		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to get clock\n");
> 	
> Even if the driver does not call device_set_deferred_probe_reason(), the
> additional check for error != -EPROBE_DEFER is ugly, isn't it?

I'd simply do

	clk = devm_clk_get_optional(...);
	error = PTR_ERR_OR_ZERO(clk);
	if (error) {
		dev_err(&pdev->dev, "...", error);
		return error;
	}

> 
> > Could you p lease resubmit with "normal" dev_err()/dev_warn()/etc and I
> > will be happy to apply.
> 
> Is the above the variant you prefer? Maybe without the call to
> device_set_deferred_probe_reason()? Or maybe even without the check for
> -EPROBE_DEFER (which however might result in wrong error messages which
> is IMHO worse than the ugliness of the additional check)?

Why are they wrong? They are not. They would literally say that some
resource was not obtained because it is not ready.

> 
> Please advice. Given that adding clk handling prevents a machine hang,
> I'm willing to add it in the way you prefer, even if I don't agree to
> your reasoning.

The form above would work for me.

Thank you.
Uwe Kleine-König Oct. 13, 2021, 7:26 a.m. UTC | #5
Helli Dmitry,

On Tue, Oct 12, 2021 at 08:06:34PM -0700, Dmitry Torokhov wrote:
> On Tue, Oct 12, 2021 at 09:39:59AM +0200, Uwe Kleine-König wrote:
> > On Mon, Oct 11, 2021 at 06:48:51PM -0700, Dmitry Torokhov wrote:
> > > On Tue, Oct 05, 2021 at 10:00:05PM +0200, Uwe Kleine-König wrote:
> > > > On Wed, Sep 22, 2021 at 11:43:00AM +0200, Uwe Kleine-König wrote:
> > > > > On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
> > > > > associated clock. Accessing the registers requires that this clock is
> > > > > enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
> > > > > having the clock enabled results in a complete hang of the machine.
> > > > > (This usually only happens if snvs_pwrkey is built as a module and the
> > > > > rtc-snvs driver isn't already bound because at bootup the required clk
> > > > > is on and only gets disabled when the clk framework disables unused clks
> > > > > late during boot.)
> > > > > 
> > > > > This completes the fix in commit 135be16d3505 ("ARM: dts: imx7s: add
> > > > > snvs clock to pwrkey").
> > > > > 
> > > > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > > > 
> > > > This patch fixes a hard machine hang that occurs on an i.MX8MP based
> > > > machine in ~10% of the boot ups. In my eyes it's suitable to be applied
> > > > before v5.14 even.
> > > > 
> > > > Any feedback on it?
> > > 
> > > Sorry for the delay. As you may know I strongly dislike dev_err_probe()
> > > as it conflates the 2 issue - error printing and noting the deferral
> > > event that should be implemented by the resource providers (and I
> > > believe Rob had WIP patches to push this reporting down too providers).
> > 
> > I didn't know your dislike (and I probably will forget it again soon,
> > given that there seems to be disagreement among maintainers :-), and
> > from your words I don't understand it. The improved idea is that
> > devm_clk_get_optional() already registers the deferral event for the
> > clk? My first intuition is that this won't work, so I'd like to see the
> > WIP series. (Added Rob to Cc.) Someone has a link?
> 
> I think this is here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dev_err-removal
> 
> I do not think it adds calls to device_set_deferred_probe_reason() but
> that should be trivial to add, given we have device pointer and ID of
> the resource, which should be enough to track it.

So the general idea is to push the error message into the resource
providers. I think this lowers the quality of error messages in some
cases. E.g. when clk_get() fails, the clk core can only emit something
like

	dev_err(dev, "Failed to get clock '%s' (%pe)\n", id, clk);

but the requesting driver has more information and might emit a more
helpful message. Or the requesting driver might even be able to handle
an error (even if it's not -ENODEV) and emitting an error is wrong. For
example because id is NULL but the driver knows a good name or a
detailed purpose of the clk can be mentioned to help debugging the
issue.

And IMHO even with this approach it's sensible to factor out the part:

	if (IS_ERR(x) && PTR_ERR(x) != -EPROBE_DEFER)
		dev_err(dev, ...);

	return PTR_ERR(x);

Ah, it could use dev_err_probe() which has exactly the required
semantic, so no need to reinvent the wheel.

I don't know your bubble of work, in mine there are often customers who
say something like "Look, my kernel error log contains these errors,
why is there a problem with device X?" and the explanation is just that
a driver emitted an error message in the -EPROBE_DEFER case, sometimes
even several of them, e.g. once whenever an USB thumb drive is inserted
into the machine. (ok, if this happens there is likely indeed a problem
to fix, but then a single error message would still be nicer.) So I
really like drivers that don't print an error on probe deferral. Also
having drivers write something helpful (today!) to
/sys/kernel/debug/devices_deferred is very appreciated, as it makes
debugging quite a bit easier. dev_err_probe() helps for both.

All in all pushing the error message reporting into the providers seems
to be orthogonal to the question how to best report an error on the
requester's side while the providers are still "unfixed". And not using
an approach that has upsides today and not only when a certain series
(that wasn't even posted yet) landed, seems wrong to me.

> > Also I don't share that sentiment, given that today
> > devm_clk_get_optional() and all the other resource providers don't do
> > the necessary stuff for deferral handling, I strongly prefer to use the
> > mechanism that is available today (even if it might be possible to
> > improve it) instead of open coding it. And if it's only because once the
> > improved variant is available it's easier to identify the code locations
> > that need adaption if they all use a common function instead of
> > identifying something like
> > 
> > 	clk = devm_clk_get_optional(&pdev->dev, NULL);
> > 	if (IS_ERR(clk)) {
> > 		error = PTR_ERR(clk);
> > 		if (error != -EPROBE_DEFER)
> > 			dev_err(pdev->dev, "Failed to get clk: %pe\n", clk)
> > 		else
> > 			device_set_deferred_probe_reason(dev, oh_I_need_a_struct_va_format_how_do_I_get_this?);
> 
> You do not, you happily ignore it and wait for providers to do it for
> you instead of forcing the change through all drivers.

I don't get this. What change is forced here? Today (where some
providers don't emit an error message) the driver has to emit an error
message itself. (Either using dev_err_probe() with the above mentioned
advantages, by open coding it or by using a simpler and inferior
procedure.) And once devm_clk_get_optional() is adapted to emit the
error message itself, you have to adapt all callers anyhow, and which of
the approaches a given driver selected doesn't make any relevant
difference, you just remove the error message emitting there.
But until devm_clk_get_optional() is "fixed" (in a year? Or two? Ever?)
you force an inferior behaviour for users with your refusal to allow
dev_err_probe().

> > 		return error;
> > 	}
> > 
> > instead of
> > 
> > 	clk = devm_clk_get_optional(&pdev->dev, NULL);
> > 	if (IS_ERR(clk))
> > 		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to get clock\n");
> > 	
> > Even if the driver does not call device_set_deferred_probe_reason(), the
> > additional check for error != -EPROBE_DEFER is ugly, isn't it?
> 
> I'd simply do
> 
> 	clk = devm_clk_get_optional(...);
> 	error = PTR_ERR_OR_ZERO(clk);
> 	if (error) {
> 		dev_err(&pdev->dev, "...", error);
> 		return error;
> 	}

Done something similar now (not using PTR_ERR_OR_ZERO() which in my
experience is even more controversial than dev_err_probe()).

> > > Could you p lease resubmit with "normal" dev_err()/dev_warn()/etc and I
> > > will be happy to apply.
> > 
> > Is the above the variant you prefer? Maybe without the call to
> > device_set_deferred_probe_reason()? Or maybe even without the check for
> > -EPROBE_DEFER (which however might result in wrong error messages which
> > is IMHO worse than the ugliness of the additional check)?
> 
> Why are they wrong? They are not. They would literally say that some
> resource was not obtained because it is not ready.

With your approach you emit messages at the error level (i.e. where
people assume there is a real problem that needs addressing), but this
information is (in most cases) irrelevant and already wrong when someone
reads it. So the only effect is delaying the boot process a bit and
irritating people. Yes, technically it's right to emit an error on
-EPROBE_DEFER, but its usefulness is negative.

Best regards
Uwe
diff mbox series

Patch

diff --git a/drivers/input/keyboard/snvs_pwrkey.c b/drivers/input/keyboard/snvs_pwrkey.c
index 2f5e3ab5ed63..b79a559b8329 100644
--- a/drivers/input/keyboard/snvs_pwrkey.c
+++ b/drivers/input/keyboard/snvs_pwrkey.c
@@ -3,6 +3,7 @@ 
 // Driver for the IMX SNVS ON/OFF Power Key
 // Copyright (C) 2015 Freescale Semiconductor, Inc. All Rights Reserved.
 
+#include <linux/clk.h>
 #include <linux/device.h>
 #include <linux/err.h>
 #include <linux/init.h>
@@ -99,6 +100,11 @@  static irqreturn_t imx_snvs_pwrkey_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
+static void imx_snvs_pwrkey_disable_clk(void *data)
+{
+	clk_disable_unprepare(data);
+}
+
 static void imx_snvs_pwrkey_act(void *pdata)
 {
 	struct pwrkey_drv_data *pd = pdata;
@@ -111,6 +117,7 @@  static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
 	struct pwrkey_drv_data *pdata;
 	struct input_dev *input;
 	struct device_node *np;
+	struct clk *clk;
 	int error;
 	u32 vid;
 
@@ -134,6 +141,19 @@  static int imx_snvs_pwrkey_probe(struct platform_device *pdev)
 		dev_warn(&pdev->dev, "KEY_POWER without setting in dts\n");
 	}
 
+	clk = devm_clk_get_optional(&pdev->dev, NULL);
+	if (IS_ERR(clk))
+		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to get clock\n");
+
+	error = clk_prepare_enable(clk);
+	if (error)
+		return dev_err_probe(&pdev->dev, PTR_ERR(clk), "Failed to enable clock\n");
+
+	error = devm_add_action_or_reset(&pdev->dev, imx_snvs_pwrkey_disable_clk, clk);
+	if (error)
+		return dev_err_probe(&pdev->dev, PTR_ERR(clk),
+				     "Failed to register clock cleanup handler\n");
+
 	pdata->wakeup = of_property_read_bool(np, "wakeup-source");
 
 	pdata->irq = platform_get_irq(pdev, 0);