diff mbox

of: Add generic handling for hardware incomplete fail state

Message ID 1460486275-12256-1-git-send-email-tony@atomide.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tony Lindgren April 12, 2016, 6:37 p.m. UTC
We have devices that are in incomplete state, but still need to be
probed to allow properly idling them for PM. Some examples are
devices that are not pinned out on certain packages, or otherwise
not enabled for use on some SoCs.

Setting status = "disabled" cannot be used for this case. Setting
"disabled" makes Linux ignore these devices so they are not probed.

To proper way deal with the incomplete devices is to probe them,
then allow the driver to check the state, and the disable or idle
the device using PM runtime. To do this we need to often enable
the device and run some device specific code to idle the device.

Also boards may have needs to disable and idle unused devices.
This may be needed for example if all resources are not wired
for a certain board variant.

It seems we can use the ePAPR 1.1 status fail-sss to do this.
From "Table 2-4 Values for status property" we have "fail-sss":

"Indicates that the device is not operational. A serious error was
 detected in the device and it is unlikely to become operational
 without repair. The sss portion of the value is specific to the
 device and indicates the error condition detected."

At least some of these fail states can be handled in a generic
way. So let's introduce a generic status = "fail-hw-incomplete"
that describes a situation where a device driver probe should
just shut down or idle the device if possible and then bail out.
This allows the SoC variant and board specific dts files to set
the status "fail-hw-incomplete" as needed.

The suggested usage in a device driver probe is:

static int foo_probe(struct platform_device *pdev)
{
	int err;
	...

        pm_runtime_enable(&pdev->dev);
        pm_runtime_get_sync(&pdev->dev);
        pm_runtime_use_autosuspend(&pdev->dev);
	...

	/* Configure device, load firmware, idle device */
	...

	if (of_device_is_incomplete(pdev->dev.of_node)) {
		err = -ENODEV;
		goto out;
	}

	/* Continue normal device probe */
	...

	return 0;

out:
	pm_runtime_dont_use_autosuspend(&pdev->dev);
	pm_runtime_put_sync(&pdev->dev);
	pm_runtime_disable(&pdev->dev);

	return err;
}

Then as needed, handling for other generic status = "fail-sss"
states can be added if needed as per ePAPR 1.1.

Cc: Nishanth Menon <nm@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
Cc: Tom Rini <trini@konsulko.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---
 drivers/of/base.c  | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 include/linux/of.h |  6 ++++++
 2 files changed, 58 insertions(+), 3 deletions(-)

Comments

Frank Rowand April 12, 2016, 8:13 p.m. UTC | #1
Hi Tony,

I agree with the need for some way of handling the incomplete
hardware issue.  I like the idea of having a uniform method for
all nodes.

I am stumbling over what the status property is supposed to convey
and what the "fail-hw-incomplete" is meant to convey.

The status property is meant to convey the current state of the
node.

"fail-hw-incomplete" is meant to describe the node implementation,
saying that some portions of hardware that the driver expects to
be present do not exist.  If I understood your explanation at ELC
correctly, an examples of this could be that a uart cell is not
routed to transmit and receive data pins or the interrupt line
from the cell is not routed to an interrupt controller.  So the
node is not useful, but it makes sense to be able to power manage
the node, turning off power so that it is not wasting power.

It seems to me that the info that needs to be conveyed is a
description of the hardware, stating:
  - some portions or features of the node are not present and/or
    are not usable
  - power management of the node is possible

Status of "fail-sss" is meant to indicate an error was detected in
the device, and that the error might (or might not) be repairable.

So the difference I see is state vs hardware description.

I would prefer to come up with a new boolean property (with a
standard name that any node binding could choose to implement)
that says something like "only power management is available for
this node, do not attempt to use any other feature of the node".

With that change, the bulk of your patch looks good, with
minor changes:

  __of_device_is_available() would not need to change.

  __of_device_is_incomplete() would change to check the new
  boolean property.  (And I would suggest renaming it to
  something that conveys it is ok to power manage the
  device, but do not do anything else to the device.)

-Frank
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren April 12, 2016, 8:34 p.m. UTC | #2
Hi,

* Frank Rowand <frowand.list@gmail.com> [160412 13:15]:
> Hi Tony,
> 
> I agree with the need for some way of handling the incomplete
> hardware issue.  I like the idea of having a uniform method for
> all nodes.
> 
> I am stumbling over what the status property is supposed to convey
> and what the "fail-hw-incomplete" is meant to convey.
> 
> The status property is meant to convey the current state of the
> node.
> 
> "fail-hw-incomplete" is meant to describe the node implementation,
> saying that some portions of hardware that the driver expects to
> be present do not exist.  If I understood your explanation at ELC
> correctly, an examples of this could be that a uart cell is not
> routed to transmit and receive data pins or the interrupt line
> from the cell is not routed to an interrupt controller.  So the
> node is not useful, but it makes sense to be able to power manage
> the node, turning off power so that it is not wasting power.

Yes cases like that are common.

> It seems to me that the info that needs to be conveyed is a
> description of the hardware, stating:
>   - some portions or features of the node are not present and/or
>     are not usable
>   - power management of the node is possible
> 
> Status of "fail-sss" is meant to indicate an error was detected in
> the device, and that the error might (or might not) be repairable.
> 
> So the difference I see is state vs hardware description.

OK thanks for the clarification. I don't see why "fail-hw-incomplete"
could not be set dynamically during the probe in some cases based
on the SoC revision detection for example. So from that point of
view using status with the "fail-sss" logic would make more sense.

> I would prefer to come up with a new boolean property (with a
> standard name that any node binding could choose to implement)
> that says something like "only power management is available for
> this node, do not attempt to use any other feature of the node".

Heh that's going to be a long property name :) How about
unusable-incomplete-idle-only :)

> With that change, the bulk of your patch looks good, with
> minor changes:
> 
>   __of_device_is_available() would not need to change.
> 
>   __of_device_is_incomplete() would change to check the new
>   boolean property.  (And I would suggest renaming it to
>   something that conveys it is ok to power manage the
>   device, but do not do anything else to the device.)

I'm fine with property too, but the runtime probe fail state
changes worry me a bit with that one.

I think Rob also preferred to use the status though while we
chatted at ELC?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frank Rowand April 12, 2016, 9:41 p.m. UTC | #3
On 4/12/2016 1:34 PM, Tony Lindgren wrote:
> Hi,
> 
> * Frank Rowand <frowand.list@gmail.com> [160412 13:15]:
>> Hi Tony,
>>
>> I agree with the need for some way of handling the incomplete
>> hardware issue.  I like the idea of having a uniform method for
>> all nodes.
>>
>> I am stumbling over what the status property is supposed to convey
>> and what the "fail-hw-incomplete" is meant to convey.
>>
>> The status property is meant to convey the current state of the
>> node.
>>
>> "fail-hw-incomplete" is meant to describe the node implementation,
>> saying that some portions of hardware that the driver expects to
>> be present do not exist.  If I understood your explanation at ELC
>> correctly, an examples of this could be that a uart cell is not
>> routed to transmit and receive data pins or the interrupt line
>> from the cell is not routed to an interrupt controller.  So the
>> node is not useful, but it makes sense to be able to power manage
>> the node, turning off power so that it is not wasting power.
> 
> Yes cases like that are common.
> 
>> It seems to me that the info that needs to be conveyed is a
>> description of the hardware, stating:
>>   - some portions or features of the node are not present and/or
>>     are not usable
>>   - power management of the node is possible
>>
>> Status of "fail-sss" is meant to indicate an error was detected in
>> the device, and that the error might (or might not) be repairable.
>>
>> So the difference I see is state vs hardware description.
> 
> OK thanks for the clarification. I don't see why "fail-hw-incomplete"
> could not be set dynamically during the probe in some cases based
> on the SoC revision detection for example. So from that point of
> view using status with the "fail-sss" logic would make more sense.

If the probe detects that the device should only be power managed
based on the SoC revision, then it would simply be one more
test added at the top of probe.  The patch would change from:

   if (of_device_is_incomplete(pdev->dev.of_node)) {

to:

   if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {

That code would be the same whether the property involved was
status or something else.

> 
>> I would prefer to come up with a new boolean property (with a
>> standard name that any node binding could choose to implement)
>> that says something like "only power management is available for
>> this node, do not attempt to use any other feature of the node".
> 
> Heh that's going to be a long property name :) How about
> unusable-incomplete-idle-only :)

Or even pm-only.  Maybe I got a little carried away with my
verbosity. :)


>> With that change, the bulk of your patch looks good, with
>> minor changes:
>>
>>   __of_device_is_available() would not need to change.
>>
>>   __of_device_is_incomplete() would change to check the new
>>   boolean property.  (And I would suggest renaming it to
>>   something that conveys it is ok to power manage the
>>   device, but do not do anything else to the device.)
> 
> I'm fine with property too, but the runtime probe fail state
> changes worry me a bit with that one.

I don't understand what the concern is.  The change I suggested
would use exactly the same code for probe as the example patch
you provided, but just with a slight name change for the function.


> I think Rob also preferred to use the status though while we
> chatted at ELC?

That is the impression I got too.  We'll have to see if I can
convince him otherwise.


> Regards,
> 
> Tony
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren April 12, 2016, 10:02 p.m. UTC | #4
* Frank Rowand <frowand.list@gmail.com> [160412 14:42]:
> On 4/12/2016 1:34 PM, Tony Lindgren wrote:
> > 
> > OK thanks for the clarification. I don't see why "fail-hw-incomplete"
> > could not be set dynamically during the probe in some cases based
> > on the SoC revision detection for example. So from that point of
> > view using status with the "fail-sss" logic would make more sense.
> 
> If the probe detects that the device should only be power managed
> based on the SoC revision, then it would simply be one more
> test added at the top of probe.  The patch would change from:
> 
>    if (of_device_is_incomplete(pdev->dev.of_node)) {
> 
> to:
> 
>    if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {
> 
> That code would be the same whether the property involved was
> status or something else.

Yeah that should work if we had a generic way to get the runtime
socrev somehow :) I guess the closest thing is the ARM system_rev.

> >> I would prefer to come up with a new boolean property (with a
> >> standard name that any node binding could choose to implement)
> >> that says something like "only power management is available for
> >> this node, do not attempt to use any other feature of the node".
> > 
> > Heh that's going to be a long property name :) How about
> > unusable-incomplete-idle-only :)
> 
> Or even pm-only.  Maybe I got a little carried away with my
> verbosity. :)

That works for me unless somebody comes up with a better one.
I can only think unusable-for-io, which is no better.

> >> With that change, the bulk of your patch looks good, with
> >> minor changes:
> >>
> >>   __of_device_is_available() would not need to change.
> >>
> >>   __of_device_is_incomplete() would change to check the new
> >>   boolean property.  (And I would suggest renaming it to
> >>   something that conveys it is ok to power manage the
> >>   device, but do not do anything else to the device.)
> > 
> > I'm fine with property too, but the runtime probe fail state
> > changes worry me a bit with that one.
> 
> I don't understand what the concern is.  The change I suggested
> would use exactly the same code for probe as the example patch
> you provided, but just with a slight name change for the function.

I guess I'm just wondering if using property vs status will make
things harder to change during runtime. For example, let's assume
u-boot needs to set some devices to "pm-only" state based on the
SoC revision on a board.

> > I think Rob also preferred to use the status though while we
> > chatted at ELC?
> 
> That is the impression I got too.  We'll have to see if I can
> convince him otherwise.

Yeah let's wait for his comments.

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring April 12, 2016, 10:20 p.m. UTC | #5
On Tue, Apr 12, 2016 at 4:41 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> On 4/12/2016 1:34 PM, Tony Lindgren wrote:
>> Hi,
>>
>> * Frank Rowand <frowand.list@gmail.com> [160412 13:15]:
>>> Hi Tony,
>>>
>>> I agree with the need for some way of handling the incomplete
>>> hardware issue.  I like the idea of having a uniform method for
>>> all nodes.
>>>
>>> I am stumbling over what the status property is supposed to convey
>>> and what the "fail-hw-incomplete" is meant to convey.
>>>
>>> The status property is meant to convey the current state of the
>>> node.
>>>
>>> "fail-hw-incomplete" is meant to describe the node implementation,
>>> saying that some portions of hardware that the driver expects to
>>> be present do not exist.  If I understood your explanation at ELC
>>> correctly, an examples of this could be that a uart cell is not
>>> routed to transmit and receive data pins or the interrupt line
>>> from the cell is not routed to an interrupt controller.  So the
>>> node is not useful, but it makes sense to be able to power manage
>>> the node, turning off power so that it is not wasting power.
>>
>> Yes cases like that are common.
>>
>>> It seems to me that the info that needs to be conveyed is a
>>> description of the hardware, stating:
>>>   - some portions or features of the node are not present and/or
>>>     are not usable
>>>   - power management of the node is possible
>>>
>>> Status of "fail-sss" is meant to indicate an error was detected in
>>> the device, and that the error might (or might not) be repairable.
>>>
>>> So the difference I see is state vs hardware description.

The question to ask is are we indicating the "operational status of a
device"? If yes, that is the definition of status and using it would
be appropriate.

IMO, I think we are.

>> OK thanks for the clarification. I don't see why "fail-hw-incomplete"
>> could not be set dynamically during the probe in some cases based
>> on the SoC revision detection for example. So from that point of
>> view using status with the "fail-sss" logic would make more sense.
>
> If the probe detects that the device should only be power managed
> based on the SoC revision, then it would simply be one more
> test added at the top of probe.  The patch would change from:
>
>    if (of_device_is_incomplete(pdev->dev.of_node)) {
>
> to:
>
>    if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {

I think Tony meant the bootloader or platform code would do this and
tweak the DT. We don't have much of a standard API for revision
checking, so drivers don't check SoC revisions generally.

> That code would be the same whether the property involved was
> status or something else.
>
>>
>>> I would prefer to come up with a new boolean property (with a
>>> standard name that any node binding could choose to implement)
>>> that says something like "only power management is available for
>>> this node, do not attempt to use any other feature of the node".
>>
>> Heh that's going to be a long property name :) How about
>> unusable-incomplete-idle-only :)
>
> Or even pm-only.  Maybe I got a little carried away with my
> verbosity. :)

I don't think we should define it so narrowly. I think DT just
indicates the device is in a non-usable state (somewhere between ok
and disabled) and the driver knows what to do with that information.

>>> With that change, the bulk of your patch looks good, with
>>> minor changes:
>>>
>>>   __of_device_is_available() would not need to change.
>>>
>>>   __of_device_is_incomplete() would change to check the new
>>>   boolean property.  (And I would suggest renaming it to
>>>   something that conveys it is ok to power manage the
>>>   device, but do not do anything else to the device.)
>>
>> I'm fine with property too, but the runtime probe fail state
>> changes worry me a bit with that one.
>
> I don't understand what the concern is.  The change I suggested
> would use exactly the same code for probe as the example patch
> you provided, but just with a slight name change for the function.
>
>
>> I think Rob also preferred to use the status though while we
>> chatted at ELC?
>
> That is the impression I got too.  We'll have to see if I can
> convince him otherwise.

I did, but I'm not wed to it. I think it depends on the question above.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren April 12, 2016, 10:27 p.m. UTC | #6
* Rob Herring <robh+dt@kernel.org> [160412 15:22]:
> On Tue, Apr 12, 2016 at 4:41 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> >>> Status of "fail-sss" is meant to indicate an error was detected in
> >>> the device, and that the error might (or might not) be repairable.
> >>>
> >>> So the difference I see is state vs hardware description.
> 
> The question to ask is are we indicating the "operational status of a
> device"? If yes, that is the definition of status and using it would
> be appropriate.
> 
> IMO, I think we are.
> 
> >> OK thanks for the clarification. I don't see why "fail-hw-incomplete"
> >> could not be set dynamically during the probe in some cases based
> >> on the SoC revision detection for example. So from that point of
> >> view using status with the "fail-sss" logic would make more sense.
> >
> > If the probe detects that the device should only be power managed
> > based on the SoC revision, then it would simply be one more
> > test added at the top of probe.  The patch would change from:
> >
> >    if (of_device_is_incomplete(pdev->dev.of_node)) {
> >
> > to:
> >
> >    if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {
> 
> I think Tony meant the bootloader or platform code would do this and
> tweak the DT. We don't have much of a standard API for revision
> checking, so drivers don't check SoC revisions generally.

Yes bootloader may need to set these based on the SoC revision.
There are already many boards with multiple SoC variants
available. Would like to hear Tom's comments on this one as
well from the u-boot point of view.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frank Rowand April 12, 2016, 10:37 p.m. UTC | #7
On 4/12/2016 3:20 PM, Rob Herring wrote:
> On Tue, Apr 12, 2016 at 4:41 PM, Frank Rowand <frowand.list@gmail.com> wrote:
>> On 4/12/2016 1:34 PM, Tony Lindgren wrote:
>>> Hi,
>>>
>>> * Frank Rowand <frowand.list@gmail.com> [160412 13:15]:
>>>> Hi Tony,
>>>>
>>>> I agree with the need for some way of handling the incomplete
>>>> hardware issue.  I like the idea of having a uniform method for
>>>> all nodes.
>>>>
>>>> I am stumbling over what the status property is supposed to convey
>>>> and what the "fail-hw-incomplete" is meant to convey.
>>>>
>>>> The status property is meant to convey the current state of the
>>>> node.
>>>>
>>>> "fail-hw-incomplete" is meant to describe the node implementation,
>>>> saying that some portions of hardware that the driver expects to
>>>> be present do not exist.  If I understood your explanation at ELC
>>>> correctly, an examples of this could be that a uart cell is not
>>>> routed to transmit and receive data pins or the interrupt line
>>>> from the cell is not routed to an interrupt controller.  So the
>>>> node is not useful, but it makes sense to be able to power manage
>>>> the node, turning off power so that it is not wasting power.
>>>
>>> Yes cases like that are common.
>>>
>>>> It seems to me that the info that needs to be conveyed is a
>>>> description of the hardware, stating:
>>>>   - some portions or features of the node are not present and/or
>>>>     are not usable
>>>>   - power management of the node is possible
>>>>
>>>> Status of "fail-sss" is meant to indicate an error was detected in
>>>> the device, and that the error might (or might not) be repairable.
>>>>
>>>> So the difference I see is state vs hardware description.
> 
> The question to ask is are we indicating the "operational status of a
> device"? If yes, that is the definition of status and using it would
> be appropriate.
> 
> IMO, I think we are.

I see the reasoning.  I could go either way, but I lean toward thinking
of it as hardware description.


>>> OK thanks for the clarification. I don't see why "fail-hw-incomplete"
>>> could not be set dynamically during the probe in some cases based
>>> on the SoC revision detection for example. So from that point of
>>> view using status with the "fail-sss" logic would make more sense.
>>
>> If the probe detects that the device should only be power managed
>> based on the SoC revision, then it would simply be one more
>> test added at the top of probe.  The patch would change from:
>>
>>    if (of_device_is_incomplete(pdev->dev.of_node)) {
>>
>> to:
>>
>>    if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {
> 
> I think Tony meant the bootloader or platform code would do this and
> tweak the DT. We don't have much of a standard API for revision
> checking, so drivers don't check SoC revisions generally.

OK, that makes more sense to me.


>> That code would be the same whether the property involved was
>> status or something else.
>>
>>>
>>>> I would prefer to come up with a new boolean property (with a
>>>> standard name that any node binding could choose to implement)
>>>> that says something like "only power management is available for
>>>> this node, do not attempt to use any other feature of the node".
>>>
>>> Heh that's going to be a long property name :) How about
>>> unusable-incomplete-idle-only :)
>>
>> Or even pm-only.  Maybe I got a little carried away with my
>> verbosity. :)
> 
> I don't think we should define it so narrowly. I think DT just
> indicates the device is in a non-usable state (somewhere between ok
> and disabled) and the driver knows what to do with that information.

My concern is that "non-usable" state is really vague.  I would
prefer that the message (however it is communicated) tells the
driver either what is usable or what is unusable.


>>>> With that change, the bulk of your patch looks good, with
>>>> minor changes:
>>>>
>>>>   __of_device_is_available() would not need to change.
>>>>
>>>>   __of_device_is_incomplete() would change to check the new
>>>>   boolean property.  (And I would suggest renaming it to
>>>>   something that conveys it is ok to power manage the
>>>>   device, but do not do anything else to the device.)
>>>
>>> I'm fine with property too, but the runtime probe fail state
>>> changes worry me a bit with that one.
>>
>> I don't understand what the concern is.  The change I suggested
>> would use exactly the same code for probe as the example patch
>> you provided, but just with a slight name change for the function.
>>
>>
>>> I think Rob also preferred to use the status though while we
>>> chatted at ELC?
>>
>> That is the impression I got too.  We'll have to see if I can
>> convince him otherwise.
> 
> I did, but I'm not wed to it. I think it depends on the question above.
> 
> Rob
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frank Rowand April 12, 2016, 10:39 p.m. UTC | #8
On 4/12/2016 1:13 PM, Frank Rowand wrote:
> Hi Tony,

< snip >

> With that change, the bulk of your patch looks good, with
> minor changes:
> 
>   __of_device_is_available() would not need to change.
> 
>   __of_device_is_incomplete() would change to check the new
>   boolean property.  (And I would suggest renaming it to
>   something that conveys it is ok to power manage the
>   device, but do not do anything else to the device.)
> 
> -Frank

One more thought...

Are there multiple drivers that need to follow this
pattern, or just one at the moment?  If just one driver,
then I would suggest open-coding accessing the property
in the probe routine instead of adding the helper
functions.  If more drivers appear with the same
pattern then the helper functions could be added.

-Frank

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren April 12, 2016, 11:18 p.m. UTC | #9
* Frank Rowand <frowand.list@gmail.com> [160412 15:40]:
> On 4/12/2016 1:13 PM, Frank Rowand wrote:
> > Hi Tony,
> 
> < snip >
> 
> > With that change, the bulk of your patch looks good, with
> > minor changes:
> > 
> >   __of_device_is_available() would not need to change.
> > 
> >   __of_device_is_incomplete() would change to check the new
> >   boolean property.  (And I would suggest renaming it to
> >   something that conveys it is ok to power manage the
> >   device, but do not do anything else to the device.)
> > 
> > -Frank
> 
> One more thought...
> 
> Are there multiple drivers that need to follow this
> pattern, or just one at the moment?  If just one driver,
> then I would suggest open-coding accessing the property
> in the probe routine instead of adding the helper
> functions.  If more drivers appear with the same
> pattern then the helper functions could be added.

Well we already have several workarounds for devices to reset
them for idle by accessing the device registers:

omap_dss_reset
omap_hdq1w_reset
omap_i2c_reset

At least the three above are doing access to the device
registers partially duplicating the driver probe functionality.

Then the PM init code has some workarounds ti idle device
wrapper IP on SoCs that don't have the video or modem:

omap3xxx_prm_iva_idle
omap3_prm_reset_modem

And I'm just aware of the ones that we have in the mainline kernel
for PM to work. I bet Nishanth has more examples in mind where the
bootloader needs set devices to incomplete state based on
the SoC revision.

In general, I would rather see board specific dts files set
the unused devices to incomplete status rather than set them
to disabled. If they are set to disabled, from PM point of
view SoC specific workarounds are needed to idle them.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Rini April 12, 2016, 11:22 p.m. UTC | #10
On Tue, Apr 12, 2016 at 03:39:30PM -0700, Frank Rowand wrote:
> On 4/12/2016 1:13 PM, Frank Rowand wrote:
> > Hi Tony,
> 
> < snip >
> 
> > With that change, the bulk of your patch looks good, with
> > minor changes:
> > 
> >   __of_device_is_available() would not need to change.
> > 
> >   __of_device_is_incomplete() would change to check the new
> >   boolean property.  (And I would suggest renaming it to
> >   something that conveys it is ok to power manage the
> >   device, but do not do anything else to the device.)
> > 
> > -Frank
> 
> One more thought...
> 
> Are there multiple drivers that need to follow this
> pattern, or just one at the moment?  If just one driver,
> then I would suggest open-coding accessing the property
> in the probe routine instead of adding the helper
> functions.  If more drivers appear with the same
> pattern then the helper functions could be added.

This is a many problem.  I was trying to describe this at the BoF but
it's something more along the lines of:

SoC family ABxx supports 20 IP blocks which will always be present on
the physical chip but depending on the exact 'xx' many of those 20
blocks will be powered but 100% unusable.  Based on the 'xx' we will
know this, and know that we must turn them off.

I'm under the impression that it's not just "turn them off for power
saving" but "turn them off so they don't break PM".
Tom Rini April 13, 2016, 12:11 a.m. UTC | #11
On Tue, Apr 12, 2016 at 03:27:32PM -0700, Tony Lindgren wrote:
> * Rob Herring <robh+dt@kernel.org> [160412 15:22]:
> > On Tue, Apr 12, 2016 at 4:41 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> > >>> Status of "fail-sss" is meant to indicate an error was detected in
> > >>> the device, and that the error might (or might not) be repairable.
> > >>>
> > >>> So the difference I see is state vs hardware description.
> > 
> > The question to ask is are we indicating the "operational status of a
> > device"? If yes, that is the definition of status and using it would
> > be appropriate.
> > 
> > IMO, I think we are.
> > 
> > >> OK thanks for the clarification. I don't see why "fail-hw-incomplete"
> > >> could not be set dynamically during the probe in some cases based
> > >> on the SoC revision detection for example. So from that point of
> > >> view using status with the "fail-sss" logic would make more sense.
> > >
> > > If the probe detects that the device should only be power managed
> > > based on the SoC revision, then it would simply be one more
> > > test added at the top of probe.  The patch would change from:
> > >
> > >    if (of_device_is_incomplete(pdev->dev.of_node)) {
> > >
> > > to:
> > >
> > >    if (of_device_is_incomplete(pdev->dev.of_node) || socrev == XXX) {
> > 
> > I think Tony meant the bootloader or platform code would do this and
> > tweak the DT. We don't have much of a standard API for revision
> > checking, so drivers don't check SoC revisions generally.
> 
> Yes bootloader may need to set these based on the SoC revision.
> There are already many boards with multiple SoC variants
> available. Would like to hear Tom's comments on this one as
> well from the u-boot point of view.

So, the first part of it is that any "smart" bootloader will have to
tweak the DT.  A "dumb" bootloader will just pass along a pre-corrected
DT.  In fact, some of the problems are going to probably still have to
be solved by passing in a correct base DT, given that not all changes
are run-time detectable.  But from my point of view, the important part
is that it won't matter what vendor the SoC is from but that we can say
fdt_fixup_hw_incomplete(blob, compatible) or something like that.
diff mbox

Patch

diff --git a/drivers/of/base.c b/drivers/of/base.c
index b299de2..8eb161f 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -518,8 +518,8 @@  EXPORT_SYMBOL(of_machine_is_compatible);
  *
  *  @device: Node to check for availability, with locks already held
  *
- *  Returns true if the status property is absent or set to "okay" or "ok",
- *  false otherwise
+ *  Returns true if the status property is absent or set to "okay", "ok",
+ *  or "fail-hw-incomplete", false otherwise
  */
 static bool __of_device_is_available(const struct device_node *device)
 {
@@ -534,7 +534,8 @@  static bool __of_device_is_available(const struct device_node *device)
 		return true;
 
 	if (statlen > 0) {
-		if (!strcmp(status, "okay") || !strcmp(status, "ok"))
+		if (!strcmp(status, "okay") || !strcmp(status, "ok") ||
+		    !strcmp(status, "fail-hw-incomplete"))
 			return true;
 	}
 
@@ -563,6 +564,54 @@  bool of_device_is_available(const struct device_node *device)
 EXPORT_SYMBOL(of_device_is_available);
 
 /**
+ *  __of_device_is_incomplete - check if a device is incomplete
+ *
+ *  @device: Node to check for partial failure with locks already held
+ *
+ *  Returns true if the status is "fail-hw-incomplete", false otherwise.
+ */
+static bool __of_device_is_incomplete(const struct device_node *device)
+{
+	const char *status;
+	int statlen;
+
+	if (!device)
+		return false;
+
+	status = __of_get_property(device, "status", &statlen);
+	if (status == NULL)
+		return false;
+
+	if (statlen > 0) {
+		if (!strcmp(status, "fail-hw-incomplete"))
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ *  of_device_is_incomplete - check if a device is incomplete
+ *
+ *  @device: Node to check for partial failure
+ *
+ *  Returns true if the status property is set to "fail-hw-incomplete",
+ *  false otherwise. Fore more information, see fail-sss in ePAPR 1.1
+ *  "Table 2-4 Values for status property".
+ */
+bool of_device_is_incomplete(const struct device_node *device)
+{
+	unsigned long flags;
+	bool res;
+
+	raw_spin_lock_irqsave(&devtree_lock, flags);
+	res = __of_device_is_incomplete(device);
+	raw_spin_unlock_irqrestore(&devtree_lock, flags);
+	return res;
+}
+EXPORT_SYMBOL(of_device_is_incomplete);
+
+/**
  *  of_device_is_big_endian - check if a device has BE registers
  *
  *  @device: Node to check for endianness
diff --git a/include/linux/of.h b/include/linux/of.h
index 7fcb681..1d37db2 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -308,6 +308,7 @@  extern int of_property_read_string_helper(const struct device_node *np,
 extern int of_device_is_compatible(const struct device_node *device,
 				   const char *);
 extern bool of_device_is_available(const struct device_node *device);
+extern bool of_device_is_incomplete(const struct device_node *device);
 extern bool of_device_is_big_endian(const struct device_node *device);
 extern const void *of_get_property(const struct device_node *node,
 				const char *name,
@@ -480,6 +481,11 @@  static inline bool of_device_is_available(const struct device_node *device)
 	return false;
 }
 
+static inline bool of_device_is_incomplete(const struct device_node *device)
+{
+	return false;
+}
+
 static inline bool of_device_is_big_endian(const struct device_node *device)
 {
 	return false;