diff mbox series

drm: bridge: thc63lvd1024: Print error message when DT parsing fails

Message ID 20240318160601.2813-1-laurent.pinchart+renesas@ideasonboard.com (mailing list archive)
State New, archived
Headers show
Series drm: bridge: thc63lvd1024: Print error message when DT parsing fails | expand

Commit Message

Laurent Pinchart March 18, 2024, 4:06 p.m. UTC
Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
replacing hand-rolled code with a helper function. While doing so, it
created an error code path at probe time without any error message,
potentially causing probe issues that get annoying to debug. Fix it by
adding an error message.

Fixes: 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()")
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
---
 drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)


base-commit: 00084f0c01bf3a2591d007010b196e048281c455

Comments

Sui Jingfeng March 18, 2024, 4:42 p.m. UTC | #1
Hi,


On 2024/3/19 00:06, Laurent Pinchart wrote:
> Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
> of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
> replacing hand-rolled code with a helper function.


[...]


> While doing so, it
> created an error code path at probe time without any error message,

If this is a reason or a concern, then every drm bridges drivers will suffer from
such a concern. Right?


> potentially causing probe issues that get annoying to debug.

Sorry, let's keep it fair enough, it creates nothing annoyed.

If there is a probe issues, then, it is caused by ill-behavioral DT.
*NOT* my patch. And should be found during review stage.

If the of_graph_get_remote_node() function is not good enough,
I suggest to improve the of_graph_get_remote_node() function,
then all callers of it will benefits.

Well, the strong word here just terrifying new programmers to call
core function helpers. Please use more *soft* description in the
commit message.


> Fix it by
> adding an error message.
>
> Fixes: 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()")


Please drop the fixes tag at here, append the tag to a real bug-fix patch will make more sense imo.
I suggest to improve the of_graph_get_remote_node() function, then all callers of it will benefits.
NOT a single implement like this.

> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> ---
>   drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
> index 5f99f9724081..674efc489e3a 100644
> --- a/drivers/gpu/drm/bridge/thc63lvd1024.c
> +++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
> @@ -125,8 +125,11 @@ static int thc63_parse_dt(struct thc63_dev *thc63)
>   
>   	remote = of_graph_get_remote_node(thc63->dev->of_node,
>   					  THC63_RGB_OUT0, -1);
> -	if (!remote)
> +	if (!remote) {
> +		dev_err(thc63->dev, "No remote endpoint for port@%u\n",
> +			THC63_RGB_OUT0);
>   		return -ENODEV;
> +	}
>   
>   	thc63->next = of_drm_find_bridge(remote);
>   	of_node_put(remote);
>
> base-commit: 00084f0c01bf3a2591d007010b196e048281c455
Laurent Pinchart March 18, 2024, 6:04 p.m. UTC | #2
Hi Sui,

On Tue, Mar 19, 2024 at 12:42:41AM +0800, Sui Jingfeng wrote:
> On 2024/3/19 00:06, Laurent Pinchart wrote:
> > Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
> > of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
> > replacing hand-rolled code with a helper function.
> 
> [...]
> 
> > While doing so, it
> > created an error code path at probe time without any error message,
> 
> If this is a reason or a concern, then every drm bridges drivers will suffer from
> such a concern. Right?

Yes, bridge drivers (or any driver, really) should avoid failing probe
silently.

> > potentially causing probe issues that get annoying to debug.
> 
> Sorry, let's keep it fair enough, it creates nothing annoyed.
> 
> If there is a probe issues, then, it is caused by ill-behavioral DT.
> *NOT* my patch. And should be found during review stage.

Even before the review stage, in the DT development stage. My point is
that creating a silent failure path in probe will make it more difficult
for DT developers to debug issues.

> If the of_graph_get_remote_node() function is not good enough,
> I suggest to improve the of_graph_get_remote_node() function,
> then all callers of it will benefits.
> 
> Well, the strong word here just terrifying new programmers to call
> core function helpers. Please use more *soft* description in the
> commit message.

Could you please propose a wording that you would consider more soft ?

> > Fix it by
> > adding an error message.
> >
> > Fixes: 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()")
> 
> Please drop the fixes tag at here, append the tag to a real bug-fix patch will make more sense imo.
> I suggest to improve the of_graph_get_remote_node() function, then all callers of it will benefits.
> NOT a single implement like this.

Improving core helpers is certainly a good idea, and if we do so, we can
simplify drivers. What I'm concerned is that commit 00084f0c01bf creates
a silent probe failure path, which didn't exist before it. This is why
this patch references it in the Fixes: tag, making sure that this patch
will get backported to any stable kernel that includes commit
00084f0c01bf. As far as I understand, this is business as usual. There's
nothing personal here, and no judgement on the quality of your code.

> > Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> > ---
> >   drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
> > index 5f99f9724081..674efc489e3a 100644
> > --- a/drivers/gpu/drm/bridge/thc63lvd1024.c
> > +++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
> > @@ -125,8 +125,11 @@ static int thc63_parse_dt(struct thc63_dev *thc63)
> >   
> >   	remote = of_graph_get_remote_node(thc63->dev->of_node,
> >   					  THC63_RGB_OUT0, -1);
> > -	if (!remote)
> > +	if (!remote) {
> > +		dev_err(thc63->dev, "No remote endpoint for port@%u\n",
> > +			THC63_RGB_OUT0);
> >   		return -ENODEV;
> > +	}
> >   
> >   	thc63->next = of_drm_find_bridge(remote);
> >   	of_node_put(remote);
> >
> > base-commit: 00084f0c01bf3a2591d007010b196e048281c455
Sui Jingfeng March 18, 2024, 6:18 p.m. UTC | #3
Hi,


On 2024/3/19 02:04, Laurent Pinchart wrote:
> Hi Sui,
>
> On Tue, Mar 19, 2024 at 12:42:41AM +0800, Sui Jingfeng wrote:
>> On 2024/3/19 00:06, Laurent Pinchart wrote:
>>> Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
>>> of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
>>> replacing hand-rolled code with a helper function.
>> [...]
>>
>>> While doing so, it
>>> created an error code path at probe time without any error message,
>> If this is a reason or a concern, then every drm bridges drivers will suffer from
>> such a concern. Right?
> Yes, bridge drivers (or any driver, really) should avoid failing probe
> silently.


Yes, I agree with you that bridge drivers should avoid failing probe.

But the real problem that deserve to discuss is that is it really *silently* ?

The of_graph_get_remote_node() function do have debug prints on failure:


   - pr_debug("no valid endpoint (%d, %d) for node %pOF\n", port, endpoint, node);
   - pr_debug("no valid remote node\n");
   - pr_debug("not available for remote node\n");

So it is not really *silently*.
Sui Jingfeng March 18, 2024, 7:23 p.m. UTC | #4
Hi,


On 2024/3/19 02:04, Laurent Pinchart wrote:
> Improving core helpers is certainly a good idea, and if we do so, we can
> simplify drivers. What I'm concerned is that commit 00084f0c01bf creates
> a silent probe failure path,


No, I can't agree here. It doesn't creates a silent probe failure path.

Simply because

1) It is NOT silent.
2) It should be exist at product level kernel.


> which didn't exist before it.


Again, it shouldn't be exist.

Otherwise it hints us that there is ill-behavior-ed DT in the mainstream kernel
or a specific product(or development board). If I were you, I would like to fix
the boot failure first.

In the earlier stage of my attempt to contribute, I also would like to enable
debug output as much as possible. Just like you, the benefit is obvious: It really
eliminate the pain on developing stage and when bugs happens.

But I was told many many times that mainstream kernel is not for debug, it is
for sound products. I bet you have seen some product level drivers print very less.
I'm not understand why in the past, but I think I could understand something now.
Probably because professional programmers really confident about what they have
wrote. As they have been tested and/or reviewed thousands or ten thousands times.

Enable this debug output by default can only prove to the community that you are
not confident about something, either the community's reviewing power on DTS or
your debug techniques.


> This is why
> this patch references it in the Fixes: tag, making sure that this patch
> will get backported to any stable kernel that includes commit
> 00084f0c01bf.


No, I keep insist on my judgement. A fixes tag is only meant for cases where your
patch fixes a bug. The bug should really be happened. All of the discussion ongoing
here are just things imaginary about the *debug* phase and development phase.


>   As far as I understand, this is business as usual. There's
> nothing personal here, and no judgement on the quality of your code.
>
Please don't misunderstanding, I do cares the quality of my code.
If it is really introduce a bug, I will responsible and help to solve.
But this is not the case. Sorry.


>>> Signed-off-by: Laurent Pinchart<laurent.pinchart+renesas@ideasonboard.com>
>>> ---
>>>    drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
>>> index 5f99f9724081..674efc489e3a 100644
>>> --- a/drivers/gpu/drm/bridge/thc63lvd1024.c
>>> +++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
>>> @@ -125,8 +125,11 @@ static int thc63_parse_dt(struct thc63_dev *thc63)
>>>    
>>>    	remote = of_graph_get_remote_node(thc63->dev->of_node,
>>>    					  THC63_RGB_OUT0, -1);
>>> -	if (!remote)
>>> +	if (!remote) {
>>> +		dev_err(thc63->dev, "No remote endpoint for port@%u\n",
>>> +			THC63_RGB_OUT0);
>>>    		return -ENODEV;
>>> +	}
>>>    

An side effect of this patch is that we will add one more extra error message in the console.
As the of_graph_get_remote_node() function already print one for us if I add '#define DEBUG 1'
on the top of this source file. What's worse, it does not really tell us what's really the
error is.

It could be no valid endpoint or no valid remote node because of bad coding in DT, or It is
also simply because the remove node(or device) is being disabled intentionally by adding
'status = "disabled"' clause. Therefore, the error printing code added here is very confusing
in practice. It cannot really help for locating the root cause of the problem.

After think about this more than twice, either help to improve the core of_graph_get_remote_node()
function or just to drop this. This what I can tell as a ordinary reviewer. Despite you and/or
other more advanced programmer & reviewer could override what I said though.
Sui Jingfeng March 18, 2024, 7:33 p.m. UTC | #5
On 2024/3/19 03:23, Sui Jingfeng wrote:
> 2) It should be exist at product level kernel. 


It should NOT be exist at product level kernel.
Neil Armstrong March 19, 2024, 3:44 p.m. UTC | #6
On 18/03/2024 17:06, Laurent Pinchart wrote:
> Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
> of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
> replacing hand-rolled code with a helper function. While doing so, it
> created an error code path at probe time without any error message,
> potentially causing probe issues that get annoying to debug. Fix it by
> adding an error message.
> 
> Fixes: 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use of_graph_get_remote_node()")
> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> ---
>   drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
> index 5f99f9724081..674efc489e3a 100644
> --- a/drivers/gpu/drm/bridge/thc63lvd1024.c
> +++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
> @@ -125,8 +125,11 @@ static int thc63_parse_dt(struct thc63_dev *thc63)
>   
>   	remote = of_graph_get_remote_node(thc63->dev->of_node,
>   					  THC63_RGB_OUT0, -1);
> -	if (!remote)
> +	if (!remote) {
> +		dev_err(thc63->dev, "No remote endpoint for port@%u\n",
> +			THC63_RGB_OUT0);
>   		return -ENODEV;
> +	}
>   
>   	thc63->next = of_drm_find_bridge(remote);
>   	of_node_put(remote);
> 
> base-commit: 00084f0c01bf3a2591d007010b196e048281c455

Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Neil Armstrong March 19, 2024, 3:49 p.m. UTC | #7
On 18/03/2024 20:23, Sui Jingfeng wrote:
> Hi,
> 
> 
> On 2024/3/19 02:04, Laurent Pinchart wrote:
>> Improving core helpers is certainly a good idea, and if we do so, we can
>> simplify drivers. What I'm concerned is that commit 00084f0c01bf creates
>> a silent probe failure path,
> 
> 
> No, I can't agree here. It doesn't creates a silent probe failure path.

It doesn't _in debug mode_, I agree with Laurent, having a verbose probe error should be kept.

Neil

> 
> Simply because
> 
> 1) It is NOT silent.
> 2) It should be exist at product level kernel.
> 
> 
>> which didn't exist before it.
> 
> 
> Again, it shouldn't be exist.
> 
> Otherwise it hints us that there is ill-behavior-ed DT in the mainstream kernel
> or a specific product(or development board). If I were you, I would like to fix
> the boot failure first.
> 
> In the earlier stage of my attempt to contribute, I also would like to enable
> debug output as much as possible. Just like you, the benefit is obvious: It really
> eliminate the pain on developing stage and when bugs happens.
> 
> But I was told many many times that mainstream kernel is not for debug, it is
> for sound products. I bet you have seen some product level drivers print very less.
> I'm not understand why in the past, but I think I could understand something now.
> Probably because professional programmers really confident about what they have
> wrote. As they have been tested and/or reviewed thousands or ten thousands times.
> 
> Enable this debug output by default can only prove to the community that you are
> not confident about something, either the community's reviewing power on DTS or
> your debug techniques.
> 
> 
>> This is why
>> this patch references it in the Fixes: tag, making sure that this patch
>> will get backported to any stable kernel that includes commit
>> 00084f0c01bf.
> 
> 
> No, I keep insist on my judgement. A fixes tag is only meant for cases where your
> patch fixes a bug. The bug should really be happened. All of the discussion ongoing
> here are just things imaginary about the *debug* phase and development phase.
> 
> 
>>   As far as I understand, this is business as usual. There's
>> nothing personal here, and no judgement on the quality of your code.
>>
> Please don't misunderstanding, I do cares the quality of my code.
> If it is really introduce a bug, I will responsible and help to solve.
> But this is not the case. Sorry.
> 
> 
>>>> Signed-off-by: Laurent Pinchart<laurent.pinchart+renesas@ideasonboard.com>
>>>> ---
>>>>    drivers/gpu/drm/bridge/thc63lvd1024.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
>>>> index 5f99f9724081..674efc489e3a 100644
>>>> --- a/drivers/gpu/drm/bridge/thc63lvd1024.c
>>>> +++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
>>>> @@ -125,8 +125,11 @@ static int thc63_parse_dt(struct thc63_dev *thc63)
>>>>        remote = of_graph_get_remote_node(thc63->dev->of_node,
>>>>                          THC63_RGB_OUT0, -1);
>>>> -    if (!remote)
>>>> +    if (!remote) {
>>>> +        dev_err(thc63->dev, "No remote endpoint for port@%u\n",
>>>> +            THC63_RGB_OUT0);
>>>>            return -ENODEV;
>>>> +    }
> 
> An side effect of this patch is that we will add one more extra error message in the console.
> As the of_graph_get_remote_node() function already print one for us if I add '#define DEBUG 1'
> on the top of this source file. What's worse, it does not really tell us what's really the
> error is.
> 
> It could be no valid endpoint or no valid remote node because of bad coding in DT, or It is
> also simply because the remove node(or device) is being disabled intentionally by adding
> 'status = "disabled"' clause. Therefore, the error printing code added here is very confusing
> in practice. It cannot really help for locating the root cause of the problem.
> 
> After think about this more than twice, either help to improve the core of_graph_get_remote_node()
> function or just to drop this. This what I can tell as a ordinary reviewer. Despite you and/or
> other more advanced programmer & reviewer could override what I said though.
>
Neil Armstrong March 19, 2024, 3:49 p.m. UTC | #8
Hi,

On Mon, 18 Mar 2024 18:06:01 +0200, Laurent Pinchart wrote:
> Commit 00084f0c01bf ("drm: bridge: thc63lvd1024: Switch to use
> of_graph_get_remote_node()") simplified the thc63lvd1024 driver by
> replacing hand-rolled code with a helper function. While doing so, it
> created an error code path at probe time without any error message,
> potentially causing probe issues that get annoying to debug. Fix it by
> adding an error message.
> 
> [...]

Thanks, Applied to https://gitlab.freedesktop.org/drm/misc/kernel.git (drm-misc-next)

[1/1] drm: bridge: thc63lvd1024: Print error message when DT parsing fails
      https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/974652d7a90be7ae3b24779794a65bfb90980044
Sui Jingfeng March 20, 2024, 8:04 p.m. UTC | #9
On 2024/3/19 23:49, Neil Armstrong wrote:
> On 18/03/2024 20:23, Sui Jingfeng wrote:
>> Hi,
>>
>>
>> On 2024/3/19 02:04, Laurent Pinchart wrote:
>>> Improving core helpers is certainly a good idea, and if we do so, we 
>>> can
>>> simplify drivers. What I'm concerned is that commit 00084f0c01bf 
>>> creates
>>> a silent probe failure path,
>>
>>
>> No, I can't agree here. It doesn't creates a silent probe failure path.
>
> It doesn't _in debug mode_, I agree with Laurent, having a verbose 
> probe error should be kept.
>
OK, I agree with you two then.

I means that we could replace the pr_debug() with the pr_err() in the
of_graph_get_remote_node() function, then all drivers will be benefit.
Is this idea too hard to be acceptable?


> Neil
diff mbox series

Patch

diff --git a/drivers/gpu/drm/bridge/thc63lvd1024.c b/drivers/gpu/drm/bridge/thc63lvd1024.c
index 5f99f9724081..674efc489e3a 100644
--- a/drivers/gpu/drm/bridge/thc63lvd1024.c
+++ b/drivers/gpu/drm/bridge/thc63lvd1024.c
@@ -125,8 +125,11 @@  static int thc63_parse_dt(struct thc63_dev *thc63)
 
 	remote = of_graph_get_remote_node(thc63->dev->of_node,
 					  THC63_RGB_OUT0, -1);
-	if (!remote)
+	if (!remote) {
+		dev_err(thc63->dev, "No remote endpoint for port@%u\n",
+			THC63_RGB_OUT0);
 		return -ENODEV;
+	}
 
 	thc63->next = of_drm_find_bridge(remote);
 	of_node_put(remote);