diff mbox series

memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get()

Message ID 20200804061210.5415-1-m.szyprowski@samsung.com (mailing list archive)
State Rejected
Headers show
Series memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get() | expand

Commit Message

Marek Szyprowski Aug. 4, 2020, 6:12 a.m. UTC
exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
devfreq event counter is not yet probed. Propagate that error value to
the caller to ensure that the exynos5422-dmc driver will be probed again
when devfreq event contuner is available.

This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
compiled as modules.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
 drivers/memory/samsung/exynos5422-dmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Krzysztof Kozlowski Aug. 4, 2020, 6:42 a.m. UTC | #1
On Tue, Aug 04, 2020 at 08:12:10AM +0200, Marek Szyprowski wrote:
> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
> devfreq event counter is not yet probed. Propagate that error value to
> the caller to ensure that the exynos5422-dmc driver will be probed again
> when devfreq event contuner is available.
> 
> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
> compiled as modules.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>  drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, looks good. I'll apply it to fixes after merge window.

Best regards,
Krzysztof
Lukasz Luba Aug. 4, 2020, 9:11 a.m. UTC | #2
Hi Marek,

On 8/4/20 7:12 AM, Marek Szyprowski wrote:
> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
> devfreq event counter is not yet probed. Propagate that error value to
> the caller to ensure that the exynos5422-dmc driver will be probed again
> when devfreq event contuner is available.
> 
> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
> compiled as modules.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>   drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
> index b9c7956e5031..639811a3eecb 100644
> --- a/drivers/memory/samsung/exynos5422-dmc.c
> +++ b/drivers/memory/samsung/exynos5422-dmc.c
> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device *dev,
>   	} else {
>   		ret = exynos5_counters_get(dmc, &load, &total);
>   		if (ret < 0)
> -			return -EINVAL;
> +			return ret;
>   
>   		/* To protect from overflow, divide by 1024 */
>   		stat->busy_time = load >> 10;
> 

Thank you for the patch, LGTM.
Some questions are still there, though. The function
exynos5_performance_counters_init() should capture that the counters
couldn't be enabled or set. So the functions:
exynos5_counters_enable_edev() and exynos5_counters_set_event()
must pass gently because devfreq device is registered.
Then devfreq checks device status, and reaches the state when
counters 'get' function returns that they are not ready...

If that is a kind of 2-stage initialization, maybe we should add
another 'check' in the exynos5_performance_counters_init() and call
the devfreq_event_get_event() to make sure that we are ready to go,
otherwise return ret from that function (which is probably EPROBE_DEFER)
and not register the devfreq device.

Marek do want to submit such patch or I should bake it and submit on top
of this patch?

Could you tell me how I can reproduce this? Do you simply load one
module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

Regards,
Lukasz
Marek Szyprowski Aug. 4, 2020, 12:19 p.m. UTC | #3
Hi Lukasz,

On 04.08.2020 11:11, Lukasz Luba wrote:
> Hi Marek,
>
> On 8/4/20 7:12 AM, Marek Szyprowski wrote:
>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
>> devfreq event counter is not yet probed. Propagate that error value to
>> the caller to ensure that the exynos5422-dmc driver will be probed again
>> when devfreq event contuner is available.
>>
>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
>> compiled as modules.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> ---
>>   drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c 
>> b/drivers/memory/samsung/exynos5422-dmc.c
>> index b9c7956e5031..639811a3eecb 100644
>> --- a/drivers/memory/samsung/exynos5422-dmc.c
>> +++ b/drivers/memory/samsung/exynos5422-dmc.c
>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device 
>> *dev,
>>       } else {
>>           ret = exynos5_counters_get(dmc, &load, &total);
>>           if (ret < 0)
>> -            return -EINVAL;
>> +            return ret;
>>             /* To protect from overflow, divide by 1024 */
>>           stat->busy_time = load >> 10;
>>
>
> Thank you for the patch, LGTM.
> Some questions are still there, though. The function
> exynos5_performance_counters_init() should capture that the counters
> couldn't be enabled or set. So the functions:
> exynos5_counters_enable_edev() and exynos5_counters_set_event()
> must pass gently because devfreq device is registered.
> Then devfreq checks device status, and reaches the state when
> counters 'get' function returns that they are not ready...
>
> If that is a kind of 2-stage initialization, maybe we should add
> another 'check' in the exynos5_performance_counters_init() and call
> the devfreq_event_get_event() to make sure that we are ready to go,
> otherwise return ret from that function (which is probably EPROBE_DEFER)
> and not register the devfreq device.

I've finally investigated this further and it turned out that the issue 
is elsewhere. The $subject patch can be discarded, as it doesn't fix 
anything. The -EPROBE_DEFER is properly returned by 
exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() 
to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks 
what in turn *sometimes* causes boot hang. This random behavior mislead 
me that the $subject patch fixes the issue, but then longer tests 
revealed that it didn't change anything.

It looks that the proper fix would be to keep fout_bpll enabled all the 
time.

>
> Marek do want to submit such patch or I should bake it and submit on top
> of this patch?
>
> Could you tell me how I can reproduce this? Do you simply load one
> module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

I've just boot zImage built from multi_v7_defconfig with modules 
installed. Modules are automatically loaded by udev during boot.

Best regards
Lukasz Luba Aug. 4, 2020, 12:38 p.m. UTC | #4
On 8/4/20 1:19 PM, Marek Szyprowski wrote:
> Hi Lukasz,
> 
> On 04.08.2020 11:11, Lukasz Luba wrote:
>> Hi Marek,
>>
>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:
>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
>>> devfreq event counter is not yet probed. Propagate that error value to
>>> the caller to ensure that the exynos5422-dmc driver will be probed again
>>> when devfreq event contuner is available.
>>>
>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
>>> compiled as modules.
>>>
>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>> ---
>>>    drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c
>>> b/drivers/memory/samsung/exynos5422-dmc.c
>>> index b9c7956e5031..639811a3eecb 100644
>>> --- a/drivers/memory/samsung/exynos5422-dmc.c
>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c
>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device
>>> *dev,
>>>        } else {
>>>            ret = exynos5_counters_get(dmc, &load, &total);
>>>            if (ret < 0)
>>> -            return -EINVAL;
>>> +            return ret;
>>>              /* To protect from overflow, divide by 1024 */
>>>            stat->busy_time = load >> 10;
>>>
>>
>> Thank you for the patch, LGTM.
>> Some questions are still there, though. The function
>> exynos5_performance_counters_init() should capture that the counters
>> couldn't be enabled or set. So the functions:
>> exynos5_counters_enable_edev() and exynos5_counters_set_event()
>> must pass gently because devfreq device is registered.
>> Then devfreq checks device status, and reaches the state when
>> counters 'get' function returns that they are not ready...
>>
>> If that is a kind of 2-stage initialization, maybe we should add
>> another 'check' in the exynos5_performance_counters_init() and call
>> the devfreq_event_get_event() to make sure that we are ready to go,
>> otherwise return ret from that function (which is probably EPROBE_DEFER)
>> and not register the devfreq device.
> 
> I've finally investigated this further and it turned out that the issue
> is elsewhere. The $subject patch can be discarded, as it doesn't fix
> anything. The -EPROBE_DEFER is properly returned by
> exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()
> to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks
> what in turn *sometimes* causes boot hang. This random behavior mislead
> me that the $subject patch fixes the issue, but then longer tests
> revealed that it didn't change anything.

Really good investigation, great work Marek!

> 
> It looks that the proper fix would be to keep fout_bpll enabled all the
> time.

Yes, I agree. I am looking for your next patch to test it then.

> 
>>
>> Marek do want to submit such patch or I should bake it and submit on top
>> of this patch?
>>
>> Could you tell me how I can reproduce this? Do you simply load one
>> module after another (exynos-ppmu than exynos5422-dmc) or in parallel?
> 
> I've just boot zImage built from multi_v7_defconfig with modules
> installed. Modules are automatically loaded by udev during boot.

Thank you sharing this test procedure.

Regards,
Lukasz
Krzysztof Kozlowski Aug. 17, 2020, 12:07 p.m. UTC | #5
On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:
> 
> 
> On 8/4/20 1:19 PM, Marek Szyprowski wrote:
> > Hi Lukasz,
> > 
> > On 04.08.2020 11:11, Lukasz Luba wrote:
> > > Hi Marek,
> > > 
> > > On 8/4/20 7:12 AM, Marek Szyprowski wrote:
> > > > exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
> > > > devfreq event counter is not yet probed. Propagate that error value to
> > > > the caller to ensure that the exynos5422-dmc driver will be probed again
> > > > when devfreq event contuner is available.
> > > > 
> > > > This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
> > > > compiled as modules.
> > > > 
> > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > ---
> > > >    drivers/memory/samsung/exynos5422-dmc.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/memory/samsung/exynos5422-dmc.c
> > > > b/drivers/memory/samsung/exynos5422-dmc.c
> > > > index b9c7956e5031..639811a3eecb 100644
> > > > --- a/drivers/memory/samsung/exynos5422-dmc.c
> > > > +++ b/drivers/memory/samsung/exynos5422-dmc.c
> > > > @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device
> > > > *dev,
> > > >        } else {
> > > >            ret = exynos5_counters_get(dmc, &load, &total);
> > > >            if (ret < 0)
> > > > -            return -EINVAL;
> > > > +            return ret;
> > > >              /* To protect from overflow, divide by 1024 */
> > > >            stat->busy_time = load >> 10;
> > > > 
> > > 
> > > Thank you for the patch, LGTM.
> > > Some questions are still there, though. The function
> > > exynos5_performance_counters_init() should capture that the counters
> > > couldn't be enabled or set. So the functions:
> > > exynos5_counters_enable_edev() and exynos5_counters_set_event()
> > > must pass gently because devfreq device is registered.
> > > Then devfreq checks device status, and reaches the state when
> > > counters 'get' function returns that they are not ready...
> > > 
> > > If that is a kind of 2-stage initialization, maybe we should add
> > > another 'check' in the exynos5_performance_counters_init() and call
> > > the devfreq_event_get_event() to make sure that we are ready to go,
> > > otherwise return ret from that function (which is probably EPROBE_DEFER)
> > > and not register the devfreq device.
> > 
> > I've finally investigated this further and it turned out that the issue
> > is elsewhere. The $subject patch can be discarded, as it doesn't fix
> > anything. The -EPROBE_DEFER is properly returned by
> > exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()
> > to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks
> > what in turn *sometimes* causes boot hang. This random behavior mislead
> > me that the $subject patch fixes the issue, but then longer tests
> > revealed that it didn't change anything.
> 
> Really good investigation, great work Marek!
> 
> > 
> > It looks that the proper fix would be to keep fout_bpll enabled all the
> > time.
> 
> Yes, I agree. I am looking for your next patch to test it then.

Hi all,

Is the patch still useful then? Shall I apply it?

Best regards,
Krzysztof
Lukasz Luba Aug. 17, 2020, 12:27 p.m. UTC | #6
On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote:
> On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:
>>
>>
>> On 8/4/20 1:19 PM, Marek Szyprowski wrote:
>>> Hi Lukasz,
>>>
>>> On 04.08.2020 11:11, Lukasz Luba wrote:
>>>> Hi Marek,
>>>>
>>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:
>>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
>>>>> devfreq event counter is not yet probed. Propagate that error value to
>>>>> the caller to ensure that the exynos5422-dmc driver will be probed again
>>>>> when devfreq event contuner is available.
>>>>>
>>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
>>>>> compiled as modules.
>>>>>
>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>> ---
>>>>>     drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c
>>>>> b/drivers/memory/samsung/exynos5422-dmc.c
>>>>> index b9c7956e5031..639811a3eecb 100644
>>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c
>>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c
>>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device
>>>>> *dev,
>>>>>         } else {
>>>>>             ret = exynos5_counters_get(dmc, &load, &total);
>>>>>             if (ret < 0)
>>>>> -            return -EINVAL;
>>>>> +            return ret;
>>>>>               /* To protect from overflow, divide by 1024 */
>>>>>             stat->busy_time = load >> 10;
>>>>>
>>>>
>>>> Thank you for the patch, LGTM.
>>>> Some questions are still there, though. The function
>>>> exynos5_performance_counters_init() should capture that the counters
>>>> couldn't be enabled or set. So the functions:
>>>> exynos5_counters_enable_edev() and exynos5_counters_set_event()
>>>> must pass gently because devfreq device is registered.
>>>> Then devfreq checks device status, and reaches the state when
>>>> counters 'get' function returns that they are not ready...
>>>>
>>>> If that is a kind of 2-stage initialization, maybe we should add
>>>> another 'check' in the exynos5_performance_counters_init() and call
>>>> the devfreq_event_get_event() to make sure that we are ready to go,
>>>> otherwise return ret from that function (which is probably EPROBE_DEFER)
>>>> and not register the devfreq device.
>>>
>>> I've finally investigated this further and it turned out that the issue
>>> is elsewhere. The $subject patch can be discarded, as it doesn't fix
>>> anything. The -EPROBE_DEFER is properly returned by
>>> exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()
>>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks
>>> what in turn *sometimes* causes boot hang. This random behavior mislead
>>> me that the $subject patch fixes the issue, but then longer tests
>>> revealed that it didn't change anything.
>>
>> Really good investigation, great work Marek!
>>
>>>
>>> It looks that the proper fix would be to keep fout_bpll enabled all the
>>> time.
>>
>> Yes, I agree. I am looking for your next patch to test it then.
> 
> Hi all,
> 
> Is the patch still useful then? Shall I apply it?


Marek has created different patch for it, which fixes the clock:
https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/

So you don't have to apply this one IMO.

Regards,
Lukasz
Marek Szyprowski Aug. 24, 2020, 7:44 a.m. UTC | #7
Hi,

On 17.08.2020 14:27, Lukasz Luba wrote:
> On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote:
>> On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:
>>> On 8/4/20 1:19 PM, Marek Szyprowski wrote:
>>>> On 04.08.2020 11:11, Lukasz Luba wrote:
>>>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:
>>>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the 
>>>>>> driver for
>>>>>> devfreq event counter is not yet probed. Propagate that error 
>>>>>> value to
>>>>>> the caller to ensure that the exynos5422-dmc driver will be 
>>>>>> probed again
>>>>>> when devfreq event contuner is available.
>>>>>>
>>>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu 
>>>>>> drivers are
>>>>>> compiled as modules.
>>>>>>
>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>> ---
>>>>>>     drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c
>>>>>> b/drivers/memory/samsung/exynos5422-dmc.c
>>>>>> index b9c7956e5031..639811a3eecb 100644
>>>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c
>>>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c
>>>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device
>>>>>> *dev,
>>>>>>         } else {
>>>>>>             ret = exynos5_counters_get(dmc, &load, &total);
>>>>>>             if (ret < 0)
>>>>>> -            return -EINVAL;
>>>>>> +            return ret;
>>>>>>               /* To protect from overflow, divide by 1024 */
>>>>>>             stat->busy_time = load >> 10;
>>>>>>
>>>>>
>>>>> Thank you for the patch, LGTM.
>>>>> Some questions are still there, though. The function
>>>>> exynos5_performance_counters_init() should capture that the counters
>>>>> couldn't be enabled or set. So the functions:
>>>>> exynos5_counters_enable_edev() and exynos5_counters_set_event()
>>>>> must pass gently because devfreq device is registered.
>>>>> Then devfreq checks device status, and reaches the state when
>>>>> counters 'get' function returns that they are not ready...
>>>>>
>>>>> If that is a kind of 2-stage initialization, maybe we should add
>>>>> another 'check' in the exynos5_performance_counters_init() and call
>>>>> the devfreq_event_get_event() to make sure that we are ready to go,
>>>>> otherwise return ret from that function (which is probably 
>>>>> EPROBE_DEFER)
>>>>> and not register the devfreq device.
>>>>
>>>> I've finally investigated this further and it turned out that the 
>>>> issue
>>>> is elsewhere. The $subject patch can be discarded, as it doesn't fix
>>>> anything. The -EPROBE_DEFER is properly returned by
>>>> exynos5_performance_counters_init(), which redirects 
>>>> exynos5_dmc_probe()
>>>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll 
>>>> clocks
>>>> what in turn *sometimes* causes boot hang. This random behavior 
>>>> mislead
>>>> me that the $subject patch fixes the issue, but then longer tests
>>>> revealed that it didn't change anything.
>>>
>>> Really good investigation, great work Marek!
>>>
>>>>
>>>> It looks that the proper fix would be to keep fout_bpll enabled all 
>>>> the
>>>> time.
>>>
>>> Yes, I agree. I am looking for your next patch to test it then.
>>
>> Hi all,
>>
>> Is the patch still useful then? Shall I apply it?
>
> Marek has created different patch for it, which fixes the clock:
> https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/ 
>
>
> So you don't have to apply this one IMO.

Indeed, you can drop this one.

Best regards
diff mbox series

Patch

diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
index b9c7956e5031..639811a3eecb 100644
--- a/drivers/memory/samsung/exynos5422-dmc.c
+++ b/drivers/memory/samsung/exynos5422-dmc.c
@@ -914,7 +914,7 @@  static int exynos5_dmc_get_status(struct device *dev,
 	} else {
 		ret = exynos5_counters_get(dmc, &load, &total);
 		if (ret < 0)
-			return -EINVAL;
+			return ret;
 
 		/* To protect from overflow, divide by 1024 */
 		stat->busy_time = load >> 10;