diff mbox series

[v4,v6.11,v5.15,v5.4,v4.19,1/1] selftests: breakpoints: use remaining time to check if suspend succeed

Message ID 20240930203603.2854629-1-yifei.l.liu@oracle.com (mailing list archive)
State Superseded
Headers show
Series [v4,v6.11,v5.15,v5.4,v4.19,1/1] selftests: breakpoints: use remaining time to check if suspend succeed | expand

Commit Message

Yifei Liu Sept. 30, 2024, 8:36 p.m. UTC
"step_after_suspend_test fails with device busy error while
writing to /sys/power/state to start suspend."  The test believes
it failed to enter suspend state with

$ sudo ./step_after_suspend_test
TAP version 13
Bail out! Failed to enter Suspend state

However, in the kernel message, I indeed see the system get
suspended and then wake up later.

[611172.033108] PM: suspend entry (s2idle)
[611172.044940] Filesystems sync: 0.006 seconds
[611172.052254] Freezing user space processes
[611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
[611172.067920] OOM killer disabled.
[611172.072465] Freezing remaining freezable tasks
[611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
[611172.117126] serial 00:03: disabled
--- some other hardware get reconnected ---
[611203.136277] OOM killer enabled.
[611203.140637] Restarting tasks ...
[611203.141135] usb 1-8.1: USB disconnect, device number 7
[611203.141755] done.
[611203.155268] random: crng reseeded on system resumption
[611203.162059] PM: suspend exit

After investigation, I noticed that for the code block
if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
	ksft_exit_fail_msg("Failed to enter Suspend state\n");

The write will return -1 and errno is set to 16 (device busy).
It should be caused by the write function is not successfully returned
before the system suspend and the return value get messed when waking up.
As a result, It may be better to check the time passed of those few 
instructions to determine whether the suspend is executed correctly for
it is pretty hard to execute those few lines for 5 seconds.

The timer to wake up the system is set to expire after 5 seconds and 
no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
it means the timer expired and wake the system up. Otherwise, the system
could be considered to enter the suspend state failed if there is any
remaining time.

After appling this patch, the test would not fail for it believes the
system does not go to suspend by mistake. It now could continue to the
rest part of the test after suspend.

Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
Reported-by: Sinadin Shan <sinadin.shan@oracle.com>
Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
---
 .../testing/selftests/breakpoints/step_after_suspend_test.c  | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Shuah Khan Sept. 30, 2024, 10:13 p.m. UTC | #1
On 9/30/24 14:36, Yifei Liu wrote:
> "step_after_suspend_test fails with device busy error while
> writing to /sys/power/state to start suspend."  The test believes
> it failed to enter suspend state with
> 
> $ sudo ./step_after_suspend_test
> TAP version 13
> Bail out! Failed to enter Suspend state
> 
> However, in the kernel message, I indeed see the system get
> suspended and then wake up later.
> 
> [611172.033108] PM: suspend entry (s2idle)
> [611172.044940] Filesystems sync: 0.006 seconds
> [611172.052254] Freezing user space processes
> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
> [611172.067920] OOM killer disabled.
> [611172.072465] Freezing remaining freezable tasks
> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
> [611172.117126] serial 00:03: disabled
> --- some other hardware get reconnected ---
> [611203.136277] OOM killer enabled.
> [611203.140637] Restarting tasks ...
> [611203.141135] usb 1-8.1: USB disconnect, device number 7
> [611203.141755] done.
> [611203.155268] random: crng reseeded on system resumption
> [611203.162059] PM: suspend exit
> 
> After investigation, I noticed that for the code block
> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
> 	ksft_exit_fail_msg("Failed to enter Suspend state\n");
> 
> The write will return -1 and errno is set to 16 (device busy).
> It should be caused by the write function is not successfully returned
> before the system suspend and the return value get messed when waking up.
> As a result, It may be better to check the time passed of those few
> instructions to determine whether the suspend is executed correctly for
> it is pretty hard to execute those few lines for 5 seconds.
> 
> The timer to wake up the system is set to expire after 5 seconds and
> no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
> it means the timer expired and wake the system up. Otherwise, the system
> could be considered to enter the suspend state failed if there is any
> remaining time.
> 
> After appling this patch, the test would not fail for it believes the
> system does not go to suspend by mistake. It now could continue to the
> rest part of the test after suspend.
> 
> Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
> Reported-by: Sinadin Shan <sinadin.shan@oracle.com>
> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
> ---

In the future add version to version changes to make it easier for
reviewers to check for the changes they asked for.

>   .../testing/selftests/breakpoints/step_after_suspend_test.c  | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
> index dfec31fb9b30d..8d275f03e977f 100644
> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
> @@ -152,7 +152,10 @@ void suspend(void)
>   	if (err < 0)
>   		ksft_exit_fail_msg("timerfd_settime() failed\n");
>   
> -	if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
> +	system("(echo mem > /sys/power/state) 2> /dev/null");
> +
> +	timerfd_gettime(timerfd, &spec);
> +	if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
>   		ksft_exit_fail_msg("Failed to enter Suspend state\n");
>   
>   	close(timerfd);

thanks,
-- Shuah
Shuah Khan Sept. 30, 2024, 10:19 p.m. UTC | #2
On 9/30/24 16:13, Shuah Khan wrote:
> On 9/30/24 14:36, Yifei Liu wrote:
>> "step_after_suspend_test fails with device busy error while

You don't need quotes

>> writing to /sys/power/state to start suspend."

Same here.

  The test believes
>> it failed to enter suspend state with
>>
>> $ sudo ./step_after_suspend_test
>> TAP version 13
>> Bail out! Failed to enter Suspend state
>>
>> However, in the kernel message, I indeed see the system get
>> suspended and then wake up later.
>>
>> [611172.033108] PM: suspend entry (s2idle)
>> [611172.044940] Filesystems sync: 0.006 seconds
>> [611172.052254] Freezing user space processes
>> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
>> [611172.067920] OOM killer disabled.
>> [611172.072465] Freezing remaining freezable tasks
>> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
>> [611172.117126] serial 00:03: disabled
>> --- some other hardware get reconnected ---

The above shouldn't be there. git am throws away everything after
this. Get rid of the above line.

>> [611203.136277] OOM killer enabled.
>> [611203.140637] Restarting tasks ...
>> [611203.141135] usb 1-8.1: USB disconnect, device number 7
>> [611203.141755] done.
>> [611203.155268] random: crng reseeded on system resumption
>> [611203.162059] PM: suspend exit
>>
>> After investigation, I noticed that for the code block
>> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
>>     ksft_exit_fail_msg("Failed to enter Suspend state\n");
>>
>> The write will return -1 and errno is set to 16 (device busy).
>> It should be caused by the write function is not successfully returned
>> before the system suspend and the return value get messed when waking up.
>> As a result, It may be better to check the time passed of those few
>> instructions to determine whether the suspend is executed correctly for
>> it is pretty hard to execute those few lines for 5 seconds.
>>
>> The timer to wake up the system is set to expire after 5 seconds and
>> no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
>> it means the timer expired and wake the system up. Otherwise, the system
>> could be considered to enter the suspend state failed if there is any
>> remaining time.
>>
>> After appling this patch, the test would not fail for it believes the
>> system does not go to suspend by mistake. It now could continue to the
>> rest part of the test after suspend.
>>
>> Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
>> Reported-by: Sinadin Shan <sinadin.shan@oracle.com>

Do you have a thread for this report. If so please add
Closes: tag with link to the report.

>> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
>> ---
> 
> In the future add version to version changes to make it easier for
> reviewers to check for the changes they asked for.
> 
>>   .../testing/selftests/breakpoints/step_after_suspend_test.c  | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>> index dfec31fb9b30d..8d275f03e977f 100644
>> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>> @@ -152,7 +152,10 @@ void suspend(void)
>>       if (err < 0)
>>           ksft_exit_fail_msg("timerfd_settime() failed\n");
>> -    if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
>> +    system("(echo mem > /sys/power/state) 2> /dev/null");
>> +
>> +    timerfd_gettime(timerfd, &spec);
>> +    if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
>>           ksft_exit_fail_msg("Failed to enter Suspend state\n");
>>       close(timerfd);
> 

Send another version with these changes.

thanks,
-- Shuah
Yifei Liu Sept. 30, 2024, 10:36 p.m. UTC | #3
Hi Shuah, 

> On Sep 30, 2024, at 3:19 PM, Shuah Khan <skhan@linuxfoundation.org> wrote:
> 
> On 9/30/24 16:13, Shuah Khan wrote:
>> On 9/30/24 14:36, Yifei Liu wrote:
>>> "step_after_suspend_test fails with device busy error while
> 
> You don't need quotes
> 
>>> writing to /sys/power/state to start suspend."
> 
> Same here.
Sure, I will remove them. 
> The test believes
>>> it failed to enter suspend state with
>>> 
>>> $ sudo ./step_after_suspend_test
>>> TAP version 13
>>> Bail out! Failed to enter Suspend state
>>> 
>>> However, in the kernel message, I indeed see the system get
>>> suspended and then wake up later.
>>> 
>>> [611172.033108] PM: suspend entry (s2idle)
>>> [611172.044940] Filesystems sync: 0.006 seconds
>>> [611172.052254] Freezing user space processes
>>> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
>>> [611172.067920] OOM killer disabled.
>>> [611172.072465] Freezing remaining freezable tasks
>>> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>>> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
>>> [611172.117126] serial 00:03: disabled
>>> --- some other hardware get reconnected ---
> 
> The above shouldn't be there. git am throws away everything after
> this. Get rid of the above line.
Sure, I will remove that. 
> 
>>> [611203.136277] OOM killer enabled.
>>> [611203.140637] Restarting tasks ...
>>> [611203.141135] usb 1-8.1: USB disconnect, device number 7
>>> [611203.141755] done.
>>> [611203.155268] random: crng reseeded on system resumption
>>> [611203.162059] PM: suspend exit
>>> 
>>> After investigation, I noticed that for the code block
>>> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
>>>     ksft_exit_fail_msg("Failed to enter Suspend state\n");
>>> 
>>> The write will return -1 and errno is set to 16 (device busy).
>>> It should be caused by the write function is not successfully returned
>>> before the system suspend and the return value get messed when waking up.
>>> As a result, It may be better to check the time passed of those few
>>> instructions to determine whether the suspend is executed correctly for
>>> it is pretty hard to execute those few lines for 5 seconds.
>>> 
>>> The timer to wake up the system is set to expire after 5 seconds and
>>> no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
>>> it means the timer expired and wake the system up. Otherwise, the system
>>> could be considered to enter the suspend state failed if there is any
>>> remaining time.
>>> 
>>> After appling this patch, the test would not fail for it believes the
>>> system does not go to suspend by mistake. It now could continue to the
>>> rest part of the test after suspend.
>>> 
>>> Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
>>> Reported-by: Sinadin Shan <sinadin.shan@oracle.com>
> 
> Do you have a thread for this report. If so please add
> Closes: tag with link to the report.
That is reported internally by Sinadin, we do not have a public link for that. 
> 
>>> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com>
>>> ---
>> In the future add version to version changes to make it easier for
>> reviewers to check for the changes they asked for.
>>>   .../testing/selftests/breakpoints/step_after_suspend_test.c  | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>>> index dfec31fb9b30d..8d275f03e977f 100644
>>> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>>> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
>>> @@ -152,7 +152,10 @@ void suspend(void)
>>>       if (err < 0)
>>>           ksft_exit_fail_msg("timerfd_settime() failed\n");
>>> -    if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
>>> +    system("(echo mem > /sys/power/state) 2> /dev/null");
>>> +
>>> +    timerfd_gettime(timerfd, &spec);
>>> +    if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
>>>           ksft_exit_fail_msg("Failed to enter Suspend state\n");
>>>       close(timerfd);
> 
> Send another version with these changes.
> 
> thanks,
> -- Shuah
>
diff mbox series

Patch

diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
index dfec31fb9b30d..8d275f03e977f 100644
--- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c
+++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c
@@ -152,7 +152,10 @@  void suspend(void)
 	if (err < 0)
 		ksft_exit_fail_msg("timerfd_settime() failed\n");
 
-	if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
+	system("(echo mem > /sys/power/state) 2> /dev/null");
+
+	timerfd_gettime(timerfd, &spec);
+	if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0)
 		ksft_exit_fail_msg("Failed to enter Suspend state\n");
 
 	close(timerfd);