Message ID | 20240923204718.1862074-1-yifei.l.liu@oracle.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v6.11,v5.15,v5.4,v4.19,1/1] selftests: breakpoints: use remaining time to check if suspend succeed | expand |
On 9/23/24 14:47, Yifei Liu wrote: > We recently notice that the step_after_suspend_test would noticed > fail on our plenty devices. The test believesit failed to "several of our devices" believes it > enter suspend state with Please do fix the above as pointed out earlier on v1. > > $ sudo ./step_after_suspend_test > TAP version 13 > Bail out! Failed to enter Suspend state > > However, in the kernel message, I indeed see the system get > suspended and then wake up later. > > [611172.033108] PM: suspend entry (s2idle) > [611172.044940] Filesystems sync: 0.006 seconds > [611172.052254] Freezing user space processes > [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) > [611172.067920] OOM killer disabled. > [611172.072465] Freezing remaining freezable tasks > [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) > [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) > [611172.117126] serial 00:03: disabled > --- some other hardware get reconnected --- > [611203.136277] OOM killer enabled. > [611203.140637] Restarting tasks ... > [611203.141135] usb 1-8.1: USB disconnect, device number 7 > [611203.141755] done. > [611203.155268] random: crng reseeded on system resumption > [611203.162059] PM: suspend exit > > After investigation, I notice that for the code block noticed? > if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) > ksft_exit_fail_msg("Failed to enter Suspend state\n"); > > The write will return -1 and errno is set to 16 (device busy). > It should be caused by the write function is not successfully returned > before the system suspend and the return value get messed when waking up. > As a result, It may be better to check the time passed of those few instructions > to determine whether the suspend is executed correctly for it is pretty hard to > execute those few lines for 5 seconds. > > The timer to wake up the system is set to expired after 5 seconds and no-rearm. > If the timer remaining time is 0 second and 0 nano secomd, it means the timer > expired and wake the system up. Otherwise, the system could be considered to > enter the suspend state failed if there is any remaining time. What would you get if you were to cat /sys/power/state at this point? Does this fix the problem you are seeing? What is the outcome? > > Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test") > Reported-by: Sinadin Shan <sinadin.shan@oracle.com> > Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> > --- > .../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c > index dfec31fb9b30d..33f5542bf741d 100644 > --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c > +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c > @@ -152,7 +152,10 @@ void suspend(void) > if (err < 0) > ksft_exit_fail_msg("timerfd_settime() failed\n"); > > - if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) > + system("(echo mem > /sys/power/state) 2> /dev/null"); > + > + timerfd_gettime(timerfd,&spec); > + if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) > ksft_exit_fail_msg("Failed to enter Suspend state\n"); > > close(timerfd); thanks, -- Shuah
> On Sep 23, 2024, at 1:56 PM, Shuah Khan <skhan@linuxfoundation.org> wrote: > > On 9/23/24 14:47, Yifei Liu wrote: >> We recently notice that the step_after_suspend_test would > > noticed How about change this to find? > >> fail on our plenty devices. The test believesit failed to > > "several of our devices" believes it >> enter suspend state with > > Please do fix the above as pointed out earlier on v1. Apologies, forget to fix. Will fix that. >> $ sudo ./step_after_suspend_test >> TAP version 13 >> Bail out! Failed to enter Suspend state >> However, in the kernel message, I indeed see the system get >> suspended and then wake up later. >> [611172.033108] PM: suspend entry (s2idle) >> [611172.044940] Filesystems sync: 0.006 seconds >> [611172.052254] Freezing user space processes >> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) >> [611172.067920] OOM killer disabled. >> [611172.072465] Freezing remaining freezable tasks >> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) >> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) >> [611172.117126] serial 00:03: disabled >> --- some other hardware get reconnected --- >> [611203.136277] OOM killer enabled. >> [611203.140637] Restarting tasks ... >> [611203.141135] usb 1-8.1: USB disconnect, device number 7 >> [611203.141755] done. >> [611203.155268] random: crng reseeded on system resumption >> [611203.162059] PM: suspend exit >> After investigation, I notice that for the code block > > noticed? How about use find here? > >> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >> The write will return -1 and errno is set to 16 (device busy). >> It should be caused by the write function is not successfully returned >> before the system suspend and the return value get messed when waking up. >> As a result, It may be better to check the time passed of those few instructions >> to determine whether the suspend is executed correctly for it is pretty hard to >> execute those few lines for 5 seconds. >> The timer to wake up the system is set to expired after 5 seconds and no-rearm. >> If the timer remaining time is 0 second and 0 nano secomd, it means the timer >> expired and wake the system up. Otherwise, the system could be considered to >> enter the suspend state failed if there is any remaining time. > > What would you get if you were to cat /sys/power/state at this point? According to file kernel/power/suspend, the cat will always return freeze mem disk. > Does this fix the problem you are seeing? What is the outcome? Yes, it fixes. The test would not fail for it believes the system does not went to suspend by mistake. It now could continue to the rest part of the test after suspend. >> Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test") >> Reported-by: Sinadin Shan <sinadin.shan@oracle.com> >> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> >> --- >> .../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >> index dfec31fb9b30d..33f5542bf741d 100644 >> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c >> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >> @@ -152,7 +152,10 @@ void suspend(void) >> if (err < 0) >> ksft_exit_fail_msg("timerfd_settime() failed\n"); >> - if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >> + system("(echo mem > /sys/power/state) 2> /dev/null"); >> + >> + timerfd_gettime(timerfd,&spec); >> + if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) >> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >> close(timerfd); > > thanks, > -- Shuah
On 9/23/24 15:18, Yifei Liu wrote: > > >> On Sep 23, 2024, at 1:56 PM, Shuah Khan <skhan@linuxfoundation.org> wrote: >> >> On 9/23/24 14:47, Yifei Liu wrote: >>> We recently notice that the step_after_suspend_test would >> >> noticed > How about change this to find? Either is fine - it would found instead of find - or you could write it as below: "step_after_suspend_test fails with device busy error while writing to /sys/power/state to start suspend." >> >>> fail on our plenty devices. The test believesit failed to >> >> "several of our devices" believes it >>> enter suspend state with >> >> Please do fix the above as pointed out earlier on v1. > Apologies, forget to fix. Will fix that. >>> $ sudo ./step_after_suspend_test >>> TAP version 13 >>> Bail out! Failed to enter Suspend state >>> However, in the kernel message, I indeed see the system get >>> suspended and then wake up later. >>> [611172.033108] PM: suspend entry (s2idle) >>> [611172.044940] Filesystems sync: 0.006 seconds >>> [611172.052254] Freezing user space processes >>> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) >>> [611172.067920] OOM killer disabled. >>> [611172.072465] Freezing remaining freezable tasks >>> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) >>> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) >>> [611172.117126] serial 00:03: disabled >>> --- some other hardware get reconnected --- >>> [611203.136277] OOM killer enabled. >>> [611203.140637] Restarting tasks ... >>> [611203.141135] usb 1-8.1: USB disconnect, device number 7 >>> [611203.141755] done. >>> [611203.155268] random: crng reseeded on system resumption >>> [611203.162059] PM: suspend exit >>> After investigation, I notice that for the code block >> >> noticed? > How about use find here? It is the tense issue - found or noticed would work. >> >>> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >>> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >>> The write will return -1 and errno is set to 16 (device busy). >>> It should be caused by the write function is not successfully returned >>> before the system suspend and the return value get messed when waking up. >>> As a result, It may be better to check the time passed of those few instructions >>> to determine whether the suspend is executed correctly for it is pretty hard to >>> execute those few lines for 5 seconds. >>> The timer to wake up the system is set to expired after 5 seconds and no-rearm. >>> If the timer remaining time is 0 second and 0 nano secomd, it means the timer >>> expired and wake the system up. Otherwise, the system could be considered to >>> enter the suspend state failed if there is any remaining time. >> >> What would you get if you were to cat /sys/power/state at this point? > According to file kernel/power/suspend, the cat will always return freeze mem disk. > >> Does this fix the problem you are seeing? What is the outcome? > Yes, it fixes. The test would not fail for it believes the system does not went to suspend by mistake. It now could continue to the rest part of the test after suspend. Can you add this to the change log. >>> Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test") >>> Reported-by: Sinadin Shan <sinadin.shan@oracle.com> >>> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> >>> --- >>> .../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>> index dfec31fb9b30d..33f5542bf741d 100644 >>> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>> @@ -152,7 +152,10 @@ void suspend(void) >>> if (err < 0) >>> ksft_exit_fail_msg("timerfd_settime() failed\n"); >>> - if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >>> + system("(echo mem > /sys/power/state) 2> /dev/null"); >>> + >>> + timerfd_gettime(timerfd,&spec); >>> + if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) >>> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >>> close(timerfd); >> thanks, -- Shuah
> On Sep 23, 2024, at 2:46 PM, Shuah Khan <skhan@linuxfoundation.org> wrote: > > On 9/23/24 15:18, Yifei Liu wrote: >>> On Sep 23, 2024, at 1:56 PM, Shuah Khan <skhan@linuxfoundation.org> wrote: >>> >>> On 9/23/24 14:47, Yifei Liu wrote: >>>> We recently notice that the step_after_suspend_test would >>> >>> noticed >> How about change this to find? > > Either is fine - it would found instead of find - or you could write > it as below: > > "step_after_suspend_test fails with device busy error while > writing to /sys/power/state to start suspend.” Thanks > >>> >>>> fail on our plenty devices. The test believesit failed to >>> >>> "several of our devices" believes it >>>> enter suspend state with >>> >>> Please do fix the above as pointed out earlier on v1. >> Apologies, forget to fix. Will fix that. >>>> $ sudo ./step_after_suspend_test >>>> TAP version 13 >>>> Bail out! Failed to enter Suspend state >>>> However, in the kernel message, I indeed see the system get >>>> suspended and then wake up later. >>>> [611172.033108] PM: suspend entry (s2idle) >>>> [611172.044940] Filesystems sync: 0.006 seconds >>>> [611172.052254] Freezing user space processes >>>> [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) >>>> [611172.067920] OOM killer disabled. >>>> [611172.072465] Freezing remaining freezable tasks >>>> [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) >>>> [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) >>>> [611172.117126] serial 00:03: disabled >>>> --- some other hardware get reconnected --- >>>> [611203.136277] OOM killer enabled. >>>> [611203.140637] Restarting tasks ... >>>> [611203.141135] usb 1-8.1: USB disconnect, device number 7 >>>> [611203.141755] done. >>>> [611203.155268] random: crng reseeded on system resumption >>>> [611203.162059] PM: suspend exit >>>> After investigation, I notice that for the code block >>> >>> noticed? >> How about use find here? > > It is the tense issue - found or noticed would work. Thanks >>> >>>> if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >>>> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >>>> The write will return -1 and errno is set to 16 (device busy). >>>> It should be caused by the write function is not successfully returned >>>> before the system suspend and the return value get messed when waking up. >>>> As a result, It may be better to check the time passed of those few instructions >>>> to determine whether the suspend is executed correctly for it is pretty hard to >>>> execute those few lines for 5 seconds. >>>> The timer to wake up the system is set to expired after 5 seconds and no-rearm. >>>> If the timer remaining time is 0 second and 0 nano secomd, it means the timer >>>> expired and wake the system up. Otherwise, the system could be considered to >>>> enter the suspend state failed if there is any remaining time. >>> >>> What would you get if you were to cat /sys/power/state at this point? >> According to file kernel/power/suspend, the cat will always return freeze mem disk. >>> Does this fix the problem you are seeing? What is the outcome? >> Yes, it fixes. The test would not fail for it believes the system does not went to suspend by mistake. It now could continue to the rest part of the test after suspend. > > Can you add this to the change log. Sure. I will send a v3 later. > >>>> Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test") >>>> Reported-by: Sinadin Shan <sinadin.shan@oracle.com> >>>> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> >>>> --- >>>> .../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++- >>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>> diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>>> index dfec31fb9b30d..33f5542bf741d 100644 >>>> --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>>> +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c >>>> @@ -152,7 +152,10 @@ void suspend(void) >>>> if (err < 0) >>>> ksft_exit_fail_msg("timerfd_settime() failed\n"); >>>> - if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) >>>> + system("(echo mem > /sys/power/state) 2> /dev/null"); >>>> + >>>> + timerfd_gettime(timerfd,&spec); >>>> + if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) >>>> ksft_exit_fail_msg("Failed to enter Suspend state\n"); >>>> close(timerfd); >>> > > thanks, > — Shuah Yifei
diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c index dfec31fb9b30d..33f5542bf741d 100644 --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c @@ -152,7 +152,10 @@ void suspend(void) if (err < 0) ksft_exit_fail_msg("timerfd_settime() failed\n"); - if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) + system("(echo mem > /sys/power/state) 2> /dev/null"); + + timerfd_gettime(timerfd,&spec); + if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) ksft_exit_fail_msg("Failed to enter Suspend state\n"); close(timerfd);
We recently notice that the step_after_suspend_test would fail on our plenty devices. The test believesit failed to enter suspend state with $ sudo ./step_after_suspend_test TAP version 13 Bail out! Failed to enter Suspend state However, in the kernel message, I indeed see the system get suspended and then wake up later. [611172.033108] PM: suspend entry (s2idle) [611172.044940] Filesystems sync: 0.006 seconds [611172.052254] Freezing user space processes [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) [611172.067920] OOM killer disabled. [611172.072465] Freezing remaining freezable tasks [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) [611172.117126] serial 00:03: disabled --- some other hardware get reconnected --- [611203.136277] OOM killer enabled. [611203.140637] Restarting tasks ... [611203.141135] usb 1-8.1: USB disconnect, device number 7 [611203.141755] done. [611203.155268] random: crng reseeded on system resumption [611203.162059] PM: suspend exit After investigation, I notice that for the code block if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) ksft_exit_fail_msg("Failed to enter Suspend state\n"); The write will return -1 and errno is set to 16 (device busy). It should be caused by the write function is not successfully returned before the system suspend and the return value get messed when waking up. As a result, It may be better to check the time passed of those few instructions to determine whether the suspend is executed correctly for it is pretty hard to execute those few lines for 5 seconds. The timer to wake up the system is set to expired after 5 seconds and no-rearm. If the timer remaining time is 0 second and 0 nano secomd, it means the timer expired and wake the system up. Otherwise, the system could be considered to enter the suspend state failed if there is any remaining time. Fixes: bfd092b8c2728 ("selftests: breakpoint: add step_after_suspend_test") Reported-by: Sinadin Shan <sinadin.shan@oracle.com> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> --- .../testing/selftests/breakpoints/step_after_suspend_test.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)