Message ID | 20230725232913.2981357-2-joel@joelfernandes.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | misc RCU fixes and cleanups | expand |
On Tue, Jul 25, 2023 at 11:29:06PM +0000, Joel Fernandes (Google) wrote: > The stuttering code isn't functioning as expected. Ideally, it should > pause the torture threads for a designated period before resuming. Yet, > it fails to halt the test for the correct duration. Additionally, a race > condition exists, potentially causing the stuttering code to pause for > an extended period if the 'spt' variable is non-zero due to the stutter > orchestration thread's inadequate CPU time. > > Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. > This happens as the stuttering code may run within a softirq due to RCU > callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, > thus obstructing RCU's progress. This situation triggers a warning > message in the logs: > > [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 > > This warning suggests that an RCU torture object, although invisible to > RCU readers, couldn't make it past the pipe array and be freed -- a > strong indication that there weren't enough grace periods during the > stutter interval. > > To address these issues, this patch sets the "stutter end" time to an > absolute point in the future set by the main stutter thread. This is > then used for waiting in stutter_wait(). While the stutter thread still > defines this absolute time, the waiters' waiting logic doesn't rely on > the stutter thread receiving sufficient CPU time to halt the stuttering > as the halting is now self-controlled. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/torture.c | 46 +++++++++++++--------------------------------- > 1 file changed, 13 insertions(+), 33 deletions(-) > > diff --git a/kernel/torture.c b/kernel/torture.c > index 68dba4ecab5c..63f8f2a7d960 100644 > --- a/kernel/torture.c > +++ b/kernel/torture.c > @@ -719,7 +719,7 @@ static void torture_shutdown_cleanup(void) > * suddenly applied to or removed from the system. > */ > static struct task_struct *stutter_task; > -static int stutter_pause_test; > +static ktime_t stutter_till_abs_time; > static int stutter; > static int stutter_gap; > > @@ -729,30 +729,17 @@ static int stutter_gap; > */ > bool stutter_wait(const char *title) > { > - unsigned int i = 0; > bool ret = false; > - int spt; > + ktime_t now_ns, till_ns; > > cond_resched_tasks_rcu_qs(); > - spt = READ_ONCE(stutter_pause_test); > - for (; spt; spt = READ_ONCE(stutter_pause_test)) { > - if (!ret && !rt_task(current)) { > - sched_set_normal(current, MAX_NICE); > - ret = true; > - } > - if (spt == 1) { > - torture_hrtimeout_jiffies(1, NULL); > - } else if (spt == 2) { > - while (READ_ONCE(stutter_pause_test)) { > - if (!(i++ & 0xffff)) > - torture_hrtimeout_us(10, 0, NULL); > - cond_resched(); > - } > - } else { > - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); > - } > - torture_shutdown_absorb(title); > + now_ns = ktime_get(); > + till_ns = READ_ONCE(stutter_till_abs_time); > + if (till_ns && ktime_before(now_ns, till_ns)) { > + torture_hrtimeout_ns(ktime_sub(till_ns, now_ns), 0, NULL); This ktime_sub() is roughly cancelled out by a ktime_add_safe() in __hrtimer_start_range_ns(). Perhaps torture_hrtimeout_ns() needs to take a mode argument as in the patch at the end of this email, allowing you to ditch that ktime_sub() in favor of HRTIMER_MODE_ABS. Thanx, Paul > + ret = true; > } > + torture_shutdown_absorb(title); > return ret; > } > EXPORT_SYMBOL_GPL(stutter_wait); > @@ -763,23 +750,16 @@ EXPORT_SYMBOL_GPL(stutter_wait); > */ > static int torture_stutter(void *arg) > { > - DEFINE_TORTURE_RANDOM(rand); > - int wtime; > + ktime_t till_ns; > > VERBOSE_TOROUT_STRING("torture_stutter task started"); > do { > if (!torture_must_stop() && stutter > 1) { > - wtime = stutter; > - if (stutter > 2) { > - WRITE_ONCE(stutter_pause_test, 1); > - wtime = stutter - 3; > - torture_hrtimeout_jiffies(wtime, &rand); > - wtime = 2; > - } > - WRITE_ONCE(stutter_pause_test, 2); > - torture_hrtimeout_jiffies(wtime, NULL); > + till_ns = ktime_add_ns(ktime_get(), > + jiffies_to_nsecs(stutter)); > + WRITE_ONCE(stutter_till_abs_time, till_ns); > + torture_hrtimeout_jiffies(stutter - 1, NULL); > } > - WRITE_ONCE(stutter_pause_test, 0); > if (!torture_must_stop()) > torture_hrtimeout_jiffies(stutter_gap, NULL); > torture_shutdown_absorb("torture_stutter"); ------------------------------------------------------------------------ commit 6445f014c3d4d20a0b69bd97d3b76a222820612f Author: Paul E. McKenney <paulmck@kernel.org> Date: Wed Jul 26 13:57:03 2023 -0700 torture: Make torture_hrtimeout_ns() take an hrtimer mode parameter The current torture-test sleeps are waiting for a duration, but there are situations where it is better to wait for an absolute time, for example, when ending a stutter interval. This commit therefore adds an hrtimer mode parameter to torture_hrtimeout_ns(). Why not also the other torture_hrtimeout_*() functions? The theory is that most absolute times will be in nanoseconds, especially not (say) jiffies. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/include/linux/torture.h b/include/linux/torture.h index bb466eec01e4..017f0f710815 100644 --- a/include/linux/torture.h +++ b/include/linux/torture.h @@ -81,7 +81,8 @@ static inline void torture_random_init(struct torture_random_state *trsp) } /* Definitions for high-resolution-timer sleeps. */ -int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp); +int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, const enum hrtimer_mode mode, + struct torture_random_state *trsp); int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state *trsp); int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state *trsp); int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp); diff --git a/kernel/torture.c b/kernel/torture.c index 68dba4ecab5c..6ba62e5993e7 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -87,14 +87,15 @@ EXPORT_SYMBOL_GPL(verbose_torout_sleep); * nanosecond random fuzz. This function and its friends desynchronize * testing from the timer wheel. */ -int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp) +int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, const enum hrtimer_mode mode, + struct torture_random_state *trsp) { ktime_t hto = baset_ns; if (trsp) hto += torture_random(trsp) % fuzzt_ns; set_current_state(TASK_IDLE); - return schedule_hrtimeout(&hto, HRTIMER_MODE_REL); + return schedule_hrtimeout(&hto, mode); } EXPORT_SYMBOL_GPL(torture_hrtimeout_ns); @@ -106,7 +107,7 @@ int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state { ktime_t baset_ns = baset_us * NSEC_PER_USEC; - return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_us); @@ -123,7 +124,7 @@ int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state fuzzt_ns = (u32)~0U; else fuzzt_ns = fuzzt_us * NSEC_PER_USEC; - return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_ms); @@ -136,7 +137,7 @@ int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp) { ktime_t baset_ns = jiffies_to_nsecs(baset_j); - return torture_hrtimeout_ns(baset_ns, jiffies_to_nsecs(1), trsp); + return torture_hrtimeout_ns(baset_ns, jiffies_to_nsecs(1), HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_jiffies); @@ -153,7 +154,7 @@ int torture_hrtimeout_s(u32 baset_s, u32 fuzzt_ms, struct torture_random_state * fuzzt_ns = (u32)~0U; else fuzzt_ns = fuzzt_ms * NSEC_PER_MSEC; - return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_s);
On Wed, Jul 26, 2023 at 4:59 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Jul 25, 2023 at 11:29:06PM +0000, Joel Fernandes (Google) wrote: > > The stuttering code isn't functioning as expected. Ideally, it should > > pause the torture threads for a designated period before resuming. Yet, > > it fails to halt the test for the correct duration. Additionally, a race > > condition exists, potentially causing the stuttering code to pause for > > an extended period if the 'spt' variable is non-zero due to the stutter > > orchestration thread's inadequate CPU time. > > > > Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. > > This happens as the stuttering code may run within a softirq due to RCU > > callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, > > thus obstructing RCU's progress. This situation triggers a warning > > message in the logs: > > > > [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 > > > > This warning suggests that an RCU torture object, although invisible to > > RCU readers, couldn't make it past the pipe array and be freed -- a > > strong indication that there weren't enough grace periods during the > > stutter interval. > > > > To address these issues, this patch sets the "stutter end" time to an > > absolute point in the future set by the main stutter thread. This is > > then used for waiting in stutter_wait(). While the stutter thread still > > defines this absolute time, the waiters' waiting logic doesn't rely on > > the stutter thread receiving sufficient CPU time to halt the stuttering > > as the halting is now self-controlled. > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > --- > > kernel/torture.c | 46 +++++++++++++--------------------------------- > > 1 file changed, 13 insertions(+), 33 deletions(-) > > > > diff --git a/kernel/torture.c b/kernel/torture.c > > index 68dba4ecab5c..63f8f2a7d960 100644 > > --- a/kernel/torture.c > > +++ b/kernel/torture.c > > @@ -719,7 +719,7 @@ static void torture_shutdown_cleanup(void) > > * suddenly applied to or removed from the system. > > */ > > static struct task_struct *stutter_task; > > -static int stutter_pause_test; > > +static ktime_t stutter_till_abs_time; > > static int stutter; > > static int stutter_gap; > > > > @@ -729,30 +729,17 @@ static int stutter_gap; > > */ > > bool stutter_wait(const char *title) > > { > > - unsigned int i = 0; > > bool ret = false; > > - int spt; > > + ktime_t now_ns, till_ns; > > > > cond_resched_tasks_rcu_qs(); > > - spt = READ_ONCE(stutter_pause_test); > > - for (; spt; spt = READ_ONCE(stutter_pause_test)) { > > - if (!ret && !rt_task(current)) { > > - sched_set_normal(current, MAX_NICE); > > - ret = true; > > - } > > - if (spt == 1) { > > - torture_hrtimeout_jiffies(1, NULL); > > - } else if (spt == 2) { > > - while (READ_ONCE(stutter_pause_test)) { > > - if (!(i++ & 0xffff)) > > - torture_hrtimeout_us(10, 0, NULL); > > - cond_resched(); > > - } > > - } else { > > - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); > > - } > > - torture_shutdown_absorb(title); > > + now_ns = ktime_get(); > > + till_ns = READ_ONCE(stutter_till_abs_time); > > + if (till_ns && ktime_before(now_ns, till_ns)) { > > + torture_hrtimeout_ns(ktime_sub(till_ns, now_ns), 0, NULL); > > This ktime_sub() is roughly cancelled out by a ktime_add_safe() in > __hrtimer_start_range_ns(). Yes, functionally it is the same but your suggestion is more robust I think. > Perhaps torture_hrtimeout_ns() needs to > take a mode argument as in the patch at the end of this email, allowing > you to ditch that ktime_sub() in favor of HRTIMER_MODE_ABS. Sure, or we can add a new API and keep the default as relative? Or have 2 APIs: torture_hrtimeout_relative_ns(); and: torture_hrtimeout_absolute_ns(); That makes it more readable IMHO. Also, do you want me to make both changes (API and usage) in the same patch? Or were you planning to have a separate patch yourself in -dev which I can use? Let me know either way, and then I'll refresh the patch. thanks, - Joel [..]
On Wed, Jul 26, 2023 at 11:01:40PM -0400, Joel Fernandes wrote: > On Wed, Jul 26, 2023 at 4:59 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Tue, Jul 25, 2023 at 11:29:06PM +0000, Joel Fernandes (Google) wrote: > > > The stuttering code isn't functioning as expected. Ideally, it should > > > pause the torture threads for a designated period before resuming. Yet, > > > it fails to halt the test for the correct duration. Additionally, a race > > > condition exists, potentially causing the stuttering code to pause for > > > an extended period if the 'spt' variable is non-zero due to the stutter > > > orchestration thread's inadequate CPU time. > > > > > > Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. > > > This happens as the stuttering code may run within a softirq due to RCU > > > callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, > > > thus obstructing RCU's progress. This situation triggers a warning > > > message in the logs: > > > > > > [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 > > > > > > This warning suggests that an RCU torture object, although invisible to > > > RCU readers, couldn't make it past the pipe array and be freed -- a > > > strong indication that there weren't enough grace periods during the > > > stutter interval. > > > > > > To address these issues, this patch sets the "stutter end" time to an > > > absolute point in the future set by the main stutter thread. This is > > > then used for waiting in stutter_wait(). While the stutter thread still > > > defines this absolute time, the waiters' waiting logic doesn't rely on > > > the stutter thread receiving sufficient CPU time to halt the stuttering > > > as the halting is now self-controlled. > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > --- > > > kernel/torture.c | 46 +++++++++++++--------------------------------- > > > 1 file changed, 13 insertions(+), 33 deletions(-) > > > > > > diff --git a/kernel/torture.c b/kernel/torture.c > > > index 68dba4ecab5c..63f8f2a7d960 100644 > > > --- a/kernel/torture.c > > > +++ b/kernel/torture.c > > > @@ -719,7 +719,7 @@ static void torture_shutdown_cleanup(void) > > > * suddenly applied to or removed from the system. > > > */ > > > static struct task_struct *stutter_task; > > > -static int stutter_pause_test; > > > +static ktime_t stutter_till_abs_time; > > > static int stutter; > > > static int stutter_gap; > > > > > > @@ -729,30 +729,17 @@ static int stutter_gap; > > > */ > > > bool stutter_wait(const char *title) > > > { > > > - unsigned int i = 0; > > > bool ret = false; > > > - int spt; > > > + ktime_t now_ns, till_ns; > > > > > > cond_resched_tasks_rcu_qs(); > > > - spt = READ_ONCE(stutter_pause_test); > > > - for (; spt; spt = READ_ONCE(stutter_pause_test)) { > > > - if (!ret && !rt_task(current)) { > > > - sched_set_normal(current, MAX_NICE); > > > - ret = true; > > > - } > > > - if (spt == 1) { > > > - torture_hrtimeout_jiffies(1, NULL); > > > - } else if (spt == 2) { > > > - while (READ_ONCE(stutter_pause_test)) { > > > - if (!(i++ & 0xffff)) > > > - torture_hrtimeout_us(10, 0, NULL); > > > - cond_resched(); > > > - } > > > - } else { > > > - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); > > > - } > > > - torture_shutdown_absorb(title); > > > + now_ns = ktime_get(); > > > + till_ns = READ_ONCE(stutter_till_abs_time); > > > + if (till_ns && ktime_before(now_ns, till_ns)) { > > > + torture_hrtimeout_ns(ktime_sub(till_ns, now_ns), 0, NULL); > > > > This ktime_sub() is roughly cancelled out by a ktime_add_safe() in > > __hrtimer_start_range_ns(). > > Yes, functionally it is the same but your suggestion is more robust I think. > > > Perhaps torture_hrtimeout_ns() needs to > > take a mode argument as in the patch at the end of this email, allowing > > you to ditch that ktime_sub() in favor of HRTIMER_MODE_ABS. > > Sure, or we can add a new API and keep the default as relative? > > Or have 2 APIs: > torture_hrtimeout_relative_ns(); > > and: > torture_hrtimeout_absolute_ns(); > > That makes it more readable IMHO. > > Also, do you want me to make both changes (API and usage) in the same > patch? Or were you planning to have a separate patch yourself in -dev > which I can use? Let me know either way, and then I'll refresh the > patch. I queued the patch on the -rcu tree's "dev" branch. It turns out that torture_hrtimeout_ns() isn't called very many times, so adding the parameter was straightforward. Plus the compiler might well optimize it away anyway. Thanx, Paul
On 7/27/23 00:01, Paul E. McKenney wrote: > On Wed, Jul 26, 2023 at 11:01:40PM -0400, Joel Fernandes wrote: >> On Wed, Jul 26, 2023 at 4:59 PM Paul E. McKenney <paulmck@kernel.org> wrote: >>> >>> On Tue, Jul 25, 2023 at 11:29:06PM +0000, Joel Fernandes (Google) wrote: >>>> The stuttering code isn't functioning as expected. Ideally, it should >>>> pause the torture threads for a designated period before resuming. Yet, >>>> it fails to halt the test for the correct duration. Additionally, a race >>>> condition exists, potentially causing the stuttering code to pause for >>>> an extended period if the 'spt' variable is non-zero due to the stutter >>>> orchestration thread's inadequate CPU time. >>>> >>>> Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. >>>> This happens as the stuttering code may run within a softirq due to RCU >>>> callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, >>>> thus obstructing RCU's progress. This situation triggers a warning >>>> message in the logs: >>>> >>>> [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 >>>> >>>> This warning suggests that an RCU torture object, although invisible to >>>> RCU readers, couldn't make it past the pipe array and be freed -- a >>>> strong indication that there weren't enough grace periods during the >>>> stutter interval. >>>> >>>> To address these issues, this patch sets the "stutter end" time to an >>>> absolute point in the future set by the main stutter thread. This is >>>> then used for waiting in stutter_wait(). While the stutter thread still >>>> defines this absolute time, the waiters' waiting logic doesn't rely on >>>> the stutter thread receiving sufficient CPU time to halt the stuttering >>>> as the halting is now self-controlled. >>>> >>>> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> >>>> --- >>>> kernel/torture.c | 46 +++++++++++++--------------------------------- >>>> 1 file changed, 13 insertions(+), 33 deletions(-) >>>> >>>> diff --git a/kernel/torture.c b/kernel/torture.c >>>> index 68dba4ecab5c..63f8f2a7d960 100644 >>>> --- a/kernel/torture.c >>>> +++ b/kernel/torture.c >>>> @@ -719,7 +719,7 @@ static void torture_shutdown_cleanup(void) >>>> * suddenly applied to or removed from the system. >>>> */ >>>> static struct task_struct *stutter_task; >>>> -static int stutter_pause_test; >>>> +static ktime_t stutter_till_abs_time; >>>> static int stutter; >>>> static int stutter_gap; >>>> >>>> @@ -729,30 +729,17 @@ static int stutter_gap; >>>> */ >>>> bool stutter_wait(const char *title) >>>> { >>>> - unsigned int i = 0; >>>> bool ret = false; >>>> - int spt; >>>> + ktime_t now_ns, till_ns; >>>> >>>> cond_resched_tasks_rcu_qs(); >>>> - spt = READ_ONCE(stutter_pause_test); >>>> - for (; spt; spt = READ_ONCE(stutter_pause_test)) { >>>> - if (!ret && !rt_task(current)) { >>>> - sched_set_normal(current, MAX_NICE); >>>> - ret = true; >>>> - } >>>> - if (spt == 1) { >>>> - torture_hrtimeout_jiffies(1, NULL); >>>> - } else if (spt == 2) { >>>> - while (READ_ONCE(stutter_pause_test)) { >>>> - if (!(i++ & 0xffff)) >>>> - torture_hrtimeout_us(10, 0, NULL); >>>> - cond_resched(); >>>> - } >>>> - } else { >>>> - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); >>>> - } >>>> - torture_shutdown_absorb(title); >>>> + now_ns = ktime_get(); >>>> + till_ns = READ_ONCE(stutter_till_abs_time); >>>> + if (till_ns && ktime_before(now_ns, till_ns)) { >>>> + torture_hrtimeout_ns(ktime_sub(till_ns, now_ns), 0, NULL); >>> >>> This ktime_sub() is roughly cancelled out by a ktime_add_safe() in >>> __hrtimer_start_range_ns(). >> >> Yes, functionally it is the same but your suggestion is more robust I think. >> >>> Perhaps torture_hrtimeout_ns() needs to >>> take a mode argument as in the patch at the end of this email, allowing >>> you to ditch that ktime_sub() in favor of HRTIMER_MODE_ABS. >> >> Sure, or we can add a new API and keep the default as relative? >> >> Or have 2 APIs: >> torture_hrtimeout_relative_ns(); >> >> and: >> torture_hrtimeout_absolute_ns(); >> >> That makes it more readable IMHO. >> >> Also, do you want me to make both changes (API and usage) in the same >> patch? Or were you planning to have a separate patch yourself in -dev >> which I can use? Let me know either way, and then I'll refresh the >> patch. > > I queued the patch on the -rcu tree's "dev" branch. It turns out that > torture_hrtimeout_ns() isn't called very many times, so adding the > parameter was straightforward. Plus the compiler might well optimize > it away anyway. Ok sounds good, I will make use of it in this patch and send it again after testing. thanks, - Joel
diff --git a/kernel/torture.c b/kernel/torture.c index 68dba4ecab5c..63f8f2a7d960 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -719,7 +719,7 @@ static void torture_shutdown_cleanup(void) * suddenly applied to or removed from the system. */ static struct task_struct *stutter_task; -static int stutter_pause_test; +static ktime_t stutter_till_abs_time; static int stutter; static int stutter_gap; @@ -729,30 +729,17 @@ static int stutter_gap; */ bool stutter_wait(const char *title) { - unsigned int i = 0; bool ret = false; - int spt; + ktime_t now_ns, till_ns; cond_resched_tasks_rcu_qs(); - spt = READ_ONCE(stutter_pause_test); - for (; spt; spt = READ_ONCE(stutter_pause_test)) { - if (!ret && !rt_task(current)) { - sched_set_normal(current, MAX_NICE); - ret = true; - } - if (spt == 1) { - torture_hrtimeout_jiffies(1, NULL); - } else if (spt == 2) { - while (READ_ONCE(stutter_pause_test)) { - if (!(i++ & 0xffff)) - torture_hrtimeout_us(10, 0, NULL); - cond_resched(); - } - } else { - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); - } - torture_shutdown_absorb(title); + now_ns = ktime_get(); + till_ns = READ_ONCE(stutter_till_abs_time); + if (till_ns && ktime_before(now_ns, till_ns)) { + torture_hrtimeout_ns(ktime_sub(till_ns, now_ns), 0, NULL); + ret = true; } + torture_shutdown_absorb(title); return ret; } EXPORT_SYMBOL_GPL(stutter_wait); @@ -763,23 +750,16 @@ EXPORT_SYMBOL_GPL(stutter_wait); */ static int torture_stutter(void *arg) { - DEFINE_TORTURE_RANDOM(rand); - int wtime; + ktime_t till_ns; VERBOSE_TOROUT_STRING("torture_stutter task started"); do { if (!torture_must_stop() && stutter > 1) { - wtime = stutter; - if (stutter > 2) { - WRITE_ONCE(stutter_pause_test, 1); - wtime = stutter - 3; - torture_hrtimeout_jiffies(wtime, &rand); - wtime = 2; - } - WRITE_ONCE(stutter_pause_test, 2); - torture_hrtimeout_jiffies(wtime, NULL); + till_ns = ktime_add_ns(ktime_get(), + jiffies_to_nsecs(stutter)); + WRITE_ONCE(stutter_till_abs_time, till_ns); + torture_hrtimeout_jiffies(stutter - 1, NULL); } - WRITE_ONCE(stutter_pause_test, 0); if (!torture_must_stop()) torture_hrtimeout_jiffies(stutter_gap, NULL); torture_shutdown_absorb("torture_stutter");
The stuttering code isn't functioning as expected. Ideally, it should pause the torture threads for a designated period before resuming. Yet, it fails to halt the test for the correct duration. Additionally, a race condition exists, potentially causing the stuttering code to pause for an extended period if the 'spt' variable is non-zero due to the stutter orchestration thread's inadequate CPU time. Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. This happens as the stuttering code may run within a softirq due to RCU callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, thus obstructing RCU's progress. This situation triggers a warning message in the logs: [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 This warning suggests that an RCU torture object, although invisible to RCU readers, couldn't make it past the pipe array and be freed -- a strong indication that there weren't enough grace periods during the stutter interval. To address these issues, this patch sets the "stutter end" time to an absolute point in the future set by the main stutter thread. This is then used for waiting in stutter_wait(). While the stutter thread still defines this absolute time, the waiters' waiting logic doesn't rely on the stutter thread receiving sufficient CPU time to halt the stuttering as the halting is now self-controlled. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> --- kernel/torture.c | 46 +++++++++++++--------------------------------- 1 file changed, 13 insertions(+), 33 deletions(-)