diff mbox series

[v2,net,5/7] net/sched: taprio: fix delayed switching to new schedule after timer expiry

Message ID 20231107112023.676016-6-faizal.abdul.rahim@linux.intel.com (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series qbv cycle time extension/truncation | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1316 this patch: 1316
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 1343 this patch: 1343
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1344 this patch: 1344
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Abdul Rahim, Faizal Nov. 7, 2023, 11:20 a.m. UTC
If a new GCL is triggered and the new admin base time falls before the
expiry of advance_timer (current running entry from oper),
taprio_start_sched() resets the current advance_timer expiry to the
new admin base time. However, upon expiry, advance_sched() doesn't
immediately switch to the admin schedule. It continues running entries
from the old oper schedule, and only switches to the new admin schedule
much later. Ideally, if the advance_timer is shorten to align with the
new admin base time, when the timer expires, advance_sched() should
trigger switch_schedules() at the beginning.

To resolve this issue, set the cycle_time_correction to a non-initialized
value in taprio_start_sched(). advance_sched() will use it to initiate
switch_schedules() at the beginning.

Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 net/sched/sch_taprio.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Vladimir Oltean Nov. 9, 2023, 11:50 a.m. UTC | #1
On Tue, Nov 07, 2023 at 06:20:21AM -0500, Faizal Rahim wrote:
> If a new GCL is triggered and the new admin base time falls before the
> expiry of advance_timer (current running entry from oper),
> taprio_start_sched() resets the current advance_timer expiry to the
> new admin base time. However, upon expiry, advance_sched() doesn't
> immediately switch to the admin schedule. It continues running entries
> from the old oper schedule, and only switches to the new admin schedule
> much later. Ideally, if the advance_timer is shorten to align with the
> new admin base time, when the timer expires, advance_sched() should
> trigger switch_schedules() at the beginning.
> 
> To resolve this issue, set the cycle_time_correction to a non-initialized
> value in taprio_start_sched(). advance_sched() will use it to initiate
> switch_schedules() at the beginning.
> 
> Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")

Did the commit you blame really introduce this issue, or was it your
rework to trigger switch_schedules() based on the correction?

> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
> ---
>  net/sched/sch_taprio.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index f18a5fe12f0c..01b114edec30 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -1379,14 +1379,19 @@ static void setup_first_end_time(struct taprio_sched *q,
>  }
>  
>  static void taprio_start_sched(struct Qdisc *sch,
> -			       ktime_t start, struct sched_gate_list *new)
> +			       ktime_t new_base_time,
> +			       struct sched_gate_list *new)
>  {
>  	struct taprio_sched *q = qdisc_priv(sch);
> -	ktime_t expires;
> +	struct sched_gate_list *oper = NULL;
> +	ktime_t expires, start;
>  
>  	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
>  		return;
>  
> +	oper = rcu_dereference_protected(q->oper_sched,
> +					 lockdep_is_held(&q->current_entry_lock));
> +
>  	expires = hrtimer_get_expires(&q->advance_timer);
>  	if (expires == 0)
>  		expires = KTIME_MAX;
> @@ -1395,7 +1400,17 @@ static void taprio_start_sched(struct Qdisc *sch,
>  	 * reprogram it to the earliest one, so we change the admin
>  	 * schedule to the operational one at the right time.
>  	 */
> -	start = min_t(ktime_t, start, expires);
> +	start = min_t(ktime_t, new_base_time, expires);
> +
> +	if (expires != KTIME_MAX &&
> +	    ktime_compare(start, new_base_time) == 0) {
> +		/* Since timer was changed to align to the new admin schedule,
> +		 * setting the variable below to a non-initialized value will

I find the wording "setting the variable below to a non-initialized value"
confusing. 0 is non-initialized? You're talking about a value different
than INIT_CYCLE_TIME_CORRECTION. What about "setting a specific cycle
correction will indicate ..."?

> +		 * indicate to advance_sched() to call switch_schedules() after
> +		 * this timer expires.
> +		 */
> +		oper->cycle_time_correction = 0;

Why 0 and not ktime_sub(new_base_time, oper->cycle_end_time)? Doesn't
the precise correction value make a difference?

> +	}
>  
>  	hrtimer_start(&q->advance_timer, start, HRTIMER_MODE_ABS);
>  }
> -- 
> 2.25.1
>
Vladimir Oltean Nov. 9, 2023, 12:24 p.m. UTC | #2
On Tue, Nov 07, 2023 at 06:20:21AM -0500, Faizal Rahim wrote:
>  static void taprio_start_sched(struct Qdisc *sch,
> -			       ktime_t start, struct sched_gate_list *new)
> +			       ktime_t new_base_time,
> +			       struct sched_gate_list *new)
>  {
>  	struct taprio_sched *q = qdisc_priv(sch);
> -	ktime_t expires;
> +	struct sched_gate_list *oper = NULL;

No point in initializing with NULL if this will be overwritten later.

> +	ktime_t expires, start;
>  
>  	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
>  		return;
>  
> +	oper = rcu_dereference_protected(q->oper_sched,
> +					 lockdep_is_held(&q->current_entry_lock));
> +
Abdul Rahim, Faizal Nov. 15, 2023, 11:56 a.m. UTC | #3
On 9/11/2023 7:50 pm, Vladimir Oltean wrote:
> On Tue, Nov 07, 2023 at 06:20:21AM -0500, Faizal Rahim wrote:
>> If a new GCL is triggered and the new admin base time falls before the
>> expiry of advance_timer (current running entry from oper),
>> taprio_start_sched() resets the current advance_timer expiry to the
>> new admin base time. However, upon expiry, advance_sched() doesn't
>> immediately switch to the admin schedule. It continues running entries
>> from the old oper schedule, and only switches to the new admin schedule
>> much later. Ideally, if the advance_timer is shorten to align with the
>> new admin base time, when the timer expires, advance_sched() should
>> trigger switch_schedules() at the beginning.
>>
>> To resolve this issue, set the cycle_time_correction to a non-initialized
>> value in taprio_start_sched(). advance_sched() will use it to initiate
>> switch_schedules() at the beginning.
>>
>> Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
> 
> Did the commit you blame really introduce this issue, or was it your
> rework to trigger switch_schedules() based on the correction?
> 

Ohh actually this issue happens even without my whole patch set.

>> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
>> ---
>>   net/sched/sch_taprio.c | 21 ++++++++++++++++++---
>>   1 file changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
>> index f18a5fe12f0c..01b114edec30 100644
>> --- a/net/sched/sch_taprio.c
>> +++ b/net/sched/sch_taprio.c
>> @@ -1379,14 +1379,19 @@ static void setup_first_end_time(struct taprio_sched *q,
>>   }
>>   
>>   static void taprio_start_sched(struct Qdisc *sch,
>> -			       ktime_t start, struct sched_gate_list *new)
>> +			       ktime_t new_base_time,
>> +			       struct sched_gate_list *new)
>>   {
>>   	struct taprio_sched *q = qdisc_priv(sch);
>> -	ktime_t expires;
>> +	struct sched_gate_list *oper = NULL;
>> +	ktime_t expires, start;
>>   
>>   	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
>>   		return;
>>   
>> +	oper = rcu_dereference_protected(q->oper_sched,
>> +					 lockdep_is_held(&q->current_entry_lock));
>> +
>>   	expires = hrtimer_get_expires(&q->advance_timer);
>>   	if (expires == 0)
>>   		expires = KTIME_MAX;
>> @@ -1395,7 +1400,17 @@ static void taprio_start_sched(struct Qdisc *sch,
>>   	 * reprogram it to the earliest one, so we change the admin
>>   	 * schedule to the operational one at the right time.
>>   	 */
>> -	start = min_t(ktime_t, start, expires);
>> +	start = min_t(ktime_t, new_base_time, expires);
>> +
>> +	if (expires != KTIME_MAX &&
>> +	    ktime_compare(start, new_base_time) == 0) {
>> +		/* Since timer was changed to align to the new admin schedule,
>> +		 * setting the variable below to a non-initialized value will
> 
> I find the wording "setting the variable below to a non-initialized value"
> confusing. 0 is non-initialized? You're talking about a value different
> than INIT_CYCLE_TIME_CORRECTION. What about "setting a specific cycle
> correction will indicate ..."?
> 

Sure

>> +		 * indicate to advance_sched() to call switch_schedules() after
>> +		 * this timer expires.
>> +		 */
>> +		oper->cycle_time_correction = 0;
> 
> Why 0 and not ktime_sub(new_base_time, oper->cycle_end_time)? Doesn't
> the precise correction value make a difference?
> 

Negative correction and its calculation is a separate problem handled in 
different patch.

My intention is to highlight a specific issue and address it with a single 
patch. The core problem stemmed from the new admin schedule not making an 
immediate transition in advance_sched().

I'll rework this patch to focus specifically on resolving this problem 
while gradually aligning with the overall series. Importantly, I won't be 
removing anything from this patch in the process.

Is that okay ?

>> +	}
>>   
>>   	hrtimer_start(&q->advance_timer, start, HRTIMER_MODE_ABS);
>>   }
>> -- 
>> 2.25.1
>>
>
diff mbox series

Patch

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index f18a5fe12f0c..01b114edec30 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1379,14 +1379,19 @@  static void setup_first_end_time(struct taprio_sched *q,
 }
 
 static void taprio_start_sched(struct Qdisc *sch,
-			       ktime_t start, struct sched_gate_list *new)
+			       ktime_t new_base_time,
+			       struct sched_gate_list *new)
 {
 	struct taprio_sched *q = qdisc_priv(sch);
-	ktime_t expires;
+	struct sched_gate_list *oper = NULL;
+	ktime_t expires, start;
 
 	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
 		return;
 
+	oper = rcu_dereference_protected(q->oper_sched,
+					 lockdep_is_held(&q->current_entry_lock));
+
 	expires = hrtimer_get_expires(&q->advance_timer);
 	if (expires == 0)
 		expires = KTIME_MAX;
@@ -1395,7 +1400,17 @@  static void taprio_start_sched(struct Qdisc *sch,
 	 * reprogram it to the earliest one, so we change the admin
 	 * schedule to the operational one at the right time.
 	 */
-	start = min_t(ktime_t, start, expires);
+	start = min_t(ktime_t, new_base_time, expires);
+
+	if (expires != KTIME_MAX &&
+	    ktime_compare(start, new_base_time) == 0) {
+		/* Since timer was changed to align to the new admin schedule,
+		 * setting the variable below to a non-initialized value will
+		 * indicate to advance_sched() to call switch_schedules() after
+		 * this timer expires.
+		 */
+		oper->cycle_time_correction = 0;
+	}
 
 	hrtimer_start(&q->advance_timer, start, HRTIMER_MODE_ABS);
 }