Message ID | 20231107112023.676016-2-faizal.abdul.rahim@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | qbv cycle time extension/truncation | expand |
On Tue, Nov 07, 2023 at 06:20:17AM -0500, Faizal Rahim wrote: > In the current taprio code for dynamic schedule change, > admin/oper schedule switching happens immediately when > should_change_schedules() is true. Then the last entry of > the old admin schedule stops being valid anymore from > taprio_dequeue_from_txq’s perspective. Admittedly, I may have become a bit detached from this code base in the past months, but I don't understand the reasoning here. Could you please explain what makes the last entry of the old admin schedule be invalid from taprio_dequeue_from_txq()'s perspective? What I see is that when should_change_schedules() is true, we change q->oper_sched and q->admin_sched through the switch_schedules() call, but we don't change q->current_entry, so I fail to understand the connection you are implying. On the other hand (and I see I did mention this in the other thread), it seems that taprio_skb_exceeds_queue_max_sdu() - called from the enqueue() path - looks at q->oper_sched, and that's a valid reason why we'd want to delay the schedule switch until admin's actual base time, rather than the current oper's cycle_end_time. But please, let's spare no expense in providing a proper problem description, justification for the change and Fixes: tag. This is not optional. > To solve this, we have to delay the switch_schedules() call via > the new cycle_time_correction variable. The variable serves 2 > purposes: > 1. Upon entering advance_sched(), if the value is set to a > non-initialized value, it indicates that we need to change > schedule. > 2. Store the cycle time correction value which will be used for > negative or positive correction. It needs to be stated much more clearly that only purpose 1 is relevant here (I would even go as far as to omit its secondary purpose here). The only reason we are using the correction variable is because it happens that we'll need that in later changes. > > Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") I believe that since the only observable problem has to do with taprio_skb_exceeds_queue_max_sdu(), the Fixes: tag should be the commit which added that logic. Which is: Fixes: a878fd46fe43 ("net/sched: keep the max_frm_len information inside struct sched_gate_list") > Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> > --- > net/sched/sch_taprio.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index 2e1949de4171..dee103647823 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -41,6 +41,7 @@ static struct static_key_false taprio_have_working_mqprio; > #define TXTIME_ASSIST_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST) > #define FULL_OFFLOAD_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD) > #define TAPRIO_FLAGS_INVALID U32_MAX > +#define INIT_CYCLE_TIME_CORRECTION S64_MIN I would prefer naming it CYCLE_TIME_CORRECTION_INVALID or _UNSPEC. It is not just used as the "initial" value. > > struct sched_entry { > /* Durations between this GCL entry and the GCL entry where the > @@ -75,6 +76,7 @@ struct sched_gate_list { > ktime_t cycle_end_time; > s64 cycle_time; > s64 cycle_time_extension; > + s64 cycle_time_correction; > s64 base_time; > }; > > @@ -940,8 +942,10 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) > admin = rcu_dereference_protected(q->admin_sched, > lockdep_is_held(&q->current_entry_lock)); > > - if (!oper) > + if (!oper || oper->cycle_time_correction != INIT_CYCLE_TIME_CORRECTION) { You could introduce even as early as this change a "static bool sched_switch_pending(struct sched_gate_list *oper)" function, which incorporates the entire body of this "if" expression. > + oper->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; > switch_schedules(q, &admin, &oper); > + } > > /* This can happen in two cases: 1. this is the very first run > * of this function (i.e. we weren't running any schedule > @@ -981,7 +985,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) > * schedule runs. > */ > end_time = sched_base_time(admin); > - switch_schedules(q, &admin, &oper); > + oper->cycle_time_correction = 0; > } > > next->end_time = end_time; > @@ -1174,6 +1178,7 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb, > } > > taprio_calculate_gate_durations(q, new); > + new->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; > > return 0; > } > -- > 2.25.1 >
On Tue, Nov 07, 2023 at 06:20:17AM -0500, Faizal Rahim wrote: > In the current taprio code for dynamic schedule change, > admin/oper schedule switching happens immediately when > should_change_schedules() is true. Then the last entry of > the old admin schedule stops being valid anymore from > taprio_dequeue_from_txq’s perspective. > > To solve this, we have to delay the switch_schedules() call via > the new cycle_time_correction variable. The variable serves 2 > purposes: > 1. Upon entering advance_sched(), if the value is set to a > non-initialized value, it indicates that we need to change > schedule. > 2. Store the cycle time correction value which will be used for > negative or positive correction. > > Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") > Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> > --- > net/sched/sch_taprio.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index 2e1949de4171..dee103647823 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -41,6 +41,7 @@ static struct static_key_false taprio_have_working_mqprio; > #define TXTIME_ASSIST_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST) > #define FULL_OFFLOAD_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD) > #define TAPRIO_FLAGS_INVALID U32_MAX > +#define INIT_CYCLE_TIME_CORRECTION S64_MIN > > struct sched_entry { > /* Durations between this GCL entry and the GCL entry where the > @@ -75,6 +76,7 @@ struct sched_gate_list { > ktime_t cycle_end_time; > s64 cycle_time; > s64 cycle_time_extension; > + s64 cycle_time_correction; > s64 base_time; > }; > > @@ -940,8 +942,10 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) > admin = rcu_dereference_protected(q->admin_sched, > lockdep_is_held(&q->current_entry_lock)); > > - if (!oper) > + if (!oper || oper->cycle_time_correction != INIT_CYCLE_TIME_CORRECTION) { Hi Faizal, The first condition above expects that oper may be NULL, but the line below dereferences it unconditionally. This doesn't seem correct, one way or the other. As flagged by Smatch and Coccinelle. > + oper->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; > switch_schedules(q, &admin, &oper); > + } > > /* This can happen in two cases: 1. this is the very first run > * of this function (i.e. we weren't running any schedule ...
On 9/11/2023 6:27 am, Vladimir Oltean wrote: > On Tue, Nov 07, 2023 at 06:20:17AM -0500, Faizal Rahim wrote: >> In the current taprio code for dynamic schedule change, >> admin/oper schedule switching happens immediately when >> should_change_schedules() is true. Then the last entry of >> the old admin schedule stops being valid anymore from >> taprio_dequeue_from_txq’s perspective. > > Admittedly, I may have become a bit detached from this code base in the > past months, but I don't understand the reasoning here. > > Could you please explain what makes the last entry of the old admin > schedule be invalid from taprio_dequeue_from_txq()'s perspective? > > What I see is that when should_change_schedules() is true, we change > q->oper_sched and q->admin_sched through the switch_schedules() call, > but we don't change q->current_entry, so I fail to understand the > connection you are implying. My bad – I used part of the explanation from the original author without thoroughly checking it. I have some guesses about taprio_dequeue_from_txq(), but they're not solid without more testing. So, I'll swap it with your suggestion below, which highlights the obvious issue. > On the other hand (and I see I did mention this in the other thread), > it seems that taprio_skb_exceeds_queue_max_sdu() - called from the > enqueue() path - looks at q->oper_sched, and that's a valid reason why > we'd want to delay the schedule switch until admin's actual base time, > rather than the current oper's cycle_end_time. Agree, observed this too. > But please, let's spare no expense in providing a proper problem > description, justification for the change and Fixes: tag. This is not > optional. > >> To solve this, we have to delay the switch_schedules() call via >> the new cycle_time_correction variable. The variable serves 2 >> purposes: >> 1. Upon entering advance_sched(), if the value is set to a >> non-initialized value, it indicates that we need to change >> schedule. >> 2. Store the cycle time correction value which will be used for >> negative or positive correction. > > It needs to be stated much more clearly that only purpose 1 is relevant > here (I would even go as far as to omit its secondary purpose here). > The only reason we are using the correction variable is because it > happens that we'll need that in later changes. Got it. > >> >> Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") > > I believe that since the only observable problem has to do with > taprio_skb_exceeds_queue_max_sdu(), the Fixes: tag should be the commit > which added that logic. Which is: > > Fixes: a878fd46fe43 ("net/sched: keep the max_frm_len information inside struct sched_gate_list") Will replace the patch explanation with this. Thanks. >> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> >> --- >> net/sched/sch_taprio.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c >> index 2e1949de4171..dee103647823 100644 >> --- a/net/sched/sch_taprio.c >> +++ b/net/sched/sch_taprio.c >> @@ -41,6 +41,7 @@ static struct static_key_false taprio_have_working_mqprio; >> #define TXTIME_ASSIST_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST) >> #define FULL_OFFLOAD_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD) >> #define TAPRIO_FLAGS_INVALID U32_MAX >> +#define INIT_CYCLE_TIME_CORRECTION S64_MIN > > I would prefer naming it CYCLE_TIME_CORRECTION_INVALID or _UNSPEC. > It is not just used as the "initial" value. > >> >> struct sched_entry { >> /* Durations between this GCL entry and the GCL entry where the >> @@ -75,6 +76,7 @@ struct sched_gate_list { >> ktime_t cycle_end_time; >> s64 cycle_time; >> s64 cycle_time_extension; >> + s64 cycle_time_correction; >> s64 base_time; >> }; >> >> @@ -940,8 +942,10 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) >> admin = rcu_dereference_protected(q->admin_sched, >> lockdep_is_held(&q->current_entry_lock)); >> >> - if (!oper) >> + if (!oper || oper->cycle_time_correction != INIT_CYCLE_TIME_CORRECTION) { > > You could introduce even as early as this change a "static bool > sched_switch_pending(struct sched_gate_list *oper)" function, which > incorporates the entire body of this "if" expression. > >> + oper->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; >> switch_schedules(q, &admin, &oper); >> + } >> >> /* This can happen in two cases: 1. this is the very first run >> * of this function (i.e. we weren't running any schedule >> @@ -981,7 +985,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) >> * schedule runs. >> */ >> end_time = sched_base_time(admin); >> - switch_schedules(q, &admin, &oper); >> + oper->cycle_time_correction = 0; >> } >> >> next->end_time = end_time; >> @@ -1174,6 +1178,7 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb, >> } >> >> taprio_calculate_gate_durations(q, new); >> + new->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; >> >> return 0; >> } >> -- >> 2.25.1 >>
On 12/11/2023 6:31 pm, Simon Horman wrote: > On Tue, Nov 07, 2023 at 06:20:17AM -0500, Faizal Rahim wrote: >> In the current taprio code for dynamic schedule change, >> admin/oper schedule switching happens immediately when >> should_change_schedules() is true. Then the last entry of >> the old admin schedule stops being valid anymore from >> taprio_dequeue_from_txq’s perspective. >> >> To solve this, we have to delay the switch_schedules() call via >> the new cycle_time_correction variable. The variable serves 2 >> purposes: >> 1. Upon entering advance_sched(), if the value is set to a >> non-initialized value, it indicates that we need to change >> schedule. >> 2. Store the cycle time correction value which will be used for >> negative or positive correction. >> >> Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") >> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> >> --- >> net/sched/sch_taprio.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c >> index 2e1949de4171..dee103647823 100644 >> --- a/net/sched/sch_taprio.c >> +++ b/net/sched/sch_taprio.c >> @@ -41,6 +41,7 @@ static struct static_key_false taprio_have_working_mqprio; >> #define TXTIME_ASSIST_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST) >> #define FULL_OFFLOAD_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD) >> #define TAPRIO_FLAGS_INVALID U32_MAX >> +#define INIT_CYCLE_TIME_CORRECTION S64_MIN >> >> struct sched_entry { >> /* Durations between this GCL entry and the GCL entry where the >> @@ -75,6 +76,7 @@ struct sched_gate_list { >> ktime_t cycle_end_time; >> s64 cycle_time; >> s64 cycle_time_extension; >> + s64 cycle_time_correction; >> s64 base_time; >> }; >> >> @@ -940,8 +942,10 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) >> admin = rcu_dereference_protected(q->admin_sched, >> lockdep_is_held(&q->current_entry_lock)); >> >> - if (!oper) >> + if (!oper || oper->cycle_time_correction != INIT_CYCLE_TIME_CORRECTION) { > > Hi Faizal, > > The first condition above expects that oper may be NULL, but the line below > dereferences it unconditionally. This doesn't seem correct, one way or the > other. > Thanks for pointing this out, sorry for this blunder. Will update. > As flagged by Smatch and Coccinelle. > >> + oper->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; >> switch_schedules(q, &admin, &oper); >> + } >> >> /* This can happen in two cases: 1. this is the very first run >> * of this function (i.e. we weren't running any schedule > > ... >
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index 2e1949de4171..dee103647823 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -41,6 +41,7 @@ static struct static_key_false taprio_have_working_mqprio; #define TXTIME_ASSIST_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST) #define FULL_OFFLOAD_IS_ENABLED(flags) ((flags) & TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD) #define TAPRIO_FLAGS_INVALID U32_MAX +#define INIT_CYCLE_TIME_CORRECTION S64_MIN struct sched_entry { /* Durations between this GCL entry and the GCL entry where the @@ -75,6 +76,7 @@ struct sched_gate_list { ktime_t cycle_end_time; s64 cycle_time; s64 cycle_time_extension; + s64 cycle_time_correction; s64 base_time; }; @@ -940,8 +942,10 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) admin = rcu_dereference_protected(q->admin_sched, lockdep_is_held(&q->current_entry_lock)); - if (!oper) + if (!oper || oper->cycle_time_correction != INIT_CYCLE_TIME_CORRECTION) { + oper->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; switch_schedules(q, &admin, &oper); + } /* This can happen in two cases: 1. this is the very first run * of this function (i.e. we weren't running any schedule @@ -981,7 +985,7 @@ static enum hrtimer_restart advance_sched(struct hrtimer *timer) * schedule runs. */ end_time = sched_base_time(admin); - switch_schedules(q, &admin, &oper); + oper->cycle_time_correction = 0; } next->end_time = end_time; @@ -1174,6 +1178,7 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb, } taprio_calculate_gate_durations(q, new); + new->cycle_time_correction = INIT_CYCLE_TIME_CORRECTION; return 0; }
In the current taprio code for dynamic schedule change, admin/oper schedule switching happens immediately when should_change_schedules() is true. Then the last entry of the old admin schedule stops being valid anymore from taprio_dequeue_from_txq’s perspective. To solve this, we have to delay the switch_schedules() call via the new cycle_time_correction variable. The variable serves 2 purposes: 1. Upon entering advance_sched(), if the value is set to a non-initialized value, it indicates that we need to change schedule. 2. Store the cycle time correction value which will be used for negative or positive correction. Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> --- net/sched/sch_taprio.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)