Message ID | 20211014175918.60188-3-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | e72aeb9ee0e34c57dc90793d0bf82cab9624d64e |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/sched: implement L4S style ce_threshold_ect1 marking | expand |
Eric Dumazet <eric.dumazet@gmail.com> writes: > From: Eric Dumazet <edumazet@google.com> > > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. > > If enabled, only packets with ECT(1) can be transformed to CE > if their sojourn time is above the ce_threshold. > > Note that this new option does not change rules for codel law. > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is > the default when fq_codel qdisc is created), ECT(0) packets can > still get CE if codel law (as governed by limit/target) decides so. The ability to have certain packets receive a shallow marking threshold and others regular ECN semantics is no doubt useful. However, given that it is by no means certain how the L4S experiment will pan out (and I for one remain sceptical that the real-world benefits will turn out to match the tech demos), I think it's premature to bake the ECT(1) semantics into UAPI. So how about tying this behaviour to a configurable skb->mark instead? That way users can get the shallow marking behaviour for any subset of packets they want, simply by installing a suitable filter on the qdisc... -Toke
On Thu, Oct 14, 2021 at 12:54 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Eric Dumazet <eric.dumazet@gmail.com> writes: > > > From: Eric Dumazet <edumazet@google.com> > > > > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, > > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. > > > > If enabled, only packets with ECT(1) can be transformed to CE > > if their sojourn time is above the ce_threshold. > > > > Note that this new option does not change rules for codel law. > > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is > > the default when fq_codel qdisc is created), ECT(0) packets can > > still get CE if codel law (as governed by limit/target) decides so. > > The ability to have certain packets receive a shallow marking threshold > and others regular ECN semantics is no doubt useful. However, given that > it is by no means certain how the L4S experiment will pan out (and I for > one remain sceptical that the real-world benefits will turn out to match > the tech demos), I think it's premature to bake the ECT(1) semantics > into UAPI. Chicken and egg problem. We had fq_codel in linux kernel years before RFC after all :) > > So how about tying this behaviour to a configurable skb->mark instead? > That way users can get the shallow marking behaviour for any subset of > packets they want, simply by installing a suitable filter on the > qdisc... This seems an idea, but do you really expect users installing a sophisticated filter ? Please provide more details, and cost analysis. (Having to install a filter is probably more expensive than testing a boolean, after the sojourn time has exceeded the threshold) Given that INET_ECN_set_ce(skb) only operates on ECT(1) and ECT(0), I guess we could use a bitmask of two bits so that users can decide which code points can become CE.
Eric Dumazet <edumazet@google.com> writes: > On Thu, Oct 14, 2021 at 12:54 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Eric Dumazet <eric.dumazet@gmail.com> writes: >> >> > From: Eric Dumazet <edumazet@google.com> >> > >> > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, >> > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. >> > >> > If enabled, only packets with ECT(1) can be transformed to CE >> > if their sojourn time is above the ce_threshold. >> > >> > Note that this new option does not change rules for codel law. >> > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is >> > the default when fq_codel qdisc is created), ECT(0) packets can >> > still get CE if codel law (as governed by limit/target) decides so. >> >> The ability to have certain packets receive a shallow marking threshold >> and others regular ECN semantics is no doubt useful. However, given that >> it is by no means certain how the L4S experiment will pan out (and I for >> one remain sceptical that the real-world benefits will turn out to match >> the tech demos), I think it's premature to bake the ECT(1) semantics >> into UAPI. > > Chicken and egg problem. > We had fq_codel in linux kernel years before RFC after all :) Sure, but fq_codel is a self-contained algorithm, it doesn't add new meanings to bits of the IP header... :) >> So how about tying this behaviour to a configurable skb->mark instead? >> That way users can get the shallow marking behaviour for any subset of >> packets they want, simply by installing a suitable filter on the >> qdisc... > > This seems an idea, but do you really expect users installing a sophisticated > filter ? Please provide more details, and cost analysis. Not sure it's that sophisticated; pretty simple to do with tc-u32 (although it's complicated a bit by having to restore the default hashing behaviour of fq_codel with a second filter). Something like: # tc qdisc replace dev $DEV handle 1: fq_codel # tc filter add dev $DEV parent 1: pref 1 protocol ipv6 u32 match u32 00100000 00100000 action skbedit mark 2 continue # tc filter add dev $DEV parent 1: pref 2 protocol ip u32 match ip dsfield 1 1 action skbedit mark 2 continue # tc filter add dev $DEV parent 1: handle 1 pref 3 protocol all flow hash keys src,dst,proto,proto-src,proto-dst divisor 1024 or one could write a single BPF program that combines all three to save some cycles walking the filter chain. > (Having to install a filter is probably more expensive than testing a > boolean, after the sojourn time has exceeded the threshold) No doubt, all other things being equal. But odds are they're not: if you're already running a BPF filter somewhere in the path, adding the logic above to an existing filter reduces it back down to a couple of boolean comparisons, for instance. But even if it does add a bit of overhead, IMO the flexibility makes up for this. We can always revisit it if L4S becomes a standards-track RFC at some point :) > Given that INET_ECN_set_ce(skb) only operates on ECT(1) and ECT(0), > I guess we could use a bitmask of two bits so that users can decide > which code points can become CE. That would be an improvement. But if we're doing bitmasks, and since the code is reading the whole dsfield anyway, why not extend that bitmask to the whole dsfield? -Toke
Eric, Because the threshold is in time units, I suggest the condition for exceeding it needs to be AND'd with (*backlog > mtu), otherwise you can get 100% solid marking at low link rates. When ce_threshold is for DCs, low link rates are unlikely. However, given ce_threshold_ect1 is mainly for the Internet, during testing with 1ms threshold we encountered solid marking at low link rates, so we had to add a 1 packet floor: https://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf Sorry to chime in after your patch went to net-next. Bob On 14/10/2021 18:59, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. > > If enabled, only packets with ECT(1) can be transformed to CE > if their sojourn time is above the ce_threshold. > > Note that this new option does not change rules for codel law. > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is > the default when fq_codel qdisc is created), ECT(0) packets can > still get CE if codel law (as governed by limit/target) decides so. > > Section 4.3.b of current draft [1] states: > > b. A scheduler with per-flow queues such as FQ-CoDel or FQ-PIE can > be used for L4S. For instance within each queue of an FQ-CoDel > system, as well as a CoDel AQM, there is typically also ECN > marking at an immediate (unsmoothed) shallow threshold to support > use in data centres (see Sec.5.2.7 of [RFC8290]). This can be > modified so that the shallow threshold is solely applied to > ECT(1) packets. Then if there is a flow of non-ECN or ECT(0) > packets in the per-flow-queue, the Classic AQM (e.g. CoDel) is > applied; while if there is a flow of ECT(1) packets in the queue, > the shallower (typically sub-millisecond) threshold is applied. > > Tested: > > tc qd replace dev eth1 root fq_codel ce_threshold_ect1 50usec > > netperf ... -t TCP_STREAM -- K dctcp > > tc -s -d qd sh dev eth1 > qdisc fq_codel 8022: root refcnt 32 limit 10240p flows 1024 quantum 9212 target 5ms ce_threshold_ect1 49us interval 100ms memory_limit 32Mb ecn drop_batch 64 > Sent 14388596616 bytes 9543449 pkt (dropped 0, overlimits 0 requeues 152013) > backlog 0b 0p requeues 152013 > maxpacket 68130 drop_overlimit 0 new_flow_count 95678 ecn_mark 0 ce_mark 7639 > new_flows_len 0 old_flows_len 0 > > [1] L4S current draft: > https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-l4s-arch > > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Neal Cardwell <ncardwell@google.com> > Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com> > Cc: Tom Henderson <tomh@tomh.org> > Cc: Bob Briscoe <in@bobbriscoe.net> > --- > include/net/codel.h | 2 ++ > include/net/codel_impl.h | 18 +++++++++++++++--- > include/uapi/linux/pkt_sched.h | 1 + > net/mac80211/sta_info.c | 1 + > net/sched/sch_fq_codel.c | 15 +++++++++++---- > 5 files changed, 30 insertions(+), 7 deletions(-) > > diff --git a/include/net/codel.h b/include/net/codel.h > index a6e428f801350809322aaff08d92904e059c3b5a..5e8b181b76b829d6af3c57809d9bc5f0578dd112 100644 > --- a/include/net/codel.h > +++ b/include/net/codel.h > @@ -102,6 +102,7 @@ static inline u32 codel_time_to_us(codel_time_t val) > * @interval: width of moving time window > * @mtu: device mtu, or minimal queue backlog in bytes. > * @ecn: is Explicit Congestion Notification enabled > + * @ce_threshold_ect1: if ce_threshold only marks ECT(1) packets > */ > struct codel_params { > codel_time_t target; > @@ -109,6 +110,7 @@ struct codel_params { > codel_time_t interval; > u32 mtu; > bool ecn; > + bool ce_threshold_ect1; > }; > > /** > diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h > index d289b91dcd65ecdc96dc0c9bf85d4a4be6961022..7af2c3eb3c43c24364519120aad5be77522854a6 100644 > --- a/include/net/codel_impl.h > +++ b/include/net/codel_impl.h > @@ -54,6 +54,7 @@ static void codel_params_init(struct codel_params *params) > params->interval = MS2TIME(100); > params->target = MS2TIME(5); > params->ce_threshold = CODEL_DISABLED_THRESHOLD; > + params->ce_threshold_ect1 = false; > params->ecn = false; > } > > @@ -246,9 +247,20 @@ static struct sk_buff *codel_dequeue(void *ctx, > vars->rec_inv_sqrt); > } > end: > - if (skb && codel_time_after(vars->ldelay, params->ce_threshold) && > - INET_ECN_set_ce(skb)) > - stats->ce_mark++; > + if (skb && codel_time_after(vars->ldelay, params->ce_threshold)) { > + bool set_ce = true; > + > + if (params->ce_threshold_ect1) { > + /* Note: if skb_get_dsfield() returns -1, following > + * gives INET_ECN_MASK, which is != INET_ECN_ECT_1. > + */ > + u8 ecn = skb_get_dsfield(skb) & INET_ECN_MASK; > + > + set_ce = (ecn == INET_ECN_ECT_1); > + } > + if (set_ce && INET_ECN_set_ce(skb)) > + stats->ce_mark++; > + } > return skb; > } > > diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h > index ec88590b3198441f18cc9def7bd40c48f0bc82a1..6be9a84cccfa79bace1f3f7123d02f484b67a25e 100644 > --- a/include/uapi/linux/pkt_sched.h > +++ b/include/uapi/linux/pkt_sched.h > @@ -840,6 +840,7 @@ enum { > TCA_FQ_CODEL_CE_THRESHOLD, > TCA_FQ_CODEL_DROP_BATCH_SIZE, > TCA_FQ_CODEL_MEMORY_LIMIT, > + TCA_FQ_CODEL_CE_THRESHOLD_ECT1, > __TCA_FQ_CODEL_MAX > }; > > diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c > index 2b5acb37587f7068e2d11fe842ec963a556f1eb1..a39830418434d4bb74d238373f63a4858230fce5 100644 > --- a/net/mac80211/sta_info.c > +++ b/net/mac80211/sta_info.c > @@ -513,6 +513,7 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata, > sta->cparams.target = MS2TIME(20); > sta->cparams.interval = MS2TIME(100); > sta->cparams.ecn = true; > + sta->cparams.ce_threshold_ect1 = false; > > sta_dbg(sdata, "Allocated STA %pM\n", sta->sta.addr); > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > index bb0cd6d3d2c2749d54e26368fb2558beedea85c9..033d65d06eb136ff704cddd3ee950a5c3a5d9831 100644 > --- a/net/sched/sch_fq_codel.c > +++ b/net/sched/sch_fq_codel.c > @@ -362,6 +362,7 @@ static const struct nla_policy fq_codel_policy[TCA_FQ_CODEL_MAX + 1] = { > [TCA_FQ_CODEL_CE_THRESHOLD] = { .type = NLA_U32 }, > [TCA_FQ_CODEL_DROP_BATCH_SIZE] = { .type = NLA_U32 }, > [TCA_FQ_CODEL_MEMORY_LIMIT] = { .type = NLA_U32 }, > + [TCA_FQ_CODEL_CE_THRESHOLD_ECT1] = { .type = NLA_U8 }, > }; > > static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, > @@ -408,6 +409,9 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, > q->cparams.ce_threshold = (val * NSEC_PER_USEC) >> CODEL_SHIFT; > } > > + if (tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]) > + q->cparams.ce_threshold_ect1 = !!nla_get_u8(tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]); > + > if (tb[TCA_FQ_CODEL_INTERVAL]) { > u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]); > > @@ -544,10 +548,13 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb) > q->flows_cnt)) > goto nla_put_failure; > > - if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD && > - nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, > - codel_time_to_us(q->cparams.ce_threshold))) > - goto nla_put_failure; > + if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD) { > + if (nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, > + codel_time_to_us(q->cparams.ce_threshold))) > + goto nla_put_failure; > + if (nla_put_u8(skb, TCA_FQ_CODEL_CE_THRESHOLD_ECT1, q->cparams.ce_threshold_ect1)) > + goto nla_put_failure; > + } > > return nla_nest_end(skb, opts); >
On Fri, Oct 15, 2021 at 5:59 AM Bob Briscoe <ietf@bobbriscoe.net> wrote: > > Eric, > > Because the threshold is in time units, I suggest the condition for > exceeding it needs to be AND'd with (*backlog > mtu), otherwise you can > get 100% solid marking at low link rates. > > When ce_threshold is for DCs, low link rates are unlikely. > However, given ce_threshold_ect1 is mainly for the Internet, during > testing with 1ms threshold we encountered solid marking at low link > rates, so we had to add a 1 packet floor: > https://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf > > Sorry to chime in after your patch went to net-next. > What you describe about a minimal backlog was already there with ce_threshold handling ? Or is it something exclusive to L4S ? This deserves a separate patch, if anything :) > > Bob > > > On 14/10/2021 18:59, Eric Dumazet wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, > > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. > > > > If enabled, only packets with ECT(1) can be transformed to CE > > if their sojourn time is above the ce_threshold. > > > > Note that this new option does not change rules for codel law. > > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is > > the default when fq_codel qdisc is created), ECT(0) packets can > > still get CE if codel law (as governed by limit/target) decides so. > > > > Section 4.3.b of current draft [1] states: > > > > b. A scheduler with per-flow queues such as FQ-CoDel or FQ-PIE can > > be used for L4S. For instance within each queue of an FQ-CoDel > > system, as well as a CoDel AQM, there is typically also ECN > > marking at an immediate (unsmoothed) shallow threshold to support > > use in data centres (see Sec.5.2.7 of [RFC8290]). This can be > > modified so that the shallow threshold is solely applied to > > ECT(1) packets. Then if there is a flow of non-ECN or ECT(0) > > packets in the per-flow-queue, the Classic AQM (e.g. CoDel) is > > applied; while if there is a flow of ECT(1) packets in the queue, > > the shallower (typically sub-millisecond) threshold is applied. > > > > Tested: > > > > tc qd replace dev eth1 root fq_codel ce_threshold_ect1 50usec > > > > netperf ... -t TCP_STREAM -- K dctcp > > > > tc -s -d qd sh dev eth1 > > qdisc fq_codel 8022: root refcnt 32 limit 10240p flows 1024 quantum 9212 target 5ms ce_threshold_ect1 49us interval 100ms memory_limit 32Mb ecn drop_batch 64 > > Sent 14388596616 bytes 9543449 pkt (dropped 0, overlimits 0 requeues 152013) > > backlog 0b 0p requeues 152013 > > maxpacket 68130 drop_overlimit 0 new_flow_count 95678 ecn_mark 0 ce_mark 7639 > > new_flows_len 0 old_flows_len 0 > > > > [1] L4S current draft: > > https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-l4s-arch > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Cc: Neal Cardwell <ncardwell@google.com> > > Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com> > > Cc: Tom Henderson <tomh@tomh.org> > > Cc: Bob Briscoe <in@bobbriscoe.net> > > --- > > include/net/codel.h | 2 ++ > > include/net/codel_impl.h | 18 +++++++++++++++--- > > include/uapi/linux/pkt_sched.h | 1 + > > net/mac80211/sta_info.c | 1 + > > net/sched/sch_fq_codel.c | 15 +++++++++++---- > > 5 files changed, 30 insertions(+), 7 deletions(-) > > > > diff --git a/include/net/codel.h b/include/net/codel.h > > index a6e428f801350809322aaff08d92904e059c3b5a..5e8b181b76b829d6af3c57809d9bc5f0578dd112 100644 > > --- a/include/net/codel.h > > +++ b/include/net/codel.h > > @@ -102,6 +102,7 @@ static inline u32 codel_time_to_us(codel_time_t val) > > * @interval: width of moving time window > > * @mtu: device mtu, or minimal queue backlog in bytes. > > * @ecn: is Explicit Congestion Notification enabled > > + * @ce_threshold_ect1: if ce_threshold only marks ECT(1) packets > > */ > > struct codel_params { > > codel_time_t target; > > @@ -109,6 +110,7 @@ struct codel_params { > > codel_time_t interval; > > u32 mtu; > > bool ecn; > > + bool ce_threshold_ect1; > > }; > > > > /** > > diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h > > index d289b91dcd65ecdc96dc0c9bf85d4a4be6961022..7af2c3eb3c43c24364519120aad5be77522854a6 100644 > > --- a/include/net/codel_impl.h > > +++ b/include/net/codel_impl.h > > @@ -54,6 +54,7 @@ static void codel_params_init(struct codel_params *params) > > params->interval = MS2TIME(100); > > params->target = MS2TIME(5); > > params->ce_threshold = CODEL_DISABLED_THRESHOLD; > > + params->ce_threshold_ect1 = false; > > params->ecn = false; > > } > > > > @@ -246,9 +247,20 @@ static struct sk_buff *codel_dequeue(void *ctx, > > vars->rec_inv_sqrt); > > } > > end: > > - if (skb && codel_time_after(vars->ldelay, params->ce_threshold) && > > - INET_ECN_set_ce(skb)) > > - stats->ce_mark++; > > + if (skb && codel_time_after(vars->ldelay, params->ce_threshold)) { > > + bool set_ce = true; > > + > > + if (params->ce_threshold_ect1) { > > + /* Note: if skb_get_dsfield() returns -1, following > > + * gives INET_ECN_MASK, which is != INET_ECN_ECT_1. > > + */ > > + u8 ecn = skb_get_dsfield(skb) & INET_ECN_MASK; > > + > > + set_ce = (ecn == INET_ECN_ECT_1); > > + } > > + if (set_ce && INET_ECN_set_ce(skb)) > > + stats->ce_mark++; > > + } > > return skb; > > } > > > > diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h > > index ec88590b3198441f18cc9def7bd40c48f0bc82a1..6be9a84cccfa79bace1f3f7123d02f484b67a25e 100644 > > --- a/include/uapi/linux/pkt_sched.h > > +++ b/include/uapi/linux/pkt_sched.h > > @@ -840,6 +840,7 @@ enum { > > TCA_FQ_CODEL_CE_THRESHOLD, > > TCA_FQ_CODEL_DROP_BATCH_SIZE, > > TCA_FQ_CODEL_MEMORY_LIMIT, > > + TCA_FQ_CODEL_CE_THRESHOLD_ECT1, > > __TCA_FQ_CODEL_MAX > > }; > > > > diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c > > index 2b5acb37587f7068e2d11fe842ec963a556f1eb1..a39830418434d4bb74d238373f63a4858230fce5 100644 > > --- a/net/mac80211/sta_info.c > > +++ b/net/mac80211/sta_info.c > > @@ -513,6 +513,7 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata, > > sta->cparams.target = MS2TIME(20); > > sta->cparams.interval = MS2TIME(100); > > sta->cparams.ecn = true; > > + sta->cparams.ce_threshold_ect1 = false; > > > > sta_dbg(sdata, "Allocated STA %pM\n", sta->sta.addr); > > > > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > > index bb0cd6d3d2c2749d54e26368fb2558beedea85c9..033d65d06eb136ff704cddd3ee950a5c3a5d9831 100644 > > --- a/net/sched/sch_fq_codel.c > > +++ b/net/sched/sch_fq_codel.c > > @@ -362,6 +362,7 @@ static const struct nla_policy fq_codel_policy[TCA_FQ_CODEL_MAX + 1] = { > > [TCA_FQ_CODEL_CE_THRESHOLD] = { .type = NLA_U32 }, > > [TCA_FQ_CODEL_DROP_BATCH_SIZE] = { .type = NLA_U32 }, > > [TCA_FQ_CODEL_MEMORY_LIMIT] = { .type = NLA_U32 }, > > + [TCA_FQ_CODEL_CE_THRESHOLD_ECT1] = { .type = NLA_U8 }, > > }; > > > > static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, > > @@ -408,6 +409,9 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, > > q->cparams.ce_threshold = (val * NSEC_PER_USEC) >> CODEL_SHIFT; > > } > > > > + if (tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]) > > + q->cparams.ce_threshold_ect1 = !!nla_get_u8(tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]); > > + > > if (tb[TCA_FQ_CODEL_INTERVAL]) { > > u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]); > > > > @@ -544,10 +548,13 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb) > > q->flows_cnt)) > > goto nla_put_failure; > > > > - if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD && > > - nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, > > - codel_time_to_us(q->cparams.ce_threshold))) > > - goto nla_put_failure; > > + if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD) { > > + if (nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, > > + codel_time_to_us(q->cparams.ce_threshold))) > > + goto nla_put_failure; > > + if (nla_put_u8(skb, TCA_FQ_CODEL_CE_THRESHOLD_ECT1, q->cparams.ce_threshold_ect1)) > > + goto nla_put_failure; > > + } > > > > return nla_nest_end(skb, opts); > > > > -- > ________________________________________________________________ > Bob Briscoe http://bobbriscoe.net/ >
On Fri, Oct 15, 2021 at 10:08 AM Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Oct 15, 2021 at 5:59 AM Bob Briscoe <ietf@bobbriscoe.net> wrote: > > > > Eric, > > > > Because the threshold is in time units, I suggest the condition for > > exceeding it needs to be AND'd with (*backlog > mtu), otherwise you can > > get 100% solid marking at low link rates. > > > > When ce_threshold is for DCs, low link rates are unlikely. > > However, given ce_threshold_ect1 is mainly for the Internet, during > > testing with 1ms threshold we encountered solid marking at low link > > rates, so we had to add a 1 packet floor: > > https://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf > > > > Sorry to chime in after your patch went to net-next. > > > > What you describe about a minimal backlog was already there with > ce_threshold handling ? For my education, do you have a pointer to where the ce_threshold marking logic has a minimum backlog size requirement in packets or bytes? AFAICT the ce_threshold marking in include/net/codel_impl.h happens regardless of the current size of the backlog. > Or is it something exclusive to L4S ? I don't think it's exclusive to L4S. I think Bob is raising a general issue about improving ECN marking based on ce_threshold. My interpretation of Bob's point is that there is sort of a quantization issue at very low link speeds, where the serialization delay for a packet is at or above the ce_threshold delay. In such cases it seems there can be behavior where the bottleneck marks every packet CE all the time, causing any ECN-based algorithm (even DCTCP) to suffer poor utilization. I suppose with a fixed-speed link the operator could adjust the ce_threshold based on the serialization delays implied by the link speed, but perhaps in general this is infeasible due to variable-speed (e.g., radio) links. I guess perhaps this could be reproduced/tested with DCTCP (using ECT(0)), a ce_threshold of 1ms (for ECT(0)), and an emulated bottleneck link speed with a serialization delay well above 1ms (so a link speed well below 12Mbps). > This deserves a separate patch, if anything :) Agreed, in the Linux development model this would make sense as a separate patch, since it is conceptually separate and there do not need to be any dependencies between the two changes. :-) neal
> On 15 Oct, 2021, at 2:24 am, Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >>>> Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, >>>> Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. >>>> >>>> If enabled, only packets with ECT(1) can be transformed to CE >>>> if their sojourn time is above the ce_threshold. >>>> >>>> Note that this new option does not change rules for codel law. >>>> In particular, if TCA_FQ_CODEL_ECN is left enabled (this is >>>> the default when fq_codel qdisc is created), ECT(0) packets can >>>> still get CE if codel law (as governed by limit/target) decides so. >>> >>> The ability to have certain packets receive a shallow marking threshold >>> and others regular ECN semantics is no doubt useful. However, given that >>> it is by no means certain how the L4S experiment will pan out (and I for >>> one remain sceptical that the real-world benefits will turn out to match >>> the tech demos), I think it's premature to bake the ECT(1) semantics >>> into UAPI. >> >> Chicken and egg problem. >> We had fq_codel in linux kernel years before RFC after all :) > > Sure, but fq_codel is a self-contained algorithm, it doesn't add new > meanings to bits of the IP header... :) I'll be blunter: In its original (and currently stable) form, fq_codel is RFC-compliant. It conforms, in particular, to RFC-3168 (ECN). There's a relatively low threshold for adding RFC-compliant network algorithms to Linux, and it is certainly not required to have a published RFC specifically describing each qdisc's operating principles before it can be upstreamed. It just so happens that fq_codel (and some other notable algorithms such as CUBIC) proved sufficiently useful in practice to warrant post-hoc documentation in RFC form. However, this patch adds an option which, when enabled, makes fq_codel *non-compliant* with RFC-3168, specifically the requirement to treat ECT(0) and ECT(1) identically, unless conforming to another published RFC which permits different behaviour. There is a path via RFC-8311 to experiment with alternative ECN semantics in this way, but the way ECT(1) is used by L4S is specifically mentioned as requiring a published RFC for public deployments. The L4S Internet Drafts have *just failed* an IETF WGLC, which means they are *not* advancing to publication as RFCs in their current form. The primary reason for this failure is L4S' fundamental incompatibility with existing Internet traffic, despite its stated goal of general Internet deployment. It is my considered opinion, indeed, that moving *away* from ECT(1) as the L4S identifier is the best option for improving that compatibility. I believe there is a much higher threshold required for adding such things to publicly maintained versions of Linux (as opposed to privately maintained experimental versions). - Jonathan Morton
Eric, (thanks Neal) On 15/10/2021 16:49, Neal Cardwell wrote: > On Fri, Oct 15, 2021 at 10:08 AM Eric Dumazet <edumazet@google.com> wrote: >> On Fri, Oct 15, 2021 at 5:59 AM Bob Briscoe <ietf@bobbriscoe.net> wrote: >>> Eric, >>> >>> Because the threshold is in time units, I suggest the condition for >>> exceeding it needs to be AND'd with (*backlog > mtu), otherwise you can >>> get 100% solid marking at low link rates. >>> >>> When ce_threshold is for DCs, low link rates are unlikely. >>> However, given ce_threshold_ect1 is mainly for the Internet, during >>> testing with 1ms threshold we encountered solid marking at low link >>> rates, so we had to add a 1 packet floor: >>> https://bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf >>> >>> Sorry to chime in after your patch went to net-next. >>> >> What you describe about a minimal backlog was already there with >> ce_threshold handling ? > For my education, do you have a pointer to where the ce_threshold > marking logic has a minimum backlog size requirement in packets or > bytes? AFAICT the ce_threshold marking in include/net/codel_impl.h > happens regardless of the current size of the backlog. [BB] When I checked before my original posting, the only check for single packet backlog was within should_drop() here: https://elixir.bootlin.com/linux/latest/source/include/net/codel_impl.h#L125 However, whether or not that causes should_drop() to return false, codel_dequeue() still always falls through to the ce_threshold marking after end: https://elixir.bootlin.com/linux/latest/source/include/net/codel_impl.h#L249 > >> Or is it something exclusive to L4S ? > I don't think it's exclusive to L4S. I think Bob is raising a general > issue about improving ECN marking based on ce_threshold. My > interpretation of Bob's point is that there is sort of a quantization > issue at very low link speeds, where the serialization delay for a > packet is at or above the ce_threshold delay. In such cases it seems > there can be behavior where the bottleneck marks every packet CE all > the time, causing any ECN-based algorithm (even DCTCP) to suffer poor > utilization. > > I suppose with a fixed-speed link the operator could adjust the > ce_threshold based on the serialization delays implied by the link > speed, but perhaps in general this is infeasible due to variable-speed > (e.g., radio) links. > > I guess perhaps this could be reproduced/tested with DCTCP (using > ECT(0)), a ce_threshold of 1ms (for ECT(0)), and an emulated > bottleneck link speed with a serialization delay well above 1ms (so a > link speed well below 12Mbps). [BB] Yes. > >> This deserves a separate patch, if anything :) > Agreed, in the Linux development model this would make sense as a > separate patch, since it is conceptually separate and there do not > need to be any dependencies between the two changes. :-) [BB] Sure. We'll see to it. The (loose/indirect) dependency I saw was just that ce_threshold_ect1 opens up the possibility of using the ce_threshold on the public Internet not just in DCs. So low rate links become a certainty, rather than a mere theoretical possibility. Bob > > neal
On 16/10/2021 08:39, Jonathan Morton wrote: >> On 15 Oct, 2021, at 2:24 am, Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >>>>> Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency, >>>>> Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold. >>>>> >>>>> If enabled, only packets with ECT(1) can be transformed to CE >>>>> if their sojourn time is above the ce_threshold. >>>>> >>>>> Note that this new option does not change rules for codel law. >>>>> In particular, if TCA_FQ_CODEL_ECN is left enabled (this is >>>>> the default when fq_codel qdisc is created), ECT(0) packets can >>>>> still get CE if codel law (as governed by limit/target) decides so. >>>> The ability to have certain packets receive a shallow marking threshold >>>> and others regular ECN semantics is no doubt useful. However, given that >>>> it is by no means certain how the L4S experiment will pan out (and I for >>>> one remain sceptical that the real-world benefits will turn out to match >>>> the tech demos), I think it's premature to bake the ECT(1) semantics >>>> into UAPI. >>> Chicken and egg problem. >>> We had fq_codel in linux kernel years before RFC after all :) >> Sure, but fq_codel is a self-contained algorithm, it doesn't add new >> meanings to bits of the IP header... :) > I'll be blunter: > > In its original (and currently stable) form, fq_codel is RFC-compliant. It conforms, in particular, to RFC-3168 (ECN). There's a relatively low threshold for adding RFC-compliant network algorithms to Linux, and it is certainly not required to have a published RFC specifically describing each qdisc's operating principles before it can be upstreamed. It just so happens that fq_codel (and some other notable algorithms such as CUBIC) proved sufficiently useful in practice to warrant post-hoc documentation in RFC form. > > However, this patch adds an option which, when enabled, makes fq_codel *non-compliant* with RFC-3168, specifically the requirement to treat ECT(0) and ECT(1) identically, unless conforming to another published RFC which permits different behaviour. > > There is a path via RFC-8311 to experiment with alternative ECN semantics in this way, but the way ECT(1) is used by L4S is specifically mentioned as requiring a published RFC for public deployments. The L4S Internet Drafts have *just failed* an IETF WGLC, which means they are *not* advancing to publication as RFCs in their current form. [BB] Clarification of IETF process: A first Working Group Last Call (WGLC) is nearly always the beginning of the end of the IETF's RFC publication process. Usually the majority of detailed comments arrive during a WGLC. Then the draft has to be fixed, and then it goes either directly through to the next stage (in this case, an IETF-wide last call), or to another WGLC. > The primary reason for this failure is L4S' fundamental incompatibility with existing Internet traffic, despite its stated goal of general Internet deployment. [BB] s/The primary reason /JM's primary objection / There is no ranking of the reasons for more work being needed. The WG had already developed a way to mitigate this objection. Otherwise, a WGLC would not have been started in the first place. Further work on this issue is now more likely to be wordsmithing. I hope this level of brevity was useful for netdev. See tsvwg@ietf.org for details. Bob > It is my considered opinion, indeed, that moving *away* from ECT(1) as the L4S identifier is the best option for improving that compatibility. > > I believe there is a much higher threshold required for adding such things to publicly maintained versions of Linux (as opposed to privately maintained experimental versions). > > - Jonathan Morton
> On 17 Oct, 2021, at 2:22 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote: > >> I'll be blunter: >> >> In its original (and currently stable) form, fq_codel is RFC-compliant. It conforms, in particular, to RFC-3168 (ECN). There's a relatively low threshold for adding RFC-compliant network algorithms to Linux, and it is certainly not required to have a published RFC specifically describing each qdisc's operating principles before it can be upstreamed. It just so happens that fq_codel (and some other notable algorithms such as CUBIC) proved sufficiently useful in practice to warrant post-hoc documentation in RFC form. >> >> However, this patch adds an option which, when enabled, makes fq_codel *non-compliant* with RFC-3168, specifically the requirement to treat ECT(0) and ECT(1) identically, unless conforming to another published RFC which permits different behaviour. >> >> There is a path via RFC-8311 to experiment with alternative ECN semantics in this way, but the way ECT(1) is used by L4S is specifically mentioned as requiring a published RFC for public deployments. The L4S Internet Drafts have *just failed* an IETF WGLC, which means they are *not* advancing to publication as RFCs in their current form. > > [BB] Clarification of IETF process: A first Working Group Last Call (WGLC) is nearly always the beginning of the end of the IETF's RFC publication process. Usually the majority of detailed comments arrive during a WGLC. Then the draft has to be fixed, and then it goes either directly through to the next stage (in this case, an IETF-wide last call), or to another WGLC. Further clarification: this is already the second WGLC for L4S. The one two years previously (at Montreal) yielded a number of major technical objections, which remained unresolved as of this latest WGLC. >> The primary reason for this failure is L4S' fundamental incompatibility with existing Internet traffic, despite its stated goal of general Internet deployment. > > [BB] s/The primary reason /JM's primary objection / > There is no ranking of the reasons for more work being needed. The WG had already developed a way to mitigate this objection. Otherwise, a WGLC would not have been started in the first place. Further work on this issue is now more likely to be wordsmithing. Given that the objections cited by the TSVWG Chairs were technical in nature, and related specifically to the incompatibility between L4S and existing conventional traffic, it is clear to me that wordsmithing will *not* be sufficient to render L4S publishable in RFC form, nor deployable at Internet scale. To quote David Black, one of the aforementioned Chairs and also an author of RFC-8311: > Two overall conclusions are that a) the WGLC has been productive, and shows significant continuing support for L4S, and b) the L4S drafts should be revised to address the WGLC concerns raised. The WG chairs strongly suggest that the revisions include limiting the scope and impact of initial L4S experiments on RFC 3168 functionality (both existing usage and potential deployment) to ensure that the L4S experiments are safe to perform on the Internet, paying particular attention to potential impacts on networks and users that are not participating in the L4S experiments. It is my recommendation to netdev to stay out of this ongoing mess, by rejecting this patch. - Jonathan Morton
Normally I would comment in-line on the patches but was somehow unsubscribed to netdev as of oct 11. I'm glad more eyeballs are on this, finally, however. 1) I would prefer this series bake in the l4s and bbrv2 trees a while, be tested on vms, containers, on cloudy substrates, and home routers. Note, I just said series. pie, fq_pie, and cake all can use rfc3168-style ecn. Worse, fq_codel is also used in the wifi stack, but not as a qdisc. I think a safer and simpler path forward for existing assumptions and callers for rfc3168 -style is that the existing INET_ECN_set_ce and related calls be reworked to exclude ect1. (as seen here) https://code.woboq.org/linux/linux/include/net/codel_impl.h.html#182 and the existing ce_threshold parameter enabled by default (at some acceptible threshold) but only on detecting ect_1. That eliminates the change to the uapi, and more correctly supports both standards. Not having the ce_threshold param exposed currently in wifi and not knowing how to make l4s-style signalling (at least vs tcp prague, maybe not against bbrv2), it seems safest to make ect_1 - > drop reno style on wifi, presently. 2) I don't know what is supposed to happen with GRO/GSO packets. This is actually a long standing confusion of mine in the present day linux stack - if a GRO packet (say an IW10 burst) is marked - do all the packets get the marking? or just one? Is it driver or hardware dependent? (software GRO is the bane of my existence) 3) My other long standing confusion is "triggered" by how we do statistics keeping for packets - ce_mark gets overloaded by this patch, and we end up tracking two very different instances of the same idea in the kernel statistics (much like the confusion along the path). Given the frequency of marking in the l4s case is much higher than the rfc3168 case, I'd advocate for a separate, 64 bit statistic. Please note that in general I find the "packets" stat in the kernel, given gso/gro, kind of hard to deal with in the first place, "bytes_marked" would be a better statistic to track than packets.
Your summary seems way off the mark to me, and I suggest you ask Wes Eddy for an actual summary of the current status ... Gorry > On 18 Oct 2021, at 16:43, Jonathan Morton <chromatix99@gmail.com> wrote: > > >> >>> On 17 Oct, 2021, at 2:22 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote: >>> >>> I'll be blunter: >>> >>> In its original (and currently stable) form, fq_codel is RFC-compliant. It conforms, in particular, to RFC-3168 (ECN). There's a relatively low threshold for adding RFC-compliant network algorithms to Linux, and it is certainly not required to have a published RFC specifically describing each qdisc's operating principles before it can be upstreamed. It just so happens that fq_codel (and some other notable algorithms such as CUBIC) proved sufficiently useful in practice to warrant post-hoc documentation in RFC form. >>> >>> However, this patch adds an option which, when enabled, makes fq_codel *non-compliant* with RFC-3168, specifically the requirement to treat ECT(0) and ECT(1) identically, unless conforming to another published RFC which permits different behaviour. >>> >>> There is a path via RFC-8311 to experiment with alternative ECN semantics in this way, but the way ECT(1) is used by L4S is specifically mentioned as requiring a published RFC for public deployments. The L4S Internet Drafts have *just failed* an IETF WGLC, which means they are *not* advancing to publication as RFCs in their current form. >> >> [BB] Clarification of IETF process: A first Working Group Last Call (WGLC) is nearly always the beginning of the end of the IETF's RFC publication process. Usually the majority of detailed comments arrive during a WGLC. Then the draft has to be fixed, and then it goes either directly through to the next stage (in this case, an IETF-wide last call), or to another WGLC. > > Further clarification: this is already the second WGLC for L4S. The one two years previously (at Montreal) yielded a number of major technical objections, which remained unresolved as of this latest WGLC. > >>> The primary reason for this failure is L4S' fundamental incompatibility with existing Internet traffic, despite its stated goal of general Internet deployment. >> >> [BB] s/The primary reason /JM's primary objection / >> There is no ranking of the reasons for more work being needed. The WG had already developed a way to mitigate this objection. Otherwise, a WGLC would not have been started in the first place. Further work on this issue is now more likely to be wordsmithing. > > Given that the objections cited by the TSVWG Chairs were technical in nature, and related specifically to the incompatibility between L4S and existing conventional traffic, it is clear to me that wordsmithing will *not* be sufficient to render L4S publishable in RFC form, nor deployable at Internet scale. > > To quote David Black, one of the aforementioned Chairs and also an author of RFC-8311: > >> Two overall conclusions are that a) the WGLC has been productive, and shows significant continuing support for L4S, and b) the L4S drafts should be revised to address the WGLC concerns raised. The WG chairs strongly suggest that the revisions include limiting the scope and impact of initial L4S experiments on RFC 3168 functionality (both existing usage and potential deployment) to ensure that the L4S experiments are safe to perform on the Internet, paying particular attention to potential impacts on networks and users that are not participating in the L4S experiments. > > It is my recommendation to netdev to stay out of this ongoing mess, by rejecting this patch. > > - Jonathan Morton
diff --git a/include/net/codel.h b/include/net/codel.h index a6e428f801350809322aaff08d92904e059c3b5a..5e8b181b76b829d6af3c57809d9bc5f0578dd112 100644 --- a/include/net/codel.h +++ b/include/net/codel.h @@ -102,6 +102,7 @@ static inline u32 codel_time_to_us(codel_time_t val) * @interval: width of moving time window * @mtu: device mtu, or minimal queue backlog in bytes. * @ecn: is Explicit Congestion Notification enabled + * @ce_threshold_ect1: if ce_threshold only marks ECT(1) packets */ struct codel_params { codel_time_t target; @@ -109,6 +110,7 @@ struct codel_params { codel_time_t interval; u32 mtu; bool ecn; + bool ce_threshold_ect1; }; /** diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h index d289b91dcd65ecdc96dc0c9bf85d4a4be6961022..7af2c3eb3c43c24364519120aad5be77522854a6 100644 --- a/include/net/codel_impl.h +++ b/include/net/codel_impl.h @@ -54,6 +54,7 @@ static void codel_params_init(struct codel_params *params) params->interval = MS2TIME(100); params->target = MS2TIME(5); params->ce_threshold = CODEL_DISABLED_THRESHOLD; + params->ce_threshold_ect1 = false; params->ecn = false; } @@ -246,9 +247,20 @@ static struct sk_buff *codel_dequeue(void *ctx, vars->rec_inv_sqrt); } end: - if (skb && codel_time_after(vars->ldelay, params->ce_threshold) && - INET_ECN_set_ce(skb)) - stats->ce_mark++; + if (skb && codel_time_after(vars->ldelay, params->ce_threshold)) { + bool set_ce = true; + + if (params->ce_threshold_ect1) { + /* Note: if skb_get_dsfield() returns -1, following + * gives INET_ECN_MASK, which is != INET_ECN_ECT_1. + */ + u8 ecn = skb_get_dsfield(skb) & INET_ECN_MASK; + + set_ce = (ecn == INET_ECN_ECT_1); + } + if (set_ce && INET_ECN_set_ce(skb)) + stats->ce_mark++; + } return skb; } diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index ec88590b3198441f18cc9def7bd40c48f0bc82a1..6be9a84cccfa79bace1f3f7123d02f484b67a25e 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -840,6 +840,7 @@ enum { TCA_FQ_CODEL_CE_THRESHOLD, TCA_FQ_CODEL_DROP_BATCH_SIZE, TCA_FQ_CODEL_MEMORY_LIMIT, + TCA_FQ_CODEL_CE_THRESHOLD_ECT1, __TCA_FQ_CODEL_MAX }; diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 2b5acb37587f7068e2d11fe842ec963a556f1eb1..a39830418434d4bb74d238373f63a4858230fce5 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -513,6 +513,7 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata, sta->cparams.target = MS2TIME(20); sta->cparams.interval = MS2TIME(100); sta->cparams.ecn = true; + sta->cparams.ce_threshold_ect1 = false; sta_dbg(sdata, "Allocated STA %pM\n", sta->sta.addr); diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index bb0cd6d3d2c2749d54e26368fb2558beedea85c9..033d65d06eb136ff704cddd3ee950a5c3a5d9831 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -362,6 +362,7 @@ static const struct nla_policy fq_codel_policy[TCA_FQ_CODEL_MAX + 1] = { [TCA_FQ_CODEL_CE_THRESHOLD] = { .type = NLA_U32 }, [TCA_FQ_CODEL_DROP_BATCH_SIZE] = { .type = NLA_U32 }, [TCA_FQ_CODEL_MEMORY_LIMIT] = { .type = NLA_U32 }, + [TCA_FQ_CODEL_CE_THRESHOLD_ECT1] = { .type = NLA_U8 }, }; static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, @@ -408,6 +409,9 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt, q->cparams.ce_threshold = (val * NSEC_PER_USEC) >> CODEL_SHIFT; } + if (tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]) + q->cparams.ce_threshold_ect1 = !!nla_get_u8(tb[TCA_FQ_CODEL_CE_THRESHOLD_ECT1]); + if (tb[TCA_FQ_CODEL_INTERVAL]) { u64 interval = nla_get_u32(tb[TCA_FQ_CODEL_INTERVAL]); @@ -544,10 +548,13 @@ static int fq_codel_dump(struct Qdisc *sch, struct sk_buff *skb) q->flows_cnt)) goto nla_put_failure; - if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD && - nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, - codel_time_to_us(q->cparams.ce_threshold))) - goto nla_put_failure; + if (q->cparams.ce_threshold != CODEL_DISABLED_THRESHOLD) { + if (nla_put_u32(skb, TCA_FQ_CODEL_CE_THRESHOLD, + codel_time_to_us(q->cparams.ce_threshold))) + goto nla_put_failure; + if (nla_put_u8(skb, TCA_FQ_CODEL_CE_THRESHOLD_ECT1, q->cparams.ce_threshold_ect1)) + goto nla_put_failure; + } return nla_nest_end(skb, opts);