Message ID | 20220505160357.298794-1-vladimir.oltean@nxp.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,net] net/sched: taprio: account for L1 overhead when calculating transmit time | expand |
Hi Vladimir, Vladimir Oltean <vladimir.oltean@nxp.com> writes: > The taprio scheduler underestimates the packet transmission time, which > means that packets can be scheduled for transmission in time slots in > which they are never going to fit. > > When this function was added in commit 4cfd5779bd6e ("taprio: Add > support for txtime-assist mode"), the only implication was that time > triggered packets would overrun its time slot and eat from the next one, > because with txtime-assist there isn't really any emulation of a "gate > close" event that would stop a packet from being transmitted. > > However, commit b5b73b26b3ca ("taprio: Fix allowing too small > intervals") started using this function too, in all modes of operation > (software, txtime-assist and full offload). So we now accept time slots > which we know we won't be ever able to fulfill. > > It's difficult to say which issue is more pressing, I'd say both are > visible with testing, even though the second would be more obvious > because of a black&white result (trying to send small packets in an > insufficiently large window blocks the queue). > > Issue found through code inspection, the code was not even compile > tested. > > The L1 overhead chosen here is an approximation, because various network > equipment has configurable IFG, however I don't think Linux is aware of > this. When testing CBS, I remember using tc-stab: https://man7.org/linux/man-pages/man8/tc-stab.8.html To set the 'overhead' to some value. That value should be used in the calculation. I agree that it's not ideal, in the ideal world we would have a way to retrieve the link overhead from the netdevice. But I would think that it gets complicated really quickly when using netdevices that are not Ethernet-based. > > Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- > net/sched/sch_taprio.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index b9c71a304d39..8c8681c37d4f 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -176,7 +176,10 @@ static ktime_t get_interval_end_time(struct sched_gate_list *sched, > > static int length_to_duration(struct taprio_sched *q, int len) > { > - return div_u64(len * atomic64_read(&q->picos_per_byte), 1000); > + /* The duration of frame transmission should account for L1 overhead > + * (12 octets IFG, 7 octets of preamble, 1 octet SFD, 4 octets FCS) > + */ > + return div_u64((24 + len) * atomic64_read(&q->picos_per_byte), 1000); > } > > /* Returns the entry corresponding to next available interval. If > -- > 2.25.1 >
On Thu, May 05, 2022 at 10:25:44AM -0700, Vinicius Costa Gomes wrote: > Hi Vladimir, > > Vladimir Oltean <vladimir.oltean@nxp.com> writes: > > > The taprio scheduler underestimates the packet transmission time, which > > means that packets can be scheduled for transmission in time slots in > > which they are never going to fit. > > > > When this function was added in commit 4cfd5779bd6e ("taprio: Add > > support for txtime-assist mode"), the only implication was that time > > triggered packets would overrun its time slot and eat from the next one, > > because with txtime-assist there isn't really any emulation of a "gate > > close" event that would stop a packet from being transmitted. > > > > However, commit b5b73b26b3ca ("taprio: Fix allowing too small > > intervals") started using this function too, in all modes of operation > > (software, txtime-assist and full offload). So we now accept time slots > > which we know we won't be ever able to fulfill. > > > > It's difficult to say which issue is more pressing, I'd say both are > > visible with testing, even though the second would be more obvious > > because of a black&white result (trying to send small packets in an > > insufficiently large window blocks the queue). > > > > Issue found through code inspection, the code was not even compile > > tested. > > > > The L1 overhead chosen here is an approximation, because various network > > equipment has configurable IFG, however I don't think Linux is aware of > > this. > > When testing CBS, I remember using tc-stab: > > https://man7.org/linux/man-pages/man8/tc-stab.8.html > > To set the 'overhead' to some value. > > That value should be used in the calculation. > > I agree that it's not ideal, in the ideal world we would have a way to > retrieve the link overhead from the netdevice. But I would think that it > gets complicated really quickly when using netdevices that are not > Ethernet-based. Interesting. So because the majority of length_to_duration() calls take qdisc_pkt_len(skb) as argument, a user-supplied overhead is taken into account. The exception is the bare ETH_ZLEN. For that, we'd have to change the prototype of __qdisc_calculate_pkt_len to return an int, and change qdisc_calculate_pkt_len like this: static inline void qdisc_calculate_pkt_len(struct sk_buff *skb, const struct Qdisc *sch) { #ifdef CONFIG_NET_SCHED struct qdisc_size_table *stab = rcu_dereference_bh(sch->stab); if (stab) qdisc_skb_cb(skb)->pkt_len = __qdisc_calculate_pkt_len(skb->len, stab); #endif } then we would use __qdisc_calculate_pkt_len(ETH_ZLEN, rtnl_dereference(q->root->stab)). Again completely untested. Also, maybe the dependency on tc-stab for correct operation at least in txtime assist mode should be mentioned in the man page, maybe? I don't think it's completely obvious.
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index b9c71a304d39..8c8681c37d4f 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -176,7 +176,10 @@ static ktime_t get_interval_end_time(struct sched_gate_list *sched, static int length_to_duration(struct taprio_sched *q, int len) { - return div_u64(len * atomic64_read(&q->picos_per_byte), 1000); + /* The duration of frame transmission should account for L1 overhead + * (12 octets IFG, 7 octets of preamble, 1 octet SFD, 4 octets FCS) + */ + return div_u64((24 + len) * atomic64_read(&q->picos_per_byte), 1000); } /* Returns the entry corresponding to next available interval. If
The taprio scheduler underestimates the packet transmission time, which means that packets can be scheduled for transmission in time slots in which they are never going to fit. When this function was added in commit 4cfd5779bd6e ("taprio: Add support for txtime-assist mode"), the only implication was that time triggered packets would overrun its time slot and eat from the next one, because with txtime-assist there isn't really any emulation of a "gate close" event that would stop a packet from being transmitted. However, commit b5b73b26b3ca ("taprio: Fix allowing too small intervals") started using this function too, in all modes of operation (software, txtime-assist and full offload). So we now accept time slots which we know we won't be ever able to fulfill. It's difficult to say which issue is more pressing, I'd say both are visible with testing, even though the second would be more obvious because of a black&white result (trying to send small packets in an insufficiently large window blocks the queue). Issue found through code inspection, the code was not even compile tested. The L1 overhead chosen here is an approximation, because various network equipment has configurable IFG, however I don't think Linux is aware of this. Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- net/sched/sch_taprio.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)