Message ID | 1620868260-32984-4-git-send-email-linyunsheng@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | fix packet stuck problem for lockless qdisc | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 20 of 20 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 10 this patch: 10 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 67 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 10 this patch: 10 |
netdev/header_inline | success | Link |
On Thu, 13 May 2021 09:11:00 +0800 Yunsheng Lin wrote: > The netdev qeueue might be stopped when byte queue limit has > reached or tx hw ring is full, net_tx_action() may still be > rescheduled endlessly if STATE_MISSED is set, which consumes > a lot of cpu without dequeuing and transmiting any skb because > the netdev queue is stopped, see qdisc_run_end(). > > This patch fixes it by checking the netdev queue state before > calling qdisc_run() and clearing STATE_MISSED if netdev queue is > stopped during qdisc_run(), the net_tx_action() is recheduled > again when netdev qeueue is restarted, see netif_tx_wake_queue(). > > As there is time window betewwn netif_xmit_frozen_or_stopped() > checking and STATE_MISSED clearing, between which STATE_MISSED > is set by net_tx_action() scheduled by netif_tx_wake_queue(), > so set the STATE_MISSED again if netdev queue is restarted. > > Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") > Reported-by: Michal Kubecek <mkubecek@suse.cz> > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > @@ -35,6 +35,25 @@ > const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops; > EXPORT_SYMBOL(default_qdisc_ops); > > +static void qdisc_maybe_stop_tx(struct Qdisc *q, nit: qdisc_maybe_clear_missed()?
diff --git a/net/core/dev.c b/net/core/dev.c index d596cd7..ef8cf76 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3853,7 +3853,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, if (q->flags & TCQ_F_NOLOCK) { rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; - qdisc_run(q); + if (likely(!netif_xmit_frozen_or_stopped(txq))) + qdisc_run(q); if (unlikely(to_free)) kfree_skb_list(to_free); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index d86c4cc..b129d9b 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -35,6 +35,25 @@ const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops; EXPORT_SYMBOL(default_qdisc_ops); +static void qdisc_maybe_stop_tx(struct Qdisc *q, + const struct netdev_queue *txq) +{ + clear_bit(__QDISC_STATE_MISSED, &q->state); + + /* Make sure the below netif_xmit_frozen_or_stopped() + * checking happens after clearing STATE_MISSED. + */ + smp_mb__after_atomic(); + + /* Checking netif_xmit_frozen_or_stopped() again to + * make sure STATE_MISSED is set if the STATE_MISSED + * set by netif_tx_wake_queue()'s rescheduling of + * net_tx_action() is cleared by the above clear_bit(). + */ + if (!netif_xmit_frozen_or_stopped(txq)) + set_bit(__QDISC_STATE_MISSED, &q->state); +} + /* Main transmission queue. */ /* Modifications to data participating in scheduling must be protected with @@ -74,6 +93,7 @@ static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q) } } else { skb = SKB_XOFF_MAGIC; + qdisc_maybe_stop_tx(q, txq); } } @@ -242,6 +262,7 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate, } } else { skb = NULL; + qdisc_maybe_stop_tx(q, txq); } if (lock) spin_unlock(lock); @@ -251,8 +272,10 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate, *validate = true; if ((q->flags & TCQ_F_ONETXQUEUE) && - netif_xmit_frozen_or_stopped(txq)) + netif_xmit_frozen_or_stopped(txq)) { + qdisc_maybe_stop_tx(q, txq); return skb; + } skb = qdisc_dequeue_skb_bad_txq(q); if (unlikely(skb)) { @@ -311,6 +334,8 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q, HARD_TX_LOCK(dev, txq, smp_processor_id()); if (!netif_xmit_frozen_or_stopped(txq)) skb = dev_hard_start_xmit(skb, dev, txq, &ret); + else + qdisc_maybe_stop_tx(q, txq); HARD_TX_UNLOCK(dev, txq); } else {
The netdev qeueue might be stopped when byte queue limit has reached or tx hw ring is full, net_tx_action() may still be rescheduled endlessly if STATE_MISSED is set, which consumes a lot of cpu without dequeuing and transmiting any skb because the netdev queue is stopped, see qdisc_run_end(). This patch fixes it by checking the netdev queue state before calling qdisc_run() and clearing STATE_MISSED if netdev queue is stopped during qdisc_run(), the net_tx_action() is recheduled again when netdev qeueue is restarted, see netif_tx_wake_queue(). As there is time window betewwn netif_xmit_frozen_or_stopped() checking and STATE_MISSED clearing, between which STATE_MISSED is set by net_tx_action() scheduled by netif_tx_wake_queue(), so set the STATE_MISSED again if netdev queue is restarted. Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> --- V7: Fix the netif_tx_wake_queue() data race noted by Jakub. V6: Drop NET_XMIT_DROP checking for it is not really relevant to this patch, and it may cause performance performance regression with multi pktgen threads on dummy netdev with pfifo_fast qdisc case. --- net/core/dev.c | 3 ++- net/sched/sch_generic.c | 27 ++++++++++++++++++++++++++- 2 files changed, 28 insertions(+), 2 deletions(-)