[net,v2] net: sched: fix packet stuck problem for lockless qdisc

Lockless qdisc has below concurrent problem:
    cpu0                 cpu1
     .                     .
q->enqueue                 .
     .                     .
qdisc_run_begin()          .
     .                     .
dequeue_skb()              .
     .                     .
sch_direct_xmit()          .
     .                     .
     .                q->enqueue
     .             qdisc_run_begin()
     .            return and do nothing
     .                     .
qdisc_run_end()            .

cpu1 enqueue a skb without calling __qdisc_run() because cpu0
has not released the lock yet and spin_trylock() return false
for cpu1 in qdisc_run_begin(), and cpu0 do not see the skb
enqueued by cpu1 when calling dequeue_skb() because cpu1 may
enqueue the skb after cpu0 calling dequeue_skb() and before
cpu0 calling qdisc_run_end().

Lockless qdisc has below another concurrent problem when
tx_action is involved:

cpu0(serving tx_action)     cpu1             cpu2
          .                   .                .
          .              q->enqueue            .
          .            qdisc_run_begin()       .
          .              dequeue_skb()         .
          .                   .            q->enqueue
          .                   .                .
          .             sch_direct_xmit()      .
          .                   .         qdisc_run_begin()
          .                   .       return and do nothing
          .                   .                .
 clear __QDISC_STATE_SCHED    .                .
 qdisc_run_begin()            .                .
 return and do nothing        .                .
          .                   .                .
          .            qdisc_run_end()         .

This patch fixes the above data race by:
1. Get the flag before doing spin_trylock().
2. If the first spin_trylock() return false and the flag is not
   set before the first spin_trylock(), Set the flag and retry
   another spin_trylock() in case other CPU may not see the new
   flag after it releases the lock.
3. reschedule if the flags is set after the lock is released
   at the end of qdisc_run_end().

For tx_action case, the flags is also set when cpu1 is at the
end if qdisc_run_end(), so tx_action will be rescheduled
again to dequeue the skb enqueued by cpu2.

Only clear the flag before retrying a dequeuing when dequeuing
returns NULL in order to reduce the overhead of the above double
spin_trylock() and __netif_schedule() calling.

The performance impact of this patch, tested using pktgen and
dummy netdev with pfifo_fast qdisc attached:

 threads  without+this_patch   with+this_patch      delta
    1        2.6Mpps            2.6Mpps             +0.0%
    2        3.9Mpps            3.8Mpps             -2.5%
    4        5.6Mpps            5.6Mpps             -0.0%
    8        2.7Mpps            2.8Mpps             +3.7%
   16        2.2Mpps            2.2Mpps             +0.0%

Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
V2: Avoid the overhead of fixing the data race as much as
    possible.
---
 include/net/sch_generic.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++-
 net/sched/sch_generic.c   | 12 ++++++++++++
 2 files changed, 59 insertions(+), 1 deletion(-)

Message ID	1616552677-39016-1-git-send-email-linyunsheng@huawei.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <bpf-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C3EEC433C1 for <bpf@archiver.kernel.org>; Wed, 24 Mar 2021 02:25:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CC00A619EC for <bpf@archiver.kernel.org>; Wed, 24 Mar 2021 02:25:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234746AbhCXCY3 (ORCPT <rfc822;bpf@archiver.kernel.org>); Tue, 23 Mar 2021 22:24:29 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:14857 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234920AbhCXCYL (ORCPT <rfc822;bpf@vger.kernel.org>); Tue, 23 Mar 2021 22:24:11 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4F4sSd0Fyyz93BL; Wed, 24 Mar 2021 10:22:09 +0800 (CST) Received: from localhost.localdomain (10.69.192.56) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.498.0; Wed, 24 Mar 2021 10:24:02 +0800 From: Yunsheng Lin <linyunsheng@huawei.com> To: <davem@davemloft.net>, <kuba@kernel.org> CC: <olteanv@gmail.com>, <ast@kernel.org>, <daniel@iogearbox.net>, <andriin@fb.com>, <edumazet@google.com>, <weiwan@google.com>, <cong.wang@bytedance.com>, <ap420073@gmail.com>, <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linuxarm@openeuler.org>, <mkl@pengutronix.de>, <linux-can@vger.kernel.org>, <jhs@mojatatu.com>, <xiyou.wangcong@gmail.com>, <jiri@resnulli.us>, <andrii@kernel.org>, <kafai@fb.com>, <songliubraving@fb.com>, <yhs@fb.com>, <john.fastabend@gmail.com>, <kpsingh@kernel.org>, <bpf@vger.kernel.org>, <jonas.bonn@netrounds.com>, <pabeni@redhat.com>, <mzhivich@akamai.com>, <johunt@akamai.com>, <albcamus@gmail.com>, <kehuan.feng@gmail.com>, <a.fatoum@pengutronix.de>, <atenart@kernel.org>, <alexander.duyck@gmail.com> Subject: [PATCH net v2] net: sched: fix packet stuck problem for lockless qdisc Date: Wed, 24 Mar 2021 10:24:37 +0800 Message-ID: <1616552677-39016-1-git-send-email-linyunsheng@huawei.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.69.192.56] X-CFilter-Loop: Reflected Precedence: bulk List-ID: <bpf.vger.kernel.org> X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org
Series	[net,v2] net: sched: fix packet stuck problem for lockless qdisc \| expand [net,v2] net: sched: fix packet stuck problem for lockless qdisc

Context	Check	Description
netdev/cover_letter	success	Link
netdev/fixes_present	success	Link
netdev/patch_count	success	Link
netdev/tree_selection	success	Clearly marked for net
netdev/subject_prefix	success	Link
netdev/cc_maintainers	success	CCed 15 of 15 maintainers
netdev/source_inline	success	Was 0 now: 0
netdev/verify_signedoff	success	Link
netdev/module_param	success	Was 0 now: 0
netdev/build_32bit	fail	Errors and warnings before: 4510 this patch: 4061
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/verify_fixes	success	Link
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 99 lines checked
netdev/build_allmodconfig_warn	fail	Errors and warnings before: 4750 this patch: 4313
netdev/header_inline	success	Link

[net,v2] net: sched: fix packet stuck problem for lockless qdisc

Checks

Commit Message

Comments

Patch