[v4] net/bridge: Optimizing read-write locks in ebtables.c

When conducting WRK testing, the softirq of the system will be very high.
forwarding through a bridge, if the network load is too high, it may
cause abnormal load on the ebt_do_table of the kernel ebtable module, leading
to excessive soft interrupts and sometimes even directly causing CPU soft
lockup.

test prepare:
1) Test machine A creates bridge :
``` bash
brctl addbr br-a
brctl addbr br-b
brctl addif br-a enp1s0f0 enp1s0f1
brctl addif br-b enp130s0f0 enp130s0f1
ifconfig br-a up
ifconfig br-b up
```
2) Testing with another machine B:
``` bash
ulimit -n 2048
./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://4.4.4.2:80/4k.html &
./wrk -t48 -c2000 -d6000 -R10000 -s request.lua http://5.5.5.2:80/4k.html &
```
At this time, the soft interrupt of machine A will be relatively high, This is
the data running on the arm Kunpeng-920 (96 cpus) machine,When I only run
wrk tests, the softirq of the system will rapidly increase to 25%:

02:50:07 PM  CPU   %usr  %nice %sys %iowait %irq  %soft  %steal %guest  %gnice %idle
02:50:25 PM  all   0.00  0.00  0.05  0.00   0.72  23.20  0.00    0.00    0.00   76.03
02:50:26 PM  all   0.00  0.00  0.08  0.00   0.72  24.53  0.00    0.00    0.00   74.67
02:50:27 PM  all   0.01  0.00  0.13  0.00   0.75  24.89  0.00    0.00    0.00   74.23

3) machine A perform ebtables related operations.
``` bash

for i in {0..100000}
do
        ebtables -t nat -Lc
        ebtables -t nat -F
        ebtables -t nat -Lc
        ebtables -t nat -A PREROUTING -j PREROUTING_direct
done
```
If ebatlse queries, updates, and other operations are continuously executed at this time, softirq
will increase again to 50%:
02:52:23 PM  all   0.00   0.00  1.18  0.00   0.54  48.91  0.00   0.00   0.00   49.36
02:52:24 PM  all   0.00   0.00  1.19  0.00   0.43  48.23  0.00   0.00   0.00   50.15
02:52:25 PM  all   0.00   0.00  1.20  0.00   0.50  48.29  0.00   0.00   0.00   50.01

More seriously, soft lockup may occur:

Message from syslogd@localhost at Sep 25 14:52:22 ...
 kernel:watchdog: BUG: soft lockup - CPU#88 stuck for 23s! [ebtables:3896]

dmesg:

[ 1376.653884] watchdog: BUG: soft lockup - CPU#88 stuck for 23s! [ebtables:3896]
[ 1376.661131] CPU: 88 PID: 3896 Comm: ebtables Kdump: loaded Not tainted 4.19.90-2305.1.0.0199.82.uel20.aarch64 #1
[ 1376.661132] Hardware name: Yunke China KunTai R722/BC82AMDDA, BIOS 6.59 07/18/2023
[ 1376.661133] pstate: 20400009 (nzCv daif +PAN -UAO)
[ 1376.661137] pc : queued_write_lock_slowpath+0x70/0x128
...
[ 1376.661156] Call trace:
[ 1376.661157]  queued_write_lock_slowpath+0x70/0x128
[ 1376.661164]  copy_counters_to_user.part.2+0x110/0x140 [ebtables]
[ 1376.661166]  copy_everything_to_user+0x3c4/0x730 [ebtables]
[ 1376.661168]  do_ebt_get_ctl+0x1c0/0x270 [ebtables]
[ 1376.661172]  nf_getsockopt+0x64/0xa8
[ 1376.661175]  ip_getsockopt+0x12c/0x1b0
[ 1376.661178]  raw_getsockopt+0x88/0xb0
[ 1376.661182]  sock_common_getsockopt+0x54/0x68
[ 1376.661185]  __arm64_sys_getsockopt+0x94/0x108
[ 1376.661190]  el0_svc_handler+0x80/0x168
[ 1376.661192]  el0_svc+0x8/0x6c0

After analysis, it was found that the code of ebtables had not been optimized
for a long time, and the read-write locks inside still existed. However, other
arp/ip/ip6 tables had already been optimized a lot, and performance bottlenecks
in read-write locks had been discovered a long time ago.

So I referred to arp/ip/ip6 modification methods to optimize the read-write
lock in ebtables.c.

Ref: '7f5c6d4f665b ("netfilter: get rid of atomic ops in fast path")'

patch after:
03:17:11 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:17:12 PM  all    0.02    0.00    0.03    0.00    0.64    4.80    0.00    0.00    0.00   94.51
03:17:13 PM  all    0.00    0.00    0.03    0.00    0.60    4.68    0.00    0.00    0.00   94.69
03:17:14 PM  all    0.02    0.00    0.00    0.00    0.63    4.60    0.00    0.00    0.00   94.74

When performing ebtables query and update operations:
03:17:50 PM  all    0.97    0.00    1.16    0.00    0.59    4.37    0.00    0.00    0.00   92.92
03:17:51 PM  all    0.71    0.00    1.20    0.00    0.56    3.97    0.00    0.00    0.00   93.56
03:17:52 PM  all    1.02    0.00    1.02    0.00    0.59    4.02    0.00    0.00    0.00   93.36
03:17:53 PM  all    0.90    0.00    1.10    0.00    0.54    4.07    0.00    0.00    0.00   93.38

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: yushengjin <yushengjin@uniontech.com>
Link: https://lore.kernel.org/all/CANn89iJCBRCM3aHDy-7gxWu_+agXC9M1R=hwFuh2G9RSLu_6bg@mail.gmail.com/
---
 include/linux/netfilter_bridge/ebtables.h |   1 -
 net/bridge/netfilter/ebtables.c           | 140 ++++++++++++++++------
 2 files changed, 102 insertions(+), 39 deletions(-)

Message ID	A872628EC4B98B9E+20240925083745.179397-1-yushengjin@uniontech.com (mailing list archive)
State	Awaiting Upstream
Delegated to:	Netdev Maintainers
Headers	show Received: from smtpbg154.qq.com (smtpbg154.qq.com [15.184.224.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C7D515A87C; Wed, 25 Sep 2024 08:37:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=15.184.224.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727253485; cv=none; b=r7JXS4kwVnHLB/spGp7SolTnJyM7SdR5XIHMV7h8J0RiLgSN5xqDZkCmi9vudJ8GClkvbJVZqDh4+zVLsK2qdQoOcTRdHepM546dXi7iJnYsi9o70lI5U8IlnQY59HspfqOhZFC8YGjxxM8z/rmFNPz6HE3GNKKLMNLp6/HN+oU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727253485; c=relaxed/simple; bh=9FpXjrMVNcWjGyIMxruP7MfsInytvTEkHkKnPgGVbiQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=O2TqaYpQ9Sna/SotvwT9k4+oyfUhi5QLYyWnu33G4yY8hiNoIk0O+FizceBM16rl2r4/Cp4F/rxPKE18419bbZFSXkUaqfgvwGUge3w01nYSEdJLTk+AzrJ5yYc9BpQZtqT7PBEKUx680saHQknXVAOev9QvsfgDXFZ15cYhFeI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=uniontech.com; spf=pass smtp.mailfrom=uniontech.com; dkim=pass (1024-bit key) header.d=uniontech.com header.i=@uniontech.com header.b=bbazCjSg; arc=none smtp.client-ip=15.184.224.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=uniontech.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=uniontech.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=uniontech.com header.i=@uniontech.com header.b="bbazCjSg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uniontech.com; s=onoh2408; t=1727253474; bh=C+KAZMZtbRWkoNiP+xKBl35aoEFpxuY+tUuQznd1gaU=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=bbazCjSgrF6PZYzu/btFHOKJJ331st5Vly3s+14pVkC6eLzmyHBQmbKNSpzXdpq6N u9h7oio7zCR0mdN9vEK//2b7CtRr8AYa1xgc6jMDM1pHi0zK5jZcFBlfdl1krTYXO+ LGhR7/PTlWiML8IdjK5exw2lSCUgIMTfH301fPa0= X-QQ-mid: bizesmtp87t1727253469tin69spl X-QQ-Originating-IP: Smz1/RT4I/8FynUZnuegihidcQsto4r32tTJgbLchfE= Received: from fish-NBLK-WAX9X.. ( [113.57.152.160]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 25 Sep 2024 16:37:46 +0800 (CST) X-QQ-SSF: 0000000000000000000000000000000 X-QQ-GoodBg: 1 X-BIZMAIL-ID: 9396028548212037258 From: yushengjin <yushengjin@uniontech.com> To: pablo@netfilter.org Cc: kadlec@netfilter.org, roopa@nvidia.com, razor@blackwall.org, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, bridge@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, gouhao@uniontech.com, yushengjin <yushengjin@uniontech.com> Subject: [PATCH v4] net/bridge: Optimizing read-write locks in ebtables.c Date: Wed, 25 Sep 2024 16:37:45 +0800 Message-ID: <A872628EC4B98B9E+20240925083745.179397-1-yushengjin@uniontech.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: <netdev.vger.kernel.org> List-Subscribe: <mailto:netdev+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:netdev+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:uniontech.com:qybglogicsvrsz:qybglogicsvrsz4a-0 X-Patchwork-Delegate: kuba@kernel.org
Series	[v4] net/bridge: Optimizing read-write locks in ebtables.c \| expand [v4] net/bridge: Optimizing read-write locks in ebtables.c

Context	Check	Description
netdev/series_format	warning	Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 16 this patch: 16
netdev/build_tools	success	Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers	success	CCed 11 of 11 maintainers
netdev/build_clang	success	Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 19 this patch: 19
netdev/checkpatch	warning	CHECK: Alignment should match open parenthesis CHECK: Blank lines aren't necessary after an open brace '{' CHECK: Please don't use multiple blank lines WARNING: memory barrier without comment WARNING: suspect code indent for conditional statements (8, 12)
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-09-26--21-00 (tests: 768)

[v4] net/bridge: Optimizing read-write locks in ebtables.c

Checks

Commit Message

Comments

Patch