diff mbox series

[net,v2,1/2] lwt: disable BH too in run_lwt_bpf()

Message ID 20201205075946.497763-1-xiyou.wangcong@gmail.com (mailing list archive)
State Accepted
Commit d9054a1ff585ba01029584ab730efc794603d68f
Delegated to: BPF
Headers show
Series [net,v2,1/2] lwt: disable BH too in run_lwt_bpf() | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 22 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Cong Wang Dec. 5, 2020, 7:59 a.m. UTC
From: Dongdong Wang <wangdongdong.6@bytedance.com>

The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
and BPF redirect helpers. Callers on RX path are all in BH context,
disabling preemption is not sufficient to prevent BH interruption.

In production, we observed strange packet drops because of the race
condition between LWT xmit and TC ingress, and we verified this issue
is fixed after we disable BH.

Although this bug was technically introduced from the beginning, that
is commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure"),
at that time call_rcu() had to be call_rcu_bh() to match the RCU context.
So this patch may not work well before RCU flavor consolidation has been
completed around v5.0.

Update the comments above the code too, as call_rcu() is now BH friendly.

Cc: Thomas Graf <tgraf@suug.ch>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Dongdong Wang <wangdongdong.6@bytedance.com>
---
 net/core/lwt_bpf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Dec. 7, 2020, 8:10 p.m. UTC | #1
Hello:

This series was applied to bpf/bpf.git (refs/heads/master):

On Fri,  4 Dec 2020 23:59:45 -0800 you wrote:
> From: Dongdong Wang <wangdongdong.6@bytedance.com>
> 
> The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
> and BPF redirect helpers. Callers on RX path are all in BH context,
> disabling preemption is not sufficient to prevent BH interruption.
> 
> In production, we observed strange packet drops because of the race
> condition between LWT xmit and TC ingress, and we verified this issue
> is fixed after we disable BH.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/2] lwt: disable BH too in run_lwt_bpf()
    https://git.kernel.org/bpf/bpf/c/d9054a1ff585
  - [net,v2,2/2] lwt_bpf: replace preempt_disable() with migrate_disable()
    https://git.kernel.org/bpf/bpf/c/e3366884b383

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
diff mbox series

Patch

diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 7d3438215f32..4f3cb7c15ddf 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -39,12 +39,11 @@  static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 {
 	int ret;
 
-	/* Preempt disable is needed to protect per-cpu redirect_info between
-	 * BPF prog and skb_do_redirect(). The call_rcu in bpf_prog_put() and
-	 * access to maps strictly require a rcu_read_lock() for protection,
-	 * mixing with BH RCU lock doesn't work.
+	/* Preempt disable and BH disable are needed to protect per-cpu
+	 * redirect_info between BPF prog and skb_do_redirect().
 	 */
 	preempt_disable();
+	local_bh_disable();
 	bpf_compute_data_pointers(skb);
 	ret = bpf_prog_run_save_cb(lwt->prog, skb);
 
@@ -78,6 +77,7 @@  static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 		break;
 	}
 
+	local_bh_enable();
 	preempt_enable();
 
 	return ret;