Message ID | 20240724152149.11003-1-aha310510@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] tun: Remove nested call to bpf_net_ctx_set() in do_xdp_generic() | expand |
Jeongjun Park wrote: > In the previous commit, bpf_net_context handling was added to > tun_sendmsg() and do_xdp_generic(), but if you write code like this, > bpf_net_context overlaps in the call trace below, causing various > memory corruptions. I'm no expert on this code, but commit 401cb7dae813 that introduced bpf_net_ctx_set explicitly states that nested calls are allowed. And the function does imply that: static inline struct bpf_net_context *bpf_net_ctx_set(struct bpf_net_context *bpf_net_ctx) { struct task_struct *tsk = current; if (tsk->bpf_net_context != NULL) return NULL; bpf_net_ctx->ri.kern_flags = 0; tsk->bpf_net_context = bpf_net_ctx; return bpf_net_ctx; } > <Call trace> > ... > tun_sendmsg() // bpf_net_ctx_set() > tun_xdp_one() > do_xdp_generic() // bpf_net_ctx_set() <-- nested > ... > > This patch removes the bpf_net_context handling that exists in > do_xdp_generic() and modifies it to handle it in the parent function. Is tun_xdp_one missing? That also calls do_xdp_generic.
Willem de Bruijn wrote: > I'm no expert on this code, but commit 401cb7dae813 that introduced > bpf_net_ctx_set explicitly states that nested calls are allowed. > > And the function does imply that: > > static inline struct bpf_net_context *bpf_net_ctx_set(struct bpf_net_context *bpf_net_ctx) > { > struct task_struct *tsk = current; > > if (tsk->bpf_net_context != NULL) > return NULL; > bpf_net_ctx->ri.kern_flags = 0; > > tsk->bpf_net_context = bpf_net_ctx; > return bpf_net_ctx; > } I'm not an expert on this code either. As you said, there is a possibility that the bug is not caused by overlapping calls, but various memory corruptions are occurring due to the handling of bpf_net_context in do_xdp_generic. Therefore, it is appropriate to modify it to handle it in the parent function rather than in do_xdp_generic. > Is tun_xdp_one missing? That also calls do_xdp_generic. This is no problem since tun_xdp_one is only called from tun_sendmsg and tun_sendmsg already does the bpf_net_context handling. Regards, Jeongjun Park.
On 7/25/24 04:43, Willem de Bruijn wrote: > Jeongjun Park wrote: >> In the previous commit, bpf_net_context handling was added to >> tun_sendmsg() and do_xdp_generic(), but if you write code like this, >> bpf_net_context overlaps in the call trace below, causing various >> memory corruptions. > > I'm no expert on this code, but commit 401cb7dae813 that introduced > bpf_net_ctx_set explicitly states that nested calls are allowed. > > And the function does imply that: > > static inline struct bpf_net_context *bpf_net_ctx_set(struct bpf_net_context *bpf_net_ctx) > { > struct task_struct *tsk = current; > > if (tsk->bpf_net_context != NULL) > return NULL; > bpf_net_ctx->ri.kern_flags = 0; > > tsk->bpf_net_context = bpf_net_ctx; > return bpf_net_ctx; > } I agree with Willem, the ctx nesting looks legit generally speaking. @Jeongjun: you need to track down more accurately the issue root cause and include such info into the commit message. Skimming over the code I *think* do_xdp_generic() is not cleaning the nested context in all the paths before return and that could cause the reported issue. Thanks, Paolo
Paolo Abeni wrote: > > On 7/25/24 04:43, Willem de Bruijn wrote: > > Jeongjun Park wrote: > >> In the previous commit, bpf_net_context handling was added to > >> tun_sendmsg() and do_xdp_generic(), but if you write code like this, > >> bpf_net_context overlaps in the call trace below, causing various > >> memory corruptions. > > > > I'm no expert on this code, but commit 401cb7dae813 that introduced > > bpf_net_ctx_set explicitly states that nested calls are allowed. > > > > And the function does imply that: > > > > static inline struct bpf_net_context *bpf_net_ctx_set(struct bpf_net_context *bpf_net_ctx) > > { > > struct task_struct *tsk = current; > > > > if (tsk->bpf_net_context != NULL) > > return NULL; > > bpf_net_ctx->ri.kern_flags = 0; > > > > tsk->bpf_net_context = bpf_net_ctx; > > return bpf_net_ctx; > > } > > I agree with Willem, the ctx nesting looks legit generally speaking. > @Jeongjun: you need to track down more accurately the issue root cause > and include such info into the commit message. > > Skimming over the code I *think* do_xdp_generic() is not cleaning the > nested context in all the paths before return and that could cause the > reported issue. Thanks to your comment, I re-read the code and found the root cause. I will send a patch for that bug. Regards, Jeongjun Park
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 9b24861464bc..095ada4a525e 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1919,10 +1919,12 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, if (skb_xdp) { struct bpf_prog *xdp_prog; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int ret; local_bh_disable(); rcu_read_lock(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); xdp_prog = rcu_dereference(tun->xdp_prog); if (xdp_prog) { ret = do_xdp_generic(xdp_prog, &skb); @@ -1932,6 +1934,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile, goto unlock_frags; } } + bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock(); local_bh_enable(); } diff --git a/net/core/dev.c b/net/core/dev.c index 6ea1d20676fb..26f9fdd66e64 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5126,14 +5126,11 @@ static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key); int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) { - struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - if (xdp_prog) { struct xdp_buff xdp; u32 act; int err; - bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); act = netif_receive_generic_xdp(pskb, &xdp, xdp_prog); if (act != XDP_PASS) { switch (act) { @@ -5147,13 +5144,11 @@ int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) generic_xdp_tx(*pskb, xdp_prog); break; } - bpf_net_ctx_clear(bpf_net_ctx); return XDP_DROP; } } return XDP_PASS; out_redir: - bpf_net_ctx_clear(bpf_net_ctx); kfree_skb_reason(*pskb, SKB_DROP_REASON_XDP); return XDP_DROP; } @@ -5475,10 +5470,13 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc, if (static_branch_unlikely(&generic_xdp_needed_key)) { int ret2; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; migrate_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); ret2 = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), &skb); + bpf_net_ctx_clear(bpf_net_ctx); migrate_enable(); if (ret2 != XDP_PASS) {
In the previous commit, bpf_net_context handling was added to tun_sendmsg() and do_xdp_generic(), but if you write code like this, bpf_net_context overlaps in the call trace below, causing various memory corruptions. <Call trace> ... tun_sendmsg() // bpf_net_ctx_set() tun_xdp_one() do_xdp_generic() // bpf_net_ctx_set() <-- nested ... This patch removes the bpf_net_context handling that exists in do_xdp_generic() and modifies it to handle it in the parent function. Reported-by: syzbot+44623300f057a28baf1e@syzkaller.appspotmail.com Fixes: fecef4cd42c6 ("tun: Assign missing bpf_net_context.") Signed-off-by: Jeongjun Park <aha310510@gmail.com> --- drivers/net/tun.c | 3 +++ net/core/dev.c | 8 +++----- 2 files changed, 6 insertions(+), 5 deletions(-) --