Message ID | 20250313183911.SPAmGLyw@linutronix.de (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [RFC] Use after free in BPF/ XDP during XDP_REDIRECT | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
bpf/vmtest-bpf-PR | fail | merge-conflict |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > Hi, > > Ricardo reported a KASAN related use after free > https://lore.kernel.org/all/20250226-20250204-kasan-slab-use-after-free-read-in-dev_map_enqueue__submit-v3-0-360efec441ba@igalia.com/ > > in v6.6 stable and suggest a backport of commits > 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") > fecef4cd42c68 ("tun: Assign missing bpf_net_context.") > 9da49aa80d686 ("tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()") > > as a fix. In the meantime I have the syz reproducer+config and was able > to investigate. > It looks as if the syzbot starts a BPF program via xdp_test_run_batch() > which assigns ri->tgt_value via dev_hash_map_redirect() and the return code > isn't XDP_REDIRECT it looks like nonsense. So the print in > bpf_warn_invalid_xdp_action() appears once. Everything goes as planned. > Then the TUN driver runs another BPF program which returns XDP_REDIRECT > without setting ri->tgt_value. This appears to be a trick because it > invoked bpf_trace_printk() which printed four characters. Anyway, this > is enough to get xdp_do_redirect() going. > > The commits in questions do fix it because the bpf_redirect_info becomes > not only per-task but gets invalidated after the XDP context is left. > > Now that I understand it I would suggest something smaller instead as a > stable fix, (instead the proposed patches). Any objections to the > following: > > diff --git a/net/core/filter.c b/net/core/filter.c > index be313928d272..1d906b7a541d 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -9000,8 +9000,12 @@ static bool xdp_is_valid_access(int off, int size, > > void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act) > { > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > const u32 act_max = XDP_REDIRECT; > > + ri->map_id = INT_MAX; > + ri->map_type = BPF_MAP_TYPE_UNSPEC; > + > pr_warn_once("%s XDP return value %u on prog %s (id %d) dev %s, expect packet loss!\n", > act > act_max ? "Illegal" : "Driver unsupported", > act, prog->aux->name, prog->aux->id, dev ? dev->name : "N/A"); From your description above, this will fix the particular error encountered, but what happens if the initial return code is not in fact nonsense (so the warn_invalid_action) is not triggered? I.e., bpf_redirect_map(...); return XDP_DROP; would still leave ri->map_id and ri->map_type set for the later tun driver invocation, no? -Toke
On 2025-03-13 20:28:06 [+0100], Toke Høiland-Jørgensen wrote: > Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > > > Hi, > > > > Ricardo reported a KASAN related use after free > > https://lore.kernel.org/all/20250226-20250204-kasan-slab-use-after-free-read-in-dev_map_enqueue__submit-v3-0-360efec441ba@igalia.com/ > > > > in v6.6 stable and suggest a backport of commits > > 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") > > fecef4cd42c68 ("tun: Assign missing bpf_net_context.") > > 9da49aa80d686 ("tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()") > > > > as a fix. In the meantime I have the syz reproducer+config and was able > > to investigate. > > It looks as if the syzbot starts a BPF program via xdp_test_run_batch() > > which assigns ri->tgt_value via dev_hash_map_redirect() and the return code > > isn't XDP_REDIRECT it looks like nonsense. So the print in > > bpf_warn_invalid_xdp_action() appears once. Everything goes as planned. > > Then the TUN driver runs another BPF program which returns XDP_REDIRECT > > without setting ri->tgt_value. This appears to be a trick because it > > invoked bpf_trace_printk() which printed four characters. Anyway, this > > is enough to get xdp_do_redirect() going. > > > > The commits in questions do fix it because the bpf_redirect_info becomes > > not only per-task but gets invalidated after the XDP context is left. > > > > Now that I understand it I would suggest something smaller instead as a > > stable fix, (instead the proposed patches). Any objections to the > > following: > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index be313928d272..1d906b7a541d 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -9000,8 +9000,12 @@ static bool xdp_is_valid_access(int off, int size, > > > > void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act) > > { > > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > const u32 act_max = XDP_REDIRECT; > > > > + ri->map_id = INT_MAX; > > + ri->map_type = BPF_MAP_TYPE_UNSPEC; > > + > > pr_warn_once("%s XDP return value %u on prog %s (id %d) dev %s, expect packet loss!\n", > > act > act_max ? "Illegal" : "Driver unsupported", > > act, prog->aux->name, prog->aux->id, dev ? dev->name : "N/A"); > > From your description above, this will fix the particular error > encountered, but what happens if the initial return code is not in fact > nonsense (so the warn_invalid_action) is not triggered? > > I.e., > > bpf_redirect_map(...); > return XDP_DROP; > > would still leave ri->map_id and ri->map_type set for the later tun > driver invocation, no? Right. So if it returns XDP_PASS or XDP_DROP instead of nonsense then the buffer remains set. And another driver could use it. But this would mean we would have to tackle each bpf_prog_run_xdp() invocation and reset it afterwards… So maybe the backport instead? We have | $ git grep bpf_prog_run_xdp | wc -l | 55 call sites. > -Toke Sebastian
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2025-03-13 20:28:06 [+0100], Toke Høiland-Jørgensen wrote: >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> > Hi, >> > >> > Ricardo reported a KASAN related use after free >> > https://lore.kernel.org/all/20250226-20250204-kasan-slab-use-after-free-read-in-dev_map_enqueue__submit-v3-0-360efec441ba@igalia.com/ >> > >> > in v6.6 stable and suggest a backport of commits >> > 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") >> > fecef4cd42c68 ("tun: Assign missing bpf_net_context.") >> > 9da49aa80d686 ("tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()") >> > >> > as a fix. In the meantime I have the syz reproducer+config and was able >> > to investigate. >> > It looks as if the syzbot starts a BPF program via xdp_test_run_batch() >> > which assigns ri->tgt_value via dev_hash_map_redirect() and the return code >> > isn't XDP_REDIRECT it looks like nonsense. So the print in >> > bpf_warn_invalid_xdp_action() appears once. Everything goes as planned. >> > Then the TUN driver runs another BPF program which returns XDP_REDIRECT >> > without setting ri->tgt_value. This appears to be a trick because it >> > invoked bpf_trace_printk() which printed four characters. Anyway, this >> > is enough to get xdp_do_redirect() going. >> > >> > The commits in questions do fix it because the bpf_redirect_info becomes >> > not only per-task but gets invalidated after the XDP context is left. >> > >> > Now that I understand it I would suggest something smaller instead as a >> > stable fix, (instead the proposed patches). Any objections to the >> > following: >> > >> > diff --git a/net/core/filter.c b/net/core/filter.c >> > index be313928d272..1d906b7a541d 100644 >> > --- a/net/core/filter.c >> > +++ b/net/core/filter.c >> > @@ -9000,8 +9000,12 @@ static bool xdp_is_valid_access(int off, int size, >> > >> > void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act) >> > { >> > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); >> > const u32 act_max = XDP_REDIRECT; >> > >> > + ri->map_id = INT_MAX; >> > + ri->map_type = BPF_MAP_TYPE_UNSPEC; >> > + >> > pr_warn_once("%s XDP return value %u on prog %s (id %d) dev %s, expect packet loss!\n", >> > act > act_max ? "Illegal" : "Driver unsupported", >> > act, prog->aux->name, prog->aux->id, dev ? dev->name : "N/A"); >> >> From your description above, this will fix the particular error >> encountered, but what happens if the initial return code is not in fact >> nonsense (so the warn_invalid_action) is not triggered? >> >> I.e., >> >> bpf_redirect_map(...); >> return XDP_DROP; >> >> would still leave ri->map_id and ri->map_type set for the later tun >> driver invocation, no? > > Right. So if it returns XDP_PASS or XDP_DROP instead of nonsense then > the buffer remains set. And another driver could use it. > But this would mean we would have to tackle each bpf_prog_run_xdp() > invocation and reset it afterwards… So maybe the backport instead? We > have > | $ git grep bpf_prog_run_xdp | wc -l > | 55 > > call sites. Hmm, how about putting the reset (essentially the changes you have above) into bpf_prog_run_xdp() itself, before executing the BPF program? -Toke
On 2025-03-14 10:21:15 [+0100], Toke Høiland-Jørgensen wrote: > Hmm, how about putting the reset (essentially the changes you have > above) into bpf_prog_run_xdp() itself, before executing the BPF program? That would be the snippet below. It does work as far as the testcase goes. It is just and unconditional write which might look like a waste but given the circumstances… While at it, is there anything that ensures that only bpf_prog_run_xdp() can invoke the map_redirect callback? Mainline only assigns the task pointer in NAPI callback so any usage outside of bpf_prog_run_xdp() will lead to a segfault and I haven't seen a report yet so… --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -486,7 +486,12 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, * under local_bh_disable(), which provides the needed RCU protection * for accessing map entries. */ - u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + u32 act; + + ri->map_id = INT_MAX; + ri->map_type = BPF_MAP_TYPE_UNSPEC; + act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) { if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev))
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2025-03-14 10:21:15 [+0100], Toke Høiland-Jørgensen wrote: >> Hmm, how about putting the reset (essentially the changes you have >> above) into bpf_prog_run_xdp() itself, before executing the BPF program? > > That would be the snippet below. It does work as far as the testcase > goes. It is just and unconditional write which might look like a waste > but given the circumstances… Hmm, yeah, it would slow down applications that never redirect, I suppose. Hmm, we could avoid the write by checking the values first? See below. > While at it, is there anything that ensures that only bpf_prog_run_xdp() > can invoke the map_redirect callback? Mainline only assigns the task > pointer in NAPI callback so any usage outside of bpf_prog_run_xdp() will > lead to a segfault and I haven't seen a report yet so… Yes, the verifier restricts which program types can call the map_redirect helper. > --- a/include/net/xdp.h > +++ b/include/net/xdp.h > @@ -486,7 +486,12 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, > * under local_bh_disable(), which provides the needed RCU protection > * for accessing map entries. > */ > - u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > + u32 act; > + Add an if here like if (ri->map_id | ri->map_type) { /* single | to make it a single branch */ > + ri->map_id = INT_MAX; > + ri->map_type = BPF_MAP_TYPE_UNSPEC; } Also, ri->map_id should be set to 0, not INT_MAX. -Toke
On 2025-03-14 17:03:35 [+0100], Toke Høiland-Jørgensen wrote: > > While at it, is there anything that ensures that only bpf_prog_run_xdp() > > can invoke the map_redirect callback? Mainline only assigns the task > > pointer in NAPI callback so any usage outside of bpf_prog_run_xdp() will > > lead to a segfault and I haven't seen a report yet so… > > Yes, the verifier restricts which program types can call the > map_redirect helper. Okay. So checks for the BPF_PROG_TYPE_XDP type for the map_redirect and that is the only one setting it. Okay. Now I remember Alexei mentioning something… > > --- a/include/net/xdp.h > > +++ b/include/net/xdp.h > > @@ -486,7 +486,12 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, > > * under local_bh_disable(), which provides the needed RCU protection > > * for accessing map entries. > > */ > > - u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); > > + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > + u32 act; > > + > > Add an if here like > > if (ri->map_id | ri->map_type) { /* single | to make it a single branch */ > > > + ri->map_id = INT_MAX; > > + ri->map_type = BPF_MAP_TYPE_UNSPEC; > > } > > Also, ri->map_id should be set to 0, not INT_MAX. The or variant does | add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ | movl 32(%rax), %edx # _51->map_id, _51->map_id | orl 36(%rax), %edx # _51->map_type, tmp311 | je .L1546 #, | movq $0, 32(%rax) #, MEM <vector(2) unsigned int> [(unsigned int *)_51 + 32B] | .L1546: while the || does | add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ | cmpq $0, 32(%rax) #, *_51 | je .L1546 #, | movq $0, 32(%rax) #, MEM <vector(2) unsigned int> [(unsigned int *)_51 + 32B] | .L1546: gcc isn't bad at optimizing here ;) This is the or version as asked for. I don't mind doing any of the both. I everyone agrees then I would send it to Greg. --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -486,7 +486,14 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, * under local_bh_disable(), which provides the needed RCU protection * for accessing map entries. */ - u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + u32 act; + + if (ri->map_id | ri->map_type) { + ri->map_id = 0; + ri->map_type = BPF_MAP_TYPE_UNSPEC; + } + act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp)); if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) { if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev)) > -Toke Sebastian
diff --git a/net/core/filter.c b/net/core/filter.c index be313928d272..1d906b7a541d 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -9000,8 +9000,12 @@ static bool xdp_is_valid_access(int off, int size, void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog, u32 act) { + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); const u32 act_max = XDP_REDIRECT; + ri->map_id = INT_MAX; + ri->map_type = BPF_MAP_TYPE_UNSPEC; + pr_warn_once("%s XDP return value %u on prog %s (id %d) dev %s, expect packet loss!\n", act > act_max ? "Illegal" : "Driver unsupported", act, prog->aux->name, prog->aux->id, dev ? dev->name : "N/A");