Message ID | CAPPBnEYn-CiWVTuRq_Fq=TP0f-W+_hcJVU61xwnxqpFr3jRcyQ@mail.gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
bpf/vmtest-bpf-PR | fail | merge-conflict |
On Thu, 12 Dec 2024 at 18:58, Priya Bala Govindasamy <pgovind2@uci.edu> wrote: > > BPF program types like kprobe and fentry can cause deadlocks in certain > situations. If a function takes a lock and one of these bpf programs is > hooked to some point in the function's critical section, and if the > bpf program tries to call the same function and take the same lock it will > lead to deadlock. These situations have been reported in the following > bug reports. > > In percpu_freelist - > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ > In bpf_lru_list - > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ > > Similar bugs have been reported by syzbot. > In queue_stack_maps - > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ > In lpm_trie - > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ > In ringbuf - > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ > > Prevent kprobe and fentry bpf programs from attaching to these critical > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > I think the current solution is to use a per-CPU variable to prevent deadlocks. You can look at the hashmap implementation for reference. However, ABBA deadlocks are still possible, so to avoid these, I think the BPF community is working towards implementing resilient spinlocks. I was planning to send patches for some of these bugs earlier. I'm wondering if per-CPU checks would still be valid once resilient spinlocks are introduced? > The bugs reported by syzbot are due to tracepoint bpf programs being > called in the critical sections. This patch does not aim to fix deadlocks > caused by tracepoint programs. However, it does prevent deadlocks from > occurring in similar situations due to kprobe and fentry programs. > > Signed-off-by: Priya Bala Govindasamy <pgovind2@uci.edu> > --- > kernel/bpf/Makefile | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile > index 7eb9ad3a3ae6..121ebcdc26cc 100644 > --- a/kernel/bpf/Makefile > +++ b/kernel/bpf/Makefile > @@ -52,3 +52,10 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/ > obj-$(CONFIG_BPF_SYSCALL) += relo_core.o > $(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE > $(call if_changed_rule,cc_o_c) > + > +CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_queue_stack_maps.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_lpm_trie.o = $(CC_FLAGS_FTRACE) > +CFLAGS_REMOVE_ringbuf.o = $(CC_FLAGS_FTRACE) > + > -- > 2.34.1 >
On Thu, Dec 12, 2024 at 4:41 PM Siddharth Chintamaneni <sidchintamaneni@gmail.com> wrote: > > On Thu, 12 Dec 2024 at 18:58, Priya Bala Govindasamy <pgovind2@uci.edu> wrote: > > > > BPF program types like kprobe and fentry can cause deadlocks in certain > > situations. If a function takes a lock and one of these bpf programs is > > hooked to some point in the function's critical section, and if the > > bpf program tries to call the same function and take the same lock it will > > lead to deadlock. These situations have been reported in the following > > bug reports. > > > > In percpu_freelist - > > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ > > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ > > In bpf_lru_list - > > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ > > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ > > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ > > > > Similar bugs have been reported by syzbot. > > In queue_stack_maps - > > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ > > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ > > In lpm_trie - > > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ > > In ringbuf - > > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ > > > > Prevent kprobe and fentry bpf programs from attaching to these critical > > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > > > > I think the current solution is to use a per-CPU variable to prevent > deadlocks. You can look at the hashmap implementation for reference. > However, ABBA deadlocks are still possible, so to avoid these, I think > the BPF community is working towards implementing resilient spinlocks. Right. The resilient spinlocks are wip, but in the meantime we need to stop the bleeding. > I was planning to send patches for some of these bugs earlier. I'm > wondering if per-CPU checks would still be valid once resilient > spinlocks are introduced? The wip patches with res_spin_lock remove these per-cpu recursion counters from hash map and other places.
On Thu, 12 Dec 2024 at 22:26, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Dec 12, 2024 at 4:41 PM Siddharth Chintamaneni > <sidchintamaneni@gmail.com> wrote: > > > > On Thu, 12 Dec 2024 at 18:58, Priya Bala Govindasamy <pgovind2@uci.edu> wrote: > > > > > > BPF program types like kprobe and fentry can cause deadlocks in certain > > > situations. If a function takes a lock and one of these bpf programs is > > > hooked to some point in the function's critical section, and if the > > > bpf program tries to call the same function and take the same lock it will > > > lead to deadlock. These situations have been reported in the following > > > bug reports. > > > > > > In percpu_freelist - > > > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ > > > In bpf_lru_list - > > > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ > > > > > > Similar bugs have been reported by syzbot. > > > In queue_stack_maps - > > > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ > > > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ > > > In lpm_trie - > > > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ > > > In ringbuf - > > > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ > > > > > > Prevent kprobe and fentry bpf programs from attaching to these critical > > > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > > > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > > > > > > > I think the current solution is to use a per-CPU variable to prevent > > deadlocks. You can look at the hashmap implementation for reference. > > However, ABBA deadlocks are still possible, so to avoid these, I think > > the BPF community is working towards implementing resilient spinlocks. > > Right. The resilient spinlocks are wip, but in the meantime > we need to stop the bleeding. > Ok I can resend the patches I was working on. https://lore.kernel.org/all/202405041108.2Up5HT0H-lkp@intel.com/T/ I remember that you shared the RFC patch set for resilient spinlocks with me, but I didn't get a chance to check them at the time. Now that I have more free time, I'd be happy to help you test that work if you'd like. > > I was planning to send patches for some of these bugs earlier. I'm > > wondering if per-CPU checks would still be valid once resilient > > spinlocks are introduced? > > The wip patches with res_spin_lock remove these per-cpu > recursion counters from hash map and other places.
On Thu, Dec 12, 2024 at 7:39 PM Siddharth Chintamaneni <sidchintamaneni@gmail.com> wrote: > > On Thu, 12 Dec 2024 at 22:26, Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Thu, Dec 12, 2024 at 4:41 PM Siddharth Chintamaneni > > <sidchintamaneni@gmail.com> wrote: > > > > > > On Thu, 12 Dec 2024 at 18:58, Priya Bala Govindasamy <pgovind2@uci.edu> wrote: > > > > > > > > BPF program types like kprobe and fentry can cause deadlocks in certain > > > > situations. If a function takes a lock and one of these bpf programs is > > > > hooked to some point in the function's critical section, and if the > > > > bpf program tries to call the same function and take the same lock it will > > > > lead to deadlock. These situations have been reported in the following > > > > bug reports. > > > > > > > > In percpu_freelist - > > > > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ > > > > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ > > > > In bpf_lru_list - > > > > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ > > > > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ > > > > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ > > > > > > > > Similar bugs have been reported by syzbot. > > > > In queue_stack_maps - > > > > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ > > > > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ > > > > In lpm_trie - > > > > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ > > > > In ringbuf - > > > > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ > > > > > > > > Prevent kprobe and fentry bpf programs from attaching to these critical > > > > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > > > > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > > > > > > > > > > I think the current solution is to use a per-CPU variable to prevent > > > deadlocks. You can look at the hashmap implementation for reference. > > > However, ABBA deadlocks are still possible, so to avoid these, I think > > > the BPF community is working towards implementing resilient spinlocks. > > > > Right. The resilient spinlocks are wip, but in the meantime > > we need to stop the bleeding. > > > > Ok I can resend the patches I was working on. > https://lore.kernel.org/all/202405041108.2Up5HT0H-lkp@intel.com/T/ No need. It's just more to revert later. > I remember that you shared the RFC patch set for resilient spinlocks > with me, but I didn't get a chance to check them at the time. Now that > I have more free time, I'd be happy to help you test that work if > you'd like. That would be great. Pls help review when they are posted publicly hopefully in a couple weeks.
On Thu, Dec 12, 2024 at 3:58 PM Priya Bala Govindasamy <pgovind2@uci.edu> wrote: > > BPF program types like kprobe and fentry can cause deadlocks in certain > situations. If a function takes a lock and one of these bpf programs is > hooked to some point in the function's critical section, and if the > bpf program tries to call the same function and take the same lock it will > lead to deadlock. These situations have been reported in the following > bug reports. > > In percpu_freelist - > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ > In bpf_lru_list - > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ > > Similar bugs have been reported by syzbot. > In queue_stack_maps - > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ > In lpm_trie - > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ > In ringbuf - > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ > > Prevent kprobe and fentry bpf programs from attaching to these critical > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > > The bugs reported by syzbot are due to tracepoint bpf programs being > called in the critical sections. This patch does not aim to fix deadlocks > caused by tracepoint programs. However, it does prevent deadlocks from > occurring in similar situations due to kprobe and fentry programs. > > Signed-off-by: Priya Bala Govindasamy <pgovind2@uci.edu> > --- > kernel/bpf/Makefile | 7 +++++++ > 1 file changed, 7 insertions(+) Applying: bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs Using index info to reconstruct a base tree... M kernel/bpf/Makefile error: patch failed: kernel/bpf/Makefile:52 error: kernel/bpf/Makefile: patch does not apply error: Did you hand edit your patch? Pls make sure that patch applies before sending it. After than pls respin with [PATCH bpf-next] bpf: ... in the subject, so CI knows what tree to use to test it.
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 7eb9ad3a3ae6..121ebcdc26cc 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -52,3 +52,10 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o $(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE $(call if_changed_rule,cc_o_c) + +CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_queue_stack_maps.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_lpm_trie.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_ringbuf.o = $(CC_FLAGS_FTRACE) +
BPF program types like kprobe and fentry can cause deadlocks in certain situations. If a function takes a lock and one of these bpf programs is hooked to some point in the function's critical section, and if the bpf program tries to call the same function and take the same lock it will lead to deadlock. These situations have been reported in the following bug reports. In percpu_freelist - Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ In bpf_lru_list - Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ Similar bugs have been reported by syzbot. In queue_stack_maps - Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ In lpm_trie - Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ In ringbuf - Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ Prevent kprobe and fentry bpf programs from attaching to these critical sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. The bugs reported by syzbot are due to tracepoint bpf programs being called in the critical sections. This patch does not aim to fix deadlocks caused by tracepoint programs. However, it does prevent deadlocks from occurring in similar situations due to kprobe and fentry programs. Signed-off-by: Priya Bala Govindasamy <pgovind2@uci.edu> --- kernel/bpf/Makefile | 7 +++++++ 1 file changed, 7 insertions(+) -- 2.34.1