Message ID | 20240209040608.98927-21-alexei.starovoitov@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | bpf: Introduce BPF arena. | expand |
On Fri, Feb 9, 2024 at 11:06 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Fri, 9 Feb 2024 at 05:07, Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > From: Alexei Starovoitov <ast@kernel.org> > > > > Convert simple page_frag allocator to per-cpu page_frag to further stress test > > a combination of __arena global and static variables and alloc/free from arena. > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > I know this organically grew from a toy implementation, but since > people will most likely be looking at selftests as usage examples, it > might be better to expose bpf_preempt_disable/enable and use it in the > case of per-CPU page_frag allocator? No need to block on this, can be > added on top later. > > The kfunc is useful on its own for writing safe per-CPU data > structures or other memory allocators like bpf_ma on top of arenas. > It is also necessary as a building block for writing spin locks > natively in BPF on top of the arena map which we may add later. > I have a patch lying around for this, verifier plumbing is mostly the > same as rcu_read_lock. > I can send it out with tests, or otherwise if you want to add it to > this series, you go ahead. Please send it. I think the verifier checks need to be more tight than rcu_read_lock. preempt_enable/disable should be as strict as bpf_spin_lock. The plan is to add bpf_arena_spin_lock() in the follow up and use it in this bpf page_frag allocator to make it work properly out of tracing context. I'm not sure yet whether bpf_preemp_disable() will be sufficient. And in the long run the idea is to convert all these bpf_arena* facilities into libc equivalent. Probably not part of libbpf, but some new package. name tbd.
diff --git a/tools/testing/selftests/bpf/bpf_arena_alloc.h b/tools/testing/selftests/bpf/bpf_arena_alloc.h index 0f4cb399b4c7..c27678299e0c 100644 --- a/tools/testing/selftests/bpf/bpf_arena_alloc.h +++ b/tools/testing/selftests/bpf/bpf_arena_alloc.h @@ -10,14 +10,19 @@ #define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) #endif -void __arena *cur_page; -int cur_offset; +#ifdef __BPF__ +#define NR_CPUS (sizeof(struct cpumask) * 8) + +static void __arena * __arena page_frag_cur_page[NR_CPUS]; +static int __arena page_frag_cur_offset[NR_CPUS]; /* Simple page_frag allocator */ static inline void __arena* bpf_alloc(unsigned int size) { __u64 __arena *obj_cnt; - void __arena *page = cur_page; + __u32 cpu = bpf_get_smp_processor_id(); + void __arena *page = page_frag_cur_page[cpu]; + int __arena *cur_offset = &page_frag_cur_offset[cpu]; int offset; size = round_up(size, 8); @@ -29,8 +34,8 @@ static inline void __arena* bpf_alloc(unsigned int size) if (!page) return NULL; cast_kern(page); - cur_page = page; - cur_offset = PAGE_SIZE - 8; + page_frag_cur_page[cpu] = page; + *cur_offset = PAGE_SIZE - 8; obj_cnt = page + PAGE_SIZE - 8; *obj_cnt = 0; } else { @@ -38,12 +43,12 @@ static inline void __arena* bpf_alloc(unsigned int size) obj_cnt = page + PAGE_SIZE - 8; } - offset = cur_offset - size; + offset = *cur_offset - size; if (offset < 0) goto refill; (*obj_cnt)++; - cur_offset = offset; + *cur_offset = offset; return page + offset; } @@ -56,3 +61,7 @@ static inline void bpf_free(void __arena *addr) if (--(*obj_cnt) == 0) bpf_arena_free_pages(&arena, addr, 1); } +#else +static inline void __arena* bpf_alloc(unsigned int size) { return NULL; } +static inline void bpf_free(void __arena *addr) {} +#endif