Message ID | 20220921073826.2365800-1-houtao@huaweicloud.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 1d8b82c613297f24354b4d750413a7456b5cd92c |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next,RESEND] bpf: Always use raw spinlock for hash bucket lock | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for bpf-next |
netdev/fixes_present | success | Fixes tag not required for -next series |
netdev/subject_prefix | success | Link |
netdev/cover_letter | success | Single patches do not need cover letters |
netdev/patch_count | success | Link |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 12 this patch: 12 |
netdev/cc_maintainers | warning | 2 maintainers not CCed: song@kernel.org martin.lau@linux.dev |
netdev/build_clang | success | Errors and warnings before: 5 this patch: 5 |
netdev/module_param | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/check_selftest | success | No net selftest shell script |
netdev/verify_fixes | success | No Fixes tag |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 12 this patch: 12 |
netdev/checkpatch | warning | CHECK: Alignment should match open parenthesis |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 1 now: 0 |
bpf/vmtest-bpf-next-PR | success | PR summary |
bpf/vmtest-bpf-next-VM_Test-1 | success | Logs for build for s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-16 | success | Logs for test_verifier on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-13 | success | Logs for test_progs_no_alu32 on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-14 | success | Logs for test_progs_no_alu32 on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-17 | success | Logs for test_verifier on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-8 | success | Logs for test_maps on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-10 | success | Logs for test_progs on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-11 | success | Logs for test_progs on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-7 | success | Logs for test_maps on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-15 | success | Logs for test_verifier on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-9 | success | Logs for test_progs on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-12 | success | Logs for test_progs_no_alu32 on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-6 | success | Logs for test_maps on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-2 | success | Logs for build for x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-3 | success | Logs for build for x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-4 | success | Logs for llvm-toolchain |
bpf/vmtest-bpf-next-VM_Test-5 | success | Logs for set-matrix |
Hello: This patch was applied to bpf/bpf-next.git (master) by Alexei Starovoitov <ast@kernel.org>: On Wed, 21 Sep 2022 15:38:26 +0800 you wrote: > From: Hou Tao <houtao1@huawei.com> > > For a non-preallocated hash map on RT kernel, regular spinlock instead > of raw spinlock is used for bucket lock. The reason is that on RT kernel > memory allocation is forbidden under atomic context and regular spinlock > is sleepable under RT. > > [...] Here is the summary with links: - [bpf-next,RESEND] bpf: Always use raw spinlock for hash bucket lock https://git.kernel.org/bpf/bpf-next/c/1d8b82c61329 You are awesome, thank you!
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 86aec20c22d0..ed3f8a53603b 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -68,24 +68,16 @@ * In theory the BPF locks could be converted to regular spinlocks as well, * but the bucket locks and percpu_freelist locks can be taken from * arbitrary contexts (perf, kprobes, tracepoints) which are required to be - * atomic contexts even on RT. These mechanisms require preallocated maps, - * so there is no need to invoke memory allocations within the lock held - * sections. - * - * BPF maps which need dynamic allocation are only used from (forced) - * thread context on RT and can therefore use regular spinlocks which in - * turn allows to invoke memory allocations from the lock held section. - * - * On a non RT kernel this distinction is neither possible nor required. - * spinlock maps to raw_spinlock and the extra code is optimized out by the - * compiler. + * atomic contexts even on RT. Before the introduction of bpf_mem_alloc, + * it is only safe to use raw spinlock for preallocated hash map on a RT kernel, + * because there is no memory allocation within the lock held sections. However + * after hash map was fully converted to use bpf_mem_alloc, there will be + * non-synchronous memory allocation for non-preallocated hash map, so it is + * safe to always use raw spinlock for bucket lock. */ struct bucket { struct hlist_nulls_head head; - union { - raw_spinlock_t raw_lock; - spinlock_t lock; - }; + raw_spinlock_t raw_lock; }; #define HASHTAB_MAP_LOCK_COUNT 8 @@ -141,26 +133,15 @@ static inline bool htab_is_prealloc(const struct bpf_htab *htab) return !(htab->map.map_flags & BPF_F_NO_PREALLOC); } -static inline bool htab_use_raw_lock(const struct bpf_htab *htab) -{ - return (!IS_ENABLED(CONFIG_PREEMPT_RT) || htab_is_prealloc(htab)); -} - static void htab_init_buckets(struct bpf_htab *htab) { unsigned int i; for (i = 0; i < htab->n_buckets; i++) { INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i); - if (htab_use_raw_lock(htab)) { - raw_spin_lock_init(&htab->buckets[i].raw_lock); - lockdep_set_class(&htab->buckets[i].raw_lock, + raw_spin_lock_init(&htab->buckets[i].raw_lock); + lockdep_set_class(&htab->buckets[i].raw_lock, &htab->lockdep_key); - } else { - spin_lock_init(&htab->buckets[i].lock); - lockdep_set_class(&htab->buckets[i].lock, - &htab->lockdep_key); - } cond_resched(); } } @@ -170,28 +151,17 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab, unsigned long *pflags) { unsigned long flags; - bool use_raw_lock; hash = hash & HASHTAB_MAP_LOCK_MASK; - use_raw_lock = htab_use_raw_lock(htab); - if (use_raw_lock) - preempt_disable(); - else - migrate_disable(); + preempt_disable(); if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { __this_cpu_dec(*(htab->map_locked[hash])); - if (use_raw_lock) - preempt_enable(); - else - migrate_enable(); + preempt_enable(); return -EBUSY; } - if (use_raw_lock) - raw_spin_lock_irqsave(&b->raw_lock, flags); - else - spin_lock_irqsave(&b->lock, flags); + raw_spin_lock_irqsave(&b->raw_lock, flags); *pflags = flags; return 0; @@ -201,18 +171,10 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab, struct bucket *b, u32 hash, unsigned long flags) { - bool use_raw_lock = htab_use_raw_lock(htab); - hash = hash & HASHTAB_MAP_LOCK_MASK; - if (use_raw_lock) - raw_spin_unlock_irqrestore(&b->raw_lock, flags); - else - spin_unlock_irqrestore(&b->lock, flags); + raw_spin_unlock_irqrestore(&b->raw_lock, flags); __this_cpu_dec(*(htab->map_locked[hash])); - if (use_raw_lock) - preempt_enable(); - else - migrate_enable(); + preempt_enable(); } static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);