Message ID | 20221114191547.1694267-13-memxor@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Allocated objects, BPF linked lists | expand |
On 11/14/22 2:15 PM, Kumar Kartikeya Dwivedi wrote: > Global variables reside in maps accessible using direct_value_addr > callbacks, so giving each load instruction's rewrite a unique reg->id > disallows us from holding locks which are global. > > The reason for preserving reg->id as a unique value for registers that > may point to spin lock is that two separate lookups are treated as two > separate memory regions, and any possible aliasing is ignored for the > purposes of spin lock correctness. > > This is not great especially for the global variable case, which are > served from maps that have max_entries == 1, i.e. they always lead to > map values pointing into the same map value. > > So refactor the active_spin_lock into a 'active_lock' structure which > represents the lock identity, and instead of the reg->id, remember two > fields, a pointer and the reg->id. The pointer will store reg->map_ptr > or reg->btf. It's only necessary to distinguish for the id == 0 case of > global variables, but always setting the pointer to a non-NULL value and > using the pointer to check whether the lock is held simplifies code in > the verifier. > > This is generic enough to allow it for global variables, map lookups, > and allocated objects at the same time. > > Note that while whether a lock is held can be answered by just comparing > active_lock.ptr to NULL, to determine whether the register is pointing > to the same held lock requires comparing _both_ ptr and id. > > Finally, as a result of this refactoring, pseudo load instructions are > not given a unique reg->id, as they are doing lookup for the same map > value (max_entries is never greater than 1). > > Essentially, we consider that the tuple of (ptr, id) will always be > unique for any kind of argument to bpf_spin_{lock,unlock}. > > Note that this can be extended in the future to also remember offset > used for locking, so that we can introduce multiple bpf_spin_lock fields > in the same allocation. > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > include/linux/bpf_verifier.h | 10 ++++++++- > kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++------------ > 2 files changed, 37 insertions(+), 14 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 1a32baa78ce2..fa738abea267 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -323,7 +323,15 @@ struct bpf_verifier_state { > u32 branches; > u32 insn_idx; > u32 curframe; > - u32 active_spin_lock; > + struct { > + /* This can either be reg->map_ptr or reg->btf, but it is only > + * used to check whether the lock is held or not by comparing to > + * NULL. > + */ > + void *ptr; > + /* This will be reg->id */ > + u32 id; > + } active_lock; I didn't get back to you re: naming here, but I think these names are clear, especially with comments elaborating on the details. The first comment can be clarified a bit, though. Sounds like "is only used to check whether lock is held or not by comparing to NULL" is saying that active_lock.ptr is only compared to NULL in this patch, but changes to process_spin_lock check which results in verbose(env, "bpf_spin_unlock of different lock\n") are comparing it to another ptr. Maybe you're trying to say "if active_lock.ptr is NULL, there's no active lock and other fields in this struct don't hold anything valid. If non-NULL, there is an active lock held"? Separately, the line in patch summary with "we consider that the tuple of (ptr, id) will always be unique" would help with clarity if it was in the comments here. > bool speculative; > > /* first and last insn idx of this verifier state */ > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 070d003a99f0..99b5edb56978 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -1215,7 +1215,8 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, > } > dst_state->speculative = src->speculative; > dst_state->curframe = src->curframe; > - dst_state->active_spin_lock = src->active_spin_lock; > + dst_state->active_lock.ptr = src->active_lock.ptr; > + dst_state->active_lock.id = src->active_lock.id; > dst_state->branches = src->branches; > dst_state->parent = src->parent; > dst_state->first_insn_idx = src->first_insn_idx; > @@ -5587,7 +5588,7 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state > * Since only one bpf_spin_lock is allowed the checks are simpler than > * reg_is_refcounted() logic. The verifier needs to remember only > * one spin_lock instead of array of acquired_refs. > - * cur_state->active_spin_lock remembers which map value element got locked > + * cur_state->active_lock remembers which map value element got locked Maybe remove "map value" here? Since previous patch adds support for allocated object. "remembers which element got locked" or "remembers which map value or allocated object" better reflect current state of spin_lock support after these patches. > * and clears it after bpf_spin_unlock. > */ > static int process_spin_lock(struct bpf_verifier_env *env, int regno, > @@ -5636,22 +5637,35 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, > return -EINVAL; > } > if (is_lock) { > - if (cur->active_spin_lock) { > + if (cur->active_lock.ptr) { > verbose(env, > "Locking two bpf_spin_locks are not allowed\n"); > return -EINVAL; > } > - cur->active_spin_lock = reg->id; > + if (map) > + cur->active_lock.ptr = map; > + else > + cur->active_lock.ptr = btf; > + cur->active_lock.id = reg->id; > } else { > - if (!cur->active_spin_lock) { > + void *ptr; > + > + if (map) > + ptr = map; > + else > + ptr = btf; > + > + if (!cur->active_lock.ptr) { > verbose(env, "bpf_spin_unlock without taking a lock\n"); > return -EINVAL; > } > - if (cur->active_spin_lock != reg->id) { > + if (cur->active_lock.ptr != ptr || > + cur->active_lock.id != reg->id) { > verbose(env, "bpf_spin_unlock of different lock\n"); > return -EINVAL; > } > - cur->active_spin_lock = 0; > + cur->active_lock.ptr = NULL; > + cur->active_lock.id = 0; > } > return 0; > } > @@ -10582,8 +10596,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) > insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { > dst_reg->type = PTR_TO_MAP_VALUE; > dst_reg->off = aux->map_off; > - if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) > - dst_reg->id = ++env->id_gen; > + WARN_ON_ONCE(map->max_entries != 1); > + /* We want reg->id to be same (0) as map_value is not distinct */ > } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || > insn->src_reg == BPF_PSEUDO_MAP_IDX) { > dst_reg->type = CONST_PTR_TO_MAP; > @@ -10661,7 +10675,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) > return err; > } > > - if (env->cur_state->active_spin_lock) { > + if (env->cur_state->active_lock.ptr) { > verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_spin_lock-ed region\n"); > return -EINVAL; > } > @@ -11927,7 +11941,8 @@ static bool states_equal(struct bpf_verifier_env *env, > if (old->speculative && !cur->speculative) > return false; > > - if (old->active_spin_lock != cur->active_spin_lock) > + if (old->active_lock.ptr != cur->active_lock.ptr || > + old->active_lock.id != cur->active_lock.id) > return false; > > /* for states to be equal callsites have to be the same > @@ -12572,7 +12587,7 @@ static int do_check(struct bpf_verifier_env *env) > return -EINVAL; > } > > - if (env->cur_state->active_spin_lock && > + if (env->cur_state->active_lock.ptr && > (insn->src_reg == BPF_PSEUDO_CALL || > insn->imm != BPF_FUNC_spin_unlock)) { > verbose(env, "function calls are not allowed while holding a lock\n"); > @@ -12609,7 +12624,7 @@ static int do_check(struct bpf_verifier_env *env) > return -EINVAL; > } > > - if (env->cur_state->active_spin_lock) { > + if (env->cur_state->active_lock.ptr) { > verbose(env, "bpf_spin_unlock is missing\n"); > return -EINVAL; > }
On Tue, Nov 15, 2022 at 11:03:02PM IST, Dave Marchevsky wrote: > On 11/14/22 2:15 PM, Kumar Kartikeya Dwivedi wrote: > > Global variables reside in maps accessible using direct_value_addr > > callbacks, so giving each load instruction's rewrite a unique reg->id > > disallows us from holding locks which are global. > > > > The reason for preserving reg->id as a unique value for registers that > > may point to spin lock is that two separate lookups are treated as two > > separate memory regions, and any possible aliasing is ignored for the > > purposes of spin lock correctness. > > > > This is not great especially for the global variable case, which are > > served from maps that have max_entries == 1, i.e. they always lead to > > map values pointing into the same map value. > > > > So refactor the active_spin_lock into a 'active_lock' structure which > > represents the lock identity, and instead of the reg->id, remember two > > fields, a pointer and the reg->id. The pointer will store reg->map_ptr > > or reg->btf. It's only necessary to distinguish for the id == 0 case of > > global variables, but always setting the pointer to a non-NULL value and > > using the pointer to check whether the lock is held simplifies code in > > the verifier. > > > > This is generic enough to allow it for global variables, map lookups, > > and allocated objects at the same time. > > > > Note that while whether a lock is held can be answered by just comparing > > active_lock.ptr to NULL, to determine whether the register is pointing > > to the same held lock requires comparing _both_ ptr and id. > > > > Finally, as a result of this refactoring, pseudo load instructions are > > not given a unique reg->id, as they are doing lookup for the same map > > value (max_entries is never greater than 1). > > > > Essentially, we consider that the tuple of (ptr, id) will always be > > unique for any kind of argument to bpf_spin_{lock,unlock}. > > > > Note that this can be extended in the future to also remember offset > > used for locking, so that we can introduce multiple bpf_spin_lock fields > > in the same allocation. > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > include/linux/bpf_verifier.h | 10 ++++++++- > > kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++------------ > > 2 files changed, 37 insertions(+), 14 deletions(-) > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > index 1a32baa78ce2..fa738abea267 100644 > > --- a/include/linux/bpf_verifier.h > > +++ b/include/linux/bpf_verifier.h > > @@ -323,7 +323,15 @@ struct bpf_verifier_state { > > u32 branches; > > u32 insn_idx; > > u32 curframe; > > - u32 active_spin_lock; > > + struct { > > + /* This can either be reg->map_ptr or reg->btf, but it is only > > + * used to check whether the lock is held or not by comparing to > > + * NULL. > > + */ > > + void *ptr; > > + /* This will be reg->id */ > > + u32 id; > > + } active_lock; > > I didn't get back to you re: naming here, but I think these names are clear, > especially with comments elaborating on the details. The first comment can > be clarified a bit, though. Sounds like "is only used to check whether lock is > held or not by comparing to NULL" is saying that active_lock.ptr is only > compared to NULL in this patch, but changes to process_spin_lock check > which results in verbose(env, "bpf_spin_unlock of different lock\n") are > comparing it to another ptr. > > Maybe you're trying to say "if active_lock.ptr > is NULL, there's no active lock and other fields in this struct don't hold > anything valid. If non-NULL, there is an active lock held"? > Makes sense, I'll improve the comment. > Separately, the line in patch summary with "we consider that the tuple of > (ptr, id) will always be unique" would help with clarity if it was in the > comments here. > Ack. > > bool speculative; > > > > /* first and last insn idx of this verifier state */ > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 070d003a99f0..99b5edb56978 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -1215,7 +1215,8 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, > > } > > dst_state->speculative = src->speculative; > > dst_state->curframe = src->curframe; > > - dst_state->active_spin_lock = src->active_spin_lock; > > + dst_state->active_lock.ptr = src->active_lock.ptr; > > + dst_state->active_lock.id = src->active_lock.id; > > dst_state->branches = src->branches; > > dst_state->parent = src->parent; > > dst_state->first_insn_idx = src->first_insn_idx; > > @@ -5587,7 +5588,7 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state > > * Since only one bpf_spin_lock is allowed the checks are simpler than > > * reg_is_refcounted() logic. The verifier needs to remember only > > * one spin_lock instead of array of acquired_refs. > > - * cur_state->active_spin_lock remembers which map value element got locked > > + * cur_state->active_lock remembers which map value element got locked > > Maybe remove "map value" here? Since previous patch adds support for allocated > object. "remembers which element got locked" or "remembers which map value > or allocated object" better reflect current state of spin_lock support after > these patches. > Ack, I should update other discrepancies wrt the previous update in this whole comment block. Thanks for the review.
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 1a32baa78ce2..fa738abea267 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -323,7 +323,15 @@ struct bpf_verifier_state { u32 branches; u32 insn_idx; u32 curframe; - u32 active_spin_lock; + struct { + /* This can either be reg->map_ptr or reg->btf, but it is only + * used to check whether the lock is held or not by comparing to + * NULL. + */ + void *ptr; + /* This will be reg->id */ + u32 id; + } active_lock; bool speculative; /* first and last insn idx of this verifier state */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 070d003a99f0..99b5edb56978 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1215,7 +1215,8 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, } dst_state->speculative = src->speculative; dst_state->curframe = src->curframe; - dst_state->active_spin_lock = src->active_spin_lock; + dst_state->active_lock.ptr = src->active_lock.ptr; + dst_state->active_lock.id = src->active_lock.id; dst_state->branches = src->branches; dst_state->parent = src->parent; dst_state->first_insn_idx = src->first_insn_idx; @@ -5587,7 +5588,7 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state * Since only one bpf_spin_lock is allowed the checks are simpler than * reg_is_refcounted() logic. The verifier needs to remember only * one spin_lock instead of array of acquired_refs. - * cur_state->active_spin_lock remembers which map value element got locked + * cur_state->active_lock remembers which map value element got locked * and clears it after bpf_spin_unlock. */ static int process_spin_lock(struct bpf_verifier_env *env, int regno, @@ -5636,22 +5637,35 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, return -EINVAL; } if (is_lock) { - if (cur->active_spin_lock) { + if (cur->active_lock.ptr) { verbose(env, "Locking two bpf_spin_locks are not allowed\n"); return -EINVAL; } - cur->active_spin_lock = reg->id; + if (map) + cur->active_lock.ptr = map; + else + cur->active_lock.ptr = btf; + cur->active_lock.id = reg->id; } else { - if (!cur->active_spin_lock) { + void *ptr; + + if (map) + ptr = map; + else + ptr = btf; + + if (!cur->active_lock.ptr) { verbose(env, "bpf_spin_unlock without taking a lock\n"); return -EINVAL; } - if (cur->active_spin_lock != reg->id) { + if (cur->active_lock.ptr != ptr || + cur->active_lock.id != reg->id) { verbose(env, "bpf_spin_unlock of different lock\n"); return -EINVAL; } - cur->active_spin_lock = 0; + cur->active_lock.ptr = NULL; + cur->active_lock.id = 0; } return 0; } @@ -10582,8 +10596,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { dst_reg->type = PTR_TO_MAP_VALUE; dst_reg->off = aux->map_off; - if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) - dst_reg->id = ++env->id_gen; + WARN_ON_ONCE(map->max_entries != 1); + /* We want reg->id to be same (0) as map_value is not distinct */ } else if (insn->src_reg == BPF_PSEUDO_MAP_FD || insn->src_reg == BPF_PSEUDO_MAP_IDX) { dst_reg->type = CONST_PTR_TO_MAP; @@ -10661,7 +10675,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) return err; } - if (env->cur_state->active_spin_lock) { + if (env->cur_state->active_lock.ptr) { verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_spin_lock-ed region\n"); return -EINVAL; } @@ -11927,7 +11941,8 @@ static bool states_equal(struct bpf_verifier_env *env, if (old->speculative && !cur->speculative) return false; - if (old->active_spin_lock != cur->active_spin_lock) + if (old->active_lock.ptr != cur->active_lock.ptr || + old->active_lock.id != cur->active_lock.id) return false; /* for states to be equal callsites have to be the same @@ -12572,7 +12587,7 @@ static int do_check(struct bpf_verifier_env *env) return -EINVAL; } - if (env->cur_state->active_spin_lock && + if (env->cur_state->active_lock.ptr && (insn->src_reg == BPF_PSEUDO_CALL || insn->imm != BPF_FUNC_spin_unlock)) { verbose(env, "function calls are not allowed while holding a lock\n"); @@ -12609,7 +12624,7 @@ static int do_check(struct bpf_verifier_env *env) return -EINVAL; } - if (env->cur_state->active_spin_lock) { + if (env->cur_state->active_lock.ptr) { verbose(env, "bpf_spin_unlock is missing\n"); return -EINVAL; }
Global variables reside in maps accessible using direct_value_addr callbacks, so giving each load instruction's rewrite a unique reg->id disallows us from holding locks which are global. The reason for preserving reg->id as a unique value for registers that may point to spin lock is that two separate lookups are treated as two separate memory regions, and any possible aliasing is ignored for the purposes of spin lock correctness. This is not great especially for the global variable case, which are served from maps that have max_entries == 1, i.e. they always lead to map values pointing into the same map value. So refactor the active_spin_lock into a 'active_lock' structure which represents the lock identity, and instead of the reg->id, remember two fields, a pointer and the reg->id. The pointer will store reg->map_ptr or reg->btf. It's only necessary to distinguish for the id == 0 case of global variables, but always setting the pointer to a non-NULL value and using the pointer to check whether the lock is held simplifies code in the verifier. This is generic enough to allow it for global variables, map lookups, and allocated objects at the same time. Note that while whether a lock is held can be answered by just comparing active_lock.ptr to NULL, to determine whether the register is pointing to the same held lock requires comparing _both_ ptr and id. Finally, as a result of this refactoring, pseudo load instructions are not given a unique reg->id, as they are doing lookup for the same map value (max_entries is never greater than 1). Essentially, we consider that the tuple of (ptr, id) will always be unique for any kind of argument to bpf_spin_{lock,unlock}. Note that this can be extended in the future to also remember offset used for locking, so that we can introduce multiple bpf_spin_lock fields in the same allocation. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> --- include/linux/bpf_verifier.h | 10 ++++++++- kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++------------ 2 files changed, 37 insertions(+), 14 deletions(-)