diff mbox series

[bpf-next,v4,14/24] bpf: Allow locking bpf_spin_lock global variables

Message ID 20221103191013.1236066-15-memxor@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Local kptrs, BPF linked lists | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count fail Series longer than 15 patches (and no cover letter)
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 67 this patch: 66
netdev/cc_maintainers warning 8 maintainers not CCed: sdf@google.com kpsingh@kernel.org haoluo@google.com yhs@fb.com jolsa@kernel.org martin.lau@linux.dev song@kernel.org john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 19 this patch: 17
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 67 this patch: 66
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-1 pending Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-PR fail merge-conflict

Commit Message

Kumar Kartikeya Dwivedi Nov. 3, 2022, 7:10 p.m. UTC
Global variables reside in maps accessible using direct_value_addr
callbacks, so giving each load instruction's rewrite a unique reg->id
disallows us from holding locks which are global.

This is not great, so refactor the active_spin_lock into two separate
fields, active_spin_lock_ptr and active_spin_lock_id, which is generic
enough to allow it for global variables, map lookups, and local kptr
registers at the same time.

Held vs non-held is indicated by active_spin_lock_ptr, which stores the
reg->map_ptr or reg->btf pointer of the register used for locking spin
lock. But the active_spin_lock_id also needs to be compared to ensure
whether bpf_spin_unlock is for the same register.

Next, pseudo load instructions are not given a unique reg->id, as they
are doing lookup for the same map value (max_entries is never greater
than 1).

Essentially, we consider that the tuple of (active_spin_lock_ptr,
active_spin_lock_id) will always be unique for any kind of argument to
bpf_spin_{lock,unlock}.

Note that this can be extended in the future to also remember offset
used for locking, so that we can introduce multiple bpf_spin_lock fields
in the same allocation.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h |  3 ++-
 kernel/bpf/verifier.c        | 39 +++++++++++++++++++++++++-----------
 2 files changed, 29 insertions(+), 13 deletions(-)

Comments

Dave Marchevsky Nov. 4, 2022, 2:54 a.m. UTC | #1
On 11/3/22 3:10 PM, Kumar Kartikeya Dwivedi wrote:
> Global variables reside in maps accessible using direct_value_addr
> callbacks, so giving each load instruction's rewrite a unique reg->id
> disallows us from holding locks which are global.
> 
> This is not great, so refactor the active_spin_lock into two separate
> fields, active_spin_lock_ptr and active_spin_lock_id, which is generic
> enough to allow it for global variables, map lookups, and local kptr
> registers at the same time.
> 
> Held vs non-held is indicated by active_spin_lock_ptr, which stores the
> reg->map_ptr or reg->btf pointer of the register used for locking spin
> lock. But the active_spin_lock_id also needs to be compared to ensure
> whether bpf_spin_unlock is for the same register.
> 
> Next, pseudo load instructions are not given a unique reg->id, as they
> are doing lookup for the same map value (max_entries is never greater
> than 1).
> 
> Essentially, we consider that the tuple of (active_spin_lock_ptr,
> active_spin_lock_id) will always be unique for any kind of argument to
> bpf_spin_{lock,unlock}.
> 
> Note that this can be extended in the future to also remember offset
> used for locking, so that we can introduce multiple bpf_spin_lock fields
> in the same allocation.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf_verifier.h |  3 ++-
>  kernel/bpf/verifier.c        | 39 +++++++++++++++++++++++++-----------
>  2 files changed, 29 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 1a32baa78ce2..bb71c59f21f6 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -323,7 +323,8 @@ struct bpf_verifier_state {
>  	u32 branches;
>  	u32 insn_idx;
>  	u32 curframe;
> -	u32 active_spin_lock;
> +	void *active_spin_lock_ptr;
> +	u32 active_spin_lock_id;
>  	bool speculative;
Back in first RFC of this series we talked about turning this "spin lock
identity" concept into a proper struct [0]. But to save you the click:

Dave:
"""
It would be good to make this "(lock_ptr, lock_id) is identifier for lock"
concept more concrete by grouping these fields in a struct w/ type enum + union,
or something similar. Will make it more obvious that they should be used / set
together.

But if you'd prefer to keep it as two fields, active_spin_lock_ptr is a
confusing name. In the future with no context as to what that field is, I'd
assume that it holds a pointer to a spin_lock instead of a "spin lock identity
pointer".
"""

Kumar: 
"""
That's a good point.

I'm thinking
struct active_lock {
  void *id_ptr;
  u32 offset;
  u32 reg_id;
};
How does that look?
"""

I didn't get back to you, but I think that looks reasonable, and "this can be
extended in the future to also remember offset used for locking" in this
patch summary supports the desire to group up those fields. (I agree that
offset isn't necessary for now, though).

  [0]: https://lore.kernel.org/bpf/311eb0d0-777a-4240-9fa0-59134344f051@fb.com/
Kumar Kartikeya Dwivedi Nov. 4, 2022, 7:56 a.m. UTC | #2
On Fri, Nov 04, 2022 at 08:24:22AM IST, Dave Marchevsky wrote:
> On 11/3/22 3:10 PM, Kumar Kartikeya Dwivedi wrote:
> > Global variables reside in maps accessible using direct_value_addr
> > callbacks, so giving each load instruction's rewrite a unique reg->id
> > disallows us from holding locks which are global.
> >
> > This is not great, so refactor the active_spin_lock into two separate
> > fields, active_spin_lock_ptr and active_spin_lock_id, which is generic
> > enough to allow it for global variables, map lookups, and local kptr
> > registers at the same time.
> >
> > Held vs non-held is indicated by active_spin_lock_ptr, which stores the
> > reg->map_ptr or reg->btf pointer of the register used for locking spin
> > lock. But the active_spin_lock_id also needs to be compared to ensure
> > whether bpf_spin_unlock is for the same register.
> >
> > Next, pseudo load instructions are not given a unique reg->id, as they
> > are doing lookup for the same map value (max_entries is never greater
> > than 1).
> >
> > Essentially, we consider that the tuple of (active_spin_lock_ptr,
> > active_spin_lock_id) will always be unique for any kind of argument to
> > bpf_spin_{lock,unlock}.
> >
> > Note that this can be extended in the future to also remember offset
> > used for locking, so that we can introduce multiple bpf_spin_lock fields
> > in the same allocation.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h |  3 ++-
> >  kernel/bpf/verifier.c        | 39 +++++++++++++++++++++++++-----------
> >  2 files changed, 29 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 1a32baa78ce2..bb71c59f21f6 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -323,7 +323,8 @@ struct bpf_verifier_state {
> >  	u32 branches;
> >  	u32 insn_idx;
> >  	u32 curframe;
> > -	u32 active_spin_lock;
> > +	void *active_spin_lock_ptr;
> > +	u32 active_spin_lock_id;
> >  	bool speculative;
> Back in first RFC of this series we talked about turning this "spin lock
> identity" concept into a proper struct [0]. But to save you the click:
>
> Dave:
> """
> It would be good to make this "(lock_ptr, lock_id) is identifier for lock"
> concept more concrete by grouping these fields in a struct w/ type enum + union,
> or something similar. Will make it more obvious that they should be used / set
> together.
>
> But if you'd prefer to keep it as two fields, active_spin_lock_ptr is a
> confusing name. In the future with no context as to what that field is, I'd
> assume that it holds a pointer to a spin_lock instead of a "spin lock identity
> pointer".
> """
>
> Kumar:
> """
> That's a good point.
>
> I'm thinking
> struct active_lock {
>   void *id_ptr;
>   u32 offset;
>   u32 reg_id;
> };
> How does that look?
> """
>
> I didn't get back to you, but I think that looks reasonable, and "this can be
> extended in the future to also remember offset used for locking" in this
> patch summary supports the desire to group up those fields. (I agree that
> offset isn't necessary for now, though).
>

I will make this change in v5.

However, do you have any suggestions on what we can call the id_ptr thing? In
patch 22 in the big comment above check_reg_allocation_locked I call it lock
class, but I'm not sure whether it helps or is more confusing for people.

In active_spin_lock_ptr, 'ptr' alone is confusing as you've pointed out before.
diff mbox series

Patch

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 1a32baa78ce2..bb71c59f21f6 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -323,7 +323,8 @@  struct bpf_verifier_state {
 	u32 branches;
 	u32 insn_idx;
 	u32 curframe;
-	u32 active_spin_lock;
+	void *active_spin_lock_ptr;
+	u32 active_spin_lock_id;
 	bool speculative;
 
 	/* first and last insn idx of this verifier state */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c31f20aed30c..4a43cde0ff4c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1201,7 +1201,8 @@  static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	}
 	dst_state->speculative = src->speculative;
 	dst_state->curframe = src->curframe;
-	dst_state->active_spin_lock = src->active_spin_lock;
+	dst_state->active_spin_lock_ptr = src->active_spin_lock_ptr;
+	dst_state->active_spin_lock_id = src->active_spin_lock_id;
 	dst_state->branches = src->branches;
 	dst_state->parent = src->parent;
 	dst_state->first_insn_idx = src->first_insn_idx;
@@ -5470,22 +5471,35 @@  static int process_spin_lock(struct bpf_verifier_env *env, int regno,
 		return -EINVAL;
 	}
 	if (is_lock) {
-		if (cur->active_spin_lock) {
+		if (cur->active_spin_lock_ptr) {
 			verbose(env,
 				"Locking two bpf_spin_locks are not allowed\n");
 			return -EINVAL;
 		}
-		cur->active_spin_lock = reg->id;
+		if (map)
+			cur->active_spin_lock_ptr = map;
+		else
+			cur->active_spin_lock_ptr = btf;
+		cur->active_spin_lock_id = reg->id;
 	} else {
-		if (!cur->active_spin_lock) {
+		void *ptr;
+
+		if (map)
+			ptr = map;
+		else
+			ptr = btf;
+
+		if (!cur->active_spin_lock_ptr) {
 			verbose(env, "bpf_spin_unlock without taking a lock\n");
 			return -EINVAL;
 		}
-		if (cur->active_spin_lock != reg->id) {
+		if (cur->active_spin_lock_ptr != ptr ||
+		    cur->active_spin_lock_id != reg->id) {
 			verbose(env, "bpf_spin_unlock of different lock\n");
 			return -EINVAL;
 		}
-		cur->active_spin_lock = 0;
+		cur->active_spin_lock_ptr = NULL;
+		cur->active_spin_lock_id = 0;
 	}
 	return 0;
 }
@@ -10393,8 +10407,8 @@  static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) {
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
-		if (btf_record_has_field(map->record, BPF_SPIN_LOCK))
-			dst_reg->id = ++env->id_gen;
+		WARN_ON_ONCE(map->max_entries != 1);
+		/* We want reg->id to be same (0) as map_value is not distinct */
 	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
 		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
 		dst_reg->type = CONST_PTR_TO_MAP;
@@ -10472,7 +10486,7 @@  static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		return err;
 	}
 
-	if (env->cur_state->active_spin_lock) {
+	if (env->cur_state->active_spin_lock_ptr) {
 		verbose(env, "BPF_LD_[ABS|IND] cannot be used inside bpf_spin_lock-ed region\n");
 		return -EINVAL;
 	}
@@ -11738,7 +11752,8 @@  static bool states_equal(struct bpf_verifier_env *env,
 	if (old->speculative && !cur->speculative)
 		return false;
 
-	if (old->active_spin_lock != cur->active_spin_lock)
+	if (old->active_spin_lock_ptr != cur->active_spin_lock_ptr ||
+	    old->active_spin_lock_id != cur->active_spin_lock_id)
 		return false;
 
 	/* for states to be equal callsites have to be the same
@@ -12377,7 +12392,7 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_spin_lock &&
+				if (env->cur_state->active_spin_lock_ptr &&
 				    (insn->src_reg == BPF_PSEUDO_CALL ||
 				     insn->imm != BPF_FUNC_spin_unlock)) {
 					verbose(env, "function calls are not allowed while holding a lock\n");
@@ -12414,7 +12429,7 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_spin_lock) {
+				if (env->cur_state->active_spin_lock_ptr) {
 					verbose(env, "bpf_spin_unlock is missing\n");
 					return -EINVAL;
 				}