Message ID | 20230609210143.2625430-4-eddyz87@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | verify scalar ids mapping in regsafe() | expand |
On Fri, Jun 9, 2023 at 2:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > Make sure that the following unsafe example is rejected by verifier: > > 1: r9 = ... some pointer with range X ... > 2: r6 = ... unbound scalar ID=a ... > 3: r7 = ... unbound scalar ID=b ... > 4: if (r6 > r7) goto +1 > 5: r6 = r7 > 6: if (r6 > X) goto ... > --- checkpoint --- > 7: r9 += r7 > 8: *(u64 *)r9 = Y > > This example is unsafe because not all execution paths verify r7 range. > Because of the jump at (4) the verifier would arrive at (6) in two states: > I. r6{.id=b}, r7{.id=b} via path 1-6; > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > Currently regsafe() does not call check_ids() for scalar registers, > thus from POV of regsafe() states (I) and (II) are identical. If the > path 1-6 is taken by verifier first, and checkpoint is created at (6) > the path [1-4, 6] would be considered safe. > > This commit adds a new function: check_scalar_ids() and updates > regsafe() to call it for precise scalar registers. > check_scalar_ids() differs from check_ids() in order to: > - treat ID zero as a unique scalar ID; > - disallow mapping same 'cur_id' to multiple 'old_id'. > > To minimize the impact on verification performance, avoid generating > bpf_reg_state::id for constant scalar values when processing BPF_MOV > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > propagate information about value ranges for registers that hold the > same value. However, there is no need to propagate range information > for constants. > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > --- > include/linux/bpf_verifier.h | 1 + > kernel/bpf/verifier.c | 77 +++++++++++++++++++++++++++++++++--- > 2 files changed, 73 insertions(+), 5 deletions(-) > I have lots of gripes with specifics in this patch, as you can see below. But ultimately this should fix the issue and we can discuss the rest separately. Acked-by: Andrii Nakryiko <andrii@kernel.org> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 73a98f6240fd..1bdda17fad70 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -584,6 +584,7 @@ struct bpf_verifier_env { > u32 used_map_cnt; /* number of used maps */ > u32 used_btf_cnt; /* number of used BTF objects */ > u32 id_gen; /* used to generate unique reg IDs */ > + u32 tmp_id_gen; > bool explore_alu_limits; > bool allow_ptr_leaks; > bool allow_uninit_stack; > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index f719de396c61..c6797742f38e 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -12942,12 +12942,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > if (BPF_SRC(insn->code) == BPF_X) { > struct bpf_reg_state *src_reg = regs + insn->src_reg; > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > + bool need_id = src_reg->type == SCALAR_VALUE && !src_reg->id && > + !tnum_is_const(src_reg->var_off); > > if (BPF_CLASS(insn->code) == BPF_ALU64) { > /* case: R1 = R2 > * copy register state to dest reg > */ > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > + if (need_id) > /* Assign src and dst registers the same ID > * that will be used by find_equal_scalars() > * to propagate min/max range. > @@ -12966,7 +12968,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > } else if (src_reg->type == SCALAR_VALUE) { > bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; > > - if (is_src_reg_u32 && !src_reg->id) > + if (is_src_reg_u32 && need_id) > src_reg->id = ++env->id_gen; > copy_register_state(dst_reg, src_reg); > /* Make sure ID is cleared if src_reg is not in u32 range otherwise > @@ -15148,6 +15150,36 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) > return false; > } > > +/* Similar to check_ids(), but: > + * - disallow mapping of different 'old_id' values to same 'cur_id' value; > + * - for zero 'old_id' or 'cur_id' allocate a unique temporary ID > + * to allow pairs like '0 vs unique ID', 'unique ID vs 0'. > + */ > +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, > + struct bpf_id_pair *idmap) > +{ > + unsigned int i; > + > + old_id = old_id ? old_id : ++env->tmp_id_gen; > + cur_id = cur_id ? cur_id : ++env->tmp_id_gen; > + > + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { > + if (!idmap[i].old) { > + /* Reached an empty slot; haven't seen this id before */ > + idmap[i].old = old_id; > + idmap[i].cur = cur_id; > + return true; > + } > + if (idmap[i].old == old_id) > + return idmap[i].cur == cur_id; > + if (idmap[i].cur == cur_id) > + return false; As I mentioned, I think we should do it always (even if in some *partial* cases this might not be necessary to guarantee correctness), I believe this is what idmap semantics is promising to check, but we actually don't. But we can discuss that separately. > + } > + /* We ran out of idmap slots, which should be impossible */ > + WARN_ON_ONCE(1); > + return false; > +} > + > static void clean_func_state(struct bpf_verifier_env *env, > struct bpf_func_state *st) > { > @@ -15253,6 +15285,15 @@ static bool regs_exact(const struct bpf_reg_state *rold, > check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap); > } > > +static bool scalar_regs_exact(struct bpf_verifier_env *env, > + const struct bpf_reg_state *rold, > + const struct bpf_reg_state *rcur, > + struct bpf_id_pair *idmap) > +{ > + return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && > + check_scalar_ids(env, rold->id, rcur->id, idmap); At this point I don't know if there is any benefit to having this special scalar_regs_exact() implementation. We are just assuming that memcmp() of 88 bytes is significantly faster than doing range_within() (8 straightforward comparisons) and tnum_in() (few bit operations and one comparison). And if this doesn't work out, we pay the price of both memcmp and range_within+tnum_in. Plus check_scalar_ids() in both cases. I suspect that just dropping memcmp() will be at least not worse, if not better. > +} > + > /* Returns true if (rold safe implies rcur safe) */ > static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > struct bpf_reg_state *rcur, struct bpf_id_pair *idmap) > @@ -15292,15 +15333,39 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > switch (base_type(rold->type)) { > case SCALAR_VALUE: > - if (regs_exact(rold, rcur, idmap)) > + if (scalar_regs_exact(env, rold, rcur, idmap)) > return true; > if (env->explore_alu_limits) > return false; > if (!rold->precise) > return true; > - /* new val must satisfy old val knowledge */ > + /* Why check_ids() for scalar registers? > + * > + * Consider the following BPF code: > + * 1: r6 = ... unbound scalar, ID=a ... > + * 2: r7 = ... unbound scalar, ID=b ... > + * 3: if (r6 > r7) goto +1 > + * 4: r6 = r7 > + * 5: if (r6 > X) goto ... > + * 6: ... memory operation using r7 ... > + * > + * First verification path is [1-6]: > + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; > + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark > + * r7 <= X, because r6 and r7 share same id. > + * Next verification path is [1-4, 6]. > + * > + * Instruction (6) would be reached in two states: > + * I. r6{.id=b}, r7{.id=b} via path 1-6; > + * II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > + * > + * Use check_ids() to distinguish these states. > + * --- > + * Also verify that new value satisfies old value range knowledge. > + */ > return range_within(rold, rcur) && > - tnum_in(rold->var_off, rcur->var_off); > + tnum_in(rold->var_off, rcur->var_off) && > + check_scalar_ids(env, rold->id, rcur->id, idmap); > case PTR_TO_MAP_KEY: > case PTR_TO_MAP_VALUE: > case PTR_TO_MEM: > @@ -15542,6 +15607,8 @@ static bool states_equal(struct bpf_verifier_env *env, > if (old->active_rcu_lock != cur->active_rcu_lock) > return false; > > + env->tmp_id_gen = env->id_gen; > + sigh, this is kind of ugly, but ok :( Ideally we have struct idmap_scratch { int id_gen; struct bpf_id_pair mapping[BPF_ID_MAP_SIZE]; } and just initialize idmap_scratch.id_gen = env->id_gen and keep all this very local to id_map. But I'm nitpicking. > /* for states to be equal callsites have to be the same > * and all frame states need to be equivalent > */ > -- > 2.40.1 >
On Fri, 2023-06-09 at 16:11 -0700, Andrii Nakryiko wrote: > On Fri, Jun 9, 2023 at 2:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > Make sure that the following unsafe example is rejected by verifier: > > > > 1: r9 = ... some pointer with range X ... > > 2: r6 = ... unbound scalar ID=a ... > > 3: r7 = ... unbound scalar ID=b ... > > 4: if (r6 > r7) goto +1 > > 5: r6 = r7 > > 6: if (r6 > X) goto ... > > --- checkpoint --- > > 7: r9 += r7 > > 8: *(u64 *)r9 = Y > > > > This example is unsafe because not all execution paths verify r7 range. > > Because of the jump at (4) the verifier would arrive at (6) in two states: > > I. r6{.id=b}, r7{.id=b} via path 1-6; > > II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > > > Currently regsafe() does not call check_ids() for scalar registers, > > thus from POV of regsafe() states (I) and (II) are identical. If the > > path 1-6 is taken by verifier first, and checkpoint is created at (6) > > the path [1-4, 6] would be considered safe. > > > > This commit adds a new function: check_scalar_ids() and updates > > regsafe() to call it for precise scalar registers. > > check_scalar_ids() differs from check_ids() in order to: > > - treat ID zero as a unique scalar ID; > > - disallow mapping same 'cur_id' to multiple 'old_id'. > > > > To minimize the impact on verification performance, avoid generating > > bpf_reg_state::id for constant scalar values when processing BPF_MOV > > in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to > > propagate information about value ranges for registers that hold the > > same value. However, there is no need to propagate range information > > for constants. > > > > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > --- > > include/linux/bpf_verifier.h | 1 + > > kernel/bpf/verifier.c | 77 +++++++++++++++++++++++++++++++++--- > > 2 files changed, 73 insertions(+), 5 deletions(-) > > > > I have lots of gripes with specifics in this patch, as you can see > below. But ultimately this should fix the issue and we can discuss the > rest separately. > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > > index 73a98f6240fd..1bdda17fad70 100644 > > --- a/include/linux/bpf_verifier.h > > +++ b/include/linux/bpf_verifier.h > > @@ -584,6 +584,7 @@ struct bpf_verifier_env { > > u32 used_map_cnt; /* number of used maps */ > > u32 used_btf_cnt; /* number of used BTF objects */ > > u32 id_gen; /* used to generate unique reg IDs */ > > + u32 tmp_id_gen; > > bool explore_alu_limits; > > bool allow_ptr_leaks; > > bool allow_uninit_stack; > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index f719de396c61..c6797742f38e 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -12942,12 +12942,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > if (BPF_SRC(insn->code) == BPF_X) { > > struct bpf_reg_state *src_reg = regs + insn->src_reg; > > struct bpf_reg_state *dst_reg = regs + insn->dst_reg; > > + bool need_id = src_reg->type == SCALAR_VALUE && !src_reg->id && > > + !tnum_is_const(src_reg->var_off); > > > > if (BPF_CLASS(insn->code) == BPF_ALU64) { > > /* case: R1 = R2 > > * copy register state to dest reg > > */ > > - if (src_reg->type == SCALAR_VALUE && !src_reg->id) > > + if (need_id) > > /* Assign src and dst registers the same ID > > * that will be used by find_equal_scalars() > > * to propagate min/max range. > > @@ -12966,7 +12968,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > } else if (src_reg->type == SCALAR_VALUE) { > > bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; > > > > - if (is_src_reg_u32 && !src_reg->id) > > + if (is_src_reg_u32 && need_id) > > src_reg->id = ++env->id_gen; > > copy_register_state(dst_reg, src_reg); > > /* Make sure ID is cleared if src_reg is not in u32 range otherwise > > @@ -15148,6 +15150,36 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) > > return false; > > } > > > > +/* Similar to check_ids(), but: > > + * - disallow mapping of different 'old_id' values to same 'cur_id' value; > > + * - for zero 'old_id' or 'cur_id' allocate a unique temporary ID > > + * to allow pairs like '0 vs unique ID', 'unique ID vs 0'. > > + */ > > +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, > > + struct bpf_id_pair *idmap) > > +{ > > + unsigned int i; > > + > > + old_id = old_id ? old_id : ++env->tmp_id_gen; > > + cur_id = cur_id ? cur_id : ++env->tmp_id_gen; > > + > > + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { > > + if (!idmap[i].old) { > > + /* Reached an empty slot; haven't seen this id before */ > > + idmap[i].old = old_id; > > + idmap[i].cur = cur_id; > > + return true; > > + } > > + if (idmap[i].old == old_id) > > + return idmap[i].cur == cur_id; > > + if (idmap[i].cur == cur_id) > > + return false; > > As I mentioned, I think we should do it always (even if in some > *partial* cases this might not be necessary to guarantee correctness), > I believe this is what idmap semantics is promising to check, but we > actually don't. But we can discuss that separately. I'll compile the list of use-cases and we can discuss. > > > + } > > + /* We ran out of idmap slots, which should be impossible */ > > + WARN_ON_ONCE(1); > > + return false; > > +} > > + > > static void clean_func_state(struct bpf_verifier_env *env, > > struct bpf_func_state *st) > > { > > @@ -15253,6 +15285,15 @@ static bool regs_exact(const struct bpf_reg_state *rold, > > check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap); > > } > > > > +static bool scalar_regs_exact(struct bpf_verifier_env *env, > > + const struct bpf_reg_state *rold, > > + const struct bpf_reg_state *rcur, > > + struct bpf_id_pair *idmap) > > +{ > > + return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && > > + check_scalar_ids(env, rold->id, rcur->id, idmap); > > At this point I don't know if there is any benefit to having this > special scalar_regs_exact() implementation. We are just assuming that > memcmp() of 88 bytes is significantly faster than doing range_within() > (8 straightforward comparisons) and tnum_in() (few bit operations and > one comparison). And if this doesn't work out, we pay the price of > both memcmp and range_within+tnum_in. Plus check_scalar_ids() in both > cases. > > I suspect that just dropping memcmp() will be at least not worse, if not better. Sounds reasonable, but I'm a bit hesitant here because of the "explore_alu_limits": if (scalar_regs_exact(env, rold, rcur, idmap)) return true; if (env->explore_alu_limits) return false; tbh, I don't understand if this special case is necessary for "explore_alu_limits", will think about it. > > > +} > > + > > /* Returns true if (rold safe implies rcur safe) */ > > static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > struct bpf_reg_state *rcur, struct bpf_id_pair *idmap) > > @@ -15292,15 +15333,39 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, > > > > switch (base_type(rold->type)) { > > case SCALAR_VALUE: > > - if (regs_exact(rold, rcur, idmap)) > > + if (scalar_regs_exact(env, rold, rcur, idmap)) > > return true; > > if (env->explore_alu_limits) > > return false; > > if (!rold->precise) > > return true; > > - /* new val must satisfy old val knowledge */ > > + /* Why check_ids() for scalar registers? > > + * > > + * Consider the following BPF code: > > + * 1: r6 = ... unbound scalar, ID=a ... > > + * 2: r7 = ... unbound scalar, ID=b ... > > + * 3: if (r6 > r7) goto +1 > > + * 4: r6 = r7 > > + * 5: if (r6 > X) goto ... > > + * 6: ... memory operation using r7 ... > > + * > > + * First verification path is [1-6]: > > + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; > > + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark > > + * r7 <= X, because r6 and r7 share same id. > > + * Next verification path is [1-4, 6]. > > + * > > + * Instruction (6) would be reached in two states: > > + * I. r6{.id=b}, r7{.id=b} via path 1-6; > > + * II. r6{.id=a}, r7{.id=b} via path 1-4, 6. > > + * > > + * Use check_ids() to distinguish these states. > > + * --- > > + * Also verify that new value satisfies old value range knowledge. > > + */ > > return range_within(rold, rcur) && > > - tnum_in(rold->var_off, rcur->var_off); > > + tnum_in(rold->var_off, rcur->var_off) && > > + check_scalar_ids(env, rold->id, rcur->id, idmap); > > case PTR_TO_MAP_KEY: > > case PTR_TO_MAP_VALUE: > > case PTR_TO_MEM: > > @@ -15542,6 +15607,8 @@ static bool states_equal(struct bpf_verifier_env *env, > > if (old->active_rcu_lock != cur->active_rcu_lock) > > return false; > > > > + env->tmp_id_gen = env->id_gen; > > + > > sigh, this is kind of ugly, but ok :( Ideally we have > > struct idmap_scratch { > int id_gen; > struct bpf_id_pair mapping[BPF_ID_MAP_SIZE]; > } > > and just initialize idmap_scratch.id_gen = env->id_gen and keep all > this very local to id_map. This looks better, will update the patch. > > But I'm nitpicking. > > > > /* for states to be equal callsites have to be the same > > * and all frame states need to be equivalent > > */ > > -- > > 2.40.1 > >
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 73a98f6240fd..1bdda17fad70 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -584,6 +584,7 @@ struct bpf_verifier_env { u32 used_map_cnt; /* number of used maps */ u32 used_btf_cnt; /* number of used BTF objects */ u32 id_gen; /* used to generate unique reg IDs */ + u32 tmp_id_gen; bool explore_alu_limits; bool allow_ptr_leaks; bool allow_uninit_stack; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f719de396c61..c6797742f38e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12942,12 +12942,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) if (BPF_SRC(insn->code) == BPF_X) { struct bpf_reg_state *src_reg = regs + insn->src_reg; struct bpf_reg_state *dst_reg = regs + insn->dst_reg; + bool need_id = src_reg->type == SCALAR_VALUE && !src_reg->id && + !tnum_is_const(src_reg->var_off); if (BPF_CLASS(insn->code) == BPF_ALU64) { /* case: R1 = R2 * copy register state to dest reg */ - if (src_reg->type == SCALAR_VALUE && !src_reg->id) + if (need_id) /* Assign src and dst registers the same ID * that will be used by find_equal_scalars() * to propagate min/max range. @@ -12966,7 +12968,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } else if (src_reg->type == SCALAR_VALUE) { bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX; - if (is_src_reg_u32 && !src_reg->id) + if (is_src_reg_u32 && need_id) src_reg->id = ++env->id_gen; copy_register_state(dst_reg, src_reg); /* Make sure ID is cleared if src_reg is not in u32 range otherwise @@ -15148,6 +15150,36 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap) return false; } +/* Similar to check_ids(), but: + * - disallow mapping of different 'old_id' values to same 'cur_id' value; + * - for zero 'old_id' or 'cur_id' allocate a unique temporary ID + * to allow pairs like '0 vs unique ID', 'unique ID vs 0'. + */ +static bool check_scalar_ids(struct bpf_verifier_env *env, u32 old_id, u32 cur_id, + struct bpf_id_pair *idmap) +{ + unsigned int i; + + old_id = old_id ? old_id : ++env->tmp_id_gen; + cur_id = cur_id ? cur_id : ++env->tmp_id_gen; + + for (i = 0; i < BPF_ID_MAP_SIZE; i++) { + if (!idmap[i].old) { + /* Reached an empty slot; haven't seen this id before */ + idmap[i].old = old_id; + idmap[i].cur = cur_id; + return true; + } + if (idmap[i].old == old_id) + return idmap[i].cur == cur_id; + if (idmap[i].cur == cur_id) + return false; + } + /* We ran out of idmap slots, which should be impossible */ + WARN_ON_ONCE(1); + return false; +} + static void clean_func_state(struct bpf_verifier_env *env, struct bpf_func_state *st) { @@ -15253,6 +15285,15 @@ static bool regs_exact(const struct bpf_reg_state *rold, check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap); } +static bool scalar_regs_exact(struct bpf_verifier_env *env, + const struct bpf_reg_state *rold, + const struct bpf_reg_state *rcur, + struct bpf_id_pair *idmap) +{ + return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && + check_scalar_ids(env, rold->id, rcur->id, idmap); +} + /* Returns true if (rold safe implies rcur safe) */ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, struct bpf_reg_state *rcur, struct bpf_id_pair *idmap) @@ -15292,15 +15333,39 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, switch (base_type(rold->type)) { case SCALAR_VALUE: - if (regs_exact(rold, rcur, idmap)) + if (scalar_regs_exact(env, rold, rcur, idmap)) return true; if (env->explore_alu_limits) return false; if (!rold->precise) return true; - /* new val must satisfy old val knowledge */ + /* Why check_ids() for scalar registers? + * + * Consider the following BPF code: + * 1: r6 = ... unbound scalar, ID=a ... + * 2: r7 = ... unbound scalar, ID=b ... + * 3: if (r6 > r7) goto +1 + * 4: r6 = r7 + * 5: if (r6 > X) goto ... + * 6: ... memory operation using r7 ... + * + * First verification path is [1-6]: + * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; + * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark + * r7 <= X, because r6 and r7 share same id. + * Next verification path is [1-4, 6]. + * + * Instruction (6) would be reached in two states: + * I. r6{.id=b}, r7{.id=b} via path 1-6; + * II. r6{.id=a}, r7{.id=b} via path 1-4, 6. + * + * Use check_ids() to distinguish these states. + * --- + * Also verify that new value satisfies old value range knowledge. + */ return range_within(rold, rcur) && - tnum_in(rold->var_off, rcur->var_off); + tnum_in(rold->var_off, rcur->var_off) && + check_scalar_ids(env, rold->id, rcur->id, idmap); case PTR_TO_MAP_KEY: case PTR_TO_MAP_VALUE: case PTR_TO_MEM: @@ -15542,6 +15607,8 @@ static bool states_equal(struct bpf_verifier_env *env, if (old->active_rcu_lock != cur->active_rcu_lock) return false; + env->tmp_id_gen = env->id_gen; + /* for states to be equal callsites have to be the same * and all frame states need to be equivalent */
Make sure that the following unsafe example is rejected by verifier: 1: r9 = ... some pointer with range X ... 2: r6 = ... unbound scalar ID=a ... 3: r7 = ... unbound scalar ID=b ... 4: if (r6 > r7) goto +1 5: r6 = r7 6: if (r6 > X) goto ... --- checkpoint --- 7: r9 += r7 8: *(u64 *)r9 = Y This example is unsafe because not all execution paths verify r7 range. Because of the jump at (4) the verifier would arrive at (6) in two states: I. r6{.id=b}, r7{.id=b} via path 1-6; II. r6{.id=a}, r7{.id=b} via path 1-4, 6. Currently regsafe() does not call check_ids() for scalar registers, thus from POV of regsafe() states (I) and (II) are identical. If the path 1-6 is taken by verifier first, and checkpoint is created at (6) the path [1-4, 6] would be considered safe. This commit adds a new function: check_scalar_ids() and updates regsafe() to call it for precise scalar registers. check_scalar_ids() differs from check_ids() in order to: - treat ID zero as a unique scalar ID; - disallow mapping same 'cur_id' to multiple 'old_id'. To minimize the impact on verification performance, avoid generating bpf_reg_state::id for constant scalar values when processing BPF_MOV in check_alu_op(). Scalar IDs are utilized by find_equal_scalars() to propagate information about value ranges for registers that hold the same value. However, there is no need to propagate range information for constants. Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- include/linux/bpf_verifier.h | 1 + kernel/bpf/verifier.c | 77 +++++++++++++++++++++++++++++++++--- 2 files changed, 73 insertions(+), 5 deletions(-)