Message ID | 20231103000822.2509815-5-andrii@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | BPF register bounds range vs range support | expand |
On Thu, Nov 2, 2023 at 5:08 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > Add simple sanity checks that validate well-formed ranges (min <= max) > across u64, s64, u32, and s32 ranges. Also for cases when the value is > constant (either 64-bit or 32-bit), we validate that ranges and tnums > are in agreement. > > These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 > operations, on conditional jumps, and for LDX instructions (where subreg > zero/sign extension is probably the most important to check). This > covers most of the interesting cases. > > Also, we validate the sanity of the return register when manually > adjusting it for some special helpers. > > By default, sanity violation will trigger a warning in verifier log and > resetting register bounds to "unbounded" ones. But to aid development > and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will > trigger hard failure of verification with -EFAULT on register bounds > violations. This allows selftests to catch such issues. veristat will > also gain a CLI option to enable this behavior. > BTW, besides two verifier_bounds "artificial" selftests, we seem to have one more violation in more real-world-like test: bpf_cubic.bpf.c's bpf_cubic_cong_avoid: ; if (!(x & (~0ull << (BITS_PER_U64-1)))) 166: (65) if r1 s> 0xffffffff goto pc+1 REG SANITY VIOLATION (true_reg1): range bounds violation u64=[0x4000000000000000, 0x1fd809fd00000000] s64=[0x0, 0x1fd809fd00000000] u32=[0x0, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x1fd809fd00000000) We get to the point where R1 state is like this (before violation checks detection was added): R1=scalar(id=59,smin=umin=4611686018427387904,smax=umax=2294594992376643584,smin32=0,smax32=umax32=0,var_off=(0x0; 0x1fd809fd00000000)) umin >= umax, smin >= smax. All this before any of the verifier changes we landed in the previous patch set. I.e., none of my changes caused this breakage, it's pre-existing problem. > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > include/linux/bpf_verifier.h | 1 + > include/uapi/linux/bpf.h | 3 + > kernel/bpf/syscall.c | 3 +- > kernel/bpf/verifier.c | 117 ++++++++++++++++++++++++++------- > tools/include/uapi/linux/bpf.h | 3 + > 5 files changed, 101 insertions(+), 26 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 24213a99cc79..402b6bc44a1b 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -602,6 +602,7 @@ struct bpf_verifier_env { > int stack_size; /* number of states to be processed */ > bool strict_alignment; /* perform strict pointer alignment checks */ > bool test_state_freq; /* test verifier with different pruning frequency */ > + bool test_sanity_strict; /* fail verification on sanity violations */ > struct bpf_verifier_state *cur_state; /* current verifier state */ > struct bpf_verifier_state_list **explored_states; /* search pruning optimization */ > struct bpf_verifier_state_list *free_list; > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 0f6cdf52b1da..b99c1e0e2730 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { > */ > #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) > > +/* The verifier internal test flag. Behavior is undefined */ > +#define BPF_F_TEST_SANITY_STRICT (1U << 7) > + > /* link_create.kprobe_multi.flags used in LINK_CREATE command for > * BPF_TRACE_KPROBE_MULTI attach type to create return probe. > */ > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 0ed286b8a0f0..f266e03ba342 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -2573,7 +2573,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) > BPF_F_SLEEPABLE | > BPF_F_TEST_RND_HI32 | > BPF_F_XDP_HAS_FRAGS | > - BPF_F_XDP_DEV_BOUND_ONLY)) > + BPF_F_XDP_DEV_BOUND_ONLY | > + BPF_F_TEST_SANITY_STRICT)) > return -EINVAL; > > if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 8691cacd3ad3..af4e2fecbef2 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2615,6 +2615,56 @@ static void reg_bounds_sync(struct bpf_reg_state *reg) > __update_reg_bounds(reg); > } > > +static int reg_bounds_sanity_check(struct bpf_verifier_env *env, > + struct bpf_reg_state *reg, const char *ctx) > +{ > + const char *msg; > + > + if (reg->umin_value > reg->umax_value || > + reg->smin_value > reg->smax_value || > + reg->u32_min_value > reg->u32_max_value || > + reg->s32_min_value > reg->s32_max_value) { > + msg = "range bounds violation"; > + goto out; > + } > + > + if (tnum_is_const(reg->var_off)) { > + u64 uval = reg->var_off.value; > + s64 sval = (s64)uval; > + > + if (reg->umin_value != uval || reg->umax_value != uval || > + reg->smin_value != sval || reg->smax_value != sval) { > + msg = "const tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + if (tnum_subreg_is_const(reg->var_off)) { > + u32 uval32 = tnum_subreg(reg->var_off).value; > + s32 sval32 = (s32)uval32; > + > + if (reg->u32_min_value != uval32 || reg->u32_max_value != uval32 || > + reg->s32_min_value != sval32 || reg->s32_max_value != sval32) { > + msg = "const subreg tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + return 0; > +out: > + verbose(env, "REG SANITY VIOLATION (%s): %s u64=[%#llx, %#llx] " > + "s64=[%#llx, %#llx] u32=[%#x, %#x] s32=[%#x, %#x] var_off=(%#llx, %#llx)\n", > + ctx, msg, reg->umin_value, reg->umax_value, > + reg->smin_value, reg->smax_value, > + reg->u32_min_value, reg->u32_max_value, > + reg->s32_min_value, reg->s32_max_value, > + reg->var_off.value, reg->var_off.mask); > + if (env->test_sanity_strict) > + return -EFAULT; > + __mark_reg_unbounded(reg); > + return 0; > +} > + > static bool __reg32_bound_s64(s32 a) > { > return a >= 0 && a <= S32_MAX; > @@ -9928,14 +9978,15 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) > return 0; > } > > -static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, > - int func_id, > - struct bpf_call_arg_meta *meta) > +static int do_refine_retval_range(struct bpf_verifier_env *env, > + struct bpf_reg_state *regs, int ret_type, > + int func_id, > + struct bpf_call_arg_meta *meta) > { > struct bpf_reg_state *ret_reg = ®s[BPF_REG_0]; > > if (ret_type != RET_INTEGER) > - return; > + return 0; > > switch (func_id) { > case BPF_FUNC_get_stack: > @@ -9961,6 +10012,8 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, > reg_bounds_sync(ret_reg); > break; > } > + > + return reg_bounds_sanity_check(env, ret_reg, "retval"); > } > > static int > @@ -10612,7 +10665,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > regs[BPF_REG_0].ref_obj_id = id; > } > > - do_refine_retval_range(regs, fn->ret_type, func_id, &meta); > + err = do_refine_retval_range(env, regs, fn->ret_type, func_id, &meta); > + if (err) > + return err; > > err = check_map_func_compatibility(env, meta.map_ptr, func_id); > if (err) > @@ -14079,13 +14134,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > /* check dest operand */ > err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); > + err = err ?: adjust_reg_min_max_vals(env, insn); > if (err) > return err; > - > - return adjust_reg_min_max_vals(env, insn); > } > > - return 0; > + return reg_bounds_sanity_check(env, ®s[insn->dst_reg], "alu"); > } > > static void find_good_pkt_pointers(struct bpf_verifier_state *vstate, > @@ -14609,18 +14663,21 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state > * Technically we can do similar adjustments for pointers to the same object, > * but we don't support that right now. > */ > -static void reg_set_min_max(struct bpf_reg_state *true_reg1, > - struct bpf_reg_state *true_reg2, > - struct bpf_reg_state *false_reg1, > - struct bpf_reg_state *false_reg2, > - u8 opcode, bool is_jmp32) > +static int reg_set_min_max(struct bpf_verifier_env *env, > + struct bpf_reg_state *true_reg1, > + struct bpf_reg_state *true_reg2, > + struct bpf_reg_state *false_reg1, > + struct bpf_reg_state *false_reg2, > + u8 opcode, bool is_jmp32) > { > + int err; > + > /* If either register is a pointer, we can't learn anything about its > * variable offset from the compare (unless they were a pointer into > * the same object, but we don't bother with that). > */ > if (false_reg1->type != SCALAR_VALUE || false_reg2->type != SCALAR_VALUE) > - return; > + return 0; > > /* fallthrough (FALSE) branch */ > regs_refine_cond_op(false_reg1, false_reg2, rev_opcode(opcode), is_jmp32); > @@ -14631,6 +14688,12 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg1, > regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32); > reg_bounds_sync(true_reg1); > reg_bounds_sync(true_reg2); > + > + err = reg_bounds_sanity_check(env, true_reg1, "true_reg1"); > + err = err ?: reg_bounds_sanity_check(env, true_reg2, "true_reg2"); > + err = err ?: reg_bounds_sanity_check(env, false_reg1, "false_reg1"); > + err = err ?: reg_bounds_sanity_check(env, false_reg2, "false_reg2"); > + return err; > } > > static void mark_ptr_or_null_reg(struct bpf_func_state *state, > @@ -14924,15 +14987,20 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > other_branch_regs = other_branch->frame[other_branch->curframe]->regs; > > if (BPF_SRC(insn->code) == BPF_X) { > - reg_set_min_max(&other_branch_regs[insn->dst_reg], > - &other_branch_regs[insn->src_reg], > - dst_reg, src_reg, opcode, is_jmp32); > + err = reg_set_min_max(env, > + &other_branch_regs[insn->dst_reg], > + &other_branch_regs[insn->src_reg], > + dst_reg, src_reg, opcode, is_jmp32); > } else /* BPF_SRC(insn->code) == BPF_K */ { > - reg_set_min_max(&other_branch_regs[insn->dst_reg], > - src_reg /* fake one */, > - dst_reg, src_reg /* same fake one */, > - opcode, is_jmp32); > + err = reg_set_min_max(env, > + &other_branch_regs[insn->dst_reg], > + src_reg /* fake one */, > + dst_reg, src_reg /* same fake one */, > + opcode, is_jmp32); > } > + if (err) > + return err; > + > if (BPF_SRC(insn->code) == BPF_X && > src_reg->type == SCALAR_VALUE && src_reg->id && > !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) { > @@ -17435,10 +17503,8 @@ static int do_check(struct bpf_verifier_env *env) > insn->off, BPF_SIZE(insn->code), > BPF_READ, insn->dst_reg, false, > BPF_MODE(insn->code) == BPF_MEMSX); > - if (err) > - return err; > - > - err = save_aux_ptr_type(env, src_reg_type, true); > + err = err ?: save_aux_ptr_type(env, src_reg_type, true); > + err = err ?: reg_bounds_sanity_check(env, ®s[insn->dst_reg], "ldx"); > if (err) > return err; > } else if (class == BPF_STX) { > @@ -20725,6 +20791,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 > > if (is_priv) > env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ; > + env->test_sanity_strict = attr->prog_flags & BPF_F_TEST_SANITY_STRICT; > > env->explored_states = kvcalloc(state_htab_size(env), > sizeof(struct bpf_verifier_state_list *), > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 0f6cdf52b1da..b99c1e0e2730 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { > */ > #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) > > +/* The verifier internal test flag. Behavior is undefined */ > +#define BPF_F_TEST_SANITY_STRICT (1U << 7) > + > /* link_create.kprobe_multi.flags used in LINK_CREATE command for > * BPF_TRACE_KPROBE_MULTI attach type to create return probe. > */ > -- > 2.34.1 >
On Thu, 2023-11-02 at 17:08 -0700, Andrii Nakryiko wrote: > Add simple sanity checks that validate well-formed ranges (min <= max) > across u64, s64, u32, and s32 ranges. Also for cases when the value is > constant (either 64-bit or 32-bit), we validate that ranges and tnums > are in agreement. > > These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 > operations, on conditional jumps, and for LDX instructions (where subreg > zero/sign extension is probably the most important to check). This > covers most of the interesting cases. > > Also, we validate the sanity of the return register when manually > adjusting it for some special helpers. > > By default, sanity violation will trigger a warning in verifier log and > resetting register bounds to "unbounded" ones. But to aid development > and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will > trigger hard failure of verification with -EFAULT on register bounds > violations. This allows selftests to catch such issues. veristat will > also gain a CLI option to enable this behavior. This is a useful check but I'm not sure about placement. It might be useful to guard calls to coerce_subreg_to_size_sx() as well. Maybe insert it as a part of the main do_check() loop but filter by instruction class (and also force on stack_pop)? > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > include/linux/bpf_verifier.h | 1 + > include/uapi/linux/bpf.h | 3 + > kernel/bpf/syscall.c | 3 +- > kernel/bpf/verifier.c | 117 ++++++++++++++++++++++++++------- > tools/include/uapi/linux/bpf.h | 3 + > 5 files changed, 101 insertions(+), 26 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 24213a99cc79..402b6bc44a1b 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -602,6 +602,7 @@ struct bpf_verifier_env { > int stack_size; /* number of states to be processed */ > bool strict_alignment; /* perform strict pointer alignment checks */ > bool test_state_freq; /* test verifier with different pruning frequency */ > + bool test_sanity_strict; /* fail verification on sanity violations */ > struct bpf_verifier_state *cur_state; /* current verifier state */ > struct bpf_verifier_state_list **explored_states; /* search pruning optimization */ > struct bpf_verifier_state_list *free_list; > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 0f6cdf52b1da..b99c1e0e2730 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { > */ > #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) > > +/* The verifier internal test flag. Behavior is undefined */ > +#define BPF_F_TEST_SANITY_STRICT (1U << 7) > + > /* link_create.kprobe_multi.flags used in LINK_CREATE command for > * BPF_TRACE_KPROBE_MULTI attach type to create return probe. > */ > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 0ed286b8a0f0..f266e03ba342 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -2573,7 +2573,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) > BPF_F_SLEEPABLE | > BPF_F_TEST_RND_HI32 | > BPF_F_XDP_HAS_FRAGS | > - BPF_F_XDP_DEV_BOUND_ONLY)) > + BPF_F_XDP_DEV_BOUND_ONLY | > + BPF_F_TEST_SANITY_STRICT)) > return -EINVAL; > > if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 8691cacd3ad3..af4e2fecbef2 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2615,6 +2615,56 @@ static void reg_bounds_sync(struct bpf_reg_state *reg) > __update_reg_bounds(reg); > } > > +static int reg_bounds_sanity_check(struct bpf_verifier_env *env, > + struct bpf_reg_state *reg, const char *ctx) > +{ > + const char *msg; > + > + if (reg->umin_value > reg->umax_value || > + reg->smin_value > reg->smax_value || > + reg->u32_min_value > reg->u32_max_value || > + reg->s32_min_value > reg->s32_max_value) { > + msg = "range bounds violation"; > + goto out; > + } > + > + if (tnum_is_const(reg->var_off)) { > + u64 uval = reg->var_off.value; > + s64 sval = (s64)uval; > + > + if (reg->umin_value != uval || reg->umax_value != uval || > + reg->smin_value != sval || reg->smax_value != sval) { > + msg = "const tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + if (tnum_subreg_is_const(reg->var_off)) { > + u32 uval32 = tnum_subreg(reg->var_off).value; > + s32 sval32 = (s32)uval32; > + > + if (reg->u32_min_value != uval32 || reg->u32_max_value != uval32 || > + reg->s32_min_value != sval32 || reg->s32_max_value != sval32) { > + msg = "const subreg tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + return 0; > +out: > + verbose(env, "REG SANITY VIOLATION (%s): %s u64=[%#llx, %#llx] " > + "s64=[%#llx, %#llx] u32=[%#x, %#x] s32=[%#x, %#x] var_off=(%#llx, %#llx)\n", > + ctx, msg, reg->umin_value, reg->umax_value, > + reg->smin_value, reg->smax_value, > + reg->u32_min_value, reg->u32_max_value, > + reg->s32_min_value, reg->s32_max_value, > + reg->var_off.value, reg->var_off.mask); > + if (env->test_sanity_strict) > + return -EFAULT; > + __mark_reg_unbounded(reg); > + return 0; > +} > + > static bool __reg32_bound_s64(s32 a) > { > return a >= 0 && a <= S32_MAX; > @@ -9928,14 +9978,15 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) > return 0; > } > > -static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, > - int func_id, > - struct bpf_call_arg_meta *meta) > +static int do_refine_retval_range(struct bpf_verifier_env *env, > + struct bpf_reg_state *regs, int ret_type, > + int func_id, > + struct bpf_call_arg_meta *meta) > { > struct bpf_reg_state *ret_reg = ®s[BPF_REG_0]; > > if (ret_type != RET_INTEGER) > - return; > + return 0; > > switch (func_id) { > case BPF_FUNC_get_stack: > @@ -9961,6 +10012,8 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, > reg_bounds_sync(ret_reg); > break; > } > + > + return reg_bounds_sanity_check(env, ret_reg, "retval"); > } > > static int > @@ -10612,7 +10665,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn > regs[BPF_REG_0].ref_obj_id = id; > } > > - do_refine_retval_range(regs, fn->ret_type, func_id, &meta); > + err = do_refine_retval_range(env, regs, fn->ret_type, func_id, &meta); > + if (err) > + return err; > > err = check_map_func_compatibility(env, meta.map_ptr, func_id); > if (err) > @@ -14079,13 +14134,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) > > /* check dest operand */ > err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); > + err = err ?: adjust_reg_min_max_vals(env, insn); > if (err) > return err; > - > - return adjust_reg_min_max_vals(env, insn); > } > > - return 0; > + return reg_bounds_sanity_check(env, ®s[insn->dst_reg], "alu"); > } > > static void find_good_pkt_pointers(struct bpf_verifier_state *vstate, > @@ -14609,18 +14663,21 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state > * Technically we can do similar adjustments for pointers to the same object, > * but we don't support that right now. > */ > -static void reg_set_min_max(struct bpf_reg_state *true_reg1, > - struct bpf_reg_state *true_reg2, > - struct bpf_reg_state *false_reg1, > - struct bpf_reg_state *false_reg2, > - u8 opcode, bool is_jmp32) > +static int reg_set_min_max(struct bpf_verifier_env *env, > + struct bpf_reg_state *true_reg1, > + struct bpf_reg_state *true_reg2, > + struct bpf_reg_state *false_reg1, > + struct bpf_reg_state *false_reg2, > + u8 opcode, bool is_jmp32) > { > + int err; > + > /* If either register is a pointer, we can't learn anything about its > * variable offset from the compare (unless they were a pointer into > * the same object, but we don't bother with that). > */ > if (false_reg1->type != SCALAR_VALUE || false_reg2->type != SCALAR_VALUE) > - return; > + return 0; > > /* fallthrough (FALSE) branch */ > regs_refine_cond_op(false_reg1, false_reg2, rev_opcode(opcode), is_jmp32); > @@ -14631,6 +14688,12 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg1, > regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32); > reg_bounds_sync(true_reg1); > reg_bounds_sync(true_reg2); > + > + err = reg_bounds_sanity_check(env, true_reg1, "true_reg1"); > + err = err ?: reg_bounds_sanity_check(env, true_reg2, "true_reg2"); > + err = err ?: reg_bounds_sanity_check(env, false_reg1, "false_reg1"); > + err = err ?: reg_bounds_sanity_check(env, false_reg2, "false_reg2"); > + return err; > } > > static void mark_ptr_or_null_reg(struct bpf_func_state *state, > @@ -14924,15 +14987,20 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > other_branch_regs = other_branch->frame[other_branch->curframe]->regs; > > if (BPF_SRC(insn->code) == BPF_X) { > - reg_set_min_max(&other_branch_regs[insn->dst_reg], > - &other_branch_regs[insn->src_reg], > - dst_reg, src_reg, opcode, is_jmp32); > + err = reg_set_min_max(env, > + &other_branch_regs[insn->dst_reg], > + &other_branch_regs[insn->src_reg], > + dst_reg, src_reg, opcode, is_jmp32); > } else /* BPF_SRC(insn->code) == BPF_K */ { > - reg_set_min_max(&other_branch_regs[insn->dst_reg], > - src_reg /* fake one */, > - dst_reg, src_reg /* same fake one */, > - opcode, is_jmp32); > + err = reg_set_min_max(env, > + &other_branch_regs[insn->dst_reg], > + src_reg /* fake one */, > + dst_reg, src_reg /* same fake one */, > + opcode, is_jmp32); > } > + if (err) > + return err; > + > if (BPF_SRC(insn->code) == BPF_X && > src_reg->type == SCALAR_VALUE && src_reg->id && > !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) { > @@ -17435,10 +17503,8 @@ static int do_check(struct bpf_verifier_env *env) > insn->off, BPF_SIZE(insn->code), > BPF_READ, insn->dst_reg, false, > BPF_MODE(insn->code) == BPF_MEMSX); > - if (err) > - return err; > - > - err = save_aux_ptr_type(env, src_reg_type, true); > + err = err ?: save_aux_ptr_type(env, src_reg_type, true); > + err = err ?: reg_bounds_sanity_check(env, ®s[insn->dst_reg], "ldx"); > if (err) > return err; > } else if (class == BPF_STX) { > @@ -20725,6 +20791,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 > > if (is_priv) > env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ; > + env->test_sanity_strict = attr->prog_flags & BPF_F_TEST_SANITY_STRICT; > > env->explored_states = kvcalloc(state_htab_size(env), > sizeof(struct bpf_verifier_state_list *), > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 0f6cdf52b1da..b99c1e0e2730 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { > */ > #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) > > +/* The verifier internal test flag. Behavior is undefined */ > +#define BPF_F_TEST_SANITY_STRICT (1U << 7) > + > /* link_create.kprobe_multi.flags used in LINK_CREATE command for > * BPF_TRACE_KPROBE_MULTI attach type to create return probe. > */
On Fri, Nov 3, 2023 at 10:56 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2023-11-02 at 17:08 -0700, Andrii Nakryiko wrote: > > Add simple sanity checks that validate well-formed ranges (min <= max) > > across u64, s64, u32, and s32 ranges. Also for cases when the value is > > constant (either 64-bit or 32-bit), we validate that ranges and tnums > > are in agreement. > > > > These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 > > operations, on conditional jumps, and for LDX instructions (where subreg > > zero/sign extension is probably the most important to check). This > > covers most of the interesting cases. > > > > Also, we validate the sanity of the return register when manually > > adjusting it for some special helpers. > > > > By default, sanity violation will trigger a warning in verifier log and > > resetting register bounds to "unbounded" ones. But to aid development > > and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will > > trigger hard failure of verification with -EFAULT on register bounds > > violations. This allows selftests to catch such issues. veristat will > > also gain a CLI option to enable this behavior. > > This is a useful check but I'm not sure about placement. > It might be useful to guard calls to coerce_subreg_to_size_sx() as well. Those are covered as part of the ALU/ALU64 check. My initial idea was to add it into reg_bounds_sync() and make reg_bounds_sync() return int (right now it's void). But discussing with Alexei we came to the conclusion that it would be a bit too much code churn for little gain. This coerce_subreg...() stuff, it's also void, so we'd need to propagate errors out of it as well. In the end I think I'm covering basically all relevant cases (ALU, LDX, RETVAL, COND_JUMP). > Maybe insert it as a part of the main do_check() loop but filter > by instruction class (and also force on stack_pop)? That would be a) a bit wasteful, and b) I'd need to re-interpret BPF_X vs BPF_K and all the other idiosyncrasies of instruction encoding. So it doesn't seem like a good idea. > > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > > --- > > include/linux/bpf_verifier.h | 1 + > > include/uapi/linux/bpf.h | 3 + > > kernel/bpf/syscall.c | 3 +- > > kernel/bpf/verifier.c | 117 ++++++++++++++++++++++++++------- > > tools/include/uapi/linux/bpf.h | 3 + > > 5 files changed, 101 insertions(+), 26 deletions(-) > > trimming is good [...]
On Fri, 2023-11-03 at 14:11 -0700, Andrii Nakryiko wrote: [...] > > This is a useful check but I'm not sure about placement. > > It might be useful to guard calls to coerce_subreg_to_size_sx() as well. > > Those are covered as part of the ALU/ALU64 check. Oh, right, sorry. > My initial idea was to add it into reg_bounds_sync() and make > reg_bounds_sync() return int (right now it's void). But discussing > with Alexei we came to the conclusion that it would be a bit too much > code churn for little gain. This coerce_subreg...() stuff, it's also > void, so we'd need to propagate errors out of it as well. > > In the end I think I'm covering basically all relevant cases (ALU, > LDX, RETVAL, COND_JUMP). > > > Maybe insert it as a part of the main do_check() loop but filter > > by instruction class (and also force on stack_pop)? > > That would be a) a bit wasteful, and b) I'd need to re-interpret BPF_X > vs BPF_K and all the other idiosyncrasies of instruction encoding. So > it doesn't seem like a good idea. tbh I think that compartmentalizing this check worth a little bit of churn, but ok, not that important. Acked-by: Eduard Zingerman <eddyz87@gmail.com>
On Thu, Nov 02, 2023 at 05:08:13PM -0700, Andrii Nakryiko wrote: > Add simple sanity checks that validate well-formed ranges (min <= max) > across u64, s64, u32, and s32 ranges. Also for cases when the value is > constant (either 64-bit or 32-bit), we validate that ranges and tnums > are in agreement. > > These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 > operations, on conditional jumps, and for LDX instructions (where subreg > zero/sign extension is probably the most important to check). This > covers most of the interesting cases. > > Also, we validate the sanity of the return register when manually > adjusting it for some special helpers. > > By default, sanity violation will trigger a warning in verifier log and > resetting register bounds to "unbounded" ones. But to aid development > and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will > trigger hard failure of verification with -EFAULT on register bounds > violations. This allows selftests to catch such issues. veristat will > also gain a CLI option to enable this behavior. > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > include/linux/bpf_verifier.h | 1 + > include/uapi/linux/bpf.h | 3 + > kernel/bpf/syscall.c | 3 +- > kernel/bpf/verifier.c | 117 ++++++++++++++++++++++++++------- > tools/include/uapi/linux/bpf.h | 3 + > 5 files changed, 101 insertions(+), 26 deletions(-) [...] > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 8691cacd3ad3..af4e2fecbef2 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -2615,6 +2615,56 @@ static void reg_bounds_sync(struct bpf_reg_state *reg) > __update_reg_bounds(reg); > } > > +static int reg_bounds_sanity_check(struct bpf_verifier_env *env, > + struct bpf_reg_state *reg, const char *ctx) > +{ > + const char *msg; > + > + if (reg->umin_value > reg->umax_value || > + reg->smin_value > reg->smax_value || > + reg->u32_min_value > reg->u32_max_value || > + reg->s32_min_value > reg->s32_max_value) { > + msg = "range bounds violation"; > + goto out; > + } Maybe Check tnum validity before comparing it with min/max? The mask bit and value bit at the same position can not both be set. if (reg->var_off.mask & reg->var_off.value) { msg = "tnum invalid"; goto out; } Unfortunately doing tnum_intersect() on two non-overlapping tnum still gives a valid tnum; so we can't readily detect such cases like how we do for ranges above. > + if (tnum_is_const(reg->var_off)) { > + u64 uval = reg->var_off.value; > + s64 sval = (s64)uval; > + > + if (reg->umin_value != uval || reg->umax_value != uval || > + reg->smin_value != sval || reg->smax_value != sval) { > + msg = "const tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + if (tnum_subreg_is_const(reg->var_off)) { > + u32 uval32 = tnum_subreg(reg->var_off).value; > + s32 sval32 = (s32)uval32; > + > + if (reg->u32_min_value != uval32 || reg->u32_max_value != uval32 || > + reg->s32_min_value != sval32 || reg->s32_max_value != sval32) { > + msg = "const subreg tnum out of sync with range bounds"; > + goto out; > + } > + } > + > + return 0; > +out: > + verbose(env, "REG SANITY VIOLATION (%s): %s u64=[%#llx, %#llx] " > + "s64=[%#llx, %#llx] u32=[%#x, %#x] s32=[%#x, %#x] var_off=(%#llx, %#llx)\n", > + ctx, msg, reg->umin_value, reg->umax_value, > + reg->smin_value, reg->smax_value, > + reg->u32_min_value, reg->u32_max_value, > + reg->s32_min_value, reg->s32_max_value, > + reg->var_off.value, reg->var_off.mask); > + if (env->test_sanity_strict) > + return -EFAULT; > + __mark_reg_unbounded(reg); > + return 0; > +} [...]
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 24213a99cc79..402b6bc44a1b 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -602,6 +602,7 @@ struct bpf_verifier_env { int stack_size; /* number of states to be processed */ bool strict_alignment; /* perform strict pointer alignment checks */ bool test_state_freq; /* test verifier with different pruning frequency */ + bool test_sanity_strict; /* fail verification on sanity violations */ struct bpf_verifier_state *cur_state; /* current verifier state */ struct bpf_verifier_state_list **explored_states; /* search pruning optimization */ struct bpf_verifier_state_list *free_list; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 0f6cdf52b1da..b99c1e0e2730 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { */ #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) +/* The verifier internal test flag. Behavior is undefined */ +#define BPF_F_TEST_SANITY_STRICT (1U << 7) + /* link_create.kprobe_multi.flags used in LINK_CREATE command for * BPF_TRACE_KPROBE_MULTI attach type to create return probe. */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 0ed286b8a0f0..f266e03ba342 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2573,7 +2573,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) BPF_F_SLEEPABLE | BPF_F_TEST_RND_HI32 | BPF_F_XDP_HAS_FRAGS | - BPF_F_XDP_DEV_BOUND_ONLY)) + BPF_F_XDP_DEV_BOUND_ONLY | + BPF_F_TEST_SANITY_STRICT)) return -EINVAL; if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8691cacd3ad3..af4e2fecbef2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2615,6 +2615,56 @@ static void reg_bounds_sync(struct bpf_reg_state *reg) __update_reg_bounds(reg); } +static int reg_bounds_sanity_check(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, const char *ctx) +{ + const char *msg; + + if (reg->umin_value > reg->umax_value || + reg->smin_value > reg->smax_value || + reg->u32_min_value > reg->u32_max_value || + reg->s32_min_value > reg->s32_max_value) { + msg = "range bounds violation"; + goto out; + } + + if (tnum_is_const(reg->var_off)) { + u64 uval = reg->var_off.value; + s64 sval = (s64)uval; + + if (reg->umin_value != uval || reg->umax_value != uval || + reg->smin_value != sval || reg->smax_value != sval) { + msg = "const tnum out of sync with range bounds"; + goto out; + } + } + + if (tnum_subreg_is_const(reg->var_off)) { + u32 uval32 = tnum_subreg(reg->var_off).value; + s32 sval32 = (s32)uval32; + + if (reg->u32_min_value != uval32 || reg->u32_max_value != uval32 || + reg->s32_min_value != sval32 || reg->s32_max_value != sval32) { + msg = "const subreg tnum out of sync with range bounds"; + goto out; + } + } + + return 0; +out: + verbose(env, "REG SANITY VIOLATION (%s): %s u64=[%#llx, %#llx] " + "s64=[%#llx, %#llx] u32=[%#x, %#x] s32=[%#x, %#x] var_off=(%#llx, %#llx)\n", + ctx, msg, reg->umin_value, reg->umax_value, + reg->smin_value, reg->smax_value, + reg->u32_min_value, reg->u32_max_value, + reg->s32_min_value, reg->s32_max_value, + reg->var_off.value, reg->var_off.mask); + if (env->test_sanity_strict) + return -EFAULT; + __mark_reg_unbounded(reg); + return 0; +} + static bool __reg32_bound_s64(s32 a) { return a >= 0 && a <= S32_MAX; @@ -9928,14 +9978,15 @@ static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) return 0; } -static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, - int func_id, - struct bpf_call_arg_meta *meta) +static int do_refine_retval_range(struct bpf_verifier_env *env, + struct bpf_reg_state *regs, int ret_type, + int func_id, + struct bpf_call_arg_meta *meta) { struct bpf_reg_state *ret_reg = ®s[BPF_REG_0]; if (ret_type != RET_INTEGER) - return; + return 0; switch (func_id) { case BPF_FUNC_get_stack: @@ -9961,6 +10012,8 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type, reg_bounds_sync(ret_reg); break; } + + return reg_bounds_sanity_check(env, ret_reg, "retval"); } static int @@ -10612,7 +10665,9 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn regs[BPF_REG_0].ref_obj_id = id; } - do_refine_retval_range(regs, fn->ret_type, func_id, &meta); + err = do_refine_retval_range(env, regs, fn->ret_type, func_id, &meta); + if (err) + return err; err = check_map_func_compatibility(env, meta.map_ptr, func_id); if (err) @@ -14079,13 +14134,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) /* check dest operand */ err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); + err = err ?: adjust_reg_min_max_vals(env, insn); if (err) return err; - - return adjust_reg_min_max_vals(env, insn); } - return 0; + return reg_bounds_sanity_check(env, ®s[insn->dst_reg], "alu"); } static void find_good_pkt_pointers(struct bpf_verifier_state *vstate, @@ -14609,18 +14663,21 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state * Technically we can do similar adjustments for pointers to the same object, * but we don't support that right now. */ -static void reg_set_min_max(struct bpf_reg_state *true_reg1, - struct bpf_reg_state *true_reg2, - struct bpf_reg_state *false_reg1, - struct bpf_reg_state *false_reg2, - u8 opcode, bool is_jmp32) +static int reg_set_min_max(struct bpf_verifier_env *env, + struct bpf_reg_state *true_reg1, + struct bpf_reg_state *true_reg2, + struct bpf_reg_state *false_reg1, + struct bpf_reg_state *false_reg2, + u8 opcode, bool is_jmp32) { + int err; + /* If either register is a pointer, we can't learn anything about its * variable offset from the compare (unless they were a pointer into * the same object, but we don't bother with that). */ if (false_reg1->type != SCALAR_VALUE || false_reg2->type != SCALAR_VALUE) - return; + return 0; /* fallthrough (FALSE) branch */ regs_refine_cond_op(false_reg1, false_reg2, rev_opcode(opcode), is_jmp32); @@ -14631,6 +14688,12 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg1, regs_refine_cond_op(true_reg1, true_reg2, opcode, is_jmp32); reg_bounds_sync(true_reg1); reg_bounds_sync(true_reg2); + + err = reg_bounds_sanity_check(env, true_reg1, "true_reg1"); + err = err ?: reg_bounds_sanity_check(env, true_reg2, "true_reg2"); + err = err ?: reg_bounds_sanity_check(env, false_reg1, "false_reg1"); + err = err ?: reg_bounds_sanity_check(env, false_reg2, "false_reg2"); + return err; } static void mark_ptr_or_null_reg(struct bpf_func_state *state, @@ -14924,15 +14987,20 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, other_branch_regs = other_branch->frame[other_branch->curframe]->regs; if (BPF_SRC(insn->code) == BPF_X) { - reg_set_min_max(&other_branch_regs[insn->dst_reg], - &other_branch_regs[insn->src_reg], - dst_reg, src_reg, opcode, is_jmp32); + err = reg_set_min_max(env, + &other_branch_regs[insn->dst_reg], + &other_branch_regs[insn->src_reg], + dst_reg, src_reg, opcode, is_jmp32); } else /* BPF_SRC(insn->code) == BPF_K */ { - reg_set_min_max(&other_branch_regs[insn->dst_reg], - src_reg /* fake one */, - dst_reg, src_reg /* same fake one */, - opcode, is_jmp32); + err = reg_set_min_max(env, + &other_branch_regs[insn->dst_reg], + src_reg /* fake one */, + dst_reg, src_reg /* same fake one */, + opcode, is_jmp32); } + if (err) + return err; + if (BPF_SRC(insn->code) == BPF_X && src_reg->type == SCALAR_VALUE && src_reg->id && !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) { @@ -17435,10 +17503,8 @@ static int do_check(struct bpf_verifier_env *env) insn->off, BPF_SIZE(insn->code), BPF_READ, insn->dst_reg, false, BPF_MODE(insn->code) == BPF_MEMSX); - if (err) - return err; - - err = save_aux_ptr_type(env, src_reg_type, true); + err = err ?: save_aux_ptr_type(env, src_reg_type, true); + err = err ?: reg_bounds_sanity_check(env, ®s[insn->dst_reg], "ldx"); if (err) return err; } else if (class == BPF_STX) { @@ -20725,6 +20791,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 if (is_priv) env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ; + env->test_sanity_strict = attr->prog_flags & BPF_F_TEST_SANITY_STRICT; env->explored_states = kvcalloc(state_htab_size(env), sizeof(struct bpf_verifier_state_list *), diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 0f6cdf52b1da..b99c1e0e2730 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1200,6 +1200,9 @@ enum bpf_perf_event_type { */ #define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6) +/* The verifier internal test flag. Behavior is undefined */ +#define BPF_F_TEST_SANITY_STRICT (1U << 7) + /* link_create.kprobe_multi.flags used in LINK_CREATE command for * BPF_TRACE_KPROBE_MULTI attach type to create return probe. */
Add simple sanity checks that validate well-formed ranges (min <= max) across u64, s64, u32, and s32 ranges. Also for cases when the value is constant (either 64-bit or 32-bit), we validate that ranges and tnums are in agreement. These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 operations, on conditional jumps, and for LDX instructions (where subreg zero/sign extension is probably the most important to check). This covers most of the interesting cases. Also, we validate the sanity of the return register when manually adjusting it for some special helpers. By default, sanity violation will trigger a warning in verifier log and resetting register bounds to "unbounded" ones. But to aid development and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will trigger hard failure of verification with -EFAULT on register bounds violations. This allows selftests to catch such issues. veristat will also gain a CLI option to enable this behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> --- include/linux/bpf_verifier.h | 1 + include/uapi/linux/bpf.h | 3 + kernel/bpf/syscall.c | 3 +- kernel/bpf/verifier.c | 117 ++++++++++++++++++++++++++------- tools/include/uapi/linux/bpf.h | 3 + 5 files changed, 101 insertions(+), 26 deletions(-)