From patchwork Tue Apr 25 23:49:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223904 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F355EC77B61 for ; Tue, 25 Apr 2023 23:49:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237019AbjDYXtc convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237020AbjDYXtb (ORCPT ); Tue, 25 Apr 2023 19:49:31 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0F57B219 for ; Tue, 25 Apr 2023 16:49:30 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEj6T031710 for ; Tue, 25 Apr 2023 16:49:30 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6hju3dg2-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:30 -0700 Received: from twshared13785.14.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:26 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id C4ED32F2D8336; Tue, 25 Apr 2023 16:49:16 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 01/10] veristat: add -t flag for adding BPF_F_TEST_STATE_FREQ program flag Date: Tue, 25 Apr 2023 16:49:02 -0700 Message-ID: <20230425234911.2113352-2-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: rmdqclNKOe4y2LCixxzTDxcABZctX0yh X-Proofpoint-ORIG-GUID: rmdqclNKOe4y2LCixxzTDxcABZctX0yh X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Sometimes during debugging it's important that BPF program is loaded with BPF_F_TEST_STATE_FREQ flag set to force verifier to do frequent state checkpointing. Teach veristat to do this when -t ("test state") flag is specified. Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/veristat.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index 1db7185181da..655095810d4a 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -141,6 +141,7 @@ static struct env { bool verbose; bool debug; bool quiet; + bool force_checkpoints; enum resfmt out_fmt; bool show_version; bool comparison_mode; @@ -209,6 +210,8 @@ static const struct argp_option opts[] = { { "log-level", 'l', "LEVEL", 0, "Verifier log level (default 0 for normal mode, 1 for verbose mode)" }, { "log-fixed", OPT_LOG_FIXED, NULL, 0, "Disable verifier log rotation" }, { "log-size", OPT_LOG_SIZE, "BYTES", 0, "Customize verifier log size (default to 16MB)" }, + { "test-states", 't', NULL, 0, + "Force frequent BPF verifier state checkpointing (set BPF_F_TEST_STATE_FREQ program flag)" }, { "quiet", 'q', NULL, 0, "Quiet mode" }, { "emit", 'e', "SPEC", 0, "Specify stats to be emitted" }, { "sort", 's', "SPEC", 0, "Specify sort order" }, @@ -284,6 +287,9 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) argp_usage(state); } break; + case 't': + env.force_checkpoints = true; + break; case 'C': env.comparison_mode = true; break; @@ -989,6 +995,9 @@ static int process_prog(const char *filename, struct bpf_object *obj, struct bpf /* increase chances of successful BPF object loading */ fixup_obj(obj, prog, base_filename); + if (env.force_checkpoints) + bpf_program__set_flags(prog, bpf_program__flags(prog) | BPF_F_TEST_STATE_FREQ); + err = bpf_object__load(obj); env.progs_processed++; From patchwork Tue Apr 25 23:49:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223906 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DCBDC7EE21 for ; Tue, 25 Apr 2023 23:49:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237025AbjDYXtg convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237030AbjDYXtf (ORCPT ); Tue, 25 Apr 2023 19:49:35 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14301B219 for ; Tue, 25 Apr 2023 16:49:35 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEDZY022298 for ; Tue, 25 Apr 2023 16:49:34 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6mws9he4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:34 -0700 Received: from twshared58712.02.prn6.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:31 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id CF3AE2F2D833D; Tue, 25 Apr 2023 16:49:18 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 02/10] bpf: mark relevant stack slots scratched for register read instructions Date: Tue, 25 Apr 2023 16:49:03 -0700 Message-ID: <20230425234911.2113352-3-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: sneBv6QkwCLV1tQzk2R__xsPQysjCVWH X-Proofpoint-ORIG-GUID: sneBv6QkwCLV1tQzk2R__xsPQysjCVWH X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net When handling instructions that read register slots, mark relevant stack slots as scratched so that verifier log would contain those slots' states, in addition to currently emitted registers with stack slot offsets. Signed-off-by: Andrii Nakryiko --- kernel/bpf/verifier.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fbcf5a4e2fcd..fea6fe4acba2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4067,6 +4067,7 @@ static void mark_reg_stack_read(struct bpf_verifier_env *env, for (i = min_off; i < max_off; i++) { slot = -i - 1; spi = slot / BPF_REG_SIZE; + mark_stack_slot_scratched(env, spi); stype = ptr_state->stack[spi].slot_type; if (stype[slot % BPF_REG_SIZE] != STACK_ZERO) break; @@ -4118,6 +4119,8 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env, stype = reg_state->stack[spi].slot_type; reg = ®_state->stack[spi].spilled_ptr; + mark_stack_slot_scratched(env, spi); + if (is_spilled_reg(®_state->stack[spi])) { u8 spill_size = 1; From patchwork Tue Apr 25 23:49:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223905 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C09FC77B7F for ; Tue, 25 Apr 2023 23:49:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237026AbjDYXtf convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237020AbjDYXte (ORCPT ); Tue, 25 Apr 2023 19:49:34 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0974DB219 for ; Tue, 25 Apr 2023 16:49:31 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEVPb028549 for ; Tue, 25 Apr 2023 16:49:31 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6ek7mqpc-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:31 -0700 Received: from ash-exhub204.TheFacebook.com (2620:10d:c0a8:83::4) by ash-exhub203.TheFacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:27 -0700 Received: from twshared13785.14.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:26 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id DD9EB2F2D8347; Tue, 25 Apr 2023 16:49:20 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 03/10] bpf: encapsulate precision backtracking bookkeeping Date: Tue, 25 Apr 2023 16:49:04 -0700 Message-ID: <20230425234911.2113352-4-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: UnL1G-sy3cdcqyCThSOJp5kQuNphz9dU X-Proofpoint-GUID: UnL1G-sy3cdcqyCThSOJp5kQuNphz9dU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add struct backtrack_state and straightforward API around it to keep track of register and stack masks used and maintained during precision backtracking process. Having this logic separately allow to keep high-level backtracking algorithm cleaner, but also it sets us up to cleanly keep track of register and stack masks per frame, allowing (with some further logic adjustments) to perform precision backpropagation across multiple frames (i.e., subprog calls). Signed-off-by: Andrii Nakryiko --- include/linux/bpf_verifier.h | 15 ++ kernel/bpf/verifier.c | 258 ++++++++++++++++++++++++++--------- 2 files changed, 206 insertions(+), 67 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 3dd29a53b711..185bfaf0ec6b 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -238,6 +238,10 @@ enum bpf_stack_slot_type { #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */ +#define BPF_REGMASK_ARGS ((1 << BPF_REG_1) | (1 << BPF_REG_2) | \ + (1 << BPF_REG_3) | (1 << BPF_REG_4) | \ + (1 << BPF_REG_5)) + #define BPF_DYNPTR_SIZE sizeof(struct bpf_dynptr_kern) #define BPF_DYNPTR_NR_SLOTS (BPF_DYNPTR_SIZE / BPF_REG_SIZE) @@ -541,6 +545,16 @@ struct bpf_subprog_info { bool is_async_cb; }; +struct bpf_verifier_env; + +struct backtrack_state { + struct bpf_verifier_env *env; + u32 frame; + u32 bitcnt; + u32 reg_masks[MAX_CALL_FRAMES]; + u64 stack_masks[MAX_CALL_FRAMES]; +}; + /* single container for all structs * one verifier_env per bpf_check() call */ @@ -578,6 +592,7 @@ struct bpf_verifier_env { int *insn_stack; int cur_stack; } cfg; + struct backtrack_state bt; u32 pass_cnt; /* number of times do_check() was called */ u32 subprog_cnt; /* number of instructions analyzed by the verifier */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fea6fe4acba2..1cb89fe00507 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1254,6 +1254,12 @@ static bool is_spilled_reg(const struct bpf_stack_state *stack) return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL; } +static bool is_spilled_scalar_reg(const struct bpf_stack_state *stack) +{ + return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL && + stack->spilled_ptr.type == SCALAR_VALUE; +} + static void scrub_spilled_slot(u8 *stype) { if (*stype != STACK_INVALID) @@ -3144,12 +3150,137 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn) return btf_name_by_offset(desc_btf, func->name_off); } +static inline void bt_init(struct backtrack_state *bt, u32 frame) +{ + bt->frame = frame; +} + +static inline void bt_reset(struct backtrack_state *bt) +{ + struct bpf_verifier_env *env = bt->env; + memset(bt, 0, sizeof(*bt)); + bt->env = env; +} + +static inline u32 bt_bitcnt(struct backtrack_state *bt) +{ + return bt->bitcnt; +} + +static inline int bt_subprog_enter(struct backtrack_state *bt) +{ + if (bt->frame == MAX_CALL_FRAMES - 1) { + verbose(bt->env, "BUG subprog enter from frame %d\n", bt->frame); + WARN_ONCE(1, "verifier backtracking bug"); + return -EFAULT; + } + bt->frame++; + return 0; +} + +static inline int bt_subprog_exit(struct backtrack_state *bt) +{ + if (bt->frame == 0) { + verbose(bt->env, "BUG subprog exit from frame 0\n"); + WARN_ONCE(1, "verifier backtracking bug"); + return -EFAULT; + } + bt->frame--; + return 0; +} + +static inline void bt_set_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg) +{ + if (bt->reg_masks[frame] & (1 << reg)) + return; + + bt->reg_masks[frame] |= 1 << reg; + bt->bitcnt++; +} + +static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg) +{ + if (!(bt->reg_masks[frame] & (1 << reg))) + return; + + bt->reg_masks[frame] &= ~(1 << reg); + bt->bitcnt--; +} + +static inline void bt_set_reg(struct backtrack_state *bt, u32 reg) +{ + bt_set_frame_reg(bt, bt->frame, reg); +} + +static inline void bt_clear_reg(struct backtrack_state *bt, u32 reg) +{ + bt_clear_frame_reg(bt, bt->frame, reg); +} + +static inline void bt_set_frame_slot(struct backtrack_state *bt, u32 frame, u32 slot) +{ + if (bt->stack_masks[frame] & (1ull << slot)) + return; + + bt->stack_masks[frame] |= 1ull << slot; + bt->bitcnt++; +} + +static inline void bt_clear_frame_slot(struct backtrack_state *bt, u32 frame, u32 slot) +{ + if (!(bt->stack_masks[frame] & (1ull << slot))) + return; + + bt->stack_masks[frame] &= ~(1ull << slot); + bt->bitcnt--; +} + +static inline void bt_set_slot(struct backtrack_state *bt, u32 slot) +{ + bt_set_frame_slot(bt, bt->frame, slot); +} + +static inline void bt_clear_slot(struct backtrack_state *bt, u32 slot) +{ + bt_clear_frame_slot(bt, bt->frame, slot); +} + +static inline u32 bt_frame_reg_mask(struct backtrack_state *bt, u32 frame) +{ + return bt->reg_masks[frame]; +} + +static inline u32 bt_reg_mask(struct backtrack_state *bt) +{ + return bt->reg_masks[bt->frame]; +} + +static inline u64 bt_frame_stack_mask(struct backtrack_state *bt, u32 frame) +{ + return bt->stack_masks[frame]; +} + +static inline u64 bt_stack_mask(struct backtrack_state *bt) +{ + return bt->stack_masks[bt->frame]; +} + +static inline bool bt_is_reg_set(struct backtrack_state *bt, u32 reg) +{ + return bt->reg_masks[bt->frame] & (1 << reg); +} + +static inline bool bt_is_slot_set(struct backtrack_state *bt, u32 slot) +{ + return bt->stack_masks[bt->frame] & (1ull << slot); +} + /* For given verifier state backtrack_insn() is called from the last insn to * the first insn. Its purpose is to compute a bitmask of registers and * stack slots that needs precision in the parent verifier state. */ static int backtrack_insn(struct bpf_verifier_env *env, int idx, - u32 *reg_mask, u64 *stack_mask) + struct backtrack_state *bt) { const struct bpf_insn_cbs cbs = { .cb_call = disasm_kfunc_name, @@ -3160,20 +3291,20 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, u8 class = BPF_CLASS(insn->code); u8 opcode = BPF_OP(insn->code); u8 mode = BPF_MODE(insn->code); - u32 dreg = 1u << insn->dst_reg; - u32 sreg = 1u << insn->src_reg; + u32 dreg = insn->dst_reg; + u32 sreg = insn->src_reg; u32 spi; if (insn->code == 0) return 0; if (env->log.level & BPF_LOG_LEVEL2) { - verbose(env, "regs=%x stack=%llx before ", *reg_mask, *stack_mask); + verbose(env, "regs=%x stack=%llx before ", bt_reg_mask(bt), bt_stack_mask(bt)); verbose(env, "%d: ", idx); print_bpf_insn(&cbs, insn, env->allow_ptr_leaks); } if (class == BPF_ALU || class == BPF_ALU64) { - if (!(*reg_mask & dreg)) + if (!bt_is_reg_set(bt, dreg)) return 0; if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { @@ -3181,8 +3312,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, * dreg needs precision after this insn * sreg needs precision before this insn */ - *reg_mask &= ~dreg; - *reg_mask |= sreg; + bt_clear_reg(bt, dreg); + bt_set_reg(bt, sreg); } else { /* dreg = K * dreg needs precision after this insn. @@ -3190,7 +3321,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, * as precise=true in this verifier state. * No further markings in parent are necessary */ - *reg_mask &= ~dreg; + bt_clear_reg(bt, dreg); } } else { if (BPF_SRC(insn->code) == BPF_X) { @@ -3198,15 +3329,15 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, * both dreg and sreg need precision * before this insn */ - *reg_mask |= sreg; + bt_set_reg(bt, sreg); } /* else dreg += K * dreg still needs precision before this insn */ } } else if (class == BPF_LDX) { - if (!(*reg_mask & dreg)) + if (!bt_is_reg_set(bt, dreg)) return 0; - *reg_mask &= ~dreg; + bt_clear_reg(bt, dreg); /* scalars can only be spilled into stack w/o losing precision. * Load from any other memory can be zero extended. @@ -3227,9 +3358,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - *stack_mask |= 1ull << spi; + bt_set_slot(bt, spi); } else if (class == BPF_STX || class == BPF_ST) { - if (*reg_mask & dreg) + if (bt_is_reg_set(bt, dreg)) /* stx & st shouldn't be using _scalar_ dst_reg * to access memory. It means backtracking * encountered a case of pointer subtraction. @@ -3244,11 +3375,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } - if (!(*stack_mask & (1ull << spi))) + if (!bt_is_slot_set(bt, spi)) return 0; - *stack_mask &= ~(1ull << spi); + bt_clear_slot(bt, spi); if (class == BPF_STX) - *reg_mask |= sreg; + bt_set_reg(bt, sreg); } else if (class == BPF_JMP || class == BPF_JMP32) { if (opcode == BPF_CALL) { if (insn->src_reg == BPF_PSEUDO_CALL) @@ -3265,19 +3396,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && insn->imm == 0) return -ENOTSUPP; /* regular helper call sets R0 */ - *reg_mask &= ~1; - if (*reg_mask & 0x3f) { + bt_clear_reg(bt, BPF_REG_0); + if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) { /* if backtracing was looking for registers R1-R5 * they should have been found already. */ - verbose(env, "BUG regs %x\n", *reg_mask); + verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } } else if (opcode == BPF_EXIT) { return -ENOTSUPP; } else if (BPF_SRC(insn->code) == BPF_X) { - if (!(*reg_mask & (dreg | sreg))) + if (!bt_is_reg_set(bt, dreg) && !bt_is_reg_set(bt, sreg)) return 0; /* dreg sreg * Both dreg and sreg need precision before @@ -3285,7 +3416,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, * before it would be equally necessary to * propagate it to dreg. */ - *reg_mask |= (sreg | dreg); + bt_set_reg(bt, dreg); + bt_set_reg(bt, sreg); /* else dreg K * Only dreg still needs precision before * this insn, so for the K-based conditional @@ -3293,9 +3425,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, */ } } else if (class == BPF_LD) { - if (!(*reg_mask & dreg)) + if (!bt_is_reg_set(bt, dreg)) return 0; - *reg_mask &= ~dreg; + bt_clear_reg(bt, dreg); /* It's ld_imm64 or ld_abs or ld_ind. * For ld_imm64 no further tracking of precision * into parent is necessary @@ -3508,20 +3640,21 @@ static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno, int spi) { + struct backtrack_state *bt = &env->bt; struct bpf_verifier_state *st = env->cur_state; int first_idx = st->first_insn_idx; int last_idx = env->insn_idx; struct bpf_func_state *func; struct bpf_reg_state *reg; - u32 reg_mask = regno >= 0 ? 1u << regno : 0; - u64 stack_mask = spi >= 0 ? 1ull << spi : 0; bool skip_first = true; - bool new_marks = false; int i, err; if (!env->bpf_capable) return 0; + /* set frame number from which we are starting to backtrack */ + bt_init(bt, frame); + /* Do sanity checks against current state of register and/or stack * slot, but don't set precise flag in current state, as precision * tracking in the current state is unnecessary. @@ -3533,26 +3666,17 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r WARN_ONCE(1, "backtracing misuse"); return -EFAULT; } - new_marks = true; + bt_set_reg(bt, regno); } while (spi >= 0) { - if (!is_spilled_reg(&func->stack[spi])) { - stack_mask = 0; - break; - } - reg = &func->stack[spi].spilled_ptr; - if (reg->type != SCALAR_VALUE) { - stack_mask = 0; + if (!is_spilled_scalar_reg(&func->stack[spi])) break; - } - new_marks = true; + bt_set_slot(bt, spi); break; } - if (!new_marks) - return 0; - if (!reg_mask && !stack_mask) + if (bt_bitcnt(bt) == 0) return 0; for (;;) { @@ -3571,12 +3695,13 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r if (st->curframe == 0 && st->frame[0]->subprogno > 0 && st->frame[0]->callsite == BPF_MAIN_FUNC && - stack_mask == 0 && (reg_mask & ~0x3e) == 0) { - bitmap_from_u64(mask, reg_mask); + bt_stack_mask(bt) == 0 && + (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) == 0) { + bitmap_from_u64(mask, bt_reg_mask(bt)); for_each_set_bit(i, mask, 32) { reg = &st->frame[0]->regs[i]; if (reg->type != SCALAR_VALUE) { - reg_mask &= ~(1u << i); + bt_clear_reg(bt, i); continue; } reg->precise = true; @@ -3584,8 +3709,8 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r return 0; } - verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n", - st->frame[0]->subprogno, reg_mask, stack_mask); + verbose(env, "BUG backtracking func entry subprog %d reg_mask %x stack_mask %llx\n", + st->frame[0]->subprogno, bt_reg_mask(bt), bt_stack_mask(bt)); WARN_ONCE(1, "verifier backtracking bug"); return -EFAULT; } @@ -3595,15 +3720,16 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r err = 0; skip_first = false; } else { - err = backtrack_insn(env, i, ®_mask, &stack_mask); + err = backtrack_insn(env, i, bt); } if (err == -ENOTSUPP) { mark_all_scalars_precise(env, st); + bt_reset(bt); return 0; } else if (err) { return err; } - if (!reg_mask && !stack_mask) + if (bt_bitcnt(bt) == 0) /* Found assignment(s) into tracked register in this state. * Since this state is already marked, just return. * Nothing to be tracked further in the parent state. @@ -3628,21 +3754,21 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r if (!st) break; - new_marks = false; func = st->frame[frame]; - bitmap_from_u64(mask, reg_mask); + bitmap_from_u64(mask, bt_reg_mask(bt)); for_each_set_bit(i, mask, 32) { reg = &func->regs[i]; if (reg->type != SCALAR_VALUE) { - reg_mask &= ~(1u << i); + bt_clear_reg(bt, i); continue; } - if (!reg->precise) - new_marks = true; - reg->precise = true; + if (reg->precise) + bt_clear_reg(bt, i); + else + reg->precise = true; } - bitmap_from_u64(mask, stack_mask); + bitmap_from_u64(mask, bt_stack_mask(bt)); for_each_set_bit(i, mask, 64) { if (i >= func->allocated_stack / BPF_REG_SIZE) { /* the sequence of instructions: @@ -3659,32 +3785,28 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r * In such case fallback to conservative. */ mark_all_scalars_precise(env, st); + bt_reset(bt); return 0; } - if (!is_spilled_reg(&func->stack[i])) { - stack_mask &= ~(1ull << i); + if (!is_spilled_scalar_reg(&func->stack[i])) { + bt_clear_slot(bt, i); continue; } reg = &func->stack[i].spilled_ptr; - if (reg->type != SCALAR_VALUE) { - stack_mask &= ~(1ull << i); - continue; - } - if (!reg->precise) - new_marks = true; - reg->precise = true; + if (reg->precise) + bt_clear_slot(bt, i); + else + reg->precise = true; } if (env->log.level & BPF_LOG_LEVEL2) { verbose(env, "parent %s regs=%x stack=%llx marks:", - new_marks ? "didn't have" : "already had", - reg_mask, stack_mask); + bt_bitcnt(bt) ? "didn't have" : "already had", + bt_reg_mask(bt), bt_stack_mask(bt)); print_verifier_state(env, func, true); } - if (!reg_mask && !stack_mask) - break; - if (!new_marks) + if (bt_bitcnt(bt) == 0) break; last_idx = st->last_insn_idx; @@ -18809,6 +18931,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 if (!env) return -ENOMEM; + env->bt.env = env; + len = (*prog)->len; env->insn_aux_data = vzalloc(array_size(sizeof(struct bpf_insn_aux_data), len)); From patchwork Tue Apr 25 23:49:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223913 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 079D9C77B7C for ; Tue, 25 Apr 2023 23:52:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237254AbjDYXw3 convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:52:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237251AbjDYXw2 (ORCPT ); Tue, 25 Apr 2023 19:52:28 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 931CEC15F for ; Tue, 25 Apr 2023 16:52:26 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLESC1028401 for ; Tue, 25 Apr 2023 16:52:25 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6ek7mr4t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:52:25 -0700 Received: from twshared7331.15.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:52:23 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id F00D32F2D8358; Tue, 25 Apr 2023 16:49:22 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 04/10] bpf: improve precision backtrack logging Date: Tue, 25 Apr 2023 16:49:05 -0700 Message-ID: <20230425234911.2113352-5-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: itrzFXlpRtgrEyXw_dHpvfwlENqPzP5H X-Proofpoint-GUID: itrzFXlpRtgrEyXw_dHpvfwlENqPzP5H X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add helper to format register and stack masks in more human-readable format. Adjust logging a bit during backtrack propagation and especially during forcing precision fallback logic to make it clearer what's going on (with log_level=2, of course), and also start reporting affected frame depth. This is in preparation for having more than one active frame later when precision propagation between subprog calls is added. Signed-off-by: Andrii Nakryiko --- include/linux/bpf_verifier.h | 13 ++- kernel/bpf/verifier.c | 72 ++++++++++-- .../testing/selftests/bpf/verifier/precise.c | 106 +++++++++--------- 3 files changed, 128 insertions(+), 63 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 185bfaf0ec6b..0ca367e13dd8 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -18,8 +18,11 @@ * that converting umax_value to int cannot overflow. */ #define BPF_MAX_VAR_SIZ (1 << 29) -/* size of type_str_buf in bpf_verifier. */ -#define TYPE_STR_BUF_LEN 128 +/* size of tmp_str_buf in bpf_verifier. + * we need at least 306 bytes to fit full stack mask representation + * (in the "-8,-16,...,-512" form) + */ +#define TMP_STR_BUF_LEN 320 /* Liveness marks, used for registers and spilled-regs (in stack slots). * Read marks propagate upwards until they find a write mark; they record that @@ -621,8 +624,10 @@ struct bpf_verifier_env { /* Same as scratched_regs but for stack slots */ u64 scratched_stack_slots; u64 prev_log_pos, prev_insn_print_pos; - /* buffer used in reg_type_str() to generate reg_type string */ - char type_str_buf[TYPE_STR_BUF_LEN]; + /* buffer used to generate temporary string representations, + * e.g., in reg_type_str() to generate reg_type string + */ + char tmp_str_buf[TMP_STR_BUF_LEN]; }; __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log, diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1cb89fe00507..8faf9170acf0 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -604,9 +604,9 @@ static const char *reg_type_str(struct bpf_verifier_env *env, type & PTR_TRUSTED ? "trusted_" : "" ); - snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s", + snprintf(env->tmp_str_buf, TMP_STR_BUF_LEN, "%s%s%s", prefix, str[base_type(type)], postfix); - return env->type_str_buf; + return env->tmp_str_buf; } static char slot_type_char[] = { @@ -3275,6 +3275,45 @@ static inline bool bt_is_slot_set(struct backtrack_state *bt, u32 slot) return bt->stack_masks[bt->frame] & (1ull << slot); } +/* format registers bitmask, e.g., "r0,r2,r4" for 0x15 mask */ +static void fmt_reg_mask(char *buf, ssize_t buf_sz, u32 reg_mask) +{ + DECLARE_BITMAP(mask, 64); + bool first = true; + int i, n; + + buf[0] = '\0'; + + bitmap_from_u64(mask, reg_mask); + for_each_set_bit(i, mask, 32) { + n = snprintf(buf, buf_sz, "%sr%d", first ? "" : ",", i); + first = false; + buf += n; + buf_sz -= n; + if (buf_sz < 0) + break; + } +} +/* format stack slots bitmask, e.g., "-8,-24,-40" for 0x15 mask */ +static void fmt_stack_mask(char *buf, ssize_t buf_sz, u64 stack_mask) +{ + DECLARE_BITMAP(mask, 64); + bool first = true; + int i, n; + + buf[0] = '\0'; + + bitmap_from_u64(mask, stack_mask); + for_each_set_bit(i, mask, 64) { + n = snprintf(buf, buf_sz, "%s%d", first ? "" : ",", -(i + 1) * 8); + first = false; + buf += n; + buf_sz -= n; + if (buf_sz < 0) + break; + } +} + /* For given verifier state backtrack_insn() is called from the last insn to * the first insn. Its purpose is to compute a bitmask of registers and * stack slots that needs precision in the parent verifier state. @@ -3298,7 +3337,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, if (insn->code == 0) return 0; if (env->log.level & BPF_LOG_LEVEL2) { - verbose(env, "regs=%x stack=%llx before ", bt_reg_mask(bt), bt_stack_mask(bt)); + fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, bt_reg_mask(bt)); + verbose(env, "mark_precise: frame%d: regs(0x%x)=%s ", + bt->frame, bt_reg_mask(bt), env->tmp_str_buf); + fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, bt_stack_mask(bt)); + verbose(env, "stack(0x%llx)=%s before ", bt_stack_mask(bt), env->tmp_str_buf); verbose(env, "%d: ", idx); print_bpf_insn(&cbs, insn, env->allow_ptr_leaks); } @@ -3498,6 +3541,11 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env, struct bpf_reg_state *reg; int i, j; + if (env->log.level & BPF_LOG_LEVEL2) { + verbose(env, "mark_precise: frame%d: falling back to forcing all scalars precise\n", + st->curframe); + } + /* big hammer: mark all scalars precise in this path. * pop_stack may still get !precise scalars. * We also skip current state and go straight to first parent state, @@ -3509,17 +3557,25 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env, func = st->frame[i]; for (j = 0; j < BPF_REG_FP; j++) { reg = &func->regs[j]; - if (reg->type != SCALAR_VALUE) + if (reg->type != SCALAR_VALUE || reg->precise) continue; reg->precise = true; + if (env->log.level & BPF_LOG_LEVEL2) { + verbose(env, "force_precise: frame%d: forcing r%d to be precise\n", + i, j); + } } for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) { if (!is_spilled_reg(&func->stack[j])) continue; reg = &func->stack[j].spilled_ptr; - if (reg->type != SCALAR_VALUE) + if (reg->type != SCALAR_VALUE || reg->precise) continue; reg->precise = true; + if (env->log.level & BPF_LOG_LEVEL2) { + verbose(env, "force_precise: frame%d: forcing fp%d to be precise\n", + i, -(j + 1) * 8); + } } } } @@ -3683,8 +3739,10 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r DECLARE_BITMAP(mask, 64); u32 history = st->jmp_history_cnt; - if (env->log.level & BPF_LOG_LEVEL2) - verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx); + if (env->log.level & BPF_LOG_LEVEL2) { + verbose(env, "mark_precise: frame%d: last_idx %d first_idx %d\n", + bt->frame, last_idx, first_idx); + } if (last_idx < 0) { /* we are at the entry into subprog, which diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c index 6c03a7d805f9..fce831667b06 100644 --- a/tools/testing/selftests/bpf/verifier/precise.c +++ b/tools/testing/selftests/bpf/verifier/precise.c @@ -38,25 +38,24 @@ .fixup_map_array_48b = { 1 }, .result = VERBOSE_ACCEPT, .errstr = - "26: (85) call bpf_probe_read_kernel#113\ - last_idx 26 first_idx 20\ - regs=4 stack=0 before 25\ - regs=4 stack=0 before 24\ - regs=4 stack=0 before 23\ - regs=4 stack=0 before 22\ - regs=4 stack=0 before 20\ - parent didn't have regs=4 stack=0 marks\ - last_idx 19 first_idx 10\ - regs=4 stack=0 before 19\ - regs=200 stack=0 before 18\ - regs=300 stack=0 before 17\ - regs=201 stack=0 before 15\ - regs=201 stack=0 before 14\ - regs=200 stack=0 before 13\ - regs=200 stack=0 before 12\ - regs=200 stack=0 before 11\ - regs=200 stack=0 before 10\ - parent already had regs=0 stack=0 marks", + "mark_precise: frame0: last_idx 26 first_idx 20\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 25\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 24\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 23\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 22\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 20\ + parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: last_idx 19 first_idx 10\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 19\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 18\ + mark_precise: frame0: regs(0x300)=r8,r9 stack(0x0)= before 17\ + mark_precise: frame0: regs(0x201)=r0,r9 stack(0x0)= before 15\ + mark_precise: frame0: regs(0x201)=r0,r9 stack(0x0)= before 14\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 13\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 12\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 11\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 10\ + parent already had regs=0 stack=0 marks:", }, { "precise: test 2", @@ -100,20 +99,20 @@ .flags = BPF_F_TEST_STATE_FREQ, .errstr = "26: (85) call bpf_probe_read_kernel#113\ - last_idx 26 first_idx 22\ - regs=4 stack=0 before 25\ - regs=4 stack=0 before 24\ - regs=4 stack=0 before 23\ - regs=4 stack=0 before 22\ - parent didn't have regs=4 stack=0 marks\ - last_idx 20 first_idx 20\ - regs=4 stack=0 before 20\ - parent didn't have regs=4 stack=0 marks\ - last_idx 19 first_idx 17\ - regs=4 stack=0 before 19\ - regs=200 stack=0 before 18\ - regs=300 stack=0 before 17\ - parent already had regs=0 stack=0 marks", + mark_precise: frame0: last_idx 26 first_idx 22\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 25\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 24\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 23\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 22\ + parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: last_idx 20 first_idx 20\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 20\ + parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: last_idx 19 first_idx 17\ + mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 19\ + mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 18\ + mark_precise: frame0: regs(0x300)=r8,r9 stack(0x0)= before 17\ + parent already had regs=0 stack=0 marks:", }, { "precise: cross frame pruning", @@ -153,15 +152,15 @@ }, .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, - .errstr = "5: (2d) if r4 > r0 goto pc+0\ - last_idx 5 first_idx 5\ - parent didn't have regs=10 stack=0 marks\ - last_idx 4 first_idx 2\ - regs=10 stack=0 before 4\ - regs=10 stack=0 before 3\ - regs=0 stack=1 before 2\ - last_idx 5 first_idx 5\ - parent didn't have regs=1 stack=0 marks", + .errstr = "mark_precise: frame0: last_idx 5 first_idx 5\ + parent didn't have regs=10 stack=0 marks:\ + mark_precise: frame0: last_idx 4 first_idx 2\ + mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 4\ + mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 3\ + mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 2\ + mark_precise: frame0: falling back to forcing all scalars precise\ + mark_precise: frame0: last_idx 5 first_idx 5\ + parent didn't have regs=1 stack=0 marks:", .result = VERBOSE_ACCEPT, .retval = -1, }, @@ -179,16 +178,19 @@ }, .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, - .errstr = "last_idx 6 first_idx 6\ - parent didn't have regs=10 stack=0 marks\ - last_idx 5 first_idx 3\ - regs=10 stack=0 before 5\ - regs=10 stack=0 before 4\ - regs=0 stack=1 before 3\ - last_idx 6 first_idx 6\ - parent didn't have regs=1 stack=0 marks\ - last_idx 5 first_idx 3\ - regs=1 stack=0 before 5", + .errstr = "mark_precise: frame0: last_idx 6 first_idx 6\ + parent didn't have regs=10 stack=0 marks:\ + mark_precise: frame0: last_idx 5 first_idx 3\ + mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 5\ + mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 4\ + mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 3\ + mark_precise: frame0: falling back to forcing all scalars precise\ + force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ + mark_precise: frame0: last_idx 6 first_idx 6\ + parent didn't have regs=1 stack=0 marks:\ + mark_precise: frame0: last_idx 5 first_idx 3\ + mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 5", .result = VERBOSE_ACCEPT, .retval = -1, }, From patchwork Tue Apr 25 23:49:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223908 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBA26C77B61 for ; Tue, 25 Apr 2023 23:49:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237034AbjDYXtl convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237037AbjDYXtj (ORCPT ); Tue, 25 Apr 2023 19:49:39 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 004F213C3B for ; Tue, 25 Apr 2023 16:49:37 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 33PLEAP0029783 for ; Tue, 25 Apr 2023 16:49:37 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3q6mgthrqq-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:37 -0700 Received: from twshared6687.46.prn1.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:32 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 084982F2D83B2; Tue, 25 Apr 2023 16:49:25 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 05/10] bpf: maintain bitmasks across all active frames in __mark_chain_precision Date: Tue, 25 Apr 2023 16:49:06 -0700 Message-ID: <20230425234911.2113352-6-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: bbKAVPc_kgwNGxyg8WiMmR6WQI4NFF_k X-Proofpoint-ORIG-GUID: bbKAVPc_kgwNGxyg8WiMmR6WQI4NFF_k X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Teach __mark_chain_precision logic to maintain register/stack masks across all active frames when going from child state to parent state. Currently this should be mostly no-op, as precision backtracking usually bails out when encountering subprog entry/exit. It's not very apparent from the diff due to increased indentation, but the logic remains the same, except everything is done on specific `fr` frame index. Calls to bt_clear_reg() and bt_clear_slot() are replaced with frame-specific bt_clear_frame_reg() and bt_clear_frame_slot(), where frame index is passed explicitly, instead of using current frame number. We also adjust logging to emit affected frame number. And we also add better logging of human-readable register and stack slot masks, similar to previous patch. Signed-off-by: Andrii Nakryiko --- kernel/bpf/verifier.c | 101 ++++++++++-------- .../testing/selftests/bpf/verifier/precise.c | 18 ++-- 2 files changed, 63 insertions(+), 56 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8faf9170acf0..0b19b3d9af65 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3703,7 +3703,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r struct bpf_func_state *func; struct bpf_reg_state *reg; bool skip_first = true; - int i, err; + int i, fr, err; if (!env->bpf_capable) return 0; @@ -3812,56 +3812,63 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r if (!st) break; - func = st->frame[frame]; - bitmap_from_u64(mask, bt_reg_mask(bt)); - for_each_set_bit(i, mask, 32) { - reg = &func->regs[i]; - if (reg->type != SCALAR_VALUE) { - bt_clear_reg(bt, i); - continue; + for (fr = bt->frame; fr >= 0; fr--) { + func = st->frame[fr]; + bitmap_from_u64(mask, bt_frame_reg_mask(bt, fr)); + for_each_set_bit(i, mask, 32) { + reg = &func->regs[i]; + if (reg->type != SCALAR_VALUE) { + bt_clear_frame_reg(bt, fr, i); + continue; + } + if (reg->precise) + bt_clear_frame_reg(bt, fr, i); + else + reg->precise = true; } - if (reg->precise) - bt_clear_reg(bt, i); - else - reg->precise = true; - } - bitmap_from_u64(mask, bt_stack_mask(bt)); - for_each_set_bit(i, mask, 64) { - if (i >= func->allocated_stack / BPF_REG_SIZE) { - /* the sequence of instructions: - * 2: (bf) r3 = r10 - * 3: (7b) *(u64 *)(r3 -8) = r0 - * 4: (79) r4 = *(u64 *)(r10 -8) - * doesn't contain jmps. It's backtracked - * as a single block. - * During backtracking insn 3 is not recognized as - * stack access, so at the end of backtracking - * stack slot fp-8 is still marked in stack_mask. - * However the parent state may not have accessed - * fp-8 and it's "unallocated" stack space. - * In such case fallback to conservative. - */ - mark_all_scalars_precise(env, st); - bt_reset(bt); - return 0; - } + bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr)); + for_each_set_bit(i, mask, 64) { + if (i >= func->allocated_stack / BPF_REG_SIZE) { + /* the sequence of instructions: + * 2: (bf) r3 = r10 + * 3: (7b) *(u64 *)(r3 -8) = r0 + * 4: (79) r4 = *(u64 *)(r10 -8) + * doesn't contain jmps. It's backtracked + * as a single block. + * During backtracking insn 3 is not recognized as + * stack access, so at the end of backtracking + * stack slot fp-8 is still marked in stack_mask. + * However the parent state may not have accessed + * fp-8 and it's "unallocated" stack space. + * In such case fallback to conservative. + */ + mark_all_scalars_precise(env, st); + bt_reset(bt); + return 0; + } - if (!is_spilled_scalar_reg(&func->stack[i])) { - bt_clear_slot(bt, i); - continue; + if (!is_spilled_scalar_reg(&func->stack[i])) { + bt_clear_frame_slot(bt, fr, i); + continue; + } + reg = &func->stack[i].spilled_ptr; + if (reg->precise) + bt_clear_frame_slot(bt, fr, i); + else + reg->precise = true; + } + if (env->log.level & BPF_LOG_LEVEL2) { + fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, + bt_frame_reg_mask(bt, fr)); + verbose(env, "mark_precise: frame%d: parent state regs(0x%x)=%s ", + fr, bt_frame_reg_mask(bt, fr), env->tmp_str_buf); + fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, + bt_frame_stack_mask(bt, fr)); + verbose(env, "stack(0x%llx)=%s: ", + bt_frame_stack_mask(bt, fr), env->tmp_str_buf); + print_verifier_state(env, func, true); } - reg = &func->stack[i].spilled_ptr; - if (reg->precise) - bt_clear_slot(bt, i); - else - reg->precise = true; - } - if (env->log.level & BPF_LOG_LEVEL2) { - verbose(env, "parent %s regs=%x stack=%llx marks:", - bt_bitcnt(bt) ? "didn't have" : "already had", - bt_reg_mask(bt), bt_stack_mask(bt)); - print_verifier_state(env, func, true); } if (bt_bitcnt(bt) == 0) diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c index fce831667b06..ac9be4c576d6 100644 --- a/tools/testing/selftests/bpf/verifier/precise.c +++ b/tools/testing/selftests/bpf/verifier/precise.c @@ -44,7 +44,7 @@ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 23\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 22\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 20\ - parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x4)=r2 stack(0x0)=:\ mark_precise: frame0: last_idx 19 first_idx 10\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 19\ mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 18\ @@ -55,7 +55,7 @@ mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 12\ mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 11\ mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 10\ - parent already had regs=0 stack=0 marks:", + mark_precise: frame0: parent state regs(0x0)= stack(0x0)=:", }, { "precise: test 2", @@ -104,15 +104,15 @@ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 24\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 23\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 22\ - parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x4)=r2 stack(0x0)=:\ mark_precise: frame0: last_idx 20 first_idx 20\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 20\ - parent didn't have regs=4 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x4)=r2 stack(0x0)=:\ mark_precise: frame0: last_idx 19 first_idx 17\ mark_precise: frame0: regs(0x4)=r2 stack(0x0)= before 19\ mark_precise: frame0: regs(0x200)=r9 stack(0x0)= before 18\ mark_precise: frame0: regs(0x300)=r8,r9 stack(0x0)= before 17\ - parent already had regs=0 stack=0 marks:", + mark_precise: frame0: parent state regs(0x0)= stack(0x0)=:", }, { "precise: cross frame pruning", @@ -153,14 +153,14 @@ .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, .errstr = "mark_precise: frame0: last_idx 5 first_idx 5\ - parent didn't have regs=10 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x10)=r4 stack(0x0)=:\ mark_precise: frame0: last_idx 4 first_idx 2\ mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 4\ mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 3\ mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 2\ mark_precise: frame0: falling back to forcing all scalars precise\ mark_precise: frame0: last_idx 5 first_idx 5\ - parent didn't have regs=1 stack=0 marks:", + mark_precise: frame0: parent state regs(0x1)=r0 stack(0x0)=:", .result = VERBOSE_ACCEPT, .retval = -1, }, @@ -179,7 +179,7 @@ .prog_type = BPF_PROG_TYPE_XDP, .flags = BPF_F_TEST_STATE_FREQ, .errstr = "mark_precise: frame0: last_idx 6 first_idx 6\ - parent didn't have regs=10 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x10)=r4 stack(0x0)=:\ mark_precise: frame0: last_idx 5 first_idx 3\ mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 5\ mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 4\ @@ -188,7 +188,7 @@ force_precise: frame0: forcing r0 to be precise\ force_precise: frame0: forcing r0 to be precise\ mark_precise: frame0: last_idx 6 first_idx 6\ - parent didn't have regs=1 stack=0 marks:\ + mark_precise: frame0: parent state regs(0x1)=r0 stack(0x0)=:\ mark_precise: frame0: last_idx 5 first_idx 3\ mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 5", .result = VERBOSE_ACCEPT, From patchwork Tue Apr 25 23:49:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223909 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F349C77B7C for ; Tue, 25 Apr 2023 23:49:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236321AbjDYXtn convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237035AbjDYXtm (ORCPT ); Tue, 25 Apr 2023 19:49:42 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 206D7B230 for ; Tue, 25 Apr 2023 16:49:41 -0700 (PDT) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEJYe012383 for ; Tue, 25 Apr 2023 16:49:41 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q687r6kun-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:40 -0700 Received: from twshared24695.38.frc1.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:39 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 149BC2F2D83F3; Tue, 25 Apr 2023 16:49:27 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 06/10] bpf: fix propagate_precision() logic for inner frames Date: Tue, 25 Apr 2023 16:49:07 -0700 Message-ID: <20230425234911.2113352-7-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: JfCW30_YEineFee5TuOdg5TAsQ1INFgc X-Proofpoint-GUID: JfCW30_YEineFee5TuOdg5TAsQ1INFgc X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Fix propagate_precision() logic to perform propagation of all necessary registers and stack slots across all active frames *in one batch step*. Doing this for each register/slot in each individual frame is wasteful, but the main problem is that backtracking of instruction in any frame except the deepest one just doesn't work. This is due to backtracking logic relying on jump history, and available jump history always starts (or ends, depending how you view it) in current frame. So, if prog A (frame #0) called subprog B (frame #1) and we need to propagate precision of, say, register R6 (callee-saved) within frame #0, we actually don't even know where jump history that corresponds to prog A even starts. We'd need to skip subprog part of jump history first to be able to do this. Luckily, with struct backtrack_state and __mark_chain_precision() handling bitmasks tracking/propagation across all active frames at the same time (added in previous patch), propagate_precision() can be both fixed and sped up by setting all the necessary bits across all frames and then performing one __mark_chain_precision() pass. This makes it unnecessary to skip subprog parts of jump history. We also improve logging along the way, to clearly specify which registers' and slots' precision markings are propagated within which frame. Fixes: 529409ea92d5 ("bpf: propagate precision across all frames, not just the last one") Signed-off-by: Andrii Nakryiko --- kernel/bpf/verifier.c | 49 +++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 18 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 0b19b3d9af65..66d64ac10fb1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3885,14 +3885,12 @@ int mark_chain_precision(struct bpf_verifier_env *env, int regno) return __mark_chain_precision(env, env->cur_state->curframe, regno, -1); } -static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno) +static int mark_chain_precision_batch(struct bpf_verifier_env *env, int frame) { - return __mark_chain_precision(env, frame, regno, -1); -} - -static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi) -{ - return __mark_chain_precision(env, frame, -1, spi); + /* we assume env->bt was set outside with desired reg and stack masks + * for all frames + */ + return __mark_chain_precision(env, frame, -1, -1); } static bool is_spillable_regtype(enum bpf_reg_type type) @@ -15308,20 +15306,25 @@ static int propagate_precision(struct bpf_verifier_env *env, struct bpf_reg_state *state_reg; struct bpf_func_state *state; int i, err = 0, fr; + bool first; for (fr = old->curframe; fr >= 0; fr--) { state = old->frame[fr]; state_reg = state->regs; + first = true; for (i = 0; i < BPF_REG_FP; i++, state_reg++) { if (state_reg->type != SCALAR_VALUE || !state_reg->precise || !(state_reg->live & REG_LIVE_READ)) continue; - if (env->log.level & BPF_LOG_LEVEL2) - verbose(env, "frame %d: propagating r%d\n", fr, i); - err = mark_chain_precision_frame(env, fr, i); - if (err < 0) - return err; + if (env->log.level & BPF_LOG_LEVEL2) { + if (first) + verbose(env, "frame %d: propagating r%d", fr, i); + else + verbose(env, ",r%d", i); + } + bt_set_frame_reg(&env->bt, fr, i); + first = false; } for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) { @@ -15332,14 +15335,24 @@ static int propagate_precision(struct bpf_verifier_env *env, !state_reg->precise || !(state_reg->live & REG_LIVE_READ)) continue; - if (env->log.level & BPF_LOG_LEVEL2) - verbose(env, "frame %d: propagating fp%d\n", - fr, (-i - 1) * BPF_REG_SIZE); - err = mark_chain_precision_stack_frame(env, fr, i); - if (err < 0) - return err; + if (env->log.level & BPF_LOG_LEVEL2) { + if (first) + verbose(env, "frame %d: propagating fp%d", + fr, (-i - 1) * BPF_REG_SIZE); + else + verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE); + } + bt_set_frame_slot(&env->bt, fr, i); + first = false; } + if (!first) + verbose(env, "\n"); } + + err = mark_chain_precision_batch(env, old->curframe); + if (err < 0) + return err; + return 0; } From patchwork Tue Apr 25 23:49:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223907 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79476C77B61 for ; Tue, 25 Apr 2023 23:49:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237020AbjDYXtg convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237025AbjDYXte (ORCPT ); Tue, 25 Apr 2023 19:49:34 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41409B234 for ; Tue, 25 Apr 2023 16:49:33 -0700 (PDT) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEOsA012524 for ; Tue, 25 Apr 2023 16:49:33 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q687r6ku5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:32 -0700 Received: from twshared6687.46.prn1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:32 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 20F7F2F2D842F; Tue, 25 Apr 2023 16:49:29 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 07/10] bpf: fix mark_all_scalars_precise use in mark_chain_precision Date: Tue, 25 Apr 2023 16:49:08 -0700 Message-ID: <20230425234911.2113352-8-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: 2nni14xNUU2Lmc6Lp--X2cdQ2XCjb-A9 X-Proofpoint-GUID: 2nni14xNUU2Lmc6Lp--X2cdQ2XCjb-A9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net When precision backtracking bails out due to some unsupported sequence of instructions (e.g., stack access through register other than r10), we need to mark all SCALAR registers as precise to be safe. Currently, though, we mark SCALARs precise only starting from the state we detected unsupported condition, which could be one of the parent states of the actual current state. This will leave some registers potentially not marked as precise, even though they should. So make sure we start marking scalars as precise from current state (env->cur_state). Further, we don't currently detect a situation when we end up with some stack slots marked as needing precision, but we ran out of available states to find the instructions that populate those stack slots. This is akin the `i >= func->allocated_stack / BPF_REG_SIZE` check and should be handled similarly by falling back to marking all SCALARs precise. Add this check when we run out of states. Signed-off-by: Andrii Nakryiko --- kernel/bpf/verifier.c | 16 +++++++++++++--- tools/testing/selftests/bpf/verifier/precise.c | 9 +++++---- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 66d64ac10fb1..35f34c977819 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3781,7 +3781,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r err = backtrack_insn(env, i, bt); } if (err == -ENOTSUPP) { - mark_all_scalars_precise(env, st); + mark_all_scalars_precise(env, env->cur_state); bt_reset(bt); return 0; } else if (err) { @@ -3843,7 +3843,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r * fp-8 and it's "unallocated" stack space. * In such case fallback to conservative. */ - mark_all_scalars_precise(env, st); + mark_all_scalars_precise(env, env->cur_state); bt_reset(bt); return 0; } @@ -3872,11 +3872,21 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r } if (bt_bitcnt(bt) == 0) - break; + return 0; last_idx = st->last_insn_idx; first_idx = st->first_insn_idx; } + + /* if we still have requested precise regs or slots, we missed + * something (e.g., stack access through non-r10 register), so + * fallback to marking all precise + */ + if (bt_bitcnt(bt) != 0) { + mark_all_scalars_precise(env, env->cur_state); + bt_reset(bt); + } + return 0; } diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c index ac9be4c576d6..f4f65cb9f9b1 100644 --- a/tools/testing/selftests/bpf/verifier/precise.c +++ b/tools/testing/selftests/bpf/verifier/precise.c @@ -159,8 +159,9 @@ mark_precise: frame0: regs(0x10)=r4 stack(0x0)= before 3\ mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 2\ mark_precise: frame0: falling back to forcing all scalars precise\ + force_precise: frame0: forcing r0 to be precise\ mark_precise: frame0: last_idx 5 first_idx 5\ - mark_precise: frame0: parent state regs(0x1)=r0 stack(0x0)=:", + mark_precise: frame0: parent state regs(0x0)= stack(0x0)=:", .result = VERBOSE_ACCEPT, .retval = -1, }, @@ -187,10 +188,10 @@ mark_precise: frame0: falling back to forcing all scalars precise\ force_precise: frame0: forcing r0 to be precise\ force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ + force_precise: frame0: forcing r0 to be precise\ mark_precise: frame0: last_idx 6 first_idx 6\ - mark_precise: frame0: parent state regs(0x1)=r0 stack(0x0)=:\ - mark_precise: frame0: last_idx 5 first_idx 3\ - mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 5", + mark_precise: frame0: parent state regs(0x0)= stack(0x0)=:", .result = VERBOSE_ACCEPT, .retval = -1, }, From patchwork Tue Apr 25 23:49:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223910 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18461C77B61 for ; Tue, 25 Apr 2023 23:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237040AbjDYXtp convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236633AbjDYXto (ORCPT ); Tue, 25 Apr 2023 19:49:44 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7596D13FAF for ; Tue, 25 Apr 2023 16:49:41 -0700 (PDT) Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEw8K008763 for ; Tue, 25 Apr 2023 16:49:41 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q66vn78f3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:40 -0700 Received: from twshared24695.38.frc1.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:39 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 2F61B2F2D8470; Tue, 25 Apr 2023 16:49:31 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 08/10] bpf: support precision propagation in the presence of subprogs Date: Tue, 25 Apr 2023 16:49:09 -0700 Message-ID: <20230425234911.2113352-9-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: oVV_LVHltjXBZSg4Dmym47SCZzbIofoI X-Proofpoint-GUID: oVV_LVHltjXBZSg4Dmym47SCZzbIofoI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; r6 11: call pc+10; r1, r6 22: r0 = r1; r1 23: exit r0 12: r1 = r0, r6 13: r1 += r0; r0, r6 14: r1 += r6; r6; 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction #14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction #9. For brevity, precision sets for instruction #13 and #14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn #22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn #11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn #13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn #14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; r6 11: call pc+10; # global subprog r6 12: r1 = r0, r6 13: r1 += r0; r0, r6 14: r1 += r6; r6; 15: exit Starting from insn #13, r0 has to be precise. We backtrack all the way to insn #11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns #13. Now for insn #14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn #9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; r6 10: r2 = &callback; r6 11: call bpf_loop; r6 22: r0 = r1; 23: exit 12: r1 = r0, r6 13: r1 += r0; r0, r6 14: r1 += r6; r6; 15: exit Again, insn #13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn #22 and #23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn #8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn #23, we'd backtrack to insn #22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn #11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko --- kernel/bpf/verifier.c | 149 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 129 insertions(+), 20 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 35f34c977819..dd436bf96e81 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -240,6 +240,12 @@ static void bpf_map_key_store(struct bpf_insn_aux_data *aux, u64 state) (poisoned ? BPF_MAP_KEY_POISON : 0ULL); } +static bool bpf_helper_call(const struct bpf_insn *insn) +{ + return insn->code == (BPF_JMP | BPF_CALL) && + insn->src_reg == 0; +} + static bool bpf_pseudo_call(const struct bpf_insn *insn) { return insn->code == (BPF_JMP | BPF_CALL) && @@ -468,6 +474,13 @@ static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg) return rec; } +static bool subprog_is_global(const struct bpf_verifier_env *env, int subprog) +{ + struct bpf_func_info_aux *aux = env->prog->aux->func_info_aux; + + return aux && aux[subprog].linkage == BTF_FUNC_GLOBAL; +} + static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) { return btf_record_has_field(reg_btf_record(reg), BPF_SPIN_LOCK); @@ -515,6 +528,8 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id) return func_id == BPF_FUNC_dynptr_data; } +static bool is_callback_calling_kfunc(u32 btf_id); + static bool is_callback_calling_function(enum bpf_func_id func_id) { return func_id == BPF_FUNC_for_each_map_elem || @@ -524,6 +539,11 @@ static bool is_callback_calling_function(enum bpf_func_id func_id) func_id == BPF_FUNC_user_ringbuf_drain; } +static bool is_async_callback_calling_function(enum bpf_func_id func_id) +{ + return func_id == BPF_FUNC_timer_set_callback; +} + static bool is_storage_get_function(enum bpf_func_id func_id) { return func_id == BPF_FUNC_sk_storage_get || @@ -3317,8 +3337,13 @@ static void fmt_stack_mask(char *buf, ssize_t buf_sz, u64 stack_mask) /* For given verifier state backtrack_insn() is called from the last insn to * the first insn. Its purpose is to compute a bitmask of registers and * stack slots that needs precision in the parent verifier state. + * + * @idx is an index of the instruction we are currently processing; + * @subseq_idx is an index of the subsequent instruction that: + * - *would be* executed next, if jump history is viewed in forward order; + * - *was* processed previously during backtracking. */ -static int backtrack_insn(struct bpf_verifier_env *env, int idx, +static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, struct backtrack_state *bt) { const struct bpf_insn_cbs cbs = { @@ -3332,7 +3357,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, u8 mode = BPF_MODE(insn->code); u32 dreg = insn->dst_reg; u32 sreg = insn->src_reg; - u32 spi; + u32 spi, i; if (insn->code == 0) return 0; @@ -3424,14 +3449,72 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, if (class == BPF_STX) bt_set_reg(bt, sreg); } else if (class == BPF_JMP || class == BPF_JMP32) { - if (opcode == BPF_CALL) { - if (insn->src_reg == BPF_PSEUDO_CALL) - return -ENOTSUPP; - /* BPF helpers that invoke callback subprogs are - * equivalent to BPF_PSEUDO_CALL above + if (bpf_pseudo_call(insn)) { + int subprog_insn_idx, subprog; + bool is_global; + + subprog_insn_idx = idx + insn->imm + 1; + subprog = find_subprog(env, subprog_insn_idx); + if (subprog < 0) + return -EFAULT; + is_global = subprog_is_global(env, subprog); + + if (is_global) { + /* r1-r5 are invalidated after subprog call, + * so for global func call it shouldn't be set + * anymore + */ + if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) + return -EFAULT; + /* global subprog always sets R0 */ + bt_clear_reg(bt, BPF_REG_0); + return 0; + } else { + /* static subprog call instruction, which + * means that we are exiting current subprog, + * so only r1-r5 could be still requested as + * precise, r0 and r6-r10 or any stack slot in + * the current frame should be zero by now + */ + if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) + return -EFAULT; + /* we don't track register spills perfectly, + * so fallback to force-precise instead of failing */ + if (bt_stack_mask(bt) != 0) + return -ENOTSUPP; + /* propagate r1-r5 to the caller */ + for (i = BPF_REG_1; i <= BPF_REG_5; i++) { + if (bt_is_reg_set(bt, i)) { + bt_clear_reg(bt, i); + bt_set_frame_reg(bt, bt->frame - 1, i); + } + } + if (bt_subprog_exit(bt)) + return -EFAULT; + return 0; + } + } else if ((bpf_helper_call(insn) && + is_callback_calling_function(insn->imm) && + !is_async_callback_calling_function(insn->imm)) || + (bpf_pseudo_kfunc_call(insn) && is_callback_calling_kfunc(insn->imm))) { + /* callback-calling helper or kfunc call, which means + * we are exiting from subprog, but unlike the subprog + * call handling above, we shouldn't propagate + * precision of r1-r5 (if any requested), as they are + * not actually arguments passed directly to callback + * subprogs */ - if (insn->src_reg == 0 && is_callback_calling_function(insn->imm)) + if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) + return -EFAULT; + if (bt_stack_mask(bt) != 0) return -ENOTSUPP; + /* clear r1-r5 in callback subprog's mask */ + for (i = BPF_REG_1; i <= BPF_REG_5; i++) + bt_clear_reg(bt, i); + if (bt_subprog_exit(bt)) + return -EFAULT; + return 0; + } else if (opcode == BPF_CALL) { /* kfunc with imm==0 is invalid and fixup_kfunc_call will * catch this error later. Make backtracking conservative * with ENOTSUPP. @@ -3449,7 +3532,39 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, return -EFAULT; } } else if (opcode == BPF_EXIT) { - return -ENOTSUPP; + bool r0_precise; + + if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) { + /* if backtracing was looking for registers R1-R5 + * they should have been found already. + */ + verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); + WARN_ONCE(1, "verifier backtracking bug"); + return -EFAULT; + } + + /* BPF_EXIT in subprog or callback always jump right + * after the call instruction, so by check whether the + * instruction at subseq_idx-1 is subprog call or not we + * can distinguish actual exit from *subprog* from + * exit from *callback*. In the former case, we need + * to propagate r0 precision, if necessary. In the + * former we never do that. + */ + r0_precise = subseq_idx - 1 >= 0 && + bpf_pseudo_call(&env->prog->insnsi[subseq_idx - 1]) && + bt_is_reg_set(bt, BPF_REG_0); + + bt_clear_reg(bt, BPF_REG_0); + if (bt_subprog_enter(bt)) + return -EFAULT; + + if (r0_precise) + bt_set_reg(bt, BPF_REG_0); + /* r6-r9 and stack slots will stay set in caller frame + * bitmasks until we return back from callee(s) + */ + return 0; } else if (BPF_SRC(insn->code) == BPF_X) { if (!bt_is_reg_set(bt, dreg) && !bt_is_reg_set(bt, sreg)) return 0; @@ -3703,7 +3818,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r struct bpf_func_state *func; struct bpf_reg_state *reg; bool skip_first = true; - int i, fr, err; + int i, prev_i, fr, err; if (!env->bpf_capable) return 0; @@ -3773,12 +3888,12 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r return -EFAULT; } - for (i = last_idx;;) { + for (i = last_idx, prev_i = -1;;) { if (skip_first) { err = 0; skip_first = false; } else { - err = backtrack_insn(env, i, bt); + err = backtrack_insn(env, i, prev_i, bt); } if (err == -ENOTSUPP) { mark_all_scalars_precise(env, env->cur_state); @@ -3795,6 +3910,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r return 0; if (i == first_idx) break; + prev_i = i; i = get_prev_insn_idx(st, i, &history); if (i >= env->prog->len) { /* This can happen if backtracking reached insn 0 @@ -8376,17 +8492,13 @@ static int set_callee_state(struct bpf_verifier_env *env, struct bpf_func_state *caller, struct bpf_func_state *callee, int insn_idx); -static bool is_callback_calling_kfunc(u32 btf_id); - static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, int *insn_idx, int subprog, set_callee_state_fn set_callee_state_cb) { struct bpf_verifier_state *state = env->cur_state; - struct bpf_func_info_aux *func_info_aux; struct bpf_func_state *caller, *callee; int err; - bool is_global = false; if (state->curframe + 1 >= MAX_CALL_FRAMES) { verbose(env, "the call stack of %d frames is too deep\n", @@ -8401,13 +8513,10 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn return -EFAULT; } - func_info_aux = env->prog->aux->func_info_aux; - if (func_info_aux) - is_global = func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL; err = btf_check_subprog_call(env, subprog, caller->regs); if (err == -EFAULT) return err; - if (is_global) { + if (subprog_is_global(env, subprog)) { if (err) { verbose(env, "Caller passes invalid args into func#%d\n", subprog); From patchwork Tue Apr 25 23:49:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223911 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72E7EC77B7F for ; Tue, 25 Apr 2023 23:49:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237043AbjDYXtr convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237048AbjDYXtr (ORCPT ); Tue, 25 Apr 2023 19:49:47 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7E45B232 for ; Tue, 25 Apr 2023 16:49:43 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEjBS031728 for ; Tue, 25 Apr 2023 16:49:43 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6hju3dh9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:43 -0700 Received: from twshared52232.38.frc1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:42 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 3BC162F2D84BA; Tue, 25 Apr 2023 16:49:33 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 09/10] selftests/bpf: add precision propagation tests in the presence of subprogs Date: Tue, 25 Apr 2023 16:49:10 -0700 Message-ID: <20230425234911.2113352-10-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: BwS6-fbi0MXfo9kV6v775l6q4Y-R9n4s X-Proofpoint-ORIG-GUID: BwS6-fbi0MXfo9kV6v775l6q4Y-R9n4s X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add a bunch of tests validating verifier's precision backpropagation logic in the presence of subprog calls and/or callback-calling helpers/kfuncs. We validate the following conditions: - subprog_result_precise: static subprog r0 result precision handling; - global_subprog_result_precise: global subprog r0 precision shortcutting, similar to BPF helper handling; - callback_result_precise: similarly r0 marking precise for callback-calling helpers; - parent_callee_saved_reg_precise, parent_callee_saved_reg_precise_global: propagation of precision for callee-saved registers bypassing static/global subprogs; - parent_callee_saved_reg_precise_with_callback: same as above, but in the presence of callback-calling helper; - parent_stack_slot_precise, parent_stack_slot_precise_global: similar to above, but instead propagating precision of stack slot (spilled SCALAR reg); - parent_stack_slot_precise_with_callback: same as above, but in the presence of callback-calling helper; - subprog_arg_precise: propagation of precision of static subprog's input argument back to caller; - subprog_spill_into_parent_stack_slot_precise: negative test validating that verifier currently can't support backtracking of stack access with non-r10 register, we validate that we fallback to forcing precision for all SCALARs. Signed-off-by: Andrii Nakryiko --- .../selftests/bpf/prog_tests/verifier.c | 2 + tools/testing/selftests/bpf/progs/bpf_misc.h | 4 + .../bpf/progs/verifier_subprog_precision.c | 536 ++++++++++++++++++ 3 files changed, 542 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/verifier_subprog_precision.c diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 2497716ee379..531621adef42 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -55,6 +55,7 @@ #include "verifier_spill_fill.skel.h" #include "verifier_spin_lock.skel.h" #include "verifier_stack_ptr.skel.h" +#include "verifier_subprog_precision.skel.h" #include "verifier_subreg.skel.h" #include "verifier_uninit.skel.h" #include "verifier_unpriv.skel.h" @@ -154,6 +155,7 @@ void test_verifier_sock(void) { RUN(verifier_sock); } void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); } void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); } void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); } +void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); } void test_verifier_subreg(void) { RUN(verifier_subreg); } void test_verifier_uninit(void) { RUN(verifier_uninit); } void test_verifier_unpriv(void) { RUN(verifier_unpriv); } diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h index d3c1217ba79a..38a57a2e70db 100644 --- a/tools/testing/selftests/bpf/progs/bpf_misc.h +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h @@ -86,6 +86,10 @@ #define POINTER_VALUE 0xcafe4all #define TEST_DATA_LEN 64 +#ifndef __used +#define __used __attribute__((used)) +#endif + #if defined(__TARGET_ARCH_x86) #define SYSCALL_WRAPPER 1 #define SYS_PREFIX "__x64_" diff --git a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c new file mode 100644 index 000000000000..ece317c0a7ec --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c @@ -0,0 +1,536 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include +#include "bpf_misc.h" + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof(x[0])) + +int vals[] SEC(".data.vals") = {1, 2, 3, 4}; + +__naked __noinline __used +static unsigned long identity_subprog() +{ + /* the simplest *static* 64-bit identity function */ + asm volatile ( + "r0 = r1;" + "exit;" + ); +} + +__noinline __used +unsigned long global_identity_subprog(__u64 x) +{ + /* the simplest *global* 64-bit identity function */ + return x; +} + +__naked __noinline __used +static unsigned long callback_subprog() +{ + /* the simplest callback function */ + asm volatile ( + "r0 = 0;" + "exit;" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("7: (0f) r1 += r0") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 6: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 5: (27) r0 *= 4") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 11: (95) exit") +__msg("mark_precise: frame1: regs(0x1)=r0 stack(0x0)= before 10: (bf) r0 = r1") +__msg("mark_precise: frame1: regs(0x2)=r1 stack(0x0)= before 4: (85) call pc+5") +__msg("mark_precise: frame0: regs(0x2)=r1 stack(0x0)= before 3: (bf) r1 = r6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 2: (b7) r6 = 3") +__naked int subprog_result_precise(void) +{ + asm volatile ( + "r6 = 3;" + /* pass r6 through r1 into subprog to get it back as r0; + * this whole chain will have to be marked as precise later + */ + "r1 = r6;" + "call identity_subprog;" + /* now use subprog's returned value (which is a + * r6 -> r1 -> r0 chain), as index into vals array, forcing + * all of that to be known precisely + */ + "r0 *= 4;" + "r1 = %[vals];" + /* here r0->r1->r6 chain is forced to be precise and has to be + * propagated back to the beginning, including through the + * subprog call + */ + "r1 += r0;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("9: (0f) r1 += r0") +__msg("mark_precise: frame0: last_idx 9 first_idx 0") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 8: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 7: (27) r0 *= 4") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 5: (a5) if r0 < 0x4 goto pc+1") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 4: (85) call pc+7") +__naked int global_subprog_result_precise(void) +{ + asm volatile ( + "r6 = 3;" + /* pass r6 through r1 into subprog to get it back as r0; + * given global_identity_subprog is global, precision won't + * propagate all the way back to r6 + */ + "r1 = r6;" + "call global_identity_subprog;" + /* now use subprog's returned value (which is unknown now, so + * we need to clamp it), as index into vals array, forcing r0 + * to be marked precise (with no effect on r6, though) + */ + "if r0 < %[vals_arr_sz] goto 1f;" + "r0 = %[vals_arr_sz] - 1;" + "1:" + "r0 *= 4;" + "r1 = %[vals];" + /* here r0 is forced to be precise and has to be + * propagated back to the global subprog call, but it + * shouldn't go all the way to mark r6 as precise + */ + "r1 += r0;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals), + __imm_const(vals_arr_sz, ARRAY_SIZE(vals)) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("14: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 14 first_idx 10") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 13: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 12: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 11: (25) if r6 > 0x3 goto pc+4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 10: (bf) r6 = r0") +__msg("mark_precise: frame0: parent state regs(0x1)=r0 stack(0x0)=:") +__msg("mark_precise: frame0: last_idx 18 first_idx 0") +__msg("mark_precise: frame0: regs(0x1)=r0 stack(0x0)= before 18: (95) exit") +__naked int callback_result_precise(void) +{ + asm volatile ( + "r6 = 3;" + + /* call subprog and use result; r0 shouldn't propagate back to + * callback_subprog + */ + "r1 = r6;" /* nr_loops */ + "r2 = %[callback_subprog];" /* callback_fn */ + "r3 = 0;" /* callback_ctx */ + "r4 = 0;" /* flags */ + "call %[bpf_loop];" + + "r6 = r0;" + "if r6 > 3 goto 1f;" + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the bpf_loop() call, but not beyond + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "1:" + "exit;" + : + : __imm_ptr(vals), + __imm_ptr(callback_subprog), + __imm(bpf_loop) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("7: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 7 first_idx 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 6: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 5: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 11: (95) exit") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 10: (bf) r0 = r1") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 4: (85) call pc+5") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 3: (b7) r1 = 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 2: (b7) r6 = 3") +__naked int parent_callee_saved_reg_precise(void) +{ + asm volatile ( + "r6 = 3;" + + /* call subprog and ignore result; we need this call only to + * complicate jump history + */ + "r1 = 0;" + "call identity_subprog;" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) subprog call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("7: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 7 first_idx 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 6: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 5: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 4: (85) call pc+5") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 3: (b7) r1 = 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 2: (b7) r6 = 3") +__naked int parent_callee_saved_reg_precise_global(void) +{ + asm volatile ( + "r6 = 3;" + + /* call subprog and ignore result; we need this call only to + * complicate jump history + */ + "r1 = 0;" + "call global_identity_subprog;" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) subprog call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("12: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 12 first_idx 10") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 11: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 10: (27) r6 *= 4") +__msg("mark_precise: frame0: parent state regs(0x40)=r6 stack(0x0)=:") +__msg("mark_precise: frame0: last_idx 16 first_idx 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 16: (95) exit") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 15: (b7) r0 = 0") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 9: (85) call bpf_loop#181") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 8: (b7) r4 = 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 7: (b7) r3 = 0") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 6: (bf) r2 = r8") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 5: (b7) r1 = 1") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 4: (b7) r6 = 3") +__naked int parent_callee_saved_reg_precise_with_callback(void) +{ + asm volatile ( + "r6 = 3;" + + /* call subprog and ignore result; we need this call only to + * complicate jump history + */ + "r1 = 1;" /* nr_loops */ + "r2 = %[callback_subprog];" /* callback_fn */ + "r3 = 0;" /* callback_ctx */ + "r4 = 0;" /* flags */ + "call %[bpf_loop];" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) callback call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals), + __imm_ptr(callback_subprog), + __imm(bpf_loop) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("9: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 9 first_idx 6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 8: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 7: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 6: (79) r6 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: parent state regs(0x0)= stack(0x1)=-8:") +__msg("mark_precise: frame0: last_idx 13 first_idx 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 13: (95) exit") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 12: (bf) r0 = r1") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 5: (85) call pc+6") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 4: (b7) r1 = 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 3: (7b) *(u64 *)(r10 -8) = r6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 2: (b7) r6 = 3") +__naked int parent_stack_slot_precise(void) +{ + asm volatile ( + /* spill reg */ + "r6 = 3;" + "*(u64 *)(r10 - 8) = r6;" + + /* call subprog and ignore result; we need this call only to + * complicate jump history + */ + "r1 = 0;" + "call identity_subprog;" + + /* restore reg from stack; in this case we'll be carrying + * stack mask when going back into subprog through jump + * history + */ + "r6 = *(u64 *)(r10 - 8);" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) subprog call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("9: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 9 first_idx 6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 8: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 7: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 6: (79) r6 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: parent state regs(0x0)= stack(0x1)=-8:") +__msg("mark_precise: frame0: last_idx 5 first_idx 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 5: (85) call pc+6") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 4: (b7) r1 = 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 3: (7b) *(u64 *)(r10 -8) = r6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 2: (b7) r6 = 3") +__naked int parent_stack_slot_precise_global(void) +{ + asm volatile ( + /* spill reg */ + "r6 = 3;" + "*(u64 *)(r10 - 8) = r6;" + + /* call subprog and ignore result; we need this call only to + * complicate jump history + */ + "r1 = 0;" + "call global_identity_subprog;" + + /* restore reg from stack; in this case we'll be carrying + * stack mask when going back into subprog through jump + * history + */ + "r6 = *(u64 *)(r10 - 8);" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) subprog call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("14: (0f) r1 += r6") +__msg("mark_precise: frame0: last_idx 14 first_idx 11") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 13: (bf) r1 = r7") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 12: (27) r6 *= 4") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 11: (79) r6 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: parent state regs(0x0)= stack(0x1)=-8:") +__msg("mark_precise: frame0: last_idx 18 first_idx 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 18: (95) exit") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 17: (b7) r0 = 0") +__msg("mark_precise: frame1: regs(0x0)= stack(0x0)= before 10: (85) call bpf_loop#181") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 9: (b7) r4 = 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 8: (b7) r3 = 0") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 7: (bf) r2 = r8") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 6: (bf) r1 = r6") +__msg("mark_precise: frame0: regs(0x0)= stack(0x1)=-8 before 5: (7b) *(u64 *)(r10 -8) = r6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 4: (b7) r6 = 3") +__naked int parent_stack_slot_precise_with_callback(void) +{ + asm volatile ( + /* spill reg */ + "r6 = 3;" + "*(u64 *)(r10 - 8) = r6;" + + /* ensure we have callback frame in jump history */ + "r1 = r6;" /* nr_loops */ + "r2 = %[callback_subprog];" /* callback_fn */ + "r3 = 0;" /* callback_ctx */ + "r4 = 0;" /* flags */ + "call %[bpf_loop];" + + /* restore reg from stack; in this case we'll be carrying + * stack mask when going back into subprog through jump + * history + */ + "r6 = *(u64 *)(r10 - 8);" + + "r6 *= 4;" + "r1 = %[vals];" + /* here r6 is forced to be precise and has to be propagated + * back to the beginning, handling (and ignoring) subprog call + */ + "r1 += r6;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals), + __imm_ptr(callback_subprog), + __imm(bpf_loop) + : __clobber_common, "r6" + ); +} + +__noinline __used +static __u64 subprog_with_precise_arg(__u64 x) +{ + return vals[x]; /* x is forced to be precise */ +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("8: (0f) r2 += r1") +__msg("mark_precise: frame1: last_idx 8 first_idx 0") +__msg("mark_precise: frame1: regs(0x2)=r1 stack(0x0)= before 6: (18) r2 = ") +__msg("mark_precise: frame1: regs(0x2)=r1 stack(0x0)= before 5: (67) r1 <<= 2") +__msg("mark_precise: frame1: regs(0x2)=r1 stack(0x0)= before 2: (85) call pc+2") +__msg("mark_precise: frame0: regs(0x2)=r1 stack(0x0)= before 1: (bf) r1 = r6") +__msg("mark_precise: frame0: regs(0x40)=r6 stack(0x0)= before 0: (b7) r6 = 3") +__naked int subprog_arg_precise(void) +{ + asm volatile ( + "r6 = 3;" + "r1 = r6;" + /* subprog_with_precise_arg expects its argument to be + * precise, so r1->r6 will be marked precise from inside the + * subprog + */ + "call subprog_with_precise_arg;" + "r0 += r6;" + "exit;" + : + : + : __clobber_common, "r6" + ); +} + +/* r1 is pointer to stack slot; + * r2 is a register to spill into that slot + * subprog also spills r2 into its own stack slot + */ +__naked __noinline __used +static __u64 subprog_spill_reg_precise(void) +{ + asm volatile ( + /* spill to parent stack */ + "*(u64 *)(r1 + 0) = r2;" + /* spill to subprog stack (we use -16 offset to avoid + * accidental confusion with parent's -8 stack slot in + * verifier log output) + */ + "*(u64 *)(r10 - 16) = r2;" + /* use both spills as return result to propagete precision everywhere */ + "r0 = *(u64 *)(r10 - 16);" + "r2 = *(u64 *)(r1 + 0);" + "r0 += r2;" + "exit;" + ); +} + +SEC("?raw_tp") +__success __log_level(2) +/* precision backtracking can't currently handle stack access not through r10, + * so we won't be able to mark stack slot fp-8 as precise, and so will + * fallback to forcing all as precise + */ +__msg("mark_precise: frame0: falling back to forcing all scalars precise") +__naked int subprog_spill_into_parent_stack_slot_precise(void) +{ + asm volatile ( + "r6 = 1;" + + /* pass pointer to stack slot and r6 to subprog; + * r6 will be marked precise and spilled into fp-8 slot, which + * also should be marked precise + */ + "r1 = r10;" + "r1 += -8;" + "r2 = r6;" + "call subprog_spill_reg_precise;" + + /* restore reg from stack; in this case we'll be carrying + * stack mask when going back into subprog through jump + * history + */ + "r7 = *(u64 *)(r10 - 8);" + + "r7 *= 4;" + "r1 = %[vals];" + /* here r7 is forced to be precise and has to be propagated + * back to the beginning, handling subprog call and logic + */ + "r1 += r7;" + "r0 = *(u32 *)(r1 + 0);" + "exit;" + : + : __imm_ptr(vals) + : __clobber_common, "r6", "r7" + ); +} + +__naked __noinline __used +static __u64 subprog_with_checkpoint(void) +{ + asm volatile ( + "r0 = 0;" + /* guaranteed checkpoint if BPF_F_TEST_STATE_FREQ is used */ + "goto +0;" + "exit;" + ); +} + +char _license[] SEC("license") = "GPL"; From patchwork Tue Apr 25 23:49:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13223912 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CC8DC77B7C for ; Tue, 25 Apr 2023 23:49:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237052AbjDYXtv convert rfc822-to-8bit (ORCPT ); Tue, 25 Apr 2023 19:49:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237060AbjDYXtt (ORCPT ); Tue, 25 Apr 2023 19:49:49 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC58A13FAF for ; Tue, 25 Apr 2023 16:49:48 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33PLEljM004044 for ; Tue, 25 Apr 2023 16:49:48 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3q6f8ucgyk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 25 Apr 2023 16:49:47 -0700 Received: from twshared35445.38.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 25 Apr 2023 16:49:47 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 4623C2F2D84D9; Tue, 25 Apr 2023 16:49:35 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 10/10] selftests/bpf: revert iter test subprog precision workaround Date: Tue, 25 Apr 2023 16:49:11 -0700 Message-ID: <20230425234911.2113352-11-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230425234911.2113352-1-andrii@kernel.org> References: <20230425234911.2113352-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Yw9VWz8DLnfqSFPYYOSICi6GOh7ndpe3 X-Proofpoint-ORIG-GUID: Yw9VWz8DLnfqSFPYYOSICi6GOh7ndpe3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-25_10,2023-04-25_01,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Now that precision propagation is supported fully in the presence of subprogs, there is no need to work around iter test. Revert original workaround. This reverts be7dbd275dc6 ("selftests/bpf: avoid mark_all_scalars_precise() trigger in one of iter tests"). Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/progs/iters.c | 26 ++++++++++------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c index be16143ae292..6b9b3c56f009 100644 --- a/tools/testing/selftests/bpf/progs/iters.c +++ b/tools/testing/selftests/bpf/progs/iters.c @@ -651,29 +651,25 @@ int iter_stack_array_loop(const void *ctx) return sum; } -#define ARR_SZ 16 - -static __noinline void fill(struct bpf_iter_num *it, int *arr, int mul) +static __noinline void fill(struct bpf_iter_num *it, int *arr, __u32 n, int mul) { - int *t; - __u64 i; + int *t, i; while ((t = bpf_iter_num_next(it))) { i = *t; - if (i >= ARR_SZ) + if (i >= n) break; arr[i] = i * mul; } } -static __noinline int sum(struct bpf_iter_num *it, int *arr) +static __noinline int sum(struct bpf_iter_num *it, int *arr, __u32 n) { - int *t, sum = 0;; - __u64 i; + int *t, i, sum = 0;; while ((t = bpf_iter_num_next(it))) { i = *t; - if (i >= ARR_SZ) + if (i >= n) break; sum += arr[i]; } @@ -685,7 +681,7 @@ SEC("raw_tp") __success int iter_pass_iter_ptr_to_subprog(const void *ctx) { - int arr1[ARR_SZ], arr2[ARR_SZ]; + int arr1[16], arr2[32]; struct bpf_iter_num it; int n, sum1, sum2; @@ -694,25 +690,25 @@ int iter_pass_iter_ptr_to_subprog(const void *ctx) /* fill arr1 */ n = ARRAY_SIZE(arr1); bpf_iter_num_new(&it, 0, n); - fill(&it, arr1, 2); + fill(&it, arr1, n, 2); bpf_iter_num_destroy(&it); /* fill arr2 */ n = ARRAY_SIZE(arr2); bpf_iter_num_new(&it, 0, n); - fill(&it, arr2, 10); + fill(&it, arr2, n, 10); bpf_iter_num_destroy(&it); /* sum arr1 */ n = ARRAY_SIZE(arr1); bpf_iter_num_new(&it, 0, n); - sum1 = sum(&it, arr1); + sum1 = sum(&it, arr1, n); bpf_iter_num_destroy(&it); /* sum arr2 */ n = ARRAY_SIZE(arr2); bpf_iter_num_new(&it, 0, n); - sum2 = sum(&it, arr2); + sum2 = sum(&it, arr2, n); bpf_iter_num_destroy(&it); bpf_printk("sum1=%d, sum2=%d", sum1, sum2);