From patchwork Wed Sep 21 16:42:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 12983982 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E177DC6FA8E for ; Wed, 21 Sep 2022 16:49:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231470AbiIUQty convert rfc822-to-8bit (ORCPT ); Wed, 21 Sep 2022 12:49:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231130AbiIUQt1 (ORCPT ); Wed, 21 Sep 2022 12:49:27 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD8B410FDC for ; Wed, 21 Sep 2022 09:43:08 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28LFUc55006436 for ; Wed, 21 Sep 2022 09:43:08 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3jqm806rxw-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 21 Sep 2022 09:43:08 -0700 Received: from twshared22593.02.prn5.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:07 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 0106F1F2B4F95; Wed, 21 Sep 2022 09:42:56 -0700 (PDT) From: Andrii Nakryiko To: , , CC: , Subject: [PATCH v2 bpf-next 1/4] selftests/bpf: fix double bpf_object__close() in veristate Date: Wed, 21 Sep 2022 09:42:51 -0700 Message-ID: <20220921164254.3630690-2-andrii@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220921164254.3630690-1-andrii@kernel.org> References: <20220921164254.3630690-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: -hEA8ox7XWTN8YSEcSbWbfWn6OOi8Rwm X-Proofpoint-GUID: -hEA8ox7XWTN8YSEcSbWbfWn6OOi8Rwm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-21_09,2022-09-20_02,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net bpf_object__close(obj) is called twice for BPF object files with single BPF program in it. This causes crash. Fix this by not calling bpf_object__close() unnecessarily. Fixes: c8bc5e050976 ("selftests/bpf: Add veristat tool for mass-verifying BPF object files") Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/veristat.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index 39e6dc41e504..c0c8a65cda52 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -300,7 +300,6 @@ static int process_obj(const char *filename) prog = bpf_object__next_program(obj, NULL); bpf_program__set_autoload(prog, true); process_prog(filename, obj, prog); - bpf_object__close(obj); goto cleanup; } From patchwork Wed Sep 21 16:42:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 12983984 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EC9DC6FA8B for ; Wed, 21 Sep 2022 16:50:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231535AbiIUQt4 convert rfc822-to-8bit (ORCPT ); Wed, 21 Sep 2022 12:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231556AbiIUQtd (ORCPT ); Wed, 21 Sep 2022 12:49:33 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E37F9AF87 for ; Wed, 21 Sep 2022 09:43:42 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28LFUjZH015200 for ; Wed, 21 Sep 2022 09:43:41 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3jqbbub3q6-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 21 Sep 2022 09:43:41 -0700 Received: from snc-exhub201.TheFacebook.com (2620:10d:c085:21d::7) by snc-exhub103.TheFacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:36 -0700 Received: from twshared15978.04.prn5.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:07 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 138791F2B4F9A; Wed, 21 Sep 2022 09:42:59 -0700 (PDT) From: Andrii Nakryiko To: , , CC: , Subject: [PATCH v2 bpf-next 2/4] selftests/bpf: add CSV output mode for veristat Date: Wed, 21 Sep 2022 09:42:52 -0700 Message-ID: <20220921164254.3630690-3-andrii@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220921164254.3630690-1-andrii@kernel.org> References: <20220921164254.3630690-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: q6in8Ouq-kDPvkFsgW9_YCm5PlhC5zfx X-Proofpoint-GUID: q6in8Ouq-kDPvkFsgW9_YCm5PlhC5zfx X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-21_09,2022-09-20_02,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Teach veristat to output results as CSV table for easier programmatic processing. Change what was --output/-o argument to now be --emit/-e. And then use --output-format/-o to specify output format. Currently "table" and "csv" is supported, table being default. For CSV output mode veristat is using spec identifiers as column names. E.g., instead of "Total states" veristat uses "total_states" as a CSV header name. Internally veristat recognizes three formats, one of them (RESFMT_TABLE_CALCLEN) is a special format instructing veristat to calculate column widths for table output. This felt a bit cleaner and more uniform than either creating separate functions just for this. Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/veristat.c | 111 +++++++++++++++++-------- 1 file changed, 75 insertions(+), 36 deletions(-) diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index c0c8a65cda52..0472bfae3c9d 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -46,10 +46,17 @@ struct stat_specs { int lens[ALL_STATS_CNT]; }; +enum resfmt { + RESFMT_TABLE, + RESFMT_TABLE_CALCLEN, /* fake format to pre-calculate table's column widths */ + RESFMT_CSV, +}; + static struct env { char **filenames; int filename_cnt; bool verbose; + enum resfmt out_fmt; struct verif_stats *prog_stats; int prog_stat_cnt; @@ -78,8 +85,9 @@ const char argp_program_doc[] = static const struct argp_option opts[] = { { NULL, 'h', NULL, OPTION_HIDDEN, "Show the full help" }, { "verbose", 'v', NULL, 0, "Verbose mode" }, - { "output", 'o', "SPEC", 0, "Specify output stats" }, + { "emit", 'e', "SPEC", 0, "Specify stats to be emitted" }, { "sort", 's', "SPEC", 0, "Specify sort order" }, + { "output-format", 'o', "FMT", 0, "Result output format (table, csv), default is table." }, {}, }; @@ -97,7 +105,7 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) case 'v': env.verbose = true; break; - case 'o': + case 'e': err = parse_stats(arg, &env.output_spec); if (err) return err; @@ -107,6 +115,16 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) if (err) return err; break; + case 'o': + if (strcmp(arg, "table") == 0) { + env.out_fmt = RESFMT_TABLE; + } else if (strcmp(arg, "csv") == 0) { + env.out_fmt = RESFMT_CSV; + } else { + fprintf(stderr, "Unrecognized output format '%s'\n", arg); + return -EINVAL; + } + break; case ARGP_KEY_ARG: tmp = realloc(env.filenames, (env.filename_cnt + 1) * sizeof(*env.filenames)); if (!tmp) @@ -147,7 +165,7 @@ static struct stat_def { [FILE_NAME] = { "File", {"file_name", "filename", "file"}, true /* asc */ }, [PROG_NAME] = { "Program", {"prog_name", "progname", "prog"}, true /* asc */ }, [VERDICT] = { "Verdict", {"verdict"}, true /* asc: failure, success */ }, - [DURATION] = { "Duration, us", {"duration", "dur"}, }, + [DURATION] = { "Duration (us)", {"duration", "dur"}, }, [TOTAL_INSNS] = { "Total insns", {"total_insns", "insns"}, }, [TOTAL_STATES] = { "Total states", {"total_states", "states"}, }, [PEAK_STATES] = { "Peak states", {"peak_states"}, }, @@ -385,7 +403,21 @@ static int cmp_prog_stats(const void *v1, const void *v2) #define HEADER_CHAR '-' #define COLUMN_SEP " " -static void output_headers(bool calc_len) +static void output_header_underlines(void) +{ + int i, j, len; + + for (i = 0; i < env.output_spec.spec_cnt; i++) { + len = env.output_spec.lens[i]; + + printf("%s", i == 0 ? "" : COLUMN_SEP); + for (j = 0; j < len; j++) + printf("%c", HEADER_CHAR); + } + printf("\n"); +} + +static void output_headers(enum resfmt fmt) { int i, len; @@ -393,34 +425,30 @@ static void output_headers(bool calc_len) int id = env.output_spec.ids[i]; int *max_len = &env.output_spec.lens[i]; - if (calc_len) { + switch (fmt) { + case RESFMT_TABLE_CALCLEN: len = snprintf(NULL, 0, "%s", stat_defs[id].header); if (len > *max_len) *max_len = len; - } else { + break; + case RESFMT_TABLE: printf("%s%-*s", i == 0 ? "" : COLUMN_SEP, *max_len, stat_defs[id].header); + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; + case RESFMT_CSV: + printf("%s%s", i == 0 ? "" : ",", stat_defs[id].names[0]); + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; } } - if (!calc_len) - printf("\n"); + if (fmt == RESFMT_TABLE) + output_header_underlines(); } -static void output_header_underlines(void) -{ - int i, j, len; - - for (i = 0; i < env.output_spec.spec_cnt; i++) { - len = env.output_spec.lens[i]; - - printf("%s", i == 0 ? "" : COLUMN_SEP); - for (j = 0; j < len; j++) - printf("%c", HEADER_CHAR); - } - printf("\n"); -} - -static void output_stats(const struct verif_stats *s, bool calc_len) +static void output_stats(const struct verif_stats *s, enum resfmt fmt, bool last) { int i; @@ -453,23 +481,36 @@ static void output_stats(const struct verif_stats *s, bool calc_len) exit(1); } - if (calc_len) { + switch (fmt) { + case RESFMT_TABLE_CALCLEN: if (str) len = snprintf(NULL, 0, "%s", str); else len = snprintf(NULL, 0, "%ld", val); if (len > *max_len) *max_len = len; - } else { + break; + case RESFMT_TABLE: if (str) printf("%s%-*s", i == 0 ? "" : COLUMN_SEP, *max_len, str); else printf("%s%*ld", i == 0 ? "" : COLUMN_SEP, *max_len, val); + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; + case RESFMT_CSV: + if (str) + printf("%s%s", i == 0 ? "" : ",", str); + else + printf("%s%ld", i == 0 ? "" : ",", val); + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; } } - if (!calc_len) - printf("\n"); + if (last && fmt == RESFMT_TABLE) + output_header_underlines(); } int main(int argc, char **argv) @@ -505,20 +546,18 @@ int main(int argc, char **argv) qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); - /* calculate column widths */ - output_headers(true); - for (i = 0; i < env.prog_stat_cnt; i++) { - output_stats(&env.prog_stats[i], true); + if (env.out_fmt == RESFMT_TABLE) { + /* calculate column widths */ + output_headers(RESFMT_TABLE_CALCLEN); + for (i = 0; i < env.prog_stat_cnt; i++) + output_stats(&env.prog_stats[i], RESFMT_TABLE_CALCLEN, false); } /* actually output the table */ - output_headers(false); - output_header_underlines(); + output_headers(env.out_fmt); for (i = 0; i < env.prog_stat_cnt; i++) { - output_stats(&env.prog_stats[i], false); + output_stats(&env.prog_stats[i], env.out_fmt, i == env.prog_stat_cnt - 1); } - output_header_underlines(); - printf("\n"); printf("Done. Processed %d object files, %d programs.\n", env.filename_cnt, env.prog_stat_cnt); From patchwork Wed Sep 21 16:42:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 12983981 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA5A1C6FA8B for ; Wed, 21 Sep 2022 16:49:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231444AbiIUQty convert rfc822-to-8bit (ORCPT ); Wed, 21 Sep 2022 12:49:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231473AbiIUQt1 (ORCPT ); Wed, 21 Sep 2022 12:49:27 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC1759A9E0 for ; Wed, 21 Sep 2022 09:43:09 -0700 (PDT) Received: from pps.filterd (m0148460.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28LFUkGn006908 for ; Wed, 21 Sep 2022 09:43:09 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3jr18hac5f-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 21 Sep 2022 09:43:09 -0700 Received: from twshared15978.04.prn5.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:21d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:07 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 21F111F2B4F9E; Wed, 21 Sep 2022 09:43:01 -0700 (PDT) From: Andrii Nakryiko To: , , CC: , Subject: [PATCH v2 bpf-next 3/4] selftests/bpf: add comparison mode to veristat Date: Wed, 21 Sep 2022 09:42:53 -0700 Message-ID: <20220921164254.3630690-4-andrii@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220921164254.3630690-1-andrii@kernel.org> References: <20220921164254.3630690-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: qJEbQiVa6IcEUAoKtKwK4kTxGXq79SPw X-Proofpoint-GUID: qJEbQiVa6IcEUAoKtKwK4kTxGXq79SPw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-21_09,2022-09-20_02,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add ability to compare and contrast two veristat runs, previously recorded with veristat using CSV output format. When veristat is called with -C (--compare) flag, veristat expects exactly two input files specified, both should be in CSV format. Expectation is that it's output from previous veristat runs, but as long as column names and formats match, it should just work. First CSV file is designated as a "baseline" provided, and the second one is comparison (experiment) data set. Establishing baseline matters later when calculating difference percentages, see below. Veristat parses these two CSV files and "reconstructs" verifier stats (it could be just a subset of all possible stats). File and program names are mandatory as they are used as joining key (these two "stats" are designated as "key stats" in the code). Veristat currently enforces that the set of stats recorded in both CSV has to exactly match, down to exact order. This is just a simplifying condition which can be lifted with a bit of additional pre-processing to reorded stat specs internally, which I didn't bother doing, yet. For all the non-key stats, veristat will output three columns: one for baseline data, one for comparison data, and one with an absolute and relative percentage difference. If either baseline or comparison values are missing (that is, respective CSV file doesn't have a row with *exactly* matching file and program name), those values are assumed to be empty or zero. In such case relative percentages are forced to +100% or -100% output, for consistency with a typical case. Veristat's -e (--emit) and -s (--sort) specs still apply, so even if CSV contains lots of stats, user can request to compare only a subset of them (and specify desired column order as well). Similarly, both CSV and human-readable table output is honored. Note that input is currently always expected to be CSV. Here's an example shell session, recording data for biosnoop tool on two different kernels and comparing them afterwards, outputting data in table format. # on slightly older production kernel $ sudo ./veristat biosnoop_bpf.o File Program Verdict Duration (us) Total insns Total states Peak states -------------- ------------------------ ------- ------------- ----------- ------------ ----------- biosnoop_bpf.o blk_account_io_merge_bio success 37 24 1 1 biosnoop_bpf.o blk_account_io_start failure 0 0 0 0 biosnoop_bpf.o block_rq_complete success 76 104 6 6 biosnoop_bpf.o block_rq_insert success 83 85 7 7 biosnoop_bpf.o block_rq_issue success 79 85 7 7 -------------- ------------------------ ------- ------------- ----------- ------------ ----------- Done. Processed 1 object files, 5 programs. $ sudo ./veristat ~/local/tmp/fbcode-bpf-objs/biosnoop_bpf.o -o csv > baseline.csv $ cat baseline.csv file_name,prog_name,verdict,duration,total_insns,total_states,peak_states biosnoop_bpf.o,blk_account_io_merge_bio,success,36,24,1,1 biosnoop_bpf.o,blk_account_io_start,failure,0,0,0,0 biosnoop_bpf.o,block_rq_complete,success,82,104,6,6 biosnoop_bpf.o,block_rq_insert,success,78,85,7,7 biosnoop_bpf.o,block_rq_issue,success,74,85,7,7 # on latest bpf-next kernel $ sudo ./veristat biosnoop_bpf.o File Program Verdict Duration (us) Total insns Total states Peak states -------------- ------------------------ ------- ------------- ----------- ------------ ----------- biosnoop_bpf.o blk_account_io_merge_bio success 31 24 1 1 biosnoop_bpf.o blk_account_io_start failure 0 0 0 0 biosnoop_bpf.o block_rq_complete success 76 104 6 6 biosnoop_bpf.o block_rq_insert success 83 91 7 7 biosnoop_bpf.o block_rq_issue success 74 91 7 7 -------------- ------------------------ ------- ------------- ----------- ------------ ----------- Done. Processed 1 object files, 5 programs. $ sudo ./veristat biosnoop_bpf.o -o csv > comparison.csv $ cat comparison.csv file_name,prog_name,verdict,duration,total_insns,total_states,peak_states biosnoop_bpf.o,blk_account_io_merge_bio,success,71,24,1,1 biosnoop_bpf.o,blk_account_io_start,failure,0,0,0,0 biosnoop_bpf.o,block_rq_complete,success,82,104,6,6 biosnoop_bpf.o,block_rq_insert,success,83,91,7,7 biosnoop_bpf.o,block_rq_issue,success,87,91,7,7 # now let's compare with human-readable output (note that no sudo needed) # we also ignore verification duration in this case to shortned output $ ./veristat -C baseline.csv comparison.csv -e file,prog,verdict,insns File Program Verdict (A) Verdict (B) Verdict (DIFF) Total insns (A) Total insns (B) Total insns (DIFF) -------------- ------------------------ ----------- ----------- -------------- --------------- --------------- ------------------ biosnoop_bpf.o blk_account_io_merge_bio success success MATCH 24 24 +0 (+0.00%) biosnoop_bpf.o blk_account_io_start failure failure MATCH 0 0 +0 (+100.00%) biosnoop_bpf.o block_rq_complete success success MATCH 104 104 +0 (+0.00%) biosnoop_bpf.o block_rq_insert success success MATCH 91 85 -6 (-6.59%) biosnoop_bpf.o block_rq_issue success success MATCH 91 85 -6 (-6.59%) -------------- ------------------------ ----------- ----------- -------------- --------------- --------------- ------------------ While not particularly exciting example (it turned out to be kind of hard to quickly find a nice example with significant difference just because of kernel version bump), it should demonstrate main features. Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/veristat.c | 543 ++++++++++++++++++++++--- 1 file changed, 492 insertions(+), 51 deletions(-) diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index 0472bfae3c9d..c6837bac357f 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -43,7 +43,7 @@ struct stat_specs { int spec_cnt; enum stat_id ids[ALL_STATS_CNT]; bool asc[ALL_STATS_CNT]; - int lens[ALL_STATS_CNT]; + int lens[ALL_STATS_CNT * 3]; /* 3x for comparison mode */ }; enum resfmt { @@ -57,16 +57,20 @@ static struct env { int filename_cnt; bool verbose; enum resfmt out_fmt; + bool comparison_mode; struct verif_stats *prog_stats; int prog_stat_cnt; + /* baseline_stats is allocated and used only in comparsion mode */ + struct verif_stats *baseline_stats; + int baseline_stat_cnt; + struct stat_specs output_spec; struct stat_specs sort_spec; } env; -static int libbpf_print_fn(enum libbpf_print_level level, - const char *format, va_list args) +static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args) { if (!env.verbose) return 0; @@ -78,9 +82,10 @@ static int libbpf_print_fn(enum libbpf_print_level level, const char *argp_program_version = "veristat"; const char *argp_program_bug_address = ""; const char argp_program_doc[] = -"veristat BPF verifier stats collection tool.\n" +"veristat BPF verifier stats collection and comparison tool.\n" "\n" -"USAGE: veristat [...]\n"; +"USAGE: veristat [...]\n" +" OR: veristat -C \n"; static const struct argp_option opts[] = { { NULL, 'h', NULL, OPTION_HIDDEN, "Show the full help" }, @@ -88,6 +93,7 @@ static const struct argp_option opts[] = { { "emit", 'e', "SPEC", 0, "Specify stats to be emitted" }, { "sort", 's', "SPEC", 0, "Specify sort order" }, { "output-format", 'o', "FMT", 0, "Result output format (table, csv), default is table." }, + { "compare", 'C', NULL, 0, "Comparison mode" }, {}, }; @@ -125,6 +131,9 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) return -EINVAL; } break; + case 'C': + env.comparison_mode = true; + break; case ARGP_KEY_ARG: tmp = realloc(env.filenames, (env.filename_cnt + 1) * sizeof(*env.filenames)); if (!tmp) @@ -141,6 +150,12 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) return 0; } +static const struct argp argp = { + .options = opts, + .parser = parse_arg, + .doc = argp_program_doc, +}; + static const struct stat_specs default_output_spec = { .spec_cnt = 7, .ids = { @@ -219,6 +234,20 @@ static int parse_stats(const char *stats_str, struct stat_specs *specs) return 0; } +static void free_verif_stats(struct verif_stats *stats, size_t stat_cnt) +{ + int i; + + if (!stats) + return; + + for (i = 0; i < stat_cnt; i++) { + free(stats[i].file_name); + free(stats[i].prog_name); + } + free(stats); +} + static char verif_log_buf[64 * 1024]; static int parse_verif_log(const char *buf, size_t buf_sz, struct verif_stats *s) @@ -448,6 +477,33 @@ static void output_headers(enum resfmt fmt) output_header_underlines(); } +static void prepare_value(const struct verif_stats *s, enum stat_id id, + const char **str, long *val) +{ + switch (id) { + case FILE_NAME: + *str = s->file_name; + break; + case PROG_NAME: + *str = s->prog_name; + break; + case VERDICT: + *str = s->stats[VERDICT] ? "success" : "failure"; + break; + case DURATION: + case TOTAL_INSNS: + case TOTAL_STATES: + case PEAK_STATES: + case MAX_STATES_PER_INSN: + case MARK_READ_MAX_LEN: + *val = s->stats[id]; + break; + default: + fprintf(stderr, "Unrecognized stat #%d\n", id); + exit(1); + } +} + static void output_stats(const struct verif_stats *s, enum resfmt fmt, bool last) { int i; @@ -458,28 +514,7 @@ static void output_stats(const struct verif_stats *s, enum resfmt fmt, bool last const char *str = NULL; long val = 0; - switch (id) { - case FILE_NAME: - str = s->file_name; - break; - case PROG_NAME: - str = s->prog_name; - break; - case VERDICT: - str = s->stats[VERDICT] ? "success" : "failure"; - break; - case DURATION: - case TOTAL_INSNS: - case TOTAL_STATES: - case PEAK_STATES: - case MAX_STATES_PER_INSN: - case MARK_READ_MAX_LEN: - val = s->stats[id]; - break; - default: - fprintf(stderr, "Unrecognized stat #%d\n", id); - exit(1); - } + prepare_value(s, id, &str, &val); switch (fmt) { case RESFMT_TABLE_CALCLEN: @@ -509,38 +544,28 @@ static void output_stats(const struct verif_stats *s, enum resfmt fmt, bool last } } - if (last && fmt == RESFMT_TABLE) + if (last && fmt == RESFMT_TABLE) { output_header_underlines(); + printf("Done. Processed %d object files, %d programs.\n", + env.filename_cnt, env.prog_stat_cnt); + } } -int main(int argc, char **argv) +static int handle_verif_mode(void) { - static const struct argp argp = { - .options = opts, - .parser = parse_arg, - .doc = argp_program_doc, - }; - int err = 0, i; - - if (argp_parse(&argp, argc, argv, 0, NULL, NULL)) - return 1; + int i, err; if (env.filename_cnt == 0) { fprintf(stderr, "Please provide path to BPF object file!\n"); argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); - return 1; + return -EINVAL; } - if (env.output_spec.spec_cnt == 0) - env.output_spec = default_output_spec; - if (env.sort_spec.spec_cnt == 0) - env.sort_spec = default_sort_spec; - for (i = 0; i < env.filename_cnt; i++) { err = process_obj(env.filenames[i]); if (err) { fprintf(stderr, "Failed to process '%s': %d\n", env.filenames[i], err); - goto cleanup; + return err; } } @@ -559,15 +584,431 @@ int main(int argc, char **argv) output_stats(&env.prog_stats[i], env.out_fmt, i == env.prog_stat_cnt - 1); } - printf("Done. Processed %d object files, %d programs.\n", - env.filename_cnt, env.prog_stat_cnt); + return 0; +} + +static int parse_stat_value(const char *str, enum stat_id id, struct verif_stats *st) +{ + switch (id) { + case FILE_NAME: + st->file_name = strdup(str); + if (!st->file_name) + return -ENOMEM; + break; + case PROG_NAME: + st->prog_name = strdup(str); + if (!st->prog_name) + return -ENOMEM; + break; + case VERDICT: + if (strcmp(str, "success") == 0) { + st->stats[VERDICT] = true; + } else if (strcmp(str, "failure") == 0) { + st->stats[VERDICT] = false; + } else { + fprintf(stderr, "Unrecognized verification verdict '%s'\n", str); + return -EINVAL; + } + break; + case DURATION: + case TOTAL_INSNS: + case TOTAL_STATES: + case PEAK_STATES: + case MAX_STATES_PER_INSN: + case MARK_READ_MAX_LEN: { + long val; + int err, n; + + if (sscanf(str, "%ld %n", &val, &n) != 1 || n != strlen(str)) { + err = -errno; + fprintf(stderr, "Failed to parse '%s' as integer\n", str); + return err; + } + + st->stats[id] = val; + break; + } + default: + fprintf(stderr, "Unrecognized stat #%d\n", id); + return -EINVAL; + } + return 0; +} + +static int parse_stats_csv(const char *filename, struct stat_specs *specs, + struct verif_stats **statsp, int *stat_cntp) +{ + char line[4096]; + FILE *f; + int err = 0; + bool header = true; + + f = fopen(filename, "r"); + if (!f) { + err = -errno; + fprintf(stderr, "Failed to open '%s': %d\n", filename, err); + return err; + } + + *stat_cntp = 0; + + while (fgets(line, sizeof(line), f)) { + char *input = line, *state = NULL, *next; + struct verif_stats *st = NULL; + int col = 0; + + if (!header) { + void *tmp; + + tmp = realloc(*statsp, (*stat_cntp + 1) * sizeof(**statsp)); + if (!tmp) { + err = -ENOMEM; + goto cleanup; + } + *statsp = tmp; + st = &(*statsp)[*stat_cntp]; + *stat_cntp += 1; + } + + while ((next = strtok_r(state ? NULL : input, ",\n", &state))) { + if (header) { + /* for the first line, set up spec stats */ + err = parse_stat(next, specs); + if (err) + goto cleanup; + continue; + } + + /* for all other lines, parse values based on spec */ + if (col >= specs->spec_cnt) { + fprintf(stderr, "Found extraneous column #%d in row #%d of '%s'\n", + col, *stat_cntp, filename); + err = -EINVAL; + goto cleanup; + } + err = parse_stat_value(next, specs->ids[col], st); + if (err) + goto cleanup; + col++; + } + + if (!header && col < specs->spec_cnt) { + fprintf(stderr, "Not enough columns in row #%d in '%s'\n", + *stat_cntp, filename); + err = -EINVAL; + goto cleanup; + } + + header = false; + } + + if (!feof(f)) { + err = -errno; + fprintf(stderr, "Failed I/O for '%s': %d\n", filename, err); + } cleanup: - for (i = 0; i < env.prog_stat_cnt; i++) { - free(env.prog_stats[i].file_name); - free(env.prog_stats[i].prog_name); + fclose(f); + return err; +} + +/* empty/zero stats for mismatched rows */ +static const struct verif_stats fallback_stats = { .file_name = "", .prog_name = "" }; + +static bool is_key_stat(enum stat_id id) +{ + return id == FILE_NAME || id == PROG_NAME; +} + +static void output_comp_header_underlines(void) +{ + int i, j, k; + + for (i = 0; i < env.output_spec.spec_cnt; i++) { + int id = env.output_spec.ids[i]; + int max_j = is_key_stat(id) ? 1 : 3; + + for (j = 0; j < max_j; j++) { + int len = env.output_spec.lens[3 * i + j]; + + printf("%s", i + j == 0 ? "" : COLUMN_SEP); + + for (k = 0; k < len; k++) + printf("%c", HEADER_CHAR); + } + } + printf("\n"); +} + +static void output_comp_headers(enum resfmt fmt) +{ + static const char *table_sfxs[3] = {" (A)", " (B)", " (DIFF)"}; + static const char *name_sfxs[3] = {"_base", "_comp", "_diff"}; + int i, j, len; + + for (i = 0; i < env.output_spec.spec_cnt; i++) { + int id = env.output_spec.ids[i]; + /* key stats don't have A/B/DIFF columns, they are common for both data sets */ + int max_j = is_key_stat(id) ? 1 : 3; + + for (j = 0; j < max_j; j++) { + int *max_len = &env.output_spec.lens[3 * i + j]; + bool last = (i == env.output_spec.spec_cnt - 1) && (j == max_j - 1); + const char *sfx; + + switch (fmt) { + case RESFMT_TABLE_CALCLEN: + sfx = is_key_stat(id) ? "" : table_sfxs[j]; + len = snprintf(NULL, 0, "%s%s", stat_defs[id].header, sfx); + if (len > *max_len) + *max_len = len; + break; + case RESFMT_TABLE: + sfx = is_key_stat(id) ? "" : table_sfxs[j]; + printf("%s%-*s%s", i + j == 0 ? "" : COLUMN_SEP, + *max_len - (int)strlen(sfx), stat_defs[id].header, sfx); + if (last) + printf("\n"); + break; + case RESFMT_CSV: + sfx = is_key_stat(id) ? "" : name_sfxs[j]; + printf("%s%s%s", i + j == 0 ? "" : ",", stat_defs[id].names[0], sfx); + if (last) + printf("\n"); + break; + } + } + } + + if (fmt == RESFMT_TABLE) + output_comp_header_underlines(); +} + +static void output_comp_stats(const struct verif_stats *base, const struct verif_stats *comp, + enum resfmt fmt, bool last) +{ + char base_buf[1024] = {}, comp_buf[1024] = {}, diff_buf[1024] = {}; + int i; + + for (i = 0; i < env.output_spec.spec_cnt; i++) { + int id = env.output_spec.ids[i], len; + int *max_len_base = &env.output_spec.lens[3 * i + 0]; + int *max_len_comp = &env.output_spec.lens[3 * i + 1]; + int *max_len_diff = &env.output_spec.lens[3 * i + 2]; + const char *base_str = NULL, *comp_str = NULL; + long base_val = 0, comp_val = 0, diff_val = 0; + + prepare_value(base, id, &base_str, &base_val); + prepare_value(comp, id, &comp_str, &comp_val); + + /* normalize all the outputs to be in string buffers for simplicity */ + if (is_key_stat(id)) { + /* key stats (file and program name) are always strings */ + if (base != &fallback_stats) + snprintf(base_buf, sizeof(base_buf), "%s", base_str); + else + snprintf(base_buf, sizeof(base_buf), "%s", comp_str); + } else if (base_str) { + snprintf(base_buf, sizeof(base_buf), "%s", base_str); + snprintf(comp_buf, sizeof(comp_buf), "%s", comp_str); + if (strcmp(base_str, comp_str) == 0) + snprintf(diff_buf, sizeof(diff_buf), "%s", "MATCH"); + else + snprintf(diff_buf, sizeof(diff_buf), "%s", "MISMATCH"); + } else { + snprintf(base_buf, sizeof(base_buf), "%ld", base_val); + snprintf(comp_buf, sizeof(comp_buf), "%ld", comp_val); + + diff_val = comp_val - base_val; + if (base == &fallback_stats || comp == &fallback_stats || base_val == 0) { + snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", + diff_val, comp_val < base_val ? -100.0 : 100.0); + } else { + snprintf(diff_buf, sizeof(diff_buf), "%+ld (%+.2lf%%)", + diff_val, diff_val * 100.0 / base_val); + } + } + + switch (fmt) { + case RESFMT_TABLE_CALCLEN: + len = strlen(base_buf); + if (len > *max_len_base) + *max_len_base = len; + if (!is_key_stat(id)) { + len = strlen(comp_buf); + if (len > *max_len_comp) + *max_len_comp = len; + len = strlen(diff_buf); + if (len > *max_len_diff) + *max_len_diff = len; + } + break; + case RESFMT_TABLE: { + /* string outputs are left-aligned, number outputs are right-aligned */ + const char *fmt = base_str ? "%s%-*s" : "%s%*s"; + + printf(fmt, i == 0 ? "" : COLUMN_SEP, *max_len_base, base_buf); + if (!is_key_stat(id)) { + printf(fmt, COLUMN_SEP, *max_len_comp, comp_buf); + printf(fmt, COLUMN_SEP, *max_len_diff, diff_buf); + } + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; + } + case RESFMT_CSV: + printf("%s%s", i == 0 ? "" : ",", base_buf); + if (!is_key_stat(id)) { + printf("%s%s", i == 0 ? "" : ",", comp_buf); + printf("%s%s", i == 0 ? "" : ",", diff_buf); + } + if (i == env.output_spec.spec_cnt - 1) + printf("\n"); + break; + } + } + + if (last && fmt == RESFMT_TABLE) + output_comp_header_underlines(); +} + +static int cmp_stats_key(const struct verif_stats *base, const struct verif_stats *comp) +{ + int r; + + r = strcmp(base->file_name, comp->file_name); + if (r != 0) + return r; + return strcmp(base->prog_name, comp->prog_name); +} + +static int handle_comparison_mode(void) +{ + struct stat_specs base_specs = {}, comp_specs = {}; + enum resfmt cur_fmt; + int err, i, j; + + if (env.filename_cnt != 2) { + fprintf(stderr, "Comparison mode expects exactly two input CSV files!\n"); + argp_help(&argp, stderr, ARGP_HELP_USAGE, "veristat"); + return -EINVAL; + } + + err = parse_stats_csv(env.filenames[0], &base_specs, + &env.baseline_stats, &env.baseline_stat_cnt); + if (err) { + fprintf(stderr, "Failed to parse stats from '%s': %d\n", env.filenames[0], err); + return err; + } + err = parse_stats_csv(env.filenames[1], &comp_specs, + &env.prog_stats, &env.prog_stat_cnt); + if (err) { + fprintf(stderr, "Failed to parse stats from '%s': %d\n", env.filenames[1], err); + return err; } - free(env.prog_stats); + + /* To keep it simple we validate that the set and order of stats in + * both CSVs are exactly the same. This can be lifted with a bit more + * pre-processing later. + */ + if (base_specs.spec_cnt != comp_specs.spec_cnt) { + fprintf(stderr, "Number of stats in '%s' and '%s' differs (%d != %d)!\n", + env.filenames[0], env.filenames[1], + base_specs.spec_cnt, comp_specs.spec_cnt); + return -EINVAL; + } + for (i = 0; i < base_specs.spec_cnt; i++) { + if (base_specs.ids[i] != comp_specs.ids[i]) { + fprintf(stderr, "Stats composition differs between '%s' and '%s' (%s != %s)!\n", + env.filenames[0], env.filenames[1], + stat_defs[base_specs.ids[i]].names[0], + stat_defs[comp_specs.ids[i]].names[0]); + return -EINVAL; + } + } + + qsort(env.prog_stats, env.prog_stat_cnt, sizeof(*env.prog_stats), cmp_prog_stats); + qsort(env.baseline_stats, env.baseline_stat_cnt, sizeof(*env.baseline_stats), cmp_prog_stats); + + /* for human-readable table output we need to do extra pass to + * calculate column widths, so we substitute current output format + * with RESFMT_TABLE_CALCLEN and later revert it back to RESFMT_TABLE + * and do everything again. + */ + if (env.out_fmt == RESFMT_TABLE) + cur_fmt = RESFMT_TABLE_CALCLEN; + else + cur_fmt = env.out_fmt; + +one_more_time: + output_comp_headers(cur_fmt); + + /* If baseline and comparison datasets have different subset of rows + * (we match by 'object + prog' as a unique key) then assume + * empty/missing/zero value for rows that are missing in the opposite + * data set + */ + i = j = 0; + while (i < env.baseline_stat_cnt || j < env.prog_stat_cnt) { + bool last = (i == env.baseline_stat_cnt - 1) || (j == env.prog_stat_cnt - 1); + const struct verif_stats *base, *comp; + int r; + + base = i < env.baseline_stat_cnt ? &env.baseline_stats[i] : &fallback_stats; + comp = j < env.prog_stat_cnt ? &env.prog_stats[j] : &fallback_stats; + + if (!base->file_name || !base->prog_name) { + fprintf(stderr, "Entry #%d in '%s' doesn't have file and/or program name specified!\n", + i, env.filenames[0]); + return -EINVAL; + } + if (!comp->file_name || !comp->prog_name) { + fprintf(stderr, "Entry #%d in '%s' doesn't have file and/or program name specified!\n", + j, env.filenames[1]); + return -EINVAL; + } + + r = cmp_stats_key(base, comp); + if (r == 0) { + output_comp_stats(base, comp, cur_fmt, last); + i++; + j++; + } else if (comp == &fallback_stats || r < 0) { + output_comp_stats(base, &fallback_stats, cur_fmt, last); + i++; + } else { + output_comp_stats(&fallback_stats, comp, cur_fmt, last); + j++; + } + } + + if (cur_fmt == RESFMT_TABLE_CALCLEN) { + cur_fmt = RESFMT_TABLE; + goto one_more_time; /* ... this time with feeling */ + } + + return 0; +} + +int main(int argc, char **argv) +{ + int err = 0, i; + + if (argp_parse(&argp, argc, argv, 0, NULL, NULL)) + return 1; + + if (env.output_spec.spec_cnt == 0) + env.output_spec = default_output_spec; + if (env.sort_spec.spec_cnt == 0) + env.sort_spec = default_sort_spec; + + if (env.comparison_mode) + err = handle_comparison_mode(); + else + err = handle_verif_mode(); + + free_verif_stats(env.prog_stats, env.prog_stat_cnt); + free_verif_stats(env.baseline_stats, env.baseline_stat_cnt); for (i = 0; i < env.filename_cnt; i++) free(env.filenames[i]); free(env.filenames); From patchwork Wed Sep 21 16:42:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 12983983 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74A90ECAAD8 for ; Wed, 21 Sep 2022 16:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231533AbiIUQt4 convert rfc822-to-8bit (ORCPT ); Wed, 21 Sep 2022 12:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231553AbiIUQtd (ORCPT ); Wed, 21 Sep 2022 12:49:33 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F8AE9AF9D for ; Wed, 21 Sep 2022 09:43:43 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28LFUjZJ015200 for ; Wed, 21 Sep 2022 09:43:42 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3jqbbub3q6-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 21 Sep 2022 09:43:42 -0700 Received: from snc-exhub201.TheFacebook.com (2620:10d:c085:21d::7) by snc-exhub103.TheFacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:37 -0700 Received: from twshared15978.04.prn5.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 21 Sep 2022 09:43:07 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 3464E1F2B4FA7; Wed, 21 Sep 2022 09:43:03 -0700 (PDT) From: Andrii Nakryiko To: , , CC: , Subject: [PATCH v2 bpf-next 4/4] selftests/bpf: add ability to filter programs in veristat Date: Wed, 21 Sep 2022 09:42:54 -0700 Message-ID: <20220921164254.3630690-5-andrii@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220921164254.3630690-1-andrii@kernel.org> References: <20220921164254.3630690-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: SzJN1qZQOHn1y-lMuutnw4OfKQlJRe6S X-Proofpoint-GUID: SzJN1qZQOHn1y-lMuutnw4OfKQlJRe6S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-21_09,2022-09-20_02,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add -f (--filter) argument which accepts glob-based filters for narrowing down what BPF object files and programs within them should be processed by veristat. This filtering applies both to comparison and main (verification) mode. Filter can be of two forms: - file (object) filter: 'strobemeta*'; in this case all the programs within matching files are implicitly allowed (or denied, depending if it's positive or negative rule, see below); - file and prog filter: 'strobemeta*/*unroll*' will further filter programs within matching files to only allow those program names that match '*unroll*' glob. As mentioned, filters can be positive (allowlisting) and negative (denylisting). Negative filters should start with '!': '!strobemeta*' will deny any filename which basename starts with "strobemeta". Further, one extra special syntax is supported to allow more convenient use in practice. Instead of specifying rule on the command line, veristat allows to specify file that contains rules, both positive and negative, one line per one filter. This is achieved with -f @ use, where points to a text file containing rules (negative and positive rules can be mixed). For convenience empty lines and lines starting with '#' are ignored. This feature is useful to have some pre-canned list of object files and program names that are tested repeatedly, allowing to check in a list of rules and quickly specify them on the command line. As a demonstration (and a short cut for nearest future), create a small list of "interesting" BPF object files from selftests/bpf and commit it as veristat.cfg. It currently includes 73 programs, most of which are the most complex and largest BPF programs in selftests, as judged by total verified instruction count and verifier states total. If there is overlap between positive or negative filters, negative filter takes precedence (denylisting is stronger than allowlisting). If no allow filter is specified, veristat implicitly assumes '*/*' rule. If no deny rule is specified, veristat (logically) assumes no negative filters. Also note that -f (just like -e and -s) can be specified multiple times and their effect is cumulative. Signed-off-by: Andrii Nakryiko --- tools/testing/selftests/bpf/veristat.c | 212 ++++++++++++++++++++++- tools/testing/selftests/bpf/veristat.cfg | 17 ++ 2 files changed, 227 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/veristat.cfg diff --git a/tools/testing/selftests/bpf/veristat.c b/tools/testing/selftests/bpf/veristat.c index c6837bac357f..51030234b60a 100644 --- a/tools/testing/selftests/bpf/veristat.c +++ b/tools/testing/selftests/bpf/veristat.c @@ -52,6 +52,11 @@ enum resfmt { RESFMT_CSV, }; +struct filter { + char *file_glob; + char *prog_glob; +}; + static struct env { char **filenames; int filename_cnt; @@ -68,6 +73,11 @@ static struct env { struct stat_specs output_spec; struct stat_specs sort_spec; + + struct filter *allow_filters; + struct filter *deny_filters; + int allow_filter_cnt; + int deny_filter_cnt; } env; static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args) @@ -94,10 +104,13 @@ static const struct argp_option opts[] = { { "sort", 's', "SPEC", 0, "Specify sort order" }, { "output-format", 'o', "FMT", 0, "Result output format (table, csv), default is table." }, { "compare", 'C', NULL, 0, "Comparison mode" }, + { "filter", 'f', "FILTER", 0, "Filter expressions (or @filename for file with expressions)." }, {}, }; static int parse_stats(const char *stats_str, struct stat_specs *specs); +static int append_filter(struct filter **filters, int *cnt, const char *str); +static int append_filter_file(const char *path); static error_t parse_arg(int key, char *arg, struct argp_state *state) { @@ -134,6 +147,18 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state) case 'C': env.comparison_mode = true; break; + case 'f': + if (arg[0] == '@') + err = append_filter_file(arg + 1); + else if (arg[0] == '!') + err = append_filter(&env.deny_filters, &env.deny_filter_cnt, arg + 1); + else + err = append_filter(&env.allow_filters, &env.allow_filter_cnt, arg); + if (err) { + fprintf(stderr, "Failed to collect program filter expressions: %d\n", err); + return err; + } + break; case ARGP_KEY_ARG: tmp = realloc(env.filenames, (env.filename_cnt + 1) * sizeof(*env.filenames)); if (!tmp) @@ -156,6 +181,150 @@ static const struct argp argp = { .doc = argp_program_doc, }; + +/* Adapted from perf/util/string.c */ +static bool glob_matches(const char *str, const char *pat) +{ + while (*str && *pat && *pat != '*') { + if (*str != *pat) + return false; + str++; + pat++; + } + /* Check wild card */ + if (*pat == '*') { + while (*pat == '*') + pat++; + if (!*pat) /* Tail wild card matches all */ + return true; + while (*str) + if (glob_matches(str++, pat)) + return true; + } + return !*str && !*pat; +} + +static bool should_process_file(const char *filename) +{ + int i; + + if (env.deny_filter_cnt > 0) { + for (i = 0; i < env.deny_filter_cnt; i++) { + if (glob_matches(filename, env.deny_filters[i].file_glob)) + return false; + } + } + + if (env.allow_filter_cnt == 0) + return true; + + for (i = 0; i < env.allow_filter_cnt; i++) { + if (glob_matches(filename, env.allow_filters[i].file_glob)) + return true; + } + + return false; +} + +static bool should_process_prog(const char *filename, const char *prog_name) +{ + int i; + + if (env.deny_filter_cnt > 0) { + for (i = 0; i < env.deny_filter_cnt; i++) { + if (glob_matches(filename, env.deny_filters[i].file_glob)) + return false; + if (!env.deny_filters[i].prog_glob) + continue; + if (glob_matches(prog_name, env.deny_filters[i].prog_glob)) + return false; + } + } + + if (env.allow_filter_cnt == 0) + return true; + + for (i = 0; i < env.allow_filter_cnt; i++) { + if (!glob_matches(filename, env.allow_filters[i].file_glob)) + continue; + /* if filter specifies only filename glob part, it implicitly + * allows all progs within that file + */ + if (!env.allow_filters[i].prog_glob) + return true; + if (glob_matches(prog_name, env.allow_filters[i].prog_glob)) + return true; + } + + return false; +} + +static int append_filter(struct filter **filters, int *cnt, const char *str) +{ + struct filter *f; + void *tmp; + const char *p; + + tmp = realloc(*filters, (*cnt + 1) * sizeof(**filters)); + if (!tmp) + return -ENOMEM; + *filters = tmp; + + f = &(*filters)[*cnt]; + f->file_glob = f->prog_glob = NULL; + + /* filter can be specified either as "" or "/" */ + p = strchr(str, '/'); + if (!p) { + f->file_glob = strdup(str); + if (!f->file_glob) + return -ENOMEM; + } else { + f->file_glob = strndup(str, p - str); + f->prog_glob = strdup(p + 1); + if (!f->file_glob || !f->prog_glob) { + free(f->file_glob); + free(f->prog_glob); + f->file_glob = f->prog_glob = NULL; + return -ENOMEM; + } + } + + *cnt = *cnt + 1; + return 0; +} + +static int append_filter_file(const char *path) +{ + char buf[1024]; + FILE *f; + int err = 0; + + f = fopen(path, "r"); + if (!f) { + err = -errno; + fprintf(stderr, "Failed to open '%s': %d\n", path, err); + return err; + } + + while (fscanf(f, " %1023[^\n]\n", buf) == 1) { + /* lines starting with # are comments, skip them */ + if (buf[0] == '\0' || buf[0] == '#') + continue; + /* lines starting with ! are negative match filters */ + if (buf[0] == '!') + err = append_filter(&env.deny_filters, &env.deny_filter_cnt, buf + 1); + else + err = append_filter(&env.allow_filters, &env.allow_filter_cnt, buf); + if (err) + goto cleanup; + } + +cleanup: + fclose(f); + return err; +} + static const struct stat_specs default_output_spec = { .spec_cnt = 7, .ids = { @@ -283,6 +452,9 @@ static int process_prog(const char *filename, struct bpf_object *obj, struct bpf int err = 0; void *tmp; + if (!should_process_prog(basename(filename), bpf_program__name(prog))) + return 0; + tmp = realloc(env.prog_stats, (env.prog_stat_cnt + 1) * sizeof(*env.prog_stats)); if (!tmp) return -ENOMEM; @@ -330,6 +502,9 @@ static int process_obj(const char *filename) LIBBPF_OPTS(bpf_object_open_opts, opts); int err = 0, prog_cnt = 0; + if (!should_process_file(basename(filename))) + return 0; + old_libbpf_print_fn = libbpf_set_print(libbpf_print_fn); obj = bpf_object__open_file(filename, &opts); @@ -666,7 +841,10 @@ static int parse_stats_csv(const char *filename, struct stat_specs *specs, goto cleanup; } *statsp = tmp; + st = &(*statsp)[*stat_cntp]; + memset(st, 0, sizeof(*st)); + *stat_cntp += 1; } @@ -692,14 +870,34 @@ static int parse_stats_csv(const char *filename, struct stat_specs *specs, col++; } - if (!header && col < specs->spec_cnt) { + if (header) { + header = false; + continue; + } + + if (col < specs->spec_cnt) { fprintf(stderr, "Not enough columns in row #%d in '%s'\n", *stat_cntp, filename); err = -EINVAL; goto cleanup; } - header = false; + if (!st->file_name || !st->prog_name) { + fprintf(stderr, "Row #%d in '%s' is missing file and/or program name\n", + *stat_cntp, filename); + err = -EINVAL; + goto cleanup; + } + + /* in comparison mode we can only check filters after we + * parsed entire line; if row should be ignored we pretend we + * never parsed it + */ + if (!should_process_prog(st->file_name, st->prog_name)) { + free(st->file_name); + free(st->prog_name); + *stat_cntp -= 1; + } } if (!feof(f)) { @@ -1012,5 +1210,15 @@ int main(int argc, char **argv) for (i = 0; i < env.filename_cnt; i++) free(env.filenames[i]); free(env.filenames); + for (i = 0; i < env.allow_filter_cnt; i++) { + free(env.allow_filters[i].file_glob); + free(env.allow_filters[i].prog_glob); + } + free(env.allow_filters); + for (i = 0; i < env.deny_filter_cnt; i++) { + free(env.deny_filters[i].file_glob); + free(env.deny_filters[i].prog_glob); + } + free(env.deny_filters); return -err; } diff --git a/tools/testing/selftests/bpf/veristat.cfg b/tools/testing/selftests/bpf/veristat.cfg new file mode 100644 index 000000000000..1a385061618d --- /dev/null +++ b/tools/testing/selftests/bpf/veristat.cfg @@ -0,0 +1,17 @@ +# pre-canned list of rather complex selftests/bpf BPF object files to monitor +# BPF verifier's performance on +bpf_flow* +bpf_loop_bench* +loop* +netif_receive_skb* +profiler* +pyperf* +strobemeta* +test_cls_redirect* +test_l4lb +test_sysctl* +test_tcp_hdr_* +test_usdt* +test_verif_scale* +test_xdp_noinline* +xdp_synproxy*