From patchwork Thu Nov 18 01:04:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Joanne Koong X-Patchwork-Id: 12625817 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29C38C433FE for ; Thu, 18 Nov 2021 01:07:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CA0E61B3F for ; Thu, 18 Nov 2021 01:07:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242100AbhKRBKA (ORCPT ); Wed, 17 Nov 2021 20:10:00 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:41902 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237831AbhKRBKA (ORCPT ); Wed, 17 Nov 2021 20:10:00 -0500 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AHLeCJq006841 for ; Wed, 17 Nov 2021 17:07:01 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=facebook; bh=PMnMl7JHz/z5Rj3SxfUQ4Ex8WNuSbKnTnshpFFtwRNc=; b=B+QQD/iH9OzAFcyXVI5ghWWU98HjN0ieKc6vy7PPMhijafHWRTvGdt5agXW8R0Sm/wqm 3GsExoBbafUL6BtMKcbCOfiV2nududcpqoghiwEbKvv4IdCbU7009bRLeACdwlYiqNvq /auhpnXJoMq4BEOIdiv9AgGjGvP/TJ+yizY= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 3ccyjw5ykm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 17 Nov 2021 17:07:01 -0800 Received: from intmgw002.25.frc3.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.14; Wed, 17 Nov 2021 17:07:00 -0800 Received: by devbig612.frc2.facebook.com (Postfix, from userid 115148) id 13CE04F5FAE5; Wed, 17 Nov 2021 17:06:53 -0800 (PST) From: Joanne Koong To: CC: , , , , , Joanne Koong Subject: [PATCH bpf-next 3/3] selftest/bpf/benchs: add bpf_for_each benchmark Date: Wed, 17 Nov 2021 17:04:04 -0800 Message-ID: <20211118010404.2415864-4-joannekoong@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211118010404.2415864-1-joannekoong@fb.com> References: <20211118010404.2415864-1-joannekoong@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-FB-Source: Intern X-Proofpoint-ORIG-GUID: TTLYnmTFuDf1ZWmf0jJNv_-26oL16gT8 X-Proofpoint-GUID: TTLYnmTFuDf1ZWmf0jJNv_-26oL16gT8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-17_09,2021-11-17_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 malwarescore=0 priorityscore=1501 impostorscore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111180005 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add benchmark to measure the overhead of the bpf_for_each call for a specified number of iterations. Testing this on qemu on my dev machine on 1 thread, the data is as follows: nr_iterations: 1 bpf_for_each helper - total callbacks called: 42.949 ± 1.404M/s nr_iterations: 10 bpf_for_each helper - total callbacks called: 73.645 ± 2.077M/s nr_iterations: 100 bpf_for_each helper - total callbacks called: 73.058 ± 1.256M/s nr_iterations: 500 bpf_for_each helper - total callbacks called: 78.255 ± 2.845M/s nr_iterations: 1000 bpf_for_each helper - total callbacks called: 79.439 ± 1.805M/s nr_iterations: 5000 bpf_for_each helper - total callbacks called: 81.639 ± 2.053M/s nr_iterations: 10000 bpf_for_each helper - total callbacks called: 80.577 ± 1.824M/s nr_iterations: 50000 bpf_for_each helper - total callbacks called: 76.773 ± 1.578M/s nr_iterations: 100000 bpf_for_each helper - total callbacks called: 77.073 ± 2.200M/s nr_iterations: 500000 bpf_for_each helper - total callbacks called: 75.136 ± 0.552M/s nr_iterations: 1000000 bpf_for_each helper - total callbacks called: 76.364 ± 1.690M/s From this data, we can see that we are able to run the loop at least 40 million times per second on an empty callback function. From this data, we can also see that as the number of iterations increases, the overhead per iteration decreases and steadies towards a constant value. Signed-off-by: Joanne Koong --- tools/testing/selftests/bpf/Makefile | 3 +- tools/testing/selftests/bpf/bench.c | 4 + .../selftests/bpf/benchs/bench_for_each.c | 105 ++++++++++++++++++ .../bpf/benchs/run_bench_for_each.sh | 16 +++ .../selftests/bpf/progs/for_each_helper.c | 13 +++ 5 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/benchs/bench_for_each.c create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_for_each.sh diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index f49cb5fc85af..b55fc72b8ef0 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -537,7 +537,8 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \ $(OUTPUT)/bench_rename.o \ $(OUTPUT)/bench_trigger.o \ $(OUTPUT)/bench_ringbufs.o \ - $(OUTPUT)/bench_bloom_filter_map.o + $(OUTPUT)/bench_bloom_filter_map.o \ + $(OUTPUT)/bench_for_each.o $(call msg,BINARY,,$@) $(Q)$(CC) $(LDFLAGS) -o $@ $(filter %.a %.o,$^) $(LDLIBS) diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index cc4722f693e9..d8b3d537a700 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -171,10 +171,12 @@ static const struct argp_option opts[] = { extern struct argp bench_ringbufs_argp; extern struct argp bench_bloom_map_argp; +extern struct argp bench_for_each_argp; static const struct argp_child bench_parsers[] = { { &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 }, { &bench_bloom_map_argp, 0, "Bloom filter map benchmark", 0 }, + { &bench_for_each_argp, 0, "bpf_for_each helper benchmark", 0 }, {}, }; @@ -368,6 +370,7 @@ extern const struct bench bench_bloom_update; extern const struct bench bench_bloom_false_positive; extern const struct bench bench_hashmap_without_bloom; extern const struct bench bench_hashmap_with_bloom; +extern const struct bench bench_for_each_helper; static const struct bench *benchs[] = { &bench_count_global, @@ -394,6 +397,7 @@ static const struct bench *benchs[] = { &bench_bloom_false_positive, &bench_hashmap_without_bloom, &bench_hashmap_with_bloom, + &bench_for_each_helper, }; static void setup_benchmark() diff --git a/tools/testing/selftests/bpf/benchs/bench_for_each.c b/tools/testing/selftests/bpf/benchs/bench_for_each.c new file mode 100644 index 000000000000..3372d5b7d67b --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/bench_for_each.c @@ -0,0 +1,105 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ + +#include +#include "bench.h" +#include "for_each_helper.skel.h" + +/* BPF triggering benchmarks */ +static struct ctx { + struct for_each_helper *skel; +} ctx; + +static struct { + __u32 nr_iters; +} args = { + .nr_iters = 10, +}; + +enum { + ARG_NR_ITERS = 4000, +}; + +static const struct argp_option opts[] = { + { "nr_iters", ARG_NR_ITERS, "nr_iters", 0, + "Set number of iterations for the bpf_for_each helper"}, + {}, +}; + +static error_t parse_arg(int key, char *arg, struct argp_state *state) +{ + switch (key) { + case ARG_NR_ITERS: + args.nr_iters = strtol(arg, NULL, 10); + break; + default: + return ARGP_ERR_UNKNOWN; + } + + return 0; +} + +/* exported into benchmark runner */ +const struct argp bench_for_each_argp = { + .options = opts, + .parser = parse_arg, +}; + +static void validate(void) +{ + if (env.consumer_cnt != 1) { + fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); + exit(1); + } +} + +static void *producer(void *input) +{ + while (true) + /* trigger the bpf program */ + syscall(__NR_getpgid); + + return NULL; +} + +static void *consumer(void *input) +{ + return NULL; +} + +static void measure(struct bench_res *res) +{ + res->hits = atomic_swap(&ctx.skel->bss->hits, 0); +} + +static void setup(void) +{ + struct bpf_link *link; + + setup_libbpf(); + + ctx.skel = for_each_helper__open_and_load(); + if (!ctx.skel) { + fprintf(stderr, "failed to open skeleton\n"); + exit(1); + } + + link = bpf_program__attach(ctx.skel->progs.benchmark); + if (!link) { + fprintf(stderr, "failed to attach program!\n"); + exit(1); + } + + ctx.skel->bss->nr_iterations = args.nr_iters; +} + +const struct bench bench_for_each_helper = { + .name = "for-each-helper", + .validate = validate, + .setup = setup, + .producer_thread = producer, + .consumer_thread = consumer, + .measure = measure, + .report_progress = hits_drops_report_progress, + .report_final = hits_drops_report_final, +}; diff --git a/tools/testing/selftests/bpf/benchs/run_bench_for_each.sh b/tools/testing/selftests/bpf/benchs/run_bench_for_each.sh new file mode 100755 index 000000000000..5f11a1ad66d3 --- /dev/null +++ b/tools/testing/selftests/bpf/benchs/run_bench_for_each.sh @@ -0,0 +1,16 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source ./benchs/run_common.sh + +set -eufo pipefail + +for t in 1 4 8 12 16; do +printf "\n" +for i in 1 10 100 500 1000 5000 10000 50000 100000 500000 1000000; do +subtitle "nr_iterations: $i, nr_threads: $t" + summarize "bpf_for_each helper - total callbacks called: " \ + "$($RUN_BENCH -p $t --nr_iters $i for-each-helper)" + printf "\n" +done +done diff --git a/tools/testing/selftests/bpf/progs/for_each_helper.c b/tools/testing/selftests/bpf/progs/for_each_helper.c index 4404d0cb32a6..b95551d99f75 100644 --- a/tools/testing/selftests/bpf/progs/for_each_helper.c +++ b/tools/testing/selftests/bpf/progs/for_each_helper.c @@ -14,6 +14,8 @@ struct callback_ctx { u32 nr_iterations; u32 stop_index = -1; +long hits; + /* Making these global variables so that the userspace program * can verify the output through the skeleton */ @@ -67,3 +69,14 @@ int prog_invalid_flags(struct __sk_buff *skb) return 0; } + +SEC("fentry/__x64_sys_getpgid") +int benchmark(void *ctx) +{ + for (int i = 0; i < 1000; i++) { + bpf_for_each(nr_iterations, empty_callback_fn, NULL, 0); + + __sync_add_and_fetch(&hits, nr_iterations); + } + return 0; +}