From patchwork Tue Nov 29 00:03:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13058017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BFFDC4321E for ; Tue, 29 Nov 2022 00:04:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234279AbiK2AEY (ORCPT ); Mon, 28 Nov 2022 19:04:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234805AbiK2AEX (ORCPT ); Mon, 28 Nov 2022 19:04:23 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8059C1AD92 for ; Mon, 28 Nov 2022 16:04:22 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 28C2961511 for ; Tue, 29 Nov 2022 00:04:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 26D4DC4347C; Tue, 29 Nov 2022 00:04:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669680261; bh=NDM4IpK1Mj0AGBGcu4ocxuGv8IyHd9RxuVw91kBfjLo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=R02H6km0mIoxwI3OT9rRXqBt7YZygebWS1uPDvA/Vl4sZ469lqyNpPWBzQ4IVdwr6 uup7bY4f8q/AYVGlFWd0e3AqzN0D95wCdW4mlbXVWhI79bDgmk9PHQNWs+sZQmrX9B Y/2PFGskzzWo1cb2ZxHLeSh4r4WAUKdfd6wGP1XPfRshq9FZMtRHq95NOJR2x+9bWH 3opLYBZkBL1XKceV8I9EefYD0ENW82LPPVevCjmTYEIcXoOlNO3NkRhbhvoVsbkToU c4Afhx7cobUf2lyjOLVGGecAC8l3kq0aT6yTmMsO/m1WBMLh3BPBQUBl+KF6GWPANe VYop4bY57tu/Q== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v1 1/3] kselftest/arm64: Hold fp-stress children until they're all spawned Date: Tue, 29 Nov 2022 00:03:53 +0000 Message-Id: <20221129000355.812425-2-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129000355.812425-1-broonie@kernel.org> References: <20221129000355.812425-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3511; i=broonie@kernel.org; h=from:subject; bh=NDM4IpK1Mj0AGBGcu4ocxuGv8IyHd9RxuVw91kBfjLo=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhUxoJLgwylCPMbz6VfJBPb5ktw1QGE8a0BHY8MTS duAFTMqJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4VMaAAKCRAk1otyXVSH0Ie6B/ 90SBqTu2fvepPiFTNRyZ8WWrDfM+sKbHwoKfxCPK0/YklTnACBRzPY5aUoHrPfLKqiPLBShlL1wlyN D/hr32ZBjIPocjNKUVJLm+kqUn0YHp3hxANSJDadS8EVnjqehWnnmq7gvzeVVlsKwmFnwGs50SsBfr 62q2T3FiwfKiQhktVuVpwym5cb8WwmLJiC8ya6+Z8ID1rVZGJPZKQ0HXZ98NKhv/fTYXz0k5UEA8w7 HENfjH1jp2k0rlJxpkd/PSZYg+N6A7rkNDT2WPd8d/cM7MmLYP7nBHxa35NaIp/Mhdocd1s9oWQljN bpLu3qDxHQoFD4GNXEtWup5rt8Groi X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org At present fp-stress has a bit of a thundering herd problem since the children it spawns start running immediately, meaning that they can start starving the parent process of CPU before it has even started all the children. This is much more severe on virtual platforms since they tend to support far more SVE and SME vector lengths, be slower in general and for some have issues with performance when simulating multiple CPUs. We can mitigate this problem by having all the child processes block before starting the test program, meaning that we at least have all the child processes started before we start heavily using CPU. We still have the same load issues while waiting for the actual stress test programs to start up and produce output but they're at least all ready to go before that kicks in, resulting in substantial reductions in overall runtime on some of the severely affected systems. One test was showing about 20% improvement. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 41 +++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 4e62a9199f97..9a3a621cc958 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -44,6 +44,8 @@ static bool terminate; static void drain_output(bool flush); +static int startup_pipe[2]; + static int num_processors(void) { long nproc = sysconf(_SC_NPROCESSORS_CONF); @@ -81,13 +83,37 @@ static void child_start(struct child_data *child, const char *program) exit(EXIT_FAILURE); } + /* + * Duplicate the read side of the startup pipe to + * FD 3 so we can close everything else. + */ + ret = dup2(startup_pipe[0], 3); + if (ret == -1) { + fprintf(stderr, "dup2() %d\n", errno); + exit(EXIT_FAILURE); + } + /* * Very dumb mechanism to clean open FDs other than * stdio. We don't want O_CLOEXEC for the pipes... */ - for (i = 3; i < 8192; i++) + for (i = 4; i < 8192; i++) close(i); + /* + * Read from the startup pipe, there should be no data + * and we should block until it is closed. We just + * carry on on error since this isn't super critical. + */ + ret = read(3, &i, sizeof(i)); + if (ret < 0) + fprintf(stderr, "read(startp pipe) failed: %s (%d)\n", + strerror(errno), errno); + if (ret > 0) + fprintf(stderr, "%d bytes of data on startup pipe\n", + ret); + close(3); + ret = execl(program, program, NULL); fprintf(stderr, "execl(%s) failed: %d (%s)\n", program, errno, strerror(errno)); @@ -465,6 +491,12 @@ int main(int argc, char **argv) strerror(errno), ret); epoll_fd = ret; + /* Create a pipe which children will block on before execing */ + ret = pipe(startup_pipe); + if (ret != 0) + ksft_exit_fail_msg("Failed to create startup pipe: %s (%d)\n", + strerror(errno), errno); + /* Get signal handers ready before we start any children */ memset(&sa, 0, sizeof(sa)); sa.sa_sigaction = handle_exit_signal; @@ -497,6 +529,13 @@ int main(int argc, char **argv) } } + /* + * All children started, close the startup pipe and let them + * run. + */ + close(startup_pipe[0]); + close(startup_pipe[1]); + for (;;) { /* Did we get a signal asking us to exit? */ if (terminate) From patchwork Tue Nov 29 00:03:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13058019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EEC7C43217 for ; Tue, 29 Nov 2022 00:04:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234851AbiK2AE1 (ORCPT ); Mon, 28 Nov 2022 19:04:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234812AbiK2AE0 (ORCPT ); Mon, 28 Nov 2022 19:04:26 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42202186D1 for ; Mon, 28 Nov 2022 16:04:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B60C1B80FE9 for ; Tue, 29 Nov 2022 00:04:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 06D87C433B5; Tue, 29 Nov 2022 00:04:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669680263; bh=ffhxUDMKp+LpVrt45dMJeTgyberGxql4NgI1JZQLIGQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oQuZcrpUNJDuk9BahSvRvIsixECtTA9JhfRT/VtJi5aWkggEDQTqpjj5nMGNnVPIX BlNvq5AYq4qalttRB0TjNhCMBqLzf3Uz7lLuQSdZ4s8tKUYAua1rygutbNsZPJvp3K G0o3jBwmWR+RKjPNqOSwFRXvJwLgepJDb4L306CYUug3yPTGioLYnNCuc6omdKFHWp XTatiNuvRwuFK3Y4atMOGuhLes7GMYHvEda+VhPOsi/KXoFsCVGODgIpTEGYHHy5uu AvD7vplIZOqHk0DuHXU1ewRLfjb5/aCGB7LA5ub/pIUYaukBDRAS4n5xRTgq8OFdc7 NpVd4Jkz9YcUQ== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v1 2/3] kselftest/arm64: Don't drain output while spawning children Date: Tue, 29 Nov 2022 00:03:54 +0000 Message-Id: <20221129000355.812425-3-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129000355.812425-1-broonie@kernel.org> References: <20221129000355.812425-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1139; i=broonie@kernel.org; h=from:subject; bh=ffhxUDMKp+LpVrt45dMJeTgyberGxql4NgI1JZQLIGQ=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhUxp5Y1PuXboQC8O3kWSuJde5MJwdGEvVVgHo8Wd NuTmi0SJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4VMaQAKCRAk1otyXVSH0ODcB/ 9wJH19WeEQFgzm9rfUCiHbEa7Dko4RXfIIdVdIKhoaCMLUpryUI37tS79NAThYeu3wDZfRU3jlF2ys +FcJ7MYyEIosdgZmLWLX19x8+AVgHBixJWsxu/coAvxCHLkqseHoWzRgvqiqTEQ+HvsQqpb4+0rU5X X8N3qiNeRc4ZlAbRacrddo/4THoXQ6KT8mOsDM5pj/Nlrb3iB2vV3cdrWJBTEgYkbQx5iCch9MTIDK 8+bQRFanHuC038YYLPFIfJbWvENebtR+PkmWwSiWlufpg523qPdLXHxwrWg9TLu6SK0FUgi+oZV9Q5 K08pzT5jzDQsRBW21UEtnRgB575dLd X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Now we hold execution of the stress test programs until all children are started there is no need to drain output while that is happening. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 9a3a621cc958..65262cf30b09 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -42,8 +42,6 @@ static struct child_data *children; static int num_children; static bool terminate; -static void drain_output(bool flush); - static int startup_pipe[2]; static int num_processors(void) @@ -138,12 +136,6 @@ static void child_start(struct child_data *child, const char *program) ksft_exit_fail_msg("%s EPOLL_CTL_ADD failed: %s (%d)\n", child->name, strerror(errno), errno); } - - /* - * Keep output flowing during child startup so logs - * are more timely, can help debugging. - */ - drain_output(false); } } From patchwork Tue Nov 29 00:03:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13058020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23110C4321E for ; Tue, 29 Nov 2022 00:04:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234812AbiK2AE2 (ORCPT ); Mon, 28 Nov 2022 19:04:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234760AbiK2AE0 (ORCPT ); Mon, 28 Nov 2022 19:04:26 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 493801AD92 for ; Mon, 28 Nov 2022 16:04:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D954B61516 for ; Tue, 29 Nov 2022 00:04:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7244C43141; Tue, 29 Nov 2022 00:04:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669680265; bh=Y43qJ/bNGK9Ddo25ogKXBarnhJnlJyovlWA7DkFzBLE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q7TjS4Ihh0OtWz32ArdI3FbEP9c/5tx+0f1FrR3O/fm0U/F9maIjVrCg4XZzrjADc g9QrOJdtQj58hg3lH6RpChyJpmZ06qWKwBu56daQbuweWGvMn2CrEXp29i3L8AAE5B fMycNHEDbCEihMhICWVYjvi2hBnW14hc8P/J5DTU3RIzGvl1XEYzhzRkTLP4wLwvXc j/J6vSkm+mWFgMK5pLd2LqXOKQa/4z8/pScnAgSFRRvtW8hhVoesCK+kjOulPEX/Ul sniDrlqKMrKk3joAmmwIKpV8eiX+5GRCbuuDbXYJbZ6xGN8jzxrIK1Wrtr/06fTssa EhLuZ0BmCYtpw== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v1 3/3] kselftest/arm64: Allow epoll_wait() to return more than one result Date: Tue, 29 Nov 2022 00:03:55 +0000 Message-Id: <20221129000355.812425-4-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129000355.812425-1-broonie@kernel.org> References: <20221129000355.812425-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3344; i=broonie@kernel.org; h=from:subject; bh=Y43qJ/bNGK9Ddo25ogKXBarnhJnlJyovlWA7DkFzBLE=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhUxqhIIw+/0qMtilo7qwuKBAbrQGk/zdnqtiunQb UH+OizmJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4VMagAKCRAk1otyXVSH0BUgB/ 9/acYMk3PSV+yuqpG5TG1+9R2clCgHlt/XdScreju0VnajaqncYZjjCgK09lMvWJU2URpIOeQaYAEz K0ALu8kKxKdvxKzBNndRWyVZAi+GhWvl31EhlmQkCuRj56vFVGy4/g6Gu59CeYcLO4wp5f9wxL5iE0 qEWi6hRN+Q1TwBJM6g0YsfQ6qKeeyocB8QhD0VK1FPk2H4pYsUl/sk+m+SRndEMNbC21ygg1hLcjZD 046yZfje1ZnDHyU9wWb8d/DllyFqnGPUIiRVH2awAW6Ym434ptFzCSBppNszx1ZjypUjbKrphx4MPR V6KKjEPUvr8yFJ8yGu2vDJISdMKwKd X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When everything is starting up we are likely to have a lot of child processes producing output at once. This means that we can reduce overhead a bit by allowing epoll_wait() to return more than one descriptor at once, it cuts down on the number of system calls we need to do which on virtual platforms where the syscall overhead is a bit more noticable and we're likely to have a lot more children active can make a small but noticable difference. On physical platforms the relatively small number of processes being run and vastly improved speeds push the effects of this change into the noise. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 27 +++++++++++++------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 65262cf30b09..d22f6e356440 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -39,6 +39,8 @@ struct child_data { static int epoll_fd; static struct child_data *children; +static struct epoll_event *evs; +static int tests; static int num_children; static bool terminate; @@ -393,11 +395,11 @@ static void probe_vls(int vls[], int *vl_count, int set_vl) /* Handle any pending output without blocking */ static void drain_output(bool flush) { - struct epoll_event ev; int ret = 1; + int i; while (ret > 0) { - ret = epoll_wait(epoll_fd, &ev, 1, 0); + ret = epoll_wait(epoll_fd, evs, tests, 0); if (ret < 0) { if (errno == EINTR) continue; @@ -405,8 +407,8 @@ static void drain_output(bool flush) strerror(errno), errno); } - if (ret == 1) - child_output(ev.data.ptr, ev.events, flush); + for (i = 0; i < ret; i++) + child_output(evs[i].data.ptr, evs[i].events, flush); } } @@ -419,10 +421,9 @@ int main(int argc, char **argv) { int ret; int timeout = 10; - int cpus, tests, i, j, c; + int cpus, i, j, c; int sve_vl_count, sme_vl_count, fpsimd_per_cpu; int sve_vls[MAX_VLS], sme_vls[MAX_VLS]; - struct epoll_event ev; struct sigaction sa; while ((c = getopt_long(argc, argv, "t:", options, NULL)) != -1) { @@ -508,6 +509,11 @@ int main(int argc, char **argv) ksft_print_msg("Failed to install SIGCHLD handler: %s (%d)\n", strerror(errno), errno); + evs = calloc(tests, sizeof(*evs)); + if (!evs) + ksft_exit_fail_msg("Failed to allocated %d epoll events\n", + tests); + for (i = 0; i < cpus; i++) { for (j = 0; j < fpsimd_per_cpu; j++) start_fpsimd(&children[num_children++], i, j); @@ -541,7 +547,7 @@ int main(int argc, char **argv) * useful in emulation where we will both be slow and * likely to have a large set of VLs. */ - ret = epoll_wait(epoll_fd, &ev, 1, 1000); + ret = epoll_wait(epoll_fd, evs, tests, 1000); if (ret < 0) { if (errno == EINTR) continue; @@ -550,8 +556,11 @@ int main(int argc, char **argv) } /* Output? */ - if (ret == 1) { - child_output(ev.data.ptr, ev.events, false); + if (ret > 0) { + for (i = 0; i < ret; i++) { + child_output(evs[i].data.ptr, evs[i].events, + false); + } continue; }