From patchwork Tue Nov 29 21:59:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13059261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74F7EC4321E for ; Tue, 29 Nov 2022 22:00:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232970AbiK2V7h (ORCPT ); Tue, 29 Nov 2022 16:59:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237115AbiK2V7f (ORCPT ); Tue, 29 Nov 2022 16:59:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E50236F377 for ; Tue, 29 Nov 2022 13:59:34 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 81AB061958 for ; Tue, 29 Nov 2022 21:59:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FDF6C433B5; Tue, 29 Nov 2022 21:59:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669759173; bh=R3+mZLjzV/gcr5knY0VrHVoRWwAETPfnU+ZHglXcR8Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rsFEnqzsZf7ADxCr27zUv8YXqBWKKo9hbsV1qqzkTtaZgvrLJyQqtRcs6Z74njr5v v/E2gvfA0GcYpintGiR1W3ftzKxrZcEWlVDq4RnjsBQ802QWySEc7PfwdL8vUYnDXw J9go+4PjqTezo7ZByRenSPHRvJf9OfOd7dytLZYnH3QV8yN7FRFmBq6A6U2ftci76e yA0dt2PfP0rqakOAAwTAxtTO50jRx3kgtwahQu7yFMGcT5cEKCVj18Aq+Oi4OGTSvw FRVaavrKcttiMk/Pv3yV53tLzoFIn2f5CJMB224A1zGf9mSqr5nyjEp39NVPOdPQra bltzRwGGRGtCA== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v2 1/3] kselftest/arm64: Hold fp-stress children until they're all spawned Date: Tue, 29 Nov 2022 21:59:23 +0000 Message-Id: <20221129215926.442895-2-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129215926.442895-1-broonie@kernel.org> References: <20221129215926.442895-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3511; i=broonie@kernel.org; h=from:subject; bh=R3+mZLjzV/gcr5knY0VrHVoRWwAETPfnU+ZHglXcR8Q=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhoC7GO8Mr1R0x1UQagteYI0RHQ2q1qlM8tari4MH UrIcutGJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4aAuwAKCRAk1otyXVSH0HDVB/ 4v4wEC8zW6w45wRu1bJWOKqLbOYWUhqVlia8PI7FvzUnGhePb5qrm7nYfVWGQ/pFIEHa/GMUAumr1i c7rGg0fp6fGmTSneIrGKxOdJufZNbdV5BeoiNKp6MGsAvibaOICtPJYmLsMofsdcZAdh2FKkEUMd2Z Y4bWBAZEG8kUWiig7gizDTUmvGglaXt75MkEqG1a6r6p0jW5MooTHfoaYRNpj1JkTfTGpXwTyJvcc8 Sk8NAJSSN9LVfDr+5rWfyeZXxEBCeeSH1d6doYcIs5oFZkU22SzWAMcsIi8Du3lD1zVvnwmwB9zoUL jm+vkTvX+BN+pc+WpUzdUK/XC7hL6m X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org At present fp-stress has a bit of a thundering herd problem since the children it spawns start running immediately, meaning that they can start starving the parent process of CPU before it has even started all the children. This is much more severe on virtual platforms since they tend to support far more SVE and SME vector lengths, be slower in general and for some have issues with performance when simulating multiple CPUs. We can mitigate this problem by having all the child processes block before starting the test program, meaning that we at least have all the child processes started before we start heavily using CPU. We still have the same load issues while waiting for the actual stress test programs to start up and produce output but they're at least all ready to go before that kicks in, resulting in substantial reductions in overall runtime on some of the severely affected systems. One test was showing about 20% improvement. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 41 +++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 69ca4a5f7e6e..7c04f5001648 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -44,6 +44,8 @@ static bool terminate; static void drain_output(bool flush); +static int startup_pipe[2]; + static int num_processors(void) { long nproc = sysconf(_SC_NPROCESSORS_CONF); @@ -81,13 +83,37 @@ static void child_start(struct child_data *child, const char *program) exit(EXIT_FAILURE); } + /* + * Duplicate the read side of the startup pipe to + * FD 3 so we can close everything else. + */ + ret = dup2(startup_pipe[0], 3); + if (ret == -1) { + fprintf(stderr, "dup2() %d\n", errno); + exit(EXIT_FAILURE); + } + /* * Very dumb mechanism to clean open FDs other than * stdio. We don't want O_CLOEXEC for the pipes... */ - for (i = 3; i < 8192; i++) + for (i = 4; i < 8192; i++) close(i); + /* + * Read from the startup pipe, there should be no data + * and we should block until it is closed. We just + * carry on on error since this isn't super critical. + */ + ret = read(3, &i, sizeof(i)); + if (ret < 0) + fprintf(stderr, "read(startp pipe) failed: %s (%d)\n", + strerror(errno), errno); + if (ret > 0) + fprintf(stderr, "%d bytes of data on startup pipe\n", + ret); + close(3); + ret = execl(program, program, NULL); fprintf(stderr, "execl(%s) failed: %d (%s)\n", program, errno, strerror(errno)); @@ -467,6 +493,12 @@ int main(int argc, char **argv) strerror(errno), ret); epoll_fd = ret; + /* Create a pipe which children will block on before execing */ + ret = pipe(startup_pipe); + if (ret != 0) + ksft_exit_fail_msg("Failed to create startup pipe: %s (%d)\n", + strerror(errno), errno); + /* Get signal handers ready before we start any children */ memset(&sa, 0, sizeof(sa)); sa.sa_sigaction = handle_exit_signal; @@ -499,6 +531,13 @@ int main(int argc, char **argv) } } + /* + * All children started, close the startup pipe and let them + * run. + */ + close(startup_pipe[0]); + close(startup_pipe[1]); + for (;;) { /* Did we get a signal asking us to exit? */ if (terminate) From patchwork Tue Nov 29 21:59:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13059262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68C1CC433FE for ; Tue, 29 Nov 2022 22:00:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233743AbiK2WAJ (ORCPT ); Tue, 29 Nov 2022 17:00:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234423AbiK2V7h (ORCPT ); Tue, 29 Nov 2022 16:59:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B52626F351 for ; Tue, 29 Nov 2022 13:59:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 531A661929 for ; Tue, 29 Nov 2022 21:59:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5E8A0C433C1; Tue, 29 Nov 2022 21:59:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669759175; bh=+W8adVt0qNP/+1paRvySD60I/TQ9TpFvbpjiyvAbBAg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GFX7m9ExCmU3ohxMn3j8fx39hCLVPwqgMANpN0txWwxWaK0eU4BeMAC9JVxjBLQl9 khEL5bz7LkPjc5I5ZTjyhJi6kqoxeSxxkg5TxMqoRcLsJSna4zpaSpPIb13STZ8udG 43by6c9EHB05xCS50AHKkyg/Gm5FiJr0qNVRrYtUQeiqdEIP33qTT9S0FsRxDOsRGi ldYHb63nWlwFs5Zd46O6vvXFY5mbzfd6InXISh3ep83IXBYg0QE1lCscKPKFW51hsn WYbhzYLibJKb1yvQA1+TOY77c3/GWuzN30DEOclthVWmm6Hlh4nUh4A3yl+WDB0Yg5 47pHtcS8Nr28g== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v2 2/3] kselftest/arm64: Don't drain output while spawning children Date: Tue, 29 Nov 2022 21:59:24 +0000 Message-Id: <20221129215926.442895-3-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129215926.442895-1-broonie@kernel.org> References: <20221129215926.442895-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1139; i=broonie@kernel.org; h=from:subject; bh=+W8adVt0qNP/+1paRvySD60I/TQ9TpFvbpjiyvAbBAg=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhoC8UMtEjXfLbwbQEU4Rxagc/CjBu3qp2TKLbzO4 N2gxArSJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4aAvAAKCRAk1otyXVSH0Mo0B/ 4st+ImVjOz9wuVjI/EGgH9bJW89NAHeVMCYcPZV1+PuWsaFDD79YX9N5q/8cdxqBFrF/89TZrzN7Bo EbqZfn1t+ywNF7F7AOXGOSpdUh5SalcD/ZO02Bvjo7BC/rDHOD+1XGsMH+L5TgXmvK/39cZIM3i+qF gqnrlWYlDEPMxrbQip5yE5B08a29E6IX32GnuSg/O8CJqBFJurMqU30tRJ1l26q8fLpmAj8gNv9RXV cm+6yoPUaV5iJ5e/6MGhGISf2JoL9dl67dd+5gl/0VBtVfh50g00uJ2ugTgaIp3HWQFbN36Dqf3Q8T KDcSt+CrsLqJQXHFweIcDrRTSXm4Mg X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Now we hold execution of the stress test programs until all children are started there is no need to drain output while that is happening. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index 7c04f5001648..b3bbfe8d9f56 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -42,8 +42,6 @@ static struct child_data *children; static int num_children; static bool terminate; -static void drain_output(bool flush); - static int startup_pipe[2]; static int num_processors(void) @@ -138,12 +136,6 @@ static void child_start(struct child_data *child, const char *program) ksft_exit_fail_msg("%s EPOLL_CTL_ADD failed: %s (%d)\n", child->name, strerror(errno), errno); } - - /* - * Keep output flowing during child startup so logs - * are more timely, can help debugging. - */ - drain_output(false); } } From patchwork Tue Nov 29 21:59:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Brown X-Patchwork-Id: 13059263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77D8FC46467 for ; Tue, 29 Nov 2022 22:00:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234423AbiK2WAK (ORCPT ); Tue, 29 Nov 2022 17:00:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236192AbiK2V7l (ORCPT ); Tue, 29 Nov 2022 16:59:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DC8C6F0EB for ; Tue, 29 Nov 2022 13:59:40 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DCFE5B81982 for ; Tue, 29 Nov 2022 21:59:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 33A31C433D6; Tue, 29 Nov 2022 21:59:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669759177; bh=ljBbwXbpFXylIv5/T0HUheeXkTvpejFf0PIQp2UDE54=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XjqEg0MFHoe2NRdlYELnD55gyH5YlG+KHhReuPjnTS1om0cBDp4S6rCedAkYY0Fjz D0NaesqX4kUbJqfyyVlaQUx/zyuxS4C8HnhyqWn9+sn5/WjeuNldmKrc7a87FWwkCQ eR1wnAIBht6423NP2cX6NT0ELUvA62wk0Ynw2pDilkoyOdKDUKRr+VzDs1rBOtAZr/ q9xz2Ai7hV3Z7D3znA0CXK6f0+7JjYvQe7uBjk4BnKRRylXxX8X1wRv5RW+u5ouKH/ YiLLwnrsIIh/XLz/49oRG5SpVnHEW0TZPyZ4pn+ps899IhsicpV1TH/yfBR3+jdI7p dVI9oqSHwhMqA== From: Mark Brown To: Catalin Marinas , Will Deacon , Shuah Khan Cc: linux-arm-kernel@lists.infradead.org, linux-kselftest@vger.kernel.org, Mark Brown Subject: [PATCH v2 3/3] kselftest/arm64: Allow epoll_wait() to return more than one result Date: Tue, 29 Nov 2022 21:59:25 +0000 Message-Id: <20221129215926.442895-4-broonie@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129215926.442895-1-broonie@kernel.org> References: <20221129215926.442895-1-broonie@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3405; i=broonie@kernel.org; h=from:subject; bh=ljBbwXbpFXylIv5/T0HUheeXkTvpejFf0PIQp2UDE54=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBjhoC9A5hGVU7vrA26t1EofYEAYFWU+d06yuxERowW NHf5mnOJATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCY4aAvQAKCRAk1otyXVSH0HutB/ 9EtlLjCLWE6479Xq7SRHRmkOeZHGiW8VuS2J1o1KWRgT4ApriYUULxzrOrEHfPK4zTlKXngz36MAYt SMp0guIKlf/BbM/ZTtbqSyfkQJcxVT5o0edBS69yO0PIcR4D5OOZTAcJ4fZKuzB2yqp2U+pP6FVk2e gdgquW/u5YIFceUL1ycSd9FdjyaoNpeh5AhlusBjx6SxRHQQ/iHDdtCpvHdv3uWx+VozhwtvqW87eU FMx3zEdfRT/hfXVmCDMyWqrfsRQPV0qaV4PIWUYiQws+eLWWEKhuh4tHQexzOLnNjdNDkpJt5ASrJE Rael97M7oOifNEO8oUT0uaAmNshYBv X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org When everything is starting up we are likely to have a lot of child processes producing output at once. This means that we can reduce overhead a bit by allowing epoll_wait() to return more than one descriptor at once, it cuts down on the number of system calls we need to do which on virtual platforms where the syscall overhead is a bit more noticable and we're likely to have a lot more children active can make a small but noticable difference. On physical platforms the relatively small number of processes being run and vastly improved speeds push the effects of this change into the noise. Signed-off-by: Mark Brown --- tools/testing/selftests/arm64/fp/fp-stress.c | 27 +++++++++++++------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c b/tools/testing/selftests/arm64/fp/fp-stress.c index b3bbfe8d9f56..f8b2f41aac36 100644 --- a/tools/testing/selftests/arm64/fp/fp-stress.c +++ b/tools/testing/selftests/arm64/fp/fp-stress.c @@ -39,6 +39,8 @@ struct child_data { static int epoll_fd; static struct child_data *children; +static struct epoll_event *evs; +static int tests; static int num_children; static bool terminate; @@ -393,11 +395,11 @@ static void probe_vls(int vls[], int *vl_count, int set_vl) /* Handle any pending output without blocking */ static void drain_output(bool flush) { - struct epoll_event ev; int ret = 1; + int i; while (ret > 0) { - ret = epoll_wait(epoll_fd, &ev, 1, 0); + ret = epoll_wait(epoll_fd, evs, tests, 0); if (ret < 0) { if (errno == EINTR) continue; @@ -405,8 +407,8 @@ static void drain_output(bool flush) strerror(errno), errno); } - if (ret == 1) - child_output(ev.data.ptr, ev.events, flush); + for (i = 0; i < ret; i++) + child_output(evs[i].data.ptr, evs[i].events, flush); } } @@ -419,12 +421,11 @@ int main(int argc, char **argv) { int ret; int timeout = 10; - int cpus, tests, i, j, c; + int cpus, i, j, c; int sve_vl_count, sme_vl_count, fpsimd_per_cpu; bool all_children_started = false; int seen_children; int sve_vls[MAX_VLS], sme_vls[MAX_VLS]; - struct epoll_event ev; struct sigaction sa; while ((c = getopt_long(argc, argv, "t:", options, NULL)) != -1) { @@ -510,6 +511,11 @@ int main(int argc, char **argv) ksft_print_msg("Failed to install SIGCHLD handler: %s (%d)\n", strerror(errno), errno); + evs = calloc(tests, sizeof(*evs)); + if (!evs) + ksft_exit_fail_msg("Failed to allocated %d epoll events\n", + tests); + for (i = 0; i < cpus; i++) { for (j = 0; j < fpsimd_per_cpu; j++) start_fpsimd(&children[num_children++], i, j); @@ -543,7 +549,7 @@ int main(int argc, char **argv) * useful in emulation where we will both be slow and * likely to have a large set of VLs. */ - ret = epoll_wait(epoll_fd, &ev, 1, 1000); + ret = epoll_wait(epoll_fd, evs, tests, 1000); if (ret < 0) { if (errno == EINTR) continue; @@ -552,8 +558,11 @@ int main(int argc, char **argv) } /* Output? */ - if (ret == 1) { - child_output(ev.data.ptr, ev.events, false); + if (ret > 0) { + for (i = 0; i < ret; i++) { + child_output(evs[i].data.ptr, evs[i].events, + false); + } continue; }