From patchwork Wed Jan 20 17:41:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Diab Neiroukh X-Patchwork-Id: 12033277 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52043C433E0 for ; Wed, 20 Jan 2021 18:12:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1A7A8233FA for ; Wed, 20 Jan 2021 18:12:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731593AbhATRvd (ORCPT ); Wed, 20 Jan 2021 12:51:33 -0500 Received: from sendmail.purelymail.com ([34.202.193.197]:51432 "EHLO sendmail.purelymail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728095AbhATRuk (ORCPT ); Wed, 20 Jan 2021 12:50:40 -0500 X-Greylist: delayed 382 seconds by postgrey-1.27 at vger.kernel.org; Wed, 20 Jan 2021 12:50:38 EST Authentication-Results: purelymail.com; auth=pass DKIM-Signature: a=rsa-sha256; b=0vKvmSFzWcheTLjBCdQh7yELER6goDXutqd83TwBMMVY82Z6u6C7lc/KAXvZOr9Y5DroqMbZO5g8JoAnXDnVK2c9usVnIZ4DySlh89kT7Qgjn1lY4Q9vudKemXtuoB20o/tvG5tp9kGk0OnPELvC0bWP9gKpQoKQOCQKT/PFwr0+5GfbN7n6KQIU5aURUtNf3MOesHkx3ksHxfZkm60usmr/Gpj3XDvgNx1A4pnNXgBZQ4DZ4FlFzg+oJRSVmuj+wzktTZXsQZLIX8aeKTnvMGmhRxra5TIi7+Ks8SCLyLmpY8si4cERL6/sClozFMjUQZKLAWgj6+U/qKS9XrsN5Q==; s=purelymail2; d=thezest.dev; v=1; bh=V9VC00tjxDxMC3NfaBpUD4u6Y+zJ6Kv9pzEYq6NvENg=; h=Received:From:To; DKIM-Signature: a=rsa-sha256; b=iQuChVOf8ja5PrB7bMBY70Zwp7e+tPjlq7PaQvjgEW/GKtnNuzUXi5Pnqr5tbA2qXCA3JGeGK2X/LPkGe0ehyZ62Mg3uefdPWt5BMw/zGVFCXE1T4dDKgFjgn976hRByQaFE7M/9jco82PSlsa5kxir2XTZFiB+7JIfIo7dn0Zi/QRAXMbJ71Ij6soeCGagzBlSBMYe819XEBhwZVIxeKay05Pf8MmX5PCjQz0/Mv4qJzLSem8euedx2VepiYM3sIPtB4XhWxjZjt2MTGAop1xvk2yggqCYX3iDuEFr5ZkvCuB3wadUYh2qftJc+QAplDAK/5gmNdDXl4kef32N93g==; s=purelymail2; d=purelymail.com; v=1; bh=V9VC00tjxDxMC3NfaBpUD4u6Y+zJ6Kv9pzEYq6NvENg=; h=Feedback-ID:Received:From:To; Feedback-ID: 1188:367:null:purelymail X-Pm-Original-To: linux-kbuild@vger.kernel.org Received: by ip-172-30-0-124.ec2.internal (JAMES SMTP Server ) with ESMTPA ID -1145992316; Wed, 20 Jan 2021 17:42:20 +0000 (UTC) From: Diab Neiroukh To: clang-built-linux@googlegroups.com Cc: Diab Neiroukh , Danny Lin , Masahiro Yamada , Michal Marek , Nathan Chancellor , Nick Desaulniers , Kees Cook , Masami Hiramatsu , "Steven Rostedt (VMware)" , Sami Tolvanen , Valentin Schneider , Nick Terrell , Quentin Perret , Johannes Weiner , linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] kbuild: Add support for Clang's polyhedral loop optimizer. Date: Wed, 20 Jan 2021 17:41:45 +0000 Message-Id: <20210120174146.12287-1-lazerl0rd@thezest.dev> X-Mailer: git-send-email 2.30.0 MIME-Version: 1.0 X-MIME-Autoconverted: from 8bit to quoted-printable by Purelymail Precedence: bulk List-ID: X-Mailing-List: linux-kbuild@vger.kernel.org Polly is able to optimize various loops throughout the kernel for cache locality. A mathematical representation of the program, based on polyhedra, is analysed to find opportunistic optimisations in memory access patterns which then leads to loop transformations. Polly is not built with LLVM by default, and requires LLVM to be compiled with the Polly "project". This can be done by adding Polly to -DLLVM_ENABLE_PROJECTS, for example: -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi;polly" Preliminary benchmarking seems to show an improvement of around two percent across perf benchmarks: Benchmark | Control | Polly -------------------------------------------------------- bonnie++ -x 2 -s 4096 -r 0 | 12.610s | 12.547s perf bench futex requeue | 33.553s | 33.094s perf bench futex wake | 1.032s | 1.021s perf bench futex wake-parallel | 1.049s | 1.025s perf bench futex requeue | 1.037s | 1.020s Furthermore, Polly does not produce a much larger image size netting it to be a "free" optimisation. A comparison of a bzImage for a kernel with and without Polly is shown below: bzImage | stat --printf="%s\n" ------------------------------------- Control | 9333728 Polly | 9345792 Compile times were one percent different at best, which is well within the range of noise. Therefore, I can say with certainty that Polly has a minimal effect on compile times, if none. Suggested-by: Danny Lin Signed-off-by: Diab Neiroukh --- Makefile | 16 ++++++++++++++++ init/Kconfig | 13 +++++++++++++ 2 files changed, 29 insertions(+) diff --git a/Makefile b/Makefile index b9d3a47c57cf..00f15bde5f8b 100644 --- a/Makefile +++ b/Makefile @@ -740,6 +740,22 @@ else ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE KBUILD_CFLAGS += -Os endif +ifdef CONFIG_POLLY_CLANG +KBUILD_CFLAGS += -mllvm -polly \ + -mllvm -polly-ast-use-context \ + -mllvm -polly-invariant-load-hoisting \ + -mllvm -polly-opt-fusion=max \ + -mllvm -polly-run-inliner \ + -mllvm -polly-vectorizer=stripmine +# Polly may optimise loops with dead paths beyound what the linker +# can understand. This may negate the effect of the linker's DCE +# so we tell Polly to perfom proven DCE on the loops it optimises +# in order to preserve the overall effect of the linker's DCE. +ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION +KBUILD_CFLAGS += -mllvm -polly-run-dce +endif +endif + # Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races) diff --git a/init/Kconfig b/init/Kconfig index 05131b3ad0f2..266d7d03ccd1 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -177,6 +177,19 @@ config BUILD_SALT This is mostly useful for distributions which want to ensure the build is unique between builds. It's safe to leave the default. +config POLLY_CLANG + bool "Use Clang Polly optimizations" + depends on CC_IS_CLANG && $(cc-option,-mllvm -polly) + depends on !COMPILE_TEST + help + This option enables Clang's polyhedral loop optimizer known as + Polly. Polly is able to optimize various loops throughout the + kernel for cache locality. This requires a Clang toolchain + compiled with support for Polly. More information can be found + from Polly's website: + + https://polly.llvm.org + config HAVE_KERNEL_GZIP bool