From patchwork Sun Jul 21 13:36:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lasse Collin X-Patchwork-Id: 13738010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 99169C3DA63 for ; Sun, 21 Jul 2024 13:38:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=N/pf140CXFriVjM5XTzVcQtyyhrrCsSVAZJ+KQwARu8=; b=znWP36FWqWhquA hqzManY+2C1TPqo6NwikrDtMmQVpG0p1l3Kc8VWog9Ad6W7u6e4SD7dlymN8kIb/wdAmlpH+XpeQi 0Zma7ZkWrNPuhyQvRx25ssnofLOIeyOW68GaZFWbOGb48UsP7+iD4Pgx4lSGFBtEBJPLXvezgutn3 ro2A+Y7lluE+d6YyoYLY3CCZI+ohvsAFB4NgwpMoaGCEepsYu5m9/poG6m5ZMbKsD6wz3EVNnKC+g 6FMZ2RKQF4Tnr3KFfxlOH8hw40BSDcsfDlSftT4zX17t68h0PYgJm1o3uolnMks9rIuJyjyIfnc3s 4OUPFVMtauJCb/ta4Zgw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sVWlS-00000006uFr-2Kua; Sun, 21 Jul 2024 13:38:14 +0000 Received: from mailscanner06.zoner.fi ([5.44.246.15]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sVWka-00000006tvz-1GLl; Sun, 21 Jul 2024 13:37:23 +0000 Received: from www25.zoner.fi (www25.zoner.fi [84.34.147.45]) by mailscanner06.zoner.fi (Postfix) with ESMTPS id 83C3021320; Sun, 21 Jul 2024 16:37:15 +0300 (EEST) Received: from mail.zoner.fi ([84.34.147.244]) by www25.zoner.fi with esmtp (Exim 4.97.1) (envelope-from ) id 1sVWkS-00000001SmU-0R5F; Sun, 21 Jul 2024 16:37:15 +0300 From: Lasse Collin To: Andrew Morton Cc: Lasse Collin , Sam James , linux-kernel@vger.kernel.org, Simon Glass , Catalin Marinas , Will Deacon , Paul Walmsley , Palmer Dabbelt , Albert Ou , Jubin Zhong , Jules Maselbas , linux-arm-kernel@lists.infradead.org, linux-riscv@lists.infradead.org Subject: [PATCH v2 14/16] xz: Adjust arch-specific options for better kernel compression Date: Sun, 21 Jul 2024 16:36:29 +0300 Message-ID: <20240721133633.47721-15-lasse.collin@tukaani.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240721133633.47721-1-lasse.collin@tukaani.org> References: <20240721133633.47721-1-lasse.collin@tukaani.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240721_063720_689868_7CF2A486 X-CRM114-Status: GOOD ( 23.71 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Use LZMA2 options that match the arch-specific alignment of instructions. This change reduces compressed kernel size 0-2 % depending on the arch. On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs it helps the most. Use the ARM-Thumb filter for ARM-Thumb2 kernels. This reduces compressed kernel size about 5 %.[1] Previously such kernels were compressed using the ARM filter which didn't do anything useful with ARM-Thumb2 code. Add BCJ filter support for ARM64 and RISC-V. Compared to unfiltered XZ or plain LZMA, the compressed kernel size is reduced about 5 % on ARM64 and 7 % on RISC-V. A new enough version of the xz tool is required: 5.4.0 for ARM64 and 5.6.0 for RISC-V. With an old xz version, a message is printed to standard error and the kernel is compressed without the filter. Update lib/decompress_unxz.c to match the changes to xz_wrap.sh. Update the CONFIG_KERNEL_XZ help text in init/Kconfig: - Add the RISC-V and ARM64 filters. - Clarify that the PowerPC filter is for big endian only. - Omit IA-64. Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1] Cc: Simon Glass Cc: Catalin Marinas Cc: Will Deacon Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Jubin Zhong Cc: Jules Maselbas Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Reviewed-by: Sam James Signed-off-by: Lasse Collin --- Notes: v2: Avoid the "eval" command. The use of "eval" on xz's output might scare people so this should make the script less scary. This should address the concerns in [2]. See also my replies [3] and [4]. [2]: https://lore.kernel.org/lkml/27db456edeb6f72e7e229c2333c5d8449718c26e.camel@16bits.net/ [3]: https://lore.kernel.org/lkml/20240403225903.0773746d@kaneli/ [4]: https://lore.kernel.org/lkml/20240404170103.1bc382b3@kaneli/ init/Kconfig | 5 +- lib/decompress_unxz.c | 14 ++++- scripts/xz_wrap.sh | 142 ++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 152 insertions(+), 9 deletions(-) diff --git a/init/Kconfig b/init/Kconfig index 964355d1757e..236105e4d441 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -310,8 +310,9 @@ config KERNEL_XZ BCJ filters which can improve compression ratio of executable code. The size of the kernel is about 30% smaller with XZ in comparison to gzip. On architectures for which there is a BCJ - filter (i386, x86_64, ARM, IA-64, PowerPC, and SPARC), XZ - will create a few percent smaller kernel than plain LZMA. + filter (i386, x86_64, ARM, ARM64, RISC-V, big endian PowerPC, + and SPARC), XZ will create a few percent smaller kernel than + plain LZMA. The speed is about the same as with LZMA: The decompression speed of XZ is better than that of bzip2 but worse than gzip diff --git a/lib/decompress_unxz.c b/lib/decompress_unxz.c index 46aa3be13fc5..cae00395d7a6 100644 --- a/lib/decompress_unxz.c +++ b/lib/decompress_unxz.c @@ -126,11 +126,21 @@ #ifdef CONFIG_X86 # define XZ_DEC_X86 #endif -#ifdef CONFIG_PPC +#if defined(CONFIG_PPC) && defined(CONFIG_CPU_BIG_ENDIAN) # define XZ_DEC_POWERPC #endif #ifdef CONFIG_ARM -# define XZ_DEC_ARM +# ifdef CONFIG_THUMB2_KERNEL +# define XZ_DEC_ARMTHUMB +# else +# define XZ_DEC_ARM +# endif +#endif +#ifdef CONFIG_ARM64 +# define XZ_DEC_ARM64 +#endif +#ifdef CONFIG_RISCV +# define XZ_DEC_RISCV #endif #ifdef CONFIG_SPARC # define XZ_DEC_SPARC diff --git a/scripts/xz_wrap.sh b/scripts/xz_wrap.sh index c8c36441ab70..f19369687030 100755 --- a/scripts/xz_wrap.sh +++ b/scripts/xz_wrap.sh @@ -6,14 +6,146 @@ # # Author: Lasse Collin +# This has specialized settings for the following archs. However, +# XZ-compressed kernel isn't currently supported on every listed arch. +# +# Arch Align Notes +# arm 2/4 ARM and ARM-Thumb2 +# arm64 4 +# csky 2 +# loongarch 4 +# mips 2/4 MicroMIPS is 2-byte aligned +# parisc 4 +# powerpc 4 Uses its own wrapper for compressors instead of this. +# riscv 2/4 +# s390 2 +# sh 2 +# sparc 4 +# x86 1 + +# A few archs use 2-byte or 4-byte aligned instructions depending on +# the kernel config. This function is used to check if the relevant +# config option is set to "y". +is_enabled() +{ + grep -q "^$1=y$" include/config/auto.conf +} + +# XZ_VERSION is needed to disable features that aren't available in +# old XZ Utils versions. +XZ_VERSION=$($XZ --robot --version) || exit +XZ_VERSION=$(printf '%s\n' "$XZ_VERSION" | sed -n 's/^XZ_VERSION=//p') + +# Assume that no BCJ filter is available. BCJ= -LZMA2OPTS= +# Set the instruction alignment to 1, 2, or 4 bytes. +# +# Set the BCJ filter if one is available. +# It must match the #ifdef usage in lib/decompress_unxz.c. case $SRCARCH in - x86) BCJ=--x86 ;; - powerpc) BCJ=--powerpc ;; - arm) BCJ=--arm ;; - sparc) BCJ=--sparc ;; + arm) + if is_enabled CONFIG_THUMB2_KERNEL; then + ALIGN=2 + BCJ=--armthumb + else + ALIGN=4 + BCJ=--arm + fi + ;; + + arm64) + ALIGN=4 + + # ARM64 filter was added in XZ Utils 5.4.0. + if [ "$XZ_VERSION" -ge 50040002 ]; then + BCJ=--arm64 + else + echo "$0: Upgrading to xz >= 5.4.0" \ + "would enable the ARM64 filter" \ + "for better compression" >&2 + fi + ;; + + csky) + ALIGN=2 + ;; + + loongarch) + ALIGN=4 + ;; + + mips) + if is_enabled CONFIG_CPU_MICROMIPS; then + ALIGN=2 + else + ALIGN=4 + fi + ;; + + parisc) + ALIGN=4 + ;; + + powerpc) + ALIGN=4 + + # The filter is only for big endian instruction encoding. + if is_enabled CONFIG_CPU_BIG_ENDIAN; then + BCJ=--powerpc + fi + ;; + + riscv) + if is_enabled CONFIG_RISCV_ISA_C; then + ALIGN=2 + else + ALIGN=4 + fi + + # RISC-V filter was added in XZ Utils 5.6.0. + if [ "$XZ_VERSION" -ge 50060002 ]; then + BCJ=--riscv + else + echo "$0: Upgrading to xz >= 5.6.0" \ + "would enable the RISC-V filter" \ + "for better compression" >&2 + fi + ;; + + s390) + ALIGN=2 + ;; + + sh) + ALIGN=2 + ;; + + sparc) + ALIGN=4 + BCJ=--sparc + ;; + + x86) + ALIGN=1 + BCJ=--x86 + ;; + + *) + echo "$0: Arch-specific tuning is missing for '$SRCARCH'" >&2 + + # Guess 2-byte-aligned instructions. Guessing too low + # should hurt less than guessing too high. + ALIGN=2 + ;; +esac + +# Select the LZMA2 options matching the instruction alignment. +case $ALIGN in + 1) LZMA2OPTS= ;; + 2) LZMA2OPTS=lp=1 ;; + 4) LZMA2OPTS=lp=2,lc=2 ;; + *) echo "$0: ALIGN wrong or missing" >&2; exit 1 ;; esac # Use single-threaded mode because it compresses a little better