From patchwork Tue Apr 5 13:57:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12801539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B10AEC433F5 for ; Tue, 5 Apr 2022 13:58:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B4AAF6B0071; Tue, 5 Apr 2022 09:58:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF99D6B0073; Tue, 5 Apr 2022 09:58:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 999F76B0074; Tue, 5 Apr 2022 09:58:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 87F386B0071 for ; Tue, 5 Apr 2022 09:58:16 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 4667461263 for ; Tue, 5 Apr 2022 13:58:06 +0000 (UTC) X-FDA: 79322979372.01.EC0F171 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id 8F2C3100024 for ; Tue, 5 Apr 2022 13:58:05 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 91E6961881; Tue, 5 Apr 2022 13:58:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E608EC385A4; Tue, 5 Apr 2022 13:58:00 +0000 (UTC) From: Catalin Marinas To: Will Deacon , Marc Zyngier , Arnd Bergmann , Greg Kroah-Hartman , Andrew Morton , Linus Torvalds Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Herbert Xu , "David S. Miller" , Mark Brown , Alasdair Kergon , Mike Snitzer , Daniel Vetter , "Rafael J. Wysocki" Subject: [PATCH 00/10] mm, arm64: Reduce ARCH_KMALLOC_MINALIGN below the cache line size Date: Tue, 5 Apr 2022 14:57:48 +0100 Message-Id: <20220405135758.774016-1-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8F2C3100024 X-Stat-Signature: pnyeheynx35cjytfzuzpagwrpduwfdbd Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) X-Rspam-User: X-HE-Tag: 1649167085-476555 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, On arm64 ARCH_DMA_MINALIGN (and therefore ARCH_KMALLOC_MINALIGN) is 128. While the majority of arm64 SoCs have a 64-byte cache line size (or rather CWG - cache writeback granule), we chose a less than optimal value in order to support all SoCs in a single kernel image. The aim of this series is to allow smaller default ARCH_KMALLOC_MINALIGN with kmalloc() caches configured at boot time to be safe when an SoC has a larger DMA alignment requirement. The first patch decouples ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN with the aim to only use the latter in DMA-specific compile-time annotations. ARCH_KMALLOC_MINALIGN becomes the minimum (static) guaranteed kmalloc() alignment but not necessarily safe for non-coherent DMA. Patches 2-7 change some drivers/ code to use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN. Patch 8 introduces the dynamic arch_kmalloc_minalign() and the slab code changes to set the corresponding minimum alignment on the newly created kmalloc() caches. Patch 10 defines arch_kmalloc_minalign() for arm64 returning cache_line_size() together with reducing ARCH_KMALLOC_MINALIGN to 64. ARCH_DMA_MINALIGN remains 128 on arm64. I don't have access to it but there's the Fujitsu A64FX with a CWG of 256 (the arm64 cache_line_size() returns 256). This series will bump the smallest kmalloc cache to kmalloc-256. The platform is known to be fully cache coherent (or so I think) and we decided long ago not to bump ARCH_DMA_MINALIGN to 256. If problematic, we could make the dynamic kmalloc() alignment on arm64 min(ARCH_DMA_MINALIGN, cache_line_size()). This series is beneficial to arm64 even if it's only reducing the kmalloc() minimum alignment to 64. While it would be nice to reduce this further to 8 (or 16) on SoCs known to be fully DMA coherent, detecting this is via arch_setup_dma_ops() is problematic, especially with late probed devices. I'd leave it for an additional RFC series on top of this (there are ideas like bounce buffering for non-coherent devices if the SoC was deemed coherent). Thanks. Catalin Marinas (10): mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN mm/slab: Allow dynamic kmalloc() minimum alignment mm/slab: Simplify create_kmalloc_cache() args and make it static arm64: Enable dynamic kmalloc() minimum alignment arch/arm64/include/asm/cache.h | 1 + arch/arm64/kernel/cacheinfo.c | 7 ++++++ drivers/base/devres.c | 4 ++-- drivers/gpu/drm/drm_managed.c | 4 ++-- drivers/md/dm-crypt.c | 2 +- drivers/spi/spidev.c | 2 +- drivers/usb/core/buffer.c | 8 +++---- drivers/usb/misc/usbtest.c | 2 +- include/linux/crypto.h | 2 +- include/linux/slab.h | 25 ++++++++++++++++----- mm/slab.c | 6 +---- mm/slab.h | 5 ++--- mm/slab_common.c | 40 ++++++++++++++++++++++------------ 13 files changed, 69 insertions(+), 39 deletions(-)