From patchwork Fri Dec 17 08:56:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiongfeng Wang X-Patchwork-Id: 12696587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F2B43C433EF for ; Fri, 17 Dec 2021 08:47:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:CC :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=MF4cG673t4mcTMVCgSCqTaZqy0D2uwL6J+cHJgMxJbU=; b=uX5vQy43LXY4FT LCbQmMHfMSYL/Cw92S+v5yXKRr3w0oINwH+arEtatuheZs5wJ2hZr3MBEdrvVcuG96ublQvqLrA5n lS3oxN4jLtRtTGb97fhRlzyxQkEaHmjeLcMnF0hQuToQ+FXjstspDhkDkeZwZyZQBxvYD7dWfCeAb bW/zAECub9Zm4mScuzDIHyKYVbBZ+2JBkt5wUOZVTerBnsbN8RhZn73PfBEZsmWXeFiUVGtLpV1kN AdVRDmXfXvsxt7Jfm6UcYpNl75QW4qimaeVdq6zpbNMOyXkiPaSmD3WkIGLaaWTSpm6YJBp3zb42k cA3gPMETk7z0U+BcBkdQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1my8se-0092o1-T8; Fri, 17 Dec 2021 08:46:21 +0000 Received: from szxga08-in.huawei.com ([45.249.212.255]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1my8sb-0092mo-Ei for linux-arm-kernel@lists.infradead.org; Fri, 17 Dec 2021 08:46:19 +0000 Received: from dggpemm500024.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4JFjDX20kvz1DJjH; Fri, 17 Dec 2021 16:43:08 +0800 (CST) Received: from dggpemm500002.china.huawei.com (7.185.36.229) by dggpemm500024.china.huawei.com (7.185.36.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Fri, 17 Dec 2021 16:46:10 +0800 Received: from localhost.localdomain.localdomain (10.175.113.25) by dggpemm500002.china.huawei.com (7.185.36.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Fri, 17 Dec 2021 16:46:09 +0800 From: Xiongfeng Wang To: , , CC: , , , , Subject: [PATCH] asm-generic: introduce io_stop_wc() and add implementation for ARM64 Date: Fri, 17 Dec 2021 16:56:11 +0800 Message-ID: <20211217085611.111999-1-wangxiongfeng2@huawei.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-Originating-IP: [10.175.113.25] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemm500002.china.huawei.com (7.185.36.229) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211217_004617_861723_D9587D3C X-CRM114-Status: GOOD ( 10.11 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org For memory accesses with Normal-Non Cacheable attributes, the CPU may do write combining. But in some situation, this is bad for the performance because the prior access may wait too long just to be merged. We introduce io_stop_wc() to prevent the Normal-NC memory accesses before this instruction to be merged with memory accesses after this instruction. We add implementation for ARM64 using DGH instruction and provide NOP implementation for other architectures. Signed-off-by: Xiongfeng Wang --- Documentation/memory-barriers.txt | 9 +++++++++ arch/arm64/include/asm/barrier.h | 9 +++++++++ include/asm-generic/barrier.h | 11 +++++++++++ 3 files changed, 29 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 7367ada13208..b868b51b1801 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -1950,6 +1950,15 @@ There are some more advanced barrier functions: For load from persistent memory, existing read memory barriers are sufficient to ensure read ordering. + (*) io_stop_wc(); + + For memory accesses with Normal-Non Cacheable attributes (e.g. those + returned by ioremap_wc()), the CPU may do write combining. But in some + situation, this is bad for the performance because the prior access may + wait too long just to be merged. io_stop_wc() can be used to prevent + merging memory accesses with Normal-Non Cacheable attributes before this + instruction with any memory accesses appearing after this instruction. + =============================== IMPLICIT KERNEL MEMORY BARRIERS =============================== diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index 1c5a00598458..62217be36217 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h @@ -26,6 +26,14 @@ #define __tsb_csync() asm volatile("hint #18" : : : "memory") #define csdb() asm volatile("hint #20" : : : "memory") +/* + * Data Gathering Hint: + * This instruction prevents merging memory accesses with Normal-NC or + * Device-GRE attributes before the hint instruction with any memory accesses + * appearing after the hint instruction. + */ +#define dgh() asm volatile("hint #6" : : : "memory") + #ifdef CONFIG_ARM64_PSEUDO_NMI #define pmr_sync() \ do { \ @@ -46,6 +54,7 @@ #define dma_rmb() dmb(oshld) #define dma_wmb() dmb(oshst) +#define io_stop_wc() dgh() #define tsb_csync() \ do { \ diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index 640f09479bdf..083be6d34cb9 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h @@ -251,5 +251,16 @@ do { \ #define pmem_wmb() wmb() #endif +/* + * ioremap_wc() maps I/O memory as memory with Normal-Non Cacheable attributes. + * The CPU may do write combining for this kind of memory access. io_stop_wc() + * prevents merging memory accesses with Normal-Non Cacheable attributes + * before this instruction with any memory accesses appearing after this + * instruction. + */ +#ifndef io_stop_wc +#define io_stop_wc do { } while (0) +#endif + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_GENERIC_BARRIER_H */