From patchwork Sun Sep 20 11:00:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Serge Semin X-Patchwork-Id: 11787415 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F94E139F for ; Sun, 20 Sep 2020 11:07:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 924A920EDD for ; Sun, 20 Sep 2020 11:07:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726379AbgITLH0 (ORCPT ); Sun, 20 Sep 2020 07:07:26 -0400 Received: from mail.baikalelectronics.com ([87.245.175.226]:53206 "EHLO mail.baikalelectronics.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726262AbgITLHY (ORCPT ); Sun, 20 Sep 2020 07:07:24 -0400 X-Greylist: delayed 388 seconds by postgrey-1.27 at vger.kernel.org; Sun, 20 Sep 2020 07:07:20 EDT Received: from localhost (unknown [127.0.0.1]) by mail.baikalelectronics.ru (Postfix) with ESMTP id 4C92B803073C; Sun, 20 Sep 2020 11:00:46 +0000 (UTC) X-Virus-Scanned: amavisd-new at baikalelectronics.ru Received: from mail.baikalelectronics.ru ([127.0.0.1]) by localhost (mail.baikalelectronics.ru [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0TPLSPyiVhKC; Sun, 20 Sep 2020 14:00:45 +0300 (MSK) From: Serge Semin To: Thomas Bogendoerfer CC: Serge Semin , Serge Semin , Alexey Malahov , Pavel Parkhomenko , Vadim Vlasov , "Maciej W . Rozycki" , , Subject: [PATCH 1/2] mips: Add strong UC ordering config Date: Sun, 20 Sep 2020 14:00:09 +0300 Message-ID: <20200920110010.16796-2-Sergey.Semin@baikalelectronics.ru> In-Reply-To: <20200920110010.16796-1-Sergey.Semin@baikalelectronics.ru> References: <20200920110010.16796-1-Sergey.Semin@baikalelectronics.ru> MIME-Version: 1.0 X-ClientProxiedBy: MAIL.baikal.int (192.168.51.25) To mail (192.168.51.25) Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org In accordance with [1, 2] memory transactions using CCA=2 (Uncached Cacheability and Coherency Attribute) are always strongly ordered. This means the younger memory accesses using CCA=2 are never allowed to be executed before older memory accesses using CCA=2 (no bypassing is allowed), and Loads and Stores using CCA=2 are never speculative. It is expected by the specification that the rest of the system maintains these properties for processor initiated uncached accesses. So the system IO interconnect doesn't reorder uncached transactions once they have left the processor subsystem. Taking into account these properties and what [3] says about the relaxed IO-accessors we can infer that normal Loads and Stores from/to CCA=2 memory and without any additional execution barriers will fully comply with the {read,write}X_relaxed() methods requirements. Let's convert then currently generated relaxed IO-accessors to being pure Loads and Stores. Seeing the commit 3d474dacae72 ("MIPS: Enforce strong ordering for MMIO accessors") and commit 8b656253a7a4 ("MIPS: Provide actually relaxed MMIO accessors") have already made a preparation in the corresponding macro, we can do that just by replacing the "barrier" parameter utilization with the "relax" one. Note the "barrier" macro argument can be removed, since it isn't fully used anyway other than being always assigned to 1. Of course it would be fullish to believe that all the available MIPS-based CPUs completely follow the denoted specification, especially considering how old the architecture is. Instead we introduced a dedicated kernel config, which when enabled will convert the relaxed IO-accessors to being pure Loads and Stores without any additional barriers around. So if some CPU supports the strongly ordered UC memory access, it can enable that config and use a fully optimized relaxed IO-methods. For instance, Baikal-T1 architecture support code will do that. [1] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior, p. 33 [2] MIPS Coherence Protocol Specification, Document Number: MD00605, Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58 [3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Serge Semin Cc: Maciej W. Rozycki Reviewed-by: Jiaxun Yang --- arch/mips/Kconfig | 8 ++++++++ arch/mips/include/asm/io.h | 20 ++++++++++---------- 2 files changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index c95fa3a2484c..2c82d927347d 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -2066,6 +2066,14 @@ config WEAK_ORDERING # config WEAK_REORDERING_BEYOND_LLSC bool + +# +# CPU may not reorder reads and writes R->R, R->W, W->R, W->W within Uncached +# Cacheability and Coherency Attribute (CCA=2) +# +config STRONG_UC_ORDERING + bool + endmenu # diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h index 78537aa23500..130c4b6458fc 100644 --- a/arch/mips/include/asm/io.h +++ b/arch/mips/include/asm/io.h @@ -213,7 +213,7 @@ void iounmap(const volatile void __iomem *addr); #define war_io_reorder_wmb() barrier() #endif -#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, relax, irq) \ +#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, relax, irq) \ \ static inline void pfx##write##bwlq(type val, \ volatile void __iomem *mem) \ @@ -221,7 +221,7 @@ static inline void pfx##write##bwlq(type val, \ volatile type *__mem; \ type __val; \ \ - if (barrier) \ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ else \ war_io_reorder_wmb(); \ @@ -262,7 +262,7 @@ static inline type pfx##read##bwlq(const volatile void __iomem *mem) \ \ __mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem)); \ \ - if (barrier) \ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ \ if (sizeof(type) != sizeof(u64) || sizeof(u64) == sizeof(long)) \ @@ -294,14 +294,14 @@ static inline type pfx##read##bwlq(const volatile void __iomem *mem) \ return pfx##ioswab##bwlq(__mem, __val); \ } -#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, barrier, relax, p) \ +#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, relax, p) \ \ static inline void pfx##out##bwlq##p(type val, unsigned long port) \ { \ volatile type *__addr; \ type __val; \ \ - if (barrier) \ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ else \ war_io_reorder_wmb(); \ @@ -325,7 +325,7 @@ static inline type pfx##in##bwlq##p(unsigned long port) \ \ BUILD_BUG_ON(sizeof(type) > sizeof(unsigned long)); \ \ - if (barrier) \ + if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING))) \ iobarrier_rw(); \ \ __val = *__addr; \ @@ -338,7 +338,7 @@ static inline type pfx##in##bwlq##p(unsigned long port) \ #define __BUILD_MEMORY_PFX(bus, bwlq, type, relax) \ \ -__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1, relax, 1) +__BUILD_MEMORY_SINGLE(bus, bwlq, type, relax, 1) #define BUILDIO_MEM(bwlq, type) \ \ @@ -358,8 +358,8 @@ __BUILD_MEMORY_PFX(__mem_, q, u64, 0) #endif #define __BUILD_IOPORT_PFX(bus, bwlq, type) \ - __BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, 0,) \ - __BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, 0, _p) + __BUILD_IOPORT_SINGLE(bus, bwlq, type, 0,) \ + __BUILD_IOPORT_SINGLE(bus, bwlq, type, 0, _p) #define BUILDIO_IOPORT(bwlq, type) \ __BUILD_IOPORT_PFX(, bwlq, type) \ @@ -374,7 +374,7 @@ BUILDIO_IOPORT(q, u64) #define __BUILDIO(bwlq, type) \ \ -__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 1, 0, 0) +__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 0, 0) __BUILDIO(q, u64) From patchwork Sun Sep 20 11:00:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Serge Semin X-Patchwork-Id: 11787411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97EAE139A for ; Sun, 20 Sep 2020 11:07:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7FE9920EDD for ; Sun, 20 Sep 2020 11:07:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726298AbgITLHW (ORCPT ); Sun, 20 Sep 2020 07:07:22 -0400 Received: from mail.baikalelectronics.com ([87.245.175.226]:53210 "EHLO mail.baikalelectronics.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726280AbgITLHW (ORCPT ); Sun, 20 Sep 2020 07:07:22 -0400 Received: from localhost (unknown [127.0.0.1]) by mail.baikalelectronics.ru (Postfix) with ESMTP id D27E380307CB; Sun, 20 Sep 2020 11:00:46 +0000 (UTC) X-Virus-Scanned: amavisd-new at baikalelectronics.ru Received: from mail.baikalelectronics.ru ([127.0.0.1]) by localhost (mail.baikalelectronics.ru [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uS0z8x_TjkDE; Sun, 20 Sep 2020 14:00:46 +0300 (MSK) From: Serge Semin To: Thomas Bogendoerfer CC: Serge Semin , Serge Semin , Alexey Malahov , Pavel Parkhomenko , Vadim Vlasov , "Maciej W . Rozycki" , , Subject: [PATCH 2/2] mips: Introduce MIPS CM2 GCR Control register accessors Date: Sun, 20 Sep 2020 14:00:10 +0300 Message-ID: <20200920110010.16796-3-Sergey.Semin@baikalelectronics.ru> In-Reply-To: <20200920110010.16796-1-Sergey.Semin@baikalelectronics.ru> References: <20200920110010.16796-1-Sergey.Semin@baikalelectronics.ru> MIME-Version: 1.0 X-ClientProxiedBy: MAIL.baikal.int (192.168.51.25) To mail (192.168.51.25) Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org For some reason these accessors have been absent from the MIPS kernel, while some of them can be used to tune the MIPS code execution up (the default value are fully acceptable though). For instance, in the framework of MIPS P5600/P6600 (see [1] for details) if we are sure the IO interconnect doesn't reorder the requests we can freely set GCR_CONTROL.SYNCDIS, which will make CM2 to respond on SYNCs just after a request is accepted on the L2/Memory interface instead of executing the legacy SYNC and waiting for the full response from L2/Memory. Needless to say that this will significantly speed the {read,write}X() IO-accessors due to having more lightweight barriers around the IO Loads and Stores. There are others MIPS Coherency Manager optimizations available in framework of that register like cache ops serialization limits, speculative read enable, etc, which can be useful for the various MIPS platforms. [1] MIPS32 P5600 Multiprocessing System Software User's Manual, Document Number: MD01025, Revision 01.60, April 19, 2016, p. 400 Signed-off-by: Serge Semin --- Folks, do you think it would be better to implement a dedicated config for arch/mips/kernel/mips-cm.c code, which would disable the SI_SyncTxEn acceptance by setting the GCR_CONTROL.SYNCDIS bit? Currently I intend to set it in the out platform-specific prom_init() method. --- arch/mips/include/asm/mips-cm.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h index aeae2effa123..17b2adf57e0c 100644 --- a/arch/mips/include/asm/mips-cm.h +++ b/arch/mips/include/asm/mips-cm.h @@ -143,6 +143,21 @@ GCR_ACCESSOR_RW(64, 0x008, base) #define CM_GCR_BASE_CMDEFTGT_IOCU0 2 #define CM_GCR_BASE_CMDEFTGT_IOCU1 3 +/* GCR_CONTROL - Global CM2 Settings */ +GCR_ACCESSOR_RW(64, 0x010, control) +#define CM_GCR_CONTROL_SYNCCTL BIT(16) +#define CM_GCR_CONTROL_SYNCDIS BIT(5) +#define CM_GCR_CONTROL_IVU_EN BIT(4) +#define CM_GCR_CONTROL_SHST_EN BIT(3) +#define CM_GCR_CONTROL_PARK_EN BIT(2) +#define CM_GCR_CONTROL_MMIO_LIMIT_DIS BIT(1) +#define CM_GCR_CONTROL_SPEC_READ_EN BIT(0) + +/* GCR_CONTROL2 - Global CM2 Settings (continue) */ +GCR_ACCESSOR_RW(64, 0x018, control2) +#define CM_GCR_CONTROL2_L2_CACHEOP_LIMIT GENMASK(19, 16) +#define CM_GCR_CONTROL2_L1_CACHEOP_LIMIT GENMASK(3, 0) + /* GCR_ACCESS - Controls core/IOCU access to GCRs */ GCR_ACCESSOR_RW(32, 0x020, access) #define CM_GCR_ACCESS_ACCESSEN GENMASK(7, 0)