From patchwork Wed Apr  4 15:52:20 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Sinan Kaya <okaya@codeaurora.org>
X-Patchwork-Id: 10322813
Return-Path: <linux-arm-msm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	168CF60390 for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Wed,  4 Apr 2018 15:52:39 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03F24284B9
	for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Wed,  4 Apr 2018 15:52:39 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id E9BB728517; Wed,  4 Apr 2018 15:52:38 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 73C05284B9
	for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Wed,  4 Apr 2018 15:52:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752121AbeDDPw0 (ORCPT
	<rfc822;patchwork-linux-arm-msm@patchwork.kernel.org>);
	Wed, 4 Apr 2018 11:52:26 -0400
Received: from smtp.codeaurora.org ([198.145.29.96]:33762 "EHLO
	smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752020AbeDDPwX (ORCPT
	<rfc822;linux-arm-msm@vger.kernel.org>);
	Wed, 4 Apr 2018 11:52:23 -0400
Received: by smtp.codeaurora.org (Postfix, from userid 1000)
	id 3384B607E5; Wed,  4 Apr 2018 15:52:23 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org;
	s=default; t=1522857143;
	bh=NMWU3O0G152eA3E4vzdqpH2oiVUOusDA372SGYfMpvk=;
	h=Subject:To:Cc:References:From:Date:In-Reply-To:From;
	b=Xd2wQWge8RX3DqEir+uofAnqdceCumuwKY1YGcja+87lRlyLlxupdF8NKVM0ge8xv
	iKmfE8zOtBgZsYJTwzvzr1jJ87vsTqwr84m9+WxBJsSS7OGeIAbgKhM69k8kUQaKgf
	voMaNWkGiCdpkc1CDwBzmW6Fvh3R9IN+nwt2oulc=
Received: from [10.235.228.150] (global_nat1_iad_fw.qualcomm.com
	[129.46.232.65])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128
	bits)) (No client certificate requested)
	(Authenticated sender: okaya@smtp.codeaurora.org)
	by smtp.codeaurora.org (Postfix) with ESMTPSA id 5321B607A2;
	Wed,  4 Apr 2018 15:52:21 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org;
	s=default; t=1522857142;
	bh=NMWU3O0G152eA3E4vzdqpH2oiVUOusDA372SGYfMpvk=;
	h=Subject:To:Cc:References:From:Date:In-Reply-To:From;
	b=A+4SfT9tqmh5FAo9MDaZHGAi4sls4qChi0lHOQkZIR824D/6KLwjSnMvS5hGVfOa0
	4MiFKFF2BshAEyU01cLBQwglBpQ1dCk6ELQZ7adTZsSZHEIwEb7fcnRgL8tEhA8qA8
	E4Cp9AmAxwraeGt8PNpGbwIocNO6yXVtKY9CUViY=
DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 5321B607A2
Authentication-Results: pdx-caf-mail.web.codeaurora.org;
	dmarc=none (p=none dis=none)
	header.from=codeaurora.org
Authentication-Results: pdx-caf-mail.web.codeaurora.org;
	spf=none smtp.mailfrom=okaya@codeaurora.org
Subject: Re: [PATCH v2 2/2] io: prevent compiler reordering on the default
	readX() implementation
To: Palmer Dabbelt <palmer@sifive.com>, Arnd Bergmann <arnd@arndb.de>
Cc: mark.rutland@arm.com, timur@codeaurora.org, sulrich@codeaurora.org,
	linux-arm-msm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org
References: <mhng-fe49c525-788d-4ce7-9703-95e2b3eaeca6@palmer-si-x1c4>
From: Sinan Kaya <okaya@codeaurora.org>
Message-ID: <691b903c-e97d-0a25-28c5-690318bb215a@codeaurora.org>
Date: Wed, 4 Apr 2018 11:52:20 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
	Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <mhng-fe49c525-788d-4ce7-9703-95e2b3eaeca6@palmer-si-x1c4>
Content-Language: en-US
Sender: linux-arm-msm-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-arm-msm.vger.kernel.org>
X-Mailing-List: linux-arm-msm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On 4/3/2018 6:29 PM, Palmer Dabbelt wrote:
> On Tue, 03 Apr 2018 05:56:18 PDT (-0700), Arnd Bergmann wrote:
>> On Tue, Apr 3, 2018 at 2:44 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
>>> On 4/3/2018 7:13 AM, Arnd Bergmann wrote:
>>>> On Tue, Apr 3, 2018 at 12:49 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>>>>> Hi,
>>>>>
>>>>> On Fri, Mar 30, 2018 at 11:58:13AM -0400, Sinan Kaya wrote:
>>>>>> The default implementation of mapping readX() to __raw_readX() is wrong.
>>>>>> readX() has stronger ordering semantics. Compiler is allowed to reorder
>>>>>> __raw_readX().
>>>>>
>>>>> Could you please specify what the compiler is potentially reordering
>>>>> __raw_readX() against, and why this would be wrong?
>>>>>
>>>>> e.g. do we care about prior normal memory accesses, subsequent normal
>>>>> memory accesses, and/or other IO accesses?
>>>>>
>>>>> I assume that the asm-generic __raw_{read,write}X() implementations are
>>>>> all ordered w.r.t. each other (at least for a specific device).
>>>>
>>>> I think that is correct: the compiler won't reorder those because of the
>>>> 'volatile' pointer dereference, but it can reorder access to a normal
>>>> pointer against a __raw_readl()/__raw_writel(), which breaks the scenario
>>>> of using writel to trigger a DMA, or using a readl to see if a DMA has
>>>> completed.
>>>
>>> Yes, we are worried about memory update vs. IO update ordering here.
>>> That was the reason why barrier() was introduced in this patch. I'll try to
>>> clarify that better in the commit text.
>>>
>>>>
>>>> The question is whether we should use a stronger barrier such
>>>> as rmb() amd wmb() here rather than a simple compiler barrier.
>>>>
>>>> I would assume that on complex architectures with write buffers and
>>>> out-of-order prefetching, those are required, while on architectures
>>>> without those features, the barriers are cheap.
>>>
>>> That's my reasoning too. I'm trying to follow the x86 example here where there
>>> is a compiler barrier in writeX() and readX() family of functions.
>>
>> I think x86 is the special case here because it implicitly guarantees
>> the strict ordering in the hardware, as long as the compiler gets it
>> right. For the asm-generic version, it may be better to play safe and
>> do the safest version, requiring architectures to override that barrier
>> if they want to be faster.
>>
>> We could use the same macros that riscv has, using __io_br(),
>> __io_ar(), __io_bw() and __io_aw() for before/after read/write.
> 
> FWIW, when I wrote this I wasn't sure what the RISC-V memory model was going to be so I just picked something generic.  In other words, it's already a generic interface, just one that we're the only users of :).
> 

Are we looking for something like this?

diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index e8c2078..693a82f 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -101,6 +101,16 @@ static inline void __raw_writeq(u64 value, volatile void __iomem *addr)
 #endif
 #endif /* CONFIG_64BIT */
 
+#ifndef __io_br()
+#define __io_br()	do {} while (0)
+#endif
+
+#ifdef rmb
+#define __io_ar()	rmb();
+#else
+#define __io_ar()	barrier();
+#endif
+
 /*
  * {read,write}{b,w,l,q}() access little endian memory and return result in
  * native endianness.
@@ -108,35 +118,46 @@ static inline void __raw_writeq(u64 value, volatile void __iomem *addr)
 
 #ifndef readb
 #define readb readb
-static inline u8 readb(const volatile void __iomem *addr)
-{
-	return __raw_readb(addr);
-}
+#define readb(c)				\
+	({ u8  __v;				\
+	 __io_br();				\
+	 __v = __raw_readb(c);			\
+	 __io_ar();				\
+	 __v; })
 #endif
 
 #ifndef readw
 #define readw readw
-static inline u16 readw(const volatile void __iomem *addr)
-{
-	return __le16_to_cpu(__raw_readw(addr));
-}
+#define readw(c)				\
+    ({ u16 __v;					\
+						\
+     __io_br();					\
+      __v = __le16_to_cpu(__raw_readw(c));	\
+     __io_ar();					\
+     __v; })
 #endif
 
 #ifndef readl
 #define readl readl
-static inline u32 readl(const volatile void __iomem *addr)
-{
-	return __le32_to_cpu(__raw_readl(addr));
-}
+#define readl(c)				\
+    ({ u32 __v;					\
+						\
+     __io_br();					\
+      __v = __le32_to_cpu(__raw_readl(c));	\
+     __io_ar();					\
+     __v; })
 #endif
 
 #ifdef CONFIG_64BIT
 #ifndef readq
 #define readq readq
-static inline u64 readq(const volatile void __iomem *addr)
-{
-	return __le64_to_cpu(__raw_readq(addr));
-}
+#define readq(c)				\
+    ({ u64 __v;					\
+						\
+     __io_br();					\
+      __v = __le64_to_cpu(__raw_readq(c));	\
+     __io_ar();					\
+     __v; })
 #endif
 #endif /* CONFIG_64BIT */