diff mbox

RFC on writel and writel_relaxed

Message ID 20180327095745.GB29373@arm.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Will Deacon March 27, 2018, 9:57 a.m. UTC
[+ locking/ordering/docs people]

On Tue, Mar 27, 2018 at 07:56:59PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2018-03-27 at 09:56 +0200, Arnd Bergmann wrote:
> > On Tue, Mar 27, 2018 at 12:27 AM, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Tue, Mar 27, 2018 at 09:01:57AM +1100, Benjamin Herrenschmidt wrote:
> > > > On Mon, 2018-03-26 at 17:46 -0400, Sinan Kaya wrote:
> > > 
> > > I even see patches adding wmb() based on actual observed memory
> > > corruption during testing on Intel:
> > > 
> > > https://patchwork.kernel.org/patch/10177207/
> > > 
> > > So you think all of this is unnecessary and writel is totally strongly
> > > ordered, even on multi-socket Intel?
> > 
> > This example adds a wmb() between two writes to a coherent DMA
> > area, it is definitely required there. 
> 
> Ah you are right, I incorrectly assumed that the "prod_db" function was
> an MMIO. So we do NOT have a counter example where wmb is needed on
> x86, pfiew ! :-)
> 
> > I'm pretty sure I've never seen
> > any bug reports pointing to a missing wmb() between memory
> > and MMIO write accesses, but if you remember seeing them in the
> > list, maybe you can look again for some evidence of something going
> > wrong on x86 without it?
> 
> The interesting thing is that we do seem to have a whole LOT of these
> spurrious wmb before writel all over the tree, I suspect because of
> that incorrect recommendation in memory-barriers.txt.
> 
> We should fix that.

Patch below. Thoughts?

Will

--->8

From db0daeaf94f0f6232f8206fc07a74211324b11d9 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Tue, 27 Mar 2018 10:49:58 +0100
Subject: [PATCH] docs/memory-barriers.txt: Fix broken DMA vs MMIO ordering
 example

The section of memory-barriers.txt that describes the dma_Xmb() barriers
has an incorrect example claiming that a wmb() is required after writing
to coherent memory in order for those writes to be visible to a device
before a subsequent MMIO access using writel() can reach the device.

In fact, this ordering guarantee is provided (at significant cost on some
architectures such as arm and power) by writel, so the wmb() is not
necessary. writel_relaxed exists for cases where this ordering is not
required.

Fix the example and update the text to make this clearer.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reported-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 Documentation/memory-barriers.txt | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

Comments

Arnd Bergmann March 27, 2018, 10:05 a.m. UTC | #1
On Tue, Mar 27, 2018 at 11:57 AM, Will Deacon <will.deacon@arm.com> wrote:

>
> From db0daeaf94f0f6232f8206fc07a74211324b11d9 Mon Sep 17 00:00:00 2001
> From: Will Deacon <will.deacon@arm.com>
> Date: Tue, 27 Mar 2018 10:49:58 +0100
> Subject: [PATCH] docs/memory-barriers.txt: Fix broken DMA vs MMIO ordering
>  example
>
> The section of memory-barriers.txt that describes the dma_Xmb() barriers
> has an incorrect example claiming that a wmb() is required after writing
> to coherent memory in order for those writes to be visible to a device
> before a subsequent MMIO access using writel() can reach the device.
>
> In fact, this ordering guarantee is provided (at significant cost on some
> architectures such as arm and power) by writel, so the wmb() is not
> necessary. writel_relaxed exists for cases where this ordering is not
> required.
>
> Fix the example and update the text to make this clearer.
>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Reported-by: Sinan Kaya <okaya@codeaurora.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  Documentation/memory-barriers.txt | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index a863009849a3..2556b4b0e6f9 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -1909,9 +1909,6 @@ There are some more advanced barrier functions:
>                 /* assign ownership */
>                 desc->status = DEVICE_OWN;
>
> -               /* force memory to sync before notifying device via MMIO */
> -               wmb();
> -
>                 /* notify device of new descriptors */
>                 writel(DESC_NOTIFY, doorbell);
>         }
> @@ -1919,11 +1916,16 @@ There are some more advanced barrier functions:
>       The dma_rmb() allows us guarantee the device has released ownership
>       before we read the data from the descriptor, and the dma_wmb() allows
>       us to guarantee the data is written to the descriptor before the device
> -     can see it now has ownership.  The wmb() is needed to guarantee that the
> -     cache coherent memory writes have completed before attempting a write to
> -     the cache incoherent MMIO region.
> -
> -     See Documentation/DMA-API.txt for more information on consistent memory.
> +     can see it now has ownership.  Note that, when using writel(), a prior
> +     wmb() is not needed to guarantee that the cache coherent memory writes
> +     have completed before writing to the cache incoherent MMIO region.
> +     If this ordering between incoherent MMIO and coherent memory regions
> +     is not required, writel_relaxed() can be used instead and is significantly
> +     cheaper on some weakly-ordered architectures.

I think that's a great improvement, but I'm a bit worried about recommending
writel_relaxed() too much: I've seen a lot of drivers that just always use
writel_relaxed() over write(), and some of them get that wrong when they
don't understand the difference but end up using DMA without explicit
barriers anyway.

Also, having an architecture-independent driver use wmb()+writel_relaxed()
ends up being more expensive than just using write(). Not sure how to
best phrase it though.

      Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Will Deacon March 27, 2018, 10:09 a.m. UTC | #2
On Tue, Mar 27, 2018 at 12:05:06PM +0200, Arnd Bergmann wrote:
> On Tue, Mar 27, 2018 at 11:57 AM, Will Deacon <will.deacon@arm.com> wrote:
> 
> >
> > From db0daeaf94f0f6232f8206fc07a74211324b11d9 Mon Sep 17 00:00:00 2001
> > From: Will Deacon <will.deacon@arm.com>
> > Date: Tue, 27 Mar 2018 10:49:58 +0100
> > Subject: [PATCH] docs/memory-barriers.txt: Fix broken DMA vs MMIO ordering
> >  example
> >
> > The section of memory-barriers.txt that describes the dma_Xmb() barriers
> > has an incorrect example claiming that a wmb() is required after writing
> > to coherent memory in order for those writes to be visible to a device
> > before a subsequent MMIO access using writel() can reach the device.
> >
> > In fact, this ordering guarantee is provided (at significant cost on some
> > architectures such as arm and power) by writel, so the wmb() is not
> > necessary. writel_relaxed exists for cases where this ordering is not
> > required.
> >
> > Fix the example and update the text to make this clearer.
> >
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Reported-by: Sinan Kaya <okaya@codeaurora.org>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >  Documentation/memory-barriers.txt | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> >
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index a863009849a3..2556b4b0e6f9 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -1909,9 +1909,6 @@ There are some more advanced barrier functions:
> >                 /* assign ownership */
> >                 desc->status = DEVICE_OWN;
> >
> > -               /* force memory to sync before notifying device via MMIO */
> > -               wmb();
> > -
> >                 /* notify device of new descriptors */
> >                 writel(DESC_NOTIFY, doorbell);
> >         }
> > @@ -1919,11 +1916,16 @@ There are some more advanced barrier functions:
> >       The dma_rmb() allows us guarantee the device has released ownership
> >       before we read the data from the descriptor, and the dma_wmb() allows
> >       us to guarantee the data is written to the descriptor before the device
> > -     can see it now has ownership.  The wmb() is needed to guarantee that the
> > -     cache coherent memory writes have completed before attempting a write to
> > -     the cache incoherent MMIO region.
> > -
> > -     See Documentation/DMA-API.txt for more information on consistent memory.
> > +     can see it now has ownership.  Note that, when using writel(), a prior
> > +     wmb() is not needed to guarantee that the cache coherent memory writes
> > +     have completed before writing to the cache incoherent MMIO region.
> > +     If this ordering between incoherent MMIO and coherent memory regions
> > +     is not required, writel_relaxed() can be used instead and is significantly
> > +     cheaper on some weakly-ordered architectures.
> 
> I think that's a great improvement, but I'm a bit worried about recommending
> writel_relaxed() too much: I've seen a lot of drivers that just always use
> writel_relaxed() over write(), and some of them get that wrong when they
> don't understand the difference but end up using DMA without explicit
> barriers anyway.
> 
> Also, having an architecture-independent driver use wmb()+writel_relaxed()
> ends up being more expensive than just using write(). Not sure how to
> best phrase it though.

Perhaps I add reword that with a simple example to say:

  If this ordering between incoherent MMIO and coherent memory regions
  is not required (e.g. in a sequence of accesses all to the MMIO region)
  [...]

since that seems to be the usual case where the _relaxed accessors help.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann March 27, 2018, 10:53 a.m. UTC | #3
On Tue, Mar 27, 2018 at 12:09 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Mar 27, 2018 at 12:05:06PM +0200, Arnd Bergmann wrote:
>> > -
>> > -     See Documentation/DMA-API.txt for more information on consistent memory.
>> > +     can see it now has ownership.  Note that, when using writel(), a prior
>> > +     wmb() is not needed to guarantee that the cache coherent memory writes
>> > +     have completed before writing to the cache incoherent MMIO region.
>> > +     If this ordering between incoherent MMIO and coherent memory regions

One more thing: I think the term "incoherent MMIO" is a bit confusing, I'd
prefer just "MMIO" here. At least I don't have the faintest clue what the
difference between "coherent MMIO" and "incoherent MMIO" would be ;-)

>> > +     is not required, writel_relaxed() can be used instead and is significantly
>> > +     cheaper on some weakly-ordered architectures.
>>
>> I think that's a great improvement, but I'm a bit worried about recommending
>> writel_relaxed() too much: I've seen a lot of drivers that just always use
>> writel_relaxed() over write(), and some of them get that wrong when they
>> don't understand the difference but end up using DMA without explicit
>> barriers anyway.
>>
>> Also, having an architecture-independent driver use wmb()+writel_relaxed()
>> ends up being more expensive than just using write(). Not sure how to
>> best phrase it though.
>
> Perhaps I add reword that with a simple example to say:
>
>   If this ordering between incoherent MMIO and coherent memory regions
>   is not required (e.g. in a sequence of accesses all to the MMIO region)
>   [...]
>
> since that seems to be the usual case where the _relaxed accessors help.

That still doesn't quite capture what I'd like driver writes to do: in essence
I would recommend them to use writel() all the time, except in performance
critical code that has been shown to be correct and has a comment to explain
why _relaxed() is ok in that particular function.

Maybe it can just be rephrased to warn against the use of writel_relaxed()
here, and explain the difference that way:

     can see it now has ownership.  Note that, when using writel(), a prior
     wmb() is not needed to guarantee that the cache coherent memory writes
     have completed before writing to the cache incoherent MMIO region.
     The cheaper writel_relaxed() does not guarantee the DMA to be visible
     to the device and must not be used here.

       Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt March 27, 2018, 11:21 a.m. UTC | #4
On Tue, 2018-03-27 at 10:57 +0100, Will Deacon wrote:
> > 
> > The interesting thing is that we do seem to have a whole LOT of these
> > spurrious wmb before writel all over the tree, I suspect because of
> > that incorrect recommendation in memory-barriers.txt.
> > 
> > We should fix that.
> 
> Patch below. Thoughts?

Looks good, we should probably also have a clearer (explicit)
definition of that ordering in the driver-api doco.

Now, to remove all those useless wmb's and find what other bugs they
were papering over ... ;-)

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index a863009849a3..2556b4b0e6f9 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1909,9 +1909,6 @@  There are some more advanced barrier functions:
 		/* assign ownership */
 		desc->status = DEVICE_OWN;
 
-		/* force memory to sync before notifying device via MMIO */
-		wmb();
-
 		/* notify device of new descriptors */
 		writel(DESC_NOTIFY, doorbell);
 	}
@@ -1919,11 +1916,16 @@  There are some more advanced barrier functions:
      The dma_rmb() allows us guarantee the device has released ownership
      before we read the data from the descriptor, and the dma_wmb() allows
      us to guarantee the data is written to the descriptor before the device
-     can see it now has ownership.  The wmb() is needed to guarantee that the
-     cache coherent memory writes have completed before attempting a write to
-     the cache incoherent MMIO region.
-
-     See Documentation/DMA-API.txt for more information on consistent memory.
+     can see it now has ownership.  Note that, when using writel(), a prior
+     wmb() is not needed to guarantee that the cache coherent memory writes
+     have completed before writing to the cache incoherent MMIO region.
+     If this ordering between incoherent MMIO and coherent memory regions
+     is not required, writel_relaxed() can be used instead and is significantly
+     cheaper on some weakly-ordered architectures.
+
+     See the subsection "Kernel I/O barrier effects" for more information on
+     relaxed I/O accessors and the Documentation/DMA-API.txt file for more
+     information on consistent memory.
 
 
 MMIO WRITE BARRIER