diff mbox

[RFC,1/3] pstore-ram: use write-combine mappings

Message ID 1365563297-12480-1-git-send-email-robherring2@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Rob Herring April 10, 2013, 3:08 a.m. UTC
From: Rob Herring <rob.herring@calxeda.com>

Atomic operations are undefined behavior on ARM for device or strongly
ordered memory types. So use write-combine variants for mappings. This
corresponds to normal, non-cacheable memory on ARM. For many other
architectures, this change should not change the mapping type.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-kernel@vger.kernel.org
---
 fs/pstore/ram_core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Colin Cross April 10, 2013, 3:53 a.m. UTC | #1
On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
> From: Rob Herring <rob.herring@calxeda.com>
>
> Atomic operations are undefined behavior on ARM for device or strongly
> ordered memory types. So use write-combine variants for mappings. This
> corresponds to normal, non-cacheable memory on ARM. For many other
> architectures, this change should not change the mapping type.

This is going to make ramconsole less reliable.  A debugging printk
followed by a __raw_writel that causes an immediate hard crash is
likely to lose the last updates, including the most useful message, in
the write buffers.

Also, isn't this patch unnecessary after patch 3 in this set?

> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Cc: Anton Vorontsov <cbouatmailru@gmail.com>
> Cc: Colin Cross <ccross@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>  fs/pstore/ram_core.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
> index 0306303..e126d9f 100644
> --- a/fs/pstore/ram_core.c
> +++ b/fs/pstore/ram_core.c
> @@ -337,7 +337,7 @@ static void *persistent_ram_vmap(phys_addr_t start, size_t size)
>         page_start = start - offset_in_page(start);
>         page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE);
>
> -       prot = pgprot_noncached(PAGE_KERNEL);
> +       prot = pgprot_writecombine(PAGE_KERNEL);
Is this necessary?  Won't pgprot_noncached already be normal memory?

>         pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL);
>         if (!pages) {
> @@ -364,7 +364,7 @@ static void *persistent_ram_iomap(phys_addr_t start, size_t size)
>                 return NULL;
>         }
>
> -       return ioremap(start, size);
> +       return ioremap_wc(start, size);

ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory,
so I don't see how this helps solve the problem in the commit message.

>  }
>
>  static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size,
> --
> 1.7.10.4
>
Rob Herring April 10, 2013, 1:30 p.m. UTC | #2
On 04/09/2013 10:53 PM, Colin Cross wrote:
> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>> From: Rob Herring <rob.herring@calxeda.com>
>>
>> Atomic operations are undefined behavior on ARM for device or strongly
>> ordered memory types. So use write-combine variants for mappings. This
>> corresponds to normal, non-cacheable memory on ARM. For many other
>> architectures, this change should not change the mapping type.
> 
> This is going to make ramconsole less reliable.  A debugging printk
> followed by a __raw_writel that causes an immediate hard crash is
> likely to lose the last updates, including the most useful message, in
> the write buffers.

It would have to be a write that hangs the bus. In my experience with
AXI, the bus doesn't actually hang until you hit max outstanding
transactions.

I think exclusive stores will limit the buffering, but that is probably
not architecturally guaranteed.

I could put a wb() in at the end of persistent_ram_write.

> Also, isn't this patch unnecessary after patch 3 in this set?

It is still needed in the main memory case to be architecturally correct
to avoid multiple mappings of different memory types and exclusive
accesses to device memory. At least on an A9, it doesn't really seem to
matter. I could remove this for the ioremap case.

Rob

>> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
>> Cc: Anton Vorontsov <cbouatmailru@gmail.com>
>> Cc: Colin Cross <ccross@android.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Tony Luck <tony.luck@intel.com>
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  fs/pstore/ram_core.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
>> index 0306303..e126d9f 100644
>> --- a/fs/pstore/ram_core.c
>> +++ b/fs/pstore/ram_core.c
>> @@ -337,7 +337,7 @@ static void *persistent_ram_vmap(phys_addr_t start, size_t size)
>>         page_start = start - offset_in_page(start);
>>         page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE);
>>
>> -       prot = pgprot_noncached(PAGE_KERNEL);
>> +       prot = pgprot_writecombine(PAGE_KERNEL);
> Is this necessary?  Won't pgprot_noncached already be normal memory?
> 
>>         pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL);
>>         if (!pages) {
>> @@ -364,7 +364,7 @@ static void *persistent_ram_iomap(phys_addr_t start, size_t size)
>>                 return NULL;
>>         }
>>
>> -       return ioremap(start, size);
>> +       return ioremap_wc(start, size);
> 
> ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory,
> so I don't see how this helps solve the problem in the commit message.
> 
>>  }
>>
>>  static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size,
>> --
>> 1.7.10.4
>>
Colin Cross April 15, 2013, 10:21 p.m. UTC | #3
On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring <robherring2@gmail.com> wrote:
> On 04/09/2013 10:53 PM, Colin Cross wrote:
>> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>>> From: Rob Herring <rob.herring@calxeda.com>
>>>
>>> Atomic operations are undefined behavior on ARM for device or strongly
>>> ordered memory types. So use write-combine variants for mappings. This
>>> corresponds to normal, non-cacheable memory on ARM. For many other
>>> architectures, this change should not change the mapping type.
>>
>> This is going to make ramconsole less reliable.  A debugging printk
>> followed by a __raw_writel that causes an immediate hard crash is
>> likely to lose the last updates, including the most useful message, in
>> the write buffers.
>
> It would have to be a write that hangs the bus. In my experience with
> AXI, the bus doesn't actually hang until you hit max outstanding
> transactions.

I've seen many cases where a single write to device memory in an
unclocked slave will completely and instantly hang all cpus, and the
next write will never happen.

> I think exclusive stores will limit the buffering, but that is probably
> not architecturally guaranteed.
>
> I could put a wb() in at the end of persistent_ram_write.
>
>> Also, isn't this patch unnecessary after patch 3 in this set?
>
> It is still needed in the main memory case to be architecturally correct
> to avoid multiple mappings of different memory types and exclusive
> accesses to device memory. At least on an A9, it doesn't really seem to
> matter. I could remove this for the ioremap case.

According to my reading of the latest ARM ARM (Issue C, section
A3.5.7), and Catalin's excellent explanation
(http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html),
it is no longer considered unpredictable to have both cached and
non-cached mappings to the same memory, as long as you use proper
cache maintenance between accessing the two mappings.

In pstore_ram the cached mapping will never be accessed (and we don't
care about speculative accesses), so no cache maintenance is
necessary.  I don't see any need for this patch, and I see plenty of
possible problems.
Rob Herring April 15, 2013, 11:59 p.m. UTC | #4
On 04/15/2013 05:21 PM, Colin Cross wrote:
> On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring <robherring2@gmail.com> wrote:
>> On 04/09/2013 10:53 PM, Colin Cross wrote:
>>> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>>>> From: Rob Herring <rob.herring@calxeda.com>
>>>>
>>>> Atomic operations are undefined behavior on ARM for device or strongly
>>>> ordered memory types. So use write-combine variants for mappings. This
>>>> corresponds to normal, non-cacheable memory on ARM. For many other
>>>> architectures, this change should not change the mapping type.
>>>
>>> This is going to make ramconsole less reliable.  A debugging printk
>>> followed by a __raw_writel that causes an immediate hard crash is
>>> likely to lose the last updates, including the most useful message, in
>>> the write buffers.
>>
>> It would have to be a write that hangs the bus. In my experience with
>> AXI, the bus doesn't actually hang until you hit max outstanding
>> transactions.
> 
> I've seen many cases where a single write to device memory in an
> unclocked slave will completely and instantly hang all cpus, and the
> next write will never happen.
> 
>> I think exclusive stores will limit the buffering, but that is probably
>> not architecturally guaranteed.
>>
>> I could put a wb() in at the end of persistent_ram_write.
>>
>>> Also, isn't this patch unnecessary after patch 3 in this set?
>>
>> It is still needed in the main memory case to be architecturally correct
>> to avoid multiple mappings of different memory types and exclusive
>> accesses to device memory. At least on an A9, it doesn't really seem to
>> matter. I could remove this for the ioremap case.
> 
> According to my reading of the latest ARM ARM (Issue C, section
> A3.5.7), and Catalin's excellent explanation
> (http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html),
> it is no longer considered unpredictable to have both cached and
> non-cached mappings to the same memory, as long as you use proper
> cache maintenance between accessing the two mappings.
> 
> In pstore_ram the cached mapping will never be accessed (and we don't
> care about speculative accesses), so no cache maintenance is
> necessary.  I don't see any need for this patch, and I see plenty of
> possible problems.

Exclusive accesses still have further restrictions. From section 3.4.5:

• It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
performed to a memory region
   with the Device or Strongly-ordered memory attribute. Unless the
implementation documentation explicitly
  states that LDREX and STREX operations to a memory region with the
Device or Strongly-ordered attribute are
 permitted, the effect of such operations is UNPREDICTABLE.


Given that it is implementation defined, I don't see how Linux can rely
on that behavior.

Rob
Colin Cross April 16, 2013, 12:43 a.m. UTC | #5
On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring <robherring2@gmail.com> wrote:
> On 04/15/2013 05:21 PM, Colin Cross wrote:
>> On Wed, Apr 10, 2013 at 6:30 AM, Rob Herring <robherring2@gmail.com> wrote:
>>> On 04/09/2013 10:53 PM, Colin Cross wrote:
>>>> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>>>>> From: Rob Herring <rob.herring@calxeda.com>
>>>>>
>>>>> Atomic operations are undefined behavior on ARM for device or strongly
>>>>> ordered memory types. So use write-combine variants for mappings. This
>>>>> corresponds to normal, non-cacheable memory on ARM. For many other
>>>>> architectures, this change should not change the mapping type.
>>>>
>>>> This is going to make ramconsole less reliable.  A debugging printk
>>>> followed by a __raw_writel that causes an immediate hard crash is
>>>> likely to lose the last updates, including the most useful message, in
>>>> the write buffers.
>>>
>>> It would have to be a write that hangs the bus. In my experience with
>>> AXI, the bus doesn't actually hang until you hit max outstanding
>>> transactions.
>>
>> I've seen many cases where a single write to device memory in an
>> unclocked slave will completely and instantly hang all cpus, and the
>> next write will never happen.
>>
>>> I think exclusive stores will limit the buffering, but that is probably
>>> not architecturally guaranteed.
>>>
>>> I could put a wb() in at the end of persistent_ram_write.
>>>
>>>> Also, isn't this patch unnecessary after patch 3 in this set?
>>>
>>> It is still needed in the main memory case to be architecturally correct
>>> to avoid multiple mappings of different memory types and exclusive
>>> accesses to device memory. At least on an A9, it doesn't really seem to
>>> matter. I could remove this for the ioremap case.
>>
>> According to my reading of the latest ARM ARM (Issue C, section
>> A3.5.7), and Catalin's excellent explanation
>> (http://lists.linaro.org/pipermail/linaro-dev/2012-February/010239.html),
>> it is no longer considered unpredictable to have both cached and
>> non-cached mappings to the same memory, as long as you use proper
>> cache maintenance between accessing the two mappings.
>>
>> In pstore_ram the cached mapping will never be accessed (and we don't
>> care about speculative accesses), so no cache maintenance is
>> necessary.  I don't see any need for this patch, and I see plenty of
>> possible problems.
>
> Exclusive accesses still have further restrictions. From section 3.4.5:
>
> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
> performed to a memory region
>    with the Device or Strongly-ordered memory attribute. Unless the
> implementation documentation explicitly
>   states that LDREX and STREX operations to a memory region with the
> Device or Strongly-ordered attribute are
>  permitted, the effect of such operations is UNPREDICTABLE.
>
>
> Given that it is implementation defined, I don't see how Linux can rely
> on that behavior.

I see, the problem is that while noncached and writecombined appear to
be similar mappings, noncached is mapped in PRRR to strongly-ordered,
while writecombined is mapped to unbufferable normal memory.

I think adding a wmb() to persistent_ram_write is going to be
expensive on cpus with outer caches like the L2X0, where wmb() will
result in a spinlock.  Is there a real SoC where this doesn't work?
Will Deacon April 16, 2013, 8:44 a.m. UTC | #6
On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote:
> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring <robherring2@gmail.com> wrote:
> > Exclusive accesses still have further restrictions. From section 3.4.5:
> >
> > • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
> > performed to a memory region
> >    with the Device or Strongly-ordered memory attribute. Unless the
> > implementation documentation explicitly
> >   states that LDREX and STREX operations to a memory region with the
> > Device or Strongly-ordered attribute are
> >  permitted, the effect of such operations is UNPREDICTABLE.
> >
> >
> > Given that it is implementation defined, I don't see how Linux can rely
> > on that behavior.
> 
> I see, the problem is that while noncached and writecombined appear to
> be similar mappings, noncached is mapped in PRRR to strongly-ordered,
> while writecombined is mapped to unbufferable normal memory.
> 
> I think adding a wmb() to persistent_ram_write is going to be
> expensive on cpus with outer caches like the L2X0, where wmb() will
> result in a spinlock.  Is there a real SoC where this doesn't work?

A real SoC where exclusives don't work to memory not mapped as normal? Take
your pick...

Will
Rob Herring April 16, 2013, 12:58 p.m. UTC | #7
On 04/16/2013 03:44 AM, Will Deacon wrote:
> On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote:
>> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring <robherring2@gmail.com> wrote:
>>> Exclusive accesses still have further restrictions. From section 3.4.5:
>>>
>>> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
>>> performed to a memory region
>>>    with the Device or Strongly-ordered memory attribute. Unless the
>>> implementation documentation explicitly
>>>   states that LDREX and STREX operations to a memory region with the
>>> Device or Strongly-ordered attribute are
>>>  permitted, the effect of such operations is UNPREDICTABLE.
>>>
>>>
>>> Given that it is implementation defined, I don't see how Linux can rely
>>> on that behavior.
>>
>> I see, the problem is that while noncached and writecombined appear to
>> be similar mappings, noncached is mapped in PRRR to strongly-ordered,
>> while writecombined is mapped to unbufferable normal memory.
>>
>> I think adding a wmb() to persistent_ram_write is going to be
>> expensive on cpus with outer caches like the L2X0, where wmb() will
>> result in a spinlock.  Is there a real SoC where this doesn't work?
> 
> A real SoC where exclusives don't work to memory not mapped as normal? Take
> your pick...

This patch doesn't actually fix problems for me. Exclusives to DDR work
for any memory type for me as the DDR controller has an exclusive
monitor. It takes write-thru cache mapping to get internal RAM to work.

Rob
Catalin Marinas April 16, 2013, 1:48 p.m. UTC | #8
On Tue, Apr 16, 2013 at 01:58:27PM +0100, Rob Herring wrote:
> On 04/16/2013 03:44 AM, Will Deacon wrote:
> > On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote:
> >> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring <robherring2@gmail.com> wrote:
> >>> Exclusive accesses still have further restrictions. From section 3.4.5:
> >>>
> >>> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
> >>> performed to a memory region
> >>>    with the Device or Strongly-ordered memory attribute. Unless the
> >>> implementation documentation explicitly
> >>>   states that LDREX and STREX operations to a memory region with the
> >>> Device or Strongly-ordered attribute are
> >>>  permitted, the effect of such operations is UNPREDICTABLE.
> >>>
> >>>
> >>> Given that it is implementation defined, I don't see how Linux can rely
> >>> on that behavior.
> >>
> >> I see, the problem is that while noncached and writecombined appear to
> >> be similar mappings, noncached is mapped in PRRR to strongly-ordered,
> >> while writecombined is mapped to unbufferable normal memory.
> >>
> >> I think adding a wmb() to persistent_ram_write is going to be
> >> expensive on cpus with outer caches like the L2X0, where wmb() will
> >> result in a spinlock.  Is there a real SoC where this doesn't work?
> > 
> > A real SoC where exclusives don't work to memory not mapped as normal? Take
> > your pick...
> 
> This patch doesn't actually fix problems for me. Exclusives to DDR work
> for any memory type for me as the DDR controller has an exclusive
> monitor. It takes write-thru cache mapping to get internal RAM to work.

I can't find any reference in the ARM ARM but I think you would need
cacheable memory for the exclusives to work. A9 for example uses the
cacheline exclusiveness to emulate the global monitor.
Russell King - ARM Linux April 19, 2013, 9:54 a.m. UTC | #9
On Tue, Apr 09, 2013 at 08:53:18PM -0700, Colin Cross wrote:
> On Tue, Apr 9, 2013 at 8:08 PM, Rob Herring <robherring2@gmail.com> wrote:
> > -       return ioremap(start, size);
> > +       return ioremap_wc(start, size);
> 
> ioremap_wc corresponds to MT_DEVICE_WC, which is still device memory,
> so I don't see how this helps solve the problem in the commit message.

In reality it isn't, because there's no such thing as "write combining
device memory" in the ARM memory model.

There are three major memory types: strongly ordered, device and normal
memory.  Only normal memory can be cached in any way, which includes
using write combining.

#define ioremap_wc(cookie,size) __arm_ioremap((cookie), (size), MT_DEVICE_WC)

        [MT_DEVICE_WC] = {      /* ioremap_wc */
                .prot_pte       = PROT_PTE_DEVICE | L_PTE_MT_DEV_WC,

         *                      n       TR      IR      OR
         *   BUFFERABLE         001     10      00      00
         *   DEV_WC             001     10
(see arch/arm/mm/proc-v7-2level.S for the rest of the table and its
description.)

So, DEV_WC is an alias for BUFFERABLE, which is normal memory,
uncacheable in both inner and outer caches.  This means that at the
moment, ioremap_wc() memory has the same properties as system memory
- with all the out of ordering effects you get there.

I don't put any guarantee on this though - we may end up having to
change it if we find a SoC needing this to really be device memory...
diff mbox

Patch

diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index 0306303..e126d9f 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -337,7 +337,7 @@  static void *persistent_ram_vmap(phys_addr_t start, size_t size)
 	page_start = start - offset_in_page(start);
 	page_count = DIV_ROUND_UP(size + offset_in_page(start), PAGE_SIZE);
 
-	prot = pgprot_noncached(PAGE_KERNEL);
+	prot = pgprot_writecombine(PAGE_KERNEL);
 
 	pages = kmalloc(sizeof(struct page *) * page_count, GFP_KERNEL);
 	if (!pages) {
@@ -364,7 +364,7 @@  static void *persistent_ram_iomap(phys_addr_t start, size_t size)
 		return NULL;
 	}
 
-	return ioremap(start, size);
+	return ioremap_wc(start, size);
 }
 
 static int persistent_ram_buffer_map(phys_addr_t start, phys_addr_t size,