diff mbox

ARM: mm: avoid attempting to flush the gate_vma with VIVT caches

Message ID 1342455826-9425-1-git-send-email-will.deacon@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Will Deacon July 16, 2012, 4:23 p.m. UTC
The vivt_flush_cache_{range,page} functions check that the mm_struct
of the VMA being flushed has been active on the current CPU before
performing the cache maintenance.

The gate_vma has a NULL mm_struct pointer and, as such, will cause a
kernel fault if we try to flush it with the above operations. This
happens during ELF core dumps, which include the gate_vma as it may be
useful for debugging purposes.

This patch adds checks to the VIVT cache flushing functions so that VMAs
with a NULL mm_struct are ignored.

Cc: Uros Bizjak <ubizjak@gmail.com>
Reported-by: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/include/asm/cacheflush.h |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

Comments

Will Deacon July 19, 2012, 12:28 p.m. UTC | #1
Gilles, Uros,

On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote:
> The vivt_flush_cache_{range,page} functions check that the mm_struct
> of the VMA being flushed has been active on the current CPU before
> performing the cache maintenance.
> 
> The gate_vma has a NULL mm_struct pointer and, as such, will cause a
> kernel fault if we try to flush it with the above operations. This
> happens during ELF core dumps, which include the gate_vma as it may be
> useful for debugging purposes.
> 
> This patch adds checks to the VIVT cache flushing functions so that VMAs
> with a NULL mm_struct are ignored.

Would one of you be able to test this patch please? I've not managed to
trigger the bug you reported on my boards, so it would be useful to know
whether or not this patch solves the problem for you.

Thanks,

Will

> diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> index 004c1bc..8cf828e 100644
> --- a/arch/arm/include/asm/cacheflush.h
> +++ b/arch/arm/include/asm/cacheflush.h
> @@ -215,7 +215,9 @@ static inline void vivt_flush_cache_mm(struct mm_struct *mm)
>  static inline void
>  vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
>  {
> -	if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm)))
> +	struct mm_struct *mm = vma->vm_mm;
> +
> +	if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm)))
>  		__cpuc_flush_user_range(start & PAGE_MASK, PAGE_ALIGN(end),
>  					vma->vm_flags);
>  }
> @@ -223,7 +225,9 @@ vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned
>  static inline void
>  vivt_flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsigned long pfn)
>  {
> -	if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) {
> +	struct mm_struct *mm = vma->vm_mm;
> +
> +	if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) {
>  		unsigned long addr = user_addr & PAGE_MASK;
>  		__cpuc_flush_user_range(addr, addr + PAGE_SIZE, vma->vm_flags);
>  	}
> -- 
> 1.7.4.1
>
Uros Bizjak July 19, 2012, 1:03 p.m. UTC | #2
On Thu, Jul 19, 2012 at 2:28 PM, Will Deacon <will.deacon@arm.com> wrote:

> On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote:
>> The vivt_flush_cache_{range,page} functions check that the mm_struct
>> of the VMA being flushed has been active on the current CPU before
>> performing the cache maintenance.
>>
>> The gate_vma has a NULL mm_struct pointer and, as such, will cause a
>> kernel fault if we try to flush it with the above operations. This
>> happens during ELF core dumps, which include the gate_vma as it may be
>> useful for debugging purposes.
>>
>> This patch adds checks to the VIVT cache flushing functions so that VMAs
>> with a NULL mm_struct are ignored.
>
> Would one of you be able to test this patch please? I've not managed to
> trigger the bug you reported on my boards, so it would be useful to know
> whether or not this patch solves the problem for you.

Patched kernel works as expected. Also, backport to 3.4 (where
original problem was triggered) works OK, so I'd suggest to backport
this patch to 3.4.

Tested-by: Uros Bizjak <ubizjak@gmail.com>

Thanks,
Uros.
Will Deacon July 19, 2012, 4:37 p.m. UTC | #3
On Thu, Jul 19, 2012 at 02:03:50PM +0100, Uros Bizjak wrote:
> On Thu, Jul 19, 2012 at 2:28 PM, Will Deacon <will.deacon@arm.com> wrote:
> 
> > On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote:
> >> The vivt_flush_cache_{range,page} functions check that the mm_struct
> >> of the VMA being flushed has been active on the current CPU before
> >> performing the cache maintenance.
> >>
> >> The gate_vma has a NULL mm_struct pointer and, as such, will cause a
> >> kernel fault if we try to flush it with the above operations. This
> >> happens during ELF core dumps, which include the gate_vma as it may be
> >> useful for debugging purposes.
> >>
> >> This patch adds checks to the VIVT cache flushing functions so that VMAs
> >> with a NULL mm_struct are ignored.
> >
> > Would one of you be able to test this patch please? I've not managed to
> > trigger the bug you reported on my boards, so it would be useful to know
> > whether or not this patch solves the problem for you.
> 
> Patched kernel works as expected. Also, backport to 3.4 (where
> original problem was triggered) works OK, so I'd suggest to backport
> this patch to 3.4.
> 
> Tested-by: Uros Bizjak <ubizjak@gmail.com>

Thanks Uros, I'll add that and CC stable on the patch too.

Will
Gilles Chanteperdrix July 20, 2012, 8:41 p.m. UTC | #4
On 07/19/2012 02:28 PM, Will Deacon wrote:
> Gilles, Uros,
> 
> On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote:
>> The vivt_flush_cache_{range,page} functions check that the mm_struct
>> of the VMA being flushed has been active on the current CPU before
>> performing the cache maintenance.
>>
>> The gate_vma has a NULL mm_struct pointer and, as such, will cause a
>> kernel fault if we try to flush it with the above operations. This
>> happens during ELF core dumps, which include the gate_vma as it may be
>> useful for debugging purposes.
>>
>> This patch adds checks to the VIVT cache flushing functions so that VMAs
>> with a NULL mm_struct are ignored.
> 
> Would one of you be able to test this patch please?

Sorry for the delay, I am getting this mail just now.

> I've not managed to
> trigger the bug you reported on my boards,

I found this bug with the LTP testsuite, more precisely the test named
"abort01".

> so it would be useful to know
> whether or not this patch solves the problem for you.

Fixes linux 3.4 bug for me. But... are you sure this is the right fix? I
mean you are adding an almost always useless test to many cache flushes,
except in one corner case. Would not it make more sense to make the fix
local, and fix gate_vma to have a static mm struct with a valid cpumask?
Being 0 or 1 whether we want to flush the vector page (I believe we do
not want to flush it, but am not sure).
Gilles Chanteperdrix July 21, 2012, 1:18 p.m. UTC | #5
On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote:
> Being 0 or 1 whether we want to flush the vector page (I believe we do
> not want to flush it, but am not sure).

Actually, I believe we want to flush the vector page, at least on
systems with VIVT cache: on systems with VIVT cache, the vector page is
writeable in kernel mode, so may have been modified, and the address
used by elf_core_dump is not the vectors address, but the address in the
kernel direct-mapped RAM region where the vector page was allocated, so
there is a cache aliasing issue.
Will Deacon July 21, 2012, 2:35 p.m. UTC | #6
Hi Gilles,

On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote:
> On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote:
> > Being 0 or 1 whether we want to flush the vector page (I believe we do
> > not want to flush it, but am not sure).
> 
> Actually, I believe we want to flush the vector page, at least on
> systems with VIVT cache: on systems with VIVT cache, the vector page is
> writeable in kernel mode, so may have been modified, and the address
> used by elf_core_dump is not the vectors address, but the address in the
> kernel direct-mapped RAM region where the vector page was allocated, so
> there is a cache aliasing issue.

It may be writable, but we never actually write to it after it has been
initialised so there's no need to worry about caching issues (the cache is
flushed in devicemaps_init).

As for the NULL check, it's likely to be a single additional cycle
before a cacheflush. I really consider it to be insignificant.

Will
Gilles Chanteperdrix July 21, 2012, 2:40 p.m. UTC | #7
On 07/21/2012 04:35 PM, Will Deacon wrote:
> Hi Gilles,
> 
> On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote:
>> On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote:
>>> Being 0 or 1 whether we want to flush the vector page (I believe we do
>>> not want to flush it, but am not sure).
>>
>> Actually, I believe we want to flush the vector page, at least on
>> systems with VIVT cache: on systems with VIVT cache, the vector page is
>> writeable in kernel mode, so may have been modified, and the address
>> used by elf_core_dump is not the vectors address, but the address in the
>> kernel direct-mapped RAM region where the vector page was allocated, so
>> there is a cache aliasing issue.
> 
> It may be writable, but we never actually write to it after it has been
> initialised so there's no need to worry about caching issues (the cache is
> flushed in devicemaps_init).

Except if CONFIG_TLS_REG_EMUL is enabled, or if some faulty code wrote
by accident to the vector page, which caused the application to crash.
What is the reason to include the vector page in the core dump if not to
help debugging?
Gilles Chanteperdrix July 21, 2012, 2:47 p.m. UTC | #8
On 07/21/2012 04:40 PM, Gilles Chanteperdrix wrote:
> On 07/21/2012 04:35 PM, Will Deacon wrote:
>> Hi Gilles,
>>
>> On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote:
>>> On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote:
>>>> Being 0 or 1 whether we want to flush the vector page (I believe we do
>>>> not want to flush it, but am not sure).
>>>
>>> Actually, I believe we want to flush the vector page, at least on
>>> systems with VIVT cache: on systems with VIVT cache, the vector page is
>>> writeable in kernel mode, so may have been modified, and the address
>>> used by elf_core_dump is not the vectors address, but the address in the
>>> kernel direct-mapped RAM region where the vector page was allocated, so
>>> there is a cache aliasing issue.
>>
>> It may be writable, but we never actually write to it after it has been
>> initialised so there's no need to worry about caching issues (the cache is
>> flushed in devicemaps_init).
> 
> Except if CONFIG_TLS_REG_EMUL is enabled

is disabled I mean.
diff mbox

Patch

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 004c1bc..8cf828e 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -215,7 +215,9 @@  static inline void vivt_flush_cache_mm(struct mm_struct *mm)
 static inline void
 vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
 {
-	if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm)))
+	struct mm_struct *mm = vma->vm_mm;
+
+	if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm)))
 		__cpuc_flush_user_range(start & PAGE_MASK, PAGE_ALIGN(end),
 					vma->vm_flags);
 }
@@ -223,7 +225,9 @@  vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned
 static inline void
 vivt_flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsigned long pfn)
 {
-	if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) {
+	struct mm_struct *mm = vma->vm_mm;
+
+	if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) {
 		unsigned long addr = user_addr & PAGE_MASK;
 		__cpuc_flush_user_range(addr, addr + PAGE_SIZE, vma->vm_flags);
 	}