Message ID | 1342455826-9425-1-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Gilles, Uros, On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote: > The vivt_flush_cache_{range,page} functions check that the mm_struct > of the VMA being flushed has been active on the current CPU before > performing the cache maintenance. > > The gate_vma has a NULL mm_struct pointer and, as such, will cause a > kernel fault if we try to flush it with the above operations. This > happens during ELF core dumps, which include the gate_vma as it may be > useful for debugging purposes. > > This patch adds checks to the VIVT cache flushing functions so that VMAs > with a NULL mm_struct are ignored. Would one of you be able to test this patch please? I've not managed to trigger the bug you reported on my boards, so it would be useful to know whether or not this patch solves the problem for you. Thanks, Will > diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h > index 004c1bc..8cf828e 100644 > --- a/arch/arm/include/asm/cacheflush.h > +++ b/arch/arm/include/asm/cacheflush.h > @@ -215,7 +215,9 @@ static inline void vivt_flush_cache_mm(struct mm_struct *mm) > static inline void > vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) > { > - if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) > + struct mm_struct *mm = vma->vm_mm; > + > + if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) > __cpuc_flush_user_range(start & PAGE_MASK, PAGE_ALIGN(end), > vma->vm_flags); > } > @@ -223,7 +225,9 @@ vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned > static inline void > vivt_flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsigned long pfn) > { > - if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) { > + struct mm_struct *mm = vma->vm_mm; > + > + if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) { > unsigned long addr = user_addr & PAGE_MASK; > __cpuc_flush_user_range(addr, addr + PAGE_SIZE, vma->vm_flags); > } > -- > 1.7.4.1 >
On Thu, Jul 19, 2012 at 2:28 PM, Will Deacon <will.deacon@arm.com> wrote: > On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote: >> The vivt_flush_cache_{range,page} functions check that the mm_struct >> of the VMA being flushed has been active on the current CPU before >> performing the cache maintenance. >> >> The gate_vma has a NULL mm_struct pointer and, as such, will cause a >> kernel fault if we try to flush it with the above operations. This >> happens during ELF core dumps, which include the gate_vma as it may be >> useful for debugging purposes. >> >> This patch adds checks to the VIVT cache flushing functions so that VMAs >> with a NULL mm_struct are ignored. > > Would one of you be able to test this patch please? I've not managed to > trigger the bug you reported on my boards, so it would be useful to know > whether or not this patch solves the problem for you. Patched kernel works as expected. Also, backport to 3.4 (where original problem was triggered) works OK, so I'd suggest to backport this patch to 3.4. Tested-by: Uros Bizjak <ubizjak@gmail.com> Thanks, Uros.
On Thu, Jul 19, 2012 at 02:03:50PM +0100, Uros Bizjak wrote: > On Thu, Jul 19, 2012 at 2:28 PM, Will Deacon <will.deacon@arm.com> wrote: > > > On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote: > >> The vivt_flush_cache_{range,page} functions check that the mm_struct > >> of the VMA being flushed has been active on the current CPU before > >> performing the cache maintenance. > >> > >> The gate_vma has a NULL mm_struct pointer and, as such, will cause a > >> kernel fault if we try to flush it with the above operations. This > >> happens during ELF core dumps, which include the gate_vma as it may be > >> useful for debugging purposes. > >> > >> This patch adds checks to the VIVT cache flushing functions so that VMAs > >> with a NULL mm_struct are ignored. > > > > Would one of you be able to test this patch please? I've not managed to > > trigger the bug you reported on my boards, so it would be useful to know > > whether or not this patch solves the problem for you. > > Patched kernel works as expected. Also, backport to 3.4 (where > original problem was triggered) works OK, so I'd suggest to backport > this patch to 3.4. > > Tested-by: Uros Bizjak <ubizjak@gmail.com> Thanks Uros, I'll add that and CC stable on the patch too. Will
On 07/19/2012 02:28 PM, Will Deacon wrote: > Gilles, Uros, > > On Mon, Jul 16, 2012 at 05:23:46PM +0100, Will Deacon wrote: >> The vivt_flush_cache_{range,page} functions check that the mm_struct >> of the VMA being flushed has been active on the current CPU before >> performing the cache maintenance. >> >> The gate_vma has a NULL mm_struct pointer and, as such, will cause a >> kernel fault if we try to flush it with the above operations. This >> happens during ELF core dumps, which include the gate_vma as it may be >> useful for debugging purposes. >> >> This patch adds checks to the VIVT cache flushing functions so that VMAs >> with a NULL mm_struct are ignored. > > Would one of you be able to test this patch please? Sorry for the delay, I am getting this mail just now. > I've not managed to > trigger the bug you reported on my boards, I found this bug with the LTP testsuite, more precisely the test named "abort01". > so it would be useful to know > whether or not this patch solves the problem for you. Fixes linux 3.4 bug for me. But... are you sure this is the right fix? I mean you are adding an almost always useless test to many cache flushes, except in one corner case. Would not it make more sense to make the fix local, and fix gate_vma to have a static mm struct with a valid cpumask? Being 0 or 1 whether we want to flush the vector page (I believe we do not want to flush it, but am not sure).
On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote: > Being 0 or 1 whether we want to flush the vector page (I believe we do > not want to flush it, but am not sure). Actually, I believe we want to flush the vector page, at least on systems with VIVT cache: on systems with VIVT cache, the vector page is writeable in kernel mode, so may have been modified, and the address used by elf_core_dump is not the vectors address, but the address in the kernel direct-mapped RAM region where the vector page was allocated, so there is a cache aliasing issue.
Hi Gilles, On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote: > On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote: > > Being 0 or 1 whether we want to flush the vector page (I believe we do > > not want to flush it, but am not sure). > > Actually, I believe we want to flush the vector page, at least on > systems with VIVT cache: on systems with VIVT cache, the vector page is > writeable in kernel mode, so may have been modified, and the address > used by elf_core_dump is not the vectors address, but the address in the > kernel direct-mapped RAM region where the vector page was allocated, so > there is a cache aliasing issue. It may be writable, but we never actually write to it after it has been initialised so there's no need to worry about caching issues (the cache is flushed in devicemaps_init). As for the NULL check, it's likely to be a single additional cycle before a cacheflush. I really consider it to be insignificant. Will
On 07/21/2012 04:35 PM, Will Deacon wrote: > Hi Gilles, > > On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote: >> On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote: >>> Being 0 or 1 whether we want to flush the vector page (I believe we do >>> not want to flush it, but am not sure). >> >> Actually, I believe we want to flush the vector page, at least on >> systems with VIVT cache: on systems with VIVT cache, the vector page is >> writeable in kernel mode, so may have been modified, and the address >> used by elf_core_dump is not the vectors address, but the address in the >> kernel direct-mapped RAM region where the vector page was allocated, so >> there is a cache aliasing issue. > > It may be writable, but we never actually write to it after it has been > initialised so there's no need to worry about caching issues (the cache is > flushed in devicemaps_init). Except if CONFIG_TLS_REG_EMUL is enabled, or if some faulty code wrote by accident to the vector page, which caused the application to crash. What is the reason to include the vector page in the core dump if not to help debugging?
On 07/21/2012 04:40 PM, Gilles Chanteperdrix wrote: > On 07/21/2012 04:35 PM, Will Deacon wrote: >> Hi Gilles, >> >> On Sat, Jul 21, 2012 at 02:18:35PM +0100, Gilles Chanteperdrix wrote: >>> On 07/20/2012 10:41 PM, Gilles Chanteperdrix wrote: >>>> Being 0 or 1 whether we want to flush the vector page (I believe we do >>>> not want to flush it, but am not sure). >>> >>> Actually, I believe we want to flush the vector page, at least on >>> systems with VIVT cache: on systems with VIVT cache, the vector page is >>> writeable in kernel mode, so may have been modified, and the address >>> used by elf_core_dump is not the vectors address, but the address in the >>> kernel direct-mapped RAM region where the vector page was allocated, so >>> there is a cache aliasing issue. >> >> It may be writable, but we never actually write to it after it has been >> initialised so there's no need to worry about caching issues (the cache is >> flushed in devicemaps_init). > > Except if CONFIG_TLS_REG_EMUL is enabled is disabled I mean.
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index 004c1bc..8cf828e 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -215,7 +215,9 @@ static inline void vivt_flush_cache_mm(struct mm_struct *mm) static inline void vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) + struct mm_struct *mm = vma->vm_mm; + + if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) __cpuc_flush_user_range(start & PAGE_MASK, PAGE_ALIGN(end), vma->vm_flags); } @@ -223,7 +225,9 @@ vivt_flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned static inline void vivt_flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsigned long pfn) { - if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(vma->vm_mm))) { + struct mm_struct *mm = vma->vm_mm; + + if (mm && cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm))) { unsigned long addr = user_addr & PAGE_MASK; __cpuc_flush_user_range(addr, addr + PAGE_SIZE, vma->vm_flags); }
The vivt_flush_cache_{range,page} functions check that the mm_struct of the VMA being flushed has been active on the current CPU before performing the cache maintenance. The gate_vma has a NULL mm_struct pointer and, as such, will cause a kernel fault if we try to flush it with the above operations. This happens during ELF core dumps, which include the gate_vma as it may be useful for debugging purposes. This patch adds checks to the VIVT cache flushing functions so that VMAs with a NULL mm_struct are ignored. Cc: Uros Bizjak <ubizjak@gmail.com> Reported-by: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/include/asm/cacheflush.h | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-)