diff mbox

[RFC] fix parisc runtime hangs wrt pa_tlb_lock

Message ID 4A2ADEC9.2090403@gmx.de (mailing list archive)
State Superseded
Delegated to: Helge Deller
Headers show

Commit Message

Helge Deller June 6, 2009, 9:25 p.m. UTC
John David Anglin wrote:
>> Since many kernel versions I regularly faced reproducibly kernel hangs when
>> compiling some bigger files. The machine suddenly just seemed to hang.
>>
>> With spinlock debugging turned on I found this:
>>
>> BUG: spinlock recursion on CPU#0, tool/7263
>>  lock: 10644000, .magic: dead4ead, .owner: tool/7263, .owner_cpu: 0
>> Backtrace:
>>  [<10113a94>] show_stack+0x18/0x28
>>
>> BUG: spinlock lockup on CPU#0, tool/7263, 10644000
>> Backtrace:
>>  [<10113a94>] show_stack+0x18/0x28
>>
>> BUG: soft lockup - CPU#0 stuck for 61s! [tool:7263]
>> IASQ: 00000000 00000000 IAOQ: 102d55dc 102d557c
>>  IIR: 03c008b3    ISR: 00000000  IOR: 00000000
>>  CPU:        0   CR30: 7d1a4000 CR31: 11111111
>>  ORIG_R28: 00000000
>>  IAOQ[0]: _raw_spin_lock+0x15c/0x1c0
>>  IAOQ[1]: _raw_spin_lock+0xfc/0x1c0
>>  RP(r2): _raw_spin_lock+0x18c/0x1c0
>> Backtrace:
>>  [<102d560c>] _raw_spin_lock+0x18c/0x1c0
>>
>> Kernel panic - not syncing: softlockup: hung tasks
>> Backtrace:
>>  [<10113a94>] show_stack+0x18/0x28
> 
> I have tested the proposed change using 2.6.30-rc6 and a modified version
> of 2.6.22.10.  I tested using SMP kernels running on a rp3440 and UP
> kernels on a c3750.  I also tested changing the macros to just use
> preempt_disable/preempt_enable.
> 
> The patch doesn't cause any new problems as far as I can tell.  

That's good. Thanks for testing !

> However,
> it doesn't fix any of the problems that I currently see on these two
> machines.  In particular, I see the occasional gcc testsuite timeout
> using SMP kernels.  Programs that usually take a few seconds to run
> timeout after three minutes.  These timeouts don't occur with UP kernels.

Ok, it fixes at least a kernel hang I faced regularily with my compilations...

> On the rp3440, the spinlock is definitely needed.  With just
> preempt_disable/preempt_enable, a crash occurs during bootstrap
> at the point unused memory is recovered.  Thus, the tlb purge
> issue referred to in the preceeding comment affects more than
> just N class.
> 
> On the otherhand, it doesn't seem necessary to disable interrupts
> during the purge with UP kernels.  

My kernel crashes happened with UP-kernels on a B2000 (1 CPU).
So, I still have the feeling that it's necessary to disable interrupts 
on UP kernels as well.

The new cleaned-up attached patch below shows, that arch/parisc/kernel/pci-dma.c
is affected. My kernel crashes always happened when the machine was pretty 
much loaded, during memory pressure when the system still wanted to load
something from disk.
In pci-dma.c the TLB flushing code is called from the DMA handlers, so
it does indicate that IRQ would need to be disabled.
One example I could imagine (with a UP-kernel):
- Process A runs some code, gets into kernel and executes clear_user_page() and locks
  the pa_tlb_lock spinlock (inside the critical section).
- Suddenly Process B gets data from disk via DMA handlers in pci-dma.c
- DMA-Interrupt kicks in, the DMA handler tries to get the pa_tlb_lock and fails
  since process A still holds the lock
-> machine deadlocks and kernel starts reporting about "hung tasks after 61seconds".
  (just search the mailing list for such reports..)

So, even a UP-kernel is affected.

> With SMP kernels, it would be nice
> to know if the lockup was caused by an interruption during the tlb
> purge, or a preemption issue.
> 
> I have the sense that disabling interrupts is wrong.  That is any
> given CPU can only generate one PxTLB inter processor broadcast
> at a time.  Disabling interrupts could cause a deadlock if a TLB
> purge was needed while the purge code was executing.
> 
> The other alternative is to allow the processor that holds the lock
> to enter the flush code.  This would fix the deadlock.  Don't know how
> to code this (atomic compare and exchange?).
> 
> The preempt_disable/preempt_enable crash on the rp3440 made me wonder
> if all tlb purge operations are properly protected with the tlb spinlock.
> I think we need to look at flush_tlb_all_local and copy_user_page_asm.
> They don't seem protected.

New patch attached.

Helge


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Helge Deller June 14, 2009, 10:05 p.m. UTC | #1
On 06/06/2009 11:25 PM, Helge Deller wrote:
> John David Anglin wrote:
>> On the rp3440, the spinlock is definitely needed.  With just
>> preempt_disable/preempt_enable, a crash occurs during bootstrap
>> at the point unused memory is recovered.  Thus, the tlb purge
>> issue referred to in the preceeding comment affects more than
>> just N class.
>>
>> On the otherhand, it doesn't seem necessary to disable interrupts
>> during the purge with UP kernels.
>
> My kernel crashes happened with UP-kernels on a B2000 (1 CPU).
> So, I still have the feeling that it's necessary to disable interrupts
> on UP kernels as well.

Hi Dave,

I did further testing. Especially I wanted to clarify if disabling
interrupts are necessary or not.

To test it, I added a WARN_ON(in_interrupt()) to the purge_tlb_start() macro like this:
+#define purge_tlb_start(flags)	WARN_ON(in_interrupt()); spin_lock_irqsave(&pa_tlb_lock, flags)
and did ran the compile-test which usually hang my system.

The result is that we really need to disable interrupts.
The WARN_ON() did triggered for me again as it always did hang the system.
This is with a UP-kernel (2.6.30-rc8) on a UP-machine (B2000):


Badness at arch/parisc/kernel/cache.c:461



      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI

PSW: 00000000000001101111100100001111 Tainted: G        W

r00-03  0006f90f 10640000 10112760 106f4c40

r04-07  5eadc000 2418e30c 0004eadc 113336f0

r08-11  2418e31c 0000007c 106f4a88 7c5b3c7c

r12-15  104c45ec 106c9da0 00001000 10000000

r16-19  00000fff 1067f13c 105e0560 00000100

r20-23  0000ff00 5eadc10a 5eadc100 00000040

r24-27  00000000 5eadcfc0 5eadd000 10640680

r28-31  106f4000 0000000a 106f4c80 471249e8

sr00-03  00000000 00000000 00000000 00000025

sr04-07  00000000 00000000 00000000 00000000



IASQ: 00000000 00000000 IAOQ: 10112774 10112778

  IIR: 03ffe01f    ISR: 0424013a  IOR: b72dcfc0

  CPU:        0   CR30: 106f4000 CR31: 11111111

  ORIG_R28: 106f4dc0

  IAOQ[0]: flush_kernel_dcache_page_addr+0x30/0xa0

  IAOQ[1]: flush_kernel_dcache_page_addr+0x34/0xa0

  RP(r2): flush_kernel_dcache_page_addr+0x1c/0xa0

Backtrace:

  [<10112760>] flush_kernel_dcache_page_addr+0x1c/0xa0



Sadly I didn't got a full backtrace.
The WARN_ON() triggered around 20 times during the phase where my machine was
pretty much loaded.
Interestingly it was always inflush_kernel_dcache_page_addr().

So, I still think my patch at http://patchwork.kernel.org/patch/28458/ should
be applied to all kernels.

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin June 14, 2009, 11:20 p.m. UTC | #2
On Mon, 15 Jun 2009, Helge Deller wrote:

> On 06/06/2009 11:25 PM, Helge Deller wrote:
>> John David Anglin wrote:
>>> On the rp3440, the spinlock is definitely needed.  With just
>>> preempt_disable/preempt_enable, a crash occurs during bootstrap
>>> at the point unused memory is recovered.  Thus, the tlb purge
>>> issue referred to in the preceeding comment affects more than
>>> just N class.
>>>
>>> On the otherhand, it doesn't seem necessary to disable interrupts
>>> during the purge with UP kernels.
>>
>> My kernel crashes happened with UP-kernels on a B2000 (1 CPU).
>> So, I still have the feeling that it's necessary to disable interrupts
>> on UP kernels as well.
>
> Hi Dave,
>
> I did further testing. Especially I wanted to clarify if disabling
> interrupts are necessary or not.
>
> To test it, I added a WARN_ON(in_interrupt()) to the purge_tlb_start() 
> macro like this:
> +#define purge_tlb_start(flags)	WARN_ON(in_interrupt()); 
> spin_lock_irqsave(&pa_tlb_lock, flags)
> and did ran the compile-test which usually hang my system.
>
> The result is that we really need to disable interrupts.
> The WARN_ON() did triggered for me again as it always did hang the system.
> This is with a UP-kernel (2.6.30-rc8) on a UP-machine (B2000):
>
>
> Badness at arch/parisc/kernel/cache.c:461
>
>
>
>      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
>
> PSW: 00000000000001101111100100001111 Tainted: G        W
>
> r00-03  0006f90f 10640000 10112760 106f4c40
>
> r04-07  5eadc000 2418e30c 0004eadc 113336f0
>
> r08-11  2418e31c 0000007c 106f4a88 7c5b3c7c
>
> r12-15  104c45ec 106c9da0 00001000 10000000
>
> r16-19  00000fff 1067f13c 105e0560 00000100
>
> r20-23  0000ff00 5eadc10a 5eadc100 00000040
>
> r24-27  00000000 5eadcfc0 5eadd000 10640680
>
> r28-31  106f4000 0000000a 106f4c80 471249e8
>
> sr00-03  00000000 00000000 00000000 00000025
>
> sr04-07  00000000 00000000 00000000 00000000
>
>
>
> IASQ: 00000000 00000000 IAOQ: 10112774 10112778
>
>  IIR: 03ffe01f    ISR: 0424013a  IOR: b72dcfc0
>
>  CPU:        0   CR30: 106f4000 CR31: 11111111
>
>  ORIG_R28: 106f4dc0
>
>  IAOQ[0]: flush_kernel_dcache_page_addr+0x30/0xa0
>
>  IAOQ[1]: flush_kernel_dcache_page_addr+0x34/0xa0
>
>  RP(r2): flush_kernel_dcache_page_addr+0x1c/0xa0
>
> Backtrace:
>
>  [<10112760>] flush_kernel_dcache_page_addr+0x1c/0xa0
>
>
>
> Sadly I didn't got a full backtrace.
> The WARN_ON() triggered around 20 times during the phase where my machine 
> was
> pretty much loaded.
> Interestingly it was always inflush_kernel_dcache_page_addr().
>
> So, I still think my patch at http://patchwork.kernel.org/patch/28458/ 
> should
> be applied to all kernels.

I am convinced that interrupts need to be disabled on SMP kernels to
prevent deadlock.  On UP kernels, I am not convinced that anything bad
happens if we do a tlb purge while handling an interrupt since we don't
have to worry about preventing bus conflicts.  The UP code can't
deadlock.  I'm thinking that we can stay with disabling preemption.

Dave
diff mbox

Patch

diff --git a/arch/parisc/include/asm/tlbflush.h b/arch/parisc/include/asm/tlbflush.h
index 1f6fd4f..217588e 100644
--- a/arch/parisc/include/asm/tlbflush.h
+++ b/arch/parisc/include/asm/tlbflush.h
@@ -18,8 +18,8 @@ 
  */
 extern spinlock_t pa_tlb_lock;
 
-#define purge_tlb_start(x) spin_lock(&pa_tlb_lock)
-#define purge_tlb_end(x) spin_unlock(&pa_tlb_lock)
+#define purge_tlb_start(flags)	spin_lock_irqsave(&pa_tlb_lock, flags)
+#define purge_tlb_end(flags)	spin_unlock_irqrestore(&pa_tlb_lock, flags)
 
 extern void flush_tlb_all(void);
 extern void flush_tlb_all_local(void *);
@@ -64,13 +64,14 @@  static inline void flush_tlb_page(struct vm_area_struct *vma,
 	unsigned long addr)
 {
 	/* For one page, it's not worth testing the split_tlb variable */
+	unsigned long flags;
 
 	mb();
 	mtsp(vma->vm_mm->context,1);
-	purge_tlb_start();
+	purge_tlb_start(flags);
 	pdtlb(addr);
 	pitlb(addr);
-	purge_tlb_end();
+	purge_tlb_end(flags);
 }
 
 void __flush_tlb_range(unsigned long sid,
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index 837530e..e3c55f7 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -400,10 +400,11 @@  void clear_user_page_asm(void *page, unsigned long vaddr)
 {
 	/* This function is implemented in assembly in pacache.S */
 	extern void __clear_user_page_asm(void *page, unsigned long vaddr);
+	unsigned long flags;
 
-	purge_tlb_start();
+	purge_tlb_start(flags);
 	__clear_user_page_asm(page, vaddr);
-	purge_tlb_end();
+	purge_tlb_end(flags);
 }
 
 #define FLUSH_THRESHOLD 0x80000 /* 0.5MB */
@@ -444,20 +445,22 @@  extern void clear_user_page_asm(void *page, unsigned long vaddr);
 
 void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
 {
+	unsigned long flags;
 	purge_kernel_dcache_page((unsigned long)page);
-	purge_tlb_start();
+	purge_tlb_start(flags);
 	pdtlb_kernel(page);
-	purge_tlb_end();
+	purge_tlb_end(flags);
 	clear_user_page_asm(page, vaddr);
 }
 EXPORT_SYMBOL(clear_user_page);
 
 void flush_kernel_dcache_page_addr(void *addr)
 {
+	unsigned long flags;
 	flush_kernel_dcache_page_asm(addr);
-	purge_tlb_start();
+	purge_tlb_start(flags);
 	pdtlb_kernel(addr);
-	purge_tlb_end();
+	purge_tlb_end(flags);
 }
 EXPORT_SYMBOL(flush_kernel_dcache_page_addr);
 
@@ -490,8 +493,9 @@  void __flush_tlb_range(unsigned long sid, unsigned long start,
 	if (npages >= 512)  /* 2MB of space: arbitrary, should be tuned */
 		flush_tlb_all();
 	else {
+		unsigned long flags;
 		mtsp(sid, 1);
-		purge_tlb_start();
+		purge_tlb_start(flags);
 		if (split_tlb) {
 			while (npages--) {
 				pdtlb(start);
@@ -504,7 +508,7 @@  void __flush_tlb_range(unsigned long sid, unsigned long start,
 				start += PAGE_SIZE;
 			}
 		}
-		purge_tlb_end();
+		purge_tlb_end(flags);
 	}
 }
 
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 7d927ea..b8ec5a0 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -90,12 +90,13 @@  static inline int map_pte_uncached(pte_t * pte,
 	if (end > PMD_SIZE)
 		end = PMD_SIZE;
 	do {
+		unsigned long flags;
 		if (!pte_none(*pte))
 			printk(KERN_ERR "map_pte_uncached: page already exists\n");
 		set_pte(pte, __mk_pte(*paddr_ptr, PAGE_KERNEL_UNC));
-		purge_tlb_start();
+		purge_tlb_start(flags);
 		pdtlb_kernel(orig_vaddr);
-		purge_tlb_end();
+		purge_tlb_end(flags);
 		vaddr += PAGE_SIZE;
 		orig_vaddr += PAGE_SIZE;
 		(*paddr_ptr) += PAGE_SIZE;
@@ -168,11 +169,12 @@  static inline void unmap_uncached_pte(pmd_t * pmd, unsigned long vaddr,
 	if (end > PMD_SIZE)
 		end = PMD_SIZE;
 	do {
+		unsigned long flags;
 		pte_t page = *pte;
 		pte_clear(&init_mm, vaddr, pte);
-		purge_tlb_start();
+		purge_tlb_start(flags);
 		pdtlb_kernel(orig_vaddr);
-		purge_tlb_end();
+		purge_tlb_end(flags);
 		vaddr += PAGE_SIZE;
 		orig_vaddr += PAGE_SIZE;
 		pte++;