diff mbox series

usb: xhci: enable interrupt only after xhci_start()

Message ID 20200218235024.15266-1-ajayg@nvidia.com (mailing list archive)
State New, archived
Headers show
Series usb: xhci: enable interrupt only after xhci_start() | expand

Commit Message

Ajay Gupta Feb. 18, 2020, 11:50 p.m. UTC
From: Ajay Gupta <ajayg@nvidia.com>

Xhci interrupt must be enabled only after controller is
initialized and started. Currently interrupt is enabled
first in xhci_run() and later hcd state is set to running
in xhci_run_finished().

On a slow system (such as FPGA based platform) the time
difference between enabling interrupt and setting the hcd
state becomes huge enough where interrupt is triggered but
controller initialization is not complete yet.

Fixing the same by moving the interrupt enable (CMD_EIE)
part of code snippet from xhci_run() to xhci_run_finished().

Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
---
 drivers/usb/host/xhci.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

Comments

Mathias Nyman Feb. 20, 2020, 1:18 p.m. UTC | #1
On 19.2.2020 1.50, Ajay Gupta wrote:
> From: Ajay Gupta <ajayg@nvidia.com>
> 
> Xhci interrupt must be enabled only after controller is
> initialized and started. Currently interrupt is enabled
> first in xhci_run() and later hcd state is set to running
> in xhci_run_finished().
> 
> On a slow system (such as FPGA based platform) the time
> difference between enabling interrupt and setting the hcd
> state becomes huge enough where interrupt is triggered but
> controller initialization is not complete yet.
> 
> Fixing the same by moving the interrupt enable (CMD_EIE)
> part of code snippet from xhci_run() to xhci_run_finished().
> 
> Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
> ---

Sounds reasonable, but xHCI specs wants interrupts set and enabled before
xHC is running state.

I see this can be an issue if we get a port event for a USB 3 port before
the USB3 hcd is added.
What kind of issues did you see? I'd guess NULL pointer dereference in 
handle_port_status()?.

We could move interrupt enabling to xhci_run_finished() before 
xhci_start() is called, then the USB3 hcd should be initialized before
we receive interrupts.

Does that work for you?

Details:
xHCI section 4.2 "Host Controller Initialization" has the following sequence:

- Enable host system interrupt (CMD_EIE),
- Enable primary interupter (set IE bit in IMAN register)
- set run bit in USBCMD register.

And section 5.5.2 has a note:

"All registers of the Primary Interrupter shall be initialized before setting the
Run/Stop (RS) flag in the USBCMD register to ‘1’."

-Mathias
Ajay Gupta Feb. 20, 2020, 6:01 p.m. UTC | #2
Hi Mathias,

> On 19.2.2020 1.50, Ajay Gupta wrote:
> > From: Ajay Gupta <ajayg@nvidia.com>
> >
> > Xhci interrupt must be enabled only after controller is initialized
> > and started. Currently interrupt is enabled first in xhci_run() and
> > later hcd state is set to running in xhci_run_finished().
> >
> > On a slow system (such as FPGA based platform) the time difference
> > between enabling interrupt and setting the hcd state becomes huge
> > enough where interrupt is triggered but controller initialization is
> > not complete yet.
> >
> > Fixing the same by moving the interrupt enable (CMD_EIE) part of code
> > snippet from xhci_run() to xhci_run_finished().
> >
> > Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
> > ---
> 
> Sounds reasonable, but xHCI specs wants interrupts set and enabled before
> xHC is running state.
> 
> I see this can be an issue if we get a port event for a USB 3 port before the
> USB3 hcd is added.
> What kind of issues did you see? I'd guess NULL pointer dereference in
> handle_port_status()?.
We saw a kernel soft lockup with following dump when a FPGA based system
is booted with ASUS xHCI (Gen 2) PCIe card attached. It was seen with kernel
v4.9.

[    6.110120] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver       
[    6.110602] ehci-pci: EHCI PCI platform driver                               
[    6.112104] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver           
[    6.112586] ohci-pci: OHCI PCI platform driver                               
[    6.113982] ohci-platform: OHCI generic platform driver                      
[    6.117577] xhci_hcd 0000:01:00.0: xHCI Host Controller                      
[    6.118022] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus numb1
[    6.131674] xhci_hcd 0000:01:00.0: hcc params 0x0200eec1 hci version 0x110 q0
[    6.134882] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002    
[    6.135731] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber1
[    6.136172] usb usb1: Product: xHCI Host Controller                          
[    6.136488] usb usb1: Manufacturer: Linux 4.9.38-tegra-g0177c0b-dirty xhci-hd
[    6.136917] usb usb1: SerialNumber: 0000:01:00.0                             
[    6.142334] hub 1-0:1.0: USB hub found                                       
[    6.142866] hub 1-0:1.0: 2 ports detected                                    
[    6.148110] xhci_hcd 0000:01:00.0: xHCI Host Controller                      
[    6.148500] xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus numb2
[    6.156660] random: fast init done                                           
[    6.754981] random: crng init done                                           
[   27.147341] INFO: rcu_preempt self-detected stall on CPU                     
[   27.147757]  0-...: (5250 ticks this GP) idle=1ed/140000000000002/0 softirq= 
[   27.148243]   (t=5250 jiffies g=622 c=621 q=12)                              
[   27.148618] rcu_preempt kthread starved for 5250 jiffies! g622 c621 f0x0 RCU1
[   27.149158] rcu_preempt     S    0     7      2 0x00000000                   
[   27.149563] Call trace:                                                      
[   27.149799] [<ffffff8008085790>] __switch_to+0x3c/0x48                       
[   27.150149] [<ffffff8008ee6374>] __schedule+0x25c/0x744                      
[   27.150479] [<ffffff8008ee6898>] schedule+0x3c/0xa0                          
[   27.150808] [<ffffff8008ee9aa8>] schedule_timeout+0x18c/0x37c                
[   27.151186] [<ffffff800811d988>] rcu_gp_kthread+0x4d8/0x79c                  
[   27.151571] [<ffffff80080cb1a0>] kthread+0xf0/0x104                          
[   27.151888] [<ffffff8008082ad0>] ret_from_fork+0x10/0x40                     
[   27.152272] Task dump for CPU 0:                                             
[   27.152497] swapper/0       R  running task        0     1      0 0x00000002 
[   27.153020] Call trace:                                                      
[   27.153224] [<ffffff8008089c5c>] dump_backtrace+0x0/0x16c                    
[   27.153582] [<ffffff8008089f70>] show_stack+0x14/0x1c                        
[   27.153914] [<ffffff800819bc94>] sched_show_task.part.135+0x6c/0x78          
[   27.154318] [<ffffff80080deb30>] dump_cpu_task+0x88/0xc8                     
[   27.154667] [<ffffff800819c0d4>] rcu_dump_cpu_stacks+0xa4/0xec               
[   27.155036] [<ffffff800811aacc>] check_cpu_stall.isra.61+0x334/0x5c8         
[   27.155419] [<ffffff800811ad94>] __rcu_pending+0x34/0x18c                    
[   27.155772] [<ffffff800811e040>] rcu_check_callbacks+0x104/0x2c4             
[   27.156166] [<ffffff8008123e60>] update_process_times+0x60/0xac              
[   27.156547] [<ffffff8008134f80>] tick_sched_timer+0x6c/0xec                  
[   27.156911] [<ffffff80081246e0>] __hrtimer_run_queues+0xd4/0x330             
[   27.157291] [<ffffff80081253c8>] hrtimer_interrupt+0x9c/0x1e0                
[   27.157679] [<ffffff80089e2f10>] arch_timer_handler_phys+0x2c/0x38           
[   27.158069] [<ffffff8008113444>] handle_percpu_devid_irq+0x80/0x248          
[   27.158483] [<ffffff800810e304>] __handle_domain_irq+0x70/0xc4               
[   27.158848] [<ffffff8008080de4>] gic_handle_irq+0x50/0xa0                    
[   27.159196] [<ffffff8008082198>] el1_irq+0xd8/0x168                          
[   27.159530] [<ffffff80080adf18>] irq_exit+0xbc/0xe8                          
[   27.159850] [<ffffff800810e308>] __handle_domain_irq+0x74/0xc4               
[   27.160213] [<ffffff8008080de4>] gic_handle_irq+0x50/0xa0                    
[   27.160560] [<ffffff8008082198>] el1_irq+0xd8/0x168                          
[   27.160897] [<ffffff80088bca88>] xhci_handshake+0x38/0x90                    
[   27.161250] [<ffffff80088bd3cc>] xhci_run+0x300/0x534                        
[   27.161574] [<ffffff8008888ef8>] usb_add_hcd+0x2f0/0x868                     
[   27.161929] [<ffffff80088d2598>] xhci_pci_probe+0x120/0x224                  
[   27.162300] [<ffffff80084c1f10>] pci_device_probe+0x98/0x10c                 
[   27.162671] [<ffffff8008655740>] driver_probe_device+0x29c/0x438             
[   27.163044] [<ffffff80086559ec>] __driver_attach+0x110/0x114                 
[   27.163445] [<ffffff8008653038>] bus_for_each_dev+0x64/0xb4                  
[   27.163801] [<ffffff8008654f9c>] driver_attach+0x20/0x28                     
[   27.164147] [<ffffff8008654a74>] bus_add_driver+0x23c/0x2a8                  
[   27.164506] [<ffffff80086569b0>] driver_register+0x74/0x10c                  
[   27.164862] [<ffffff80084c0c44>] __pci_register_driver+0x44/0x4c             
[   27.165252] [<ffffff8009530614>] xhci_pci_init+0x54/0x60                     
[   27.165601] [<ffffff80080831b4>] do_one_initcall+0x48/0x150                  
[   27.165997] [<ffffff80094f0e54>] kernel_init_freeable+0x1d0/0x284            
[   27.166400] [<ffffff8008ee27f8>] kernel_init+0x10/0xfc                       
[   27.166723] [<ffffff8008082ad0>] ret_from_fork+0x10/0x40   
 
> We could move interrupt enabling to xhci_run_finished() before
> xhci_start() is called, then the USB3 hcd should be initialized before we receive
> interrupts.
> Does that work for you?
Let me check if I can find the system and test it.

Thanks
>nvpublic
> 
> Details:
> xHCI section 4.2 "Host Controller Initialization" has the following sequence:
> 
> - Enable host system interrupt (CMD_EIE),
> - Enable primary interupter (set IE bit in IMAN register)
> - set run bit in USBCMD register.
> 
> And section 5.5.2 has a note:
> 
> "All registers of the Primary Interrupter shall be initialized before setting the
> Run/Stop (RS) flag in the USBCMD register to ‘1’."
> 
> -Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index dbac0fa9748d..ee2877fe0a58 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -604,6 +604,8 @@  static int xhci_init(struct usb_hcd *hcd)
 
 static int xhci_run_finished(struct xhci_hcd *xhci)
 {
+	u32 temp;
+
 	if (xhci_start(xhci)) {
 		xhci_halt(xhci);
 		return -ENODEV;
@@ -611,6 +613,13 @@  static int xhci_run_finished(struct xhci_hcd *xhci)
 	xhci->shared_hcd->state = HC_STATE_RUNNING;
 	xhci->cmd_ring_state = CMD_RING_STATE_RUNNING;
 
+	/* Set the HCD state before we enable the irqs */
+	temp = readl(&xhci->op_regs->command);
+	temp |= (CMD_EIE);
+	xhci_dbg_trace(xhci, trace_xhci_dbg_init,
+			"// Enable interrupts, cmd = 0x%x.", temp);
+	writel(temp, &xhci->op_regs->command);
+
 	if (xhci->quirks & XHCI_NEC_HOST)
 		xhci_ring_cmd_db(xhci);
 
@@ -664,13 +673,6 @@  int xhci_run(struct usb_hcd *hcd)
 	temp |= (xhci->imod_interval / 250) & ER_IRQ_INTERVAL_MASK;
 	writel(temp, &xhci->ir_set->irq_control);
 
-	/* Set the HCD state before we enable the irqs */
-	temp = readl(&xhci->op_regs->command);
-	temp |= (CMD_EIE);
-	xhci_dbg_trace(xhci, trace_xhci_dbg_init,
-			"// Enable interrupts, cmd = 0x%x.", temp);
-	writel(temp, &xhci->op_regs->command);
-
 	temp = readl(&xhci->ir_set->irq_pending);
 	xhci_dbg_trace(xhci, trace_xhci_dbg_init,
 			"// Enabling event ring interrupter %p by writing 0x%x to irq_pending",