Message ID | 20230718112600.3969141-2-xu.yang_2@nxp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] usb: host: ehci-sched: add itd/sitd_unlink_urb() functions | expand |
On Tue, Jul 18, 2023 at 07:26:00PM +0800, Xu Yang wrote: > In current design, the ehci driver will not unlink itd/sitds from the > hardware list when dequeue isochronous urbs. Rather just wait until they > complete normally or their time slot expires. However, this will cause > issues if the controller has stopped periodic schedule before finished > all periodic schedule. The urb will not be done forever in this case and > then usb_kill/poison_urb() will always wait there. > > The ChipIdea IP exactly has a bug: if frame babble occurs during periodic > transfer, PE (PORTSC.bit2) will be cleared and the controller will stop > periodic schedule immediately. So if the user tries to kill or poison > related urb, it will wait there since the urb can't be done forever. > > This patch will check if this issue occurs, then it will unlink itd/sitds > from the hardware list depends on the result. This is not the right approach. There already is code in the driver for unlinking itds/sitds when the periodic schedule isn't running: See how the "live" variable is used in scan_isoc(). You don't need to add new code to do the same thing, you simply have to make sure that live is set to false if the controller has stopped the periodic schedule unexpectedly. (Be very careful about handling the case where CMD_PSE is set and STS_PSS is clear. This can happen when the controller has been told to start the periodic schedule but it hasn't done so yet.) Alan Stern
Hi Alan, > On Tue, Jul 18, 2023 at 07:26:00PM +0800, Xu Yang wrote: > > In current design, the ehci driver will not unlink itd/sitds from the > > hardware list when dequeue isochronous urbs. Rather just wait until they > > complete normally or their time slot expires. However, this will cause > > issues if the controller has stopped periodic schedule before finished > > all periodic schedule. The urb will not be done forever in this case and > > then usb_kill/poison_urb() will always wait there. > > > > The ChipIdea IP exactly has a bug: if frame babble occurs during periodic > > transfer, PE (PORTSC.bit2) will be cleared and the controller will stop > > periodic schedule immediately. So if the user tries to kill or poison > > related urb, it will wait there since the urb can't be done forever. > > > > This patch will check if this issue occurs, then it will unlink itd/sitds > > from the hardware list depends on the result. > > This is not the right approach. There already is code in the driver for > unlinking itds/sitds when the periodic schedule isn't running: See how > the "live" variable is used in scan_isoc(). You don't need to add new > code to do the same thing, you simply have to make sure that live is set > to false if the controller has stopped the periodic schedule > unexpectedly. > > (Be very careful about handling the case where CMD_PSE is set and > STS_PSS is clear. This can happen when the controller has been told to > start the periodic schedule but it hasn't done so yet.) Many thanks for your comments and suggestions. Yes, there is a "live" variable in scan_isoc() to handle the case where root hub has stopped periodic schedule. I have rechecked the root cause of the issue , that is the USB controller has disabled the port (PE cleared) and asserted USBERRINT when frame babble is detected, but PEC is not asserted. Therefore, the SW didn't aware that port has been disabled. The periodic schedule keeps running at this moment. Then the SW keeps sending packets to this port, but all of the transfers will fail. But periodic schedule will also be disabled after a period of time. Finally, all of the linked itds are penging there. The code can't enter into scan_isoc() if only USBERRINT is asserted. For this issue, I think the SW needs to aware that the port has been disabled although PEC not asserted by HW. I will send another patch to fix this issue later. Thanks, Xu Yang > > Alan Stern
On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote: > Many thanks for your comments and suggestions. > > Yes, there is a "live" variable in scan_isoc() to handle the case where > root hub has stopped periodic schedule. I have rechecked the root cause > of the issue , that is the USB controller has disabled the port (PE cleared) > and asserted USBERRINT when frame babble is detected, but PEC is not > asserted. Therefore, the SW didn't aware that port has been disabled. > The periodic schedule keeps running at this moment. Then the SW keeps > sending packets to this port, but all of the transfers will fail. But periodic > schedule will also be disabled after a period of time. Finally, all of the linked > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT > is asserted. That's not true. The io_watchdog timer continues to fire periodically (at 100 ms intervals) as long as isoc_count > 0. The timer's handler is ehci_work(), which calls scan_isoc() if isoc_count > 0. > For this issue, I think the SW needs to aware that the port has been disabled > although PEC not asserted by HW. I will send another patch to fix this issue > later. Yes, that sounds like a nasty problem. The kernel wouldn't realize that the device had disconnected. Alan Stern
Hi Alan, > On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote: > > Many thanks for your comments and suggestions. > > > > Yes, there is a "live" variable in scan_isoc() to handle the case where > > root hub has stopped periodic schedule. I have rechecked the root cause > > of the issue , that is the USB controller has disabled the port (PE cleared) > > and asserted USBERRINT when frame babble is detected, but PEC is not > > asserted. Therefore, the SW didn't aware that port has been disabled. > > The periodic schedule keeps running at this moment. Then the SW keeps > > sending packets to this port, but all of the transfers will fail. But periodic > > schedule will also be disabled after a period of time. Finally, all of the linked > > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT > > is asserted. > > That's not true. The io_watchdog timer continues to fire periodically > (at 100 ms intervals) as long as isoc_count > 0. The timer's handler is > ehci_work(), which calls scan_isoc() if isoc_count > 0. Yes, the io_watchdog timer will cleanup the isoc periodically as long as isoc_count > 0. I did see all of the linked itds are pending there in my case at the end. After my debug, I found the chipidea/host.c had set ehci->need_io_watchdog to 0 which will have impact on turn_on_io_watchdog(). The host has enabled 1 intr endpoint and 2 isoc endpoints from USB camera. Therefore, ehci->intr_count is always 1 and ehci->isoc_count is changing from time to time during camera is running. When PE is cleared by HW, isoc_count may be decreased to 0 after scan_isoc(). When turn_on_io_watchdog() is called, EHCI_HRTIMER_IO_WATCHDOG event will not be enabled due to isoc_count=0 and need_io_watchdog=0 too. When isoc urb is submited again, enable_periodic() will still not enable EHCI_HRTIMER_IO_WATCHDOG event due to periodic_count>0. Then, the linked itds are pending there as long as intr urb has not been completed. Thanks, Xu Yang > > > For this issue, I think the SW needs to aware that the port has been disabled > > although PEC not asserted by HW. I will send another patch to fix this issue > > later. > > Yes, that sounds like a nasty problem. The kernel wouldn't realize > that the device had disconnected. > > Alan Stern
On Tue, Aug 08, 2023 at 10:03:39AM +0000, Xu Yang wrote: > Hi Alan, > > > On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote: > > > Many thanks for your comments and suggestions. > > > > > > Yes, there is a "live" variable in scan_isoc() to handle the case where > > > root hub has stopped periodic schedule. I have rechecked the root cause > > > of the issue , that is the USB controller has disabled the port (PE cleared) > > > and asserted USBERRINT when frame babble is detected, but PEC is not > > > asserted. Therefore, the SW didn't aware that port has been disabled. > > > The periodic schedule keeps running at this moment. Then the SW keeps > > > sending packets to this port, but all of the transfers will fail. But periodic > > > schedule will also be disabled after a period of time. Finally, all of the linked > > > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT > > > is asserted. > > > > That's not true. The io_watchdog timer continues to fire periodically > > (at 100 ms intervals) as long as isoc_count > 0. The timer's handler is > > ehci_work(), which calls scan_isoc() if isoc_count > 0. > > Yes, the io_watchdog timer will cleanup the isoc periodically as long as > isoc_count > 0. > > I did see all of the linked itds are pending there in my case at the end. After my > debug, I found the chipidea/host.c had set ehci->need_io_watchdog to 0 which > will have impact on turn_on_io_watchdog(). > > The host has enabled 1 intr endpoint and 2 isoc endpoints from USB camera. > Therefore, ehci->intr_count is always 1 and ehci->isoc_count is changing from > time to time during camera is running. When PE is cleared by HW, isoc_count > may be decreased to 0 after scan_isoc(). When turn_on_io_watchdog() is called, > EHCI_HRTIMER_IO_WATCHDOG event will not be enabled due to isoc_count=0 > and need_io_watchdog=0 too. When isoc urb is submited again, enable_periodic() > will still not enable EHCI_HRTIMER_IO_WATCHDOG event due to periodic_count>0. Ooh, that's a bug. enable_periodic() should call turn_on_io_watchdog() no matter what value periodic_count has. Would you like to fix it? Alan Stern
Hi Alan, > On Tue, Aug 08, 2023 at 10:03:39AM +0000, Xu Yang wrote: > > Hi Alan, > > > > > On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote: > > > > Many thanks for your comments and suggestions. > > > > > > > > Yes, there is a "live" variable in scan_isoc() to handle the case where > > > > root hub has stopped periodic schedule. I have rechecked the root cause > > > > of the issue , that is the USB controller has disabled the port (PE cleared) > > > > and asserted USBERRINT when frame babble is detected, but PEC is not > > > > asserted. Therefore, the SW didn't aware that port has been disabled. > > > > The periodic schedule keeps running at this moment. Then the SW keeps > > > > sending packets to this port, but all of the transfers will fail. But periodic > > > > schedule will also be disabled after a period of time. Finally, all of the linked > > > > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT > > > > is asserted. > > > > > > That's not true. The io_watchdog timer continues to fire periodically > > > (at 100 ms intervals) as long as isoc_count > 0. The timer's handler is > > > ehci_work(), which calls scan_isoc() if isoc_count > 0. > > > > Yes, the io_watchdog timer will cleanup the isoc periodically as long as > > isoc_count > 0. > > > > I did see all of the linked itds are pending there in my case at the end. After my > > debug, I found the chipidea/host.c had set ehci->need_io_watchdog to 0 which > > will have impact on turn_on_io_watchdog(). > > > > The host has enabled 1 intr endpoint and 2 isoc endpoints from USB camera. > > Therefore, ehci->intr_count is always 1 and ehci->isoc_count is changing from > > time to time during camera is running. When PE is cleared by HW, isoc_count > > may be decreased to 0 after scan_isoc(). When turn_on_io_watchdog() is called, > > EHCI_HRTIMER_IO_WATCHDOG event will not be enabled due to isoc_count=0 > > and need_io_watchdog=0 too. When isoc urb is submited again, enable_periodic() > > will still not enable EHCI_HRTIMER_IO_WATCHDOG event due to periodic_count>0. > > Ooh, that's a bug. enable_periodic() should call turn_on_io_watchdog() > no matter what value periodic_count has. Would you like to fix it? Sure. Will try to fix it. Thanks, Xu Yang > > Alan Stern
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index a1930db0da1c..26dc1d1ae5e8 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -930,10 +930,41 @@ static int ehci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status) if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) { /* - * We don't expedite dequeue for isochronous URBs. + * 1. We don't expedite dequeue for isochronous URBs. * Just wait until they complete normally or their * time slot expires. + * + * 2. The ChipIdea IP has a bug: if frame babble occurs, + * PE will be cleared and the controller will stop periodic + * schedule. So if we don't force dequeue this urb, it + * won't be done forever. Here, a force dequeue is needed + * for this case. */ + unsigned i = HCS_N_PORTS (ehci->hcs_params); + bool need_force_dequeue = false; + + while (i--) { + int pstatus; + + pstatus = ehci_readl(ehci, + &ehci->regs->port_status[i]); + + /* Any cleared PE means controller has stopped + * periodic schedule. + */ + if (!(pstatus & PORT_PE)) { + need_force_dequeue = true; + break; + } + } + + if (!need_force_dequeue) + goto done; + + if (urb->dev->speed == USB_SPEED_HIGH) + itd_unlink_urb(ehci, urb); + else + sitd_unlink_urb(ehci, urb); } else { qh = (struct ehci_qh *) urb->hcpriv; qh->unlink_reason |= QH_UNLINK_REQUESTED;
In current design, the ehci driver will not unlink itd/sitds from the hardware list when dequeue isochronous urbs. Rather just wait until they complete normally or their time slot expires. However, this will cause issues if the controller has stopped periodic schedule before finished all periodic schedule. The urb will not be done forever in this case and then usb_kill/poison_urb() will always wait there. The ChipIdea IP exactly has a bug: if frame babble occurs during periodic transfer, PE (PORTSC.bit2) will be cleared and the controller will stop periodic schedule immediately. So if the user tries to kill or poison related urb, it will wait there since the urb can't be done forever. This patch will check if this issue occurs, then it will unlink itd/sitds from the hardware list depends on the result. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> --- drivers/usb/host/ehci-hcd.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)