Message ID | 20240912193935.1916426-2-quic_wcheng@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Introduce QC USB SND audio offloading support | expand |
Hi, > Expose xhci_stop_endpoint_sync() which is a synchronous variant of > xhci_queue_stop_endpoint(). This is useful for client drivers that are > using the secondary interrupters, and need to stop/clean up the current > session. The stop endpoint command handler will also take care of > cleaning up the ring. I'm not entirely sure what you meant by "cleaning up the ring" (maybe a comment would be in order?), but I see nothing being done here after the command completes and FYI xhci-ring.c will not run the default handler if the command is queued with a completion, like here. At least that's the case for certain command types and there is probably a story behind each of them. I know that xhci_stop_device() queues a Stop EP with completion (and also a few without(?)). Maybe it's a bug... Regards, Michal
Hi Michal, On 9/13/2024 1:32 AM, Michał Pecio wrote: > Hi, > >> Expose xhci_stop_endpoint_sync() which is a synchronous variant of >> xhci_queue_stop_endpoint(). This is useful for client drivers that are >> using the secondary interrupters, and need to stop/clean up the current >> session. The stop endpoint command handler will also take care of >> cleaning up the ring. > I'm not entirely sure what you meant by "cleaning up the ring" (maybe a > comment would be in order?), but I see nothing being done here after the > command completes and FYI xhci-ring.c will not run the default handler if > the command is queued with a completion, like here. > > At least that's the case for certain command types and there is probably > a story behind each of them. I know that xhci_stop_device() queues a > Stop EP with completion (and also a few without(?)). Maybe it's a bug... Maybe the last sentence is not needed. When we are using the secondary interrupters, at least in the offload use case that I've verified with, the XHCI is completely unaware of what TDs have been queued, etc... So technically, even if we did call the default handler (ie xhci_handle_cmd_stop_ep), most of the routines to invalidate TDs are going to be no-ops. Thanks Wesley Cheng
Hi, > Maybe the last sentence is not needed. When we are using the > secondary interrupters, at least in the offload use case that I've > verified with, the XHCI is completely unaware of what TDs have been > queued, etc... So technically, even if we did call the default > handler (ie xhci_handle_cmd_stop_ep), most of the routines to > invalidate TDs are going to be no-ops. Yes, the cancellation machinery will return immediately if there are no TDs queued by xhci_hcd itself. But xhci_handle_cmd_stop_ep() does a few more things for you - it checks if the command has actually succeeded, clears any halt condition which may be preventing stopping the endpoint, and it sometimes retries the command (only on "bad" chips, AFAIK). This new code does none of the above, so in the general case it can't even guarantee that the endpoint is stopped when it returns zero. This should ideally be documented in some way, or fixed, before somebody is tempted to call it with unrealistically high expectations ;) As far as I see, it only works for you because isochronous never halts and Qualcomm HW is (hopefully) free of those stop-after-restart bugs. There will be problems if the SB tries to use any other endpoint type. Regards, Michal
Hi Michal, On 9/15/2024 12:55 AM, Michał Pecio wrote: > Hi, > >> Maybe the last sentence is not needed. When we are using the >> secondary interrupters, at least in the offload use case that I've >> verified with, the XHCI is completely unaware of what TDs have been >> queued, etc... So technically, even if we did call the default >> handler (ie xhci_handle_cmd_stop_ep), most of the routines to >> invalidate TDs are going to be no-ops. > Yes, the cancellation machinery will return immediately if there are > no TDs queued by xhci_hcd itself. > > But xhci_handle_cmd_stop_ep() does a few more things for you - it > checks if the command has actually succeeded, clears any halt condition > which may be preventing stopping the endpoint, and it sometimes retries > the command (only on "bad" chips, AFAIK). > > This new code does none of the above, so in the general case it can't > even guarantee that the endpoint is stopped when it returns zero. This > should ideally be documented in some way, or fixed, before somebody is > tempted to call it with unrealistically high expectations ;) > > As far as I see, it only works for you because isochronous never halts > and Qualcomm HW is (hopefully) free of those stop-after-restart bugs. > There will be problems if the SB tries to use any other endpoint type. So what I ended up doing was to split off the context error handling into a separate helper API, which can be also called for the sync ep stop API. From there, based on say....the helper re queuing the stop EP command, it would return a specific value to signify that it has done so. The sync based API will then re-wait for the completion of the subsequent stop endpoint command that was queued. In all other context error cases, it'd return the error to the caller, and its up to them to handle it accordingly. Thanks Wesley Cheng
Hi, > So what I ended up doing was to split off the context error handling > into a separate helper API, which can be also called for the sync ep > stop API. From there, based on say....the helper re queuing the stop > EP command, it would return a specific value to signify that it has > done so. The sync based API will then re-wait for the completion of > the subsequent stop endpoint command that was queued. AFAIK retries are only necessary on buggy hardware. I don't see them on my controllers except for two old ones, both with the same buggy chip. > In all other context error cases, it'd return the error to the caller, > and its up to them to handle it accordingly. For the record, all existing callers end up ignoring this return value. Honestly, I don't know if improving this function is worth your effort if it's working for you as-is. There are no users except xhci-sideband and probably shouldn't be - besides failing to fix stalled endpoints, this function also does nothing to prevent automatic restart of the EP when new URBs are submitted through xhci_hcd, so it is mainly relevant for sideband users who never submit URBs the usual way. My issue with this function is that it is simply poorly documented what it is or isn't expected to achieve (both here and in the calling code in xhci-sideband.c), and the changelog message is wrong to suggest that the default completion handler will run (unless somewhere there are patches to make it happen), making it look like this code can do things that it really cannot do. And this is apparently a public, exported API. Regards, Michal
Hi Michal On 9/22/2024 4:23 PM, Michał Pecio wrote: > Hi, > >> So what I ended up doing was to split off the context error handling >> into a separate helper API, which can be also called for the sync ep >> stop API. From there, based on say....the helper re queuing the stop >> EP command, it would return a specific value to signify that it has >> done so. The sync based API will then re-wait for the completion of >> the subsequent stop endpoint command that was queued. > AFAIK retries are only necessary on buggy hardware. I don't see them on > my controllers except for two old ones, both with the same buggy chip. > >> In all other context error cases, it'd return the error to the caller, >> and its up to them to handle it accordingly. > For the record, all existing callers end up ignoring this return value. > > Honestly, I don't know if improving this function is worth your effort > if it's working for you as-is. There are no users except xhci-sideband > and probably shouldn't be - besides failing to fix stalled endpoints, > this function also does nothing to prevent automatic restart of the EP > when new URBs are submitted through xhci_hcd, so it is mainly relevant > for sideband users who never submit URBs the usual way. > > My issue with this function is that it is simply poorly documented what > it is or isn't expected to achieve (both here and in the calling code > in xhci-sideband.c), and the changelog message is wrong to suggest that > the default completion handler will run (unless somewhere there are > patches to make it happen), making it look like this code can do things > that it really cannot do. And this is apparently a public, exported API. Thanks for the clarifications. Yes, unfortunately, I can't really test any scenarios where this would be exercised in the current path, so I will leave the code out for now, and just add some comments and updates to the commit message. Can revisit when there is some other users for utilizing secondary interrupters. Thanks Wesley Cheng
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index ed1bb7ed44b0..1fafba95d407 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -2784,6 +2784,45 @@ static int xhci_reserve_bandwidth(struct xhci_hcd *xhci, return -ENOMEM; } +/* + * Synchronous XHCI stop endpoint helper. Issues the stop endpoint command and + * waits for the command completion before returning. + */ +int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, int suspend, + gfp_t gfp_flags) +{ + struct xhci_command *command; + unsigned long flags; + int ret; + + command = xhci_alloc_command(xhci, true, gfp_flags); + if (!command) + return -ENOMEM; + + spin_lock_irqsave(&xhci->lock, flags); + ret = xhci_queue_stop_endpoint(xhci, command, ep->vdev->slot_id, + ep->ep_index, suspend); + if (ret < 0) { + spin_unlock_irqrestore(&xhci->lock, flags); + goto out; + } + + xhci_ring_cmd_db(xhci); + spin_unlock_irqrestore(&xhci->lock, flags); + + wait_for_completion(command->completion); + + if (command->status == COMP_COMMAND_ABORTED || + command->status == COMP_COMMAND_RING_STOPPED) { + xhci_warn(xhci, "Timeout while waiting for stop endpoint command\n"); + ret = -ETIME; + } +out: + xhci_free_command(xhci, command); + + return ret; +} +EXPORT_SYMBOL_GPL(xhci_stop_endpoint_sync); /* Issue a configure endpoint command or evaluate context command * and wait for it to finish. diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 324644165d93..51a992d8ffcf 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1917,6 +1917,8 @@ void xhci_ring_doorbell_for_active_rings(struct xhci_hcd *xhci, void xhci_cleanup_command_queue(struct xhci_hcd *xhci); void inc_deq(struct xhci_hcd *xhci, struct xhci_ring *ring); unsigned int count_trbs(u64 addr, u64 len); +int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, + int suspend, gfp_t gfp_flags); /* xHCI roothub code */ void xhci_set_link_state(struct xhci_hcd *xhci, struct xhci_port *port,