diff mbox series

[03/15] usb: xhci: Don't skip on Stopped - Length Invalid

Message ID 20250306144954.3507700-4-mathias.nyman@linux.intel.com (mailing list archive)
State Superseded
Commit 58d0a3fab5f4fdc112c16a4c6d382f62097afd1c
Headers show
Series xhci features for usb-next | expand

Commit Message

Mathias Nyman March 6, 2025, 2:49 p.m. UTC
From: Michal Pecio <michal.pecio@gmail.com>

Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are
returned when isoc ring is stopped") in v6.11, the driver didn't skip
missed isochronous TDs when handling Stoppend and Stopped - Length
Invalid events. Instead, it erroneously cleared the skip flag, which
would cause the ring to get stuck, as future events won't match the
missed TD which is never removed from the queue until it's cancelled.

This buggy logic seems to have been in place substantially unchanged
since the 3.x series over 10 years ago, which probably speaks first
and foremost about relative rarity of this case in normal usage, but
by the spec I see no reason why it shouldn't be possible.

After d56b0b2ab142, TDs are immediately skipped when handling those
Stopped events. This poses a potential problem in case of Stopped -
Length Invalid, which occurs either on completed TDs (likely already
given back) or Link and No-Op TRBs. Such event won't be recognized
as matching any TD (unless it's the rare Link TRB inside a TD) and
will result in skipping all pending TDs, giving them back possibly
before they are done, risking isoc data loss and maybe UAF by HW.

As a compromise, don't skip and don't clear the skip flag on this
kind of event. Then the next event will skip missed TDs. A downside
of not handling Stopped - Length Invalid on a Link inside a TD is
that if the TD is cancelled, its actual length will not be updated
to account for TRBs (silently) completed before the TD was stopped.

I had no luck producing this sequence of completion events so there
is no compelling demonstration of any resulting disaster. It may be
a very rare, obscure condition. The sole motivation for this patch
is that if such unlikely event does occur, I'd rather risk reporting
a cancelled partially done isoc frame as empty than gamble with UAF.

This will be fixed more properly by looking at Stopped event's TRB
pointer when making skipping decisions, but such rework is unlikely
to be backported to v6.12, which will stay around for a few years.

Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Greg KH March 6, 2025, 2:52 p.m. UTC | #1
On Thu, Mar 06, 2025 at 04:49:42PM +0200, Mathias Nyman wrote:
> From: Michal Pecio <michal.pecio@gmail.com>
> 
> Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are
> returned when isoc ring is stopped") in v6.11, the driver didn't skip
> missed isochronous TDs when handling Stoppend and Stopped - Length
> Invalid events. Instead, it erroneously cleared the skip flag, which
> would cause the ring to get stuck, as future events won't match the
> missed TD which is never removed from the queue until it's cancelled.
> 
> This buggy logic seems to have been in place substantially unchanged
> since the 3.x series over 10 years ago, which probably speaks first
> and foremost about relative rarity of this case in normal usage, but
> by the spec I see no reason why it shouldn't be possible.
> 
> After d56b0b2ab142, TDs are immediately skipped when handling those
> Stopped events. This poses a potential problem in case of Stopped -
> Length Invalid, which occurs either on completed TDs (likely already
> given back) or Link and No-Op TRBs. Such event won't be recognized
> as matching any TD (unless it's the rare Link TRB inside a TD) and
> will result in skipping all pending TDs, giving them back possibly
> before they are done, risking isoc data loss and maybe UAF by HW.
> 
> As a compromise, don't skip and don't clear the skip flag on this
> kind of event. Then the next event will skip missed TDs. A downside
> of not handling Stopped - Length Invalid on a Link inside a TD is
> that if the TD is cancelled, its actual length will not be updated
> to account for TRBs (silently) completed before the TD was stopped.
> 
> I had no luck producing this sequence of completion events so there
> is no compelling demonstration of any resulting disaster. It may be
> a very rare, obscure condition. The sole motivation for this patch
> is that if such unlikely event does occur, I'd rather risk reporting
> a cancelled partially done isoc frame as empty than gamble with UAF.
> 
> This will be fixed more properly by looking at Stopped event's TRB
> pointer when making skipping decisions, but such rework is unlikely
> to be backported to v6.12, which will stay around for a few years.
> 
> Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped")
> Cc: stable@vger.kernel.org
> Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>

Why is a patch cc: stable burried here in a series for linux-next?  It
will be many many weeks before it gets out to anyone else, is that
intentional?

Same for the other commit in this series tagged that way.

thanks,

greg k-h
Mathias Nyman March 6, 2025, 3:29 p.m. UTC | #2
On 6.3.2025 16.52, Greg KH wrote:
> On Thu, Mar 06, 2025 at 04:49:42PM +0200, Mathias Nyman wrote:
> Why is a patch cc: stable burried here in a series for linux-next?  It
> will be many many weeks before it gets out to anyone else, is that
> intentional?
> 
> Same for the other commit in this series tagged that way.

These are both kind of half theoretical issues that have been
around for years without more complaints. No need to rush them to
stable. Balance between regression risk vs adding them to stable.

This patch for example states:

"I had no luck producing this sequence of completion events so there
  is no compelling demonstration of any resulting disaster. It may be
  a very rare, obscure condition. The sole motivation for this patch
  is that if such unlikely event does occur, I'd rather risk reporting
  a cancelled partially done isoc frame as empty than gamble with UA"

Thanks
Mathias
Greg KH March 6, 2025, 3:42 p.m. UTC | #3
On Thu, Mar 06, 2025 at 05:29:30PM +0200, Mathias Nyman wrote:
> On 6.3.2025 16.52, Greg KH wrote:
> > On Thu, Mar 06, 2025 at 04:49:42PM +0200, Mathias Nyman wrote:
> > Why is a patch cc: stable burried here in a series for linux-next?  It
> > will be many many weeks before it gets out to anyone else, is that
> > intentional?
> > 
> > Same for the other commit in this series tagged that way.
> 
> These are both kind of half theoretical issues that have been
> around for years without more complaints. No need to rush them to
> stable. Balance between regression risk vs adding them to stable.
> 
> This patch for example states:
> 
> "I had no luck producing this sequence of completion events so there
>  is no compelling demonstration of any resulting disaster. It may be
>  a very rare, obscure condition. The sole motivation for this patch
>  is that if such unlikely event does occur, I'd rather risk reporting
>  a cancelled partially done isoc frame as empty than gamble with UA"

Ok, fair enough, just seeing patches languish in -next that are tagged
for stable looks odd.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 23cf20026359..6fb48d30ec21 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2828,6 +2828,10 @@  static int handle_tx_event(struct xhci_hcd *xhci,
 		if (!ep_seg) {
 
 			if (ep->skip && usb_endpoint_xfer_isoc(&td->urb->ep->desc)) {
+				/* this event is unlikely to match any TD, don't skip them all */
+				if (trb_comp_code == COMP_STOPPED_LENGTH_INVALID)
+					return 0;
+
 				skip_isoc_td(xhci, td, ep, status);
 				if (!list_empty(&ep_ring->td_list))
 					continue;