mbox series

[v3,0/4] Add logic to consolidate TRBs for Synopsys xHC

Message ID cover.1590415123.git.joglekar@synopsys.com (mailing list archive)
Headers show
Series Add logic to consolidate TRBs for Synopsys xHC | expand

Message

Tejas Joglekar May 27, 2020, 10:40 a.m. UTC
The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the
transfer ring in system memory whenever the driver issues a start
transfer or update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to
1 MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to
send or receive a packet and it will hang causing a driver timeout and
error.

This patch set adds logic to the XHCI driver to detect and prevent this
from happening along with the quirk to enable this logic for Synopsys
HAPS platform.

Based on Mathias's feedback on previous implementation where consolidation
was done in TRB cache, with this patch series the implementation is done
during mapping of the URB by consolidating the SG list into a temporary
buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.

Changes since v2:
 - Modified the xhci_unmap_temp_buffer function to unmap dma and clear
   the dma flag
 - Removed RFC tag

Changes since v1:
 - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
 - Renamed the property and quirk as in other patches based on [PATCH 1/4]

Tejas Joglekar (4):
  dt-bindings: usb: Add documentation for SG trb cache size quirk
  usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  usb: dwc3: Add device property sgl-trb-cache-size-quirk
  usb: xhci: Use temporary buffer to consolidate SG

 Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +
 Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +
 drivers/usb/dwc3/core.c                            |   2 +
 drivers/usb/dwc3/core.h                            |   2 +
 drivers/usb/dwc3/dwc3-haps.c                       |   1 +
 drivers/usb/dwc3/host.c                            |   6 +-
 drivers/usb/host/xhci-pci.c                        |   3 +
 drivers/usb/host/xhci-plat.c                       |   4 +
 drivers/usb/host/xhci-ring.c                       |   2 +-
 drivers/usb/host/xhci.c                            | 135 +++++++++++++++++++++
 drivers/usb/host/xhci.h                            |   5 +
 11 files changed, 165 insertions(+), 2 deletions(-)

Comments

Tejas Joglekar June 8, 2020, 4:32 a.m. UTC | #1
Hi Mathias,
   Will this be added to your next branch ?
On 5/27/2020 4:10 PM, Tejas Joglekar wrote:
> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
> for HS. The controller loads and updates the TRB cache from the
> transfer ring in system memory whenever the driver issues a start
> transfer or update transfer command.
> 
> For chained TRBs, the Synopsys xHC requires that the total amount of
> bytes for all TRBs loaded in the TRB cache be greater than or equal to
> 1 MPS. Or the chain ends within the TRB cache (with a last TRB).
> 
> If this requirement is not met, the controller will not be able to
> send or receive a packet and it will hang causing a driver timeout and
> error.
> 
> This patch set adds logic to the XHCI driver to detect and prevent this
> from happening along with the quirk to enable this logic for Synopsys
> HAPS platform.
> 
> Based on Mathias's feedback on previous implementation where consolidation
> was done in TRB cache, with this patch series the implementation is done
> during mapping of the URB by consolidating the SG list into a temporary
> buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.
> 
> Changes since v2:
>  - Modified the xhci_unmap_temp_buffer function to unmap dma and clear
>    the dma flag
>  - Removed RFC tag
> 
> Changes since v1:
>  - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
>  - Renamed the property and quirk as in other patches based on [PATCH 1/4]
> 
> Tejas Joglekar (4):
>   dt-bindings: usb: Add documentation for SG trb cache size quirk
>   usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
>   usb: dwc3: Add device property sgl-trb-cache-size-quirk
>   usb: xhci: Use temporary buffer to consolidate SG
> 
>  Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +
>  Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +
>  drivers/usb/dwc3/core.c                            |   2 +
>  drivers/usb/dwc3/core.h                            |   2 +
>  drivers/usb/dwc3/dwc3-haps.c                       |   1 +
>  drivers/usb/dwc3/host.c                            |   6 +-
>  drivers/usb/host/xhci-pci.c                        |   3 +
>  drivers/usb/host/xhci-plat.c                       |   4 +
>  drivers/usb/host/xhci-ring.c                       |   2 +-
>  drivers/usb/host/xhci.c                            | 135 +++++++++++++++++++++
>  drivers/usb/host/xhci.h                            |   5 +
>  11 files changed, 165 insertions(+), 2 deletions(-)
> 

Thanks and Regards,
 Tejas Joglekar
Mathias Nyman June 9, 2020, 8:57 a.m. UTC | #2
On 8.6.2020 7.32, Tejas Joglekar wrote:
> Hi Mathias,
>    Will this be added to your next branch ?

There are still some opens Rob Herring pointed out regarding devicetree.
Adding a compatible string for the synopsys xhci and setting quirk based
on that sounds good to me.

Not sure how that works in cases where the xhci device is created by the DWC3 driver.
Once we have a solution that Felipe and Rob agrees with I can take the whole
series.  

The xhci parts and PCI case looks good to me.

-Mathias
Tejas Joglekar June 11, 2020, 6:07 p.m. UTC | #3
Hi,
On 6/9/2020 2:27 PM, Mathias Nyman wrote:
> On 8.6.2020 7.32, Tejas Joglekar wrote:
>> Hi Mathias,
>>    Will this be added to your next branch ?
> 
> There are still some opens Rob Herring pointed out regarding devicetree.
> Adding a compatible string for the synopsys xhci and setting quirk based
> on that sounds good to me.
> 
> Not sure how that works in cases where the xhci device is created by the DWC3 driver.
> Once we have a solution that Felipe and Rob agrees with I can take the whole
> series.  
> 
@Felipe: Can you please have a look and see if the current implementation is ok or anything to be changed?

> The xhci parts and PCI case looks good to me.
> 
> -Mathias
> 
> 
Thanks & Regards,
 Tejas Joglekar
Tejas Joglekar June 30, 2020, 6:28 a.m. UTC | #4
Hi Felipe,
On 6/11/2020 11:37 PM, Tejas Joglekar wrote:
> Hi,
> On 6/9/2020 2:27 PM, Mathias Nyman wrote:
>> On 8.6.2020 7.32, Tejas Joglekar wrote:
>>> Hi Mathias,
>>>    Will this be added to your next branch ?
>>
>> There are still some opens Rob Herring pointed out regarding devicetree.
>> Adding a compatible string for the synopsys xhci and setting quirk based
>> on that sounds good to me.
>>
>> Not sure how that works in cases where the xhci device is created by the DWC3 driver.
>> Once we have a solution that Felipe and Rob agrees with I can take the whole
>> series.  
>>
> @Felipe: Can you please have a look and see if the current implementation is ok or anything to be changed?
> 
Can you please have review my patch and suggest the changes required, so Mathias can accept the series?

>> The xhci parts and PCI case looks good to me.
>>
>> -Mathias
>>
>>
> Thanks & Regards,
>  Tejas Joglekar
> 

Thanks & Regards,
  Tejas Joglekar
Jun Li July 23, 2020, 10:35 a.m. UTC | #5
Tejas Joglekar <Tejas.Joglekar@synopsys.com> 于2020年5月27日周三 下午7:54写道:
>
> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
> for HS. The controller loads and updates the TRB cache from the
> transfer ring in system memory whenever the driver issues a start
> transfer or update transfer command.
>
> For chained TRBs, the Synopsys xHC requires that the total amount of
> bytes for all TRBs loaded in the TRB cache be greater than or equal to
> 1 MPS. Or the chain ends within the TRB cache (with a last TRB).
>
> If this requirement is not met, the controller will not be able to
> send or receive a packet and it will hang causing a driver timeout and
> error.

Hi Tejas Joglekar

I am debugging  a similar issue on Synipsys XHC, it's not the same case
but I am wondering if it also linked to this HW limitation.

My Synopsys XHC based host enable UAS, when enumerates a UAS
HDD, one BULK-IN EP with stream enabled will not generate event for
trb(with stream ID 1) after a 16/4096 bytes(with stream ID 2) finished in
previous trb.

If I change the last OK urb/trb's buffer length from 4096 to 512, the issue
will gone.

following is the sequence of the question EP-IN:

<idle>-0     [000] d.h1   154.961710: xhci_urb_giveback: ep3in-bulk:
urb ffff0001775f6f00 pipe 3221324672 slot 1 length 36/36 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.962023: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 96/96 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970395: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 11/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970562: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 20/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970786: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 60/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   155.851600: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00200 pipe 3221324672 slot 1 length 16/4096 sgs 1/1
stream 2 flags 00040200

/* then the next ep3-in trb will not generate event and stopped, so
driver timeout in the end */
kworker/u8:2-349   [003] d..3   155.851987: xhci_urb_enqueue:
ep3in-bulk: urb ffff000170492400 pipe 3221324672 slot 1 length 0/32
sgs 1/1 stream 1 flags 00040200
kworker/u8:2-349   [003] d..4   155.851989: xhci_queue_trb: STREAM:
Buffer 00000000c19cf000 length 32 TD size 0 intr 0 type 'Normal' flags
b:i:I:c:s:I:e:c
kworker/u8:2-349   [003] d..4   155.851991: xhci_inc_enq: STREAM
ffff000177f86f80: enq 0x00000000be0eb060(0x00000000be0eb000) deq
0x00000000be0eb050(0x00000000be0eb000) segs 2 stream 1 free_trbs 508
bounce 1024 cycle 1

Do you have any ideas?

thanks
Li Jun
>
> This patch set adds logic to the XHCI driver to detect and prevent this
> from happening along with the quirk to enable this logic for Synopsys
> HAPS platform.
>
> Based on Mathias's feedback on previous implementation where consolidation
> was done in TRB cache, with this patch series the implementation is done
> during mapping of the URB by consolidating the SG list into a temporary
> buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.
>
> Changes since v2:
>  - Modified the xhci_unmap_temp_buffer function to unmap dma and clear
>    the dma flag
>  - Removed RFC tag
>
> Changes since v1:
>  - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
>  - Renamed the property and quirk as in other patches based on [PATCH 1/4]
>
> Tejas Joglekar (4):
>   dt-bindings: usb: Add documentation for SG trb cache size quirk
>   usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
>   usb: dwc3: Add device property sgl-trb-cache-size-quirk
>   usb: xhci: Use temporary buffer to consolidate SG
>
>  Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +
>  Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +
>  drivers/usb/dwc3/core.c                            |   2 +
>  drivers/usb/dwc3/core.h                            |   2 +
>  drivers/usb/dwc3/dwc3-haps.c                       |   1 +
>  drivers/usb/dwc3/host.c                            |   6 +-
>  drivers/usb/host/xhci-pci.c                        |   3 +
>  drivers/usb/host/xhci-plat.c                       |   4 +
>  drivers/usb/host/xhci-ring.c                       |   2 +-
>  drivers/usb/host/xhci.c                            | 135 +++++++++++++++++++++
>  drivers/usb/host/xhci.h                            |   5 +
>  11 files changed, 165 insertions(+), 2 deletions(-)
>
> --
> 2.11.0
>
Tejas Joglekar July 24, 2020, 12:15 a.m. UTC | #6
Hello,
On 7/23/2020 4:05 PM, Jun Li wrote:
> Tejas Joglekar <Tejas.Joglekar@synopsys.com> 于2020年5月27日周三 下午7:54写道:
>>
>> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
>> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
>> for HS. The controller loads and updates the TRB cache from the
>> transfer ring in system memory whenever the driver issues a start
>> transfer or update transfer command.
>>
>> For chained TRBs, the Synopsys xHC requires that the total amount of
>> bytes for all TRBs loaded in the TRB cache be greater than or equal to
>> 1 MPS. Or the chain ends within the TRB cache (with a last TRB).
>>
>> If this requirement is not met, the controller will not be able to
>> send or receive a packet and it will hang causing a driver timeout and
>> error.
> 
> Hi Tejas Joglekar
> 
> I am debugging  a similar issue on Synipsys XHC, it's not the same case
> but I am wondering if it also linked to this HW limitation.
> 
> My Synopsys XHC based host enable UAS, when enumerates a UAS
> HDD, one BULK-IN EP with stream enabled will not generate event for
> trb(with stream ID 1) after a 16/4096 bytes(with stream ID 2) finished in
> previous trb.
> 
> If I change the last OK urb/trb's buffer length from 4096 to 512, the issue
> will gone.
> 
> following is the sequence of the question EP-IN:
> 
> <idle>-0     [000] d.h1   154.961710: xhci_urb_giveback: ep3in-bulk:
> urb ffff0001775f6f00 pipe 3221324672 slot 1 length 36/36 sgs 1/1
> stream 1 flags 00040200
> <idle>-0     [000] d.h1   154.962023: xhci_urb_giveback: ep3in-bulk:
> urb ffff000177d00400 pipe 3221324672 slot 1 length 96/96 sgs 1/1
> stream 1 flags 00040200
> <idle>-0     [000] d.h1   154.970395: xhci_urb_giveback: ep3in-bulk:
> urb ffff000177d00400 pipe 3221324672 slot 1 length 11/255 sgs 1/1
> stream 1 flags 00040200
> <idle>-0     [000] d.h1   154.970562: xhci_urb_giveback: ep3in-bulk:
> urb ffff000177d00400 pipe 3221324672 slot 1 length 20/255 sgs 1/1
> stream 1 flags 00040200
> <idle>-0     [000] d.h1   154.970786: xhci_urb_giveback: ep3in-bulk:
> urb ffff000177d00400 pipe 3221324672 slot 1 length 60/255 sgs 1/1
> stream 1 flags 00040200
> <idle>-0     [000] d.h1   155.851600: xhci_urb_giveback: ep3in-bulk:
> urb ffff000177d00200 pipe 3221324672 slot 1 length 16/4096 sgs 1/1
> stream 2 flags 00040200
> 
> /* then the next ep3-in trb will not generate event and stopped, so
> driver timeout in the end */
> kworker/u8:2-349   [003] d..3   155.851987: xhci_urb_enqueue:
> ep3in-bulk: urb ffff000170492400 pipe 3221324672 slot 1 length 0/32
> sgs 1/1 stream 1 flags 00040200
> kworker/u8:2-349   [003] d..4   155.851989: xhci_queue_trb: STREAM:
> Buffer 00000000c19cf000 length 32 TD size 0 intr 0 type 'Normal' flags
> b:i:I:c:s:I:e:c
> kworker/u8:2-349   [003] d..4   155.851991: xhci_inc_enq: STREAM
> ffff000177f86f80: enq 0x00000000be0eb060(0x00000000be0eb000) deq
> 0x00000000be0eb050(0x00000000be0eb000) segs 2 stream 1 free_trbs 508
> bounce 1024 cycle 1
> 
> Do you have any ideas?
> 
From initial observation of your issue it seems to be unrelated with the
changes in my patch. But could you please provide, USB analyzer trace logs
for the working and non-working case? Also it would be helpful if you can
provide device details which you are testing.

> thanks
> Li Jun

Thanks & Regards,
 Tejas Joglekar