diff mbox series

[1/4] usb: xhci: add Immediate Data Transfer support

Message ID 1556285012-28186-2-git-send-email-mathias.nyman@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series xhci features for usb-next | expand

Commit Message

Mathias Nyman April 26, 2019, 1:23 p.m. UTC
From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

Immediate data transfers (IDT) allow the HCD to copy small chunks of
data (up to 8bytes) directly into its output transfer TRBs. This avoids
the somewhat expensive DMA mappings that are performed by default on
most URBs submissions.

In the case an URB was suitable for IDT. The data is directly copied
into the "Data Buffer Pointer" region of the TRB and the IDT flag is
set. Instead of triggering memory accesses the HC will use the data
directly.

The implementation could cover all kind of output endpoints. Yet
Isochronous endpoints are bypassed as I was unable to find one that
matched IDT's constraints. As we try to bypass the default DMA mappings
on URB buffers we'd need to find a Isochronous device with an
urb->transfer_buffer_length <= 8 bytes.

The implementation takes into account that the 8 byte buffers provided
by the URB will never cross a 64KB boundary.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 12 ++++++++++++
 drivers/usb/host/xhci.c      | 16 ++++++++++++++++
 drivers/usb/host/xhci.h      | 17 +++++++++++++++++
 3 files changed, 45 insertions(+)

Comments

Marek Szyprowski May 9, 2019, 10:32 a.m. UTC | #1
Dear All,

On 2019-04-26 15:23, Mathias Nyman wrote:
> From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>
> Immediate data transfers (IDT) allow the HCD to copy small chunks of
> data (up to 8bytes) directly into its output transfer TRBs. This avoids
> the somewhat expensive DMA mappings that are performed by default on
> most URBs submissions.
>
> In the case an URB was suitable for IDT. The data is directly copied
> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
> set. Instead of triggering memory accesses the HC will use the data
> directly.
>
> The implementation could cover all kind of output endpoints. Yet
> Isochronous endpoints are bypassed as I was unable to find one that
> matched IDT's constraints. As we try to bypass the default DMA mappings
> on URB buffers we'd need to find a Isochronous device with an
> urb->transfer_buffer_length <= 8 bytes.
>
> The implementation takes into account that the 8 byte buffers provided
> by the URB will never cross a 64KB boundary.
>
> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>

I've noticed that this patch causes regression on various Samsung Exynos 
5420/5422/5800 boards, which have USB3.0 host ports provided by 
DWC3/XHCI hardware module. The regression can be observed with ASIX USB 
2.0 ethernet dongle, which stops working after applying this patch (eth0 
interface is registered, but no packets are transmitted/received). I can 
provide more debugging information or do some tests, just let me know 
what do you need. Reverting this commit makes ASIX USB ethernet dongle 
operational again.

Here are some more information from one of my test systems:

# lsusb
Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 005 Device 002: ID 0b95:772b ASIX Electronics Corp. AX88772B
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 2232:1056 Silicon Motion
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

# lsusb -t
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
     |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=asix, 480M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=exynos-ohci/3p, 12M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=exynos-ehci/3p, 480M
     |__ Port 1: Dev 2, If 0, Class=Video, Driver=, 480M
     |__ Port 1: Dev 2, If 1, Class=Video, Driver=, 480M

# dmesg | grep usb
[    1.484210] reg-fixed-voltage regulator-usb300: GPIO lookup for 
consumer (null)
[    1.484219] reg-fixed-voltage regulator-usb300: using device tree for 
GPIO lookup
[    1.484241] of_get_named_gpiod_flags: can't parse 'gpios' property of 
node '/regulator-usb300[0]'
[    1.484276] of_get_named_gpiod_flags: parsed 'gpio' property of node 
'/regulator-usb300[0]' - status (0)
[    1.484749] reg-fixed-voltage regulator-usb301: GPIO lookup for 
consumer (null)
[    1.484758] reg-fixed-voltage regulator-usb301: using device tree for 
GPIO lookup
[    1.484777] of_get_named_gpiod_flags: can't parse 'gpios' property of 
node '/regulator-usb301[0]'
[    1.484807] of_get_named_gpiod_flags: parsed 'gpio' property of node 
'/regulator-usb301[0]' - status (0)
[    1.490275] usbcore: registered new interface driver usbfs
[    1.495521] usbcore: registered new interface driver hub
[    1.500897] usbcore: registered new device driver usb
[    2.014966] samsung-usb2-phy 12130000.phy: 12130000.phy supply vbus 
not found, using dummy regulator
[    2.024093] exynos5_usb3drd_phy 12100000.phy: 12100000.phy supply 
vbus-boost not found, using dummy regulator
[    2.033232] exynos5_usb3drd_phy 12500000.phy: 12500000.phy supply 
vbus-boost not found, using dummy regulator
[    2.347306] usbcore: registered new interface driver r8152
[    2.352427] usbcore: registered new interface driver asix
[    2.357877] usbcore: registered new interface driver ax88179_178a
[    2.363860] usbcore: registered new interface driver cdc_ether
[    2.369730] usbcore: registered new interface driver smsc75xx
[    2.375421] usbcore: registered new interface driver smsc95xx
[    2.381198] usbcore: registered new interface driver net1080
[    2.386760] usbcore: registered new interface driver cdc_subset
[    2.392699] usbcore: registered new interface driver zaurus
[    2.398280] usbcore: registered new interface driver cdc_ncm
[    2.404280] exynos-dwc3 soc:usb3-0: soc:usb3-0 supply vdd33 not 
found, using dummy regulator
[    2.412485] exynos-dwc3 soc:usb3-0: soc:usb3-0 supply vdd10 not 
found, using dummy regulator
[    2.427570] exynos-dwc3 soc:usb3-1: soc:usb3-1 supply vdd33 not 
found, using dummy regulator
[    2.434748] exynos-dwc3 soc:usb3-1: soc:usb3-1 supply vdd10 not 
found, using dummy regulator
[    2.459866] of_get_named_gpiod_flags: can't parse 'samsung,vbus-gpio' 
property of node '/soc/usb@12110000[0]'
[    2.460177] exynos-ehci 12110000.usb: EHCI Host Controller
[    2.465329] exynos-ehci 12110000.usb: new USB bus registered, 
assigned bus number 1
[    2.473161] exynos-ehci 12110000.usb: irq 90, io mem 0x12110000
[    2.507027] exynos-ehci 12110000.usb: USB 2.0 started, EHCI 1.00
[    2.512197] usb usb1: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 5.01
[    2.519880] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.527075] usb usb1: Product: EHCI Host Controller
[    2.531875] usb usb1: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 ehci_hcd
[    2.539298] usb usb1: SerialNumber: 12110000.usb
[    2.562423] exynos-ohci 12120000.usb: USB Host Controller
[    2.567689] exynos-ohci 12120000.usb: new USB bus registered, 
assigned bus number 2
[    2.575502] exynos-ohci 12120000.usb: irq 90, io mem 0x12120000
[    2.651364] usb usb2: New USB device found, idVendor=1d6b, 
idProduct=0001, bcdDevice= 5.01
[    2.658219] usb usb2: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.665376] usb usb2: Product: USB Host Controller
[    2.670180] usb usb2: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 ohci_hcd
[    2.677560] usb usb2: SerialNumber: 12120000.usb
[    2.719349] usb usb3: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 5.01
[    2.726870] usb usb3: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.734107] usb usb3: Product: xHCI Host Controller
[    2.738947] usb usb3: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 xhci-hcd
[    2.746297] usb usb3: SerialNumber: xhci-hcd.3.auto
[    2.778642] usb usb4: We don't know the algorithms for LPM for this 
host, disabling LPM.
[    2.786800] usb usb4: New USB device found, idVendor=1d6b, 
idProduct=0003, bcdDevice= 5.01
[    2.794836] usb usb4: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.802022] usb usb4: Product: xHCI Host Controller
[    2.806837] usb usb4: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 xhci-hcd
[    2.814258] usb usb4: SerialNumber: xhci-hcd.3.auto
[    2.855879] usb usb5: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 5.01
[    2.863456] usb usb5: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.870894] usb usb5: Product: xHCI Host Controller
[    2.875457] usb usb5: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 xhci-hcd
[    2.882884] usb usb5: SerialNumber: xhci-hcd.4.auto
[    2.915149] usb usb6: We don't know the algorithms for LPM for this 
host, disabling LPM.
[    2.917137] usb 1-1: new high-speed USB device number 2 using exynos-ehci
[    2.923623] usb usb6: New USB device found, idVendor=1d6b, 
idProduct=0003, bcdDevice= 5.01
[    2.938433] usb usb6: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[    2.945302] usb usb6: Product: xHCI Host Controller
[    2.950196] usb usb6: Manufacturer: Linux 
5.1.0-rc3-00114-g95e060e68bd9 xhci-hcd
[    2.957579] usb usb6: SerialNumber: xhci-hcd.4.auto
[    2.970795] usbcore: registered new interface driver uas
[    2.975423] usbcore: registered new interface driver usb-storage
[    3.208523] usb 1-1: New USB device found, idVendor=2232, 
idProduct=1056, bcdDevice= 0.35
[    3.219999] usb 1-1: New USB device strings: Mfr=3, Product=1, 
SerialNumber=2
[    3.227034] usb 1-1: Product: WebCam SC-10HDM13631N
[    3.227041] usb 1-1: Manufacturer: Generic
[    3.235923] usb 1-1: SerialNumber: 200901010001
[    3.257013] usb 5-1: new high-speed USB device number 2 using xhci-hcd
[    3.459744] usb 5-1: New USB device found, idVendor=0b95, 
idProduct=772b, bcdDevice= 0.01
[    3.460560] usbcore: registered new interface driver usbhid
[    3.460572] usbhid: USB HID core driver
[    3.474443] usb 5-1: New USB device strings: Mfr=1, Product=2, 
SerialNumber=3
[    3.491163] usb 5-1: Product: AX88772B
[    3.500613] usb 5-1: Manufacturer: ASIX Elec. Corp.
[    3.509275] usb 5-1: SerialNumber: 1892F2
[    4.133358] asix 5-1:1.0 eth0: register 'asix' at 
usb-xhci-hcd.4.auto-1, ASIX AX88772B USB 2.0 Ethernet, 00:50:b6:18:92:f2

> ---
>   drivers/usb/host/xhci-ring.c | 12 ++++++++++++
>   drivers/usb/host/xhci.c      | 16 ++++++++++++++++
>   drivers/usb/host/xhci.h      | 17 +++++++++++++++++
>   3 files changed, 45 insertions(+)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 9215a28..2825031 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3275,6 +3275,12 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
>   			field |= TRB_IOC;
>   			more_trbs_coming = false;
>   			td->last_trb = ring->enqueue;
> +
> +			if (xhci_urb_suitable_for_idt(urb)) {
> +				memcpy(&send_addr, urb->transfer_buffer,
> +				       trb_buff_len);
> +				field |= TRB_IDT;
> +			}
>   		}
>   
>   		/* Only set interrupt on short packet for IN endpoints */
> @@ -3414,6 +3420,12 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
>   	if (urb->transfer_buffer_length > 0) {
>   		u32 length_field, remainder;
>   
> +		if (xhci_urb_suitable_for_idt(urb)) {
> +			memcpy(&urb->transfer_dma, urb->transfer_buffer,
> +			       urb->transfer_buffer_length);
> +			field |= TRB_IDT;
> +		}
> +
>   		remainder = xhci_td_remainder(xhci, 0,
>   				urb->transfer_buffer_length,
>   				urb->transfer_buffer_length,
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 7fa58c9..255f93f 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1238,6 +1238,21 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>   
>   /*-------------------------------------------------------------------------*/
>   
> +/*
> + * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
> + * we'll copy the actual data into the TRB address register. This is limited to
> + * transfers up to 8 bytes on output endpoints of any kind with wMaxPacketSize
> + * >= 8 bytes. If suitable for IDT only one Transfer TRB per TD is allowed.
> + */
> +static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> +				gfp_t mem_flags)
> +{
> +	if (xhci_urb_suitable_for_idt(urb))
> +		return 0;
> +
> +	return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
> +}
> +
>   /**
>    * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
>    * HCDs.  Find the index for an endpoint given its descriptor.  Use the return
> @@ -5154,6 +5169,7 @@ static const struct hc_driver xhci_hc_driver = {
>   	/*
>   	 * managing i/o requests and associated device resources
>   	 */
> +	.map_urb_for_dma =      xhci_map_urb_for_dma,
>   	.urb_enqueue =		xhci_urb_enqueue,
>   	.urb_dequeue =		xhci_urb_dequeue,
>   	.alloc_dev =		xhci_alloc_dev,
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 9334cde..abbd481 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1303,6 +1303,8 @@ enum xhci_setup_dev {
>   #define TRB_IOC			(1<<5)
>   /* The buffer pointer contains immediate data */
>   #define TRB_IDT			(1<<6)
> +/* TDs smaller than this might use IDT */
> +#define TRB_IDT_MAX_SIZE	8
>   
>   /* Block Event Interrupt */
>   #define	TRB_BEI			(1<<9)
> @@ -2149,6 +2151,21 @@ static inline struct xhci_ring *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
>   					urb->stream_id);
>   }
>   
> +/*
> + * TODO: As per spec Isochronous IDT transmissions are supported. We bypass
> + * them anyways as we where unable to find a device that matches the
> + * constraints.
> + */
> +static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
> +{
> +	if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
> +	    usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
> +	    urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
> +		return true;
> +
> +	return false;
> +}
> +
>   static inline char *xhci_slot_state_string(u32 state)
>   {
>   	switch (state) {

Best regards
Mathias Nyman May 9, 2019, 11:40 a.m. UTC | #2
On 9.5.2019 13.32, Marek Szyprowski wrote:
> Dear All,
> 
> On 2019-04-26 15:23, Mathias Nyman wrote:
>> From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>
>> Immediate data transfers (IDT) allow the HCD to copy small chunks of
>> data (up to 8bytes) directly into its output transfer TRBs. This avoids
>> the somewhat expensive DMA mappings that are performed by default on
>> most URBs submissions.
>>
>> In the case an URB was suitable for IDT. The data is directly copied
>> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
>> set. Instead of triggering memory accesses the HC will use the data
>> directly.
>>
>> The implementation could cover all kind of output endpoints. Yet
>> Isochronous endpoints are bypassed as I was unable to find one that
>> matched IDT's constraints. As we try to bypass the default DMA mappings
>> on URB buffers we'd need to find a Isochronous device with an
>> urb->transfer_buffer_length <= 8 bytes.
>>
>> The implementation takes into account that the 8 byte buffers provided
>> by the URB will never cross a 64KB boundary.
>>
>> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>> Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> 
> I've noticed that this patch causes regression on various Samsung Exynos
> 5420/5422/5800 boards, which have USB3.0 host ports provided by
> DWC3/XHCI hardware module. The regression can be observed with ASIX USB
> 2.0 ethernet dongle, which stops working after applying this patch (eth0
> interface is registered, but no packets are transmitted/received). I can
> provide more debugging information or do some tests, just let me know
> what do you need. Reverting this commit makes ASIX USB ethernet dongle
> operational again.
> 

Thanks for reporting.

Would it be possible to check if your ASIX ethernet dongle works on some
desktop/laptop setup with this same IDT patch?

Also Exynos xhci traces could help, they would show the content of the TRBs using IDT.
Maybe byte order gets messed up?

Take traces with:

mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable

<connect ASIX eth dongle, try to use it>

send /sys/kernel/debug/tracing/trace content to me

If we can't get this fixed I'll revert the IDT patch

Thanks
Mathias
Nicolas Saenz Julienne May 9, 2019, 11:51 a.m. UTC | #3
On Thu, 2019-05-09 at 14:40 +0300, Mathias Nyman wrote:
> On 9.5.2019 13.32, Marek Szyprowski wrote:
> > Dear All,
> > 
> > On 2019-04-26 15:23, Mathias Nyman wrote:
> > > From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > > 
> > > Immediate data transfers (IDT) allow the HCD to copy small chunks of
> > > data (up to 8bytes) directly into its output transfer TRBs. This avoids
> > > the somewhat expensive DMA mappings that are performed by default on
> > > most URBs submissions.
> > > 
> > > In the case an URB was suitable for IDT. The data is directly copied
> > > into the "Data Buffer Pointer" region of the TRB and the IDT flag is
> > > set. Instead of triggering memory accesses the HC will use the data
> > > directly.
> > > 
> > > The implementation could cover all kind of output endpoints. Yet
> > > Isochronous endpoints are bypassed as I was unable to find one that
> > > matched IDT's constraints. As we try to bypass the default DMA mappings
> > > on URB buffers we'd need to find a Isochronous device with an
> > > urb->transfer_buffer_length <= 8 bytes.
> > > 
> > > The implementation takes into account that the 8 byte buffers provided
> > > by the URB will never cross a 64KB boundary.
> > > 
> > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > > Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> > > Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> > 
> > I've noticed that this patch causes regression on various Samsung Exynos
> > 5420/5422/5800 boards, which have USB3.0 host ports provided by
> > DWC3/XHCI hardware module. The regression can be observed with ASIX USB
> > 2.0 ethernet dongle, which stops working after applying this patch (eth0
> > interface is registered, but no packets are transmitted/received). I can
> > provide more debugging information or do some tests, just let me know
> > what do you need. Reverting this commit makes ASIX USB ethernet dongle
> > operational again.
> > 
> 
> Thanks for reporting.
> 
> Would it be possible to check if your ASIX ethernet dongle works on some
> desktop/laptop setup with this same IDT patch?
> 
> Also Exynos xhci traces could help, they would show the content of the TRBs
> using IDT.
> Maybe byte order gets messed up?
> 
> Take traces with:
> 
> mount -t debugfs none /sys/kernel/debug
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
> 
> <connect ASIX eth dongle, try to use it>
> 
> send /sys/kernel/debug/tracing/trace content to me
> 
> If we can't get this fixed I'll revert the IDT patch

Hi Matthias, thanks for your help.

I'll also be looking into it, so please send me the logs too.

Regards,
Nicolas
Mathias Nyman May 9, 2019, 3:10 p.m. UTC | #4
On 9.5.2019 14.51, Nicolas Saenz Julienne wrote:
> On Thu, 2019-05-09 at 14:40 +0300, Mathias Nyman wrote:
>> On 9.5.2019 13.32, Marek Szyprowski wrote:
>>> Dear All,
>>>
>>> On 2019-04-26 15:23, Mathias Nyman wrote:
>>>> From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>>
>>>> Immediate data transfers (IDT) allow the HCD to copy small chunks of
>>>> data (up to 8bytes) directly into its output transfer TRBs. This avoids
>>>> the somewhat expensive DMA mappings that are performed by default on
>>>> most URBs submissions.
>>>>
>>>> In the case an URB was suitable for IDT. The data is directly copied
>>>> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
>>>> set. Instead of triggering memory accesses the HC will use the data
>>>> directly.
>>>>
>>>> The implementation could cover all kind of output endpoints. Yet
>>>> Isochronous endpoints are bypassed as I was unable to find one that
>>>> matched IDT's constraints. As we try to bypass the default DMA mappings
>>>> on URB buffers we'd need to find a Isochronous device with an
>>>> urb->transfer_buffer_length <= 8 bytes.
>>>>
>>>> The implementation takes into account that the 8 byte buffers provided
>>>> by the URB will never cross a 64KB boundary.
>>>>
>>>> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>> Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>>>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
>>>
>>> I've noticed that this patch causes regression on various Samsung Exynos
>>> 5420/5422/5800 boards, which have USB3.0 host ports provided by
>>> DWC3/XHCI hardware module. The regression can be observed with ASIX USB
>>> 2.0 ethernet dongle, which stops working after applying this patch (eth0
>>> interface is registered, but no packets are transmitted/received). I can
>>> provide more debugging information or do some tests, just let me know
>>> what do you need. Reverting this commit makes ASIX USB ethernet dongle
>>> operational again.
>>>
>>
>> Thanks for reporting.
>>
>> Would it be possible to check if your ASIX ethernet dongle works on some
>> desktop/laptop setup with this same IDT patch?
>>
>> Also Exynos xhci traces could help, they would show the content of the TRBs
>> using IDT.
>> Maybe byte order gets messed up?
>>
>> Take traces with:
>>
>> mount -t debugfs none /sys/kernel/debug
>> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
>> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>>
>> <connect ASIX eth dongle, try to use it>
>>
>> send /sys/kernel/debug/tracing/trace content to me
>>
>> If we can't get this fixed I'll revert the IDT patch
> 
> Hi Matthias, thanks for your help.
> 
> I'll also be looking into it, so please send me the logs too.
> 

Got the logs off list, thanks

The "Buffer" data in Control transfer Data stage look suspicious.

grep "flags I:" trace_fail  | grep Data
kworker/0:2-124   [000] d..1    63.092399: xhci_queue_trb: CTRL: Buffer 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C
ifconfig-1429  [005] d..1    93.181231: xhci_queue_trb: CTRL: Buffer 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C
ifconfig-1429  [007] dn.2    93.182050: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 8 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C
ifconfig-1429  [007] d..2    93.182499: xhci_queue_trb: CTRL: Buffer 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C
ifconfig-1429  [007] d..2    93.182736: xhci_queue_trb: CTRL: Buffer 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C
kworker/0:3-1409  [000] d..3    93.382630: xhci_queue_trb: CTRL: Buffer 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C

First guess would be that in case URB has URB_NO_TRANSFER_DMA_MAP set then data
will be mapped and urb->transfer_dma is already set.
The IDT patch uses urb->trabfer_dma as a temporary buffer, and copies the
urb->transfer_buffer there.
if transfer buffer is already dma mapped the urb->transfer_buffer can be garbage,
(shouldn't, but it can be)

Below code avoids IDT if URB_NO_TRANSFER_DMA_MAP is set, and doesn't touch
urb->transfer_dma (patch attached)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index fed3385..f080054 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3423,11 +3423,14 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
  
         if (urb->transfer_buffer_length > 0) {
                 u32 length_field, remainder;
+               u64 addr;
  
                 if (xhci_urb_suitable_for_idt(urb)) {
-                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
+                       memcpy(&addr, urb->transfer_buffer,
                                urb->transfer_buffer_length);
                         field |= TRB_IDT;
+               } else {
+                       addr = (u64) urb->transfer_dma;
                 }
  
                 remainder = xhci_td_remainder(xhci, 0,
@@ -3440,8 +3443,8 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
                 if (setup->bRequestType & USB_DIR_IN)
                         field |= TRB_DIR_IN;
                 queue_trb(xhci, ep_ring, true,
-                               lower_32_bits(urb->transfer_dma),
-                               upper_32_bits(urb->transfer_dma),
+                               lower_32_bits(addr),
+                               upper_32_bits(addr),
                                 length_field,
                                 field | ep_ring->cycle_state);
         }
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index a450a99..7f8b950 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -2160,7 +2160,8 @@ static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
  {
         if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
             usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
-           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
+           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
+           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
                 return true;
  
         return false;

If that doesn't help, then it's possible DATA trbs in control transfer can't
use IDT at all. IDT is supported for Normal TRBs, which have a different trb
type than DATA trbs in control transfers.

Also xhci specs 4.11.7 limit IDT usage:

"If the IDT flag is set in one TRB of a TD, then it shall be the only Transfer
  TRB of the TD"

A whole control transfer is one TD, and it already contains a SETUP transfer TRB
which is using the IDT flag.

Following disables IDT for control transfers (testpatch attached as well)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index fed3385..4c1c9ad 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3424,12 +3424,6 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
         if (urb->transfer_buffer_length > 0) {
                 u32 length_field, remainder;
  
-               if (xhci_urb_suitable_for_idt(urb)) {
-                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
-                              urb->transfer_buffer_length);
-                       field |= TRB_IDT;
-               }
-
                 remainder = xhci_td_remainder(xhci, 0,
                                 urb->transfer_buffer_length,
                                 urb->transfer_buffer_length,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index a450a99..2e16ff7 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -2158,9 +2158,11 @@ static inline struct xhci_ring *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
   */
  static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
  {
-       if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
+       if (!usb_endpoint_xfer_control(&urb->ep->desc) &&
+           !usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
             usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
-           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
+           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
+           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
                 return true;
  
         return false;

-Mathias
From c92d0e83f24d9a8f401ef5c91ebc8263fd9d2a56 Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@linux.intel.com>
Date: Thu, 9 May 2019 17:47:28 +0300
Subject: [TESTPATCH2] xhci: Don't use IDT tranfers when not supported

control tranfer data stage can't support IDT as xHCI can't have two
IDT flags in the same TD.
The SETUP stage is already using IDT
see xhci 4.11.7 for details

Also don't use IDT if tranfer buffer is already dma mapped.
There is no benefit with IDT then.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 6 ------
 drivers/usb/host/xhci.h      | 6 ++++--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index fed3385..4c1c9ad 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3424,12 +3424,6 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	if (urb->transfer_buffer_length > 0) {
 		u32 length_field, remainder;
 
-		if (xhci_urb_suitable_for_idt(urb)) {
-			memcpy(&urb->transfer_dma, urb->transfer_buffer,
-			       urb->transfer_buffer_length);
-			field |= TRB_IDT;
-		}
-
 		remainder = xhci_td_remainder(xhci, 0,
 				urb->transfer_buffer_length,
 				urb->transfer_buffer_length,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index a450a99..2e16ff7 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -2158,9 +2158,11 @@ static inline struct xhci_ring *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
  */
 static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
 {
-	if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
+	if (!usb_endpoint_xfer_control(&urb->ep->desc) &&
+	    !usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
 	    usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
-	    urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
+	    urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
+	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
 		return true;
 
 	return false;
Nicolas Saenz Julienne May 9, 2019, 3:38 p.m. UTC | #5
Hi Matthias, thanks for spending the time debugging this :)

On Thu, 2019-05-09 at 18:10 +0300, Mathias Nyman wrote:
> Got the logs off list, thanks
> 
> The "Buffer" data in Control transfer Data stage look suspicious.
> 
> grep "flags I:" trace_fail  | grep Data
> kworker/0:2-124   [000] d..1    63.092399: xhci_queue_trb: CTRL: Buffer
> 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> ifconfig-1429  [005] d..1    93.181231: xhci_queue_trb: CTRL: Buffer
> 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> ifconfig-1429  [007] dn.2    93.182050: xhci_queue_trb: CTRL: Buffer
> 0000000000000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> ifconfig-1429  [007] d..2    93.182499: xhci_queue_trb: CTRL: Buffer
> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> ifconfig-1429  [007] d..2    93.182736: xhci_queue_trb: CTRL: Buffer
> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> kworker/0:3-1409  [000] d..3    93.382630: xhci_queue_trb: CTRL: Buffer
> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
> I:i:c:s:i:e:C
> 
> First guess would be that in case URB has URB_NO_TRANSFER_DMA_MAP set then
> data
> will be mapped and urb->transfer_dma is already set.
> The IDT patch uses urb->trabfer_dma as a temporary buffer, and copies the
> urb->transfer_buffer there.
> if transfer buffer is already dma mapped the urb->transfer_buffer can be
> garbage,
> (shouldn't, but it can be)
> 
> Below code avoids IDT if URB_NO_TRANSFER_DMA_MAP is set, and doesn't touch
> urb->transfer_dma (patch attached)
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index fed3385..f080054 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3423,11 +3423,14 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t
> mem_flags,
>   
>          if (urb->transfer_buffer_length > 0) {
>                  u32 length_field, remainder;
> +               u64 addr;
>   
>                  if (xhci_urb_suitable_for_idt(urb)) {
> -                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
> +                       memcpy(&addr, urb->transfer_buffer,
>                                 urb->transfer_buffer_length);
>                          field |= TRB_IDT;
> +               } else {
> +                       addr = (u64) urb->transfer_dma;
>                  }
>   
>                  remainder = xhci_td_remainder(xhci, 0,
> @@ -3440,8 +3443,8 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t
> mem_flags,
>                  if (setup->bRequestType & USB_DIR_IN)
>                          field |= TRB_DIR_IN;
>                  queue_trb(xhci, ep_ring, true,
> -                               lower_32_bits(urb->transfer_dma),
> -                               upper_32_bits(urb->transfer_dma),
> +                               lower_32_bits(addr),
> +                               upper_32_bits(addr),
>                                  length_field,
>                                  field | ep_ring->cycle_state);
>          }
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index a450a99..7f8b950 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -2160,7 +2160,8 @@ static inline bool xhci_urb_suitable_for_idt(struct urb
> *urb)
>   {
>          if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb)
> &&
>              usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
> -           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
> +           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
> +           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>                  return true;
>   
>          return false;
> 

This could be it, I broadly checked and assumed everyone was playing nice with
the transfer_dma pointer, but I guess I might have missed something.

> If that doesn't help, then it's possible DATA trbs in control transfer can't
> use IDT at all. IDT is supported for Normal TRBs, which have a different trb
> type than DATA trbs in control transfers.
> 
> Also xhci specs 4.11.7 limit IDT usage:
> 
> "If the IDT flag is set in one TRB of a TD, then it shall be the only Transfer
>   TRB of the TD"
> 
> A whole control transfer is one TD, and it already contains a SETUP transfer
> TRB
> which is using the IDT flag.
> 
> Following disables IDT for control transfers (testpatch attached as well)

This one I'm not so sure as the standard defines a control transfer as a 2 or 3
TD operation (see 4.11.2.2):

"The Control Transfer Ring may contain Setup Stage and Status Stage TDs, and
optionally Data Stage TDs."

Also, for what is worth, I spent some time testing that specific case on my
intel laptop while preparing the patch.

Regards,
Nicolas
Mathias Nyman May 10, 2019, 6:11 a.m. UTC | #6
On 9.5.2019 18.38, Nicolas Saenz Julienne wrote:
> Hi Mathias, thanks for spending the time debugging this :)
> 
> On Thu, 2019-05-09 at 18:10 +0300, Mathias Nyman wrote:
>> Got the logs off list, thanks
>>
>> The "Buffer" data in Control transfer Data stage look suspicious.
>>
>> First guess would be that in case URB has URB_NO_TRANSFER_DMA_MAP set then
>> data
>> will be mapped and urb->transfer_dma is already set.
>> The IDT patch uses urb->trabfer_dma as a temporary buffer, and copies the
>> urb->transfer_buffer there.
>> if transfer buffer is already dma mapped the urb->transfer_buffer can be
>> garbage,
>> (shouldn't, but it can be)
> 
> This could be it, I broadly checked and assumed everyone was playing nice with
> the transfer_dma pointer, but I guess I might have missed something.
> 
>> If that doesn't help, then it's possible DATA trbs in control transfer can't
>> use IDT at all. IDT is supported for Normal TRBs, which have a different trb
>> type than DATA trbs in control transfers.
>>
>> Also xhci specs 4.11.7 limit IDT usage:
>>
>> "If the IDT flag is set in one TRB of a TD, then it shall be the only Transfer
>>    TRB of the TD"
>>
>> A whole control transfer is one TD, and it already contains a SETUP transfer
>> TRB
>> which is using the IDT flag.
> 
> This one I'm not so sure as the standard defines a control transfer as a 2 or 3
> TD operation (see 4.11.2.2):
> 
> "The Control Transfer Ring may contain Setup Stage and Status Stage TDs, and
> optionally Data Stage TDs."

True, xhci driver treats a control transfer as one TD, but TRBs aren't chained so
from hw point of view they are separate TDs

> 
> Also, for what is worth, I spent some time testing that specific case on my
> intel laptop while preparing the patch.

And a closer look at the spec shows that both Normal and Data Stage TRB support
IDT. So this is not likely the cause.

-Mathias
Marek Szyprowski May 10, 2019, 6:28 a.m. UTC | #7
Hi Mathias,

On 2019-05-09 17:10, Mathias Nyman wrote:
> On 9.5.2019 14.51, Nicolas Saenz Julienne wrote:
>> On Thu, 2019-05-09 at 14:40 +0300, Mathias Nyman wrote:
>>> On 9.5.2019 13.32, Marek Szyprowski wrote:
>>>> Dear All,
>>>>
>>>> On 2019-04-26 15:23, Mathias Nyman wrote:
>>>>> From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>>>
>>>>> Immediate data transfers (IDT) allow the HCD to copy small chunks of
>>>>> data (up to 8bytes) directly into its output transfer TRBs. This 
>>>>> avoids
>>>>> the somewhat expensive DMA mappings that are performed by default on
>>>>> most URBs submissions.
>>>>>
>>>>> In the case an URB was suitable for IDT. The data is directly copied
>>>>> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
>>>>> set. Instead of triggering memory accesses the HC will use the data
>>>>> directly.
>>>>>
>>>>> The implementation could cover all kind of output endpoints. Yet
>>>>> Isochronous endpoints are bypassed as I was unable to find one that
>>>>> matched IDT's constraints. As we try to bypass the default DMA 
>>>>> mappings
>>>>> on URB buffers we'd need to find a Isochronous device with an
>>>>> urb->transfer_buffer_length <= 8 bytes.
>>>>>
>>>>> The implementation takes into account that the 8 byte buffers 
>>>>> provided
>>>>> by the URB will never cross a 64KB boundary.
>>>>>
>>>>> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>>> Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>>>>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
>>>>
>>>> I've noticed that this patch causes regression on various Samsung 
>>>> Exynos
>>>> 5420/5422/5800 boards, which have USB3.0 host ports provided by
>>>> DWC3/XHCI hardware module. The regression can be observed with ASIX 
>>>> USB
>>>> 2.0 ethernet dongle, which stops working after applying this patch 
>>>> (eth0
>>>> interface is registered, but no packets are transmitted/received). 
>>>> I can
>>>> provide more debugging information or do some tests, just let me know
>>>> what do you need. Reverting this commit makes ASIX USB ethernet dongle
>>>> operational again.
>>>>
>>>
>>> Thanks for reporting.
>>>
>>> Would it be possible to check if your ASIX ethernet dongle works on 
>>> some
>>> desktop/laptop setup with this same IDT patch?
>>>
>>> Also Exynos xhci traces could help, they would show the content of 
>>> the TRBs
>>> using IDT.
>>> Maybe byte order gets messed up?
>>>
>>> Take traces with:
>>>
>>> mount -t debugfs none /sys/kernel/debug
>>> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
>>> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>>>
>>> <connect ASIX eth dongle, try to use it>
>>>
>>> send /sys/kernel/debug/tracing/trace content to me
>>>
>>> If we can't get this fixed I'll revert the IDT patch
>>
>> Hi Matthias, thanks for your help.
>>
>> I'll also be looking into it, so please send me the logs too.
>>
>
> Got the logs off list, thanks
>
> The "Buffer" data in Control transfer Data stage look suspicious.
>
> grep "flags I:" trace_fail  | grep Data
> kworker/0:2-124   [000] d..1    63.092399: xhci_queue_trb: CTRL: 
> Buffer 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' 
> flags I:i:c:s:i:e:C
> ifconfig-1429  [005] d..1    93.181231: xhci_queue_trb: CTRL: Buffer 
> 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags 
> I:i:c:s:i:e:C
> ifconfig-1429  [007] dn.2    93.182050: xhci_queue_trb: CTRL: Buffer 
> 0000000000000000 length 8 TD size 0 intr 0 type 'Data Stage' flags 
> I:i:c:s:i:e:C
> ifconfig-1429  [007] d..2    93.182499: xhci_queue_trb: CTRL: Buffer 
> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags 
> I:i:c:s:i:e:C
> ifconfig-1429  [007] d..2    93.182736: xhci_queue_trb: CTRL: Buffer 
> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags 
> I:i:c:s:i:e:C
> kworker/0:3-1409  [000] d..3    93.382630: xhci_queue_trb: CTRL: 
> Buffer 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' 
> flags I:i:c:s:i:e:C
>
> First guess would be that in case URB has URB_NO_TRANSFER_DMA_MAP set 
> then data
> will be mapped and urb->transfer_dma is already set.
> The IDT patch uses urb->trabfer_dma as a temporary buffer, and copies the
> urb->transfer_buffer there.
> if transfer buffer is already dma mapped the urb->transfer_buffer can 
> be garbage,
> (shouldn't, but it can be)
>
> Below code avoids IDT if URB_NO_TRANSFER_DMA_MAP is set, and doesn't 
> touch
> urb->transfer_dma (patch attached)
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index fed3385..f080054 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3423,11 +3423,14 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, 
> gfp_t mem_flags,
>
>         if (urb->transfer_buffer_length > 0) {
>                 u32 length_field, remainder;
> +               u64 addr;
>
>                 if (xhci_urb_suitable_for_idt(urb)) {
> -                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
> +                       memcpy(&addr, urb->transfer_buffer,
>                                urb->transfer_buffer_length);
>                         field |= TRB_IDT;
> +               } else {
> +                       addr = (u64) urb->transfer_dma;
>                 }
>
>                 remainder = xhci_td_remainder(xhci, 0,
> @@ -3440,8 +3443,8 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, 
> gfp_t mem_flags,
>                 if (setup->bRequestType & USB_DIR_IN)
>                         field |= TRB_DIR_IN;
>                 queue_trb(xhci, ep_ring, true,
> - lower_32_bits(urb->transfer_dma),
> - upper_32_bits(urb->transfer_dma),
> +                               lower_32_bits(addr),
> +                               upper_32_bits(addr),
>                                 length_field,
>                                 field | ep_ring->cycle_state);
>         }
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index a450a99..7f8b950 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -2160,7 +2160,8 @@ static inline bool 
> xhci_urb_suitable_for_idt(struct urb *urb)
>  {
>         if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && 
> usb_urb_dir_out(urb) &&
>             usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
> -           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
> +           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
> +           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>                 return true;
>
>         return false;
>
 > If that doesn't help, then it's possible DATA trbs in control 
transfer can't
> use IDT at all. IDT is supported for Normal TRBs, which have a 
> different trb
> type than DATA trbs in control transfers.
>
> Also xhci specs 4.11.7 limit IDT usage:
>
> "If the IDT flag is set in one TRB of a TD, then it shall be the only 
> Transfer
>  TRB of the TD"
>
> A whole control transfer is one TD, and it already contains a SETUP 
> transfer TRB
> which is using the IDT flag.
>
> Following disables IDT for control transfers (testpatch attached as well)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index fed3385..4c1c9ad 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3424,12 +3424,6 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, 
> gfp_t mem_flags,
>         if (urb->transfer_buffer_length > 0) {
>                 u32 length_field, remainder;
>
> -               if (xhci_urb_suitable_for_idt(urb)) {
> -                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
> -                              urb->transfer_buffer_length);
> -                       field |= TRB_IDT;
> -               }
> -
>                 remainder = xhci_td_remainder(xhci, 0,
>                                 urb->transfer_buffer_length,
>                                 urb->transfer_buffer_length,
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index a450a99..2e16ff7 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -2158,9 +2158,11 @@ static inline struct xhci_ring 
> *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
>   */
>  static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
>  {
> -       if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && 
> usb_urb_dir_out(urb) &&
> +       if (!usb_endpoint_xfer_control(&urb->ep->desc) &&
> +           !usb_endpoint_xfer_isoc(&urb->ep->desc) && 
> usb_urb_dir_out(urb) &&
>             usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
> -           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
> +           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
> +           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>                 return true;
>
>         return false;
>
> -Mathias


Thanks for the patches to test! Both patches applied separately (without 
the other one) fixes the issue with ASIX USB dongle, but from the 
discussion I assume that the first one 
(0001-xhci-don-t-use-IDT-transfer-buffer-is-already-dma-ma.patch) really 
fixes the issue, while the second one is just a workaround.

You can add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>


Best regards
Mathias Nyman May 10, 2019, 7:30 a.m. UTC | #8
On 10.5.2019 9.28, Marek Szyprowski wrote:
> Hi Mathias,
> 
> On 2019-05-09 17:10, Mathias Nyman wrote:
>> On 9.5.2019 14.51, Nicolas Saenz Julienne wrote:
>>> On Thu, 2019-05-09 at 14:40 +0300, Mathias Nyman wrote:
>>>> On 9.5.2019 13.32, Marek Szyprowski wrote:
>>>>> Dear All,
>>>>>
>>>>> On 2019-04-26 15:23, Mathias Nyman wrote:
>>>>>> From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>>>>
>>>>>> Immediate data transfers (IDT) allow the HCD to copy small chunks of
>>>>>> data (up to 8bytes) directly into its output transfer TRBs. This
>>>>>> avoids
>>>>>> the somewhat expensive DMA mappings that are performed by default on
>>>>>> most URBs submissions.
>>>>>>
>>>>>> In the case an URB was suitable for IDT. The data is directly copied
>>>>>> into the "Data Buffer Pointer" region of the TRB and the IDT flag is
>>>>>> set. Instead of triggering memory accesses the HC will use the data
>>>>>> directly.
>>>>>>
>>>>>> The implementation could cover all kind of output endpoints. Yet
>>>>>> Isochronous endpoints are bypassed as I was unable to find one that
>>>>>> matched IDT's constraints. As we try to bypass the default DMA
>>>>>> mappings
>>>>>> on URB buffers we'd need to find a Isochronous device with an
>>>>>> urb->transfer_buffer_length <= 8 bytes.
>>>>>>
>>>>>> The implementation takes into account that the 8 byte buffers
>>>>>> provided
>>>>>> by the URB will never cross a 64KB boundary.
>>>>>>
>>>>>> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
>>>>>> Reviewed-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>>>>>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
>>>>>
>>>>> I've noticed that this patch causes regression on various Samsung
>>>>> Exynos
>>>>> 5420/5422/5800 boards, which have USB3.0 host ports provided by
>>>>> DWC3/XHCI hardware module. The regression can be observed with ASIX
>>>>> USB
>>>>> 2.0 ethernet dongle, which stops working after applying this patch
>>>>> (eth0
>>>>> interface is registered, but no packets are transmitted/received).
>>>>> I can
>>>>> provide more debugging information or do some tests, just let me know
>>>>> what do you need. Reverting this commit makes ASIX USB ethernet dongle
>>>>> operational again.
>>>>>
>>>>
>>>> Thanks for reporting.
>>>>
>>>> Would it be possible to check if your ASIX ethernet dongle works on
>>>> some
>>>> desktop/laptop setup with this same IDT patch?
>>>>
>>>> Also Exynos xhci traces could help, they would show the content of
>>>> the TRBs
>>>> using IDT.
>>>> Maybe byte order gets messed up?
>>>>
>>>> Take traces with:
>>>>
>>>> mount -t debugfs none /sys/kernel/debug
>>>> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
>>>> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>>>>
>>>> <connect ASIX eth dongle, try to use it>
>>>>
>>>> send /sys/kernel/debug/tracing/trace content to me
>>>>
>>>> If we can't get this fixed I'll revert the IDT patch
>>>
>>> Hi Matthias, thanks for your help.
>>>
>>> I'll also be looking into it, so please send me the logs too.
>>>
>>
>> Got the logs off list, thanks
>>
>> The "Buffer" data in Control transfer Data stage look suspicious.
>>
>> grep "flags I:" trace_fail  | grep Data
>> kworker/0:2-124   [000] d..1    63.092399: xhci_queue_trb: CTRL:
>> Buffer 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage'
>> flags I:i:c:s:i:e:C
>> ifconfig-1429  [005] d..1    93.181231: xhci_queue_trb: CTRL: Buffer
>> 0000000018b65000 length 6 TD size 0 intr 0 type 'Data Stage' flags
>> I:i:c:s:i:e:C
>> ifconfig-1429  [007] dn.2    93.182050: xhci_queue_trb: CTRL: Buffer
>> 0000000000000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
>> I:i:c:s:i:e:C
>> ifconfig-1429  [007] d..2    93.182499: xhci_queue_trb: CTRL: Buffer
>> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
>> I:i:c:s:i:e:C
>> ifconfig-1429  [007] d..2    93.182736: xhci_queue_trb: CTRL: Buffer
>> 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage' flags
>> I:i:c:s:i:e:C
>> kworker/0:3-1409  [000] d..3    93.382630: xhci_queue_trb: CTRL:
>> Buffer 0000000080000000 length 8 TD size 0 intr 0 type 'Data Stage'
>> flags I:i:c:s:i:e:C
>>
>> First guess would be that in case URB has URB_NO_TRANSFER_DMA_MAP set
>> then data
>> will be mapped and urb->transfer_dma is already set.
>> The IDT patch uses urb->trabfer_dma as a temporary buffer, and copies the
>> urb->transfer_buffer there.
>> if transfer buffer is already dma mapped the urb->transfer_buffer can
>> be garbage,
>> (shouldn't, but it can be)
>>
>> Below code avoids IDT if URB_NO_TRANSFER_DMA_MAP is set, and doesn't
>> touch
>> urb->transfer_dma (patch attached)
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index fed3385..f080054 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -3423,11 +3423,14 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci,
>> gfp_t mem_flags,
>>
>>          if (urb->transfer_buffer_length > 0) {
>>                  u32 length_field, remainder;
>> +               u64 addr;
>>
>>                  if (xhci_urb_suitable_for_idt(urb)) {
>> -                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
>> +                       memcpy(&addr, urb->transfer_buffer,
>>                                 urb->transfer_buffer_length);
>>                          field |= TRB_IDT;
>> +               } else {
>> +                       addr = (u64) urb->transfer_dma;
>>                  }
>>
>>                  remainder = xhci_td_remainder(xhci, 0,
>> @@ -3440,8 +3443,8 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci,
>> gfp_t mem_flags,
>>                  if (setup->bRequestType & USB_DIR_IN)
>>                          field |= TRB_DIR_IN;
>>                  queue_trb(xhci, ep_ring, true,
>> - lower_32_bits(urb->transfer_dma),
>> - upper_32_bits(urb->transfer_dma),
>> +                               lower_32_bits(addr),
>> +                               upper_32_bits(addr),
>>                                  length_field,
>>                                  field | ep_ring->cycle_state);
>>          }
>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>> index a450a99..7f8b950 100644
>> --- a/drivers/usb/host/xhci.h
>> +++ b/drivers/usb/host/xhci.h
>> @@ -2160,7 +2160,8 @@ static inline bool
>> xhci_urb_suitable_for_idt(struct urb *urb)
>>   {
>>          if (!usb_endpoint_xfer_isoc(&urb->ep->desc) &&
>> usb_urb_dir_out(urb) &&
>>              usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
>> -           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
>> +           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
>> +           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>>                  return true;
>>
>>          return false;
>>
>   > If that doesn't help, then it's possible DATA trbs in control
> transfer can't
>> use IDT at all. IDT is supported for Normal TRBs, which have a
>> different trb
>> type than DATA trbs in control transfers.
>>
>> Also xhci specs 4.11.7 limit IDT usage:
>>
>> "If the IDT flag is set in one TRB of a TD, then it shall be the only
>> Transfer
>>   TRB of the TD"
>>
>> A whole control transfer is one TD, and it already contains a SETUP
>> transfer TRB
>> which is using the IDT flag.
>>
>> Following disables IDT for control transfers (testpatch attached as well)
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index fed3385..4c1c9ad 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -3424,12 +3424,6 @@ int xhci_queue_ctrl_tx(struct xhci_hcd *xhci,
>> gfp_t mem_flags,
>>          if (urb->transfer_buffer_length > 0) {
>>                  u32 length_field, remainder;
>>
>> -               if (xhci_urb_suitable_for_idt(urb)) {
>> -                       memcpy(&urb->transfer_dma, urb->transfer_buffer,
>> -                              urb->transfer_buffer_length);
>> -                       field |= TRB_IDT;
>> -               }
>> -
>>                  remainder = xhci_td_remainder(xhci, 0,
>>                                  urb->transfer_buffer_length,
>>                                  urb->transfer_buffer_length,
>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>> index a450a99..2e16ff7 100644
>> --- a/drivers/usb/host/xhci.h
>> +++ b/drivers/usb/host/xhci.h
>> @@ -2158,9 +2158,11 @@ static inline struct xhci_ring
>> *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
>>    */
>>   static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
>>   {
>> -       if (!usb_endpoint_xfer_isoc(&urb->ep->desc) &&
>> usb_urb_dir_out(urb) &&
>> +       if (!usb_endpoint_xfer_control(&urb->ep->desc) &&
>> +           !usb_endpoint_xfer_isoc(&urb->ep->desc) &&
>> usb_urb_dir_out(urb) &&
>>              usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
>> -           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
>> +           urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE &&
>> +           !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>>                  return true;
>>
>>          return false;
>>
>> -Mathias
> 
> 
> Thanks for the patches to test! Both patches applied separately (without
> the other one) fixes the issue with ASIX USB dongle, but from the
> discussion I assume that the first one
> (0001-xhci-don-t-use-IDT-transfer-buffer-is-already-dma-ma.patch) really
> fixes the issue, while the second one is just a workaround.
> 
> You can add:
> 
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 

Great, thanks, I'll send the first patch forward after the merge window

Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 9215a28..2825031 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3275,6 +3275,12 @@  int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 			field |= TRB_IOC;
 			more_trbs_coming = false;
 			td->last_trb = ring->enqueue;
+
+			if (xhci_urb_suitable_for_idt(urb)) {
+				memcpy(&send_addr, urb->transfer_buffer,
+				       trb_buff_len);
+				field |= TRB_IDT;
+			}
 		}
 
 		/* Only set interrupt on short packet for IN endpoints */
@@ -3414,6 +3420,12 @@  int xhci_queue_ctrl_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 	if (urb->transfer_buffer_length > 0) {
 		u32 length_field, remainder;
 
+		if (xhci_urb_suitable_for_idt(urb)) {
+			memcpy(&urb->transfer_dma, urb->transfer_buffer,
+			       urb->transfer_buffer_length);
+			field |= TRB_IDT;
+		}
+
 		remainder = xhci_td_remainder(xhci, 0,
 				urb->transfer_buffer_length,
 				urb->transfer_buffer_length,
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 7fa58c9..255f93f 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1238,6 +1238,21 @@  EXPORT_SYMBOL_GPL(xhci_resume);
 
 /*-------------------------------------------------------------------------*/
 
+/*
+ * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
+ * we'll copy the actual data into the TRB address register. This is limited to
+ * transfers up to 8 bytes on output endpoints of any kind with wMaxPacketSize
+ * >= 8 bytes. If suitable for IDT only one Transfer TRB per TD is allowed.
+ */
+static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
+				gfp_t mem_flags)
+{
+	if (xhci_urb_suitable_for_idt(urb))
+		return 0;
+
+	return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
+}
+
 /**
  * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
  * HCDs.  Find the index for an endpoint given its descriptor.  Use the return
@@ -5154,6 +5169,7 @@  static const struct hc_driver xhci_hc_driver = {
 	/*
 	 * managing i/o requests and associated device resources
 	 */
+	.map_urb_for_dma =      xhci_map_urb_for_dma,
 	.urb_enqueue =		xhci_urb_enqueue,
 	.urb_dequeue =		xhci_urb_dequeue,
 	.alloc_dev =		xhci_alloc_dev,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 9334cde..abbd481 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1303,6 +1303,8 @@  enum xhci_setup_dev {
 #define TRB_IOC			(1<<5)
 /* The buffer pointer contains immediate data */
 #define TRB_IDT			(1<<6)
+/* TDs smaller than this might use IDT */
+#define TRB_IDT_MAX_SIZE	8
 
 /* Block Event Interrupt */
 #define	TRB_BEI			(1<<9)
@@ -2149,6 +2151,21 @@  static inline struct xhci_ring *xhci_urb_to_transfer_ring(struct xhci_hcd *xhci,
 					urb->stream_id);
 }
 
+/*
+ * TODO: As per spec Isochronous IDT transmissions are supported. We bypass
+ * them anyways as we where unable to find a device that matches the
+ * constraints.
+ */
+static inline bool xhci_urb_suitable_for_idt(struct urb *urb)
+{
+	if (!usb_endpoint_xfer_isoc(&urb->ep->desc) && usb_urb_dir_out(urb) &&
+	    usb_endpoint_maxp(&urb->ep->desc) >= TRB_IDT_MAX_SIZE &&
+	    urb->transfer_buffer_length <= TRB_IDT_MAX_SIZE)
+		return true;
+
+	return false;
+}
+
 static inline char *xhci_slot_state_string(u32 state)
 {
 	switch (state) {