diff mbox series

dmaengine: ti: omap-dma: Configure LCH_TYPE for OMAP1

Message ID 20181119104040.12885-1-peter.ujfalusi@ti.com (mailing list archive)
State New, archived
Headers show
Series dmaengine: ti: omap-dma: Configure LCH_TYPE for OMAP1 | expand

Commit Message

Peter Ujfalusi Nov. 19, 2018, 10:40 a.m. UTC
When the channel is configured for slave operation the LCH_TYPE needs to be
set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
 drivers/dma/ti/omap-dma.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Aaro Koskinen Nov. 19, 2018, 6:46 p.m. UTC | #1
Hi,

On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
> When the channel is configured for slave operation the LCH_TYPE needs to be
> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
> 
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>

I don't have the documentation, but based on what omap_udc driver (still
using the legacy OMAP DMA API) does this seems to be correct.

I tested the patch on Nokia 770 with MMC and couldn't see any negative
impact.

Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>

A.

> ---
>  drivers/dma/ti/omap-dma.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/dma/ti/omap-dma.c b/drivers/dma/ti/omap-dma.c
> index a4a931ddf6f6..a18cfd497f04 100644
> --- a/drivers/dma/ti/omap-dma.c
> +++ b/drivers/dma/ti/omap-dma.c
> @@ -185,6 +185,10 @@ enum {
>  
>  	CLNK_CTRL_ENABLE_LNK	= BIT(15),
>  
> +	/* OMAP1 only */
> +	LCH_CTRL_LCH_2D		= 0,
> +	LCH_CTRL_LCH_P		= 2,
> +
>  	CDP_DST_VALID_INC	= 0 << 0,
>  	CDP_DST_VALID_RELOAD	= 1 << 0,
>  	CDP_DST_VALID_REUSE	= 2 << 0,
> @@ -529,6 +533,7 @@ static void omap_dma_start_sg(struct omap_chan *c, struct omap_desc *d)
>  
>  static void omap_dma_start_desc(struct omap_chan *c)
>  {
> +	struct omap_dmadev *od = to_omap_dma_dev(c->vc.chan.device);
>  	struct virt_dma_desc *vd = vchan_next_desc(&c->vc);
>  	struct omap_desc *d;
>  	unsigned cxsa, cxei, cxfi;
> @@ -570,6 +575,12 @@ static void omap_dma_start_desc(struct omap_chan *c)
>  	omap_dma_chan_write(c, CSDP, d->csdp);
>  	omap_dma_chan_write(c, CLNK_CTRL, d->clnk_ctrl);
>  
> +	if (dma_omap1() && !__dma_omap15xx(od->plat->dma_attr)) {
> +		if (is_slave_direction(d->dir))
> +			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_P);
> +		else
> +			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_2D);
> +	}
>  	omap_dma_start_sg(c, d);
>  }
>  
> -- 
> Peter
> 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>
Peter Ujfalusi Nov. 20, 2018, 7:28 a.m. UTC | #2
Aaro,

On 19/11/2018 20.46, Aaro Koskinen wrote:
> Hi,
> 
> On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
>> When the channel is configured for slave operation the LCH_TYPE needs to be
>> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
>>
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> 
> I don't have the documentation, but based on what omap_udc driver (still
> using the legacy OMAP DMA API) does this seems to be correct.

They are hard to fine, true. From the omap1710 TRM:

Logical channel types (LCh types) supported are:
- LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
- LCh-P for synchronized transfers (mostly peripheral transfers)
- LCh-PD similar to LCh-P but runs on a dedicated physical channel
- LCh-G for graphical transfers/operations
- LCh-D for display transfers

Looking at other part it looks like hat LCH-2D channel mode can happily
service a peripheral. LCH-P supports the same features as LCH-2D, but it
lacks support for Single/Double-indexed addressing mode on the
peripheral port side.

So, this patch might not be needed at all. Can you test the omap_udc
with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D

If USB works, then we can just drop this patch.

Note: if we ever need the port_window support in OMAP1 then we need
double indexing on the peripheral side.

> I tested the patch on Nokia 770 with MMC and couldn't see any negative
> impact.
> 
> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
> 
> A.
> 
>> ---
>>  drivers/dma/ti/omap-dma.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/dma/ti/omap-dma.c b/drivers/dma/ti/omap-dma.c
>> index a4a931ddf6f6..a18cfd497f04 100644
>> --- a/drivers/dma/ti/omap-dma.c
>> +++ b/drivers/dma/ti/omap-dma.c
>> @@ -185,6 +185,10 @@ enum {
>>  
>>  	CLNK_CTRL_ENABLE_LNK	= BIT(15),
>>  
>> +	/* OMAP1 only */
>> +	LCH_CTRL_LCH_2D		= 0,
>> +	LCH_CTRL_LCH_P		= 2,
>> +
>>  	CDP_DST_VALID_INC	= 0 << 0,
>>  	CDP_DST_VALID_RELOAD	= 1 << 0,
>>  	CDP_DST_VALID_REUSE	= 2 << 0,
>> @@ -529,6 +533,7 @@ static void omap_dma_start_sg(struct omap_chan *c, struct omap_desc *d)
>>  
>>  static void omap_dma_start_desc(struct omap_chan *c)
>>  {
>> +	struct omap_dmadev *od = to_omap_dma_dev(c->vc.chan.device);
>>  	struct virt_dma_desc *vd = vchan_next_desc(&c->vc);
>>  	struct omap_desc *d;
>>  	unsigned cxsa, cxei, cxfi;
>> @@ -570,6 +575,12 @@ static void omap_dma_start_desc(struct omap_chan *c)
>>  	omap_dma_chan_write(c, CSDP, d->csdp);
>>  	omap_dma_chan_write(c, CLNK_CTRL, d->clnk_ctrl);
>>  
>> +	if (dma_omap1() && !__dma_omap15xx(od->plat->dma_attr)) {
>> +		if (is_slave_direction(d->dir))
>> +			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_P);
>> +		else
>> +			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_2D);
>> +	}
>>  	omap_dma_start_sg(c, d);
>>  }
>>  
>> -- 
>> Peter
>>
>> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
>> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Aaro Koskinen Nov. 20, 2018, 9:04 p.m. UTC | #3
Hi,

On Tue, Nov 20, 2018 at 09:28:37AM +0200, Peter Ujfalusi wrote:
> On 19/11/2018 20.46, Aaro Koskinen wrote:
> > On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
> >> When the channel is configured for slave operation the LCH_TYPE needs to be
> >> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
> >>
> >> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> > 
> > I don't have the documentation, but based on what omap_udc driver (still
> > using the legacy OMAP DMA API) does this seems to be correct.
> 
> They are hard to fine, true. From the omap1710 TRM:
> 
> Logical channel types (LCh types) supported are:
> - LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
> - LCh-P for synchronized transfers (mostly peripheral transfers)
> - LCh-PD similar to LCh-P but runs on a dedicated physical channel
> - LCh-G for graphical transfers/operations
> - LCh-D for display transfers

(I found a public document "OMAP5912 Multimedia Processor Direct
Memory Access (DMA) Support Reference Guide", documenting these; easy
to confuse with "OMAP5910 Dual-Core Processor System DMA Controller
Reference Guide".)

> Looking at other part it looks like hat LCH-2D channel mode can happily
> service a peripheral. LCH-P supports the same features as LCH-2D, but it
> lacks support for Single/Double-indexed addressing mode on the
> peripheral port side.
> 
> So, this patch might not be needed at all. Can you test the omap_udc
> with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D
> 
> If USB works, then we can just drop this patch.

Unfortunately omap_udc does not seem to work at all anymore with DMA on
my 770 setup. :-(

I had switched to PIO mode in 2015 since the WARNs about legacy DMA
API were too annoying and flooding the console. And now that I tried
using DMA again with g_ether, it doesn't work anymore. The device get's
recognized on host side, but no traffic goes through. Switching back to
PIO makes it to work again.

A.
Peter Ujfalusi Nov. 22, 2018, 8:31 a.m. UTC | #4
On 20/11/2018 23.04, Aaro Koskinen wrote:
> Hi,
> 
> On Tue, Nov 20, 2018 at 09:28:37AM +0200, Peter Ujfalusi wrote:
>> On 19/11/2018 20.46, Aaro Koskinen wrote:
>>> On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
>>>> When the channel is configured for slave operation the LCH_TYPE needs to be
>>>> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
>>>>
>>>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>>>
>>> I don't have the documentation, but based on what omap_udc driver (still
>>> using the legacy OMAP DMA API) does this seems to be correct.
>>
>> They are hard to fine, true. From the omap1710 TRM:
>>
>> Logical channel types (LCh types) supported are:
>> - LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
>> - LCh-P for synchronized transfers (mostly peripheral transfers)
>> - LCh-PD similar to LCh-P but runs on a dedicated physical channel
>> - LCh-G for graphical transfers/operations
>> - LCh-D for display transfers
> 
> (I found a public document "OMAP5912 Multimedia Processor Direct
> Memory Access (DMA) Support Reference Guide", documenting these; easy
> to confuse with "OMAP5910 Dual-Core Processor System DMA Controller
> Reference Guide".)
> 
>> Looking at other part it looks like hat LCH-2D channel mode can happily
>> service a peripheral. LCH-P supports the same features as LCH-2D, but it
>> lacks support for Single/Double-indexed addressing mode on the
>> peripheral port side.
>>
>> So, this patch might not be needed at all. Can you test the omap_udc
>> with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D
>>
>> If USB works, then we can just drop this patch.
> 
> Unfortunately omap_udc does not seem to work at all anymore with DMA on
> my 770 setup. :-(
> 
> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> API were too annoying and flooding the console. And now that I tried
> using DMA again with g_ether, it doesn't work anymore. The device get's
> recognized on host side, but no traffic goes through. Switching back to
> PIO makes it to work again.

After some tinkering I got omap_udc working with DMA (not DMAengine,
yet) on 770 - using nfsroot. Configuring the channels to OMAP_DMA_LCH_2D
works as expected.

This patch can be dropped.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Russell King (Oracle) Nov. 22, 2018, 10:29 a.m. UTC | #5
On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> API were too annoying and flooding the console. And now that I tried
> using DMA again with g_ether, it doesn't work anymore. The device get's
> recognized on host side, but no traffic goes through. Switching back to
> PIO makes it to work again.

A solution to that would be to do what the warning message says, and
update the driver to the DMAengine API.

The reason it didn't get updated when the DMAengine conversion happened
is because I don't have hardware for it, so had no way to test, and no
one seemed to know that anyone was using it.  Eventually, the WARN_ON()
was added to try and root out any users and generate interest in
updating the drivers.  Obviously that didn't happen, because people
just worked around the warning rather than saying anything.

I'm afraid we're long past the time that I'd be willing to update the
omap_udc driver now as I've dropped most of my knowledge on that as
it's been four years, and Peter has been looking after OMAP DMAengine
issues since.

Sorry.
Russell King (Oracle) Nov. 22, 2018, 3:12 p.m. UTC | #6
On Thu, Nov 22, 2018 at 10:29:48AM +0000, Russell King - ARM Linux wrote:
> On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
> > I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> > API were too annoying and flooding the console. And now that I tried
> > using DMA again with g_ether, it doesn't work anymore. The device get's
> > recognized on host side, but no traffic goes through. Switching back to
> > PIO makes it to work again.
> 
> A solution to that would be to do what the warning message says, and
> update the driver to the DMAengine API.

Here's a partial conversion (not even build tested) - it only supports
OUT transfers with dmaengine at the moment.

There's at least one thing that doesn't make sense - the driver
apparently can transfer more than req->length bytes, surely if it does
that, it's a serious problem - shouldn't it be noisy about that?

Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
DMAengine always uses a 64 byte burst, but udc wants a smaller burst
setting.  Does this matter?

I've kept the old code for reference (and the driver will fall back if
we can't get a dmaengine channel.)  I haven't been through and checked
that we result in the channel setup largely the same either.

There will be one major difference - UDC uses element sync, where
an element is 16bits, ep->ep.maxpacket/2 in a frame, and "packets"
frames.  DMAengine is using frame sync, with a 16bit element, one
element in a frame, and packets*ep->ep.maxpacket/2 frames.  This
should be functionally equivalent but I'd like confirmation of that.

I'm also not sure about this:

        if (cpu_is_omap15xx())
                end++;

in dma_dest_len() - is that missing from the omap-dma driver?  It looks
like a work-around for some problem on OMAP15xx, but I can't make sense
about why it's in the UDC driver rather than the legacy DMA driver.

I'm also confused by:

        end |= start & (0xffff << 16);

also in dma_dest_len() - omap_get_dma_dst_pos() returns in the high 16
bits the full address:

        if (dma_omap1())
                offset |= (p->dma_read(CDSA, lch) & 0xFFFF0000);

so if the address crosses a 64k physical address boundary, surely orring
in the start address is wrong?  The more I look at dma_dest_len(), the
more I wonder whether that and dma_src_len() are anywhere near correct,
and whether that is a source of breakage for Aaro.

As I've already said, I've no way to test this on hardware.

Please review and let me know whether I missed anything on the OUT
handling path.

Fixing the IN path is going to be a bit more head-scratching.

 drivers/usb/gadget/udc/omap_udc.c | 154 +++++++++++++++++++++++++++++---------
 drivers/usb/gadget/udc/omap_udc.h |   2 +
 2 files changed, 120 insertions(+), 36 deletions(-)

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index 3a16431da321..a37e1d2f0f3e 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -204,6 +204,7 @@ static int omap_ep_enable(struct usb_ep *_ep,
 	ep->dma_channel = 0;
 	ep->has_dma = 0;
 	ep->lch = -1;
+	ep->dma = NULL;
 	use_ep(ep, UDC_EP_SEL);
 	omap_writew(udc->clr_halt, UDC_CTRL);
 	ep->ackwait = 0;
@@ -576,21 +577,49 @@ static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
 static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 {
 	unsigned packets = req->req.length - req->req.actual;
-	int dma_trigger = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *dma = ep->dma;
+	dma_cookie_t cookie;
 	u16 w;
 
-	/* set up this DMA transfer, enable the fifo, start */
-	packets /= ep->ep.maxpacket;
-	packets = min(packets, (unsigned)UDC_RXN_TC + 1);
+	packets = min_t(unsigned, packets / ep->ep.maxpacket, UDC_RXN_TC + 1);
 	req->dma_bytes = packets * ep->ep.maxpacket;
-	omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
-			ep->ep.maxpacket >> 1, packets,
-			OMAP_DMA_SYNC_ELEMENT,
-			dma_trigger, 0);
-	omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
-		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
-		0, 0);
-	ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
+
+	if (dma) {
+		struct dma_slave_config cfg = {
+			.direction = DMA_DEV_TO_MEM,
+			.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTE,
+			/*
+			 * DMAengine uses frame sync mode, setting maxburst=1
+			 * is equivalent to element sync mode.
+			 */
+			.src_maxburst = 1,
+			.src_addr = UDC_DATA_DMA,
+		};
+
+		if (WARN_ON(dmaengine_slave_config(dma, &cfg)))
+			return;
+
+		tx = dmaengine_prep_slave_single(dma,
+						 req->req.dma + req->req.actual,
+						 req->dma_bytes,
+						 DMA_DEV_TO_MEM, 0);
+		if (WARN_ON(!tx))
+			return;
+	} else {
+		int dma_trigger = 0;
+
+		/* set up this DMA transfer, enable the fifo, start */
+		/* dt = S16, cen = ep->ep.maxpacket / 2, cfn = packets */
+		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
+				ep->ep.maxpacket >> 1, packets,
+				OMAP_DMA_SYNC_ELEMENT,
+				dma_trigger, 0);
+		omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
+			OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
+			0, 0);
+		ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
+	}
 
 	omap_writew(UDC_RXN_STOP | (packets - 1), UDC_RXDMA(ep->dma_channel));
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -599,7 +628,15 @@ static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 	omap_writew(ep->bEndpointAddress & 0xf, UDC_EP_NUM);
 	omap_writew(UDC_SET_FIFO_EN, UDC_CTRL);
 
-	omap_start_dma(ep->lch);
+	if (dma) {
+		cookie = dmaengine_submit(tx);
+		if (WARN_ON(dma_submit_error(cookie)))
+			return;
+		ep->dma_cookie = cookie;
+		dma_async_issue_pending(dma);
+	} else {
+		omap_start_dma(ep->lch);
+	}
 }
 
 static void
@@ -607,21 +644,39 @@ finish_out_dma(struct omap_ep *ep, struct omap_req *req, int status, int one)
 {
 	u16	count, w;
 
-	if (status == 0)
-		ep->dma_counter = (u16) (req->req.dma + req->req.actual);
-	count = dma_dest_len(ep, req->req.dma + req->req.actual);
+	if (ep->dma) {
+		struct dma_tx_state state;
+
+		dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
+
+		count = req->dma_bytes - state.residual;
+	} else {
+		if (status == 0)
+			ep->dma_counter = (u16) (req->req.dma + req->req.actual);
+		count = dma_dest_len(ep, req->req.dma + req->req.actual);
+	}
+
 	count += req->req.actual;
 	if (one)
 		count--;
+
+	/*
+	 * FIXME: Surely if count > req->req.length, something has gone
+	 * seriously wrong and we've scribbled over memory we should not...
+	 * so surely we should be a WARN_ON() at the very least?
+	 */
 	if (count <= req->req.length)
 		req->req.actual = count;
 
-	if (count != req->dma_bytes || status)
-		omap_stop_dma(ep->lch);
-
+	if (count != req->dma_bytes || status) {
+		if (ep->dma)
+			dmaengine_terminate_async(ep->dma);
+		else
+			omap_stop_dma(ep->lch);
 	/* if this wasn't short, request may need another transfer */
-	else if (req->req.actual < req->req.length)
+	} else if (req->req.actual < req->req.length) {
 		return;
+	}
 
 	/* rx completion */
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -709,6 +764,7 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 
 	ep->dma_channel = 0;
 	ep->lch = -1;
+	ep->dma = NULL;
 	if (channel == 0 || channel > 3) {
 		if ((reg & 0x0f00) == 0)
 			channel = 3;
@@ -742,26 +798,44 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 				0, 0);
 		}
 	} else {
+		struct dma_chan *dma;
+
 		dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
-		status = omap_request_dma(dma_channel,
-			ep->ep.name, dma_error, ep, &ep->lch);
-		if (status == 0) {
+
+		dma = __dma_request_channel(&mask, omap_dma_filter_fn,
+					    (void *)dma_channel);
+		if (dma) {
+			ep->dma = dma;
 			omap_writew(reg, UDC_RXDMA_CFG);
-			/* TIPB */
-			omap_set_dma_src_params(ep->lch,
-				OMAP_DMA_PORT_TIPB,
-				OMAP_DMA_AMODE_CONSTANT,
-				UDC_DATA_DMA,
-				0, 0);
-			/* EMIFF or SDRC */
-			omap_set_dma_dest_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_dest_data_pack(ep->lch, 1);
+		} else {
+			status = omap_request_dma(dma_channel,
+				ep->ep.name, dma_error, ep, &ep->lch);
+			if (status == 0) {
+				omap_writew(reg, UDC_RXDMA_CFG);
+				/* TIPB */
+				omap_set_dma_src_params(ep->lch,
+					OMAP_DMA_PORT_TIPB,
+					OMAP_DMA_AMODE_CONSTANT,
+					UDC_DATA_DMA,
+					0, 0);
+				/* EMIFF or SDRC */
+				/*
+				 * not ok - CSDP_DST_BURST_64 selected, but this selects
+				 * CSDP_DST_BURST_16 on omap2+ and CSDP_DST_BURST_32 on
+				 * omap1.
+				 */
+				omap_set_dma_dest_burst_mode(ep->lch,
+							OMAP_DMA_DATA_BURST_4);
+				/* ok - CSDP_DST_PACKED set for dmaengine */
+				omap_set_dma_dest_data_pack(ep->lch, 1);
+			}
 		}
 	}
-	if (status)
+	if (d->dma) {
+		ep->has_dma = 1;
+	} else if (status) {
 		ep->dma_channel = 0;
-	else {
+	} else {
 		ep->has_dma = 1;
 		omap_disable_dma_irq(ep->lch, OMAP_DMA_BLOCK_IRQ);
 
@@ -777,6 +851,10 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 	if (status)
 		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
 			restart ? " (restart)" : "");
+	else if (d->dma)
+		DBG("%s claimed %cxdma%d dmaengine %s%s\n", ep->ep.name,
+			is_in ? 't' : 'r', ep->dma_channel - 1,
+			dma_chan_name(d->dma), restart ? " (restart)" : "");
 	else
 		DBG("%s claimed %cxdma%d lch %d%s\n", ep->ep.name,
 			is_in ? 't' : 'r',
@@ -850,9 +928,13 @@ static void dma_channel_release(struct omap_ep *ep)
 		if (req)
 			finish_out_dma(ep, req, -ECONNRESET, 0);
 	}
-	omap_free_dma(ep->lch);
+	if (ep->dma)
+		dma_release_channel(ep->dma);
+	else
+		omap_free_dma(ep->lch);
 	ep->dma_channel = 0;
 	ep->lch = -1;
+	ep->dma = NULL;
 	/* has_dma still set, till endpoint is fully quiesced */
 }
 
diff --git a/drivers/usb/gadget/udc/omap_udc.h b/drivers/usb/gadget/udc/omap_udc.h
index 00f9e608e755..68857ae8d763 100644
--- a/drivers/usb/gadget/udc/omap_udc.h
+++ b/drivers/usb/gadget/udc/omap_udc.h
@@ -153,6 +153,8 @@ struct omap_ep {
 	u8				dma_channel;
 	u16				dma_counter;
 	int				lch;
+	struct dma_chan			*dma;
+	dma_cookie_t			dma_cookie;
 	struct omap_udc			*udc;
 	struct timer_list		timer;
 };
Aaro Koskinen Nov. 22, 2018, 10:01 p.m. UTC | #7
Hi,

On Thu, Nov 22, 2018 at 10:31:31AM +0200, Peter Ujfalusi wrote:
> On 20/11/2018 23.04, Aaro Koskinen wrote:
> > On Tue, Nov 20, 2018 at 09:28:37AM +0200, Peter Ujfalusi wrote:
> >> On 19/11/2018 20.46, Aaro Koskinen wrote:
> >>> On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
> >>>> When the channel is configured for slave operation the LCH_TYPE needs to be
> >>>> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
> >>>>
> >>>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> >>>
> >>> I don't have the documentation, but based on what omap_udc driver (still
> >>> using the legacy OMAP DMA API) does this seems to be correct.
> >>
> >> They are hard to fine, true. From the omap1710 TRM:
> >>
> >> Logical channel types (LCh types) supported are:
> >> - LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
> >> - LCh-P for synchronized transfers (mostly peripheral transfers)
> >> - LCh-PD similar to LCh-P but runs on a dedicated physical channel
> >> - LCh-G for graphical transfers/operations
> >> - LCh-D for display transfers
> > 
> > (I found a public document "OMAP5912 Multimedia Processor Direct
> > Memory Access (DMA) Support Reference Guide", documenting these; easy
> > to confuse with "OMAP5910 Dual-Core Processor System DMA Controller
> > Reference Guide".)
> > 
> >> Looking at other part it looks like hat LCH-2D channel mode can happily
> >> service a peripheral. LCH-P supports the same features as LCH-2D, but it
> >> lacks support for Single/Double-indexed addressing mode on the
> >> peripheral port side.
> >>
> >> So, this patch might not be needed at all. Can you test the omap_udc
> >> with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D
> >>
> >> If USB works, then we can just drop this patch.
> > 
> > Unfortunately omap_udc does not seem to work at all anymore with DMA on
> > my 770 setup. :-(
> > 
> > I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> > API were too annoying and flooding the console. And now that I tried
> > using DMA again with g_ether, it doesn't work anymore. The device get's
> > recognized on host side, but no traffic goes through. Switching back to
> > PIO makes it to work again.
> 
> After some tinkering I got omap_udc working with DMA (not DMAengine,
> yet) on 770 - using nfsroot. Configuring the channels to OMAP_DMA_LCH_2D
> works as expected.

Would be interesting to know how you got it working with DMA. Which
gadget driver were you using?

I bisected my issue, and got:

commit 387f869d2579e379ee343f5493dcd360be60f5c6 (refs/bisect/bad)
Author: Felipe Balbi <felipe.balbi@linux.intel.com>
Date:   Wed Mar 22 13:25:18 2017 +0200

    usb: gadget: u_ether: conditionally align transfer size

With that reverted, the DMA works OK (and I can also now confirm that
OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
quirk in OMAP UDC, or if this is related to RMK's findings of potential
bugs in the driver. Anyway, there is clearly yet another regression.

A.
Aaro Koskinen Nov. 22, 2018, 10:24 p.m. UTC | #8
Hi,

On Thu, Nov 22, 2018 at 03:12:36PM +0000, Russell King - ARM Linux wrote:
> On Thu, Nov 22, 2018 at 10:29:48AM +0000, Russell King - ARM Linux wrote:
> > On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
> > > I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> > > API were too annoying and flooding the console. And now that I tried
> > > using DMA again with g_ether, it doesn't work anymore. The device get's
> > > recognized on host side, but no traffic goes through. Switching back to
> > > PIO makes it to work again.
> > 
> > A solution to that would be to do what the warning message says, and
> > update the driver to the DMAengine API.

Fully agreed, but I was busy debugging other more serious issues, and
just wanted to get a reliable ssh or USB serial access to the device
without any extra noise, so switching to PIO using a module parameter
is probably what most users do in such situations.

> Here's a partial conversion (not even build tested) - it only supports
> OUT transfers with dmaengine at the moment.

Thanks, I'll take a closer look and try to do some testing hopefully
during the weekend.

A.
Russell King (Oracle) Nov. 23, 2018, 12:25 a.m. UTC | #9
On Fri, Nov 23, 2018 at 12:24:26AM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Thu, Nov 22, 2018 at 03:12:36PM +0000, Russell King - ARM Linux wrote:
> > On Thu, Nov 22, 2018 at 10:29:48AM +0000, Russell King - ARM Linux wrote:
> > > On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
> > > > I had switched to PIO mode in 2015 since the WARNs about legacy DMA
> > > > API were too annoying and flooding the console. And now that I tried
> > > > using DMA again with g_ether, it doesn't work anymore. The device get's
> > > > recognized on host side, but no traffic goes through. Switching back to
> > > > PIO makes it to work again.
> > > 
> > > A solution to that would be to do what the warning message says, and
> > > update the driver to the DMAengine API.
> 
> Fully agreed, but I was busy debugging other more serious issues, and
> just wanted to get a reliable ssh or USB serial access to the device
> without any extra noise, so switching to PIO using a module parameter
> is probably what most users do in such situations.
> 
> > Here's a partial conversion (not even build tested) - it only supports
> > OUT transfers with dmaengine at the moment.
> 
> Thanks, I'll take a closer look and try to do some testing hopefully
> during the weekend.

The patch was more for Peter to take a peek at - there's definitely
some bits missing in the dmaengine driver (like the write to the
LCH_CTRL register) that would need to be fixed somehow.

However, it's worth noting that there is exactly one user of
omap_set_dma_channel_mode(), which is omap-udc, which means any DMA
channel made use of by omap-udc will have the LCH_CTRL register
modified to LCH_P, and it will remain that way even if someone else
subsequently makes use of the same channel.  That's rather suspicious
to me... maybe we can just initialise all LCH_CTRL registers to LCH_P
in the dmaengine driver in that case!  If not, then there's a bug
right there.
Aaro Koskinen Nov. 23, 2018, 1:23 a.m. UTC | #10
Hi,

On Fri, Nov 23, 2018 at 12:25:49AM +0000, Russell King - ARM Linux wrote:
> The patch was more for Peter to take a peek at

OK, I'll hack with other platforms meanwhile.

A.
Peter Ujfalusi Nov. 23, 2018, 11:45 a.m. UTC | #11
On 23/11/2018 0.01, Aaro Koskinen wrote:
> Hi,
> 
> On Thu, Nov 22, 2018 at 10:31:31AM +0200, Peter Ujfalusi wrote:
>> On 20/11/2018 23.04, Aaro Koskinen wrote:
>>> On Tue, Nov 20, 2018 at 09:28:37AM +0200, Peter Ujfalusi wrote:
>>>> On 19/11/2018 20.46, Aaro Koskinen wrote:
>>>>> On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
>>>>>> When the channel is configured for slave operation the LCH_TYPE needs to be
>>>>>> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
>>>>>>
>>>>>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>>>>>
>>>>> I don't have the documentation, but based on what omap_udc driver (still
>>>>> using the legacy OMAP DMA API) does this seems to be correct.
>>>>
>>>> They are hard to fine, true. From the omap1710 TRM:
>>>>
>>>> Logical channel types (LCh types) supported are:
>>>> - LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
>>>> - LCh-P for synchronized transfers (mostly peripheral transfers)
>>>> - LCh-PD similar to LCh-P but runs on a dedicated physical channel
>>>> - LCh-G for graphical transfers/operations
>>>> - LCh-D for display transfers
>>>
>>> (I found a public document "OMAP5912 Multimedia Processor Direct
>>> Memory Access (DMA) Support Reference Guide", documenting these; easy
>>> to confuse with "OMAP5910 Dual-Core Processor System DMA Controller
>>> Reference Guide".)
>>>
>>>> Looking at other part it looks like hat LCH-2D channel mode can happily
>>>> service a peripheral. LCH-P supports the same features as LCH-2D, but it
>>>> lacks support for Single/Double-indexed addressing mode on the
>>>> peripheral port side.
>>>>
>>>> So, this patch might not be needed at all. Can you test the omap_udc
>>>> with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D
>>>>
>>>> If USB works, then we can just drop this patch.
>>>
>>> Unfortunately omap_udc does not seem to work at all anymore with DMA on
>>> my 770 setup. :-(
>>>
>>> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
>>> API were too annoying and flooding the console. And now that I tried
>>> using DMA again with g_ether, it doesn't work anymore. The device get's
>>> recognized on host side, but no traffic goes through. Switching back to
>>> PIO makes it to work again.
>>
>> After some tinkering I got omap_udc working with DMA (not DMAengine,
>> yet) on 770 - using nfsroot. Configuring the channels to OMAP_DMA_LCH_2D
>> works as expected.
> 
> Would be interesting to know how you got it working with DMA. Which
> gadget driver were you using?
> 
> I bisected my issue, and got:
> 
> commit 387f869d2579e379ee343f5493dcd360be60f5c6 (refs/bisect/bad)
> Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> Date:   Wed Mar 22 13:25:18 2017 +0200
> 
>     usb: gadget: u_ether: conditionally align transfer size

I just:
commit 0d61d79625202c1c4fcf07fb960e27984a3657a3
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Thu Nov 22 10:36:55 2018 +0200

    usb: gadget: omap_udc: HACK: Make RX dma work
    
    partial packets do work...
    
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index 94128eb69d97..0748c3841a25 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -886,14 +886,14 @@ omap_ep_queue(struct usb_ep *_ep, struct usb_request *_req, gfp_t gfp_flags)
 
        /* this isn't bogus, but OMAP DMA isn't the only hardware to
         * have a hard time with partial packet reads...  reject it.
+        * Wait a minute, it does work :o
         */
        if (use_dma
                        && ep->has_dma
                        && ep->bEndpointAddress != 0
                        && (ep->bEndpointAddress & USB_DIR_IN) == 0
                        && (req->req.length % ep->ep.maxpacket) != 0) {
-               DBG("%s, no partial packet OUT reads\n", __func__);
-               return -EMSGSIZE;
+               DBG("%s, partial packet OUT, might not work?\n", __func__);
        }
 
        udc = ep->udc;

> With that reverted, the DMA works OK (and I can also now confirm that
> OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> quirk in OMAP UDC,

The omap_udc driver is a bit of a mess, need to check it myself, but for
now we can just set the quirk_ep_out_aligned_size and investigate later.

> or if this is related to RMK's findings of potential
> bugs in the driver. Anyway, there is clearly yet another regression.

I'll check Russell's mail.

> 
> A.
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 23, 2018, 11:49 a.m. UTC | #12
On 22/11/2018 12.29, Russell King - ARM Linux wrote:
> On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
>> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
>> API were too annoying and flooding the console. And now that I tried
>> using DMA again with g_ether, it doesn't work anymore. The device get's
>> recognized on host side, but no traffic goes through. Switching back to
>> PIO makes it to work again.
> 
> A solution to that would be to do what the warning message says, and
> update the driver to the DMAengine API.

Yep, omap_udc is the last user of legacy omap_dma API. It is a slow
progress as I do the conversion in my free time, onenand/omap2 and tusb
was converted not too long ago, let's see how the omap_udc is going to go.

> The reason it didn't get updated when the DMAengine conversion happened
> is because I don't have hardware for it, so had no way to test, and no
> one seemed to know that anyone was using it.  Eventually, the WARN_ON()
> was added to try and root out any users and generate interest in
> updating the drivers.  Obviously that didn't happen, because people
> just worked around the warning rather than saying anything.
> 
> I'm afraid we're long past the time that I'd be willing to update the
> omap_udc driver now as I've dropped most of my knowledge on that as
> it's been four years, and Peter has been looking after OMAP DMAengine
> issues since.
> 
> Sorry.
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 23, 2018, 11:54 a.m. UTC | #13
On 23/11/2018 2.25, Russell King - ARM Linux wrote:
> On Fri, Nov 23, 2018 at 12:24:26AM +0200, Aaro Koskinen wrote:
>> Hi,
>>
>> On Thu, Nov 22, 2018 at 03:12:36PM +0000, Russell King - ARM Linux wrote:
>>> On Thu, Nov 22, 2018 at 10:29:48AM +0000, Russell King - ARM Linux wrote:
>>>> On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
>>>>> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
>>>>> API were too annoying and flooding the console. And now that I tried
>>>>> using DMA again with g_ether, it doesn't work anymore. The device get's
>>>>> recognized on host side, but no traffic goes through. Switching back to
>>>>> PIO makes it to work again.
>>>>
>>>> A solution to that would be to do what the warning message says, and
>>>> update the driver to the DMAengine API.
>>
>> Fully agreed, but I was busy debugging other more serious issues, and
>> just wanted to get a reliable ssh or USB serial access to the device
>> without any extra noise, so switching to PIO using a module parameter
>> is probably what most users do in such situations.
>>
>>> Here's a partial conversion (not even build tested) - it only supports
>>> OUT transfers with dmaengine at the moment.
>>
>> Thanks, I'll take a closer look and try to do some testing hopefully
>> during the weekend.
> 
> The patch was more for Peter to take a peek at - there's definitely
> some bits missing in the dmaengine driver (like the write to the
> LCH_CTRL register) that would need to be fixed somehow.
> 
> However, it's worth noting that there is exactly one user of
> omap_set_dma_channel_mode(), which is omap-udc, which means any DMA
> channel made use of by omap-udc will have the LCH_CTRL register
> modified to LCH_P, and it will remain that way even if someone else
> subsequently makes use of the same channel.  That's rather suspicious
> to me... maybe we can just initialise all LCH_CTRL registers to LCH_P
> in the dmaengine driver in that case!  If not, then there's a bug
> right there.

Hrm, right. memcpy will break if we take a channel which was used by
omap_udc at some point as LCH_P is not capable of dealing with it.

With this patch it should be fine as we configure the LCH_CTRL to P or
2D depending on the transfer type (slave vs non-slave).
But, we might just set the lch to 2D regardless as it works for slave
and memcpy channels fine.

> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Peter Ujfalusi Nov. 23, 2018, 12:35 p.m. UTC | #14
On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> On Thu, Nov 22, 2018 at 10:29:48AM +0000, Russell King - ARM Linux wrote:
>> On Tue, Nov 20, 2018 at 11:04:06PM +0200, Aaro Koskinen wrote:
>>> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
>>> API were too annoying and flooding the console. And now that I tried
>>> using DMA again with g_ether, it doesn't work anymore. The device get's
>>> recognized on host side, but no traffic goes through. Switching back to
>>> PIO makes it to work again.
>>
>> A solution to that would be to do what the warning message says, and
>> update the driver to the DMAengine API.
> 
> Here's a partial conversion (not even build tested) - it only supports
> OUT transfers with dmaengine at the moment.

Thanks!

What I learned with the tusb that we need to rearrange things for
DMAengine (4cadc711cdc7 usb: musb: tusb6010_omap: Allocate DMA channels
upfront)

But that was within the musb framework, so omap_udc might be simpler.

> There's at least one thing that doesn't make sense - the driver
> apparently can transfer more than req->length bytes, surely if it does
> that, it's a serious problem - shouldn't it be noisy about that?


> Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
> DMAengine always uses a 64 byte burst, but udc wants a smaller burst
> setting.  Does this matter?

The tusb also fiddled with the burst before the conversion, I believe
what the DMAengine driver is doing should be fine. If not then we fix it
while converting the omap_udc.

> 
> I've kept the old code for reference (and the driver will fall back if
> we can't get a dmaengine channel.)  I haven't been through and checked
> that we result in the channel setup largely the same either.
> 
> There will be one major difference - UDC uses element sync, where
> an element is 16bits, ep->ep.maxpacket/2 in a frame, and "packets"
> frames.  DMAengine is using frame sync, with a 16bit element, one
> element in a frame, and packets*ep->ep.maxpacket/2 frames.  This
> should be functionally equivalent but I'd like confirmation of that.

Yes, I think it should be fine also.

> 
> I'm also not sure about this:
> 
>         if (cpu_is_omap15xx())
>                 end++;
> 
> in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> like a work-around for some problem on OMAP15xx, but I can't make sense
> about why it's in the UDC driver rather than the legacy DMA driver.

afaik no other legacy drivers were doing similar thing, this must be
something which is needed for the omap_udc driver to fix up something?

> 
> I'm also confused by:
> 
>         end |= start & (0xffff << 16);
> 
> also in dma_dest_len() - omap_get_dma_dst_pos() returns in the high 16
> bits the full address:
> 
>         if (dma_omap1())
>                 offset |= (p->dma_read(CDSA, lch) & 0xFFFF0000);

CDSA is OMAP_DMA_REG_2X16BIT for omap1
The CPC/CDAC holds the LSB of the _current_ DMA pointer. The code gets
the MSB of the address from the CDSA registers.

> 
> so if the address crosses a 64k physical address boundary, surely orring
> in the start address is wrong?  The more I look at dma_dest_len(), the
> more I wonder whether that and dma_src_len() are anywhere near correct,
> and whether that is a source of breakage for Aaro.

Hrm, again... the position reporting on OMAP1 is certainly not something
I would put my life on :o

> As I've already said, I've no way to test this on hardware.
> 
> Please review and let me know whether I missed anything on the OUT
> handling path.
> 
> Fixing the IN path is going to be a bit more head-scratching.
> 
>  drivers/usb/gadget/udc/omap_udc.c | 154 +++++++++++++++++++++++++++++---------
>  drivers/usb/gadget/udc/omap_udc.h |   2 +
>  2 files changed, 120 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
> index 3a16431da321..a37e1d2f0f3e 100644
> --- a/drivers/usb/gadget/udc/omap_udc.c
> +++ b/drivers/usb/gadget/udc/omap_udc.c
> @@ -204,6 +204,7 @@ static int omap_ep_enable(struct usb_ep *_ep,
>  	ep->dma_channel = 0;
>  	ep->has_dma = 0;
>  	ep->lch = -1;
> +	ep->dma = NULL;
>  	use_ep(ep, UDC_EP_SEL);
>  	omap_writew(udc->clr_halt, UDC_CTRL);
>  	ep->ackwait = 0;
> @@ -576,21 +577,49 @@ static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
>  static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
>  {
>  	unsigned packets = req->req.length - req->req.actual;
> -	int dma_trigger = 0;
> +	struct dma_async_tx_descriptor *tx;
> +	struct dma_chan *dma = ep->dma;
> +	dma_cookie_t cookie;
>  	u16 w;
>  
> -	/* set up this DMA transfer, enable the fifo, start */
> -	packets /= ep->ep.maxpacket;
> -	packets = min(packets, (unsigned)UDC_RXN_TC + 1);
> +	packets = min_t(unsigned, packets / ep->ep.maxpacket, UDC_RXN_TC + 1);
>  	req->dma_bytes = packets * ep->ep.maxpacket;
> -	omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
> -			ep->ep.maxpacket >> 1, packets,
> -			OMAP_DMA_SYNC_ELEMENT,
> -			dma_trigger, 0);
> -	omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
> -		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
> -		0, 0);
> -	ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
> +
> +	if (dma) {
> +		struct dma_slave_config cfg = {
> +			.direction = DMA_DEV_TO_MEM,
> +			.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTE,
> +			/*
> +			 * DMAengine uses frame sync mode, setting maxburst=1
> +			 * is equivalent to element sync mode.
> +			 */
> +			.src_maxburst = 1,

We should fix the omap-dma driver for slave_sg  instead:

if (!burst)
	burst = 1;

I thought that I already did that.

> +			.src_addr = UDC_DATA_DMA,
> +		};
> +
> +		if (WARN_ON(dmaengine_slave_config(dma, &cfg)))
> +			return;
> +
> +		tx = dmaengine_prep_slave_single(dma,
> +						 req->req.dma + req->req.actual,
> +						 req->dma_bytes,
> +						 DMA_DEV_TO_MEM, 0);
> +		if (WARN_ON(!tx))
> +			return;
> +	} else {
> +		int dma_trigger = 0;
> +
> +		/* set up this DMA transfer, enable the fifo, start */
> +		/* dt = S16, cen = ep->ep.maxpacket / 2, cfn = packets */
> +		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
> +				ep->ep.maxpacket >> 1, packets,
> +				OMAP_DMA_SYNC_ELEMENT,
> +				dma_trigger, 0);
> +		omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
> +			OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
> +			0, 0);
> +		ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
> +	}
>  
>  	omap_writew(UDC_RXN_STOP | (packets - 1), UDC_RXDMA(ep->dma_channel));
>  	w = omap_readw(UDC_DMA_IRQ_EN);
> @@ -599,7 +628,15 @@ static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
>  	omap_writew(ep->bEndpointAddress & 0xf, UDC_EP_NUM);
>  	omap_writew(UDC_SET_FIFO_EN, UDC_CTRL);
>  
> -	omap_start_dma(ep->lch);
> +	if (dma) {
> +		cookie = dmaengine_submit(tx);
> +		if (WARN_ON(dma_submit_error(cookie)))
> +			return;
> +		ep->dma_cookie = cookie;
> +		dma_async_issue_pending(dma);
> +	} else {
> +		omap_start_dma(ep->lch);
> +	}
>  }
>  
>  static void
> @@ -607,21 +644,39 @@ finish_out_dma(struct omap_ep *ep, struct omap_req *req, int status, int one)
>  {
>  	u16	count, w;
>  
> -	if (status == 0)
> -		ep->dma_counter = (u16) (req->req.dma + req->req.actual);
> -	count = dma_dest_len(ep, req->req.dma + req->req.actual);
> +	if (ep->dma) {
> +		struct dma_tx_state state;
> +
> +		dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
> +
> +		count = req->dma_bytes - state.residual;
> +	} else {
> +		if (status == 0)
> +			ep->dma_counter = (u16) (req->req.dma + req->req.actual);
> +		count = dma_dest_len(ep, req->req.dma + req->req.actual);
> +	}
> +
>  	count += req->req.actual;
>  	if (one)
>  		count--;
> +
> +	/*
> +	 * FIXME: Surely if count > req->req.length, something has gone
> +	 * seriously wrong and we've scribbled over memory we should not...
> +	 * so surely we should be a WARN_ON() at the very least?
> +	 */
>  	if (count <= req->req.length)
>  		req->req.actual = count;
>  
> -	if (count != req->dma_bytes || status)
> -		omap_stop_dma(ep->lch);
> -
> +	if (count != req->dma_bytes || status) {
> +		if (ep->dma)
> +			dmaengine_terminate_async(ep->dma);
> +		else
> +			omap_stop_dma(ep->lch);
>  	/* if this wasn't short, request may need another transfer */
> -	else if (req->req.actual < req->req.length)
> +	} else if (req->req.actual < req->req.length) {
>  		return;
> +	}
>  
>  	/* rx completion */
>  	w = omap_readw(UDC_DMA_IRQ_EN);
> @@ -709,6 +764,7 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
>  
>  	ep->dma_channel = 0;
>  	ep->lch = -1;
> +	ep->dma = NULL;
>  	if (channel == 0 || channel > 3) {
>  		if ((reg & 0x0f00) == 0)
>  			channel = 3;
> @@ -742,26 +798,44 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
>  				0, 0);
>  		}
>  	} else {
> +		struct dma_chan *dma;
> +
>  		dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
> -		status = omap_request_dma(dma_channel,
> -			ep->ep.name, dma_error, ep, &ep->lch);
> -		if (status == 0) {
> +
> +		dma = __dma_request_channel(&mask, omap_dma_filter_fn,
> +					    (void *)dma_channel);

dma_request_chan(dev, "ch_name");

where ch_name: rx0/1/2, tx0/1/2

and we don't need the omap_dma_filter_fn in here as all taken care via
the dma_slave_map


> +		if (dma) {
> +			ep->dma = dma;
>  			omap_writew(reg, UDC_RXDMA_CFG);
> -			/* TIPB */
> -			omap_set_dma_src_params(ep->lch,
> -				OMAP_DMA_PORT_TIPB,
> -				OMAP_DMA_AMODE_CONSTANT,
> -				UDC_DATA_DMA,
> -				0, 0);
> -			/* EMIFF or SDRC */
> -			omap_set_dma_dest_burst_mode(ep->lch,
> -						OMAP_DMA_DATA_BURST_4);
> -			omap_set_dma_dest_data_pack(ep->lch, 1);
> +		} else {
> +			status = omap_request_dma(dma_channel,
> +				ep->ep.name, dma_error, ep, &ep->lch);
> +			if (status == 0) {
> +				omap_writew(reg, UDC_RXDMA_CFG);
> +				/* TIPB */
> +				omap_set_dma_src_params(ep->lch,
> +					OMAP_DMA_PORT_TIPB,
> +					OMAP_DMA_AMODE_CONSTANT,
> +					UDC_DATA_DMA,
> +					0, 0);
> +				/* EMIFF or SDRC */
> +				/*
> +				 * not ok - CSDP_DST_BURST_64 selected, but this selects
> +				 * CSDP_DST_BURST_16 on omap2+ and CSDP_DST_BURST_32 on
> +				 * omap1.
> +				 */
> +				omap_set_dma_dest_burst_mode(ep->lch,
> +							OMAP_DMA_DATA_BURST_4);
> +				/* ok - CSDP_DST_PACKED set for dmaengine */
> +				omap_set_dma_dest_data_pack(ep->lch, 1);
> +			}
>  		}
>  	}
> -	if (status)
> +	if (d->dma) {
> +		ep->has_dma = 1;
> +	} else if (status) {
>  		ep->dma_channel = 0;
> -	else {
> +	} else {
>  		ep->has_dma = 1;
>  		omap_disable_dma_irq(ep->lch, OMAP_DMA_BLOCK_IRQ);
>  
> @@ -777,6 +851,10 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
>  	if (status)
>  		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
>  			restart ? " (restart)" : "");
> +	else if (d->dma)
> +		DBG("%s claimed %cxdma%d dmaengine %s%s\n", ep->ep.name,
> +			is_in ? 't' : 'r', ep->dma_channel - 1,
> +			dma_chan_name(d->dma), restart ? " (restart)" : "");
>  	else
>  		DBG("%s claimed %cxdma%d lch %d%s\n", ep->ep.name,
>  			is_in ? 't' : 'r',
> @@ -850,9 +928,13 @@ static void dma_channel_release(struct omap_ep *ep)
>  		if (req)
>  			finish_out_dma(ep, req, -ECONNRESET, 0);
>  	}
> -	omap_free_dma(ep->lch);
> +	if (ep->dma)
> +		dma_release_channel(ep->dma);
> +	else
> +		omap_free_dma(ep->lch);
>  	ep->dma_channel = 0;
>  	ep->lch = -1;
> +	ep->dma = NULL;
>  	/* has_dma still set, till endpoint is fully quiesced */
>  }
>  
> diff --git a/drivers/usb/gadget/udc/omap_udc.h b/drivers/usb/gadget/udc/omap_udc.h
> index 00f9e608e755..68857ae8d763 100644
> --- a/drivers/usb/gadget/udc/omap_udc.h
> +++ b/drivers/usb/gadget/udc/omap_udc.h
> @@ -153,6 +153,8 @@ struct omap_ep {
>  	u8				dma_channel;
>  	u16				dma_counter;
>  	int				lch;
> +	struct dma_chan			*dma;
> +	dma_cookie_t			dma_cookie;
>  	struct omap_udc			*udc;
>  	struct timer_list		timer;
>  };

I try to give this a try, thanks Russell for the patch!

> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Russell King (Oracle) Nov. 23, 2018, 3:43 p.m. UTC | #15
On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
> > DMAengine always uses a 64 byte burst, but udc wants a smaller burst
> > setting.  Does this matter?
> 
> The tusb also fiddled with the burst before the conversion, I believe
> what the DMAengine driver is doing should be fine. If not then we fix it
> while converting the omap_udc.

That's good news, so I can ignore that difference.

> > I'm also not sure about this:
> > 
> >         if (cpu_is_omap15xx())
> >                 end++;
> > 
> > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > like a work-around for some problem on OMAP15xx, but I can't make sense
> > about why it's in the UDC driver rather than the legacy DMA driver.
> 
> afaik no other legacy drivers were doing similar thing, this must be
> something which is needed for the omap_udc driver to fix up something?
> 
> > 
> > I'm also confused by:
> > 
> >         end |= start & (0xffff << 16);
> > 
> > also in dma_dest_len() - omap_get_dma_dst_pos() returns in the high 16
> > bits the full address:
> > 
> >         if (dma_omap1())
> >                 offset |= (p->dma_read(CDSA, lch) & 0xFFFF0000);
> 
> CDSA is OMAP_DMA_REG_2X16BIT for omap1
> The CPC/CDAC holds the LSB of the _current_ DMA pointer. The code gets
> the MSB of the address from the CDSA registers.
> 
> > 
> > so if the address crosses a 64k physical address boundary, surely orring
> > in the start address is wrong?  The more I look at dma_dest_len(), the
> > more I wonder whether that and dma_src_len() are anywhere near correct,
> > and whether that is a source of breakage for Aaro.
> 
> Hrm, again... the position reporting on OMAP1 is certainly not something
> I would put my life on :o

My feeling is - if the code in plat-omap/dma.c doesn't work, we've got
the same problems in the dmaengine driver, so the reported residue will
be wrong.  Any workarounds need to be within the dmaengine driver, not
in individual drivers.  We can't just go subtracting 1 from the residue
reported by dmaengine.

> > diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
> > index 3a16431da321..a37e1d2f0f3e 100644
> > --- a/drivers/usb/gadget/udc/omap_udc.c
> > +++ b/drivers/usb/gadget/udc/omap_udc.c
> > @@ -204,6 +204,7 @@ static int omap_ep_enable(struct usb_ep *_ep,
> >  	ep->dma_channel = 0;
> >  	ep->has_dma = 0;
> >  	ep->lch = -1;
> > +	ep->dma = NULL;
> >  	use_ep(ep, UDC_EP_SEL);
> >  	omap_writew(udc->clr_halt, UDC_CTRL);
> >  	ep->ackwait = 0;
> > @@ -576,21 +577,49 @@ static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
> >  static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
> >  {
> >  	unsigned packets = req->req.length - req->req.actual;
> > -	int dma_trigger = 0;
> > +	struct dma_async_tx_descriptor *tx;
> > +	struct dma_chan *dma = ep->dma;
> > +	dma_cookie_t cookie;
> >  	u16 w;
> >  
> > -	/* set up this DMA transfer, enable the fifo, start */
> > -	packets /= ep->ep.maxpacket;
> > -	packets = min(packets, (unsigned)UDC_RXN_TC + 1);
> > +	packets = min_t(unsigned, packets / ep->ep.maxpacket, UDC_RXN_TC + 1);
> >  	req->dma_bytes = packets * ep->ep.maxpacket;
> > -	omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
> > -			ep->ep.maxpacket >> 1, packets,
> > -			OMAP_DMA_SYNC_ELEMENT,
> > -			dma_trigger, 0);
> > -	omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
> > -		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
> > -		0, 0);
> > -	ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
> > +
> > +	if (dma) {
> > +		struct dma_slave_config cfg = {
> > +			.direction = DMA_DEV_TO_MEM,
> > +			.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTE,
> > +			/*
> > +			 * DMAengine uses frame sync mode, setting maxburst=1
> > +			 * is equivalent to element sync mode.
> > +			 */
> > +			.src_maxburst = 1,
> 
> We should fix the omap-dma driver for slave_sg  instead:
> 
> if (!burst)
> 	burst = 1;
> 
> I thought that I already did that.

It isn't in 4.19, and I see no changes between 4.19 and 4.20-rc for
ti/omap-dma.c.

> > +		struct dma_chan *dma;
> > +
> >  		dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
> > -		status = omap_request_dma(dma_channel,
> > -			ep->ep.name, dma_error, ep, &ep->lch);
> > -		if (status == 0) {
> > +
> > +		dma = __dma_request_channel(&mask, omap_dma_filter_fn,
> > +					    (void *)dma_channel);
> 
> dma_request_chan(dev, "ch_name");
> 
> where ch_name: rx0/1/2, tx0/1/2
> 
> and we don't need the omap_dma_filter_fn in here as all taken care via
> the dma_slave_map

Yea, we can switch to that once the DT has been modified, but let's
try to keep the conversion as separate small steps at the moment.

> I try to give this a try, thanks Russell for the patch!

Thanks for the response, I'll rip out the old non-DMAengine handling
for OUT transfers given your responses so far - I've been though all
the register constructions, and it looks like I've broadly got it
right, with the differences that I've already noted.

I'll send an updated patch shortly.

I've just spotted that I've missed a call to dma_dest_len() in
proc_ep_show() though...
Russell King (Oracle) Nov. 23, 2018, 4:16 p.m. UTC | #16
Hi Peter,

Here's the patch, which should now support IN as well as OUT.
Completely untested, as mentioned before.

 drivers/usb/gadget/udc/omap_udc.c | 286 ++++++++++++++++++--------------------
 drivers/usb/gadget/udc/omap_udc.h |   3 +-
 2 files changed, 135 insertions(+), 154 deletions(-)

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index 3a16431da321..ad6f315e4327 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -28,6 +28,7 @@
 #include <linux/usb/ch9.h>
 #include <linux/usb/gadget.h>
 #include <linux/usb/otg.h>
+#include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
 #include <linux/clk.h>
 #include <linux/err.h>
@@ -203,7 +204,7 @@ static int omap_ep_enable(struct usb_ep *_ep,
 	/* set endpoint to initial state */
 	ep->dma_channel = 0;
 	ep->has_dma = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	use_ep(ep, UDC_EP_SEL);
 	omap_writew(udc->clr_halt, UDC_CTRL);
 	ep->ackwait = 0;
@@ -468,43 +469,6 @@ static int read_fifo(struct omap_ep *ep, struct omap_req *req)
 
 /*-------------------------------------------------------------------------*/
 
-static u16 dma_src_len(struct omap_ep *ep, dma_addr_t start)
-{
-	dma_addr_t	end;
-
-	/* IN-DMA needs this on fault/cancel paths, so 15xx misreports
-	 * the last transfer's bytecount by more than a FIFO's worth.
-	 */
-	if (cpu_is_omap15xx())
-		return 0;
-
-	end = omap_get_dma_src_pos(ep->lch);
-	if (end == ep->dma_counter)
-		return 0;
-
-	end |= start & (0xffff << 16);
-	if (end < start)
-		end += 0x10000;
-	return end - start;
-}
-
-static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
-{
-	dma_addr_t	end;
-
-	end = omap_get_dma_dst_pos(ep->lch);
-	if (end == ep->dma_counter)
-		return 0;
-
-	end |= start & (0xffff << 16);
-	if (cpu_is_omap15xx())
-		end++;
-	if (end < start)
-		end += 0x10000;
-	return end - start;
-}
-
-
 /* Each USB transfer request using DMA maps to one or more DMA transfers.
  * When DMA completion isn't request completion, the UDC continues with
  * the next DMA transfer for that USB transfer.
@@ -512,34 +476,53 @@ static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
 
 static void next_in_dma(struct omap_ep *ep, struct omap_req *req)
 {
-	u16		txdma_ctrl, w;
-	unsigned	length = req->req.length - req->req.actual;
-	const int	sync_mode = cpu_is_omap15xx()
-				? OMAP_DMA_SYNC_FRAME
-				: OMAP_DMA_SYNC_ELEMENT;
-	int		dma_trigger = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *dma = ep->dma;
+	dma_cookie_t cookie;
+	unsigned burst, length;
+	u16 txdma_ctrl, w;
+	struct dma_slave_config omap_udc_in_cfg = {
+		.direction = DMA_MEM_TO_DEV,
+		.dst_addr = UDC_DATA_DMA,
+	};
+
+	length = req->req.length - req->req.actual;
 
 	/* measure length in either bytes or packets */
-	if ((cpu_is_omap16xx() && length <= UDC_TXN_TSC)
-			|| (cpu_is_omap15xx() && length < ep->maxpacket)) {
+	if ((cpu_is_omap16xx() && length <= UDC_TXN_TSC) ||
+	    (cpu_is_omap15xx() && length < ep->maxpacket)) {
+		omap_udc_in_cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
 		txdma_ctrl = UDC_TXN_EOT | length;
-		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S8,
-				length, 1, sync_mode, dma_trigger, 0);
+		burst = length;
 	} else {
-		length = min(length / ep->maxpacket,
-				(unsigned) UDC_TXN_TSC + 1);
+		omap_udc_in_cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTE;
+		length = min_t(unsigned, length / ep->maxpacket,
+		               UDC_TXN_TSC + 1);
 		txdma_ctrl = length;
-		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
-				ep->ep.maxpacket >> 1, length, sync_mode,
-				dma_trigger, 0);
 		length *= ep->maxpacket;
+		burst = ep->ep.maxpacket >> 1;
 	}
-	omap_set_dma_src_params(ep->lch, OMAP_DMA_PORT_EMIFF,
-		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
-		0, 0);
 
-	omap_start_dma(ep->lch);
-	ep->dma_counter = omap_get_dma_src_pos(ep->lch);
+	if (!cpu_is_omap15xx())
+		burst = 1;
+
+	omap_udc_in_cfg.dst_maxburst = burst;
+
+	if (WARN_ON(dmaengine_slave_config(dma, &omap_udc_in_cfg)))
+		return;
+
+	tx = dmaengine_prep_slave_single(dma, req->req.dma + req->req.actual,
+					 length, DMA_MEM_TO_DEV, 0);
+	if (WARN_ON(!tx))
+		return;
+
+	cookie = dmaengine_submit(tx);
+	if (WARN_ON(dma_submit_error(cookie)))
+		return;
+
+	ep->dma_cookie = cookie;
+	dma_async_issue_pending(dma);
+
 	w = omap_readw(UDC_DMA_IRQ_EN);
 	w |= UDC_TX_DONE_IE(ep->dma_channel);
 	omap_writew(w, UDC_DMA_IRQ_EN);
@@ -549,11 +532,14 @@ static void next_in_dma(struct omap_ep *ep, struct omap_req *req)
 
 static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
 {
+	struct dma_tx_state state;
 	u16 w;
 
-	if (status == 0) {
-		req->req.actual += req->dma_bytes;
+	dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
 
+	req->req.actual += req->dma_bytes - state.residual;
+
+	if (status == 0) {
 		/* return if this request needs to send data or zlp */
 		if (req->req.actual < req->req.length)
 			return;
@@ -561,36 +547,47 @@ static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
 				&& req->dma_bytes != 0
 				&& (req->req.actual % ep->maxpacket) == 0)
 			return;
-	} else
-		req->req.actual += dma_src_len(ep, req->req.dma
-							+ req->req.actual);
+	}
 
 	/* tx completion */
-	omap_stop_dma(ep->lch);
+	dmaengine_terminate_async(ep->dma);
+
 	w = omap_readw(UDC_DMA_IRQ_EN);
 	w &= ~UDC_TX_DONE_IE(ep->dma_channel);
 	omap_writew(w, UDC_DMA_IRQ_EN);
 	done(ep, req, status);
 }
 
+static const struct dma_slave_config omap_udc_out_cfg = {
+	.direction = DMA_DEV_TO_MEM,
+	.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTE,
+	/*
+	 * DMAengine uses frame sync mode, setting maxburst=1
+	 * is equivalent to element sync mode.
+	 */
+	.src_maxburst = 1,
+	.src_addr = UDC_DATA_DMA,
+};
+
 static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 {
-	unsigned packets = req->req.length - req->req.actual;
-	int dma_trigger = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *dma = ep->dma;
+	dma_cookie_t cookie;
+	unsigned packets, length;
 	u16 w;
 
-	/* set up this DMA transfer, enable the fifo, start */
-	packets /= ep->ep.maxpacket;
-	packets = min(packets, (unsigned)UDC_RXN_TC + 1);
-	req->dma_bytes = packets * ep->ep.maxpacket;
-	omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
-			ep->ep.maxpacket >> 1, packets,
-			OMAP_DMA_SYNC_ELEMENT,
-			dma_trigger, 0);
-	omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
-		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
-		0, 0);
-	ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
+	length = req->req.length - req->req.actual;
+	packets = min_t(unsigned, length / ep->ep.maxpacket, UDC_RXN_TC + 1);
+	length = packets * ep->ep.maxpacket;
+
+	if (WARN_ON(dmaengine_slave_config(dma, &omap_udc_out_cfg)))
+		return;
+
+	tx = dmaengine_prep_slave_single(dma, req->req.dma + req->req.actual,
+					 length, DMA_DEV_TO_MEM, 0);
+	if (WARN_ON(!tx))
+		return;
 
 	omap_writew(UDC_RXN_STOP | (packets - 1), UDC_RXDMA(ep->dma_channel));
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -599,29 +596,42 @@ static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 	omap_writew(ep->bEndpointAddress & 0xf, UDC_EP_NUM);
 	omap_writew(UDC_SET_FIFO_EN, UDC_CTRL);
 
-	omap_start_dma(ep->lch);
+	cookie = dmaengine_submit(tx);
+	if (WARN_ON(dma_submit_error(cookie)))
+		return;
+
+	ep->dma_cookie = cookie;
+	dma_async_issue_pending(dma);
+	req->dma_bytes = length;
 }
 
 static void
 finish_out_dma(struct omap_ep *ep, struct omap_req *req, int status, int one)
 {
+	struct dma_tx_state state;
 	u16	count, w;
 
-	if (status == 0)
-		ep->dma_counter = (u16) (req->req.dma + req->req.actual);
-	count = dma_dest_len(ep, req->req.dma + req->req.actual);
+	dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
+
+	count = req->dma_bytes - state.residual;
 	count += req->req.actual;
 	if (one)
 		count--;
+
+	/*
+	 * FIXME: Surely if count > req->req.length, something has gone
+	 * seriously wrong and we've scribbled over memory we should not...
+	 * so surely we should be a WARN_ON() at the very least?
+	 */
 	if (count <= req->req.length)
 		req->req.actual = count;
 
-	if (count != req->dma_bytes || status)
-		omap_stop_dma(ep->lch);
-
+	if (count != req->dma_bytes || status) {
+		dmaengine_terminate_async(ep->dma);
 	/* if this wasn't short, request may need another transfer */
-	else if (req->req.actual < req->req.length)
+	} else if (req->req.actual < req->req.length) {
 		return;
+	}
 
 	/* rx completion */
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -683,19 +693,10 @@ static void dma_irq(struct omap_udc *udc, u16 irq_src)
 	}
 }
 
-static void dma_error(int lch, u16 ch_status, void *data)
-{
-	struct omap_ep	*ep = data;
-
-	/* if ch_status & OMAP_DMA_DROP_IRQ ... */
-	/* if ch_status & OMAP1_DMA_TOUT_IRQ ... */
-	ERR("%s dma error, lch %d status %02x\n", ep->ep.name, lch, ch_status);
-
-	/* complete current transfer ... */
-}
-
 static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 {
+	dma_cap_mask_t mask;
+	struct dma_chan *dma;
 	u16	reg;
 	int	status, restart, is_in;
 	int	dma_channel;
@@ -708,7 +709,7 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 	reg |= UDC_DMA_REQ;		/* "pulse" activated */
 
 	ep->dma_channel = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	if (channel == 0 || channel > 3) {
 		if ((reg & 0x0f00) == 0)
 			channel = 3;
@@ -722,65 +723,41 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 		}
 	}
 	reg |= (0x0f & ep->bEndpointAddress) << (4 * (channel - 1));
-	ep->dma_channel = channel;
 
-	if (is_in) {
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	if (is_in)
 		dma_channel = OMAP_DMA_USB_W2FC_TX0 - 1 + channel;
-		status = omap_request_dma(dma_channel,
-			ep->ep.name, dma_error, ep, &ep->lch);
-		if (status == 0) {
-			omap_writew(reg, UDC_TXDMA_CFG);
-			/* EMIFF or SDRC */
-			omap_set_dma_src_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_src_data_pack(ep->lch, 1);
-			/* TIPB */
-			omap_set_dma_dest_params(ep->lch,
-				OMAP_DMA_PORT_TIPB,
-				OMAP_DMA_AMODE_CONSTANT,
-				UDC_DATA_DMA,
-				0, 0);
-		}
-	} else {
+	else
 		dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
-		status = omap_request_dma(dma_channel,
-			ep->ep.name, dma_error, ep, &ep->lch);
-		if (status == 0) {
+
+	dma = __dma_request_channel(&mask, omap_dma_filter_fn,
+				    (void *)dma_channel);
+	if (dma) {
+		ep->dma_channel = channel;
+		ep->dma = dma;
+		if (is_in)
+			omap_writew(reg, UDC_TXDMA_CFG);
+		else
 			omap_writew(reg, UDC_RXDMA_CFG);
-			/* TIPB */
-			omap_set_dma_src_params(ep->lch,
-				OMAP_DMA_PORT_TIPB,
-				OMAP_DMA_AMODE_CONSTANT,
-				UDC_DATA_DMA,
-				0, 0);
-			/* EMIFF or SDRC */
-			omap_set_dma_dest_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_dest_data_pack(ep->lch, 1);
-		}
-	}
-	if (status)
-		ep->dma_channel = 0;
-	else {
 		ep->has_dma = 1;
-		omap_disable_dma_irq(ep->lch, OMAP_DMA_BLOCK_IRQ);
-
-		/* channel type P: hw synch (fifo) */
-		if (!cpu_is_omap15xx())
-			omap_set_dma_channel_mode(ep->lch, OMAP_DMA_LCH_P);
+		status = 0;
+	} else {
+		ep->dma_channel = 0;
+		status = -EINVAL;
 	}
 
 just_restart:
 	/* restart any queue, even if the claim failed  */
 	restart = !ep->stopped && !list_empty(&ep->queue);
 
-	if (status)
-		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
-			restart ? " (restart)" : "");
+	if (d->dma)
+		DBG("%s claimed %cxdma%d dmaengine %s%s\n", ep->ep.name,
+			is_in ? 't' : 'r', ep->dma_channel - 1,
+			dma_chan_name(d->dma), restart ? " (restart)" : "");
 	else
-		DBG("%s claimed %cxdma%d lch %d%s\n", ep->ep.name,
-			is_in ? 't' : 'r',
-			ep->dma_channel - 1, ep->lch,
+		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
 			restart ? " (restart)" : "");
 
 	if (restart) {
@@ -814,7 +791,8 @@ static void dma_channel_release(struct omap_ep *ep)
 	else
 		req = NULL;
 
-	active = omap_get_dma_active_status(ep->lch);
+	active = dma_async_is_tx_complete(ep->dma, ep->dma_cookie, NULL, NULL)
+			== DMA_IN_PROGRESS;
 
 	DBG("%s release %s %cxdma%d %p\n", ep->ep.name,
 			active ? "active" : "idle",
@@ -850,9 +828,9 @@ static void dma_channel_release(struct omap_ep *ep)
 		if (req)
 			finish_out_dma(ep, req, -ECONNRESET, 0);
 	}
-	omap_free_dma(ep->lch);
+	dma_release_channel(ep->dma);
 	ep->dma_channel = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	/* has_dma still set, till endpoint is fully quiesced */
 }
 
@@ -2146,9 +2124,9 @@ static void proc_ep_show(struct seq_file *s, struct omap_ep *ep)
 	use_ep(ep, 0);
 
 	if (use_dma && ep->has_dma)
-		snprintf(buf, sizeof buf, "(%cxdma%d lch%d) ",
+		snprintf(buf, sizeof buf, "(%cxdma%d dma %s) ",
 			(ep->bEndpointAddress & USB_DIR_IN) ? 't' : 'r',
-			ep->dma_channel - 1, ep->lch);
+			ep->dma_channel - 1, dma_chan_name(ep->dma));
 	else
 		buf[0] = 0;
 
@@ -2194,9 +2172,11 @@ static void proc_ep_show(struct seq_file *s, struct omap_ep *ep)
 			unsigned	length = req->req.actual;
 
 			if (use_dma && buf[0]) {
-				length += ((ep->bEndpointAddress & USB_DIR_IN)
-						? dma_src_len : dma_dest_len)
-					(ep, req->req.dma + length);
+				struct dma_tx_state state;
+
+				dmaengine_tx_status(ep->dma, ep->dma_cookie,
+						    &state);
+				length += req->dma_bytes - state.residual;
 				buf[0] = 0;
 			}
 			seq_printf(s, "\treq %p len %d/%d buf %p\n",
diff --git a/drivers/usb/gadget/udc/omap_udc.h b/drivers/usb/gadget/udc/omap_udc.h
index 00f9e608e755..e04c48f669ed 100644
--- a/drivers/usb/gadget/udc/omap_udc.h
+++ b/drivers/usb/gadget/udc/omap_udc.h
@@ -152,7 +152,8 @@ struct omap_ep {
 	u8				ackwait;
 	u8				dma_channel;
 	u16				dma_counter;
-	int				lch;
+	struct dma_chan			*dma;
+	dma_cookie_t			dma_cookie;
 	struct omap_udc			*udc;
 	struct timer_list		timer;
 };
Aaro Koskinen Nov. 23, 2018, 6:52 p.m. UTC | #17
Hi,

On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > I'm also not sure about this:
> > 
> >         if (cpu_is_omap15xx())
> >                 end++;
> > 
> > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > like a work-around for some problem on OMAP15xx, but I can't make sense
> > about why it's in the UDC driver rather than the legacy DMA driver.
> 
> afaik no other legacy drivers were doing similar thing, this must be
> something which is needed for the omap_udc driver to fix up something?

Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2

"Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
off-by-one with respect to the 1611 CDAC"

A.
Russell King (Oracle) Nov. 23, 2018, 11:27 p.m. UTC | #18
On Fri, Nov 23, 2018 at 04:16:59PM +0000, Russell King - ARM Linux wrote:
> Hi Peter,
> 
> Here's the patch, which should now support IN as well as OUT.
> Completely untested, as mentioned before.

Now compile tested...

 drivers/usb/gadget/udc/omap_udc.c | 291 ++++++++++++++++++--------------------
 drivers/usb/gadget/udc/omap_udc.h |   3 +-
 2 files changed, 137 insertions(+), 157 deletions(-)

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index 3a16431da321..dd85476ec234 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -28,6 +28,7 @@
 #include <linux/usb/ch9.h>
 #include <linux/usb/gadget.h>
 #include <linux/usb/otg.h>
+#include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
 #include <linux/clk.h>
 #include <linux/err.h>
@@ -203,7 +204,7 @@ static int omap_ep_enable(struct usb_ep *_ep,
 	/* set endpoint to initial state */
 	ep->dma_channel = 0;
 	ep->has_dma = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	use_ep(ep, UDC_EP_SEL);
 	omap_writew(udc->clr_halt, UDC_CTRL);
 	ep->ackwait = 0;
@@ -468,43 +469,6 @@ static int read_fifo(struct omap_ep *ep, struct omap_req *req)
 
 /*-------------------------------------------------------------------------*/
 
-static u16 dma_src_len(struct omap_ep *ep, dma_addr_t start)
-{
-	dma_addr_t	end;
-
-	/* IN-DMA needs this on fault/cancel paths, so 15xx misreports
-	 * the last transfer's bytecount by more than a FIFO's worth.
-	 */
-	if (cpu_is_omap15xx())
-		return 0;
-
-	end = omap_get_dma_src_pos(ep->lch);
-	if (end == ep->dma_counter)
-		return 0;
-
-	end |= start & (0xffff << 16);
-	if (end < start)
-		end += 0x10000;
-	return end - start;
-}
-
-static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
-{
-	dma_addr_t	end;
-
-	end = omap_get_dma_dst_pos(ep->lch);
-	if (end == ep->dma_counter)
-		return 0;
-
-	end |= start & (0xffff << 16);
-	if (cpu_is_omap15xx())
-		end++;
-	if (end < start)
-		end += 0x10000;
-	return end - start;
-}
-
-
 /* Each USB transfer request using DMA maps to one or more DMA transfers.
  * When DMA completion isn't request completion, the UDC continues with
  * the next DMA transfer for that USB transfer.
@@ -512,34 +476,53 @@ static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
 
 static void next_in_dma(struct omap_ep *ep, struct omap_req *req)
 {
-	u16		txdma_ctrl, w;
-	unsigned	length = req->req.length - req->req.actual;
-	const int	sync_mode = cpu_is_omap15xx()
-				? OMAP_DMA_SYNC_FRAME
-				: OMAP_DMA_SYNC_ELEMENT;
-	int		dma_trigger = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *dma = ep->dma;
+	dma_cookie_t cookie;
+	unsigned burst, length;
+	u16 txdma_ctrl, w;
+	struct dma_slave_config omap_udc_in_cfg = {
+		.direction = DMA_MEM_TO_DEV,
+		.dst_addr = UDC_DATA_DMA,
+	};
+
+	length = req->req.length - req->req.actual;
 
 	/* measure length in either bytes or packets */
-	if ((cpu_is_omap16xx() && length <= UDC_TXN_TSC)
-			|| (cpu_is_omap15xx() && length < ep->maxpacket)) {
+	if ((cpu_is_omap16xx() && length <= UDC_TXN_TSC) ||
+	    (cpu_is_omap15xx() && length < ep->maxpacket)) {
+		omap_udc_in_cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
 		txdma_ctrl = UDC_TXN_EOT | length;
-		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S8,
-				length, 1, sync_mode, dma_trigger, 0);
+		burst = length;
 	} else {
-		length = min(length / ep->maxpacket,
-				(unsigned) UDC_TXN_TSC + 1);
+		omap_udc_in_cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
+		length = min_t(unsigned, length / ep->maxpacket,
+		               UDC_TXN_TSC + 1);
 		txdma_ctrl = length;
-		omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
-				ep->ep.maxpacket >> 1, length, sync_mode,
-				dma_trigger, 0);
 		length *= ep->maxpacket;
+		burst = ep->ep.maxpacket >> 1;
 	}
-	omap_set_dma_src_params(ep->lch, OMAP_DMA_PORT_EMIFF,
-		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
-		0, 0);
 
-	omap_start_dma(ep->lch);
-	ep->dma_counter = omap_get_dma_src_pos(ep->lch);
+	if (!cpu_is_omap15xx())
+		burst = 1;
+
+	omap_udc_in_cfg.dst_maxburst = burst;
+
+	if (WARN_ON(dmaengine_slave_config(dma, &omap_udc_in_cfg)))
+		return;
+
+	tx = dmaengine_prep_slave_single(dma, req->req.dma + req->req.actual,
+					 length, DMA_MEM_TO_DEV, 0);
+	if (WARN_ON(!tx))
+		return;
+
+	cookie = dmaengine_submit(tx);
+	if (WARN_ON(dma_submit_error(cookie)))
+		return;
+
+	ep->dma_cookie = cookie;
+	dma_async_issue_pending(dma);
+
 	w = omap_readw(UDC_DMA_IRQ_EN);
 	w |= UDC_TX_DONE_IE(ep->dma_channel);
 	omap_writew(w, UDC_DMA_IRQ_EN);
@@ -549,11 +532,14 @@ static void next_in_dma(struct omap_ep *ep, struct omap_req *req)
 
 static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
 {
+	struct dma_tx_state state;
 	u16 w;
 
-	if (status == 0) {
-		req->req.actual += req->dma_bytes;
+	dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
+
+	req->req.actual += req->dma_bytes - state.residue;
 
+	if (status == 0) {
 		/* return if this request needs to send data or zlp */
 		if (req->req.actual < req->req.length)
 			return;
@@ -561,36 +547,47 @@ static void finish_in_dma(struct omap_ep *ep, struct omap_req *req, int status)
 				&& req->dma_bytes != 0
 				&& (req->req.actual % ep->maxpacket) == 0)
 			return;
-	} else
-		req->req.actual += dma_src_len(ep, req->req.dma
-							+ req->req.actual);
+	}
 
 	/* tx completion */
-	omap_stop_dma(ep->lch);
+	dmaengine_terminate_async(ep->dma);
+
 	w = omap_readw(UDC_DMA_IRQ_EN);
 	w &= ~UDC_TX_DONE_IE(ep->dma_channel);
 	omap_writew(w, UDC_DMA_IRQ_EN);
 	done(ep, req, status);
 }
 
+static struct dma_slave_config omap_udc_out_cfg = {
+	.direction = DMA_DEV_TO_MEM,
+	.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES,
+	/*
+	 * DMAengine uses frame sync mode, setting maxburst=1
+	 * is equivalent to element sync mode.
+	 */
+	.src_maxburst = 1,
+	.src_addr = UDC_DATA_DMA,
+};
+
 static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 {
-	unsigned packets = req->req.length - req->req.actual;
-	int dma_trigger = 0;
+	struct dma_async_tx_descriptor *tx;
+	struct dma_chan *dma = ep->dma;
+	dma_cookie_t cookie;
+	unsigned packets, length;
 	u16 w;
 
-	/* set up this DMA transfer, enable the fifo, start */
-	packets /= ep->ep.maxpacket;
-	packets = min(packets, (unsigned)UDC_RXN_TC + 1);
-	req->dma_bytes = packets * ep->ep.maxpacket;
-	omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
-			ep->ep.maxpacket >> 1, packets,
-			OMAP_DMA_SYNC_ELEMENT,
-			dma_trigger, 0);
-	omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
-		OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
-		0, 0);
-	ep->dma_counter = omap_get_dma_dst_pos(ep->lch);
+	length = req->req.length - req->req.actual;
+	packets = min_t(unsigned, length / ep->ep.maxpacket, UDC_RXN_TC + 1);
+	length = packets * ep->ep.maxpacket;
+
+	if (WARN_ON(dmaengine_slave_config(dma, &omap_udc_out_cfg)))
+		return;
+
+	tx = dmaengine_prep_slave_single(dma, req->req.dma + req->req.actual,
+					 length, DMA_DEV_TO_MEM, 0);
+	if (WARN_ON(!tx))
+		return;
 
 	omap_writew(UDC_RXN_STOP | (packets - 1), UDC_RXDMA(ep->dma_channel));
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -599,29 +596,42 @@ static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
 	omap_writew(ep->bEndpointAddress & 0xf, UDC_EP_NUM);
 	omap_writew(UDC_SET_FIFO_EN, UDC_CTRL);
 
-	omap_start_dma(ep->lch);
+	cookie = dmaengine_submit(tx);
+	if (WARN_ON(dma_submit_error(cookie)))
+		return;
+
+	ep->dma_cookie = cookie;
+	dma_async_issue_pending(dma);
+	req->dma_bytes = length;
 }
 
 static void
 finish_out_dma(struct omap_ep *ep, struct omap_req *req, int status, int one)
 {
+	struct dma_tx_state state;
 	u16	count, w;
 
-	if (status == 0)
-		ep->dma_counter = (u16) (req->req.dma + req->req.actual);
-	count = dma_dest_len(ep, req->req.dma + req->req.actual);
+	dmaengine_tx_status(ep->dma, ep->dma_cookie, &state);
+
+	count = req->dma_bytes - state.residue;
 	count += req->req.actual;
 	if (one)
 		count--;
+
+	/*
+	 * FIXME: Surely if count > req->req.length, something has gone
+	 * seriously wrong and we've scribbled over memory we should not...
+	 * so surely we should be a WARN_ON() at the very least?
+	 */
 	if (count <= req->req.length)
 		req->req.actual = count;
 
-	if (count != req->dma_bytes || status)
-		omap_stop_dma(ep->lch);
-
+	if (count != req->dma_bytes || status) {
+		dmaengine_terminate_async(ep->dma);
 	/* if this wasn't short, request may need another transfer */
-	else if (req->req.actual < req->req.length)
+	} else if (req->req.actual < req->req.length) {
 		return;
+	}
 
 	/* rx completion */
 	w = omap_readw(UDC_DMA_IRQ_EN);
@@ -683,32 +693,25 @@ static void dma_irq(struct omap_udc *udc, u16 irq_src)
 	}
 }
 
-static void dma_error(int lch, u16 ch_status, void *data)
-{
-	struct omap_ep	*ep = data;
-
-	/* if ch_status & OMAP_DMA_DROP_IRQ ... */
-	/* if ch_status & OMAP1_DMA_TOUT_IRQ ... */
-	ERR("%s dma error, lch %d status %02x\n", ep->ep.name, lch, ch_status);
-
-	/* complete current transfer ... */
-}
-
 static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 {
+	dma_cap_mask_t mask;
+	struct dma_chan *dma;
+	u32 dma_cfg;
 	u16	reg;
 	int	status, restart, is_in;
 	int	dma_channel;
 
 	is_in = ep->bEndpointAddress & USB_DIR_IN;
 	if (is_in)
-		reg = omap_readw(UDC_TXDMA_CFG);
+		dma_cfg = UDC_TXDMA_CFG;
 	else
-		reg = omap_readw(UDC_RXDMA_CFG);
+		dma_cfg = UDC_RXDMA_CFG;
+	reg = omap_readw(dma_cfg);
 	reg |= UDC_DMA_REQ;		/* "pulse" activated */
 
 	ep->dma_channel = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	if (channel == 0 || channel > 3) {
 		if ((reg & 0x0f00) == 0)
 			channel = 3;
@@ -722,65 +725,38 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 		}
 	}
 	reg |= (0x0f & ep->bEndpointAddress) << (4 * (channel - 1));
-	ep->dma_channel = channel;
 
-	if (is_in) {
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+
+	if (is_in)
 		dma_channel = OMAP_DMA_USB_W2FC_TX0 - 1 + channel;
-		status = omap_request_dma(dma_channel,
-			ep->ep.name, dma_error, ep, &ep->lch);
-		if (status == 0) {
-			omap_writew(reg, UDC_TXDMA_CFG);
-			/* EMIFF or SDRC */
-			omap_set_dma_src_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_src_data_pack(ep->lch, 1);
-			/* TIPB */
-			omap_set_dma_dest_params(ep->lch,
-				OMAP_DMA_PORT_TIPB,
-				OMAP_DMA_AMODE_CONSTANT,
-				UDC_DATA_DMA,
-				0, 0);
-		}
-	} else {
+	else
 		dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
-		status = omap_request_dma(dma_channel,
-			ep->ep.name, dma_error, ep, &ep->lch);
-		if (status == 0) {
-			omap_writew(reg, UDC_RXDMA_CFG);
-			/* TIPB */
-			omap_set_dma_src_params(ep->lch,
-				OMAP_DMA_PORT_TIPB,
-				OMAP_DMA_AMODE_CONSTANT,
-				UDC_DATA_DMA,
-				0, 0);
-			/* EMIFF or SDRC */
-			omap_set_dma_dest_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_dest_data_pack(ep->lch, 1);
-		}
-	}
-	if (status)
-		ep->dma_channel = 0;
-	else {
-		ep->has_dma = 1;
-		omap_disable_dma_irq(ep->lch, OMAP_DMA_BLOCK_IRQ);
 
-		/* channel type P: hw synch (fifo) */
-		if (!cpu_is_omap15xx())
-			omap_set_dma_channel_mode(ep->lch, OMAP_DMA_LCH_P);
+	dma = __dma_request_channel(&mask, omap_dma_filter_fn,
+				    (void *)dma_channel);
+	if (dma) {
+		omap_writew(reg, dma_cfg);
+		ep->dma_channel = channel;
+		ep->dma = dma;
+		ep->has_dma = 1;
+		status = 0;
+	} else {
+		ep->dma_channel = 0;
+		status = -EINVAL;
 	}
 
 just_restart:
 	/* restart any queue, even if the claim failed  */
 	restart = !ep->stopped && !list_empty(&ep->queue);
 
-	if (status)
-		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
-			restart ? " (restart)" : "");
+	if (ep->dma)
+		DBG("%s claimed %cxdma%d dmaengine %s%s\n", ep->ep.name,
+			is_in ? 't' : 'r', ep->dma_channel - 1,
+			dma_chan_name(ep->dma), restart ? " (restart)" : "");
 	else
-		DBG("%s claimed %cxdma%d lch %d%s\n", ep->ep.name,
-			is_in ? 't' : 'r',
-			ep->dma_channel - 1, ep->lch,
+		DBG("%s no dma channel: %d%s\n", ep->ep.name, status,
 			restart ? " (restart)" : "");
 
 	if (restart) {
@@ -814,7 +790,8 @@ static void dma_channel_release(struct omap_ep *ep)
 	else
 		req = NULL;
 
-	active = omap_get_dma_active_status(ep->lch);
+	active = dma_async_is_tx_complete(ep->dma, ep->dma_cookie, NULL, NULL)
+			== DMA_IN_PROGRESS;
 
 	DBG("%s release %s %cxdma%d %p\n", ep->ep.name,
 			active ? "active" : "idle",
@@ -850,9 +827,9 @@ static void dma_channel_release(struct omap_ep *ep)
 		if (req)
 			finish_out_dma(ep, req, -ECONNRESET, 0);
 	}
-	omap_free_dma(ep->lch);
+	dma_release_channel(ep->dma);
 	ep->dma_channel = 0;
-	ep->lch = -1;
+	ep->dma = NULL;
 	/* has_dma still set, till endpoint is fully quiesced */
 }
 
@@ -2146,9 +2123,9 @@ static void proc_ep_show(struct seq_file *s, struct omap_ep *ep)
 	use_ep(ep, 0);
 
 	if (use_dma && ep->has_dma)
-		snprintf(buf, sizeof buf, "(%cxdma%d lch%d) ",
+		snprintf(buf, sizeof buf, "(%cxdma%d dma %s) ",
 			(ep->bEndpointAddress & USB_DIR_IN) ? 't' : 'r',
-			ep->dma_channel - 1, ep->lch);
+			ep->dma_channel - 1, dma_chan_name(ep->dma));
 	else
 		buf[0] = 0;
 
@@ -2194,9 +2171,11 @@ static void proc_ep_show(struct seq_file *s, struct omap_ep *ep)
 			unsigned	length = req->req.actual;
 
 			if (use_dma && buf[0]) {
-				length += ((ep->bEndpointAddress & USB_DIR_IN)
-						? dma_src_len : dma_dest_len)
-					(ep, req->req.dma + length);
+				struct dma_tx_state state;
+
+				dmaengine_tx_status(ep->dma, ep->dma_cookie,
+						    &state);
+				length += req->dma_bytes - state.residue;
 				buf[0] = 0;
 			}
 			seq_printf(s, "\treq %p len %d/%d buf %p\n",
diff --git a/drivers/usb/gadget/udc/omap_udc.h b/drivers/usb/gadget/udc/omap_udc.h
index 00f9e608e755..e04c48f669ed 100644
--- a/drivers/usb/gadget/udc/omap_udc.h
+++ b/drivers/usb/gadget/udc/omap_udc.h
@@ -152,7 +152,8 @@ struct omap_ep {
 	u8				ackwait;
 	u8				dma_channel;
 	u16				dma_counter;
-	int				lch;
+	struct dma_chan			*dma;
+	dma_cookie_t			dma_cookie;
 	struct omap_udc			*udc;
 	struct timer_list		timer;
 };
Aaro Koskinen Nov. 24, 2018, 12:17 a.m. UTC | #19
Hi,

On Fri, Nov 23, 2018 at 01:45:46PM +0200, Peter Ujfalusi wrote:
> On 23/11/2018 0.01, Aaro Koskinen wrote:
> > With that reverted, the DMA works OK (and I can also now confirm that
> > OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> > quirk in OMAP UDC,
> 
> The omap_udc driver is a bit of a mess, need to check it myself, but for
> now we can just set the quirk_ep_out_aligned_size and investigate later.

OK, with quirk_ep_out_aligned_size we get 770/16xx DMA working again,
but on 15xx the omap_udc DMA still doesn't work (tested today for the
first time ever, I have no idea if it has ever worked and if so, when?).

A.
Russell King (Oracle) Nov. 24, 2018, 5:48 p.m. UTC | #20
On Sat, Nov 24, 2018 at 02:17:40AM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Fri, Nov 23, 2018 at 01:45:46PM +0200, Peter Ujfalusi wrote:
> > On 23/11/2018 0.01, Aaro Koskinen wrote:
> > > With that reverted, the DMA works OK (and I can also now confirm that
> > > OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> > > quirk in OMAP UDC,
> > 
> > The omap_udc driver is a bit of a mess, need to check it myself, but for
> > now we can just set the quirk_ep_out_aligned_size and investigate later.
> 
> OK, with quirk_ep_out_aligned_size we get 770/16xx DMA working again,
> but on 15xx the omap_udc DMA still doesn't work (tested today for the
> first time ever, I have no idea if it has ever worked and if so, when?).

Hmm, there's more questionable stuff in this driver, and the gadget
layer.

Fundamental fact of struct device - it's a ref-counted structure and
will only be freed when the last reference is dropped.  dev_unregister()
merely drops the refcount, it doesn't guarantee that it's dropped to
zero (iow, there can be other users).  Only when the refcount drops
to zero is the dev.release function called.  However:

usb_add_gadget_udc_release(..., release)
{
        if (release)
                gadget->dev.release = release;
        else
                gadget->dev.release = usb_udc_nop_release;
        device_initialize(&gadget->dev);
        ret = device_add(&gadget->dev);
}

At this point, that struct device is registered, so its refcount can
be increased by other users.

void usb_del_gadget_udc(struct usb_gadget *gadget)
{
...
        device_unregister(&gadget->dev);
        memset(&gadget->dev, 0x00, sizeof(gadget->dev));
}

That memset() is down-right wrong - the refcount on this struct device
may _not_ be zero at this point, the struct device could well be in
use by another thread.  That memset will trample over the contents of
the structure potentially while someone else is using it, and
_potentially_ before the gadget->dev.release function has been called.

However, that _may_ be a good thing when you read the omap_udc code:

        status = usb_add_gadget_udc_release(&pdev->dev, &udc->gadget,
                        omap_udc_release);

During the omap_udc_remove() function:
{
...
        usb_del_gadget_udc(&udc->gadget);
        if (udc->driver)
                return -EBUSY;

        udc->done = &done;

... more dereferences of udc, which is a _global_ variable ...

        wait_for_completion(&done);
}

Now, omap_udc_release() does this:

        complete(udc->done);
        kfree(udc);
        udc = NULL;

So, when usb_del_gadget_udc() is called, if device_unregister() within
there drops the last reference count, omap_udc_release() will be called
immediately.  Since udc->done hasn't been setup at that point, that
complete() will fail with a NULL pointer dereference.  If that doesn't
happen, then the kfree() and following set of the global 'udc' variable
to NULL means that all future references to 'udc' after the call to
usb_del_gadget_udc() in omap_udc_remove() will be dereferencing a NULL
pointer.  So one way or the other, this leads to a kernel OOPS.

If, on the other hand, omap_udc_release() was not called in
device_unregister(), the function pointer will be zeroed by the
memset(), which will lead to 'udc' never being freed - in other words,
we leak memory.

What's more is that 'done' is never "completed" so we end up hanging
at the wait_for_completion().

Then there's the pointless:

        if (udc->driver)
                return -EBUSY;

in omap_udc_remove().  The effect of returning an error is... what
exactly?  It doesn't prevent the device being removed at all, it
doesn't delay it, in fact the whole "remove returns an int" is
nothing but confusion - the return value from all driver remove
methods is completely ignored.

If udc->driver is still set at this point, it basically means that
we skip the rest of the tear down, but the platform device will
still be unbound from the driver, leaving (eg) the transceiver phy
still claimed, the procfs file still registered, the interrupts
still claimed, the memory region still registered, etc.  If omap_udc
is built as a module, the module could even be removed while all
that is still registered.

So, whatever way I look at this, the code in the removal path both
in omap_udc and the gadget removal code higher up looks very wrong
and broken to me.
Aaro Koskinen Nov. 24, 2018, 7:06 p.m. UTC | #21
Hello,

On Sat, Nov 24, 2018 at 05:48:23PM +0000, Russell King - ARM Linux wrote:
> Hmm, there's more questionable stuff in this driver, and the gadget
> layer.

[...]

> So, whatever way I look at this, the code in the removal path both
> in omap_udc and the gadget removal code higher up looks very wrong
> and broken to me.

Yes, week ago I saw omap_udc crashing on both probe failure and
module removal and sent some fixes for the most obvious failures (see
https://marc.info/?l=linux-usb&m=154258778316932&w=2).

Is there any good driver that uses usb_add_gadget_udc_release() correctly?
Looking at fsl_qe_udc.c and fsl_udc_core.c they should also crash if
usb_add_gadget_udc_release() fails.

A.
Russell King (Oracle) Nov. 24, 2018, 7:29 p.m. UTC | #22
On Sat, Nov 24, 2018 at 09:06:48PM +0200, Aaro Koskinen wrote:
> Hello,
> 
> On Sat, Nov 24, 2018 at 05:48:23PM +0000, Russell King - ARM Linux wrote:
> > Hmm, there's more questionable stuff in this driver, and the gadget
> > layer.
> 
> [...]
> 
> > So, whatever way I look at this, the code in the removal path both
> > in omap_udc and the gadget removal code higher up looks very wrong
> > and broken to me.
> 
> Yes, week ago I saw omap_udc crashing on both probe failure and
> module removal and sent some fixes for the most obvious failures (see
> https://marc.info/?l=linux-usb&m=154258778316932&w=2).

The effect of your patch is basically to replace the release function
with a no-op function.

> Is there any good driver that uses usb_add_gadget_udc_release() correctly?
> Looking at fsl_qe_udc.c and fsl_udc_core.c they should also crash if
> usb_add_gadget_udc_release() fails.

usb_add_gadget_udc_release() itself will call the release function
automatically on error.

The release function should _also_ be called when usb_del_gadget_udc()
is called (and would be guaranteed if the memset() is removed.)

So, moving the cleanup in the remove path into the release function
would solve the problem with omap_udc, and removing the memset()
would solve the problem with the core code.

It does leave a problem if the omap_udc module is removed - the
release function _could_ be called after the module has been removed
which would lead to an oops.  That's presumably why there's a
completion.  One solution to that would be to move the assignment
of udc->done before the call to usb_del_gadget_udc().

However, using a completion for something like this tends to be
frowned upon, but I don't see any other way to ensure correctness
here.
Russell King (Oracle) Nov. 24, 2018, 8:09 p.m. UTC | #23
On Fri, Nov 23, 2018 at 08:52:15PM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > > I'm also not sure about this:
> > > 
> > >         if (cpu_is_omap15xx())
> > >                 end++;
> > > 
> > > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > > like a work-around for some problem on OMAP15xx, but I can't make sense
> > > about why it's in the UDC driver rather than the legacy DMA driver.
> > 
> > afaik no other legacy drivers were doing similar thing, this must be
> > something which is needed for the omap_udc driver to fix up something?
> 
> Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2
> 
> "Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
> off-by-one with respect to the 1611 CDAC"

... which suggests that's a problem with the CPC register itself, and
we should fix that in the DMAengine driver rather than the USB gadget
driver.

Tony, any input on this?
Tony Lindgren Nov. 25, 2018, 1:07 a.m. UTC | #24
* Russell King - ARM Linux <linux@armlinux.org.uk> [181124 20:10]:
> On Fri, Nov 23, 2018 at 08:52:15PM +0200, Aaro Koskinen wrote:
> > Hi,
> > 
> > On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > > On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > > > I'm also not sure about this:
> > > > 
> > > >         if (cpu_is_omap15xx())
> > > >                 end++;
> > > > 
> > > > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > > > like a work-around for some problem on OMAP15xx, but I can't make sense
> > > > about why it's in the UDC driver rather than the legacy DMA driver.
> > > 
> > > afaik no other legacy drivers were doing similar thing, this must be
> > > something which is needed for the omap_udc driver to fix up something?
> > 
> > Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2
> > 
> > "Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
> > off-by-one with respect to the 1611 CDAC"
> 
> ... which suggests that's a problem with the CPC register itself, and
> we should fix that in the DMAengine driver rather than the USB gadget
> driver.
> 
> Tony, any input on this?

Yeah that sounds like some hardware work-around for 15xx as described
in the DMA_DEST_LAST macro reading CSAC on 15xx instead of CDAC. Seems
like it should be done in the dmaengine driver.. My guess is that other
dma users never needed to read CSAC register?

Regards,

Tony
Tony Lindgren Nov. 25, 2018, 1:11 a.m. UTC | #25
* Tony Lindgren <tony@atomide.com> [181125 01:07]:
> * Russell King - ARM Linux <linux@armlinux.org.uk> [181124 20:10]:
> > On Fri, Nov 23, 2018 at 08:52:15PM +0200, Aaro Koskinen wrote:
> > > Hi,
> > > 
> > > On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > > > On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > > > > I'm also not sure about this:
> > > > > 
> > > > >         if (cpu_is_omap15xx())
> > > > >                 end++;
> > > > > 
> > > > > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > > > > like a work-around for some problem on OMAP15xx, but I can't make sense
> > > > > about why it's in the UDC driver rather than the legacy DMA driver.
> > > > 
> > > > afaik no other legacy drivers were doing similar thing, this must be
> > > > something which is needed for the omap_udc driver to fix up something?
> > > 
> > > Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2
> > > 
> > > "Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
> > > off-by-one with respect to the 1611 CDAC"
> > 
> > ... which suggests that's a problem with the CPC register itself, and
> > we should fix that in the DMAengine driver rather than the USB gadget
> > driver.
> > 
> > Tony, any input on this?
> 
> Yeah that sounds like some hardware work-around for 15xx as described
> in the DMA_DEST_LAST macro reading CSAC on 15xx instead of CDAC. Seems
> like it should be done in the dmaengine driver.. My guess is that other
> dma users never needed to read CSAC register?

And it looks like for 15xx we have CPC and CSAC both at offset 0x18 in
arch/arm/mach-omap1/dma.c, seems like the dma driver is missing handling
for the CPC register that's there only for 15xx.

Regards,

Tony
Russell King (Oracle) Nov. 25, 2018, 11:11 a.m. UTC | #26
On Sat, Nov 24, 2018 at 05:07:17PM -0800, Tony Lindgren wrote:
> * Russell King - ARM Linux <linux@armlinux.org.uk> [181124 20:10]:
> > On Fri, Nov 23, 2018 at 08:52:15PM +0200, Aaro Koskinen wrote:
> > > Hi,
> > > 
> > > On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > > > On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > > > > I'm also not sure about this:
> > > > > 
> > > > >         if (cpu_is_omap15xx())
> > > > >                 end++;
> > > > > 
> > > > > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > > > > like a work-around for some problem on OMAP15xx, but I can't make sense
> > > > > about why it's in the UDC driver rather than the legacy DMA driver.
> > > > 
> > > > afaik no other legacy drivers were doing similar thing, this must be
> > > > something which is needed for the omap_udc driver to fix up something?
> > > 
> > > Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2
> > > 
> > > "Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
> > > off-by-one with respect to the 1611 CDAC"
> > 
> > ... which suggests that's a problem with the CPC register itself, and
> > we should fix that in the DMAengine driver rather than the USB gadget
> > driver.
> > 
> > Tony, any input on this?
> 
> Yeah that sounds like some hardware work-around for 15xx as described
> in the DMA_DEST_LAST macro reading CSAC on 15xx instead of CDAC. Seems
> like it should be done in the dmaengine driver.. My guess is that other
> dma users never needed to read CSAC register?

Hmm, reading the OMAP1510 TRM for the CPC register, it seems that
omap-dma won't handle this particularly well.  The fact that the
register only updates after the last request in an element or frame
means that if we try to use this value as the current source /
destination address before the first transfer, it can be wildly
wrong.

Saving the current value at the beginning of a request, and detecting
if it's changed (like omap_udc) isn't going to work well in the
generic case.  If, say, the register happens to contain 0x0004, and
our next transfer is using 32-bit elements in element sync mode
starting at address with lsb 16 bits as 0, it would mean we'd see
0x0004 in this register after the _second_ element has been
transferred, and we'd assume that the register hasn't yet changed -
but we would have in reality transferred two elements.

However, omap-dma.c zeros the CPC register before each transfer,
which is interesting, because in one place the OMAP1510 TRM says
that the CPC register is read/write, but in the actual description
of this register, it says it's read-only.

What it also means is that, in such a case, after the 2nd element has
been transferred, where the register contains 0x0004, the address
we'd be looking for (to calculate the residual) should be 0x0008,
not 0x0005, so we actually need to be adding the number of bytes
according to element size.

Looking at omap-dma.c, someone has added support for the residue
granularity:

        if (__dma_omap15xx(od->plat->dma_attr))
                od->ddev.residue_granularity =
                                DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
        else
                od->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;

If OMAP15xx is truely descriptor granularity, then we can't use
omap-dma for omap_udc, because omap_udc needs to know exactly
how many bytes were transferred.

So... hmm, OMAP15xx DMA looks like a complete mess, and the only
way to know what would and wouldn't work is to actually have the
hardware.

I think we're better off leaving omap-udc well alone, and if it's
now broken with DMA, then that's unfortunate - it would require
someone with the hardware to diagnose the problem and fix it.  I
think trying to convert it to dmaengine would be risking way more
problems than its worth.
Russell King (Oracle) Nov. 25, 2018, 11:57 a.m. UTC | #27
On Sun, Nov 25, 2018 at 11:11:05AM +0000, Russell King - ARM Linux wrote:
> On Sat, Nov 24, 2018 at 05:07:17PM -0800, Tony Lindgren wrote:
> > * Russell King - ARM Linux <linux@armlinux.org.uk> [181124 20:10]:
> > > On Fri, Nov 23, 2018 at 08:52:15PM +0200, Aaro Koskinen wrote:
> > > > Hi,
> > > > 
> > > > On Fri, Nov 23, 2018 at 02:35:04PM +0200, Peter Ujfalusi wrote:
> > > > > On 22/11/2018 17.12, Russell King - ARM Linux wrote:
> > > > > > I'm also not sure about this:
> > > > > > 
> > > > > >         if (cpu_is_omap15xx())
> > > > > >                 end++;
> > > > > > 
> > > > > > in dma_dest_len() - is that missing from the omap-dma driver?  It looks
> > > > > > like a work-around for some problem on OMAP15xx, but I can't make sense
> > > > > > about why it's in the UDC driver rather than the legacy DMA driver.
> > > > > 
> > > > > afaik no other legacy drivers were doing similar thing, this must be
> > > > > something which is needed for the omap_udc driver to fix up something?
> > > > 
> > > > Here's the patch that added it: https://marc.info/?l=linux-omap&m=119634396324221&w=2
> > > > 
> > > > "Make DMA-OUT behave on the 1510 ... the 1510 CPC register was just
> > > > off-by-one with respect to the 1611 CDAC"
> > > 
> > > ... which suggests that's a problem with the CPC register itself, and
> > > we should fix that in the DMAengine driver rather than the USB gadget
> > > driver.
> > > 
> > > Tony, any input on this?
> > 
> > Yeah that sounds like some hardware work-around for 15xx as described
> > in the DMA_DEST_LAST macro reading CSAC on 15xx instead of CDAC. Seems
> > like it should be done in the dmaengine driver.. My guess is that other
> > dma users never needed to read CSAC register?
> 
> Hmm, reading the OMAP1510 TRM for the CPC register, it seems that
> omap-dma won't handle this particularly well.  The fact that the
> register only updates after the last request in an element or frame
> means that if we try to use this value as the current source /
> destination address before the first transfer, it can be wildly
> wrong.
> 
> Saving the current value at the beginning of a request, and detecting
> if it's changed (like omap_udc) isn't going to work well in the
> generic case.  If, say, the register happens to contain 0x0004, and
> our next transfer is using 32-bit elements in element sync mode
> starting at address with lsb 16 bits as 0, it would mean we'd see
> 0x0004 in this register after the _second_ element has been
> transferred, and we'd assume that the register hasn't yet changed -
> but we would have in reality transferred two elements.
> 
> However, omap-dma.c zeros the CPC register before each transfer,
> which is interesting, because in one place the OMAP1510 TRM says
> that the CPC register is read/write, but in the actual description
> of this register, it says it's read-only.
> 
> What it also means is that, in such a case, after the 2nd element has
> been transferred, where the register contains 0x0004, the address
> we'd be looking for (to calculate the residual) should be 0x0008,
> not 0x0005, so we actually need to be adding the number of bytes
> according to element size.
> 
> Looking at omap-dma.c, someone has added support for the residue
> granularity:
> 
>         if (__dma_omap15xx(od->plat->dma_attr))
>                 od->ddev.residue_granularity =
>                                 DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>         else
>                 od->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;

I'll point out that this was introduced by:

commit c9bd0946da243a8eb86b44ff613e2c813f9b683b
Author: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Date:   Tue Jun 5 18:59:57 2018 +0200

    dmaengine: ti: omap-dma: Fix OMAP1510 incorrect residue_granularity
...
    It was verified already before that omap_get_dma_src_pos() from
    arch/arm/plat-omap/dma.c didn't work correctly for OMAP1510 - see
    commit 1bdd7419910c ("ASoC: OMAP: fix OMAP1510 broken PCM pointer
    callback") for details.  Apparently the same applies to its successor,
    omap_dma_get_src_pos() from drivers/dma/ti/omap-dma.c.

So, this now presents us with bigger problems - if we fix it now for
omap_udc, should the above commit be reverted, and if we do revert
the above commit, will it regress OMAP1510 audio.

The intention of omap-dma was always to report an accurate residue.
Aaro Koskinen Nov. 25, 2018, 4:58 p.m. UTC | #28
Hi,

On Sun, Nov 25, 2018 at 11:11:05AM +0000, Russell King - ARM Linux wrote:
> I think we're better off leaving omap-udc well alone, and if it's
> now broken with DMA, then that's unfortunate - it would require
> someone with the hardware to diagnose the problem and fix it.  I
> think trying to convert it to dmaengine would be risking way more
> problems than its worth.

Well, there's also an option to use dmaengine only for 16xx at the
beginning.

My current guess is that 15xx DMA has been broken at least since
65111084c63d ("USB: more omap_udc updates (dma and omap1710)").

There are two changes in that patch that broke it:

"use 16 bit DMA access" ==> CPC off-by-one becomes now off-by-two...?

"allow burst/pack for memory access" ==> no idea why

Below changes get traffic going with DMA & g_ether...

A.

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index fcf13ef33b31..8094a0380057 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -498,7 +498,7 @@ static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
 
 	end |= start & (0xffff << 16);
 	if (cpu_is_omap15xx())
-		end++;
+		end += 2;
 	if (end < start)
 		end += 0x10000;
 	return end - start;
@@ -730,10 +730,12 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 			ep->ep.name, dma_error, ep, &ep->lch);
 		if (status == 0) {
 			omap_writew(reg, UDC_TXDMA_CFG);
-			/* EMIFF or SDRC */
-			omap_set_dma_src_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_src_data_pack(ep->lch, 1);
+			if (!cpu_is_omap15xx()) {
+				/* EMIFF or SDRC */
+				omap_set_dma_src_burst_mode(ep->lch,
+							OMAP_DMA_DATA_BURST_4);
+				omap_set_dma_src_data_pack(ep->lch, 1);
+			}
 			/* TIPB */
 			omap_set_dma_dest_params(ep->lch,
 				OMAP_DMA_PORT_TIPB,
@@ -753,10 +755,12 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 				OMAP_DMA_AMODE_CONSTANT,
 				UDC_DATA_DMA,
 				0, 0);
-			/* EMIFF or SDRC */
-			omap_set_dma_dest_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
-			omap_set_dma_dest_data_pack(ep->lch, 1);
+			if (!cpu_is_omap15xx()) {
+				/* EMIFF or SDRC */
+				omap_set_dma_dest_burst_mode(ep->lch,
+							OMAP_DMA_DATA_BURST_4);
+				omap_set_dma_dest_data_pack(ep->lch, 1);
+			}
 		}
 	}
 	if (status)
Tony Lindgren Nov. 25, 2018, 5:14 p.m. UTC | #29
* Aaro Koskinen <aaro.koskinen@iki.fi> [181125 16:58]:
> Below changes get traffic going with DMA & g_ether...

Oh cool, if you have dma and g_ether working, you should
test it with a variable size ping test loop :) That should
expose any issues within few minutes.

Below is the test script I was using earlier, the
tusb6010 comment there is probably no longer valid.

Regards,

Tony

8< ----------------
#!/bin/bash

#
# At least tusb6010 dma needs 32-bit aligned buffers.
# That can be done by setting no_skb_reserve with:
# musb->g.quirk_avoids_skb_reserve = 1;
#

device=$1
size=$2
wraps=0

while [ 1 ]; do
	#echo "Pinging with size $size"
	if ! ping -w0 -c1 -s$size $device > /dev/null 2>&1; then
		break;
	fi
	size=$(expr $size + 1)

	if [ $size -gt 8192 ]; then
		wraps=$(expr $wraps + 1)
		echo "wrapping ($wraps) at $size"
		size=1
	fi
done
echo "Test ran up to $size"
Aaro Koskinen Dec. 17, 2018, 7:16 p.m. UTC | #30
On Thu, Nov 22, 2018 at 03:12:36PM +0000, Russell King - ARM Linux wrote:
> Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
> DMAengine always uses a 64 byte burst, but udc wants a smaller burst
> setting.  Does this matter?

Looking at OMAP1 docs, it seems it supports only 16 bytes. Then checking
DMAengine code, I don't think these CSDP bit values are not valid
for OMAP1:

        CSDP_SRC_BURST_1        = 0 << 7,
        CSDP_SRC_BURST_16       = 1 << 7,
        CSDP_SRC_BURST_32       = 2 << 7,
        CSDP_SRC_BURST_64       = 3 << 7,

From TI SPRU674 document, pages 50-51:

	0	single access (no burst)
	1	single access (no burst)
	2	burst 4
	3	reserved (do not use this setting)

So if CSDP_SRC_BURST_64 (3) gets programmed OMAP1, I wonder what is the
end result, no burst or burst 4...

A.
Aaro Koskinen Dec. 17, 2018, 11:47 p.m. UTC | #31
Hi,

On Sun, Nov 25, 2018 at 09:14:28AM -0800, Tony Lindgren wrote:
> * Aaro Koskinen <aaro.koskinen@iki.fi> [181125 16:58]:
> > Below changes get traffic going with DMA & g_ether...
> 
> Oh cool, if you have dma and g_ether working, you should
> test it with a variable size ping test loop :) That should
> expose any issues within few minutes.

The ping test is working fine. Also setting MTU higher works fine.

I was also able to reduce the needed changes further, so only the below
are now needed for 15xx DMA.

The dma_dest_len() change I can understand.

But why the BURST_4 is not working in out direction?

--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -498,7 +498,7 @@ static u16 dma_dest_len(struct omap_ep *ep, dma_addr_t start)
 
 	end |= start & (0xffff << 16);
 	if (cpu_is_omap15xx())
-		end++;
+		end += sizeof(u16);
 	if (end < start)
 		end += 0x10000;
 	return end - start;
@@ -754,8 +754,9 @@ static void dma_channel_claim(struct omap_ep *ep, unsigned channel)
 				UDC_DATA_DMA,
 				0, 0);
 			/* EMIFF or SDRC */
-			omap_set_dma_dest_burst_mode(ep->lch,
-						OMAP_DMA_DATA_BURST_4);
+			if (!cpu_is_omap15xx())
+				omap_set_dma_dest_burst_mode(ep->lch,
+							OMAP_DMA_DATA_BURST_4);
 			omap_set_dma_dest_data_pack(ep->lch, 1);
 		}
 	}

A.
Peter Ujfalusi Dec. 18, 2018, 10:11 a.m. UTC | #32
On 17/12/2018 21.16, Aaro Koskinen wrote:
> On Thu, Nov 22, 2018 at 03:12:36PM +0000, Russell King - ARM Linux wrote:
>> Also we can't deal with the omap_set_dma_dest_burst_mode() setting -
>> DMAengine always uses a 64 byte burst, but udc wants a smaller burst
>> setting.  Does this matter?
> 
> Looking at OMAP1 docs, it seems it supports only 16 bytes. Then checking
> DMAengine code, I don't think these CSDP bit values are not valid
> for OMAP1:
> 
>         CSDP_SRC_BURST_1        = 0 << 7,
>         CSDP_SRC_BURST_16       = 1 << 7,
>         CSDP_SRC_BURST_32       = 2 << 7,
>         CSDP_SRC_BURST_64       = 3 << 7,
> 
> From TI SPRU674 document, pages 50-51:
> 
> 	0	single access (no burst)
> 	1	single access (no burst)
> 	2	burst 4

In omap1510 it is 4 x data_type
In omap1610/1710 it is 4 x data_type (only data_type == 32bit is supported)
From omap2420+ 32 bytes (8x32bit/4x64bit)

So for OMAP1 we need to have different handling of the burst:
only enable if data_type is 32bit.

> 	3	reserved (do not use this setting)
> 
> So if CSDP_SRC_BURST_64 (3) gets programmed OMAP1, I wonder what is the
> end result, no burst or burst 4...
> 
> A.
> 

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Tony Lindgren Dec. 18, 2018, 3:55 p.m. UTC | #33
* Aaro Koskinen <aaro.koskinen@iki.fi> [181217 23:47]:
> Hi,
> 
> On Sun, Nov 25, 2018 at 09:14:28AM -0800, Tony Lindgren wrote:
> > * Aaro Koskinen <aaro.koskinen@iki.fi> [181125 16:58]:
> > > Below changes get traffic going with DMA & g_ether...
> > 
> > Oh cool, if you have dma and g_ether working, you should
> > test it with a variable size ping test loop :) That should
> > expose any issues within few minutes.
> 
> The ping test is working fine. Also setting MTU higher works fine.

Thanks for checking, good to hear.

Regards,

Tony
diff mbox series

Patch

diff --git a/drivers/dma/ti/omap-dma.c b/drivers/dma/ti/omap-dma.c
index a4a931ddf6f6..a18cfd497f04 100644
--- a/drivers/dma/ti/omap-dma.c
+++ b/drivers/dma/ti/omap-dma.c
@@ -185,6 +185,10 @@  enum {
 
 	CLNK_CTRL_ENABLE_LNK	= BIT(15),
 
+	/* OMAP1 only */
+	LCH_CTRL_LCH_2D		= 0,
+	LCH_CTRL_LCH_P		= 2,
+
 	CDP_DST_VALID_INC	= 0 << 0,
 	CDP_DST_VALID_RELOAD	= 1 << 0,
 	CDP_DST_VALID_REUSE	= 2 << 0,
@@ -529,6 +533,7 @@  static void omap_dma_start_sg(struct omap_chan *c, struct omap_desc *d)
 
 static void omap_dma_start_desc(struct omap_chan *c)
 {
+	struct omap_dmadev *od = to_omap_dma_dev(c->vc.chan.device);
 	struct virt_dma_desc *vd = vchan_next_desc(&c->vc);
 	struct omap_desc *d;
 	unsigned cxsa, cxei, cxfi;
@@ -570,6 +575,12 @@  static void omap_dma_start_desc(struct omap_chan *c)
 	omap_dma_chan_write(c, CSDP, d->csdp);
 	omap_dma_chan_write(c, CLNK_CTRL, d->clnk_ctrl);
 
+	if (dma_omap1() && !__dma_omap15xx(od->plat->dma_attr)) {
+		if (is_slave_direction(d->dir))
+			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_P);
+		else
+			omap_dma_chan_write(c, LCH_CTRL, LCH_CTRL_LCH_2D);
+	}
 	omap_dma_start_sg(c, d);
 }