tty: serial: 8250_omap: do not defer termios changes
diff mbox

Message ID 8737r7ght7.fsf@linutronix.de
State New
Headers show

Commit Message

John Ogness March 31, 2016, 8:41 a.m. UTC
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

It has been observed that the TX-DMA can stall if termios changes
occur while a TX-DMA operation is in progress. Previously this was
handled by allowing set_termios to return immediately and deferring
the change until the operation is completed. Now set_termios will
block until the operation is completed, proceed with the changes,
then schedule any pending next operation.

Commit message and forward port by John Ogness.

Tested-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 Patch against next-20160331.

 drivers/tty/serial/8250/8250_omap.c |   57 ++++++++++++++++----------
 1 file changed, 37 insertions(+), 20 deletions(-)

Comments

John Ogness March 31, 2016, 10:51 a.m. UTC | #1
On 2016-03-31, John Ogness <john.ogness@linutronix.de> wrote:
> diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
> index 6f76051..9459b4d 100644
> --- a/drivers/tty/serial/8250/8250_omap.c
> +++ b/drivers/tty/serial/8250/8250_omap.c
> @@ -460,6 +451,24 @@ static void omap_8250_set_termios(struct uart_port *port,
>  			priv->efr |= OMAP_UART_SW_TX;
>  		}
>  	}
> +
> +	if (dma && dma->tx_running) {
> +		/*
> +		 * TCSANOW requests the change to occur immediately, however
> +		 * if we have a TX-DMA operation in progress then it has been
> +		 * observed that it might stall and never complete. Therefore
> +		 * we wait until DMA completes to prevent this hang from
> +		 * happening.
> +		 */
> +
> +		dma->tx_running = 2;
> +
> +		spin_unlock_irq(&up->port.lock);
> +		wait_event_interruptible(priv->termios_wait,
> +					 dma->tx_running == 3);
> +		spin_lock_irq(&up->port.lock);
> +		complete_dma = 1;
> +	}

Sorry, I just realized that wait_event_interruptible() is used
here. IMHO this should be changed to wait_event(), otherwise a signal
could trigger the TX-DMA deadlock this code is trying to prevent.

If hardware flow control is active, it is possible that wait_event()
leads to the caller waiting forever. But allowing interruptible causes
too many problems since set_termios() cannot communicate errors.

John Ogness
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Cox March 31, 2016, 2:33 p.m. UTC | #2
> +		 * TCSANOW requests the change to occur immediately, however
> +		 * if we have a TX-DMA operation in progress then it has been
> +		 * observed that it might stall and never complete. Therefore
> +		 * we wait until DMA completes to prevent this hang from
> +		 * happening.
> +		 */
> +
> +		dma->tx_running = 2;
> +
> +		spin_unlock_irq(&up->port.lock);
> +		wait_event_interruptible(priv->termios_wait,
> +					 dma->tx_running == 3);
> +		spin_lock_irq(&up->port.lock);
> +		complete_dma = 1;
> +	}

This look wrong if a signal is received. termios setting is not an
interruptible event and this would mean for example that a random other
signal to a terminal process would cause mysterious non-setting of
termios changes.

Should this therefore not just be a straight wait_event or even a
completion() handler ?

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 5, 2016, 4:07 a.m. UTC | #3
On 03/31/2016 01:41 AM, John Ogness wrote:
> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> It has been observed that the TX-DMA can stall

Does this happen on any other OMAP part besides am335x?
I looked back over the LKML history of this and didn't see any other
design implicated in this problem.

> if termios changes
> occur while a TX-DMA operation is in progress. Previously this was
> handled by allowing set_termios to return immediately and deferring
> the change until the operation is completed. Now set_termios will
> block until the operation is completed, proceed with the changes,
> then schedule any pending next operation.

I ask because I wonder if this is related to the tx dma trigger
problem (worked around by writing a byte to the tx fifo on am335x)?

If so, then
1) It should use the OMAP_DMA_TX_KICK bug flag
2) It could pause dma, complete set_termios(), then resume dma which would
keep everything nice and linear.

Or even just drop DMA for the am335x and only use the normal 8250_dma
support for newer OMAP designs.

Regards,
Peter Hurley


> Commit message and forward port by John Ogness.
> 
> Tested-by: John Ogness <john.ogness@linutronix.de>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  Patch against next-20160331.
> 
>  drivers/tty/serial/8250/8250_omap.c |   57 ++++++++++++++++----------
>  1 file changed, 37 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
> index 6f76051..9459b4d 100644
> --- a/drivers/tty/serial/8250/8250_omap.c
> +++ b/drivers/tty/serial/8250/8250_omap.c
> @@ -100,9 +100,9 @@ struct omap8250_priv {
>  	u8 wer;
>  	u8 xon;
>  	u8 xoff;
> -	u8 delayed_restore;
>  	u16 quot;
>  
> +	wait_queue_head_t termios_wait;
>  	bool is_suspending;
>  	int wakeirq;
>  	int wakeups_enabled;
> @@ -257,18 +257,6 @@ static void omap8250_update_mdr1(struct uart_8250_port *up,
>  static void omap8250_restore_regs(struct uart_8250_port *up)
>  {
>  	struct omap8250_priv *priv = up->port.private_data;
> -	struct uart_8250_dma	*dma = up->dma;
> -
> -	if (dma && dma->tx_running) {
> -		/*
> -		 * TCSANOW requests the change to occur immediately however if
> -		 * we have a TX-DMA operation in progress then it has been
> -		 * observed that it might stall and never complete. Therefore we
> -		 * delay DMA completes to prevent this hang from happen.
> -		 */
> -		priv->delayed_restore = 1;
> -		return;
> -	}
>  
>  	serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B);
>  	serial_out(up, UART_EFR, UART_EFR_ECB);
> @@ -310,6 +298,7 @@ static void omap8250_restore_regs(struct uart_8250_port *up)
>  	up->port.ops->set_mctrl(&up->port, up->port.mctrl);
>  }
>  
> +static void omap_8250_dma_tx_complete(void *param);
>  /*
>   * OMAP can use "CLK / (16 or 13) / div" for baud rate. And then we have have
>   * some differences in how we want to handle flow control.
> @@ -320,6 +309,8 @@ static void omap_8250_set_termios(struct uart_port *port,
>  {
>  	struct uart_8250_port *up = up_to_u8250p(port);
>  	struct omap8250_priv *priv = up->port.private_data;
> +	struct uart_8250_dma *dma = up->dma;
> +	unsigned int complete_dma = 0;
>  	unsigned char cval = 0;
>  	unsigned int baud;
>  
> @@ -430,7 +421,7 @@ static void omap_8250_set_termios(struct uart_port *port,
>  	priv->scr = OMAP_UART_SCR_RX_TRIG_GRANU1_MASK | OMAP_UART_SCR_TX_EMPTY |
>  		OMAP_UART_SCR_TX_TRIG_GRANU1_MASK;
>  
> -	if (up->dma)
> +	if (dma)
>  		priv->scr |= OMAP_UART_SCR_DMAMODE_1 |
>  			OMAP_UART_SCR_DMAMODE_CTL;
>  
> @@ -460,6 +451,24 @@ static void omap_8250_set_termios(struct uart_port *port,
>  			priv->efr |= OMAP_UART_SW_TX;
>  		}
>  	}
> +
> +	if (dma && dma->tx_running) {
> +		/*
> +		 * TCSANOW requests the change to occur immediately, however
> +		 * if we have a TX-DMA operation in progress then it has been
> +		 * observed that it might stall and never complete. Therefore
> +		 * we wait until DMA completes to prevent this hang from
> +		 * happening.
> +		 */
> +
> +		dma->tx_running = 2;
> +
> +		spin_unlock_irq(&up->port.lock);
> +		wait_event_interruptible(priv->termios_wait,
> +					 dma->tx_running == 3);
> +		spin_lock_irq(&up->port.lock);
> +		complete_dma = 1;
> +	}
>  	omap8250_restore_regs(up);
>  
>  	spin_unlock_irq(&up->port.lock);
> @@ -475,6 +484,8 @@ static void omap_8250_set_termios(struct uart_port *port,
>  	/* Don't rewrite B0 */
>  	if (tty_termios_baud_rate(termios))
>  		tty_termios_encode_baud_rate(termios, baud, baud);
> +	if (complete_dma)
> +		omap_8250_dma_tx_complete(up);
>  }
>  
>  /* same as 8250 except that we may have extra flow bits set in EFR */
> @@ -893,17 +904,18 @@ static void omap_8250_dma_tx_complete(void *param)
>  
>  	spin_lock_irqsave(&p->port.lock, flags);
>  
> +	if (dma->tx_running == 2) {
> +		dma->tx_running = 3;
> +		wake_up(&priv->termios_wait);
> +		goto out;
> +	}
> +
>  	dma->tx_running = 0;
>  
>  	xmit->tail += dma->tx_size;
>  	xmit->tail &= UART_XMIT_SIZE - 1;
>  	p->port.icount.tx += dma->tx_size;
>  
> -	if (priv->delayed_restore) {
> -		priv->delayed_restore = 0;
> -		omap8250_restore_regs(p);
> -	}
> -
>  	if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
>  		uart_write_wakeup(&p->port);
>  
> @@ -923,7 +935,7 @@ static void omap_8250_dma_tx_complete(void *param)
>  		p->ier |= UART_IER_THRI;
>  		serial_port_out(&p->port, UART_IER, p->ier);
>  	}
> -
> +out:
>  	spin_unlock_irqrestore(&p->port.lock, flags);
>  }
>  
> @@ -1088,6 +1100,10 @@ static inline int omap_8250_rx_dma(struct uart_8250_port *p, unsigned int iir)
>  {
>  	return -EINVAL;
>  }
> +
> +static void omap_8250_dma_tx_complete(void *param)
> +{
> +}
>  #endif
>  
>  static int omap8250_no_handle_irq(struct uart_port *port)
> @@ -1241,6 +1257,7 @@ static int omap8250_probe(struct platform_device *pdev)
>  			priv->omap8250_dma.rx_size = RX_TRIGGER;
>  			priv->omap8250_dma.rxconf.src_maxburst = RX_TRIGGER;
>  			priv->omap8250_dma.txconf.dst_maxburst = TX_TRIGGER;
> +			init_waitqueue_head(&priv->termios_wait);
>  
>  			if (of_machine_is_compatible("ti,am33xx"))
>  				priv->habit |= OMAP_DMA_TX_KICK;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Ogness April 11, 2016, 8:18 a.m. UTC | #4
On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 03/31/2016 01:41 AM, John Ogness wrote:
>> It has been observed that the TX-DMA can stall
>
> Does this happen on any other OMAP part besides am335x?
> I looked back over the LKML history of this and didn't see
> any other design implicated in this problem.

I just ran the tests again using 4.6-rc2. I am able to reproduce the
dma-tx stall with am335x/edma and dra7/sdma.

Note: To achieve the stall, both the delayed_restore and the
      rx_dma_broken features of the mainline 8250_omap driver needed to
      be removed.

John Ogness
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 11, 2016, 5:53 p.m. UTC | #5
On 04/11/2016 01:18 AM, John Ogness wrote:
> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>> It has been observed that the TX-DMA can stall
>>
>> Does this happen on any other OMAP part besides am335x?
>> I looked back over the LKML history of this and didn't see
>> any other design implicated in this problem.
> 
> I just ran the tests again using 4.6-rc2. I am able to reproduce the
> dma-tx stall with am335x/edma and dra7/sdma.

I thought we already established sdma was not to be used since
the hardware does not actually support pausing without data loss.

http://www.spinics.net/lists/linux-serial/msg18503.html


So I'm wondering if we're carrying all this extra DMA complexity
and workarounds for just am335x?


> Note: To achieve the stall, both the delayed_restore and the
>       rx_dma_broken features of the mainline 8250_omap driver needed to
>       be removed.
> 
> John Ogness
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior April 11, 2016, 6:31 p.m. UTC | #6
On 04/11/2016 07:53 PM, Peter Hurley wrote:
> On 04/11/2016 01:18 AM, John Ogness wrote:
>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>> It has been observed that the TX-DMA can stall
>>>
>>> Does this happen on any other OMAP part besides am335x?
>>> I looked back over the LKML history of this and didn't see
>>> any other design implicated in this problem.
>>
>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>> dma-tx stall with am335x/edma and dra7/sdma.
> 
> I thought we already established sdma was not to be used since
> the hardware does not actually support pausing without data loss.

This workaround was not invented for sdma but for edma (with am335x).

> http://www.spinics.net/lists/linux-serial/msg18503.html

This could be fixed. See
  http://www.spinics.net/lists/linux-serial/msg18517.html
  http://www.spinics.net/lists/linux-serial/msg18531.html

rmk was fine with it from what I read. So what is missing is just
refurbish the patch (update the comment according to rmk replay) and
then we could re-enable DMA again.

> So I'm wondering if we're carrying all this extra DMA complexity
> and workarounds for just am335x?
> 

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 11, 2016, 8:10 p.m. UTC | #7
On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>> It has been observed that the TX-DMA can stall
>>>>
>>>> Does this happen on any other OMAP part besides am335x?
>>>> I looked back over the LKML history of this and didn't see
>>>> any other design implicated in this problem.
>>>
>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>> dma-tx stall with am335x/edma and dra7/sdma.
>>
>> I thought we already established sdma was not to be used since
>> the hardware does not actually support pausing without data loss.
> 
> This workaround was not invented for sdma but for edma (with am335x).

According to John above, dra7/sdma requires this workaround.


>> http://www.spinics.net/lists/linux-serial/msg18503.html
> 
> This could be fixed. See
>   http://www.spinics.net/lists/linux-serial/msg18517.html
>   http://www.spinics.net/lists/linux-serial/msg18531.html
> 
> rmk was fine with it from what I read. So what is missing is just
> refurbish the patch (update the comment according to rmk replay) and
> then we could re-enable DMA again.

That's hardly all that is required.

1. edma pause returns an error if the descriptor has already been retired
   when a pause is attempted. This makes distinguishing between reporting an
   error for unsupported feature indistinguishable from a transient dma
   error that can simply be logged.
2. The question of a spurious uart interrupt with every dma transaction
   on am335x is still unanswered.
3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
   without pause/resume.
4. Since virt-dma uses tasklets which since 3.8 are no longer serviced
   in a timely manner, rx dma is unreliable, since it's often kicked out
   to regular interrupts.
5. omap dma maintenance is not keeping up with baseline dma.


IOW, omap dma has turned into one big tangle of workarounds.

Let's start with making a list of which TI designs need which workarounds.

*am335x*
- requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
  workaround necessitating completely different tx dma completion handler)
- requires rx dma already queued before UART data ready interrupt
  (ie., necessitates completely different irq handler and rx dma completion
  handler)
- hangs changing some unknown register if tx dma in progress
  (ie., this termios change workaround)
- generates spurious uart interrupt for every rx dma transaction
  (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
  _Even with this workaround_, it still generates spurious interrupt warning
  which shuts off interrupts for several ms while logging the error
  message to the console, virtually guaranteeing lost data.


Can any TI design use the baseline 8250 tx dma transaction flow without
workarounds?  I know the am335x can't; any others?
Can any TI design use the baseline 8250 rx dma transaction flow without
workarounds?  Again, I know the am335x can't; any others?



>> So I'm wondering if we're carrying all this extra DMA complexity
>> and workarounds for just am335x?
>>
> 
> Sebastian
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior April 12, 2016, 5:03 p.m. UTC | #8
On 04/11/2016 10:10 PM, Peter Hurley wrote:
> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>> It has been observed that the TX-DMA can stall
>>>>>
>>>>> Does this happen on any other OMAP part besides am335x?
>>>>> I looked back over the LKML history of this and didn't see
>>>>> any other design implicated in this problem.
>>>>
>>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>>> dma-tx stall with am335x/edma and dra7/sdma.
>>>
>>> I thought we already established sdma was not to be used since
>>> the hardware does not actually support pausing without data loss.
>>
>> This workaround was not invented for sdma but for edma (with am335x).
> 
> According to John above, dra7/sdma requires this workaround.

It was reported by Frans Klaver against am335x
 http://lkml.kernel.org/r/20140908183353.GB4686@ci00147.xsens-tech.local

and I managed to reproduce this with his yocto image on dra7 and am335x:
 http://lkml.kernel.org/r/20140921204100.GA10111@linutronix.de

>>> http://www.spinics.net/lists/linux-serial/msg18503.html
>>
>> This could be fixed. See
>>   http://www.spinics.net/lists/linux-serial/msg18517.html
>>   http://www.spinics.net/lists/linux-serial/msg18531.html
>>
>> rmk was fine with it from what I read. So what is missing is just
>> refurbish the patch (update the comment according to rmk replay) and
>> then we could re-enable DMA again.
> 
> That's hardly all that is required.

well it would enable pause of RX transfers. TX would not work (unless
TI's HW people can confirm that it will).

> 1. edma pause returns an error if the descriptor has already been retired
>    when a pause is attempted. This makes distinguishing between reporting an
>    error for unsupported feature indistinguishable from a transient dma
>    error that can simply be logged.
> 2. The question of a spurious uart interrupt with every dma transaction
>    on am335x is still unanswered.

This is correct. If I remember correctly, the Intel people see the same
thing and I *think* John told me that the Intel manual says that RDI
should be disabled if DMA is used.

> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
>    without pause/resume.
Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
control is one workaround but the user has no chance of knowing that
XON/XOFF has been silently disabled.

You could send the x_char after TX transfer completed. After all you
need to ensure that you have some space in the TX-FIFO. However if you
send a 4KiB of data you might want to send x_char rather sooner than
later. I *think* even with pause the hardware will complete the last
burst before stopping but is probably better than waiting for the 4KiB
to complete.

> 4. Since virt-dma uses tasklets which since 3.8 are no longer serviced
>    in a timely manner, rx dma is unreliable, since it's often kicked out
>    to regular interrupts.

Is this only the delay in omap_dma_callback() (which you don't have
!cyclic) or something else? omap_dma_issue_pending() seems to program
the transfer right away. Oh now I see the same thing in
edma_completion_handler(). Okay but this affects now everyone that
relies on low latency?

> 5. omap dma maintenance is not keeping up with baseline dma.

John switched to cylic mode so he was not effected very much non-pause
problem.

> IOW, omap dma has turned into one big tangle of workarounds.
Most of them are hardware shortcomings. I think disabling RX-DMA due to
missing pause in omap-dma is the only workaround that could be avoided
if the driver would be changed.

> Let's start with making a list of which TI designs need which workarounds.
> 
> *am335x*

I am not sure if the limitations are based on the DMA engine or the

> - requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
>   workaround necessitating completely different tx dma completion handler)

This one for instance I don't see on BeagleBoard xM / omap36xx and
DRA7x. Both (not affected) use SDMA instead EDMA. It would be
interesting to see if DRA7x is affected once it uses EDMA.

> - requires rx dma already queued before UART data ready interrupt
>   (ie., necessitates completely different irq handler and rx dma completion
>   handler)

true. But is this something that would work for others, too?

> - hangs changing some unknown register if tx dma in progress
>   (ie., this termios change workaround)

I think some registers are the baud-rate registers which pause engine.

> - generates spurious uart interrupt for every rx dma transaction
>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>   _Even with this workaround_, it still generates spurious interrupt warning
>   which shuts off interrupts for several ms while logging the error
>   message to the console, virtually guaranteeing lost data.

as I wrote in my other email I think RDI should be disabled with DMA
according the Intel manual and I *think* someone here reported that
they see the same problem.

> Can any TI design use the baseline 8250 tx dma transaction flow without
> workarounds?  I know the am335x can't; any others?
Am335x. Has edma and so has dm814x. According to the code, dm814x based
HW does not need it, can this be confirmed? Sekhar, TOny?

> Can any TI design use the baseline 8250 rx dma transaction flow without
> workarounds?  Again, I know the am335x can't; any others?

Is dra7 out? Because that one needs to enqueue RX transfers asap. And
omap36xx (aka BeagleBoard-xm) as well. I don't kown anything about
later SoCs (like am437x and so on) but I would assume so.

Sebastian

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 12, 2016, 6:42 p.m. UTC | #9
On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>> It has been observed that the TX-DMA can stall
>>>>>>
>>>>>> Does this happen on any other OMAP part besides am335x?
>>>>>> I looked back over the LKML history of this and didn't see
>>>>>> any other design implicated in this problem.
>>>>>
>>>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>>>> dma-tx stall with am335x/edma and dra7/sdma.
>>>>
>>>> I thought we already established sdma was not to be used since
>>>> the hardware does not actually support pausing without data loss.
>>>
>>> This workaround was not invented for sdma but for edma (with am335x).
>>
>> According to John above, dra7/sdma requires this workaround.
> 
> It was reported by Frans Klaver against am335x
>  http://lkml.kernel.org/r/20140908183353.GB4686@ci00147.xsens-tech.local
> 
> and I managed to reproduce this with his yocto image on dra7 and am335x:
>  http://lkml.kernel.org/r/20140921204100.GA10111@linutronix.de

[...]

>> - hangs changing some unknown register if tx dma in progress
>>   (ie., this termios change workaround)
> 
> I think some registers are the baud-rate registers which pause engine.

Let's back up here and focus on just this problem for now.

Since this is observable on dra7/sdma, then it's not related to
the OMAP_DMA_TX_KICK problem. IOW, dropping tx dma support for am335x
does not make this go away, which is what I was asking.

Now, if the DLL/DLH/MDR1 register writes are causing dma to stall,
then skipping those if they're unchanged should fix this, and then
pause/terminating in-progress DMA if any of these registers are being
written would be acceptable since some data loss is to be expected
when changing the baud rate without waiting.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 12, 2016, 11:20 p.m. UTC | #10
[ +to Heikki, Andy ]

On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:

[ elided parts not relevant to shared 8250 dma discussion ]

>> 2. The question of a spurious uart interrupt with every dma transaction
>>    on am335x is still unanswered.
> 
> This is correct. If I remember correctly, the Intel people see the same
> thing and I *think* John told me that the Intel manual says that RDI
> should be disabled if DMA is used.

Not sure we're talking about the same thing here?

I mean the 100000 UART_IIR_NO_INTs that trigger irq disable, for which
John submitted a patch to ack these. I've never seen any discussion
regarding this problem on intel platforms.

An *extra* 6000+ interrupts/sec. for no purpose is bad.


>> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
>>    without pause/resume.
>
> Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
> control is one workaround but the user has no chance of knowing that
> XON/XOFF has been silently disabled.
> 
> You could send the x_char after TX transfer completed. After all you
> need to ensure that you have some space in the TX-FIFO. However if you
> send a 4KiB of data you might want to send x_char rather sooner than
> later. I *think* even with pause the hardware will complete the last
> burst before stopping but is probably better than waiting for the 4KiB
> to complete.

Yeah, it doesn't need to be the very next byte but a better effort should
be made. 4k is lot of scroll-by.


>> 4. Since virt-dma uses tasklets which since 3.8 are no longer serviced
>>    in a timely manner, rx dma is unreliable, since it's often kicked out
>>    to regular interrupts.
> 
> Is this only the delay in omap_dma_callback() (which you don't have
> !cyclic) or something else? omap_dma_issue_pending() seems to program
> the transfer right away. Oh now I see the same thing in
> edma_completion_handler(). Okay but this affects now everyone that
> relies on low latency?

Well, the real problem is that only one rx buffer is being used serially,
first filled by the dma h/w, then emptied by the driver, then resubmitted.
This creates a gap of time between the dma h/w completion interrupt and
the resubmission where data loss is possible (and happens).

In the omap8250 rx dma flow:

1. Submit 1st rx dma transaction
2. h/w finishes 1st rx dma transaction
3. * schedule completion handler * which may be delayed significant
   amount of time ( > 1ms )
4. data still arriving at uart
5. completion handler eventually runs but too late, so the
   2nd rx dma transaction is not submitted in time to prevent data
   loss.

This rx dma flow creates a hard deadline starting at step 3 and ending at
step 5 above that Linux won't meet.

The normal 8250 dma flow has a similar gap but it seems to be less
frequent, probably because of the disparity in rx buffer size
(4k vs. 48 bytes)

In the 8250 rx dma flow:

1. Receive rx interrupt, submit 1st rx dma transaction
2. h/w finishes 1st rx dma transaction
3. * schedule completion handler * which may be delayed significant
   amount of time ( > 1ms )
4. data still arriving at uart, more rx interrupts received but
   rx_running == T so no add'l transaction scheduled
5. completion handler eventually runs but too late, so the
   2nd rx dma transaction is not submitted in time to prevent data
   loss.

Instead, both platforms should use a ping-pong scheme (or more generally
a ring of buffers): initially submit multiple rx buffers which allows
the dmaengine driver to start immediately on next buffer instead of
waiting for resubmission. Not sure if this will work though, haven't
really experimented with it at all yet.

But that's why I'd like to bring the two implementations closer, so that
maybe both can be replaced with a single rx dma transaction flow.
[ And perhaps extending tty buffer to perform direct fill, skipping the
  buffer copy ]



>> 5. omap dma maintenance is not keeping up with baseline dma.
> 
> John switched to cylic mode so he was not effected very much non-pause
> problem.

But the issue here is mainline, so it's still a problem.
[ FWIW, my main issue with John's approach was the excessive buffering. ]


[ On omap workarounds: ]
>> - requires rx dma already queued before UART data ready interrupt
>>   (ie., necessitates completely different irq handler and rx dma completion
>>   handler)
> 
> true. But is this something that would work for others, too?

Good point, I don't see why not.
Let's find out what the Intel guys think?


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 13, 2016, midnight UTC | #11
On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>> It has been observed that the TX-DMA can stall

[...]

>>>> http://www.spinics.net/lists/linux-serial/msg18503.html
>>>
>>> This could be fixed. See
>>>   http://www.spinics.net/lists/linux-serial/msg18517.html
>>>   http://www.spinics.net/lists/linux-serial/msg18531.html
>>>
>>> rmk was fine with it from what I read. So what is missing is just
>>> refurbish the patch (update the comment according to rmk replay) and
>>> then we could re-enable DMA again.
>>
>> That's hardly all that is required.
> 
> well it would enable pause of RX transfers.

I'm not totally convinced of that.

Does a RX timeout ensure that sdma won't try to complete the transaction
if more rx data arrives between the raising of the rx timeout interrupt
and the pausing of rx transaction?

My guess is not, which will result in lost data in normal operation.


> TX would not work (unless TI's HW people can confirm that it will).
> 
>> 1. edma pause returns an error if the descriptor has already been retired
>>    when a pause is attempted. This makes distinguishing between reporting an
>>    error for unsupported feature indistinguishable from a transient dma
>>    error that can simply be logged.

Also, this error is reported during normal operation, which is bogus.
The h/w descriptor has completed but the completion handler has not yet been
called, so the "error" is expected.

Which, laughably, is the same error code (-EINVAL) omap-dma uses to "creatively"
report it doesn't actually support pause/resume, despite what dma_get_slave_caps()
reports.

[ split spurious uart interrupts, XON/XOFF transmit, rx dma flow, and code
  code maintenance to separate email discussion ]


>> IOW, omap dma has turned into one big tangle of workarounds.
>
> Most of them are hardware shortcomings.

Totally agree. But at some point, working around h/w becomes prohibitive;
maybe not for a vendor tree, but for mainline, yes.


> I think disabling RX-DMA due to missing pause in omap-dma is the only
> workaround that could be avoided if the driver would be changed.
> 
>> Let's start with making a list of which TI designs need which workarounds.
>>
>> *am335x*
> 
> I am not sure if the limitations are based on the DMA engine or the
> 
>> - requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
>>   workaround necessitating completely different tx dma completion handler)
> 
> This one for instance I don't see on BeagleBoard xM / omap36xx and
> DRA7x. Both (not affected) use SDMA instead EDMA.

Ok.
A goal here would be to use serial8250_tx_dma() for these designs
(the delayed restore workaround makes that not possible right now though).

> It would be interesting to see if DRA7x is affected once it uses EDMA.

Agree; because if this problem is limited to the am335x then maybe
tx dma isn't worth doing for that design.


[ cut to "8250 dma issues" ]

[ cut to my 1st reply      ]


>> - generates spurious uart interrupt for every rx dma transaction
>>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>>   _Even with this workaround_, it still generates spurious interrupt warning
>>   which shuts off interrupts for several ms while logging the error
>>   message to the console, virtually guaranteeing lost data.
> 
> as I wrote in my other email I think RDI should be disabled with DMA


I'll test to see if disabling RDI eliminates the UART_IIR_NO_INT spurious
interrupts.

> according the Intel manual and I *think* someone here reported that
> they see the same problem.

Let's confirm with the Intel folks that this is true, which would argue
for using the omap-style rx dma flow.



>> Can any TI design use the baseline 8250 tx dma transaction flow without
>> workarounds?  I know the am335x can't; any others?
>
> Am335x. Has edma and so has dm814x. According to the code, dm814x based
> HW does not need it, can this be confirmed? Sekhar, TOny?

Slight misunderstanding; what I mean is what TI designs can actually use
serial8250_tx_dma()?

Right now, it seems both am335x and dra7 require omap_8250_tx_dma
(for the delayed_restore workaround and the direct write to tx fifo to
kick tx dma).

[ Assume that the UART_CAP_RPM functionality is added to serial8250_tx_dma() ]



>> Can any TI design use the baseline 8250 rx dma transaction flow without
>> workarounds?  Again, I know the am335x can't; any others?
> 
> Is dra7 out? Because that one needs to enqueue RX transfers asap. And
> omap36xx (aka BeagleBoard-xm) as well. I don't kown anything about
> later SoCs (like am437x and so on) but I would assume so.

Same misunderstanding but I get what you're saying; to your knowledge
all the TI designs require the omap-style rx dma transaction flow
(ie., submit transaction before data is received, and resubmit with each
completed transaction).

One thing that I wonder about then is the initial condition when
the port is first opened; what if data is being received then?



--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sekhar Nori April 13, 2016, 11:11 a.m. UTC | #12
On Wednesday 13 April 2016 05:30 AM, Peter Hurley wrote:

>>> - generates spurious uart interrupt for every rx dma transaction
>>>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>>>   _Even with this workaround_, it still generates spurious interrupt warning
>>>   which shuts off interrupts for several ms while logging the error
>>>   message to the console, virtually guaranteeing lost data.
>>
>> as I wrote in my other email I think RDI should be disabled with DMA
> 
> 
> I'll test to see if disabling RDI eliminates the UART_IIR_NO_INT spurious
> interrupts.
> 
>> according the Intel manual and I *think* someone here reported that
>> they see the same problem.
> 
> Let's confirm with the Intel folks that this is true, which would argue
> for using the omap-style rx dma flow.

Andy Shevchenko pointed this out here: https://lkml.org/lkml/2016/2/23/588

Regards,
Sekhar

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 14, 2016, 1:14 a.m. UTC | #13
On 04/13/2016 04:11 AM, Sekhar Nori wrote:
> On Wednesday 13 April 2016 05:30 AM, Peter Hurley wrote:
> 
>>>> - generates spurious uart interrupt for every rx dma transaction
>>>>   (ie., necessitates acking every UART interrupt, even UART_IIR_NO_INT)
>>>>   _Even with this workaround_, it still generates spurious interrupt warning
>>>>   which shuts off interrupts for several ms while logging the error
>>>>   message to the console, virtually guaranteeing lost data.
>>>
>>> as I wrote in my other email I think RDI should be disabled with DMA
>>
>>
>> I'll test to see if disabling RDI eliminates the UART_IIR_NO_INT spurious
>> interrupts.

Ok; disabling UART_IER_RDI eliminates the UART_IIR_NO_INT spurious
interrupts.

However, disabling RDI disables RX timeout as well, so data just sits in
the RX fifo with no way to get it out. AFAICT that's a showstopper.


>>> according the Intel manual and I *think* someone here reported that
>>> they see the same problem.
>>
>> Let's confirm with the Intel folks that this is true, which would argue
>> for using the omap-style rx dma flow.
> 
> Andy Shevchenko pointed this out here: https://lkml.org/lkml/2016/2/23/588

which Andy noted as well:

On 02/23/2016 08:56 AM, Andy Shevchenko wrote:
> The problem is that we have no separate bit to control timeout
> interrupts from UART.

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Cox April 14, 2016, 3:07 p.m. UTC | #14
> >> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
> >>    without pause/resume.  
> >
> > Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
> > control is one workaround but the user has no chance of knowing that
> > XON/XOFF has been silently disabled.

You can clear the bits in the termios when the termios is set and the
application *should* interpret that as not supported. I doubt many
applications do for the XON/XOFF case. Equally you can just say that soft
flow control turns off DMA or reduces buffering depending upon the data
rate. We have plenty of hardware in the kernel that is more optimal in
some configurations than others.

This also shouldn't be about whether 4K is a lot - it's about time to
respond. Thus the _time_ latency of getting the ^S/^Q out is what matters
at higher rates. At low speed (1200-9600 etc) you want to be able to
respond within a few characters because the chances are the device the
other end is not very bright.

> > the transfer right away. Oh now I see the same thing in
> > edma_completion_handler(). Okay but this affects now everyone that
> > relies on low latency?  
> 
> Well, the real problem is that only one rx buffer is being used serially,
> first filled by the dma h/w, then emptied by the driver, then resubmitted.
> This creates a gap of time between the dma h/w completion interrupt and
> the resubmission where data loss is possible (and happens).

Most low latency users are concerned about the latency between transmit
and receive. The usual case is windowless protocols like firmware
downloaders. For higher speed that tends to be driven by the DMA
timeouts, for lower baud rates you can perhaps mitigate this by using
chains of very small buffers or just turning off DMA just as we turn off
some of the FIFOs at very low speed ?

> But that's why I'd like to bring the two implementations closer, so that
> maybe both can be replaced with a single rx dma transaction flow.
> [ And perhaps extending tty buffer to perform direct fill, skipping the
>   buffer copy ]

For the general case what IMHO is needed is probably not a direct fill of
the tty buffer (which is surprisingly locking hard - we used to have one
but it was broken) but rather a fastpath around it. With the specific
exception of N_TTY I think every single other line discipline we have is
capable of accepting a pointer and length to a block of data that ceases
to be valid the moment the function returns. All the networking ones
certainly are and it would speed up the usual culprits (3G modems over
USB, bluetooth over onboard 3.3v uart etc). 

So a way to call

	port->fast_rx(data, flags, len);

with a rule that you never mix fast and tty buffers, and with an atomic
swap of port->fast_rx between tty_buffer queueing logic, discard and
ldisc->fast_rx pointers done when the ldisc is set or changes.


There are very few cases where n_tty is the one that needs the optimized
path: uucp died a long time ago 8)

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 14, 2016, 4:03 p.m. UTC | #15
On 04/12/2016 11:42 AM, Peter Hurley wrote:
> On 04/12/2016 10:03 AM, Sebastian Andrzej Siewior wrote:
>> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:
>>>>>>>> It has been observed that the TX-DMA can stall
>>>>>>>
>>>>>>> Does this happen on any other OMAP part besides am335x?
>>>>>>> I looked back over the LKML history of this and didn't see
>>>>>>> any other design implicated in this problem.
>>>>>>
>>>>>> I just ran the tests again using 4.6-rc2. I am able to reproduce the
>>>>>> dma-tx stall with am335x/edma and dra7/sdma.

What "test" is reproducing this problem?


>>>>> I thought we already established sdma was not to be used since
>>>>> the hardware does not actually support pausing without data loss.
>>>>
>>>> This workaround was not invented for sdma but for edma (with am335x).
>>>
>>> According to John above, dra7/sdma requires this workaround.
>>
>> It was reported by Frans Klaver against am335x
>>  http://lkml.kernel.org/r/20140908183353.GB4686@ci00147.xsens-tech.local
>>
>> and I managed to reproduce this with his yocto image on dra7 and am335x:
>>  http://lkml.kernel.org/r/20140921204100.GA10111@linutronix.de

I overlooked Sebastian's statement from that message:

This includes
changing the baudrate (not by yocto but the driver sets it to 0 and then
to the requested one) and this seems to be responsible for the "bad
bytes".

What is changing the baud rate to 0? How about get a stack trace for that?



> [...]
> 
>>> - hangs changing some unknown register if tx dma in progress
>>>   (ie., this termios change workaround)
>>
>> I think some registers are the baud-rate registers which pause engine.
> 
> Let's back up here and focus on just this problem for now.
> 
> Since this is observable on dra7/sdma, then it's not related to
> the OMAP_DMA_TX_KICK problem. IOW, dropping tx dma support for am335x
> does not make this go away, which is what I was asking.
> 
> Now, if the DLL/DLH/MDR1 register writes are causing dma to stall,
> then skipping those if they're unchanged should fix this, and then
> pause/terminating in-progress DMA if any of these registers are being
> written would be acceptable since some data loss is to be expected
> when changing the baud rate without waiting.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 14, 2016, 5:54 p.m. UTC | #16
On 04/14/2016 08:07 AM, One Thousand Gnomes wrote:
>>>> 3. Handling XON/XOFF transmit is mandatory; I don't see a way to do that
>>>>    without pause/resume.  
>>>
>>> Yes, not doing XON/XOFF with DMA is not good. Using hardware flow
>>> control is one workaround but the user has no chance of knowing that
>>> XON/XOFF has been silently disabled.
> 
> You can clear the bits in the termios when the termios is set and the
> application *should* interpret that as not supported. I doubt many
> applications do for the XON/XOFF case. Equally you can just say that soft
> flow control turns off DMA or reduces buffering depending upon the data
> rate. We have plenty of hardware in the kernel that is more optimal in
> some configurations than others.
> 
> This also shouldn't be about whether 4K is a lot - it's about time to
> respond. Thus the _time_ latency of getting the ^S/^Q out is what matters
> at higher rates. At low speed (1200-9600 etc) you want to be able to
> respond within a few characters because the chances are the device the
> other end is not very bright.

Currently, tcflow(TCIOFF/TCION) ignores the flow control modes and always
sends XON/XOFF, so disabling DMA based on flow control mode won't work.


>>> the transfer right away. Oh now I see the same thing in
>>> edma_completion_handler(). Okay but this affects now everyone that
>>> relies on low latency?  
>>
>> Well, the real problem is that only one rx buffer is being used serially,
>> first filled by the dma h/w, then emptied by the driver, then resubmitted.
>> This creates a gap of time between the dma h/w completion interrupt and
>> the resubmission where data loss is possible (and happens).
> 
> Most low latency users are concerned about the latency between transmit
> and receive.

Sebastian really confused this problem by relating it to low latency.
That's not the problem.

The problem is that softirqs and tasklets are not reliable bottom halves
anymore, because since 3.8 they don't always run when interrupts are
re-enabled. They can now be deferred to the lowest priority process,
ksoftirqd.

For a uart trying not to overrun a 64-byte fifo at 3Mbaud creates a hard
213us deadline for the uart driver to give the dmaengine driver the
_next_ RX buffer to fill. Right now, the 8250 driver is only using 1
RX buffer, so until the DMA completion handler runs (in tasklet), the
8250 driver doesn't supply the dmaengine driver a new one.



> The usual case is windowless protocols like firmware
> downloaders. For higher speed that tends to be driven by the DMA
> timeouts, for lower baud rates you can perhaps mitigate this by using
> chains of very small buffers or just turning off DMA just as we turn off
> some of the FIFOs at very low speed ?
> 
>> But that's why I'd like to bring the two implementations closer, so that
>> maybe both can be replaced with a single rx dma transaction flow.
>> [ And perhaps extending tty buffer to perform direct fill, skipping the
>>   buffer copy ]
> 
> For the general case what IMHO is needed is probably not a direct fill of
> the tty buffer (which is surprisingly locking hard - we used to have one
> but it was broken) but rather a fastpath around it. With the specific
> exception of N_TTY I think every single other line discipline we have is
> capable of accepting a pointer and length to a block of data that ceases
> to be valid the moment the function returns. All the networking ones
> certainly are and it would speed up the usual culprits (3G modems over
> USB, bluetooth over onboard 3.3v uart etc). 
> 
> So a way to call
> 
> 	port->fast_rx(data, flags, len);
> 
> with a rule that you never mix fast and tty buffers, and with an atomic
> swap of port->fast_rx between tty_buffer queueing logic, discard and
> ldisc->fast_rx pointers done when the ldisc is set or changes.

Yes, that's one possible alternative. The drawback is that makes
testing more difficult, but maybe the tradeoff would be worth it.

Regards,
Peter Hurley


> There are very few cases where n_tty is the one that needs the optimized
> path: uucp died a long time ago 8)
> 
> Alan
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vignesh Raghavendra May 3, 2016, noon UTC | #17
Hi,

On 04/12/2016 10:33 PM, Sebastian Andrzej Siewior wrote:
> On 04/11/2016 10:10 PM, Peter Hurley wrote:
>> On 04/11/2016 11:31 AM, Sebastian Andrzej Siewior wrote:
>>> On 04/11/2016 07:53 PM, Peter Hurley wrote:
>>>> On 04/11/2016 01:18 AM, John Ogness wrote:
>>>>> On 2016-04-05, Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>>> On 03/31/2016 01:41 AM, John Ogness wrote:

[...]

>> *am335x*
> 
> I am not sure if the limitations are based on the DMA engine or the
> 
>> - requires write to tx fifo to trigger tx dma (ie. OMAP_DMA_TX_KICK
>>   workaround necessitating completely different tx dma completion handler)
> 
> This one for instance I don't see on BeagleBoard xM / omap36xx and
> DRA7x. Both (not affected) use SDMA instead EDMA. It would be
> interesting to see if DRA7x is affected once it uses EDMA.
> 

8250 UART DMA works fine with EDMA as the DMA engine.
As per my testing, am335x + EDMA, am437x + EDMA, dra7xx + EDMA and
dra72x + EDMA require OMAP_DMA_TX_KICK quirk. However when using SDMA
OMAP_DMA_TX_KICK is not needed (I tested on dra7xx platform). So, it can
be concluded that when using EDMA with 8250 UART tx kick is needed.

Patch
diff mbox

diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
index 6f76051..9459b4d 100644
--- a/drivers/tty/serial/8250/8250_omap.c
+++ b/drivers/tty/serial/8250/8250_omap.c
@@ -100,9 +100,9 @@  struct omap8250_priv {
 	u8 wer;
 	u8 xon;
 	u8 xoff;
-	u8 delayed_restore;
 	u16 quot;
 
+	wait_queue_head_t termios_wait;
 	bool is_suspending;
 	int wakeirq;
 	int wakeups_enabled;
@@ -257,18 +257,6 @@  static void omap8250_update_mdr1(struct uart_8250_port *up,
 static void omap8250_restore_regs(struct uart_8250_port *up)
 {
 	struct omap8250_priv *priv = up->port.private_data;
-	struct uart_8250_dma	*dma = up->dma;
-
-	if (dma && dma->tx_running) {
-		/*
-		 * TCSANOW requests the change to occur immediately however if
-		 * we have a TX-DMA operation in progress then it has been
-		 * observed that it might stall and never complete. Therefore we
-		 * delay DMA completes to prevent this hang from happen.
-		 */
-		priv->delayed_restore = 1;
-		return;
-	}
 
 	serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B);
 	serial_out(up, UART_EFR, UART_EFR_ECB);
@@ -310,6 +298,7 @@  static void omap8250_restore_regs(struct uart_8250_port *up)
 	up->port.ops->set_mctrl(&up->port, up->port.mctrl);
 }
 
+static void omap_8250_dma_tx_complete(void *param);
 /*
  * OMAP can use "CLK / (16 or 13) / div" for baud rate. And then we have have
  * some differences in how we want to handle flow control.
@@ -320,6 +309,8 @@  static void omap_8250_set_termios(struct uart_port *port,
 {
 	struct uart_8250_port *up = up_to_u8250p(port);
 	struct omap8250_priv *priv = up->port.private_data;
+	struct uart_8250_dma *dma = up->dma;
+	unsigned int complete_dma = 0;
 	unsigned char cval = 0;
 	unsigned int baud;
 
@@ -430,7 +421,7 @@  static void omap_8250_set_termios(struct uart_port *port,
 	priv->scr = OMAP_UART_SCR_RX_TRIG_GRANU1_MASK | OMAP_UART_SCR_TX_EMPTY |
 		OMAP_UART_SCR_TX_TRIG_GRANU1_MASK;
 
-	if (up->dma)
+	if (dma)
 		priv->scr |= OMAP_UART_SCR_DMAMODE_1 |
 			OMAP_UART_SCR_DMAMODE_CTL;
 
@@ -460,6 +451,24 @@  static void omap_8250_set_termios(struct uart_port *port,
 			priv->efr |= OMAP_UART_SW_TX;
 		}
 	}
+
+	if (dma && dma->tx_running) {
+		/*
+		 * TCSANOW requests the change to occur immediately, however
+		 * if we have a TX-DMA operation in progress then it has been
+		 * observed that it might stall and never complete. Therefore
+		 * we wait until DMA completes to prevent this hang from
+		 * happening.
+		 */
+
+		dma->tx_running = 2;
+
+		spin_unlock_irq(&up->port.lock);
+		wait_event_interruptible(priv->termios_wait,
+					 dma->tx_running == 3);
+		spin_lock_irq(&up->port.lock);
+		complete_dma = 1;
+	}
 	omap8250_restore_regs(up);
 
 	spin_unlock_irq(&up->port.lock);
@@ -475,6 +484,8 @@  static void omap_8250_set_termios(struct uart_port *port,
 	/* Don't rewrite B0 */
 	if (tty_termios_baud_rate(termios))
 		tty_termios_encode_baud_rate(termios, baud, baud);
+	if (complete_dma)
+		omap_8250_dma_tx_complete(up);
 }
 
 /* same as 8250 except that we may have extra flow bits set in EFR */
@@ -893,17 +904,18 @@  static void omap_8250_dma_tx_complete(void *param)
 
 	spin_lock_irqsave(&p->port.lock, flags);
 
+	if (dma->tx_running == 2) {
+		dma->tx_running = 3;
+		wake_up(&priv->termios_wait);
+		goto out;
+	}
+
 	dma->tx_running = 0;
 
 	xmit->tail += dma->tx_size;
 	xmit->tail &= UART_XMIT_SIZE - 1;
 	p->port.icount.tx += dma->tx_size;
 
-	if (priv->delayed_restore) {
-		priv->delayed_restore = 0;
-		omap8250_restore_regs(p);
-	}
-
 	if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
 		uart_write_wakeup(&p->port);
 
@@ -923,7 +935,7 @@  static void omap_8250_dma_tx_complete(void *param)
 		p->ier |= UART_IER_THRI;
 		serial_port_out(&p->port, UART_IER, p->ier);
 	}
-
+out:
 	spin_unlock_irqrestore(&p->port.lock, flags);
 }
 
@@ -1088,6 +1100,10 @@  static inline int omap_8250_rx_dma(struct uart_8250_port *p, unsigned int iir)
 {
 	return -EINVAL;
 }
+
+static void omap_8250_dma_tx_complete(void *param)
+{
+}
 #endif
 
 static int omap8250_no_handle_irq(struct uart_port *port)
@@ -1241,6 +1257,7 @@  static int omap8250_probe(struct platform_device *pdev)
 			priv->omap8250_dma.rx_size = RX_TRIGGER;
 			priv->omap8250_dma.rxconf.src_maxburst = RX_TRIGGER;
 			priv->omap8250_dma.txconf.dst_maxburst = TX_TRIGGER;
+			init_waitqueue_head(&priv->termios_wait);
 
 			if (of_machine_is_compatible("ti,am33xx"))
 				priv->habit |= OMAP_DMA_TX_KICK;