diff mbox

Revert "OMAP/serial: Fix incorrect Rx FIFO threshold setting, LSR validation on Tx, and Tx FIFO IRQ generation"

Message ID alpine.DEB.2.00.1304010519260.28273@utopia.booyaka.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Walmsley April 1, 2013, 5:25 a.m. UTC
This reverts commit 1776fd059c40907297d6c26c51876575d63fd9e2.

Commit 1776fd059c40 causes UART sluggishness on the OMAP37xx EVM.
This can be demonstrated by pasting in a ten-character string, like
"ffffffffff", at the serial console.  The string will be echoed back
two to three characters at a time, with about a one-second pause
between groups.  This causes one of the OMAP PM validation tests to
time out:

http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/37xxevm/37xxevm_log.txt

With commit 1776fd059c40 reverted, the test succeeds:

http://www.pwsan.com/omap/testlogs/serial_fix_37xx_v3.9-rc/20130331230517/pm/37xxevm/37xxevm_log.txt

This problem has been present since v3.9-rc1, but was incorrectly 
suspected to be due to commit 6aa9707099c4b25700940eb3d016f16c4434360d 
("lockdep: check that no locks held at freeze time"), due to limitations
in my local testbed.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Alexey Pelykh <alexey.pelykh@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---

Alexey, care to make another attempt for 3.11 at fixing the issue you were 
originally debugging, but one that also works on OMAP37xx EVM?  It would 
be good to cc the <linux-omap@vger.kernel.org> on your OMAP work.  If you 
don't have a 37xx EVM, someone there might be able to help you test.

 drivers/tty/serial/omap-serial.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Alexey Pelykh April 1, 2013, 8:31 a.m. UTC | #1
Actually, I've tested my patch on DM3730, and, at least, can prove
that original settings of UART are incorrect according to TRM of
processor. What settings of UART you were using to reproduce issue?
I'd like to kindly ask you to describe your test environment, since
I've never experiences issues that you've described nor in debug
console, nor in regular UART usage.

On Mon, Apr 1, 2013 at 8:25 AM, Paul Walmsley <paul@pwsan.com> wrote:
>
> This reverts commit 1776fd059c40907297d6c26c51876575d63fd9e2.
>
> Commit 1776fd059c40 causes UART sluggishness on the OMAP37xx EVM.
> This can be demonstrated by pasting in a ten-character string, like
> "ffffffffff", at the serial console.  The string will be echoed back
> two to three characters at a time, with about a one-second pause
> between groups.  This causes one of the OMAP PM validation tests to
> time out:
>
> http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/37xxevm/37xxevm_log.txt
>
> With commit 1776fd059c40 reverted, the test succeeds:
>
> http://www.pwsan.com/omap/testlogs/serial_fix_37xx_v3.9-rc/20130331230517/pm/37xxevm/37xxevm_log.txt
>
> This problem has been present since v3.9-rc1, but was incorrectly
> suspected to be due to commit 6aa9707099c4b25700940eb3d016f16c4434360d
> ("lockdep: check that no locks held at freeze time"), due to limitations
> in my local testbed.
>
> Signed-off-by: Paul Walmsley <paul@pwsan.com>
> Cc: Alexey Pelykh <alexey.pelykh@gmail.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>
> Alexey, care to make another attempt for 3.11 at fixing the issue you were
> originally debugging, but one that also works on OMAP37xx EVM?  It would
> be good to cc the <linux-omap@vger.kernel.org> on your OMAP work.  If you
> don't have a 37xx EVM, someone there might be able to help you test.
>
>  drivers/tty/serial/omap-serial.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> index 4dc4140..9915e4d 100644
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -59,7 +59,6 @@
>
>  /* SCR register bitmasks */
>  #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK              (1 << 7)
> -#define OMAP_UART_SCR_TX_TRIG_GRANU1_MASK              (1 << 6)
>  #define OMAP_UART_SCR_TX_EMPTY                 (1 << 3)
>
>  /* FCR register bitmasks */
> @@ -321,6 +320,9 @@ static void transmit_chars(struct uart_omap_port *up, unsigned int lsr)
>         struct circ_buf *xmit = &up->port.state->xmit;
>         int count;
>
> +       if (!(lsr & UART_LSR_THRE))
> +               return;
> +
>         if (up->port.x_char) {
>                 serial_out(up, UART_TX, up->port.x_char);
>                 up->port.icount.tx++;
> @@ -862,7 +864,7 @@ serial_omap_set_termios(struct uart_port *port, struct ktermios *termios,
>         serial_out(up, UART_IER, up->ier);
>         serial_out(up, UART_LCR, cval);         /* reset DLAB */
>         up->lcr = cval;
> -       up->scr = 0;
> +       up->scr = OMAP_UART_SCR_TX_EMPTY;
>
>         /* FIFOs and DMA Settings */
>
> @@ -886,6 +888,8 @@ serial_omap_set_termios(struct uart_port *port, struct ktermios *termios,
>         serial_out(up, UART_MCR, up->mcr | UART_MCR_TCRTLR);
>         /* FIFO ENABLE, DMA MODE */
>
> +       up->scr |= OMAP_UART_SCR_RX_TRIG_GRANU1_MASK;
> +
>         /* Set receive FIFO threshold to 16 characters and
>          * transmit FIFO threshold to 16 spaces
>          */
> --
> 1.7.10.4
>
Paul Walmsley April 1, 2013, 9:13 a.m. UTC | #2
Hi

On Mon, 1 Apr 2013, Alexey Pelykh wrote:

> Actually, I've tested my patch on DM3730,

What board, bootloader, and test steps did you use?  Can you post a dmesg?

> and, at least, can prove that original settings of UART are incorrect 
> according to TRM of processor. 

The TRM could be buggy or wrong; or your patch could be correct as far as 
the UART goes, but could be triggering a separate bug.  We're reaching the 
end of the v3.9-rc fixes period, so we need to deal with the v3.9-rc 
regression quickly so folks don't wind up with a broken v3.9 kernel.  
Then the issue can be debugged and tested, and the revised fix targeted 
for the v3.11 kernel.

> What settings of UART you were using to reproduce issue? I'd like to 
> kindly ask you to describe your test environment, since I've never 
> experiences issues that you've described nor in debug console, nor in 
> regular UART usage.

Could you be more specific about what information you're looking for, 
beyond what's in:

http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/37xxevm/37xxevm_log.txt 

?  As you can see from the log, it's using UART1 at 115kbps -- init=/bin/sh, 
so no userspace to speak of.

It's also worth mentioning that the 3730 Beagle XM here doesn't fail the 
PM test, UART3 at 230kbps in a "full" userspace:

http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/3730beaglexm/3730beaglexm_log.txt


- Paul
Alexey Pelykh April 1, 2013, 10:01 a.m. UTC | #3
Hi,

On Mon, Apr 1, 2013 at 12:13 PM, Paul Walmsley <paul@pwsan.com> wrote:
> Hi
>
> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>
>> Actually, I've tested my patch on DM3730,
>
> What board, bootloader, and test steps did you use?  Can you post a dmesg?

I've used LogicPD Torpedo Wireless SoM + DevKit board. Bootloader is
stock from their BSP, u-boot 2011.06 + patches from LogicPD.
I can provide dmesg of 3.8(inital kernel this patch was intented for)
a bit later during the day, and a bit more later same patch rebased
onto latest sources.
Also, I should mention that LogicPD does not support 3.8, so most
LogicPD-specific devices are not working.

And I'd like to point out that after patch gaps I've seen are not gone
completely, ~1 packet out of 200 at 1mbaud is still corrupted, but
only when PM is enabled in kernel,
what gives me a clue that it must be something in PM also. Maybe my
patch just reveals PM-specific issue. Have you tried without PM to
reproduce issues you're experiencing?

>
>> and, at least, can prove that original settings of UART are incorrect
>> according to TRM of processor.
>
> The TRM could be buggy or wrong; or your patch could be correct as far as
> the UART goes, but could be triggering a separate bug.  We're reaching the
> end of the v3.9-rc fixes period, so we need to deal with the v3.9-rc
> regression quickly so folks don't wind up with a broken v3.9 kernel.
> Then the issue can be debugged and tested, and the revised fix targeted
> for the v3.11 kernel.

Indeed, TRMs appear to contain wrong information, but I believe that
in this particular case, bug is somewhere around, but not in these
lines.
I believe if you take a look on related TRM pages, it seems quite
logical that configuration of SCR (before my patch) is not quite
proper:
 - trigger interrupt only when it's entirely empty, not when threshold
level is reached, what is incorrect, in my opinion.
 - (less influencing) Granularity of counter indicates opposite to
what is stated in comments

These issues may have never been experienced by others, since
originally I've hit them on 1mbaud speed, what is not supported by
original driver at all.

>
>> What settings of UART you were using to reproduce issue? I'd like to
>> kindly ask you to describe your test environment, since I've never
>> experiences issues that you've described nor in debug console, nor in
>> regular UART usage.
>
> Could you be more specific about what information you're looking for,
> beyond what's in:
>
> http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/37xxevm/37xxevm_log.txt
>
> ?  As you can see from the log, it's using UART1 at 115kbps -- init=/bin/sh,
> so no userspace to speak of.
>

Sorry, I've missed that file at all. After patch I've never expericed
any issues on 115200 speed of debug console, and it works flawlessly
for me.

> It's also worth mentioning that the 3730 Beagle XM here doesn't fail the
> PM test, UART3 at 230kbps in a "full" userspace:
>
> http://www.pwsan.com/omap/testlogs/test_v3.9-rc5/20130331205513/pm/3730beaglexm/3730beaglexm_log.txt
>
>
> - Paul
Paul Walmsley April 1, 2013, 10:19 a.m. UTC | #4
cc Kevin, Felipe

Hi

On Mon, 1 Apr 2013, Alexey Pelykh wrote:

> On Mon, Apr 1, 2013 at 12:13 PM, Paul Walmsley <paul@pwsan.com> wrote:
> > On Mon, 1 Apr 2013, Alexey Pelykh wrote:
> >
> >> Actually, I've tested my patch on DM3730,
> >
> > What board, bootloader, and test steps did you use?  Can you post a dmesg?
> 
> I've used LogicPD Torpedo Wireless SoM + DevKit board. Bootloader is
> stock from their BSP, u-boot 2011.06 + patches from LogicPD.
> I can provide dmesg of 3.8(inital kernel this patch was intented for)
> a bit later during the day, and a bit more later same patch rebased
> onto latest sources.
> Also, I should mention that LogicPD does not support 3.8, so most
> LogicPD-specific devices are not working.

A dmesg from v3.9-rc5 would be ideal, but sounds like it's a non-mainline 
kernel?

Are you using a serial console on that board, and if so, what UART is it 
on?

> And I'd like to point out that after patch gaps I've seen are not gone
> completely, ~1 packet out of 200 at 1mbaud is still corrupted, but
> only when PM is enabled in kernel,

Are you seeing corruption caused by the OMAP UART's receive path, or by 
the transmit path?

When you say "only when PM is enabled" do you mean by enabling CONFIG_PM, 
or do you mean after setting an autosuspend timeout on the serial drivers, 
or something else?

> what gives me a clue that it must be something in PM also. Maybe my
> patch just reveals PM-specific issue. Have you tried without PM to
> reproduce issues you're experiencing?

So I'm using omap2plus_defconfig, which has CONFIG_PM=y, but the issues 
appear before serial autosuspend is enabled.

> Indeed, TRMs appear to contain wrong information, but I believe that
> in this particular case, bug is somewhere around, but not in these
> lines.
> I believe if you take a look on related TRM pages, it seems quite
> logical that configuration of SCR (before my patch) is not quite
> proper:
>  - trigger interrupt only when it's entirely empty, not when threshold
> level is reached, what is incorrect, in my opinion.
>  - (less influencing) Granularity of counter indicates opposite to
> what is stated in comments
> 
> These issues may have never been experienced by others, since
> originally I've hit them on 1mbaud speed, what is not supported by
> original driver at all.

It wouldn't surprise me at all if something is wrong with the 
original driver's FIFO setup.  The challenge at this point is to keep 
the driver working while trying to fix what's wrong with it.


- Paul
Alexey Pelykh April 1, 2013, 10:45 a.m. UTC | #5
On Mon, Apr 1, 2013 at 1:19 PM, Paul Walmsley <paul@pwsan.com> wrote:
> cc Kevin, Felipe
>
> Hi
>
> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>
>> On Mon, Apr 1, 2013 at 12:13 PM, Paul Walmsley <paul@pwsan.com> wrote:
>> > On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>> >
>> >> Actually, I've tested my patch on DM3730,
>> >
>> > What board, bootloader, and test steps did you use?  Can you post a dmesg?
>>
>> I've used LogicPD Torpedo Wireless SoM + DevKit board. Bootloader is
>> stock from their BSP, u-boot 2011.06 + patches from LogicPD.
>> I can provide dmesg of 3.8(inital kernel this patch was intented for)
>> a bit later during the day, and a bit more later same patch rebased
>> onto latest sources.
>> Also, I should mention that LogicPD does not support 3.8, so most
>> LogicPD-specific devices are not working.
>
> A dmesg from v3.9-rc5 would be ideal, but sounds like it's a non-mainline
> kernel?

It's true, mainline kernel won't run on that board out of the box, but
patches to
support it go only in mach-types.h and board-omap3logic.c

>
> Are you using a serial console on that board, and if so, what UART is it
> on?

Yes, it's serial console port on UART0 (console=ttyO0,115200n8)

>
>> And I'd like to point out that after patch gaps I've seen are not gone
>> completely, ~1 packet out of 200 at 1mbaud is still corrupted, but
>> only when PM is enabled in kernel,
>
> Are you seeing corruption caused by the OMAP UART's receive path, or by
> the transmit path?

Corruption is on Tx path. Without my patch (but with patch to allow 1Mbaud):
Interrupt to refill Tx FIFO is raised only after entire FIFO is
transmitted (that is, no data left in FIFO to be sent),
and it takes kernel 25us to handle interrupt and refill FIFO. Issue is
in condition when UART triggers interrupt.
According to original sources, this interrupt is configured to be
raised only when TX FIFO is (fully) empty, while
ignoring Tx FIFO threshold level completely, although it's being
(almost) properly configured and enabled.
So, what my patch changes is basically when interrupt to refill Tx
FIFO is raised:
 - from "when empty" to "when threshold is reached", refs:
OMAP_UART_SCR_TX_EMPTY flag
   should not be set, and UART_LSR_THRE check is also invalid, since
blocks threshold level situation.
 - setting OMAP_UART_SCR_RX_TRIG_GRANU1_MASK will lead to incorrect
interpretation
   of threshold level settings (inconsistent with comments in driver
sources itself).

>
> When you say "only when PM is enabled" do you mean by enabling CONFIG_PM,
> or do you mean after setting an autosuspend timeout on the serial drivers,
> or something else?
>

I mean that I've seen gap issues both with CONFIG_PM=y and CONFIG_PM=n
After my patch I saw gap issues only several times, and _probably_
that is due to CONFIG_PM=y,
since I've never seen them after that with CONFIG_PM=n

I should point that tests I run are completely user-space, so I've
never tested UARTs
using oscilloscope during kernel boot. But, no issues in serial
console seen, as I've
already pointed out.

>> what gives me a clue that it must be something in PM also. Maybe my
>> patch just reveals PM-specific issue. Have you tried without PM to
>> reproduce issues you're experiencing?
>
> So I'm using omap2plus_defconfig, which has CONFIG_PM=y, but the issues
> appear before serial autosuspend is enabled.
>
>> Indeed, TRMs appear to contain wrong information, but I believe that
>> in this particular case, bug is somewhere around, but not in these
>> lines.
>> I believe if you take a look on related TRM pages, it seems quite
>> logical that configuration of SCR (before my patch) is not quite
>> proper:
>>  - trigger interrupt only when it's entirely empty, not when threshold
>> level is reached, what is incorrect, in my opinion.
>>  - (less influencing) Granularity of counter indicates opposite to
>> what is stated in comments
>>
>> These issues may have never been experienced by others, since
>> originally I've hit them on 1mbaud speed, what is not supported by
>> original driver at all.
>
> It wouldn't surprise me at all if something is wrong with the
> original driver's FIFO setup.  The challenge at this point is to keep
> the driver working while trying to fix what's wrong with it.
>
>
> - Paul
Alexey Pelykh April 1, 2013, 11:18 a.m. UTC | #6
Paul,

Please, correct me if I'm wrong, you're experiencing issues on Tx, or
Rx direction? You've said that characters are echoed incorrectly, but
haven't mentioned in what direction issue appears?
For reference, I'll quote TI DM3730 Rev.R TRM:
TRM page 2953, LSR_REG
Bit #5: TX_FIFO_E
 0x0 Transmit hold register (TX FIFO) is not empty.
 0x1 Transmit hold register (TX FIFO) is empty. The transmission is
not necessarily complete.

Before my patch: there was a check for TX_FIFO_E to be 0x1 before
filling Tx FIFO
After: If THR interrupt is raised, then there is a space in FIFO and
threshold has reached it's level, no extra check is required

TRM page 2972, SCR_REG
Bit #4: TX_EMPTY_CTL_IT
 0x1 The THR interrupt is generated when TX FIFO and TX shift register
are empty.
 0x0 Normal mode for THR interrupt (see Table 19-33 for details about
UART mode interrupts)

Before: THR interrput is raised only when TX FIFO and TX shift
register are empty, for shortcut "all Tx lane empty"
After: THR interrupt is raised both when Tx lane is empty, or when
size of data in Tx lane falls below Tx threshold level

Bit #7: RX_TRIG_GRANU1
 0x0 Disables the granularity of 1 for TRIGGER RX level
 0x1 Enables the granularity of 1 for TRIGGER RX level

Before: RX FIFO Trigger Level "Defined by the concatenated value of
RX_FIFO_TRIG_DMA and
RX_FIFO_TRIG (from 1 to 63 characters with a granularity of 1 character).
Note: The combination of RX_FIFO_TRIG_DMA = 0x0 and RX_FIFO_TRIG =
0x0 (all zeros) is not supported (minimum 1 character required). All
zeros result
in unpredictable behavior."

After: RX FIFO Trigger Level "Defined by the UARTi.FCR_REG[7:6]
RX_FIFO_TRIG field (8,16, 56, or 60
characters)" (as it should be according to comments in source code)

On Mon, Apr 1, 2013 at 1:45 PM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
> On Mon, Apr 1, 2013 at 1:19 PM, Paul Walmsley <paul@pwsan.com> wrote:
>> cc Kevin, Felipe
>>
>> Hi
>>
>> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>>
>>> On Mon, Apr 1, 2013 at 12:13 PM, Paul Walmsley <paul@pwsan.com> wrote:
>>> > On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>>> >
>>> >> Actually, I've tested my patch on DM3730,
>>> >
>>> > What board, bootloader, and test steps did you use?  Can you post a dmesg?
>>>
>>> I've used LogicPD Torpedo Wireless SoM + DevKit board. Bootloader is
>>> stock from their BSP, u-boot 2011.06 + patches from LogicPD.
>>> I can provide dmesg of 3.8(inital kernel this patch was intented for)
>>> a bit later during the day, and a bit more later same patch rebased
>>> onto latest sources.
>>> Also, I should mention that LogicPD does not support 3.8, so most
>>> LogicPD-specific devices are not working.
>>
>> A dmesg from v3.9-rc5 would be ideal, but sounds like it's a non-mainline
>> kernel?
>
> It's true, mainline kernel won't run on that board out of the box, but
> patches to
> support it go only in mach-types.h and board-omap3logic.c
>
>>
>> Are you using a serial console on that board, and if so, what UART is it
>> on?
>
> Yes, it's serial console port on UART0 (console=ttyO0,115200n8)
>
>>
>>> And I'd like to point out that after patch gaps I've seen are not gone
>>> completely, ~1 packet out of 200 at 1mbaud is still corrupted, but
>>> only when PM is enabled in kernel,
>>
>> Are you seeing corruption caused by the OMAP UART's receive path, or by
>> the transmit path?
>
> Corruption is on Tx path. Without my patch (but with patch to allow 1Mbaud):
> Interrupt to refill Tx FIFO is raised only after entire FIFO is
> transmitted (that is, no data left in FIFO to be sent),
> and it takes kernel 25us to handle interrupt and refill FIFO. Issue is
> in condition when UART triggers interrupt.
> According to original sources, this interrupt is configured to be
> raised only when TX FIFO is (fully) empty, while
> ignoring Tx FIFO threshold level completely, although it's being
> (almost) properly configured and enabled.
> So, what my patch changes is basically when interrupt to refill Tx
> FIFO is raised:
>  - from "when empty" to "when threshold is reached", refs:
> OMAP_UART_SCR_TX_EMPTY flag
>    should not be set, and UART_LSR_THRE check is also invalid, since
> blocks threshold level situation.
>  - setting OMAP_UART_SCR_RX_TRIG_GRANU1_MASK will lead to incorrect
> interpretation
>    of threshold level settings (inconsistent with comments in driver
> sources itself).
>
>>
>> When you say "only when PM is enabled" do you mean by enabling CONFIG_PM,
>> or do you mean after setting an autosuspend timeout on the serial drivers,
>> or something else?
>>
>
> I mean that I've seen gap issues both with CONFIG_PM=y and CONFIG_PM=n
> After my patch I saw gap issues only several times, and _probably_
> that is due to CONFIG_PM=y,
> since I've never seen them after that with CONFIG_PM=n
>
> I should point that tests I run are completely user-space, so I've
> never tested UARTs
> using oscilloscope during kernel boot. But, no issues in serial
> console seen, as I've
> already pointed out.
>
>>> what gives me a clue that it must be something in PM also. Maybe my
>>> patch just reveals PM-specific issue. Have you tried without PM to
>>> reproduce issues you're experiencing?
>>
>> So I'm using omap2plus_defconfig, which has CONFIG_PM=y, but the issues
>> appear before serial autosuspend is enabled.
>>
>>> Indeed, TRMs appear to contain wrong information, but I believe that
>>> in this particular case, bug is somewhere around, but not in these
>>> lines.
>>> I believe if you take a look on related TRM pages, it seems quite
>>> logical that configuration of SCR (before my patch) is not quite
>>> proper:
>>>  - trigger interrupt only when it's entirely empty, not when threshold
>>> level is reached, what is incorrect, in my opinion.
>>>  - (less influencing) Granularity of counter indicates opposite to
>>> what is stated in comments
>>>
>>> These issues may have never been experienced by others, since
>>> originally I've hit them on 1mbaud speed, what is not supported by
>>> original driver at all.
>>
>> It wouldn't surprise me at all if something is wrong with the
>> original driver's FIFO setup.  The challenge at this point is to keep
>> the driver working while trying to fix what's wrong with it.
>>
>>
>> - Paul
Paul Walmsley April 1, 2013, 5:09 p.m. UTC | #7
On Mon, 1 Apr 2013, Alexey Pelykh wrote:

> Please, correct me if I'm wrong, you're experiencing issues on Tx, or
> Rx direction? You've said that characters are echoed incorrectly, but
> haven't mentioned in what direction issue appears?

The problem seems to be on the Rx path.  If I 'cat' a file, it's 
transmitted without hesitation.  But when I paste in this Simpsons quote 
into the serial console via X:

-----
Mr. Burns: A lifetime of working with nuclear power has left me with a healthy green glow... and left me as impotent as a Nevada boxing commissioner. 
-----

... everything up to the end of "boxing" is echoed immediately, and then 
"commissioner." stutters its way across the screen over the next five 
seconds.  


- Paul
Alexey Pelykh April 1, 2013, 6:01 p.m. UTC | #8
Paul, can you please try to comment out change related to Rx line
(granulation one). That's only Rx line related change in the patch.

On Mon, Apr 1, 2013 at 8:09 PM, Paul Walmsley <paul@pwsan.com> wrote:
> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>
>> Please, correct me if I'm wrong, you're experiencing issues on Tx, or
>> Rx direction? You've said that characters are echoed incorrectly, but
>> haven't mentioned in what direction issue appears?
>
> The problem seems to be on the Rx path.  If I 'cat' a file, it's
> transmitted without hesitation.  But when I paste in this Simpsons quote
> into the serial console via X:
>
> -----
> Mr. Burns: A lifetime of working with nuclear power has left me with a healthy green glow... and left me as impotent as a Nevada boxing commissioner.
> -----
>
> ... everything up to the end of "boxing" is echoed immediately, and then
> "commissioner." stutters its way across the screen over the next five
> seconds.
>
>
> - Paul
Paul Walmsley April 2, 2013, 6:24 p.m. UTC | #9
On Mon, 1 Apr 2013, Alexey Pelykh wrote:

> Paul, can you please try to comment out change related to Rx line
> (granulation one). That's only Rx line related change in the patch.

Yes, looks like simply reverting the commit 1776fd059c40 ("OMAP/serial: 
Fix incorrect Rx FIFO threshold setting, LSR validation on Tx, and Tx FIFO 
IRQ generation") change to OMAP_UART_SCR_RX_TRIG_GRANU1_MASK fixes it.

Care to send a patch doing so ASAP for v3.9-rc fixes?  That would moot the 
revert of the entire commit 1776fd059c40.


- Paul
Alexey Pelykh April 2, 2013, 9:19 p.m. UTC | #10
Hi Paul,

No problem, I'll prepare patch in next 24 hours and will send in ASAP.

But, since we've found the issue, I suggest that we should look at it
closer, especially since at this moment only you can reproduce it. As
far as I understand, but I may be wrong,
this comment is wrong, since if to specify
OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx threshold to
1 character, instead of 16.
/* Set receive FIFO threshold to 16 characters and
 * transmit FIFO threshold to 16 spaces
 */

On Tue, Apr 2, 2013 at 9:24 PM, Paul Walmsley <paul@pwsan.com> wrote:
> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>
>> Paul, can you please try to comment out change related to Rx line
>> (granulation one). That's only Rx line related change in the patch.
>
> Yes, looks like simply reverting the commit 1776fd059c40 ("OMAP/serial:
> Fix incorrect Rx FIFO threshold setting, LSR validation on Tx, and Tx FIFO
> IRQ generation") change to OMAP_UART_SCR_RX_TRIG_GRANU1_MASK fixes it.
>
> Care to send a patch doing so ASAP for v3.9-rc fixes?  That would moot the
> revert of the entire commit 1776fd059c40.
>
>
> - Paul
Alexey Pelykh April 3, 2013, 3:55 p.m. UTC | #11
Paul,

Should I start a new email chain with that patch or add it here?

On Wed, Apr 3, 2013 at 12:19 AM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
> Hi Paul,
>
> No problem, I'll prepare patch in next 24 hours and will send in ASAP.
>
> But, since we've found the issue, I suggest that we should look at it
> closer, especially since at this moment only you can reproduce it. As
> far as I understand, but I may be wrong,
> this comment is wrong, since if to specify
> OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx threshold to
> 1 character, instead of 16.
> /* Set receive FIFO threshold to 16 characters and
>  * transmit FIFO threshold to 16 spaces
>  */
>
> On Tue, Apr 2, 2013 at 9:24 PM, Paul Walmsley <paul@pwsan.com> wrote:
>> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>>
>>> Paul, can you please try to comment out change related to Rx line
>>> (granulation one). That's only Rx line related change in the patch.
>>
>> Yes, looks like simply reverting the commit 1776fd059c40 ("OMAP/serial:
>> Fix incorrect Rx FIFO threshold setting, LSR validation on Tx, and Tx FIFO
>> IRQ generation") change to OMAP_UART_SCR_RX_TRIG_GRANU1_MASK fixes it.
>>
>> Care to send a patch doing so ASAP for v3.9-rc fixes?  That would moot the
>> revert of the entire commit 1776fd059c40.
>>
>>
>> - Paul
Alexey Pelykh April 3, 2013, 7:04 p.m. UTC | #12
I've submitted patch that should remove this regression

On Wed, Apr 3, 2013 at 6:55 PM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
> Paul,
>
> Should I start a new email chain with that patch or add it here?
>
> On Wed, Apr 3, 2013 at 12:19 AM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
>> Hi Paul,
>>
>> No problem, I'll prepare patch in next 24 hours and will send in ASAP.
>>
>> But, since we've found the issue, I suggest that we should look at it
>> closer, especially since at this moment only you can reproduce it. As
>> far as I understand, but I may be wrong,
>> this comment is wrong, since if to specify
>> OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx threshold to
>> 1 character, instead of 16.
>> /* Set receive FIFO threshold to 16 characters and
>>  * transmit FIFO threshold to 16 spaces
>>  */
>>
>> On Tue, Apr 2, 2013 at 9:24 PM, Paul Walmsley <paul@pwsan.com> wrote:
>>> On Mon, 1 Apr 2013, Alexey Pelykh wrote:
>>>
>>>> Paul, can you please try to comment out change related to Rx line
>>>> (granulation one). That's only Rx line related change in the patch.
>>>
>>> Yes, looks like simply reverting the commit 1776fd059c40 ("OMAP/serial:
>>> Fix incorrect Rx FIFO threshold setting, LSR validation on Tx, and Tx FIFO
>>> IRQ generation") change to OMAP_UART_SCR_RX_TRIG_GRANU1_MASK fixes it.
>>>
>>> Care to send a patch doing so ASAP for v3.9-rc fixes?  That would moot the
>>> revert of the entire commit 1776fd059c40.
>>>
>>>
>>> - Paul
Paul Walmsley April 3, 2013, 8:17 p.m. UTC | #13
On Wed, 3 Apr 2013, Alexey Pelykh wrote:

> I've submitted patch that should remove this regression

Thanks, indeed it does.  So the revert patch that I sent at the beginning 
of this thread is now mooted.

- Paul
Paul Walmsley April 3, 2013, 8:21 p.m. UTC | #14
Hi,

On Wed, 3 Apr 2013, Alexey Pelykh wrote:

> But, since we've found the issue, I suggest that we should look at it
> closer, especially since at this moment only you can reproduce it. As
> far as I understand, but I may be wrong,
> this comment is wrong, since if to specify
> OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx threshold to
> 1 character, instead of 16.
> /* Set receive FIFO threshold to 16 characters and
>  * transmit FIFO threshold to 16 spaces
>  */

If you want to continue to experiment with it, I'd be willing to do an 
occasional test boot on 37xx EVM.  It's nice to see community patches...


- Paul
Alexey Pelykh April 3, 2013, 8:46 p.m. UTC | #15
Hi

On Wed, Apr 3, 2013 at 11:21 PM, Paul Walmsley <paul@pwsan.com> wrote:
> Hi,
>
> On Wed, 3 Apr 2013, Alexey Pelykh wrote:
>
>> But, since we've found the issue, I suggest that we should look at it
>> closer, especially since at this moment only you can reproduce it. As
>> far as I understand, but I may be wrong,
>> this comment is wrong, since if to specify
>> OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx threshold to
>> 1 character, instead of 16.
>> /* Set receive FIFO threshold to 16 characters and
>>  * transmit FIFO threshold to 16 spaces
>>  */
>
> If you want to continue to experiment with it, I'd be willing to do an
> occasional test boot on 37xx EVM.  It's nice to see community patches...
>

I'm pretty sure that current configuration of UART indeed sets Rx FIFO
threshold to
1 character, as opposite to 16 characters mentioned in comments. Kernel 3.3
is the origin of Rx 1-char-granularity bit setting, but states clearly
that Rx FIFO
is being set to 1 character in non-DMA mode. The comment appears between
3.6 and 3.7 in commit 6721ab7f77f2614ab43e3de2f908b1d7436331df by Felipe Balbi.
3.0-3.2 kernels don't have 1-char-granularity enabled, and Rx FIFO
threshold level
was effectively set to 16 characters, what should be the same as my
patch that turned off
1-char-granularity

Paul, can you confirm (if possible, surely) if 3.0-3.2 kernels had same issue?

Felipe, it would be great to get your comments on this?

>
> - Paul
Grazvydas Ignotas April 3, 2013, 8:48 p.m. UTC | #16
Hi,

On Wed, Apr 3, 2013 at 12:19 AM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
>
> But, since we've found the issue, I suggest that we should look at it
> closer, especially since at this moment only you can reproduce it. As

FWIW I was also affected by it, maybe because my console is on UART3
and that is on different power (or was is clock?) domain than UART1.


--
Gražvydas
Paul Walmsley April 3, 2013, 8:57 p.m. UTC | #17
Hi Alexey,

On Wed, 3 Apr 2013, Alexey Pelykh wrote:

> But, since we've found the issue, I suggest that we should look at it
> closer, especially since at this moment only you can reproduce it.

Well probably no one else is testing it :-)

> As far as I understand, but I may be wrong, this comment is wrong, since 
> if to specify OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx 
> threshold to 1 character, instead of 16. /* Set receive FIFO threshold 
> to 16 characters and
>  * transmit FIFO threshold to 16 spaces
>  */

Again it certainly would not surprise me... that UART IP block seems to be 
poorly understood :-(  Bizarre, I know.

- Paul
Alexey Pelykh April 3, 2013, 9:06 p.m. UTC | #18
Hi Paul,

Is it possible to check behavior on 3.0-3.2 kernels?

On Wed, Apr 3, 2013 at 11:57 PM, Paul Walmsley <paul@pwsan.com> wrote:
> Hi Alexey,
>
> On Wed, 3 Apr 2013, Alexey Pelykh wrote:
>
>> But, since we've found the issue, I suggest that we should look at it
>> closer, especially since at this moment only you can reproduce it.
>
> Well probably no one else is testing it :-)
>
>> As far as I understand, but I may be wrong, this comment is wrong, since
>> if to specify OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx
>> threshold to 1 character, instead of 16. /* Set receive FIFO threshold
>> to 16 characters and
>>  * transmit FIFO threshold to 16 spaces
>>  */
>
> Again it certainly would not surprise me... that UART IP block seems to be
> poorly understood :-(  Bizarre, I know.
>
> - Paul
Grazvydas Ignotas April 4, 2013, 10:54 a.m. UTC | #19
On Thu, Apr 4, 2013 at 12:06 AM, Alexey Pelykh <alexey.pelykh@gmail.com> wrote:
>
> Is it possible to check behavior on 3.0-3.2 kernels?

We use 3.2 on pandora as production kernel and it doesn't have this
issue that 3.9 had. 3.2 serial PM code is vastly different from what's
in later ones though.

>
> On Wed, Apr 3, 2013 at 11:57 PM, Paul Walmsley <paul@pwsan.com> wrote:
>> Hi Alexey,
>>
>> On Wed, 3 Apr 2013, Alexey Pelykh wrote:
>>
>>> But, since we've found the issue, I suggest that we should look at it
>>> closer, especially since at this moment only you can reproduce it.
>>
>> Well probably no one else is testing it :-)
>>
>>> As far as I understand, but I may be wrong, this comment is wrong, since
>>> if to specify OMAP_UART_SCR_RX_TRIG_GRANU1_MASK, it effectively sets Rx
>>> threshold to 1 character, instead of 16. /* Set receive FIFO threshold
>>> to 16 characters and
>>>  * transmit FIFO threshold to 16 spaces
>>>  */
>>
>> Again it certainly would not surprise me... that UART IP block seems to be
>> poorly understood :-(  Bizarre, I know.
>>
>> - Paul

--
Gražvydas
diff mbox

Patch

diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
index 4dc4140..9915e4d 100644
--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -59,7 +59,6 @@ 
 
 /* SCR register bitmasks */
 #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK		(1 << 7)
-#define OMAP_UART_SCR_TX_TRIG_GRANU1_MASK		(1 << 6)
 #define OMAP_UART_SCR_TX_EMPTY			(1 << 3)
 
 /* FCR register bitmasks */
@@ -321,6 +320,9 @@  static void transmit_chars(struct uart_omap_port *up, unsigned int lsr)
 	struct circ_buf *xmit = &up->port.state->xmit;
 	int count;
 
+	if (!(lsr & UART_LSR_THRE))
+		return;
+
 	if (up->port.x_char) {
 		serial_out(up, UART_TX, up->port.x_char);
 		up->port.icount.tx++;
@@ -862,7 +864,7 @@  serial_omap_set_termios(struct uart_port *port, struct ktermios *termios,
 	serial_out(up, UART_IER, up->ier);
 	serial_out(up, UART_LCR, cval);		/* reset DLAB */
 	up->lcr = cval;
-	up->scr = 0;
+	up->scr = OMAP_UART_SCR_TX_EMPTY;
 
 	/* FIFOs and DMA Settings */
 
@@ -886,6 +888,8 @@  serial_omap_set_termios(struct uart_port *port, struct ktermios *termios,
 	serial_out(up, UART_MCR, up->mcr | UART_MCR_TCRTLR);
 	/* FIFO ENABLE, DMA MODE */
 
+	up->scr |= OMAP_UART_SCR_RX_TRIG_GRANU1_MASK;
+
 	/* Set receive FIFO threshold to 16 characters and
 	 * transmit FIFO threshold to 16 spaces
 	 */