diff mbox

[REPOST,1/2] serial/amba-pl011: Activate TX IRQ passively

Message ID 1425472054-7543-2-git-send-email-Dave.Martin@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Martin March 4, 2015, 12:27 p.m. UTC
The current PL011 driver transmits a dummy character when the UART
is opened, to assert the TX IRQ for the first time
(see pl011_startup()).  The UART is put in loopback mode temporarily,
so the receiver presumably shouldn't see anything.

However...

At least some platforms containing a PL011 send characters down the
wire even when loopback mode is enabled.  This means that a
spurious NUL character may be seen at the receiver when the PL011 is
opened through the TTY layer.

The current code also temporarily sets the baud rate to maximum and
the character width to the minimum, to that the dummy TX completes
as quickly as possible.  If this is seen by the receiver it will
result in a framing error and can knock the receiver out of sync --
turning subsequent output into garbage until synchronisation
is reestablished.  (Particularly problematic during boot with systemd.)

To avoid spurious transmissions, this patch removes assumptions about
whether the TX IRQ will fire until at least one TX IRQ has been seen.

Instead, the UART will unmask the TX IRQ and then slow-start via
polling and timer-based soft IRQs initially.  If the TTY layer writes
enough data to fill the FIFO to the interrupt threshold in one go,
the TX IRQ should assert, at which point the driver changes to
fully interrupt-driven TX.

In this way, the TX IRQ is activated as a side-effect instead of
being done deliberately.

This should also mean that the driver works on the SBSA Generic
UART[1] (a cut-down PL011) without invasive changes.  The Generic
UART lacks some features needed for the dummy TX approach to work
(FIFO disabling and loopback).

[1] Server Base System Architecture (ARM-DEN-0029-v2.3)
    http://infocenter.arm.com/
    (click-thru required :/)

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 drivers/tty/serial/amba-pl011.c |  165 +++++++++++++++++++++++++++------------
 1 file changed, 115 insertions(+), 50 deletions(-)

Comments

Russell King - ARM Linux March 12, 2015, 11:03 a.m. UTC | #1
On Wed, Mar 04, 2015 at 12:27:33PM +0000, Dave Martin wrote:
> The current PL011 driver transmits a dummy character when the UART
> is opened, to assert the TX IRQ for the first time
> (see pl011_startup()).  The UART is put in loopback mode temporarily,
> so the receiver presumably shouldn't see anything.

We do this so that we know for certain that the PL011 has room to accept
at least one character.

> However...
> 
> At least some platforms containing a PL011 send characters down the
> wire even when loopback mode is enabled.  This means that a
> spurious NUL character may be seen at the receiver when the PL011 is
> opened through the TTY layer.

... which is an annoyance.

> The current code also temporarily sets the baud rate to maximum and
> the character width to the minimum, to that the dummy TX completes
> as quickly as possible.  If this is seen by the receiver it will
> result in a framing error and can knock the receiver out of sync --
> turning subsequent output into garbage until synchronisation
> is reestablished.  (Particularly problematic during boot with systemd.)

Yea, I have my own issues with systemd, but let's stay away from that
potential argument. :)

> To avoid spurious transmissions, this patch removes assumptions about
> whether the TX IRQ will fire until at least one TX IRQ has been seen.
> 
> Instead, the UART will unmask the TX IRQ and then slow-start via
> polling and timer-based soft IRQs initially.  If the TTY layer writes
> enough data to fill the FIFO to the interrupt threshold in one go,
> the TX IRQ should assert, at which point the driver changes to
> fully interrupt-driven TX.

I'm concerned about this.  What happens if the PL011 is also being used
as a console, and the UART TX FIFO is fully stuffed?

Reading the updated code, it seems that we can call pl011_tx_chars()
irrespective of whether the TX FIFO is full or not.  pl011_tx_chars()
makes the assumption that the TX FIFO can always accept the next
character, and it results in (eg, in the x_char handling) the next
character being written, even if the FIFO is full.

If hardware CTS flow control is enabled, the problem gets worse - in
that mode, the TX FIFO can remain fully occupied for an indeterminant
amount of time.

This introduces a whole new unreliability to the driver which wasn't
there to start with.

So, I'm afraid this gets a NAK.

You need to check the TX FIFO status before trying to stuff it with
characters.
Dave Martin March 12, 2015, 12:55 p.m. UTC | #2
On Thu, Mar 12, 2015 at 11:03:45AM +0000, Russell King - ARM Linux wrote:
> On Wed, Mar 04, 2015 at 12:27:33PM +0000, Dave Martin wrote:
> > The current PL011 driver transmits a dummy character when the UART
> > is opened, to assert the TX IRQ for the first time
> > (see pl011_startup()).  The UART is put in loopback mode temporarily,
> > so the receiver presumably shouldn't see anything.
> 
> We do this so that we know for certain that the PL011 has room to accept
> at least one character.

Thanks for taking a look at this.

[Greg, please hold these patches until I've reworked them]

> 
> > However...
> > 
> > At least some platforms containing a PL011 send characters down the
> > wire even when loopback mode is enabled.  This means that a
> > spurious NUL character may be seen at the receiver when the PL011 is
> > opened through the TTY layer.
> 
> ... which is an annoyance.
> 
> > The current code also temporarily sets the baud rate to maximum and
> > the character width to the minimum, to that the dummy TX completes
> > as quickly as possible.  If this is seen by the receiver it will
> > result in a framing error and can knock the receiver out of sync --
> > turning subsequent output into garbage until synchronisation
> > is reestablished.  (Particularly problematic during boot with systemd.)
> 
> Yea, I have my own issues with systemd, but let's stay away from that
> potential argument. :)

(dunno what you could be suggesting...)

> 
> > To avoid spurious transmissions, this patch removes assumptions about
> > whether the TX IRQ will fire until at least one TX IRQ has been seen.
> > 
> > Instead, the UART will unmask the TX IRQ and then slow-start via
> > polling and timer-based soft IRQs initially.  If the TTY layer writes
> > enough data to fill the FIFO to the interrupt threshold in one go,
> > the TX IRQ should assert, at which point the driver changes to
> > fully interrupt-driven TX.
> 
> I'm concerned about this.  What happens if the PL011 is also being used
> as a console, and the UART TX FIFO is fully stuffed?

I don't think my patches change this situation.  For normal printk,
pl011_console_write() will disable IRQs at the CPU and take port->lock:
so either pl011_int() finishes all its work safely before the console
output is done, or the console output is done first.

In the latter case, the condition that triggered the IRQ (i.e., TX
FIFO has space) may no longer be true: the lock protects pl011_int()
from the race that would otherwise exist between the check on TXIS
and the actual writes to the FIFO.

Only in special cases (sysrq and oopses) might the lock not be
held.  But this problem exists independently of my changes AFAICT.
Correct output seems to be best-effort in these cases, which is
probably the right thing to do.

> 
> Reading the updated code, it seems that we can call pl011_tx_chars()
> irrespective of whether the TX FIFO is full or not.  pl011_tx_chars()
> makes the assumption that the TX FIFO can always accept the next
> character, and it results in (eg, in the x_char handling) the next
> character being written, even if the FIFO is full.

There are multiple paths into pl011_tx_chars() now, and the original
invariant (TXIS is asserted and nothing else is filling the FIFO,
therefore the FIFO is at least half-empty) is replaced with a more
complex one.

Now, the FIFO is definitely half-empty if tx_irq_seen >= 2.  Otherwise,
the FIFO fill level is unknown, and TXFF is polled for each write.


I've been thinking about this, and my code is really more complex than
it needs to be here.

All that's really needed is an explicit argument to say whether
pl011_tx_chars() was called due to a FIFO-definitely-has-space
condition (IRQ or DMA underflow), or simply because there was
some data to write (the pl011_start_tx() and softirq cases) --
this just boils down to identifying the call site.

> If hardware CTS flow control is enabled, the problem gets worse - in
> that mode, the TX FIFO can remain fully occupied for an indeterminant
> amount of time.

Not really?  Regardless of why the FIFO is full, pl011_int() won't
call pl011_tx_chars() because a full FIFO means TXIS is deasserted.

In the start_tx/softirq case TXFF will be checked, as I try to
explain above.


> This introduces a whole new unreliability to the driver which wasn't
> there to start with.

I don't _think_ this adds unreliability (except for a dumbass bug I
noticed where I call spin_lock() instead of spin_lock_irq() in
pl011_tx_softirq()).

But it does add complexity and will likely make the code harder
to maintiain.


Unless you have any other ideas I will try to simplify the code
along the above lines.

> You need to check the TX FIFO status before trying to stuff it with
> characters.

Indeed.

Cheers
---Dave
Russell King - ARM Linux March 12, 2015, 2:39 p.m. UTC | #3
On Thu, Mar 12, 2015 at 12:55:19PM +0000, Dave Martin wrote:
> I don't think my patches change this situation.  For normal printk,
> pl011_console_write() will disable IRQs at the CPU and take port->lock:
> so either pl011_int() finishes all its work safely before the console
> output is done, or the console output is done first.
> 
> In the latter case, the condition that triggered the IRQ (i.e., TX
> FIFO has space) may no longer be true: the lock protects pl011_int()
> from the race that would otherwise exist between the check on TXIS
> and the actual writes to the FIFO.
> 
> Only in special cases (sysrq and oopses) might the lock not be
> held.  But this problem exists independently of my changes AFAICT.
> Correct output seems to be best-effort in these cases, which is
> probably the right thing to do.

Actually, yes, you're right, I'm wrong.  The console won't drop the
spinlock until the last character has left the TX FIFO and has been
sent on the wire, so at the point I was originally thinking of, we
guarantee that the FIFO is empty (if it weren't like that, it'd
almost certainly be full.)

> > If hardware CTS flow control is enabled, the problem gets worse - in
> > that mode, the TX FIFO can remain fully occupied for an indeterminant
> > amount of time.
> 
> Not really?  Regardless of why the FIFO is full, pl011_int() won't
> call pl011_tx_chars() because a full FIFO means TXIS is deasserted.
> 
> In the start_tx/softirq case TXFF will be checked, as I try to
> explain above.

Yes, in that case, pl011_int() won't be called for transmit, but what
about the softirq case.

Looking at the patch again, I don't see pl011_tx_chars() ever returning
true - so the while (pl011_tx_chars()); in the softirq looks strange
(if it always returns false, we won't ever iterate.)

> Unless you have any other ideas I will try to simplify the code
> along the above lines.

I forget whether we have problems with UARTs which don't generate any TX
irqs - I hope we don't.

What about doing this instead, which should result in fewer changes to
the driver (especially to the fast paths):

- On startup, set a flag to indicate that the TX state is unknown (so we
  can detect the first start_tx().)
- When we get our first start_tx(), check the flag.
  - read the FIFO flags and interrupt status.  If the FIFO flags indicate
    that the TX FIFO is empty, but there is no transmit interrupt, read
    and load the first character into the FIFO.  Instead of just loading
    the first character, as we've checked the FIFO status, we could
    alternatively load the first half of the FIFO via a call to
    pl011_tx_chars().
  - clear the flag.

This should mean that we keep the driver structure we have today.

The only concern I have about that is (from what I remember) we don't
expect a call to ->start_tx to cause a tx stop event, though I don't
see anything in serial_core which would get upset if we did.
Dave Martin March 12, 2015, 4:34 p.m. UTC | #4
On Thu, Mar 12, 2015 at 02:39:30PM +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 12, 2015 at 12:55:19PM +0000, Dave Martin wrote:
> > I don't think my patches change this situation.  For normal printk,
> > pl011_console_write() will disable IRQs at the CPU and take port->lock:
> > so either pl011_int() finishes all its work safely before the console
> > output is done, or the console output is done first.
> > 
> > In the latter case, the condition that triggered the IRQ (i.e., TX
> > FIFO has space) may no longer be true: the lock protects pl011_int()
> > from the race that would otherwise exist between the check on TXIS
> > and the actual writes to the FIFO.
> > 
> > Only in special cases (sysrq and oopses) might the lock not be
> > held.  But this problem exists independently of my changes AFAICT.
> > Correct output seems to be best-effort in these cases, which is
> > probably the right thing to do.
> 
> Actually, yes, you're right, I'm wrong.  The console won't drop the
> spinlock until the last character has left the TX FIFO and has been
> sent on the wire, so at the point I was originally thinking of, we
> guarantee that the FIFO is empty (if it weren't like that, it'd
> almost certainly be full.)

Yes, that's also true.

> > > If hardware CTS flow control is enabled, the problem gets worse - in
> > > that mode, the TX FIFO can remain fully occupied for an indeterminant
> > > amount of time.
> > 
> > Not really?  Regardless of why the FIFO is full, pl011_int() won't
> > call pl011_tx_chars() because a full FIFO means TXIS is deasserted.
> > 
> > In the start_tx/softirq case TXFF will be checked, as I try to
> > explain above.
> 
> Yes, in that case, pl011_int() won't be called for transmit, but what
> about the softirq case.
> 
> Looking at the patch again, I don't see pl011_tx_chars() ever returning
> true - so the while (pl011_tx_chars()); in the softirq looks strange
> (if it always returns false, we won't ever iterate.)

Oops, you're right.  This is a legacy from an older spin of the patches,
where tx_softirq had to spin around tx_chars in the case of an xchar,
just like pl011_int must at present.

That seemed contrived, so I got rid of this requirement, but that
makes the return value and the loop redundant.

> 
> > Unless you have any other ideas I will try to simplify the code
> > along the above lines.
> 
> I forget whether we have problems with UARTs which don't generate any TX
> irqs - I hope we don't.

Currently the code relies on a working TX IRQ anyway, so I'm hoping
it's safe to assume we don't have to cope with that.

> What about doing this instead, which should result in fewer changes to
> the driver (especially to the fast paths):
> 
> - On startup, set a flag to indicate that the TX state is unknown (so we
>   can detect the first start_tx().)
> - When we get our first start_tx(), check the flag.
>   - read the FIFO flags and interrupt status.  If the FIFO flags indicate
>     that the TX FIFO is empty, but there is no transmit interrupt, read
>     and load the first character into the FIFO.  Instead of just loading
>     the first character, as we've checked the FIFO status, we could
>     alternatively load the first half of the FIFO via a call to
>     pl011_tx_chars().
>   - clear the flag.
> 
> This should mean that we keep the driver structure we have today.

I had originally hoped to do things that way, but there is a problem:
unless pl011_tx_chars() sees TXFF at some point, there is no way to
know how full the FIFO got.  For example, if there are fifosize/2
chars queued from serial_core initially, it's likely that the FIFO
won't get half-full.  This permits a situation where we have to leave
the flag set (to be on the safe side), but a TX IRQ does actually
happen.

So, the flag can't be cleared unconditionally -- I took an "I'll believe
it when I see it" approach to the TX IRQ.  tx_irq_seen should really be
a bool though -- the counter-style thing works around a race which could
only happen because I mastakenly wasn't disabling interrupts in
pl011_tx_softirq().

There could still be merit in transmitting the initial chars directly
instead of waiting for a TX IRQ, even if we already know the TX IRQ
works, especially if start_tx() is called because of a newly pending
x_char.  But this is really an optimisation, so I may present it as
a separate patch.

> The only concern I have about that is (from what I remember) we don't
> expect a call to ->start_tx to cause a tx stop event, though I don't
> see anything in serial_core which would get upset if we did.

That's a fair point.  stop_tx will normally happen asynchronously, but
the port lock has to be released first.  8250 does stop_tx from its
softirq in the analogous situation, so we could do the same here.


I'll rework the series around my current ideas first, and then we
can see if it needs further restructuring...

Cheers
---Dave
diff mbox

Patch

diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 8d94c19..69910f7 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -58,6 +58,7 @@ 
 #include <linux/pinctrl/consumer.h>
 #include <linux/sizes.h>
 #include <linux/io.h>
+#include <linux/workqueue.h>
 
 #define UART_NR			14
 
@@ -156,7 +157,9 @@  struct uart_amba_port {
 	unsigned int		lcrh_tx;	/* vendor-specific */
 	unsigned int		lcrh_rx;	/* vendor-specific */
 	unsigned int		old_cr;		/* state during shutdown */
+	struct delayed_work	tx_softirq_work;
 	bool			autorts;
+	unsigned int		tx_irq_seen;	/* 0=none, 1=1, 2=2 or more */
 	char			type[12];
 #ifdef CONFIG_DMA_ENGINE
 	/* DMA stuff */
@@ -440,8 +443,9 @@  static void pl011_dma_remove(struct uart_amba_port *uap)
 		dma_release_channel(uap->dmarx.chan);
 }
 
-/* Forward declare this for the refill routine */
+/* Forward declare these for the refill routine */
 static int pl011_dma_tx_refill(struct uart_amba_port *uap);
+static void pl011_start_tx_pio(struct uart_amba_port *uap);
 
 /*
  * The current DMA TX buffer has been sent.
@@ -479,14 +483,13 @@  static void pl011_dma_tx_callback(void *data)
 		return;
 	}
 
-	if (pl011_dma_tx_refill(uap) <= 0) {
+	if (pl011_dma_tx_refill(uap) <= 0)
 		/*
 		 * We didn't queue a DMA buffer for some reason, but we
 		 * have data pending to be sent.  Re-enable the TX IRQ.
 		 */
-		uap->im |= UART011_TXIM;
-		writew(uap->im, uap->port.membase + UART011_IMSC);
-	}
+		pl011_start_tx_pio(uap);
+
 	spin_unlock_irqrestore(&uap->port.lock, flags);
 }
 
@@ -664,12 +667,10 @@  static inline bool pl011_dma_tx_start(struct uart_amba_port *uap)
 		if (!uap->dmatx.queued) {
 			if (pl011_dma_tx_refill(uap) > 0) {
 				uap->im &= ~UART011_TXIM;
-				ret = true;
-			} else {
-				uap->im |= UART011_TXIM;
+				writew(uap->im, uap->port.membase +
+				       UART011_IMSC);
+			} else
 				ret = false;
-			}
-			writew(uap->im, uap->port.membase + UART011_IMSC);
 		} else if (!(uap->dmacr & UART011_TXDMAE)) {
 			uap->dmacr |= UART011_TXDMAE;
 			writew(uap->dmacr,
@@ -1208,15 +1209,24 @@  static void pl011_stop_tx(struct uart_port *port)
 	pl011_dma_tx_stop(uap);
 }
 
+static bool pl011_tx_chars(struct uart_amba_port *uap);
+
+/* Start TX with programmed I/O only (no DMA) */
+static void pl011_start_tx_pio(struct uart_amba_port *uap)
+{
+	uap->im |= UART011_TXIM;
+	writew(uap->im, uap->port.membase + UART011_IMSC);
+	if (!uap->tx_irq_seen)
+		pl011_tx_chars(uap);
+}
+
 static void pl011_start_tx(struct uart_port *port)
 {
 	struct uart_amba_port *uap =
 	    container_of(port, struct uart_amba_port, port);
 
-	if (!pl011_dma_tx_start(uap)) {
-		uap->im |= UART011_TXIM;
-		writew(uap->im, uap->port.membase + UART011_IMSC);
-	}
+	if (!pl011_dma_tx_start(uap))
+		pl011_start_tx_pio(uap);
 }
 
 static void pl011_stop_rx(struct uart_port *port)
@@ -1274,40 +1284,87 @@  __acquires(&uap->port.lock)
 	spin_lock(&uap->port.lock);
 }
 
-static void pl011_tx_chars(struct uart_amba_port *uap)
+/*
+ * Transmit a character
+ * There must be at least one free entry in the TX FIFO to accept the char.
+ *
+ * Returns true if the FIFO might have space in it afterwards;
+ * returns false if the FIFO definitely became full.
+ */
+static bool pl011_tx_char(struct uart_amba_port *uap, unsigned char c)
+{
+	writew(c, uap->port.membase + UART01x_DR);
+	uap->port.icount.tx++;
+
+	if (likely(uap->tx_irq_seen > 1))
+		return true;
+
+	return !(readw(uap->port.membase + UART01x_FR) & UART01x_FR_TXFF);
+}
+
+static bool pl011_tx_chars(struct uart_amba_port *uap)
 {
 	struct circ_buf *xmit = &uap->port.state->xmit;
 	int count;
 
+	if (unlikely(uap->tx_irq_seen < 2))
+		/*
+		 * Initial FIFO fill level unknown: we must check TXFF
+		 * after each write, so just try to fill up the FIFO.
+		 */
+		count = uap->fifosize;
+	else /* tx_irq_seen >= 2 */
+		/*
+		 * FIFO initially at least half-empty, so we can simply
+		 * write half the FIFO without polling TXFF.
+
+		 * Note: the *first* TX IRQ can still race with
+		 * pl011_start_tx_pio(), which can result in the FIFO
+		 * being fuller than expected in that case.
+		 */
+		count = uap->fifosize >> 1;
+
+	/*
+	 * If the FIFO is full we're guaranteed a TX IRQ at some later point,
+	 * and can't transmit immediately in any case:
+	 */
+	if (unlikely(uap->tx_irq_seen < 2 &&
+		     readw(uap->port.membase + UART01x_FR) & UART01x_FR_TXFF))
+		return false;
+
 	if (uap->port.x_char) {
-		writew(uap->port.x_char, uap->port.membase + UART01x_DR);
-		uap->port.icount.tx++;
+		pl011_tx_char(uap, uap->port.x_char);
 		uap->port.x_char = 0;
-		return;
+		--count;
 	}
 	if (uart_circ_empty(xmit) || uart_tx_stopped(&uap->port)) {
 		pl011_stop_tx(&uap->port);
-		return;
+		goto done;
 	}
 
 	/* If we are using DMA mode, try to send some characters. */
 	if (pl011_dma_tx_irq(uap))
-		return;
+		goto done;
 
-	count = uap->fifosize >> 1;
-	do {
-		writew(xmit->buf[xmit->tail], uap->port.membase + UART01x_DR);
+	while (count-- > 0 && pl011_tx_char(uap, xmit->buf[xmit->tail])) {
 		xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
-		uap->port.icount.tx++;
 		if (uart_circ_empty(xmit))
 			break;
-	} while (--count > 0);
+	}
 
 	if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
 		uart_write_wakeup(&uap->port);
 
-	if (uart_circ_empty(xmit))
+	if (uart_circ_empty(xmit)) {
 		pl011_stop_tx(&uap->port);
+		goto done;
+	}
+
+	if (unlikely(!uap->tx_irq_seen))
+		schedule_delayed_work(&uap->tx_softirq_work, uap->port.timeout);
+
+done:
+	return false;
 }
 
 static void pl011_modem_status(struct uart_amba_port *uap)
@@ -1334,6 +1391,28 @@  static void pl011_modem_status(struct uart_amba_port *uap)
 	wake_up_interruptible(&uap->port.state->port.delta_msr_wait);
 }
 
+static void pl011_tx_softirq(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct uart_amba_port *uap =
+		container_of(dwork, struct uart_amba_port, tx_softirq_work);
+
+	spin_lock(&uap->port.lock);
+	while (pl011_tx_chars(uap)) ;
+	spin_unlock(&uap->port.lock);
+}
+
+static void pl011_tx_irq_seen(struct uart_amba_port *uap)
+{
+	if (likely(uap->tx_irq_seen > 1))
+		return;
+
+	uap->tx_irq_seen++;
+	if (uap->tx_irq_seen < 2)
+		/* first TX IRQ */
+		cancel_delayed_work(&uap->tx_softirq_work);
+}
+
 static irqreturn_t pl011_int(int irq, void *dev_id)
 {
 	struct uart_amba_port *uap = dev_id;
@@ -1372,8 +1451,10 @@  static irqreturn_t pl011_int(int irq, void *dev_id)
 			if (status & (UART011_DSRMIS|UART011_DCDMIS|
 				      UART011_CTSMIS|UART011_RIMIS))
 				pl011_modem_status(uap);
-			if (status & UART011_TXIS)
+			if (status & UART011_TXIS) {
+				pl011_tx_irq_seen(uap);
 				pl011_tx_chars(uap);
+			}
 
 			if (pass_counter-- == 0)
 				break;
@@ -1577,7 +1658,7 @@  static int pl011_startup(struct uart_port *port)
 {
 	struct uart_amba_port *uap =
 	    container_of(port, struct uart_amba_port, port);
-	unsigned int cr, lcr_h, fbrd, ibrd;
+	unsigned int cr;
 	int retval;
 
 	retval = pl011_hwinit(port);
@@ -1595,29 +1676,10 @@  static int pl011_startup(struct uart_port *port)
 
 	writew(uap->vendor->ifls, uap->port.membase + UART011_IFLS);
 
-	/*
-	 * Provoke TX FIFO interrupt into asserting. Taking care to preserve
-	 * baud rate and data format specified by FBRD, IBRD and LCRH as the
-	 * UART may already be in use as a console.
-	 */
-	spin_lock_irq(&uap->port.lock);
-
-	fbrd = readw(uap->port.membase + UART011_FBRD);
-	ibrd = readw(uap->port.membase + UART011_IBRD);
-	lcr_h = readw(uap->port.membase + uap->lcrh_rx);
-
-	cr = UART01x_CR_UARTEN | UART011_CR_TXE | UART011_CR_LBE;
-	writew(cr, uap->port.membase + UART011_CR);
-	writew(0, uap->port.membase + UART011_FBRD);
-	writew(1, uap->port.membase + UART011_IBRD);
-	pl011_write_lcr_h(uap, 0);
-	writew(0, uap->port.membase + UART01x_DR);
-	while (readw(uap->port.membase + UART01x_FR) & UART01x_FR_BUSY)
-		barrier();
+	/* Assume that TX IRQ doesn't work until we see one: */
+	uap->tx_irq_seen = 0;
 
-	writew(fbrd, uap->port.membase + UART011_FBRD);
-	writew(ibrd, uap->port.membase + UART011_IBRD);
-	pl011_write_lcr_h(uap, lcr_h);
+	spin_lock_irq(&uap->port.lock);
 
 	/* restore RTS and DTR */
 	cr = uap->old_cr & (UART011_CR_RTS | UART011_CR_DTR);
@@ -1672,6 +1734,8 @@  static void pl011_shutdown(struct uart_port *port)
 	    container_of(port, struct uart_amba_port, port);
 	unsigned int cr;
 
+	cancel_delayed_work_sync(&uap->tx_softirq_work);
+
 	/*
 	 * disable all interrupts
 	 */
@@ -2218,6 +2282,7 @@  static int pl011_probe(struct amba_device *dev, const struct amba_id *id)
 	uap->port.ops = &amba_pl011_pops;
 	uap->port.flags = UPF_BOOT_AUTOCONF;
 	uap->port.line = i;
+	INIT_DELAYED_WORK(&uap->tx_softirq_work, pl011_tx_softirq);
 	pl011_dma_probe(&dev->dev, uap);
 
 	/* Ensure interrupts from this UART are masked and cleared */