Message ID | 20240902152451.862-7-johan+linaro@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | serial: qcom-geni: fix console corruption | expand |
Hi Johan, kernel test robot noticed the following build warnings: [auto build test WARNING on tty/tty-testing] [also build test WARNING on tty/tty-next tty/tty-linus driver-core/driver-core-testing driver-core/driver-core-next driver-core/driver-core-linus usb/usb-testing usb/usb-next usb/usb-linus linus/master v6.11-rc6 next-20240902] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Johan-Hovold/serial-qcom-geni-fix-fifo-polling-timeout/20240902-232801 base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tty-testing patch link: https://lore.kernel.org/r/20240902152451.862-7-johan%2Blinaro%40kernel.org patch subject: [PATCH 6/8] serial: qcom-geni: fix console corruption config: arc-randconfig-001-20240903 (https://download.01.org/0day-ci/archive/20240903/202409031521.hMYVfAjO-lkp@intel.com/config) compiler: arc-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240903/202409031521.hMYVfAjO-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202409031521.hMYVfAjO-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/tty/serial/qcom_geni_serial.c:314:13: warning: 'qcom_geni_serial_drain_fifo' defined but not used [-Wunused-function] 314 | static void qcom_geni_serial_drain_fifo(struct uart_port *uport) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/qcom_geni_serial_drain_fifo +314 drivers/tty/serial/qcom_geni_serial.c 313 > 314 static void qcom_geni_serial_drain_fifo(struct uart_port *uport) 315 { 316 struct qcom_geni_serial_port *port = to_dev_port(uport); 317 318 if (!qcom_geni_serial_main_active(uport)) 319 return; 320 321 qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH, 322 port->tx_queued); 323 } 324
Hi, On Mon, Sep 2, 2024 at 8:26 AM Johan Hovold <johan+linaro@kernel.org> wrote: > > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport) > +{ > + struct qcom_geni_serial_port *port = to_dev_port(uport); > + > + if (!qcom_geni_serial_main_active(uport)) > + return; It seems like all callers already do the check and only ever call you if the port is active. Do you really need to re-check? > @@ -308,6 +311,17 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport, > return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0); > } > > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport) > +{ > + struct qcom_geni_serial_port *port = to_dev_port(uport); > + > + if (!qcom_geni_serial_main_active(uport)) > + return; > + > + qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH, > + port->tx_queued); nit: indent "port->tx_queued" to match open parenthesis? ...also: as the kernel test robot reported, w/ certain CONFIGs this is defined / not used. Aside from the nit / robot issue, this solution looks reasonable to me. It's been long enough that I've already paged out much of the past digging I did into this driver, but this seems like it should work. Feel free to add my Reviewed-by when the robot issue is fixed. -Doug
On Wed, Sep 04, 2024 at 02:51:15PM -0700, Doug Anderson wrote: > On Mon, Sep 2, 2024 at 8:26 AM Johan Hovold <johan+linaro@kernel.org> wrote: > > > > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + > > + if (!qcom_geni_serial_main_active(uport)) > > + return; > > It seems like all callers already do the check and only ever call you > if the port is active. Do you really need to re-check? I wanted to make the helper self-contained and work in both cases. But since I ended up only using this helper only in the console code and will need to move it anyway (under the console ifdef), perhaps I can consider dropping it. But then again, it's just one register read. > > @@ -308,6 +311,17 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport, > > return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0); > > } > > > > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport) > > +{ > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > + > > + if (!qcom_geni_serial_main_active(uport)) > > + return; > > + > > + qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH, > > + port->tx_queued); > > nit: indent "port->tx_queued" to match open parenthesis? No, I don't use open-parenthesis alignment unless that's the (consistent) style of the code I'm changing (e.g. to avoid unnecessary realignments when symbol names change and to make a point about checkpatch --pedantic warnings not being part of the coding style). > ...also: as the kernel test robot reported, w/ certain CONFIGs this is > defined / not used. Yes, I need to move the helper under the console ifdef. I was just waiting to see if there was any further feedback before respinning. > Aside from the nit / robot issue, this solution looks reasonable to > me. It's been long enough that I've already paged out much of the past > digging I did into this driver, but this seems like it should work. > Feel free to add my Reviewed-by when the robot issue is fixed. Thanks for reviewing. Johan
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 7029c39a9a21..be620c5703f5 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -131,6 +131,7 @@ struct qcom_geni_serial_port { bool brk; unsigned int tx_remaining; + unsigned int tx_queued; int wakeup_irq; bool rx_tx_swap; bool cts_rts_swap; @@ -144,6 +145,8 @@ static const struct uart_ops qcom_geni_uart_pops; static struct uart_driver qcom_geni_console_driver; static struct uart_driver qcom_geni_uart_driver; +static void qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport); + static inline struct qcom_geni_serial_port *to_dev_port(struct uart_port *uport) { return container_of(uport, struct qcom_geni_serial_port, uport); @@ -308,6 +311,17 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport, return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0); } +static void qcom_geni_serial_drain_fifo(struct uart_port *uport) +{ + struct qcom_geni_serial_port *port = to_dev_port(uport); + + if (!qcom_geni_serial_main_active(uport)) + return; + + qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH, + port->tx_queued); +} + static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size) { u32 m_cmd; @@ -476,7 +490,6 @@ static void qcom_geni_serial_console_write(struct console *co, const char *s, struct qcom_geni_serial_port *port; bool locked = true; unsigned long flags; - u32 geni_status; WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS); @@ -490,34 +503,20 @@ static void qcom_geni_serial_console_write(struct console *co, const char *s, else uart_port_lock_irqsave(uport, &flags); - geni_status = readl(uport->membase + SE_GENI_STATUS); + if (qcom_geni_serial_main_active(uport)) { + /* Wait for completion or drain FIFO */ + if (!locked || port->tx_remaining == 0) + qcom_geni_serial_poll_tx_done(uport); + else + qcom_geni_serial_drain_fifo(uport); - if (!locked) { - /* - * We can only get here if an oops is in progress then we were - * unable to get the lock. This means we can't safely access - * our state variables like tx_remaining. About the best we - * can do is wait for the FIFO to be empty before we start our - * transfer, so we'll do that. - */ - qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS, - M_TX_FIFO_NOT_EMPTY_EN, false); - } else if ((geni_status & M_GENI_CMD_ACTIVE) && !port->tx_remaining) { - /* - * It seems we can't interrupt existing transfers if all data - * has been sent, in which case we need to look for done first. - */ - qcom_geni_serial_poll_tx_done(uport); + qcom_geni_serial_cancel_tx_cmd(uport); } __qcom_geni_serial_console_write(uport, s, count); - - if (locked) { - if (port->tx_remaining) - qcom_geni_serial_setup_tx(uport, port->tx_remaining); + if (locked) uart_port_unlock_irqrestore(uport, flags); - } } static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop) @@ -698,6 +697,7 @@ static void qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport) writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR); port->tx_remaining = 0; + port->tx_queued = 0; } static void qcom_geni_serial_handle_rx_fifo(struct uart_port *uport, bool drop) @@ -924,6 +924,7 @@ static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport, if (!port->tx_remaining) { qcom_geni_serial_setup_tx(uport, pending); port->tx_remaining = pending; + port->tx_queued = 0; irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN); if (!(irq_en & M_TX_FIFO_WATERMARK_EN)) @@ -932,6 +933,7 @@ static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport, } qcom_geni_serial_send_chunk_fifo(uport, chunk); + port->tx_queued += chunk; /* * The tx fifo watermark is level triggered and latched. Though we had
The Qualcomm serial console implementation is broken and can lose characters when the serial port is also used for tty output. Specifically, the console code only waits for the current tx command to complete when all data has already been written to the fifo. When there are on-going longer transfers this often means that console output is lost when the console code inadvertently "hijacks" the current tx command instead of starting a new one. This can, for example, be observed during boot when console output that should have been interspersed with init output is truncated: [ 9.462317] qcom-snps-eusb2-hsphy fde000.phy: Registered Qcom-eUSB2 phy [ OK ] Found device KBG50ZNS256G KIOXIA Wi[ 9.471743ndows. [ 9.539915] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller Add a new state variable to track how much data has been written to the fifo and use it to determine when the fifo and shift register are both empty. This is needed since there is currently no other known way to determine when the shift register is empty. This in turn allows the console code to interrupt long transfers without losing data. Note that the oops-in-progress case is similarly broken as it does not cancel any active command and also waits for the wrong status flag when attempting to drain the fifo (TX_FIFO_NOT_EMPTY_EN is only set when cancelling a command leaves data in the fifo). Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP") Fixes: a1fee899e5be ("tty: serial: qcom_geni_serial: Fix softlock") Fixes: 9e957a155005 ("serial: qcom-geni: Don't cancel/abort if we can't get the port lock") Cc: stable@vger.kernel.org # 4.17 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> --- drivers/tty/serial/qcom_geni_serial.c | 48 ++++++++++++++------------- 1 file changed, 25 insertions(+), 23 deletions(-)