From patchwork Fri Dec 29 09:50:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Marek_Marczykowski-G=C3=B3recki?= X-Patchwork-Id: 13506420 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 21FD6C46CD4 for ; Fri, 29 Dec 2023 09:51:07 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.660247.1029808 (Exim 4.92) (envelope-from ) id 1rJ9W3-0002gM-8S; Fri, 29 Dec 2023 09:50:55 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 660247.1029808; Fri, 29 Dec 2023 09:50:55 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rJ9W3-0002gF-5l; Fri, 29 Dec 2023 09:50:55 +0000 Received: by outflank-mailman (input) for mailman id 660247; Fri, 29 Dec 2023 09:50:53 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rJ9W1-0002ft-6a for xen-devel@lists.xenproject.org; Fri, 29 Dec 2023 09:50:53 +0000 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id be9e17d2-a62f-11ee-9b0f-b553b5be7939; Fri, 29 Dec 2023 10:50:50 +0100 (CET) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 1C1EE5C013B for ; Fri, 29 Dec 2023 04:50:48 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Fri, 29 Dec 2023 04:50:48 -0500 Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 29 Dec 2023 04:50:47 -0500 (EST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: be9e17d2-a62f-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:content-type:content-type:date:date :from:from:in-reply-to:message-id:mime-version:reply-to:subject :subject:to:to; s=fm2; t=1703843448; x=1703929848; bh=SgYlradmGw /uy51fJCWRH4SKeV000a0rZhH6j/UlL/M=; b=rKwpLUjBgJBtIlCHN6l6z1GcPM BuR9/KumSptfte0BZqqHNhGuFmIev4dEHbhBUaHXR3wXYn1R4kO0qqXAcOJQTKPL ZX16oRoq6AVlATpX4gQIOrhGmjkdeJYAgqDxm8v8jEM33sI3GWRE+77Zpi1PR9l2 TDSbKHdrwvYpCzANkxFUkZSpR7FmFiNs2rO+C0R8xPhTmFGlRFHk8t4f3zTL4iAP 5tYZQ6io2eKw32yc/d18BmuwIziWKH6DIrz+HmnkW8IbrPij5Y4LVUUDh8YWUS0g 5PiI4gRLS9KginPKaG+CqSHNCgA4fPy6noC50LV74DccKmbGQR3PrmjggTYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:message-id :mime-version:reply-to:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1703843448; x=1703929848; bh=SgYlradmGw/uy51fJCWRH4SKeV000a0rZhH 6j/UlL/M=; b=1bUDhoLBmwSmNS0/IL0dluuBZCsDwWiT6UY1Akgwsr9m6uoZnM+ /lNbSx1mEKSN2+mU0SNprtth3MB8dd2OjcrxUEaSsfhz+hsLBoy8jCV2jHuvMWvL fYUvIgl+oQid59C91yDHIzwnipG0nnd2wz3r5FDwHzEtE4dXTAkMUs/2m3gSG4J9 yCjeFL0f06MfAgc68pyrw8il10Wr7lMjrd24BZUPfWwVC4k5h3rnHwF4c94sWT5m UO3GqKxw+h1LUCr4EKXig5jl60/zfKlDSWDl/Y5cZ6OXCWY9Okgg35jiMFhN1BdS KKq/P9ttEw/1jX3o+AKpgKas3ca2LMPbzXg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeffedgtdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkgggtugesghdtreertd dtjeenucfhrhhomhepofgrrhgvkhcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhi uceomhgrrhhmrghrvghksehinhhvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqne cuggftrfgrthhtvghrnhepvddviefgjefhvddvjeduueeihfdvvdffjeevieektdejueei ffetffefieethfefnecuffhomhgrihhnpehquhgsvghsqdhoshdrohhrghenucevlhhush htvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehmrghrmhgrrhgvkhes ihhnvhhishhisghlvghthhhinhhgshhlrggsrdgtohhm X-ME-Proxy: Feedback-ID: i1568416f:Fastmail Date: Fri, 29 Dec 2023 10:50:43 +0100 From: Marek =?utf-8?q?Marczykowski-G=C3=B3recki?= To: xen-devel Subject: Serial console stuck during boot, unblocked with xl debug-key Message-ID: MIME-Version: 1.0 Content-Disposition: inline Hi, This is continuation from matrix chat. There is an occasional failure on qubes-hw2 gitlab runner that console become stuck during boot. I can now reproduce it _much_ more often on another system, and the serial console output ends with: (XEN) Allocated console ring of 256 KiB. (XEN) Using HWP for cpufreq (XEN) mwait-idle: does not run on family 6 It should be: (XEN) Allocated console ring of 256 KiB. (XEN) Using HWP for cpufreq (XEN) mwait-idle: does not run on family 6 model 183 (XEN) VMX: Supported advanced features: (XEN) - APIC MMIO access virtualisation (XEN) - APIC TPR shadow ... Otherwise the system works perfectly fine, the logs are available in full via `xl dmesg` etc. Doing (any?) `xl debug-key` unblocks the console and missing logs gets dumped there too. I narrowed it down to the serial console tx buffer and collected some info with the attacked patch (it collects info still during boot, after the place where it usually breaks). When it works, I get: (XEN) SERIAL DEBUG: txbufc: 0x5b5, txbufp: 0x9f7, uart intr_works: 1, serial_txbufsz: 0x4000, tx_ready: 0, lsr_mask: 0x20, msi: 0, io_size: 8, skipped_interrupts: 0 And when it breaks, I get: (XEN) SERIAL DEBUG: txbufc: 0x70, txbufp: 0x9fd, uart intr_works: 1, serial_txbufsz: 0x4000, tx_ready: 16, lsr_mask: 0x20, msi: 0, io_size: 8, skipped_interrupts: 0 So, I haven't found yet why it stops sending data. I'll continue adding some debug prints etc, but if anyone has some ideas what is going on, I'd appreciate hints. Full `xl dmesg` of a broken case can be seen here: https://openqa.qubes-os.org/tests/89100/logfile?filename=serial_terminal.txt Similarly, example working case is here: https://openqa.qubes-os.org/tests/89099/logfile?filename=serial0.txt diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index 5bbed3a36a..39ad935266 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -354,6 +354,8 @@ static struct page_info * __init alloc_chunk(struct domain *d, return page; } +extern void serial_debug(void); + int __init dom0_construct_pv(struct domain *d, const module_t *image, unsigned long image_headroom, @@ -412,6 +414,7 @@ int __init dom0_construct_pv(struct domain *d, /* Machine address of next candidate page-table page. */ paddr_t mpt_alloc; + serial_debug(); printk(XENLOG_INFO "*** Building a PV Dom%d ***\n", d->domain_id); d->max_pages = ~0U; diff --git a/xen/drivers/char/serial.c b/xen/drivers/char/serial.c index 324024c29a..d6a4d62a07 100644 --- a/xen/drivers/char/serial.c +++ b/xen/drivers/char/serial.c @@ -71,6 +71,80 @@ void serial_rx_interrupt(struct serial_port *port, struct cpu_user_regs *regs) (*fn)(c & 0x7f, regs); } +#include +#include +#include +#include +#include +#include +#include +#include +#ifdef NS16550_PCI +#include +#include +#include +#endif +#include +#include +#include +#ifdef CONFIG_HAS_DEVICE_TREE +#include +#endif +#ifdef CONFIG_X86 +#include +#endif + +#define NS16550_PCI + +struct ns16550 { + int baud, clock_hz, data_bits, parity, stop_bits, fifo_size, irq; + u64 io_base; /* I/O port or memory-mapped I/O address. */ + u64 io_size; + int reg_shift; /* Bits to shift register offset by */ + int reg_width; /* Size of access to use, the registers + * themselves are still bytes */ + char __iomem *remapped_io_base; /* Remapped virtual address of MMIO. */ + /* UART with IRQ line: interrupt-driven I/O. */ + struct irqaction irqaction; + u8 lsr_mask; +#ifdef CONFIG_ARM + struct vuart_info vuart; +#endif + /* UART with no IRQ line: periodically-polled I/O. */ + struct timer timer; + struct timer resume_timer; + unsigned int timeout_ms; + bool intr_works; + bool dw_usr_bsy; +#ifdef NS16550_PCI + /* PCI card parameters. */ + bool pb_bdf_enable; /* if =1, pb-bdf effective, port behind bridge */ + bool ps_bdf_enable; /* if =1, ps_bdf effective, port on pci card */ + unsigned int pb_bdf[3]; /* pci bridge BDF */ + unsigned int ps_bdf[3]; /* pci serial port BDF */ + u32 bar; + u32 bar64; + u16 cr; + u8 bar_idx; + bool msi; + const struct ns16550_config_param *param; /* Points into .init.*! */ +#endif +}; + +int skipped_interrupts = 0; + +void serial_debug(void) +{ + struct serial_port *port = &com[SERHND_COM1]; + struct ns16550 *uart = port->uart; + printk("SERIAL DEBUG: txbufc: %#x, txbufp: %#x, uart intr_works: %d, serial_txbufsz: %#x, tx_ready: %d, " + "lsr_mask: %#x, msi: %d, io_size: %ld, skipped_interrupts: %d\n", + port->txbufc, port->txbufp, uart->intr_works, + serial_txbufsz, + port->driver->tx_ready(port), + uart->lsr_mask, uart->msi, uart->io_size, skipped_interrupts); +} + void serial_tx_interrupt(struct serial_port *port, struct cpu_user_regs *regs) { int i, n; @@ -85,8 +159,10 @@ void serial_tx_interrupt(struct serial_port *port, struct cpu_user_regs *regs) */ while ( !spin_trylock(&port->tx_lock) ) { - if ( port->driver->tx_ready(port) <= 0 ) + if ( port->driver->tx_ready(port) <= 0 ) { + skipped_interrupts++; goto out; + } cpu_relax(); }