From patchwork Sun Jan 17 15:16:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 8050751 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 9145BBEEE5 for ; Sun, 17 Jan 2016 15:20:19 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6FD382025B for ; Sun, 17 Jan 2016 15:20:18 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9124720219 for ; Sun, 17 Jan 2016 15:20:17 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aKp4X-0006Cx-TU; Sun, 17 Jan 2016 15:16:54 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aKp4S-0006Cs-Bg for xen-devel@lists.xen.org; Sun, 17 Jan 2016 15:16:49 +0000 Received: from [85.158.139.211] by server-2.bemta-5.messagelabs.com id 4F/8C-21594-F50BB965; Sun, 17 Jan 2016 15:16:47 +0000 X-Env-Sender: prvs=8174a37d9=Andrew.Cooper3@citrix.com X-Msg-Ref: server-3.tower-206.messagelabs.com!1453043805!16314167!1 X-Originating-IP: [66.165.176.89] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni44OSA9PiAyMDMwMDc=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 41193 invoked from network); 17 Jan 2016 15:16:46 -0000 Received: from smtp.citrix.com (HELO SMTP.CITRIX.COM) (66.165.176.89) by server-3.tower-206.messagelabs.com with RC4-SHA encrypted SMTP; 17 Jan 2016 15:16:46 -0000 X-IronPort-AV: E=Sophos;i="5.22,307,1449532800"; d="scan'208";a="325741324" To: =?UTF-8?Q?H=c3=a5kon_Alstadheim?= , References: <5698D0C1.6000808@alstadheim.priv.no> <5698D297.8030700@citrix.com> <569BAA4A.8010501@alstadheim.priv.no> From: Andrew Cooper Message-ID: <569BB05B.1000801@citrix.com> Date: Sun, 17 Jan 2016 15:16:43 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 MIME-Version: 1.0 In-Reply-To: <569BAA4A.8010501@alstadheim.priv.no> X-DLP: MIA2 Subject: Re: [Xen-devel] [BUG] Assertion '(sp == 0) || (peoi[sp-1].vector < vector)' failed at irq.c:1163 X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 17/01/16 14:50, Håkon Alstadheim wrote: > Den 15. jan. 2016 12:05, skrev Andrew Cooper: >> On 15/01/16 10:58, Håkon Alstadheim wrote: >>> CPUINFO: >>> vendor_id : GenuineIntel >>> cpu family : 6 >>> model : 63 >>> model name : Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz >>> >>> # smbios-sys-info >>> Libsmbios version: 2.2.28 >>> Product Name: Z10PE-D8 WS >>> Vendor: ASUSTeK COMPUTER INC. >>> BIOS Version: 3101 >>> >>> >>> I have been experiencing issues with domains with passed through PCIe >>> devices since I first installed xen. Then at version 4.5.x , I'm now >>> at 4.6.0 with gentoo patches. Crashes SEEM mostly related to this pci >>> pass through and interrupts (usb-cards, sound cards). >>> >>> Recently the system has been more stable, whether it is because I pass >>> through as few things as possible, or because of improvements in Xen I >>> do not know. I have also taken to building with debug, which leads to >>> more abrupt but less mysterious failures. Earlier (w/o debug and under >>> xen 4.5 ) stuff would just gradually stop working and end up in total >>> hang of everything. So, hey, things are improving :-b >> This isn't the first time we have seen this on Haswell processors. Do >> you have microcode loading set up? >> >> ~Andrew >> > Still happening with kernel-genkernel-x86_64-4.1.15-gentoo and updated > cpu microcode, using microcode from 20151106. Ok - I previously investigated this issue, but my repro evaporated from under my feet with a firmware update, and I never got to the bottom of it. Please can you start with the following patch which will dump some more information on crash. ---8<--- peoi[sp].irq = irq; diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c index 1228568..588b562 100644 --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1165,6 +1165,13 @@ static void __do_IRQ_guest(int irq) if ( action->ack_type == ACKTYPE_EOI ) { sp = pending_eoi_sp(peoi); + if ( unlikely(!((sp == 0) || (peoi[sp-1].vector < vector))) ) + { + int p; + for ( p = sp; p > 0; --p ) + printk("**peoi[%d] = {%d, 0x%u, %d}\n", + p-1, peoi[p-1].irq, peoi[p-1].vector, peoi[p-1].ready); + } ASSERT((sp == 0) || (peoi[sp-1].vector < vector)); ASSERT(sp < (NR_DYNAMIC_VECTORS-1));