From patchwork Fri Jan 22 19:23:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Marek_Marczykowski-G=C3=B3recki?= X-Patchwork-Id: 8092491 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id C28AABEEE5 for ; Fri, 22 Jan 2016 19:27:43 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 011BC205B3 for ; Fri, 22 Jan 2016 19:27:41 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D389520557 for ; Fri, 22 Jan 2016 19:27:37 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aMhJL-00064v-5v; Fri, 22 Jan 2016 19:23:55 +0000 Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aMhJI-00064q-Tw for xen-devel@lists.xen.org; Fri, 22 Jan 2016 19:23:53 +0000 Received: from [85.158.143.35] by server-3.bemta-4.messagelabs.com id F4/E3-31122-8C182A65; Fri, 22 Jan 2016 19:23:52 +0000 X-Env-Sender: marmarek@invisiblethingslab.com X-Msg-Ref: server-8.tower-21.messagelabs.com!1453490629!11633330!1 X-Originating-IP: [66.111.4.27] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTExLjQuMjcgPT4gODQ2Mw==\n X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 36015 invoked from network); 22 Jan 2016 19:23:50 -0000 Received: from out3-smtp.messagingengine.com (HELO out3-smtp.messagingengine.com) (66.111.4.27) by server-8.tower-21.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 22 Jan 2016 19:23:50 -0000 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 541A421528 for ; Fri, 22 Jan 2016 14:23:49 -0500 (EST) Received: from frontend2 ([10.202.2.161]) by compute5.internal (MEProxy); Fri, 22 Jan 2016 14:23:49 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=+WTkb4MXKp+7joybCDHb87IHIXs=; b=b6UN3R Tq5zA9Lk0NVa/J56jiMO1kV2QxgELq7yuS/TKj/zZg11LmtHJbWJ2Ide/C0yToZw wI+PYZ9M7rEd4KATwGnh/JRbBeftSLLb7uDKKPww9f14gP8Fv8QCL/xSsY6D3xk5 ult8NstAcWkuglKb1GKeFVJOkD3i1W1kd4sMo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=smtpout; bh=+WTkb4MXKp+7joybCDHb87IHIXs=; b=BqrCP ARpNen0wI8CtDjdDC4VHSIq9j/mdd+HvQjQ91pJB6m+9Lm1uqS1bgWaoZcBX5d/E u+t9ee5zvmRVW+BJc04YbPwfbNgdiZLDDIgCnt/IxrQ34T85b1cMcHEiBqm3qtbh bPFkWpV6n0SY1ClO2cmAXy4z82SGwIvsHhZJuc= X-Sasl-enc: 9ffkFauKaT3pMW1I/lemN1HW31M58bvuYrwqvTJ1D018 1453490628 Received: from mail-itl (89-70-93-48.dynamic.chello.pl [89.70.93.48]) by mail.messagingengine.com (Postfix) with ESMTPA id 636A36801A6; Fri, 22 Jan 2016 14:23:47 -0500 (EST) Date: Fri, 22 Jan 2016 20:23:43 +0100 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= To: Joao Martins Message-ID: <20160122192343.GL31058@mail-itl> References: <20150522114932.GC8664@mail-itl> <55645140.1050209@citrix.com> <20150526220312.GA1358@mail-itl> <20151021185734.GD31646@mail-itl> <20151117024515.GN976@mail-itl> <20151201220042.GC22107@char.us.oracle.com> <20151201223258.GH1337@mail-itl> <20160120215914.GA6384@char.us.oracle.com> <56A0CF78.1090901@oracle.com> MIME-Version: 1.0 In-Reply-To: <56A0CF78.1090901@oracle.com> User-Agent: Mutt/1.5.24 (2015-08-30) Cc: netdev@vger.kernel.org, xen-devel , Annie Li , David Vrabel , Boris Ostrovsky Subject: Re: [Xen-devel] xen-netfront crash when detaching network while some network activity X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Jan 21, 2016 at 12:30:48PM +0000, Joao Martins wrote: > > > On 01/20/2016 09:59 PM, Konrad Rzeszutek Wilk wrote: > > On Tue, Dec 01, 2015 at 11:32:58PM +0100, Marek Marczykowski-Górecki wrote: > >> On Tue, Dec 01, 2015 at 05:00:42PM -0500, Konrad Rzeszutek Wilk wrote: > >>> On Tue, Nov 17, 2015 at 03:45:15AM +0100, Marek Marczykowski-Górecki wrote: > >>>> On Wed, Oct 21, 2015 at 08:57:34PM +0200, Marek Marczykowski-Górecki wrote: > >>>>> On Wed, May 27, 2015 at 12:03:12AM +0200, Marek Marczykowski-Górecki wrote: > >>>>>> On Tue, May 26, 2015 at 11:56:00AM +0100, David Vrabel wrote: > >>>>>>> On 22/05/15 12:49, Marek Marczykowski-Górecki wrote: > >>>>>>>> Hi all, > >>>>>>>> > >>>>>>>> I'm experiencing xen-netfront crash when doing xl network-detach while > >>>>>>>> some network activity is going on at the same time. It happens only when > >>>>>>>> domU has more than one vcpu. Not sure if this matters, but the backend > >>>>>>>> is in another domU (not dom0). I'm using Xen 4.2.2. It happens on kernel > >>>>>>>> 3.9.4 and 4.1-rc1 as well. > >>>>>>>> > >>>>>>>> Steps to reproduce: > >>>>>>>> 1. Start the domU with some network interface > >>>>>>>> 2. Call there 'ping -f some-IP' > >>>>>>>> 3. Call 'xl network-detach NAME 0' > >>> > >>> Do you see this all the time or just on occassions? > >> > >> Using above procedure - all the time. > >> > >>> I tried to reproduce it and couldn't see it. Is your VM an PV or HVM? > >> > >> PV, started by libvirt. This may have something to do, the problem didn't > >> existed on older Xen (4.1) and started by xl. I'm not sure about kernel > >> version there, but I think I've tried there 3.18 too, which has this > >> problem. > >> > >> But I don't see anything special in domU config file (neither backend > >> nor frontend) - it may be some libvirt default. If that's really the > >> cause. Can I (and how) get any useful information about that? > > > > libvirt naturally does some libxl calls, and they may be different. > > > > Any chance you could give me an idea of: > > - What commands you use in libvirt? > > - Do you use a bond or bridge? > > - What version of libvirt you are using? > > > > Thanks! > > CC-ing Joao just in case he has seen this. > >> > Hm, So far I couldn't reproduce the issue with upstream Xen/linux/libvirt, using > both libvirt or plain xl (both on a bridge setup) and also irrespective of the > both load and direction of traffic (be it a ping flood, pktgen with min. > sized packets or iperf). I've ran the test again, on vanilla 4.4 and collected some info: - xenstore dump of frontend (xs-frontend-before.txt) - xenstore dump of backend (xs-backend-before.txt) - kernel messages (console output) (console.log) - kernel config (config-4.4) - libvirt config of that domain (netdebug.conf) Versions: - kernel 4.4 (frontend), 4.2.8 (backend) - libvirt 1.2.20 - xen 4.6.0 In backend domain there is no bridge or anything like that - only routing. The same in frontend - nothing fancy - just IP set on eth0 there. Steps to reproduce were the same: - start frontend domain (virsh create ...) - call ping -f - xl network-detach NAME 0 Note that the crash doesn't happen with attached patch applied (as noted in mail on Oct 21), but I have no idea whether is it a proper fix, or just prevents the crash by a coincidence. diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index f821a97..a5efbb0 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1065,9 +1069,10 @@ static void xennet_release_tx_bufs(struct netfront_queue *queue) skb = queue->tx_skbs[i].skb; get_page(queue->grant_tx_page[i]); - gnttab_end_foreign_access(queue->grant_tx_ref[i], - GNTMAP_readonly, - (unsigned long)page_address(queue->grant_tx_page[i])); + gnttab_end_foreign_access_ref( + queue->grant_tx_ref[i], GNTMAP_readonly); + gnttab_release_grant_reference( + &queue->gref_tx_head, queue->grant_tx_ref[i]); queue->grant_tx_page[i] = NULL; queue->grant_tx_ref[i] = GRANT_INVALID_REF; add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, i);