From patchwork Thu Oct 20 07:34:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xulei (Stone)" X-Patchwork-Id: 9386137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DF5BF607F0 for ; Thu, 20 Oct 2016 07:36:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D12A529AB7 for ; Thu, 20 Oct 2016 07:36:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C5BEE29B29; Thu, 20 Oct 2016 07:36:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2588C29AB7 for ; Thu, 20 Oct 2016 07:36:43 +0000 (UTC) Received: from localhost ([::1]:52937 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bx7u5-0001oY-4f for patchwork-qemu-devel@patchwork.kernel.org; Thu, 20 Oct 2016 03:36:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49082) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bx7to-0001oT-LI for qemu-devel@nongnu.org; Thu, 20 Oct 2016 03:36:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bx7tl-0005eW-EI for qemu-devel@nongnu.org; Thu, 20 Oct 2016 03:36:24 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:53785) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1bx7tj-0005cK-9I for qemu-devel@nongnu.org; Thu, 20 Oct 2016 03:36:21 -0400 Received: from 172.24.1.60 (EHLO SZXEMI415-HUB.china.huawei.com) ([172.24.1.60]) by szxrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DPC27454; Thu, 20 Oct 2016 15:34:25 +0800 (CST) Received: from SZXEMI504-MBS.china.huawei.com ([169.254.1.190]) by SZXEMI415-HUB.china.huawei.com ([10.86.210.48]) with mapi id 14.03.0235.001; Thu, 20 Oct 2016 15:34:07 +0800 From: "Xulei (Stone)" To: =?utf-8?B?TWFyYy1BbmRyw6kgTHVyZWF1?= Thread-Topic: [Problem] qemu crash when vhost_net_start Thread-Index: AdIqpFSVZiNn09moRV6hl65Baudm+g== Date: Thu, 20 Oct 2016 07:34:06 +0000 Message-ID: <8E78D212B8C25246BE4CE7EA0E645FE545837F@SZXEMI504-MBS.china.huawei.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.177.254.96] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 119.145.14.65 Subject: Re: [Qemu-devel] [Problem] qemu crash when vhost_net_start X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Huangweidong \(C\)" , "wangxin \(U\)" , wangyunjian , qemu-devel , "Gonglei \(Arei\)" , marcandre lureau , i maximets , "pbonzini@redhat.com" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP > > Hi > > ----- Original Message ----- > > Hi, all > > Recently, I have a VM with a vhost-user network card created by qemu 2.6.0. > > Once, I restart OpenVSwitch service > > and start this VM in the same time. I found qemu may probably crash > > with following stack: > > > > (gdb) bt > > #0 0x00007f0f9179a5d7 in raise () from /usr/lib64/libc.so.6 > > #1 0x00007f0f9179bcc8 in abort () from /usr/lib64/libc.so.6 > > #2 0x000000000045a202 in kvm_io_ioeventfd_add () > > #3 0x000000000045cffc in address_space_add_del_ioeventfds () > > #4 0x000000000045fa0e in address_space_update_ioeventfds () > > #5 0x0000000000460f40 in memory_region_transaction_commit () > > #6 0x0000000000461ce5 in memory_region_add_eventfd () > > #7 0x000000000066a1e5 in virtio_pci_set_host_notifier_internal () > > #8 0x00000000004ae08a in vhost_dev_enable_notifiers () > > #9 0x0000000000492743 in vhost_net_start_one () > > #10 0x00000000004930bf in vhost_net_start () > > #11 0x000000000048efd4 in virtio_net_vhost_status () > > #12 0x000000000048f16a in virtio_net_set_status () > > #13 0x0000000000686bcd in qmp_set_link () > > #14 0x000000000068dcc3 in net_vhost_user_event () > > #15 0x000000000051f043 in tcp_chr_new_client () > > #16 0x000000000051f10f in qemu_chr_socket_connected () > > #17 0x000000000073cb10 in qio_task_complete () > > #18 0x000000000073cb7b in gio_task_thread_result () > > #19 0x00007f0f929fb99a in g_main_context_dispatch () from > > /usr/lib64/libglib-2.0.so.0 > > #20 0x00000000006d2275 in os_host_main_loop_wait () > > #21 0x00000000006d2393 in main_loop_wait () > > #22 0x000000000052a0f2 in main_loop () > > #23 0x000000000041bcd3 in main () > > > > This seems a bug triggering when backend starts vhost_net and > > meanwhile the frontend rmmod/modprobe virtio-net. > > Is this a known issue or any patch can fix this? > > > > > Thanks for the report. > > Could you provide step-by-step instructions on how to reproduce? > > (if you could bisect qemu.git that would be also helpful !) > > thanks Thanks for reply. Your patch "vhost-user: check vhost_user_{read,write}() return value" or Gonglei's "vhost-user: fix unreasonable return value when vhost-user read failed" seems inspired me. Qemu 2.6 has not merged your patch, so vhost_user_init() will get a random feature value when vhost_user_{write,read}() failed. I think the crash has certain relations with this. Because I have tried following modification which can let this problem be inevitable (start a vm, and restart openvswitch): diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 1580929..3628382 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -469,6 +469,12 @@ static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64) return 0; } + if (request == VHOST_USER_GET_FEATURES) { + vhost_user_features_init(u64); + return 0; + } + if (vhost_user_write(dev, &msg, NULL, 0) < 0) { So, I guess the crash problem has relations with vhost user feature. Then I tried a lot to find which features and finally I found following patch can fix crash problem: diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 1580929..e861e8a 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -455,7 +455,8 @@ static int vhost_user_set_protocol_features(struct vhost_dev *dev, static void vhost_user_features_init(void *arg) { #define VIRTIO_NET_F_MRG_RXBUF 15 /* Host can merge receive buffers. */ - *((__u64 *) arg) = ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VHOST_F_LOG_ALL)); + *((__u64 *) arg) = ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VHOST_F_LOG_ALL) + |(1ULL << VHOST_USER_F_PROTOCOL_FEATURES)); } static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64) @@ -469,6 +470,7 @@ static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64) if (vhost_user_write(dev, &msg, NULL, 0) < 0) { + if (request == VHOST_USER_GET_FEATURES) { + vhost_user_features_init(u64); + } return 0; } Merely, I could not figure out why VHOST_USER_F_PROTOCOL_FEATURES feature could led to crash. Hoping above information can help you to tell me the reason.