From patchwork Thu Dec 14 04:57:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jia He X-Patchwork-Id: 10111441 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 900FF602B3 for ; Thu, 14 Dec 2017 04:58:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 864B129A07 for ; Thu, 14 Dec 2017 04:58:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7AB5F29A1F; Thu, 14 Dec 2017 04:58:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CA65429A07 for ; Thu, 14 Dec 2017 04:58:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=qDr/MKHgcPdPznd+L/NtL+BilPttwkpAxdXTmVneGoQ=; b=aieMbsu8TpbSTKgvJZM9kY8tW a8w7G94oV9ysFobX+xSriI1I6V2GdSbk9IGmAH86vnPvlvZycJeyOFSvJuSM3Fc2MEan/UW/9EPfD nAagoyzU+SPunu7WycKhnzRKXKcxvi/02k3CJykMzKktbqNmcaAPxccFxR61dlGXsKXZQXt0VIdC/ Pmsuwg7H0z6f8AAu2jFobUOJcFTHZyUpP6iNTo3DF2EMam+Y5G1fcDMzTNz0tTrQtlHyA9rA52n81 +vEOTF3zXd8y6991crPwlyOvtyn48JdEJwy4nIbfGsW5yPkqQDstooYiDbkQ2uSGfvqq8ovWH7eqt 2u2WdEELw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1ePLbR-0006qr-CZ; Thu, 14 Dec 2017 04:58:37 +0000 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1ePLbN-0006oJ-B4 for linux-arm-kernel@lists.infradead.org; Thu, 14 Dec 2017 04:58:35 +0000 Received: by mail-pf0-x241.google.com with SMTP id c204so2841177pfc.13 for ; Wed, 13 Dec 2017 20:58:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=OvJtdlOHAPrX2pXQt6pIGuYv7ZXpV5KYabdPBkWP6QY=; b=CmY97YBTOw6OTTQdhl9V4QtRnqrGGKLfz+AvTx3gt/1EO99ot/42Q8nWriPSQpCgXC oYCWFI/+VJoYQW/uG+Rzh86+G4N2sJVZLY2H2P05MbwZ714AgffQGx2WcEWK/+88VhqT lXQTCAHgS0CEGtLnZw/Uom/3HBl6AJcCXpT0ZUzSqIIq+fc6OhXg1bj1RUAmZ7zEa0Hn N46rglocEnqVa6DYSJZkKWzu7TjB/FDD8kVzabu0n3nd5KEOMSom69AeX/zj6vA67pYR ht3m6Z04CaVN/I7JfgsM+ro/fONAb/OK+7pdoyxrUHwwONYoV9M1oJReaHrmQLdAPfZD zeOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=OvJtdlOHAPrX2pXQt6pIGuYv7ZXpV5KYabdPBkWP6QY=; b=hMkR1kHa5mgLGm8KU8i+eMUY8hjcP8LFPbiicMOcvlz8av5BX1RcWiQL8br6pYks7/ LizF0Ai+pjcsOREbHgeXbPygdIkduqZVS5EJvBHeDmsFdKjaq9B34jL7nglRpAURi2RE 7i/XrglmPE1vC2Usei6aplpA0Q6VT1xRvm6kCJmxsAJxj/qYclafng7UKQ3Hro6ou0x3 pFNncrZfPI1W6b7fztdxhTjxr6oFcsWSFKV6Wl6DirDtcaTpzXvodBhnfzdrwkSqIUPG uQuZNQ+RxJhbLVBjRj2teP6NFnfSASkSy56xrZkXOTgX1rqctWOsVU0OtwlyN/UhOfe7 j13A== X-Gm-Message-State: AKGB3mI0hf627ThwG1KV/U6jNCzPpdyudFBaPwP38amouMtITB8njDjv SdR7HUPjbRUwtI4nNFGOPO6CVtON X-Google-Smtp-Source: ACJfBouNaRK7QHFczBnDjmIyvOasknCaVvz1yJNPXpT2ND5NqbRnxeX4SNdPCwjKRIYDS/+sqSjbfQ== X-Received: by 10.98.224.200 with SMTP id d69mr8540696pfm.100.1513227492675; Wed, 13 Dec 2017 20:58:12 -0800 (PST) Received: from [0.0.0.0] (67.209.179.165.16clouds.com. [67.209.179.165]) by smtp.gmail.com with ESMTPSA id 73sm6795679pfr.145.2017.12.13.20.58.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Dec 2017 20:58:12 -0800 (PST) Subject: Re: [PATCH] KVM: arm/arm64: don't set vtimer->cnt_ctl in kvm_arch_timer_handler To: Christoffer Dall References: <1513148407-2611-1-git-send-email-hejianet@gmail.com> <20171213091803.GQ910@cbox> From: Jia He Message-ID: Date: Thu, 14 Dec 2017 12:57:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20171213091803.GQ910@cbox> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20171213_205833_442911_3C5574F4 X-CRM114-Status: GOOD ( 26.56 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Marc Zyngier , Jia He , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Christoffer I have tried your newer level-mapped-v7 branch, but bug is still there. There is no special load in both host and guest. The guest (kernel 4.14) is often hanging when booting the guest kernel log [ OK ] Reached target Remote File Systems. Starting File System Check on /dev/mapper/fedora-root... [ OK ] Started File System Check on /dev/mapper/fedora-root. Mounting /sysroot... [ 2.670764] SGI XFS with ACLs, security attributes, no debug enabled [ 2.678180] XFS (dm-0): Mounting V5 Filesystem [ 2.740364] XFS (dm-0): Ending clean mount [ OK ] Mounted /sysroot. [ OK ] Reached target Initrd Root File System. Starting Reload Configuration from the Real Root... [ 61.288215] INFO: rcu_sched detected stalls on CPUs/tasks: [ 61.290791] 1-...!: (0 ticks this GP) idle=574/0/0 softirq=5/5 fqs=1 [ 61.293664] (detected by 0, t=6002 jiffies, g=-263, c=-264, q=39760) [ 61.296480] Task dump for CPU 1: [ 61.297938] swapper/1 R running task 0 0 1 0x00000020 [ 61.300643] Call trace: [ 61.301260] __switch_to+0x6c/0x78 [ 61.302095] cpu_number+0x0/0x8 [ 61.302867] rcu_sched kthread starved for 6000 jiffies! g18446744073709551353 c18446744073709551352 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1 [ 61.305941] rcu_sched I 0 8 2 0x00000020 [ 61.307250] Call trace: [ 61.307854] __switch_to+0x6c/0x78 [ 61.308693] __schedule+0x268/0x8f0 [ 61.309545] schedule+0x2c/0x88 [ 61.310325] schedule_timeout+0x84/0x3b8 [ 61.311278] rcu_gp_kthread+0x4d4/0x7d8 [ 61.312213] kthread+0x134/0x138 [ 61.313001] ret_from_fork+0x10/0x1c Maybe my previous patch is not perfect enough, thanks for your comments. I digged it futher more, do you think below code logic is possibly problematic? vtimer_save_state           (vtimer->loaded = false, cntv_ctl is 0) kvm_arch_timer_handler        (read cntv_ctl and set vtimer->cnt_ctl = 0) vtimer_restore_state            (write vtimer->cnt_ctl to cntv_ctl, then cntv_ctl will                        be 0 forever) If above analysis is reasonable, how about below patch? already tested in my arm64 server.                         kvm_timer_update_irq(vcpu, true, vtimer); Cheers, Jia On 12/13/2017 5:18 PM, Christoffer Dall Wrote: > On Tue, Dec 12, 2017 at 11:00:07PM -0800, Jia He wrote: >> In our Armv8a server (qualcomm Amberwing, non VHE), after applying >> Christoffer's timer optimizing patchset(Optimize arch timer register >> handling), the guest is hang during kernel booting. >> >> The error root cause might be as follows: >> 1. in kvm_arch_timer_handler, it reset vtimer->cnt_ctl with current >> cntv_ctl register value. And then it missed some cases to update timer's >> irq (irq.level) when kvm_timer_irq_can_fire() is false > Why should it set the irq level to true when the timer cannot fire? > >> 2. It causes kvm_vcpu_check_block return 0 instead of -EINTR >> kvm_vcpu_check_block >> kvm_cpu_has_pending_timer >> kvm_timer_is_pending >> kvm_timer_should_fire >> 3. Thus, the kvm hyp code can not break the loop in kvm_vcpu_block (halt >> poll process) and the guest is hang forever > This is just a polling loop which will expire after some time, so it > shouldn't halt the guest indefinitely, but merely slow it down for some > while, if we have a bug. Is that the behavior you're seeing or are you > seeing the guest coming to a complete halt? > >> Fixes: b103cc3f10c0 ("KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit") >> Signed-off-by: Jia He >> --- >> virt/kvm/arm/arch_timer.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c >> index f9555b1..bb86433 100644 >> --- a/virt/kvm/arm/arch_timer.c >> +++ b/virt/kvm/arm/arch_timer.c >> @@ -100,7 +100,6 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) >> vtimer = vcpu_vtimer(vcpu); >> >> if (!vtimer->irq.level) { >> - vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl); > This fix is clearly not correct, as it would prevent forwarding timer > interrupts in some cases. > >> if (kvm_timer_irq_can_fire(vtimer)) >> kvm_timer_update_irq(vcpu, true, vtimer); >> } >> -- >> 2.7.4 >> > I actually don't see how the above scenario you painted can happen. > > If you're in the polling loop, that means that the timer state is loaded > on the vcpu, and that means you can take interrupts from the timer, and > when you take interrupts, you will set the irq.level. > > And here's the first bit of logic in kvm_timer_is_pending(): > > if (vtimer->irq.level || ptimer->irq.level) > return true; > > So that would break the loop. > > I'm not able to reproduce on my side with a non-VHE platform. > > What is the workload you're running to reproduce this, and what is the > exact kernel tree and kernel configuration you're using? > > Thanks, > -Christoffer > > > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index f9555b1..ee6dd3f 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -99,7 +99,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)         }         vtimer = vcpu_vtimer(vcpu); -       if (!vtimer->irq.level) { +       if (vtimer->loaded && !vtimer->irq.level) {                 vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);                 if (kvm_timer_irq_can_fire(vtimer))