From patchwork Wed Jan 11 09:37:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 13096352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 903B7C6379F for ; Wed, 11 Jan 2023 09:43:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232249AbjAKJmc (ORCPT ); Wed, 11 Jan 2023 04:42:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235627AbjAKJlj (ORCPT ); Wed, 11 Jan 2023 04:41:39 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B6A6D6B for ; Wed, 11 Jan 2023 01:37:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=MIME-Version:Content-Type:References: In-Reply-To:Date:Cc:To:From:Subject:Message-ID:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=asDDLkhhiVe8MatzoaV5vyaHB/p2X1IB9PzQ1tQv9UA=; b=pSM9VHjuduHfkTk520DkWAYvI6 DtX6eCDu87iE5K+My7WzLmFzT3xnmqtMUqae2B+Rh32rNrk7pq2x4zkWWDvvxB/3whfq+5vPV+Hxy x0fjYUjObQzu3xP8+iMPfFk6Uby29j1tRL4aVWddio5KhLCEI2+gPoi/MqyvS44DZVIBqVHICFYk0 QNferzVJemrM9UJZ4xaoDrmVm2ekpRPHAMXDvQfoXA+WklH8FsPkhLEZh3xxAgm0aJjxf2a2gmm0D 5hNcuRHn87v86Qid2eJdk3skLAkEZeLlzp9vrpdyHfnejStrpfc3/H1r4pLmiAaQxDRUnC7WrQFC0 JTzZJgNw==; Received: from [2001:8b0:10b:5::bb3] (helo=u3832b3a9db3152.ant.amazon.com) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pFXYa-0040AZ-Kw; Wed, 11 Jan 2023 09:38:04 +0000 Message-ID: <03f0a9ddf3db211d969ff4eb4e0aeb8789683776.camel@infradead.org> Subject: [PATCH 2/3] KVM: x86/xen: Fix potential deadlock in kvm_xen_update_runstate_guest() From: David Woodhouse To: Paolo Bonzini , paul , Sean Christopherson Cc: kvm , Peter Zijlstra , Michal Luczaj Date: Wed, 11 Jan 2023 09:37:50 +0000 In-Reply-To: <99b1da6ca8293b201fe0a89fd973a9b2f70dc450.camel@infradead.org> References: <99b1da6ca8293b201fe0a89fd973a9b2f70dc450.camel@infradead.org> User-Agent: Evolution 3.44.4-0ubuntu1 MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Woodhouse The kvm_xen_update_runstate_guest() function can be called when the vCPU is being scheduled out, from a preempt notifier. It *opportunistically* updates the runstate area in the guest memory, if the gfn_to_pfn_cache which caches the appropriate address is still valid. If there is *contention* when it attempts to obtain gpc->lock, then locking inside the priority inheritance checks may cause a deadlock. Lockdep reports: [13890.148997] Chain exists of:                  &gpc->lock --> &p->pi_lock --> &rq->__lock [13890.149002]  Possible unsafe locking scenario: [13890.149003]        CPU0                    CPU1 [13890.149004]        ----                    ---- [13890.149005]   lock(&rq->__lock); [13890.149007]                                lock(&p->pi_lock); [13890.149009]                                lock(&rq->__lock); [13890.149011]   lock(&gpc->lock); [13890.149013]                 *** DEADLOCK *** In the general case, if there's contention for a read lock on gpc->lock, that's going to be because something else is either invalidating or revalidating the cache. Either way, we've raced with seeing it in an invalid state, in which case we would have aborted the opportunistic update anyway. So in the 'atomic' case when called from the preempt notifier, just switch to using read_trylock() and avoid the PI handling altogether. Signed-off-by: David Woodhouse --- First patch in this series was '[PATCH] KVM: x86/xen: Fix lockdep warning on "recursive" gpc locking' but now there are three.  arch/x86/kvm/xen.c | 17 ++++++++++++++---  1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 07e61cc9881e..c444948ab1ac 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -272,7 +272,12 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)          * Attempt to obtain the GPC lock on *both* (if there are two)          * gfn_to_pfn caches that cover the region.          */ -       read_lock_irqsave(&gpc1->lock, flags); +       local_irq_save(flags); +       if (!read_trylock(&gpc1->lock)) { +               if (atomic) +                       return; +               read_lock(&gpc1->lock); +       }         while (!kvm_gpc_check(gpc1, user_len1)) {                 read_unlock_irqrestore(&gpc1->lock, flags);   @@ -283,7 +288,7 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)                 if (kvm_gpc_refresh(gpc1, user_len1))                         return;   -               read_lock_irqsave(&gpc1->lock, flags); +               goto retry;         }           if (likely(!user_len2)) { @@ -309,7 +314,13 @@ static void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, bool atomic)                  * gpc1 lock to make lockdep shut up about it.                  */                 lock_set_subclass(&gpc1->lock.dep_map, 1, _THIS_IP_); -               read_lock(&gpc2->lock); +               if (!read_trylock(&gpc2->lock)) { +                       if (atomic) { +                               read_unlock_irqrestore(&gpc1->lock, flags); +                               return; +                       } +                       read_lock(&gpc2->lock); +               }                   if (!kvm_gpc_check(gpc2, user_len2)) {                         read_unlock(&gpc2->lock);