From patchwork Tue Apr 10 15:05:40 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Christoffer Dall <cdall@kernel.org>
X-Patchwork-Id: 10333323
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	7C1626053C for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 10 Apr 2018 15:10:22 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6490026D05
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 10 Apr 2018 15:10:22 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 153CA27031; Tue, 10 Apr 2018 15:10:21 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,
	T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81CB926220
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 10 Apr 2018 15:05:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754168AbeDJPFp (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Tue, 10 Apr 2018 11:05:45 -0400
Received: from mail-wm0-f67.google.com ([74.125.82.67]:32871 "EHLO
	mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753097AbeDJPFo (ORCPT <rfc822; kvm@vger.kernel.org>);
	Tue, 10 Apr 2018 11:05:44 -0400
Received: by mail-wm0-f67.google.com with SMTP id o23so22005180wmf.0
	for <kvm@vger.kernel.org>; Tue, 10 Apr 2018 08:05:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=christofferdall-dk.20150623.gappssmtp.com; s=20150623;
	h=sender:date:from:to:cc:subject:message-id:references:mime-version
	:content-disposition:in-reply-to:user-agent;
	bh=/38sU1nLxw2PMNqdedPJo5+mHT69nWXkqGc68QY24wg=;
	b=Nor02yVf26EW520WgeW/ghfc1mSQw4932B+OdeNmfQSsovqA1cBDkruZxYOBedfJsM
	zGjqn/WJL0BTlkolgPJqOKl8p6v/KmvFe+VDIqXWupbm8MEkyO/ilTILY46hCFAkyRqd
	N8Mey5bpST0qKiVwW2dsZzdqyrFQLu8O/Z6dStVF9WCpu58px9heB/00gwflDyHfnt4t
	5hCOwbg7FS/cLjYK00AWMJNpV+Re6nVS3D9YgE8zYWLKyMVQ0BtM+Uaq53IVW2uYxowm
	Vn/0vrcqDgm/ePYcJY2WkN2adQOb/NlO0Y8J/aKbZj3nPb6HiKJNAGn/VIJigCWZR3SD
	Fs1w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:sender:date:from:to:cc:subject:message-id
	:references:mime-version:content-disposition:in-reply-to:user-agent;
	bh=/38sU1nLxw2PMNqdedPJo5+mHT69nWXkqGc68QY24wg=;
	b=r70KOE4fObLfjPausS7Na126326jC4iZx0oXsIy5T9YsyhDovg6FH381Wax/nldAar
	wF16uOvVrlKnBmN7W0CNe5VwKGdeQY7x1Oslm+GXOR8aeQrOYhCb6FgLH/GnarD7h/zl
	KUjWWhIToLdupQmpJeRCXBDfVqTUOdJh3IGK7PRtOmcRGVNQiNvWYn/52CumHI3FbzXU
	ipCbn7u891vOcntU89b5OsR6UZRQBPIUnnksEE+nlVK3iCFPCtMGo06dVfe3PaezD7JV
	1+AhuL0wVSnmc0h+F9D82fp7guSwTjhgQ6B6YbkVEHxkgohFmHgRb15lPtUcZEjRyjIz
	aqdw==
X-Gm-Message-State: ALQs6tDdXAF3OMOeh0qYsdcpFgOdWSGP6xSi+Cw8vHf4S4SaLg8VUZVP
	R73Ggk5M5xUdhMBx2WsP38WXLm6AzDw=
X-Google-Smtp-Source: 
 AIpwx484X7GlEWffMA/XE3MlAf0v+Vo+VWey+o/+Hv8i6NcDCHLRYFH0BGoQkPd4JPYuMprZ2bhdVA==
X-Received: by 10.80.137.149 with SMTP id g21mr3906448edg.25.1523372742469;
	Tue, 10 Apr 2018 08:05:42 -0700 (PDT)
Received: from localhost (x50d2404e.cust.hiper.dk. [80.210.64.78])
	by smtp.gmail.com with ESMTPSA id
	f21sm854572edd.65.2018.04.10.08.05.41
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Tue, 10 Apr 2018 08:05:41 -0700 (PDT)
Date: Tue, 10 Apr 2018 17:05:40 +0200
From: Christoffer Dall <cdall@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>, kvmarm@lists.cs.columbia.edu,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Shannon Zhao <zhaoshenglong@huawei.com>
Subject: Re: [PATCH] KVM: arm/arm64: Close VMID generation race
Message-ID: <20180410150540.GK10904@cbox>
References: <20180409170706.23541-1-marc.zyngier@arm.com>
	<20180409205139.GH10904@cbox>
	<20180410105119.yzzzd4lyvlsvtbfy@lakrids.cambridge.arm.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20180410105119.yzzzd4lyvlsvtbfy@lakrids.cambridge.arm.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote:
> > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote:
> > > Before entering the guest, we check whether our VMID is still
> > > part of the current generation. In order to avoid taking a lock,
> > > we start with checking that the generation is still current, and
> > > only if not current do we take the lock, recheck, and update the
> > > generation and VMID.
> > > 
> > > This leaves open a small race: A vcpu can bump up the global
> > > generation number as well as the VM's, but has not updated
> > > the VMID itself yet.
> > > 
> > > At that point another vcpu from the same VM comes in, checks
> > > the generation (and finds it not needing anything), and jumps
> > > into the guest. At this point, we end-up with two vcpus belonging
> > > to the same VM running with two different VMIDs. Eventually, the
> > > VMID used by the second vcpu will get reassigned, and things will
> > > really go wrong...
> > > 
> > > A simple solution would be to drop this initial check, and always take
> > > the lock. This is likely to cause performance issues. A middle ground
> > > is to convert the spinlock to a rwlock, and only take the read lock
> > > on the fast path. If the check fails at that point, drop it and
> > > acquire the write lock, rechecking the condition.
> > > 
> > > This ensures that the above scenario doesn't occur.
> > > 
> > > Reported-by: Mark Rutland <mark.rutland@arm.com>
> > > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > ---
> > > I haven't seen any reply from Shannon, so reposting this to
> > > a slightly wider audience for feedback.
> > > 
> > >  virt/kvm/arm/arm.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > > index dba629c5f8ac..a4c1b76240df 100644
> > > --- a/virt/kvm/arm/arm.c
> > > +++ b/virt/kvm/arm/arm.c
> > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
> > >  static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> > >  static u32 kvm_next_vmid;
> > >  static unsigned int kvm_vmid_bits __read_mostly;
> > > -static DEFINE_SPINLOCK(kvm_vmid_lock);
> > > +static DEFINE_RWLOCK(kvm_vmid_lock);
> > >  
> > >  static bool vgic_present;
> > >  
> > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm)
> > >  {
> > >  	phys_addr_t pgd_phys;
> > >  	u64 vmid;
> > > +	bool new_gen;
> > >  
> > > -	if (!need_new_vmid_gen(kvm))
> > > +	read_lock(&kvm_vmid_lock);
> > > +	new_gen = need_new_vmid_gen(kvm);
> > > +	read_unlock(&kvm_vmid_lock);
> > > +
> > > +	if (!new_gen)
> > >  		return;
> > >  
> > > -	spin_lock(&kvm_vmid_lock);
> > > +	write_lock(&kvm_vmid_lock);
> > >  
> > >  	/*
> > >  	 * We need to re-check the vmid_gen here to ensure that if another vcpu
> > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm)
> > >  	 * use the same vmid.
> > >  	 */
> > >  	if (!need_new_vmid_gen(kvm)) {
> > > -		spin_unlock(&kvm_vmid_lock);
> > > +		write_unlock(&kvm_vmid_lock);
> > >  		return;
> > >  	}
> > >  
> > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm)
> > >  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
> > >  	kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
> > >  
> > > -	spin_unlock(&kvm_vmid_lock);
> > > +	write_unlock(&kvm_vmid_lock);
> > >  }
> > >  
> > >  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> > > -- 
> > > 2.14.2
> > > 
> > 
> > The above looks correct to me.  I am wondering if something like the
> > following would also work, which may be slightly more efficient,
> > although I doubt the difference can be measured:
> > 

[...]

> 
> I think we also need to update kvm->arch.vttbr before updating
> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> old VMID).
> 
> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> the critical section, I think that works, modulo using READ_ONCE() and
> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> locklessly.

Indeed, you're right.  I would look something like this, then:


It's probably easier to convince ourselves about the correctness of
Marc's code using a rwlock instead, though.  Thoughts?

Thanks,
-Christoffer

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2e43f9d42bd5..6cb08995e7ff 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
  */
 static bool need_new_vmid_gen(struct kvm *kvm)
 {
-	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
+	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
+	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
+	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
 }
 
 /**
@@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
 		kvm_call_hyp(__kvm_flush_vm_context);
 	}
 
-	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
 	kvm->arch.vmid = kvm_next_vmid;
 	kvm_next_vmid++;
 	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
@@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
 	pgd_phys = virt_to_phys(kvm->arch.pgd);
 	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
 	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
-	kvm->arch.vttbr = pgd_phys | vmid;
+	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
+
+	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
+	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
 
 	spin_unlock(&kvm_vmid_lock);
 }