From patchwork Sun Jun  4 11:57:37 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Christoffer Dall <cdall@linaro.org>
X-Patchwork-Id: 9764783
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	A398A6032D for <patchwork-kvm@patchwork.kernel.org>;
	Sun,  4 Jun 2017 11:58:11 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 86C12283C8
	for <patchwork-kvm@patchwork.kernel.org>;
	Sun,  4 Jun 2017 11:58:11 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 6747B2842E; Sun,  4 Jun 2017 11:58:11 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A92AF283C8
	for <patchwork-kvm@patchwork.kernel.org>;
	Sun,  4 Jun 2017 11:58:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751027AbdFDL55 (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Sun, 4 Jun 2017 07:57:57 -0400
Received: from mail-wm0-f54.google.com ([74.125.82.54]:34878 "EHLO
	mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750847AbdFDL54 (ORCPT <rfc822; kvm@vger.kernel.org>);
	Sun, 4 Jun 2017 07:57:56 -0400
Received: by mail-wm0-f54.google.com with SMTP id b84so54578366wmh.0
	for <kvm@vger.kernel.org>; Sun, 04 Jun 2017 04:57:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google;
	h=date:from:to:cc:subject:message-id:references:mime-version
	:content-disposition:in-reply-to:user-agent;
	bh=uBR1qii8i85RwrrjeNqqRLO/RQfwF2oFih8wjh5zIZc=;
	b=Lasl8s2M6txjs4pCXpfUEgiAYAkNIksx/11ZDHL3pZ5ug435lnVpe3r04ivsDNW1sT
	2HiLmiF5btMYGtzG6n94t/cgBduBye46BtB37FLmK8J6aQcOvIaMaSZQ6yERru2JHLSo
	RTEaXSRva2vWxFnfAd/41qDdACFVkxhiegBiA=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:date:from:to:cc:subject:message-id:references
	:mime-version:content-disposition:in-reply-to:user-agent;
	bh=uBR1qii8i85RwrrjeNqqRLO/RQfwF2oFih8wjh5zIZc=;
	b=Fpo/0wv0qpr4cyMcZdtddyUGT8mTBHL49/ICbKiQ2iI3Z2LCYy8cX8v/kfBcgL+pIE
	EkxHqLLbzcfAe5gGyphglx5ZGnKaIYGo4zcvE30mHdikvScun5uoYYaZvr7tgddcr8F4
	bWagjMbeTLrm5zootntzztmw9bIxbRoU8r3ZZNbR+0ESxpbrRJM/OoDgMnHLr+9ztdgn
	cfdye3k9HiCLhAsfw5Af/LqVbYrvhszhEwWCyxjxGvR7/WF9aLQdPYwcC7MFUCHxqx/j
	xStEF2iicHrEQcL+iTM52GspDhT5bmtX2gjSG3GbNXRnYovae0ABW1a8Di5nfr+fAE8s
	Zlrw==
X-Gm-Message-State: AODbwcCqUHdihc0yDz2q7cxgkSh4Uh1dXxHolrlUZVIa9CI53wwwvBc+
	v+vmR5tfmPM3RTwK0dEQ3A==
X-Received: by 10.80.154.33 with SMTP id o30mr12792265edb.65.1496577474499;
	Sun, 04 Jun 2017 04:57:54 -0700 (PDT)
Received: from localhost (xd93ddc2d.cust.hiper.dk. [217.61.220.45])
	by smtp.gmail.com with ESMTPSA id
	z27sm12322614edb.54.2017.06.04.04.57.53
	(version=TLS1_2 cipher=AES128-SHA bits=128/128);
	Sun, 04 Jun 2017 04:57:53 -0700 (PDT)
Date: Sun, 4 Jun 2017 13:57:37 +0200
From: Christoffer Dall <cdall@linaro.org>
To: Andrew Jones <drjones@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>, kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	Eric Auger <eric.auger@redhat.com>
Subject: Re: [PATCH v2 3/3] KVM: arm/arm64: Simplify active_change_prepare
	and plug race
Message-ID: <20170604115737.GA9464@cbox>
References: <20170516100431.4101-1-cdall@linaro.org>
	<20170516100431.4101-4-cdall@linaro.org>
	<9a412ed2-70ea-b651-dd75-18d723ee8b71@arm.com>
	<20170523084329.GC6319@cbox>
	<20170604081557.rhda3xp34mgkf7ow@hawk.localdomain>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20170604081557.rhda3xp34mgkf7ow@hawk.localdomain>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On Sun, Jun 04, 2017 at 10:15:57AM +0200, Andrew Jones wrote:
> On Tue, May 23, 2017 at 10:43:29AM +0200, Christoffer Dall wrote:
> > On Mon, May 22, 2017 at 04:30:22PM +0100, Marc Zyngier wrote:
> > > On 16/05/17 11:04, Christoffer Dall wrote:
> > > > We don't need to stop a specific VCPU when changing the active state,
> > > > because private IRQs can only be modified by a running VCPU for the
> > > > VCPU itself and it is therefore already stopped.
> > > > 
> > > > However, it is also possible for two VCPUs to be modifying the active
> > > > state of SPIs at the same time, which can cause the thread being stuck
> > > > in the loop that checks other VCPU threads for a potentially very long
> > > > time, or to modify the active state of a running VCPU.  Fix this by
> > > > serializing all accesses to setting and clearing the active state of
> > > > interrupts using the KVM mutex.
> > > > 
> > > > Reported-by: Andrew Jones <drjones@redhat.com>
> > > > Signed-off-by: Christoffer Dall <cdall@linaro.org>
> > > > ---
> > > >  arch/arm/include/asm/kvm_host.h   |  2 --
> > > >  arch/arm64/include/asm/kvm_host.h |  2 --
> > > >  virt/kvm/arm/arm.c                | 20 ++++----------------
> > > >  virt/kvm/arm/vgic/vgic-mmio.c     | 18 ++++++++++--------
> > > >  virt/kvm/arm/vgic/vgic.c          | 11 ++++++-----
> > > >  5 files changed, 20 insertions(+), 33 deletions(-)
> > > > 
> > > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> > > > index f0e6657..12274d4 100644
> > > > --- a/arch/arm/include/asm/kvm_host.h
> > > > +++ b/arch/arm/include/asm/kvm_host.h
> > > > @@ -233,8 +233,6 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
> > > >  struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
> > > >  void kvm_arm_halt_guest(struct kvm *kvm);
> > > >  void kvm_arm_resume_guest(struct kvm *kvm);
> > > > -void kvm_arm_halt_vcpu(struct kvm_vcpu *vcpu);
> > > > -void kvm_arm_resume_vcpu(struct kvm_vcpu *vcpu);
> > > >  
> > > >  int kvm_arm_copy_coproc_indices(struct kvm_vcpu *vcpu, u64 __user *uindices);
> > > >  unsigned long kvm_arm_num_coproc_regs(struct kvm_vcpu *vcpu);
> > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > > index 5e19165..32cbe8a 100644
> > > > --- a/arch/arm64/include/asm/kvm_host.h
> > > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > > @@ -333,8 +333,6 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
> > > >  struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
> > > >  void kvm_arm_halt_guest(struct kvm *kvm);
> > > >  void kvm_arm_resume_guest(struct kvm *kvm);
> > > > -void kvm_arm_halt_vcpu(struct kvm_vcpu *vcpu);
> > > > -void kvm_arm_resume_vcpu(struct kvm_vcpu *vcpu);
> > > >  
> > > >  u64 __kvm_call_hyp(void *hypfn, ...);
> > > >  #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
> > > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > > > index 3417e18..3c387fd 100644
> > > > --- a/virt/kvm/arm/arm.c
> > > > +++ b/virt/kvm/arm/arm.c
> > > > @@ -539,27 +539,15 @@ void kvm_arm_halt_guest(struct kvm *kvm)
> > > >  	kvm_make_all_cpus_request(kvm, KVM_REQ_VCPU_EXIT);
> > > >  }
> > > >  
> > > > -void kvm_arm_halt_vcpu(struct kvm_vcpu *vcpu)
> > > > -{
> > > > -	vcpu->arch.pause = true;
> > > > -	kvm_vcpu_kick(vcpu);
> > > > -}
> > > > -
> > > > -void kvm_arm_resume_vcpu(struct kvm_vcpu *vcpu)
> > > > -{
> > > > -	struct swait_queue_head *wq = kvm_arch_vcpu_wq(vcpu);
> > > > -
> > > > -	vcpu->arch.pause = false;
> > > > -	swake_up(wq);
> > > > -}
> > > > -
> > > >  void kvm_arm_resume_guest(struct kvm *kvm)
> > > >  {
> > > >  	int i;
> > > >  	struct kvm_vcpu *vcpu;
> > > >  
> > > > -	kvm_for_each_vcpu(i, vcpu, kvm)
> > > > -		kvm_arm_resume_vcpu(vcpu);
> > > > +	kvm_for_each_vcpu(i, vcpu, kvm) {
> > > > +		vcpu->arch.pause = false;
> > > > +		swake_up(kvm_arch_vcpu_wq(vcpu));
> > > > +	}
> > > >  }
> > > >  
> > > >  static void vcpu_sleep(struct kvm_vcpu *vcpu)
> > > > diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
> > > > index 64cbcb4..c1e4bdd 100644
> > > > --- a/virt/kvm/arm/vgic/vgic-mmio.c
> > > > +++ b/virt/kvm/arm/vgic/vgic-mmio.c
> > > > @@ -231,23 +231,21 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
> > > >   * be migrated while we don't hold the IRQ locks and we don't want to be
> > > >   * chasing moving targets.
> > > >   *
> > > > - * For private interrupts, we only have to make sure the single and only VCPU
> > > > - * that can potentially queue the IRQ is stopped.
> > > > + * For private interrupts we don't have to do anything because userspace
> > > > + * accesses to the VGIC state already require all VCPUs to be stopped, and
> > > > + * only the VCPU itself can modify its private interrupts active state, which
> > > > + * guarantees that the VCPU is not running.
> > > >   */
> > > >  static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid)
> > > >  {
> > > > -	if (intid < VGIC_NR_PRIVATE_IRQS)
> > > > -		kvm_arm_halt_vcpu(vcpu);
> > > > -	else
> > > > +	if (intid > VGIC_NR_PRIVATE_IRQS)
> > > >  		kvm_arm_halt_guest(vcpu->kvm);
> > > >  }
> > > >  
> > > >  /* See vgic_change_active_prepare */
> > > >  static void vgic_change_active_finish(struct kvm_vcpu *vcpu, u32 intid)
> > > >  {
> > > > -	if (intid < VGIC_NR_PRIVATE_IRQS)
> > > > -		kvm_arm_resume_vcpu(vcpu);
> > > > -	else
> > > > +	if (intid > VGIC_NR_PRIVATE_IRQS)
> > > >  		kvm_arm_resume_guest(vcpu->kvm);
> > > >  }
> > > >  
> > > > @@ -271,11 +269,13 @@ void vgic_mmio_write_cactive(struct kvm_vcpu *vcpu,
> > > >  {
> > > >  	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
> > > >  
> > > > +	mutex_lock(&vcpu->kvm->lock);
> > > >  	vgic_change_active_prepare(vcpu, intid);
> > > >  
> > > >  	__vgic_mmio_write_cactive(vcpu, addr, len, val);
> > > >  
> > > >  	vgic_change_active_finish(vcpu, intid);
> > > > +	mutex_unlock(&vcpu->kvm->lock);
> > > 
> > > Any reason not to move the lock/unlock calls to prepare/finish? Also, do
> > > we need to take that mutex if intid is a PPI?
> > 
> > I guess we strictly don't need to take the mutex if it's a PPI, no.
> > 
> > But I actually preferred this symmetry because you can easily tell we
> > don't have a bug (famous last words) by locking and unlocking the mutex
> > in the same function.
> > 
> > I don't feel strongly about it though, so I can move it if you prefer
> > it.
> 
> Actually we must move the locking into the prepare/finish functions, in
> order to tuck them into the VGIC_NR_PRIVATE_IRQS conditions. Otherwise,
> with gicv3, when userspace accesses ISACTIVER0/ICACTIVER0, which are
> SGI_base offsets, then the vgic_v3_sgibase_registers table is used. That
> table doesn't provide the uaccess functions, so we try to lock twice
> again.
> 

Nice find.

How about we always use the uaccess functions or uaccesses instead? :


Thanks,
-Christoffer

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 23c0d564..e25567a 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -528,12 +528,14 @@ static const struct vgic_register_region vgic_v3_sgibase_registers[] = {
 		vgic_mmio_read_pending, vgic_mmio_write_cpending,
 		vgic_mmio_read_raz, vgic_mmio_write_wi, 4,
 		VGIC_ACCESS_32bit),
-	REGISTER_DESC_WITH_LENGTH(GICR_ISACTIVER0,
-		vgic_mmio_read_active, vgic_mmio_write_sactive, 4,
-		VGIC_ACCESS_32bit),
-	REGISTER_DESC_WITH_LENGTH(GICR_ICACTIVER0,
-		vgic_mmio_read_active, vgic_mmio_write_cactive, 4,
-		VGIC_ACCESS_32bit),
+	REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_ISACTIVER0,
+		vgic_mmio_read_active, vgic_mmio_write_sactive,
+		NULL, vgic_mmio_uaccess_write_sactive,
+		4, VGIC_ACCESS_32bit),
+	REGISTER_DESC_WITH_LENGTH_UACCESS(GICR_ICACTIVER0,
+		vgic_mmio_read_active, vgic_mmio_write_cactive,
+		NULL, vgic_mmio_uaccess_write_cactive,
+		4, VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_LENGTH(GICR_IPRIORITYR0,
 		vgic_mmio_read_priority, vgic_mmio_write_priority, 32,
 		VGIC_ACCESS_32bit | VGIC_ACCESS_8bit),