From patchwork Thu Mar 10 14:40:30 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Xiao Guangrong <guangrong.xiao@linux.intel.com>
X-Patchwork-Id: 8557571
Return-Path: <kvm-owner@kernel.org>
X-Original-To: patchwork-kvm@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 9B785C0553
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 10 Mar 2016 14:41:05 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id A620420380
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 10 Mar 2016 14:41:04 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id 9C33A2037C
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 10 Mar 2016 14:41:03 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752495AbcCJOlB (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Thu, 10 Mar 2016 09:41:01 -0500
Received: from mga02.intel.com ([134.134.136.20]:28474 "EHLO mga02.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751939AbcCJOk7 (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 10 Mar 2016 09:40:59 -0500
Received: from fmsmga004.fm.intel.com ([10.253.24.48])
	by orsmga101.jf.intel.com with ESMTP; 10 Mar 2016 06:40:57 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.24,316,1455004800"; d="scan'208";a="63571406"
Received: from xiao.sh.intel.com ([10.239.159.134])
	by fmsmga004.fm.intel.com with ESMTP; 10 Mar 2016 06:40:55 -0800
Subject: Re: [PATCH] KVM: Remove redundant smp_mb() in the
	kvm_mmu_commit_zap_page()
To: Paolo Bonzini <pbonzini@redhat.com>, Lan Tianyu <tianyu.lan@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>
References: <1457055312-27067-1-git-send-email-tianyu.lan@intel.com>
	<alpine.DEB.2.11.1603040820100.3597@nanos>
	<56D9354E.9040908@intel.com>
	<56D94BFE.1080406@redhat.com> <56DE8F1A.9000401@intel.com>
	<56DEEF69.20208@redhat.com>
Cc: gleb@kernel.org, mingo@redhat.com, hpa@zytor.com, x86@kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Message-ID: <56E1875E.8060007@linux.intel.com>
Date: Thu, 10 Mar 2016 22:40:30 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
	Thunderbird/38.5.1
MIME-Version: 1.0
In-Reply-To: <56DEEF69.20208@redhat.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI,
	RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

On 03/08/2016 11:27 PM, Paolo Bonzini wrote:
>
>
> On 08/03/2016 09:36, Lan Tianyu wrote:
>> Summary about smp_mb()s we met in this thread. If misunderstood, please
>> correct me. Thanks.
>>
>> The smp_mb() in the kvm_flush_remote_tlbs() was introduced by the commit
>> a4ee1ca4 and it seems to keep the order of reading and cmpxchg
>> kvm->tlbs_dirty.
>>
>>      Quote from Avi:
>>      | I don't think we need to flush immediately; set a "tlb dirty" bit
>> somewhere
>>      | that is cleareded when we flush the tlb.
>> kvm_mmu_notifier_invalidate_page()
>>      | can consult the bit and force a flush if set.
>>
>>      Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>>      Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> Unfortunately that patch added a bad memory barrier: 1) it lacks a
> comment; 2) it lacks obvious pairing; 3) it is an smp_mb() after a read,
> so it's not even obvious that this memory barrier has to do with the
> immediately preceding read of kvm->tlbs_dirty.  It also is not
> documented in Documentation/virtual/kvm/mmu.txt (Guangrong documented
> there most of his other work, back in 2013, but not this one :)).
>
> The cmpxchg is ordered anyway against the read, because 1) x86 has
> implicit ordering between earlier loads and later stores; 2) even
> store-load barriers are unnecessary for accesses to the same variable
> (in this case kvm->tlbs_dirty).
>
> So offhand, I cannot say what it orders.  There are two possibilities:
>
> 1) it orders the read of tlbs_dirty with the read of mode.  In this
> case, a smp_rmb() would have been enough, and it's not clear where is
> the matching smp_wmb().
>
> 2) it orders the read of tlbs_dirty with the KVM_REQ_TLB_FLUSH request.
>   In this case a smp_load_acquire would be better.
>
> 3) it does the same as kvm_mmu_commit_zap_page's smp_mb() but for other
> callers of kvm_flush_remote_tlbs().  In this case, we know what's the
> matching memory barrier (walk_shadow_page_lockless_*).
>
> 4) it is completely unnecessary.

Sorry, memory barriers were missed in sync_page(), this diff should fix it:


Any comment?
---
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 91e939b..4cad57f 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -948,6 +948,12 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
  			return -EINVAL;

  		if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) {
+			/*
+			 * update spte before increasing tlbs_dirty to make sure no tlb
+			 * flush in lost after spte is zapped, see the comments in
+			 * kvm_flush_remote_tlbs().
+			 */
+			smp_wmb();
  			vcpu->kvm->tlbs_dirty++;
  			continue;
  		}
@@ -963,6 +969,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)

  		if (gfn != sp->gfns[i]) {
  			drop_spte(vcpu->kvm, &sp->spt[i]);
+			/* the same as above where we are doing prefetch_invalid_gpte(). */
+			smp_wmb();
  			vcpu->kvm->tlbs_dirty++;
  			continue;
  		}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 314c777..82c86ea 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -193,7 +193,12 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
  {
  	long dirty_count = kvm->tlbs_dirty;

+        /*
+         * read tlbs_dirty before doing tlb flush to make sure not tlb request is
+         * lost.
+         */
  	smp_mb();
+
  	if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
  		++kvm->stat.remote_tlb_flush;
  	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);