diff mbox

KVM: Remove redundant smp_mb() in the kvm_mmu_commit_zap_page()

Message ID 56E1875E.8060007@linux.intel.com
State New, archived
Headers show

Commit Message

Xiao Guangrong March 10, 2016, 2:40 p.m. UTC
On 03/08/2016 11:27 PM, Paolo Bonzini wrote:
>
>
> On 08/03/2016 09:36, Lan Tianyu wrote:
>> Summary about smp_mb()s we met in this thread. If misunderstood, please
>> correct me. Thanks.
>>
>> The smp_mb() in the kvm_flush_remote_tlbs() was introduced by the commit
>> a4ee1ca4 and it seems to keep the order of reading and cmpxchg
>> kvm->tlbs_dirty.
>>
>>      Quote from Avi:
>>      | I don't think we need to flush immediately; set a "tlb dirty" bit
>> somewhere
>>      | that is cleareded when we flush the tlb.
>> kvm_mmu_notifier_invalidate_page()
>>      | can consult the bit and force a flush if set.
>>
>>      Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
>>      Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> Unfortunately that patch added a bad memory barrier: 1) it lacks a
> comment; 2) it lacks obvious pairing; 3) it is an smp_mb() after a read,
> so it's not even obvious that this memory barrier has to do with the
> immediately preceding read of kvm->tlbs_dirty.  It also is not
> documented in Documentation/virtual/kvm/mmu.txt (Guangrong documented
> there most of his other work, back in 2013, but not this one :)).
>
> The cmpxchg is ordered anyway against the read, because 1) x86 has
> implicit ordering between earlier loads and later stores; 2) even
> store-load barriers are unnecessary for accesses to the same variable
> (in this case kvm->tlbs_dirty).
>
> So offhand, I cannot say what it orders.  There are two possibilities:
>
> 1) it orders the read of tlbs_dirty with the read of mode.  In this
> case, a smp_rmb() would have been enough, and it's not clear where is
> the matching smp_wmb().
>
> 2) it orders the read of tlbs_dirty with the KVM_REQ_TLB_FLUSH request.
>   In this case a smp_load_acquire would be better.
>
> 3) it does the same as kvm_mmu_commit_zap_page's smp_mb() but for other
> callers of kvm_flush_remote_tlbs().  In this case, we know what's the
> matching memory barrier (walk_shadow_page_lockless_*).
>
> 4) it is completely unnecessary.

Sorry, memory barriers were missed in sync_page(), this diff should fix it:



Any comment?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paolo Bonzini March 10, 2016, 3:26 p.m. UTC | #1
On 10/03/2016 15:40, Xiao Guangrong wrote:
>      long dirty_count = kvm->tlbs_dirty;
> 
> +        /*
> +         * read tlbs_dirty before doing tlb flush to make sure not tlb
> request is
> +         * lost.
> +         */
>      smp_mb();
> +
>      if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
>          ++kvm->stat.remote_tlb_flush;
>      cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
> 
> 
> Any comment?

Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
and loads.  Is the

The load of kvm->tlbs_dirty should then be

	/*
	 * Read tlbs_dirty before setting KVM_REQ_TLB_FLUSH in
	 * kvm_make_all_cpus_request.  This
	 */
	long dirty_count = smp_load_acquire(kvm->tlbs_dirty);

Tianyu, I think Xiao provided the information that I was missing.  Would
you like to prepare the patch?

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini March 10, 2016, 3:31 p.m. UTC | #2
On 10/03/2016 16:26, Paolo Bonzini wrote:
> Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
> and loads.

Here, the ordering is load-store, hence...

> The load of kvm->tlbs_dirty should then be
> 
> 	/*
> 	 * Read tlbs_dirty before setting KVM_REQ_TLB_FLUSH in
> 	 * kvm_make_all_cpus_request.  This
> 	 */
> 	long dirty_count = smp_load_acquire(kvm->tlbs_dirty);
> 
> Tianyu, I think Xiao provided the information that I was missing.  Would
> you like to prepare the patch?

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong March 10, 2016, 3:45 p.m. UTC | #3
On 03/10/2016 11:31 PM, Paolo Bonzini wrote:
>
>
> On 10/03/2016 16:26, Paolo Bonzini wrote:
>> Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
>> and loads.
>
> Here, the ordering is load-store, hence...

Yes, this is why i put smp_mb() in the code. :)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini March 10, 2016, 4:04 p.m. UTC | #4
On 10/03/2016 16:45, Xiao Guangrong wrote:
>>
>>> Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
>>> and loads.
>>
>> Here, the ordering is load-store, hence...
> 
> Yes, this is why i put smp_mb() in the code. :)

Here is a table of barriers:


    '. after|                   |
before '.   |    load           |    store
__________'.|___________________|________________________
            |                   |
            |  smp_rmb          | smp_load_acquire
load        |  smp_load_acquire | smp_store_release    XX
            |  smp_mb           | smp_mb
____________|___________________|________________________
            |                   |
            |                   | smp_wmb
store       |  smp_mb           | smp_store_release
            |                   | smp_mb
            |                   |

Your case is the one marked with XX, so a smp_load_acquire() is
enough---and it's preferrable, because it's cheaper than smp_mb() and
more self-documenting.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong March 10, 2016, 5:57 p.m. UTC | #5
On 03/11/2016 12:04 AM, Paolo Bonzini wrote:
>
>
> On 10/03/2016 16:45, Xiao Guangrong wrote:
>>>
>>>> Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
>>>> and loads.
>>>
>>> Here, the ordering is load-store, hence...
>>
>> Yes, this is why i put smp_mb() in the code. :)
>
> Here is a table of barriers:
>
>
>      '. after|                   |
> before '.   |    load           |    store
> __________'.|___________________|________________________
>              |                   |
>              |  smp_rmb          | smp_load_acquire
> load        |  smp_load_acquire | smp_store_release    XX
>              |  smp_mb           | smp_mb
> ____________|___________________|________________________
>              |                   |
>              |                   | smp_wmb
> store       |  smp_mb           | smp_store_release
>              |                   | smp_mb
>              |                   |
>
> Your case is the one marked with XX, so a smp_load_acquire() is
> enough---and it's preferrable, because it's cheaper than smp_mb() and
> more self-documenting.

Yes, you are right and thank you for pointing it out.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
lan,Tianyu March 11, 2016, 1:13 a.m. UTC | #6
On 2016?03?10? 23:26, Paolo Bonzini wrote:
> 
> 
> On 10/03/2016 15:40, Xiao Guangrong wrote:
>>      long dirty_count = kvm->tlbs_dirty;
>>
>> +        /*
>> +         * read tlbs_dirty before doing tlb flush to make sure not tlb
>> request is
>> +         * lost.
>> +         */
>>      smp_mb();
>> +
>>      if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
>>          ++kvm->stat.remote_tlb_flush;
>>      cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
>>
>>
>> Any comment?
> 
> Compared to smp_load_acquire(), smp_mb() adds an ordering between stores
> and loads.  Is the
> 
> The load of kvm->tlbs_dirty should then be
> 
> 	/*
> 	 * Read tlbs_dirty before setting KVM_REQ_TLB_FLUSH in
> 	 * kvm_make_all_cpus_request.  This
> 	 */
> 	long dirty_count = smp_load_acquire(kvm->tlbs_dirty);
> 
> Tianyu, I think Xiao provided the information that I was missing.  Would
> you like to prepare the patch?

Paolo:
	Sure. I will do that.

Guangrong:
	Thanks a lot for your input.
diff mbox

Patch

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 91e939b..4cad57f 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -948,6 +948,12 @@  static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
  			return -EINVAL;

  		if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) {
+			/*
+			 * update spte before increasing tlbs_dirty to make sure no tlb
+			 * flush in lost after spte is zapped, see the comments in
+			 * kvm_flush_remote_tlbs().
+			 */
+			smp_wmb();
  			vcpu->kvm->tlbs_dirty++;
  			continue;
  		}
@@ -963,6 +969,8 @@  static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)

  		if (gfn != sp->gfns[i]) {
  			drop_spte(vcpu->kvm, &sp->spt[i]);
+			/* the same as above where we are doing prefetch_invalid_gpte(). */
+			smp_wmb();
  			vcpu->kvm->tlbs_dirty++;
  			continue;
  		}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 314c777..82c86ea 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -193,7 +193,12 @@  void kvm_flush_remote_tlbs(struct kvm *kvm)
  {
  	long dirty_count = kvm->tlbs_dirty;

+        /*
+         * read tlbs_dirty before doing tlb flush to make sure not tlb request is
+         * lost.
+         */
  	smp_mb();
+
  	if (kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
  		++kvm->stat.remote_tlb_flush;
  	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);