diff mbox

KVM: arm/arm64: Close VMID generation race

Message ID 20180410150540.GK10904@cbox (mailing list archive)
State New, archived
Headers show

Commit Message

Christoffer Dall April 10, 2018, 3:05 p.m. UTC
On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> On Mon, Apr 09, 2018 at 10:51:39PM +0200, Christoffer Dall wrote:
> > On Mon, Apr 09, 2018 at 06:07:06PM +0100, Marc Zyngier wrote:
> > > Before entering the guest, we check whether our VMID is still
> > > part of the current generation. In order to avoid taking a lock,
> > > we start with checking that the generation is still current, and
> > > only if not current do we take the lock, recheck, and update the
> > > generation and VMID.
> > > 
> > > This leaves open a small race: A vcpu can bump up the global
> > > generation number as well as the VM's, but has not updated
> > > the VMID itself yet.
> > > 
> > > At that point another vcpu from the same VM comes in, checks
> > > the generation (and finds it not needing anything), and jumps
> > > into the guest. At this point, we end-up with two vcpus belonging
> > > to the same VM running with two different VMIDs. Eventually, the
> > > VMID used by the second vcpu will get reassigned, and things will
> > > really go wrong...
> > > 
> > > A simple solution would be to drop this initial check, and always take
> > > the lock. This is likely to cause performance issues. A middle ground
> > > is to convert the spinlock to a rwlock, and only take the read lock
> > > on the fast path. If the check fails at that point, drop it and
> > > acquire the write lock, rechecking the condition.
> > > 
> > > This ensures that the above scenario doesn't occur.
> > > 
> > > Reported-by: Mark Rutland <mark.rutland@arm.com>
> > > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > ---
> > > I haven't seen any reply from Shannon, so reposting this to
> > > a slightly wider audience for feedback.
> > > 
> > >  virt/kvm/arm/arm.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > > index dba629c5f8ac..a4c1b76240df 100644
> > > --- a/virt/kvm/arm/arm.c
> > > +++ b/virt/kvm/arm/arm.c
> > > @@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
> > >  static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
> > >  static u32 kvm_next_vmid;
> > >  static unsigned int kvm_vmid_bits __read_mostly;
> > > -static DEFINE_SPINLOCK(kvm_vmid_lock);
> > > +static DEFINE_RWLOCK(kvm_vmid_lock);
> > >  
> > >  static bool vgic_present;
> > >  
> > > @@ -473,11 +473,16 @@ static void update_vttbr(struct kvm *kvm)
> > >  {
> > >  	phys_addr_t pgd_phys;
> > >  	u64 vmid;
> > > +	bool new_gen;
> > >  
> > > -	if (!need_new_vmid_gen(kvm))
> > > +	read_lock(&kvm_vmid_lock);
> > > +	new_gen = need_new_vmid_gen(kvm);
> > > +	read_unlock(&kvm_vmid_lock);
> > > +
> > > +	if (!new_gen)
> > >  		return;
> > >  
> > > -	spin_lock(&kvm_vmid_lock);
> > > +	write_lock(&kvm_vmid_lock);
> > >  
> > >  	/*
> > >  	 * We need to re-check the vmid_gen here to ensure that if another vcpu
> > > @@ -485,7 +490,7 @@ static void update_vttbr(struct kvm *kvm)
> > >  	 * use the same vmid.
> > >  	 */
> > >  	if (!need_new_vmid_gen(kvm)) {
> > > -		spin_unlock(&kvm_vmid_lock);
> > > +		write_unlock(&kvm_vmid_lock);
> > >  		return;
> > >  	}
> > >  
> > > @@ -519,7 +524,7 @@ static void update_vttbr(struct kvm *kvm)
> > >  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
> > >  	kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
> > >  
> > > -	spin_unlock(&kvm_vmid_lock);
> > > +	write_unlock(&kvm_vmid_lock);
> > >  }
> > >  
> > >  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> > > -- 
> > > 2.14.2
> > > 
> > 
> > The above looks correct to me.  I am wondering if something like the
> > following would also work, which may be slightly more efficient,
> > although I doubt the difference can be measured:
> > 

[...]

> 
> I think we also need to update kvm->arch.vttbr before updating
> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> old VMID).
> 
> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> the critical section, I think that works, modulo using READ_ONCE() and
> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> locklessly.

Indeed, you're right.  I would look something like this, then:


It's probably easier to convince ourselves about the correctness of
Marc's code using a rwlock instead, though.  Thoughts?

Thanks,
-Christoffer

Comments

Mark Rutland April 10, 2018, 3:24 p.m. UTC | #1
On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> > I think we also need to update kvm->arch.vttbr before updating
> > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> > old VMID).
> > 
> > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> > the critical section, I think that works, modulo using READ_ONCE() and
> > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> > locklessly.
> 
> Indeed, you're right.  I would look something like this, then:
> 
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 2e43f9d42bd5..6cb08995e7ff 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>   */
>  static bool need_new_vmid_gen(struct kvm *kvm)
>  {
> -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
> +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
> +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>  }
>  
>  /**
> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>  		kvm_call_hyp(__kvm_flush_vm_context);
>  	}
>  
> -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>  	kvm->arch.vmid = kvm_next_vmid;
>  	kvm_next_vmid++;
>  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>  	pgd_phys = virt_to_phys(kvm->arch.pgd);
>  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
> -	kvm->arch.vttbr = pgd_phys | vmid;
> +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> +
> +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>  
>  	spin_unlock(&kvm_vmid_lock);
>  }

I think that's right, yes.

We could replace the smp_{r,w}mb() barriers with an acquire of the
kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
trying to optimize things there are larger algorithmic changes necessary
anyhow.

> It's probably easier to convince ourselves about the correctness of
> Marc's code using a rwlock instead, though.  Thoughts?

I believe that Marc's preference was the rwlock; I have no preference
either way.

Thanks,
Mark.
Christoffer Dall April 10, 2018, 3:35 p.m. UTC | #2
On Tue, Apr 10, 2018 at 04:24:20PM +0100, Mark Rutland wrote:
> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> > On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> > > I think we also need to update kvm->arch.vttbr before updating
> > > kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> > > vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> > > old VMID).
> > > 
> > > With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> > > the critical section, I think that works, modulo using READ_ONCE() and
> > > WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> > > locklessly.
> > 
> > Indeed, you're right.  I would look something like this, then:
> > 
> > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> > index 2e43f9d42bd5..6cb08995e7ff 100644
> > --- a/virt/kvm/arm/arm.c
> > +++ b/virt/kvm/arm/arm.c
> > @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
> >   */
> >  static bool need_new_vmid_gen(struct kvm *kvm)
> >  {
> > -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
> > +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
> > +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> > +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
> >  }
> >  
> >  /**
> > @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
> >  		kvm_call_hyp(__kvm_flush_vm_context);
> >  	}
> >  
> > -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >  	kvm->arch.vmid = kvm_next_vmid;
> >  	kvm_next_vmid++;
> >  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> > @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
> >  	pgd_phys = virt_to_phys(kvm->arch.pgd);
> >  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
> >  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
> > -	kvm->arch.vttbr = pgd_phys | vmid;
> > +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> > +
> > +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> > +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >  
> >  	spin_unlock(&kvm_vmid_lock);
> >  }
> 
> I think that's right, yes.
> 
> We could replace the smp_{r,w}mb() barriers with an acquire of the
> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> trying to optimize things there are larger algorithmic changes necessary
> anyhow.
> 
> > It's probably easier to convince ourselves about the correctness of
> > Marc's code using a rwlock instead, though.  Thoughts?
> 
> I believe that Marc's preference was the rwlock; I have no preference
> either way.
> 

I'm fine with both approaches as well, but it was educational for me to
see if this could be done in the lockless way as well.  Thanks for
having a look at that!

-Christoffer
Marc Zyngier April 10, 2018, 3:37 p.m. UTC | #3
On 10/04/18 16:24, Mark Rutland wrote:
> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
>>> I think we also need to update kvm->arch.vttbr before updating
>>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
>>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
>>> old VMID).
>>>
>>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
>>> the critical section, I think that works, modulo using READ_ONCE() and
>>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
>>> locklessly.
>>
>> Indeed, you're right.  I would look something like this, then:
>>
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 2e43f9d42bd5..6cb08995e7ff 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>>   */
>>  static bool need_new_vmid_gen(struct kvm *kvm)
>>  {
>> -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
>> +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
>> +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
>> +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>>  }
>>  
>>  /**
>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>>  		kvm_call_hyp(__kvm_flush_vm_context);
>>  	}
>>  
>> -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>  	kvm->arch.vmid = kvm_next_vmid;
>>  	kvm_next_vmid++;
>>  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>>  	pgd_phys = virt_to_phys(kvm->arch.pgd);
>>  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>>  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
>> -	kvm->arch.vttbr = pgd_phys | vmid;
>> +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
>> +
>> +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
>> +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>  
>>  	spin_unlock(&kvm_vmid_lock);
>>  }
> 
> I think that's right, yes.
> 
> We could replace the smp_{r,w}mb() barriers with an acquire of the
> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> trying to optimize things there are larger algorithmic changes necessary
> anyhow.
> 
>> It's probably easier to convince ourselves about the correctness of
>> Marc's code using a rwlock instead, though.  Thoughts?
> 
> I believe that Marc's preference was the rwlock; I have no preference
> either way.

I don't mind either way. If you can be bothered to write a proper commit
log for this, I'll take it. What I'd really want is Shannon to indicate
whether or not this solves the issue he was seeing.

Thanks,

	M.
Christoffer Dall April 10, 2018, 3:48 p.m. UTC | #4
On Tue, Apr 10, 2018 at 04:37:12PM +0100, Marc Zyngier wrote:
> On 10/04/18 16:24, Mark Rutland wrote:
> > On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
> >> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
> >>> I think we also need to update kvm->arch.vttbr before updating
> >>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
> >>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
> >>> old VMID).
> >>>
> >>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
> >>> the critical section, I think that works, modulo using READ_ONCE() and
> >>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
> >>> locklessly.
> >>
> >> Indeed, you're right.  I would look something like this, then:
> >>
> >> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> >> index 2e43f9d42bd5..6cb08995e7ff 100644
> >> --- a/virt/kvm/arm/arm.c
> >> +++ b/virt/kvm/arm/arm.c
> >> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
> >>   */
> >>  static bool need_new_vmid_gen(struct kvm *kvm)
> >>  {
> >> -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
> >> +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
> >> +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
> >> +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
> >>  }
> >>  
> >>  /**
> >> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
> >>  		kvm_call_hyp(__kvm_flush_vm_context);
> >>  	}
> >>  
> >> -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >>  	kvm->arch.vmid = kvm_next_vmid;
> >>  	kvm_next_vmid++;
> >>  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> >> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
> >>  	pgd_phys = virt_to_phys(kvm->arch.pgd);
> >>  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
> >>  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
> >> -	kvm->arch.vttbr = pgd_phys | vmid;
> >> +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
> >> +
> >> +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
> >> +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
> >>  
> >>  	spin_unlock(&kvm_vmid_lock);
> >>  }
> > 
> > I think that's right, yes.
> > 
> > We could replace the smp_{r,w}mb() barriers with an acquire of the
> > kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
> > trying to optimize things there are larger algorithmic changes necessary
> > anyhow.
> > 
> >> It's probably easier to convince ourselves about the correctness of
> >> Marc's code using a rwlock instead, though.  Thoughts?
> > 
> > I believe that Marc's preference was the rwlock; I have no preference
> > either way.
> 
> I don't mind either way. If you can be bothered to write a proper commit
> log for this, I'll take it. 

You've already done the work, and your patch is easier to read, so let's
just go ahead with that.

I was just curious to which degree my original implementation was
broken; was I trying to achieve something impossible or was I just
writing buggy code.  Seems the latter.  Oh well.

> What I'd really want is Shannon to indicate
> whether or not this solves the issue he was seeing.
> 

Agreed, would like to see that too.

Thanks (and sorry for being noisy),
-Christoffer
Shannon Zhao April 11, 2018, 1:30 a.m. UTC | #5
On 2018/4/10 23:37, Marc Zyngier wrote:
> On 10/04/18 16:24, Mark Rutland wrote:
>> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
>>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
>>>> I think we also need to update kvm->arch.vttbr before updating
>>>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
>>>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
>>>> old VMID).
>>>>
>>>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
>>>> the critical section, I think that works, modulo using READ_ONCE() and
>>>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
>>>> locklessly.
>>>
>>> Indeed, you're right.  I would look something like this, then:
>>>
>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>>> index 2e43f9d42bd5..6cb08995e7ff 100644
>>> --- a/virt/kvm/arm/arm.c
>>> +++ b/virt/kvm/arm/arm.c
>>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>>>   */
>>>  static bool need_new_vmid_gen(struct kvm *kvm)
>>>  {
>>> -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
>>> +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
>>> +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
>>> +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>>>  }
>>>  
>>>  /**
>>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>>>  		kvm_call_hyp(__kvm_flush_vm_context);
>>>  	}
>>>  
>>> -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>>  	kvm->arch.vmid = kvm_next_vmid;
>>>  	kvm_next_vmid++;
>>>  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
>>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>>>  	pgd_phys = virt_to_phys(kvm->arch.pgd);
>>>  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>>>  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
>>> -	kvm->arch.vttbr = pgd_phys | vmid;
>>> +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
>>> +
>>> +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
>>> +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>>  
>>>  	spin_unlock(&kvm_vmid_lock);
>>>  }
>>
>> I think that's right, yes.
>>
>> We could replace the smp_{r,w}mb() barriers with an acquire of the
>> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
>> trying to optimize things there are larger algorithmic changes necessary
>> anyhow.
>>
>>> It's probably easier to convince ourselves about the correctness of
>>> Marc's code using a rwlock instead, though.  Thoughts?
>>
>> I believe that Marc's preference was the rwlock; I have no preference
>> either way.
> 
> I don't mind either way. If you can be bothered to write a proper commit
> log for this, I'll take it. What I'd really want is Shannon to indicate
> whether or not this solves the issue he was seeing.
> 
I'll test Marc's patch. This will take about 3 days since it's not 100%
reproducible.

Thanks,
Shannon Zhao April 16, 2018, 10:05 a.m. UTC | #6
On 2018/4/11 9:30, Shannon Zhao wrote:
> 
> On 2018/4/10 23:37, Marc Zyngier wrote:
>> > On 10/04/18 16:24, Mark Rutland wrote:
>>> >> On Tue, Apr 10, 2018 at 05:05:40PM +0200, Christoffer Dall wrote:
>>>> >>> On Tue, Apr 10, 2018 at 11:51:19AM +0100, Mark Rutland wrote:
>>>>> >>>> I think we also need to update kvm->arch.vttbr before updating
>>>>> >>>> kvm->arch.vmid_gen, otherwise another CPU can come in, see that the
>>>>> >>>> vmid_gen is up-to-date, jump to hyp, and program a stale VTTBR (with the
>>>>> >>>> old VMID).
>>>>> >>>>
>>>>> >>>> With the smp_wmb() and update of kvm->arch.vmid_gen moved to the end of
>>>>> >>>> the critical section, I think that works, modulo using READ_ONCE() and
>>>>> >>>> WRITE_ONCE() to ensure single-copy-atomicity of the fields we access
>>>>> >>>> locklessly.
>>>> >>>
>>>> >>> Indeed, you're right.  I would look something like this, then:
>>>> >>>
>>>> >>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>>>> >>> index 2e43f9d42bd5..6cb08995e7ff 100644
>>>> >>> --- a/virt/kvm/arm/arm.c
>>>> >>> +++ b/virt/kvm/arm/arm.c
>>>> >>> @@ -450,7 +450,9 @@ void force_vm_exit(const cpumask_t *mask)
>>>> >>>   */
>>>> >>>  static bool need_new_vmid_gen(struct kvm *kvm)
>>>> >>>  {
>>>> >>> -	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
>>>> >>> +	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
>>>> >>> +	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
>>>> >>> +	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
>>>> >>>  }
>>>> >>>  
>>>> >>>  /**
>>>> >>> @@ -500,7 +502,6 @@ static void update_vttbr(struct kvm *kvm)
>>>> >>>  		kvm_call_hyp(__kvm_flush_vm_context);
>>>> >>>  	}
>>>> >>>  
>>>> >>> -	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>>> >>>  	kvm->arch.vmid = kvm_next_vmid;
>>>> >>>  	kvm_next_vmid++;
>>>> >>>  	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
>>>> >>> @@ -509,7 +510,10 @@ static void update_vttbr(struct kvm *kvm)
>>>> >>>  	pgd_phys = virt_to_phys(kvm->arch.pgd);
>>>> >>>  	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
>>>> >>>  	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
>>>> >>> -	kvm->arch.vttbr = pgd_phys | vmid;
>>>> >>> +	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
>>>> >>> +
>>>> >>> +	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
>>>> >>> +	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
>>>> >>>  
>>>> >>>  	spin_unlock(&kvm_vmid_lock);
>>>> >>>  }
>>> >>
>>> >> I think that's right, yes.
>>> >>
>>> >> We could replace the smp_{r,w}mb() barriers with an acquire of the
>>> >> kvm_vmid_gen and a release of kvm->arch.vmid_gen, but if we're really
>>> >> trying to optimize things there are larger algorithmic changes necessary
>>> >> anyhow.
>>> >>
>>>> >>> It's probably easier to convince ourselves about the correctness of
>>>> >>> Marc's code using a rwlock instead, though.  Thoughts?
>>> >>
>>> >> I believe that Marc's preference was the rwlock; I have no preference
>>> >> either way.
>> > 
>> > I don't mind either way. If you can be bothered to write a proper commit
>> > log for this, I'll take it. What I'd really want is Shannon to indicate
>> > whether or not this solves the issue he was seeing.
>> > 
> I'll test Marc's patch. This will take about 3 days since it's not 100%
> reproducible.
Hi Marc,

I've run the test for about 4 days. The issue doesn't appear.
So Tested-by: Shannon Zhao <zhaoshenglong@huawei.com>

Thanks,
Marc Zyngier April 16, 2018, 10:20 a.m. UTC | #7
On 16/04/18 11:05, Shannon Zhao wrote:
> 
> 
> On 2018/4/11 9:30, Shannon Zhao wrote:
>>
>> On 2018/4/10 23:37, Marc Zyngier wrote:

[...]

>>>> I don't mind either way. If you can be bothered to write a proper commit
>>>> log for this, I'll take it. What I'd really want is Shannon to indicate
>>>> whether or not this solves the issue he was seeing.
>>>>
>> I'll test Marc's patch. This will take about 3 days since it's not 100%
>> reproducible.
> Hi Marc,
> 
> I've run the test for about 4 days. The issue doesn't appear.
> So Tested-by: Shannon Zhao <zhaoshenglong@huawei.com>

Thanks Shannon, much appreciated. I'll send the fix upstream towards the
end of the week.

Cheers,

	M.
diff mbox

Patch

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2e43f9d42bd5..6cb08995e7ff 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -450,7 +450,9 @@  void force_vm_exit(const cpumask_t *mask)
  */
 static bool need_new_vmid_gen(struct kvm *kvm)
 {
-	return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
+	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
+	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
+	return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
 }
 
 /**
@@ -500,7 +502,6 @@  static void update_vttbr(struct kvm *kvm)
 		kvm_call_hyp(__kvm_flush_vm_context);
 	}
 
-	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
 	kvm->arch.vmid = kvm_next_vmid;
 	kvm_next_vmid++;
 	kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
@@ -509,7 +510,10 @@  static void update_vttbr(struct kvm *kvm)
 	pgd_phys = virt_to_phys(kvm->arch.pgd);
 	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
 	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
-	kvm->arch.vttbr = pgd_phys | vmid;
+	WRITE_ONCE(kvm->arch.vttbr, pgd_phys | vmid);
+
+	smp_wmb(); /* Ensure vttbr update is observed before vmid_gen update */
+	kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
 
 	spin_unlock(&kvm_vmid_lock);
 }