diff mbox

[3/5] ARM: KVM: make sure maintainance operation complete before world switch

Message ID 1371648006-8036-4-git-send-email-marc.zyngier@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marc Zyngier June 19, 2013, 1:20 p.m. UTC
We may have preempted the guest while it was performing a maintainance
operation (TLB invalidation, for example). Make sure it completes
before we do anything else by adding the necessary barriers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/interrupts.S | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Christoffer Dall June 20, 2013, 12:18 a.m. UTC | #1
On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
> We may have preempted the guest while it was performing a maintainance
> operation (TLB invalidation, for example). Make sure it completes
> before we do anything else by adding the necessary barriers.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/interrupts.S | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index afa6c04..3124e0f 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
>  	 * r0: vcpu pointer
>  	 * r1: exception code
>  	 */
> +
> +	/*
> +	 * We may have preempted the guest while it was performing a
> +	 * maintainance operation (TLB invalidation, for example). Make
> +	 * sure it completes before we do anything else.
> +	 */

Can you explain what could go wrong here without these two instructions?

> +	dsb
> +	isb
> +
>  	save_guest_regs
>  
>  	@ Set VMID == 0
> -- 
> 1.8.2.3
> 
> 
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
Marc Zyngier June 20, 2013, 8:13 a.m. UTC | #2
On 20/06/13 01:18, Christoffer Dall wrote:
> On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
>> We may have preempted the guest while it was performing a maintainance
>> operation (TLB invalidation, for example). Make sure it completes
>> before we do anything else by adding the necessary barriers.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/kvm/interrupts.S | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index afa6c04..3124e0f 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
>>  	 * r0: vcpu pointer
>>  	 * r1: exception code
>>  	 */
>> +
>> +	/*
>> +	 * We may have preempted the guest while it was performing a
>> +	 * maintainance operation (TLB invalidation, for example). Make
>> +	 * sure it completes before we do anything else.
>> +	 */
> 
> Can you explain what could go wrong here without these two instructions?

There would be no guarantee that the TLB invalidation has effectively
completed, and is visible by other CPUs. Not sure that would be a
massive issue in any decent guest OS, but I thought it was worth plugging.

Another (more serious) thing I had doubts about was that we're about to
switch VMID to restore the host context. The ARM ARM doesn't clearly
specify the interaction between pending TLB maintainance and VMID
switch, and I'm worried that you could end up performing the TLB
maintainance on the *host* TLBs rather than on the guest's.

Having this dsb/isb sequence before switching VMID gives us a strong
guarantee that such a mixup cannot occur.

	M.
Will Deacon June 20, 2013, 10:48 a.m. UTC | #3
On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
> We may have preempted the guest while it was performing a maintainance
> operation (TLB invalidation, for example). Make sure it completes
> before we do anything else by adding the necessary barriers.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/interrupts.S | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index afa6c04..3124e0f 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
>  	 * r0: vcpu pointer
>  	 * r1: exception code
>  	 */
> +
> +	/*
> +	 * We may have preempted the guest while it was performing a
> +	 * maintainance operation (TLB invalidation, for example). Make
> +	 * sure it completes before we do anything else.
> +	 */
> +	dsb

Same here; you can use the inner-shareable version.

Will
Christoffer Dall June 20, 2013, 5:14 p.m. UTC | #4
On Thu, Jun 20, 2013 at 09:13:22AM +0100, Marc Zyngier wrote:
> On 20/06/13 01:18, Christoffer Dall wrote:
> > On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
> >> We may have preempted the guest while it was performing a maintainance
> >> operation (TLB invalidation, for example). Make sure it completes
> >> before we do anything else by adding the necessary barriers.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/kvm/interrupts.S | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> >> index afa6c04..3124e0f 100644
> >> --- a/arch/arm/kvm/interrupts.S
> >> +++ b/arch/arm/kvm/interrupts.S
> >> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
> >>  	 * r0: vcpu pointer
> >>  	 * r1: exception code
> >>  	 */
> >> +
> >> +	/*
> >> +	 * We may have preempted the guest while it was performing a
> >> +	 * maintainance operation (TLB invalidation, for example). Make
> >> +	 * sure it completes before we do anything else.
> >> +	 */
> > 
> > Can you explain what could go wrong here without these two instructions?
> 
> There would be no guarantee that the TLB invalidation has effectively
> completed, and is visible by other CPUs. Not sure that would be a
> massive issue in any decent guest OS, but I thought it was worth plugging.

ok, I was trying to think about how it would break, and if a guest needs
a TLB invalidation to be visisble by other CPUs it would have to have a
dsb/isb itself after the operation, and that would eventually be
executed once the VCPU was rescheduled, but potentially on another CPU,
but then I wonder if the PCPU migration on the host wouldn't take care
of it?

It sounds like you're not 100% sure it actually breaks something (or am
I reading it wrong?), but if the performance impact is minor, why not be
on the safe side I guess.

> 
> Another (more serious) thing I had doubts about was that we're about to
> switch VMID to restore the host context. The ARM ARM doesn't clearly
> specify the interaction between pending TLB maintainance and VMID
> switch, and I'm worried that you could end up performing the TLB
> maintainance on the *host* TLBs rather than on the guest's.
> 
> Having this dsb/isb sequence before switching VMID gives us a strong
> guarantee that such a mixup cannot occur.
> 
This is really hurting my brain.

Again, it seems the argument is, why not, and maybe it's required.
And indeed, if it gives us peace of mind, I'm ok with it.

Sorry about this OCD.

-Christoffer
Marc Zyngier June 20, 2013, 5:29 p.m. UTC | #5
On 20/06/13 18:14, Christoffer Dall wrote:
> On Thu, Jun 20, 2013 at 09:13:22AM +0100, Marc Zyngier wrote:
>> On 20/06/13 01:18, Christoffer Dall wrote:
>>> On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
>>>> We may have preempted the guest while it was performing a maintainance
>>>> operation (TLB invalidation, for example). Make sure it completes
>>>> before we do anything else by adding the necessary barriers.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm/kvm/interrupts.S | 9 +++++++++
>>>>  1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>>>> index afa6c04..3124e0f 100644
>>>> --- a/arch/arm/kvm/interrupts.S
>>>> +++ b/arch/arm/kvm/interrupts.S
>>>> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
>>>>  	 * r0: vcpu pointer
>>>>  	 * r1: exception code
>>>>  	 */
>>>> +
>>>> +	/*
>>>> +	 * We may have preempted the guest while it was performing a
>>>> +	 * maintainance operation (TLB invalidation, for example). Make
>>>> +	 * sure it completes before we do anything else.
>>>> +	 */
>>>
>>> Can you explain what could go wrong here without these two instructions?
>>
>> There would be no guarantee that the TLB invalidation has effectively
>> completed, and is visible by other CPUs. Not sure that would be a
>> massive issue in any decent guest OS, but I thought it was worth plugging.
> 
> ok, I was trying to think about how it would break, and if a guest needs
> a TLB invalidation to be visisble by other CPUs it would have to have a
> dsb/isb itself after the operation, and that would eventually be
> executed once the VCPU was rescheduled, but potentially on another CPU,
> but then I wonder if the PCPU migration on the host wouldn't take care
> of it?
> 
> It sounds like you're not 100% sure it actually breaks something (or am
> I reading it wrong?), but if the performance impact is minor, why not be
> on the safe side I guess.

I think a well written guest wouldn't be affected.

>>
>> Another (more serious) thing I had doubts about was that we're about to
>> switch VMID to restore the host context. The ARM ARM doesn't clearly
>> specify the interaction between pending TLB maintainance and VMID
>> switch, and I'm worried that you could end up performing the TLB
>> maintainance on the *host* TLBs rather than on the guest's.
>>
>> Having this dsb/isb sequence before switching VMID gives us a strong
>> guarantee that such a mixup cannot occur.
>>
> This is really hurting my brain.
> 
> Again, it seems the argument is, why not, and maybe it's required.
> And indeed, if it gives us peace of mind, I'm ok with it.

I guess my problem here is that the spec isn't 100% clear about what
happens. Which means a compliant implementation could do things that
would go horribly wrong.

I'm fairly confident that Cortex-A15 doesn't require this. But other
implementations might, and that's what I'm trying to cover here.

> Sorry about this OCD.

No worries.

	M.
Will Deacon June 20, 2013, 6:15 p.m. UTC | #6
On Thu, Jun 20, 2013 at 06:14:09PM +0100, Christoffer Dall wrote:
> On Thu, Jun 20, 2013 at 09:13:22AM +0100, Marc Zyngier wrote:
> > On 20/06/13 01:18, Christoffer Dall wrote:
> > > On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
> > >> We may have preempted the guest while it was performing a maintainance
> > >> operation (TLB invalidation, for example). Make sure it completes
> > >> before we do anything else by adding the necessary barriers.
> > >>
> > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > >> ---
> > >>  arch/arm/kvm/interrupts.S | 9 +++++++++
> > >>  1 file changed, 9 insertions(+)
> > >>
> > >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> > >> index afa6c04..3124e0f 100644
> > >> --- a/arch/arm/kvm/interrupts.S
> > >> +++ b/arch/arm/kvm/interrupts.S
> > >> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
> > >>  	 * r0: vcpu pointer
> > >>  	 * r1: exception code
> > >>  	 */
> > >> +
> > >> +	/*
> > >> +	 * We may have preempted the guest while it was performing a
> > >> +	 * maintainance operation (TLB invalidation, for example). Make
> > >> +	 * sure it completes before we do anything else.
> > >> +	 */
> > > 
> > > Can you explain what could go wrong here without these two instructions?
> > 
> > There would be no guarantee that the TLB invalidation has effectively
> > completed, and is visible by other CPUs. Not sure that would be a
> > massive issue in any decent guest OS, but I thought it was worth plugging.
> 
> ok, I was trying to think about how it would break, and if a guest needs
> a TLB invalidation to be visisble by other CPUs it would have to have a
> dsb/isb itself after the operation, and that would eventually be
> executed once the VCPU was rescheduled, but potentially on another CPU,
> but then I wonder if the PCPU migration on the host wouldn't take care
> of it?

Actually, it's worse than both of you think :)

The dsb *must* be executed on the same physical CPU as the TLB invalidation.
The same virtual CPU isn't enough, which is all that is guaranteed by the
guest. If you don't have a dsb on your vcpu migration path, then you need
something here.

The same thing applies to cache maintenance operations.

Will
Christoffer Dall June 20, 2013, 6:28 p.m. UTC | #7
On Thu, Jun 20, 2013 at 07:15:25PM +0100, Will Deacon wrote:
> On Thu, Jun 20, 2013 at 06:14:09PM +0100, Christoffer Dall wrote:
> > On Thu, Jun 20, 2013 at 09:13:22AM +0100, Marc Zyngier wrote:
> > > On 20/06/13 01:18, Christoffer Dall wrote:
> > > > On Wed, Jun 19, 2013 at 02:20:04PM +0100, Marc Zyngier wrote:
> > > >> We may have preempted the guest while it was performing a maintainance
> > > >> operation (TLB invalidation, for example). Make sure it completes
> > > >> before we do anything else by adding the necessary barriers.
> > > >>
> > > >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > > >> ---
> > > >>  arch/arm/kvm/interrupts.S | 9 +++++++++
> > > >>  1 file changed, 9 insertions(+)
> > > >>
> > > >> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> > > >> index afa6c04..3124e0f 100644
> > > >> --- a/arch/arm/kvm/interrupts.S
> > > >> +++ b/arch/arm/kvm/interrupts.S
> > > >> @@ -149,6 +149,15 @@ __kvm_vcpu_return:
> > > >>  	 * r0: vcpu pointer
> > > >>  	 * r1: exception code
> > > >>  	 */
> > > >> +
> > > >> +	/*
> > > >> +	 * We may have preempted the guest while it was performing a
> > > >> +	 * maintainance operation (TLB invalidation, for example). Make
> > > >> +	 * sure it completes before we do anything else.
> > > >> +	 */
> > > > 
> > > > Can you explain what could go wrong here without these two instructions?
> > > 
> > > There would be no guarantee that the TLB invalidation has effectively
> > > completed, and is visible by other CPUs. Not sure that would be a
> > > massive issue in any decent guest OS, but I thought it was worth plugging.
> > 
> > ok, I was trying to think about how it would break, and if a guest needs
> > a TLB invalidation to be visisble by other CPUs it would have to have a
> > dsb/isb itself after the operation, and that would eventually be
> > executed once the VCPU was rescheduled, but potentially on another CPU,
> > but then I wonder if the PCPU migration on the host wouldn't take care
> > of it?
> 
> Actually, it's worse than both of you think :)
> 
> The dsb *must* be executed on the same physical CPU as the TLB invalidation.
> The same virtual CPU isn't enough, which is all that is guaranteed by the
> guest. If you don't have a dsb on your vcpu migration path, then you need
> something here.
> 
> The same thing applies to cache maintenance operations.
> 
But are we not sure that a dsb will happen anywhere in the kernel if a
process is migrated to a different core?

-Christoffer
Will Deacon June 20, 2013, 6:38 p.m. UTC | #8
On Thu, Jun 20, 2013 at 07:28:47PM +0100, Christoffer Dall wrote:
> On Thu, Jun 20, 2013 at 07:15:25PM +0100, Will Deacon wrote:
> > On Thu, Jun 20, 2013 at 06:14:09PM +0100, Christoffer Dall wrote:
> > > ok, I was trying to think about how it would break, and if a guest needs
> > > a TLB invalidation to be visisble by other CPUs it would have to have a
> > > dsb/isb itself after the operation, and that would eventually be
> > > executed once the VCPU was rescheduled, but potentially on another CPU,
> > > but then I wonder if the PCPU migration on the host wouldn't take care
> > > of it?
> > 
> > Actually, it's worse than both of you think :)
> > 
> > The dsb *must* be executed on the same physical CPU as the TLB invalidation.
> > The same virtual CPU isn't enough, which is all that is guaranteed by the
> > guest. If you don't have a dsb on your vcpu migration path, then you need
> > something here.
> > 
> > The same thing applies to cache maintenance operations.
> > 
> But are we not sure that a dsb will happen anywhere in the kernel if a
> process is migrated to a different core?

Yes, we have a dsb when we unlock the runqueue for a CPU. That's why Linux
doesn't crash and burn usually. If vcpu migration always goes through the
usual scheduling paths, then you don't have a problem.

Will
Christoffer Dall June 20, 2013, 6:50 p.m. UTC | #9
On Thu, Jun 20, 2013 at 07:38:18PM +0100, Will Deacon wrote:
> On Thu, Jun 20, 2013 at 07:28:47PM +0100, Christoffer Dall wrote:
> > On Thu, Jun 20, 2013 at 07:15:25PM +0100, Will Deacon wrote:
> > > On Thu, Jun 20, 2013 at 06:14:09PM +0100, Christoffer Dall wrote:
> > > > ok, I was trying to think about how it would break, and if a guest needs
> > > > a TLB invalidation to be visisble by other CPUs it would have to have a
> > > > dsb/isb itself after the operation, and that would eventually be
> > > > executed once the VCPU was rescheduled, but potentially on another CPU,
> > > > but then I wonder if the PCPU migration on the host wouldn't take care
> > > > of it?
> > > 
> > > Actually, it's worse than both of you think :)
> > > 
> > > The dsb *must* be executed on the same physical CPU as the TLB invalidation.
> > > The same virtual CPU isn't enough, which is all that is guaranteed by the
> > > guest. If you don't have a dsb on your vcpu migration path, then you need
> > > something here.
> > > 
> > > The same thing applies to cache maintenance operations.
> > > 
> > But are we not sure that a dsb will happen anywhere in the kernel if a
> > process is migrated to a different core?
> 
> Yes, we have a dsb when we unlock the runqueue for a CPU. That's why Linux
> doesn't crash and burn usually. If vcpu migration always goes through the
> usual scheduling paths, then you don't have a problem.
> 
Right, a vcpu is simply a thread, a process, so it gets migrated on the
host as any other process.

I gather this means we don't need these, except maybe for the VMID
rollover case, which I honestly didn't fully understand, but maybe it
can be added for that specific case instead?

-Christoffer
diff mbox

Patch

diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index afa6c04..3124e0f 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -149,6 +149,15 @@  __kvm_vcpu_return:
 	 * r0: vcpu pointer
 	 * r1: exception code
 	 */
+
+	/*
+	 * We may have preempted the guest while it was performing a
+	 * maintainance operation (TLB invalidation, for example). Make
+	 * sure it completes before we do anything else.
+	 */
+	dsb
+	isb
+
 	save_guest_regs
 
 	@ Set VMID == 0