diff mbox

[3/4] Nested SVM: Implement INVLPGA v2

Message ID 1242730443-15656-4-git-send-email-agraf@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Alexander Graf May 19, 2009, 10:54 a.m. UTC
SVM adds another way to do INVLPG by ASID which Hyper-V makes use of,
so let's implement it!

For now we just do the same thing invlpg does, as asid switching
means we flush the mmu anyways. That might change one day though.

v2 makes invlpga do the same as invlpg, not flush the whole mmu

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 arch/x86/kvm/svm.c |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

Comments

Avi Kivity May 19, 2009, 12:58 p.m. UTC | #1
Alexander Graf wrote:
> SVM adds another way to do INVLPG by ASID which Hyper-V makes use of,
> so let's implement it!
>
> For now we just do the same thing invlpg does, as asid switching
> means we flush the mmu anyways. That might change one day though.
>
> v2 makes invlpga do the same as invlpg, not flush the whole mmu
>
>  
> +static int invlpga_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
> +{
> +	struct kvm_vcpu *vcpu = &svm->vcpu;
> +	nsvm_printk("INVLPGA\n");
> +
> +	/* Let's treat INVLPGA the same as INVLPG */
> +	kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
> +
> +	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
> +	skip_emulated_instruction(&svm->vcpu);
> +	return 1;
> +}
>   

I think that for ASID!=0 you can actually do nothing.  The guest entry 
is a cr3 switch, so we'll both get a tlb flush and a resync on any 
modified ptes.

For ASID==0 you can do the invlpg thing.

Marcelo?
Alexander Graf May 19, 2009, 1:10 p.m. UTC | #2
On 19.05.2009, at 14:58, Avi Kivity wrote:

> Alexander Graf wrote:
>> SVM adds another way to do INVLPG by ASID which Hyper-V makes use of,
>> so let's implement it!
>>
>> For now we just do the same thing invlpg does, as asid switching
>> means we flush the mmu anyways. That might change one day though.
>>
>> v2 makes invlpga do the same as invlpg, not flush the whole mmu
>>
>> +static int invlpga_interception(struct vcpu_svm *svm, struct  
>> kvm_run *kvm_run)
>> +{
>> +	struct kvm_vcpu *vcpu = &svm->vcpu;
>> +	nsvm_printk("INVLPGA\n");
>> +
>> +	/* Let's treat INVLPGA the same as INVLPG */
>> +	kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
>> +
>> +	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
>> +	skip_emulated_instruction(&svm->vcpu);
>> +	return 1;
>> +}
>>
>
> I think that for ASID!=0 you can actually do nothing.  The guest  
> entry is a cr3 switch, so we'll both get a tlb flush and a resync on  
> any modified ptes.

Right, the only situation I can imagine this isn't fulfilled is when  
INVLPGA isn't trapped in the 1st level guest, but issued in the 2nd  
level one. That should be rather rare though ;-).

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 19, 2009, 1:19 p.m. UTC | #3
Alexander Graf wrote:
>> I think that for ASID!=0 you can actually do nothing.  The guest 
>> entry is a cr3 switch, so we'll both get a tlb flush and a resync on 
>> any modified ptes.
>
> Right, the only situation I can imagine this isn't fulfilled is when 
> INVLPGA isn't trapped in the 1st level guest, but issued in the 2nd 
> level one. That should be rather rare though ;-).

Good catch.

Would be better to get it right; changing the test to asid != 
current_asid should suffice.
Marcelo Tosatti May 19, 2009, 1:54 p.m. UTC | #4
On Tue, May 19, 2009 at 03:58:52PM +0300, Avi Kivity wrote:
> Alexander Graf wrote:
>> SVM adds another way to do INVLPG by ASID which Hyper-V makes use of,
>> so let's implement it!
>>
>> For now we just do the same thing invlpg does, as asid switching
>> means we flush the mmu anyways. That might change one day though.
>>
>> v2 makes invlpga do the same as invlpg, not flush the whole mmu
>>
>>  +static int invlpga_interception(struct vcpu_svm *svm, struct kvm_run 
>> *kvm_run)
>> +{
>> +	struct kvm_vcpu *vcpu = &svm->vcpu;
>> +	nsvm_printk("INVLPGA\n");
>> +
>> +	/* Let's treat INVLPGA the same as INVLPG */
>> +	kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
>> +
>> +	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
>> +	skip_emulated_instruction(&svm->vcpu);
>> +	return 1;
>> +}
>>   
>
> I think that for ASID!=0 you can actually do nothing.  The guest entry  
> is a cr3 switch, so we'll both get a tlb flush and a resync on any  
> modified ptes.
>
> For ASID==0 you can do the invlpg thing.
>
> Marcelo?

kvm_mmu_invlpg is cheap, better just invalidate the entry. If hyper-v
uses invlpga to invalidate TLB entries which it has updated pte's in
memory for, and you skip the invalidation now and somehow later use an
unsync spte, you're toast.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 19, 2009, 1:56 p.m. UTC | #5
Marcelo Tosatti wrote:
>> I think that for ASID!=0 you can actually do nothing.  The guest entry  
>> is a cr3 switch, so we'll both get a tlb flush and a resync on any  
>> modified ptes.
>>
>> For ASID==0 you can do the invlpg thing.
>>
>> Marcelo?
>>     
>
> kvm_mmu_invlpg is cheap, better just invalidate the entry. If hyper-v
> uses invlpga to invalidate TLB entries which it has updated pte's in
> memory for, and you skip the invalidation now and somehow later use an
> unsync spte, you're toast.
>   

But won't the guest entry cause a resync?

Doing nothing is even cheaper.
Marcelo Tosatti May 19, 2009, 1:58 p.m. UTC | #6
On Tue, May 19, 2009 at 04:56:48PM +0300, Avi Kivity wrote:
> Marcelo Tosatti wrote:
>>> I think that for ASID!=0 you can actually do nothing.  The guest 
>>> entry  is a cr3 switch, so we'll both get a tlb flush and a resync on 
>>> any  modified ptes.
>>>
>>> For ASID==0 you can do the invlpg thing.
>>>
>>> Marcelo?
>>>     
>>
>> kvm_mmu_invlpg is cheap, better just invalidate the entry. If hyper-v
>> uses invlpga to invalidate TLB entries which it has updated pte's in
>> memory for, and you skip the invalidation now and somehow later use an
>> unsync spte, you're toast.
>>   
>
> But won't the guest entry cause a resync?

If its a cr3/cr4 exit, yes. 

> Doing nothing is even cheaper.

My brain is nested.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf May 19, 2009, 3:18 p.m. UTC | #7
On 19.05.2009, at 15:58, Marcelo Tosatti wrote:

> On Tue, May 19, 2009 at 04:56:48PM +0300, Avi Kivity wrote:
>> Marcelo Tosatti wrote:
>>>> I think that for ASID!=0 you can actually do nothing.  The guest
>>>> entry  is a cr3 switch, so we'll both get a tlb flush and a  
>>>> resync on
>>>> any  modified ptes.
>>>>
>>>> For ASID==0 you can do the invlpg thing.
>>>>
>>>> Marcelo?
>>>>
>>>
>>> kvm_mmu_invlpg is cheap, better just invalidate the entry. If  
>>> hyper-v
>>> uses invlpga to invalidate TLB entries which it has updated pte's in
>>> memory for, and you skip the invalidation now and somehow later  
>>> use an
>>> unsync spte, you're toast.
>>>
>>
>> But won't the guest entry cause a resync?
>
> If its a cr3/cr4 exit, yes.

Well it has to be. Either we're switching from one NPT to the other  
(todo) or do a normal cr3+cr4 switch.

So I guess we can optimize here. Is it worth it?

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti May 19, 2009, 4:33 p.m. UTC | #8
On Tue, May 19, 2009 at 05:18:07PM +0200, Alexander Graf wrote:
>
> On 19.05.2009, at 15:58, Marcelo Tosatti wrote:
>
>> On Tue, May 19, 2009 at 04:56:48PM +0300, Avi Kivity wrote:
>>> Marcelo Tosatti wrote:
>>>>> I think that for ASID!=0 you can actually do nothing.  The guest
>>>>> entry  is a cr3 switch, so we'll both get a tlb flush and a  
>>>>> resync on
>>>>> any  modified ptes.
>>>>>
>>>>> For ASID==0 you can do the invlpg thing.
>>>>>
>>>>> Marcelo?
>>>>>
>>>>
>>>> kvm_mmu_invlpg is cheap, better just invalidate the entry. If  
>>>> hyper-v
>>>> uses invlpga to invalidate TLB entries which it has updated pte's in
>>>> memory for, and you skip the invalidation now and somehow later  
>>>> use an
>>>> unsync spte, you're toast.
>>>>
>>>
>>> But won't the guest entry cause a resync?
>>
>> If its a cr3/cr4 exit, yes.
>
> Well it has to be. Either we're switching from one NPT to the other  
> (todo) or do a normal cr3+cr4 switch.
>
> So I guess we can optimize here. Is it worth it?

IMHO better leave it the way it is, perhaps add a comment that 
the optimization is possible, and do it later if worthwhile.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 19, 2009, 4:35 p.m. UTC | #9
Alexander Graf wrote:
>>>> kvm_mmu_invlpg is cheap, better just invalidate the entry. If hyper-v
>>>> uses invlpga to invalidate TLB entries which it has updated pte's in
>>>> memory for, and you skip the invalidation now and somehow later use an
>>>> unsync spte, you're toast.
>>>>
>>>
>>> But won't the guest entry cause a resync?
>>
>> If its a cr3/cr4 exit, yes.
>
> Well it has to be. Either we're switching from one NPT to the other 
> (todo) or do a normal cr3+cr4 switch.
>
> So I guess we can optimize here. Is it worth it?
>

I think so.  We also need to make sure the entry causes a resync, even 
if cr3 doesn't change.

Oh, exit needs to force a resync as well, in case the guest foolishly 
let its guest touch its page tables and issue invlpga asid=0.
diff mbox

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 4b4eadd..fa2a710 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1785,6 +1785,19 @@  static int clgi_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
 	return 1;
 }
 
+static int invlpga_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run)
+{
+	struct kvm_vcpu *vcpu = &svm->vcpu;
+	nsvm_printk("INVLPGA\n");
+
+	/* Let's treat INVLPGA the same as INVLPG */
+	kvm_mmu_invlpg(vcpu, vcpu->arch.regs[VCPU_REGS_RAX]);
+
+	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
+	skip_emulated_instruction(&svm->vcpu);
+	return 1;
+}
+
 static int invalid_op_interception(struct vcpu_svm *svm,
 				   struct kvm_run *kvm_run)
 {
@@ -2130,7 +2143,7 @@  static int (*svm_exit_handlers[])(struct vcpu_svm *svm,
 	[SVM_EXIT_INVD]                         = emulate_on_interception,
 	[SVM_EXIT_HLT]				= halt_interception,
 	[SVM_EXIT_INVLPG]			= invlpg_interception,
-	[SVM_EXIT_INVLPGA]			= invalid_op_interception,
+	[SVM_EXIT_INVLPGA]			= invlpga_interception,
 	[SVM_EXIT_IOIO] 		  	= io_interception,
 	[SVM_EXIT_MSR]				= msr_interception,
 	[SVM_EXIT_TASK_SWITCH]			= task_switch_interception,