diff mbox

IO on guest is 20 times slower than host

Message ID 49CFC7A2.3030808@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Avi Kivity March 29, 2009, 7:10 p.m. UTC
Avi Kivity wrote:
> Kurt Yoder wrote:
>> slow host cpu information, core 1 of 16:
>>
>> processor       : 0
>> vendor_id       : AuthenticAMD
>> cpu family      : 16
>> model           : 4
>> model name      : Quad-Core AMD Opteron(tm) Processor 8382
>> stepping        : 2
>> cpu MHz         : 2611.998
>> cache size      : 512 KB
>> physical id     : 0
>> siblings        : 4
>> core id         : 0
>> cpu cores       : 4
>> apicid          : 0
>> initial apicid  : 0
>> fpu             : yes
>> fpu_exception   : yes
>> cpuid level     : 5
>> wp              : yes
>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr 
>> pge mca
>> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall mmxext fxsr_opt
>> pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl pni monitor
>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a
>> misalignsse 3dnowprefetch osvw ibs skinit wdt
>> bogomips        : 5223.97
>> TLB size        : 1024 4K pages
>> clflush size    : 64
>> cache_alignment : 64
>> address sizes   : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>>
>>   
>
> Can you loading kvm_amd on this host with 'modprobe kvm-amd npt=0'?
>

If it helps, then the guest is messing up the cpu cache.  Try the 
attached patch.

Comments

Joerg Roedel March 31, 2009, 9:59 a.m. UTC | #1
On Sun, Mar 29, 2009 at 10:10:26PM +0300, Avi Kivity wrote:
> Avi Kivity wrote:
>> Kurt Yoder wrote:
>>> slow host cpu information, core 1 of 16:
>>>
>>> processor       : 0
>>> vendor_id       : AuthenticAMD
>>> cpu family      : 16
>>> model           : 4
>>> model name      : Quad-Core AMD Opteron(tm) Processor 8382
>>> stepping        : 2
>>> cpu MHz         : 2611.998
>>> cache size      : 512 KB
>>> physical id     : 0
>>> siblings        : 4
>>> core id         : 0
>>> cpu cores       : 4
>>> apicid          : 0
>>> initial apicid  : 0
>>> fpu             : yes
>>> fpu_exception   : yes
>>> cpuid level     : 5
>>> wp              : yes
>>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr  
>>> pge mca
>>> cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall mmxext fxsr_opt
>>> pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl pni monitor
>>> cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a
>>> misalignsse 3dnowprefetch osvw ibs skinit wdt
>>> bogomips        : 5223.97
>>> TLB size        : 1024 4K pages
>>> clflush size    : 64
>>> cache_alignment : 64
>>> address sizes   : 48 bits physical, 48 bits virtual
>>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>>
>>>
>>>   
>>
>> Can you loading kvm_amd on this host with 'modprobe kvm-amd npt=0'?
>>
>
> If it helps, then the guest is messing up the cpu cache.  Try the  
> attached patch.
>
> -- 
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
>

> diff --git a/kernel/x86/kvm/svm.c b/kernel/x86/kvm/svm.c
> index 1fcbc17..d9774e9 100644
> --- a/kernel/x86/kvm/svm.c
> +++ b/kernel/x86/kvm/svm.c
> @@ -575,7 +575,7 @@ static void init_vmcb(struct vcpu_svm *svm)
>  						INTERCEPT_CR3_MASK);
>  		control->intercept_cr_write &= ~(INTERCEPT_CR0_MASK|
>  						 INTERCEPT_CR3_MASK);
> -		save->g_pat = 0x0007040600070406ULL;
> +		save->g_pat = 0x0606060606060606ULL;
>  		/* enable caching because the QEMU Bios doesn't enable it */
>  		save->cr0 = X86_CR0_ET;
>  		save->cr3 = 0;

Yeah, that patch makes sense. But I think we need some more work on this
because the guest may change the pat msr afterwards. Best would be a simple
shadow of the pat msr. Last question is how this will effect pci passthrough.

	Joerg
Avi Kivity March 31, 2009, 10:02 a.m. UTC | #2
Joerg Roedel wrote:
>> --- a/kernel/x86/kvm/svm.c
>> +++ b/kernel/x86/kvm/svm.c
>> @@ -575,7 +575,7 @@ static void init_vmcb(struct vcpu_svm *svm)
>>  						INTERCEPT_CR3_MASK);
>>  		control->intercept_cr_write &= ~(INTERCEPT_CR0_MASK|
>>  						 INTERCEPT_CR3_MASK);
>> -		save->g_pat = 0x0007040600070406ULL;
>> +		save->g_pat = 0x0606060606060606ULL;
>>  		/* enable caching because the QEMU Bios doesn't enable it */
>>  		save->cr0 = X86_CR0_ET;
>>  		save->cr3 = 0;
>>     
>
> Yeah, that patch makes sense. But I think we need some more work on this
> because the guest may change the pat msr afterwards. Best would be a simple
> shadow of the pat msr. Last question is how this will effect pci passthrough.	

This is just a stopgap; we can later add proper pat shadowing.
Avi Kivity April 4, 2009, 11:47 a.m. UTC | #3
Joerg Roedel wrote:
>> index 1fcbc17..d9774e9 100644
>> --- a/kernel/x86/kvm/svm.c
>> +++ b/kernel/x86/kvm/svm.c
>> @@ -575,7 +575,7 @@ static void init_vmcb(struct vcpu_svm *svm)
>>  						INTERCEPT_CR3_MASK);
>>  		control->intercept_cr_write &= ~(INTERCEPT_CR0_MASK|
>>  						 INTERCEPT_CR3_MASK);
>> -		save->g_pat = 0x0007040600070406ULL;
>> +		save->g_pat = 0x0606060606060606ULL;
>>  		/* enable caching because the QEMU Bios doesn't enable it */
>>  		save->cr0 = X86_CR0_ET;
>>  		save->cr3 = 0;
>>     
>
> Yeah, that patch makes sense. But I think we need some more work on this
> because the guest may change the pat msr afterwards. Best would be a simple
> shadow of the pat msr. Last question is how this will effect pci passthrough.
>
>   

I've noticed that Windows (and likely Linux, didn't test) maps the 
cirrus framebuffer with PWT=1, which should slow down the emulated 
framebuffer.  So this patch should speed up things.

If a device is assigned, we must respect the guest PAT, so cirrus 
performance will be low.  On Intel there's an 'ignore PAT' bit which can 
be set on an ept pte for the framebuffer.  Any trick we can do on AMD to 
achieve a similar result?
diff mbox

Patch

diff --git a/kernel/x86/kvm/svm.c b/kernel/x86/kvm/svm.c
index 1fcbc17..d9774e9 100644
--- a/kernel/x86/kvm/svm.c
+++ b/kernel/x86/kvm/svm.c
@@ -575,7 +575,7 @@  static void init_vmcb(struct vcpu_svm *svm)
 						INTERCEPT_CR3_MASK);
 		control->intercept_cr_write &= ~(INTERCEPT_CR0_MASK|
 						 INTERCEPT_CR3_MASK);
-		save->g_pat = 0x0007040600070406ULL;
+		save->g_pat = 0x0606060606060606ULL;
 		/* enable caching because the QEMU Bios doesn't enable it */
 		save->cr0 = X86_CR0_ET;
 		save->cr3 = 0;