KVM: SVM: obey guest PAT
diff mbox

Message ID 20171026071327.15427-1-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini Oct. 26, 2017, 7:13 a.m. UTC
For many years some users of assigned devices have reported worse
performance on AMD processors with NPT than on AMD without NPT,
Intel or bare metal.

The reason turned out to be that SVM is discarding the guest PAT
setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
PA3=UC).  The guest might be using a different setting, and
especially might want write combining but isn't getting it
(instead getting slow UC or UC- accesses).

Thanks a lot to geoff@hostfission.com for noticing the relation
to the g_pat setting.  The patch has been tested also by a bunch
of people on VFIO users forums.

Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

David Hildenbrand Oct. 26, 2017, 8:17 a.m. UTC | #1
On 26.10.2017 09:13, Paolo Bonzini wrote:
> For many years some users of assigned devices have reported worse
> performance on AMD processors with NPT than on AMD without NPT,
> Intel or bare metal.
> 
> The reason turned out to be that SVM is discarding the guest PAT
> setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
> PA3=UC).  The guest might be using a different setting, and
> especially might want write combining but isn't getting it
> (instead getting slow UC or UC- accesses).
> 
> Thanks a lot to geoff@hostfission.com for noticing the relation
> to the g_pat setting.  The patch has been tested also by a bunch
> of people on VFIO users forums.
> 
> Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/svm.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index af256b786a70..af09baa3d736 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -3626,6 +3626,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  	u32 ecx = msr->index;
>  	u64 data = msr->data;
>  	switch (ecx) {
> +	case MSR_IA32_CR_PAT:
> +		if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> +			return 1;
> +		vcpu->arch.pat = data;
> +		svm->vmcb->save.g_pat = data;
> +		mark_dirty(svm->vmcb, VMCB_NPT);
> +		break;
>  	case MSR_IA32_TSC:
>  		kvm_write_tsc(vcpu, msr);
>  		break;
> 

Although no SVM expert, looking at the way it is handled on VMX, this
looks good to me.

Reviewed-by: David Hildenbrand <david@redhat.com>
Nick Sarnie Oct. 26, 2017, 3 p.m. UTC | #2
On Thu, Oct 26, 2017 at 4:17 AM, David Hildenbrand <david@redhat.com> wrote:
> On 26.10.2017 09:13, Paolo Bonzini wrote:
>> For many years some users of assigned devices have reported worse
>> performance on AMD processors with NPT than on AMD without NPT,
>> Intel or bare metal.
>>
>> The reason turned out to be that SVM is discarding the guest PAT
>> setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
>> PA3=UC).  The guest might be using a different setting, and
>> especially might want write combining but isn't getting it
>> (instead getting slow UC or UC- accesses).
>>
>> Thanks a lot to geoff@hostfission.com for noticing the relation
>> to the g_pat setting.  The patch has been tested also by a bunch
>> of people on VFIO users forums.
>>
>> Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
>> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  arch/x86/kvm/svm.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index af256b786a70..af09baa3d736 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -3626,6 +3626,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>>       u32 ecx = msr->index;
>>       u64 data = msr->data;
>>       switch (ecx) {
>> +     case MSR_IA32_CR_PAT:
>> +             if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
>> +                     return 1;
>> +             vcpu->arch.pat = data;
>> +             svm->vmcb->save.g_pat = data;
>> +             mark_dirty(svm->vmcb, VMCB_NPT);
>> +             break;
>>       case MSR_IA32_TSC:
>>               kvm_write_tsc(vcpu, msr);
>>               break;
>>
>
> Although no SVM expert, looking at the way it is handled on VMX, this
> looks good to me.
>
> Reviewed-by: David Hildenbrand <david@redhat.com>
>
> --
>
> Thanks,
>
> David

Tested-by: Nick Sarnie <commendsarnex@gmail.com>

You're a legend.

Thanks,
Sarnex
Radim Krčmář Nov. 10, 2017, 9:27 p.m. UTC | #3
2017-10-26 09:13+0200, Paolo Bonzini:
> For many years some users of assigned devices have reported worse
> performance on AMD processors with NPT than on AMD without NPT,
> Intel or bare metal.
> 
> The reason turned out to be that SVM is discarding the guest PAT
> setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
> PA3=UC).  The guest might be using a different setting, and
> especially might want write combining but isn't getting it
> (instead getting slow UC or UC- accesses).
> 
> Thanks a lot to geoff@hostfission.com for noticing the relation
> to the g_pat setting.  The patch has been tested also by a bunch
> of people on VFIO users forums.
> 
> Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
> Cc: stable@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Applied, thanks.

Patch
diff mbox

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index af256b786a70..af09baa3d736 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3626,6 +3626,13 @@  static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	u32 ecx = msr->index;
 	u64 data = msr->data;
 	switch (ecx) {
+	case MSR_IA32_CR_PAT:
+		if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
+			return 1;
+		vcpu->arch.pat = data;
+		svm->vmcb->save.g_pat = data;
+		mark_dirty(svm->vmcb, VMCB_NPT);
+		break;
 	case MSR_IA32_TSC:
 		kvm_write_tsc(vcpu, msr);
 		break;