diff mbox

[3/3] KVM: VMX: Allow I/O port 0x80 bypass when userspace prefer

Message ID 1523943962-25415-4-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li April 17, 2018, 5:46 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

Tim Shearer reported that "There is a guest which is running a packet 
forwarding app based on the DPDK (dpdk.org). The packet receive routine 
writes to 0xc070 using glibc's "outw_p" function which does an additional 
write to I/O port 80. It does this write for every packet that's received, 
causing a flood of KVM userspace context switches". He uses mpstat to 
observe a CPU performing L2 packet forwarding on a pinned guest vCPU, 
the guest time is 95 percent when allowing I/O port 0x80 bypass, however, 
it is 65.78 percent when I/O port 0x80 bypss is disabled.  

This patch allows I/O port 0x80 bypass when userspace prefer.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Tim Shearer <tshearer@advaoptical.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/vmx.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Konrad Rzeszutek Wilk May 11, 2018, 3:44 p.m. UTC | #1
On Mon, Apr 16, 2018 at 10:46:02PM -0700, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Tim Shearer reported that "There is a guest which is running a packet 
> forwarding app based on the DPDK (dpdk.org). The packet receive routine 
> writes to 0xc070 using glibc's "outw_p" function which does an additional 
> write to I/O port 80. It does this write for every packet that's received, 
> causing a flood of KVM userspace context switches". He uses mpstat to 
> observe a CPU performing L2 packet forwarding on a pinned guest vCPU, 
> the guest time is 95 percent when allowing I/O port 0x80 bypass, however, 
> it is 65.78 percent when I/O port 0x80 bypss is disabled.  
> 
> This patch allows I/O port 0x80 bypass when userspace prefer.

s/prefer/requests it/
> 

Perhaps:

Reported-by: Tim Shearer as well?

> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Tim Shearer <tshearer@advaoptical.com>
> Cc: Liran Alon <liran.alon@oracle.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kvm/vmx.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index ebf1140..d3e5fef 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -10118,6 +10118,13 @@ static int vmx_vm_init(struct kvm *kvm)
>  			goto out;
>  		memset(kvm_vmx->vmx_io_bitmap[i], 0xff, PAGE_SIZE);
>  	}
> +	if (kvm->arch.ioport_disable_intercept)	{
> +		/*
> +		 * Allow direct access to the PC debug port (it is often used for I/O
> +		 * delays, but the vmexits simply slow things down).
> +		 */
> +		clear_bit(0x80, kvm_vmx->vmx_io_bitmap[VMX_IO_BITMAP_A]);
> +	}
>  	return 0;
>  
>  out:
> -- 
> 2.7.4
>
Jim Mattson May 15, 2018, 9:57 p.m. UTC | #2
This does seem to allow a DoS from userspace if userspace prefers it.
That doesn't seem wise.

On Fri, May 11, 2018 at 8:44 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Mon, Apr 16, 2018 at 10:46:02PM -0700, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpengli@tencent.com>
>>
>> Tim Shearer reported that "There is a guest which is running a packet
>> forwarding app based on the DPDK (dpdk.org). The packet receive routine
>> writes to 0xc070 using glibc's "outw_p" function which does an additional
>> write to I/O port 80. It does this write for every packet that's received,
>> causing a flood of KVM userspace context switches". He uses mpstat to
>> observe a CPU performing L2 packet forwarding on a pinned guest vCPU,
>> the guest time is 95 percent when allowing I/O port 0x80 bypass, however,
>> it is 65.78 percent when I/O port 0x80 bypss is disabled.
>>
>> This patch allows I/O port 0x80 bypass when userspace prefer.
>
> s/prefer/requests it/
>>
>
> Perhaps:
>
> Reported-by: Tim Shearer as well?
>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Tim Shearer <tshearer@advaoptical.com>
>> Cc: Liran Alon <liran.alon@oracle.com>
>> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
>> ---
>>  arch/x86/kvm/vmx.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index ebf1140..d3e5fef 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -10118,6 +10118,13 @@ static int vmx_vm_init(struct kvm *kvm)
>>                       goto out;
>>               memset(kvm_vmx->vmx_io_bitmap[i], 0xff, PAGE_SIZE);
>>       }
>> +     if (kvm->arch.ioport_disable_intercept) {
>> +             /*
>> +              * Allow direct access to the PC debug port (it is often used for I/O
>> +              * delays, but the vmexits simply slow things down).
>> +              */
>> +             clear_bit(0x80, kvm_vmx->vmx_io_bitmap[VMX_IO_BITMAP_A]);
>> +     }
>>       return 0;
>>
>>  out:
>> --
>> 2.7.4
>>
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ebf1140..d3e5fef 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10118,6 +10118,13 @@  static int vmx_vm_init(struct kvm *kvm)
 			goto out;
 		memset(kvm_vmx->vmx_io_bitmap[i], 0xff, PAGE_SIZE);
 	}
+	if (kvm->arch.ioport_disable_intercept)	{
+		/*
+		 * Allow direct access to the PC debug port (it is often used for I/O
+		 * delays, but the vmexits simply slow things down).
+		 */
+		clear_bit(0x80, kvm_vmx->vmx_io_bitmap[VMX_IO_BITMAP_A]);
+	}
 	return 0;
 
 out: