diff mbox

[PATCHv3] kvm/irqchip: Speed up KVM_SET_GSI_ROUTING

Message ID 52D83BF2.6030705@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Christian Borntraeger Jan. 16, 2014, 8:07 p.m. UTC
On 16/01/14 19:55, Michael S. Tsirkin wrote:
> On Thu, Jan 16, 2014 at 01:44:20PM +0100, Christian Borntraeger wrote:
[...]
>> I converted most of the rcu routines to srcu. Review for the unconverted
[...]

> That's nice but did you try to measure the overhead
> on some interrupt-intensive workloads, such as RX with 10G ethernet?
> srcu locks aren't free like rcu ones.

Right, but the read side is only acting on cpu local data structures so the overhead
is probably very small in contrast to the hlist work and the injection itself.
You have a more compelling review comment, though:

[...]
>> -	synchronize_rcu();
>> +	synchronize_srcu_expedited(&irq_srcu);
> 
> Hmm, it's a bit strange that you also do _expecited here.

Well, I just did what the original mail thread suggested :-)

> What if this synchronize_rcu is replaced by synchronize_rcu_expedited
> and no other changes are made?
> Maybe that's enough?

Yes, its enough. (seems slightly slower than v2, but fast enough)

Patch below:


[PATCHv3] kvm/irqchip: Speed up KVM_SET_GSI_ROUTING

When starting lots of dataplane devices the bootup takes very long
on my s390 system(prototype irqfd code). With larger setups we are even
able to trigger some timeouts in some components.
Turns out that the KVM_SET_GSI_ROUTING ioctl takes very
long (strace claims up to 0.1 sec) when having multiple CPUs.
This is caused by the  synchronize_rcu and the HZ=100 of s390.

Lets use the expedited variant to speed things up as suggested by
Michael S. Tsirkin

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 virt/kvm/irqchip.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael S. Tsirkin Jan. 16, 2014, 8:22 p.m. UTC | #1
On Thu, Jan 16, 2014 at 09:07:14PM +0100, Christian Borntraeger wrote:
> On 16/01/14 19:55, Michael S. Tsirkin wrote:
> > On Thu, Jan 16, 2014 at 01:44:20PM +0100, Christian Borntraeger wrote:
> [...]
> >> I converted most of the rcu routines to srcu. Review for the unconverted
> [...]
> 
> > That's nice but did you try to measure the overhead
> > on some interrupt-intensive workloads, such as RX with 10G ethernet?
> > srcu locks aren't free like rcu ones.
> 
> Right, but the read side is only acting on cpu local data structures so the overhead
> is probably very small in contrast to the hlist work and the injection itself.

My testing of VM exit paths shows the overhead of read size RCU
is not negligeable, IIRC it involves memory barriers which
are expensive operations.

> You have a more compelling review comment, though:
> 
> [...]
> >> -	synchronize_rcu();
> >> +	synchronize_srcu_expedited(&irq_srcu);
> > 
> > Hmm, it's a bit strange that you also do _expecited here.
> 
> Well, I just did what the original mail thread suggested :-)
> 
> > What if this synchronize_rcu is replaced by synchronize_rcu_expedited
> > and no other changes are made?
> > Maybe that's enough?
> 
> Yes, its enough. (seems slightly slower than v2, but fast enough)
> 
> Patch below:
> 
> 
> [PATCHv3] kvm/irqchip: Speed up KVM_SET_GSI_ROUTING
> 
> When starting lots of dataplane devices the bootup takes very long
> on my s390 system(prototype irqfd code). With larger setups we are even
> able to trigger some timeouts in some components.
> Turns out that the KVM_SET_GSI_ROUTING ioctl takes very
> long (strace claims up to 0.1 sec) when having multiple CPUs.
> This is caused by the  synchronize_rcu and the HZ=100 of s390.
> 
> Lets use the expedited variant to speed things up as suggested by
> Michael S. Tsirkin
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>  virt/kvm/irqchip.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 20dc9e4..dbcfde7 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -226,7 +226,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
>  	kvm_irq_routing_update(kvm, new);
>  	mutex_unlock(&kvm->irq_lock);
>  
> -	synchronize_rcu();
> +	synchronize_rcu_expedited();
>  
>  	new = old;
>  	r = 0;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Jan. 17, 2014, 2:03 p.m. UTC | #2
Il 16/01/2014 21:22, Michael S. Tsirkin ha scritto:
>> [PATCHv3] kvm/irqchip: Speed up KVM_SET_GSI_ROUTING
>>
>> When starting lots of dataplane devices the bootup takes very long
>> on my s390 system(prototype irqfd code). With larger setups we are even
>> able to trigger some timeouts in some components.
>> Turns out that the KVM_SET_GSI_ROUTING ioctl takes very
>> long (strace claims up to 0.1 sec) when having multiple CPUs.
>> This is caused by the  synchronize_rcu and the HZ=100 of s390.
>>
>> Lets use the expedited variant to speed things up as suggested by
>> Michael S. Tsirkin
>>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>>  virt/kvm/irqchip.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
>> index 20dc9e4..dbcfde7 100644
>> --- a/virt/kvm/irqchip.c
>> +++ b/virt/kvm/irqchip.c
>> @@ -226,7 +226,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
>>  	kvm_irq_routing_update(kvm, new);
>>  	mutex_unlock(&kvm->irq_lock);
>>  
>> -	synchronize_rcu();
>> +	synchronize_rcu_expedited();
>>  
>>  	new = old;
>>  	r = 0;
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Well... I love to contradict myself, so: no way this can be accepted
this close to the end of the merge window.  :(

synchronize_rcu_expedited() forces a context switch on all CPUs, even
those that are not running KVM.  Thus, this patch might help a guest DoS
its host by changing the IRQ routing tables in a loop.

So this will have to wait for 3.15.  We have ~2 months to do performance
measurements on the v2 patch.  Sorry.

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christian Borntraeger Jan. 17, 2014, 3:03 p.m. UTC | #3
On 17/01/14 15:03, Paolo Bonzini wrote:
> Il 16/01/2014 21:22, Michael S. Tsirkin ha scritto:
>>> [PATCHv3] kvm/irqchip: Speed up KVM_SET_GSI_ROUTING
>>>
>>> When starting lots of dataplane devices the bootup takes very long
>>> on my s390 system(prototype irqfd code). With larger setups we are even
>>> able to trigger some timeouts in some components.
>>> Turns out that the KVM_SET_GSI_ROUTING ioctl takes very
>>> long (strace claims up to 0.1 sec) when having multiple CPUs.
>>> This is caused by the  synchronize_rcu and the HZ=100 of s390.
>>>
>>> Lets use the expedited variant to speed things up as suggested by
>>> Michael S. Tsirkin
>>>
>>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>>> ---
>>>  virt/kvm/irqchip.c |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
>>> index 20dc9e4..dbcfde7 100644
>>> --- a/virt/kvm/irqchip.c
>>> +++ b/virt/kvm/irqchip.c
>>> @@ -226,7 +226,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
>>>  	kvm_irq_routing_update(kvm, new);
>>>  	mutex_unlock(&kvm->irq_lock);
>>>  
>>> -	synchronize_rcu();
>>> +	synchronize_rcu_expedited();
>>>  
>>>  	new = old;
>>>  	r = 0;
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> Well... I love to contradict myself, so: no way this can be accepted
> this close to the end of the merge window.  :(
> 
> synchronize_rcu_expedited() forces a context switch on all CPUs, even
> those that are not running KVM.  Thus, this patch might help a guest DoS
> its host by changing the IRQ routing tables in a loop.
> 
> So this will have to wait for 3.15.  We have ~2 months to do performance
> measurements on the v2 patch.  Sorry.

Any chance that you or Michael can give some performance feedback on v2? All
my lab systems are s390 and not x86...


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Jan. 17, 2014, 3:26 p.m. UTC | #4
Il 17/01/2014 16:03, Christian Borntraeger ha scritto:
> > Well... I love to contradict myself, so: no way this can be accepted
> > this close to the end of the merge window.  :(
> > 
> > synchronize_rcu_expedited() forces a context switch on all CPUs, even
> > those that are not running KVM.  Thus, this patch might help a guest DoS
> > its host by changing the IRQ routing tables in a loop.
> > 
> > So this will have to wait for 3.15.  We have ~2 months to do performance
> > measurements on the v2 patch.  Sorry.
> 
> Any chance that you or Michael can give some performance feedback on v2? All
> my lab systems are s390 and not x86...

Yes, we will help.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 20dc9e4..dbcfde7 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -226,7 +226,7 @@  int kvm_set_irq_routing(struct kvm *kvm,
 	kvm_irq_routing_update(kvm, new);
 	mutex_unlock(&kvm->irq_lock);
 
-	synchronize_rcu();
+	synchronize_rcu_expedited();
 
 	new = old;
 	r = 0;