Message ID | 20240321082442.195631-1-d-tatianin@yandex-team.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kvm_host: bump KVM_MAX_IRQ_ROUTE to 128k | expand |
ping :) On 3/21/24 11:24 AM, Daniil Tatianin wrote: > We would like to be able to create large VMs (up to 224 vCPUs atm) with > up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU > for optimal performance (as well as config & control interrupts per > card). Adding in extra virtio-blk controllers with a queue per vCPU (up > to 192 disks) yields a total of about ~100k IRQ routes, rounded up to > 128k for extra headroom and flexibility. > > The current limit of 4096 was set in 2018 and is too low for modern > demands. It also seems to be there for no good reason as routes are > allocated lazily by the kernel anyway (depending on the largest GSI > requested by the VM). > > Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> > --- > include/linux/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 48f31dcd318a..10a141add2a8 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ > +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ > > bool kvm_arch_can_set_irq_routing(struct kvm *kvm); > int kvm_set_irq_routing(struct kvm *kvm,
Ping :) Is anyone interested in this change? Our internal testing has shown no problems with this, so I think it should be fine. Thanks! On 3/21/24 11:24 AM, Daniil Tatianin wrote: > We would like to be able to create large VMs (up to 224 vCPUs atm) with > up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU > for optimal performance (as well as config & control interrupts per > card). Adding in extra virtio-blk controllers with a queue per vCPU (up > to 192 disks) yields a total of about ~100k IRQ routes, rounded up to > 128k for extra headroom and flexibility. > > The current limit of 4096 was set in 2018 and is too low for modern > demands. It also seems to be there for no good reason as routes are > allocated lazily by the kernel anyway (depending on the largest GSI > requested by the VM). > > Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> > --- > include/linux/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 48f31dcd318a..10a141add2a8 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ > +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ > > bool kvm_arch_can_set_irq_routing(struct kvm *kvm); > int kvm_set_irq_routing(struct kvm *kvm,
On Thu, 21 Mar 2024 11:24:42 +0300 Daniil Tatianin <d-tatianin@yandex-team.ru> wrote: > We would like to be able to create large VMs (up to 224 vCPUs atm) with > up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU > for optimal performance (as well as config & control interrupts per > card). Adding in extra virtio-blk controllers with a queue per vCPU (up > to 192 disks) yields a total of about ~100k IRQ routes, rounded up to > 128k for extra headroom and flexibility. > > The current limit of 4096 was set in 2018 and is too low for modern > demands. It also seems to be there for no good reason as routes are > allocated lazily by the kernel anyway (depending on the largest GSI > requested by the VM). > > Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> LGTM Acked-by: Igor Mammedov <imammedo@redhat.com> > --- > include/linux/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 48f31dcd318a..10a141add2a8 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ > +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ > > bool kvm_arch_can_set_irq_routing(struct kvm *kvm); > int kvm_set_irq_routing(struct kvm *kvm,
On 6/18/24 3:28 PM, Igor Mammedov wrote: > On Thu, 21 Mar 2024 11:24:42 +0300 > Daniil Tatianin <d-tatianin@yandex-team.ru> wrote: > >> We would like to be able to create large VMs (up to 224 vCPUs atm) with >> up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU >> for optimal performance (as well as config & control interrupts per >> card). Adding in extra virtio-blk controllers with a queue per vCPU (up >> to 192 disks) yields a total of about ~100k IRQ routes, rounded up to >> 128k for extra headroom and flexibility. >> >> The current limit of 4096 was set in 2018 and is too low for modern >> demands. It also seems to be there for no good reason as routes are >> allocated lazily by the kernel anyway (depending on the largest GSI >> requested by the VM). >> >> Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> > LGTM > > Acked-by: Igor Mammedov <imammedo@redhat.com> Thank you! I want to ping everyone once again to take a look at this, I think this patch is quite trivial and unlocks larger VMs for free, would really appreciate a review from anyone interested! >> --- >> include/linux/kvm_host.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h >> index 48f31dcd318a..10a141add2a8 100644 >> --- a/include/linux/kvm_host.h >> +++ b/include/linux/kvm_host.h >> @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, >> >> #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING >> >> -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ >> +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ >> >> bool kvm_arch_can_set_irq_routing(struct kvm *kvm); >> int kvm_set_irq_routing(struct kvm *kvm,
KVM: for the scope On Thu, Mar 21, 2024, Daniil Tatianin wrote: > We would like to be able to create large VMs (up to 224 vCPUs atm) with > up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU > for optimal performance (as well as config & control interrupts per > card). Adding in extra virtio-blk controllers with a queue per vCPU (up > to 192 disks) yields a total of about ~100k IRQ routes, rounded up to > 128k for extra headroom and flexibility. > > The current limit of 4096 was set in 2018 and is too low for modern > demands. It also seems to be there for no good reason as routes are > allocated lazily by the kernel anyway (depending on the largest GSI > requested by the VM). > > Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> > --- > include/linux/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 48f31dcd318a..10a141add2a8 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ > +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ I am not comfortable simply bumping the max. Yeah, it's allocated on-demand, but if my math is correct, the means a max of ~8MiB for the table, plus another 8MiB for the tnries. And when handling KVM_SET_GSI_ROUTING, KVM will have 2 tables (old and new), and another 4MiB for duplicating the userspace array. Those allocations are accounted, but that's still a lot of potential thrash. And KVM's handling is also grossly inefficient, e.g. reallocating everything just to change one routing entry is awful. Maybe painfully slow updates are fine for your use case, but some OSes have a bad habit of round-robining IRQ destinations on a regular basis. So it might be "free" in the sense that it costs you nothing to get your use case working, but there's very much a cost for KVM in the form of technical debt that someone will have to eventually pay for. I don't have any concrete thoughts on how to make KVM's implementation less sucky, but I do think we need to give it some attention before increasing the maximum number of IRQ routes, especially before increasing it by 32x.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 48f31dcd318a..10a141add2a8 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ bool kvm_arch_can_set_irq_routing(struct kvm *kvm); int kvm_set_irq_routing(struct kvm *kvm,
We would like to be able to create large VMs (up to 224 vCPUs atm) with up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU for optimal performance (as well as config & control interrupts per card). Adding in extra virtio-blk controllers with a queue per vCPU (up to 192 disks) yields a total of about ~100k IRQ routes, rounded up to 128k for extra headroom and flexibility. The current limit of 4096 was set in 2018 and is too low for modern demands. It also seems to be there for no good reason as routes are allocated lazily by the kernel anyway (depending on the largest GSI requested by the VM). Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> --- include/linux/kvm_host.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)