Message ID | 20231122092845.973949-3-harshpb@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce SPAPR_NR_IPIS and fix max-cpus | expand |
Hi Harsh, On 22/11/23 10:28, Harsh Prateek Bora wrote: > Initialize the machine specific max_cpus limit as per the maximum range > of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not > free error due to XIVE/XICS limitation and keeping beyond 8192 will hit > assert in tcg_region_init or spapr_xive_claim_irq. > > Logs: > > Without patch fix: > > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 > qemu-system-ppc64: IRQ 4096 is not free > [root@host build]# > > On LPAR: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 > ** > ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: > (region_size >= 2 * page_size) > Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: > (region_size >= 2 * page_size) > Aborted (core dumped) > [root@host build]# > > On x86: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 > qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: > Assertion `lisn < xive->nr_irqs' failed. > Aborted (core dumped) > [root@host build]# > > With patch fix: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 > qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by > machine 'pseries-8.2' is 4096 > [root@host build]# > > Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> > --- > hw/ppc/spapr.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index df09aa9d6a..0de11a4458 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) > mc->block_default_type = IF_SCSI; > > /* > - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values > - * should be limited by the host capability instead of hardcoded. > - * max_cpus for KVM guests will be checked in kvm_init(), and TCG > - * guests are welcome to have as many CPUs as the host are capable > - * of emulate. > + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), > + * In TCG the limit is restricted by the range of CPU IPIs available. > */ > - mc->max_cpus = INT32_MAX; > + mc->max_cpus = SPAPR_NR_IPIS; Is SPAPR_NR_IPIS also the upper limit for KVM? > mc->no_parallel = 1; > mc->default_boot_order = "";
Hi Philippe, On 11/22/23 16:46, Philippe Mathieu-Daudé wrote: > Hi Harsh, > > On 22/11/23 10:28, Harsh Prateek Bora wrote: >> Initialize the machine specific max_cpus limit as per the maximum range >> of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not >> free error due to XIVE/XICS limitation and keeping beyond 8192 will hit >> assert in tcg_region_init or spapr_xive_claim_irq. >> >> Logs: >> >> Without patch fix: >> >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >> qemu-system-ppc64: IRQ 4096 is not free >> [root@host build]# >> >> On LPAR: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >> ** >> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >> (region_size >= 2 * page_size) >> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >> (region_size >= 2 * page_size) >> Aborted (core dumped) >> [root@host build]# >> >> On x86: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: >> Assertion `lisn < xive->nr_irqs' failed. >> Aborted (core dumped) >> [root@host build]# >> >> With patch fix: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by >> machine 'pseries-8.2' is 4096 >> [root@host build]# >> >> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> >> --- >> hw/ppc/spapr.c | 9 +++------ >> 1 file changed, 3 insertions(+), 6 deletions(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index df09aa9d6a..0de11a4458 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -4647,13 +4647,10 @@ static void >> spapr_machine_class_init(ObjectClass *oc, void *data) >> mc->block_default_type = IF_SCSI; >> /* >> - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values >> - * should be limited by the host capability instead of hardcoded. >> - * max_cpus for KVM guests will be checked in kvm_init(), and TCG >> - * guests are welcome to have as many CPUs as the host are capable >> - * of emulate. >> + * While KVM determines max cpus in kvm_init() using >> kvm_max_vcpus(), >> + * In TCG the limit is restricted by the range of CPU IPIs >> available. >> */ >> - mc->max_cpus = INT32_MAX; >> + mc->max_cpus = SPAPR_NR_IPIS; > > Is SPAPR_NR_IPIS also the upper limit for KVM? In KVM mode, the limit is restricted to what is supported by KVM which is checked using kvm_ioctl via wrappers in kvm_init and appears to be evaluating to 2048. So, having a larger default works for both case. regards, Harsh > >> mc->no_parallel = 1; >> mc->default_boot_order = ""; >
On 11/23/23 06:03, Harsh Prateek Bora wrote: > Hi Philippe, > > On 11/22/23 16:46, Philippe Mathieu-Daudé wrote: >> Hi Harsh, >> >> On 22/11/23 10:28, Harsh Prateek Bora wrote: >>> Initialize the machine specific max_cpus limit as per the maximum range >>> of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not >>> free error due to XIVE/XICS limitation and keeping beyond 8192 will hit >>> assert in tcg_region_init or spapr_xive_claim_irq. >>> >>> Logs: >>> >>> Without patch fix: >>> >>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>> qemu-system-ppc64: IRQ 4096 is not free >>> [root@host build]# >>> >>> On LPAR: >>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>> ** >>> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>> (region_size >= 2 * page_size) >>> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>> (region_size >= 2 * page_size) >>> Aborted (core dumped) >>> [root@host build]# >>> >>> On x86: >>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: >>> Assertion `lisn < xive->nr_irqs' failed. >>> Aborted (core dumped) >>> [root@host build]# >>> >>> With patch fix: >>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by >>> machine 'pseries-8.2' is 4096 >>> [root@host build]# >>> >>> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> >>> --- >>> hw/ppc/spapr.c | 9 +++------ >>> 1 file changed, 3 insertions(+), 6 deletions(-) >>> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>> index df09aa9d6a..0de11a4458 100644 >>> --- a/hw/ppc/spapr.c >>> +++ b/hw/ppc/spapr.c >>> @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) >>> mc->block_default_type = IF_SCSI; >>> /* >>> - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values >>> - * should be limited by the host capability instead of hardcoded. >>> - * max_cpus for KVM guests will be checked in kvm_init(), and TCG >>> - * guests are welcome to have as many CPUs as the host are capable >>> - * of emulate. >>> + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), >>> + * In TCG the limit is restricted by the range of CPU IPIs available. >>> */ >>> - mc->max_cpus = INT32_MAX; >>> + mc->max_cpus = SPAPR_NR_IPIS; >> >> Is SPAPR_NR_IPIS also the upper limit for KVM? > > In KVM mode, the limit is restricted to what is supported by KVM which is checked using kvm_ioctl via wrappers in kvm_init and appears to be evaluating to 2048. So, having a larger default works for both case. QEMU sets the number of cpus with KVM ioctls : KVM_DEV_XICS_NR_SERVERS KVM_DEV_XIVE_NR_SERVERS This is important for the host since the interrupt controller is then configured with these values through FW. The default value is indeed 2K but this is large and wastes a lot of HW resources, page mappings, etc. Thanks, C.
On 23/11/23 09:47, Cédric Le Goater wrote: > On 11/23/23 06:03, Harsh Prateek Bora wrote: >> Hi Philippe, >> >> On 11/22/23 16:46, Philippe Mathieu-Daudé wrote: >>> Hi Harsh, >>> >>> On 22/11/23 10:28, Harsh Prateek Bora wrote: >>>> Initialize the machine specific max_cpus limit as per the maximum range >>>> of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not >>>> free error due to XIVE/XICS limitation and keeping beyond 8192 will hit >>>> assert in tcg_region_init or spapr_xive_claim_irq. >>>> >>>> Logs: >>>> >>>> Without patch fix: >>>> >>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>>> qemu-system-ppc64: IRQ 4096 is not free >>>> [root@host build]# >>>> >>>> On LPAR: >>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>>> ** >>>> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>>> (region_size >= 2 * page_size) >>>> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>>> (region_size >= 2 * page_size) >>>> Aborted (core dumped) >>>> [root@host build]# >>>> >>>> On x86: >>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>>> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: >>>> Assertion `lisn < xive->nr_irqs' failed. >>>> Aborted (core dumped) >>>> [root@host build]# >>>> >>>> With patch fix: >>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>>> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by >>>> machine 'pseries-8.2' is 4096 >>>> [root@host build]# >>>> >>>> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> >>>> --- >>>> hw/ppc/spapr.c | 9 +++------ >>>> 1 file changed, 3 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>>> index df09aa9d6a..0de11a4458 100644 >>>> --- a/hw/ppc/spapr.c >>>> +++ b/hw/ppc/spapr.c >>>> @@ -4647,13 +4647,10 @@ static void >>>> spapr_machine_class_init(ObjectClass *oc, void *data) >>>> mc->block_default_type = IF_SCSI; >>>> /* >>>> - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values >>>> - * should be limited by the host capability instead of hardcoded. >>>> - * max_cpus for KVM guests will be checked in kvm_init(), and TCG >>>> - * guests are welcome to have as many CPUs as the host are capable >>>> - * of emulate. >>>> + * While KVM determines max cpus in kvm_init() using >>>> kvm_max_vcpus(), >>>> + * In TCG the limit is restricted by the range of CPU IPIs >>>> available. >>>> */ >>>> - mc->max_cpus = INT32_MAX; >>>> + mc->max_cpus = SPAPR_NR_IPIS; >>> >>> Is SPAPR_NR_IPIS also the upper limit for KVM? >> >> In KVM mode, the limit is restricted to what is supported by KVM which >> is checked using kvm_ioctl via wrappers in kvm_init and appears to be >> evaluating to 2048. So, having a larger default works for both case. > > QEMU sets the number of cpus with KVM ioctls : > > KVM_DEV_XICS_NR_SERVERS > KVM_DEV_XIVE_NR_SERVERS > > This is important for the host since the interrupt controller is then > configured with these values through FW. > > The default value is indeed 2K but this is large and wastes a lot of > HW resources, page mappings, etc. I was wondering if one day KVM raise its limit to 5k, then the machine will clamp to 4k, and someone will have to debug that. Not a big deal ;)
On 11/23/23 11:26, Philippe Mathieu-Daudé wrote: > On 23/11/23 09:47, Cédric Le Goater wrote: >> On 11/23/23 06:03, Harsh Prateek Bora wrote: >>> Hi Philippe, >>> >>> On 11/22/23 16:46, Philippe Mathieu-Daudé wrote: >>>> Hi Harsh, >>>> >>>> On 22/11/23 10:28, Harsh Prateek Bora wrote: >>>>> Initialize the machine specific max_cpus limit as per the maximum range >>>>> of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not >>>>> free error due to XIVE/XICS limitation and keeping beyond 8192 will hit >>>>> assert in tcg_region_init or spapr_xive_claim_irq. >>>>> >>>>> Logs: >>>>> >>>>> Without patch fix: >>>>> >>>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>>>> qemu-system-ppc64: IRQ 4096 is not free >>>>> [root@host build]# >>>>> >>>>> On LPAR: >>>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>>>> ** >>>>> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>>>> (region_size >= 2 * page_size) >>>>> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >>>>> (region_size >= 2 * page_size) >>>>> Aborted (core dumped) >>>>> [root@host build]# >>>>> >>>>> On x86: >>>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >>>>> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: >>>>> Assertion `lisn < xive->nr_irqs' failed. >>>>> Aborted (core dumped) >>>>> [root@host build]# >>>>> >>>>> With patch fix: >>>>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >>>>> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by >>>>> machine 'pseries-8.2' is 4096 >>>>> [root@host build]# >>>>> >>>>> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> >>>>> --- >>>>> hw/ppc/spapr.c | 9 +++------ >>>>> 1 file changed, 3 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>>>> index df09aa9d6a..0de11a4458 100644 >>>>> --- a/hw/ppc/spapr.c >>>>> +++ b/hw/ppc/spapr.c >>>>> @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) >>>>> mc->block_default_type = IF_SCSI; >>>>> /* >>>>> - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values >>>>> - * should be limited by the host capability instead of hardcoded. >>>>> - * max_cpus for KVM guests will be checked in kvm_init(), and TCG >>>>> - * guests are welcome to have as many CPUs as the host are capable >>>>> - * of emulate. >>>>> + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), >>>>> + * In TCG the limit is restricted by the range of CPU IPIs available. >>>>> */ >>>>> - mc->max_cpus = INT32_MAX; >>>>> + mc->max_cpus = SPAPR_NR_IPIS; >>>> >>>> Is SPAPR_NR_IPIS also the upper limit for KVM? >>> >>> In KVM mode, the limit is restricted to what is supported by KVM which is checked using kvm_ioctl via wrappers in kvm_init and appears to be evaluating to 2048. So, having a larger default works for both case. >> >> QEMU sets the number of cpus with KVM ioctls : >> >> KVM_DEV_XICS_NR_SERVERS >> KVM_DEV_XIVE_NR_SERVERS >> >> This is important for the host since the interrupt controller is then >> configured with these values through FW. >> >> The default value is indeed 2K but this is large and wastes a lot of >> HW resources, page mappings, etc. > > I was wondering if one day KVM raise its limit to 5k, then the > machine will clamp to 4k, and someone will have to debug that. > Not a big deal ;) Changing the number of CPUs will require some work in Linux first. I think we (as ppc) tried to push it to 8K but there was some push back upstream. Anyhow, there are DT issues also, memory layout, etc. It won't happen without being noticed I am sure :) Anyhow, If we need more IPIs to support more CPUs, the IRQ number space will need an extra range after the device range to preserve compatibility. Thanks, C.
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index df09aa9d6a..0de11a4458 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) mc->block_default_type = IF_SCSI; /* - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values - * should be limited by the host capability instead of hardcoded. - * max_cpus for KVM guests will be checked in kvm_init(), and TCG - * guests are welcome to have as many CPUs as the host are capable - * of emulate. + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), + * In TCG the limit is restricted by the range of CPU IPIs available. */ - mc->max_cpus = INT32_MAX; + mc->max_cpus = SPAPR_NR_IPIS; mc->no_parallel = 1; mc->default_boot_order = "";
Initialize the machine specific max_cpus limit as per the maximum range of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE/XICS limitation and keeping beyond 8192 will hit assert in tcg_region_init or spapr_xive_claim_irq. Logs: Without patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: IRQ 4096 is not free [root@host build]# On LPAR: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 ** ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Aborted (core dumped) [root@host build]# On x86: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: Assertion `lisn < xive->nr_irqs' failed. Aborted (core dumped) [root@host build]# With patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by machine 'pseries-8.2' is 4096 [root@host build]# Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> --- hw/ppc/spapr.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)