Message ID | 20231122074802.868083-1-harshpb@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ppc/spapr: Initialize max_cpus limit to an allowed usable limit. | expand |
On 11/22/23 08:48, Harsh Prateek Bora wrote: > Initialize the machine specific max_cpus limit to a usable limit 4096. > Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE > limitation and keeping beyond 8192 will hit assert in tcg_region_init > or spapr_xive_claim_irq. The IRQ number space is defined in include/hw/ppc/spapr_irq.h. XICS and XIVE have the same IRQ number space, it is not a XIVE limitation. It is how we organized interrupt numbers in the pseries-3.1 machine. SPAPR_XIRQ_BASE defines an offset, at which the device IRQ numbers start, and below that offset, the range of IRQ numbers is reserved for IPIs. An assumption is made on the fact the both ranges, IPIs and devices, are contiguous and there is a little shortcut being done with the SPAPR_XIRQ_BASE define. hw/ppc/spapr_irq.c: qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE); hw/ppc/spapr_irq.c: smc->nr_xirqs + SPAPR_XIRQ_BASE); This should use a SPAPR_NR_IPIS define (like we have a SPAPR_NR_XIRQS define) instead, which could be used to define mc->max_cpus like we define smc->nr_xirqs. Thanks, C. > Logs: > > Without patch fix: > > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 > qemu-system-ppc64: IRQ 4096 is not free > [root@host build]# > > On LPAR: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 > ** > ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: > (region_size >= 2 * page_size) > Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: > (region_size >= 2 * page_size) > Aborted (core dumped) > [root@host build]# > > On x86: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 > qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: > Assertion `lisn < xive->nr_irqs' failed. > Aborted (core dumped) > [root@host build]# > > With patch fix: > [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 > qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by > machine 'pseries-8.2' is 4096 > [root@host build]# > > Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> > --- > hw/ppc/spapr.c | 9 +++------ > include/hw/ppc/spapr.h | 1 + > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index df09aa9d6a..1995949ea5 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) > mc->block_default_type = IF_SCSI; > > /* > - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values > - * should be limited by the host capability instead of hardcoded. > - * max_cpus for KVM guests will be checked in kvm_init(), and TCG > - * guests are welcome to have as many CPUs as the host are capable > - * of emulate. > + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), > + * In TCG the limit is restricted by max-irqs setup by XIVE which is 4096. > */ > - mc->max_cpus = INT32_MAX; > + mc->max_cpus = SPAPR_MAX_CPUS; > > mc->no_parallel = 1; > mc->default_boot_order = ""; > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index e91791a1a9..210849a494 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt; > > typedef struct Vof Vof; > > +#define SPAPR_MAX_CPUS 4096 > #define HPTE64_V_HPTE_DIRTY 0x0000000000000040ULL > #define SPAPR_ENTRY_POINT 0x100 >
On 11/22/23 14:06, Cédric Le Goater wrote: > On 11/22/23 08:48, Harsh Prateek Bora wrote: >> Initialize the machine specific max_cpus limit to a usable limit 4096. >> Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE >> limitation and keeping beyond 8192 will hit assert in tcg_region_init >> or spapr_xive_claim_irq. > > The IRQ number space is defined in include/hw/ppc/spapr_irq.h. XICS and > XIVE have the same IRQ number space, it is not a XIVE limitation. It > is how we organized interrupt numbers in the pseries-3.1 machine. > > SPAPR_XIRQ_BASE defines an offset, at which the device IRQ numbers > start, and below that offset, the range of IRQ numbers is reserved > for IPIs. An assumption is made on the fact the both ranges, IPIs and > devices, are contiguous and there is a little shortcut being done with > the SPAPR_XIRQ_BASE define. > > hw/ppc/spapr_irq.c: qdev_prop_set_uint32(dev, "nr-irqs", > smc->nr_xirqs + SPAPR_XIRQ_BASE); > hw/ppc/spapr_irq.c: smc->nr_xirqs + > SPAPR_XIRQ_BASE); > > This should use a SPAPR_NR_IPIS define (like we have a SPAPR_NR_XIRQS > define) instead, which could be used to define mc->max_cpus like we > define smc->nr_xirqs. > Thanks Cedric for your review comments. I have posted a v2 incorporating your suggestion. regards, Harsh > Thanks, > > C. > > >> Logs: >> >> Without patch fix: >> >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >> qemu-system-ppc64: IRQ 4096 is not free >> [root@host build]# >> >> On LPAR: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >> ** >> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >> (region_size >= 2 * page_size) >> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: >> (region_size >= 2 * page_size) >> Aborted (core dumped) >> [root@host build]# >> >> On x86: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 >> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: >> Assertion `lisn < xive->nr_irqs' failed. >> Aborted (core dumped) >> [root@host build]# >> >> With patch fix: >> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 >> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by >> machine 'pseries-8.2' is 4096 >> [root@host build]# >> >> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> >> --- >> hw/ppc/spapr.c | 9 +++------ >> include/hw/ppc/spapr.h | 1 + >> 2 files changed, 4 insertions(+), 6 deletions(-) >> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index df09aa9d6a..1995949ea5 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -4647,13 +4647,10 @@ static void >> spapr_machine_class_init(ObjectClass *oc, void *data) >> mc->block_default_type = IF_SCSI; >> /* >> - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values >> - * should be limited by the host capability instead of hardcoded. >> - * max_cpus for KVM guests will be checked in kvm_init(), and TCG >> - * guests are welcome to have as many CPUs as the host are capable >> - * of emulate. >> + * While KVM determines max cpus in kvm_init() using >> kvm_max_vcpus(), >> + * In TCG the limit is restricted by max-irqs setup by XIVE which >> is 4096. >> */ >> - mc->max_cpus = INT32_MAX; >> + mc->max_cpus = SPAPR_MAX_CPUS; >> mc->no_parallel = 1; >> mc->default_boot_order = ""; >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h >> index e91791a1a9..210849a494 100644 >> --- a/include/hw/ppc/spapr.h >> +++ b/include/hw/ppc/spapr.h >> @@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt; >> typedef struct Vof Vof; >> +#define SPAPR_MAX_CPUS 4096 >> #define HPTE64_V_HPTE_DIRTY 0x0000000000000040ULL >> #define SPAPR_ENTRY_POINT 0x100 >> >
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index df09aa9d6a..1995949ea5 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) mc->block_default_type = IF_SCSI; /* - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values - * should be limited by the host capability instead of hardcoded. - * max_cpus for KVM guests will be checked in kvm_init(), and TCG - * guests are welcome to have as many CPUs as the host are capable - * of emulate. + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), + * In TCG the limit is restricted by max-irqs setup by XIVE which is 4096. */ - mc->max_cpus = INT32_MAX; + mc->max_cpus = SPAPR_MAX_CPUS; mc->no_parallel = 1; mc->default_boot_order = ""; diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index e91791a1a9..210849a494 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt; typedef struct Vof Vof; +#define SPAPR_MAX_CPUS 4096 #define HPTE64_V_HPTE_DIRTY 0x0000000000000040ULL #define SPAPR_ENTRY_POINT 0x100
Initialize the machine specific max_cpus limit to a usable limit 4096. Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE limitation and keeping beyond 8192 will hit assert in tcg_region_init or spapr_xive_claim_irq. Logs: Without patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: IRQ 4096 is not free [root@host build]# On LPAR: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 ** ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Aborted (core dumped) [root@host build]# On x86: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: Assertion `lisn < xive->nr_irqs' failed. Aborted (core dumped) [root@host build]# With patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by machine 'pseries-8.2' is 4096 [root@host build]# Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> --- hw/ppc/spapr.c | 9 +++------ include/hw/ppc/spapr.h | 1 + 2 files changed, 4 insertions(+), 6 deletions(-)