Message ID | 1468392620-25599-1-git-send-email-bharata@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 13 Jul 2016 12:20:20 +0530 Bharata B Rao <bharata@linux.vnet.ibm.com> wrote: > If CPU core addition or removal is allowed in random order leading to > holes in the core id range (and hence in the cpu_index range), migration > can fail as migration with holes in cpu_index range isn't yet handled > correctly. > > Prevent this situation by enforcing the addition in contiguous order > and removal in LIFO order so that we never end up with holes in > cpu_index range. Adding this limitation looks better than adding migration_id as it will allow libvirt to use the current -numa cpus=... while doing the same amount of guess work as it does now. Similar patch for x86 won't be so simple as cpu-add can add cpus with gaps (breaking migration at that), so I'd need to keep it that way with some compat code, but that shouldn't be issue. > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > --- > While there is work in progress to support migration when there are holes > in cpu_index range resulting from out-of-order plug or unplug, this patch > is intended as a last resort if no easy, risk-free and elegant solution > emerges before 2.7 dev cycle ends. > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index bc52b3c..4bfc96b 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > Error **errp) > { > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > CPUCore *cc = CPU_CORE(dev); > sPAPRDRConnector *drc = > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > sPAPRDRConnectorClass *drck; > Error *local_err = NULL; > + int smt = kvmppc_smt_threads(); > + int index = cc->core_id / smt; > + int spapr_max_cores = max_cpus / smp_threads; > + int i; > > + for (i = spapr_max_cores - 1; i > index; i--) { > + if (spapr->cores[i]) { > + error_setg(errp, "core-id %d should be removed first", i * smt); > + return; > + } > + } > g_assert(drc); > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > int spapr_max_cores = max_cpus / smp_threads; > - int index; > + int index, i; > int smt = kvmppc_smt_threads(); > Error *local_err = NULL; > CPUCore *cc = CPU_CORE(dev); > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > goto out; > } > > + for (i = 0; i < index; i++) { > + if (!spapr->cores[i]) { > + error_setg(&local_err, "core-id %d should be added first", > + i * smt); > + goto out; > + } > + } > + > out: > g_free(base_core_type); > error_propagate(errp, local_err);
On Wed, 13 Jul 2016 09:42:54 +0200 Igor Mammedov <imammedo@redhat.com> wrote: > On Wed, 13 Jul 2016 12:20:20 +0530 > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote: > > > If CPU core addition or removal is allowed in random order leading to > > holes in the core id range (and hence in the cpu_index range), migration > > can fail as migration with holes in cpu_index range isn't yet handled > > correctly. > > > > Prevent this situation by enforcing the addition in contiguous order > > and removal in LIFO order so that we never end up with holes in > > cpu_index range. > Adding this limitation looks better than adding migration_id as > it will allow libvirt to use the current -numa cpus=... while > doing the same amount of guess work as it does now. > > Similar patch for x86 won't be so simple as cpu-add can add cpus > with gaps (breaking migration at that), so I'd need to keep it > that way with some compat code, but that shouldn't be issue. CCing Peter for libvirt's opinion on this turn of events > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > --- > > While there is work in progress to support migration when there are holes > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > is intended as a last resort if no easy, risk-free and elegant solution > > emerges before 2.7 dev cycle ends. > > > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > index bc52b3c..4bfc96b 100644 > > --- a/hw/ppc/spapr_cpu_core.c > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > > Error **errp) > > { > > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > CPUCore *cc = CPU_CORE(dev); > > sPAPRDRConnector *drc = > > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > > sPAPRDRConnectorClass *drck; > > Error *local_err = NULL; > > + int smt = kvmppc_smt_threads(); > > + int index = cc->core_id / smt; > > + int spapr_max_cores = max_cpus / smp_threads; > > + int i; > > > > + for (i = spapr_max_cores - 1; i > index; i--) { > > + if (spapr->cores[i]) { > > + error_setg(errp, "core-id %d should be removed first", i * smt); > > + return; > > + } > > + } > > g_assert(drc); > > > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > int spapr_max_cores = max_cpus / smp_threads; > > - int index; > > + int index, i; > > int smt = kvmppc_smt_threads(); > > Error *local_err = NULL; > > CPUCore *cc = CPU_CORE(dev); > > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > goto out; > > } > > > > + for (i = 0; i < index; i++) { > > + if (!spapr->cores[i]) { > > + error_setg(&local_err, "core-id %d should be added first", > > + i * smt); > > + goto out; > > + } > > + } > > + > > out: > > g_free(base_core_type); > > error_propagate(errp, local_err); > >
On Wed, 13 Jul 2016 10:19:00 +0200 Igor Mammedov <imammedo@redhat.com> wrote: > On Wed, 13 Jul 2016 09:42:54 +0200 > Igor Mammedov <imammedo@redhat.com> wrote: > > > On Wed, 13 Jul 2016 12:20:20 +0530 > > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote: > > > > > If CPU core addition or removal is allowed in random order leading to > > > holes in the core id range (and hence in the cpu_index range), migration > > > can fail as migration with holes in cpu_index range isn't yet handled > > > correctly. > > > > > > Prevent this situation by enforcing the addition in contiguous order > > > and removal in LIFO order so that we never end up with holes in > > > cpu_index range. > > Adding this limitation looks better than adding migration_id as > > it will allow libvirt to use the current -numa cpus=... while > > doing the same amount of guess work as it does now. > > > > Similar patch for x86 won't be so simple as cpu-add can add cpus > > with gaps (breaking migration at that), so I'd need to keep it > > that way with some compat code, but that shouldn't be issue. > CCing Peter for libvirt's opinion on this turn of events didn't actually CCed him :/ > > > > > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > > --- > > > While there is work in progress to support migration when there are holes > > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > > is intended as a last resort if no easy, risk-free and elegant solution > > > emerges before 2.7 dev cycle ends. > > > > > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > > index bc52b3c..4bfc96b 100644 > > > --- a/hw/ppc/spapr_cpu_core.c > > > +++ b/hw/ppc/spapr_cpu_core.c > > > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > > > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > Error **errp) > > > { > > > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > > CPUCore *cc = CPU_CORE(dev); > > > sPAPRDRConnector *drc = > > > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > > > sPAPRDRConnectorClass *drck; > > > Error *local_err = NULL; > > > + int smt = kvmppc_smt_threads(); > > > + int index = cc->core_id / smt; > > > + int spapr_max_cores = max_cpus / smp_threads; > > > + int i; > > > > > > + for (i = spapr_max_cores - 1; i > index; i--) { > > > + if (spapr->cores[i]) { > > > + error_setg(errp, "core-id %d should be removed first", i * smt); > > > + return; > > > + } > > > + } > > > g_assert(drc); > > > > > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > > > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > > > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > > int spapr_max_cores = max_cpus / smp_threads; > > > - int index; > > > + int index, i; > > > int smt = kvmppc_smt_threads(); > > > Error *local_err = NULL; > > > CPUCore *cc = CPU_CORE(dev); > > > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > goto out; > > > } > > > > > > + for (i = 0; i < index; i++) { > > > + if (!spapr->cores[i]) { > > > + error_setg(&local_err, "core-id %d should be added first", > > > + i * smt); > > > + goto out; > > > + } > > > + } > > > + > > > out: > > > g_free(base_core_type); > > > error_propagate(errp, local_err); > > > > > >
On Wed, 13 Jul 2016 09:42:54 +0200 Igor Mammedov <imammedo@redhat.com> wrote: > On Wed, 13 Jul 2016 12:20:20 +0530 > Bharata B Rao <bharata@linux.vnet.ibm.com> wrote: > > > If CPU core addition or removal is allowed in random order leading to > > holes in the core id range (and hence in the cpu_index range), migration > > can fail as migration with holes in cpu_index range isn't yet handled > > correctly. > > > > Prevent this situation by enforcing the addition in contiguous order > > and removal in LIFO order so that we never end up with holes in > > cpu_index range. > Adding this limitation looks better than adding migration_id as > it will allow libvirt to use the current -numa cpus=... while > doing the same amount of guess work as it does now. > > Similar patch for x86 won't be so simple as cpu-add can add cpus > with gaps (breaking migration at that), so I'd need to keep it > that way with some compat code, but that shouldn't be issue. > And does libvirt take care not to add cpus with gaps ? If so, maybe it can do the same with PPC, and we'd simply warn the user that migration is broken if we introduce gaps. Makes sense ? > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > --- > > While there is work in progress to support migration when there are holes > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > is intended as a last resort if no easy, risk-free and elegant solution > > emerges before 2.7 dev cycle ends. > > > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > index bc52b3c..4bfc96b 100644 > > --- a/hw/ppc/spapr_cpu_core.c > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > > Error **errp) > > { > > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > CPUCore *cc = CPU_CORE(dev); > > sPAPRDRConnector *drc = > > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > > sPAPRDRConnectorClass *drck; > > Error *local_err = NULL; > > + int smt = kvmppc_smt_threads(); > > + int index = cc->core_id / smt; > > + int spapr_max_cores = max_cpus / smp_threads; > > + int i; > > > > + for (i = spapr_max_cores - 1; i > index; i--) { > > + if (spapr->cores[i]) { > > + error_setg(errp, "core-id %d should be removed first", i * smt); > > + return; > > + } > > + } > > g_assert(drc); > > > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > int spapr_max_cores = max_cpus / smp_threads; > > - int index; > > + int index, i; > > int smt = kvmppc_smt_threads(); > > Error *local_err = NULL; > > CPUCore *cc = CPU_CORE(dev); > > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > goto out; > > } > > > > + for (i = 0; i < index; i++) { > > + if (!spapr->cores[i]) { > > + error_setg(&local_err, "core-id %d should be added first", > > + i * smt); > > + goto out; > > + } > > + } > > + > > out: > > g_free(base_core_type); > > error_propagate(errp, local_err); > >
On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote: > If CPU core addition or removal is allowed in random order leading to > holes in the core id range (and hence in the cpu_index range), migration > can fail as migration with holes in cpu_index range isn't yet handled > correctly. > > Prevent this situation by enforcing the addition in contiguous order > and removal in LIFO order so that we never end up with holes in > cpu_index range. > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > --- > While there is work in progress to support migration when there are holes > in cpu_index range resulting from out-of-order plug or unplug, this patch > is intended as a last resort if no easy, risk-free and elegant solution > emerges before 2.7 dev cycle ends. Applied to ppc-for-2.7. We can revert it once the problems with cpu_index are sorted out. > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index bc52b3c..4bfc96b 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > Error **errp) > { > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > CPUCore *cc = CPU_CORE(dev); > sPAPRDRConnector *drc = > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > sPAPRDRConnectorClass *drck; > Error *local_err = NULL; > + int smt = kvmppc_smt_threads(); > + int index = cc->core_id / smt; > + int spapr_max_cores = max_cpus / smp_threads; > + int i; > > + for (i = spapr_max_cores - 1; i > index; i--) { > + if (spapr->cores[i]) { > + error_setg(errp, "core-id %d should be removed first", i * smt); > + return; > + } > + } > g_assert(drc); > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > int spapr_max_cores = max_cpus / smp_threads; > - int index; > + int index, i; > int smt = kvmppc_smt_threads(); > Error *local_err = NULL; > CPUCore *cc = CPU_CORE(dev); > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > goto out; > } > > + for (i = 0; i < index; i++) { > + if (!spapr->cores[i]) { > + error_setg(&local_err, "core-id %d should be added first", > + i * smt); > + goto out; > + } > + } > + > out: > g_free(base_core_type); > error_propagate(errp, local_err);
On Thu, 14 Jul 2016 10:51:27 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote: > > If CPU core addition or removal is allowed in random order leading to > > holes in the core id range (and hence in the cpu_index range), migration > > can fail as migration with holes in cpu_index range isn't yet handled > > correctly. > > > > Prevent this situation by enforcing the addition in contiguous order > > and removal in LIFO order so that we never end up with holes in > > cpu_index range. > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > --- > > While there is work in progress to support migration when there are holes > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > is intended as a last resort if no easy, risk-free and elegant solution > > emerges before 2.7 dev cycle ends. > > Applied to ppc-for-2.7. We can revert it once the problems with > cpu_index are sorted out. You'd need to add machine type specific compat option here, so that new-qemu -M 2.7 wouldn't allow out of order too and could be migrated to old-qemu -M 2.7 > > > > > hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- > > 1 file changed, 20 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > index bc52b3c..4bfc96b 100644 > > --- a/hw/ppc/spapr_cpu_core.c > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) > > void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, > > Error **errp) > > { > > + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > CPUCore *cc = CPU_CORE(dev); > > sPAPRDRConnector *drc = > > spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); > > sPAPRDRConnectorClass *drck; > > Error *local_err = NULL; > > + int smt = kvmppc_smt_threads(); > > + int index = cc->core_id / smt; > > + int spapr_max_cores = max_cpus / smp_threads; > > + int i; > > > > + for (i = spapr_max_cores - 1; i > index; i--) { > > + if (spapr->cores[i]) { > > + error_setg(errp, "core-id %d should be removed first", i * smt); > > + return; > > + } > > + } > > g_assert(drc); > > > > drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > > @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); > > sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > > int spapr_max_cores = max_cpus / smp_threads; > > - int index; > > + int index, i; > > int smt = kvmppc_smt_threads(); > > Error *local_err = NULL; > > CPUCore *cc = CPU_CORE(dev); > > @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > goto out; > > } > > > > + for (i = 0; i < index; i++) { > > + if (!spapr->cores[i]) { > > + error_setg(&local_err, "core-id %d should be added first", > > + i * smt); > > + goto out; > > + } > > + } > > + > > out: > > g_free(base_core_type); > > error_propagate(errp, local_err); >
On Thu, Jul 14, 2016 at 10:27:15AM +0200, Igor Mammedov wrote: > On Thu, 14 Jul 2016 10:51:27 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote: > > > If CPU core addition or removal is allowed in random order leading to > > > holes in the core id range (and hence in the cpu_index range), migration > > > can fail as migration with holes in cpu_index range isn't yet handled > > > correctly. > > > > > > Prevent this situation by enforcing the addition in contiguous order > > > and removal in LIFO order so that we never end up with holes in > > > cpu_index range. > > > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > > --- > > > While there is work in progress to support migration when there are holes > > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > > is intended as a last resort if no easy, risk-free and elegant solution > > > emerges before 2.7 dev cycle ends. > > > > Applied to ppc-for-2.7. We can revert it once the problems with > > cpu_index are sorted out. > You'd need to add machine type specific compat option here, > so that new-qemu -M 2.7 wouldn't allow out of order too and > could be migrated to old-qemu -M 2.7 Hmm, do we care about migration from newer back to older versions of qemu upstream? If so, then I guess we do need this option. Though strictly we don't need it until we actually do allow any-order hotplug.
On Fri, Jul 15, 2016 at 03:29:01PM +1000, David Gibson wrote: > On Thu, Jul 14, 2016 at 10:27:15AM +0200, Igor Mammedov wrote: > > On Thu, 14 Jul 2016 10:51:27 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote: > > > > If CPU core addition or removal is allowed in random order leading to > > > > holes in the core id range (and hence in the cpu_index range), migration > > > > can fail as migration with holes in cpu_index range isn't yet handled > > > > correctly. > > > > > > > > Prevent this situation by enforcing the addition in contiguous order > > > > and removal in LIFO order so that we never end up with holes in > > > > cpu_index range. > > > > > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > > > --- > > > > While there is work in progress to support migration when there are holes > > > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > > > is intended as a last resort if no easy, risk-free and elegant solution > > > > emerges before 2.7 dev cycle ends. > > > > > > Applied to ppc-for-2.7. We can revert it once the problems with > > > cpu_index are sorted out. > > You'd need to add machine type specific compat option here, > > so that new-qemu -M 2.7 wouldn't allow out of order too and > > could be migrated to old-qemu -M 2.7 > > Hmm, do we care about migration from newer back to older versions of > qemu upstream? If so, then I guess we do need this option. Though > strictly we don't need it until we actually do allow any-order > hotplug. Right. I too thought that when we relax this restriction say in 2.8, we could have a sPAPRMachineClass option to allow out-of-order hotplug for 2.8 upwards and disable it for 2.7 downwards. With this, 2.7 guest started with new-qemu can be migrated to old-qemu. Regards, Bharata.
On Fri, 15 Jul 2016 15:29:01 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Jul 14, 2016 at 10:27:15AM +0200, Igor Mammedov wrote: > > On Thu, 14 Jul 2016 10:51:27 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote: > > > > If CPU core addition or removal is allowed in random order leading to > > > > holes in the core id range (and hence in the cpu_index range), migration > > > > can fail as migration with holes in cpu_index range isn't yet handled > > > > correctly. > > > > > > > > Prevent this situation by enforcing the addition in contiguous order > > > > and removal in LIFO order so that we never end up with holes in > > > > cpu_index range. > > > > > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > > > --- > > > > While there is work in progress to support migration when there are holes > > > > in cpu_index range resulting from out-of-order plug or unplug, this patch > > > > is intended as a last resort if no easy, risk-free and elegant solution > > > > emerges before 2.7 dev cycle ends. > > > > > > Applied to ppc-for-2.7. We can revert it once the problems with > > > cpu_index are sorted out. > > You'd need to add machine type specific compat option here, > > so that new-qemu -M 2.7 wouldn't allow out of order too and > > could be migrated to old-qemu -M 2.7 > > Hmm, do we care about migration from newer back to older versions of > qemu upstream? upstream we don't, though it would reduce maintenance head-ache. (if isn't hard then why not do it upstream either)
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c index bc52b3c..4bfc96b 100644 --- a/hw/ppc/spapr_cpu_core.c +++ b/hw/ppc/spapr_cpu_core.c @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { + sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); CPUCore *cc = CPU_CORE(dev); sPAPRDRConnector *drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); sPAPRDRConnectorClass *drck; Error *local_err = NULL; + int smt = kvmppc_smt_threads(); + int index = cc->core_id / smt; + int spapr_max_cores = max_cpus / smp_threads; + int i; + for (i = spapr_max_cores - 1; i > index; i--) { + if (spapr->cores[i]) { + error_setg(errp, "core-id %d should be removed first", i * smt); + return; + } + } g_assert(drc); drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); int spapr_max_cores = max_cpus / smp_threads; - int index; + int index, i; int smt = kvmppc_smt_threads(); Error *local_err = NULL; CPUCore *cc = CPU_CORE(dev); @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, goto out; } + for (i = 0; i < index; i++) { + if (!spapr->cores[i]) { + error_setg(&local_err, "core-id %d should be added first", + i * smt); + goto out; + } + } + out: g_free(base_core_type); error_propagate(errp, local_err);
If CPU core addition or removal is allowed in random order leading to holes in the core id range (and hence in the cpu_index range), migration can fail as migration with holes in cpu_index range isn't yet handled correctly. Prevent this situation by enforcing the addition in contiguous order and removal in LIFO order so that we never end up with holes in cpu_index range. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> --- While there is work in progress to support migration when there are holes in cpu_index range resulting from out-of-order plug or unplug, this patch is intended as a last resort if no easy, risk-free and elegant solution emerges before 2.7 dev cycle ends. hw/ppc/spapr_cpu_core.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)