Message ID | 20190107183946.7230-7-clg@kaod.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | spapr: add KVM support to the XIVE interrupt mode | expand |
On Mon, Jan 07, 2019 at 07:39:39PM +0100, Cédric Le Goater wrote: > When the thread interrupt management state is retrieved from the KVM > VCPU, word2 is saved under the QEMU XIVE thread context to print out > the OS CAM line under the QEMU monitor. > > This breaks the migration of a TCG guest (and with KVM when > kernel_irqchip=off) because the matching algorithm of the presenter > relies on the OS CAM value. Fix with an extra reset of the thread > contexts to restore the expected value. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> Why is the CAM value you get from KVM different from the one you expect in qemu? > --- > hw/ppc/spapr_irq.c | 26 +++++++++++++++++++++++++- > 1 file changed, 25 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c > index 233c97c5ecd9..ba27d9d8e972 100644 > --- a/hw/ppc/spapr_irq.c > +++ b/hw/ppc/spapr_irq.c > @@ -363,7 +363,31 @@ static void spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr, > > static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id) > { > - return spapr_xive_post_load(spapr->xive, version_id); > + CPUState *cs; > + int ret; > + > + ret = spapr_xive_post_load(spapr->xive, version_id); > + if (ret) { > + return ret; > + } > + > + /* > + * When the states are collected from the KVM XIVE device, word2 > + * of the XiveTCTX is set to print out the OS CAM line under the > + * QEMU monitor. > + * > + * This breaks the migration on a TCG guest (or on KVM with > + * kernel_irqchip=off) because the matching algorithm of the > + * presenter relies on the OS CAM value. Fix with an extra reset > + * of the thread contexts to restore the expected value. > + */ > + CPU_FOREACH(cs) { > + PowerPCCPU *cpu = POWERPC_CPU(cs); > + > + /* (TCG) Set the OS CAM line of the thread interrupt context. */ > + spapr_xive_set_tctx_os_cam(cpu->tctx); > + } > + return 0; > } > > static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
On 2/8/19 6:36 AM, David Gibson wrote: > On Mon, Jan 07, 2019 at 07:39:39PM +0100, Cédric Le Goater wrote: >> When the thread interrupt management state is retrieved from the KVM >> VCPU, word2 is saved under the QEMU XIVE thread context to print out >> the OS CAM line under the QEMU monitor. >> >> This breaks the migration of a TCG guest (and with KVM when >> kernel_irqchip=off) because the matching algorithm of the presenter >> relies on the OS CAM value. Fix with an extra reset of the thread >> contexts to restore the expected value. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> > > Why is the CAM value you get from KVM different from the one you > expect in qemu? An NVT base identifier is allocated for each VM at the OPAL level and each vCPU getd an increment of this value. It is pushed in the OS CAM line when the vCPU is scheduled to run. KVM XIVE names this identifier a VP id. The QEMU emulation of XIVE uses a fixed value for the NVT base identifier. C. >> --- >> hw/ppc/spapr_irq.c | 26 +++++++++++++++++++++++++- >> 1 file changed, 25 insertions(+), 1 deletion(-) >> >> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c >> index 233c97c5ecd9..ba27d9d8e972 100644 >> --- a/hw/ppc/spapr_irq.c >> +++ b/hw/ppc/spapr_irq.c >> @@ -363,7 +363,31 @@ static void spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr, >> >> static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id) >> { >> - return spapr_xive_post_load(spapr->xive, version_id); >> + CPUState *cs; >> + int ret; >> + >> + ret = spapr_xive_post_load(spapr->xive, version_id); >> + if (ret) { >> + return ret; >> + } >> + >> + /* >> + * When the states are collected from the KVM XIVE device, word2 >> + * of the XiveTCTX is set to print out the OS CAM line under the >> + * QEMU monitor. >> + * >> + * This breaks the migration on a TCG guest (or on KVM with >> + * kernel_irqchip=off) because the matching algorithm of the >> + * presenter relies on the OS CAM value. Fix with an extra reset >> + * of the thread contexts to restore the expected value. >> + */ >> + CPU_FOREACH(cs) { >> + PowerPCCPU *cpu = POWERPC_CPU(cs); >> + >> + /* (TCG) Set the OS CAM line of the thread interrupt context. */ >> + spapr_xive_set_tctx_os_cam(cpu->tctx); >> + } >> + return 0; >> } >> >> static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp) >
On Fri, Feb 08, 2019 at 08:12:12AM +0100, Cédric Le Goater wrote: > On 2/8/19 6:36 AM, David Gibson wrote: > > On Mon, Jan 07, 2019 at 07:39:39PM +0100, Cédric Le Goater wrote: > >> When the thread interrupt management state is retrieved from the KVM > >> VCPU, word2 is saved under the QEMU XIVE thread context to print out > >> the OS CAM line under the QEMU monitor. > >> > >> This breaks the migration of a TCG guest (and with KVM when > >> kernel_irqchip=off) because the matching algorithm of the presenter > >> relies on the OS CAM value. Fix with an extra reset of the thread > >> contexts to restore the expected value. > >> > >> Signed-off-by: Cédric Le Goater <clg@kaod.org> > > > > Why is the CAM value you get from KVM different from the one you > > expect in qemu? > > An NVT base identifier is allocated for each VM at the OPAL level > and each vCPU getd an increment of this value. It is pushed in the > OS CAM line when the vCPU is scheduled to run. > > KVM XIVE names this identifier a VP id. > > The QEMU emulation of XIVE uses a fixed value for the NVT base > identifier. Ah, I see. Hmm. Really this highlights why I'm uneasy migrating the whole TCTX as just a blob of registers, even though only some of them are really runtime state, and others are machine configuration that can be worked out separately at the two ends.
On 2/12/19 1:22 AM, David Gibson wrote: > On Fri, Feb 08, 2019 at 08:12:12AM +0100, Cédric Le Goater wrote: >> On 2/8/19 6:36 AM, David Gibson wrote: >>> On Mon, Jan 07, 2019 at 07:39:39PM +0100, Cédric Le Goater wrote: >>>> When the thread interrupt management state is retrieved from the KVM >>>> VCPU, word2 is saved under the QEMU XIVE thread context to print out >>>> the OS CAM line under the QEMU monitor. >>>> >>>> This breaks the migration of a TCG guest (and with KVM when >>>> kernel_irqchip=off) because the matching algorithm of the presenter >>>> relies on the OS CAM value. Fix with an extra reset of the thread >>>> contexts to restore the expected value. >>>> >>>> Signed-off-by: Cédric Le Goater <clg@kaod.org> >>> >>> Why is the CAM value you get from KVM different from the one you >>> expect in qemu? >> >> An NVT base identifier is allocated for each VM at the OPAL level >> and each vCPU getd an increment of this value. It is pushed in the >> OS CAM line when the vCPU is scheduled to run. >> >> KVM XIVE names this identifier a VP id. >> >> The QEMU emulation of XIVE uses a fixed value for the NVT base >> identifier. > > Ah, I see. > > Hmm. Really this highlights why I'm uneasy migrating the whole TCTX > as just a blob of registers, even though only some of them are really > runtime state, and others are machine configuration that can be worked > out separately at the two ends. This is really a special case : 1. migration kernel_irqchip=on -> kernel_irqchip=off 2. debug info from KVM is squashing TCG state We could ignore the VP id configured by KVM but it seems interesting to retrieve. C.
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c index 233c97c5ecd9..ba27d9d8e972 100644 --- a/hw/ppc/spapr_irq.c +++ b/hw/ppc/spapr_irq.c @@ -363,7 +363,31 @@ static void spapr_irq_cpu_intc_create_xive(sPAPRMachineState *spapr, static int spapr_irq_post_load_xive(sPAPRMachineState *spapr, int version_id) { - return spapr_xive_post_load(spapr->xive, version_id); + CPUState *cs; + int ret; + + ret = spapr_xive_post_load(spapr->xive, version_id); + if (ret) { + return ret; + } + + /* + * When the states are collected from the KVM XIVE device, word2 + * of the XiveTCTX is set to print out the OS CAM line under the + * QEMU monitor. + * + * This breaks the migration on a TCG guest (or on KVM with + * kernel_irqchip=off) because the matching algorithm of the + * presenter relies on the OS CAM value. Fix with an extra reset + * of the thread contexts to restore the expected value. + */ + CPU_FOREACH(cs) { + PowerPCCPU *cpu = POWERPC_CPU(cs); + + /* (TCG) Set the OS CAM line of the thread interrupt context. */ + spapr_xive_set_tctx_os_cam(cpu->tctx); + } + return 0; } static void spapr_irq_reset_xive(sPAPRMachineState *spapr, Error **errp)
When the thread interrupt management state is retrieved from the KVM VCPU, word2 is saved under the QEMU XIVE thread context to print out the OS CAM line under the QEMU monitor. This breaks the migration of a TCG guest (and with KVM when kernel_irqchip=off) because the matching algorithm of the presenter relies on the OS CAM value. Fix with an extra reset of the thread contexts to restore the expected value. Signed-off-by: Cédric Le Goater <clg@kaod.org> --- hw/ppc/spapr_irq.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-)