Message ID | 1477370856-8940-4-git-send-email-mdroth@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Oct 24, 2016 at 11:47:29PM -0500, Michael Roth wrote: > In some cases, ibm,client-architecture-support calls can fail. This > could happen in the current code for situations where the modified > device tree segment exceeds the buffer size provided by the guest > via the call parameters. In these cases, QEMU will reset, allowing > an opportunity to regenerate the device tree from scratch via > boot-time handling. There are potentially other scenarios as well, > not currently reachable in the current code, but possible in theory, > such as cases where device-tree properties or nodes need to be removed. > > We currently don't handle either of these properly for option vector > capabilities however. Instead of carrying the negotiated capability > beyond the reset and creating the boot-time device tree accordingly, > we start from scratch, generating the same boot-time device tree as we > did prior to the CAS-generated and the same device tree updates as we > did before. This could (in theory) cause us to get stuck in a reset > loop. This hasn't been observed, but depending on the extensiveness > of CAS-induced device tree updates in the future, could eventually > become an issue. > > Address this by pulling capability-related device tree > updates resulting from CAS calls into a common routine, > spapr_dt_cas_updates(), and adding an sPAPROptionVector* > parameter that allows us to test for newly-negotiated capabilities. > We invoke it as follows: > > 1) When ibm,client-architecture-support gets called, we > call spapr_dt_cas_updates() with the set of capabilities > added since the previous call to ibm,client-architecture-support. > For the initial boot, or a system reset generated by something > other than the CAS call itself, this set will consist of *all* > options supported both the platform and the guest. For calls > to ibm,client-architecture-support immediately after a CAS-induced > reset, we call spapr_dt_cas_updates() with only the set > of capabilities added since the previous call, since the other > capabilities will have already been addressed by the boot-time > device-tree this time around. In the unlikely event that > capabilities are *removed* since the previous CAS, we will > generate a CAS-induced reset. In the unlikely event that we > cannot fit the device-tree updates into the buffer provided > by the guest, well generate a CAS-induced reset. > > 2) When a CAS update results in the need to reset the machine and > include the updates in the boot-time device tree, we call the > spapr_dt_cas_updates() using the full set of negotiated > capabilities as part of the reset path. At initial boot, or after > a reset generated by something other than the CAS call itself, > this set will be empty, resulting in what should be the same > boot-time device-tree as we generated prior to this patch. For > CAS-induced reset, this routine will be called with the full set of > capabilities negotiated by the platform/guest in the previous > CAS call, which should result in CAS updates from previous call > being accounted for in the initial boot-time device tree. > > Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au> One little nit.. [snip] > @@ -1013,13 +1013,27 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_, > * of guest input. To model these properly we'd want some sort of mask, > * but since they only currently apply to memory migration as defined > * by LoPAPR 1.1, 14.5.4.8, which QEMU doesn't implement, we don't need > - * to worry about this. > + * to worry about this for now. > */ > + ov5_cas_old = spapr_ovec_clone(spapr->ov5_cas); > + /* full range of negotiated ov5 capabilities */ > spapr_ovec_intersect(spapr->ov5_cas, spapr->ov5, ov5_guest); > spapr_ovec_cleanup(ov5_guest); > + /* capabilities that have been added since CAS-generated guest reset. > + * if capabilities have since been removed, generate another reset > + */ > + ov5_updates = spapr_ovec_new(); > + spapr->cas_reboot = spapr_ovec_diff(ov5_updates, > + ov5_cas_old, spapr->ov5_cas); > + > + if (!spapr->cas_reboot) { > + spapr->cas_reboot = > + spapr_h_cas_compose_response(spapr, args[1], args[2], cpu_update, > + ov5_updates); spapr->cas_reboot is a bool, whereas spapr_h_cas_compose_response() returns an int error code. Now that C has real bools, I think the compiler will do the right thing here, but I'd prefer to see an explicit cas_reboot = (spapr_h_cas_compose_response() != 0) for clarity. > + } > + spapr_ovec_cleanup(ov5_updates); > > - if (spapr_h_cas_compose_response(spapr, args[1], args[2], > - cpu_update)) { > + if (spapr->cas_reboot) { > qemu_system_reset_request(); > } > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index a37eee8..b6f9f1b 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -75,6 +75,7 @@ struct sPAPRMachineState { > bool has_graphics; > sPAPROptionVector *ov5; /* QEMU-supported option vectors */ > sPAPROptionVector *ov5_cas; /* negotiated (via CAS) option vectors */ > + bool cas_reboot; > > uint32_t check_exception_irq; > Notifier epow_notifier; > @@ -586,7 +587,8 @@ void spapr_events_init(sPAPRMachineState *sm); > void spapr_dt_events(void *fdt, uint32_t check_exception_irq); > int spapr_h_cas_compose_response(sPAPRMachineState *sm, > target_ulong addr, target_ulong size, > - bool cpu_update); > + bool cpu_update, > + sPAPROptionVector *ov5_updates); > sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn); > void spapr_tce_table_enable(sPAPRTCETable *tcet, > uint32_t page_shift, uint64_t bus_offset,
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index af5a239..3b64580 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -655,13 +655,28 @@ out: return ret; } +static int spapr_dt_cas_updates(sPAPRMachineState *spapr, void *fdt, + sPAPROptionVector *ov5_updates) +{ + sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr); + int ret = 0; + + /* Generate ibm,dynamic-reconfiguration-memory node if required */ + if (spapr_ovec_test(ov5_updates, OV5_DRCONF_MEMORY)) { + g_assert(smc->dr_lmb_enabled); + ret = spapr_populate_drconf_memory(spapr, fdt); + } + + return ret; +} + int spapr_h_cas_compose_response(sPAPRMachineState *spapr, target_ulong addr, target_ulong size, - bool cpu_update) + bool cpu_update, + sPAPROptionVector *ov5_updates) { void *fdt, *fdt_skel; sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 }; - sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine()); size -= sizeof(hdr); @@ -680,10 +695,8 @@ int spapr_h_cas_compose_response(sPAPRMachineState *spapr, _FDT((spapr_fixup_cpu_dt(fdt, spapr))); } - /* Generate ibm,dynamic-reconfiguration-memory node if required */ - if (spapr_ovec_test(spapr->ov5_cas, OV5_DRCONF_MEMORY)) { - g_assert(smc->dr_lmb_enabled); - _FDT((spapr_populate_drconf_memory(spapr, fdt))); + if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) { + return -1; } /* Pack resulting tree */ @@ -972,6 +985,13 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr, _FDT((fdt_add_mem_rsv(fdt, spapr->initrd_base, spapr->initrd_size))); } + /* ibm,client-architecture-support updates */ + ret = spapr_dt_cas_updates(spapr, fdt, spapr->ov5_cas); + if (ret < 0) { + error_report("couldn't setup CAS properties fdt"); + exit(1); + } + return fdt; } @@ -1137,6 +1157,13 @@ static void ppc_spapr_reset(void) rtas_addr = rtas_limit - RTAS_MAX_SIZE; fdt_addr = rtas_addr - FDT_MAX_SIZE; + /* if this reset wasn't generated by CAS, we should reset our + * negotiated options and start from scratch */ + if (!spapr->cas_reboot) { + spapr_ovec_cleanup(spapr->ov5_cas); + spapr->ov5_cas = spapr_ovec_new(); + } + fdt = spapr_build_fdt(spapr, rtas_addr, spapr->rtas_size); spapr_load_rtas(spapr, fdt, rtas_addr); @@ -1164,6 +1191,7 @@ static void ppc_spapr_reset(void) first_cpu->halted = 0; first_ppc_cpu->env.nip = SPAPR_ENTRY_POINT; + spapr->cas_reboot = false; } static void spapr_create_nvram(sPAPRMachineState *spapr) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index f1d081b..d277813 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -950,7 +950,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_, unsigned compat_lvl = 0, cpu_version = 0; unsigned max_lvl = get_compat_level(cpu_->max_compat); int counter; - sPAPROptionVector *ov5_guest; + sPAPROptionVector *ov5_guest, *ov5_cas_old, *ov5_updates; /* Parse PVR list */ for (counter = 0; counter < 512; ++counter) { @@ -1013,13 +1013,27 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu_, * of guest input. To model these properly we'd want some sort of mask, * but since they only currently apply to memory migration as defined * by LoPAPR 1.1, 14.5.4.8, which QEMU doesn't implement, we don't need - * to worry about this. + * to worry about this for now. */ + ov5_cas_old = spapr_ovec_clone(spapr->ov5_cas); + /* full range of negotiated ov5 capabilities */ spapr_ovec_intersect(spapr->ov5_cas, spapr->ov5, ov5_guest); spapr_ovec_cleanup(ov5_guest); + /* capabilities that have been added since CAS-generated guest reset. + * if capabilities have since been removed, generate another reset + */ + ov5_updates = spapr_ovec_new(); + spapr->cas_reboot = spapr_ovec_diff(ov5_updates, + ov5_cas_old, spapr->ov5_cas); + + if (!spapr->cas_reboot) { + spapr->cas_reboot = + spapr_h_cas_compose_response(spapr, args[1], args[2], cpu_update, + ov5_updates); + } + spapr_ovec_cleanup(ov5_updates); - if (spapr_h_cas_compose_response(spapr, args[1], args[2], - cpu_update)) { + if (spapr->cas_reboot) { qemu_system_reset_request(); } diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index a37eee8..b6f9f1b 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -75,6 +75,7 @@ struct sPAPRMachineState { bool has_graphics; sPAPROptionVector *ov5; /* QEMU-supported option vectors */ sPAPROptionVector *ov5_cas; /* negotiated (via CAS) option vectors */ + bool cas_reboot; uint32_t check_exception_irq; Notifier epow_notifier; @@ -586,7 +587,8 @@ void spapr_events_init(sPAPRMachineState *sm); void spapr_dt_events(void *fdt, uint32_t check_exception_irq); int spapr_h_cas_compose_response(sPAPRMachineState *sm, target_ulong addr, target_ulong size, - bool cpu_update); + bool cpu_update, + sPAPROptionVector *ov5_updates); sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn); void spapr_tce_table_enable(sPAPRTCETable *tcet, uint32_t page_shift, uint64_t bus_offset,