Message ID | ce92465a-8a54-e8b3-035f-46b695704169@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86: move vgc_flags to struct pv_vcpu | expand |
On 20/12/2019 13:55, Jan Beulich wrote: > There's been effectively no use of the field for HVM. > > Also shrink the field to unsigned int, even if this doesn't immediately > yield any space benefit for the structure itself. The resulting 32-bit > padding slot can eventually be used for some other field. The change in > size makes accesses slightly more efficient though, as no REX.W prefix > is going to be needed anymore on the respective insns. > > Mirror the HVM side change here (dropping of setting the field to > VGCF_online) also to Arm, on the assumption that it was cloned like > this originally. VGCF_online really should simply and consistently be > the guest view of the inverse of VPF_down, and hence needs representing > only in the get/set vCPU context interfaces. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Hi Jan, On 20/12/2019 14:55, Jan Beulich wrote: > There's been effectively no use of the field for HVM. > > Also shrink the field to unsigned int, even if this doesn't immediately > yield any space benefit for the structure itself. The resulting 32-bit > padding slot can eventually be used for some other field. The change in > size makes accesses slightly more efficient though, as no REX.W prefix > is going to be needed anymore on the respective insns. > > Mirror the HVM side change here (dropping of setting the field to > VGCF_online) also to Arm, on the assumption that it was cloned like > this originally. VGCF_online really should simply and consistently be > the guest view of the inverse of VPF_down, and hence needs representing > only in the get/set vCPU context interfaces. But vPSCI is just a wrapper to get/set vCPU context interfaces. Your changes below will clearly break bring-up of secondary vCPUs on Arm. This is because arch_set_guest_info() will rely on this flag to clear/set VPF_down in the pause flags. So I think the line in arm/vpsci.c should be left alone. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/arm/vpsci.c > +++ b/xen/arch/arm/vpsci.c > @@ -78,7 +78,6 @@ static int do_common_cpu_on(register_t t > ctxt->user_regs.x0 = context_id; > } > #endif > - ctxt->flags = VGCF_online; > > domain_lock(d); > rc = arch_set_info_guest(v, ctxt); Cheers,
On 23.12.2019 18:33, Julien Grall wrote: > Hi Jan, > > On 20/12/2019 14:55, Jan Beulich wrote: >> There's been effectively no use of the field for HVM. >> >> Also shrink the field to unsigned int, even if this doesn't immediately >> yield any space benefit for the structure itself. The resulting 32-bit >> padding slot can eventually be used for some other field. The change in >> size makes accesses slightly more efficient though, as no REX.W prefix >> is going to be needed anymore on the respective insns. >> >> Mirror the HVM side change here (dropping of setting the field to >> VGCF_online) also to Arm, on the assumption that it was cloned like >> this originally. VGCF_online really should simply and consistently be >> the guest view of the inverse of VPF_down, and hence needs representing >> only in the get/set vCPU context interfaces. > > But vPSCI is just a wrapper to get/set vCPU context interfaces. Your > changes below will clearly break bring-up of secondary vCPUs on Arm. > > This is because arch_set_guest_info() will rely on this flag to > clear/set VPF_down in the pause flags. > > So I think the line in arm/vpsci.c should be left alone. Oh, I see - I didn't notice this (ab)use despite ... >> --- a/xen/arch/arm/vpsci.c >> +++ b/xen/arch/arm/vpsci.c >> @@ -78,7 +78,6 @@ static int do_common_cpu_on(register_t t >> ctxt->user_regs.x0 = context_id; >> } >> #endif >> - ctxt->flags = VGCF_online; > >> domain_lock(d); >> rc = arch_set_info_guest(v, ctxt); ... it actually being in context. Thanks for noticing. Jan
Hi Jan, On Fri, 27 Dec 2019, 09:22 Jan Beulich, <JBeulich@suse.com> wrote: > On 23.12.2019 18:33, Julien Grall wrote: > > Hi Jan, > > > > On 20/12/2019 14:55, Jan Beulich wrote: > >> There's been effectively no use of the field for HVM. > >> > >> Also shrink the field to unsigned int, even if this doesn't immediately > >> yield any space benefit for the structure itself. The resulting 32-bit > >> padding slot can eventually be used for some other field. The change in > >> size makes accesses slightly more efficient though, as no REX.W prefix > >> is going to be needed anymore on the respective insns. > >> > >> Mirror the HVM side change here (dropping of setting the field to > >> VGCF_online) also to Arm, on the assumption that it was cloned like > >> this originally. VGCF_online really should simply and consistently be > >> the guest view of the inverse of VPF_down, and hence needs representing > >> only in the get/set vCPU context interfaces. > > > > But vPSCI is just a wrapper to get/set vCPU context interfaces. Your > > changes below will clearly break bring-up of secondary vCPUs on Arm. > > > > This is because arch_set_guest_info() will rely on this flag to > > clear/set VPF_down in the pause flags. > > > > So I think the line in arm/vpsci.c should be left alone. > > Oh, I see - I didn't notice this (ab)use despite ... > Out of Interest, why do you think it is an abuse here and not in the toolstack? How do you suggest to improve it? I can add it in my improvement list for Arm. Cheers,
On 27.12.2019 12:27, Julien Grall wrote: > Hi Jan, > > On Fri, 27 Dec 2019, 09:22 Jan Beulich, <JBeulich@suse.com> wrote: > >> On 23.12.2019 18:33, Julien Grall wrote: >>> Hi Jan, >>> >>> On 20/12/2019 14:55, Jan Beulich wrote: >>>> There's been effectively no use of the field for HVM. >>>> >>>> Also shrink the field to unsigned int, even if this doesn't immediately >>>> yield any space benefit for the structure itself. The resulting 32-bit >>>> padding slot can eventually be used for some other field. The change in >>>> size makes accesses slightly more efficient though, as no REX.W prefix >>>> is going to be needed anymore on the respective insns. >>>> >>>> Mirror the HVM side change here (dropping of setting the field to >>>> VGCF_online) also to Arm, on the assumption that it was cloned like >>>> this originally. VGCF_online really should simply and consistently be >>>> the guest view of the inverse of VPF_down, and hence needs representing >>>> only in the get/set vCPU context interfaces. >>> >>> But vPSCI is just a wrapper to get/set vCPU context interfaces. Your >>> changes below will clearly break bring-up of secondary vCPUs on Arm. >>> >>> This is because arch_set_guest_info() will rely on this flag to >>> clear/set VPF_down in the pause flags. >>> >>> So I think the line in arm/vpsci.c should be left alone. >> >> Oh, I see - I didn't notice this (ab)use despite ... >> > > Out of Interest, why do you think it is an abuse here and not in the > toolstack? > > How do you suggest to improve it? I can add it in my improvement list for > Arm. Oh, "abuse" was about the arch_set_guest_info() use, not the use of the flag by the tool stack. Jan
Hi, On 27/12/2019 12:14, Jan Beulich wrote: > On 27.12.2019 12:27, Julien Grall wrote: >> Hi Jan, >> >> On Fri, 27 Dec 2019, 09:22 Jan Beulich, <JBeulich@suse.com> wrote: >> >>> On 23.12.2019 18:33, Julien Grall wrote: >>>> Hi Jan, >>>> >>>> On 20/12/2019 14:55, Jan Beulich wrote: >>>>> There's been effectively no use of the field for HVM. >>>>> >>>>> Also shrink the field to unsigned int, even if this doesn't immediately >>>>> yield any space benefit for the structure itself. The resulting 32-bit >>>>> padding slot can eventually be used for some other field. The change in >>>>> size makes accesses slightly more efficient though, as no REX.W prefix >>>>> is going to be needed anymore on the respective insns. >>>>> >>>>> Mirror the HVM side change here (dropping of setting the field to >>>>> VGCF_online) also to Arm, on the assumption that it was cloned like >>>>> this originally. VGCF_online really should simply and consistently be >>>>> the guest view of the inverse of VPF_down, and hence needs representing >>>>> only in the get/set vCPU context interfaces. >>>> >>>> But vPSCI is just a wrapper to get/set vCPU context interfaces. Your >>>> changes below will clearly break bring-up of secondary vCPUs on Arm. >>>> >>>> This is because arch_set_guest_info() will rely on this flag to >>>> clear/set VPF_down in the pause flags. >>>> >>>> So I think the line in arm/vpsci.c should be left alone. >>> >>> Oh, I see - I didn't notice this (ab)use despite ... >>> >> >> Out of Interest, why do you think it is an abuse here and not in the >> toolstack? >> >> How do you suggest to improve it? I can add it in my improvement list for >> Arm. > > Oh, "abuse" was about the arch_set_guest_info() use, not the use of > the flag by the tool stack. I may have read incorrectly your e-mail, although I think my questions about why this is an abuse and how do you suggest to improve are still relevant. Cheers,
On 03.01.2020 11:56, Julien Grall wrote: > Hi, > > On 27/12/2019 12:14, Jan Beulich wrote: >> On 27.12.2019 12:27, Julien Grall wrote: >>> Hi Jan, >>> >>> On Fri, 27 Dec 2019, 09:22 Jan Beulich, <JBeulich@suse.com> wrote: >>> >>>> On 23.12.2019 18:33, Julien Grall wrote: >>>>> Hi Jan, >>>>> >>>>> On 20/12/2019 14:55, Jan Beulich wrote: >>>>>> There's been effectively no use of the field for HVM. >>>>>> >>>>>> Also shrink the field to unsigned int, even if this doesn't immediately >>>>>> yield any space benefit for the structure itself. The resulting 32-bit >>>>>> padding slot can eventually be used for some other field. The change in >>>>>> size makes accesses slightly more efficient though, as no REX.W prefix >>>>>> is going to be needed anymore on the respective insns. >>>>>> >>>>>> Mirror the HVM side change here (dropping of setting the field to >>>>>> VGCF_online) also to Arm, on the assumption that it was cloned like >>>>>> this originally. VGCF_online really should simply and consistently be >>>>>> the guest view of the inverse of VPF_down, and hence needs representing >>>>>> only in the get/set vCPU context interfaces. >>>>> >>>>> But vPSCI is just a wrapper to get/set vCPU context interfaces. Your >>>>> changes below will clearly break bring-up of secondary vCPUs on Arm. >>>>> >>>>> This is because arch_set_guest_info() will rely on this flag to >>>>> clear/set VPF_down in the pause flags. >>>>> >>>>> So I think the line in arm/vpsci.c should be left alone. >>>> >>>> Oh, I see - I didn't notice this (ab)use despite ... >>>> >>> >>> Out of Interest, why do you think it is an abuse here and not in the >>> toolstack? >>> >>> How do you suggest to improve it? I can add it in my improvement list for >>> Arm. >> >> Oh, "abuse" was about the arch_set_guest_info() use, not the use of >> the flag by the tool stack. > > I may have read incorrectly your e-mail, although I think my questions > about why this is an abuse and how do you suggest to improve are still > relevant. arch_set_info_guest() is intended to be used for exactly one purpose - vCPU context initialization via hypercall. With this, and _without_ me knowing anything about PSCI, it _looks_ to me to be an abuse. I'd expect there to be something in x86 that could be used for comparison, and whatever that is - it doesn't need a similar extra use of arch_set_info_guest(). (As a result, I don't see how I could reasonably give a concrete suggestion towards improvement. In fact I may be entirely wrong with my feeling of this being an abuse in the first place.) Jan
Hi, On 03/01/2020 11:05, Jan Beulich wrote: > On 03.01.2020 11:56, Julien Grall wrote: >> Hi, >> >> On 27/12/2019 12:14, Jan Beulich wrote: >>> On 27.12.2019 12:27, Julien Grall wrote: >>>> Hi Jan, >>>> >>>> On Fri, 27 Dec 2019, 09:22 Jan Beulich, <JBeulich@suse.com> wrote: >>>> >>>>> On 23.12.2019 18:33, Julien Grall wrote: >>>>>> Hi Jan, >>>>>> >>>>>> On 20/12/2019 14:55, Jan Beulich wrote: >>>>>>> There's been effectively no use of the field for HVM. >>>>>>> >>>>>>> Also shrink the field to unsigned int, even if this doesn't immediately >>>>>>> yield any space benefit for the structure itself. The resulting 32-bit >>>>>>> padding slot can eventually be used for some other field. The change in >>>>>>> size makes accesses slightly more efficient though, as no REX.W prefix >>>>>>> is going to be needed anymore on the respective insns. >>>>>>> >>>>>>> Mirror the HVM side change here (dropping of setting the field to >>>>>>> VGCF_online) also to Arm, on the assumption that it was cloned like >>>>>>> this originally. VGCF_online really should simply and consistently be >>>>>>> the guest view of the inverse of VPF_down, and hence needs representing >>>>>>> only in the get/set vCPU context interfaces. >>>>>> >>>>>> But vPSCI is just a wrapper to get/set vCPU context interfaces. Your >>>>>> changes below will clearly break bring-up of secondary vCPUs on Arm. >>>>>> >>>>>> This is because arch_set_guest_info() will rely on this flag to >>>>>> clear/set VPF_down in the pause flags. >>>>>> >>>>>> So I think the line in arm/vpsci.c should be left alone. >>>>> >>>>> Oh, I see - I didn't notice this (ab)use despite ... >>>>> >>>> >>>> Out of Interest, why do you think it is an abuse here and not in the >>>> toolstack? >>>> >>>> How do you suggest to improve it? I can add it in my improvement list for >>>> Arm. >>> >>> Oh, "abuse" was about the arch_set_guest_info() use, not the use of >>> the flag by the tool stack. >> >> I may have read incorrectly your e-mail, although I think my questions >> about why this is an abuse and how do you suggest to improve are still >> relevant. > > arch_set_info_guest() is intended to be used for exactly one purpose > - vCPU context initialization via hypercall. With this, and _without_ > me knowing anything about PSCI, it _looks_ to me to be an abuse. PSCI (Power State Coordination Interface) is a generic way to manage the power on your plarform (e.g CPU bring-up). The CPU_ON call will actually initialize the vCPU context and then start the vCPU. Whilst, this is not a Xen specific interface, they are still hypercall based. > I'd > expect there to be something in x86 that could be used for > comparison, and whatever that is - it doesn't need a similar extra > use of arch_set_info_guest(). How do you manage secondary CPUs on HVM/PVH guest? Cheers,
On 03.01.2020 12:19, Julien Grall wrote:
> How do you manage secondary CPUs on HVM/PVH guest?
Secondary CPUs have architectural state they start with, and
there's very little control an OS has over initial register
state: There's just an 8-bit value specifying (part of) the
address the CPU should start executing from. All other
registers get set to hard coded values. And that 8-bit value
is part of the IPI message the primary CPU sends to the AP
to be brought up (i.e. there's no hypercall involved here).
For PVH, a variant of the normal PV model of starting vCPU-s
gets used, i.e. via VCPUOP_initialise.
Jan
Hi Jan, Thank you for the information. On 03/01/2020 11:31, Jan Beulich wrote: > On 03.01.2020 12:19, Julien Grall wrote: >> How do you manage secondary CPUs on HVM/PVH guest? > > Secondary CPUs have architectural state they start with, and > there's very little control an OS has over initial register > state: There's just an 8-bit value specifying (part of) the > address the CPU should start executing from. All other > registers get set to hard coded values. And that 8-bit value > is part of the IPI message the primary CPU sends to the AP > to be brought up (i.e. there's no hypercall involved here). Do you have any pointer to this code? Can a CPU be turned off afterwards and then boot again? > > For PVH, a variant of the normal PV model of starting vCPU-s > gets used, i.e. via VCPUOP_initialise. In the case of PSCI, I think it is between the two. We are using a generic hypercall, yet most of the state is fixed. But as the guest OS may run a CPU for a while, turning off and then boot again, we need to be able to set the state again. Hence, the arch_set_guest_info() is quite convenient to use to reset the CPU state. Cheers,
On 03.01.2020 12:48, Julien Grall wrote: > Hi Jan, > > Thank you for the information. > > On 03/01/2020 11:31, Jan Beulich wrote: >> On 03.01.2020 12:19, Julien Grall wrote: >>> How do you manage secondary CPUs on HVM/PVH guest? >> >> Secondary CPUs have architectural state they start with, and >> there's very little control an OS has over initial register >> state: There's just an 8-bit value specifying (part of) the >> address the CPU should start executing from. All other >> registers get set to hard coded values. And that 8-bit value >> is part of the IPI message the primary CPU sends to the AP >> to be brought up (i.e. there's no hypercall involved here). > > Do you have any pointer to this code? Perhaps best look at how Xen _uses_ this mechanism: smpboot.c:wakeup_secondary_cpu(). > Can a CPU be turned off afterwards > and then boot again? Yes, by sending it an INIT signal (as done by e.g. wakeup_secondary_cpu()). Jan
--- a/xen/arch/arm/vpsci.c +++ b/xen/arch/arm/vpsci.c @@ -78,7 +78,6 @@ static int do_common_cpu_on(register_t t ctxt->user_regs.x0 = context_id; } #endif - ctxt->flags = VGCF_online; domain_lock(d); rc = arch_set_info_guest(v, ctxt); --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -896,6 +896,8 @@ int arch_set_info_guest( if ( ((c(ldt_base) & (PAGE_SIZE - 1)) != 0) || (c(ldt_ents) > 8192) ) return -EINVAL; + + v->arch.pv.vgc_flags = flags; } v->arch.flags |= TF_kernel_mode; @@ -908,8 +910,6 @@ int arch_set_info_guest( !is_hvm_domain(d) && !is_pv_32bit_domain(d) ) v->arch.flags &= ~TF_kernel_mode; - v->arch.vgc_flags = flags; - vcpu_setup_fpu(v, v->arch.xsave_area, flags & VGCF_I387_VALID ? &c.nat->fpu_ctxt : NULL, FCW_DEFAULT); @@ -1488,7 +1488,7 @@ static void load_segments(struct vcpu *n domain_crash(n->domain); } - if ( n->arch.vgc_flags & VGCF_failsafe_disables_events ) + if ( n->arch.pv.vgc_flags & VGCF_failsafe_disables_events ) vcpu_info(n, evtchn_upcall_mask) = 1; regs->entry_vector |= TRAP_syscall; @@ -1527,7 +1527,7 @@ static void load_segments(struct vcpu *n domain_crash(n->domain); } - if ( n->arch.vgc_flags & VGCF_failsafe_disables_events ) + if ( n->arch.pv.vgc_flags & VGCF_failsafe_disables_events ) vcpu_info(n, evtchn_upcall_mask) = 1; regs->entry_vector |= TRAP_syscall; --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -1500,7 +1500,10 @@ void arch_get_info_guest(struct vcpu *v, #define c(fld) (!compat ? (c.nat->fld) : (c.cmp->fld)) memcpy(&c.nat->fpu_ctxt, v->arch.fpu_ctxt, sizeof(c.nat->fpu_ctxt)); - c(flags = v->arch.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel)); + if ( is_pv_domain(d) ) + c(flags = v->arch.pv.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel)); + else + c(flags = 0); if ( v->fpu_initialised ) c(flags |= VGCF_i387_valid); if ( !(v->pause_flags & VPF_down) ) --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1151,8 +1151,6 @@ static int hvm_load_cpu_ctxt(struct doma v->arch.dr6 = ctxt.dr6; v->arch.dr7 = ctxt.dr7; - v->arch.vgc_flags = VGCF_online; - /* Auxiliary processors should be woken immediately. */ v->is_initialised = 1; clear_bit(_VPF_down, &v->pause_flags); @@ -3864,8 +3862,6 @@ void hvm_vcpu_reset_state(struct vcpu *v v->arch.xsave_area->xsave_hdr.xstate_bv = 0; vcpu_setup_fpu(v, v->arch.xsave_area, NULL, FCW_RESET); - v->arch.vgc_flags = VGCF_online; - arch_vcpu_regs_init(v); v->arch.user_regs.rip = ip; --- a/xen/arch/x86/pv/callback.c +++ b/xen/arch/x86/pv/callback.c @@ -82,17 +82,17 @@ static long register_guest_callback(stru case CALLBACKTYPE_failsafe: curr->arch.pv.failsafe_callback_eip = reg->address; if ( reg->flags & CALLBACKF_mask_events ) - curr->arch.vgc_flags |= VGCF_failsafe_disables_events; + curr->arch.pv.vgc_flags |= VGCF_failsafe_disables_events; else - curr->arch.vgc_flags &= ~VGCF_failsafe_disables_events; + curr->arch.pv.vgc_flags &= ~VGCF_failsafe_disables_events; break; case CALLBACKTYPE_syscall: curr->arch.pv.syscall_callback_eip = reg->address; if ( reg->flags & CALLBACKF_mask_events ) - curr->arch.vgc_flags |= VGCF_syscall_disables_events; + curr->arch.pv.vgc_flags |= VGCF_syscall_disables_events; else - curr->arch.vgc_flags &= ~VGCF_syscall_disables_events; + curr->arch.pv.vgc_flags &= ~VGCF_syscall_disables_events; break; case CALLBACKTYPE_syscall32: @@ -226,9 +226,9 @@ static long compat_register_guest_callba curr->arch.pv.failsafe_callback_cs = reg->address.cs; curr->arch.pv.failsafe_callback_eip = reg->address.eip; if ( reg->flags & CALLBACKF_mask_events ) - curr->arch.vgc_flags |= VGCF_failsafe_disables_events; + curr->arch.pv.vgc_flags |= VGCF_failsafe_disables_events; else - curr->arch.vgc_flags &= ~VGCF_failsafe_disables_events; + curr->arch.pv.vgc_flags &= ~VGCF_failsafe_disables_events; break; case CALLBACKTYPE_syscall32: --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -69,7 +69,7 @@ void __dummy__(void) OFFSET(VCPU_kernel_sp, struct vcpu, arch.pv.kernel_sp); OFFSET(VCPU_kernel_ss, struct vcpu, arch.pv.kernel_ss); OFFSET(VCPU_iopl, struct vcpu, arch.pv.iopl); - OFFSET(VCPU_guest_context_flags, struct vcpu, arch.vgc_flags); + OFFSET(VCPU_guest_context_flags, struct vcpu, arch.pv.vgc_flags); OFFSET(VCPU_cr3, struct vcpu, arch.cr3); OFFSET(VCPU_arch_msrs, struct vcpu, arch.msrs); OFFSET(VCPU_nmi_pending, struct vcpu, nmi_pending); --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -476,6 +476,8 @@ struct pv_vcpu /* map_domain_page() mapping cache. */ struct mapcache_vcpu mapcache; + unsigned int vgc_flags; + struct trap_info *trap_ctxt; unsigned long gdt_frames[FIRST_RESERVED_GDT_PAGE]; @@ -542,7 +544,6 @@ struct arch_vcpu */ void *fpu_ctxt; - unsigned long vgc_flags; struct cpu_user_regs user_regs; /* Debug registers. */
There's been effectively no use of the field for HVM. Also shrink the field to unsigned int, even if this doesn't immediately yield any space benefit for the structure itself. The resulting 32-bit padding slot can eventually be used for some other field. The change in size makes accesses slightly more efficient though, as no REX.W prefix is going to be needed anymore on the respective insns. Mirror the HVM side change here (dropping of setting the field to VGCF_online) also to Arm, on the assumption that it was cloned like this originally. VGCF_online really should simply and consistently be the guest view of the inverse of VPF_down, and hence needs representing only in the get/set vCPU context interfaces. Signed-off-by: Jan Beulich <jbeulich@suse.com>