Message ID | ac4cf44e-8edf-ba18-193a-79dfde53dd5c@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/10/2016 09:55 AM, Andrew Cooper wrote: > On 10/11/16 14:50, Boris Ostrovsky wrote: >> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >> APIC ID) with contents of that field on the processor that launched >> the guest. This results in the guest reporting different initial >> APIC IDs across runs. >> >> We should be consistent in how this value is reported, let's set >> it to 0 (which is also what Linux guests expect). >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > This surely wants to go along with: Probably, although Linux PV always reports APIC ID as zero (whole PV APIC is a mess there as it is tied to topology discovery and we don't do this well, to put it charitably). > andrewcoop@andrewcoop:/local/xen.git/xen$ git diff > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index b51b51b..bdf9339 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) > uint32_t tmp, _ecx, _ebx; > > case 0x00000001: > + /* Fix up VLAPIC details. */ > + b &= 0x00FFFFFFu; > + b |= (curr->vcpu_id * 2) << 24; Do we also need to multiply by two for PV guests? Or is it just to be consistent with HVM? -boris > + > c &= pv_featureset[FEATURESET_1c]; > d &= pv_featureset[FEATURESET_1d]; > > > Which brings the PV CPUID handling in line with HVM handling. Otherwise > a guest will see an APIC ID of 0 in all vcpus, which will surely confuse it. > > ~Andrew > >> --- >> >> I think this should go to stable branches as well. This has been causing >> problems lately in Linux with introduction of topology maps. >> >> >> tools/libxc/xc_cpuid_x86.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c >> index d761805..1f26294 100644 >> --- a/tools/libxc/xc_cpuid_x86.c >> +++ b/tools/libxc/xc_cpuid_x86.c >> @@ -618,6 +618,12 @@ static void xc_cpuid_pv_policy(xc_interface *xch, >> /* Host topology exposed to PV guest. Provide host value. */ >> bool host_htt = regs[3] & bitmaskof(X86_FEATURE_HTT); >> >> + /* >> + * Don't pick host's Initial APIC ID which can change from run >> + * to run. >> + */ >> + regs[1] &= 0x00ffffffu; >> + >> regs[2] = info->featureset[featureword_of(X86_FEATURE_SSE3)]; >> regs[3] = (info->featureset[featureword_of(X86_FEATURE_FPU)] & >> ~bitmaskof(X86_FEATURE_HTT));
On 10/11/16 15:05, Boris Ostrovsky wrote: > On 11/10/2016 09:55 AM, Andrew Cooper wrote: >> On 10/11/16 14:50, Boris Ostrovsky wrote: >>> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >>> APIC ID) with contents of that field on the processor that launched >>> the guest. This results in the guest reporting different initial >>> APIC IDs across runs. >>> >>> We should be consistent in how this value is reported, let's set >>> it to 0 (which is also what Linux guests expect). >>> >>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> This surely wants to go along with: > Probably, although Linux PV always reports APIC ID as zero (whole PV > APIC is a mess there as it is tied to topology discovery and we don't do > this well, to put it charitably). If PV linux always overrides this to 0, why do you need the toolstack fix in the first place? > >> andrewcoop@andrewcoop:/local/xen.git/xen$ git diff >> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >> index b51b51b..bdf9339 100644 >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) >> uint32_t tmp, _ecx, _ebx; >> >> case 0x00000001: >> + /* Fix up VLAPIC details. */ >> + b &= 0x00FFFFFFu; >> + b |= (curr->vcpu_id * 2) << 24; > Do we also need to multiply by two for PV guests? Or is it just to be > consistent with HVM? Frankly, until I get CPUID phase 2 sorted, this is all held together with good wishes, rather than duck tape. I am astounded it has held together this long. HVM chooses an even APIC ID to prevent the VM thinking it has hyperthreads. ~Andrew
On 11/10/2016 10:08 AM, Andrew Cooper wrote: > On 10/11/16 15:05, Boris Ostrovsky wrote: >> On 11/10/2016 09:55 AM, Andrew Cooper wrote: >>> On 10/11/16 14:50, Boris Ostrovsky wrote: >>>> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >>>> APIC ID) with contents of that field on the processor that launched >>>> the guest. This results in the guest reporting different initial >>>> APIC IDs across runs. >>>> >>>> We should be consistent in how this value is reported, let's set >>>> it to 0 (which is also what Linux guests expect). >>>> >>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> This surely wants to go along with: >> Probably, although Linux PV always reports APIC ID as zero (whole PV >> APIC is a mess there as it is tied to topology discovery and we don't do >> this well, to put it charitably). > If PV linux always overrides this to 0, why do you need the toolstack > fix in the first place? I meant that Linux overrides APICID read from APIC (which, of course, we don't have) to zero, not CPUID. > >>> andrewcoop@andrewcoop:/local/xen.git/xen$ git diff >>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >>> index b51b51b..bdf9339 100644 >>> --- a/xen/arch/x86/traps.c >>> +++ b/xen/arch/x86/traps.c >>> @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) >>> uint32_t tmp, _ecx, _ebx; >>> >>> case 0x00000001: >>> + /* Fix up VLAPIC details. */ >>> + b &= 0x00FFFFFFu; >>> + b |= (curr->vcpu_id * 2) << 24; >> Do we also need to multiply by two for PV guests? Or is it just to be >> consistent with HVM? > Frankly, until I get CPUID phase 2 sorted, this is all held together > with good wishes, rather than duck tape. I am astounded it has held > together this long. It does not anymore. We are being yelled at by Linux x86 maintainers to get our stuff in order. > > HVM chooses an even APIC ID to prevent the VM thinking it has hyperthreads. Right, but it's not needed (I think) by PV. But we might do it that way to keep the two in sync. Let me test it and see what breaks. -boris
>>> On 10.11.16 at 15:55, <andrew.cooper3@citrix.com> wrote: > On 10/11/16 14:50, Boris Ostrovsky wrote: >> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >> APIC ID) with contents of that field on the processor that launched >> the guest. This results in the guest reporting different initial >> APIC IDs across runs. >> >> We should be consistent in how this value is reported, let's set >> it to 0 (which is also what Linux guests expect). >> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > This surely wants to go along with: > > andrewcoop@andrewcoop:/local/xen.git/xen$ git diff > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index b51b51b..bdf9339 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) > uint32_t tmp, _ecx, _ebx; > > case 0x00000001: > + /* Fix up VLAPIC details. */ > + b &= 0x00FFFFFFu; > + b |= (curr->vcpu_id * 2) << 24; > + > c &= pv_featureset[FEATURESET_1c]; > d &= pv_featureset[FEATURESET_1d]; > > > Which brings the PV CPUID handling in line with HVM handling. Otherwise > a guest will see an APIC ID of 0 in all vcpus, which will surely confuse it. Which will still result in multiple identical APIC IDs once there are 128 or more vCPU-s in a PV guest. Jan
On 11/10/2016 11:12 AM, Jan Beulich wrote: >>>> On 10.11.16 at 15:55, <andrew.cooper3@citrix.com> wrote: >> On 10/11/16 14:50, Boris Ostrovsky wrote: >>> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >>> APIC ID) with contents of that field on the processor that launched >>> the guest. This results in the guest reporting different initial >>> APIC IDs across runs. >>> >>> We should be consistent in how this value is reported, let's set >>> it to 0 (which is also what Linux guests expect). >>> >>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> This surely wants to go along with: >> >> andrewcoop@andrewcoop:/local/xen.git/xen$ git diff >> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >> index b51b51b..bdf9339 100644 >> --- a/xen/arch/x86/traps.c >> +++ b/xen/arch/x86/traps.c >> @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) >> uint32_t tmp, _ecx, _ebx; >> >> case 0x00000001: >> + /* Fix up VLAPIC details. */ >> + b &= 0x00FFFFFFu; >> + b |= (curr->vcpu_id * 2) << 24; >> + >> c &= pv_featureset[FEATURESET_1c]; >> d &= pv_featureset[FEATURESET_1d]; >> >> >> Which brings the PV CPUID handling in line with HVM handling. Otherwise >> a guest will see an APIC ID of 0 in all vcpus, which will surely confuse it. > Which will still result in multiple identical APIC IDs once there are > 128 or more vCPU-s in a PV guest. And this change (either with or without *2) makes Linux PV problem of mismatched APIC[0x20] vs CPUID(1).EBX[31:24] to be back (yes, because Linux is broken in that it currently makes APIC[0x20] return zero). You'd obviously be within your rights to say that it's up to Linux to deal with this but I will ask you to consider taking only the toolstack part of this for now. -boris
On 10/11/16 16:24, Boris Ostrovsky wrote: > On 11/10/2016 11:12 AM, Jan Beulich wrote: >>>>> On 10.11.16 at 15:55, <andrew.cooper3@citrix.com> wrote: >>> On 10/11/16 14:50, Boris Ostrovsky wrote: >>>> Currently hypervisor provides PV guest's CPUID(1).EBX[31:24] (initial >>>> APIC ID) with contents of that field on the processor that launched >>>> the guest. This results in the guest reporting different initial >>>> APIC IDs across runs. >>>> >>>> We should be consistent in how this value is reported, let's set >>>> it to 0 (which is also what Linux guests expect). >>>> >>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> This surely wants to go along with: >>> >>> andrewcoop@andrewcoop:/local/xen.git/xen$ git diff >>> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c >>> index b51b51b..bdf9339 100644 >>> --- a/xen/arch/x86/traps.c >>> +++ b/xen/arch/x86/traps.c >>> @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) >>> uint32_t tmp, _ecx, _ebx; >>> >>> case 0x00000001: >>> + /* Fix up VLAPIC details. */ >>> + b &= 0x00FFFFFFu; >>> + b |= (curr->vcpu_id * 2) << 24; >>> + >>> c &= pv_featureset[FEATURESET_1c]; >>> d &= pv_featureset[FEATURESET_1d]; >>> >>> >>> Which brings the PV CPUID handling in line with HVM handling. Otherwise >>> a guest will see an APIC ID of 0 in all vcpus, which will surely confuse it. >> Which will still result in multiple identical APIC IDs once there are >> 128 or more vCPU-s in a PV guest. You can already crash Xen with fewer PV vcpus than that, due to long running for_each_vcpu() loops. The current ABI limit of 8192 is implausible for production, as is the current 128 limit for HVM guests. > > And this change (either with or without *2) makes Linux PV problem of > mismatched APIC[0x20] vs CPUID(1).EBX[31:24] to be back (yes, because > Linux is broken in that it currently makes APIC[0x20] return zero). > > You'd obviously be within your rights to say that it's up to Linux to > deal with this but I will ask you to consider taking only the toolstack > part of this for now. If the toolstack along is sufficient duct tape, perhaps best to go with just that for now. I will have to change all of this for CPUID phase-2. We can see about fixing things more correctly then. ~Andrew
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index b51b51b..bdf9339 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -985,6 +985,10 @@ void pv_cpuid(struct cpu_user_regs *regs) uint32_t tmp, _ecx, _ebx; case 0x00000001: + /* Fix up VLAPIC details. */ + b &= 0x00FFFFFFu; + b |= (curr->vcpu_id * 2) << 24; + c &= pv_featureset[FEATURESET_1c]; d &= pv_featureset[FEATURESET_1d];