Message ID | 1467132449-1030-3-git-send-email-vkuznets@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 28/06/16 17:47, Vitaly Kuznetsov wrote: > @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, > int cpu = (long)hcpu; > switch (action) { > case CPU_UP_PREPARE: > + /* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */ > + per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2; Please do not assume or propagate this brokenness. It is incorrect in the general case, and I will be fixing in the hypervisor in due course. Always read the APIC_ID from the LAPIC, per regular hardware. ~Andrew
Andrew Cooper <andrew.cooper3@citrix.com> writes: > On 28/06/16 17:47, Vitaly Kuznetsov wrote: >> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, >> int cpu = (long)hcpu; >> switch (action) { >> case CPU_UP_PREPARE: >> + /* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */ >> + per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2; > > Please do not assume or propagate this brokenness. It is incorrect in > the general case, and I will be fixing in the hypervisor in due course. > > Always read the APIC_ID from the LAPIC, per regular hardware. (I'm probbaly missing something important - please bear with me) The problem here is that I need to get _other_ CPU's id before any code is executed on that CPU (or, at least, this is the current state of affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR reads/... The only option I see here is to rely on ACPI (MADT) data which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id() gives us). MADT also has processor id which connects it to DSDT but I'm not sure Linux keeps this data. But this is something fixable I guess.
On 29/06/16 13:16, Vitaly Kuznetsov wrote: > Andrew Cooper <andrew.cooper3@citrix.com> writes: > >> On 28/06/16 17:47, Vitaly Kuznetsov wrote: >>> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, >>> int cpu = (long)hcpu; >>> switch (action) { >>> case CPU_UP_PREPARE: >>> + /* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */ >>> + per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2; >> Please do not assume or propagate this brokenness. It is incorrect in >> the general case, and I will be fixing in the hypervisor in due course. >> >> Always read the APIC_ID from the LAPIC, per regular hardware. > (I'm probbaly missing something important - please bear with me) > > The problem here is that I need to get _other_ CPU's id before any code > is executed on that CPU (or, at least, this is the current state of > affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR > reads/... The only option I see here is to rely on ACPI (MADT) data > which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id() > gives us). MADT also has processor id which connects it to DSDT but I'm > not sure Linux keeps this data. But this is something fixable I guess. Hmm yes - that is a tricky issue. It is not safe or correct to assume that xen_vcpu_id is APICID / 2. This is currently the case for most modern versions of Xen, but isn't the case for older versions, and won't be the case in the future when I (or someone else) fixes topology representation for guests. For this to work, we need one or more of: 1) to provide the guest a full mapping from APIC_ID to vcpu id at boot time. 2) add a new interface where the guest can explicitly query "what is the vcpu id for the entity with this APIC_ID". 3) Allow HVM guests to identify a vcpu in a hypercall by APIC_ID. 3 is the cleaner approach, but given that vcpu ids have already leaked into an HVM domains idea of the world, 1 or 2 is probably a better ladder to dig us out of this hole. ~Andrew.
Andrew Cooper <andrew.cooper3@citrix.com> writes: > On 29/06/16 13:16, Vitaly Kuznetsov wrote: >> Andrew Cooper <andrew.cooper3@citrix.com> writes: >> >>> On 28/06/16 17:47, Vitaly Kuznetsov wrote: >>>> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, >>>> int cpu = (long)hcpu; >>>> switch (action) { >>>> case CPU_UP_PREPARE: >>>> + /* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */ >>>> + per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2; >>> Please do not assume or propagate this brokenness. It is incorrect in >>> the general case, and I will be fixing in the hypervisor in due course. >>> >>> Always read the APIC_ID from the LAPIC, per regular hardware. >> (I'm probbaly missing something important - please bear with me) >> >> The problem here is that I need to get _other_ CPU's id before any code >> is executed on that CPU (or, at least, this is the current state of >> affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR >> reads/... The only option I see here is to rely on ACPI (MADT) data >> which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id() >> gives us). MADT also has processor id which connects it to DSDT but I'm >> not sure Linux keeps this data. But this is something fixable I guess. > > Hmm yes - that is a tricky issue. > > It is not safe or correct to assume that xen_vcpu_id is APICID / 2. > > This is currently the case for most modern versions of Xen, but isn't > the case for older versions, and won't be the case in the future when I > (or someone else) fixes topology representation for guests. > > For this to work, we need one or more of: > > 1) to provide the guest a full mapping from APIC_ID to vcpu id at boot > time. So can we rely on ACPI data? Especially on MADT and processor ids there? I think we can always guarantee that processor ids there match vCPU ids. If yes I can try saving this data when we parse MADT. > 2) add a new interface where the guest can explicitly query "what is the > vcpu id for the entity with this APIC_ID". > 3) Allow HVM guests to identify a vcpu in a hypercall by APIC_ID. > > 3 is the cleaner approach, but given that vcpu ids have already leaked > into an HVM domains idea of the world, 1 or 2 is probably a better > ladder to dig us out of this hole. It would be nice to avoid hypervisor changes but if we have to modify it we can fail all secondary CPUs for now when we detect that CPU0's vCPU id is not 0 (and CPU0 gets its id with CPUID).
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c index 75cd734..ea99ca2 100644 --- a/arch/arm/xen/enlighten.c +++ b/arch/arm/xen/enlighten.c @@ -46,6 +46,10 @@ struct shared_info *HYPERVISOR_shared_info = (void *)&xen_dummy_shared_info; DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu); static struct vcpu_info __percpu *xen_vcpu_info; +/* Linux <-> Xen vCPU id mapping */ +DEFINE_PER_CPU(int, xen_vcpu_id) = -1; +EXPORT_SYMBOL_GPL(xen_vcpu_id); + /* These are unused until we support booting "pre-ballooned" */ unsigned long xen_released_pages; struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata; @@ -179,6 +183,9 @@ static void xen_percpu_init(void) pr_info("Xen: initializing cpu%d\n", cpu); vcpup = per_cpu_ptr(xen_vcpu_info, cpu); + /* Direct vCPU id mapping for ARM guests. */ + per_cpu(xen_vcpu_id, cpu) = cpu; + info.mfn = virt_to_gfn(vcpup); info.offset = xen_offset_in_page(vcpup); @@ -328,6 +335,9 @@ static int __init xen_guest_init(void) if (xen_vcpu_info == NULL) return -ENOMEM; + /* Direct vCPU id mapping for ARM guests. */ + per_cpu(xen_vcpu_id, 0) = 0; + if (gnttab_setup_auto_xlat_frames(grant_frames)) { free_percpu(xen_vcpu_info); return -ENOMEM; diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 760789a..69f4c0c 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -59,6 +59,7 @@ #include <asm/xen/pci.h> #include <asm/xen/hypercall.h> #include <asm/xen/hypervisor.h> +#include <asm/xen/cpuid.h> #include <asm/fixmap.h> #include <asm/processor.h> #include <asm/proto.h> @@ -118,6 +119,10 @@ DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu); */ DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info); +/* Linux <-> Xen vCPU id mapping */ +DEFINE_PER_CPU(int, xen_vcpu_id) = -1; +EXPORT_SYMBOL_GPL(xen_vcpu_id); + enum xen_domain_type xen_domain_type = XEN_NATIVE; EXPORT_SYMBOL_GPL(xen_domain_type); @@ -1137,8 +1142,11 @@ void xen_setup_vcpu_info_placement(void) { int cpu; - for_each_possible_cpu(cpu) + for_each_possible_cpu(cpu) { + /* Set up direct vCPU id mapping for PV guests. */ + per_cpu(xen_vcpu_id, cpu) = cpu; xen_vcpu_setup(cpu); + } /* xen_vcpu_setup managed to place the vcpu_info within the * percpu area for all cpus, so make use of it. Note that for @@ -1797,6 +1805,12 @@ static void __init init_hvm_pv_info(void) xen_setup_features(); + cpuid(base + 4, &eax, &ebx, &ecx, &edx); + if (eax & XEN_HVM_CPUID_VCPU_ID_PRESENT) + this_cpu_write(xen_vcpu_id, ebx); + else + this_cpu_write(xen_vcpu_id, smp_processor_id()); + pv_info.name = "Xen HVM"; xen_domain_type = XEN_HVM_DOMAIN; @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action, int cpu = (long)hcpu; switch (action) { case CPU_UP_PREPARE: + /* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */ + per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2; xen_vcpu_setup(cpu); if (xen_have_vector_callback) { if (xen_feature(XENFEAT_hvm_safe_pvclock)) diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index 86abe07..b02a343 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -8,6 +8,7 @@ #include <xen/interface/vcpu.h> DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); +DECLARE_PER_CPU(int, xen_vcpu_id); void xen_arch_pre_suspend(void); void xen_arch_post_suspend(int suspend_cancelled);
It may happen that Xen's and Linux's ideas of vCPU id diverge. In particular, when we crash on a secondary vCPU we may want to do kdump and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting on the vCPU which crashed. This doesn't work very well for PVHVM guests as we have a number of hypercalls where we pass vCPU id as a parameter. These hypercalls either fail or do something unexpected. To solve the issue introduce percpu xen_vcpu_id mapping. ARM and PV guests get direct mapping for now. Boot CPU for PVHVM guest gets its id from CPUID. With secondary CPUs it is a bit more trickier. Currently, we initialize IPI vectors before these CPUs boot so we can't use CPUID. However, we know that physical CPU id (vLAPIC id) is Xen's vCPU id * 2, we can piggyback on that. Alternatively, we could have disabled all secondary CPUs once we detect that Xen's and Linux's ideas of vCPU id diverged. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> --- arch/arm/xen/enlighten.c | 10 ++++++++++ arch/x86/xen/enlighten.c | 18 +++++++++++++++++- include/xen/xen-ops.h | 1 + 3 files changed, 28 insertions(+), 1 deletion(-)