diff mbox

[linux,2/8] xen: introduce xen_vcpu_id mapping

Message ID 1467132449-1030-3-git-send-email-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Vitaly Kuznetsov June 28, 2016, 4:47 p.m. UTC
It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
on the vCPU which crashed. This doesn't work very well for PVHVM guests as
we have a number of hypercalls where we pass vCPU id as a parameter. These
hypercalls either fail or do something unexpected. To solve the issue
introduce percpu xen_vcpu_id mapping. ARM and PV guests get direct mapping
for now. Boot CPU for PVHVM guest gets its id from CPUID. With secondary
CPUs it is a bit more trickier. Currently, we initialize IPI vectors
before these CPUs boot so we can't use CPUID. However, we know that
physical CPU id (vLAPIC id) is Xen's vCPU id * 2, we can piggyback on
that. Alternatively, we could have disabled all secondary CPUs once we
detect that Xen's and Linux's ideas of vCPU id diverged.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/arm/xen/enlighten.c | 10 ++++++++++
 arch/x86/xen/enlighten.c | 18 +++++++++++++++++-
 include/xen/xen-ops.h    |  1 +
 3 files changed, 28 insertions(+), 1 deletion(-)

Comments

Andrew Cooper June 28, 2016, 5:28 p.m. UTC | #1
On 28/06/16 17:47, Vitaly Kuznetsov wrote:
> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
>  	int cpu = (long)hcpu;
>  	switch (action) {
>  	case CPU_UP_PREPARE:
> +		/* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */
> +		per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2;

Please do not assume or propagate this brokenness.  It is incorrect in
the general case, and I will be fixing in the hypervisor in due course.

Always read the APIC_ID from the LAPIC, per regular hardware.

~Andrew
Vitaly Kuznetsov June 29, 2016, 12:16 p.m. UTC | #2
Andrew Cooper <andrew.cooper3@citrix.com> writes:

> On 28/06/16 17:47, Vitaly Kuznetsov wrote:
>> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
>>  	int cpu = (long)hcpu;
>>  	switch (action) {
>>  	case CPU_UP_PREPARE:
>> +		/* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */
>> +		per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2;
>
> Please do not assume or propagate this brokenness.  It is incorrect in
> the general case, and I will be fixing in the hypervisor in due course.
>
> Always read the APIC_ID from the LAPIC, per regular hardware.

(I'm probbaly missing something important - please bear with me)

The problem here is that I need to get _other_ CPU's id before any code
is executed on that CPU (or, at least, this is the current state of
affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR
reads/... The only option I see here is to rely on ACPI (MADT) data
which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id()
gives us). MADT also has processor id which connects it to DSDT but I'm
not sure Linux keeps this data. But this is something fixable I guess.
Andrew Cooper June 29, 2016, 12:30 p.m. UTC | #3
On 29/06/16 13:16, Vitaly Kuznetsov wrote:
> Andrew Cooper <andrew.cooper3@citrix.com> writes:
>
>> On 28/06/16 17:47, Vitaly Kuznetsov wrote:
>>> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
>>>  	int cpu = (long)hcpu;
>>>  	switch (action) {
>>>  	case CPU_UP_PREPARE:
>>> +		/* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */
>>> +		per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2;
>> Please do not assume or propagate this brokenness.  It is incorrect in
>> the general case, and I will be fixing in the hypervisor in due course.
>>
>> Always read the APIC_ID from the LAPIC, per regular hardware.
> (I'm probbaly missing something important - please bear with me)
>
> The problem here is that I need to get _other_ CPU's id before any code
> is executed on that CPU (or, at least, this is the current state of
> affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR
> reads/... The only option I see here is to rely on ACPI (MADT) data
> which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id()
> gives us). MADT also has processor id which connects it to DSDT but I'm
> not sure Linux keeps this data. But this is something fixable I guess.

Hmm yes - that is a tricky issue.

It is not safe or correct to assume that xen_vcpu_id is APICID / 2.

This is currently the case for most modern versions of Xen, but isn't
the case for older versions, and won't be the case in the future when I
(or someone else) fixes topology representation for guests.

For this to work, we need one or more of:

1) to provide the guest a full mapping from APIC_ID to vcpu id at boot time.
2) add a new interface where the guest can explicitly query "what is the
vcpu id for the entity with this APIC_ID".
3) Allow HVM guests to identify a vcpu in a hypercall by APIC_ID.

3 is the cleaner approach, but given that vcpu ids have already leaked
into an HVM domains idea of the world, 1 or 2 is probably a better
ladder to dig us out of this hole.

~Andrew.
Vitaly Kuznetsov June 29, 2016, 12:50 p.m. UTC | #4
Andrew Cooper <andrew.cooper3@citrix.com> writes:

> On 29/06/16 13:16, Vitaly Kuznetsov wrote:
>> Andrew Cooper <andrew.cooper3@citrix.com> writes:
>>
>>> On 28/06/16 17:47, Vitaly Kuznetsov wrote:
>>>> @@ -1808,6 +1822,8 @@ static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
>>>>  	int cpu = (long)hcpu;
>>>>  	switch (action) {
>>>>  	case CPU_UP_PREPARE:
>>>> +		/* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */
>>>> +		per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2;
>>> Please do not assume or propagate this brokenness.  It is incorrect in
>>> the general case, and I will be fixing in the hypervisor in due course.
>>>
>>> Always read the APIC_ID from the LAPIC, per regular hardware.
>> (I'm probbaly missing something important - please bear with me)
>>
>> The problem here is that I need to get _other_ CPU's id before any code
>> is executed on that CPU (or, at least, this is the current state of
>> affairs if you look at xen_hvm_cpu_up()) so I can't use CPUID/do MSR
>> reads/... The only option I see here is to rely on ACPI (MADT) data
>> which is stored in x86_cpu_to_apicid (and that's what cpu_physical_id()
>> gives us). MADT also has processor id which connects it to DSDT but I'm
>> not sure Linux keeps this data. But this is something fixable I guess.
>
> Hmm yes - that is a tricky issue.
>
> It is not safe or correct to assume that xen_vcpu_id is APICID / 2.
>
> This is currently the case for most modern versions of Xen, but isn't
> the case for older versions, and won't be the case in the future when I
> (or someone else) fixes topology representation for guests.
>
> For this to work, we need one or more of:
>
> 1) to provide the guest a full mapping from APIC_ID to vcpu id at boot
> time.

So can we rely on ACPI data? Especially on MADT and processor ids there?
I think we can always guarantee that processor ids there match vCPU
ids. If yes I can try saving this data when we parse MADT.

> 2) add a new interface where the guest can explicitly query "what is the
> vcpu id for the entity with this APIC_ID".
> 3) Allow HVM guests to identify a vcpu in a hypercall by APIC_ID.
>
> 3 is the cleaner approach, but given that vcpu ids have already leaked
> into an HVM domains idea of the world, 1 or 2 is probably a better
> ladder to dig us out of this hole.

It would be nice to avoid hypervisor changes but if we have to modify it
we can fail all secondary CPUs for now when we detect that CPU0's vCPU
id is not 0 (and CPU0 gets its id with CPUID).
diff mbox

Patch

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 75cd734..ea99ca2 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -46,6 +46,10 @@  struct shared_info *HYPERVISOR_shared_info = (void *)&xen_dummy_shared_info;
 DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
 static struct vcpu_info __percpu *xen_vcpu_info;
 
+/* Linux <-> Xen vCPU id mapping */
+DEFINE_PER_CPU(int, xen_vcpu_id) = -1;
+EXPORT_SYMBOL_GPL(xen_vcpu_id);
+
 /* These are unused until we support booting "pre-ballooned" */
 unsigned long xen_released_pages;
 struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
@@ -179,6 +183,9 @@  static void xen_percpu_init(void)
 	pr_info("Xen: initializing cpu%d\n", cpu);
 	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
+	/* Direct vCPU id mapping for ARM guests. */
+	per_cpu(xen_vcpu_id, cpu) = cpu;
+
 	info.mfn = virt_to_gfn(vcpup);
 	info.offset = xen_offset_in_page(vcpup);
 
@@ -328,6 +335,9 @@  static int __init xen_guest_init(void)
 	if (xen_vcpu_info == NULL)
 		return -ENOMEM;
 
+	/* Direct vCPU id mapping for ARM guests. */
+	per_cpu(xen_vcpu_id, 0) = 0;
+
 	if (gnttab_setup_auto_xlat_frames(grant_frames)) {
 		free_percpu(xen_vcpu_info);
 		return -ENOMEM;
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 760789a..69f4c0c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -59,6 +59,7 @@ 
 #include <asm/xen/pci.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/xen/cpuid.h>
 #include <asm/fixmap.h>
 #include <asm/processor.h>
 #include <asm/proto.h>
@@ -118,6 +119,10 @@  DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
  */
 DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
 
+/* Linux <-> Xen vCPU id mapping */
+DEFINE_PER_CPU(int, xen_vcpu_id) = -1;
+EXPORT_SYMBOL_GPL(xen_vcpu_id);
+
 enum xen_domain_type xen_domain_type = XEN_NATIVE;
 EXPORT_SYMBOL_GPL(xen_domain_type);
 
@@ -1137,8 +1142,11 @@  void xen_setup_vcpu_info_placement(void)
 {
 	int cpu;
 
-	for_each_possible_cpu(cpu)
+	for_each_possible_cpu(cpu) {
+		/* Set up direct vCPU id mapping for PV guests. */
+		per_cpu(xen_vcpu_id, cpu) = cpu;
 		xen_vcpu_setup(cpu);
+	}
 
 	/* xen_vcpu_setup managed to place the vcpu_info within the
 	 * percpu area for all cpus, so make use of it. Note that for
@@ -1797,6 +1805,12 @@  static void __init init_hvm_pv_info(void)
 
 	xen_setup_features();
 
+	cpuid(base + 4, &eax, &ebx, &ecx, &edx);
+	if (eax & XEN_HVM_CPUID_VCPU_ID_PRESENT)
+		this_cpu_write(xen_vcpu_id, ebx);
+	else
+		this_cpu_write(xen_vcpu_id, smp_processor_id());
+
 	pv_info.name = "Xen HVM";
 
 	xen_domain_type = XEN_HVM_DOMAIN;
@@ -1808,6 +1822,8 @@  static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
 	int cpu = (long)hcpu;
 	switch (action) {
 	case CPU_UP_PREPARE:
+		/* vLAPIC_ID == Xen's vCPU_ID * 2 for HVM guests */
+		per_cpu(xen_vcpu_id, cpu) = cpu_physical_id(cpu) / 2;
 		xen_vcpu_setup(cpu);
 		if (xen_have_vector_callback) {
 			if (xen_feature(XENFEAT_hvm_safe_pvclock))
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 86abe07..b02a343 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -8,6 +8,7 @@ 
 #include <xen/interface/vcpu.h>
 
 DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
+DECLARE_PER_CPU(int, xen_vcpu_id);
 
 void xen_arch_pre_suspend(void);
 void xen_arch_post_suspend(int suspend_cancelled);