diff mbox

[v12,3/7] x86, acpi, cpu-hotplug: Introduce cpuid_to_apicid[] array to store persistent cpuid <-> apicid mapping.

Message ID 1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Dou Liyang Aug. 25, 2016, 8:35 a.m. UTC
From: Gu Zheng <guz.fnst@cn.fujitsu.com>

The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-presnet or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.

This patch finishes step 2.

In this patch, we introduce a new static array named cpuid_to_apicid[],
which is large enough to store info for all possible cpus.

And then, we modify the cpuid calculation. In generic_processor_info(),
it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid
mapping changes with node hotplug.

After this patch, we find the next unused cpuid, map it to an apicid,
and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid
mapping will be persistent.

And finally we will use this array to make cpuid <-> nodeid persistent.

cpuid <-> apicid mapping is established at local apic registeration time.
But non-present or disabled cpus are ignored.

In this patch, we establish all possible cpuid <-> apicid mapping when
registering local apic.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
---
 arch/x86/include/asm/mpspec.h |  1 +
 arch/x86/kernel/acpi/boot.c   |  7 +----
 arch/x86/kernel/apic/apic.c   | 61 ++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 60 insertions(+), 9 deletions(-)

Comments

Dou Liyang Oct. 7, 2016, 4:35 a.m. UTC | #1
Hi Yinghai

At 10/07/2016 05:20 AM, Yinghai Lu wrote:
> On Thu, Oct 6, 2016 at 1:06 AM, Dou Liyang <douly.fnst@cn.fujitsu.com> wrote:
>
>> I seem to remember that in x2APIC Spec the x2APIC ID may be at 255 or
>> greater.
>
> Good to know. Maybe later when one package have more cores like 30 cores etc.
>
>> If we do that judgment, it may be affect x2APIC's work in some other places.
>>
>> I saw the MADT, the main reason may be that we define 0xff to acpi_id
>> in LAPIC mode.
>> As you said, it was like:
>> [   42.107902] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
>> [   42.120125] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
>> [   42.132361] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
>> ...
>>
>> How about doing the acpi_id check when we parse it in
>> acpi_parse_lapic().
>>
>> 8<----------------
>>
>> --- a/arch/x86/kernel/acpi/boot.c
>> +++ b/arch/x86/kernel/acpi/boot.c
>> @@ -233,6 +233,11 @@ acpi_parse_lapic(struct acpi_subtable_header * header,
>> const unsigned long end)
>>
>>         acpi_table_print_madt_entry(header);
>>
>> +       if (processor->id >= 255) {
>> +               ++disabled_cpus;
>> +               return -EINVAL;
>> +       }
>> +
>>         /*
>>          * We need to register disabled CPU as well to permit
>>          * counting disabled CPUs. This allows us to size
>
> Yes, that should work. but should do the same thing for x2apic
>
> in acpi_parse_x2apic should have
>
>> +       if (processor->local_apic_id == -1) {
>> +               ++disabled_cpus;
>> +               return -EINVAL;
>> +       }
>
> that is the reason why i want to extend acpi_register_lapic()
> to take extra disabled_id (one is 0xff and another is 0xffffffff)
> so could save some lines.
>

Yes, I understood.
But I think adding an extra disabled_id is not a good way for
validating the apic_id. If the disabled_id is not just one id(-1 or
255), may be two or more, even be a range. what should we do for
extending our code?

Firstly, I am not sure that the "-1" could appear in the MADT, even if
the ACPI tables is unreasonable.

Seondly, I guess if we need the check, there are some reserved methods
in the kernel, such as "default_apic_id_valid", "x2apic_apic_id_valid"
and so on. we should extend all of them and use them for check.


CC'ed: Rafael and Lv

May I ask a question?

Is it possible that the "-1/oxffffffff" could appear in the MADT which 
is one of the ACPI tables?


> Thanks
>
> Yinghai
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 7, 2016, 12:50 p.m. UTC | #2
On Fri, 7 Oct 2016, Dou Liyang wrote:
> Is it possible that the "-1/oxffffffff" could appear in the MADT which is one
> of the ACPI tables?

According to the SDM the x2apic id is a 32bit ID, so 0xffffffff is a
legitimate value.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 7, 2016, 1 p.m. UTC | #3
On Fri, 7 Oct 2016, Thomas Gleixner wrote:
> On Fri, 7 Oct 2016, Dou Liyang wrote:
> > Is it possible that the "-1/oxffffffff" could appear in the MADT which is one
> > of the ACPI tables?
> 
> According to the SDM the x2apic id is a 32bit ID, so 0xffffffff is a
> legitimate value.

The ACPI spec says that bit 0 of the x2apic flags field tells whether the
logical processor is present or not. So the proper check for x2apic is that
flag.

The lapic structure has the same flag, but the kernel ignores the flags for
both lapic and x2apic.

I'm going to apply the minimal fix of checking for id == 0xff in
acpi_lapic_parse() for now, but this needs to be revisited and fixed
proper.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Oct. 7, 2016, 6:55 p.m. UTC | #4
On Fri, Oct 7, 2016 at 6:00 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 7 Oct 2016, Thomas Gleixner wrote:
>> On Fri, 7 Oct 2016, Dou Liyang wrote:
>> > Is it possible that the "-1/oxffffffff" could appear in the MADT which is one
>> > of the ACPI tables?
>>
>> According to the SDM the x2apic id is a 32bit ID, so 0xffffffff is a
>> legitimate value.
>
> The ACPI spec says that bit 0 of the x2apic flags field tells whether the
> logical processor is present or not. So the proper check for x2apic is that
> flag.
>
> The lapic structure has the same flag, but the kernel ignores the flags for
> both lapic and x2apic.
>
> I'm going to apply the minimal fix of checking for id == 0xff in
> acpi_lapic_parse() for now, but this needs to be revisited and fixed
> proper.

Good to me. Thanks for fixing it.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dou Liyang Oct. 8, 2016, 5:22 a.m. UTC | #5
Hi tglx,

At 10/07/2016 09:00 PM, Thomas Gleixner wrote:
> On Fri, 7 Oct 2016, Thomas Gleixner wrote:
>> On Fri, 7 Oct 2016, Dou Liyang wrote:
>>> Is it possible that the "-1/oxffffffff" could appear in the MADT which is one
>>> of the ACPI tables?
>>
>> According to the SDM the x2apic id is a 32bit ID, so 0xffffffff is a
>> legitimate value.

Yes, I see.

>
> The ACPI spec says that bit 0 of the x2apic flags field tells whether the
> logical processor is present or not. So the proper check for x2apic is that
> flag.
>
> The lapic structure has the same flag, but the kernel ignores the flags for
> both lapic and x2apic.

It seems the kernel uses the flags in this sentence:

	enabled = processor->lapic_flags & ACPI_MADT_ENABLED;


>
> I'm going to apply the minimal fix of checking for id == 0xff in
> acpi_lapic_parse() for now, but this needs to be revisited and fixed
> proper.

Yes, I will do it.


Thanks

	Dou.


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index b07233b..db902d8 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -86,6 +86,7 @@  static inline void early_reserve_e820_mpc_new(void) { }
 #endif
 
 int generic_processor_info(int apicid, int version);
+int __generic_processor_info(int apicid, int version, bool enabled);
 
 #define PHYSID_ARRAY_SIZE	BITS_TO_LONGS(MAX_LOCAL_APIC)
 
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 90d84c3..abd939c 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -176,15 +176,10 @@  static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
 		return -EINVAL;
 	}
 
-	if (!enabled) {
-		++disabled_cpus;
-		return -EINVAL;
-	}
-
 	if (boot_cpu_physical_apicid != -1U)
 		ver = apic_version[boot_cpu_physical_apicid];
 
-	cpu = generic_processor_info(id, ver);
+	cpu = __generic_processor_info(id, ver, enabled);
 	if (cpu >= 0)
 		early_per_cpu(x86_cpu_to_acpiid, cpu) = acpiid;
 
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index e5612a9..7aa9863 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2024,7 +2024,53 @@  void disconnect_bsp_APIC(int virt_wire_setup)
 	apic_write(APIC_LVT1, value);
 }
 
-static int __generic_processor_info(int apicid, int version, bool enabled)
+/*
+ * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
+ * contiguously, it equals to current allocated max logical CPU ID plus 1.
+ * All allocated CPU ID should be in [0, nr_logical_cpuidi), so the maximum of
+ * nr_logical_cpuids is nr_cpu_ids.
+ *
+ * NOTE: Reserve 0 for BSP.
+ */
+static int nr_logical_cpuids = 1;
+
+/*
+ * Used to store mapping between logical CPU IDs and APIC IDs.
+ */
+static int cpuid_to_apicid[] = {
+	[0 ... NR_CPUS - 1] = -1,
+};
+
+/*
+ * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
+ * and cpuid_to_apicid[] synchronized.
+ */
+static int allocate_logical_cpuid(int apicid)
+{
+	int i;
+
+	/*
+	 * cpuid <-> apicid mapping is persistent, so when a cpu is up,
+	 * check if the kernel has allocated a cpuid for it.
+	 */
+	for (i = 0; i < nr_logical_cpuids; i++) {
+		if (cpuid_to_apicid[i] == apicid)
+			return i;
+	}
+
+	/* Allocate a new cpuid. */
+	if (nr_logical_cpuids >= nr_cpu_ids) {
+		WARN_ONCE(1, "Only %d processors supported."
+			     "Processor %d/0x%x and the rest are ignored.\n",
+			     nr_cpu_ids - 1, nr_logical_cpuids, apicid);
+		return -1;
+	}
+
+	cpuid_to_apicid[nr_logical_cpuids] = apicid;
+	return nr_logical_cpuids++;
+}
+
+int __generic_processor_info(int apicid, int version, bool enabled)
 {
 	int cpu, max = nr_cpu_ids;
 	bool boot_cpu_detected = physid_isset(boot_cpu_physical_apicid,
@@ -2099,8 +2145,17 @@  static int __generic_processor_info(int apicid, int version, bool enabled)
 		 * for BSP.
 		 */
 		cpu = 0;
-	} else
-		cpu = cpumask_next_zero(-1, cpu_present_mask);
+
+		/* Logical cpuid 0 is reserved for BSP. */
+		cpuid_to_apicid[0] = apicid;
+	} else {
+		cpu = allocate_logical_cpuid(apicid);
+		if (cpu < 0) {
+
+			disabled_cpus++;
+			return -EINVAL;
+		}
+	}
 
 	/*
 	 * This can happen on physical hotplug. The sanity check at boot time