diff mbox

[4/5] powernv:idle: Move initialization of sibling pacas to pnv_alloc_idle_core_states

Message ID 1499272696-28751-5-git-send-email-ego@linux.vnet.ibm.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Gautham R Shenoy July 5, 2017, 4:38 p.m. UTC
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On POWER9 DD1, in order to get around a hardware issue, we store in
every CPU thread's paca the paca pointers of all its siblings.

Move this code into pnv_alloc_idle_core_states() soon after the space
for saving the sibling pacas is allocated.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/idle.c | 45 +++++++++++++++++------------------
 1 file changed, 22 insertions(+), 23 deletions(-)

Comments

Nicholas Piggin July 6, 2017, 3:16 p.m. UTC | #1
On Wed,  5 Jul 2017 22:08:15 +0530
"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:

> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> On POWER9 DD1, in order to get around a hardware issue, we store in
> every CPU thread's paca the paca pointers of all its siblings.
> 
> Move this code into pnv_alloc_idle_core_states() soon after the space
> for saving the sibling pacas is allocated.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

> -	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> -		int cpu;
> -
> -		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
> -		for_each_possible_cpu(cpu) {
> -			int base_cpu = cpu_first_thread_sibling(cpu);
> -			int idx = cpu_thread_in_core(cpu);
> -			int i;
> -

You could move the thread_sibling_pacas allocation to here?

Speaking of which... core_idle_state and thread_sibling_pacas are
allocated with kmalloc_node... What happens if we take an SLB miss
in the idle wakeup code on these guys? Nothing good I think. Perhaps
we should put them into the pacas or somewhere in bolted memory.

Good cleanup though.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Gautham R Shenoy July 7, 2017, 3:04 p.m. UTC | #2
On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote:
> On Wed,  5 Jul 2017 22:08:15 +0530
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:
> 
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > 
> > On POWER9 DD1, in order to get around a hardware issue, we store in
> > every CPU thread's paca the paca pointers of all its siblings.
> > 
> > Move this code into pnv_alloc_idle_core_states() soon after the space
> > for saving the sibling pacas is allocated.
> > 
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> 
> > -	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> > -		int cpu;
> > -
> > -		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
> > -		for_each_possible_cpu(cpu) {
> > -			int base_cpu = cpu_first_thread_sibling(cpu);
> > -			int idx = cpu_thread_in_core(cpu);
> > -			int i;
> > -
> 
> You could move the thread_sibling_pacas allocation to here?
> 
> Speaking of which... core_idle_state and thread_sibling_pacas are
> allocated with kmalloc_node... What happens if we take an SLB miss
> in the idle wakeup code on these guys? Nothing good I think. Perhaps
> we should put them into the pacas or somewhere in bolted memory.

Yes, though the SLB miss hasn't yet been encountered in practise so
far!

While one can define thread_sibling_pacas in PACA, it doesn't make
sense to allocate space for core_idle_state in PACA since the
allocated value of the secondary threads will never be used.

What is the right way to ensure that these allocations fall in the
bolted range ?

> 
> Good cleanup though.
> 
> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
>
Nicholas Piggin July 8, 2017, 9 a.m. UTC | #3
On Fri, 7 Jul 2017 20:34:16 +0530
Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:

> On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote:
> > On Wed,  5 Jul 2017 22:08:15 +0530
> > "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:
> >   
> > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > > 
> > > On POWER9 DD1, in order to get around a hardware issue, we store in
> > > every CPU thread's paca the paca pointers of all its siblings.
> > > 
> > > Move this code into pnv_alloc_idle_core_states() soon after the space
> > > for saving the sibling pacas is allocated.
> > > 
> > > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>  
> >   
> > > -	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> > > -		int cpu;
> > > -
> > > -		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
> > > -		for_each_possible_cpu(cpu) {
> > > -			int base_cpu = cpu_first_thread_sibling(cpu);
> > > -			int idx = cpu_thread_in_core(cpu);
> > > -			int i;
> > > -  
> > 
> > You could move the thread_sibling_pacas allocation to here?
> > 
> > Speaking of which... core_idle_state and thread_sibling_pacas are
> > allocated with kmalloc_node... What happens if we take an SLB miss
> > in the idle wakeup code on these guys? Nothing good I think. Perhaps
> > we should put them into the pacas or somewhere in bolted memory.  
> 
> Yes, though the SLB miss hasn't yet been encountered in practise so
> far!

Considering it's a node-affine allocation, it may actually be possible
to hit in practice on very large memory systems in practice.

> While one can define thread_sibling_pacas in PACA, it doesn't make
> sense to allocate space for core_idle_state in PACA since the
> allocated value of the secondary threads will never be used.

Well, same for core_idle_state, although that's smaller.

> What is the right way to ensure that these allocations fall in the
> bolted range ?

I'm not sure, I guess the memblock allocator is not up anymore at this
point. I think we'd have to move it earlier. You could allocate another
array of them along side the paca allocation.

Thanks,
Nick
Michael Ellerman July 10, 2017, 4:34 a.m. UTC | #4
Nicholas Piggin <npiggin@gmail.com> writes:
> On Fri, 7 Jul 2017 20:34:16 +0530
> Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>> On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote:
>> > 
>> > Speaking of which... core_idle_state and thread_sibling_pacas are
>> > allocated with kmalloc_node... What happens if we take an SLB miss
>> > in the idle wakeup code on these guys? Nothing good I think. Perhaps
>> > we should put them into the pacas or somewhere in bolted memory.  
>> 
>> Yes, though the SLB miss hasn't yet been encountered in practise so
>> far!
>
> Considering it's a node-affine allocation, it may actually be possible
> to hit in practice on very large memory systems in practice.

You can boot with disable_1tb_segments on the kernel command line to
increase the change of hitting it.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index c400ff9..254a0db8 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -194,6 +194,28 @@  static void pnv_alloc_idle_core_states(void)
 		}
 	}
 
+	/*
+	 * For each CPU, record its PACA address in each of it's
+	 * sibling thread's PACA at the slot corresponding to this
+	 * CPU's index in the core.
+	 */
+	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
+		int cpu;
+
+		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
+		for_each_possible_cpu(cpu) {
+			int base_cpu = cpu_first_thread_sibling(cpu);
+			int idx = cpu_thread_in_core(cpu);
+			int i;
+
+			for (i = 0; i < threads_per_core; i++) {
+				int j = base_cpu + i;
+
+				paca[j].thread_sibling_pacas[idx] = &paca[cpu];
+			}
+		}
+	}
+
 	update_subcore_sibling_mask();
 
 	if (supported_cpuidle_states & OPAL_PM_LOSE_FULL_CONTEXT)
@@ -898,31 +920,8 @@  static int __init pnv_init_idle_states(void)
 	if (pnv_probe_idle_states())
 		goto out;
 
-
 	pnv_alloc_idle_core_states();
 
-	/*
-	 * For each CPU, record its PACA address in each of it's
-	 * sibling thread's PACA at the slot corresponding to this
-	 * CPU's index in the core.
-	 */
-	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
-		int cpu;
-
-		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
-		for_each_possible_cpu(cpu) {
-			int base_cpu = cpu_first_thread_sibling(cpu);
-			int idx = cpu_thread_in_core(cpu);
-			int i;
-
-			for (i = 0; i < threads_per_core; i++) {
-				int j = base_cpu + i;
-
-				paca[j].thread_sibling_pacas[idx] = &paca[cpu];
-			}
-		}
-	}
-
 out:
 	return 0;
 }