Message ID | 1499272696-28751-5-git-send-email-ego@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Wed, 5 Jul 2017 22:08:15 +0530 "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote: > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> > > On POWER9 DD1, in order to get around a hardware issue, we store in > every CPU thread's paca the paca pointers of all its siblings. > > Move this code into pnv_alloc_idle_core_states() soon after the space > for saving the sibling pacas is allocated. > > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> > - if (cpu_has_feature(CPU_FTR_POWER9_DD1)) { > - int cpu; > - > - pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n"); > - for_each_possible_cpu(cpu) { > - int base_cpu = cpu_first_thread_sibling(cpu); > - int idx = cpu_thread_in_core(cpu); > - int i; > - You could move the thread_sibling_pacas allocation to here? Speaking of which... core_idle_state and thread_sibling_pacas are allocated with kmalloc_node... What happens if we take an SLB miss in the idle wakeup code on these guys? Nothing good I think. Perhaps we should put them into the pacas or somewhere in bolted memory. Good cleanup though. Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote: > On Wed, 5 Jul 2017 22:08:15 +0530 > "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote: > > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> > > > > On POWER9 DD1, in order to get around a hardware issue, we store in > > every CPU thread's paca the paca pointers of all its siblings. > > > > Move this code into pnv_alloc_idle_core_states() soon after the space > > for saving the sibling pacas is allocated. > > > > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> > > > - if (cpu_has_feature(CPU_FTR_POWER9_DD1)) { > > - int cpu; > > - > > - pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n"); > > - for_each_possible_cpu(cpu) { > > - int base_cpu = cpu_first_thread_sibling(cpu); > > - int idx = cpu_thread_in_core(cpu); > > - int i; > > - > > You could move the thread_sibling_pacas allocation to here? > > Speaking of which... core_idle_state and thread_sibling_pacas are > allocated with kmalloc_node... What happens if we take an SLB miss > in the idle wakeup code on these guys? Nothing good I think. Perhaps > we should put them into the pacas or somewhere in bolted memory. Yes, though the SLB miss hasn't yet been encountered in practise so far! While one can define thread_sibling_pacas in PACA, it doesn't make sense to allocate space for core_idle_state in PACA since the allocated value of the secondary threads will never be used. What is the right way to ensure that these allocations fall in the bolted range ? > > Good cleanup though. > > Reviewed-by: Nicholas Piggin <npiggin@gmail.com> >
On Fri, 7 Jul 2017 20:34:16 +0530 Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote: > On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote: > > On Wed, 5 Jul 2017 22:08:15 +0530 > > "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote: > > > > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> > > > > > > On POWER9 DD1, in order to get around a hardware issue, we store in > > > every CPU thread's paca the paca pointers of all its siblings. > > > > > > Move this code into pnv_alloc_idle_core_states() soon after the space > > > for saving the sibling pacas is allocated. > > > > > > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> > > > > > - if (cpu_has_feature(CPU_FTR_POWER9_DD1)) { > > > - int cpu; > > > - > > > - pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n"); > > > - for_each_possible_cpu(cpu) { > > > - int base_cpu = cpu_first_thread_sibling(cpu); > > > - int idx = cpu_thread_in_core(cpu); > > > - int i; > > > - > > > > You could move the thread_sibling_pacas allocation to here? > > > > Speaking of which... core_idle_state and thread_sibling_pacas are > > allocated with kmalloc_node... What happens if we take an SLB miss > > in the idle wakeup code on these guys? Nothing good I think. Perhaps > > we should put them into the pacas or somewhere in bolted memory. > > Yes, though the SLB miss hasn't yet been encountered in practise so > far! Considering it's a node-affine allocation, it may actually be possible to hit in practice on very large memory systems in practice. > While one can define thread_sibling_pacas in PACA, it doesn't make > sense to allocate space for core_idle_state in PACA since the > allocated value of the secondary threads will never be used. Well, same for core_idle_state, although that's smaller. > What is the right way to ensure that these allocations fall in the > bolted range ? I'm not sure, I guess the memblock allocator is not up anymore at this point. I think we'd have to move it earlier. You could allocate another array of them along side the paca allocation. Thanks, Nick
Nicholas Piggin <npiggin@gmail.com> writes: > On Fri, 7 Jul 2017 20:34:16 +0530 > Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote: >> On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote: >> > >> > Speaking of which... core_idle_state and thread_sibling_pacas are >> > allocated with kmalloc_node... What happens if we take an SLB miss >> > in the idle wakeup code on these guys? Nothing good I think. Perhaps >> > we should put them into the pacas or somewhere in bolted memory. >> >> Yes, though the SLB miss hasn't yet been encountered in practise so >> far! > > Considering it's a node-affine allocation, it may actually be possible > to hit in practice on very large memory systems in practice. You can boot with disable_1tb_segments on the kernel command line to increase the change of hitting it. cheers
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index c400ff9..254a0db8 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -194,6 +194,28 @@ static void pnv_alloc_idle_core_states(void) } } + /* + * For each CPU, record its PACA address in each of it's + * sibling thread's PACA at the slot corresponding to this + * CPU's index in the core. + */ + if (cpu_has_feature(CPU_FTR_POWER9_DD1)) { + int cpu; + + pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n"); + for_each_possible_cpu(cpu) { + int base_cpu = cpu_first_thread_sibling(cpu); + int idx = cpu_thread_in_core(cpu); + int i; + + for (i = 0; i < threads_per_core; i++) { + int j = base_cpu + i; + + paca[j].thread_sibling_pacas[idx] = &paca[cpu]; + } + } + } + update_subcore_sibling_mask(); if (supported_cpuidle_states & OPAL_PM_LOSE_FULL_CONTEXT) @@ -898,31 +920,8 @@ static int __init pnv_init_idle_states(void) if (pnv_probe_idle_states()) goto out; - pnv_alloc_idle_core_states(); - /* - * For each CPU, record its PACA address in each of it's - * sibling thread's PACA at the slot corresponding to this - * CPU's index in the core. - */ - if (cpu_has_feature(CPU_FTR_POWER9_DD1)) { - int cpu; - - pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n"); - for_each_possible_cpu(cpu) { - int base_cpu = cpu_first_thread_sibling(cpu); - int idx = cpu_thread_in_core(cpu); - int i; - - for (i = 0; i < threads_per_core; i++) { - int j = base_cpu + i; - - paca[j].thread_sibling_pacas[idx] = &paca[cpu]; - } - } - } - out: return 0; }