Message ID | 1635896196-18961-1-git-send-email-boris.ostrovsky@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/smp: Factor out parts of native_smp_prepare_cpus() | expand |
On 03.11.21 00:36, Boris Ostrovsky wrote: > Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") > introduced cpu_l2c_shared_map mask which is expected to be initialized > by smp_op.smp_prepare_cpus(). That commit only updated > native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). > As result Xen PV guests crash in set_cpu_sibling_map(). > > While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can > see that both versions of smp_prepare_cpus ops share a number of common > operations that can be factored out. So do that instead. > > Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Juergen
On Tue, Nov 02, 2021 at 07:36:36PM -0400, Boris Ostrovsky wrote: > Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") > introduced cpu_l2c_shared_map mask which is expected to be initialized > by smp_op.smp_prepare_cpus(). That commit only updated > native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). > As result Xen PV guests crash in set_cpu_sibling_map(). > > While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can > see that both versions of smp_prepare_cpus ops share a number of common > operations that can be factored out. So do that instead. > > Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Thanks! I'll go stick that somewhere /urgent (I've had another report on that here: https://lkml.kernel.org/r/20211105074139.GE174703@worktop.programming.kicks-ass.net ) But looking at those functions; there seems to be more spurious differences. For example, the whole sched_topology thing. Should we re-architect this whole smp_prepare_cpus() thing instead? Have a common function and a guest function? HyperV for instance seems to call native_smp_prepare_cpus() and then does something extra (as does xen_hvm).
On 11/8/21 10:11 AM, Peter Zijlstra wrote: > On Tue, Nov 02, 2021 at 07:36:36PM -0400, Boris Ostrovsky wrote: >> Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") >> introduced cpu_l2c_shared_map mask which is expected to be initialized >> by smp_op.smp_prepare_cpus(). That commit only updated >> native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). >> As result Xen PV guests crash in set_cpu_sibling_map(). >> >> While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can >> see that both versions of smp_prepare_cpus ops share a number of common >> operations that can be factored out. So do that instead. >> >> Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Thanks! I'll go stick that somewhere /urgent (I've had another report on > that here: > > https://lkml.kernel.org/r/20211105074139.GE174703@worktop.programming.kicks-ass.net > ) Thank you. (I don't see this message btw) > > But looking at those functions; there seems to be more spurious > differences. For example, the whole sched_topology thing. I did look at that and thought this should be benign given that Xen PV is not really topology-aware. I didn't see anything that would be a cause for concern but perhaps you can point me to things I missed. > > Should we re-architect this whole smp_prepare_cpus() thing instead? Have > a common function and a guest function? HyperV for instance seems to > call native_smp_prepare_cpus() and then does something extra (as does > xen_hvm). Something like void smp_prepare_cpus() { // Code that this patch moved to smp_prepare_cpus_common(); smp_ops.smp_prepare_cpus(); // Including baremetal } ? XenHVM and hyperV will need to call native smp_op too. Not sure this will be prettier than what it is now? -boris -boris
On Mon, Nov 08, 2021 at 12:20:26PM -0500, Boris Ostrovsky wrote: > > On 11/8/21 10:11 AM, Peter Zijlstra wrote: > > On Tue, Nov 02, 2021 at 07:36:36PM -0400, Boris Ostrovsky wrote: > > > Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") > > > introduced cpu_l2c_shared_map mask which is expected to be initialized > > > by smp_op.smp_prepare_cpus(). That commit only updated > > > native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). > > > As result Xen PV guests crash in set_cpu_sibling_map(). > > > > > > While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can > > > see that both versions of smp_prepare_cpus ops share a number of common > > > operations that can be factored out. So do that instead. > > > > > > Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") > > > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > Thanks! I'll go stick that somewhere /urgent (I've had another report on > > that here: > > > > https://lkml.kernel.org/r/20211105074139.GE174703@worktop.programming.kicks-ass.net > > ) > > > Thank you. (I don't see this message btw) Urgh, that thread never went to lkml :/ > > But looking at those functions; there seems to be more spurious > > differences. For example, the whole sched_topology thing. > > > I did look at that and thought this should be benign given that Xen PV > is not really topology-aware. I didn't see anything that would be a > cause for concern but perhaps you can point me to things I missed. And me not being Xen aware... What does Xen-PV guests see of the CPUID topology fields? Does it fully sanitize the CPUID data, or is it a clean pass-through from whatever CPU the vCPU happens to run on at the time? > > Should we re-architect this whole smp_prepare_cpus() thing instead? Have > > a common function and a guest function? HyperV for instance seems to > > call native_smp_prepare_cpus() and then does something extra (as does > > xen_hvm). > > > Something like > > > void smp_prepare_cpus() > > { > > // Code that this patch moved to smp_prepare_cpus_common(); > > > smp_ops.smp_prepare_cpus(); // Including baremetal > > } > > > ? > > > XenHVM and hyperV will need to call native smp_op too. Not sure this > will be prettier than what it is now? Hurmph, yeah. I was thinking it would allow pre and post common code, but yeah, doesn't seem to make sense for now.
On 09.11.21 16:10, Peter Zijlstra wrote: > On Mon, Nov 08, 2021 at 12:20:26PM -0500, Boris Ostrovsky wrote: >> >> On 11/8/21 10:11 AM, Peter Zijlstra wrote: >>> On Tue, Nov 02, 2021 at 07:36:36PM -0400, Boris Ostrovsky wrote: >>>> Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") >>>> introduced cpu_l2c_shared_map mask which is expected to be initialized >>>> by smp_op.smp_prepare_cpus(). That commit only updated >>>> native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). >>>> As result Xen PV guests crash in set_cpu_sibling_map(). >>>> >>>> While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can >>>> see that both versions of smp_prepare_cpus ops share a number of common >>>> operations that can be factored out. So do that instead. >>>> >>>> Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") >>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> Thanks! I'll go stick that somewhere /urgent (I've had another report on >>> that here: >>> >>> https://lkml.kernel.org/r/20211105074139.GE174703@worktop.programming.kicks-ass.net >>> ) >> >> >> Thank you. (I don't see this message btw) > > Urgh, that thread never went to lkml :/ > >>> But looking at those functions; there seems to be more spurious >>> differences. For example, the whole sched_topology thing. >> >> >> I did look at that and thought this should be benign given that Xen PV >> is not really topology-aware. I didn't see anything that would be a >> cause for concern but perhaps you can point me to things I missed. > > And me not being Xen aware... What does Xen-PV guests see of the CPUID > topology fields? Does it fully sanitize the CPUID data, or is it a clean > pass-through from whatever CPU the vCPU happens to run on at the time? The latter. :-( Juergen
On 09/11/2021 15:10, Peter Zijlstra wrote: > On Mon, Nov 08, 2021 at 12:20:26PM -0500, Boris Ostrovsky wrote: >>> But looking at those functions; there seems to be more spurious >>> differences. For example, the whole sched_topology thing. >> >> I did look at that and thought this should be benign given that Xen PV >> is not really topology-aware. I didn't see anything that would be a >> cause for concern but perhaps you can point me to things I missed. > And me not being Xen aware... What does Xen-PV guests see of the CPUID > topology fields? Does it fully sanitize the CPUID data, or is it a clean > pass-through from whatever CPU the vCPU happens to run on at the time? That depends on hardware support (CPUID Faulting or not), version of Xen (anything before Xen 4.7 is totally insane. Anything more recent is only moderately insane), and whether the kernel asks via the enlightened CPUID path or not. On hardware lacking CPUID faulting, and for a kernel using native_cpuid() where it ought to be using the PVOP, it sees the real hardware value of the CPU it happens to be running on. ~Andrew
On 11/3/21 00:36, Boris Ostrovsky wrote: > Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") > introduced cpu_l2c_shared_map mask which is expected to be initialized > by smp_op.smp_prepare_cpus(). That commit only updated > native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). > As result Xen PV guests crash in set_cpu_sibling_map(). > > While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can > see that both versions of smp_prepare_cpus ops share a number of common > operations that can be factored out. So do that instead. > > Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > --- > arch/x86/include/asm/smp.h | 1 + > arch/x86/kernel/smpboot.c | 19 +++++++++++++------ > arch/x86/xen/smp_pv.c | 11 ++--------- > 3 files changed, 16 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > index 08b0e90623ad..81a0211a372d 100644 > --- a/arch/x86/include/asm/smp.h > +++ b/arch/x86/include/asm/smp.h > @@ -126,6 +126,7 @@ static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask) > > void cpu_disable_common(void); > void native_smp_prepare_boot_cpu(void); > +void smp_prepare_cpus_common(void); > void native_smp_prepare_cpus(unsigned int max_cpus); > void calculate_max_logical_packages(void); > void native_smp_cpus_done(unsigned int max_cpus); > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index 8241927addff..d7429198c22f 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1350,12 +1350,7 @@ static void __init smp_get_logical_apicid(void) > cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); > } > > -/* > - * Prepare for SMP bootup. > - * @max_cpus: configured maximum number of CPUs, It is a legacy parameter > - * for common interface support. > - */ > -void __init native_smp_prepare_cpus(unsigned int max_cpus) > +void __init smp_prepare_cpus_common(void) > { > unsigned int i; Testing out this patch I got a warning that i is unused. Which it is, since for_each_possible_cpu(i) is removed. > > @@ -1386,6 +1381,18 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) > set_sched_topology(x86_topology); > > set_cpu_sibling_map(0); > +} > + > +/* > + * Prepare for SMP bootup. > + * @max_cpus: configured maximum number of CPUs, It is a legacy parameter > + * for common interface support. > + */ > +void __init native_smp_prepare_cpus(unsigned int max_cpus) > +{ > + > + smp_prepare_cpus_common(); > + > init_freq_invariance(false, false); > smp_sanity_check(); > > diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c > index 9e55bcbfcd33..69e91d0d3ca4 100644 > --- a/arch/x86/xen/smp_pv.c > +++ b/arch/x86/xen/smp_pv.c > @@ -238,16 +238,9 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus) > } > xen_init_lock_cpu(0); > > - smp_store_boot_cpu_info(); > - cpu_data(0).x86_max_cores = 1; > + smp_prepare_cpus_common(); > > - for_each_possible_cpu(i) { > - zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL); > - zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL); > - zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL); > - zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL); > - } > - set_cpu_sibling_map(0); > + cpu_data(0).x86_max_cores = 1; > > speculative_store_bypass_ht_init(); >
On Wed, Nov 10, 2021 at 10:52:09PM +0100, Josef Johansson wrote: > On 11/3/21 00:36, Boris Ostrovsky wrote: > > Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") > > introduced cpu_l2c_shared_map mask which is expected to be initialized > > by smp_op.smp_prepare_cpus(). That commit only updated > > native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). > > As result Xen PV guests crash in set_cpu_sibling_map(). > > > > While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can > > see that both versions of smp_prepare_cpus ops share a number of common > > operations that can be factored out. So do that instead. > > > > Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") > > Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > > --- > > arch/x86/include/asm/smp.h | 1 + > > arch/x86/kernel/smpboot.c | 19 +++++++++++++------ > > arch/x86/xen/smp_pv.c | 11 ++--------- > > 3 files changed, 16 insertions(+), 15 deletions(-) > > > > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > > index 08b0e90623ad..81a0211a372d 100644 > > --- a/arch/x86/include/asm/smp.h > > +++ b/arch/x86/include/asm/smp.h > > @@ -126,6 +126,7 @@ static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask) > > > > void cpu_disable_common(void); > > void native_smp_prepare_boot_cpu(void); > > +void smp_prepare_cpus_common(void); > > void native_smp_prepare_cpus(unsigned int max_cpus); > > void calculate_max_logical_packages(void); > > void native_smp_cpus_done(unsigned int max_cpus); > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > index 8241927addff..d7429198c22f 100644 > > --- a/arch/x86/kernel/smpboot.c > > +++ b/arch/x86/kernel/smpboot.c > > @@ -1350,12 +1350,7 @@ static void __init smp_get_logical_apicid(void) > > cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); > > } > > > > -/* > > - * Prepare for SMP bootup. > > - * @max_cpus: configured maximum number of CPUs, It is a legacy parameter > > - * for common interface support. > > - */ > > -void __init native_smp_prepare_cpus(unsigned int max_cpus) > > +void __init smp_prepare_cpus_common(void) > > { > > unsigned int i; > Testing out this patch I got a warning that i is unused. Which it is, > since for_each_possible_cpu(i) is removed. Fixed. Can I add your Tested-by ?
On 11/11/21 11:15, Peter Zijlstra wrote: > On Wed, Nov 10, 2021 at 10:52:09PM +0100, Josef Johansson wrote: >> On 11/3/21 00:36, Boris Ostrovsky wrote: >>> Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") >>> introduced cpu_l2c_shared_map mask which is expected to be initialized >>> by smp_op.smp_prepare_cpus(). That commit only updated >>> native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). >>> As result Xen PV guests crash in set_cpu_sibling_map(). >>> >>> While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can >>> see that both versions of smp_prepare_cpus ops share a number of common >>> operations that can be factored out. So do that instead. >>> >>> Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") >>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> >>> --- >>> arch/x86/include/asm/smp.h | 1 + >>> arch/x86/kernel/smpboot.c | 19 +++++++++++++------ >>> arch/x86/xen/smp_pv.c | 11 ++--------- >>> 3 files changed, 16 insertions(+), 15 deletions(-) >>> >>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h >>> index 08b0e90623ad..81a0211a372d 100644 >>> --- a/arch/x86/include/asm/smp.h >>> +++ b/arch/x86/include/asm/smp.h >>> @@ -126,6 +126,7 @@ static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask) >>> >>> void cpu_disable_common(void); >>> void native_smp_prepare_boot_cpu(void); >>> +void smp_prepare_cpus_common(void); >>> void native_smp_prepare_cpus(unsigned int max_cpus); >>> void calculate_max_logical_packages(void); >>> void native_smp_cpus_done(unsigned int max_cpus); >>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c >>> index 8241927addff..d7429198c22f 100644 >>> --- a/arch/x86/kernel/smpboot.c >>> +++ b/arch/x86/kernel/smpboot.c >>> @@ -1350,12 +1350,7 @@ static void __init smp_get_logical_apicid(void) >>> cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); >>> } >>> >>> -/* >>> - * Prepare for SMP bootup. >>> - * @max_cpus: configured maximum number of CPUs, It is a legacy parameter >>> - * for common interface support. >>> - */ >>> -void __init native_smp_prepare_cpus(unsigned int max_cpus) >>> +void __init smp_prepare_cpus_common(void) >>> { >>> unsigned int i; >> Testing out this patch I got a warning that i is unused. Which it is, >> since for_each_possible_cpu(i) is removed. > Fixed. Can I add your Tested-by ? Yes, I tested with tip. Regards Josef
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index 08b0e90623ad..81a0211a372d 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -126,6 +126,7 @@ static inline void arch_send_call_function_ipi_mask(const struct cpumask *mask) void cpu_disable_common(void); void native_smp_prepare_boot_cpu(void); +void smp_prepare_cpus_common(void); void native_smp_prepare_cpus(unsigned int max_cpus); void calculate_max_logical_packages(void); void native_smp_cpus_done(unsigned int max_cpus); diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 8241927addff..d7429198c22f 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1350,12 +1350,7 @@ static void __init smp_get_logical_apicid(void) cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); } -/* - * Prepare for SMP bootup. - * @max_cpus: configured maximum number of CPUs, It is a legacy parameter - * for common interface support. - */ -void __init native_smp_prepare_cpus(unsigned int max_cpus) +void __init smp_prepare_cpus_common(void) { unsigned int i; @@ -1386,6 +1381,18 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) set_sched_topology(x86_topology); set_cpu_sibling_map(0); +} + +/* + * Prepare for SMP bootup. + * @max_cpus: configured maximum number of CPUs, It is a legacy parameter + * for common interface support. + */ +void __init native_smp_prepare_cpus(unsigned int max_cpus) +{ + + smp_prepare_cpus_common(); + init_freq_invariance(false, false); smp_sanity_check(); diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c index 9e55bcbfcd33..69e91d0d3ca4 100644 --- a/arch/x86/xen/smp_pv.c +++ b/arch/x86/xen/smp_pv.c @@ -238,16 +238,9 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus) } xen_init_lock_cpu(0); - smp_store_boot_cpu_info(); - cpu_data(0).x86_max_cores = 1; + smp_prepare_cpus_common(); - for_each_possible_cpu(i) { - zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL); - zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL); - zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL); - zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL); - } - set_cpu_sibling_map(0); + cpu_data(0).x86_max_cores = 1; speculative_store_bypass_ht_init();
Commit 66558b730f25 ("sched: Add cluster scheduler level for x86") introduced cpu_l2c_shared_map mask which is expected to be initialized by smp_op.smp_prepare_cpus(). That commit only updated native_smp_prepare_cpus() version but not xen_pv_smp_prepare_cpus(). As result Xen PV guests crash in set_cpu_sibling_map(). While the new mask can be allocated in xen_pv_smp_prepare_cpus() one can see that both versions of smp_prepare_cpus ops share a number of common operations that can be factored out. So do that instead. Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- arch/x86/include/asm/smp.h | 1 + arch/x86/kernel/smpboot.c | 19 +++++++++++++------ arch/x86/xen/smp_pv.c | 11 ++--------- 3 files changed, 16 insertions(+), 15 deletions(-)