diff mbox series

[core-for-CI] x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform

Message ID 20230327121116.1785979-1-imre.deak@intel.com (mailing list archive)
State New, archived
Headers show
Series [core-for-CI] x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform | expand

Commit Message

Imre Deak March 27, 2023, 12:11 p.m. UTC
From: Zhang Rui <rui.zhang@intel.com>

The SMT siblings value returned by CPUID.1F SMT level EBX differs
among CPUs on Intel Hybrid platforms like AlderLake and MeteorLake.
It returns 2 for Pcore CPUs which have SMT siblings and returns 1 for
Ecore CPUs which do not have SMT siblings.

Today, the CPU boot code sets the global variable smp_num_siblings when
every CPU thread is brought up. The last thread to boot will overwrite
it with the number of siblings of *that* thread. That last thread to
boot will "win". If the thread is a Pcore, smp_num_siblings == 2.  If it
is an Ecore, smp_num_siblings == 1.

smp_num_siblings describes if the *system* supports SMT.  It should
specify the maximum number of SMT threads among all cores.

Ensure that smp_num_siblings represents the system-wide maximum number
of siblings by always increasing its value. Never allow it to decrease.

On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are
not updated in any cpu sibling map because the system is treated as an
UP system when probing Ecore CPUs.

Below shows part of the CPU topology information before and after the
fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).
...
-/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
-/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
+/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
...
-/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
-/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
+/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21

And this also breaks userspace tools like lscpu
-Core(s) per socket:  1
-Socket(s):           11
+Core(s) per socket:  16
+Socket(s):           1

CC: stable@kernel.org
Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology")
Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()")
Suggested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[Imre: resend for core-for-CI]
References: https://lore.kernel.org/lkml/20230323015640.27906-1-rui.zhang@intel.com
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8317
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 arch/x86/kernel/cpu/topology.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Saarinen, Jani March 27, 2023, 12:46 p.m. UTC | #1
Hi, 
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Imre Deak
> Sent: maanantai 27. maaliskuuta 2023 15.11
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [core-for-CI] x86/topology: fix erroneous smp_num_siblings on
> Intel Hybrid platform
> 
> From: Zhang Rui <rui.zhang@intel.com>
> 
> The SMT siblings value returned by CPUID.1F SMT level EBX differs among CPUs on
> Intel Hybrid platforms like AlderLake and MeteorLake.
> It returns 2 for Pcore CPUs which have SMT siblings and returns 1 for Ecore CPUs
> which do not have SMT siblings.
> 
> Today, the CPU boot code sets the global variable smp_num_siblings when every
> CPU thread is brought up. The last thread to boot will overwrite it with the number
> of siblings of *that* thread. That last thread to boot will "win". If the thread is a
> Pcore, smp_num_siblings == 2.  If it is an Ecore, smp_num_siblings == 1.
> 
> smp_num_siblings describes if the *system* supports SMT.  It should specify the
> maximum number of SMT threads among all cores.
> 
> Ensure that smp_num_siblings represents the system-wide maximum number of
> siblings by always increasing its value. Never allow it to decrease.
> 
> On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are not updated
> in any cpu sibling map because the system is treated as an UP system when probing
> Ecore CPUs.
> 
> Below shows part of the CPU topology information before and after the fix, for both
> Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).

Tested-By: Jani Saarinen <jani.saarinen@intel.com> on local MTL setup. Also tested earlier on other CI systems 
by: https://patchwork.freedesktop.org/series/115601/ trybot series. 
For this there is https://gitlab.freedesktop.org/drm/intel/-/issues/8317 

Br,
Jani

> ...
> -/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
> -/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
> +/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
> +/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
> ...
> -/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
> -/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
> +/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
> +/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21
> 
> And this also breaks userspace tools like lscpu
> -Core(s) per socket:  1
> -Socket(s):           11
> +Core(s) per socket:  16
> +Socket(s):           1
> 
> CC: stable@kernel.org
> Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu
> topology")
> Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide
> detect_extended_topology_early()")
> Suggested-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> [Imre: resend for core-for-CI]
> References: https://lore.kernel.org/lkml/20230323015640.27906-1-
> rui.zhang@intel.com
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8317
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  arch/x86/kernel/cpu/topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index
> 5e868b62a7c4e..0270925fe013b 100644
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -79,7 +79,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
>  	 * initial apic id, which also represents 32-bit extended x2apic id.
>  	 */
>  	c->initial_apicid = edx;
> -	smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	smp_num_siblings = max_t(int, smp_num_siblings,
> +LEVEL_MAX_SIBLINGS(ebx));
>  #endif
>  	return 0;
>  }
> @@ -109,7 +109,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
>  	 */
>  	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
>  	c->initial_apicid = edx;
> -	core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	smp_num_siblings = max_t(int, smp_num_siblings,
> +LEVEL_MAX_SIBLINGS(ebx));
>  	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>  	die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
>  	pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
> --
> 2.37.2
Imre Deak March 27, 2023, 9:26 p.m. UTC | #2
Hi Rui,

after applying your
"x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform"

fix on the drm-tip tree (see the patchwork URL below) the CI tests show
some regression on a HSW and a KBL machine (see [2] and [4] below) in
the i915 driver. However I think they can't be related to your changes,
since on these machines all cores will report the same number of CPU
siblings. Could you confirm this and that in general the reported
siblings can only vary on platforms with both E and P cores (ADL-P being
the first such platform)?

Thanks,
Imre

On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform
> URL   : https://patchwork.freedesktop.org/series/115661/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_115661v1 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_115661v1, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
> 
> Participating hosts (37 -> 37)
> ------------------------------
> 
>   No changes in participating hosts
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_115661v1:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@i915_selftest@live@hangcheck:
>     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> 
>   
> #### Suppressed ####
> 
>   The following results come from untrusted machines, tests, or statuses.
>   They do not affect the overall result.
> 
>   * igt@fbdev@info:
>     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-2/igt@fbdev@info.html
>    [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-kbl-2/igt@fbdev@info.html
> 
>   
> Known issues
> ------------
> 
>   Here are the changes found in Patchwork_115661v1 that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt@gem_exec_suspend@basic-s3@lmem0:
>     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6] ([i915#6311])
>    [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
>    [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> 
>   * igt@i915_selftest@live@reset:
>     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
>    [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-1/igt@i915_selftest@live@reset.html
>    [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-1/igt@i915_selftest@live@reset.html
> 
>   * igt@i915_selftest@live@slpc:
>     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367] / [i915#7913] / [i915#7996])
>    [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@slpc.html
> 
>   * igt@i915_suspend@basic-s3-without-i915:
>     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] / [i915#7978])
>    [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_suspend@basic-s3-without-i915.html
> 
>   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1:
>     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
>    [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
>    [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> 
>   
> #### Possible fixes ####
> 
>   * igt@i915_selftest@live@gt_heartbeat:
>     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] / [i915#7872]) -> [PASS][14]
>    [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
>    [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> 
>   * igt@i915_selftest@live@reset:
>     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913]) -> [PASS][16]
>    [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-2/igt@i915_selftest@live@reset.html
>    [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@reset.html
> 
>   
>   {name}: This element is suppressed. This means it is ignored when computing
>           the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
>   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
>   [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
>   [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
>   [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311
>   [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
>   [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
>   [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
>   [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
>   [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
>   [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
>   [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996
> 
> 
> Build changes
> -------------
> 
>   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> 
>   CI-20190529: 20190529
>   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @ git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @ git://anongit.freedesktop.org/gfx-ci/linux
> 
> 
> ### Linux commits
> 
> 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
Zhang Rui March 28, 2023, 1:14 a.m. UTC | #3
Hi, Imre,

Thanks for raising this.

On Tue, 2023-03-28 at 00:26 +0300, Imre Deak wrote:
> Hi Rui,
> 
> after applying your
> "x86/topology: fix erroneous smp_num_siblings on Intel Hybrid
> platform"
> 
> fix on the drm-tip tree (see the patchwork URL below) the CI tests
> show
> some regression on a HSW and a KBL machine (see [2] and [4] below) in
> the i915 driver. However I think they can't be related to your
> changes,
> since on these machines all cores will report the same number of CPU
> siblings.

Right.

>  Could you confirm this and that in general the reported
> siblings can only vary on platforms with both E and P cores (ADL-P
> being
> the first such platform)?

Right.

I don't think the patch could bring any change related.
It only affects hybrid platforms.

Thanks,
rui
> 
> Thanks,
> Imre
> 
> On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: x86/topology: fix erroneous smp_num_siblings on Intel
> > Hybrid platform
> > URL   : https://patchwork.freedesktop.org/series/115661/
> > State : failure
> > 
> > == Summary ==
> > 
> > CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> > ====================================================
> > 
> > Summary
> > -------
> > 
> >   **FAILURE**
> > 
> >   Serious unknown changes coming with Patchwork_115661v1 absolutely
> > need to be
> >   verified manually.
> >   
> >   If you think the reported changes have nothing to do with the
> > changes
> >   introduced in Patchwork_115661v1, please notify your bug team to
> > allow them
> >   to document this new failure mode, which will reduce false
> > positives in CI.
> > 
> >   External URL: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
> > 
> > Participating hosts (37 -> 37)
> > ------------------------------
> > 
> >   No changes in participating hosts
> > 
> > Possible new issues
> > -------------------
> > 
> >   Here are the unknown changes that may have been introduced in
> > Patchwork_115661v1:
> > 
> > ### IGT changes ###
> > 
> > #### Possible regressions ####
> > 
> >   * igt@i915_selftest@live@hangcheck:
> >     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
> >    [1]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> >    [2]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> > 
> >   
> > #### Suppressed ####
> > 
> >   The following results come from untrusted machines, tests, or
> > statuses.
> >   They do not affect the overall result.
> > 
> >   * igt@fbdev@info:
> >     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
> >    [3]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-2/igt@fbdev@info.html
> >    [4]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-kbl-2/igt@fbdev@info.html
> > 
> >   
> > Known issues
> > ------------
> > 
> >   Here are the changes found in Patchwork_115661v1 that come from
> > known issues:
> > 
> > ### IGT changes ###
> > 
> > #### Issues hit ####
> > 
> >   * igt@gem_exec_suspend@basic-s3@lmem0:
> >     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6]
> > ([i915#6311])
> >    [5]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> >    [6]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > 
> >   * igt@i915_selftest@live@reset:
> >     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
> >    [7]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-1/igt@i915_selftest@live@reset.html
> >    [8]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-1/igt@i915_selftest@live@reset.html
> > 
> >   * igt@i915_selftest@live@slpc:
> >     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367] /
> > [i915#7913] / [i915#7996])
> >    [9]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@slpc.html
> > 
> >   * igt@i915_suspend@basic-s3-without-i915:
> >     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] /
> > [i915#7978])
> >    [10]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_suspend@basic-s3-without-i915.html
> > 
> >   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-
> > dp-1:
> >     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
> >    [11]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> >    [12]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > 
> >   
> > #### Possible fixes ####
> > 
> >   * igt@i915_selftest@live@gt_heartbeat:
> >     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] /
> > [i915#7872]) -> [PASS][14]
> >    [13]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> >    [14]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> > 
> >   * igt@i915_selftest@live@reset:
> >     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913])
> > -> [PASS][16]
> >    [15]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-2/igt@i915_selftest@live@reset.html
> >    [16]: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@reset.html
> > 
> >   
> >   {name}: This element is suppressed. This means it is ignored when
> > computing
> >           the status of the difference (SUCCESS, WARNING, or
> > FAILURE).
> > 
> >   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
> >   [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
> >   [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
> >   [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311
> >   [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
> >   [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
> >   [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
> >   [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
> >   [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
> >   [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
> >   [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996
> > 
> > 
> > Build changes
> > -------------
> > 
> >   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> > 
> >   CI-20190529: 20190529
> >   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > git://anongit.freedesktop.org/gfx-ci/linux
> >   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @ 
> > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> >   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > git://anongit.freedesktop.org/gfx-ci/linux
> > 
> > 
> > ### Linux commits
> > 
> > 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on Intel
> > Hybrid platform
> > 
> > == Logs ==
> > 
> > For more details see: 
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
Zhang Rui March 28, 2023, 1:57 a.m. UTC | #4
On Tue, 2023-03-28 at 09:14 +0800, Zhang Rui wrote:
> Hi, Imre,
> 
> Thanks for raising this.
> 
> On Tue, 2023-03-28 at 00:26 +0300, Imre Deak wrote:
> > Hi Rui,
> > 
> > after applying your
> > "x86/topology: fix erroneous smp_num_siblings on Intel Hybrid
> > platform"
> > 
> > fix on the drm-tip tree (see the patchwork URL below) the CI tests
> > show
> > some regression on a HSW and a KBL machine (see [2] and [4] below)
> > in
> > the i915 driver. However I think they can't be related to your
> > changes,
> > since on these machines all cores will report the same number of
> > CPU
> > siblings.
> 
> Right.
> 
> >  Could you confirm this and that in general the reported
> > siblings can only vary on platforms with both E and P cores (ADL-P
> > being
> > the first such platform)?
> 
> Right.
> 
> I don't think the patch could bring any change related.
> It only affects hybrid platforms.

Is this topology fix patch the only patch applied?
or together with some other patches?

I can hardly imagine that the fix patch can trigger such issues, so I
suspect they are intermittent issues. say
is the regression 100% reproducible?
does the warning/failure ever show without the patch?

BTW, I just happened to see this thread
https://lore.kernel.org/all/DM8PR11MB565580BCF44661B6A392F0CEE08B9@DM8PR11MB5655.namprd11.prod.outlook.com/
If the problem on hand has been verified to be not related with the
topology fix, can we update in this thread as well?
https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang@intel.com/
This is another issue that the patch fixes. And it's better to have a
Buglink/Tested-by tag in the commit.

thanks,
rui

> 
> Thanks,
> rui
> > Thanks,
> > Imre
> > 
> > On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> > > == Series Details ==
> > > 
> > > Series: x86/topology: fix erroneous smp_num_siblings on Intel
> > > Hybrid platform
> > > URL   : https://patchwork.freedesktop.org/series/115661/
> > > State : failure
> > > 
> > > == Summary ==
> > > 
> > > CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> > > ====================================================
> > > 
> > > Summary
> > > -------
> > > 
> > >   **FAILURE**
> > > 
> > >   Serious unknown changes coming with Patchwork_115661v1
> > > absolutely
> > > need to be
> > >   verified manually.
> > >   
> > >   If you think the reported changes have nothing to do with the
> > > changes
> > >   introduced in Patchwork_115661v1, please notify your bug team
> > > to
> > > allow them
> > >   to document this new failure mode, which will reduce false
> > > positives in CI.
> > > 
> > >   External URL: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
> > > 
> > > Participating hosts (37 -> 37)
> > > ------------------------------
> > > 
> > >   No changes in participating hosts
> > > 
> > > Possible new issues
> > > -------------------
> > > 
> > >   Here are the unknown changes that may have been introduced in
> > > Patchwork_115661v1:
> > > 
> > > ### IGT changes ###
> > > 
> > > #### Possible regressions ####
> > > 
> > >   * igt@i915_selftest@live@hangcheck:
> > >     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
> > >    [1]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> > >    [2]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> > > 
> > >   
> > > #### Suppressed ####
> > > 
> > >   The following results come from untrusted machines, tests, or
> > > statuses.
> > >   They do not affect the overall result.
> > > 
> > >   * igt@fbdev@info:
> > >     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
> > >    [3]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-2/igt@fbdev@info.html
> > >    [4]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-kbl-2/igt@fbdev@info.html
> > > 
> > >   
> > > Known issues
> > > ------------
> > > 
> > >   Here are the changes found in Patchwork_115661v1 that come from
> > > known issues:
> > > 
> > > ### IGT changes ###
> > > 
> > > #### Issues hit ####
> > > 
> > >   * igt@gem_exec_suspend@basic-s3@lmem0:
> > >     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6]
> > > ([i915#6311])
> > >    [5]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > >    [6]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > 
> > >   * igt@i915_selftest@live@reset:
> > >     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
> > >    [7]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-1/igt@i915_selftest@live@reset.html
> > >    [8]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-1/igt@i915_selftest@live@reset.html
> > > 
> > >   * igt@i915_selftest@live@slpc:
> > >     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367]
> > > /
> > > [i915#7913] / [i915#7996])
> > >    [9]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@slpc.html
> > > 
> > >   * igt@i915_suspend@basic-s3-without-i915:
> > >     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] /
> > > [i915#7978])
> > >    [10]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_suspend@basic-s3-without-i915.html
> > > 
> > >   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-
> > > dp-1:
> > >     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
> > >    [11]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > >    [12]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > > 
> > >   
> > > #### Possible fixes ####
> > > 
> > >   * igt@i915_selftest@live@gt_heartbeat:
> > >     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] /
> > > [i915#7872]) -> [PASS][14]
> > >    [13]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> > >    [14]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > 
> > >   * igt@i915_selftest@live@reset:
> > >     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913])
> > > -> [PASS][16]
> > >    [15]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-2/igt@i915_selftest@live@reset.html
> > >    [16]: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@reset.html
> > > 
> > >   
> > >   {name}: This element is suppressed. This means it is ignored
> > > when
> > > computing
> > >           the status of the difference (SUCCESS, WARNING, or
> > > FAILURE).
> > > 
> > >   [fdo#109271]: 
> > > https://bugs.freedesktop.org/show_bug.cgi?id=109271
> > >   [i915#4983]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/4983
> > >   [i915#5334]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/5334
> > >   [i915#6311]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/6311
> > >   [i915#6367]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/6367
> > >   [i915#6687]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/6687
> > >   [i915#7872]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/7872
> > >   [i915#7913]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/7913
> > >   [i915#7932]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/7932
> > >   [i915#7978]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/7978
> > >   [i915#7996]: 
> > > https://gitlab.freedesktop.org/drm/intel/issues/7996
> > > 
> > > 
> > > Build changes
> > > -------------
> > > 
> > >   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> > > 
> > >   CI-20190529: 20190529
> > >   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > git://anongit.freedesktop.org/gfx-ci/linux
> > >   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @ 
> > > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> > >   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > git://anongit.freedesktop.org/gfx-ci/linux
> > > 
> > > 
> > > ### Linux commits
> > > 
> > > 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on
> > > Intel
> > > Hybrid platform
> > > 
> > > == Logs ==
> > > 
> > > For more details see: 
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
Saarinen, Jani March 28, 2023, 6:34 a.m. UTC | #5
Hi. 
> -----Original Message-----
> From: Zhang, Rui <rui.zhang@intel.com>
> Sent: tiistai 28. maaliskuuta 2023 4.57
> To: Deak, Imre <imre.deak@intel.com>
> Cc: Saarinen, Jani <jani.saarinen@intel.com>; intel-gfx@lists.freedesktop.org
> Subject: Re: ✗ Fi.CI.BAT: failure for x86/topology: fix erroneous smp_num_siblings
> on Intel Hybrid platform
> 
> On Tue, 2023-03-28 at 09:14 +0800, Zhang Rui wrote:
> > Hi, Imre,
> >
> > Thanks for raising this.
> >
> > On Tue, 2023-03-28 at 00:26 +0300, Imre Deak wrote:
> > > Hi Rui,
> > >
> > > after applying your
> > > "x86/topology: fix erroneous smp_num_siblings on Intel Hybrid
> > > platform"
> > >
> > > fix on the drm-tip tree (see the patchwork URL below) the CI tests
> > > show some regression on a HSW and a KBL machine (see [2] and [4]
> > > below) in the i915 driver. However I think they can't be related to
> > > your changes, since on these machines all cores will report the same
> > > number of CPU siblings.
> >
> > Right.
> >
> > >  Could you confirm this and that in general the reported siblings
> > > can only vary on platforms with both E and P cores (ADL-P being the
> > > first such platform)?
> >
> > Right.
> >
> > I don't think the patch could bring any change related.
> > It only affects hybrid platforms.
> 
> Is this topology fix patch the only patch applied? or together with some other patches?
This only. 
> 
> I can hardly imagine that the fix patch can trigger such issues, so I suspect they are
> intermittent issues. say is the regression 100% reproducible?
This is not regression. I assume drm-tip misses this patch (as was not part of 6.3rc yet.)

> does the warning/failure ever show without the patch?
Yes, On our local (3) system's seen on all. 
> 
> BTW, I just happened to see this thread
> https://lore.kernel.org/all/DM8PR11MB565580BCF44661B6A392F0CEE08B9@DM8P 
> R11MB5655.namprd11.prod.outlook.com/
> If the problem on hand has been verified to be not related with the topology fix, can
> we update in this thread as well?
> https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang@intel.com/
> This is another issue that the patch fixes. And it's better to have a Buglink/Tested-by
> tag in the commit.
> 
> thanks,
> rui
> 
> >
> > Thanks,
> > rui
> > > Thanks,
> > > Imre
> > >
> > > On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> > > > == Series Details ==
> > > >
> > > > Series: x86/topology: fix erroneous smp_num_siblings on Intel
> > > > Hybrid platform
> > > > URL   : https://patchwork.freedesktop.org/series/115661/
> > > > State : failure
> > > >
> > > > == Summary ==
> > > >
> > > > CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> > > > ====================================================
> > > >
> > > > Summary
> > > > -------
> > > >
> > > >   **FAILURE**
> > > >
> > > >   Serious unknown changes coming with Patchwork_115661v1
> > > > absolutely need to be
> > > >   verified manually.
> > > >
> > > >   If you think the reported changes have nothing to do with the
> > > > changes
> > > >   introduced in Patchwork_115661v1, please notify your bug team to
> > > > allow them
> > > >   to document this new failure mode, which will reduce false
> > > > positives in CI.
> > > >
> > > >   External URL:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.
> > > > html
> > > >
> > > > Participating hosts (37 -> 37)
> > > > ------------------------------
> > > >
> > > >   No changes in participating hosts
> > > >
> > > > Possible new issues
> > > > -------------------
> > > >
> > > >   Here are the unknown changes that may have been introduced in
> > > > Patchwork_115661v1:
> > > >
> > > > ### IGT changes ###
> > > >
> > > > #### Possible regressions ####
> > > >
> > > >   * igt@i915_selftest@live@hangcheck:
> > > >     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
> > > >    [1]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-
> 4770/igt@i915_selftest@live@hangcheck.html
> > > >    [2]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-hsw
> > > > -4770/igt@i915_selftest@live@hangcheck.html
> > > >
> > > >
> > > > #### Suppressed ####
> > > >
> > > >   The following results come from untrusted machines, tests, or
> > > > statuses.
> > > >   They do not affect the overall result.
> > > >
> > > >   * igt@fbdev@info:
> > > >     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
> > > >    [3]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-
> 2/igt@fbdev@info.html
> > > >    [4]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-kb
> > > > l-2/igt@fbdev@info.html
> > > >
> > > >
> > > > Known issues
> > > > ------------
> > > >
> > > >   Here are the changes found in Patchwork_115661v1 that come from
> > > > known issues:
> > > >
> > > > ### IGT changes ###
> > > >
> > > > #### Issues hit ####
> > > >
> > > >   * igt@gem_exec_suspend@basic-s3@lmem0:
> > > >     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6]
> > > > ([i915#6311])
> > > >    [5]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-
> 11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > >    [6]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg
> > > > 2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > >
> > > >   * igt@i915_selftest@live@reset:
> > > >     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
> > > >    [7]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-
> 1/igt@i915_selftest@live@reset.html
> > > >    [8]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rp
> > > > ls-1/igt@i915_selftest@live@reset.html
> > > >
> > > >   * igt@i915_selftest@live@slpc:
> > > >     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367]
> > > > /
> > > > [i915#7913] / [i915#7996])
> > > >    [9]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rp
> > > > ls-2/igt@i915_selftest@live@slpc.html
> > > >
> > > >   * igt@i915_suspend@basic-s3-without-i915:
> > > >     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] /
> > > > [i915#7978])
> > > >    [10]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rp
> > > > ls-2/igt@i915_suspend@basic-s3-without-i915.html
> > > >
> > > >   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-
> > > > dp-1:
> > > >     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
> > > >    [11]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-
> 8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > > >    [12]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg
> > > > 2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-d
> > > > p-1.html
> > > >
> > > >
> > > > #### Possible fixes ####
> > > >
> > > >   * igt@i915_selftest@live@gt_heartbeat:
> > > >     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] /
> > > > [i915#7872]) -> [PASS][14]
> > > >    [13]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-
> soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > >    [14]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-kbl
> > > > -soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > >
> > > >   * igt@i915_selftest@live@reset:
> > > >     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913])
> > > > -> [PASS][16]
> > > >    [15]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-
> 2/igt@i915_selftest@live@reset.html
> > > >    [16]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rp
> > > > ls-2/igt@i915_selftest@live@reset.html
> > > >
> > > >
> > > >   {name}: This element is suppressed. This means it is ignored
> > > > when computing
> > > >           the status of the difference (SUCCESS, WARNING, or
> > > > FAILURE).
> > > >
> > > >   [fdo#109271]:
> > > > https://bugs.freedesktop.org/show_bug.cgi?id=109271
> > > >   [i915#4983]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/4983
> > > >   [i915#5334]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/5334
> > > >   [i915#6311]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6311
> > > >   [i915#6367]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6367
> > > >   [i915#6687]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6687
> > > >   [i915#7872]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7872
> > > >   [i915#7913]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7913
> > > >   [i915#7932]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7932
> > > >   [i915#7978]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7978
> > > >   [i915#7996]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7996
> > > >
> > > >
> > > > Build changes
> > > > -------------
> > > >
> > > >   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> > > >
> > > >   CI-20190529: 20190529
> > > >   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > >   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @
> > > > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> > > >   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > >
> > > >
> > > > ### Linux commits
> > > >
> > > > 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on Intel
> > > > Hybrid platform
> > > >
> > > > == Logs ==
> > > >
> > > > For more details see:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.
> > > > html
Saarinen, Jani March 28, 2023, 6:47 a.m. UTC | #6
Hi,

> -----Original Message-----
> From: Saarinen, Jani
> Sent: tiistai 28. maaliskuuta 2023 9.35
> To: Zhang, Rui <rui.zhang@intel.com>; Deak, Imre <imre.deak@intel.com>
> Cc: intel-gfx@lists.freedesktop.org
> Subject: RE: ✗ Fi.CI.BAT: failure for x86/topology: fix erroneous
> smp_num_siblings on Intel Hybrid platform
> 
> Hi.
> > -----Original Message-----
> > From: Zhang, Rui <rui.zhang@intel.com>
> > Sent: tiistai 28. maaliskuuta 2023 4.57
> > To: Deak, Imre <imre.deak@intel.com>
> > Cc: Saarinen, Jani <jani.saarinen@intel.com>;
> > intel-gfx@lists.freedesktop.org
> > Subject: Re: ✗ Fi.CI.BAT: failure for x86/topology: fix erroneous
> > smp_num_siblings on Intel Hybrid platform
> >
> > On Tue, 2023-03-28 at 09:14 +0800, Zhang Rui wrote:
> > > Hi, Imre,
> > >
> > > Thanks for raising this.
> > >
> > > On Tue, 2023-03-28 at 00:26 +0300, Imre Deak wrote:
> > > > Hi Rui,
> > > >
> > > > after applying your
> > > > "x86/topology: fix erroneous smp_num_siblings on Intel Hybrid
> > > > platform"
> > > >
> > > > fix on the drm-tip tree (see the patchwork URL below) the CI tests
> > > > show some regression on a HSW and a KBL machine (see [2] and [4]
> > > > below) in the i915 driver. However I think they can't be related
> > > > to your changes, since on these machines all cores will report the
> > > > same number of CPU siblings.
> > >
> > > Right.
> > >
> > > >  Could you confirm this and that in general the reported siblings
> > > > can only vary on platforms with both E and P cores (ADL-P being
> > > > the first such platform)?
> > >
> > > Right.
> > >
> > > I don't think the patch could bring any change related.
> > > It only affects hybrid platforms.
> >
> > Is this topology fix patch the only patch applied? or together with some other
> patches?
> This only.
> >
> > I can hardly imagine that the fix patch can trigger such issues, so I
> > suspect they are intermittent issues. say is the regression 100% reproducible?
> This is not regression. I assume drm-tip misses this patch (as was not part of 6.3rc
> yet.)
Ignore this comment, badly read mail only. I assume it is hard to day as this test is done on CI (pre-merge)
Let Imre to comment here. 
> 
> > does the warning/failure ever show without the patch?
> Yes, On our local (3) system's seen on all.
Again. Ignore this too. 
> >
> > BTW, I just happened to see this thread
> >
> https://lore.kernel.org/all/DM8PR11MB565580BCF44661B6A392F0CEE08B9@DM8
> > P
> > R11MB5655.namprd11.prod.outlook.com/
> > If the problem on hand has been verified to be not related with the
> > topology fix, can we update in this thread as well?
> > https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang@intel.com
> > / This is another issue that the patch fixes. And it's better to have
> > a Buglink/Tested-by tag in the commit.
> >
> > thanks,
> > rui
> >
> > >
> > > Thanks,
> > > rui
> > > > Thanks,
> > > > Imre
> > > >
> > > > On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> > > > > == Series Details ==
> > > > >
> > > > > Series: x86/topology: fix erroneous smp_num_siblings on Intel
> > > > > Hybrid platform
> > > > > URL   : https://patchwork.freedesktop.org/series/115661/
> > > > > State : failure
> > > > >
> > > > > == Summary ==
> > > > >
> > > > > CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> > > > > ====================================================
> > > > >
> > > > > Summary
> > > > > -------
> > > > >
> > > > >   **FAILURE**
> > > > >
> > > > >   Serious unknown changes coming with Patchwork_115661v1
> > > > > absolutely need to be
> > > > >   verified manually.
> > > > >
> > > > >   If you think the reported changes have nothing to do with the
> > > > > changes
> > > > >   introduced in Patchwork_115661v1, please notify your bug team
> > > > > to allow them
> > > > >   to document this new failure mode, which will reduce false
> > > > > positives in CI.
> > > > >
> > > > >   External URL:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.
> > > > > html
> > > > >
> > > > > Participating hosts (37 -> 37)
> > > > > ------------------------------
> > > > >
> > > > >   No changes in participating hosts
> > > > >
> > > > > Possible new issues
> > > > > -------------------
> > > > >
> > > > >   Here are the unknown changes that may have been introduced in
> > > > > Patchwork_115661v1:
> > > > >
> > > > > ### IGT changes ###
> > > > >
> > > > > #### Possible regressions ####
> > > > >
> > > > >   * igt@i915_selftest@live@hangcheck:
> > > > >     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
> > > > >    [1]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-
> > 4770/igt@i915_selftest@live@hangcheck.html
> > > > >    [2]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-h
> > > > > sw -4770/igt@i915_selftest@live@hangcheck.html
> > > > >
> > > > >
> > > > > #### Suppressed ####
> > > > >
> > > > >   The following results come from untrusted machines, tests, or
> > > > > statuses.
> > > > >   They do not affect the overall result.
> > > > >
> > > > >   * igt@fbdev@info:
> > > > >     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
> > > > >    [3]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-
> > 2/igt@fbdev@info.html
> > > > >    [4]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > kb
> > > > > l-2/igt@fbdev@info.html
> > > > >
> > > > >
> > > > > Known issues
> > > > > ------------
> > > > >
> > > > >   Here are the changes found in Patchwork_115661v1 that come
> > > > > from known issues:
> > > > >
> > > > > ### IGT changes ###
> > > > >
> > > > > #### Issues hit ####
> > > > >
> > > > >   * igt@gem_exec_suspend@basic-s3@lmem0:
> > > > >     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6]
> > > > > ([i915#6311])
> > > > >    [5]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-
> > 11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > > >    [6]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > dg 2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > > >
> > > > >   * igt@i915_selftest@live@reset:
> > > > >     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
> > > > >    [7]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-
> > 1/igt@i915_selftest@live@reset.html
> > > > >    [8]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > rp ls-1/igt@i915_selftest@live@reset.html
> > > > >
> > > > >   * igt@i915_selftest@live@slpc:
> > > > >     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367]
> > > > > /
> > > > > [i915#7913] / [i915#7996])
> > > > >    [9]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > rp ls-2/igt@i915_selftest@live@slpc.html
> > > > >
> > > > >   * igt@i915_suspend@basic-s3-without-i915:
> > > > >     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] /
> > > > > [i915#7978])
> > > > >    [10]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > rp ls-2/igt@i915_suspend@basic-s3-without-i915.html
> > > > >
> > > > >   *
> > > > > igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-
> > > > > dp-1:
> > > > >     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
> > > > >    [11]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-
> > 8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.ht
> > ml
> > > > >    [12]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > dg
> > > > > 2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d
> > > > > -d
> > > > > p-1.html
> > > > >
> > > > >
> > > > > #### Possible fixes ####
> > > > >
> > > > >   * igt@i915_selftest@live@gt_heartbeat:
> > > > >     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] /
> > > > > [i915#7872]) -> [PASS][14]
> > > > >    [13]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-
> > soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > > >    [14]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-k
> > > > > bl -soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > > >
> > > > >   * igt@i915_selftest@live@reset:
> > > > >     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913])
> > > > > -> [PASS][16]
> > > > >    [15]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-
> > 2/igt@i915_selftest@live@reset.html
> > > > >    [16]:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-
> > > > > rp ls-2/igt@i915_selftest@live@reset.html
> > > > >
> > > > >
> > > > >   {name}: This element is suppressed. This means it is ignored
> > > > > when computing
> > > > >           the status of the difference (SUCCESS, WARNING, or
> > > > > FAILURE).
> > > > >
> > > > >   [fdo#109271]:
> > > > > https://bugs.freedesktop.org/show_bug.cgi?id=109271
> > > > >   [i915#4983]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/4983
> > > > >   [i915#5334]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/5334
> > > > >   [i915#6311]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/6311
> > > > >   [i915#6367]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/6367
> > > > >   [i915#6687]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/6687
> > > > >   [i915#7872]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/7872
> > > > >   [i915#7913]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/7913
> > > > >   [i915#7932]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/7932
> > > > >   [i915#7978]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/7978
> > > > >   [i915#7996]:
> > > > > https://gitlab.freedesktop.org/drm/intel/issues/7996
> > > > >
> > > > >
> > > > > Build changes
> > > > > -------------
> > > > >
> > > > >   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> > > > >
> > > > >   CI-20190529: 20190529
> > > > >   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > > >   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @
> > > > > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> > > > >   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > > >
> > > > >
> > > > > ### Linux commits
> > > > >
> > > > > 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on
> > > > > Intel Hybrid platform
> > > > >
> > > > > == Logs ==
> > > > >
> > > > > For more details see:
> > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.
> > > > > html
Imre Deak March 28, 2023, 7:23 a.m. UTC | #7
On Tue, Mar 28, 2023 at 04:57:14AM +0300, Zhang, Rui wrote:
> On Tue, 2023-03-28 at 09:14 +0800, Zhang Rui wrote:
> > Hi, Imre,
> >
> > Thanks for raising this.
> >
> > On Tue, 2023-03-28 at 00:26 +0300, Imre Deak wrote:
> > > Hi Rui,
> > >
> > > after applying your
> > > "x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform"
> > >
> > > fix on the drm-tip tree (see the patchwork URL below) the CI tests
> > > show some regression on a HSW and a KBL machine (see [2] and [4]
> > > below) in the i915 driver. However I think they can't be related
> > > to your changes, since on these machines all cores will report the
> > > same number of CPU siblings.
> >
> > Right.
> >
> > >  Could you confirm this and that in general the reported siblings
> > >  can only vary on platforms with both E and P cores (ADL-P being
> > >  the first such platform)?
> >
> > Right.
> >
> > I don't think the patch could bring any change related.
> > It only affects hybrid platforms.

Ok, thanks for confirming.

> Is this topology fix patch the only patch applied?

Yes.

> or together with some other patches?
> 
> I can hardly imagine that the fix patch can trigger such issues, so I
> suspect they are intermittent issues. say
> is the regression 100% reproducible?

No, the problems reported by CI here happened already earlier, before
this patch was applied, so they seem to be sporadic. I don't think
either that they are related to the fix; nevertheless wanted to get
the above clarification from you.

> does the warning/failure ever show without the patch?

Yes, they also happened in CI builds before the patch was applied.

> BTW, I just happened to see this thread
> https://lore.kernel.org/all/DM8PR11MB565580BCF44661B6A392F0CEE08B9@DM8PR11MB5655.namprd11.prod.outlook.com/
> If the problem on hand has been verified to be not related with the
> topology fix, can we update in this thread as well?

This email is in the same thread as the above message.

> https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang@intel.com/
> This is another issue that the patch fixes. 

I added the above link to the patch with a References: trailer.

> And it's better to have a Buglink/Tested-by tag in the commit.

The patch has a link to a bug Jani opened, and his Tested-by can
be added while the fix is applied.

Thanks,
Imre

> thanks,
> rui
> 
> >
> > Thanks,
> > rui
> > > Thanks,
> > > Imre
> > >
> > > On Mon, Mar 27, 2023 at 07:02:25PM +0000, Patchwork wrote:
> > > > == Series Details ==
> > > >
> > > > Series: x86/topology: fix erroneous smp_num_siblings on Intel
> > > > Hybrid platform
> > > > URL   : https://patchwork.freedesktop.org/series/115661/
> > > > State : failure
> > > >
> > > > == Summary ==
> > > >
> > > > CI Bug Log - changes from CI_DRM_12921 -> Patchwork_115661v1
> > > > ====================================================
> > > >
> > > > Summary
> > > > -------
> > > >
> > > >   **FAILURE**
> > > >
> > > >   Serious unknown changes coming with Patchwork_115661v1
> > > > absolutely
> > > > need to be
> > > >   verified manually.
> > > >
> > > >   If you think the reported changes have nothing to do with the
> > > > changes
> > > >   introduced in Patchwork_115661v1, please notify your bug team
> > > > to
> > > > allow them
> > > >   to document this new failure mode, which will reduce false
> > > > positives in CI.
> > > >
> > > >   External URL:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
> > > >
> > > > Participating hosts (37 -> 37)
> > > > ------------------------------
> > > >
> > > >   No changes in participating hosts
> > > >
> > > > Possible new issues
> > > > -------------------
> > > >
> > > >   Here are the unknown changes that may have been introduced in
> > > > Patchwork_115661v1:
> > > >
> > > > ### IGT changes ###
> > > >
> > > > #### Possible regressions ####
> > > >
> > > >   * igt@i915_selftest@live@hangcheck:
> > > >     - fi-hsw-4770:        [PASS][1] -> [DMESG-WARN][2]
> > > >    [1]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> > > >    [2]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
> > > >
> > > >
> > > > #### Suppressed ####
> > > >
> > > >   The following results come from untrusted machines, tests, or
> > > > statuses.
> > > >   They do not affect the overall result.
> > > >
> > > >   * igt@fbdev@info:
> > > >     - {bat-kbl-2}:        [SKIP][3] ([fdo#109271]) -> [ABORT][4]
> > > >    [3]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-kbl-2/igt@fbdev@info.html
> > > >    [4]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-kbl-2/igt@fbdev@info.html
> > > >
> > > >
> > > > Known issues
> > > > ------------
> > > >
> > > >   Here are the changes found in Patchwork_115661v1 that come from
> > > > known issues:
> > > >
> > > > ### IGT changes ###
> > > >
> > > > #### Issues hit ####
> > > >
> > > >   * igt@gem_exec_suspend@basic-s3@lmem0:
> > > >     - bat-dg2-11:         [PASS][5] -> [INCOMPLETE][6]
> > > > ([i915#6311])
> > > >    [5]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > >    [6]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-11/igt@gem_exec_suspend@basic-s3@lmem0.html
> > > >
> > > >   * igt@i915_selftest@live@reset:
> > > >     - bat-rpls-1:         [PASS][7] -> [ABORT][8] ([i915#4983])
> > > >    [7]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-1/igt@i915_selftest@live@reset.html
> > > >    [8]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-1/igt@i915_selftest@live@reset.html
> > > >
> > > >   * igt@i915_selftest@live@slpc:
> > > >     - bat-rpls-2:         NOTRUN -> [DMESG-FAIL][9] ([i915#6367]
> > > > /
> > > > [i915#7913] / [i915#7996])
> > > >    [9]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@slpc.html
> > > >
> > > >   * igt@i915_suspend@basic-s3-without-i915:
> > > >     - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] /
> > > > [i915#7978])
> > > >    [10]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_suspend@basic-s3-without-i915.html
> > > >
> > > >   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-
> > > > dp-1:
> > > >     - bat-dg2-8:          [PASS][11] -> [FAIL][12] ([i915#7932])
> > > >    [11]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > > >    [12]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
> > > >
> > > >
> > > > #### Possible fixes ####
> > > >
> > > >   * igt@i915_selftest@live@gt_heartbeat:
> > > >     - fi-kbl-soraka:      [DMESG-FAIL][13] ([i915#5334] /
> > > > [i915#7872]) -> [PASS][14]
> > > >    [13]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > >    [14]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
> > > >
> > > >   * igt@i915_selftest@live@reset:
> > > >     - bat-rpls-2:         [ABORT][15] ([i915#4983] / [i915#7913])
> > > > -> [PASS][16]
> > > >    [15]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12921/bat-rpls-2/igt@i915_selftest@live@reset.html
> > > >    [16]:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/bat-rpls-2/igt@i915_selftest@live@reset.html
> > > >
> > > >
> > > >   {name}: This element is suppressed. This means it is ignored
> > > > when
> > > > computing
> > > >           the status of the difference (SUCCESS, WARNING, or
> > > > FAILURE).
> > > >
> > > >   [fdo#109271]:
> > > > https://bugs.freedesktop.org/show_bug.cgi?id=109271
> > > >   [i915#4983]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/4983
> > > >   [i915#5334]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/5334
> > > >   [i915#6311]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6311
> > > >   [i915#6367]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6367
> > > >   [i915#6687]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/6687
> > > >   [i915#7872]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7872
> > > >   [i915#7913]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7913
> > > >   [i915#7932]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7932
> > > >   [i915#7978]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7978
> > > >   [i915#7996]:
> > > > https://gitlab.freedesktop.org/drm/intel/issues/7996
> > > >
> > > >
> > > > Build changes
> > > > -------------
> > > >
> > > >   * Linux: CI_DRM_12921 -> Patchwork_115661v1
> > > >
> > > >   CI-20190529: 20190529
> > > >   CI_DRM_12921: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > >   IGT_7221: 4b77c6d85024d22ca521d510f8eee574128fe04f @
> > > > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> > > >   Patchwork_115661v1: 3de6040ce9900a94ec626662d5c6a227b37eeb1c @
> > > > git://anongit.freedesktop.org/gfx-ci/linux
> > > >
> > > >
> > > > ### Linux commits
> > > >
> > > > 83d9e76610d5 x86/topology: fix erroneous smp_num_siblings on
> > > > Intel
> > > > Hybrid platform
> > > >
> > > > == Logs ==
> > > >
> > > > For more details see:
> > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115661v1/index.html
Jani Nikula March 28, 2023, 3:59 p.m. UTC | #8
On Mon, 27 Mar 2023, Imre Deak <imre.deak@intel.com> wrote:
> From: Zhang Rui <rui.zhang@intel.com>
>
> The SMT siblings value returned by CPUID.1F SMT level EBX differs
> among CPUs on Intel Hybrid platforms like AlderLake and MeteorLake.
> It returns 2 for Pcore CPUs which have SMT siblings and returns 1 for
> Ecore CPUs which do not have SMT siblings.
>
> Today, the CPU boot code sets the global variable smp_num_siblings when
> every CPU thread is brought up. The last thread to boot will overwrite
> it with the number of siblings of *that* thread. That last thread to
> boot will "win". If the thread is a Pcore, smp_num_siblings == 2.  If it
> is an Ecore, smp_num_siblings == 1.
>
> smp_num_siblings describes if the *system* supports SMT.  It should
> specify the maximum number of SMT threads among all cores.
>
> Ensure that smp_num_siblings represents the system-wide maximum number
> of siblings by always increasing its value. Never allow it to decrease.
>
> On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are
> not updated in any cpu sibling map because the system is treated as an
> UP system when probing Ecore CPUs.
>
> Below shows part of the CPU topology information before and after the
> fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).
> ...
> -/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
> -/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
> +/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
> +/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
> ...
> -/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
> -/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
> +/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
> +/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21
>
> And this also breaks userspace tools like lscpu
> -Core(s) per socket:  1
> -Socket(s):           11
> +Core(s) per socket:  16
> +Socket(s):           1
>
> CC: stable@kernel.org
> Fixes: bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology")
> Fixes: 95f3d39ccf7a ("x86/cpu/topology: Provide detect_extended_topology_early()")
> Suggested-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> [Imre: resend for core-for-CI]
> References: https://lore.kernel.org/lkml/20230323015640.27906-1-rui.zhang@intel.com
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8317
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Pushed to topic/core-for-CI as a stopgap measure.

BR,
Jani.

> ---
>  arch/x86/kernel/cpu/topology.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
> index 5e868b62a7c4e..0270925fe013b 100644
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -79,7 +79,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
>  	 * initial apic id, which also represents 32-bit extended x2apic id.
>  	 */
>  	c->initial_apicid = edx;
> -	smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
>  #endif
>  	return 0;
>  }
> @@ -109,7 +109,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
>  	 */
>  	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
>  	c->initial_apicid = edx;
> -	core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
> +	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
>  	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>  	die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
>  	pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 5e868b62a7c4e..0270925fe013b 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -79,7 +79,7 @@  int detect_extended_topology_early(struct cpuinfo_x86 *c)
 	 * initial apic id, which also represents 32-bit extended x2apic id.
 	 */
 	c->initial_apicid = edx;
-	smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
 #endif
 	return 0;
 }
@@ -109,7 +109,8 @@  int detect_extended_topology(struct cpuinfo_x86 *c)
 	 */
 	cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
 	c->initial_apicid = edx;
-	core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+	core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
+	smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx));
 	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
 	die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
 	pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);