Message ID | 20211121122502.9844-1-wangyanan55@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | ARM virt: Introduce CPU clusters topology support | expand |
Ping... On 2021/11/21 20:24, Yanan Wang wrote: > Hi, > > This series introduces the new CPU clusters topology parameter > and enable the support for it on ARM virt machines. > > Background and descriptions: > The new Cluster-Aware Scheduling support has landed in Linux 5.16, > which has been proved to benefit the scheduling performance (e.g. > load balance and wake_affine strategy) on both x86_64 and AArch64. > We can see Kernel PR [1] and the latest patch set [2] for reference. > > So now in Linux 5.16 we have four-level arch-neutral CPU topology > definition like below and a new scheduler level for clusters. > struct cpu_topology { > int thread_id; > int core_id; > int cluster_id; > int package_id; > int llc_id; > cpumask_t thread_sibling; > cpumask_t core_sibling; > cpumask_t cluster_sibling; > cpumask_t llc_sibling; > } > > A cluster generally means a group of CPU cores which share L2 cache > or other mid-level resources, and it is the shared resources that > is used to improve scheduler's behavior. From the point of view of > the size range, it's between CPU die and CPU core. For example, on > some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node, > and 4 CPU cores in each cluster. The 4 CPU cores share a separate > L2 cache and a L3 cache tag, which brings cache affinity advantage. > > [1] https://lore.kernel.org/lkml/163572864855.3357115.17938524897008353101.tglx@xen13/ > [2] https://lkml.org/lkml/2021/9/24/178 > > In virtualization, on the Hosts which have pClusters, if we can > design a vCPU topology with cluster level for guest kernel and > have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can > also make use of the cache affinity of CPU clusters to gain > similar scheduling performance. > > This series consists of two parts: > The first part (patch 1-3): > Implement infrastructure for CPU cluster level topology support, > including the SMP documentation, configuration and parsing. > > The second part (part 4-10): > Enable CPU cluster support on ARM virt machines, so that users > can specify a 4-level CPU hierarchy sockets/clusters/cores/threads. > And the 4-level topology will be described to guest kernel through > ACPI PPTT and DT cpu-map. > > Changelog: > v3->v4: > - Significant change from v3 to v4, since the whole series is reworked > based on latest QEMU SMP frame. > - v3: https://lore.kernel.org/qemu-devel/20210516103228.37792-1-wangyanan55@huawei.com/ > > Yanan Wang (10): > qemu-options: Improve readability of SMP related Docs > hw/core/machine: Introduce CPU cluster topology support > hw/core/machine: Wrap target specific parameters together > hw/arm/virt: Support clusters on ARM virt machines > hw/arm/virt: Support cluster level in DT cpu-map > hw/acpi/aml-build: Improve scalability of PPTT generation > hw/arm/virt-acpi-build: Make an ARM specific PPTT generator > tests/acpi/bios-tables-test: Allow changes to virt/PPTT file > hw/acpi/virt-acpi-build: Support cluster level in PPTT generation > tests/acpi/bios-table-test: Update expected virt/PPTT file > > hw/acpi/aml-build.c | 66 ++------------------------ > hw/arm/virt-acpi-build.c | 92 +++++++++++++++++++++++++++++++++++- > hw/arm/virt.c | 16 ++++--- > hw/core/machine-smp.c | 29 +++++++++--- > hw/core/machine.c | 3 ++ > include/hw/acpi/aml-build.h | 5 +- > include/hw/boards.h | 6 ++- > qapi/machine.json | 5 +- > qemu-options.hx | 91 +++++++++++++++++++++++++++-------- > softmmu/vl.c | 3 ++ > tests/data/acpi/virt/PPTT | Bin 76 -> 96 bytes > 11 files changed, 214 insertions(+), 102 deletions(-) > > -- > 2.19.1 > > .
I have sent a v5 with four new patches added, so this v4 can be ignored. v5: https://patchew.org/QEMU/20211228092221.21068-1-wangyanan55@huawei.com/ Thanks, Yanan On 2021/11/21 20:24, Yanan Wang wrote: > Hi, > > This series introduces the new CPU clusters topology parameter > and enable the support for it on ARM virt machines. > > Background and descriptions: > The new Cluster-Aware Scheduling support has landed in Linux 5.16, > which has been proved to benefit the scheduling performance (e.g. > load balance and wake_affine strategy) on both x86_64 and AArch64. > We can see Kernel PR [1] and the latest patch set [2] for reference. > > So now in Linux 5.16 we have four-level arch-neutral CPU topology > definition like below and a new scheduler level for clusters. > struct cpu_topology { > int thread_id; > int core_id; > int cluster_id; > int package_id; > int llc_id; > cpumask_t thread_sibling; > cpumask_t core_sibling; > cpumask_t cluster_sibling; > cpumask_t llc_sibling; > } > > A cluster generally means a group of CPU cores which share L2 cache > or other mid-level resources, and it is the shared resources that > is used to improve scheduler's behavior. From the point of view of > the size range, it's between CPU die and CPU core. For example, on > some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node, > and 4 CPU cores in each cluster. The 4 CPU cores share a separate > L2 cache and a L3 cache tag, which brings cache affinity advantage. > > [1] https://lore.kernel.org/lkml/163572864855.3357115.17938524897008353101.tglx@xen13/ > [2] https://lkml.org/lkml/2021/9/24/178 > > In virtualization, on the Hosts which have pClusters, if we can > design a vCPU topology with cluster level for guest kernel and > have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can > also make use of the cache affinity of CPU clusters to gain > similar scheduling performance. > > This series consists of two parts: > The first part (patch 1-3): > Implement infrastructure for CPU cluster level topology support, > including the SMP documentation, configuration and parsing. > > The second part (part 4-10): > Enable CPU cluster support on ARM virt machines, so that users > can specify a 4-level CPU hierarchy sockets/clusters/cores/threads. > And the 4-level topology will be described to guest kernel through > ACPI PPTT and DT cpu-map. > > Changelog: > v3->v4: > - Significant change from v3 to v4, since the whole series is reworked > based on latest QEMU SMP frame. > - v3: https://lore.kernel.org/qemu-devel/20210516103228.37792-1-wangyanan55@huawei.com/ > > Yanan Wang (10): > qemu-options: Improve readability of SMP related Docs > hw/core/machine: Introduce CPU cluster topology support > hw/core/machine: Wrap target specific parameters together > hw/arm/virt: Support clusters on ARM virt machines > hw/arm/virt: Support cluster level in DT cpu-map > hw/acpi/aml-build: Improve scalability of PPTT generation > hw/arm/virt-acpi-build: Make an ARM specific PPTT generator > tests/acpi/bios-tables-test: Allow changes to virt/PPTT file > hw/acpi/virt-acpi-build: Support cluster level in PPTT generation > tests/acpi/bios-table-test: Update expected virt/PPTT file > > hw/acpi/aml-build.c | 66 ++------------------------ > hw/arm/virt-acpi-build.c | 92 +++++++++++++++++++++++++++++++++++- > hw/arm/virt.c | 16 ++++--- > hw/core/machine-smp.c | 29 +++++++++--- > hw/core/machine.c | 3 ++ > include/hw/acpi/aml-build.h | 5 +- > include/hw/boards.h | 6 ++- > qapi/machine.json | 5 +- > qemu-options.hx | 91 +++++++++++++++++++++++++++-------- > softmmu/vl.c | 3 ++ > tests/data/acpi/virt/PPTT | Bin 76 -> 96 bytes > 11 files changed, 214 insertions(+), 102 deletions(-) > > -- > 2.19.1 > > .