mbox series

[RFC,v2,0/4] hw/arm/virt: Introduce cluster cpu topology support

Message ID 20210413083147.34236-1-wangyanan55@huawei.com (mailing list archive)
Headers show
Series hw/arm/virt: Introduce cluster cpu topology support | expand

Message

Yanan Wang April 13, 2021, 8:31 a.m. UTC
Hi,

This series is a new version of [0] posted to introduce the cluster cpu
topology support for ARM platform, besides now existing sockets, cores,
and threads. And the code has been rewriten based on patch series [1].
[0] https://patchwork.kernel.org/project/qemu-devel/cover/20210331095343.12172-1-wangyanan55@huawei.com/
[1] https://patchwork.kernel.org/project/qemu-devel/cover/20210413080745.33004-1-wangyanan55@huawei.com/

Changelogs:
v1->v2:
- Only focus on cluster support for ARM platform
- Rebase the code on patch series [1]

Description:
A cluster means a group of cores that share some resources (e.g. cache)
among them under the LLC. For example, ARM64 server chip Kunpeng 920 has
6 or 8 clusters in each NUMA, and each cluster has 4 cores. All clusters
share L3 cache data while cores within each cluster share the L2 cache.

The cache affinity of cluster has been proved to improve the Linux kernel
scheduling performance and a patchset [2] has already been posted, where
a general sched_domain for clusters was added and a cluster level was
added in the arch-neutral cpu topology struct like below.
struct cpu_topology {
    int thread_id;
    int core_id;
    int cluster_id;
    int package_id;
    int llc_id;
    cpumask_t thread_sibling;
    cpumask_t core_sibling;
    cpumask_t cluster_sibling;
    cpumask_t llc_sibling;
};

Also Kernel Doc [3]: Documentation/devicetree/bindings/cpu/cpu-topology.txt
defines a four-level CPU topology hierarchy like socket/cluster/core/thread.
According to the context, a socket node's child nodes must be one or more
cluster nodes and a cluster node's child nodes must be one or more cluster
nodes/one or more core nodes.

So let's add the -smp, clusters=* command line support for ARM cpu, so that
future guest os could make use of cluster cpu topology for better scheduling
performance. For ARM machines, a four-level cpu hierarchy can be defined and
it will be sockets/clusters/cores/threads.
[2] https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210319041618.14316-1-song.bao.hua@hisilicon.com/
[3] https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/cpu/cpu-topology.txt

Test results:
After applying this patch series, launch a guest with virt-6.0 and cpu
topology configured with: -smp
96,sockets=2,clusters=6,cores=4,threads=2,
VM's cpu topology description shows as below.
lscpu:
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  2
Core(s) per socket:  24
Socket(s):           2
NUMA node(s):        1
Vendor ID:           0x48
Model:               0
Stepping:            0x1
BogoMIPS:            200.00
NUMA node0 CPU(s):   0-95

Topology information of clusters can also be got:
cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list: 0-7
cat /sys/devices/system/cpu/cpu0/topology/cluster_id: 56

cat /sys/devices/system/cpu/cpu8/topology/cluster_cpus_list: 8-15
cat /sys/devices/system/cpu/cpu8/topology/cluster_id: 316
...
cat /sys/devices/system/cpu/cpu95/topology/cluster_cpus_list: 88-95
cat /sys/devices/system/cpu/cpu95/topology/cluster_id: 2936

Yanan Wang (4):
  vl.c: Add -smp, clusters=* command line support for ARM cpu
  hw/arm/virt: Parse -smp cluster parameter in virt_smp_parse
  hw/arm/virt-acpi-build: Add cluster level for PPTT table
  hw/arm/virt: Add cluster level for device tree

 hw/arm/virt-acpi-build.c | 55 ++++++++++++++++++++++++----------------
 hw/arm/virt.c            | 44 +++++++++++++++++++-------------
 include/hw/arm/virt.h    |  1 +
 qemu-options.hx          | 26 +++++++++++--------
 softmmu/vl.c             |  3 +++
 5 files changed, 78 insertions(+), 51 deletions(-)