From patchwork Tue Feb 20 09:24:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13563696 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 00F4557333 for ; Tue, 20 Feb 2024 09:11:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.17 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708420307; cv=none; b=Sq/xRmFT6D0Igc1gaPgvLxtk5wAB9TosOWODvvUGM48DOBteiYJdheShXJ4m8nQyN3Bg6Z69kaNGqC9u6FBqwcDLFyJwtjWFYRX/WwDGIvJT24yorb0WxuBUCXG4Q2SjnTGhVuTh+15a8oNSCH1rZ3OiftXSw1c2MokgrFZ1LZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708420307; c=relaxed/simple; bh=brs8Wc8MtqC+jD4fj4w5YcCZt8grPE7rypv1O7XbwgM=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=hHk+k00CNYVNxwYPK2ToKZ0He/f6Id/NX/lvlLc2DdY7RXZM3tNxF45Y9y9Bf5cbfOSvf/oeH3dc8gHoC5m9OnbG1X8L7t5UNsY7vn87HVLqrrZeLuK+ivGKNaHkCLiDRrgpXw8YMFQKOB14aIPpkqf5Q0kvWHUTpG1thJ8xFsk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=S6XahKdw; arc=none smtp.client-ip=192.198.163.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="S6XahKdw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708420306; x=1739956306; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=brs8Wc8MtqC+jD4fj4w5YcCZt8grPE7rypv1O7XbwgM=; b=S6XahKdwsWV6GdriUc6BinfWQGgaqHG2BMtNzU5AcIHo7k5i/+qdxgVY gD2cU3yzRH+7lB2xhUJVRZKVOzRC5ptgIDuGjRmkdC8KJsgquaqy21/Vj ZLc8+7CIFHeUOutjxOW2CsCHwIbQ3OPiIES/sL4opIwaUdO4puENW/Z61 ehNw7gkov+Oy+T7S+IneSgk5pQ7FBsYP6ZW/eiNBhHD5nuQn5nHYfXqJD X8LvWyWAC06/aLDpSvGBKsgRO/KTKAj/Ci6szqCElmIx+a4OxgUTr9Vlx WbI1CLYSRRH631YNxukmi9uaF/+1ccXXYClSXWTDBV5d1hQvh97Uee6AP Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10989"; a="2374955" X-IronPort-AV: E=Sophos;i="6.06,172,1705392000"; d="scan'208";a="2374955" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Feb 2024 01:11:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,172,1705392000"; d="scan'208";a="5012799" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by orviesa007.jf.intel.com with ESMTP; 20 Feb 2024 01:11:39 -0800 From: Zhao Liu To: =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Eduardo Habkost , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Yanan Wang , "Michael S . Tsirkin" , Paolo Bonzini , Richard Henderson , Eric Blake , Markus Armbruster , Marcelo Tosatti , =?utf-8?q?Alex_Benn=C3=A9e?= , Peter Maydell , Jonathan Cameron , Sia Jee Heng Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, qemu-riscv@nongnu.org, qemu-arm@nongnu.org, Zhenyu Wang , Dapeng Mi , Yongwei Ma , Zhao Liu Subject: [RFC 0/8] Introduce SMP Cache Topology Date: Tue, 20 Feb 2024 17:24:56 +0800 Message-Id: <20240220092504.726064-1-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu Hi list, This's our proposal for supporting (SMP) cache topology in -smp as the following example: -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die With the new cache topology options ("l1d-cache", "l1i-cache", "l2-cache" and "l3-cache"), we could adjust the cache topology via -smp. This patch set is rebased on our i386 module series: https://lore.kernel.org/qemu-devel/20240131101350.109512-1-zhao1.liu@linux.intel.com/ Since the ARM [1] and RISC-V [2] folks have similar needs for the cache topology, I also cc'd the ARM and RISC-V folks and lists. Welcome your feedback! Introduction ============ Background ---------- Intel client platforms (ADL/RPL/MTL) and E core server platforms (SRF) share the L2 cache domain among multiple E cores (in the same module). Thus we need a way to adjust the cache topology so that users could create the cache topology for Guest that is nearly identical to Host. This is necessary in cases where there are bound vCPUs, especially considering that Guest scheduling often takes into account the cache topology as well (e.g. Linux cluster aware scheduling, i.e. L2 cache scheduling). Previously, we introduced a x86 specific option to adjust the cache topology: -cpu x-l2-cache-topo=[core|module] [3] However, considering the needs of other arches, we re-implemented the generic cache topology (aslo in response to Michael's [4] and Daniel's comment [5]) in this series. Cache Topology Representation ----------------------------- We consider to define the cache topology based on CPU topology level for two reasons: 1. In practice, a cache will always be bound to the CPU container - "CPU container" indicates to a set of CPUs that refer to a certain level of CPU topology - where the cache is either private in that CPU container or shared among multiple containers. 2. The x86's cache-related CPUIDs encode cache topology based on APIC ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV relies on also requires CPU containers (CPU topology) to help indicate the private shared hierarchy of the cache. Therefore, for SMP systems, it is natural to use the CPU topology hierarchy directly in QEMU to define the cache topology. And currently, separated L1 cache (L1 data cache and L1 instruction cache) with unified higher-level caches (e.g., unified L2 and L3 caches), is the most common cache architectures. Thus, we define the topology for L1 D-cache, L1 I-cache, L2 cache and L3 cache in MachineState as the basic cache topology support: typedef struct CacheTopology { CPUTopoLevel l1d; CPUTopoLevel l1i; CPUTopoLevel l2; CPUTopoLevel l3; } CacheTopology; Machines may also only support a subset of the cache topology to be configured in -smp by setting the SMP property of MachineClass: typedef struct { ... bool l1_separated_cache_supported; bool l2_unified_cache_supported; bool l3_unified_cache_supported; } SMPCompatProps; Cache Topology Configuration in -smp ------------------------------------ Further, we add new parameters to -smp: * l1d-cache=level * l1i-cache=level * l2-cache=level * l3-cache=level These cache topology parameters accept the strings of CPU topology levels (such as "drawer", "book", "socket", "die", "cluster", "module", "core" or "thread"). Exactly which topology level strings could be accepted as the parameter depends on the machine's support for the corresponding CPU topology level. Unsupported cache topology parameters will be omitted, and correspondingly, the target CPU's cache topology will use the its default cache topology setting. In this series, we add the cache topology support in -smp for x86 PC machine. The following example defines a 3-level cache topology hierarchy (L1 D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per die) for PC machine. -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die Reference --------- [1]: [ARM] Jonathan's proposal to adjust cache topology: https://lore.kernel.org/qemu-devel/20230808115713.2613-2-Jonathan.Cameron@huawei.com/ [2]: [RISC-V] Discussion between JeeHeng and Jonathan about cache topology: https://lore.kernel.org/qemu-devel/20240131155336.000068d1@Huawei.com/ [3]: Previous x86 specific cache topology option: https://lore.kernel.org/qemu-devel/20230914072159.1177582-22-zhao1.liu@linux.intel.com/ [4]: Michael's comment about generic cache topology support: https://lore.kernel.org/qemu-devel/20231003085516-mutt-send-email-mst@kernel.org/ [5]: Daniel's question about how x86 support L2 cache domain (cluster) configuration: https://lore.kernel.org/qemu-devel/ZcUG0Uc8KylEQhUW@redhat.com/ Thanks and Best Regards, Zhao --- Zhao Liu (8): hw/core: Rename CpuTopology to CPUTopology hw/core: Move CPU topology enumeration into arch-agnostic file hw/core: Define cache topology for machine hw/core: Add cache topology options in -smp i386/cpu: Support thread and module level cache topology i386/cpu: Update cache topology with machine's configuration i386/pc: Support cache topology in -smp for PC machine qemu-options: Add the cache topology description of -smp MAINTAINERS | 2 + hw/core/cpu-topology.c | 56 ++++++++++++++ hw/core/machine-smp.c | 128 ++++++++++++++++++++++++++++++++ hw/core/machine.c | 9 +++ hw/core/meson.build | 1 + hw/i386/pc.c | 3 + hw/s390x/cpu-topology.c | 6 +- include/hw/boards.h | 33 +++++++- include/hw/core/cpu-topology.h | 40 ++++++++++ include/hw/i386/topology.h | 18 +---- include/hw/s390x/cpu-topology.h | 6 +- qapi/machine.json | 14 +++- qemu-options.hx | 54 ++++++++++++-- system/vl.c | 15 ++++ target/i386/cpu.c | 55 ++++++++++---- target/i386/cpu.h | 2 +- tests/unit/meson.build | 3 +- tests/unit/test-smp-parse.c | 14 ++-- 18 files changed, 399 insertions(+), 60 deletions(-) create mode 100644 hw/core/cpu-topology.c create mode 100644 include/hw/core/cpu-topology.h