Message ID | 20240220092504.726064-9-zhao1.liu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce SMP Cache Topology | expand |
On Tue, 20 Feb 2024 17:25:04 +0800 Zhao Liu <zhao1.liu@linux.intel.com> wrote: > From: Zhao Liu <zhao1.liu@intel.com> > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Hi, A trivial comment, but also a possibly more significant one about whether the defaults are correctly verified. Jonathan > --- > qemu-options.hx | 54 ++++++++++++++++++++++++++++++++++++++++++------- > 1 file changed, 47 insertions(+), 7 deletions(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 70eaf3256685..85c78c99a3b0 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -281,7 +281,9 @@ ERST > > DEF("smp", HAS_ARG, QEMU_OPTION_smp, > "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n" > - " [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n" > + " [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n" > + " [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n" burns more characters but I'd go with l1d->cache=topo_level As level for a cache has a totally different meaning! > + " [,l3-cache=level]\n" > " set the number of initial CPUs to 'n' [default=1]\n" > " maxcpus= maximum number of total CPUs, including\n" > " offline CPUs for hotplug, etc\n" > @@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp, > " sockets= number of sockets in one book\n" > " dies= number of dies in one socket\n" > " clusters= number of clusters in one die\n" > - " cores= number of cores in one cluster\n" > + " modules= number of modules in one cluster\n" > + " cores= number of cores in one module\n" > " threads= number of threads in one core\n" > - "Note: Different machines may have different subsets of the CPU topology\n" > + " l1d-cache= topology level of L1 D-cache\n" > + " l1i-cache= topology level of L1 I-cache\n" > + " l2-cache= topology level of L2 cache\n" > + " l3-cache= topology level of L3 cache\n" > + "Note: Different machines may have different subsets of the CPU and cache topology\n" > > -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32 > > + The following sub-option defines a CPU topology hierarchy (2 sockets > + totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per > + module, 2 threads per core) with 3-level cache topology hierarchy (L1 > + D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per > + die) for PC machines which support sockets/dies/modules/cores/threads. > + Some members of the CPU topology option can be omitted but their values > + will be automatically computed. Some members of the cache topology > + option can also be omitted and target CPU will use the default topology.: Given the default could be inconsistent I wonder if we should 'push' levels up. So if L2 not defined it is set either to default of equal to max of l1i and l1d level. L3 either default or same level as l2. Won't always correspond to a sensible system so maybe just rejecting cases where default isn't possible is the best plan. However I don't see that verification as the checks on higher levels are gated on them being specified. > + > + :: > + > + -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ > + l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die > + > The following sub-option defines a CPU topology hierarchy (2 sockets > totally on the machine, 2 clusters per socket, 2 cores per cluster, > 2 threads per core) for ARM virt machines which support sockets/clusters
Hi Jonathan, On Mon, Feb 26, 2024 at 03:47:34PM +0000, Jonathan Cameron wrote: > Date: Mon, 26 Feb 2024 15:47:34 +0000 > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Subject: Re: [RFC 8/8] qemu-options: Add the cache topology description of > -smp > X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) > > On Tue, 20 Feb 2024 17:25:04 +0800 > Zhao Liu <zhao1.liu@linux.intel.com> wrote: > > > From: Zhao Liu <zhao1.liu@intel.com> > > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> > > Hi, > > A trivial comment, but also a possibly more significant one about > whether the defaults are correctly verified. > > Jonathan > > --- > > qemu-options.hx | 54 ++++++++++++++++++++++++++++++++++++++++++------- > > 1 file changed, 47 insertions(+), 7 deletions(-) > > > > diff --git a/qemu-options.hx b/qemu-options.hx > > index 70eaf3256685..85c78c99a3b0 100644 > > --- a/qemu-options.hx > > +++ b/qemu-options.hx > > @@ -281,7 +281,9 @@ ERST > > > > DEF("smp", HAS_ARG, QEMU_OPTION_smp, > > "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n" > > - " [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n" > > + " [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n" > > + " [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n" > burns more characters but I'd go with > l1d->cache=topo_level > > As level for a cache has a totally different meaning! Yes, good catch! Thanks. > > > + " [,l3-cache=level]\n" > > " set the number of initial CPUs to 'n' [default=1]\n" > > " maxcpus= maximum number of total CPUs, including\n" > > " offline CPUs for hotplug, etc\n" > > @@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp, > > " sockets= number of sockets in one book\n" > > " dies= number of dies in one socket\n" > > " clusters= number of clusters in one die\n" > > - " cores= number of cores in one cluster\n" > > + " modules= number of modules in one cluster\n" > > + " cores= number of cores in one module\n" > > " threads= number of threads in one core\n" > > - "Note: Different machines may have different subsets of the CPU topology\n" > > + " l1d-cache= topology level of L1 D-cache\n" > > + " l1i-cache= topology level of L1 I-cache\n" > > + " l2-cache= topology level of L2 cache\n" > > + " l3-cache= topology level of L3 cache\n" > > + "Note: Different machines may have different subsets of the CPU and cache topology\n" > > > > > -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32 > > > > + The following sub-option defines a CPU topology hierarchy (2 sockets > > + totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per > > + module, 2 threads per core) with 3-level cache topology hierarchy (L1 > > + D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per > > + die) for PC machines which support sockets/dies/modules/cores/threads. > > + Some members of the CPU topology option can be omitted but their values > > + will be automatically computed. Some members of the cache topology > > + option can also be omitted and target CPU will use the default topology.: > > Given the default could be inconsistent I wonder if we should 'push' levels > up. So if L2 not defined it is set either to default of equal to max of > l1i and l1d level. L3 either default or same level as l2. HMM, IIUC, I think there may be the case: User sets L2 cache as per core and omits L3 cache. In this case, if L3 is per core (as L2) by default, how could we identify if that per core L3 is the default or from user? We need to identify this becase x86's L3 is shared at die by default and L2 is shared at core level for current CPU models. To resolve this issue, we can add the status field in SMPCompatProps, e.g., has_l3_cache, just like current SMPCompatProps.has_clusters, to explicitly indicate that the L3 cache topo is set by user. Then other caches also need the similar fields...It doesn't look as simple as the current default invalid topology level. > Won't always correspond to a sensible system so maybe just rejecting > cases where default isn't possible is the best plan. However I don't > see that verification as the checks on higher levels are gated on them > being specified. > > > + > > + :: > > + > > + -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ > > + l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die > > + > > The following sub-option defines a CPU topology hierarchy (2 sockets > > totally on the machine, 2 clusters per socket, 2 cores per cluster, > > 2 threads per core) for ARM virt machines which support sockets/clusters >
diff --git a/qemu-options.hx b/qemu-options.hx index 70eaf3256685..85c78c99a3b0 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -281,7 +281,9 @@ ERST DEF("smp", HAS_ARG, QEMU_OPTION_smp, "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n" - " [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n" + " [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n" + " [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n" + " [,l3-cache=level]\n" " set the number of initial CPUs to 'n' [default=1]\n" " maxcpus= maximum number of total CPUs, including\n" " offline CPUs for hotplug, etc\n" @@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp, " sockets= number of sockets in one book\n" " dies= number of dies in one socket\n" " clusters= number of clusters in one die\n" - " cores= number of cores in one cluster\n" + " modules= number of modules in one cluster\n" + " cores= number of cores in one module\n" " threads= number of threads in one core\n" - "Note: Different machines may have different subsets of the CPU topology\n" + " l1d-cache= topology level of L1 D-cache\n" + " l1i-cache= topology level of L1 I-cache\n" + " l2-cache= topology level of L2 cache\n" + " l3-cache= topology level of L3 cache\n" + "Note: Different machines may have different subsets of the CPU and cache topology\n" " parameters supported, so the actual meaning of the supported parameters\n" " will vary accordingly. For example, for a machine type that supports a\n" " three-level CPU hierarchy of sockets/cores/threads, the parameters will\n" @@ -306,7 +313,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp, " must be set as 1 in the purpose of correct parsing.\n", QEMU_ARCH_ALL) SRST -``-smp [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]`` +``-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,modules=modules][,cores=cores][,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level][,l3-cache=level]`` Simulate a SMP system with '\ ``n``\ ' CPUs initially present on the machine type board. On boards supporting CPU hotplug, the optional '\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be @@ -320,15 +327,34 @@ SRST Both parameters are subject to an upper limit that is determined by the specific machine type chosen. + CPU topology parameters include '\ ``drawers``\ ', '\ ``books``\ ', + '\ ``sockets``\ ', '\ ``dies``\ ', '\ ``clusters``\ ', '\ ``modules``\ ', + '\ ``cores``\ ' and '\ ``threads``\ '. These CPU parameters accept only + integers and are used to specify the number of specific topology domains + under the corresponding topology level. + To control reporting of CPU topology information, values of the topology parameters can be specified. Machines may only support a subset of the - parameters and different machines may have different subsets supported - which vary depending on capacity of the corresponding CPU targets. So - for a particular machine type board, an expected topology hierarchy can + CPU topology parameters and different machines may have different subsets + supported which vary depending on capacity of the corresponding CPU targets. + So for a particular machine type board, an expected topology hierarchy can be defined through the supported sub-option. Unsupported parameters can also be provided in addition to the sub-option, but their values must be set as 1 in the purpose of correct parsing. + Cache topology parameters include '\ ``l1d-cache``\ ', '\ ``l1i-cache``\ ', + '\ ``l2-cache``\ ' and '\ ``l3-cache``\ '. These cache topology parameters + accept the strings of CPU topology levels (such as '\ ``drawer``\ ', '\ ``book``\ ', + '\ ``socket``\ ', '\ ``die``\ ', '\ ``cluster``\ ', '\ ``module``\ ', + '\ ``core``\ ' or '\ ``thread``\ '). Exactly which topology level strings + could be accepted as the parameter depends on the machine's support for the + corresponding CPU topology level. + + Machines may also only support a subset of the cache topology parameters. + Unsupported cache topology parameters will be omitted, and correspondingly, + the target CPU's cache topology will use the its default cache topology + setting. + Either the initial CPU count, or at least one of the topology parameters must be specified. The specified parameters must be greater than zero, explicit configuration like "cpus=0" is not allowed. Values for any @@ -354,6 +380,20 @@ SRST -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32 + The following sub-option defines a CPU topology hierarchy (2 sockets + totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per + module, 2 threads per core) with 3-level cache topology hierarchy (L1 + D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per + die) for PC machines which support sockets/dies/modules/cores/threads. + Some members of the CPU topology option can be omitted but their values + will be automatically computed. Some members of the cache topology + option can also be omitted and target CPU will use the default topology.: + + :: + + -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ + l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die + The following sub-option defines a CPU topology hierarchy (2 sockets totally on the machine, 2 clusters per socket, 2 cores per cluster, 2 threads per core) for ARM virt machines which support sockets/clusters