mbox series

[RFC,0/5] hw/arm/virt: Introduce cpu topology support

Message ID 20210225085627.2263-1-fangying1@huawei.com (mailing list archive)
Headers show
Series hw/arm/virt: Introduce cpu topology support | expand

Message

fangying Feb. 25, 2021, 8:56 a.m. UTC
An accurate cpu topology may help improve the cpu scheduler's decision
making when dealing with multi-core system. So cpu topology description
is helpful to provide guest with the right view. Dario Faggioli's talk
in [0] also shows the virtual topology may has impact on sched performace.
Thus this patch series is posted to introduce cpu topology support for
arm platform.

Both fdt and ACPI are introduced to present the cpu topology. To describe
the cpu topology via ACPI, a PPTT table is introduced according to the
processor hierarchy node structure. This series is derived from [1], in
[1] we are trying to bring both cpu and cache topology support for arm
platform, but there is still some issues to solve to support the cache
hierarchy. So we split the cpu topology part out and send it seperately.
The patch series to support cache hierarchy will be send later since
Salil Mehta's cpu hotplug feature need the cpu topology enabled first and
he is waiting for it to be upstreamed.

This patch series was initially based on the patches posted by Andrew Jones [2].
I jumped in on it since some OS vendor cooperative partner are eager for it.
Thanks for Andrew's contribution.

After applying this patch series, launch a guest with virt-6.0 and cpu
topology configured with sockets:cores:threads = 2:4:2, you will get the
bellow messages with the lscpu command.

-----------------------------------------
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       2
NUMA node(s):                    2
Vendor ID:                       HiSilicon
Model:                           0
Model name:                      Kunpeng-920
Stepping:                        0x1
BogoMIPS:                        200.00
NUMA node0 CPU(s):               0-7
NUMA node1 CPU(s):               8-15

[0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
[1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
[2] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com

Ying Fang (5):
  device_tree: Add qemu_fdt_add_path
  hw/arm/virt: Add cpu-map to device tree
  hw/arm/virt-acpi-build: distinguish possible and present cpus
  hw/acpi/aml-build: add processor hierarchy node structure
  hw/arm/virt-acpi-build: add PPTT table

 hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
 hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
 hw/arm/virt.c                | 40 +++++++++++++++++++++-
 include/hw/acpi/acpi-defs.h  | 13 ++++++++
 include/hw/acpi/aml-build.h  |  7 ++++
 include/hw/arm/virt.h        |  1 +
 include/sysemu/device_tree.h |  1 +
 softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
 8 files changed, 204 insertions(+), 7 deletions(-)

Comments

Andrew Jones Feb. 25, 2021, 12:02 p.m. UTC | #1
On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote:
> An accurate cpu topology may help improve the cpu scheduler's decision
> making when dealing with multi-core system. So cpu topology description
> is helpful to provide guest with the right view. Dario Faggioli's talk
> in [0] also shows the virtual topology may has impact on sched performace.
> Thus this patch series is posted to introduce cpu topology support for
> arm platform.
> 
> Both fdt and ACPI are introduced to present the cpu topology. To describe
> the cpu topology via ACPI, a PPTT table is introduced according to the
> processor hierarchy node structure. This series is derived from [1], in
> [1] we are trying to bring both cpu and cache topology support for arm
> platform, but there is still some issues to solve to support the cache
> hierarchy. So we split the cpu topology part out and send it seperately.
> The patch series to support cache hierarchy will be send later since
> Salil Mehta's cpu hotplug feature need the cpu topology enabled first and
> he is waiting for it to be upstreamed.
> 
> This patch series was initially based on the patches posted by Andrew Jones [2].
> I jumped in on it since some OS vendor cooperative partner are eager for it.
> Thanks for Andrew's contribution.
> 
> After applying this patch series, launch a guest with virt-6.0 and cpu
> topology configured with sockets:cores:threads = 2:4:2, you will get the
> bellow messages with the lscpu command.
> 
> -----------------------------------------
> Architecture:                    aarch64
> CPU op-mode(s):                  64-bit
> Byte Order:                      Little Endian
> CPU(s):                          16
> On-line CPU(s) list:             0-15
> Thread(s) per core:              2

What CPU model was used? Did it actually support threads? If these were
KVM VCPUs, then I guess MPIDR.MT was not set on the CPUs. Apparently
that didn't confuse Linux? See [1] for how I once tried to deal with
threads.

[1] https://github.com/rhdrjones/qemu/commit/60218e0dd7b331031b644872d56f2aca42d0ff1e

> Core(s) per socket:              4
> Socket(s):                       2

Good, but what happens if you specify '-smp 16'? Do you get 16 sockets
each with 1 core? Or, do you get 1 socket with 16 cores? And, which do
we want and why? If you look at [2], then you'll see I was assuming we
want to prefer cores over sockets, since without topology descriptions
that's what the Linux guest kernel would do.

[2] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd

> NUMA node(s):                    2

Why do we have two NUMA nodes in the guest? The two sockets in the
guest should not imply this.

Thanks,
drew

> Vendor ID:                       HiSilicon
> Model:                           0
> Model name:                      Kunpeng-920
> Stepping:                        0x1
> BogoMIPS:                        200.00
> NUMA node0 CPU(s):               0-7
> NUMA node1 CPU(s):               8-15
> 
> [0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
> [2] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com
> 
> Ying Fang (5):
>   device_tree: Add qemu_fdt_add_path
>   hw/arm/virt: Add cpu-map to device tree
>   hw/arm/virt-acpi-build: distinguish possible and present cpus
>   hw/acpi/aml-build: add processor hierarchy node structure
>   hw/arm/virt-acpi-build: add PPTT table
> 
>  hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
>  hw/arm/virt.c                | 40 +++++++++++++++++++++-
>  include/hw/acpi/acpi-defs.h  | 13 ++++++++
>  include/hw/acpi/aml-build.h  |  7 ++++
>  include/hw/arm/virt.h        |  1 +
>  include/sysemu/device_tree.h |  1 +
>  softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
>  8 files changed, 204 insertions(+), 7 deletions(-)
> 
> -- 
> 2.23.0
>
fangying Feb. 26, 2021, 8:41 a.m. UTC | #2
On 2/25/2021 8:02 PM, Andrew Jones wrote:
> On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote:
>> An accurate cpu topology may help improve the cpu scheduler's decision
>> making when dealing with multi-core system. So cpu topology description
>> is helpful to provide guest with the right view. Dario Faggioli's talk
>> in [0] also shows the virtual topology may has impact on sched performace.
>> Thus this patch series is posted to introduce cpu topology support for
>> arm platform.
>>
>> Both fdt and ACPI are introduced to present the cpu topology. To describe
>> the cpu topology via ACPI, a PPTT table is introduced according to the
>> processor hierarchy node structure. This series is derived from [1], in
>> [1] we are trying to bring both cpu and cache topology support for arm
>> platform, but there is still some issues to solve to support the cache
>> hierarchy. So we split the cpu topology part out and send it seperately.
>> The patch series to support cache hierarchy will be send later since
>> Salil Mehta's cpu hotplug feature need the cpu topology enabled first and
>> he is waiting for it to be upstreamed.
>>
>> This patch series was initially based on the patches posted by Andrew Jones [2].
>> I jumped in on it since some OS vendor cooperative partner are eager for it.
>> Thanks for Andrew's contribution.
>>
>> After applying this patch series, launch a guest with virt-6.0 and cpu
>> topology configured with sockets:cores:threads = 2:4:2, you will get the
>> bellow messages with the lscpu command.
>>
>> -----------------------------------------
>> Architecture:                    aarch64
>> CPU op-mode(s):                  64-bit
>> Byte Order:                      Little Endian
>> CPU(s):                          16
>> On-line CPU(s) list:             0-15
>> Thread(s) per core:              2
> 
> What CPU model was used? Did it actually support threads? If these were

It's tested on Huawei Kunpeng 920 CPU model and vcpu host-passthrough.
It does not support threads for now, but the next version 930 may
support it. Here we emulate a virtual cpu topology, a virtual 2 threads
is used to do the test.


> KVM VCPUs, then I guess MPIDR.MT was not set on the CPUs. Apparently
> that didn't confuse Linux? See [1] for how I once tried to deal with
> threads.
> 
> [1] https://github.com/rhdrjones/qemu/commit/60218e0dd7b331031b644872d56f2aca42d0ff1e
> 

If ACPI PPTT table is specified, the linux kernel won't check the MPIDR
register to populate cpu topology. Moreover MPIDR does not ensure a
right cpu topology. So it won't be a problem if MPIDR.MT is not set.

>> Core(s) per socket:              4
>> Socket(s):                       2
> 
> Good, but what happens if you specify '-smp 16'? Do you get 16 sockets
> each with 1 core? Or, do you get 1 socket with 16 cores? And, which do
> we want and why? If you look at [2], then you'll see I was assuming we
> want to prefer cores over sockets, since without topology descriptions
> that's what the Linux guest kernel would do.
> 
> [2] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
> 
>> NUMA node(s):                    2
> 
> Why do we have two NUMA nodes in the guest? The two sockets in the
> guest should not imply this.

The two NUMA nodes are emulated by Qemu since we already have guest numa
topology feature. So the two sockets in the guest has nothing to do with
it. Actually even one socket may have two numa nodes in it in real cpu
model.

> 
> Thanks,
> drew
> 
>> Vendor ID:                       HiSilicon
>> Model:                           0
>> Model name:                      Kunpeng-920
>> Stepping:                        0x1
>> BogoMIPS:                        200.00
>> NUMA node0 CPU(s):               0-7
>> NUMA node1 CPU(s):               8-15
>>
>> [0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
>> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
>> [2] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com
>>
>> Ying Fang (5):
>>    device_tree: Add qemu_fdt_add_path
>>    hw/arm/virt: Add cpu-map to device tree
>>    hw/arm/virt-acpi-build: distinguish possible and present cpus
>>    hw/acpi/aml-build: add processor hierarchy node structure
>>    hw/arm/virt-acpi-build: add PPTT table
>>
>>   hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
>>   hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
>>   hw/arm/virt.c                | 40 +++++++++++++++++++++-
>>   include/hw/acpi/acpi-defs.h  | 13 ++++++++
>>   include/hw/acpi/aml-build.h  |  7 ++++
>>   include/hw/arm/virt.h        |  1 +
>>   include/sysemu/device_tree.h |  1 +
>>   softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
>>   8 files changed, 204 insertions(+), 7 deletions(-)
>>
>> -- 
>> 2.23.0
>>
> 
> .
>
Andrew Jones March 1, 2021, 9:48 a.m. UTC | #3
On Fri, Feb 26, 2021 at 04:41:45PM +0800, Ying Fang wrote:
> 
> 
> On 2/25/2021 8:02 PM, Andrew Jones wrote:
> > On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote:
> > > An accurate cpu topology may help improve the cpu scheduler's decision
> > > making when dealing with multi-core system. So cpu topology description
> > > is helpful to provide guest with the right view. Dario Faggioli's talk
> > > in [0] also shows the virtual topology may has impact on sched performace.
> > > Thus this patch series is posted to introduce cpu topology support for
> > > arm platform.
> > > 
> > > Both fdt and ACPI are introduced to present the cpu topology. To describe
> > > the cpu topology via ACPI, a PPTT table is introduced according to the
> > > processor hierarchy node structure. This series is derived from [1], in
> > > [1] we are trying to bring both cpu and cache topology support for arm
> > > platform, but there is still some issues to solve to support the cache
> > > hierarchy. So we split the cpu topology part out and send it seperately.
> > > The patch series to support cache hierarchy will be send later since
> > > Salil Mehta's cpu hotplug feature need the cpu topology enabled first and
> > > he is waiting for it to be upstreamed.
> > > 
> > > This patch series was initially based on the patches posted by Andrew Jones [2].
> > > I jumped in on it since some OS vendor cooperative partner are eager for it.
> > > Thanks for Andrew's contribution.
> > > 
> > > After applying this patch series, launch a guest with virt-6.0 and cpu
> > > topology configured with sockets:cores:threads = 2:4:2, you will get the
> > > bellow messages with the lscpu command.
> > > 
> > > -----------------------------------------
> > > Architecture:                    aarch64
> > > CPU op-mode(s):                  64-bit
> > > Byte Order:                      Little Endian
> > > CPU(s):                          16
> > > On-line CPU(s) list:             0-15
> > > Thread(s) per core:              2
> > 
> > What CPU model was used? Did it actually support threads? If these were
> 
> It's tested on Huawei Kunpeng 920 CPU model and vcpu host-passthrough.
> It does not support threads for now, but the next version 930 may
> support it. Here we emulate a virtual cpu topology, a virtual 2 threads
> is used to do the test.
> 
> 
> > KVM VCPUs, then I guess MPIDR.MT was not set on the CPUs. Apparently
> > that didn't confuse Linux? See [1] for how I once tried to deal with
> > threads.
> > 
> > [1] https://github.com/rhdrjones/qemu/commit/60218e0dd7b331031b644872d56f2aca42d0ff1e
> > 
> 
> If ACPI PPTT table is specified, the linux kernel won't check the MPIDR
> register to populate cpu topology. Moreover MPIDR does not ensure a
> right cpu topology. So it won't be a problem if MPIDR.MT is not set.

OK, so Linux doesn't care about MPIDR.MT with ACPI. What happens with
DT?

> 
> > > Core(s) per socket:              4
> > > Socket(s):                       2
> > 
> > Good, but what happens if you specify '-smp 16'? Do you get 16 sockets
              ^^ You didn't answer this question.

> > each with 1 core? Or, do you get 1 socket with 16 cores? And, which do
> > we want and why? If you look at [2], then you'll see I was assuming we
> > want to prefer cores over sockets, since without topology descriptions
> > that's what the Linux guest kernel would do.
> > 
> > [2] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
> > 
> > > NUMA node(s):                    2
> > 
> > Why do we have two NUMA nodes in the guest? The two sockets in the
> > guest should not imply this.
> 
> The two NUMA nodes are emulated by Qemu since we already have guest numa
> topology feature.

That's what I suspected, and I presume only a single node is present when
you don't use QEMU's NUMA feature - even when you supply a VCPU topology
with multiple sockets?

Thanks,
drew

> So the two sockets in the guest has nothing to do with
> it. Actually even one socket may have two numa nodes in it in real cpu
> model.
> 
> > 
> > Thanks,
> > drew
> > 
> > > Vendor ID:                       HiSilicon
> > > Model:                           0
> > > Model name:                      Kunpeng-920
> > > Stepping:                        0x1
> > > BogoMIPS:                        200.00
> > > NUMA node0 CPU(s):               0-7
> > > NUMA node1 CPU(s):               8-15
> > > 
> > > [0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
> > > [1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
> > > [2] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com
> > > 
> > > Ying Fang (5):
> > >    device_tree: Add qemu_fdt_add_path
> > >    hw/arm/virt: Add cpu-map to device tree
> > >    hw/arm/virt-acpi-build: distinguish possible and present cpus
> > >    hw/acpi/aml-build: add processor hierarchy node structure
> > >    hw/arm/virt-acpi-build: add PPTT table
> > > 
> > >   hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
> > >   hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
> > >   hw/arm/virt.c                | 40 +++++++++++++++++++++-
> > >   include/hw/acpi/acpi-defs.h  | 13 ++++++++
> > >   include/hw/acpi/aml-build.h  |  7 ++++
> > >   include/hw/arm/virt.h        |  1 +
> > >   include/sysemu/device_tree.h |  1 +
> > >   softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
> > >   8 files changed, 204 insertions(+), 7 deletions(-)
> > > 
> > > -- 
> > > 2.23.0
> > > 
> > 
> > .
> > 
>
fangying March 5, 2021, 6:14 a.m. UTC | #4
On 3/1/2021 5:48 PM, Andrew Jones wrote:
> On Fri, Feb 26, 2021 at 04:41:45PM +0800, Ying Fang wrote:
>>
>>
>> On 2/25/2021 8:02 PM, Andrew Jones wrote:
>>> On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote:
>>>> An accurate cpu topology may help improve the cpu scheduler's decision
>>>> making when dealing with multi-core system. So cpu topology description
>>>> is helpful to provide guest with the right view. Dario Faggioli's talk
>>>> in [0] also shows the virtual topology may has impact on sched performace.
>>>> Thus this patch series is posted to introduce cpu topology support for
>>>> arm platform.
>>>>
>>>> Both fdt and ACPI are introduced to present the cpu topology. To describe
>>>> the cpu topology via ACPI, a PPTT table is introduced according to the
>>>> processor hierarchy node structure. This series is derived from [1], in
>>>> [1] we are trying to bring both cpu and cache topology support for arm
>>>> platform, but there is still some issues to solve to support the cache
>>>> hierarchy. So we split the cpu topology part out and send it seperately.
>>>> The patch series to support cache hierarchy will be send later since
>>>> Salil Mehta's cpu hotplug feature need the cpu topology enabled first and
>>>> he is waiting for it to be upstreamed.
>>>>
>>>> This patch series was initially based on the patches posted by Andrew Jones [2].
>>>> I jumped in on it since some OS vendor cooperative partner are eager for it.
>>>> Thanks for Andrew's contribution.
>>>>
>>>> After applying this patch series, launch a guest with virt-6.0 and cpu
>>>> topology configured with sockets:cores:threads = 2:4:2, you will get the
>>>> bellow messages with the lscpu command.
>>>>
>>>> -----------------------------------------
>>>> Architecture:                    aarch64
>>>> CPU op-mode(s):                  64-bit
>>>> Byte Order:                      Little Endian
>>>> CPU(s):                          16
>>>> On-line CPU(s) list:             0-15
>>>> Thread(s) per core:              2
>>>
>>> What CPU model was used? Did it actually support threads? If these were
>>
>> It's tested on Huawei Kunpeng 920 CPU model and vcpu host-passthrough.
>> It does not support threads for now, but the next version 930 may
>> support it. Here we emulate a virtual cpu topology, a virtual 2 threads
>> is used to do the test.
>>
>>
>>> KVM VCPUs, then I guess MPIDR.MT was not set on the CPUs. Apparently
>>> that didn't confuse Linux? See [1] for how I once tried to deal with
>>> threads.
>>>
>>> [1] https://github.com/rhdrjones/qemu/commit/60218e0dd7b331031b644872d56f2aca42d0ff1e
>>>
>>
>> If ACPI PPTT table is specified, the linux kernel won't check the MPIDR
>> register to populate cpu topology. Moreover MPIDR does not ensure a
>> right cpu topology. So it won't be a problem if MPIDR.MT is not set.
> 
> OK, so Linux doesn't care about MPIDR.MT with ACPI. What happens with
> DT?

Behind the logical of Linux kernel, it tries to parse cpu topology in
smp_prepare_cpus (arch/arm64/kernel/topology.c). If cpu topology is
provided via DT, Linux kernel won't check MPIDR any more. This is the
same with ACPI enabled.

> 
>>
>>>> Core(s) per socket:              4
>>>> Socket(s):                       2
>>>
>>> Good, but what happens if you specify '-smp 16'? Do you get 16 sockets
>                ^^ You didn't answer this question.

The latest qemu use smp_parse the parse -smp command line, by default if
-smp 16 is given, arm64 virt machine will get 16 sockets.

> 
>>> each with 1 core? Or, do you get 1 socket with 16 cores? And, which do
>>> we want and why? If you look at [2], then you'll see I was assuming we
>>> want to prefer cores over sockets, since without topology descriptions
>>> that's what the Linux guest kernel would do.
>>>
>>> [2] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
>>>

Thanks, I'll check the default way Linux does.

>>>> NUMA node(s):                    2
>>>
>>> Why do we have two NUMA nodes in the guest? The two sockets in the
>>> guest should not imply this.
>>
>> The two NUMA nodes are emulated by Qemu since we already have guest numa
>> topology feature.
> 
> That's what I suspected, and I presume only a single node is present when
> you don't use QEMU's NUMA feature - even when you supply a VCPU topology
> with multiple sockets?

Agreed, I would like single numa node too if we do not use guest
numa feature. Here I provide the guest with two numa nodes and set the 
cpu affinity only to do a test.


> 
> Thanks,
> drew
> 
>> So the two sockets in the guest has nothing to do with
>> it. Actually even one socket may have two numa nodes in it in real cpu
>> model.
>>
>>>
>>> Thanks,
>>> drew
>>>
>>>> Vendor ID:                       HiSilicon
>>>> Model:                           0
>>>> Model name:                      Kunpeng-920
>>>> Stepping:                        0x1
>>>> BogoMIPS:                        200.00
>>>> NUMA node0 CPU(s):               0-7
>>>> NUMA node1 CPU(s):               8-15
>>>>
>>>> [0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse
>>>> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
>>>> [2] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjones@redhat.com
>>>>
>>>> Ying Fang (5):
>>>>     device_tree: Add qemu_fdt_add_path
>>>>     hw/arm/virt: Add cpu-map to device tree
>>>>     hw/arm/virt-acpi-build: distinguish possible and present cpus
>>>>     hw/acpi/aml-build: add processor hierarchy node structure
>>>>     hw/arm/virt-acpi-build: add PPTT table
>>>>
>>>>    hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
>>>>    hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
>>>>    hw/arm/virt.c                | 40 +++++++++++++++++++++-
>>>>    include/hw/acpi/acpi-defs.h  | 13 ++++++++
>>>>    include/hw/acpi/aml-build.h  |  7 ++++
>>>>    include/hw/arm/virt.h        |  1 +
>>>>    include/sysemu/device_tree.h |  1 +
>>>>    softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
>>>>    8 files changed, 204 insertions(+), 7 deletions(-)
>>>>
>>>> -- 
>>>> 2.23.0
>>>>
>>>
>>> .
>>>
>>
> 
> .
>
fangying March 10, 2021, 9:43 a.m. UTC | #5
-----邮件原件-----
发件人: Andrew Jones [mailto:drjones@redhat.com] 
发送时间: 2021年3月10日 17:21
收件人: fangying <fangying1@huawei.com>
主题: Re: [RFC PATCH 0/5] hw/arm/virt: Introduce cpu topology support


> Hi Ying Fang,
> 
> Do you plan to repost this soon? It'd be great if it got into 6.0.
>
> Thanks,
> drew


Hi Andrew
Thanks for your remind.
Yes, I will repost and update this series soon.
It seems 6.0 will be soft feature frozen at 3.16.
Deadline is close.

On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote:
> An accurate cpu topology may help improve the cpu scheduler's decision 
> making when dealing with multi-core system. So cpu topology 
> description is helpful to provide guest with the right view. Dario 
> Faggioli's talk in [0] also shows the virtual topology may has impact on sched performace.
> Thus this patch series is posted to introduce cpu topology support for 
> arm platform.
> 
> Both fdt and ACPI are introduced to present the cpu topology. To 
> describe the cpu topology via ACPI, a PPTT table is introduced 
> according to the processor hierarchy node structure. This series is 
> derived from [1], in [1] we are trying to bring both cpu and cache 
> topology support for arm platform, but there is still some issues to 
> solve to support the cache hierarchy. So we split the cpu topology part out and send it seperately.
> The patch series to support cache hierarchy will be send later since 
> Salil Mehta's cpu hotplug feature need the cpu topology enabled first 
> and he is waiting for it to be upstreamed.
> 
> This patch series was initially based on the patches posted by Andrew Jones [2].
> I jumped in on it since some OS vendor cooperative partner are eager for it.
> Thanks for Andrew's contribution.
> 
> After applying this patch series, launch a guest with virt-6.0 and cpu 
> topology configured with sockets:cores:threads = 2:4:2, you will get 
> the bellow messages with the lscpu command.
> 
> -----------------------------------------
> Architecture:                    aarch64
> CPU op-mode(s):                  64-bit
> Byte Order:                      Little Endian
> CPU(s):                          16
> On-line CPU(s) list:             0-15
> Thread(s) per core:              2
> Core(s) per socket:              4
> Socket(s):                       2
> NUMA node(s):                    2
> Vendor ID:                       HiSilicon
> Model:                           0
> Model name:                      Kunpeng-920
> Stepping:                        0x1
> BogoMIPS:                        200.00
> NUMA node0 CPU(s):               0-7
> NUMA node1 CPU(s):               8-15
> 
> [0] 
> https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual
> -machines-friend-or-foe-dario-faggioli-suse
> [1] 
> https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html
> [2] 
> https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.3
> 2483-1-drjones@redhat.com
> 
> Ying Fang (5):
>   device_tree: Add qemu_fdt_add_path
>   hw/arm/virt: Add cpu-map to device tree
>   hw/arm/virt-acpi-build: distinguish possible and present cpus
>   hw/acpi/aml-build: add processor hierarchy node structure
>   hw/arm/virt-acpi-build: add PPTT table
> 
>  hw/acpi/aml-build.c          | 40 ++++++++++++++++++++++
>  hw/arm/virt-acpi-build.c     | 64 +++++++++++++++++++++++++++++++++---
>  hw/arm/virt.c                | 40 +++++++++++++++++++++-
>  include/hw/acpi/acpi-defs.h  | 13 ++++++++  
> include/hw/acpi/aml-build.h  |  7 ++++
>  include/hw/arm/virt.h        |  1 +
>  include/sysemu/device_tree.h |  1 +
>  softmmu/device_tree.c        | 45 +++++++++++++++++++++++--
>  8 files changed, 204 insertions(+), 7 deletions(-)
> 
> --
> 2.23.0
>