diff mbox series

[v14,11/11] docs/s390x/cpu topology: document s390x cpu topology

Message ID 20230105145313.168489-12-pmorel@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series s390x: CPU Topology | expand

Commit Message

Pierre Morel Jan. 5, 2023, 2:53 p.m. UTC
Add some basic examples for the definition of cpu topology
in s390x.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 docs/system/s390x/cpu-topology.rst | 292 +++++++++++++++++++++++++++++
 docs/system/target-s390x.rst       |   1 +
 2 files changed, 293 insertions(+)
 create mode 100644 docs/system/s390x/cpu-topology.rst

Comments

Thomas Huth Jan. 12, 2023, 11:46 a.m. UTC | #1
On 05/01/2023 15.53, Pierre Morel wrote:
> Add some basic examples for the definition of cpu topology
> in s390x.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   docs/system/s390x/cpu-topology.rst | 292 +++++++++++++++++++++++++++++
>   docs/system/target-s390x.rst       |   1 +
>   2 files changed, 293 insertions(+)
>   create mode 100644 docs/system/s390x/cpu-topology.rst
> 
> diff --git a/docs/system/s390x/cpu-topology.rst b/docs/system/s390x/cpu-topology.rst
> new file mode 100644
> index 0000000000..0020b70b50
> --- /dev/null
> +++ b/docs/system/s390x/cpu-topology.rst
> @@ -0,0 +1,292 @@
> +CPU Topology on s390x
> +=====================
> +
> +CPU Topology on S390x provides up to 5 levels of topology containers:

You sometimes write "Topology" with a capital T, sometimes lower case ... 
I'd suggest to write it lower case consistently everywhere.

> +nodes, drawers, books, sockets and CPUs.

Hmm, so here you mention that "nodes" are usable on s390x, too? ... in 
another spot below, you don't mention these anymore...

> +While the higher level containers, Containers Topology List Entries,
> +(Containers TLE) define a tree hierarchy, the lowest level of topology
> +definition, the CPU Topology List Entry (CPU TLE), provides the placement
> +of the CPUs inside the parent container.
> +
> +Currently QEMU CPU topology uses a single level of container: the sockets.
> +
> +For backward compatibility, threads can be declared on the ``-smp`` command
> +line. They will be seen as CPUs by the guest as long as multithreading
> +is not really supported by QEMU for S390.

Maybe mention that threads are not allowed with machine types >= 7.2 anymore?

> +Beside the topological tree, S390x provides 3 CPU attributes:
> +- CPU type
> +- polarity entitlement
> +- dedication
> +
> +Prerequisites
> +-------------
> +
> +To use CPU Topology a Linux QEMU/KVM machine providing the CPU Topology facility
> +(STFLE bit 11) is required.
> +
> +However, since this facility has been enabled by default in an early version
> +of QEMU, we use a capability, ``KVM_CAP_S390_CPU_TOPOLOGY``, to notify KVM
> +QEMU use of the CPU Topology.

Has it? I thought bit 11 was not enabled by default in the past?

> +Enabling CPU topology
> +---------------------
> +
> +Currently, CPU topology is only enabled in the host model.

add a "by default if support is available in the host kernel" at the end of 
the sentence?

> +Enabling CPU topology in a CPU model is done by setting the CPU flag
> +``ctop`` to ``on`` like in:
> +
> +.. code-block:: bash
> +
> +   -cpu gen16b,ctop=on
> +
> +Having the topology disabled by default allows migration between
> +old and new QEMU without adding new flags.
> +
> +Default topology usage
> +----------------------
> +
> +The CPU Topology, can be specified on the QEMU command line
> +with the ``-smp`` or the ``-device`` QEMU command arguments
> +without using any new attributes.
> +In this case, the topology will be calculated by simply adding
> +to the topology the cores based on the core-id starting with
> +core-0 at position 0 of socket-0, book-0, drawer-0 with default

... here you don't mention "nodes" anymore (which you still mentioned at the 
beginning of the doc).

> +modifier attributes: horizontal polarity and no dedication.
> +
> +In the following machine we define 8 sockets with 4 cores each.
> +Note that S390 QEMU machines do not implement multithreading.

I'd use s390x instead of S390 to avoid confusion with 31-bit machines.

> +.. code-block:: bash
> +
> +  $ qemu-system-s390x -m 2G \
> +    -cpu gen16b,ctop=on \
> +    -smp cpus=5,sockets=8,cores=4,maxcpus=32 \
> +    -device host-s390x-cpu,core-id=14 \
> +
> +New CPUs can be plugged using the device_add hmp command like in:
> +
> +.. code-block:: bash
> +
> +  (qemu) device_add gen16b-s390x-cpu,core-id=9
> +
> +The core-id defines the placement of the core in the topology by
> +starting with core 0 in socket 0 up to maxcpus.
> +
> +In the example above:
> +
> +* There are 5 CPUs provided to the guest with the ``-smp`` command line
> +  They will take the core-ids 0,1,2,3,4
> +  As we have 4 cores in a socket, we have 4 CPUs provided
> +  to the guest in socket 0, with core-ids 0,1,2,3.
> +  The last cpu, with core-id 4, will be on socket 1.
> +
> +* the core with ID 14 provided by the ``-device`` command line will
> +  be placed in socket 3, with core-id 14
> +
> +* the core with ID 9 provided by the ``device_add`` qmp command will
> +  be placed in socket 2, with core-id 9
> +
> +Note that the core ID is machine wide and the CPU TLE masks provided
> +by the STSI instruction will be written in a big endian mask:
> +
> +* in socket 0: 0xf000000000000000 (core id 0,1,2,3)
> +* in socket 1: 0x0800000000000000 (core id 4)
> +* in socket 2: 0x0040000000000000 (core id 9)
> +* in socket 3: 0x0002000000000000 (core id 14)

Hmm, who's supposed to be the audience of this documentation? Users? 
Developers? For a doc in docs/system/ I'd expect this to be a documentation 
for users, so this seems to be way too much of implementation detail here 
already. If this is supposed to be a doc for developers instead, the file 
should likely rather go into doc/devel/ instead. Or maybe you want both? ... 
then you should split the information in here in two files, I think, one in 
docs/system/ and one in docs/devel/ .

> +Defining the topology on command line
> +-------------------------------------
> +
> +The topology can be defined entirely during the CPU definition,
> +with the exception of CPU 0 which must be defined with the -smp
> +argument.
> +
> +For example, here we set the position of the cores 1,2,3 on
> +drawer 1, book 1, socket 2 and cores 0,9 and 14 on drawer 0,
> +book 0, socket 0 with all horizontal polarity and not dedicated.
> +The core 4, will be set on its default position on socket 1
> +(since we have 4 core per socket) and we define it with dedication and
> +vertical high entitlement.
> +
> +.. code-block:: bash
> +
> +  $ qemu-system-s390x -m 2G \
> +    -cpu gen16b,ctop=on \
> +    -smp cpus=1,sockets=8,cores=4,maxcpus=32 \
> +    \
> +    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=1 \
> +    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=2 \
> +    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=3 \
> +    \
> +    -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=9 \
> +    -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=14 \
> +    \
> +    -device gen16b-s390x-cpu,core-id=4,dedicated=on,polarity=3 \
> +
> +Polarity and dedication
> +-----------------------

Since you are using the terms "polarity" and "dedication" in the previous 
paragraphs already, it might make sense to move this section here earlier in 
the document to teach the users about this first, before using the terms in 
the other paragraphs?

> +Polarity can be of two types: horizontal or vertical.
> +
> +The horizontal polarization specifies that all guest's vCPUs get
> +almost the same amount of provisioning of real CPU by the host.
> +
> +The vertical polarization specifies that guest's vCPU can get
> +different  real CPU provisions:

Please remove one space between "different" and "real".

> +- a vCPU with Vertical high entitlement specifies that this
> +  vCPU gets 100% of the real CPU provisioning.
> +
> +- a vCPU with Vertical medium entitlement specifies that this
> +  vCPU shares the real CPU with other vCPU.

"with *one* other vCPU" or rather "with other vCPU*s*" ?

> +
> +- a vCPU with Vertical low entitlement specifies that this
> +  vCPU only get real CPU provisioning when no other vCPU need it.
> +
> +In the case a vCPU with vertical high entitlement does not use
> +the real CPU, the unused "slack" can be dispatched to other vCPU
> +with medium or low entitlement.
> +
> +The host indicates to the guest how the real CPU resources are
> +provided to the vCPUs through the SYSIB with two polarity bits
> +inside the CPU TLE.
> +
> +Bits d - Polarization
> +0 0      Horizontal
> +0 1      Vertical low entitlement
> +1 0      Vertical medium entitlement
> +1 1      Vertical high entitlement

That SYSIB stuff looks like details for developers again ... I think you 
should either add more explanations here (I assume the average user does not 
know the term SYSIB), move it to a separate developers file or drop it.

> +A subsystem reset puts all vCPU of the configuration into the
> +horizontal polarization.
> +
> +The admin specifies the dedicated bit when the vCPU is dedicated
> +to a single real CPU.
> +
> +As for the Linux admin, the dedicated bit is an indication on the
> +affinity of a vCPU for a real CPU while the entitlement indicates the
> +sharing or exclusivity of use.
> +
> +QAPI interface for topology
> +---------------------------

A "grep -r QAPI docs/system/" shows hardly any entries there. I think QAPI 
documentation should go into docs/devel instead.

> +Let's start QEMU with the following command:
> +
> +.. code-block:: bash
> +
> + sudo /usr/local/bin/qemu-system-s390x \
> +    -enable-kvm \
> +    -cpu z14,ctop=on \
> +    -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
> +    \
> +    -device z14-s390x-cpu,core-id=19,polarity=3 \
> +    -device z14-s390x-cpu,core-id=11,polarity=1 \
> +    -device z14-s390x-cpu,core-id=12,polarity=3 \
> +   ...
> +
> +and see the result when using of the QAPI interface.
> +
> +query-topology
> ++++++++++++++++
> +
> +The command cpu-topology allows the admin to query the topology

Not sure if the average admin runs QMP directly ... maybe rather talk about 
the "upper layers like libvirt" here or something similar.

> +tree and modifier for all configured vCPU.
> +
> +.. code-block:: QMP
> +
> + -> { "execute": "query-topology" }
> +    {"return":
> +        [
> +            {
> +            "origin": 0,
> +            "dedicated": false,
> +            "book": 0,
> +            "socket": 0,
> +            "drawer": 0,
> +            "polarity": 0,
> +            "mask": "0x8000000000000000"
> +            },
> +            {
> +                "origin": 0,
> +                "dedicated": false,
> +                "book": 2,
> +                "socket": 1,
> +                "drawer": 0,
> +                "polarity": 1,
> +                "mask": "0x0010000000000000"
> +            },
> +            {
> +                "origin": 0,
> +                "dedicated": false,
> +                "book": 0,
> +                "socket": 0,
> +                "drawer": 1,
> +                "polarity": 3,
> +                "mask": "0x0008000000000000"
> +            },
> +            {
> +                "origin": 0,
> +                "dedicated": false,
> +                "book": 1,
> +                "socket": 1,
> +                "drawer": 1,
> +                "polarity": 3,
> +                "mask": "0x0000100000000000"
> +            }
> +        ]
> +    }
> +
> +change-topology
> ++++++++++++++++
> +
> +The command change-topology allows the admin to modify the topology
> +tree or the topology modifiers of a vCPU in the configuration.
> +
> +.. code-block:: QMP
> +
> + -> { "execute": "change-topology",
> +      "arguments": {
> +         "core": 11,
> +         "socket": 0,
> +         "book": 0,
> +         "drawer": 0,
> +         "polarity": 0,
> +         "dedicated": false
> +      }
> +    }
> + <- {"return": {}}
> +
> +
> +event POLARITY_CHANGE
> ++++++++++++++++++++++
> +
> +When a guest is requesting a modification of the polarity,
> +QEMU sends a POLARITY_CHANGE event.
> +
> +When requesting the change, the guest only specifies horizontal or
> +vertical polarity.
> +The dedication and fine grain vertical entitlement depends on admin
> +to set according to its response to this event.
> +
> +Note that a vertical polarized dedicated vCPU can only have a high
> +entitlement, this gives 6 possibilities for a vCPU polarity:
> +
> +- Horizontal
> +- Horizontal dedicated
> +- Vertical low
> +- Vertical medium
> +- Vertical high
> +- Vertical high dedicated
> +
> +Example of the event received when the guest issues PTF(0) to request

Please mention that PTF is a CPU instruction (and provide the full name).

  Thomas
Daniel P. Berrangé Jan. 12, 2023, 11:58 a.m. UTC | #2
On Thu, Jan 05, 2023 at 03:53:13PM +0100, Pierre Morel wrote:
> Add some basic examples for the definition of cpu topology
> in s390x.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  docs/system/s390x/cpu-topology.rst | 292 +++++++++++++++++++++++++++++
>  docs/system/target-s390x.rst       |   1 +
>  2 files changed, 293 insertions(+)
>  create mode 100644 docs/system/s390x/cpu-topology.rst
> 
> diff --git a/docs/system/s390x/cpu-topology.rst b/docs/system/s390x/cpu-topology.rst
> new file mode 100644
> index 0000000000..0020b70b50
> --- /dev/null
> +++ b/docs/system/s390x/cpu-topology.rst
> @@ -0,0 +1,292 @@
> +CPU Topology on s390x
> +=====================
> +
> +CPU Topology on S390x provides up to 5 levels of topology containers:
> +nodes, drawers, books, sockets and CPUs.

The last level should be 'cores' not CPUs for QEMU terminology.

> +While the higher level containers, Containers Topology List Entries,
> +(Containers TLE) define a tree hierarchy, the lowest level of topology
> +definition, the CPU Topology List Entry (CPU TLE), provides the placement
> +of the CPUs inside the parent container.

With regards,
Daniel
Pierre Morel Jan. 18, 2023, 5:10 p.m. UTC | #3
On 1/12/23 12:58, Daniel P. Berrangé wrote:
> On Thu, Jan 05, 2023 at 03:53:13PM +0100, Pierre Morel wrote:
>> Add some basic examples for the definition of cpu topology
>> in s390x.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   docs/system/s390x/cpu-topology.rst | 292 +++++++++++++++++++++++++++++
>>   docs/system/target-s390x.rst       |   1 +
>>   2 files changed, 293 insertions(+)
>>   create mode 100644 docs/system/s390x/cpu-topology.rst
>>
>> diff --git a/docs/system/s390x/cpu-topology.rst b/docs/system/s390x/cpu-topology.rst
>> new file mode 100644
>> index 0000000000..0020b70b50
>> --- /dev/null
>> +++ b/docs/system/s390x/cpu-topology.rst
>> @@ -0,0 +1,292 @@
>> +CPU Topology on s390x
>> +=====================
>> +
>> +CPU Topology on S390x provides up to 5 levels of topology containers:
>> +nodes, drawers, books, sockets and CPUs.
> 
> The last level should be 'cores' not CPUs for QEMU terminology.

Yes, thanks.

Regards,
Pierre
Pierre Morel Jan. 19, 2023, 2:48 p.m. UTC | #4
On 1/12/23 12:46, Thomas Huth wrote:
> On 05/01/2023 15.53, Pierre Morel wrote:
>> Add some basic examples for the definition of cpu topology
>> in s390x.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   docs/system/s390x/cpu-topology.rst | 292 +++++++++++++++++++++++++++++
>>   docs/system/target-s390x.rst       |   1 +
>>   2 files changed, 293 insertions(+)
>>   create mode 100644 docs/system/s390x/cpu-topology.rst
>>
>> diff --git a/docs/system/s390x/cpu-topology.rst 
>> b/docs/system/s390x/cpu-topology.rst
>> new file mode 100644
>> index 0000000000..0020b70b50
>> --- /dev/null
>> +++ b/docs/system/s390x/cpu-topology.rst
>> @@ -0,0 +1,292 @@
>> +CPU Topology on s390x
>> +=====================
>> +
>> +CPU Topology on S390x provides up to 5 levels of topology containers:
> 
> You sometimes write "Topology" with a capital T, sometimes lower case 
> ... I'd suggest to write it lower case consistently everywhere.

OK

> 
>> +nodes, drawers, books, sockets and CPUs.
> 
> Hmm, so here you mention that "nodes" are usable on s390x, too? ... in 
> another spot below, you don't mention these anymore...

No, no nodes needed here, I remove that.

> 
>> +While the higher level containers, Containers Topology List Entries,
>> +(Containers TLE) define a tree hierarchy, the lowest level of topology
>> +definition, the CPU Topology List Entry (CPU TLE), provides the 
>> placement
>> +of the CPUs inside the parent container.
>> +
>> +Currently QEMU CPU topology uses a single level of container: the 
>> sockets.
>> +
>> +For backward compatibility, threads can be declared on the ``-smp`` 
>> command
>> +line. They will be seen as CPUs by the guest as long as multithreading
>> +is not really supported by QEMU for S390.
> 
> Maybe mention that threads are not allowed with machine types >= 7.2 
> anymore?

yes

> 
>> +Beside the topological tree, S390x provides 3 CPU attributes:
>> +- CPU type
>> +- polarity entitlement
>> +- dedication
>> +
>> +Prerequisites
>> +-------------
>> +
>> +To use CPU Topology a Linux QEMU/KVM machine providing the CPU 
>> Topology facility
>> +(STFLE bit 11) is required.
>> +
>> +However, since this facility has been enabled by default in an early 
>> version
>> +of QEMU, we use a capability, ``KVM_CAP_S390_CPU_TOPOLOGY``, to 
>> notify KVM
>> +QEMU use of the CPU Topology.
> 
> Has it? I thought bit 11 was not enabled by default in the past?

bit 11 enabled by default in QEMU, not in KVM.
However no code has been provided to support the STSI(15) and the PTF 
instruction which are enabled by facility 11.

So if we had enabled facility 11 in KVM without precaution a guest 
seeing facility 11 will use the PTF instruction and get a program interrupt.

Therefore we need a KVM capability to enable bit 11 in KVM


> 
>> +Enabling CPU topology
>> +---------------------
>> +
>> +Currently, CPU topology is only enabled in the host model.
> 
> add a "by default if support is available in the host kernel" at the end 
> of the sentence?

yes, thx

> 
>> +Enabling CPU topology in a CPU model is done by setting the CPU flag
>> +``ctop`` to ``on`` like in:
>> +
>> +.. code-block:: bash
>> +
>> +   -cpu gen16b,ctop=on
>> +
>> +Having the topology disabled by default allows migration between
>> +old and new QEMU without adding new flags.
>> +
>> +Default topology usage
>> +----------------------
>> +
>> +The CPU Topology, can be specified on the QEMU command line
>> +with the ``-smp`` or the ``-device`` QEMU command arguments
>> +without using any new attributes.
>> +In this case, the topology will be calculated by simply adding
>> +to the topology the cores based on the core-id starting with
>> +core-0 at position 0 of socket-0, book-0, drawer-0 with default
> 
> ... here you don't mention "nodes" anymore (which you still mentioned at 
> the beginning of the doc).

I removed it

> 
>> +modifier attributes: horizontal polarity and no dedication.
>> +
>> +In the following machine we define 8 sockets with 4 cores each.
>> +Note that S390 QEMU machines do not implement multithreading.
> 
> I'd use s390x instead of S390 to avoid confusion with 31-bit machines.

OK

> 
>> +.. code-block:: bash
>> +
>> +  $ qemu-system-s390x -m 2G \
>> +    -cpu gen16b,ctop=on \
>> +    -smp cpus=5,sockets=8,cores=4,maxcpus=32 \
>> +    -device host-s390x-cpu,core-id=14 \
>> +
>> +New CPUs can be plugged using the device_add hmp command like in:
>> +
>> +.. code-block:: bash
>> +
>> +  (qemu) device_add gen16b-s390x-cpu,core-id=9
>> +
>> +The core-id defines the placement of the core in the topology by
>> +starting with core 0 in socket 0 up to maxcpus.
>> +
>> +In the example above:
>> +
>> +* There are 5 CPUs provided to the guest with the ``-smp`` command line
>> +  They will take the core-ids 0,1,2,3,4
>> +  As we have 4 cores in a socket, we have 4 CPUs provided
>> +  to the guest in socket 0, with core-ids 0,1,2,3.
>> +  The last cpu, with core-id 4, will be on socket 1.
>> +
>> +* the core with ID 14 provided by the ``-device`` command line will
>> +  be placed in socket 3, with core-id 14
>> +
>> +* the core with ID 9 provided by the ``device_add`` qmp command will
>> +  be placed in socket 2, with core-id 9
>> +
>> +Note that the core ID is machine wide and the CPU TLE masks provided
>> +by the STSI instruction will be written in a big endian mask:
>> +
>> +* in socket 0: 0xf000000000000000 (core id 0,1,2,3)
>> +* in socket 1: 0x0800000000000000 (core id 4)
>> +* in socket 2: 0x0040000000000000 (core id 9)
>> +* in socket 3: 0x0002000000000000 (core id 14)
> 
> Hmm, who's supposed to be the audience of this documentation? Users? 
> Developers? For a doc in docs/system/ I'd expect this to be a 
> documentation for users, so this seems to be way too much of 
> implementation detail here already. If this is supposed to be a doc for 
> developers instead, the file should likely rather go into doc/devel/ 
> instead. Or maybe you want both? ... then you should split the 
> information in here in two files, I think, one in docs/system/ and one 
> in docs/devel/ .

I am not sure in devel there is all doc on QAPI interface not on commands.
On the other hand the QAPI seems to have its own way to document the 
commands.

So what I write here is more detailed than in the QAPI documentation.
May be I better write these details there and suppress them here,
just naming the command, info and event without details.


> 
>> +Defining the topology on command line
>> +-------------------------------------
>> +
>> +The topology can be defined entirely during the CPU definition,
>> +with the exception of CPU 0 which must be defined with the -smp
>> +argument.
>> +
>> +For example, here we set the position of the cores 1,2,3 on
>> +drawer 1, book 1, socket 2 and cores 0,9 and 14 on drawer 0,
>> +book 0, socket 0 with all horizontal polarity and not dedicated.
>> +The core 4, will be set on its default position on socket 1
>> +(since we have 4 core per socket) and we define it with dedication and
>> +vertical high entitlement.
>> +
>> +.. code-block:: bash
>> +
>> +  $ qemu-system-s390x -m 2G \
>> +    -cpu gen16b,ctop=on \
>> +    -smp cpus=1,sockets=8,cores=4,maxcpus=32 \
>> +    \
>> +    -device 
>> gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=1 \
>> +    -device 
>> gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=2 \
>> +    -device 
>> gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=3 \
>> +    \
>> +    -device 
>> gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=9 \
>> +    -device 
>> gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=14 \
>> +    \
>> +    -device gen16b-s390x-cpu,core-id=4,dedicated=on,polarity=3 \
>> +
>> +Polarity and dedication
>> +-----------------------
> 
> Since you are using the terms "polarity" and "dedication" in the 
> previous paragraphs already, it might make sense to move this section 
> here earlier in the document to teach the users about this first, before 
> using the terms in the other paragraphs?

yes

> 
>> +Polarity can be of two types: horizontal or vertical.
>> +
>> +The horizontal polarization specifies that all guest's vCPUs get
>> +almost the same amount of provisioning of real CPU by the host.
>> +
>> +The vertical polarization specifies that guest's vCPU can get
>> +different  real CPU provisions:
> 
> Please remove one space between "different" and "real".

OK

> 
>> +- a vCPU with Vertical high entitlement specifies that this
>> +  vCPU gets 100% of the real CPU provisioning.
>> +
>> +- a vCPU with Vertical medium entitlement specifies that this
>> +  vCPU shares the real CPU with other vCPU.
> 
> "with *one* other vCPU" or rather "with other vCPU*s*" ?

thx, vCPUs

> 
>> +
>> +- a vCPU with Vertical low entitlement specifies that this
>> +  vCPU only get real CPU provisioning when no other vCPU need it.
>> +
>> +In the case a vCPU with vertical high entitlement does not use
>> +the real CPU, the unused "slack" can be dispatched to other vCPU
>> +with medium or low entitlement.
>> +
>> +The host indicates to the guest how the real CPU resources are
>> +provided to the vCPUs through the SYSIB with two polarity bits
>> +inside the CPU TLE.
>> +
>> +Bits d - Polarization
>> +0 0      Horizontal
>> +0 1      Vertical low entitlement
>> +1 0      Vertical medium entitlement
>> +1 1      Vertical high entitlement
> 
> That SYSIB stuff looks like details for developers again ... I think you 
> should either add more explanations here (I assume the average user does 
> not know the term SYSIB), move it to a separate developers file or drop it.
> 

OK, I drop it

>> +A subsystem reset puts all vCPU of the configuration into the
>> +horizontal polarization.
>> +
>> +The admin specifies the dedicated bit when the vCPU is dedicated
>> +to a single real CPU.
>> +
>> +As for the Linux admin, the dedicated bit is an indication on the
>> +affinity of a vCPU for a real CPU while the entitlement indicates the
>> +sharing or exclusivity of use.
>> +
>> +QAPI interface for topology
>> +---------------------------
> 
> A "grep -r QAPI docs/system/" shows hardly any entries there. I think 
> QAPI documentation should go into docs/devel instead.

discussion above.
I enhance the QAPI internal doc or I move it into devel.


> 
>> +Let's start QEMU with the following command:
>> +
>> +.. code-block:: bash
>> +
>> + sudo /usr/local/bin/qemu-system-s390x \
>> +    -enable-kvm \
>> +    -cpu z14,ctop=on \
>> +    -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
>> +    \
>> +    -device z14-s390x-cpu,core-id=19,polarity=3 \
>> +    -device z14-s390x-cpu,core-id=11,polarity=1 \
>> +    -device z14-s390x-cpu,core-id=12,polarity=3 \
>> +   ...
>> +
>> +and see the result when using of the QAPI interface.
>> +
>> +query-topology
>> ++++++++++++++++
>> +
>> +The command cpu-topology allows the admin to query the topology
> 
> Not sure if the average admin runs QMP directly ... maybe rather talk 
> about the "upper layers like libvirt" here or something similar.
> 
>> +tree and modifier for all configured vCPU.
>> +
>> +.. code-block:: QMP
>> +
>> + -> { "execute": "query-topology" }
>> +    {"return":
>> +        [
>> +            {
>> +            "origin": 0,
>> +            "dedicated": false,
>> +            "book": 0,
>> +            "socket": 0,
>> +            "drawer": 0,
>> +            "polarity": 0,
>> +            "mask": "0x8000000000000000"
>> +            },
>> +            {
>> +                "origin": 0,
>> +                "dedicated": false,
>> +                "book": 2,
>> +                "socket": 1,
>> +                "drawer": 0,
>> +                "polarity": 1,
>> +                "mask": "0x0010000000000000"
>> +            },
>> +            {
>> +                "origin": 0,
>> +                "dedicated": false,
>> +                "book": 0,
>> +                "socket": 0,
>> +                "drawer": 1,
>> +                "polarity": 3,
>> +                "mask": "0x0008000000000000"
>> +            },
>> +            {
>> +                "origin": 0,
>> +                "dedicated": false,
>> +                "book": 1,
>> +                "socket": 1,
>> +                "drawer": 1,
>> +                "polarity": 3,
>> +                "mask": "0x0000100000000000"
>> +            }
>> +        ]
>> +    }
>> +
>> +change-topology
>> ++++++++++++++++
>> +
>> +The command change-topology allows the admin to modify the topology
>> +tree or the topology modifiers of a vCPU in the configuration.
>> +
>> +.. code-block:: QMP
>> +
>> + -> { "execute": "change-topology",
>> +      "arguments": {
>> +         "core": 11,
>> +         "socket": 0,
>> +         "book": 0,
>> +         "drawer": 0,
>> +         "polarity": 0,
>> +         "dedicated": false
>> +      }
>> +    }
>> + <- {"return": {}}
>> +
>> +
>> +event POLARITY_CHANGE
>> ++++++++++++++++++++++
>> +
>> +When a guest is requesting a modification of the polarity,
>> +QEMU sends a POLARITY_CHANGE event.
>> +
>> +When requesting the change, the guest only specifies horizontal or
>> +vertical polarity.
>> +The dedication and fine grain vertical entitlement depends on admin
>> +to set according to its response to this event.
>> +
>> +Note that a vertical polarized dedicated vCPU can only have a high
>> +entitlement, this gives 6 possibilities for a vCPU polarity:
>> +
>> +- Horizontal
>> +- Horizontal dedicated
>> +- Vertical low
>> +- Vertical medium
>> +- Vertical high
>> +- Vertical high dedicated
>> +
>> +Example of the event received when the guest issues PTF(0) to request
> 
> Please mention that PTF is a CPU instruction (and provide the full name).

Yes, thanks.

regards,
Pierre


> 
>   Thomas
>
diff mbox series

Patch

diff --git a/docs/system/s390x/cpu-topology.rst b/docs/system/s390x/cpu-topology.rst
new file mode 100644
index 0000000000..0020b70b50
--- /dev/null
+++ b/docs/system/s390x/cpu-topology.rst
@@ -0,0 +1,292 @@ 
+CPU Topology on s390x
+=====================
+
+CPU Topology on S390x provides up to 5 levels of topology containers:
+nodes, drawers, books, sockets and CPUs.
+While the higher level containers, Containers Topology List Entries,
+(Containers TLE) define a tree hierarchy, the lowest level of topology
+definition, the CPU Topology List Entry (CPU TLE), provides the placement
+of the CPUs inside the parent container.
+
+Currently QEMU CPU topology uses a single level of container: the sockets.
+
+For backward compatibility, threads can be declared on the ``-smp`` command
+line. They will be seen as CPUs by the guest as long as multithreading
+is not really supported by QEMU for S390.
+
+Beside the topological tree, S390x provides 3 CPU attributes:
+- CPU type
+- polarity entitlement
+- dedication
+
+Prerequisites
+-------------
+
+To use CPU Topology a Linux QEMU/KVM machine providing the CPU Topology facility
+(STFLE bit 11) is required.
+
+However, since this facility has been enabled by default in an early version
+of QEMU, we use a capability, ``KVM_CAP_S390_CPU_TOPOLOGY``, to notify KVM
+QEMU use of the CPU Topology.
+
+Enabling CPU topology
+---------------------
+
+Currently, CPU topology is only enabled in the host model.
+
+Enabling CPU topology in a CPU model is done by setting the CPU flag
+``ctop`` to ``on`` like in:
+
+.. code-block:: bash
+
+   -cpu gen16b,ctop=on
+
+Having the topology disabled by default allows migration between
+old and new QEMU without adding new flags.
+
+Default topology usage
+----------------------
+
+The CPU Topology, can be specified on the QEMU command line
+with the ``-smp`` or the ``-device`` QEMU command arguments
+without using any new attributes.
+In this case, the topology will be calculated by simply adding
+to the topology the cores based on the core-id starting with
+core-0 at position 0 of socket-0, book-0, drawer-0 with default
+modifier attributes: horizontal polarity and no dedication.
+
+In the following machine we define 8 sockets with 4 cores each.
+Note that S390 QEMU machines do not implement multithreading.
+
+.. code-block:: bash
+
+  $ qemu-system-s390x -m 2G \
+    -cpu gen16b,ctop=on \
+    -smp cpus=5,sockets=8,cores=4,maxcpus=32 \
+    -device host-s390x-cpu,core-id=14 \
+
+New CPUs can be plugged using the device_add hmp command like in:
+
+.. code-block:: bash
+
+  (qemu) device_add gen16b-s390x-cpu,core-id=9
+
+The core-id defines the placement of the core in the topology by
+starting with core 0 in socket 0 up to maxcpus.
+
+In the example above:
+
+* There are 5 CPUs provided to the guest with the ``-smp`` command line
+  They will take the core-ids 0,1,2,3,4
+  As we have 4 cores in a socket, we have 4 CPUs provided
+  to the guest in socket 0, with core-ids 0,1,2,3.
+  The last cpu, with core-id 4, will be on socket 1.
+
+* the core with ID 14 provided by the ``-device`` command line will
+  be placed in socket 3, with core-id 14
+
+* the core with ID 9 provided by the ``device_add`` qmp command will
+  be placed in socket 2, with core-id 9
+
+Note that the core ID is machine wide and the CPU TLE masks provided
+by the STSI instruction will be written in a big endian mask:
+
+* in socket 0: 0xf000000000000000 (core id 0,1,2,3)
+* in socket 1: 0x0800000000000000 (core id 4)
+* in socket 2: 0x0040000000000000 (core id 9)
+* in socket 3: 0x0002000000000000 (core id 14)
+
+Defining the topology on command line
+-------------------------------------
+
+The topology can be defined entirely during the CPU definition,
+with the exception of CPU 0 which must be defined with the -smp
+argument.
+
+For example, here we set the position of the cores 1,2,3 on
+drawer 1, book 1, socket 2 and cores 0,9 and 14 on drawer 0,
+book 0, socket 0 with all horizontal polarity and not dedicated.
+The core 4, will be set on its default position on socket 1
+(since we have 4 core per socket) and we define it with dedication and
+vertical high entitlement.
+
+.. code-block:: bash
+
+  $ qemu-system-s390x -m 2G \
+    -cpu gen16b,ctop=on \
+    -smp cpus=1,sockets=8,cores=4,maxcpus=32 \
+    \
+    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=1 \
+    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=2 \
+    -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=3 \
+    \
+    -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=9 \
+    -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=14 \
+    \
+    -device gen16b-s390x-cpu,core-id=4,dedicated=on,polarity=3 \
+
+Polarity and dedication
+-----------------------
+
+Polarity can be of two types: horizontal or vertical.
+
+The horizontal polarization specifies that all guest's vCPUs get
+almost the same amount of provisioning of real CPU by the host.
+
+The vertical polarization specifies that guest's vCPU can get
+different  real CPU provisions:
+
+- a vCPU with Vertical high entitlement specifies that this
+  vCPU gets 100% of the real CPU provisioning.
+
+- a vCPU with Vertical medium entitlement specifies that this
+  vCPU shares the real CPU with other vCPU.
+
+- a vCPU with Vertical low entitlement specifies that this
+  vCPU only get real CPU provisioning when no other vCPU need it.
+
+In the case a vCPU with vertical high entitlement does not use
+the real CPU, the unused "slack" can be dispatched to other vCPU
+with medium or low entitlement.
+
+The host indicates to the guest how the real CPU resources are
+provided to the vCPUs through the SYSIB with two polarity bits
+inside the CPU TLE.
+
+Bits d - Polarization
+0 0      Horizontal
+0 1      Vertical low entitlement
+1 0      Vertical medium entitlement
+1 1      Vertical high entitlement
+
+A subsystem reset puts all vCPU of the configuration into the
+horizontal polarization.
+
+The admin specifies the dedicated bit when the vCPU is dedicated
+to a single real CPU.
+
+As for the Linux admin, the dedicated bit is an indication on the
+affinity of a vCPU for a real CPU while the entitlement indicates the
+sharing or exclusivity of use.
+
+QAPI interface for topology
+---------------------------
+
+Let's start QEMU with the following command:
+
+.. code-block:: bash
+
+ sudo /usr/local/bin/qemu-system-s390x \
+    -enable-kvm \
+    -cpu z14,ctop=on \
+    -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
+    \
+    -device z14-s390x-cpu,core-id=19,polarity=3 \
+    -device z14-s390x-cpu,core-id=11,polarity=1 \
+    -device z14-s390x-cpu,core-id=12,polarity=3 \
+   ...
+
+and see the result when using of the QAPI interface.
+
+query-topology
++++++++++++++++
+
+The command cpu-topology allows the admin to query the topology
+tree and modifier for all configured vCPU.
+
+.. code-block:: QMP
+
+ -> { "execute": "query-topology" }
+    {"return":
+        [
+            {
+            "origin": 0,
+            "dedicated": false,
+            "book": 0,
+            "socket": 0,
+            "drawer": 0,
+            "polarity": 0,
+            "mask": "0x8000000000000000"
+            },
+            {
+                "origin": 0,
+                "dedicated": false,
+                "book": 2,
+                "socket": 1,
+                "drawer": 0,
+                "polarity": 1,
+                "mask": "0x0010000000000000"
+            },
+            {
+                "origin": 0,
+                "dedicated": false,
+                "book": 0,
+                "socket": 0,
+                "drawer": 1,
+                "polarity": 3,
+                "mask": "0x0008000000000000"
+            },
+            {
+                "origin": 0,
+                "dedicated": false,
+                "book": 1,
+                "socket": 1,
+                "drawer": 1,
+                "polarity": 3,
+                "mask": "0x0000100000000000"
+            }
+        ]
+    }
+
+change-topology
++++++++++++++++
+
+The command change-topology allows the admin to modify the topology
+tree or the topology modifiers of a vCPU in the configuration.
+
+.. code-block:: QMP
+
+ -> { "execute": "change-topology",
+      "arguments": {
+         "core": 11,
+         "socket": 0,
+         "book": 0,
+         "drawer": 0,
+         "polarity": 0,
+         "dedicated": false
+      }
+    }
+ <- {"return": {}}
+
+
+event POLARITY_CHANGE
++++++++++++++++++++++
+
+When a guest is requesting a modification of the polarity,
+QEMU sends a POLARITY_CHANGE event.
+
+When requesting the change, the guest only specifies horizontal or
+vertical polarity.
+The dedication and fine grain vertical entitlement depends on admin
+to set according to its response to this event.
+
+Note that a vertical polarized dedicated vCPU can only have a high
+entitlement, this gives 6 possibilities for a vCPU polarity:
+
+- Horizontal
+- Horizontal dedicated
+- Vertical low
+- Vertical medium
+- Vertical high
+- Vertical high dedicated
+
+Example of the event received when the guest issues PTF(0) to request
+an horizontal polarity:
+
+.. code-block:: QMP
+
+ <- { "event": "POLARITY_CHANGE",
+      "data": { "polarity": 0 },
+      "timestamp": { "seconds": 1401385907, "microseconds": 422329 } }
+
+
diff --git a/docs/system/target-s390x.rst b/docs/system/target-s390x.rst
index c636f64113..ff0ffe04f3 100644
--- a/docs/system/target-s390x.rst
+++ b/docs/system/target-s390x.rst
@@ -33,3 +33,4 @@  Architectural features
 .. toctree::
    s390x/bootdevices
    s390x/protvirt
+   s390x/cpu-topology