mbox series

[0/4] target/ppc: TCG SMT support for spapr

Message ID 20230605112323.179259-1-npiggin@gmail.com (mailing list archive)
Headers show
Series target/ppc: TCG SMT support for spapr | expand

Message

Nicholas Piggin June 5, 2023, 11:23 a.m. UTC
Previous RFC here

https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html

This series drops patch 1 from the previous, which is more of
a standalone bugfix.

Also accounted for Cedric's comments, except a nicer way to
set cpu_index vs PIR/TIR SPRs, which is not quite trivial.

This limits support for SMT to POWER8 and newer. It is also
incompatible with nested-HV so that is checked for too.

Iterating CPUs to find siblings for now I kept because similar
loops exist in a few places, and it is not conceptually
difficult for SMT, just fiddly code to improve. For now it
should not be much performane concern.

I removed hypervisor msgsnd support from patch 3, which is not
required for spapr and added significantly to the patch.

For now nobody has objected to the way shared SPR access is
handled (serialised with TCG atomics support) so we'll keep
going with it.

Thanks,
Nick

Nicholas Piggin (4):
  target/ppc: Add initial flags and helpers for SMT support
  target/ppc: Add support for SMT CTRL register
  target/ppc: Add msgsndp and DPDES SMT support
  spapr: Allow up to 8 threads SMT on POWER8 and newer

 hw/ppc/ppc.c             |  6 ++++
 hw/ppc/spapr.c           | 16 +++++++---
 hw/ppc/spapr_caps.c      | 14 ++++++++
 hw/ppc/spapr_cpu_core.c  |  7 ++--
 include/hw/ppc/ppc.h     |  1 +
 target/ppc/cpu.h         |  9 ++++++
 target/ppc/cpu_init.c    |  5 +++
 target/ppc/excp_helper.c | 30 ++++++++++++++---
 target/ppc/gdbstub.c     |  2 +-
 target/ppc/helper.h      |  2 ++
 target/ppc/misc_helper.c | 69 ++++++++++++++++++++++++++++++++++++----
 target/ppc/translate.c   | 46 ++++++++++++++++++++++++++-
 12 files changed, 188 insertions(+), 19 deletions(-)

Comments

Cédric Le Goater June 6, 2023, 2:09 p.m. UTC | #1
On 6/5/23 13:23, Nicholas Piggin wrote:
> Previous RFC here
> 
> https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html
> 
> This series drops patch 1 from the previous, which is more of
> a standalone bugfix.
> 
> Also accounted for Cedric's comments, except a nicer way to
> set cpu_index vs PIR/TIR SPRs, which is not quite trivial.
> 
> This limits support for SMT to POWER8 and newer. It is also
> incompatible with nested-HV so that is checked for too.
> 
> Iterating CPUs to find siblings for now I kept because similar
> loops exist in a few places, and it is not conceptually
> difficult for SMT, just fiddly code to improve. For now it
> should not be much performane concern.
> 
> I removed hypervisor msgsnd support from patch 3, which is not
> required for spapr and added significantly to the patch.
> 
> For now nobody has objected to the way shared SPR access is
> handled (serialised with TCG atomics support) so we'll keep
> going with it.

Cc:ing more people for possible feedback.

Thanks,

C.
Nicholas Piggin June 20, 2023, 10:12 a.m. UTC | #2
On Wed Jun 7, 2023 at 12:09 AM AEST, Cédric Le Goater wrote:
> On 6/5/23 13:23, Nicholas Piggin wrote:
> > Previous RFC here
> > 
> > https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html
> > 
> > This series drops patch 1 from the previous, which is more of
> > a standalone bugfix.
> > 
> > Also accounted for Cedric's comments, except a nicer way to
> > set cpu_index vs PIR/TIR SPRs, which is not quite trivial.
> > 
> > This limits support for SMT to POWER8 and newer. It is also
> > incompatible with nested-HV so that is checked for too.
> > 
> > Iterating CPUs to find siblings for now I kept because similar
> > loops exist in a few places, and it is not conceptually
> > difficult for SMT, just fiddly code to improve. For now it
> > should not be much performane concern.
> > 
> > I removed hypervisor msgsnd support from patch 3, which is not
> > required for spapr and added significantly to the patch.
> > 
> > For now nobody has objected to the way shared SPR access is
> > handled (serialised with TCG atomics support) so we'll keep
> > going with it.
>
> Cc:ing more people for possible feedback.

Not much feedback so I'll plan to go with this.

A more performant implementation might try to synchronize
threads at the register level rather than serialize everything,
but SMT shared registers are not too performance critical so
this should do for now.

Thanks,
Nick
Cédric Le Goater June 20, 2023, 10:27 a.m. UTC | #3
On 6/20/23 12:12, Nicholas Piggin wrote:
> On Wed Jun 7, 2023 at 12:09 AM AEST, Cédric Le Goater wrote:
>> On 6/5/23 13:23, Nicholas Piggin wrote:
>>> Previous RFC here
>>>
>>> https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html
>>>
>>> This series drops patch 1 from the previous, which is more of
>>> a standalone bugfix.
>>>
>>> Also accounted for Cedric's comments, except a nicer way to
>>> set cpu_index vs PIR/TIR SPRs, which is not quite trivial.
>>>
>>> This limits support for SMT to POWER8 and newer. It is also
>>> incompatible with nested-HV so that is checked for too.
>>>
>>> Iterating CPUs to find siblings for now I kept because similar
>>> loops exist in a few places, and it is not conceptually
>>> difficult for SMT, just fiddly code to improve. For now it
>>> should not be much performane concern.
>>>
>>> I removed hypervisor msgsnd support from patch 3, which is not
>>> required for spapr and added significantly to the patch.
>>>
>>> For now nobody has objected to the way shared SPR access is
>>> handled (serialised with TCG atomics support) so we'll keep
>>> going with it.
>>
>> Cc:ing more people for possible feedback.
> 
> Not much feedback so I'll plan to go with this.
> 
> A more performant implementation might try to synchronize
> threads at the register level rather than serialize everything,
> but SMT shared registers are not too performance critical so
> this should do for now.

yes. Could you please rebase this series on upstream ?

It would be good to add tests for SMT. May be we could extend :

   tests/avocado/ppc_pseries.py

with a couple of extra QEMU configs adding 'threads=' (if possible) and
check :

   "CPU maps initialized for Y threads per core"

and

   "smp: Brought up 1 node, X*Y CPUs"

?

Thanks,

C.
Nicholas Piggin June 20, 2023, 10:45 a.m. UTC | #4
On Tue Jun 20, 2023 at 8:27 PM AEST, Cédric Le Goater wrote:
> On 6/20/23 12:12, Nicholas Piggin wrote:
> > On Wed Jun 7, 2023 at 12:09 AM AEST, Cédric Le Goater wrote:
> >> On 6/5/23 13:23, Nicholas Piggin wrote:
> >>> Previous RFC here
> >>>
> >>> https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html
> >>>
> >>> This series drops patch 1 from the previous, which is more of
> >>> a standalone bugfix.
> >>>
> >>> Also accounted for Cedric's comments, except a nicer way to
> >>> set cpu_index vs PIR/TIR SPRs, which is not quite trivial.
> >>>
> >>> This limits support for SMT to POWER8 and newer. It is also
> >>> incompatible with nested-HV so that is checked for too.
> >>>
> >>> Iterating CPUs to find siblings for now I kept because similar
> >>> loops exist in a few places, and it is not conceptually
> >>> difficult for SMT, just fiddly code to improve. For now it
> >>> should not be much performane concern.
> >>>
> >>> I removed hypervisor msgsnd support from patch 3, which is not
> >>> required for spapr and added significantly to the patch.
> >>>
> >>> For now nobody has objected to the way shared SPR access is
> >>> handled (serialised with TCG atomics support) so we'll keep
> >>> going with it.
> >>
> >> Cc:ing more people for possible feedback.
> > 
> > Not much feedback so I'll plan to go with this.
> > 
> > A more performant implementation might try to synchronize
> > threads at the register level rather than serialize everything,
> > but SMT shared registers are not too performance critical so
> > this should do for now.
>
> yes. Could you please rebase this series on upstream ?

Oh yeah I should have said, I will rebase and resend.

> It would be good to add tests for SMT. May be we could extend :
>
>    tests/avocado/ppc_pseries.py
>
> with a couple of extra QEMU configs adding 'threads=' (if possible) and
> check :
>
>    "CPU maps initialized for Y threads per core"
>
> and
>
>    "smp: Brought up 1 node, X*Y CPUs"
>
> ?

Yeah that could be a good idea, I'll try it.

Thanks,
Nick
Cédric Le Goater June 22, 2023, 7:26 a.m. UTC | #5
On 6/20/23 12:45, Nicholas Piggin wrote:
> On Tue Jun 20, 2023 at 8:27 PM AEST, Cédric Le Goater wrote:
>> On 6/20/23 12:12, Nicholas Piggin wrote:
>>> On Wed Jun 7, 2023 at 12:09 AM AEST, Cédric Le Goater wrote:
>>>> On 6/5/23 13:23, Nicholas Piggin wrote:
>>>>> Previous RFC here
>>>>>
>>>>> https://lists.gnu.org/archive/html/qemu-ppc/2023-05/msg00453.html
>>>>>
>>>>> This series drops patch 1 from the previous, which is more of
>>>>> a standalone bugfix.
>>>>>
>>>>> Also accounted for Cedric's comments, except a nicer way to
>>>>> set cpu_index vs PIR/TIR SPRs, which is not quite trivial.
>>>>>
>>>>> This limits support for SMT to POWER8 and newer. It is also
>>>>> incompatible with nested-HV so that is checked for too.
>>>>>
>>>>> Iterating CPUs to find siblings for now I kept because similar
>>>>> loops exist in a few places, and it is not conceptually
>>>>> difficult for SMT, just fiddly code to improve. For now it
>>>>> should not be much performane concern.
>>>>>
>>>>> I removed hypervisor msgsnd support from patch 3, which is not
>>>>> required for spapr and added significantly to the patch.
>>>>>
>>>>> For now nobody has objected to the way shared SPR access is
>>>>> handled (serialised with TCG atomics support) so we'll keep
>>>>> going with it.
>>>>
>>>> Cc:ing more people for possible feedback.
>>>
>>> Not much feedback so I'll plan to go with this.
>>>
>>> A more performant implementation might try to synchronize
>>> threads at the register level rather than serialize everything,
>>> but SMT shared registers are not too performance critical so
>>> this should do for now.
>>
>> yes. Could you please rebase this series on upstream ?
> 
> Oh yeah I should have said, I will rebase and resend.

Here is a tree to check the rebase on :

   https://github.com/legoater/qemu/commits/ppc-next

>> It would be good to add tests for SMT. May be we could extend :
>>
>>     tests/avocado/ppc_pseries.py
>>
>> with a couple of extra QEMU configs adding 'threads=' (if possible) and
>> check :
>>
>>     "CPU maps initialized for Y threads per core"
>>
>> and
>>
>>     "smp: Brought up 1 node, X*Y CPUs"
>>
>> ?
> 
> Yeah that could be a good idea, I'll try it.

It can come later.

Thanks,

C.