diff mbox series

[v7,01/16] hw/cpu: introduce CPU clusters

Message ID 20181123091729.29921-2-luc.michel@greensocs.com (mailing list archive)
State New, archived
Headers show
Series gdbstub: support for the multiprocess extension | expand

Commit Message

Luc Michel Nov. 23, 2018, 9:17 a.m. UTC
This commit adds the cpu-cluster type. It aims at gathering CPUs from
the same cluster in a machine.

For now it only has a `cluster-id` property.

Signed-off-by: Luc Michel <luc.michel@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
---
 include/hw/cpu/cluster.h | 38 ++++++++++++++++++++++++++
 hw/cpu/cluster.c         | 59 ++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS              |  2 ++
 hw/cpu/Makefile.objs     |  2 +-
 4 files changed, 100 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/cpu/cluster.h
 create mode 100644 hw/cpu/cluster.c

Comments

Eduardo Habkost Nov. 23, 2018, 6:10 p.m. UTC | #1
Hi,

Sorry for not reviewing this series earlier.  I just stumbled
upon this part of the code:

On Fri, Nov 23, 2018 at 10:17:14AM +0100, Luc Michel wrote:
> This commit adds the cpu-cluster type. It aims at gathering CPUs from
> the same cluster in a machine.
> 
> For now it only has a `cluster-id` property.
> 
> Signed-off-by: Luc Michel <luc.michel@greensocs.com>
> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
[...]
> +static void cpu_cluster_init(Object *obj)
> +{
> +    static uint32_t cluster_id_auto_increment;
> +    CPUClusterState *cluster = CPU_CLUSTER(obj);
> +
> +    cluster->cluster_id = cluster_id_auto_increment++;

I see that you implemented this after a suggestion from Philippe,
but I'm worried about this kind of side-effect on object/device
code.  I'm afraid this will bite us back in the future.  We were
bitten by problems caused by automatic cpu_index assignment on
CPU instance_init, and we took a while to fix that.

If you really want to do this and assign cluster_id
automatically, please do it on realize, where it won't have
unexpected side-effects after a simple `qom-list-properties` QMP
command.

I would also add a huge warning above the cluster_id field
declaration, mentioning that the field is not supposed to be used
for anything except debugging.  I think there's a large risk of
people trying to reuse the field incorrectly, just like cpu_index
was reused for multiple (conflicting) purposes in the past.


> +}
> +
> +static Property cpu_cluster_properties[] = {
> +    DEFINE_PROP_UINT32("cluster-id", CPUClusterState, cluster_id, 0),
> +    DEFINE_PROP_END_OF_LIST()
> +};
[...]
Peter Maydell Nov. 23, 2018, 6:14 p.m. UTC | #2
On Fri, 23 Nov 2018 at 18:11, Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Fri, Nov 23, 2018 at 10:17:14AM +0100, Luc Michel wrote:
> > This commit adds the cpu-cluster type. It aims at gathering CPUs from
> > the same cluster in a machine.
> >
> > For now it only has a `cluster-id` property.
> >
> > Signed-off-by: Luc Michel <luc.michel@greensocs.com>
> > Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> [...]
> > +static void cpu_cluster_init(Object *obj)
> > +{
> > +    static uint32_t cluster_id_auto_increment;
> > +    CPUClusterState *cluster = CPU_CLUSTER(obj);
> > +
> > +    cluster->cluster_id = cluster_id_auto_increment++;
>
> I see that you implemented this after a suggestion from Philippe,
> but I'm worried about this kind of side-effect on object/device
> code.  I'm afraid this will bite us back in the future.  We were
> bitten by problems caused by automatic cpu_index assignment on
> CPU instance_init, and we took a while to fix that.
>
> If you really want to do this and assign cluster_id
> automatically, please do it on realize, where it won't have
> unexpected side-effects after a simple `qom-list-properties` QMP
> command.
>
> I would also add a huge warning above the cluster_id field
> declaration, mentioning that the field is not supposed to be used
> for anything except debugging.  I think there's a large risk of
> people trying to reuse the field incorrectly, just like cpu_index
> was reused for multiple (conflicting) purposes in the past.

One thing I would like to do with this new "cpu cluster"
concept is to use it to handle a problem we have at the
moment with TCG, where we assume all CPUs have the same
view of physical memory (and so if CPU A executes from physical
address X it can share translated code with CPU B executing
from physical address X). The idea is that we should include
the CPU cluster number in the TCG hash key that we use to
look up cached translation blocks, so that only CPUs in
the same cluster (assumed to have the same view of memory
and to be identical) share TBs.

If we don't have a unique integer key for the cluster, what
should we use instead ?

thanks
-- PMM
Eduardo Habkost Nov. 23, 2018, 6:24 p.m. UTC | #3
On Fri, Nov 23, 2018 at 06:14:28PM +0000, Peter Maydell wrote:
> On Fri, 23 Nov 2018 at 18:11, Eduardo Habkost <ehabkost@redhat.com> wrote:
> > On Fri, Nov 23, 2018 at 10:17:14AM +0100, Luc Michel wrote:
> > > This commit adds the cpu-cluster type. It aims at gathering CPUs from
> > > the same cluster in a machine.
> > >
> > > For now it only has a `cluster-id` property.
> > >
> > > Signed-off-by: Luc Michel <luc.michel@greensocs.com>
> > > Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> > > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> > [...]
> > > +static void cpu_cluster_init(Object *obj)
> > > +{
> > > +    static uint32_t cluster_id_auto_increment;
> > > +    CPUClusterState *cluster = CPU_CLUSTER(obj);
> > > +
> > > +    cluster->cluster_id = cluster_id_auto_increment++;
> >
> > I see that you implemented this after a suggestion from Philippe,
> > but I'm worried about this kind of side-effect on object/device
> > code.  I'm afraid this will bite us back in the future.  We were
> > bitten by problems caused by automatic cpu_index assignment on
> > CPU instance_init, and we took a while to fix that.
> >
> > If you really want to do this and assign cluster_id
> > automatically, please do it on realize, where it won't have
> > unexpected side-effects after a simple `qom-list-properties` QMP
> > command.
> >
> > I would also add a huge warning above the cluster_id field
> > declaration, mentioning that the field is not supposed to be used
> > for anything except debugging.  I think there's a large risk of
> > people trying to reuse the field incorrectly, just like cpu_index
> > was reused for multiple (conflicting) purposes in the past.
> 
> One thing I would like to do with this new "cpu cluster"
> concept is to use it to handle a problem we have at the
> moment with TCG, where we assume all CPUs have the same
> view of physical memory (and so if CPU A executes from physical
> address X it can share translated code with CPU B executing
> from physical address X). The idea is that we should include
> the CPU cluster number in the TCG hash key that we use to
> look up cached translation blocks, so that only CPUs in
> the same cluster (assumed to have the same view of memory
> and to be identical) share TBs.
> 
> If we don't have a unique integer key for the cluster, what
> should we use instead ?

This sounds like a reasonable use of cluster_id as implemented in
this patch.  The ID would be only used internally and not exposed
to the outside, right?

I'm more worried about cases where we could end up exposing the
ID in an external interface (either to guests, or through QMP or
the command-line).  This happened to cpu_index and we took a long
time to fix the mess.
Peter Maydell Nov. 23, 2018, 6:53 p.m. UTC | #4
On Fri, 23 Nov 2018 at 18:24, Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Fri, Nov 23, 2018 at 06:14:28PM +0000, Peter Maydell wrote:
> > One thing I would like to do with this new "cpu cluster"
> > concept is to use it to handle a problem we have at the
> > moment with TCG, where we assume all CPUs have the same
> > view of physical memory (and so if CPU A executes from physical
> > address X it can share translated code with CPU B executing
> > from physical address X). The idea is that we should include
> > the CPU cluster number in the TCG hash key that we use to
> > look up cached translation blocks, so that only CPUs in
> > the same cluster (assumed to have the same view of memory
> > and to be identical) share TBs.
> >
> > If we don't have a unique integer key for the cluster, what
> > should we use instead ?
>
> This sounds like a reasonable use of cluster_id as implemented in
> this patch.  The ID would be only used internally and not exposed
> to the outside, right?

It would be internal to QEMU (not exposed to the guest or
to the user), yes.

> I'm more worried about cases where we could end up exposing the
> ID in an external interface (either to guests, or through QMP or
> the command-line).  This happened to cpu_index and we took a long
> time to fix the mess.

I see, thanks.

My other question about this code was a slightly different one -- are
we guaranteed to be holding the iothread lock when we create
new QOM objects? (ie that we won't have races between two threads
which both try to create new objects and increment the variable)

thanks
-- PMM
Eduardo Habkost Nov. 23, 2018, 10:38 p.m. UTC | #5
On Fri, Nov 23, 2018 at 06:53:07PM +0000, Peter Maydell wrote:
> On Fri, 23 Nov 2018 at 18:24, Eduardo Habkost <ehabkost@redhat.com> wrote:
> >
> > On Fri, Nov 23, 2018 at 06:14:28PM +0000, Peter Maydell wrote:
> > > One thing I would like to do with this new "cpu cluster"
> > > concept is to use it to handle a problem we have at the
> > > moment with TCG, where we assume all CPUs have the same
> > > view of physical memory (and so if CPU A executes from physical
> > > address X it can share translated code with CPU B executing
> > > from physical address X). The idea is that we should include
> > > the CPU cluster number in the TCG hash key that we use to
> > > look up cached translation blocks, so that only CPUs in
> > > the same cluster (assumed to have the same view of memory
> > > and to be identical) share TBs.
> > >
> > > If we don't have a unique integer key for the cluster, what
> > > should we use instead ?
> >
> > This sounds like a reasonable use of cluster_id as implemented in
> > this patch.  The ID would be only used internally and not exposed
> > to the outside, right?
> 
> It would be internal to QEMU (not exposed to the guest or
> to the user), yes.
> 
> > I'm more worried about cases where we could end up exposing the
> > ID in an external interface (either to guests, or through QMP or
> > the command-line).  This happened to cpu_index and we took a long
> > time to fix the mess.
> 
> I see, thanks.
> 
> My other question about this code was a slightly different one -- are
> we guaranteed to be holding the iothread lock when we create
> new QOM objects? (ie that we won't have races between two threads
> which both try to create new objects and increment the variable)

I assume we are, because type_initialize() (called by
object_new()) isn't thread-safe.
Philippe Mathieu-Daudé Nov. 25, 2018, 9:27 p.m. UTC | #6
Hi Eduardo,

On 23/11/18 19:10, Eduardo Habkost wrote:
> Hi,
> 
> Sorry for not reviewing this series earlier.  I just stumbled
> upon this part of the code:
> 
> On Fri, Nov 23, 2018 at 10:17:14AM +0100, Luc Michel wrote:
>> This commit adds the cpu-cluster type. It aims at gathering CPUs from
>> the same cluster in a machine.
>>
>> For now it only has a `cluster-id` property.
>>
>> Signed-off-by: Luc Michel <luc.michel@greensocs.com>
>> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
>> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> [...]
>> +static void cpu_cluster_init(Object *obj)
>> +{
>> +    static uint32_t cluster_id_auto_increment;
>> +    CPUClusterState *cluster = CPU_CLUSTER(obj);
>> +
>> +    cluster->cluster_id = cluster_id_auto_increment++;
> 
> I see that you implemented this after a suggestion from Philippe,
> but I'm worried about this kind of side-effect on object/device
> code.  I'm afraid this will bite us back in the future.  We were
> bitten by problems caused by automatic cpu_index assignment on
> CPU instance_init, and we took a while to fix that.

Oops, thanks for catching this. I'm not aware of the cpu_index past issue.

> 
> If you really want to do this and assign cluster_id
> automatically, please do it on realize, where it won't have
> unexpected side-effects after a simple `qom-list-properties` QMP
> command.

This looks clean enough to me.
Do you prefer we don't use static auto_increment at all?

> 
> I would also add a huge warning above the cluster_id field
> declaration, mentioning that the field is not supposed to be used
> for anything except debugging.  I think there's a large risk of
> people trying to reuse the field incorrectly, just like cpu_index
> was reused for multiple (conflicting) purposes in the past.
> 
> 
>> +}
>> +
>> +static Property cpu_cluster_properties[] = {
>> +    DEFINE_PROP_UINT32("cluster-id", CPUClusterState, cluster_id, 0),
>> +    DEFINE_PROP_END_OF_LIST()
>> +};
> [...]
>
Eduardo Habkost Nov. 26, 2018, 1:27 p.m. UTC | #7
On Sun, Nov 25, 2018 at 10:27:04PM +0100, Philippe Mathieu-Daudé wrote:
> Hi Eduardo,
> 
> On 23/11/18 19:10, Eduardo Habkost wrote:
> > Hi,
> > 
> > Sorry for not reviewing this series earlier.  I just stumbled
> > upon this part of the code:
> > 
> > On Fri, Nov 23, 2018 at 10:17:14AM +0100, Luc Michel wrote:
> >> This commit adds the cpu-cluster type. It aims at gathering CPUs from
> >> the same cluster in a machine.
> >>
> >> For now it only has a `cluster-id` property.
> >>
> >> Signed-off-by: Luc Michel <luc.michel@greensocs.com>
> >> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> >> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> > [...]
> >> +static void cpu_cluster_init(Object *obj)
> >> +{
> >> +    static uint32_t cluster_id_auto_increment;
> >> +    CPUClusterState *cluster = CPU_CLUSTER(obj);
> >> +
> >> +    cluster->cluster_id = cluster_id_auto_increment++;
> > 
> > I see that you implemented this after a suggestion from Philippe,
> > but I'm worried about this kind of side-effect on object/device
> > code.  I'm afraid this will bite us back in the future.  We were
> > bitten by problems caused by automatic cpu_index assignment on
> > CPU instance_init, and we took a while to fix that.
> 
> Oops, thanks for catching this. I'm not aware of the cpu_index past issue.
> 
> > 
> > If you really want to do this and assign cluster_id
> > automatically, please do it on realize, where it won't have
> > unexpected side-effects after a simple `qom-list-properties` QMP
> > command.
> 
> This looks clean enough to me.
> Do you prefer we don't use static auto_increment at all?

I would prefer to avoid the static variable if possible, but I
won't reject it if the alternatives make the API too complex to
use.

In either case, I'd like to ensure the caveats of the cluster_id
field are clearly documented: no guarantees about ordering or
predictability, making it not appropriate for external
interfaces.


> 
> > 
> > I would also add a huge warning above the cluster_id field
> > declaration, mentioning that the field is not supposed to be used
> > for anything except debugging.  I think there's a large risk of
> > people trying to reuse the field incorrectly, just like cpu_index
> > was reused for multiple (conflicting) purposes in the past.
> > 
> > 
> >> +}
> >> +
> >> +static Property cpu_cluster_properties[] = {
> >> +    DEFINE_PROP_UINT32("cluster-id", CPUClusterState, cluster_id, 0),
> >> +    DEFINE_PROP_END_OF_LIST()
> >> +};
> > [...]
> >
Peter Maydell Nov. 30, 2018, 4:52 p.m. UTC | #8
On Mon, 26 Nov 2018 at 13:27, Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Sun, Nov 25, 2018 at 10:27:04PM +0100, Philippe Mathieu-Daudé wrote:
> > Hi Eduardo,
> >
> > On 23/11/18 19:10, Eduardo Habkost wrote:
> > > If you really want to do this and assign cluster_id
> > > automatically, please do it on realize, where it won't have
> > > unexpected side-effects after a simple `qom-list-properties` QMP
> > > command.
> >
> > This looks clean enough to me.
> > Do you prefer we don't use static auto_increment at all?
>
> I would prefer to avoid the static variable if possible, but I
> won't reject it if the alternatives make the API too complex to
> use.

I guess the alternative would be to require the cluster ID
to be a QOM property on the cluster object. Usage would then
be something like
    object_initialize_child(OBJECT(s), "rpu-cluster", &s->rpu_cluster,
                            sizeof(s->rpu_cluster), TYPE_CPU_CLUSTER,
                            &error_abort, NULL);
    object_property_set_int(OBJECT(s), 1, "cluster-id", &error_abort);
    qdev_init_nofail(DEVICE(&s->rpu_cluster));

(ie one extra line setting the cluster ID explicitly).

SoC objects can probably reasonably assume that they are
the only SoC in the system, ie that they can just assign
cluster IDs themselves). If that turns out not to be true
we can make the SoC take a property to specify a cluster ID
base or something.

I guess if we've been bitten by cpu-ID auto-assignment
in the past, starting out with "user picks the cluster ID"
would be safer, and it's not too much pain at the callsite.
Plus it avoids the gdbstub changing its mind about which
order the cluster "processes" appear in if we refactor the
board code.

> In either case, I'd like to ensure the caveats of the cluster_id
> field are clearly documented: no guarantees about ordering or
> predictability, making it not appropriate for external
> interfaces.

The gdb stub is sort of an external interface, of course...

Luc: what are the requirements on boards using CPU cluster
objects? I assume these are both OK:
 * does not use cluster objects at all
   (the gdbstub puts all the CPUs in one process?)
 * all CPUs created are in some CPU cluster
   (the gdbstub uses one process per CPU cluster)
but what about
 * some CPUs are created in a CPU cluster, but some
   are "loose", not in any cluster at all
?
Is that just invalid, or do the "loose" CPUs end up in
a default cluster (gdbstub process), or do they get
one cluster each?

If it's invalid (which doesn't seem like a bad idea),
is there an assert somewhere so that getting it wrong
results in the board failing to start (ie a "make check"
failure) ?

thanks
-- PMM
Eduardo Habkost Nov. 30, 2018, 6:14 p.m. UTC | #9
On Fri, Nov 30, 2018 at 04:52:31PM +0000, Peter Maydell wrote:
> On Mon, 26 Nov 2018 at 13:27, Eduardo Habkost <ehabkost@redhat.com> wrote:
> >
> > On Sun, Nov 25, 2018 at 10:27:04PM +0100, Philippe Mathieu-Daudé wrote:
> > > Hi Eduardo,
> > >
> > > On 23/11/18 19:10, Eduardo Habkost wrote:
> > > > If you really want to do this and assign cluster_id
> > > > automatically, please do it on realize, where it won't have
> > > > unexpected side-effects after a simple `qom-list-properties` QMP
> > > > command.
> > >
> > > This looks clean enough to me.
> > > Do you prefer we don't use static auto_increment at all?
> >
> > I would prefer to avoid the static variable if possible, but I
> > won't reject it if the alternatives make the API too complex to
> > use.
> 
> I guess the alternative would be to require the cluster ID
> to be a QOM property on the cluster object. Usage would then
> be something like
>     object_initialize_child(OBJECT(s), "rpu-cluster", &s->rpu_cluster,
>                             sizeof(s->rpu_cluster), TYPE_CPU_CLUSTER,
>                             &error_abort, NULL);
>     object_property_set_int(OBJECT(s), 1, "cluster-id", &error_abort);
>     qdev_init_nofail(DEVICE(&s->rpu_cluster));
> 
> (ie one extra line setting the cluster ID explicitly).
> 
> SoC objects can probably reasonably assume that they are
> the only SoC in the system, ie that they can just assign
> cluster IDs themselves). If that turns out not to be true
> we can make the SoC take a property to specify a cluster ID
> base or something.
> 
> I guess if we've been bitten by cpu-ID auto-assignment
> in the past, starting out with "user picks the cluster ID"
> would be safer, and it's not too much pain at the callsite.
> Plus it avoids the gdbstub changing its mind about which
> order the cluster "processes" appear in if we refactor the
> board code.

Agreed.

> 
> > In either case, I'd like to ensure the caveats of the cluster_id
> > field are clearly documented: no guarantees about ordering or
> > predictability, making it not appropriate for external
> > interfaces.
> 
> The gdb stub is sort of an external interface, of course...

Right, and it works if you treat it as just an opaque identifier.
It won't work only if you need it to be stable and/or
predictable.

Letting the board set "cluster-id" seems to be safer, and not too
complex.

> 
> Luc: what are the requirements on boards using CPU cluster
> objects? I assume these are both OK:
>  * does not use cluster objects at all
>    (the gdbstub puts all the CPUs in one process?)
>  * all CPUs created are in some CPU cluster
>    (the gdbstub uses one process per CPU cluster)
> but what about
>  * some CPUs are created in a CPU cluster, but some
>    are "loose", not in any cluster at all
> ?
> Is that just invalid, or do the "loose" CPUs end up in
> a default cluster (gdbstub process), or do they get
> one cluster each?
> 
> If it's invalid (which doesn't seem like a bad idea),
> is there an assert somewhere so that getting it wrong
> results in the board failing to start (ie a "make check"
> failure) ?

I'm having the impression that "cluster" has a very clear meaning
and purpose in this discussion, but this is not being documented
anywhere.  Can we clarify at cpu/cluster.c what exactly a cluster
is, and when it's appropriate (or necessary) to group the CPUs
into clusters?

Is it really possible describe the meaning of "cluster" in a way
that doesn't refer to GDB at all?  Is this all about memory
address spaces?  If this is all about memory address spaces,
can't the GDB code detect that automatically by looking at the
address space objects?
Luc Michel Dec. 3, 2018, 11:20 a.m. UTC | #10
On 11/30/18 5:52 PM, Peter Maydell wrote:
> On Mon, 26 Nov 2018 at 13:27, Eduardo Habkost <ehabkost@redhat.com> wrote:
>>
>> On Sun, Nov 25, 2018 at 10:27:04PM +0100, Philippe Mathieu-Daudé wrote:
>>> Hi Eduardo,
>>>
>>> On 23/11/18 19:10, Eduardo Habkost wrote:
>>>> If you really want to do this and assign cluster_id
>>>> automatically, please do it on realize, where it won't have
>>>> unexpected side-effects after a simple `qom-list-properties` QMP
>>>> command.
>>>
>>> This looks clean enough to me.
>>> Do you prefer we don't use static auto_increment at all?
>>
>> I would prefer to avoid the static variable if possible, but I
>> won't reject it if the alternatives make the API too complex to
>> use.
> 
> I guess the alternative would be to require the cluster ID
> to be a QOM property on the cluster object. Usage would then
> be something like
>     object_initialize_child(OBJECT(s), "rpu-cluster", &s->rpu_cluster,
>                             sizeof(s->rpu_cluster), TYPE_CPU_CLUSTER,
>                             &error_abort, NULL);
>     object_property_set_int(OBJECT(s), 1, "cluster-id", &error_abort);
>     qdev_init_nofail(DEVICE(&s->rpu_cluster));
> 
> (ie one extra line setting the cluster ID explicitly).
I had such a line in v3 which was removed with the Philippe's
auto-assign suggestion. Discussions seems to converge to the removal of
the auto-assign mechanism though. I can rollback to manual assign for my
next re-roll.
Philippe: is it OK for you?

> 
> SoC objects can probably reasonably assume that they are
> the only SoC in the system, ie that they can just assign
> cluster IDs themselves). If that turns out not to be true
> we can make the SoC take a property to specify a cluster ID
> base or something.
> 
> I guess if we've been bitten by cpu-ID auto-assignment
> in the past, starting out with "user picks the cluster ID"
> would be safer, and it's not too much pain at the callsite.
> Plus it avoids the gdbstub changing its mind about which
> order the cluster "processes" appear in if we refactor the
> board code.
> 
>> In either case, I'd like to ensure the caveats of the cluster_id
>> field are clearly documented: no guarantees about ordering or
>> predictability, making it not appropriate for external
>> interfaces.
> 
> The gdb stub is sort of an external interface, of course...
> 
> Luc: what are the requirements on boards using CPU cluster
> objects? I assume these are both OK:
>  * does not use cluster objects at all
>    (the gdbstub puts all the CPUs in one process?)
Yes, when no clusters are found, a process with PID 1 is created and
used for all CPUs.
>  * all CPUs created are in some CPU cluster
>    (the gdbstub uses one process per CPU cluster)
> but what about
>  * some CPUs are created in a CPU cluster, but some
>    are "loose", not in any cluster at all> ?
> Is that just invalid, or do the "loose" CPUs end up in
> a default cluster (gdbstub process), or do they get
> one cluster each?
Currently this is valid and the "loose" CPUs end up in the first
available process (which will be PID 1 if we are in your first case,
i.e. no clusters in the machine).

> 
> If it's invalid (which doesn't seem like a bad idea),
> is there an assert somewhere so that getting it wrong
> results in the board failing to start (ie a "make check"
> failure) ?
I could add such a check if you think it's a good idea, i.e. assert that
"no cluster exist" or "all CPUs are in a cluster"

But maybe in the end this check should not be done in the gdb stub? For
the same reasons maybe you want to ensure that all CPUs have a well
known address space or whatever information the cluster will provide?

Thanks.
Peter Maydell Dec. 3, 2018, 11:23 a.m. UTC | #11
On Mon, 3 Dec 2018 at 11:21, Luc Michel <luc.michel@greensocs.com> wrote:
>
> On 11/30/18 5:52 PM, Peter Maydell wrote:
> > Luc: what are the requirements on boards using CPU cluster
> > objects? I assume these are both OK:
> >  * does not use cluster objects at all
> >    (the gdbstub puts all the CPUs in one process?)
> Yes, when no clusters are found, a process with PID 1 is created and
> used for all CPUs.
> >  * all CPUs created are in some CPU cluster
> >    (the gdbstub uses one process per CPU cluster)
> > but what about
> >  * some CPUs are created in a CPU cluster, but some
> >    are "loose", not in any cluster at all> ?
> > Is that just invalid, or do the "loose" CPUs end up in
> > a default cluster (gdbstub process), or do they get
> > one cluster each?
> Currently this is valid and the "loose" CPUs end up in the first
> available process (which will be PID 1 if we are in your first case,
> i.e. no clusters in the machine).

So if there are some defined clusters 1 and 2, and some
loose CPUs, the clusters get PID 1 and PID 2, and the
loose CPUs end up in PID 3? That seems reasonable.

thanks
-- PMM
Luc Michel Dec. 3, 2018, 2:09 p.m. UTC | #12
On 12/3/18 12:23 PM, Peter Maydell wrote:
> On Mon, 3 Dec 2018 at 11:21, Luc Michel <luc.michel@greensocs.com> wrote:
>>
>> On 11/30/18 5:52 PM, Peter Maydell wrote:
>>> Luc: what are the requirements on boards using CPU cluster
>>> objects? I assume these are both OK:
>>>  * does not use cluster objects at all
>>>    (the gdbstub puts all the CPUs in one process?)
>> Yes, when no clusters are found, a process with PID 1 is created and
>> used for all CPUs.
>>>  * all CPUs created are in some CPU cluster
>>>    (the gdbstub uses one process per CPU cluster)
>>> but what about
>>>  * some CPUs are created in a CPU cluster, but some
>>>    are "loose", not in any cluster at all> ?
>>> Is that just invalid, or do the "loose" CPUs end up in
>>> a default cluster (gdbstub process), or do they get
>>> one cluster each?
>> Currently this is valid and the "loose" CPUs end up in the first
>> available process (which will be PID 1 if we are in your first case,
>> i.e. no clusters in the machine).
> 
> So if there are some defined clusters 1 and 2, and some
> loose CPUs, the clusters get PID 1 and PID 2, and the
> loose CPUs end up in PID 3? That seems reasonable.
No sorry I was not clear, the loose CPUs would end up on PID 1 in that case.

The current behaviour is as follows:
During gdb stub initialization:
  - A process is created per cluster.
  - If no cluster are found, an unique process is created with PID 1

When trying to find a CPU PID:
  - If it is attached to a cluster, return the associated process,
  - If it is loose, return the first available process.
Eduardo Habkost Dec. 4, 2018, 6:06 p.m. UTC | #13
On Mon, Dec 03, 2018 at 03:09:14PM +0100, Luc Michel wrote:
> 
> 
> On 12/3/18 12:23 PM, Peter Maydell wrote:
> > On Mon, 3 Dec 2018 at 11:21, Luc Michel <luc.michel@greensocs.com> wrote:
> >>
> >> On 11/30/18 5:52 PM, Peter Maydell wrote:
> >>> Luc: what are the requirements on boards using CPU cluster
> >>> objects? I assume these are both OK:
> >>>  * does not use cluster objects at all
> >>>    (the gdbstub puts all the CPUs in one process?)
> >> Yes, when no clusters are found, a process with PID 1 is created and
> >> used for all CPUs.
> >>>  * all CPUs created are in some CPU cluster
> >>>    (the gdbstub uses one process per CPU cluster)
> >>> but what about
> >>>  * some CPUs are created in a CPU cluster, but some
> >>>    are "loose", not in any cluster at all> ?
> >>> Is that just invalid, or do the "loose" CPUs end up in
> >>> a default cluster (gdbstub process), or do they get
> >>> one cluster each?
> >> Currently this is valid and the "loose" CPUs end up in the first
> >> available process (which will be PID 1 if we are in your first case,
> >> i.e. no clusters in the machine).
> > 
> > So if there are some defined clusters 1 and 2, and some
> > loose CPUs, the clusters get PID 1 and PID 2, and the
> > loose CPUs end up in PID 3? That seems reasonable.
> No sorry I was not clear, the loose CPUs would end up on PID 1 in that case.
> 
> The current behaviour is as follows:
> During gdb stub initialization:
>   - A process is created per cluster.
>   - If no cluster are found, an unique process is created with PID 1
> 
> When trying to find a CPU PID:
>   - If it is attached to a cluster, return the associated process,
>   - If it is loose, return the first available process.

The behavior described by Peter would be more reasonable to me.
Otherwise what was just an accident (getting the CPU assigned to
the first process) will become expected behavior of the API, and
it will be hard to change it later.

In either case, I'm still missing a clear description of what a
cluster is supposed to represent, exactly (see my previous reply
on this thread).
Peter Maydell Dec. 4, 2018, 6:24 p.m. UTC | #14
On Tue, 4 Dec 2018 at 18:06, Eduardo Habkost <ehabkost@redhat.com> wrote:
> In either case, I'm still missing a clear description of what a
> cluster is supposed to represent, exactly (see my previous reply
> on this thread).

Here's my attempt:

A cluster is a group of CPUs which are all identical and have
the same view of the rest of the system.
If CPUs are not identical (for example, Cortex-A53 and
Cortex-A57 CPUs in an Arm big.LITTLE system) they should be
in different clusters.
If the CPUs do not have the same view of memory (for example
the main CPU and a management controller processor) they should
be in different clusters.

I agree that this is slightly confusing, because the concept
is on the boundary between something that's real in hardware
(eg in a big.LITTLE setup the CPUs are in separate hardware
clusters, and of coures a BMC processor and the main CPU
are definitely different things) and something that we're
defining for its effects on the GDB UI and so we can make
sure we don't share TCG translated code where we shouldn't.

thanks
-- PMM
Eduardo Habkost Dec. 4, 2018, 7:05 p.m. UTC | #15
On Tue, Dec 04, 2018 at 06:24:19PM +0000, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 18:06, Eduardo Habkost <ehabkost@redhat.com> wrote:
> > In either case, I'm still missing a clear description of what a
> > cluster is supposed to represent, exactly (see my previous reply
> > on this thread).
> 
> Here's my attempt:
> 
> A cluster is a group of CPUs which are all identical and have
> the same view of the rest of the system.
> If CPUs are not identical (for example, Cortex-A53 and
> Cortex-A57 CPUs in an Arm big.LITTLE system) they should be
> in different clusters.
> If the CPUs do not have the same view of memory (for example
> the main CPU and a management controller processor) they should
> be in different clusters.
> 
> I agree that this is slightly confusing, because the concept
> is on the boundary between something that's real in hardware
> (eg in a big.LITTLE setup the CPUs are in separate hardware
> clusters, and of coures a BMC processor and the main CPU
> are definitely different things) and something that we're
> defining for its effects on the GDB UI and so we can make
> sure we don't share TCG translated code where we shouldn't.
> 

Thanks!

With that definition in mind, why can't QEMU cluster CPUs
automatically by looking at CPU models and address space objects?
Peter Maydell Dec. 4, 2018, 7:16 p.m. UTC | #16
On Tue, 4 Dec 2018 at 19:05, Eduardo Habkost <ehabkost@redhat.com> wrote:
> On Tue, Dec 04, 2018 at 06:24:19PM +0000, Peter Maydell wrote:
> > A cluster is a group of CPUs which are all identical and have
> > the same view of the rest of the system.

> With that definition in mind, why can't QEMU cluster CPUs
> automatically by looking at CPU models and address space objects?

That sounds like it is in theory feasible and in practice
quite tricky. You would have to look not just at the CPU
model name but also introspect all its properties for
ones which change features of the CPU and are set differently
on different CPUs (and I don't think there's any way to
automatically tell which properties are ones which make
the CPU different for which-cluster purposes and which aren't).
And if we automatically checked whether address space objects
were the same it would rule out implementing devices with
per-cpu banked memory mapped registers by mapping different
things into the AS for each CPU (though that's a hypothetical
at the moment -- I've thought about implementing stuff that
way but we tend to implement that sort of thing by looking
at current_cpu->cpu_index at the moment).

thanks
-- PMM
Eduardo Habkost Dec. 4, 2018, 7:45 p.m. UTC | #17
On Tue, Dec 04, 2018 at 07:16:39PM +0000, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 19:05, Eduardo Habkost <ehabkost@redhat.com> wrote:
> > On Tue, Dec 04, 2018 at 06:24:19PM +0000, Peter Maydell wrote:
> > > A cluster is a group of CPUs which are all identical and have
> > > the same view of the rest of the system.
> 
> > With that definition in mind, why can't QEMU cluster CPUs
> > automatically by looking at CPU models and address space objects?
> 
> That sounds like it is in theory feasible and in practice
> quite tricky. You would have to look not just at the CPU
> model name but also introspect all its properties for
> ones which change features of the CPU and are set differently
> on different CPUs (and I don't think there's any way to
> automatically tell which properties are ones which make
> the CPU different for which-cluster purposes and which aren't).
> And if we automatically checked whether address space objects
> were the same it would rule out implementing devices with
> per-cpu banked memory mapped registers by mapping different
> things into the AS for each CPU (though that's a hypothetical
> at the moment -- I've thought about implementing stuff that
> way but we tend to implement that sort of thing by looking
> at current_cpu->cpu_index at the moment).

I see.

Can't we at least do something to make sure the cluster objects
make sense?  e.g. by ensuring at least QOM CPU type is the same,
and that cpu->address_space somehow points to
cpu->cluster->address_space?

CCing Paolo, so he can correct me if this doesn't make sense.
Luc Michel Dec. 5, 2018, 1:44 p.m. UTC | #18
On 12/4/18 7:24 PM, Peter Maydell wrote:
> On Tue, 4 Dec 2018 at 18:06, Eduardo Habkost <ehabkost@redhat.com> wrote:
>> In either case, I'm still missing a clear description of what a
>> cluster is supposed to represent, exactly (see my previous reply
>> on this thread).
> 
> Here's my attempt:
> 
> A cluster is a group of CPUs which are all identical and have
> the same view of the rest of the system.
> If CPUs are not identical (for example, Cortex-A53 and
> Cortex-A57 CPUs in an Arm big.LITTLE system) they should be
> in different clusters.
> If the CPUs do not have the same view of memory (for example
> the main CPU and a management controller processor) they should
> be in different clusters.
Thanks, I'll use that as a start point for the documentation in cluster.h
Luc Michel Dec. 5, 2018, 1:47 p.m. UTC | #19
On 12/4/18 7:06 PM, Eduardo Habkost wrote:
> On Mon, Dec 03, 2018 at 03:09:14PM +0100, Luc Michel wrote:
>>
>>
>> On 12/3/18 12:23 PM, Peter Maydell wrote:
>>> On Mon, 3 Dec 2018 at 11:21, Luc Michel <luc.michel@greensocs.com> wrote:
>>>>
>>>> On 11/30/18 5:52 PM, Peter Maydell wrote:
>>>>> Luc: what are the requirements on boards using CPU cluster
>>>>> objects? I assume these are both OK:
>>>>>  * does not use cluster objects at all
>>>>>    (the gdbstub puts all the CPUs in one process?)
>>>> Yes, when no clusters are found, a process with PID 1 is created and
>>>> used for all CPUs.
>>>>>  * all CPUs created are in some CPU cluster
>>>>>    (the gdbstub uses one process per CPU cluster)
>>>>> but what about
>>>>>  * some CPUs are created in a CPU cluster, but some
>>>>>    are "loose", not in any cluster at all> ?
>>>>> Is that just invalid, or do the "loose" CPUs end up in
>>>>> a default cluster (gdbstub process), or do they get
>>>>> one cluster each?
>>>> Currently this is valid and the "loose" CPUs end up in the first
>>>> available process (which will be PID 1 if we are in your first case,
>>>> i.e. no clusters in the machine).
>>>
>>> So if there are some defined clusters 1 and 2, and some
>>> loose CPUs, the clusters get PID 1 and PID 2, and the
>>> loose CPUs end up in PID 3? That seems reasonable.
>> No sorry I was not clear, the loose CPUs would end up on PID 1 in that case.
>>
>> The current behaviour is as follows:
>> During gdb stub initialization:
>>   - A process is created per cluster.
>>   - If no cluster are found, an unique process is created with PID 1
>>
>> When trying to find a CPU PID:
>>   - If it is attached to a cluster, return the associated process,
>>   - If it is loose, return the first available process.
> 
> The behavior described by Peter would be more reasonable to me.
> Otherwise what was just an accident (getting the CPU assigned to
> the first process) will become expected behavior of the API, and
> it will be hard to change it later.
Make sense, I'll change that to adopt the behaviour described by Peter.

> 
> In either case, I'm still missing a clear description of what a
> cluster is supposed to represent, exactly (see my previous reply
> on this thread).
I'll add a description in the next re-roll, using the Peter's one as a
starting point.

Thanks.
Luc Michel Dec. 5, 2018, 2:02 p.m. UTC | #20
On 12/4/18 8:45 PM, Eduardo Habkost wrote:
> On Tue, Dec 04, 2018 at 07:16:39PM +0000, Peter Maydell wrote:
>> On Tue, 4 Dec 2018 at 19:05, Eduardo Habkost <ehabkost@redhat.com> wrote:
>>> On Tue, Dec 04, 2018 at 06:24:19PM +0000, Peter Maydell wrote:
>>>> A cluster is a group of CPUs which are all identical and have
>>>> the same view of the rest of the system.
>>
>>> With that definition in mind, why can't QEMU cluster CPUs
>>> automatically by looking at CPU models and address space objects?
>>
>> That sounds like it is in theory feasible and in practice
>> quite tricky. You would have to look not just at the CPU
>> model name but also introspect all its properties for
>> ones which change features of the CPU and are set differently
>> on different CPUs (and I don't think there's any way to
>> automatically tell which properties are ones which make
>> the CPU different for which-cluster purposes and which aren't).
>> And if we automatically checked whether address space objects
>> were the same it would rule out implementing devices with
>> per-cpu banked memory mapped registers by mapping different
>> things into the AS for each CPU (though that's a hypothetical
>> at the moment -- I've thought about implementing stuff that
>> way but we tend to implement that sort of thing by looking
>> at current_cpu->cpu_index at the moment).
> 
> I see.
> 
> Can't we at least do something to make sure the cluster objects
> make sense?  e.g. by ensuring at least QOM CPU type is the same,
> and that cpu->address_space somehow points to
> cpu->cluster->address_space?
Where such a check should be placed? Cluster realize function is not
good since children can still be added after device realization. A good
place would be when object_property_add_child() is called, but I'm not
aware of a way of hooking into that with the QOM API...
Eduardo Habkost Dec. 5, 2018, 4:47 p.m. UTC | #21
On Wed, Dec 05, 2018 at 03:02:45PM +0100, Luc Michel wrote:
> On 12/4/18 8:45 PM, Eduardo Habkost wrote:
> > On Tue, Dec 04, 2018 at 07:16:39PM +0000, Peter Maydell wrote:
> >> On Tue, 4 Dec 2018 at 19:05, Eduardo Habkost <ehabkost@redhat.com> wrote:
> >>> On Tue, Dec 04, 2018 at 06:24:19PM +0000, Peter Maydell wrote:
> >>>> A cluster is a group of CPUs which are all identical and have
> >>>> the same view of the rest of the system.
> >>
> >>> With that definition in mind, why can't QEMU cluster CPUs
> >>> automatically by looking at CPU models and address space objects?
> >>
> >> That sounds like it is in theory feasible and in practice
> >> quite tricky. You would have to look not just at the CPU
> >> model name but also introspect all its properties for
> >> ones which change features of the CPU and are set differently
> >> on different CPUs (and I don't think there's any way to
> >> automatically tell which properties are ones which make
> >> the CPU different for which-cluster purposes and which aren't).
> >> And if we automatically checked whether address space objects
> >> were the same it would rule out implementing devices with
> >> per-cpu banked memory mapped registers by mapping different
> >> things into the AS for each CPU (though that's a hypothetical
> >> at the moment -- I've thought about implementing stuff that
> >> way but we tend to implement that sort of thing by looking
> >> at current_cpu->cpu_index at the moment).
> > 
> > I see.
> > 
> > Can't we at least do something to make sure the cluster objects
> > make sense?  e.g. by ensuring at least QOM CPU type is the same,
> > and that cpu->address_space somehow points to
> > cpu->cluster->address_space?
> Where such a check should be placed? Cluster realize function is not
> good since children can still be added after device realization. A good
> place would be when object_property_add_child() is called, but I'm not
> aware of a way of hooking into that with the QOM API...

Not sure.  Maybe it can be a post-machine_init hook.  Maybe this
can be done by a separate (avocado_qemu-based?) test script.

It looks like address space info is available through HMP ("info
mtree") and not QMP, so we'd need to fix that first.  CCing HMP
and QMP maintainers.

I don't think this should block the series, but it would be a
welcome addition.
diff mbox series

Patch

diff --git a/include/hw/cpu/cluster.h b/include/hw/cpu/cluster.h
new file mode 100644
index 0000000000..11f50d5f6b
--- /dev/null
+++ b/include/hw/cpu/cluster.h
@@ -0,0 +1,38 @@ 
+/*
+ * QEMU CPU cluster
+ *
+ * Copyright (c) 2018 GreenSocs SAS
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * <http://www.gnu.org/licenses/gpl-2.0.html>
+ */
+#ifndef HW_CPU_CLUSTER_H
+#define HW_CPU_CLUSTER_H
+
+#include "qemu/osdep.h"
+#include "hw/qdev.h"
+
+#define TYPE_CPU_CLUSTER "cpu-cluster"
+#define CPU_CLUSTER(obj) \
+    OBJECT_CHECK(CPUClusterState, (obj), TYPE_CPU_CLUSTER)
+
+typedef struct CPUClusterState {
+    /*< private >*/
+    DeviceState parent_obj;
+
+    /*< public >*/
+    uint32_t cluster_id;
+} CPUClusterState;
+
+#endif
diff --git a/hw/cpu/cluster.c b/hw/cpu/cluster.c
new file mode 100644
index 0000000000..e0ffd76152
--- /dev/null
+++ b/hw/cpu/cluster.c
@@ -0,0 +1,59 @@ 
+/*
+ * QEMU CPU cluster
+ *
+ * Copyright (c) 2018 GreenSocs SAS
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see
+ * <http://www.gnu.org/licenses/gpl-2.0.html>
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cpu/cluster.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+
+static void cpu_cluster_init(Object *obj)
+{
+    static uint32_t cluster_id_auto_increment;
+    CPUClusterState *cluster = CPU_CLUSTER(obj);
+
+    cluster->cluster_id = cluster_id_auto_increment++;
+}
+
+static Property cpu_cluster_properties[] = {
+    DEFINE_PROP_UINT32("cluster-id", CPUClusterState, cluster_id, 0),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void cpu_cluster_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->props = cpu_cluster_properties;
+}
+
+static const TypeInfo cpu_cluster_type_info = {
+    .name = TYPE_CPU_CLUSTER,
+    .parent = TYPE_DEVICE,
+    .instance_size = sizeof(CPUClusterState),
+    .instance_init = cpu_cluster_init,
+    .class_init = cpu_cluster_class_init,
+};
+
+static void cpu_cluster_register_types(void)
+{
+    type_register_static(&cpu_cluster_type_info);
+}
+
+type_init(cpu_cluster_register_types)
diff --git a/MAINTAINERS b/MAINTAINERS
index 1032406c56..f7d47d2b1d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1120,11 +1120,13 @@  Machine core
 M: Eduardo Habkost <ehabkost@redhat.com>
 M: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
 S: Supported
 F: hw/core/machine.c
 F: hw/core/null-machine.c
+F: hw/cpu/cluster.c
 F: include/hw/boards.h
+F: include/hw/cpu/cluster.h
 T: git https://github.com/ehabkost/qemu.git machine-next
 
 Xtensa Machines
 ---------------
 sim
diff --git a/hw/cpu/Makefile.objs b/hw/cpu/Makefile.objs
index cd52d20b65..8db9e8a7b3 100644
--- a/hw/cpu/Makefile.objs
+++ b/hw/cpu/Makefile.objs
@@ -1,5 +1,5 @@ 
 obj-$(CONFIG_ARM11MPCORE) += arm11mpcore.o
 obj-$(CONFIG_REALVIEW) += realview_mpcore.o
 obj-$(CONFIG_A9MPCORE) += a9mpcore.o
 obj-$(CONFIG_A15MPCORE) += a15mpcore.o
-common-obj-y += core.o
+common-obj-y += core.o cluster.o