diff mbox

[RFC,2/2] linux-user: Fix cpu_index generation

Message ID 1468483025-1084-3-git-send-email-david@gibson.dropbear.id.au (mailing list archive)
State New, archived
Headers show

Commit Message

David Gibson July 14, 2016, 7:57 a.m. UTC
With CONFIG_USER_ONLY, generation of cpu_index values is done differently
than for full system targets.  This method turns out to be broken, since
it can fairly easily result in duplicate cpu_index values for
simultaneously active cpus (i.e. threads in the emulated process).

Consider this sequence:
    Create thread 1
    Create thread 2
    Exit thread 1
    Create thread 3

With the current logic thread 1 will get cpu_index 1, thread 2 will get
cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
threads in the cpus list at the point of its creation).

We mostly get away with this because cpu_index values aren't that important
for userspace emulation.  Still, it can't be good, so this patch fixes it
by making CONFIG_USER_ONLY use the same bitmap based allocation that full
system targets already use.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 exec.c | 19 -------------------
 1 file changed, 19 deletions(-)

Comments

Peter Maydell July 14, 2016, 9:54 a.m. UTC | #1
On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:
> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> than for full system targets.  This method turns out to be broken, since
> it can fairly easily result in duplicate cpu_index values for
> simultaneously active cpus (i.e. threads in the emulated process).
>
> Consider this sequence:
>     Create thread 1
>     Create thread 2
>     Exit thread 1
>     Create thread 3
>
> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> threads in the cpus list at the point of its creation).
>
> We mostly get away with this because cpu_index values aren't that important
> for userspace emulation.  Still, it can't be good, so this patch fixes it
> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> system targets already use.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  exec.c | 19 -------------------
>  1 file changed, 19 deletions(-)
>
> diff --git a/exec.c b/exec.c
> index 011babd..e410dab 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
>  }
>  #endif
>
> -#ifndef CONFIG_USER_ONLY
>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
>
>  static int cpu_get_free_index(Error **errp)
> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
>  {
>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
>  }
> -#else
> -
> -static int cpu_get_free_index(Error **errp)
> -{
> -    CPUState *some_cpu;
> -    int cpu_index = 0;
> -
> -    CPU_FOREACH(some_cpu) {
> -        cpu_index++;
> -    }
> -    return cpu_index;
> -}
> -
> -static void cpu_release_index(CPUState *cpu)
> -{
> -    return;
> -}
> -#endif

Won't this change impose a maximum limit of 256 simultaneous
threads? That seems a little low for comfort.

thanks
-- PMM
Bharata B Rao July 14, 2016, 10:20 a.m. UTC | #2
On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:
>> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
>> than for full system targets.  This method turns out to be broken, since
>> it can fairly easily result in duplicate cpu_index values for
>> simultaneously active cpus (i.e. threads in the emulated process).
>>
>> Consider this sequence:
>>     Create thread 1
>>     Create thread 2
>>     Exit thread 1
>>     Create thread 3
>>
>> With the current logic thread 1 will get cpu_index 1, thread 2 will get
>> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
>> threads in the cpus list at the point of its creation).
>>
>> We mostly get away with this because cpu_index values aren't that important
>> for userspace emulation.  Still, it can't be good, so this patch fixes it
>> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
>> system targets already use.
>>
>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>> ---
>>  exec.c | 19 -------------------
>>  1 file changed, 19 deletions(-)
>>
>> diff --git a/exec.c b/exec.c
>> index 011babd..e410dab 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
>>  }
>>  #endif
>>
>> -#ifndef CONFIG_USER_ONLY
>>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
>>
>>  static int cpu_get_free_index(Error **errp)
>> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
>>  {
>>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
>>  }
>> -#else
>> -
>> -static int cpu_get_free_index(Error **errp)
>> -{
>> -    CPUState *some_cpu;
>> -    int cpu_index = 0;
>> -
>> -    CPU_FOREACH(some_cpu) {
>> -        cpu_index++;
>> -    }
>> -    return cpu_index;
>> -}
>> -
>> -static void cpu_release_index(CPUState *cpu)
>> -{
>> -    return;
>> -}
>> -#endif
>
> Won't this change impose a maximum limit of 256 simultaneous
> threads? That seems a little low for comfort.

This was the reason why the bitmap logic wasn't applied to
CONFIG_USER_ONLY when it was introduced.

https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html

But then we didn't have actual removal, but we do now.

Regards,
Bharata.
David Gibson July 14, 2016, 11:59 a.m. UTC | #3
On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:
> On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:
> >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> >> than for full system targets.  This method turns out to be broken, since
> >> it can fairly easily result in duplicate cpu_index values for
> >> simultaneously active cpus (i.e. threads in the emulated process).
> >>
> >> Consider this sequence:
> >>     Create thread 1
> >>     Create thread 2
> >>     Exit thread 1
> >>     Create thread 3
> >>
> >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> >> threads in the cpus list at the point of its creation).
> >>
> >> We mostly get away with this because cpu_index values aren't that important
> >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> >> system targets already use.
> >>
> >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >> ---
> >>  exec.c | 19 -------------------
> >>  1 file changed, 19 deletions(-)
> >>
> >> diff --git a/exec.c b/exec.c
> >> index 011babd..e410dab 100644
> >> --- a/exec.c
> >> +++ b/exec.c
> >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> >>  }
> >>  #endif
> >>
> >> -#ifndef CONFIG_USER_ONLY
> >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> >>
> >>  static int cpu_get_free_index(Error **errp)
> >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> >>  {
> >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> >>  }
> >> -#else
> >> -
> >> -static int cpu_get_free_index(Error **errp)
> >> -{
> >> -    CPUState *some_cpu;
> >> -    int cpu_index = 0;
> >> -
> >> -    CPU_FOREACH(some_cpu) {
> >> -        cpu_index++;
> >> -    }
> >> -    return cpu_index;
> >> -}
> >> -
> >> -static void cpu_release_index(CPUState *cpu)
> >> -{
> >> -    return;
> >> -}
> >> -#endif
> >
> > Won't this change impose a maximum limit of 256 simultaneous
> > threads? That seems a little low for comfort.
> 
> This was the reason why the bitmap logic wasn't applied to
> CONFIG_USER_ONLY when it was introduced.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html

Ah.. good point.

Hrm, ok, my next idea would be to just (globally) sequentially
allocate cpu_index values for CONFIG_USER, and never try to re-use
them.  Does that seem reasonable?

> But then we didn't have actual removal, but we do now.

You mean patch 1/2 in this set?  Or something else?

Even so, 256 does seem a bit low for a number of simultaneously active
threads - there are some bug hairy multi-threaded programs out there.
Greg Kurz July 15, 2016, 10:11 p.m. UTC | #4
On Thu, 14 Jul 2016 21:59:45 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:
> > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:  
> > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:  
> > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > >> than for full system targets.  This method turns out to be broken, since
> > >> it can fairly easily result in duplicate cpu_index values for
> > >> simultaneously active cpus (i.e. threads in the emulated process).
> > >>
> > >> Consider this sequence:
> > >>     Create thread 1
> > >>     Create thread 2
> > >>     Exit thread 1
> > >>     Create thread 3
> > >>
> > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > >> threads in the cpus list at the point of its creation).
> > >>
> > >> We mostly get away with this because cpu_index values aren't that important
> > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > >> system targets already use.
> > >>
> > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > >> ---
> > >>  exec.c | 19 -------------------
> > >>  1 file changed, 19 deletions(-)
> > >>
> > >> diff --git a/exec.c b/exec.c
> > >> index 011babd..e410dab 100644
> > >> --- a/exec.c
> > >> +++ b/exec.c
> > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > >>  }
> > >>  #endif
> > >>
> > >> -#ifndef CONFIG_USER_ONLY
> > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > >>
> > >>  static int cpu_get_free_index(Error **errp)
> > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > >>  {
> > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > >>  }
> > >> -#else
> > >> -
> > >> -static int cpu_get_free_index(Error **errp)
> > >> -{
> > >> -    CPUState *some_cpu;
> > >> -    int cpu_index = 0;
> > >> -
> > >> -    CPU_FOREACH(some_cpu) {
> > >> -        cpu_index++;
> > >> -    }
> > >> -    return cpu_index;
> > >> -}
> > >> -
> > >> -static void cpu_release_index(CPUState *cpu)
> > >> -{
> > >> -    return;
> > >> -}
> > >> -#endif  
> > >
> > > Won't this change impose a maximum limit of 256 simultaneous
> > > threads? That seems a little low for comfort.  
> > 
> > This was the reason why the bitmap logic wasn't applied to
> > CONFIG_USER_ONLY when it was introduced.
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html  
> 
> Ah.. good point.
> 
> Hrm, ok, my next idea would be to just (globally) sequentially
> allocate cpu_index values for CONFIG_USER, and never try to re-use
> them.  Does that seem reasonable?
> 

Isn't it only deferring the problem to later ?

Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
value fo CONFIG_USER only instead ?

> > But then we didn't have actual removal, but we do now.  
> 
> You mean patch 1/2 in this set?  Or something else?
> 
> Even so, 256 does seem a bit low for a number of simultaneously active
> threads - there are some bug hairy multi-threaded programs out there.
>
David Gibson July 18, 2016, 1:17 a.m. UTC | #5
On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> On Thu, 14 Jul 2016 21:59:45 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:
> > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:  
> > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:  
> > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > > >> than for full system targets.  This method turns out to be broken, since
> > > >> it can fairly easily result in duplicate cpu_index values for
> > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > >>
> > > >> Consider this sequence:
> > > >>     Create thread 1
> > > >>     Create thread 2
> > > >>     Exit thread 1
> > > >>     Create thread 3
> > > >>
> > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > >> threads in the cpus list at the point of its creation).
> > > >>
> > > >> We mostly get away with this because cpu_index values aren't that important
> > > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > > >> system targets already use.
> > > >>
> > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > >> ---
> > > >>  exec.c | 19 -------------------
> > > >>  1 file changed, 19 deletions(-)
> > > >>
> > > >> diff --git a/exec.c b/exec.c
> > > >> index 011babd..e410dab 100644
> > > >> --- a/exec.c
> > > >> +++ b/exec.c
> > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > > >>  }
> > > >>  #endif
> > > >>
> > > >> -#ifndef CONFIG_USER_ONLY
> > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > >>
> > > >>  static int cpu_get_free_index(Error **errp)
> > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > >>  {
> > > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > >>  }
> > > >> -#else
> > > >> -
> > > >> -static int cpu_get_free_index(Error **errp)
> > > >> -{
> > > >> -    CPUState *some_cpu;
> > > >> -    int cpu_index = 0;
> > > >> -
> > > >> -    CPU_FOREACH(some_cpu) {
> > > >> -        cpu_index++;
> > > >> -    }
> > > >> -    return cpu_index;
> > > >> -}
> > > >> -
> > > >> -static void cpu_release_index(CPUState *cpu)
> > > >> -{
> > > >> -    return;
> > > >> -}
> > > >> -#endif  
> > > >
> > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > threads? That seems a little low for comfort.  
> > > 
> > > This was the reason why the bitmap logic wasn't applied to
> > > CONFIG_USER_ONLY when it was introduced.
> > > 
> > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html  
> > 
> > Ah.. good point.
> > 
> > Hrm, ok, my next idea would be to just (globally) sequentially
> > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > them.  Does that seem reasonable?
> > 
> 
> Isn't it only deferring the problem to later ?

You mean that we could get duplicate indexes after the value wraps
around?

I suppose, but duplicates after spawning 4 billion threads seems like
a substantial improvement over duplicates after spawning 3 in the
wrong order..

> Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
> value fo CONFIG_USER only instead ?

Perhaps.  It does mean carrying around a huge bitmap, though.

Another option is to remove cpu_index entirely for the user only
case.  I have some patches for this, which are very ugly but it's
possible they can be cleaned up to something reasonable (the biggest
chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY
for what I think are registers that aren't accessible in user mode).


> > > But then we didn't have actual removal, but we do now.  
> > 
> > You mean patch 1/2 in this set?  Or something else?
> > 
> > Even so, 256 does seem a bit low for a number of simultaneously active
> > threads - there are some bug hairy multi-threaded programs out there.
> > 
>
Igor Mammedov July 18, 2016, 7:25 a.m. UTC | #6
On Mon, 18 Jul 2016 11:17:25 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> > On Thu, 14 Jul 2016 21:59:45 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:  
> > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:    
> > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:    
> > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > > > >> than for full system targets.  This method turns out to be broken, since
> > > > >> it can fairly easily result in duplicate cpu_index values for
> > > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > > >>
> > > > >> Consider this sequence:
> > > > >>     Create thread 1
> > > > >>     Create thread 2
> > > > >>     Exit thread 1
> > > > >>     Create thread 3
> > > > >>
> > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > > >> threads in the cpus list at the point of its creation).
> > > > >>
> > > > >> We mostly get away with this because cpu_index values aren't that important
> > > > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > > > >> system targets already use.
> > > > >>
> > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > >> ---
> > > > >>  exec.c | 19 -------------------
> > > > >>  1 file changed, 19 deletions(-)
> > > > >>
> > > > >> diff --git a/exec.c b/exec.c
> > > > >> index 011babd..e410dab 100644
> > > > >> --- a/exec.c
> > > > >> +++ b/exec.c
> > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > > > >>  }
> > > > >>  #endif
> > > > >>
> > > > >> -#ifndef CONFIG_USER_ONLY
> > > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > > >>
> > > > >>  static int cpu_get_free_index(Error **errp)
> > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > > >>  {
> > > > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > > >>  }
> > > > >> -#else
> > > > >> -
> > > > >> -static int cpu_get_free_index(Error **errp)
> > > > >> -{
> > > > >> -    CPUState *some_cpu;
> > > > >> -    int cpu_index = 0;
> > > > >> -
> > > > >> -    CPU_FOREACH(some_cpu) {
> > > > >> -        cpu_index++;
> > > > >> -    }
> > > > >> -    return cpu_index;
> > > > >> -}
> > > > >> -
> > > > >> -static void cpu_release_index(CPUState *cpu)
> > > > >> -{
> > > > >> -    return;
> > > > >> -}
> > > > >> -#endif    
> > > > >
> > > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > > threads? That seems a little low for comfort.    
> > > > 
> > > > This was the reason why the bitmap logic wasn't applied to
> > > > CONFIG_USER_ONLY when it was introduced.
> > > > 
> > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html    
> > > 
> > > Ah.. good point.
> > > 
> > > Hrm, ok, my next idea would be to just (globally) sequentially
> > > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > > them.  Does that seem reasonable?
> > >   
> > 
> > Isn't it only deferring the problem to later ?  
> 
> You mean that we could get duplicate indexes after the value wraps
> around?
> 
> I suppose, but duplicates after spawning 4 billion threads seems like
> a substantial improvement over duplicates after spawning 3 in the
> wrong order..
> 
> > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
> > value fo CONFIG_USER only instead ?  
> 
> Perhaps.  It does mean carrying around a huge bitmap, though.
> 
> Another option is to remove cpu_index entirely for the user only
> case.  I have some patches for this, which are very ugly but it's
> possible they can be cleaned up to something reasonable (the biggest
> chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY
> for what I think are registers that aren't accessible in user mode).
could we remove cpu_index altogether for bot *-user and *-softmmu targets?

> 
> 
> > > > But then we didn't have actual removal, but we do now.    
> > > 
> > > You mean patch 1/2 in this set?  Or something else?
> > > 
> > > Even so, 256 does seem a bit low for a number of simultaneously active
> > > threads - there are some bug hairy multi-threaded programs out there.
> > >   
> >   
> 
> 
>
Greg Kurz July 18, 2016, 8:52 a.m. UTC | #7
On Mon, 18 Jul 2016 11:17:25 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> > On Thu, 14 Jul 2016 21:59:45 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:  
> > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:    
> > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:    
> > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > > > >> than for full system targets.  This method turns out to be broken, since
> > > > >> it can fairly easily result in duplicate cpu_index values for
> > > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > > >>
> > > > >> Consider this sequence:
> > > > >>     Create thread 1
> > > > >>     Create thread 2
> > > > >>     Exit thread 1
> > > > >>     Create thread 3
> > > > >>
> > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > > >> threads in the cpus list at the point of its creation).
> > > > >>
> > > > >> We mostly get away with this because cpu_index values aren't that important
> > > > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > > > >> system targets already use.
> > > > >>
> > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > >> ---
> > > > >>  exec.c | 19 -------------------
> > > > >>  1 file changed, 19 deletions(-)
> > > > >>
> > > > >> diff --git a/exec.c b/exec.c
> > > > >> index 011babd..e410dab 100644
> > > > >> --- a/exec.c
> > > > >> +++ b/exec.c
> > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > > > >>  }
> > > > >>  #endif
> > > > >>
> > > > >> -#ifndef CONFIG_USER_ONLY
> > > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > > >>
> > > > >>  static int cpu_get_free_index(Error **errp)
> > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > > >>  {
> > > > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > > >>  }
> > > > >> -#else
> > > > >> -
> > > > >> -static int cpu_get_free_index(Error **errp)
> > > > >> -{
> > > > >> -    CPUState *some_cpu;
> > > > >> -    int cpu_index = 0;
> > > > >> -
> > > > >> -    CPU_FOREACH(some_cpu) {
> > > > >> -        cpu_index++;
> > > > >> -    }
> > > > >> -    return cpu_index;
> > > > >> -}
> > > > >> -
> > > > >> -static void cpu_release_index(CPUState *cpu)
> > > > >> -{
> > > > >> -    return;
> > > > >> -}
> > > > >> -#endif    
> > > > >
> > > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > > threads? That seems a little low for comfort.    
> > > > 
> > > > This was the reason why the bitmap logic wasn't applied to
> > > > CONFIG_USER_ONLY when it was introduced.
> > > > 
> > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html    
> > > 
> > > Ah.. good point.
> > > 
> > > Hrm, ok, my next idea would be to just (globally) sequentially
> > > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > > them.  Does that seem reasonable?
> > >   
> > 
> > Isn't it only deferring the problem to later ?  
> 
> You mean that we could get duplicate indexes after the value wraps
> around?
> 

Yes.

> I suppose, but duplicates after spawning 4 billion threads seems like
> a substantial improvement over duplicates after spawning 3 in the
> wrong order..
> 

Agreed.

It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto
box, so the wrap around would occur after ~ 100 days. :)

> > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
> > value fo CONFIG_USER only instead ?  
> 
> Perhaps.  It does mean carrying around a huge bitmap, though.
> 
> Another option is to remove cpu_index entirely for the user only
> case.  I have some patches for this, which are very ugly but it's
> possible they can be cleaned up to something reasonable (the biggest
> chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY
> for what I think are registers that aren't accessible in user mode).
> 
> 
> > > > But then we didn't have actual removal, but we do now.    
> > > 
> > > You mean patch 1/2 in this set?  Or something else?
> > > 
> > > Even so, 256 does seem a bit low for a number of simultaneously active
> > > threads - there are some bug hairy multi-threaded programs out there.
> > >   
> >   
> 
> 
>
David Gibson July 18, 2016, 9:50 a.m. UTC | #8
On Mon, Jul 18, 2016 at 10:52:39AM +0200, Greg Kurz wrote:
> On Mon, 18 Jul 2016 11:17:25 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> > > On Thu, 14 Jul 2016 21:59:45 +1000
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:  
> > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:    
> > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:    
> > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > > > > >> than for full system targets.  This method turns out to be broken, since
> > > > > >> it can fairly easily result in duplicate cpu_index values for
> > > > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > > > >>
> > > > > >> Consider this sequence:
> > > > > >>     Create thread 1
> > > > > >>     Create thread 2
> > > > > >>     Exit thread 1
> > > > > >>     Create thread 3
> > > > > >>
> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > > > >> threads in the cpus list at the point of its creation).
> > > > > >>
> > > > > >> We mostly get away with this because cpu_index values aren't that important
> > > > > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > > > > >> system targets already use.
> > > > > >>
> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > >> ---
> > > > > >>  exec.c | 19 -------------------
> > > > > >>  1 file changed, 19 deletions(-)
> > > > > >>
> > > > > >> diff --git a/exec.c b/exec.c
> > > > > >> index 011babd..e410dab 100644
> > > > > >> --- a/exec.c
> > > > > >> +++ b/exec.c
> > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > > > > >>  }
> > > > > >>  #endif
> > > > > >>
> > > > > >> -#ifndef CONFIG_USER_ONLY
> > > > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > > > >>
> > > > > >>  static int cpu_get_free_index(Error **errp)
> > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > > > >>  {
> > > > > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > > > >>  }
> > > > > >> -#else
> > > > > >> -
> > > > > >> -static int cpu_get_free_index(Error **errp)
> > > > > >> -{
> > > > > >> -    CPUState *some_cpu;
> > > > > >> -    int cpu_index = 0;
> > > > > >> -
> > > > > >> -    CPU_FOREACH(some_cpu) {
> > > > > >> -        cpu_index++;
> > > > > >> -    }
> > > > > >> -    return cpu_index;
> > > > > >> -}
> > > > > >> -
> > > > > >> -static void cpu_release_index(CPUState *cpu)
> > > > > >> -{
> > > > > >> -    return;
> > > > > >> -}
> > > > > >> -#endif    
> > > > > >
> > > > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > > > threads? That seems a little low for comfort.    
> > > > > 
> > > > > This was the reason why the bitmap logic wasn't applied to
> > > > > CONFIG_USER_ONLY when it was introduced.
> > > > > 
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html    
> > > > 
> > > > Ah.. good point.
> > > > 
> > > > Hrm, ok, my next idea would be to just (globally) sequentially
> > > > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > > > them.  Does that seem reasonable?
> > > >   
> > > 
> > > Isn't it only deferring the problem to later ?  
> > 
> > You mean that we could get duplicate indexes after the value wraps
> > around?
> > 
> 
> Yes.
> 
> > I suppose, but duplicates after spawning 4 billion threads seems like
> > a substantial improvement over duplicates after spawning 3 in the
> > wrong order..
> > 
> 
> Agreed.
> 
> It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto
> box, so the wrap around would occur after ~ 100 days. :)

Yeah, I think we can live with that.  Especially since the fact this
hasn't come up before kind of indicates the duplication isn't that
fatal anyway.
David Gibson July 18, 2016, 9:58 a.m. UTC | #9
On Mon, Jul 18, 2016 at 09:25:58AM +0200, Igor Mammedov wrote:
> On Mon, 18 Jul 2016 11:17:25 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> > > On Thu, 14 Jul 2016 21:59:45 +1000
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:  
> > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote:    
> > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote:    
> > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently
> > > > > >> than for full system targets.  This method turns out to be broken, since
> > > > > >> it can fairly easily result in duplicate cpu_index values for
> > > > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > > > >>
> > > > > >> Consider this sequence:
> > > > > >>     Create thread 1
> > > > > >>     Create thread 2
> > > > > >>     Exit thread 1
> > > > > >>     Create thread 3
> > > > > >>
> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > > > >> threads in the cpus list at the point of its creation).
> > > > > >>
> > > > > >> We mostly get away with this because cpu_index values aren't that important
> > > > > >> for userspace emulation.  Still, it can't be good, so this patch fixes it
> > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full
> > > > > >> system targets already use.
> > > > > >>
> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > >> ---
> > > > > >>  exec.c | 19 -------------------
> > > > > >>  1 file changed, 19 deletions(-)
> > > > > >>
> > > > > >> diff --git a/exec.c b/exec.c
> > > > > >> index 011babd..e410dab 100644
> > > > > >> --- a/exec.c
> > > > > >> +++ b/exec.c
> > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
> > > > > >>  }
> > > > > >>  #endif
> > > > > >>
> > > > > >> -#ifndef CONFIG_USER_ONLY
> > > > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > > > >>
> > > > > >>  static int cpu_get_free_index(Error **errp)
> > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > > > >>  {
> > > > > >>      bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > > > >>  }
> > > > > >> -#else
> > > > > >> -
> > > > > >> -static int cpu_get_free_index(Error **errp)
> > > > > >> -{
> > > > > >> -    CPUState *some_cpu;
> > > > > >> -    int cpu_index = 0;
> > > > > >> -
> > > > > >> -    CPU_FOREACH(some_cpu) {
> > > > > >> -        cpu_index++;
> > > > > >> -    }
> > > > > >> -    return cpu_index;
> > > > > >> -}
> > > > > >> -
> > > > > >> -static void cpu_release_index(CPUState *cpu)
> > > > > >> -{
> > > > > >> -    return;
> > > > > >> -}
> > > > > >> -#endif    
> > > > > >
> > > > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > > > threads? That seems a little low for comfort.    
> > > > > 
> > > > > This was the reason why the bitmap logic wasn't applied to
> > > > > CONFIG_USER_ONLY when it was introduced.
> > > > > 
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html    
> > > > 
> > > > Ah.. good point.
> > > > 
> > > > Hrm, ok, my next idea would be to just (globally) sequentially
> > > > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > > > them.  Does that seem reasonable?
> > > >   
> > > 
> > > Isn't it only deferring the problem to later ?  
> > 
> > You mean that we could get duplicate indexes after the value wraps
> > around?
> > 
> > I suppose, but duplicates after spawning 4 billion threads seems like
> > a substantial improvement over duplicates after spawning 3 in the
> > wrong order..
> > 
> > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
> > > value fo CONFIG_USER only instead ?  
> > 
> > Perhaps.  It does mean carrying around a huge bitmap, though.
> > 
> > Another option is to remove cpu_index entirely for the user only
> > case.  I have some patches for this, which are very ugly but it's
> > possible they can be cleaned up to something reasonable (the biggest
> > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY
> > for what I think are registers that aren't accessible in user mode).
> could we remove cpu_index altogether for bot *-user and *-softmmu targets?

Well.. not in the same way I'm looking at removing it for *-user, at
any rate.  From looking through all the users of cpu_index, nearly all
of them are in two categories:

    1) Labelling debug or error messages with a CPU #

There's not something obvious to replace this with for *-softmmu.  For
*-user, however, we can use the host tid, which is probably more
useful than an essentially arbitrary cpu index.

     2) Initializing cpu specific registers

That's "cpu specific" in both the sense of ISA specific and in the
sense of specific to a particular CPU in an SMP system.  These
registers are generally privileged and so don't need to be emulated
for *-user.  Finding a substitute for *-softmmu is rather harder.
diff mbox

Patch

diff --git a/exec.c b/exec.c
index 011babd..e410dab 100644
--- a/exec.c
+++ b/exec.c
@@ -596,7 +596,6 @@  AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
 }
 #endif
 
-#ifndef CONFIG_USER_ONLY
 static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
 
 static int cpu_get_free_index(Error **errp)
@@ -617,24 +616,6 @@  static void cpu_release_index(CPUState *cpu)
 {
     bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
 }
-#else
-
-static int cpu_get_free_index(Error **errp)
-{
-    CPUState *some_cpu;
-    int cpu_index = 0;
-
-    CPU_FOREACH(some_cpu) {
-        cpu_index++;
-    }
-    return cpu_index;
-}
-
-static void cpu_release_index(CPUState *cpu)
-{
-    return;
-}
-#endif
 
 void cpu_exec_exit(CPUState *cpu)
 {