Message ID | 1468483025-1084-3-git-send-email-david@gibson.dropbear.id.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > With CONFIG_USER_ONLY, generation of cpu_index values is done differently > than for full system targets. This method turns out to be broken, since > it can fairly easily result in duplicate cpu_index values for > simultaneously active cpus (i.e. threads in the emulated process). > > Consider this sequence: > Create thread 1 > Create thread 2 > Exit thread 1 > Create thread 3 > > With the current logic thread 1 will get cpu_index 1, thread 2 will get > cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > threads in the cpus list at the point of its creation). > > We mostly get away with this because cpu_index values aren't that important > for userspace emulation. Still, it can't be good, so this patch fixes it > by making CONFIG_USER_ONLY use the same bitmap based allocation that full > system targets already use. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > exec.c | 19 ------------------- > 1 file changed, 19 deletions(-) > > diff --git a/exec.c b/exec.c > index 011babd..e410dab 100644 > --- a/exec.c > +++ b/exec.c > @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > } > #endif > > -#ifndef CONFIG_USER_ONLY > static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > static int cpu_get_free_index(Error **errp) > @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > { > bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > } > -#else > - > -static int cpu_get_free_index(Error **errp) > -{ > - CPUState *some_cpu; > - int cpu_index = 0; > - > - CPU_FOREACH(some_cpu) { > - cpu_index++; > - } > - return cpu_index; > -} > - > -static void cpu_release_index(CPUState *cpu) > -{ > - return; > -} > -#endif Won't this change impose a maximum limit of 256 simultaneous threads? That seems a little low for comfort. thanks -- PMM
On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently >> than for full system targets. This method turns out to be broken, since >> it can fairly easily result in duplicate cpu_index values for >> simultaneously active cpus (i.e. threads in the emulated process). >> >> Consider this sequence: >> Create thread 1 >> Create thread 2 >> Exit thread 1 >> Create thread 3 >> >> With the current logic thread 1 will get cpu_index 1, thread 2 will get >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 >> threads in the cpus list at the point of its creation). >> >> We mostly get away with this because cpu_index values aren't that important >> for userspace emulation. Still, it can't be good, so this patch fixes it >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full >> system targets already use. >> >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >> --- >> exec.c | 19 ------------------- >> 1 file changed, 19 deletions(-) >> >> diff --git a/exec.c b/exec.c >> index 011babd..e410dab 100644 >> --- a/exec.c >> +++ b/exec.c >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) >> } >> #endif >> >> -#ifndef CONFIG_USER_ONLY >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); >> >> static int cpu_get_free_index(Error **errp) >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) >> { >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); >> } >> -#else >> - >> -static int cpu_get_free_index(Error **errp) >> -{ >> - CPUState *some_cpu; >> - int cpu_index = 0; >> - >> - CPU_FOREACH(some_cpu) { >> - cpu_index++; >> - } >> - return cpu_index; >> -} >> - >> -static void cpu_release_index(CPUState *cpu) >> -{ >> - return; >> -} >> -#endif > > Won't this change impose a maximum limit of 256 simultaneous > threads? That seems a little low for comfort. This was the reason why the bitmap logic wasn't applied to CONFIG_USER_ONLY when it was introduced. https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html But then we didn't have actual removal, but we do now. Regards, Bharata.
On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > >> than for full system targets. This method turns out to be broken, since > >> it can fairly easily result in duplicate cpu_index values for > >> simultaneously active cpus (i.e. threads in the emulated process). > >> > >> Consider this sequence: > >> Create thread 1 > >> Create thread 2 > >> Exit thread 1 > >> Create thread 3 > >> > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > >> threads in the cpus list at the point of its creation). > >> > >> We mostly get away with this because cpu_index values aren't that important > >> for userspace emulation. Still, it can't be good, so this patch fixes it > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > >> system targets already use. > >> > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >> --- > >> exec.c | 19 ------------------- > >> 1 file changed, 19 deletions(-) > >> > >> diff --git a/exec.c b/exec.c > >> index 011babd..e410dab 100644 > >> --- a/exec.c > >> +++ b/exec.c > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > >> } > >> #endif > >> > >> -#ifndef CONFIG_USER_ONLY > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > >> > >> static int cpu_get_free_index(Error **errp) > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > >> { > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > >> } > >> -#else > >> - > >> -static int cpu_get_free_index(Error **errp) > >> -{ > >> - CPUState *some_cpu; > >> - int cpu_index = 0; > >> - > >> - CPU_FOREACH(some_cpu) { > >> - cpu_index++; > >> - } > >> - return cpu_index; > >> -} > >> - > >> -static void cpu_release_index(CPUState *cpu) > >> -{ > >> - return; > >> -} > >> -#endif > > > > Won't this change impose a maximum limit of 256 simultaneous > > threads? That seems a little low for comfort. > > This was the reason why the bitmap logic wasn't applied to > CONFIG_USER_ONLY when it was introduced. > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html Ah.. good point. Hrm, ok, my next idea would be to just (globally) sequentially allocate cpu_index values for CONFIG_USER, and never try to re-use them. Does that seem reasonable? > But then we didn't have actual removal, but we do now. You mean patch 1/2 in this set? Or something else? Even so, 256 does seem a bit low for a number of simultaneously active threads - there are some bug hairy multi-threaded programs out there.
On Thu, 14 Jul 2016 21:59:45 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > >> than for full system targets. This method turns out to be broken, since > > >> it can fairly easily result in duplicate cpu_index values for > > >> simultaneously active cpus (i.e. threads in the emulated process). > > >> > > >> Consider this sequence: > > >> Create thread 1 > > >> Create thread 2 > > >> Exit thread 1 > > >> Create thread 3 > > >> > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > >> threads in the cpus list at the point of its creation). > > >> > > >> We mostly get away with this because cpu_index values aren't that important > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > >> system targets already use. > > >> > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > >> --- > > >> exec.c | 19 ------------------- > > >> 1 file changed, 19 deletions(-) > > >> > > >> diff --git a/exec.c b/exec.c > > >> index 011babd..e410dab 100644 > > >> --- a/exec.c > > >> +++ b/exec.c > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > >> } > > >> #endif > > >> > > >> -#ifndef CONFIG_USER_ONLY > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > >> > > >> static int cpu_get_free_index(Error **errp) > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > >> { > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > >> } > > >> -#else > > >> - > > >> -static int cpu_get_free_index(Error **errp) > > >> -{ > > >> - CPUState *some_cpu; > > >> - int cpu_index = 0; > > >> - > > >> - CPU_FOREACH(some_cpu) { > > >> - cpu_index++; > > >> - } > > >> - return cpu_index; > > >> -} > > >> - > > >> -static void cpu_release_index(CPUState *cpu) > > >> -{ > > >> - return; > > >> -} > > >> -#endif > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > threads? That seems a little low for comfort. > > > > This was the reason why the bitmap logic wasn't applied to > > CONFIG_USER_ONLY when it was introduced. > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > Ah.. good point. > > Hrm, ok, my next idea would be to just (globally) sequentially > allocate cpu_index values for CONFIG_USER, and never try to re-use > them. Does that seem reasonable? > Isn't it only deferring the problem to later ? Maybe it is possible to define MAX_CPUMASK_BITS to a much higher value fo CONFIG_USER only instead ? > > But then we didn't have actual removal, but we do now. > > You mean patch 1/2 in this set? Or something else? > > Even so, 256 does seem a bit low for a number of simultaneously active > threads - there are some bug hairy multi-threaded programs out there. >
On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > On Thu, 14 Jul 2016 21:59:45 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > >> than for full system targets. This method turns out to be broken, since > > > >> it can fairly easily result in duplicate cpu_index values for > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > >> > > > >> Consider this sequence: > > > >> Create thread 1 > > > >> Create thread 2 > > > >> Exit thread 1 > > > >> Create thread 3 > > > >> > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > >> threads in the cpus list at the point of its creation). > > > >> > > > >> We mostly get away with this because cpu_index values aren't that important > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > >> system targets already use. > > > >> > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > >> --- > > > >> exec.c | 19 ------------------- > > > >> 1 file changed, 19 deletions(-) > > > >> > > > >> diff --git a/exec.c b/exec.c > > > >> index 011babd..e410dab 100644 > > > >> --- a/exec.c > > > >> +++ b/exec.c > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > >> } > > > >> #endif > > > >> > > > >> -#ifndef CONFIG_USER_ONLY > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > >> > > > >> static int cpu_get_free_index(Error **errp) > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > >> { > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > >> } > > > >> -#else > > > >> - > > > >> -static int cpu_get_free_index(Error **errp) > > > >> -{ > > > >> - CPUState *some_cpu; > > > >> - int cpu_index = 0; > > > >> - > > > >> - CPU_FOREACH(some_cpu) { > > > >> - cpu_index++; > > > >> - } > > > >> - return cpu_index; > > > >> -} > > > >> - > > > >> -static void cpu_release_index(CPUState *cpu) > > > >> -{ > > > >> - return; > > > >> -} > > > >> -#endif > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > threads? That seems a little low for comfort. > > > > > > This was the reason why the bitmap logic wasn't applied to > > > CONFIG_USER_ONLY when it was introduced. > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > Ah.. good point. > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > them. Does that seem reasonable? > > > > Isn't it only deferring the problem to later ? You mean that we could get duplicate indexes after the value wraps around? I suppose, but duplicates after spawning 4 billion threads seems like a substantial improvement over duplicates after spawning 3 in the wrong order.. > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > value fo CONFIG_USER only instead ? Perhaps. It does mean carrying around a huge bitmap, though. Another option is to remove cpu_index entirely for the user only case. I have some patches for this, which are very ugly but it's possible they can be cleaned up to something reasonable (the biggest chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY for what I think are registers that aren't accessible in user mode). > > > But then we didn't have actual removal, but we do now. > > > > You mean patch 1/2 in this set? Or something else? > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > threads - there are some bug hairy multi-threaded programs out there. > > >
On Mon, 18 Jul 2016 11:17:25 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > On Thu, 14 Jul 2016 21:59:45 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > >> than for full system targets. This method turns out to be broken, since > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > >> > > > > >> Consider this sequence: > > > > >> Create thread 1 > > > > >> Create thread 2 > > > > >> Exit thread 1 > > > > >> Create thread 3 > > > > >> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > >> threads in the cpus list at the point of its creation). > > > > >> > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > >> system targets already use. > > > > >> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > >> --- > > > > >> exec.c | 19 ------------------- > > > > >> 1 file changed, 19 deletions(-) > > > > >> > > > > >> diff --git a/exec.c b/exec.c > > > > >> index 011babd..e410dab 100644 > > > > >> --- a/exec.c > > > > >> +++ b/exec.c > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > >> } > > > > >> #endif > > > > >> > > > > >> -#ifndef CONFIG_USER_ONLY > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > >> > > > > >> static int cpu_get_free_index(Error **errp) > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > >> { > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > >> } > > > > >> -#else > > > > >> - > > > > >> -static int cpu_get_free_index(Error **errp) > > > > >> -{ > > > > >> - CPUState *some_cpu; > > > > >> - int cpu_index = 0; > > > > >> - > > > > >> - CPU_FOREACH(some_cpu) { > > > > >> - cpu_index++; > > > > >> - } > > > > >> - return cpu_index; > > > > >> -} > > > > >> - > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > >> -{ > > > > >> - return; > > > > >> -} > > > > >> -#endif > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > threads? That seems a little low for comfort. > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > Ah.. good point. > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > them. Does that seem reasonable? > > > > > > > Isn't it only deferring the problem to later ? > > You mean that we could get duplicate indexes after the value wraps > around? > > I suppose, but duplicates after spawning 4 billion threads seems like > a substantial improvement over duplicates after spawning 3 in the > wrong order.. > > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > value fo CONFIG_USER only instead ? > > Perhaps. It does mean carrying around a huge bitmap, though. > > Another option is to remove cpu_index entirely for the user only > case. I have some patches for this, which are very ugly but it's > possible they can be cleaned up to something reasonable (the biggest > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > for what I think are registers that aren't accessible in user mode). could we remove cpu_index altogether for bot *-user and *-softmmu targets? > > > > > > But then we didn't have actual removal, but we do now. > > > > > > You mean patch 1/2 in this set? Or something else? > > > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > > threads - there are some bug hairy multi-threaded programs out there. > > > > > > > >
On Mon, 18 Jul 2016 11:17:25 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > On Thu, 14 Jul 2016 21:59:45 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > >> than for full system targets. This method turns out to be broken, since > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > >> > > > > >> Consider this sequence: > > > > >> Create thread 1 > > > > >> Create thread 2 > > > > >> Exit thread 1 > > > > >> Create thread 3 > > > > >> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > >> threads in the cpus list at the point of its creation). > > > > >> > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > >> system targets already use. > > > > >> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > >> --- > > > > >> exec.c | 19 ------------------- > > > > >> 1 file changed, 19 deletions(-) > > > > >> > > > > >> diff --git a/exec.c b/exec.c > > > > >> index 011babd..e410dab 100644 > > > > >> --- a/exec.c > > > > >> +++ b/exec.c > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > >> } > > > > >> #endif > > > > >> > > > > >> -#ifndef CONFIG_USER_ONLY > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > >> > > > > >> static int cpu_get_free_index(Error **errp) > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > >> { > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > >> } > > > > >> -#else > > > > >> - > > > > >> -static int cpu_get_free_index(Error **errp) > > > > >> -{ > > > > >> - CPUState *some_cpu; > > > > >> - int cpu_index = 0; > > > > >> - > > > > >> - CPU_FOREACH(some_cpu) { > > > > >> - cpu_index++; > > > > >> - } > > > > >> - return cpu_index; > > > > >> -} > > > > >> - > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > >> -{ > > > > >> - return; > > > > >> -} > > > > >> -#endif > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > threads? That seems a little low for comfort. > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > Ah.. good point. > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > them. Does that seem reasonable? > > > > > > > Isn't it only deferring the problem to later ? > > You mean that we could get duplicate indexes after the value wraps > around? > Yes. > I suppose, but duplicates after spawning 4 billion threads seems like > a substantial improvement over duplicates after spawning 3 in the > wrong order.. > Agreed. It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto box, so the wrap around would occur after ~ 100 days. :) > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > value fo CONFIG_USER only instead ? > > Perhaps. It does mean carrying around a huge bitmap, though. > > Another option is to remove cpu_index entirely for the user only > case. I have some patches for this, which are very ugly but it's > possible they can be cleaned up to something reasonable (the biggest > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > for what I think are registers that aren't accessible in user mode). > > > > > > But then we didn't have actual removal, but we do now. > > > > > > You mean patch 1/2 in this set? Or something else? > > > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > > threads - there are some bug hairy multi-threaded programs out there. > > > > > > > >
On Mon, Jul 18, 2016 at 10:52:39AM +0200, Greg Kurz wrote: > On Mon, 18 Jul 2016 11:17:25 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > > On Thu, 14 Jul 2016 21:59:45 +1000 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > > >> than for full system targets. This method turns out to be broken, since > > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > > >> > > > > > >> Consider this sequence: > > > > > >> Create thread 1 > > > > > >> Create thread 2 > > > > > >> Exit thread 1 > > > > > >> Create thread 3 > > > > > >> > > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > > >> threads in the cpus list at the point of its creation). > > > > > >> > > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > > >> system targets already use. > > > > > >> > > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > >> --- > > > > > >> exec.c | 19 ------------------- > > > > > >> 1 file changed, 19 deletions(-) > > > > > >> > > > > > >> diff --git a/exec.c b/exec.c > > > > > >> index 011babd..e410dab 100644 > > > > > >> --- a/exec.c > > > > > >> +++ b/exec.c > > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > > >> } > > > > > >> #endif > > > > > >> > > > > > >> -#ifndef CONFIG_USER_ONLY > > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > > >> > > > > > >> static int cpu_get_free_index(Error **errp) > > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > > >> { > > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > > >> } > > > > > >> -#else > > > > > >> - > > > > > >> -static int cpu_get_free_index(Error **errp) > > > > > >> -{ > > > > > >> - CPUState *some_cpu; > > > > > >> - int cpu_index = 0; > > > > > >> - > > > > > >> - CPU_FOREACH(some_cpu) { > > > > > >> - cpu_index++; > > > > > >> - } > > > > > >> - return cpu_index; > > > > > >> -} > > > > > >> - > > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > > >> -{ > > > > > >> - return; > > > > > >> -} > > > > > >> -#endif > > > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > > threads? That seems a little low for comfort. > > > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > > > Ah.. good point. > > > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > > them. Does that seem reasonable? > > > > > > > > > > Isn't it only deferring the problem to later ? > > > > You mean that we could get duplicate indexes after the value wraps > > around? > > > > Yes. > > > I suppose, but duplicates after spawning 4 billion threads seems like > > a substantial improvement over duplicates after spawning 3 in the > > wrong order.. > > > > Agreed. > > It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto > box, so the wrap around would occur after ~ 100 days. :) Yeah, I think we can live with that. Especially since the fact this hasn't come up before kind of indicates the duplication isn't that fatal anyway.
On Mon, Jul 18, 2016 at 09:25:58AM +0200, Igor Mammedov wrote: > On Mon, 18 Jul 2016 11:17:25 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > > On Thu, 14 Jul 2016 21:59:45 +1000 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > > >> than for full system targets. This method turns out to be broken, since > > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > > >> > > > > > >> Consider this sequence: > > > > > >> Create thread 1 > > > > > >> Create thread 2 > > > > > >> Exit thread 1 > > > > > >> Create thread 3 > > > > > >> > > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > > >> threads in the cpus list at the point of its creation). > > > > > >> > > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > > >> system targets already use. > > > > > >> > > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > >> --- > > > > > >> exec.c | 19 ------------------- > > > > > >> 1 file changed, 19 deletions(-) > > > > > >> > > > > > >> diff --git a/exec.c b/exec.c > > > > > >> index 011babd..e410dab 100644 > > > > > >> --- a/exec.c > > > > > >> +++ b/exec.c > > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > > >> } > > > > > >> #endif > > > > > >> > > > > > >> -#ifndef CONFIG_USER_ONLY > > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > > >> > > > > > >> static int cpu_get_free_index(Error **errp) > > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > > >> { > > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > > >> } > > > > > >> -#else > > > > > >> - > > > > > >> -static int cpu_get_free_index(Error **errp) > > > > > >> -{ > > > > > >> - CPUState *some_cpu; > > > > > >> - int cpu_index = 0; > > > > > >> - > > > > > >> - CPU_FOREACH(some_cpu) { > > > > > >> - cpu_index++; > > > > > >> - } > > > > > >> - return cpu_index; > > > > > >> -} > > > > > >> - > > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > > >> -{ > > > > > >> - return; > > > > > >> -} > > > > > >> -#endif > > > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > > threads? That seems a little low for comfort. > > > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > > > Ah.. good point. > > > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > > them. Does that seem reasonable? > > > > > > > > > > Isn't it only deferring the problem to later ? > > > > You mean that we could get duplicate indexes after the value wraps > > around? > > > > I suppose, but duplicates after spawning 4 billion threads seems like > > a substantial improvement over duplicates after spawning 3 in the > > wrong order.. > > > > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > > value fo CONFIG_USER only instead ? > > > > Perhaps. It does mean carrying around a huge bitmap, though. > > > > Another option is to remove cpu_index entirely for the user only > > case. I have some patches for this, which are very ugly but it's > > possible they can be cleaned up to something reasonable (the biggest > > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > > for what I think are registers that aren't accessible in user mode). > could we remove cpu_index altogether for bot *-user and *-softmmu targets? Well.. not in the same way I'm looking at removing it for *-user, at any rate. From looking through all the users of cpu_index, nearly all of them are in two categories: 1) Labelling debug or error messages with a CPU # There's not something obvious to replace this with for *-softmmu. For *-user, however, we can use the host tid, which is probably more useful than an essentially arbitrary cpu index. 2) Initializing cpu specific registers That's "cpu specific" in both the sense of ISA specific and in the sense of specific to a particular CPU in an SMP system. These registers are generally privileged and so don't need to be emulated for *-user. Finding a substitute for *-softmmu is rather harder.
diff --git a/exec.c b/exec.c index 011babd..e410dab 100644 --- a/exec.c +++ b/exec.c @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) } #endif -#ifndef CONFIG_USER_ONLY static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); static int cpu_get_free_index(Error **errp) @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) { bitmap_clear(cpu_index_map, cpu->cpu_index, 1); } -#else - -static int cpu_get_free_index(Error **errp) -{ - CPUState *some_cpu; - int cpu_index = 0; - - CPU_FOREACH(some_cpu) { - cpu_index++; - } - return cpu_index; -} - -static void cpu_release_index(CPUState *cpu) -{ - return; -} -#endif void cpu_exec_exit(CPUState *cpu) {
With CONFIG_USER_ONLY, generation of cpu_index values is done differently than for full system targets. This method turns out to be broken, since it can fairly easily result in duplicate cpu_index values for simultaneously active cpus (i.e. threads in the emulated process). Consider this sequence: Create thread 1 Create thread 2 Exit thread 1 Create thread 3 With the current logic thread 1 will get cpu_index 1, thread 2 will get cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 threads in the cpus list at the point of its creation). We mostly get away with this because cpu_index values aren't that important for userspace emulation. Still, it can't be good, so this patch fixes it by making CONFIG_USER_ONLY use the same bitmap based allocation that full system targets already use. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- exec.c | 19 ------------------- 1 file changed, 19 deletions(-)