diff mbox

return default values for apic probe functions.

Message ID 1239945321-3903-1-git-send-email-glommer@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Glauber Costa April 17, 2009, 5:15 a.m. UTC
As KVM cpus runs on threads, it is possible that
we call kvm_load_registers() from a cpu thread, while the
apic has not yet fully initialized. kvm_load_registers() is called
from ap_main_loop.

This is not a problem when we're starting the whole machine together,
but is a problem for hotplug, since we don't have the protection
of the locks that protect machine initialization. Currently, some executions
of cpu hotplug on rainy sundays fail with a segfault.

Moving apic initialization to before kvm_init_vpcu proved fruitful,
as there are some dependencies involved. (kvm irqchip would fail to
initialize).

This patch provides default values to be used for tpr and apic_base,
that will be returned when the apic is not yet properly initialized.
It is aimed at kvm, where the problem exists, but it could equally be
used for qemu too, if there is agreement.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 qemu/hw/apic.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

Comments

Jan Kiszka April 17, 2009, 6:56 a.m. UTC | #1
Glauber Costa wrote:
> As KVM cpus runs on threads, it is possible that
> we call kvm_load_registers() from a cpu thread, while the
> apic has not yet fully initialized. kvm_load_registers() is called
> from ap_main_loop.
> 
> This is not a problem when we're starting the whole machine together,
> but is a problem for hotplug, since we don't have the protection
> of the locks that protect machine initialization. Currently, some executions
> of cpu hotplug on rainy sundays fail with a segfault.
> 
> Moving apic initialization to before kvm_init_vpcu proved fruitful,
> as there are some dependencies involved. (kvm irqchip would fail to
> initialize).
> 
> This patch provides default values to be used for tpr and apic_base,
> that will be returned when the apic is not yet properly initialized.
> It is aimed at kvm, where the problem exists, but it could equally be
> used for qemu too, if there is agreement.
> 
> Signed-off-by: Glauber Costa <glommer@redhat.com>
> ---
>  qemu/hw/apic.c |   12 ++++++++++--
>  1 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c
> index b926508..06fb9b5 100644
> --- a/qemu/hw/apic.c
> +++ b/qemu/hw/apic.c
> @@ -301,7 +301,12 @@ uint64_t cpu_get_apic_base(CPUState *env)
>  #ifdef DEBUG_APIC
>      printf("cpu_get_apic_base: %016" PRIx64 "\n", (uint64_t)s->apicbase);
>  #endif
> -    return s->apicbase;
> +    if (s) {
> +        return s->apicbase;
> +    }
> +    else {
> +        return 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
> +    }
>  }

Even on sunny days, this collides with QEMU commit #7048. :)

Does Intel specify what non-existent MSRs should return, ie. is your
version still correct if !s->apicbase means that there is actually no
APIC? And does kvm depend on the default base? If so, I would say:
provide a patch against upstream.

>  
>  void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
> @@ -314,7 +319,10 @@ void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
>  uint8_t cpu_get_apic_tpr(CPUX86State *env)
>  {
>      APICState *s = env->apic_state;
> -    return s->tpr >> 4;
> +    if (s)
> +        return s->tpr >> 4;
> +    else
> +        return 0;
>  }
>  
>  /* return -1 if no bit is set */

This is already upstream.

Jan
Glauber Costa April 17, 2009, 1:22 p.m. UTC | #2
> Even on sunny days, this collides with QEMU commit #7048. :)
>
> Does Intel specify what non-existent MSRs should return, ie. is your
> version still correct if !s->apicbase means that there is actually no
> APIC? And does kvm depend on the default base? If so, I would say:
> provide a patch against upstream.

hummm, I missed this one going in.
But sadly, your patch also breaks cpu hotplug. Not a segfault anymore, but the
VM will freeze instead of shutting down, if we ask too. It does not even respond
to ^C anymore.

By leaving your patch as is, and changing the apic base return to

   return s ? s->apicbase : (0xfee00000 | MSR_IA32_APICBASE_ENABLE);

fixes the issue.

I'm not sure about what the manual says (will check now), but I
believe if we ever try to
read from apic, we should return a meaningful value. Can you verify if
this also works for your
test case?

>
>>
>>  void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
>> @@ -314,7 +319,10 @@ void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
>>  uint8_t cpu_get_apic_tpr(CPUX86State *env)
>>  {
>>      APICState *s = env->apic_state;
>> -    return s->tpr >> 4;
>> +    if (s)
>> +        return s->tpr >> 4;
>> +    else
>> +        return 0;
>>  }
>>
>>  /* return -1 if no bit is set */
>
> This is already upstream.

Yeah, and the rest of your patch is totally ok for me.
Glauber Costa April 17, 2009, 1:40 p.m. UTC | #3
On Fri, Apr 17, 2009 at 10:22 AM, Glauber Costa <glommer@gmail.com> wrote:
>> Even on sunny days, this collides with QEMU commit #7048. :)
>>
>> Does Intel specify what non-existent MSRs should return, ie. is your
>> version still correct if !s->apicbase means that there is actually no
>> APIC? And does kvm depend on the default base? If so, I would say:
>> provide a patch against upstream.
>
> hummm, I missed this one going in.
> But sadly, your patch also breaks cpu hotplug. Not a segfault anymore, but the
> VM will freeze instead of shutting down, if we ask too. It does not even respond
> to ^C anymore.
>
> By leaving your patch as is, and changing the apic base return to
>
>   return s ? s->apicbase : (0xfee00000 | MSR_IA32_APICBASE_ENABLE);
>
> fixes the issue.
>
After reading the manual, my understanding is that only the flag must
be set. I tried,
and in fact:

   return s ? s->apicbase :  MSR_IA32_APICBASE_ENABLE;

still fixes it.
If it works for you, I believe this is a good solution, and will send
a descriptive patch.
If we ever read the apic as disabled, we will have problems enabling
it again. So for
my test case, kvm should never see a disabled apic.

For yours, you'll still see the apic base address as zero.

what do you think?
Marcelo Tosatti April 17, 2009, 1:53 p.m. UTC | #4
Hi Glauber,

On Fri, Apr 17, 2009 at 01:15:21AM -0400, Glauber Costa wrote:
> As KVM cpus runs on threads, it is possible that
> we call kvm_load_registers() from a cpu thread, while the
> apic has not yet fully initialized. kvm_load_registers() is called
> from ap_main_loop.
> 
> This is not a problem when we're starting the whole machine together,
> but is a problem for hotplug, since we don't have the protection
> of the locks that protect machine initialization. Currently, some executions
> of cpu hotplug on rainy sundays fail with a segfault.

    /* and wait for machine initialization */
    while (!qemu_system_ready)
        qemu_cond_wait(&qemu_system_cond);
    pthread_mutex_unlock(&qemu_mutex);

Shouldnt this cover the cpu hotplug case too? Perhaps have:

    /* wait for machine initialization */
    while (!qemu_system_ready)
        qemu_cond_wait(&qemu_system_cond);
    /* wait for vcpu initialization */
    while (!env->initialized)
        qemu_cond_wait(&qemu_system_cond);
    pthread_mutex_unlock(&qemu_mutex);

And then set env->initialized when the cpu is good to go.

Because there could be other dependencies other than APIC 
initialization, for eg in pc_new_cpu

        if (cpu != 0)
            env->halted = 1;

> Moving apic initialization to before kvm_init_vpcu proved fruitful,
> as there are some dependencies involved. (kvm irqchip would fail to
> initialize).
> 
> This patch provides default values to be used for tpr and apic_base,
> that will be returned when the apic is not yet properly initialized.
> It is aimed at kvm, where the problem exists, but it could equally be
> used for qemu too, if there is agreement.
> 
> Signed-off-by: Glauber Costa <glommer@redhat.com>
> ---
>  qemu/hw/apic.c |   12 ++++++++++--
>  1 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c
> index b926508..06fb9b5 100644
> --- a/qemu/hw/apic.c
> +++ b/qemu/hw/apic.c
> @@ -301,7 +301,12 @@ uint64_t cpu_get_apic_base(CPUState *env)
>  #ifdef DEBUG_APIC
>      printf("cpu_get_apic_base: %016" PRIx64 "\n", (uint64_t)s->apicbase);
>  #endif
> -    return s->apicbase;
> +    if (s) {
> +        return s->apicbase;
> +    }
> +    else {
> +        return 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
> +    }
>  }
>  
>  void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
> @@ -314,7 +319,10 @@ void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
>  uint8_t cpu_get_apic_tpr(CPUX86State *env)
>  {
>      APICState *s = env->apic_state;
> -    return s->tpr >> 4;
> +    if (s)
> +        return s->tpr >> 4;
> +    else
> +        return 0;
>  }
>  
>  /* return -1 if no bit is set */
> -- 
> 1.5.6.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Glauber Costa April 17, 2009, 1:59 p.m. UTC | #5
On Fri, Apr 17, 2009 at 10:53 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> Hi Glauber,
>
> On Fri, Apr 17, 2009 at 01:15:21AM -0400, Glauber Costa wrote:
>> As KVM cpus runs on threads, it is possible that
>> we call kvm_load_registers() from a cpu thread, while the
>> apic has not yet fully initialized. kvm_load_registers() is called
>> from ap_main_loop.
>>
>> This is not a problem when we're starting the whole machine together,
>> but is a problem for hotplug, since we don't have the protection
>> of the locks that protect machine initialization. Currently, some executions
>> of cpu hotplug on rainy sundays fail with a segfault.
>
>    /* and wait for machine initialization */
>    while (!qemu_system_ready)
>        qemu_cond_wait(&qemu_system_cond);
>    pthread_mutex_unlock(&qemu_mutex);
>
> Shouldnt this cover the cpu hotplug case too? Perhaps have:
>
>    /* wait for machine initialization */
>    while (!qemu_system_ready)
>        qemu_cond_wait(&qemu_system_cond);
>    /* wait for vcpu initialization */
>    while (!env->initialized)
>        qemu_cond_wait(&qemu_system_cond);
>    pthread_mutex_unlock(&qemu_mutex);
>
> And then set env->initialized when the cpu is good to go.
From my understanding, all this is only useful when the whole machine
is starting, since they are global locks that wait for a system wide condition.

This is not the case with cpu hotplug, since the box is already on.

>
> Because there could be other dependencies other than APIC
> initialization, for eg in pc_new_cpu
>
>        if (cpu != 0)
>            env->halted = 1;

it is okay for the cpu to be halted. Btw, I believe this should be
moved inside cpu init.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 17, 2009, 2:15 p.m. UTC | #6
Glauber Costa wrote:
> On Fri, Apr 17, 2009 at 10:22 AM, Glauber Costa <glommer@gmail.com> wrote:
>>> Even on sunny days, this collides with QEMU commit #7048. :)
>>>
>>> Does Intel specify what non-existent MSRs should return, ie. is your
>>> version still correct if !s->apicbase means that there is actually no
>>> APIC? And does kvm depend on the default base? If so, I would say:
>>> provide a patch against upstream.
>> hummm, I missed this one going in.
>> But sadly, your patch also breaks cpu hotplug. Not a segfault anymore, but the
>> VM will freeze instead of shutting down, if we ask too. It does not even respond
>> to ^C anymore.
>>
>> By leaving your patch as is, and changing the apic base return to
>>
>>   return s ? s->apicbase : (0xfee00000 | MSR_IA32_APICBASE_ENABLE);
>>
>> fixes the issue.
>>
> After reading the manual, my understanding is that only the flag must
> be set. I tried,
> and in fact:
> 
>    return s ? s->apicbase :  MSR_IA32_APICBASE_ENABLE;
> 
> still fixes it.
> If it works for you, I believe this is a good solution, and will send
> a descriptive patch.
> If we ever read the apic as disabled, we will have problems enabling
> it again. So for
> my test case, kvm should never see a disabled apic.
> 
> For yours, you'll still see the apic base address as zero.
> 
> what do you think?
> 

My use case (you can try it yourself: -M isapc) will likely already be
fine if there is no segv on accidentally reading that msr. I was more
concerned about the correct value according to the spec - if that case
isn't simply "undefined".

On the other hand, I didn't get the precise race yet, and my feeling
about this patch as a fix for something else than an inexistent apic is
not that good. But it's just a feeling.

Jan
Marcelo Tosatti April 17, 2009, 2:18 p.m. UTC | #7
On Fri, Apr 17, 2009 at 10:59:11AM -0300, Glauber Costa wrote:
> On Fri, Apr 17, 2009 at 10:53 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > Hi Glauber,
> >
> > On Fri, Apr 17, 2009 at 01:15:21AM -0400, Glauber Costa wrote:
> >> As KVM cpus runs on threads, it is possible that
> >> we call kvm_load_registers() from a cpu thread, while the
> >> apic has not yet fully initialized. kvm_load_registers() is called
> >> from ap_main_loop.
> >>
> >> This is not a problem when we're starting the whole machine together,
> >> but is a problem for hotplug, since we don't have the protection
> >> of the locks that protect machine initialization. Currently, some executions
> >> of cpu hotplug on rainy sundays fail with a segfault.
> >
> >    /* and wait for machine initialization */
> >    while (!qemu_system_ready)
> >        qemu_cond_wait(&qemu_system_cond);
> >    pthread_mutex_unlock(&qemu_mutex);
> >
> > Shouldnt this cover the cpu hotplug case too? Perhaps have:
> >
> >    /* wait for machine initialization */
> >    while (!qemu_system_ready)
> >        qemu_cond_wait(&qemu_system_cond);
> >    /* wait for vcpu initialization */
> >    while (!env->initialized)
       ^^^^^^^^^^^

> >        qemu_cond_wait(&qemu_system_cond);
> >    pthread_mutex_unlock(&qemu_mutex);
> >
> > And then set env->initialized when the cpu is good to go.
> >From my understanding, all this is only useful when the whole machine
> is starting, since they are global locks that wait for a system wide condition.
> 
> This is not the case with cpu hotplug, since the box is already on.

Yes, but the above proposes the addition of vcpu "initialized" state
in addition to the system wide state, which would fix the current cpu
hotplug breakage with apic, or any other access to uninitialized state
from the vcpu thread.

> > Because there could be other dependencies other than APIC
> > initialization, for eg in pc_new_cpu
> >
> >        if (cpu != 0)
> >            env->halted = 1;
> 
> it is okay for the cpu to be halted. Btw, I believe this should be
> moved inside cpu init.

Point here is the vcpu might not be halted by the time the vcpu thread
starts to run, while it should (ie: its an example of uninitialized
state other than the apic).

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity April 19, 2009, 8:45 a.m. UTC | #8
Glauber Costa wrote:
> As KVM cpus runs on threads, it is possible that
> we call kvm_load_registers() from a cpu thread, while the
> apic has not yet fully initialized. kvm_load_registers() is called
> from ap_main_loop.
>
> This is not a problem when we're starting the whole machine together,
> but is a problem for hotplug, since we don't have the protection
> of the locks that protect machine initialization. Currently, some executions
> of cpu hotplug on rainy sundays fail with a segfault.
>
> Moving apic initialization to before kvm_init_vpcu proved fruitful,
> as there are some dependencies involved. (kvm irqchip would fail to
> initialize).
>   

I presume you mean unfruitful (or perhaps a nasty kind of fruit).

> This patch provides default values to be used for tpr and apic_base,
> that will be returned when the apic is not yet properly initialized.
> It is aimed at kvm, where the problem exists, but it could equally be
> used for qemu too, if there is agreement.
>   

Seems like a hack... can you try not to make the vcpu visible until it 
is completely initialized?

(and what is the problem exactly - someone accessing the registers from 
a different thread? that shouldn't happen)
diff mbox

Patch

diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c
index b926508..06fb9b5 100644
--- a/qemu/hw/apic.c
+++ b/qemu/hw/apic.c
@@ -301,7 +301,12 @@  uint64_t cpu_get_apic_base(CPUState *env)
 #ifdef DEBUG_APIC
     printf("cpu_get_apic_base: %016" PRIx64 "\n", (uint64_t)s->apicbase);
 #endif
-    return s->apicbase;
+    if (s) {
+        return s->apicbase;
+    }
+    else {
+        return 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+    }
 }
 
 void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
@@ -314,7 +319,10 @@  void cpu_set_apic_tpr(CPUX86State *env, uint8_t val)
 uint8_t cpu_get_apic_tpr(CPUX86State *env)
 {
     APICState *s = env->apic_state;
-    return s->tpr >> 4;
+    if (s)
+        return s->tpr >> 4;
+    else
+        return 0;
 }
 
 /* return -1 if no bit is set */