diff mbox series

xen/x86: clear per cpu stub page information in cpu_smpboot_free()

Message ID 20200108110148.18988-1-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series xen/x86: clear per cpu stub page information in cpu_smpboot_free() | expand

Commit Message

Jürgen Groß Jan. 8, 2020, 11:01 a.m. UTC
cpu_smpboot_free() removes the stubs for the cpu going offline, but it
isn't clearing the related percpu variables. This will result in
crashes when a stub page is released due to all related cpus gone
offline and one of those cpus going online later.

Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
new stub page when needed.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/smpboot.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Wei Liu Jan. 8, 2020, 12:16 p.m. UTC | #1
On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
> cpu_smpboot_free() removes the stubs for the cpu going offline, but it
> isn't clearing the related percpu variables. This will result in
> crashes when a stub page is released due to all related cpus gone
> offline and one of those cpus going online later.
> 
> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
> new stub page when needed.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  xen/arch/x86/smpboot.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> index 7e29704080..46c0729214 100644
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>                               (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>          if ( i == STUBS_PER_PAGE )
>              free_domheap_page(mfn_to_page(mfn));
> +        per_cpu(stubs.addr, cpu) = 0;
> +        per_cpu(stubs.mfn, cpu) = 0;

Shouldn't the mfn be set to INVALID_MFN instead?

Wei.

>      }
>  
>      FREE_XENHEAP_PAGE(per_cpu(compat_gdt, cpu));
> -- 
> 2.16.4
>
Jürgen Groß Jan. 8, 2020, 12:18 p.m. UTC | #2
On 08.01.20 13:16, Wei Liu wrote:
> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
>> cpu_smpboot_free() removes the stubs for the cpu going offline, but it
>> isn't clearing the related percpu variables. This will result in
>> crashes when a stub page is released due to all related cpus gone
>> offline and one of those cpus going online later.
>>
>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
>> new stub page when needed.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/arch/x86/smpboot.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>> index 7e29704080..46c0729214 100644
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>>                                (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>>           if ( i == STUBS_PER_PAGE )
>>               free_domheap_page(mfn_to_page(mfn));
>> +        per_cpu(stubs.addr, cpu) = 0;
>> +        per_cpu(stubs.mfn, cpu) = 0;
> 
> Shouldn't the mfn be set to INVALID_MFN instead?

This would require modifying alloc_stub_page():

     if ( *mfn )
         pg = mfn_to_page(_mfn(*mfn));
     else


Juergen
Andrew Cooper Jan. 8, 2020, 12:20 p.m. UTC | #3
On 08/01/2020 12:18, Jürgen Groß wrote:
> On 08.01.20 13:16, Wei Liu wrote:
>> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
>>> cpu_smpboot_free() removes the stubs for the cpu going offline, but it
>>> isn't clearing the related percpu variables. This will result in
>>> crashes when a stub page is released due to all related cpus gone
>>> offline and one of those cpus going online later.
>>>
>>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
>>> new stub page when needed.
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>>   xen/arch/x86/smpboot.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>> index 7e29704080..46c0729214 100644
>>> --- a/xen/arch/x86/smpboot.c
>>> +++ b/xen/arch/x86/smpboot.c
>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu,
>>> bool remove)
>>>                                (per_cpu(stubs.addr, cpu) |
>>> ~PAGE_MASK) + 1);
>>>           if ( i == STUBS_PER_PAGE )
>>>               free_domheap_page(mfn_to_page(mfn));
>>> +        per_cpu(stubs.addr, cpu) = 0;
>>> +        per_cpu(stubs.mfn, cpu) = 0;
>>
>> Shouldn't the mfn be set to INVALID_MFN instead?
>
> This would require modifying alloc_stub_page():
>
>     if ( *mfn )
>         pg = mfn_to_page(_mfn(*mfn));
>     else

Correct.  per-cpu data is initialised to 0, not to a custom default, so
using INVALID_MFN is more complicated.

~Andrew
Wei Liu Jan. 8, 2020, 12:23 p.m. UTC | #4
On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote:
> On 08.01.20 13:16, Wei Liu wrote:
> > On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
> > > cpu_smpboot_free() removes the stubs for the cpu going offline, but it
> > > isn't clearing the related percpu variables. This will result in
> > > crashes when a stub page is released due to all related cpus gone
> > > offline and one of those cpus going online later.
> > > 
> > > Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
> > > new stub page when needed.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > ---
> > >   xen/arch/x86/smpboot.c | 2 ++
> > >   1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> > > index 7e29704080..46c0729214 100644
> > > --- a/xen/arch/x86/smpboot.c
> > > +++ b/xen/arch/x86/smpboot.c
> > > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
> > >                                (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
> > >           if ( i == STUBS_PER_PAGE )
> > >               free_domheap_page(mfn_to_page(mfn));
> > > +        per_cpu(stubs.addr, cpu) = 0;
> > > +        per_cpu(stubs.mfn, cpu) = 0;
> > 
> > Shouldn't the mfn be set to INVALID_MFN instead?
> 
> This would require modifying alloc_stub_page():
> 
>     if ( *mfn )
>         pg = mfn_to_page(_mfn(*mfn));
>     else

OK. I think the chance of allocating mfn 0 from the allocator is
exceedingly low, so I certainly have no objection to reset it to 0.

Wei.

> 
> 
> Juergen
Jürgen Groß Jan. 8, 2020, 12:26 p.m. UTC | #5
On 08.01.20 13:23, Wei Liu wrote:
> On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote:
>> On 08.01.20 13:16, Wei Liu wrote:
>>> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
>>>> cpu_smpboot_free() removes the stubs for the cpu going offline, but it
>>>> isn't clearing the related percpu variables. This will result in
>>>> crashes when a stub page is released due to all related cpus gone
>>>> offline and one of those cpus going online later.
>>>>
>>>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
>>>> new stub page when needed.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>    xen/arch/x86/smpboot.c | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>>> index 7e29704080..46c0729214 100644
>>>> --- a/xen/arch/x86/smpboot.c
>>>> +++ b/xen/arch/x86/smpboot.c
>>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>>>>                                 (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>>>>            if ( i == STUBS_PER_PAGE )
>>>>                free_domheap_page(mfn_to_page(mfn));
>>>> +        per_cpu(stubs.addr, cpu) = 0;
>>>> +        per_cpu(stubs.mfn, cpu) = 0;
>>>
>>> Shouldn't the mfn be set to INVALID_MFN instead?
>>
>> This would require modifying alloc_stub_page():
>>
>>      if ( *mfn )
>>          pg = mfn_to_page(_mfn(*mfn));
>>      else
> 
> OK. I think the chance of allocating mfn 0 from the allocator is
> exceedingly low, so I certainly have no objection to reset it to 0.

The chance should be exactly zero. Otherwise we'd have a big problem
due to L1TF...


Juergen
Wei Liu Jan. 8, 2020, 12:29 p.m. UTC | #6
On Wed, Jan 08, 2020 at 01:26:49PM +0100, Jürgen Groß wrote:
> On 08.01.20 13:23, Wei Liu wrote:
> > On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote:
> > > On 08.01.20 13:16, Wei Liu wrote:
> > > > On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
> > > > > cpu_smpboot_free() removes the stubs for the cpu going offline, but it
> > > > > isn't clearing the related percpu variables. This will result in
> > > > > crashes when a stub page is released due to all related cpus gone
> > > > > offline and one of those cpus going online later.
> > > > > 
> > > > > Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
> > > > > new stub page when needed.
> > > > > 
> > > > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > > > ---
> > > > >    xen/arch/x86/smpboot.c | 2 ++
> > > > >    1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
> > > > > index 7e29704080..46c0729214 100644
> > > > > --- a/xen/arch/x86/smpboot.c
> > > > > +++ b/xen/arch/x86/smpboot.c
> > > > > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
> > > > >                                 (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
> > > > >            if ( i == STUBS_PER_PAGE )
> > > > >                free_domheap_page(mfn_to_page(mfn));
> > > > > +        per_cpu(stubs.addr, cpu) = 0;
> > > > > +        per_cpu(stubs.mfn, cpu) = 0;
> > > > 
> > > > Shouldn't the mfn be set to INVALID_MFN instead?
> > > 
> > > This would require modifying alloc_stub_page():
> > > 
> > >      if ( *mfn )
> > >          pg = mfn_to_page(_mfn(*mfn));
> > >      else
> > 
> > OK. I think the chance of allocating mfn 0 from the allocator is
> > exceedingly low, so I certainly have no objection to reset it to 0.
> 
> The chance should be exactly zero. Otherwise we'd have a big problem
> due to L1TF...

Heh...

Reviewed-by: Wei Liu <wl@xen.org>
Andrew Cooper Jan. 8, 2020, 12:31 p.m. UTC | #7
On 08/01/2020 12:26, Jürgen Groß wrote:
> On 08.01.20 13:23, Wei Liu wrote:
>> On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote:
>>> On 08.01.20 13:16, Wei Liu wrote:
>>>> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote:
>>>>> cpu_smpboot_free() removes the stubs for the cpu going offline,
>>>>> but it
>>>>> isn't clearing the related percpu variables. This will result in
>>>>> crashes when a stub page is released due to all related cpus gone
>>>>> offline and one of those cpus going online later.
>>>>>
>>>>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a
>>>>> new stub page when needed.
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>>    xen/arch/x86/smpboot.c | 2 ++
>>>>>    1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>>>> index 7e29704080..46c0729214 100644
>>>>> --- a/xen/arch/x86/smpboot.c
>>>>> +++ b/xen/arch/x86/smpboot.c
>>>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu,
>>>>> bool remove)
>>>>>                                 (per_cpu(stubs.addr, cpu) |
>>>>> ~PAGE_MASK) + 1);
>>>>>            if ( i == STUBS_PER_PAGE )
>>>>>                free_domheap_page(mfn_to_page(mfn));
>>>>> +        per_cpu(stubs.addr, cpu) = 0;
>>>>> +        per_cpu(stubs.mfn, cpu) = 0;
>>>>
>>>> Shouldn't the mfn be set to INVALID_MFN instead?
>>>
>>> This would require modifying alloc_stub_page():
>>>
>>>      if ( *mfn )
>>>          pg = mfn_to_page(_mfn(*mfn));
>>>      else
>>
>> OK. I think the chance of allocating mfn 0 from the allocator is
>> exceedingly low, so I certainly have no objection to reset it to 0.
>
> The chance should be exactly zero. Otherwise we'd have a big problem
> due to L1TF...

Also MFN 0 contains the IVT and BDA.

The IVT at least must remain valid at all times (even on EFI systems),
because AP boot needs to have at least a no-op NMI handler.  A base of 0
and limit of 0xffff is the architectural INIT/RESET state for IDTR. 
(Also for the other system data structures, but they have to be moved
elsewhere before they can be used).

~Andrew
Jan Beulich Jan. 8, 2020, 1:20 p.m. UTC | #8
On 08.01.2020 12:01, Juergen Gross wrote:
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>                               (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>          if ( i == STUBS_PER_PAGE )
>              free_domheap_page(mfn_to_page(mfn));
> +        per_cpu(stubs.addr, cpu) = 0;
> +        per_cpu(stubs.mfn, cpu) = 0;
>      }

Afaict this was a regression from the introduction of CPU parking:
Prior to that, per-CPU data would have been zeroed in all cases
before a new CPU was unleashed. I think it would be helpful it
this was mentioned in the description (possibly by way of a Fixes:
tag).

Jan
Jürgen Groß Jan. 8, 2020, 1:26 p.m. UTC | #9
On 08.01.20 14:20, Jan Beulich wrote:
> On 08.01.2020 12:01, Juergen Gross wrote:
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>>                                (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>>           if ( i == STUBS_PER_PAGE )
>>               free_domheap_page(mfn_to_page(mfn));
>> +        per_cpu(stubs.addr, cpu) = 0;
>> +        per_cpu(stubs.mfn, cpu) = 0;
>>       }
> 
> Afaict this was a regression from the introduction of CPU parking:
> Prior to that, per-CPU data would have been zeroed in all cases
> before a new CPU was unleashed. I think it would be helpful it
> this was mentioned in the description (possibly by way of a Fixes:
> tag).

Okay, will do. Just to be sure - you are thinking of commit 2e6c8f182?


Juergen
Jan Beulich Jan. 8, 2020, 1:31 p.m. UTC | #10
On 08.01.2020 14:26, Jürgen Groß wrote:
> On 08.01.20 14:20, Jan Beulich wrote:
>> On 08.01.2020 12:01, Juergen Gross wrote:
>>> --- a/xen/arch/x86/smpboot.c
>>> +++ b/xen/arch/x86/smpboot.c
>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
>>>                                (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
>>>           if ( i == STUBS_PER_PAGE )
>>>               free_domheap_page(mfn_to_page(mfn));
>>> +        per_cpu(stubs.addr, cpu) = 0;
>>> +        per_cpu(stubs.mfn, cpu) = 0;
>>>       }
>>
>> Afaict this was a regression from the introduction of CPU parking:
>> Prior to that, per-CPU data would have been zeroed in all cases
>> before a new CPU was unleashed. I think it would be helpful it
>> this was mentioned in the description (possibly by way of a Fixes:
>> tag).
> 
> Okay, will do. Just to be sure - you are thinking of commit 2e6c8f182?

Yes, this looks to be the one.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 7e29704080..46c0729214 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -945,6 +945,8 @@  static void cpu_smpboot_free(unsigned int cpu, bool remove)
                              (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1);
         if ( i == STUBS_PER_PAGE )
             free_domheap_page(mfn_to_page(mfn));
+        per_cpu(stubs.addr, cpu) = 0;
+        per_cpu(stubs.mfn, cpu) = 0;
     }
 
     FREE_XENHEAP_PAGE(per_cpu(compat_gdt, cpu));