Message ID | 20200108110148.18988-1-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/x86: clear per cpu stub page information in cpu_smpboot_free() | expand |
On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: > cpu_smpboot_free() removes the stubs for the cpu going offline, but it > isn't clearing the related percpu variables. This will result in > crashes when a stub page is released due to all related cpus gone > offline and one of those cpus going online later. > > Fix that by clearing stubs.addr and stubs.mfn in order to allocate a > new stub page when needed. > > Signed-off-by: Juergen Gross <jgross@suse.com> > --- > xen/arch/x86/smpboot.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c > index 7e29704080..46c0729214 100644 > --- a/xen/arch/x86/smpboot.c > +++ b/xen/arch/x86/smpboot.c > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) > (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); > if ( i == STUBS_PER_PAGE ) > free_domheap_page(mfn_to_page(mfn)); > + per_cpu(stubs.addr, cpu) = 0; > + per_cpu(stubs.mfn, cpu) = 0; Shouldn't the mfn be set to INVALID_MFN instead? Wei. > } > > FREE_XENHEAP_PAGE(per_cpu(compat_gdt, cpu)); > -- > 2.16.4 >
On 08.01.20 13:16, Wei Liu wrote: > On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: >> cpu_smpboot_free() removes the stubs for the cpu going offline, but it >> isn't clearing the related percpu variables. This will result in >> crashes when a stub page is released due to all related cpus gone >> offline and one of those cpus going online later. >> >> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a >> new stub page when needed. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> >> --- >> xen/arch/x86/smpboot.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >> index 7e29704080..46c0729214 100644 >> --- a/xen/arch/x86/smpboot.c >> +++ b/xen/arch/x86/smpboot.c >> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) >> (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); >> if ( i == STUBS_PER_PAGE ) >> free_domheap_page(mfn_to_page(mfn)); >> + per_cpu(stubs.addr, cpu) = 0; >> + per_cpu(stubs.mfn, cpu) = 0; > > Shouldn't the mfn be set to INVALID_MFN instead? This would require modifying alloc_stub_page(): if ( *mfn ) pg = mfn_to_page(_mfn(*mfn)); else Juergen
On 08/01/2020 12:18, Jürgen Groß wrote: > On 08.01.20 13:16, Wei Liu wrote: >> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: >>> cpu_smpboot_free() removes the stubs for the cpu going offline, but it >>> isn't clearing the related percpu variables. This will result in >>> crashes when a stub page is released due to all related cpus gone >>> offline and one of those cpus going online later. >>> >>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a >>> new stub page when needed. >>> >>> Signed-off-by: Juergen Gross <jgross@suse.com> >>> --- >>> xen/arch/x86/smpboot.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >>> index 7e29704080..46c0729214 100644 >>> --- a/xen/arch/x86/smpboot.c >>> +++ b/xen/arch/x86/smpboot.c >>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, >>> bool remove) >>> (per_cpu(stubs.addr, cpu) | >>> ~PAGE_MASK) + 1); >>> if ( i == STUBS_PER_PAGE ) >>> free_domheap_page(mfn_to_page(mfn)); >>> + per_cpu(stubs.addr, cpu) = 0; >>> + per_cpu(stubs.mfn, cpu) = 0; >> >> Shouldn't the mfn be set to INVALID_MFN instead? > > This would require modifying alloc_stub_page(): > > if ( *mfn ) > pg = mfn_to_page(_mfn(*mfn)); > else Correct. per-cpu data is initialised to 0, not to a custom default, so using INVALID_MFN is more complicated. ~Andrew
On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote: > On 08.01.20 13:16, Wei Liu wrote: > > On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: > > > cpu_smpboot_free() removes the stubs for the cpu going offline, but it > > > isn't clearing the related percpu variables. This will result in > > > crashes when a stub page is released due to all related cpus gone > > > offline and one of those cpus going online later. > > > > > > Fix that by clearing stubs.addr and stubs.mfn in order to allocate a > > > new stub page when needed. > > > > > > Signed-off-by: Juergen Gross <jgross@suse.com> > > > --- > > > xen/arch/x86/smpboot.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c > > > index 7e29704080..46c0729214 100644 > > > --- a/xen/arch/x86/smpboot.c > > > +++ b/xen/arch/x86/smpboot.c > > > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) > > > (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); > > > if ( i == STUBS_PER_PAGE ) > > > free_domheap_page(mfn_to_page(mfn)); > > > + per_cpu(stubs.addr, cpu) = 0; > > > + per_cpu(stubs.mfn, cpu) = 0; > > > > Shouldn't the mfn be set to INVALID_MFN instead? > > This would require modifying alloc_stub_page(): > > if ( *mfn ) > pg = mfn_to_page(_mfn(*mfn)); > else OK. I think the chance of allocating mfn 0 from the allocator is exceedingly low, so I certainly have no objection to reset it to 0. Wei. > > > Juergen
On 08.01.20 13:23, Wei Liu wrote: > On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote: >> On 08.01.20 13:16, Wei Liu wrote: >>> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: >>>> cpu_smpboot_free() removes the stubs for the cpu going offline, but it >>>> isn't clearing the related percpu variables. This will result in >>>> crashes when a stub page is released due to all related cpus gone >>>> offline and one of those cpus going online later. >>>> >>>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a >>>> new stub page when needed. >>>> >>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>> --- >>>> xen/arch/x86/smpboot.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >>>> index 7e29704080..46c0729214 100644 >>>> --- a/xen/arch/x86/smpboot.c >>>> +++ b/xen/arch/x86/smpboot.c >>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) >>>> (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); >>>> if ( i == STUBS_PER_PAGE ) >>>> free_domheap_page(mfn_to_page(mfn)); >>>> + per_cpu(stubs.addr, cpu) = 0; >>>> + per_cpu(stubs.mfn, cpu) = 0; >>> >>> Shouldn't the mfn be set to INVALID_MFN instead? >> >> This would require modifying alloc_stub_page(): >> >> if ( *mfn ) >> pg = mfn_to_page(_mfn(*mfn)); >> else > > OK. I think the chance of allocating mfn 0 from the allocator is > exceedingly low, so I certainly have no objection to reset it to 0. The chance should be exactly zero. Otherwise we'd have a big problem due to L1TF... Juergen
On Wed, Jan 08, 2020 at 01:26:49PM +0100, Jürgen Groß wrote: > On 08.01.20 13:23, Wei Liu wrote: > > On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote: > > > On 08.01.20 13:16, Wei Liu wrote: > > > > On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: > > > > > cpu_smpboot_free() removes the stubs for the cpu going offline, but it > > > > > isn't clearing the related percpu variables. This will result in > > > > > crashes when a stub page is released due to all related cpus gone > > > > > offline and one of those cpus going online later. > > > > > > > > > > Fix that by clearing stubs.addr and stubs.mfn in order to allocate a > > > > > new stub page when needed. > > > > > > > > > > Signed-off-by: Juergen Gross <jgross@suse.com> > > > > > --- > > > > > xen/arch/x86/smpboot.c | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c > > > > > index 7e29704080..46c0729214 100644 > > > > > --- a/xen/arch/x86/smpboot.c > > > > > +++ b/xen/arch/x86/smpboot.c > > > > > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) > > > > > (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); > > > > > if ( i == STUBS_PER_PAGE ) > > > > > free_domheap_page(mfn_to_page(mfn)); > > > > > + per_cpu(stubs.addr, cpu) = 0; > > > > > + per_cpu(stubs.mfn, cpu) = 0; > > > > > > > > Shouldn't the mfn be set to INVALID_MFN instead? > > > > > > This would require modifying alloc_stub_page(): > > > > > > if ( *mfn ) > > > pg = mfn_to_page(_mfn(*mfn)); > > > else > > > > OK. I think the chance of allocating mfn 0 from the allocator is > > exceedingly low, so I certainly have no objection to reset it to 0. > > The chance should be exactly zero. Otherwise we'd have a big problem > due to L1TF... Heh... Reviewed-by: Wei Liu <wl@xen.org>
On 08/01/2020 12:26, Jürgen Groß wrote: > On 08.01.20 13:23, Wei Liu wrote: >> On Wed, Jan 08, 2020 at 01:18:46PM +0100, Jürgen Groß wrote: >>> On 08.01.20 13:16, Wei Liu wrote: >>>> On Wed, Jan 08, 2020 at 12:01:48PM +0100, Juergen Gross wrote: >>>>> cpu_smpboot_free() removes the stubs for the cpu going offline, >>>>> but it >>>>> isn't clearing the related percpu variables. This will result in >>>>> crashes when a stub page is released due to all related cpus gone >>>>> offline and one of those cpus going online later. >>>>> >>>>> Fix that by clearing stubs.addr and stubs.mfn in order to allocate a >>>>> new stub page when needed. >>>>> >>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>> --- >>>>> xen/arch/x86/smpboot.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c >>>>> index 7e29704080..46c0729214 100644 >>>>> --- a/xen/arch/x86/smpboot.c >>>>> +++ b/xen/arch/x86/smpboot.c >>>>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, >>>>> bool remove) >>>>> (per_cpu(stubs.addr, cpu) | >>>>> ~PAGE_MASK) + 1); >>>>> if ( i == STUBS_PER_PAGE ) >>>>> free_domheap_page(mfn_to_page(mfn)); >>>>> + per_cpu(stubs.addr, cpu) = 0; >>>>> + per_cpu(stubs.mfn, cpu) = 0; >>>> >>>> Shouldn't the mfn be set to INVALID_MFN instead? >>> >>> This would require modifying alloc_stub_page(): >>> >>> if ( *mfn ) >>> pg = mfn_to_page(_mfn(*mfn)); >>> else >> >> OK. I think the chance of allocating mfn 0 from the allocator is >> exceedingly low, so I certainly have no objection to reset it to 0. > > The chance should be exactly zero. Otherwise we'd have a big problem > due to L1TF... Also MFN 0 contains the IVT and BDA. The IVT at least must remain valid at all times (even on EFI systems), because AP boot needs to have at least a no-op NMI handler. A base of 0 and limit of 0xffff is the architectural INIT/RESET state for IDTR. (Also for the other system data structures, but they have to be moved elsewhere before they can be used). ~Andrew
On 08.01.2020 12:01, Juergen Gross wrote: > --- a/xen/arch/x86/smpboot.c > +++ b/xen/arch/x86/smpboot.c > @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) > (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); > if ( i == STUBS_PER_PAGE ) > free_domheap_page(mfn_to_page(mfn)); > + per_cpu(stubs.addr, cpu) = 0; > + per_cpu(stubs.mfn, cpu) = 0; > } Afaict this was a regression from the introduction of CPU parking: Prior to that, per-CPU data would have been zeroed in all cases before a new CPU was unleashed. I think it would be helpful it this was mentioned in the description (possibly by way of a Fixes: tag). Jan
On 08.01.20 14:20, Jan Beulich wrote: > On 08.01.2020 12:01, Juergen Gross wrote: >> --- a/xen/arch/x86/smpboot.c >> +++ b/xen/arch/x86/smpboot.c >> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) >> (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); >> if ( i == STUBS_PER_PAGE ) >> free_domheap_page(mfn_to_page(mfn)); >> + per_cpu(stubs.addr, cpu) = 0; >> + per_cpu(stubs.mfn, cpu) = 0; >> } > > Afaict this was a regression from the introduction of CPU parking: > Prior to that, per-CPU data would have been zeroed in all cases > before a new CPU was unleashed. I think it would be helpful it > this was mentioned in the description (possibly by way of a Fixes: > tag). Okay, will do. Just to be sure - you are thinking of commit 2e6c8f182? Juergen
On 08.01.2020 14:26, Jürgen Groß wrote: > On 08.01.20 14:20, Jan Beulich wrote: >> On 08.01.2020 12:01, Juergen Gross wrote: >>> --- a/xen/arch/x86/smpboot.c >>> +++ b/xen/arch/x86/smpboot.c >>> @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) >>> (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); >>> if ( i == STUBS_PER_PAGE ) >>> free_domheap_page(mfn_to_page(mfn)); >>> + per_cpu(stubs.addr, cpu) = 0; >>> + per_cpu(stubs.mfn, cpu) = 0; >>> } >> >> Afaict this was a regression from the introduction of CPU parking: >> Prior to that, per-CPU data would have been zeroed in all cases >> before a new CPU was unleashed. I think it would be helpful it >> this was mentioned in the description (possibly by way of a Fixes: >> tag). > > Okay, will do. Just to be sure - you are thinking of commit 2e6c8f182? Yes, this looks to be the one. Jan
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 7e29704080..46c0729214 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -945,6 +945,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) (per_cpu(stubs.addr, cpu) | ~PAGE_MASK) + 1); if ( i == STUBS_PER_PAGE ) free_domheap_page(mfn_to_page(mfn)); + per_cpu(stubs.addr, cpu) = 0; + per_cpu(stubs.mfn, cpu) = 0; } FREE_XENHEAP_PAGE(per_cpu(compat_gdt, cpu));
cpu_smpboot_free() removes the stubs for the cpu going offline, but it isn't clearing the related percpu variables. This will result in crashes when a stub page is released due to all related cpus gone offline and one of those cpus going online later. Fix that by clearing stubs.addr and stubs.mfn in order to allocate a new stub page when needed. Signed-off-by: Juergen Gross <jgross@suse.com> --- xen/arch/x86/smpboot.c | 2 ++ 1 file changed, 2 insertions(+)