Message ID | 20191031105609.21819-1-sergey.dyasli@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,for,4.13] x86/e820: fix 640k - 1M region reservation logic | expand |
On 31.10.2019 11:56, Sergey Dyasli wrote: > --- a/xen/arch/x86/cpu/common.c > +++ b/xen/arch/x86/cpu/common.c > @@ -274,6 +274,15 @@ static inline u32 phys_pkg_id(u32 cpuid_apic, int index_msb) > return _phys_pkg_id(get_apic_id(), index_msb); > } > > +/* > + * Sometimes it's too early to use cpu_has_hypervisor which is available only > + * after early_cpu_init(). > + */ > +bool __init early_cpu_has_hypervisor(void) > +{ > + return cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR); > +} OOI, did you consider introducing a more general early_cpu_has(), with X86_FEATURE_* passed as an argument? > @@ -318,9 +319,10 @@ static int __init copy_e820_map(struct e820entry * biosmap, unsigned int nr_map) > > /* > * Some BIOSes claim RAM in the 640k - 1M region. > - * Not right. Fix it up. > + * Not right. Fix it up, but only when running on bare metal. > */ > - if (type == E820_RAM) { > + if ( !early_cpu_has_hypervisor() && type == E820_RAM ) > + { > if (start < 0x100000ULL && end > 0xA0000ULL) { > if (start < 0xA0000ULL) > add_memory_region(start, 0xA0000ULL-start, type); Seeing original line and lower context - aren't you corrupting previously reasonably consistent (Linux) style here? (This could be easily taken care of while committing, but I'd first like the point below be clarified.) > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5943,7 +5943,7 @@ const struct platform_bad_page *__init get_platform_badpages(unsigned int *array > case 0x000806e0: /* erratum KBL??? */ > case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ > *array_size = (cpuid_eax(0) >= 7 && > - !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) && > + !early_cpu_has_hypervisor() && Strictly speaking this makes clear that in early_cpu_has_hypervisor() you should also check cpuid_eax(0) >= 1. We don't currently seem to object to running on a system with only basic leaf 0 available (we do object to there not being extended leaf 1). Andrew, do you have any clear opinion one way or the other? Jan
On 31/10/2019 11:07, Jan Beulich wrote: > On 31.10.2019 11:56, Sergey Dyasli wrote: >> --- a/xen/arch/x86/cpu/common.c >> +++ b/xen/arch/x86/cpu/common.c >> @@ -274,6 +274,15 @@ static inline u32 phys_pkg_id(u32 cpuid_apic, int index_msb) >> return _phys_pkg_id(get_apic_id(), index_msb); >> } >> >> +/* >> + * Sometimes it's too early to use cpu_has_hypervisor which is available only >> + * after early_cpu_init(). >> + */ >> +bool __init early_cpu_has_hypervisor(void) >> +{ >> + return cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR); >> +} > > OOI, did you consider introducing a more general early_cpu_has(), > with X86_FEATURE_* passed as an argument? I couldn't find any other users of cpuid_e{c,d}x(0x1), so I don't see the point in doing it now. >> @@ -318,9 +319,10 @@ static int __init copy_e820_map(struct e820entry * biosmap, unsigned int nr_map) >> >> /* >> * Some BIOSes claim RAM in the 640k - 1M region. >> - * Not right. Fix it up. >> + * Not right. Fix it up, but only when running on bare metal. >> */ >> - if (type == E820_RAM) { >> + if ( !early_cpu_has_hypervisor() && type == E820_RAM ) >> + { >> if (start < 0x100000ULL && end > 0xA0000ULL) { >> if (start < 0xA0000ULL) >> add_memory_region(start, 0xA0000ULL-start, type); > > Seeing original line and lower context - aren't you corrupting > previously reasonably consistent (Linux) style here? (This could > be easily taken care of while committing, but I'd first like the > point below be clarified.) This file has mixed coding style and I don't have any preference here. >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -5943,7 +5943,7 @@ const struct platform_bad_page *__init get_platform_badpages(unsigned int *array >> case 0x000806e0: /* erratum KBL??? */ >> case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ >> *array_size = (cpuid_eax(0) >= 7 && >> - !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) && >> + !early_cpu_has_hypervisor() && > > Strictly speaking this makes clear that in early_cpu_has_hypervisor() > you should also check cpuid_eax(0) >= 1. We don't currently seem to > object to running on a system with only basic leaf 0 available (we > do object to there not being extended leaf 1). Andrew, do you have > any clear opinion one way or the other? What gave you that impression? early_cpu_init() and generic_identify() do cpuid(0x1) unconditionally. -- Thanks, Sergey
On 31.10.2019 12:45, Sergey Dyasli wrote: > On 31/10/2019 11:07, Jan Beulich wrote: >> On 31.10.2019 11:56, Sergey Dyasli wrote: >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -5943,7 +5943,7 @@ const struct platform_bad_page *__init get_platform_badpages(unsigned int *array >>> case 0x000806e0: /* erratum KBL??? */ >>> case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ >>> *array_size = (cpuid_eax(0) >= 7 && >>> - !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) && >>> + !early_cpu_has_hypervisor() && >> >> Strictly speaking this makes clear that in early_cpu_has_hypervisor() >> you should also check cpuid_eax(0) >= 1. We don't currently seem to >> object to running on a system with only basic leaf 0 available (we >> do object to there not being extended leaf 1). Andrew, do you have >> any clear opinion one way or the other? > > What gave you that impression? early_cpu_init() and generic_identify() > do cpuid(0x1) unconditionally. "do cpuid(0x1) unconditionally" != "object to running on a system with only basic leaf 0 available", i.e. that code is simply buggy as long as we don't elsewhere panic() if there's only leaf 0 available. Jan
On 31/10/2019 11:07, Jan Beulich wrote: > On 31.10.2019 11:56, Sergey Dyasli wrote: >> --- a/xen/arch/x86/cpu/common.c >> +++ b/xen/arch/x86/cpu/common.c >> @@ -274,6 +274,15 @@ static inline u32 phys_pkg_id(u32 cpuid_apic, int index_msb) >> return _phys_pkg_id(get_apic_id(), index_msb); >> } >> >> +/* >> + * Sometimes it's too early to use cpu_has_hypervisor which is available only >> + * after early_cpu_init(). >> + */ >> +bool __init early_cpu_has_hypervisor(void) >> +{ >> + return cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR); >> +} Static inline alongside the other cpuid helpers please. This assembles to smaller than the call required to get to a separate translation unit. > OOI, did you consider introducing a more general early_cpu_has(), > with X86_FEATURE_* passed as an argument? This is interim code. Personally, I'm still go with what I said at the very beginning, of opencoding it just in the hunk below. > >> @@ -318,9 +319,10 @@ static int __init copy_e820_map(struct e820entry * biosmap, unsigned int nr_map) >> >> /* >> * Some BIOSes claim RAM in the 640k - 1M region. >> - * Not right. Fix it up. >> + * Not right. Fix it up, but only when running on bare metal. >> */ >> - if (type == E820_RAM) { >> + if ( !early_cpu_has_hypervisor() && type == E820_RAM ) >> + { >> if (start < 0x100000ULL && end > 0xA0000ULL) { >> if (start < 0xA0000ULL) >> add_memory_region(start, 0xA0000ULL-start, type); > Seeing original line and lower context - aren't you corrupting > previously reasonably consistent (Linux) style here? (This could > be easily taken care of while committing, but I'd first like the > point below be clarified.) > >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -5943,7 +5943,7 @@ const struct platform_bad_page *__init get_platform_badpages(unsigned int *array >> case 0x000806e0: /* erratum KBL??? */ >> case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ >> *array_size = (cpuid_eax(0) >= 7 && >> - !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) && >> + !early_cpu_has_hypervisor() && > Strictly speaking this makes clear that in early_cpu_has_hypervisor() > you should also check cpuid_eax(0) >= 1. We don't currently seem to > object to running on a system with only basic leaf 0 available (we > do object to there not being extended leaf 1). Andrew, do you have > any clear opinion one way or the other? By the time we are running in C, we know the CPU actually has long mode and therefore max_leaf > 4(?) and max_extd_leaf >= 0x80000008 I don't see any value in not relying on this property, and a cost (extra branches) to not relying on it. ~Andrew
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 6c6bd63301..adedb557df 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -274,6 +274,15 @@ static inline u32 phys_pkg_id(u32 cpuid_apic, int index_msb) return _phys_pkg_id(get_apic_id(), index_msb); } +/* + * Sometimes it's too early to use cpu_has_hypervisor which is available only + * after early_cpu_init(). + */ +bool __init early_cpu_has_hypervisor(void) +{ + return cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR); +} + /* Do minimum CPU detection early. Fields really needed: vendor, cpuid_level, family, model, mask, cache alignment. The others are not touched to avoid unwanted side effects. diff --git a/xen/arch/x86/e820.c b/xen/arch/x86/e820.c index 8e8a2c4e1b..dbbe6cac0a 100644 --- a/xen/arch/x86/e820.c +++ b/xen/arch/x86/e820.c @@ -10,6 +10,7 @@ #include <asm/mtrr.h> #include <asm/msr.h> #include <asm/guest.h> +#include <asm/setup.h> /* * opt_mem: Limit maximum address of physical RAM. @@ -318,9 +319,10 @@ static int __init copy_e820_map(struct e820entry * biosmap, unsigned int nr_map) /* * Some BIOSes claim RAM in the 640k - 1M region. - * Not right. Fix it up. + * Not right. Fix it up, but only when running on bare metal. */ - if (type == E820_RAM) { + if ( !early_cpu_has_hypervisor() && type == E820_RAM ) + { if (start < 0x100000ULL && end > 0xA0000ULL) { if (start < 0xA0000ULL) add_memory_region(start, 0xA0000ULL-start, type); diff --git a/xen/arch/x86/guest/xen.c b/xen/arch/x86/guest/xen.c index 7b7a5badab..48ea3224ea 100644 --- a/xen/arch/x86/guest/xen.c +++ b/xen/arch/x86/guest/xen.c @@ -31,6 +31,7 @@ #include <asm/guest.h> #include <asm/msr.h> #include <asm/processor.h> +#include <asm/setup.h> #include <public/arch-x86/cpuid.h> #include <public/hvm/params.h> @@ -72,8 +73,7 @@ void __init probe_hypervisor(void) if ( xen_guest ) return; - /* Too early to use cpu_has_hypervisor */ - if ( !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) ) + if ( !early_cpu_has_hypervisor() ) return; find_xen_leaves(); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 99816fc67c..df1641634c 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5943,7 +5943,7 @@ const struct platform_bad_page *__init get_platform_badpages(unsigned int *array case 0x000806e0: /* erratum KBL??? */ case 0x000906e0: /* errata KBL??? / KBW114 / CFW103 */ *array_size = (cpuid_eax(0) >= 7 && - !(cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_HYPERVISOR)) && + !early_cpu_has_hypervisor() && (cpuid_count_ebx(7, 0) & cpufeat_mask(X86_FEATURE_HLE))); return &hle_bad_page; } diff --git a/xen/include/asm-x86/setup.h b/xen/include/asm-x86/setup.h index 861d46d6ac..ddd37907f1 100644 --- a/xen/include/asm-x86/setup.h +++ b/xen/include/asm-x86/setup.h @@ -16,6 +16,7 @@ extern unsigned long xenheap_initial_phys_start; void early_cpu_init(void); void early_time_init(void); +bool early_cpu_has_hypervisor(void); void set_nr_cpu_ids(unsigned int max_cpus);
Converting a guest from PV to PV-in-PVH makes the guest to have 384k less memory, which may confuse guest's balloon driver. This happens because Xen unconditionally reserves 640k - 1M region in E820 despite the fact that it's really a usable RAM in PVH boot mode. Fix this by skipping region type change in virtualised environments, trusting whatever memory map our hypervisor has provided. Take a refactoring opportunity to introduce early_cpu_has_hypervisor(). Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> --- v2 --> v3: - early_cpu_has_hypervisor() is added CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Wei Liu <wl@xen.org> CC: "Roger Pau Monné" <roger.pau@citrix.com> CC: Juergen Gross <jgross@suse.com> --- xen/arch/x86/cpu/common.c | 9 +++++++++ xen/arch/x86/e820.c | 6 ++++-- xen/arch/x86/guest/xen.c | 4 ++-- xen/arch/x86/mm.c | 2 +- xen/include/asm-x86/setup.h | 1 + 5 files changed, 17 insertions(+), 5 deletions(-)