diff mbox

[v2,3/3] paravirt: rename paravirt_enabled to paravirt_legacy

Message ID 20160208163526.GI28980@pd.tnic (mailing list archive)
State New, archived
Headers show

Commit Message

Borislav Petkov Feb. 8, 2016, 4:35 p.m. UTC
On Mon, Feb 08, 2016 at 11:31:04AM -0500, Boris Ostrovsky wrote:
> I think we are OK for PV because this code will be executed after pvops are
> set and so we will be calling xen_cpuid().

Not for the early loader - it is too early for pvops then. So you're
saying something like that won't work?

---

Comments

Andrew Cooper Feb. 8, 2016, 4:38 p.m. UTC | #1
On 08/02/16 16:35, Borislav Petkov wrote:
> On Mon, Feb 08, 2016 at 11:31:04AM -0500, Boris Ostrovsky wrote:
>> I think we are OK for PV because this code will be executed after pvops are
>> set and so we will be calling xen_cpuid().
> Not for the early loader - it is too early for pvops then. So you're
> saying something like that won't work?

Correct.  PV guests are ring-deprivilelged so the cpuid instruction
doesn't trap in general.  (It does on more modern Intel hardware with
cpuid faulting enabled, but that is only IvyBridge and newer).

Does the early loader have extable support?  If so, this is fairly easy
to fix.  If not, we have a problem.

~Andrew
Boris Ostrovsky Feb. 8, 2016, 4:41 p.m. UTC | #2
On 02/08/2016 11:35 AM, Borislav Petkov wrote:
> On Mon, Feb 08, 2016 at 11:31:04AM -0500, Boris Ostrovsky wrote:
>> I think we are OK for PV because this code will be executed after pvops are
>> set and so we will be calling xen_cpuid().
> Not for the early loader - it is too early for pvops then. So you're
> saying something like that won't work?

Keep in mind that Xen PV doesn't go through startup_32|64(). It starts 
at xen_start_kernel (save for a small stub before that), which sets 
pvops.  It "joins" regular/baremetal code in 
i386_start_kernel/x86_64_start_reservation().

-boris
Borislav Petkov Feb. 8, 2016, 4:45 p.m. UTC | #3
On Mon, Feb 08, 2016 at 04:38:40PM +0000, Andrew Cooper wrote:
> Does the early loader have extable support?  If so, this is fairly easy
> to fix.  If not, we have a problem.

It doesn't and regardless, you want to have this CPUID querying as
simple as possible. No special handling, no special prefixes as it
should be able to run on other hypervisors too.

If one can't execute a simple CPUID(0x4...) on a xen guest and get the
results back, then for early, we will have to do what we've done until
now and simply emulate the MSR accesses.

Later code can use then xen_cpuid() and all is fine. We should still get
rid of paravirt_enabled() though.
Borislav Petkov Feb. 8, 2016, 4:52 p.m. UTC | #4
On Mon, Feb 08, 2016 at 11:41:16AM -0500, Boris Ostrovsky wrote:
> Keep in mind that Xen PV doesn't go through startup_32|64(). It starts at
> xen_start_kernel (save for a small stub before that), which sets pvops.  It
> "joins" regular/baremetal code in
> i386_start_kernel/x86_64_start_reservation().

Ah, so one more reason not to do the CPUID thing for the early loader.

So it's up to you how to replace that paravirt_enabled() thing - CPUID
or test static_cpu_has(X86_FEATURE_XENPV)...
Boris Ostrovsky Feb. 8, 2016, 4:52 p.m. UTC | #5
On 02/08/2016 11:45 AM, Borislav Petkov wrote:
> On Mon, Feb 08, 2016 at 04:38:40PM +0000, Andrew Cooper wrote:
>> Does the early loader have extable support?  If so, this is fairly easy
>> to fix.  If not, we have a problem.
> It doesn't and regardless, you want to have this CPUID querying as
> simple as possible. No special handling, no special prefixes as it
> should be able to run on other hypervisors too.
>
> If one can't execute a simple CPUID(0x4...) on a xen guest and get the
> results back, then for early, we will have to do what we've done until
> now and simply emulate the MSR accesses.

I think xen_hypervisor check can be done in microcode_init() as this is 
first time PV kernel deals with microcode.

Let me try it --- I want to see what happens on hotplug and resume but I 
am reasonably certain this should work during boot.

-boris

>
> Later code can use then xen_cpuid() and all is fine. We should still get
> rid of paravirt_enabled() though.
>
Andrew Cooper Feb. 8, 2016, 4:53 p.m. UTC | #6
On 08/02/16 16:45, Borislav Petkov wrote:
> On Mon, Feb 08, 2016 at 04:38:40PM +0000, Andrew Cooper wrote:
>> Does the early loader have extable support?  If so, this is fairly easy
>> to fix.  If not, we have a problem.
> It doesn't and regardless, you want to have this CPUID querying as
> simple as possible. No special handling, no special prefixes as it
> should be able to run on other hypervisors too.
>
> If one can't execute a simple CPUID(0x4...) on a xen guest and get the
> results back, then for early, we will have to do what we've done until
> now and simply emulate the MSR accesses.
>
> Later code can use then xen_cpuid() and all is fine. We should still get
> rid of paravirt_enabled() though.
>

The force emulation prefix starts with a ud2a instruction, so extable is
to prevent it breaking on non-Xen systems.  However, if extable isn't
available, this point is moot.

As an alternative check which should be doable this early on, peeking in
the head of hypercall_page should work.  If Linux was booted as a PV
guest, the hypercall_page will have been constructed by the domain
builder, and won't have 0x90's in it.

~Andrew
Borislav Petkov Feb. 8, 2016, 5:13 p.m. UTC | #7
On Mon, Feb 08, 2016 at 04:53:00PM +0000, Andrew Cooper wrote:
> As an alternative check which should be doable this early on, peeking in
> the head of hypercall_page should work.  If Linux was booted as a PV
> guest, the hypercall_page will have been constructed by the domain
> builder, and won't have 0x90's in it.

Good to know, we might need it for something. :)

Thanks.
Boris Ostrovsky Feb. 8, 2016, 8:45 p.m. UTC | #8
On 02/08/2016 11:52 AM, Boris Ostrovsky wrote:
>
>
> On 02/08/2016 11:45 AM, Borislav Petkov wrote:
>> On Mon, Feb 08, 2016 at 04:38:40PM +0000, Andrew Cooper wrote:
>>> Does the early loader have extable support?  If so, this is fairly easy
>>> to fix.  If not, we have a problem.
>> It doesn't and regardless, you want to have this CPUID querying as
>> simple as possible. No special handling, no special prefixes as it
>> should be able to run on other hypervisors too.
>>
>> If one can't execute a simple CPUID(0x4...) on a xen guest and get the
>> results back, then for early, we will have to do what we've done until
>> now and simply emulate the MSR accesses.
>
> I think xen_hypervisor check can be done in microcode_init() as this 
> is first time PV kernel deals with microcode.
>
> Let me try it --- I want to see what happens on hotplug and resume but 
> I am reasonably certain this should work during boot.

So it looks like we can just simply revert a18a0f6850 because the very 
next patch to microcode code (fbae4ba8c4a) makes the original problem 
(of using __pa_nodebug, which we shouldn't use on PV) go away: we don't 
call load_ucode_ap from resume path anymore.

-boris
Borislav Petkov Feb. 8, 2016, 9:06 p.m. UTC | #9
On Mon, Feb 08, 2016 at 03:45:21PM -0500, Boris Ostrovsky wrote:
> So it looks like we can just simply revert a18a0f6850 because the very next
> patch to microcode code (fbae4ba8c4a) makes the original problem (of using
> __pa_nodebug, which we shouldn't use on PV) go away: we don't call
> load_ucode_ap from resume path anymore.

Even better.

Care to cut a proper patch explaining why we don't need the
paravirt_enabled() check anymore, and prep it ontop of

http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/log/?h=tip-microcode

?

I'll queue it for 4.6.

Thanks!
Luis Chamberlain Feb. 9, 2016, 6:22 a.m. UTC | #10
On Mon, Feb 08, 2016 at 04:53:00PM +0000, Andrew Cooper wrote:
> On 08/02/16 16:45, Borislav Petkov wrote:
> > On Mon, Feb 08, 2016 at 04:38:40PM +0000, Andrew Cooper wrote:
> >> Does the early loader have extable support?  If so, this is fairly easy
> >> to fix.  If not, we have a problem.
> > It doesn't and regardless, you want to have this CPUID querying as
> > simple as possible. No special handling, no special prefixes as it
> > should be able to run on other hypervisors too.
> >
> > If one can't execute a simple CPUID(0x4...) on a xen guest and get the
> > results back, then for early, we will have to do what we've done until
> > now and simply emulate the MSR accesses.
> >
> > Later code can use then xen_cpuid() and all is fine. We should still get
> > rid of paravirt_enabled() though.
> >
> 
> The force emulation prefix starts with a ud2a instruction, so extable is
> to prevent it breaking on non-Xen systems.  However, if extable isn't
> available, this point is moot.
> 
> As an alternative check which should be doable this early on, peeking in
> the head of hypercall_page should work.  If Linux was booted as a PV
> guest, the hypercall_page will have been constructed by the domain
> builder, and won't have 0x90's in it.

Most of the paravirt_enabled() checks can be replaced with a hardware_subarch
check once we modify the xen_start_kernel() to add that, today XEN is unused
even though it was added eons ago. Part of my work was to remove as many
paravirt_enabled() checks. I'll reply to Andy's original e-mail now indicating
which ones I could address, it sounds like with this nugget and some other
work we might be able to address all.

I should note, in the future the check for subarch would be an explicit part of
the x86 early init init routines to avoid further issues but only in between
x86_64_start_reservations() and setup_arch(). How *early* can such a hypercall
_page check be *safely* be called? I say safely here as if we're not on Xen are
we OK to muck around and check the same address space?

Provided we use the subarch for PV to remove a lot of the paravirt_enabled()
checks, is HVMLite still OK if it ends up using  PC subarch and there not being
a paravirt_enabled() anyamore? Boris O's HVMLite series added paravirt_enabled
= 1 for the new HVMLite.

  Luis
diff mbox

Patch

diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h
index 055ea9941dd5..f3e563e8b5c3 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -56,9 +56,11 @@  extern const struct hypervisor_x86 x86_hyper_kvm;
 extern void init_hypervisor(struct cpuinfo_x86 *c);
 extern void init_hypervisor_platform(void);
 extern bool hypervisor_x2apic_available(void);
+bool is_xen_hypervisor(void);
 #else
 static inline void init_hypervisor(struct cpuinfo_x86 *c) { }
 static inline void init_hypervisor_platform(void) { }
 static inline bool hypervisor_x2apic_available(void) { return false; }
+static inline bool is_xen_hypervisor(void) { return false; }
 #endif /* CONFIG_HYPERVISOR_GUEST */
 #endif /* _ASM_X86_HYPERVISOR_H */
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index d820d8eae96b..bda29017f946 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -85,3 +85,24 @@  bool __init hypervisor_x2apic_available(void)
 	       x86_hyper->x2apic_available &&
 	       x86_hyper->x2apic_available();
 }
+
+bool is_xen_hypervisor(void)
+{
+	u32 eax, ebx, ecx, edx;
+
+	eax = 0x4000000;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	if (ebx == 0x566e6558 && ecx == 0x65584d4d && edx == 0x4d4d566e)
+		return true;
+
+	eax = 0x40000100;
+	ecx = 0;
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+
+	if (ebx == 0x566e6558 && ecx == 0x65584d4d && edx == 0x4d4d566e)
+		return true;
+
+	return false;
+}
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index cea8552e2b3a..0a941ff8095c 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -35,6 +35,7 @@ 
 #include <asm/microcode_intel.h>
 #include <asm/cpu_device_id.h>
 #include <asm/microcode_amd.h>
+#include <asm/hypervisor.h>
 #include <asm/perf_event.h>
 #include <asm/microcode.h>
 #include <asm/processor.h>
@@ -86,7 +87,7 @@  static bool __init check_loader_disabled_bsp(void)
 	bool *res = &dis_ucode_ldr;
 #endif
 
-	if (cmdline_find_option_bool(cmdline, option))
+	if (cmdline_find_option_bool(cmdline, option) || is_xen_hypervisor())
 		*res = true;
 
 	return *res;