diff mbox

[3/3] x86/xen: use guest_late_init to detect Xen PVH guest

Message ID 20171108090739.26491-4-jgross@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jürgen Groß Nov. 8, 2017, 9:07 a.m. UTC
In case we are booted via the default boot entry by a generic loader
like grub or OVMF it is necessary to distinguish between a HVM guest
with a device model supporting legacy devices and a PVH guest without
device model.

PVH guests will always have x86_platform.legacy.no_vga set and
x86_platform.legacy.rtc cleared, while both won't be true for HVM
guests.

Test for both conditions in the guest_late_init hook and set xen_pvh
to true if they are met.

Move some of the early PVH initializations to the new hook in order
to avoid duplicated code.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/xen/enlighten_hvm.c | 24 ++++++++++++++++++++++--
 arch/x86/xen/enlighten_pvh.c |  9 ---------
 2 files changed, 22 insertions(+), 11 deletions(-)

Comments

Jan Beulich Nov. 8, 2017, 11:18 a.m. UTC | #1
>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
> In case we are booted via the default boot entry by a generic loader
> like grub or OVMF it is necessary to distinguish between a HVM guest
> with a device model supporting legacy devices and a PVH guest without
> device model.
> 
> PVH guests will always have x86_platform.legacy.no_vga set and
> x86_platform.legacy.rtc cleared, while both won't be true for HVM
> guests.
> 
> Test for both conditions in the guest_late_init hook and set xen_pvh
> to true if they are met.

This sounds pretty fragile to me: I can't see a reason why a proper
HVM guest couldn't come without VGA and RTC. That's not possible
today, agreed, but certainly an option down the road if virtualization
follows bare metal's road towards being legacy free.

Jan
Jürgen Groß Nov. 8, 2017, 11:55 a.m. UTC | #2
On 08/11/17 12:18, Jan Beulich wrote:
>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>> In case we are booted via the default boot entry by a generic loader
>> like grub or OVMF it is necessary to distinguish between a HVM guest
>> with a device model supporting legacy devices and a PVH guest without
>> device model.
>>
>> PVH guests will always have x86_platform.legacy.no_vga set and
>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>> guests.
>>
>> Test for both conditions in the guest_late_init hook and set xen_pvh
>> to true if they are met.
> 
> This sounds pretty fragile to me: I can't see a reason why a proper
> HVM guest couldn't come without VGA and RTC. That's not possible
> today, agreed, but certainly an option down the road if virtualization
> follows bare metal's road towards being legacy free.

From guest's perspective: what is the difference between a legacy free
HVM domain and PVH? In the end the need for differentiating is to avoid
access to legacy features in PVH as those would require a device model.


Juergen
Paolo Bonzini Nov. 8, 2017, 12:03 p.m. UTC | #3
On 08/11/2017 12:55, Juergen Gross wrote:
> On 08/11/17 12:18, Jan Beulich wrote:
>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>> In case we are booted via the default boot entry by a generic loader
>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>> with a device model supporting legacy devices and a PVH guest without
>>> device model.
>>>
>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>> guests.
>>>
>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>> to true if they are met.
>>
>> This sounds pretty fragile to me: I can't see a reason why a proper
>> HVM guest couldn't come without VGA and RTC. That's not possible
>> today, agreed, but certainly an option down the road if virtualization
>> follows bare metal's road towards being legacy free.
> 
> From guest's perspective: what is the difference between a legacy free
> HVM domain and PVH? In the end the need for differentiating is to avoid
> access to legacy features in PVH as those would require a device model.

My understanding of Xen is very rusty at this point, but I think a
"completely" legacy-free HVM domain will still have a PCI bus and the
Xen platform device on that bus.

A PVH domain just knows how to access the Xen PV features.

Paolo
Jürgen Groß Nov. 8, 2017, 12:24 p.m. UTC | #4
On 08/11/17 13:03, Paolo Bonzini wrote:
> On 08/11/2017 12:55, Juergen Gross wrote:
>> On 08/11/17 12:18, Jan Beulich wrote:
>>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>>> In case we are booted via the default boot entry by a generic loader
>>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>>> with a device model supporting legacy devices and a PVH guest without
>>>> device model.
>>>>
>>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>>> guests.
>>>>
>>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>>> to true if they are met.
>>>
>>> This sounds pretty fragile to me: I can't see a reason why a proper
>>> HVM guest couldn't come without VGA and RTC. That's not possible
>>> today, agreed, but certainly an option down the road if virtualization
>>> follows bare metal's road towards being legacy free.
>>
>> From guest's perspective: what is the difference between a legacy free
>> HVM domain and PVH? In the end the need for differentiating is to avoid
>> access to legacy features in PVH as those would require a device model.
> 
> My understanding of Xen is very rusty at this point, but I think a
> "completely" legacy-free HVM domain will still have a PCI bus and the
> Xen platform device on that bus.
> 
> A PVH domain just knows how to access the Xen PV features.

A HVM domain does so, too. Today maybe only partially, but e.g. event
channels work in a HVM domain even without the Xen platform device.
Grant tables can be made working without the platform device, too,
and I'm already preparing a patch to do exactly that.

The only need for the platform device will then be to have an
interface for unplugging emulated boot devices in favor of the pv
devices. And without the platform device we can just skip that
step without doing any harm, as this can happen only for PVH where
we have no need to do the unplug, or for HVM explicitly configured
to work without platform device needing to continue using the
emulated devices as it is doing today in this case.


Juergen
Paolo Bonzini Nov. 8, 2017, 12:26 p.m. UTC | #5
On 08/11/2017 13:24, Juergen Gross wrote:
>> My understanding of Xen is very rusty at this point, but I think a
>> "completely" legacy-free HVM domain will still have a PCI bus and the
>> Xen platform device on that bus.
>>
>> A PVH domain just knows how to access the Xen PV features.
>
> A HVM domain does so, too. Today maybe only partially, but e.g. event
> channels work in a HVM domain even without the Xen platform device.
> Grant tables can be made working without the platform device, too,
> and I'm already preparing a patch to do exactly that.

What about assigned PCI devices?  I think they are not PV pcifront for
HVM.  So the main difference in the end is the PCI bus.

Paolo
Jan Beulich Nov. 8, 2017, 12:31 p.m. UTC | #6
>>> On 08.11.17 at 12:55, <jgross@suse.com> wrote:
> On 08/11/17 12:18, Jan Beulich wrote:
>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>> In case we are booted via the default boot entry by a generic loader
>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>> with a device model supporting legacy devices and a PVH guest without
>>> device model.
>>>
>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>> guests.
>>>
>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>> to true if they are met.
>> 
>> This sounds pretty fragile to me: I can't see a reason why a proper
>> HVM guest couldn't come without VGA and RTC. That's not possible
>> today, agreed, but certainly an option down the road if virtualization
>> follows bare metal's road towards being legacy free.
> 
> From guest's perspective: what is the difference between a legacy free
> HVM domain and PVH? In the end the need for differentiating is to avoid
> access to legacy features in PVH as those would require a device model.

My point is that "legacy free" would likely be reached over time (and
even once fully reached, hybrid configurations would be possible).
I.e. there could be a setup with PIC, but with neither VGA nor RTC.
That's still not PVH then. Nor do all legacy features require a device
model in the first place - some of them are being emulated entirely
in the hypervisor.

Furthermore, PVH absolutely requires guest awareness afaict, while
legacy-free pure HVM guests (with an OS only aware of the possible
absence of legacy devices) would still be possible.

Jan
Jürgen Groß Nov. 8, 2017, 12:40 p.m. UTC | #7
On 08/11/17 13:26, Paolo Bonzini wrote:
> On 08/11/2017 13:24, Juergen Gross wrote:
>>> My understanding of Xen is very rusty at this point, but I think a
>>> "completely" legacy-free HVM domain will still have a PCI bus and the
>>> Xen platform device on that bus.
>>>
>>> A PVH domain just knows how to access the Xen PV features.
>>
>> A HVM domain does so, too. Today maybe only partially, but e.g. event
>> channels work in a HVM domain even without the Xen platform device.
>> Grant tables can be made working without the platform device, too,
>> and I'm already preparing a patch to do exactly that.
> 
> What about assigned PCI devices?  I think they are not PV pcifront for
> HVM.  So the main difference in the end is the PCI bus.

Sure, but this is easily detectable, isn't it?


Juergen
Jürgen Groß Nov. 8, 2017, 12:45 p.m. UTC | #8
On 08/11/17 13:31, Jan Beulich wrote:
>>>> On 08.11.17 at 12:55, <jgross@suse.com> wrote:
>> On 08/11/17 12:18, Jan Beulich wrote:
>>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>>> In case we are booted via the default boot entry by a generic loader
>>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>>> with a device model supporting legacy devices and a PVH guest without
>>>> device model.
>>>>
>>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>>> guests.
>>>>
>>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>>> to true if they are met.
>>>
>>> This sounds pretty fragile to me: I can't see a reason why a proper
>>> HVM guest couldn't come without VGA and RTC. That's not possible
>>> today, agreed, but certainly an option down the road if virtualization
>>> follows bare metal's road towards being legacy free.
>>
>> From guest's perspective: what is the difference between a legacy free
>> HVM domain and PVH? In the end the need for differentiating is to avoid
>> access to legacy features in PVH as those would require a device model.
> 
> My point is that "legacy free" would likely be reached over time (and
> even once fully reached, hybrid configurations would be possible).
> I.e. there could be a setup with PIC, but with neither VGA nor RTC.
> That's still not PVH then. Nor do all legacy features require a device
> model in the first place - some of them are being emulated entirely
> in the hypervisor.
> 
> Furthermore, PVH absolutely requires guest awareness afaict, while
> legacy-free pure HVM guests (with an OS only aware of the possible
> absence of legacy devices) would still be possible.

Hmm, where else do you expect PVH awareness to be required? Maybe for
vcpu hotplugging, but this could easily be solved by adding a Xenstore
entry containing the required information. Is there any other problem to
be expected before Xenstore access is possible?


Juergen
Jan Beulich Nov. 8, 2017, 12:58 p.m. UTC | #9
>>> On 08.11.17 at 13:45, <jgross@suse.com> wrote:
> On 08/11/17 13:31, Jan Beulich wrote:
>>>>> On 08.11.17 at 12:55, <jgross@suse.com> wrote:
>>> On 08/11/17 12:18, Jan Beulich wrote:
>>>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>>>> In case we are booted via the default boot entry by a generic loader
>>>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>>>> with a device model supporting legacy devices and a PVH guest without
>>>>> device model.
>>>>>
>>>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>>>> guests.
>>>>>
>>>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>>>> to true if they are met.
>>>>
>>>> This sounds pretty fragile to me: I can't see a reason why a proper
>>>> HVM guest couldn't come without VGA and RTC. That's not possible
>>>> today, agreed, but certainly an option down the road if virtualization
>>>> follows bare metal's road towards being legacy free.
>>>
>>> From guest's perspective: what is the difference between a legacy free
>>> HVM domain and PVH? In the end the need for differentiating is to avoid
>>> access to legacy features in PVH as those would require a device model.
>> 
>> My point is that "legacy free" would likely be reached over time (and
>> even once fully reached, hybrid configurations would be possible).
>> I.e. there could be a setup with PIC, but with neither VGA nor RTC.
>> That's still not PVH then. Nor do all legacy features require a device
>> model in the first place - some of them are being emulated entirely
>> in the hypervisor.
>> 
>> Furthermore, PVH absolutely requires guest awareness afaict, while
>> legacy-free pure HVM guests (with an OS only aware of the possible
>> absence of legacy devices) would still be possible.
> 
> Hmm, where else do you expect PVH awareness to be required? Maybe for
> vcpu hotplugging, but this could easily be solved by adding a Xenstore
> entry containing the required information. Is there any other problem to
> be expected before Xenstore access is possible?

Let me ask the question the other way around: What's all the PVH
specific code for under arch/x86/xen/ if there's no difference? One
thing I seem to remember is that getting hold of the ACPI tables
is different between PVH and HVM. Iirc the distinct PVH entry point
is (in part) for that purpose. In the end - with that separate entry
point - it is not really clear to me why any "detection" needs to be
done in the first place: You'd know which mode you're in by knowing
which entry point path you've taken.

Jan
Jürgen Groß Nov. 8, 2017, 1:36 p.m. UTC | #10
On 08/11/17 13:58, Jan Beulich wrote:
>>>> On 08.11.17 at 13:45, <jgross@suse.com> wrote:
>> On 08/11/17 13:31, Jan Beulich wrote:
>>>>>> On 08.11.17 at 12:55, <jgross@suse.com> wrote:
>>>> On 08/11/17 12:18, Jan Beulich wrote:
>>>>>>>> On 08.11.17 at 10:07, <jgross@suse.com> wrote:
>>>>>> In case we are booted via the default boot entry by a generic loader
>>>>>> like grub or OVMF it is necessary to distinguish between a HVM guest
>>>>>> with a device model supporting legacy devices and a PVH guest without
>>>>>> device model.
>>>>>>
>>>>>> PVH guests will always have x86_platform.legacy.no_vga set and
>>>>>> x86_platform.legacy.rtc cleared, while both won't be true for HVM
>>>>>> guests.
>>>>>>
>>>>>> Test for both conditions in the guest_late_init hook and set xen_pvh
>>>>>> to true if they are met.
>>>>>
>>>>> This sounds pretty fragile to me: I can't see a reason why a proper
>>>>> HVM guest couldn't come without VGA and RTC. That's not possible
>>>>> today, agreed, but certainly an option down the road if virtualization
>>>>> follows bare metal's road towards being legacy free.
>>>>
>>>> From guest's perspective: what is the difference between a legacy free
>>>> HVM domain and PVH? In the end the need for differentiating is to avoid
>>>> access to legacy features in PVH as those would require a device model.
>>>
>>> My point is that "legacy free" would likely be reached over time (and
>>> even once fully reached, hybrid configurations would be possible).
>>> I.e. there could be a setup with PIC, but with neither VGA nor RTC.
>>> That's still not PVH then. Nor do all legacy features require a device
>>> model in the first place - some of them are being emulated entirely
>>> in the hypervisor.
>>>
>>> Furthermore, PVH absolutely requires guest awareness afaict, while
>>> legacy-free pure HVM guests (with an OS only aware of the possible
>>> absence of legacy devices) would still be possible.
>>
>> Hmm, where else do you expect PVH awareness to be required? Maybe for
>> vcpu hotplugging, but this could easily be solved by adding a Xenstore
>> entry containing the required information. Is there any other problem to
>> be expected before Xenstore access is possible?
> 
> Let me ask the question the other way around: What's all the PVH
> specific code for under arch/x86/xen/ if there's no difference? One

Most of it is for early boot when coming through the PVH specific
boot entry.

> thing I seem to remember is that getting hold of the ACPI tables
> is different between PVH and HVM. Iirc the distinct PVH entry point
> is (in part) for that purpose. In the end - with that separate entry
> point - it is not really clear to me why any "detection" needs to be
> done in the first place: You'd know which mode you're in by knowing
> which entry point path you've taken.

Its all in the commit message: I am trying to enable a boot loader to
use the default kernel boot entry for PVH. This will reduce the needed
modifications in the loader.

Regarding ACPI tables: current PVH implementation in Linux kernel
seems not to make use of the special information presented in the boot
information block.


Juergen
Boris Ostrovsky Nov. 8, 2017, 2:10 p.m. UTC | #11
On 11/08/2017 08:36 AM, Juergen Gross wrote:
>
> Regarding ACPI tables: current PVH implementation in Linux kernel
> seems not to make use of the special information presented in the boot
> information block.

It will need to do so for dom0 (and, then, for simplicity, for all PVH
guests).

-boris
Jürgen Groß Nov. 8, 2017, 2:17 p.m. UTC | #12
On 08/11/17 15:10, Boris Ostrovsky wrote:
> On 11/08/2017 08:36 AM, Juergen Gross wrote:
>>
>> Regarding ACPI tables: current PVH implementation in Linux kernel
>> seems not to make use of the special information presented in the boot
>> information block.
> 
> It will need to do so for dom0 (and, then, for simplicity, for all PVH
> guests).

What about: for all PVH guests booted via the PVH specific boot entry?
A guest booted via the default boot entry won't know it is PVH until
ACPI tables have been scanned.


Juergen
Boris Ostrovsky Nov. 8, 2017, 2:24 p.m. UTC | #13
On 11/08/2017 09:17 AM, Juergen Gross wrote:
> On 08/11/17 15:10, Boris Ostrovsky wrote:
>> On 11/08/2017 08:36 AM, Juergen Gross wrote:
>>> Regarding ACPI tables: current PVH implementation in Linux kernel
>>> seems not to make use of the special information presented in the boot
>>> information block.
>> It will need to do so for dom0 (and, then, for simplicity, for all PVH
>> guests).
> What about: for all PVH guests booted via the PVH specific boot entry?
> A guest booted via the default boot entry won't know it is PVH until
> ACPI tables have been scanned.

Right. Guest booted from default entry will have to discover RSDP in the
usual way.

-boris
diff mbox

Patch

diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index de503c225ae1..d7d68a39073a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -1,3 +1,4 @@ 
+#include <linux/acpi.h>
 #include <linux/cpu.h>
 #include <linux/kexec.h>
 #include <linux/memblock.h>
@@ -188,8 +189,6 @@  static void __init xen_hvm_guest_init(void)
 	xen_hvm_init_time_ops();
 	xen_hvm_init_mmu_ops();
 
-	if (xen_pvh_domain())
-		machine_ops.emergency_restart = xen_emergency_restart;
 #ifdef CONFIG_KEXEC_CORE
 	machine_ops.shutdown = xen_hvm_shutdown;
 	machine_ops.crash_shutdown = xen_hvm_crash_shutdown;
@@ -226,6 +225,26 @@  static uint32_t __init xen_platform_hvm(void)
 	return xen_cpuid_base();
 }
 
+static __init void xen_hvm_guest_late_init(void)
+{
+#ifdef CONFIG_XEN_PVH
+	/* Test for PVH domain (PVH boot path taken overrides ACPI flags). */
+	if (!xen_pvh &&
+	    (x86_platform.legacy.rtc || !x86_platform.legacy.no_vga))
+		return;
+
+	/* PVH detected. */
+	xen_pvh = true;
+
+	/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
+	if (!nr_ioapics && acpi_irq_model == ACPI_IRQ_MODEL_PIC)
+		acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
+
+	machine_ops.emergency_restart = xen_emergency_restart;
+	pv_info.name = "Xen PVH";
+#endif
+}
+
 const struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.name                   = "Xen HVM",
 	.detect                 = xen_platform_hvm,
@@ -233,5 +252,6 @@  const struct hypervisor_x86 x86_hyper_xen_hvm = {
 	.pin_vcpu               = xen_pin_vcpu,
 	.x2apic_available       = xen_x2apic_para_available,
 	.init_mem_mapping	= xen_hvm_init_mem_mapping,
+	.guest_late_init	= xen_hvm_guest_late_init,
 };
 EXPORT_SYMBOL(x86_hyper_xen_hvm);
diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 7bd3ee08393e..436c4f003e17 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -25,13 +25,6 @@  struct boot_params pvh_bootparams __attribute__((section(".data")));
 struct hvm_start_info pvh_start_info;
 unsigned int pvh_start_info_sz = sizeof(pvh_start_info);
 
-static void xen_pvh_arch_setup(void)
-{
-	/* Make sure we don't fall back to (default) ACPI_IRQ_MODEL_PIC. */
-	if (nr_ioapics == 0)
-		acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
-}
-
 static void __init init_pvh_bootparams(void)
 {
 	struct xen_memory_map memmap;
@@ -102,6 +95,4 @@  void __init xen_prepare_pvh(void)
 	wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32));
 
 	init_pvh_bootparams();
-
-	x86_init.oem.arch_setup = xen_pvh_arch_setup;
 }