diff mbox series

[v8,1/7] xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH

Message ID 1544076257-21792-1-git-send-email-maran.wilson@oracle.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Allow Qemu/KVM to use PVH entry point | expand

Commit Message

Maran Wilson Dec. 6, 2018, 6:04 a.m. UTC
In order to pave the way for hypervisors other than Xen to use the PVH
entry point for VMs, we need to factor the PVH entry code into Xen specific
and hypervisor agnostic components. The first step in doing that, is to
create a new config option for PVH entry that can be enabled
independently from CONFIG_XEN.

Signed-off-by: Maran Wilson <maran.wilson@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/Kconfig          | 6 ++++++
 arch/x86/kernel/head_64.S | 2 +-
 arch/x86/xen/Kconfig      | 3 ++-
 3 files changed, 9 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini Dec. 6, 2018, 10:11 p.m. UTC | #1
On 06/12/18 07:04, Maran Wilson wrote:
> +config PVH
> +	bool "Support for running PVH guests"
> +	---help---
> +	  This option enables the PVH entry point for guest virtual machines
> +	  as specified in the x86/HVM direct boot ABI.
> +

IIUC this breaks "normal" bzImage boot, so we should have something like

	The resulting kernel will not boot with most x86 boot loaders
	such as GRUB or SYSLINUX.  Unless you plan to start the kernel
	using QEMU or Xen, you probably want to say N here.

and also

	depends on !EFI

because even though in principle it would be possible to write a PVH
loader for UEFI, PVH's start info does not support the EFI handover
protocol.

Paolo
Boris Ostrovsky Dec. 6, 2018, 10:34 p.m. UTC | #2
On 12/6/18 5:11 PM, Paolo Bonzini wrote:
> On 06/12/18 07:04, Maran Wilson wrote:
>> +config PVH
>> +	bool "Support for running PVH guests"
>> +	---help---
>> +	  This option enables the PVH entry point for guest virtual machines
>> +	  as specified in the x86/HVM direct boot ABI.
>> +
> IIUC this breaks "normal" bzImage boot, so we should have something like
>
> 	The resulting kernel will not boot with most x86 boot loaders
> 	such as GRUB 


Grub support for PVH guests (for Xen) is well under way.


> or SYSLINUX.  Unless you plan to start the kernel
> 	using QEMU or Xen, you probably want to say N here.

I think PVH should not be user-selectable at all. It should be selected
by either XEN_PVH or KVM_GUEST_PVH (which you suggested to drop).

>
> and also
>
> 	depends on !EFI
>
> because even though in principle it would be possible to write a PVH
> loader for UEFI, PVH's start info does not support the EFI handover
> protocol.

But we should be able to build the binary with both EFI and PVH?

-boris
Paolo Bonzini Dec. 6, 2018, 10:49 p.m. UTC | #3
On 06/12/18 23:34, Boris Ostrovsky wrote:
> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>> On 06/12/18 07:04, Maran Wilson wrote:
>>> +config PVH
>>> +	bool "Support for running PVH guests"
>>> +	---help---
>>> +	  This option enables the PVH entry point for guest virtual machines
>>> +	  as specified in the x86/HVM direct boot ABI.
>>> +
>> IIUC this breaks "normal" bzImage boot, so we should have something like
>>
>> 	The resulting kernel will not boot with most x86 boot loaders
>> 	such as GRUB 
> 
> 
> Grub support for PVH guests (for Xen) is well under way.

Oh, nice. :)

>> or SYSLINUX.  Unless you plan to start the kernel
>> 	using QEMU or Xen, you probably want to say N here.
> 
> I think PVH should not be user-selectable at all. It should be selected
> by either XEN_PVH or KVM_GUEST_PVH (which you suggested to drop).

KVM_GUEST_PVH is not entirely accurate because it's not just for KVM (it
can be used with QEMU and Apple's Hypervisor.framework for example).

It's also not necessarily just for QEMU (it could be implemented for
kvmtool if desired), but as long as it's in the help I guess it's
acceptable.

I think we could just drop the sentence about boot loaders from my
suggestion.

>>
>> and also
>>
>> 	depends on !EFI
>>
>> because even though in principle it would be possible to write a PVH
>> loader for UEFI, PVH's start info does not support the EFI handover
>> protocol.
> 
> But we should be able to build the binary with both EFI and PVH?

Can you?  It's a completely different binary format, the EFI handover
protocol is invoked via a special entry point and needs the Linux header
format, not ELF.

Paolo
Boris Ostrovsky Dec. 6, 2018, 11:11 p.m. UTC | #4
On 12/6/18 5:49 PM, Paolo Bonzini wrote:
> On 06/12/18 23:34, Boris Ostrovsky wrote:
>> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>>
>>> and also
>>>
>>> 	depends on !EFI
>>>
>>> because even though in principle it would be possible to write a PVH
>>> loader for UEFI, PVH's start info does not support the EFI handover
>>> protocol.
>> But we should be able to build the binary with both EFI and PVH?
> Can you?  It's a completely different binary format, the EFI handover
> protocol is invoked via a special entry point and needs the Linux header
> format, not ELF.

Right, but I think it is desirable to be able to build both from the
same config file.

-boris
Paolo Bonzini Dec. 6, 2018, 11:30 p.m. UTC | #5
On 07/12/18 00:11, Boris Ostrovsky wrote:
> On 12/6/18 5:49 PM, Paolo Bonzini wrote:
>> On 06/12/18 23:34, Boris Ostrovsky wrote:
>>> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>>>
>>>> and also
>>>>
>>>> 	depends on !EFI
>>>>
>>>> because even though in principle it would be possible to write a PVH
>>>> loader for UEFI, PVH's start info does not support the EFI handover
>>>> protocol.
>>> But we should be able to build the binary with both EFI and PVH?
>> Can you?  It's a completely different binary format, the EFI handover
>> protocol is invoked via a special entry point and needs the Linux header
>> format, not ELF.
> 
> Right, but I think it is desirable to be able to build both from the
> same config file.

Ah, "make bzImage" and use the vmlinux for PVH, because PVH fetches the
entry point from the special note.  That's clever. :)

I don't see why it should not work, and if so the "depends on !EFI" is
indeed unnecessary.

Paolo
Andrew Cooper Dec. 6, 2018, 11:36 p.m. UTC | #6
On 06/12/2018 23:30, Paolo Bonzini wrote:
> On 07/12/18 00:11, Boris Ostrovsky wrote:
>> On 12/6/18 5:49 PM, Paolo Bonzini wrote:
>>> On 06/12/18 23:34, Boris Ostrovsky wrote:
>>>> On 12/6/18 5:11 PM, Paolo Bonzini wrote:
>>>>
>>>>> and also
>>>>>
>>>>> 	depends on !EFI
>>>>>
>>>>> because even though in principle it would be possible to write a PVH
>>>>> loader for UEFI, PVH's start info does not support the EFI handover
>>>>> protocol.
>>>> But we should be able to build the binary with both EFI and PVH?
>>> Can you?  It's a completely different binary format, the EFI handover
>>> protocol is invoked via a special entry point and needs the Linux header
>>> format, not ELF.
>> Right, but I think it is desirable to be able to build both from the
>> same config file.
> Ah, "make bzImage" and use the vmlinux for PVH, because PVH fetches the
> entry point from the special note.  That's clever. :)
>
> I don't see why it should not work, and if so the "depends on !EFI" is
> indeed unnecessary.

We do strive for single binaries in the Xen world, because that is how
people actually want to consume Xen.

It is for this reason why a single xen.gz binary can be loaded as a
straight ELF (including this PVH boot protocol), or via Multiboot 1 or
2, and even do full EFI if your bootloader is up to date on its
Multiboot2 spec :)

~Andrew
Jürgen Groß Dec. 7, 2018, 6:02 a.m. UTC | #7
On 06/12/2018 23:11, Paolo Bonzini wrote:
> On 06/12/18 07:04, Maran Wilson wrote:
>> +config PVH
>> +	bool "Support for running PVH guests"
>> +	---help---
>> +	  This option enables the PVH entry point for guest virtual machines
>> +	  as specified in the x86/HVM direct boot ABI.
>> +
> 
> IIUC this breaks "normal" bzImage boot, so we should have something like
> 
> 	The resulting kernel will not boot with most x86 boot loaders
> 	such as GRUB or SYSLINUX.  Unless you plan to start the kernel
> 	using QEMU or Xen, you probably want to say N here.

The resulting kernel should be able to be booted either in PVH mode
via the PVH entry point or the "normal" way via the still existing
old entry point(s).

It is an _additional_ way to boot the kernel, not an exclusive
alternative.


Juergen
Paolo Bonzini Dec. 7, 2018, 1:41 p.m. UTC | #8
On 07/12/18 07:02, Juergen Gross wrote:
> On 06/12/2018 23:11, Paolo Bonzini wrote:
>> On 06/12/18 07:04, Maran Wilson wrote:
>>> +config PVH
>>> +	bool "Support for running PVH guests"
>>> +	---help---
>>> +	  This option enables the PVH entry point for guest virtual machines
>>> +	  as specified in the x86/HVM direct boot ABI.
>>> +
>>
>> IIUC this breaks "normal" bzImage boot, so we should have something like
>>
>> 	The resulting kernel will not boot with most x86 boot loaders
>> 	such as GRUB or SYSLINUX.  Unless you plan to start the kernel
>> 	using QEMU or Xen, you probably want to say N here.
> 
> The resulting kernel should be able to be booted either in PVH mode
> via the PVH entry point or the "normal" way via the still existing
> old entry point(s).
> 
> It is an _additional_ way to boot the kernel, not an exclusive
> alternative.

Yup, understood now.  Different binaries but one build.

Paolo
Jürgen Groß Dec. 7, 2018, 1:50 p.m. UTC | #9
On 07/12/2018 14:41, Paolo Bonzini wrote:
> On 07/12/18 07:02, Juergen Gross wrote:
>> On 06/12/2018 23:11, Paolo Bonzini wrote:
>>> On 06/12/18 07:04, Maran Wilson wrote:
>>>> +config PVH
>>>> +	bool "Support for running PVH guests"
>>>> +	---help---
>>>> +	  This option enables the PVH entry point for guest virtual machines
>>>> +	  as specified in the x86/HVM direct boot ABI.
>>>> +
>>>
>>> IIUC this breaks "normal" bzImage boot, so we should have something like
>>>
>>> 	The resulting kernel will not boot with most x86 boot loaders
>>> 	such as GRUB or SYSLINUX.  Unless you plan to start the kernel
>>> 	using QEMU or Xen, you probably want to say N here.
>>
>> The resulting kernel should be able to be booted either in PVH mode
>> via the PVH entry point or the "normal" way via the still existing
>> old entry point(s).
>>
>> It is an _additional_ way to boot the kernel, not an exclusive
>> alternative.
> 
> Yup, understood now.  Different binaries but one build.

The PVH boot entry is in the same bzImage binary as the normal one.
Its just another entry, similar to the Xen PV boot entry. So the binary
arch/x86/boot/bzimage can be booted either on bare metal via grub2 or
other boot-loaders, as Xen PV-guest, as Xen PVH-guest, or as KVM
PVH-guest. So one build and one binary. The non-standard boot entries
(PV- or PVH-node) are found via ELF-notes by the boot loader (qemu in
case of KVM).


Juergen
Paolo Bonzini Dec. 7, 2018, 1:52 p.m. UTC | #10
On 07/12/18 14:50, Juergen Gross wrote:
> The PVH boot entry is in the same bzImage binary as the normal one.
> Its just another entry, similar to the Xen PV boot entry. So the binary
> arch/x86/boot/bzimage can be booted either on bare metal via grub2 or
> other boot-loaders, as Xen PV-guest, as Xen PVH-guest, or as KVM
> PVH-guest. So one build and one binary. The non-standard boot entries
> (PV- or PVH-node) are found via ELF-notes by the boot loader (qemu in
> case of KVM).
> 

But the bzImage is not an ELF binary, and it is compressed, isn't it?
/me is confused. :)

Paolo
Jürgen Groß Dec. 7, 2018, 1:58 p.m. UTC | #11
On 07/12/2018 14:52, Paolo Bonzini wrote:
> On 07/12/18 14:50, Juergen Gross wrote:
>> The PVH boot entry is in the same bzImage binary as the normal one.
>> Its just another entry, similar to the Xen PV boot entry. So the binary
>> arch/x86/boot/bzimage can be booted either on bare metal via grub2 or
>> other boot-loaders, as Xen PV-guest, as Xen PVH-guest, or as KVM
>> PVH-guest. So one build and one binary. The non-standard boot entries
>> (PV- or PVH-node) are found via ELF-notes by the boot loader (qemu in
>> case of KVM).
>>
> 
> But the bzImage is not an ELF binary, and it is compressed, isn't it?
> /me is confused. :)

grub2 (and qemu, too) can decompress it. And the decompressed result
"vmlinux" is an ELF-binary.


Juergen
Paolo Bonzini Dec. 7, 2018, 3:14 p.m. UTC | #12
On 07/12/18 14:58, Juergen Gross wrote:
> On 07/12/2018 14:52, Paolo Bonzini wrote:
>> On 07/12/18 14:50, Juergen Gross wrote:
>>> The PVH boot entry is in the same bzImage binary as the normal one.
>>> Its just another entry, similar to the Xen PV boot entry. So the binary
>>> arch/x86/boot/bzimage can be booted either on bare metal via grub2 or
>>> other boot-loaders, as Xen PV-guest, as Xen PVH-guest, or as KVM
>>> PVH-guest. So one build and one binary. The non-standard boot entries
>>> (PV- or PVH-node) are found via ELF-notes by the boot loader (qemu in
>>> case of KVM).
>>
>> But the bzImage is not an ELF binary, and it is compressed, isn't it?
>> /me is confused. :)
> 
> grub2 (and qemu, too) can decompress it. And the decompressed result
> "vmlinux" is an ELF-binary.

Aha - for KVM, the main advantage of PVH would be to skip the cost of
decompression, and that is what confused me (also we probably prefer not
having any decompression code running in QEMU, since that increases the
attack surface; there's no real disadvantage to using the existing
linuxboot code if we're given a bzImage).  So, at least for now, KVM
would go with the two-binaries/one-config approach.

Sorry for having you lecture me, it's much clearer now.  Thanks,

Paolo
Maran Wilson Dec. 7, 2018, 6:21 p.m. UTC | #13
On 12/7/2018 7:14 AM, Paolo Bonzini wrote:
> On 07/12/18 14:58, Juergen Gross wrote:
>> On 07/12/2018 14:52, Paolo Bonzini wrote:
>>> On 07/12/18 14:50, Juergen Gross wrote:
>>>> The PVH boot entry is in the same bzImage binary as the normal one.
>>>> Its just another entry, similar to the Xen PV boot entry. So the binary
>>>> arch/x86/boot/bzimage can be booted either on bare metal via grub2 or
>>>> other boot-loaders, as Xen PV-guest, as Xen PVH-guest, or as KVM
>>>> PVH-guest. So one build and one binary. The non-standard boot entries
>>>> (PV- or PVH-node) are found via ELF-notes by the boot loader (qemu in
>>>> case of KVM).
>>> But the bzImage is not an ELF binary, and it is compressed, isn't it?
>>> /me is confused. :)
>> grub2 (and qemu, too) can decompress it. And the decompressed result
>> "vmlinux" is an ELF-binary.
> Aha - for KVM, the main advantage of PVH would be to skip the cost of
> decompression, and that is what confused me (also we probably prefer not
> having any decompression code running in QEMU, since that increases the
> attack surface; there's no real disadvantage to using the existing
> linuxboot code if we're given a bzImage).  So, at least for now, KVM
> would go with the two-binaries/one-config approach.

Yeah, the way we have been testing with the Qemu/qboot changes that Liam 
has out for review, if you provide the bzImage binary in the -kernel 
argument, legacy behavior is followed. However if you provide the 
vmlinux binary (from the same kernel build), Qemu recognizes it as an 
ELF binary, looks for the presence of the PVH ELFNOTE, and (if the 
ELFNOTE in question is found) uses that entry point instead.

So at this point, the only feedback/comment still outstanding from you 
is the one about removing KVM_GUEST_PVH right?

I'll hold off on sending a v9 until next week to see if there is any 
additional feedback or test results.

Thanks,
-Maran

> Sorry for having you lecture me, it's much clearer now.  Thanks,
>
> Paolo
diff mbox series

Patch

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8689e794a43c..c2a22a74abee 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -796,6 +796,12 @@  config KVM_GUEST
 	  underlying device model, the host provides the guest with
 	  timing infrastructure such as time of day, and system time
 
+config PVH
+	bool "Support for running PVH guests"
+	---help---
+	  This option enables the PVH entry point for guest virtual machines
+	  as specified in the x86/HVM direct boot ABI.
+
 config KVM_DEBUG_FS
 	bool "Enable debug information for KVM Guests in debugfs"
 	depends on KVM_GUEST && DEBUG_FS
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 747c758f67b7..d1dbe8e4eb82 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -386,7 +386,7 @@  NEXT_PAGE(early_dynamic_pgts)
 
 	.data
 
-#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
+#if defined(CONFIG_XEN_PV) || defined(CONFIG_PVH)
 NEXT_PGD_PAGE(init_top_pgt)
 	.quad   level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
 	.org    init_top_pgt + L4_PAGE_OFFSET*8, 0
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 1ef391aa184d..e07abefd3d26 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -74,6 +74,7 @@  config XEN_DEBUG_FS
 	  Enabling this option may incur a significant performance overhead.
 
 config XEN_PVH
-	bool "Support for running as a PVH guest"
+	bool "Support for running as a Xen PVH guest"
 	depends on XEN && XEN_PVHVM && ACPI
+	select PVH
 	def_bool n