diff mbox series

[hyperv-next,v5,03/11] Drivers: hv: Enable VTL mode for arm64

Message ID 20250307220304.247725-4-romank@linux.microsoft.com (mailing list archive)
State Handled Elsewhere
Delegated to: Krzysztof Wilczyński
Headers show
Series arm64: hyperv: Support Virtual Trust Level Boot | expand

Commit Message

Roman Kisel March 7, 2025, 10:02 p.m. UTC
Kconfig dependencies for arm64 guests on Hyper-V require that be
ACPI enabled, and limit VTL mode to x86/x64. To enable VTL mode
on arm64 as well, update the dependencies. Since VTL mode requires
DeviceTree instead of ACPI, don’t require arm64 guests on Hyper-V
to have ACPI unconditionally.

Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
---
 drivers/hv/Kconfig | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Arnd Bergmann March 8, 2025, 9:05 p.m. UTC | #1
On Fri, Mar 7, 2025, at 23:02, Roman Kisel wrote:
> @@ -5,18 +5,20 @@ menu "Microsoft Hyper-V guest support"
>  config HYPERV
>  	tristate "Microsoft Hyper-V client drivers"
>  	depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \
> -		|| (ACPI && ARM64 && !CPU_BIG_ENDIAN)
> +		|| (ARM64 && !CPU_BIG_ENDIAN)
> +	depends on (ACPI || HYPERV_VTL_MODE)
>  	select PARAVIRT
>  	select X86_HV_CALLBACK_VECTOR if X86
> -	select OF_EARLY_FLATTREE if OF
>  	help
>  	  Select this option to run Linux as a Hyper-V client operating
>  	  system.
> 
>  config HYPERV_VTL_MODE
>  	bool "Enable Linux to boot in VTL context"
> -	depends on X86_64 && HYPERV
> +	depends on (X86_64 || ARM64)
>  	depends on SMP
> +	select OF_EARLY_FLATTREE
> +	select OF
>  	default n
>  	help

Having the dependency below the top-level Kconfig entry feels a little
counterintuitive. You could flip that back as it was before by doing

      select HYPERV_VTL_MODE if !ACPI
      depends on ACPI || SMP

in the HYPERV option, leaving the dependency on HYPERV in
HYPERV_VTL_MODE.

Is OF_EARLY_FLATTREE actually needed on x86?

      Arnd
Roman Kisel March 10, 2025, 5:35 p.m. UTC | #2
On 3/8/2025 1:05 PM, Arnd Bergmann wrote:
> On Fri, Mar 7, 2025, at 23:02, Roman Kisel wrote:
>> @@ -5,18 +5,20 @@ menu "Microsoft Hyper-V guest support"
>>   config HYPERV
>>   	tristate "Microsoft Hyper-V client drivers"
>>   	depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \
>> -		|| (ACPI && ARM64 && !CPU_BIG_ENDIAN)
>> +		|| (ARM64 && !CPU_BIG_ENDIAN)
>> +	depends on (ACPI || HYPERV_VTL_MODE)
>>   	select PARAVIRT
>>   	select X86_HV_CALLBACK_VECTOR if X86
>> -	select OF_EARLY_FLATTREE if OF
>>   	help
>>   	  Select this option to run Linux as a Hyper-V client operating
>>   	  system.
>>
>>   config HYPERV_VTL_MODE
>>   	bool "Enable Linux to boot in VTL context"
>> -	depends on X86_64 && HYPERV
>> +	depends on (X86_64 || ARM64)
>>   	depends on SMP
>> +	select OF_EARLY_FLATTREE
>> +	select OF
>>   	default n
>>   	help
> 
> Having the dependency below the top-level Kconfig entry feels a little
> counterintuitive. You could flip that back as it was before by doing
> 
>        select HYPERV_VTL_MODE if !ACPI
>        depends on ACPI || SMP
> 
> in the HYPERV option, leaving the dependency on HYPERV in
> HYPERV_VTL_MODE.
> 

I was implementing Michael's suggestion, and might've gone a bit
overboard, my bad. I'll fix this, thanks a lot for reviewing!

> Is OF_EARLY_FLATTREE actually needed on x86?
> 

No, it is not needed on x86. It is only needed when VTL mode is used.

>        Arnd
Michael Kelley March 10, 2025, 9:01 p.m. UTC | #3
From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
> 
> On Fri, Mar 7, 2025, at 23:02, Roman Kisel wrote:
> > @@ -5,18 +5,20 @@ menu "Microsoft Hyper-V guest support"
> >  config HYPERV
> >  	tristate "Microsoft Hyper-V client drivers"
> >  	depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \
> > -		|| (ACPI && ARM64 && !CPU_BIG_ENDIAN)
> > +		|| (ARM64 && !CPU_BIG_ENDIAN)
> > +	depends on (ACPI || HYPERV_VTL_MODE)
> >  	select PARAVIRT
> >  	select X86_HV_CALLBACK_VECTOR if X86
> > -	select OF_EARLY_FLATTREE if OF
> >  	help
> >  	  Select this option to run Linux as a Hyper-V client operating
> >  	  system.
> >
> >  config HYPERV_VTL_MODE
> >  	bool "Enable Linux to boot in VTL context"
> > -	depends on X86_64 && HYPERV
> > +	depends on (X86_64 || ARM64)
> >  	depends on SMP
> > +	select OF_EARLY_FLATTREE
> > +	select OF
> >  	default n
> >  	help
> 
> Having the dependency below the top-level Kconfig entry feels a little
> counterintuitive. You could flip that back as it was before by doing
> 
>       select HYPERV_VTL_MODE if !ACPI
>       depends on ACPI || SMP
> 
> in the HYPERV option, leaving the dependency on HYPERV in
> HYPERV_VTL_MODE.

I would argue that we don't ever want to implicitly select
HYPERV_VTL_MODE because of some other config setting or
lack thereof.  VTL mode is enough of a special case that it should
only be explicitly selected. If someone omits ACPI, then HYPERV
should not be selectable unless HYPERV_VTL_MODE is explicitly
selected.

The last line of the comment for HYPERV_VTL_MODE says
"A kernel built with this option must run at VTL2, and will not run
as a normal guest."  In other words, don't choose this unless you
100% know that VTL2 is what you want.

Michael

> 
> Is OF_EARLY_FLATTREE actually needed on x86?
> 
>       Arnd
Arnd Bergmann March 10, 2025, 9:20 p.m. UTC | #4
On Mon, Mar 10, 2025, at 22:01, Michael Kelley wrote:
> From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
>> >  config HYPERV_VTL_MODE
>> >  	bool "Enable Linux to boot in VTL context"
>> > -	depends on X86_64 && HYPERV
>> > +	depends on (X86_64 || ARM64)
>> >  	depends on SMP
>> > +	select OF_EARLY_FLATTREE
>> > +	select OF
>> >  	default n
>> >  	help
>> 
>> Having the dependency below the top-level Kconfig entry feels a little
>> counterintuitive. You could flip that back as it was before by doing
>> 
>>       select HYPERV_VTL_MODE if !ACPI
>>       depends on ACPI || SMP
>> 
>> in the HYPERV option, leaving the dependency on HYPERV in
>> HYPERV_VTL_MODE.
>
> I would argue that we don't ever want to implicitly select
> HYPERV_VTL_MODE because of some other config setting or
> lack thereof.  VTL mode is enough of a special case that it should
> only be explicitly selected. If someone omits ACPI, then HYPERV
> should not be selectable unless HYPERV_VTL_MODE is explicitly
> selected.
>
> The last line of the comment for HYPERV_VTL_MODE says
> "A kernel built with this option must run at VTL2, and will not run
> as a normal guest."  In other words, don't choose this unless you
> 100% know that VTL2 is what you want.

It sounds like the latter is the real problem: enabling a feature
should never prevent something else from working. Can you describe
what VTL context is and why it requires an exception to a rather
fundamental rule here? If you build a kernel that runs on every
single piece of arm64 hardware and every hypervisor, why can't
you add HYPERV_VTL_MODE to that as an option?

      Arnd
Michael Kelley March 10, 2025, 10:18 p.m. UTC | #5
From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 10, 2025 2:21 PM
> 
> On Mon, Mar 10, 2025, at 22:01, Michael Kelley wrote:
> > From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
> >> >  config HYPERV_VTL_MODE
> >> >  	bool "Enable Linux to boot in VTL context"
> >> > -	depends on X86_64 && HYPERV
> >> > +	depends on (X86_64 || ARM64)
> >> >  	depends on SMP
> >> > +	select OF_EARLY_FLATTREE
> >> > +	select OF
> >> >  	default n
> >> >  	help
> >>
> >> Having the dependency below the top-level Kconfig entry feels a little
> >> counterintuitive. You could flip that back as it was before by doing
> >>
> >>       select HYPERV_VTL_MODE if !ACPI
> >>       depends on ACPI || SMP
> >>
> >> in the HYPERV option, leaving the dependency on HYPERV in
> >> HYPERV_VTL_MODE.
> >
> > I would argue that we don't ever want to implicitly select
> > HYPERV_VTL_MODE because of some other config setting or
> > lack thereof.  VTL mode is enough of a special case that it should
> > only be explicitly selected. If someone omits ACPI, then HYPERV
> > should not be selectable unless HYPERV_VTL_MODE is explicitly
> > selected.
> >
> > The last line of the comment for HYPERV_VTL_MODE says
> > "A kernel built with this option must run at VTL2, and will not run
> > as a normal guest."  In other words, don't choose this unless you
> > 100% know that VTL2 is what you want.
> 
> It sounds like the latter is the real problem: enabling a feature
> should never prevent something else from working. Can you describe
> what VTL context is and why it requires an exception to a rather
> fundamental rule here? If you build a kernel that runs on every
> single piece of arm64 hardware and every hypervisor, why can't
> you add HYPERV_VTL_MODE to that as an option?
> 

VTL = Virtual Trust Level, and VSM = Virtual Secure Mode, are Hyper-V's
terminology for offering multiple execution environments with
hierarchical trust in the context of a single VM. A normal guest
operating system runs at VTL 0, and there are no other VTLs in use.
But in some environments, additional software may run as a paravisor
layer between the normal guest OS and the hypervisor. This software
runs at some other VTL > 0, and has a higher privilege level within
the VM than software running at VTL 0 (which is the lowest privilege).
VTL 2 is used today in the Azure cloud with CoCo VMs to run a
paravisor, and there may be other uses in the future. See [1] if you
want more details on VSM and VTLs. Also [2] for the CoCo VM use
case.

Ideally, a Linux kernel image could detect at runtime what VTL it is
running at, and "do the right thing". Unfortunately, on x86 Linux this
has proved difficult (or perhaps impossible) because the amount of
boot-time setup required to ask the question about the current VTL
is significant. The idiosyncrasies and historical baggage of x86 requires
that Linux do some x86-specific initialization steps for VTL > 0
before the question can be asked. Hence the introduction of
CONFIG_HYPERV_VTL_MODE, and the behavior that when it is
selected, the kernel image won't run normally in VTL 0.

I'll go out on a limb and say that I suspect on arm64 a runtime
determination based on querying the VTL *could* be made (though
I'm not the person writing the code). But taking advantage of that
on arm64 produces an undesirable dichotomy with x86.

Roman may have further thoughts on the topic, but that's
what I know about how we got here.

Michael

[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/vsm
[2] https://techcommunity.microsoft.com/blog/windowsosplatform/openhcl-the-new-open-source-paravisor/4273172
Roman Kisel March 12, 2025, 6:33 p.m. UTC | #6
On 3/10/2025 3:18 PM, Michael Kelley wrote:
> From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 10, 2025 2:21 PM
>>
>> On Mon, Mar 10, 2025, at 22:01, Michael Kelley wrote:
>>> From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
>>>>>   config HYPERV_VTL_MODE
>>>>>   	bool "Enable Linux to boot in VTL context"
>>>>> -	depends on X86_64 && HYPERV
>>>>> +	depends on (X86_64 || ARM64)
>>>>>   	depends on SMP
>>>>> +	select OF_EARLY_FLATTREE
>>>>> +	select OF
>>>>>   	default n
>>>>>   	help
>>>>
>>>> Having the dependency below the top-level Kconfig entry feels a little
>>>> counterintuitive. You could flip that back as it was before by doing
>>>>
>>>>        select HYPERV_VTL_MODE if !ACPI
>>>>        depends on ACPI || SMP
>>>>
>>>> in the HYPERV option, leaving the dependency on HYPERV in
>>>> HYPERV_VTL_MODE.
>>>
>>> I would argue that we don't ever want to implicitly select
>>> HYPERV_VTL_MODE because of some other config setting or
>>> lack thereof.  VTL mode is enough of a special case that it should
>>> only be explicitly selected. If someone omits ACPI, then HYPERV
>>> should not be selectable unless HYPERV_VTL_MODE is explicitly
>>> selected.
>>>
>>> The last line of the comment for HYPERV_VTL_MODE says
>>> "A kernel built with this option must run at VTL2, and will not run
>>> as a normal guest."  In other words, don't choose this unless you
>>> 100% know that VTL2 is what you want.
>>
>> It sounds like the latter is the real problem: enabling a feature
>> should never prevent something else from working. Can you describe
>> what VTL context is and why it requires an exception to a rather
>> fundamental rule here? If you build a kernel that runs on every
>> single piece of arm64 hardware and every hypervisor, why can't
>> you add HYPERV_VTL_MODE to that as an option?
>>

In the VTL mode, we're running the kernel as secure firmware inside the
guest (one might see VTL2 working as Intel SMM or Secure World on ARM).

[...]

> 
> Ideally, a Linux kernel image could detect at runtime what VTL it is
> running at, and "do the right thing". Unfortunately, on x86 Linux this
> has proved difficult (or perhaps impossible) because the amount of
> boot-time setup required to ask the question about the current VTL
> is significant. The idiosyncrasies and historical baggage of x86 requires
> that Linux do some x86-specific initialization steps for VTL > 0
> before the question can be asked. Hence the introduction of
> CONFIG_HYPERV_VTL_MODE, and the behavior that when it is
> selected, the kernel image won't run normally in VTL 0.
> 
> I'll go out on a limb and say that I suspect on arm64 a runtime
> determination based on querying the VTL *could* be made (though
> I'm not the person writing the code). But taking advantage of that
> on arm64 produces an undesirable dichotomy with x86.

On arm64 that is much easier, I agree. On x86 we'd need a kludge of

static void __naked __init __aligned(4096) early_hvcall_pg(void)
{
	/*
	 * Fill the early hvcall page with `0xF1` aka `INT1` to catch
	 * programming errors. The hypervisor will overlay the page with
	 * the vendor-specific code sequences to make hypercalls on x86(_64).
	 */
	asm (".skip 4096, 0xf1");
}

static u8 __init early_hvcall_pg_input[4096] 
__attribute__((aligned(4096)));
static u8 __init early_hvcall_pg_output[4096] 
__attribute__((aligned(4096)));

static void __init early_connect_to_hv(void)
{
	union hv_x64_msr_hypercall_contents hypercall_msr;
	u64 guest_id;

	guest_id = hv_generate_guest_id(LINUX_VERSION_CODE);
	wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
	rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
	hypercall_msr.enable = 1;
	hypercall_msr.guest_physical_address = 
__phys_to_pfn(virt_to_phys(early_hvcall_pg));
	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
}

or variations thereof.

What's very nice about arm64 in this case at least, it's got SMCCC, hvc,
OF/DT and a history of options of being power-efficient and embedded.
Conversely, on x86(_64) the code sequences for hyeprcalls vary from the
first vendor to the second one so we have to have the hvcall page to
make this regular in the code. Support for OF/DT on x86 was added for
Intel set top boxes (MID, ~2015 iirc), and it took a bit of huffing and
puffing to make that work for us on the large/NUMA systems (and there
might be something about supporting x2apic that had to be figured out).

All told, we can have nicer things in our arm64 code yet diverging the
code much from x86(_64) is not very desirable. I am not sure yet what
the tradeoff should be, and my knowledge of Kconfig is rather basic.
Certainly I cannot propose to arm64 maintainers that we'd like to do
quirky things in Kconfig because of x86(-64), legacy specs, etc.

Perhaps, we could go back to the V2's option of

  config HYPERV
  	tristate "Microsoft Hyper-V client drivers"
  	depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \
-		|| (ACPI && ARM64 && !CPU_BIG_ENDIAN)
+		|| (ARM64 && !CPU_BIG_ENDIAN)
  	select PARAVIRT
  	select X86_HV_CALLBACK_VECTOR if X86
  	select OF_EARLY_FLATTREE if OF
@@ -15,7 +15,7 @@ config HYPERV

  config HYPERV_VTL_MODE
  	bool "Enable Linux to boot in VTL context"
-	depends on X86_64 && HYPERV
+	depends on HYPERV
  	depends on SMP
  	default n
  	help
@@ -31,7 +31,7 @@ config HYPERV_VTL_MODE

  	  Select this option to build a Linux kernel to run at a VTL other than
  	  the normal VTL0, which currently is only VTL2.  This option
-	  initializes the x86 platform for VTL2, and adds the ability to boot
+	  initializes the kernel to run in VTL2, and adds the ability to boot
  	  secondary CPUs directly into 64-bit context as required for VTLs other
  	  than 0.  A kernel built with this option must run at VTL2, and will
  	  not run as a normal guest.

That's a minimal extension, its surprise factor is very low. It has not
been seen to cause issues. If no one has strong opinions against that,
I'd send that in V6.

> 
> Roman may have further thoughts on the topic, but that's
> what I know about how we got here.
> 
> Michael
> 
> [1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/vsm
> [2] https://techcommunity.microsoft.com/blog/windowsosplatform/openhcl-the-new-open-source-paravisor/4273172
Arnd Bergmann March 12, 2025, 8:25 p.m. UTC | #7
On Wed, Mar 12, 2025, at 19:33, Roman Kisel wrote:
> On 3/10/2025 3:18 PM, Michael Kelley wrote:
>
> That's a minimal extension, its surprise factor is very low. It has not
> been seen to cause issues. If no one has strong opinions against that,
> I'd send that in V6.
>

Works for me. Thanks for your detailed explanations.

       Arnd
Wei Liu March 12, 2025, 8:31 p.m. UTC | #8
On Wed, Mar 12, 2025 at 11:33:11AM -0700, Roman Kisel wrote:
> 
> 
> On 3/10/2025 3:18 PM, Michael Kelley wrote:
> > From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 10, 2025 2:21 PM
> > > 
> > > On Mon, Mar 10, 2025, at 22:01, Michael Kelley wrote:
> > > > From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
> > > > > >   config HYPERV_VTL_MODE
> > > > > >   	bool "Enable Linux to boot in VTL context"
> > > > > > -	depends on X86_64 && HYPERV
> > > > > > +	depends on (X86_64 || ARM64)
> > > > > >   	depends on SMP
> > > > > > +	select OF_EARLY_FLATTREE
> > > > > > +	select OF
> > > > > >   	default n
> > > > > >   	help
> > > > > 
> > > > > Having the dependency below the top-level Kconfig entry feels a little
> > > > > counterintuitive. You could flip that back as it was before by doing
> > > > > 
> > > > >        select HYPERV_VTL_MODE if !ACPI
> > > > >        depends on ACPI || SMP
> > > > > 
> > > > > in the HYPERV option, leaving the dependency on HYPERV in
> > > > > HYPERV_VTL_MODE.
> > > > 
> > > > I would argue that we don't ever want to implicitly select
> > > > HYPERV_VTL_MODE because of some other config setting or
> > > > lack thereof.  VTL mode is enough of a special case that it should
> > > > only be explicitly selected. If someone omits ACPI, then HYPERV
> > > > should not be selectable unless HYPERV_VTL_MODE is explicitly
> > > > selected.
> > > > 
> > > > The last line of the comment for HYPERV_VTL_MODE says
> > > > "A kernel built with this option must run at VTL2, and will not run
> > > > as a normal guest."  In other words, don't choose this unless you
> > > > 100% know that VTL2 is what you want.
> > > 
> > > It sounds like the latter is the real problem: enabling a feature
> > > should never prevent something else from working. Can you describe
> > > what VTL context is and why it requires an exception to a rather
> > > fundamental rule here? If you build a kernel that runs on every
> > > single piece of arm64 hardware and every hypervisor, why can't
> > > you add HYPERV_VTL_MODE to that as an option?
> > > 
> 
> In the VTL mode, we're running the kernel as secure firmware inside the
> guest (one might see VTL2 working as Intel SMM or Secure World on ARM).
> 
> [...]
> 
> > 
> > Ideally, a Linux kernel image could detect at runtime what VTL it is
> > running at, and "do the right thing". Unfortunately, on x86 Linux this
> > has proved difficult (or perhaps impossible) because the amount of
> > boot-time setup required to ask the question about the current VTL
> > is significant. The idiosyncrasies and historical baggage of x86 requires
> > that Linux do some x86-specific initialization steps for VTL > 0
> > before the question can be asked. Hence the introduction of
> > CONFIG_HYPERV_VTL_MODE, and the behavior that when it is
> > selected, the kernel image won't run normally in VTL 0.
> > 
> > I'll go out on a limb and say that I suspect on arm64 a runtime
> > determination based on querying the VTL *could* be made (though
> > I'm not the person writing the code). But taking advantage of that
> > on arm64 produces an undesirable dichotomy with x86.
> 
> On arm64 that is much easier, I agree. On x86 we'd need a kludge of
> 
> static void __naked __init __aligned(4096) early_hvcall_pg(void)
> {
> 	/*
> 	 * Fill the early hvcall page with `0xF1` aka `INT1` to catch
> 	 * programming errors. The hypervisor will overlay the page with
> 	 * the vendor-specific code sequences to make hypercalls on x86(_64).
> 	 */
> 	asm (".skip 4096, 0xf1");
> }
> 
> static u8 __init early_hvcall_pg_input[4096] __attribute__((aligned(4096)));
> static u8 __init early_hvcall_pg_output[4096]
> __attribute__((aligned(4096)));
> 
> static void __init early_connect_to_hv(void)
> {
> 	union hv_x64_msr_hypercall_contents hypercall_msr;
> 	u64 guest_id;
> 
> 	guest_id = hv_generate_guest_id(LINUX_VERSION_CODE);
> 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
> 	rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> 	hypercall_msr.enable = 1;
> 	hypercall_msr.guest_physical_address =
> __phys_to_pfn(virt_to_phys(early_hvcall_pg));
> 	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> }
> 
> or variations thereof.

OT here but what's stopping us from doing this on x86?

It seems to me there is some value in setting up the hypercall page as
early as possible. The same page can be used through the lifetime of the
partition. The early input and output pages should be reclaimed.

Also, since the hypervisor will insert an overlay page, it makes sense
to not allocate a page from Linux at all. When I ported Xen to run as
a guest on Hyper-V, I used that approach. The setup worked just fine.

All being said, things work today, so I'm in no hurry to change things.

Wei.
Roman Kisel March 12, 2025, 9:21 p.m. UTC | #9
On 3/12/2025 1:25 PM, Arnd Bergmann wrote:
> On Wed, Mar 12, 2025, at 19:33, Roman Kisel wrote:
>> On 3/10/2025 3:18 PM, Michael Kelley wrote:
>>
>> That's a minimal extension, its surprise factor is very low. It has not
>> been seen to cause issues. If no one has strong opinions against that,
>> I'd send that in V6.
>>
> 
> Works for me. Thanks for your detailed explanations.
> 

Thank you for your review very much!

>         Arnd
Roman Kisel March 12, 2025, 9:30 p.m. UTC | #10
On 3/12/2025 1:31 PM, Wei Liu wrote:
> On Wed, Mar 12, 2025 at 11:33:11AM -0700, Roman Kisel wrote:
>>
>>
>> On 3/10/2025 3:18 PM, Michael Kelley wrote:
>>> From: Arnd Bergmann <arnd@arndb.de> Sent: Monday, March 10, 2025 2:21 PM
>>>>
>>>> On Mon, Mar 10, 2025, at 22:01, Michael Kelley wrote:
>>>>> From: Arnd Bergmann <arnd@arndb.de> Sent: Saturday, March 8, 2025 1:05 PM
>>>>>>>    config HYPERV_VTL_MODE
>>>>>>>    	bool "Enable Linux to boot in VTL context"
>>>>>>> -	depends on X86_64 && HYPERV
>>>>>>> +	depends on (X86_64 || ARM64)
>>>>>>>    	depends on SMP
>>>>>>> +	select OF_EARLY_FLATTREE
>>>>>>> +	select OF
>>>>>>>    	default n
>>>>>>>    	help
>>>>>>
>>>>>> Having the dependency below the top-level Kconfig entry feels a little
>>>>>> counterintuitive. You could flip that back as it was before by doing
>>>>>>
>>>>>>         select HYPERV_VTL_MODE if !ACPI
>>>>>>         depends on ACPI || SMP
>>>>>>
>>>>>> in the HYPERV option, leaving the dependency on HYPERV in
>>>>>> HYPERV_VTL_MODE.
>>>>>
>>>>> I would argue that we don't ever want to implicitly select
>>>>> HYPERV_VTL_MODE because of some other config setting or
>>>>> lack thereof.  VTL mode is enough of a special case that it should
>>>>> only be explicitly selected. If someone omits ACPI, then HYPERV
>>>>> should not be selectable unless HYPERV_VTL_MODE is explicitly
>>>>> selected.
>>>>>
>>>>> The last line of the comment for HYPERV_VTL_MODE says
>>>>> "A kernel built with this option must run at VTL2, and will not run
>>>>> as a normal guest."  In other words, don't choose this unless you
>>>>> 100% know that VTL2 is what you want.
>>>>
>>>> It sounds like the latter is the real problem: enabling a feature
>>>> should never prevent something else from working. Can you describe
>>>> what VTL context is and why it requires an exception to a rather
>>>> fundamental rule here? If you build a kernel that runs on every
>>>> single piece of arm64 hardware and every hypervisor, why can't
>>>> you add HYPERV_VTL_MODE to that as an option?
>>>>
>>
>> In the VTL mode, we're running the kernel as secure firmware inside the
>> guest (one might see VTL2 working as Intel SMM or Secure World on ARM).
>>
>> [...]
>>
>>>
>>> Ideally, a Linux kernel image could detect at runtime what VTL it is
>>> running at, and "do the right thing". Unfortunately, on x86 Linux this
>>> has proved difficult (or perhaps impossible) because the amount of
>>> boot-time setup required to ask the question about the current VTL
>>> is significant. The idiosyncrasies and historical baggage of x86 requires
>>> that Linux do some x86-specific initialization steps for VTL > 0
>>> before the question can be asked. Hence the introduction of
>>> CONFIG_HYPERV_VTL_MODE, and the behavior that when it is
>>> selected, the kernel image won't run normally in VTL 0.
>>>
>>> I'll go out on a limb and say that I suspect on arm64 a runtime
>>> determination based on querying the VTL *could* be made (though
>>> I'm not the person writing the code). But taking advantage of that
>>> on arm64 produces an undesirable dichotomy with x86.
>>
>> On arm64 that is much easier, I agree. On x86 we'd need a kludge of
>>
>> static void __naked __init __aligned(4096) early_hvcall_pg(void)
>> {
>> 	/*
>> 	 * Fill the early hvcall page with `0xF1` aka `INT1` to catch
>> 	 * programming errors. The hypervisor will overlay the page with
>> 	 * the vendor-specific code sequences to make hypercalls on x86(_64).
>> 	 */
>> 	asm (".skip 4096, 0xf1");
>> }
>>
>> static u8 __init early_hvcall_pg_input[4096] __attribute__((aligned(4096)));
>> static u8 __init early_hvcall_pg_output[4096]
>> __attribute__((aligned(4096)));
>>
>> static void __init early_connect_to_hv(void)
>> {
>> 	union hv_x64_msr_hypercall_contents hypercall_msr;
>> 	u64 guest_id;
>>
>> 	guest_id = hv_generate_guest_id(LINUX_VERSION_CODE);
>> 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
>> 	rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
>> 	hypercall_msr.enable = 1;
>> 	hypercall_msr.guest_physical_address =
>> __phys_to_pfn(virt_to_phys(early_hvcall_pg));
>> 	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
>> }
>>
>> or variations thereof.
> 
> OT here but what's stopping us from doing this on x86?
> 

At the first glance, seems like nothing I think. For the conf scenarios
like TDX and SEV-SNP, due to the early hvcall I/O pages above allocated
in BSS, might need to mark the pages as decrypted and zero them out so
they look like proper BSS section (the page contents are scrambled after
flipping the page encryption bit iirc).

> It seems to me there is some value in setting up the hypercall page as
> early as possible. The same page can be used through the lifetime of the
> partition. The early input and output pages should be reclaimed.
> 

Wholeheartedly agree!

> Also, since the hypervisor will insert an overlay page, it makes sense
> to not allocate a page from Linux at all. When I ported Xen to run as
> a guest on Hyper-V, I used that approach. The setup worked just fine.
> 
> All being said, things work today, so I'm in no hurry to change things.
> 

I'll try fleshing this out soon-ish if no one beats me to that :)

> Wei.
diff mbox series

Patch

diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 862c47b191af..c37b1a44e580 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -5,18 +5,20 @@  menu "Microsoft Hyper-V guest support"
 config HYPERV
 	tristate "Microsoft Hyper-V client drivers"
 	depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \
-		|| (ACPI && ARM64 && !CPU_BIG_ENDIAN)
+		|| (ARM64 && !CPU_BIG_ENDIAN)
+	depends on (ACPI || HYPERV_VTL_MODE)
 	select PARAVIRT
 	select X86_HV_CALLBACK_VECTOR if X86
-	select OF_EARLY_FLATTREE if OF
 	help
 	  Select this option to run Linux as a Hyper-V client operating
 	  system.
 
 config HYPERV_VTL_MODE
 	bool "Enable Linux to boot in VTL context"
-	depends on X86_64 && HYPERV
+	depends on (X86_64 || ARM64)
 	depends on SMP
+	select OF_EARLY_FLATTREE
+	select OF
 	default n
 	help
 	  Virtual Secure Mode (VSM) is a set of hypervisor capabilities and
@@ -31,7 +33,7 @@  config HYPERV_VTL_MODE
 
 	  Select this option to build a Linux kernel to run at a VTL other than
 	  the normal VTL0, which currently is only VTL2.  This option
-	  initializes the x86 platform for VTL2, and adds the ability to boot
+	  initializes the kernel to run in VTL2, and adds the ability to boot
 	  secondary CPUs directly into 64-bit context as required for VTLs other
 	  than 0.  A kernel built with this option must run at VTL2, and will
 	  not run as a normal guest.