diff mbox

KVM: nVMX: mask unrestricted_guest if disabled on L0

Message ID 20150224163005.GB2186@potion.brq.redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Radim Krčmář Feb. 24, 2015, 4:30 p.m. UTC
2015-02-23 19:05+0100, Kashyap Chamarthy:
> Tested with the _correct_ Kernel[1] (that has Radim's patch) now --
> applied it on both L0 and L1.
> 
> Result: Same as before -- Booting L2 causes L1 to reboot. However, the
>         stack trace from `dmesg` on L0 is took slightly different path than
>         before -- it's using MSR handling:

Thanks, the problem was deeper ... L1 enabled unrestricted mode while L0
had it disabled.  L1 could then vmrun a L2 state that L0 would have to
emulate, but that doesn't work.  There are at least these solutions:

 1) don't expose unrestricted_guest when L0 doesn't have it
 2) fix unrestricted mode emulation code
 3) handle the failure a without killing L1

I'd do just (1) -- emulating unrestricted mode is a loss.

I have done initial testing and at least qemu-sanity-check works now:

---8<---
If EPT was enabled, unrestricted_guest was allowed in L1 regardless of
L0.  L1 triple faulted when running L2 guest that required emulation.

Another side effect was 'WARN_ON_ONCE(vmx->nested.nested_run_pending)'
in L0's dmesg:
  WARNING: CPU: 0 PID: 0 at arch/x86/kvm/vmx.c:9190 nested_vmx_vmexit+0x96e/0xb00 [kvm_intel] ()

Prevent this scenario by masking SECONDARY_EXEC_UNRESTRICTED_GUEST when
the host doesn't have it enabled.

Fixes: 78051e3b7e35 ("KVM: nVMX: Disable unrestricted mode if ept=0")
Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>
---
 arch/x86/kvm/vmx.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jan Kiszka Feb. 24, 2015, 4:39 p.m. UTC | #1
On 2015-02-24 17:30, Radim Kr?má? wrote:
> 2015-02-23 19:05+0100, Kashyap Chamarthy:
>> Tested with the _correct_ Kernel[1] (that has Radim's patch) now --
>> applied it on both L0 and L1.
>>
>> Result: Same as before -- Booting L2 causes L1 to reboot. However, the
>>         stack trace from `dmesg` on L0 is took slightly different path than
>>         before -- it's using MSR handling:
> 
> Thanks, the problem was deeper ... L1 enabled unrestricted mode while L0
> had it disabled.  L1 could then vmrun a L2 state that L0 would have to
> emulate, but that doesn't work.  There are at least these solutions:
> 
>  1) don't expose unrestricted_guest when L0 doesn't have it

Reminds me of a patch called "KVM: nVMX: Disable unrestricted mode if
ept=0" by Bandan. I thought that would have caught it - apparently not.

>  2) fix unrestricted mode emulation code
>  3) handle the failure a without killing L1
> 
> I'd do just (1) -- emulating unrestricted mode is a loss.

Agreed.

Jan

> 
> I have done initial testing and at least qemu-sanity-check works now:
> 
> ---8<---
> If EPT was enabled, unrestricted_guest was allowed in L1 regardless of
> L0.  L1 triple faulted when running L2 guest that required emulation.
> 
> Another side effect was 'WARN_ON_ONCE(vmx->nested.nested_run_pending)'
> in L0's dmesg:
>   WARNING: CPU: 0 PID: 0 at arch/x86/kvm/vmx.c:9190 nested_vmx_vmexit+0x96e/0xb00 [kvm_intel] ()
> 
> Prevent this scenario by masking SECONDARY_EXEC_UNRESTRICTED_GUEST when
> the host doesn't have it enabled.
> 
> Fixes: 78051e3b7e35 ("KVM: nVMX: Disable unrestricted mode if ept=0")
> Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f7b20b417a3a..dbabea21357b 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2476,8 +2476,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>  	if (enable_ept) {
>  		/* nested EPT: emulate EPT also to L1 */
>  		vmx->nested.nested_vmx_secondary_ctls_high |=
> -			SECONDARY_EXEC_ENABLE_EPT |
> -			SECONDARY_EXEC_UNRESTRICTED_GUEST;
> +			SECONDARY_EXEC_ENABLE_EPT;
>  		vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
>  			 VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
>  			 VMX_EPT_INVEPT_BIT;
> @@ -2491,6 +2490,10 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>  	} else
>  		vmx->nested.nested_vmx_ept_caps = 0;
>  
> +	if (enable_unrestricted_guest)
> +		vmx->nested.nested_vmx_secondary_ctls_high |=
> +			SECONDARY_EXEC_UNRESTRICTED_GUEST;
> +
>  	/* miscellaneous data */
>  	rdmsr(MSR_IA32_VMX_MISC,
>  		vmx->nested.nested_vmx_misc_low,
>
Bandan Das Feb. 24, 2015, 6:32 p.m. UTC | #2
Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2015-02-24 17:30, Radim Kr?má? wrote:
>> 2015-02-23 19:05+0100, Kashyap Chamarthy:
>>> Tested with the _correct_ Kernel[1] (that has Radim's patch) now --
>>> applied it on both L0 and L1.
>>>
>>> Result: Same as before -- Booting L2 causes L1 to reboot. However, the
>>>         stack trace from `dmesg` on L0 is took slightly different path than
>>>         before -- it's using MSR handling:
>> 
>> Thanks, the problem was deeper ... L1 enabled unrestricted mode while L0
>> had it disabled.  L1 could then vmrun a L2 state that L0 would have to
>> emulate, but that doesn't work.  There are at least these solutions:
>> 
>>  1) don't expose unrestricted_guest when L0 doesn't have it
>
> Reminds me of a patch called "KVM: nVMX: Disable unrestricted mode if
> ept=0" by Bandan. I thought that would have caught it - apparently not.

Yeah... Unrestricted guest could be disabled even if ept=0,
and I incorrectly didn't take that into account.

>>  2) fix unrestricted mode emulation code
>>  3) handle the failure a without killing L1
>> 
>> I'd do just (1) -- emulating unrestricted mode is a loss.
>
> Agreed.
>
> Jan
>
>> 
>> I have done initial testing and at least qemu-sanity-check works now:
>> 
>> ---8<---
>> If EPT was enabled, unrestricted_guest was allowed in L1 regardless of
>> L0.  L1 triple faulted when running L2 guest that required emulation.
>> 
>> Another side effect was 'WARN_ON_ONCE(vmx->nested.nested_run_pending)'
>> in L0's dmesg:
>>   WARNING: CPU: 0 PID: 0 at arch/x86/kvm/vmx.c:9190 nested_vmx_vmexit+0x96e/0xb00 [kvm_intel] ()
>> 
>> Prevent this scenario by masking SECONDARY_EXEC_UNRESTRICTED_GUEST when
>> the host doesn't have it enabled.
>> 
>> Fixes: 78051e3b7e35 ("KVM: nVMX: Disable unrestricted mode if ept=0")
>> Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>

We should Cc stable on this patch.

Bandan
>> ---
>>  arch/x86/kvm/vmx.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index f7b20b417a3a..dbabea21357b 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -2476,8 +2476,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>>  	if (enable_ept) {
>>  		/* nested EPT: emulate EPT also to L1 */
>>  		vmx->nested.nested_vmx_secondary_ctls_high |=
>> -			SECONDARY_EXEC_ENABLE_EPT |
>> -			SECONDARY_EXEC_UNRESTRICTED_GUEST;
>> +			SECONDARY_EXEC_ENABLE_EPT;
>>  		vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
>>  			 VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
>>  			 VMX_EPT_INVEPT_BIT;
>> @@ -2491,6 +2490,10 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>>  	} else
>>  		vmx->nested.nested_vmx_ept_caps = 0;
>>  
>> +	if (enable_unrestricted_guest)
>> +		vmx->nested.nested_vmx_secondary_ctls_high |=
>> +			SECONDARY_EXEC_UNRESTRICTED_GUEST;
>> +
>>  	/* miscellaneous data */
>>  	rdmsr(MSR_IA32_VMX_MISC,
>>  		vmx->nested.nested_vmx_misc_low,
>> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kashyap Chamarthy Feb. 25, 2015, 3:50 p.m. UTC | #3
On Tue, Feb 24, 2015 at 05:30:06PM +0100, Radim Kr?má? wrote:
> 2015-02-23 19:05+0100, Kashyap Chamarthy:
> > Tested with the _correct_ Kernel[1] (that has Radim's patch) now --
> > applied it on both L0 and L1.
> > 
> > Result: Same as before -- Booting L2 causes L1 to reboot. However, the
> >         stack trace from `dmesg` on L0 is took slightly different path than
> >         before -- it's using MSR handling:
> 
> Thanks, the problem was deeper ... L1 enabled unrestricted mode while L0
> had it disabled.  L1 could then vmrun a L2 state that L0 would have to
> emulate, but that doesn't work.  There are at least these solutions:
> 
>  1) don't expose unrestricted_guest when L0 doesn't have it
>  2) fix unrestricted mode emulation code
>  3) handle the failure a without killing L1
> 
> I'd do just (1) -- emulating unrestricted mode is a loss.
> 
> I have done initial testing and at least qemu-sanity-check works now:
> 
> ---8<---
> If EPT was enabled, unrestricted_guest was allowed in L1 regardless of
> L0.  L1 triple faulted when running L2 guest that required emulation.
> 
> Another side effect was 'WARN_ON_ONCE(vmx->nested.nested_run_pending)'
> in L0's dmesg:
>   WARNING: CPU: 0 PID: 0 at arch/x86/kvm/vmx.c:9190 nested_vmx_vmexit+0x96e/0xb00 [kvm_intel] ()
> 
> Prevent this scenario by masking SECONDARY_EXEC_UNRESTRICTED_GUEST when
> the host doesn't have it enabled.
> 
> Fixes: 78051e3b7e35 ("KVM: nVMX: Disable unrestricted mode if ept=0")
> Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com>


I just built[1] a Kernel with this patch and tested it on L0 and L1 and
can confirm, the patch fixes the issue -- Booting L2 does not cause L1
to reboot.

So:

    Tested-By: Kashyap Chamarthy <kchamart@redhat.com>

Thanks for investigating, Radim!

[1] https://kashyapc.fedorapeople.org/kernel-4.0.0-0.rc1.git1.1.kashyap1.fc23-with-nvmx-fix2-radim/


> ---
>  arch/x86/kvm/vmx.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f7b20b417a3a..dbabea21357b 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2476,8 +2476,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>  	if (enable_ept) {
>  		/* nested EPT: emulate EPT also to L1 */
>  		vmx->nested.nested_vmx_secondary_ctls_high |=
> -			SECONDARY_EXEC_ENABLE_EPT |
> -			SECONDARY_EXEC_UNRESTRICTED_GUEST;
> +			SECONDARY_EXEC_ENABLE_EPT;
>  		vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
>  			 VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
>  			 VMX_EPT_INVEPT_BIT;
> @@ -2491,6 +2490,10 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
>  	} else
>  		vmx->nested.nested_vmx_ept_caps = 0;
>  
> +	if (enable_unrestricted_guest)
> +		vmx->nested.nested_vmx_secondary_ctls_high |=
> +			SECONDARY_EXEC_UNRESTRICTED_GUEST;
> +
>  	/* miscellaneous data */
>  	rdmsr(MSR_IA32_VMX_MISC,
>  		vmx->nested.nested_vmx_misc_low,
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f7b20b417a3a..dbabea21357b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2476,8 +2476,7 @@  static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	if (enable_ept) {
 		/* nested EPT: emulate EPT also to L1 */
 		vmx->nested.nested_vmx_secondary_ctls_high |=
-			SECONDARY_EXEC_ENABLE_EPT |
-			SECONDARY_EXEC_UNRESTRICTED_GUEST;
+			SECONDARY_EXEC_ENABLE_EPT;
 		vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
 			 VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
 			 VMX_EPT_INVEPT_BIT;
@@ -2491,6 +2490,10 @@  static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	} else
 		vmx->nested.nested_vmx_ept_caps = 0;
 
+	if (enable_unrestricted_guest)
+		vmx->nested.nested_vmx_secondary_ctls_high |=
+			SECONDARY_EXEC_UNRESTRICTED_GUEST;
+
 	/* miscellaneous data */
 	rdmsr(MSR_IA32_VMX_MISC,
 		vmx->nested.nested_vmx_misc_low,