diff mbox

Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS

Message ID 560D1C6E.2060803@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dirk Mueller Oct. 1, 2015, 11:43 a.m. UTC
The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

Signed-off-by: Dirk Mueller <dmueller@suse.com>

Comments

Paolo Bonzini Oct. 1, 2015, 12:25 p.m. UTC | #1
On 01/10/2015 13:43, Dirk Müller wrote:
> The cpu feature flags are not ever going to change, so warning
> everytime can cause a lot of kernel log spam
> (in our case more than 10GB/hour).
> 
> Signed-off-by: Dirk Mueller <dmueller@suse.com>
> 

Applied.  Do you also know what caused the warning, and/or would you
like me to take a look?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Oct. 1, 2015, 12:31 p.m. UTC | #2
On 01/10/2015 13:43, Dirk Müller wrote:
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 94b7d15..0a42859 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
>  	if (svm->vmcb->control.next_rip != 0) {
> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
>  		svm->next_rip = svm->vmcb->control.next_rip;
>  	}
>  

Bandan, what was the reason for warning here?

Should we change the "if" condition to static_cpu_has(X86_FEATURE_NRIPS)
instead of Dirk's patch?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dirk Mueller Oct. 1, 2015, 12:45 p.m. UTC | #3
On 01.10.2015 14:25, Paolo Bonzini wrote:

Hi Paolo,

 > Applied.  Do you also know what caused the warning, and/or would you
> like me to take a look?

The trace we're getting is this:

  [<ffffffff8100471d>] dump_trace+0x7d/0x2d0
  [<ffffffff81004a04>] show_stack_log_lvl+0x94/0x170
  [<ffffffff81005cc1>] show_stack+0x21/0x50
  [<ffffffff81510f42>] dump_stack+0x41/0x51
  [<ffffffff81055362>] warn_slowpath_common+0x82/0xc0
  [<ffffffffa0388cc1>] skip_emulated_instruction+0xe1/0x160 [kvm_amd]
  [<ffffffffa038908b>] io_interception+0x3b/0x80 [kvm_amd]
  [<ffffffffa031667c>] vcpu_enter_guest+0x6bc/0xc70 [kvm]
  [<ffffffffa031aa08>] kvm_arch_vcpu_ioctl_run+0x1d8/0x450 [kvm]
  [<ffffffffa0305ec1>] kvm_vcpu_ioctl+0x2e1/0x580 [kvm]
  [<ffffffff811b3f24>] do_vfs_ioctl+0x2d4/0x4b0
  [<ffffffff811b4188>] SyS_ioctl+0x88/0xa0
  [<ffffffff8151f209>] system_call_fastpath+0x16/0x1b
  [<00007f8f7b5fabe7>] 0x7f8f7b5fabe6


The colleague who was investigating this is not here today, but from 
what I remember it seems to only happen when nested virtualisation is 
being used. I don't know more about it right now, bisecting pointed out 
f104765b4f81fd74d69e0eb161e89096deade2db . Since it was logging so 
rapidly on our autotester machines (which are running 200-250 VMs 
nested) that we are continuously run out of disk space I just tried to 
come up with a stop-gap fix until this has been properly analyzed.

Thanks,
Dirk

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 1, 2015, 10:31 p.m. UTC | #4
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 01/10/2015 13:43, Dirk Müller wrote:
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 94b7d15..0a42859 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>>  	struct vcpu_svm *svm = to_svm(vcpu);
>>  
>>  	if (svm->vmcb->control.next_rip != 0) {
>> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
>> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
>>  		svm->next_rip = svm->vmcb->control.next_rip;
>>  	}
>>  
>
> Bandan, what was the reason for warning here?

I added the warning so that we catch if the next_rip field is being written
to (even if the feature isn't supported) by a buggy L1 hypervisor.

From the commit:

 
-	if (svm->vmcb->control.next_rip != 0)
+	if (svm->vmcb->control.next_rip != 0) {
+		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
 		svm->next_rip = svm->vmcb->control.next_rip;
+	}
 
 	if (!svm->next_rip) {
 		if (emulate_instruction(vcpu, EMULTYPE_SKIP) !=
@@ -4317,7 +4319,9 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu,
 		break;
 	}
 
-	vmcb->control.next_rip  = info->next_rip;
+	/* TODO: Advertise NRIPS to guest hypervisor unconditionally */
+	if (static_cpu_has(X86_FEATURE_NRIPS))
+		vmcb->control.next_rip  = info->next_rip;
 	vmcb->control.exit_code = icpt_info.exit_code;
 	vmexit = nested_svm_exit_handled(svm);
...

> Should we change the "if" condition to static_cpu_has(X86_FEATURE_NRIPS)
> instead of Dirk's patch?

Yes, seems ok to me. If decodeassist isn't supported then it's
mostly a stale value. It's interesting that that L1 still works even
after we hit this warning!

> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dirk Mueller Oct. 2, 2015, 6:43 a.m. UTC | #5
> I added the warning so that we catch if the next_rip field is being written
> to (even if the feature isn't supported) by a buggy L1 hypervisor.

Interesting, so how about this patch?
Thanks,
Dirk
Bandan Das Oct. 5, 2015, 1:15 a.m. UTC | #6
Dirk Müller <dmueller@suse.com> writes:

>> I added the warning so that we catch if the next_rip field is being written
>> to (even if the feature isn't supported) by a buggy L1 hypervisor.
>
> Interesting, so how about this patch?
>
>
> From c5c8ea255d680f972cbdfc835cdf352fa78897ae Mon Sep 17 00:00:00 2001
> From: Dirk Mueller <dirk@dmllr.de>
> Date: Fri, 2 Oct 2015 08:35:24 +0200
> Subject: [PATCH] KVM: nSVM: Check for NRIP support before accepting
>  control.next_rip
>
> NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS], remove
> a WARN_ON_(once) and check for it directly.
>
> Signed-off-by: Dirk Mueller <dmueller@suse.com>
> ---
>  arch/x86/kvm/svm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 0a42859..33d36da 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -513,8 +513,8 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
> -	if (svm->vmcb->control.next_rip != 0) {
> -		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
> +	if (static_cpu_has(X86_FEATURE_NRIPS) &&
> +	    svm->vmcb->control.next_rip != 0) {
>  		svm->next_rip = svm->vmcb->control.next_rip;
>  	}

Ok, looks good to me. Still, probably a good idea to let the user know if this condition is
hit.

Bandan

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 5, 2015, 9:50 a.m. UTC | #7
On Thu, Oct 01, 2015 at 06:31:27PM -0400, Bandan Das wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 01/10/2015 13:43, Dirk Müller wrote:
> >> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> >> index 94b7d15..0a42859 100644
> >> --- a/arch/x86/kvm/svm.c
> >> +++ b/arch/x86/kvm/svm.c
> >> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> >>  	struct vcpu_svm *svm = to_svm(vcpu);
> >>  
> >>  	if (svm->vmcb->control.next_rip != 0) {
> >> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
> >> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
> >>  		svm->next_rip = svm->vmcb->control.next_rip;
> >>  	}
> >>  
> >
> > Bandan, what was the reason for warning here?
> 
> I added the warning so that we catch if the next_rip field is being written
> to (even if the feature isn't supported) by a buggy L1 hypervisor.

Even if the L1 hypervisor writes to the next_rip field in the VMCB, we
would never see it in this code path, as we access the shadow VMCB in
this statement.

We don't even care if the L1 hypervisor writes to its next_rip field
because we only write to this field on an emulatated VMEXIT and never
read it back.

So what's the point in adding a guest-triggerable warning at all?



	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 5, 2015, 4:54 p.m. UTC | #8
Joerg Roedel <joro@8bytes.org> writes:

> On Thu, Oct 01, 2015 at 06:31:27PM -0400, Bandan Das wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>> > On 01/10/2015 13:43, Dirk Müller wrote:
>> >> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> >> index 94b7d15..0a42859 100644
>> >> --- a/arch/x86/kvm/svm.c
>> >> +++ b/arch/x86/kvm/svm.c
>> >> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>> >>  	struct vcpu_svm *svm = to_svm(vcpu);
>> >>  
>> >>  	if (svm->vmcb->control.next_rip != 0) {
>> >> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
>> >> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
>> >>  		svm->next_rip = svm->vmcb->control.next_rip;
>> >>  	}
>> >>  
>> >
>> > Bandan, what was the reason for warning here?
>> 
>> I added the warning so that we catch if the next_rip field is being written
>> to (even if the feature isn't supported) by a buggy L1 hypervisor.
>
> Even if the L1 hypervisor writes to the next_rip field in the VMCB, we
> would never see it in this code path, as we access the shadow VMCB in
> this statement.
>
> We don't even care if the L1 hypervisor writes to its next_rip field
> because we only write to this field on an emulatated VMEXIT and never
> read it back.

The problems is that the next_rip field could be stale. If the processor supports
next_rip, then it will clear it out on the next entry. If it doesn't,
an old value just sits there (no matter who wrote it) and the problem
happens when skip_emulated_instruction advances the rip with an incorrect
value.

> So what's the point in adding a guest-triggerable warning at all?

So, yes, maybe this doesn't have to be a guest specific warning but we still
need to warn if this unsupported field is being written to.

>
>
> 	Joerg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 5, 2015, 5:15 p.m. UTC | #9
On Mon, Oct 05, 2015 at 12:54:43PM -0400, Bandan Das wrote:
> Joerg Roedel <joro@8bytes.org> writes:
> The problems is that the next_rip field could be stale. If the processor supports
> next_rip, then it will clear it out on the next entry. If it doesn't,
> an old value just sits there (no matter who wrote it) and the problem
> happens when skip_emulated_instruction advances the rip with an incorrect
> value.

So the right fix would be to just set the guests next_rip to 0 when we
emulate vmrun, just like real hardware does, no?


	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 5, 2015, 5:42 p.m. UTC | #10
Joerg Roedel <joro@8bytes.org> writes:

> On Mon, Oct 05, 2015 at 12:54:43PM -0400, Bandan Das wrote:
>> Joerg Roedel <joro@8bytes.org> writes:
>> The problems is that the next_rip field could be stale. If the processor supports
>> next_rip, then it will clear it out on the next entry. If it doesn't,
>> an old value just sits there (no matter who wrote it) and the problem
>> happens when skip_emulated_instruction advances the rip with an incorrect
>> value.
>
> So the right fix would be to just set the guests next_rip to 0 when we
> emulate vmrun, just like real hardware does, no?

Agreed, resetting to 0 if nrips isn't supported seems right. It would still
help having a printk_once in this case IMO :)

>
> 	Joerg
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dirk Mueller Oct. 5, 2015, 8:12 p.m. UTC | #11
> So the right fix would be to just set the guests next_rip to 0 when we
> emulate vmrun, just like real hardware does, no?

Like this? (Note: I’m not sure what I’m doing here..). I Agree with you that the warning
for this seems excessive, I’ve just removed it.
Thanks,
Dirk
Bandan Das Oct. 5, 2015, 10 p.m. UTC | #12
Dirk Müller <dmueller@suse.com> writes:

>> So the right fix would be to just set the guests next_rip to 0 when we
>> emulate vmrun, just like real hardware does, no?
>
> Like this? (Note: I’m not sure what I’m doing here..). I Agree with you that the warning
> for this seems excessive, I’ve just removed it. 
>
>
> From 47db81837b6fe58aa302317bbf2a092b40af0a36 Mon Sep 17 00:00:00 2001
> From: Dirk Mueller <dirk@dmllr.de>
> Date: Fri, 2 Oct 2015 08:35:24 +0200
> Subject: [PATCH] KVM: nSVM: Check for NRIP support before passing on
>  control.next_rip
>
> NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS], so
> ignore it if the cpu feature is not available. Remove the
> guest-triggerable WARN_ON for this as it just emulates real
> hardware behavior.
>
> Signed-off-by: Dirk Mueller <dmueller@suse.com>
> ---
>  arch/x86/kvm/svm.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 0a42859..66e3a4c 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -514,8 +514,10 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
>  	if (svm->vmcb->control.next_rip != 0) {
> -		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
> -		svm->next_rip = svm->vmcb->control.next_rip;
> +		if (static_cpu_has(X86_FEATURE_NRIPS))
> +			svm->next_rip = svm->vmcb->control.next_rip;
> +		else
> +			svm->next_rip = 0;

skip_emulated_instruction probably isn't the right place.

>  	}

So, L1 writes this field in svm_check_intercept.

I went back to the spec and it says (15.7.1):
The next sequential instruction pointer (nRIP) is saved in the guest VMCB
control area at location C8h on all #VMEXITs that are due to instruction
intercepts, as defined in Section 15.9 on page 458, as well as MSR and
IOIO intercepts and exceptions caused by the INT3, INTO, and BOUND
instructions. For all other intercepts, nRIP is reset to zero.

So, I think a better way would be to reset the field on vmexit from
L2 to L0 if NRIP isn't supported. Maybe it's enough to do this only for
the intercepts mentioned above but I am not sure.

Also, please include a debug message (pr_debug_once will do). It seems
unnecessary today, but in 2 years when someone is scratching his head why his
nested guests won't run, he will thank you!

BTW, it seems possible to unconditionally advertise SVM_FEATURE_NRIP..

>  	if (!svm->next_rip) {
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 6, 2015, 10:23 a.m. UTC | #13
On Mon, Oct 05, 2015 at 01:42:43PM -0400, Bandan Das wrote:
> Joerg Roedel <joro@8bytes.org> writes:
> 
> > On Mon, Oct 05, 2015 at 12:54:43PM -0400, Bandan Das wrote:
> >> Joerg Roedel <joro@8bytes.org> writes:
> >> The problems is that the next_rip field could be stale. If the processor supports
> >> next_rip, then it will clear it out on the next entry. If it doesn't,
> >> an old value just sits there (no matter who wrote it) and the problem
> >> happens when skip_emulated_instruction advances the rip with an incorrect
> >> value.
> >
> > So the right fix would be to just set the guests next_rip to 0 when we
> > emulate vmrun, just like real hardware does, no?
> 
> Agreed, resetting to 0 if nrips isn't supported seems right. It would still
> help having a printk_once in this case IMO :)

I meant to reset it always to 0 on vmrun, like real hardware does.



	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 6, 2015, 10:28 a.m. UTC | #14
On Thu, Oct 01, 2015 at 06:31:27PM -0400, Bandan Das wrote:
> >> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> >>  	struct vcpu_svm *svm = to_svm(vcpu);
> >>  
> >>  	if (svm->vmcb->control.next_rip != 0) {
> >> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
> >> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
> >>  		svm->next_rip = svm->vmcb->control.next_rip;
> >>  	}

I looked again how this could possibly be triggered, and I am somewhat
confused now.

So svm->vmcb->control.next_rip is only written by hardware or in
svm_check_intercept(). Both cases write only to this field, if the
hardware supports X86_FEATURE_NRIPS. The write in nested_svm_vmexit only
targets the guests VMCB, and we don't use that one again.

So I can't see how the WARN_ON above could be triggered. Do I miss
something or might this also be a miscompilation of static_cpu_has?


	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 6, 2015, 5:59 p.m. UTC | #15
Joerg Roedel <joro@8bytes.org> writes:

> On Thu, Oct 01, 2015 at 06:31:27PM -0400, Bandan Das wrote:
>> >> @@ -514,7 +514,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
>> >>  	struct vcpu_svm *svm = to_svm(vcpu);
>> >>  
>> >>  	if (svm->vmcb->control.next_rip != 0) {
>> >> -		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
>> >> +		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
>> >>  		svm->next_rip = svm->vmcb->control.next_rip;
>> >>  	}
>
> I looked again how this could possibly be triggered, and I am somewhat
> confused now.
>
> So svm->vmcb->control.next_rip is only written by hardware or in
> svm_check_intercept(). Both cases write only to this field, if the
> hardware supports X86_FEATURE_NRIPS. The write in nested_svm_vmexit only

Not until commit f104765b4f81fd74d69e0eb161e89096deade2db. So, an older L1
kernel will trigger it.

> targets the guests VMCB, and we don't use that one again.
>
> So I can't see how the WARN_ON above could be triggered. Do I miss
> something or might this also be a miscompilation of static_cpu_has?
>
>
> 	Joerg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 6, 2015, 6:02 p.m. UTC | #16
Joerg Roedel <joro@8bytes.org> writes:

> On Mon, Oct 05, 2015 at 01:42:43PM -0400, Bandan Das wrote:
>> Joerg Roedel <joro@8bytes.org> writes:
>> 
>> > On Mon, Oct 05, 2015 at 12:54:43PM -0400, Bandan Das wrote:
>> >> Joerg Roedel <joro@8bytes.org> writes:
>> >> The problems is that the next_rip field could be stale. If the processor supports
>> >> next_rip, then it will clear it out on the next entry. If it doesn't,
>> >> an old value just sits there (no matter who wrote it) and the problem
>> >> happens when skip_emulated_instruction advances the rip with an incorrect
>> >> value.
>> >
>> > So the right fix would be to just set the guests next_rip to 0 when we
>> > emulate vmrun, just like real hardware does, no?
>> 
>> Agreed, resetting to 0 if nrips isn't supported seems right. It would still
>> help having a printk_once in this case IMO :)
>
> I meant to reset it always to 0 on vmrun, like real hardware does.

Atleast the spec don't mention this, I don't know how I got that idea :) The spec
just say that it gets written to by hardware on certain intercepts and for others
it gets reset to 0 on #VMEXIT.

>
>
> 	Joerg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 7, 2015, 11:03 a.m. UTC | #17
On Tue, Oct 06, 2015 at 01:59:27PM -0400, Bandan Das wrote:
> Joerg Roedel <joro@8bytes.org> writes:
> >
> > So svm->vmcb->control.next_rip is only written by hardware or in
> > svm_check_intercept(). Both cases write only to this field, if the
> > hardware supports X86_FEATURE_NRIPS. The write in nested_svm_vmexit only
> 
> Not until commit f104765b4f81fd74d69e0eb161e89096deade2db. So, an older L1
> kernel will trigger it.

But we don't care if L1 writes something into its own next_rip, as we
never read this value from its VMCB. We only copy the next_rip value we
get from our shadow-vmcb to it on an emulated vmexit. So I still don't
understand what triggers the reported problem or why the WARN_ON is
necessary.


	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das Oct. 7, 2015, 2:58 p.m. UTC | #18
Joerg Roedel <joro@8bytes.org> writes:

> On Tue, Oct 06, 2015 at 01:59:27PM -0400, Bandan Das wrote:
>> Joerg Roedel <joro@8bytes.org> writes:
>> >
>> > So svm->vmcb->control.next_rip is only written by hardware or in
>> > svm_check_intercept(). Both cases write only to this field, if the
>> > hardware supports X86_FEATURE_NRIPS. The write in nested_svm_vmexit only
>> 
>> Not until commit f104765b4f81fd74d69e0eb161e89096deade2db. So, an older L1
>> kernel will trigger it.
>
> But we don't care if L1 writes something into its own next_rip, as we
> never read this value from its VMCB. We only copy the next_rip value we
> get from our shadow-vmcb to it on an emulated vmexit. So I still don't
> understand what triggers the reported problem or why the WARN_ON is
> necessary.

Ok, looks like I am making some incorrect "vmx" assumptions here. What happens
when we exit from L2 to L0, arent' we looking at the VMCB L1 is using to run
L2 ? Wouldn't that trigger the warning if the host processor does not support
nrips and the field is set ?


>
> 	Joerg
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joerg Roedel Oct. 7, 2015, 3:24 p.m. UTC | #19
On Wed, Oct 07, 2015 at 10:58:07AM -0400, Bandan Das wrote:
> Ok, looks like I am making some incorrect "vmx" assumptions here. What happens
> when we exit from L2 to L0, arent' we looking at the VMCB L1 is using to run
> L2 ? Wouldn't that trigger the warning if the host processor does not support
> nrips and the field is set ?

No, because the L1 hypervisor can't write to the shadow-vmcb, and this
is the only one where we _read_ next_rip from.



	Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From 40c7213e34a1a1caf0b98db832c06c593f888b11 Mon Sep 17 00:00:00 2001
From: Dirk Mueller <dirk@dmllr.de>
Date: Thu, 1 Oct 2015 13:10:10 +0200
Subject: [PATCH] Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS

The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

Signed-off-by: Dirk Mueller <dmueller@suse.com>
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 94b7d15..0a42859 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -514,7 +514,7 @@  static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 
 	if (svm->vmcb->control.next_rip != 0) {
-		WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
+		WARN_ON_ONCE(!static_cpu_has(X86_FEATURE_NRIPS));
 		svm->next_rip = svm->vmcb->control.next_rip;
 	}
 
-- 
2.5.1