diff mbox series

[3/3] KVM: SVM: allow to intercept all exceptions for debug

Message ID 20210315221020.661693-4-mlevitsk@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: my debug patch queue | expand

Commit Message

Maxim Levitsky March 15, 2021, 10:10 p.m. UTC
Add a new debug module param 'debug_intercept_exceptions' which will allow the
KVM to intercept any guest exception, and forward it to the guest.

This can be very useful for guest debugging and/or KVM debugging with kvm trace.
This is not intended to be used on production systems.

This is based on an idea first shown here:
https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/

CC: Borislav Petkov <bp@suse.de>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +
 arch/x86/kvm/svm/svm.c          | 77 ++++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.h          |  5 ++-
 arch/x86/kvm/x86.c              |  5 ++-
 4 files changed, 85 insertions(+), 4 deletions(-)

Comments

Joerg Roedel March 16, 2021, 8:32 a.m. UTC | #1
Hi Maxim,

On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> -static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> +static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {

Can you keep this const and always set the necessary handlers? If
exceptions are not intercepted they will not be used.

> @@ -333,7 +334,9 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
>  	struct vmcb *vmcb = svm->vmcb01.ptr;
>  
>  	WARN_ON_ONCE(bit >= 32);
> -	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
> +
> +	if (!((1 << bit) & debug_intercept_exceptions))
> +		vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);

This will break SEV-ES guests, as those will not cause an intercept but
now start to get #VC exceptions on every other exception that is raised.
SEV-ES guests are not prepared for that and will not even boot, so
please don't enable this feature for them.
Borislav Petkov March 16, 2021, 8:34 a.m. UTC | #2
On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> Add a new debug module param 'debug_intercept_exceptions' which will allow the
> KVM to intercept any guest exception, and forward it to the guest.
> 
> This can be very useful for guest debugging and/or KVM debugging with kvm trace.
> This is not intended to be used on production systems.
> 
> This is based on an idea first shown here:
> https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/
> 
> CC: Borislav Petkov <bp@suse.de>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>

> ---
>  arch/x86/include/asm/kvm_host.h |  2 +
>  arch/x86/kvm/svm/svm.c          | 77 ++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/svm/svm.h          |  5 ++-
>  arch/x86/kvm/x86.c              |  5 ++-
>  4 files changed, 85 insertions(+), 4 deletions(-)

Looks interesting, I'll give it a try when I get a chance in the coming days.

Thx!
Maxim Levitsky March 16, 2021, 10:51 a.m. UTC | #3
On Tue, 2021-03-16 at 09:32 +0100, Joerg Roedel wrote:
> Hi Maxim,
> 
> On Tue, Mar 16, 2021 at 12:10:20AM +0200, Maxim Levitsky wrote:
> > -static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> > +static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
> 
> Can you keep this const and always set the necessary handlers? If
> exceptions are not intercepted they will not be used.
> 
> > @@ -333,7 +334,9 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
> >  	struct vmcb *vmcb = svm->vmcb01.ptr;
> >  
> >  	WARN_ON_ONCE(bit >= 32);
> > -	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
> > +
> > +	if (!((1 << bit) & debug_intercept_exceptions))
> > +		vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
> 
> This will break SEV-ES guests, as those will not cause an intercept but
> now start to get #VC exceptions on every other exception that is raised.
> SEV-ES guests are not prepared for that and will not even boot, so
> please don't enable this feature for them.

I agree but what is wrong with that? 
This is a debug feature, and it only can be enabled by the root,
and so someone might actually want this case to happen
(e.g to see if a SEV guest can cope with extra #VC exceptions).

I have nothing against not allowing this for SEV-ES guests though.
What do you think?


Best regards,
	Maxim Levitsky
Joerg Roedel March 18, 2021, 9:19 a.m. UTC | #4
On Tue, Mar 16, 2021 at 12:51:20PM +0200, Maxim Levitsky wrote:
> I agree but what is wrong with that? 
> This is a debug feature, and it only can be enabled by the root,
> and so someone might actually want this case to happen
> (e.g to see if a SEV guest can cope with extra #VC exceptions).

That doesn't make sense, we know that and SEV-ES guest can't cope with
extra #VC exceptions, so there is no point in testing this. It is more a
way to shot oneself into the foot for the user and a potential source of
bug reports for SEV-ES guests.


> I have nothing against not allowing this for SEV-ES guests though.
> What do you think?

I think SEV-ES guests should only have the intercept bits set which
guests acutally support.

Regards,

	Joerg
Maxim Levitsky March 18, 2021, 9:24 a.m. UTC | #5
On Thu, 2021-03-18 at 10:19 +0100, Joerg Roedel wrote:
> On Tue, Mar 16, 2021 at 12:51:20PM +0200, Maxim Levitsky wrote:
> > I agree but what is wrong with that? 
> > This is a debug feature, and it only can be enabled by the root,
> > and so someone might actually want this case to happen
> > (e.g to see if a SEV guest can cope with extra #VC exceptions).
> 
> That doesn't make sense, we know that and SEV-ES guest can't cope with
> extra #VC exceptions, so there is no point in testing this. It is more a
> way to shot oneself into the foot for the user and a potential source of
> bug reports for SEV-ES guests.

But again this is a debug feature, and it is intended to allow the user
to shoot himself in the foot. Bug reports for a debug feature
are autoclosed. It is no different from say poking kernel memory with
its built-in gdbstub, for example.

Best regards,
	Maxim Levitsky

> 
> 
> > I have nothing against not allowing this for SEV-ES guests though.
> > What do you think?
> 
> I think SEV-ES guests should only have the intercept bits set which
> guests acutally support

> 
> Regards,
> 
> 	Joerg
>
Joerg Roedel March 18, 2021, 3:47 p.m. UTC | #6
On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> But again this is a debug feature, and it is intended to allow the user
> to shoot himself in the foot.

And one can't debug SEV-ES guests with it, so what is the point of
enabling it for them too?

Regards,

	Joerg
Sean Christopherson March 18, 2021, 4:35 p.m. UTC | #7
On Thu, Mar 18, 2021, Joerg Roedel wrote:
> On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> > But again this is a debug feature, and it is intended to allow the user
> > to shoot himself in the foot.
> 
> And one can't debug SEV-ES guests with it, so what is the point of
> enabling it for them too?

Agreed.  I can see myself enabling debug features by default, it would be nice
to not having to go out of my way to disable them for SEV-ES/SNP guests.

Skipping SEV-ES guests should not be difficult; KVM could probably even
print a message stating that the debug hook is being ignored.  One thought would
be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
for incompatible guests.  That would also allow changing debug_intercept_exceptions
without reloading KVM, which IMO would be very convenient.
Maxim Levitsky March 18, 2021, 4:41 p.m. UTC | #8
On Thu, 2021-03-18 at 16:35 +0000, Sean Christopherson wrote:
> On Thu, Mar 18, 2021, Joerg Roedel wrote:
> > On Thu, Mar 18, 2021 at 11:24:25AM +0200, Maxim Levitsky wrote:
> > > But again this is a debug feature, and it is intended to allow the user
> > > to shoot himself in the foot.
> > 
> > And one can't debug SEV-ES guests with it, so what is the point of
> > enabling it for them too?
You can create a special SEV-ES guest which does handle all exceptions via
#VC, or just observe it fail which can be useful for some whatever reason.
> 
> Agreed.  I can see myself enabling debug features by default, it would be nice
> to not having to go out of my way to disable them for SEV-ES/SNP guests.
This does sound like a valid reason to disable this for SEV-ES.

> 
> Skipping SEV-ES guests should not be difficult; KVM could probably even
> print a message stating that the debug hook is being ignored.  One thought would
> be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
> for incompatible guests.  That would also allow changing debug_intercept_exceptions
> without reloading KVM, which IMO would be very convenient.
> 
So all right I'll disable this for SEV-ES. 
The idea to change the debug_intercept_exceptions on the fly is also a good idea,
I will implement it in next version of the patches.

Thanks for the review,
Best regards,
	Maxim Levitsky
Sean Christopherson March 18, 2021, 5:22 p.m. UTC | #9
On Thu, Mar 18, 2021, Maxim Levitsky wrote:
> On Thu, 2021-03-18 at 16:35 +0000, Sean Christopherson wrote:
> > Skipping SEV-ES guests should not be difficult; KVM could probably even
> > print a message stating that the debug hook is being ignored.  One thought would
> > be to snapshot debug_intercept_exceptions at VM creation, and simply zero it out
> > for incompatible guests.  That would also allow changing debug_intercept_exceptions
> > without reloading KVM, which IMO would be very convenient.
> > 
> So all right I'll disable this for SEV-ES. 

Belated thought.  KVM doesn't know a guest will be an SEV-ES guest until
sev_es_guest_init(), and KVM currently doesn't prevent creating vCPUs before
KVM_SEV_ES_INIT.  But, I'm 99% confident that's a KVM bug.  For your purposes,
I think you can assume kvm->arch.debug_intercept_exceptions will _not_ change
after vCPU creation.

> The idea to change the debug_intercept_exceptions on the fly is also a good idea,
> I will implement it in next version of the patches.

Can you also move the module param to x86?  It doesn't need to be wired up for
VMX right away, but it makes sense to do it at some point, and ideally folks
won't have to update their scripts when that happens.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a52f973bdff6d..c8f44a88b3153 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1564,6 +1564,8 @@  int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu);
 void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
 void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload);
+void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
+			     u32 error_code, unsigned long payload);
 void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
 void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 271196400495f..94156a367a663 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -197,6 +197,9 @@  module_param(sev_es, int, 0444);
 bool __read_mostly dump_invalid_vmcb;
 module_param(dump_invalid_vmcb, bool, 0644);
 
+uint debug_intercept_exceptions;
+module_param(debug_intercept_exceptions, uint, 0444);
+
 static bool svm_gp_erratum_intercept = true;
 
 static u8 rsm_ins_bytes[] = "\x0f\xaa";
@@ -220,6 +223,8 @@  static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
 #define MSRS_RANGE_SIZE 2048
 #define MSRS_IN_RANGE (MSRS_RANGE_SIZE * 8 / 2)
 
+static void init_debug_exceptions_intercept(struct vcpu_svm *svm);
+
 u32 svm_msrpm_offset(u32 msr)
 {
 	u32 offset;
@@ -1137,6 +1142,8 @@  static void init_vmcb(struct kvm_vcpu *vcpu)
 	set_exception_intercept(svm, MC_VECTOR);
 	set_exception_intercept(svm, AC_VECTOR);
 	set_exception_intercept(svm, DB_VECTOR);
+
+	init_debug_exceptions_intercept(svm);
 	/*
 	 * Guest access to VMware backdoor ports could legitimately
 	 * trigger #GP because of TSS I/O permission bitmap.
@@ -1913,6 +1920,17 @@  static int pf_interception(struct kvm_vcpu *vcpu)
 	u64 fault_address = svm->vmcb->control.exit_info_2;
 	u64 error_code = svm->vmcb->control.exit_info_1;
 
+	if ((debug_intercept_exceptions & (1 << PF_VECTOR)))
+		if (npt_enabled && !vcpu->arch.apf.host_apf_flags) {
+			/* If #PF was only intercepted for debug, inject
+			 * it directly to the guest, since the mmu code
+			 * is not ready to deal with such page faults
+			 */
+			kvm_queue_exception_e_p(vcpu, PF_VECTOR,
+						error_code, fault_address);
+			return 1;
+		}
+
 	return kvm_handle_page_fault(vcpu, error_code, fault_address,
 			static_cpu_has(X86_FEATURE_DECODEASSISTS) ?
 			svm->vmcb->control.insn_bytes : NULL,
@@ -3025,7 +3043,7 @@  static int invpcid_interception(struct kvm_vcpu *vcpu)
 	return kvm_handle_invpcid(vcpu, type, gva);
 }
 
-static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
+static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_READ_CR0]			= cr_interception,
 	[SVM_EXIT_READ_CR3]			= cr_interception,
 	[SVM_EXIT_READ_CR4]			= cr_interception,
@@ -3099,6 +3117,63 @@  static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_VMGEXIT]			= sev_handle_vmgexit,
 };
 
+static int generic_exception_interception(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Generic exception handler which forwards a guest exception
+	 * as-is to the guest.
+	 * For exceptions that don't have a special intercept handler.
+	 *
+	 * Used for 'debug_intercept_exceptions' KVM debug feature only.
+	 */
+	struct vcpu_svm *svm = to_svm(vcpu);
+	int exc = svm->vmcb->control.exit_code - SVM_EXIT_EXCP_BASE;
+
+	WARN_ON(exc < 0 || exc > 31);
+
+	if (exc == TS_VECTOR) {
+		/*
+		 * SVM doesn't provide us with an error code to be able to
+		 * re-inject the #TS exception, so just disable its
+		 * interception, and let the guest re-execute the instruction.
+		 */
+		vmcb_clr_intercept(&svm->vmcb01.ptr->control,
+				   INTERCEPT_EXCEPTION_OFFSET + TS_VECTOR);
+		recalc_intercepts(svm);
+		return 1;
+	} else if (exc == DF_VECTOR) {
+		/* SVM doesn't provide us with an error code for the #DF */
+		kvm_queue_exception_e(vcpu, exc, 0);
+		return 1;
+	}
+
+	if (x86_exception_has_error_code(exc))
+		kvm_queue_exception_e(vcpu, exc, svm->vmcb->control.exit_info_1);
+	else
+		kvm_queue_exception(vcpu, exc);
+	return 1;
+}
+
+static void init_debug_exceptions_intercept(struct vcpu_svm *svm)
+{
+	int exc;
+
+	for (exc = 0 ; exc < 32 ; exc++) {
+		if (!(debug_intercept_exceptions & (1 << exc)))
+			continue;
+
+		/* Those are defined to have undefined behavior in the SVM spec */
+		if (exc == 2 || exc == 9)
+			continue;
+
+		set_exception_intercept(svm, exc);
+
+		if (!svm_exit_handlers[SVM_EXIT_EXCP_BASE + exc])
+			svm_exit_handlers[SVM_EXIT_EXCP_BASE + exc] =
+					generic_exception_interception;
+	}
+}
+
 static void dump_vmcb(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8e276c4fb33df..e0ff9ca996df8 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -32,6 +32,7 @@  static const u32 host_save_user_msrs[] = {
 #define MSRPM_OFFSETS	16
 extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
 extern bool npt_enabled;
+extern uint debug_intercept_exceptions;
 
 enum {
 	VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -333,7 +334,9 @@  static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
 	struct vmcb *vmcb = svm->vmcb01.ptr;
 
 	WARN_ON_ONCE(bit >= 32);
-	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+	if (!((1 << bit) & debug_intercept_exceptions))
+		vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
 
 	recalc_intercepts(svm);
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b75d990fcf12b..be509944622bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -627,12 +627,13 @@  void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
 }
 EXPORT_SYMBOL_GPL(kvm_queue_exception_p);
 
-static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
-				    u32 error_code, unsigned long payload)
+void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
+			     u32 error_code, unsigned long payload)
 {
 	kvm_multiple_exception(vcpu, nr, true, error_code,
 			       true, payload, false);
 }
+EXPORT_SYMBOL_GPL(kvm_queue_exception_e_p);
 
 int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err)
 {