diff mbox series

[41/62] x86/sev-es: Handle MSR events

Message ID 20200211135256.24617-42-joro@8bytes.org (mailing list archive)
State New, archived
Headers show
Series Linux as SEV-ES Guest Support | expand

Commit Message

Joerg Roedel Feb. 11, 2020, 1:52 p.m. UTC
From: Tom Lendacky <thomas.lendacky@amd.com>

Implement a handler for #VC exceptions caused by RDMSR/WRMSR
instructions.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[ jroedel@suse.de: Adapt to #VC handling infrastructure ]
Co-developed-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev-es.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Dave Hansen Feb. 13, 2020, 3:45 p.m. UTC | #1
On 2/11/20 5:52 AM, Joerg Roedel wrote:
> Implement a handler for #VC exceptions caused by RDMSR/WRMSR
> instructions.

As a general comment on all of these event handlers: Why do we bother
having the hypercalls in the interrupt handler as opposed to just
calling them directly.  What you have is:

	wrmsr()
	-> #VC exception
	   hcall()

But we could make our rd/wrmsr() wrappers just do:

	if (running_on_sev_es())
		hcall(HCALL_MSR_WHATEVER...)
	else
		wrmsr()

and then we don't have any of the nastiness of exception handling.
Joerg Roedel Feb. 14, 2020, 7:23 a.m. UTC | #2
On Thu, Feb 13, 2020 at 07:45:00AM -0800, Dave Hansen wrote:
> On 2/11/20 5:52 AM, Joerg Roedel wrote:
> > Implement a handler for #VC exceptions caused by RDMSR/WRMSR
> > instructions.
> 
> As a general comment on all of these event handlers: Why do we bother
> having the hypercalls in the interrupt handler as opposed to just
> calling them directly.  What you have is:
> 
> 	wrmsr()
> 	-> #VC exception
> 	   hcall()
> 
> But we could make our rd/wrmsr() wrappers just do:
> 
> 	if (running_on_sev_es())
> 		hcall(HCALL_MSR_WHATEVER...)
> 	else
> 		wrmsr()
> 
> and then we don't have any of the nastiness of exception handling.

Yes, investigating this is on the list for future optimizations (besides
caching CPUID results). My idea is to use alternatives patching for
this. But the exception handling is needed anyway because #VC
exceptions happen very early already, basically the first thing after
setting up a stack is calling verify_cpu(), which uses CPUID.
The other reason is that things like MMIO and IOIO instructions can't be
easily patched by alternatives. Those would work with the runtime
checking you showed above, though.

Regards,

	Joerg
Dave Hansen Feb. 14, 2020, 4:59 p.m. UTC | #3
On 2/13/20 11:23 PM, Joerg Roedel wrote:
> Yes, investigating this is on the list for future optimizations (besides
> caching CPUID results). My idea is to use alternatives patching for
> this. But the exception handling is needed anyway because #VC
> exceptions happen very early already, basically the first thing after
> setting up a stack is calling verify_cpu(), which uses CPUID.

Ahh, bummer.  How does a guest know that it's running under SEV-ES?
What's the enumeration mechanism if CPUID doesn't "work"?

> The other reason is that things like MMIO and IOIO instructions can't be
> easily patched by alternatives. Those would work with the runtime
> checking you showed above, though.

Is there a reason we can't make a rule that you *must* do MMIO through
an accessor function so we *can* patch them?  I know random drivers
might break the rule, but are SEV-ES guests going to be running random
drivers?  I would think that they mostly if not all want to use virtio.
Joerg Roedel Feb. 15, 2020, 12:45 p.m. UTC | #4
On Fri, Feb 14, 2020 at 08:59:39AM -0800, Dave Hansen wrote:
> On 2/13/20 11:23 PM, Joerg Roedel wrote:
> > Yes, investigating this is on the list for future optimizations (besides
> > caching CPUID results). My idea is to use alternatives patching for
> > this. But the exception handling is needed anyway because #VC
> > exceptions happen very early already, basically the first thing after
> > setting up a stack is calling verify_cpu(), which uses CPUID.
> 
> Ahh, bummer.  How does a guest know that it's running under SEV-ES?
> What's the enumeration mechanism if CPUID doesn't "work"?

There are two ways a guest can find out:

	1) Read the SEV_STATUS_MSR and check for the SEV-ES bit
	2) If a #VC exception is raised it also knows it runs as an
	   SEV-ES guest

This patch-set implements both ways at the appropriate stages of the
boot process. Very early it just installs a #VC handler without checking
whether it is running under SEV-ES and handles the exceptions when they
are raised.

Later in the boot process it also reads the SEV_STATUS_MSR and sets a
cpu_feature flag to do alternative patching based on its value.

> > The other reason is that things like MMIO and IOIO instructions can't be
> > easily patched by alternatives. Those would work with the runtime
> > checking you showed above, though.
> 
> Is there a reason we can't make a rule that you *must* do MMIO through
> an accessor function so we *can* patch them?  I know random drivers
> might break the rule, but are SEV-ES guests going to be running random
> drivers?  I would think that they mostly if not all want to use
> virtio.

Yeah, there are already defined accessor functions for MMIO, like
read/write[bwlq] and memcpy_toio/memcpy_fromio. It is probably worth
testing what performance overhead is involved in overloading these to
call directly into the paravirt path when SEV-ES is enabled. With
alternatives patching it would still add a couple of NOPS for the
non-SEV-ES case.

But all that does not remove the need for the #VC exception handler, as
#VC exceptions can also be triggered by user-space, and the instruction
emulation for MMIO will be needed to allow MMIO in user-space (the
patch-set currently does not allow that, but it could be needed in the
future).

Regards,

	Joerg
diff mbox series

Patch

diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 84b5b8f7897a..b27d5b0a8ae1 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -134,6 +134,35 @@  static phys_addr_t es_slow_virt_to_phys(struct ghcb *ghcb, long vaddr)
 /* Include code shared with pre-decompression boot stage */
 #include "sev-es-shared.c"
 
+static enum es_result handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
+{
+	struct pt_regs *regs = ctxt->regs;
+	enum es_result ret;
+	bool write;
+	u64 exit_info_1;
+
+	write = (ctxt->insn.opcode.bytes[1] == 0x30);
+
+	ghcb_set_rcx(ghcb, regs->cx);
+	if (write) {
+		ghcb_set_rax(ghcb, regs->ax);
+		ghcb_set_rdx(ghcb, regs->dx);
+		exit_info_1 = 1;
+	} else {
+		exit_info_1 = 0;
+	}
+
+	ret = ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MSR, exit_info_1, 0);
+	if (ret != ES_OK)
+		return ret;
+	else if (!write) {
+		regs->ax = ghcb->save.rax;
+		regs->dx = ghcb->save.rdx;
+	}
+
+	return ret;
+}
+
 /*
  * This function runs on the first #VC exception after the kernel
  * switched to virtual addresses.
@@ -196,6 +225,9 @@  static enum es_result handle_vc_exception(struct es_em_ctxt *ctxt,
 	case SVM_EXIT_IOIO:
 		result = handle_ioio(ghcb, ctxt);
 		break;
+	case SVM_EXIT_MSR:
+		result = handle_msr(ghcb, ctxt);
+		break;
 	case SVM_EXIT_NPF:
 		result = handle_mmio(ghcb, ctxt);
 		break;