diff mbox series

[v8,08/40] x86/sev: Check the vmpl level

Message ID 20211210154332.11526-9-brijesh.singh@amd.com (mailing list archive)
State New, archived
Headers show
Series Add AMD Secure Nested Paging (SEV-SNP) Guest Support | expand

Commit Message

Brijesh Singh Dec. 10, 2021, 3:43 p.m. UTC
Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
allows a guest VM to divide its address space into four levels. The level
can be used to provide the hardware isolated abstraction layers with a VM.
The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
Certain operations must be done by the VMPL0 software, such as:

* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)

The initial SEV-SNP support requires that the guest kernel is running on
VMPL0. Add a check to make sure that kernel is running at VMPL0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest is
running at the VMPL0.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
 arch/x86/include/asm/sev-common.h |  1 +
 arch/x86/include/asm/sev.h        | 16 +++++++++++++++
 3 files changed, 48 insertions(+), 3 deletions(-)

Comments

Venu Busireddy Dec. 16, 2021, 8:24 p.m. UTC | #1
On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
> allows a guest VM to divide its address space into four levels. The level
> can be used to provide the hardware isolated abstraction layers with a VM.
> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
> Certain operations must be done by the VMPL0 software, such as:
> 
> * Validate or invalidate memory range (PVALIDATE instruction)
> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
> 
> The initial SEV-SNP support requires that the guest kernel is running on
> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
> continuing the boot. There is no easy method to query the current VMPL
> level, so use the RMPADJUST instruction to determine whether the guest is
> running at the VMPL0.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>  arch/x86/include/asm/sev-common.h |  1 +
>  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>  3 files changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index a0708f359a46..9be369f72299 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>  	return ((high << 32) | low);
>  }
>  
> +static void enforce_vmpl0(void)
> +{
> +	u64 attrs;
> +	int err;
> +
> +	/*
> +	 * There is no straightforward way to query the current VMPL level. The
> +	 * simplest method is to use the RMPADJUST instruction to change a page
> +	 * permission to a VMPL level-1, and if the guest kernel is launched at
> +	 * a level <= 1, then RMPADJUST instruction will return an error.

Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
equal to 1 semantically, or numerically?

Venu
Mikolaj Lisik Dec. 16, 2021, 11:39 p.m. UTC | #2
On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
<venu.busireddy@oracle.com> wrote:
>
> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
> > Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
> > allows a guest VM to divide its address space into four levels. The level
> > can be used to provide the hardware isolated abstraction layers with a VM.
> > The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
> > Certain operations must be done by the VMPL0 software, such as:
> >
> > * Validate or invalidate memory range (PVALIDATE instruction)
> > * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
> >
> > The initial SEV-SNP support requires that the guest kernel is running on
> > VMPL0. Add a check to make sure that kernel is running at VMPL0 before
> > continuing the boot. There is no easy method to query the current VMPL
> > level, so use the RMPADJUST instruction to determine whether the guest is
> > running at the VMPL0.
> >
> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> > ---
> >  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
> >  arch/x86/include/asm/sev-common.h |  1 +
> >  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
> >  3 files changed, 48 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> > index a0708f359a46..9be369f72299 100644
> > --- a/arch/x86/boot/compressed/sev.c
> > +++ b/arch/x86/boot/compressed/sev.c
> > @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
> >       return ((high << 32) | low);
> >  }
> >
> > +static void enforce_vmpl0(void)
> > +{
> > +     u64 attrs;
> > +     int err;
> > +
> > +     /*
> > +      * There is no straightforward way to query the current VMPL level. The
> > +      * simplest method is to use the RMPADJUST instruction to change a page
> > +      * permission to a VMPL level-1, and if the guest kernel is launched at
> > +      * a level <= 1, then RMPADJUST instruction will return an error.
>
> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
> equal to 1 semantically, or numerically?
>

+1 to this. Additionally I found the "level-1" confusing which I
interpreted as "level minus one".

Perhaps phrasing it as "level one", or "level=1" would be more explicit?
Brijesh Singh Dec. 17, 2021, 10:19 p.m. UTC | #3
On 12/16/21 5:39 PM, Mikolaj Lisik wrote:
> On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
> <venu.busireddy@oracle.com> wrote:
>> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
>>> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
>>> allows a guest VM to divide its address space into four levels. The level
>>> can be used to provide the hardware isolated abstraction layers with a VM.
>>> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
>>> Certain operations must be done by the VMPL0 software, such as:
>>>
>>> * Validate or invalidate memory range (PVALIDATE instruction)
>>> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>>>
>>> The initial SEV-SNP support requires that the guest kernel is running on
>>> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
>>> continuing the boot. There is no easy method to query the current VMPL
>>> level, so use the RMPADJUST instruction to determine whether the guest is
>>> running at the VMPL0.
>>>
>>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>> ---
>>>  arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>>>  arch/x86/include/asm/sev-common.h |  1 +
>>>  arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>>>  3 files changed, 48 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>>> index a0708f359a46..9be369f72299 100644
>>> --- a/arch/x86/boot/compressed/sev.c
>>> +++ b/arch/x86/boot/compressed/sev.c
>>> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>>>       return ((high << 32) | low);
>>>  }
>>>
>>> +static void enforce_vmpl0(void)
>>> +{
>>> +     u64 attrs;
>>> +     int err;
>>> +
>>> +     /*
>>> +      * There is no straightforward way to query the current VMPL level. The
>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>> equal to 1 semantically, or numerically?

Its numerically, please see the AMD APM vol 3.

Here is the snippet from the APM RMPAJUST.

IF (TARGET_VMPL <= CURRENT_VMPL)  // Only permissions for numerically

        EAX = FAIL_PERMISSION                // higher VMPL can be modified

        EXIT


> +1 to this. Additionally I found the "level-1" confusing which I
> interpreted as "level minus one".
>
> Perhaps phrasing it as "level one", or "level=1" would be more explicit?
>
Sure, I will make it clear that its target vmpl level 1 and not (target
level - 1).

thanks
Tom Lendacky Dec. 17, 2021, 10:33 p.m. UTC | #4
On 12/17/21 4:19 PM, Brijesh Singh wrote:
> 
> On 12/16/21 5:39 PM, Mikolaj Lisik wrote:
>> On Thu, Dec 16, 2021 at 12:24 PM Venu Busireddy
>> <venu.busireddy@oracle.com> wrote:
>>> On 2021-12-10 09:43:00 -0600, Brijesh Singh wrote:
>>>> Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
>>>> allows a guest VM to divide its address space into four levels. The level
>>>> can be used to provide the hardware isolated abstraction layers with a VM.
>>>> The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
>>>> Certain operations must be done by the VMPL0 software, such as:
>>>>
>>>> * Validate or invalidate memory range (PVALIDATE instruction)
>>>> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>>>>
>>>> The initial SEV-SNP support requires that the guest kernel is running on
>>>> VMPL0. Add a check to make sure that kernel is running at VMPL0 before
>>>> continuing the boot. There is no easy method to query the current VMPL
>>>> level, so use the RMPADJUST instruction to determine whether the guest is
>>>> running at the VMPL0.
>>>>
>>>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>>> ---
>>>>   arch/x86/boot/compressed/sev.c    | 34 ++++++++++++++++++++++++++++---
>>>>   arch/x86/include/asm/sev-common.h |  1 +
>>>>   arch/x86/include/asm/sev.h        | 16 +++++++++++++++
>>>>   3 files changed, 48 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>>>> index a0708f359a46..9be369f72299 100644
>>>> --- a/arch/x86/boot/compressed/sev.c
>>>> +++ b/arch/x86/boot/compressed/sev.c
>>>> @@ -212,6 +212,31 @@ static inline u64 rd_sev_status_msr(void)
>>>>        return ((high << 32) | low);
>>>>   }
>>>>
>>>> +static void enforce_vmpl0(void)
>>>> +{
>>>> +     u64 attrs;
>>>> +     int err;
>>>> +
>>>> +     /*
>>>> +      * There is no straightforward way to query the current VMPL level. The
>>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>>> equal to 1 semantically, or numerically?
> 
> Its numerically, please see the AMD APM vol 3.

Actually it is not numerically...  if it was numerically, then 0 <= 1 
would return an error, but VMPL0 is the highest permission level.

> 
> Here is the snippet from the APM RMPAJUST.
> 
> IF (TARGET_VMPL <= CURRENT_VMPL)  // Only permissions for numerically

Notice, that the target VMPL is checked against the current VMPL. So if 
the target VMPL is numerically less than or equal to the current VMPL 
(e.g. you are trying to modify permissions for VMPL1 when you are running 
at VMPL2), that is a permission error. So similar to CPL, 0 is the highest 
permission followed by 1 then 2 then 3.

Thanks,
Tom

> 
>          EAX = FAIL_PERMISSION                // higher VMPL can be modified
> 
>          EXIT
> 
> 
>> +1 to this. Additionally I found the "level-1" confusing which I
>> interpreted as "level minus one".
>>
>> Perhaps phrasing it as "level one", or "level=1" would be more explicit?
>>
> Sure, I will make it clear that its target vmpl level 1 and not (target
> level - 1).
> 
> thanks
> 
>
Borislav Petkov Dec. 20, 2021, 6:10 p.m. UTC | #5
On Fri, Dec 17, 2021 at 04:33:02PM -0600, Tom Lendacky wrote:
> > > > > +      * There is no straightforward way to query the current VMPL level. The
> > > > > +      * simplest method is to use the RMPADJUST instruction to change a page
> > > > > +      * permission to a VMPL level-1, and if the guest kernel is launched at
> > > > > +      * a level <= 1, then RMPADJUST instruction will return an error.
> > > > Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
> > > > equal to 1 semantically, or numerically?
> > 
> > Its numerically, please see the AMD APM vol 3.
> 
> Actually it is not numerically...  if it was numerically, then 0 <= 1 would
> return an error, but VMPL0 is the highest permission level.

Just write in that comment exactly what this function does:

"RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
higher) privilege level. Here, clear the VMPL1 permission mask of the
GHCB page. If the guest is not running at VMPL0, this will fail.

If the guest is running at VMP0, it will succeed. Even if that operation
modifies permission bits, it is still ok to do currently because Linux
SNP guests are supported only on VMPL0 so VMPL1 or higher permission
masks changing is a don't-care."

and then everything is clear wrt numbering, privilege, etc.

Ok?
Brijesh Singh Jan. 4, 2022, 3:23 p.m. UTC | #6
On 12/20/21 12:10 PM, Borislav Petkov wrote:
> On Fri, Dec 17, 2021 at 04:33:02PM -0600, Tom Lendacky wrote:
>>>>>> +      * There is no straightforward way to query the current VMPL level. The
>>>>>> +      * simplest method is to use the RMPADJUST instruction to change a page
>>>>>> +      * permission to a VMPL level-1, and if the guest kernel is launched at
>>>>>> +      * a level <= 1, then RMPADJUST instruction will return an error.
>>>>> Perhaps a nit. When you say "level <= 1", do you mean a level lower than or
>>>>> equal to 1 semantically, or numerically?
>>>
>>> Its numerically, please see the AMD APM vol 3.
>>
>> Actually it is not numerically...  if it was numerically, then 0 <= 1 would
>> return an error, but VMPL0 is the highest permission level.
> 
> Just write in that comment exactly what this function does:
> 
> "RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
> higher) privilege level. Here, clear the VMPL1 permission mask of the
> GHCB page. If the guest is not running at VMPL0, this will fail.
> 
> If the guest is running at VMP0, it will succeed. Even if that operation
> modifies permission bits, it is still ok to do currently because Linux
> SNP guests are supported only on VMPL0 so VMPL1 or higher permission
> masks changing is a don't-care."
> 
> and then everything is clear wrt numbering, privilege, etc.
> 
> Ok?
> 

Noted.

thanks
diff mbox series

Patch

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index a0708f359a46..9be369f72299 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -212,6 +212,31 @@  static inline u64 rd_sev_status_msr(void)
 	return ((high << 32) | low);
 }
 
+static void enforce_vmpl0(void)
+{
+	u64 attrs;
+	int err;
+
+	/*
+	 * There is no straightforward way to query the current VMPL level. The
+	 * simplest method is to use the RMPADJUST instruction to change a page
+	 * permission to a VMPL level-1, and if the guest kernel is launched at
+	 * a level <= 1, then RMPADJUST instruction will return an error.
+	 */
+	attrs = 1;
+
+	/*
+	 * Any page-aligned virtual address is sufficient to test the VMPL level.
+	 * The boot_ghcb_page is page aligned memory, so use for the test.
+	 *
+	 * The RMPADJUST operation below clears the permission for the boot_ghcb_page
+	 * on VMPL1. If the guest is booted at the VMPL0, then there is no need to
+	 * restore the permissions because VMPL1 permission will be all zero.
+	 */
+	if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
 void sev_enable(struct boot_params *bp)
 {
 	unsigned int eax, ebx, ecx, edx;
@@ -252,11 +277,14 @@  void sev_enable(struct boot_params *bp)
 	/*
 	 * SNP is supported in v2 of the GHCB spec which mandates support for HV
 	 * features. If SEV-SNP is enabled, then check if the hypervisor supports
-	 * the SEV-SNP features.
+	 * the SEV-SNP features and is launched at VMPL0 level.
 	 */
-	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
-		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+	if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+		if (!(get_hv_features() & GHCB_HV_FT_SNP))
+			sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
 
+		enforce_vmpl0();
+	}
 
 	sme_me_mask = BIT_ULL(ebx & 0x3f);
 }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6f037c29a46e..7ac5842e32b6 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -89,6 +89,7 @@ 
 #define GHCB_TERM_REGISTER		0	/* GHCB GPA registration failure */
 #define GHCB_TERM_PSC			1	/* Page State Change failure */
 #define GHCB_TERM_PVALIDATE		2	/* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0		3	/* SNP guest is not running at VMPL-0 */
 
 #define GHCB_RESP_CODE(v)		((v) & GHCB_MSR_INFO_MASK)
 
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4ee98976aed8..e37451849165 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -63,6 +63,9 @@  extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 /* Software defined (when rFlags.CF = 1) */
 #define PVALIDATE_FAIL_NOUPDATE		255
 
+/* RMP page size */
+#define RMP_PG_SIZE_4K			0
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 extern struct static_key_false sev_es_enable_key;
 extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -90,6 +93,18 @@  extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
 					  struct es_em_ctxt *ctxt,
 					  u64 exit_code, u64 exit_info_1,
 					  u64 exit_info_2);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+	int rc;
+
+	/* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+	asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+		     : "=a"(rc)
+		     : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+		     : "memory", "cc");
+
+	return rc;
+}
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
 {
 	bool no_rmpupdate;
@@ -114,6 +129,7 @@  static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
 static inline void sev_es_nmi_complete(void) { }
 static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
 static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
 #endif
 
 #endif