diff mbox series

[RFC,v7,14/64] x86/sev: Add the host SEV-SNP initialization support

Message ID 20221214194056.161492-15-michael.roth@amd.com (mailing list archive)
State New, archived
Headers show
Series Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand

Commit Message

Michael Roth Dec. 14, 2022, 7:40 p.m. UTC
From: Brijesh Singh <brijesh.singh@amd.com>

The memory integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). The RMP is a single data
structure shared across the system that contains one entry for every 4K
page of DRAM that may be used by SEV-SNP VMs. The goal of RMP is to
track the owner of each page of memory. Pages of memory can be owned by
the hypervisor, owned by a specific VM or owned by the AMD-SP. See APM2
section 15.36.3 for more detail on RMP.

The RMP table is used to enforce access control to memory. The table itself
is not directly writable by the software. New CPU instructions (RMPUPDATE,
PVALIDATE, RMPADJUST) are used to manipulate the RMP entries.

Based on the platform configuration, the BIOS reserves the memory used
for the RMP table. The start and end address of the RMP table must be
queried by reading the RMP_BASE and RMP_END MSRs. If the RMP_BASE and
RMP_END are not set then disable the SEV-SNP feature.

The SEV-SNP feature is enabled only after the RMP table is successfully
initialized.

Also set SYSCFG.MFMD when enabling SNP as SEV-SNP FW >= 1.51 requires
that SYSCFG.MFMD must be se

RMP table entry format is non-architectural and it can vary by processor
and is defined by the PPR. Restrict SNP support on the known CPU model
and family for which the RMP table entry format is currently defined for.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-b: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/include/asm/disabled-features.h |   8 +-
 arch/x86/include/asm/msr-index.h         |  11 +-
 arch/x86/kernel/sev.c                    | 180 +++++++++++++++++++++++
 3 files changed, 197 insertions(+), 2 deletions(-)

Comments

Sabin Rapan Jan. 11, 2023, 2:50 p.m. UTC | #1
On 14.12.2022 21:40, Michael Roth wrote:
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +# define DISABLE_SEV_SNP       0
> +#else
> +# define DISABLE_SEV_SNP       (1 << (X86_FEATURE_SEV_SNP & 31))
> +#endif
> +

Would it make sense to split the SEV-* feature family into their own
config flag(s) ?
I'm thinking in the context of SEV-SNP running on systems with
Transparent SME enabled in the bios. In this case, enabling
CONFIG_AMD_MEM_ENCRYPT will also enable SME in the kernel, which is a
bit strange and not necessarily useful.
Commit 4e2c87949f2b ("crypto: ccp - When TSME and SME both detected
notify user") highlights it.

--
Sabin.



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Jeremi Piotrowski Jan. 18, 2023, 3:55 p.m. UTC | #2
On Wed, Dec 14, 2022 at 01:40:06PM -0600, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The memory integrity guarantees of SEV-SNP are enforced through a new
> structure called the Reverse Map Table (RMP). The RMP is a single data
> structure shared across the system that contains one entry for every 4K
> page of DRAM that may be used by SEV-SNP VMs. The goal of RMP is to
> track the owner of each page of memory. Pages of memory can be owned by
> the hypervisor, owned by a specific VM or owned by the AMD-SP. See APM2
> section 15.36.3 for more detail on RMP.
> 
> The RMP table is used to enforce access control to memory. The table itself
> is not directly writable by the software. New CPU instructions (RMPUPDATE,
> PVALIDATE, RMPADJUST) are used to manipulate the RMP entries.
> 
> Based on the platform configuration, the BIOS reserves the memory used
> for the RMP table. The start and end address of the RMP table must be
> queried by reading the RMP_BASE and RMP_END MSRs. If the RMP_BASE and
> RMP_END are not set then disable the SEV-SNP feature.
> 
> The SEV-SNP feature is enabled only after the RMP table is successfully
> initialized.
> 
> Also set SYSCFG.MFMD when enabling SNP as SEV-SNP FW >= 1.51 requires
> that SYSCFG.MFMD must be se
> 
> RMP table entry format is non-architectural and it can vary by processor
> and is defined by the PPR. Restrict SNP support on the known CPU model
> and family for which the RMP table entry format is currently defined for.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-b: Ashish Kalra <ashish.kalra@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>  arch/x86/include/asm/disabled-features.h |   8 +-
>  arch/x86/include/asm/msr-index.h         |  11 +-
>  arch/x86/kernel/sev.c                    | 180 +++++++++++++++++++++++
>  3 files changed, 197 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
> index 33d2cd04d254..9b5a2cc8064a 100644
> --- a/arch/x86/include/asm/disabled-features.h
> +++ b/arch/x86/include/asm/disabled-features.h
> @@ -87,6 +87,12 @@
>  # define DISABLE_TDX_GUEST	(1 << (X86_FEATURE_TDX_GUEST & 31))
>  #endif
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +# define DISABLE_SEV_SNP	0
> +#else
> +# define DISABLE_SEV_SNP	(1 << (X86_FEATURE_SEV_SNP & 31))
> +#endif
> +
>  /*
>   * Make sure to add features to the correct mask
>   */
> @@ -110,7 +116,7 @@
>  			 DISABLE_ENQCMD)
>  #define DISABLED_MASK17	0
>  #define DISABLED_MASK18	0
> -#define DISABLED_MASK19	0
> +#define DISABLED_MASK19	(DISABLE_SEV_SNP)
>  #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>  
>  #endif /* _ASM_X86_DISABLED_FEATURES_H */
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 10ac52705892..35100c630617 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -565,6 +565,8 @@
>  #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>  #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>  #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
> +#define MSR_AMD64_RMP_BASE		0xc0010132
> +#define MSR_AMD64_RMP_END		0xc0010133
>  
>  #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
>  
> @@ -649,7 +651,14 @@
>  #define MSR_K8_TOP_MEM2			0xc001001d
>  #define MSR_AMD64_SYSCFG		0xc0010010
>  #define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT	23
> -#define MSR_AMD64_SYSCFG_MEM_ENCRYPT	BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
> +#define MSR_AMD64_SYSCFG_MEM_ENCRYPT		BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
> +#define MSR_AMD64_SYSCFG_SNP_EN_BIT		24
> +#define MSR_AMD64_SYSCFG_SNP_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT	25
> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
> +#define MSR_AMD64_SYSCFG_MFDM_BIT		19
> +#define MSR_AMD64_SYSCFG_MFDM			BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
> +
>  #define MSR_K8_INT_PENDING_MSG		0xc0010055
>  /* C1E active bits in int pending message */
>  #define K8_INTP_C1E_ACTIVE_MASK		0x18000000
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index a428c62330d3..687a91284506 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -22,6 +22,9 @@
>  #include <linux/efi.h>
>  #include <linux/platform_device.h>
>  #include <linux/io.h>
> +#include <linux/cpumask.h>
> +#include <linux/iommu.h>
> +#include <linux/amd-iommu.h>
>  
>  #include <asm/cpu_entry_area.h>
>  #include <asm/stacktrace.h>
> @@ -38,6 +41,7 @@
>  #include <asm/apic.h>
>  #include <asm/cpuid.h>
>  #include <asm/cmdline.h>
> +#include <asm/iommu.h>
>  
>  #define DR7_RESET_VALUE        0x400
>  
> @@ -57,6 +61,12 @@
>  #define AP_INIT_CR0_DEFAULT		0x60000010
>  #define AP_INIT_MXCSR_DEFAULT		0x1f80
>  
> +/*
> + * The first 16KB from the RMP_BASE is used by the processor for the
> + * bookkeeping, the range needs to be added during the RMP entry lookup.
> + */
> +#define RMPTABLE_CPU_BOOKKEEPING_SZ	0x4000
> +
>  /* For early boot hypervisor communication in SEV-ES enabled guests */
>  static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>  
> @@ -69,6 +79,9 @@ static struct ghcb *boot_ghcb __section(".data");
>  /* Bitmap of SEV features supported by the hypervisor */
>  static u64 sev_hv_features __ro_after_init;
>  
> +static unsigned long rmptable_start __ro_after_init;
> +static unsigned long rmptable_end __ro_after_init;
> +
>  /* #VC handler runtime per-CPU data */
>  struct sev_es_runtime_data {
>  	struct ghcb ghcb_page;
> @@ -2260,3 +2273,170 @@ static int __init snp_init_platform_device(void)
>  	return 0;
>  }
>  device_initcall(snp_init_platform_device);
> +
> +#undef pr_fmt
> +#define pr_fmt(fmt)	"SEV-SNP: " fmt
> +
> +static int __mfd_enable(unsigned int cpu)
> +{
> +	u64 val;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return 0;
> +
> +	rdmsrl(MSR_AMD64_SYSCFG, val);
> +
> +	val |= MSR_AMD64_SYSCFG_MFDM;
> +
> +	wrmsrl(MSR_AMD64_SYSCFG, val);
> +
> +	return 0;
> +}
> +
> +static __init void mfd_enable(void *arg)
> +{
> +	__mfd_enable(smp_processor_id());
> +}
> +
> +static int __snp_enable(unsigned int cpu)
> +{
> +	u64 val;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return 0;
> +
> +	rdmsrl(MSR_AMD64_SYSCFG, val);
> +
> +	val |= MSR_AMD64_SYSCFG_SNP_EN;
> +	val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
> +
> +	wrmsrl(MSR_AMD64_SYSCFG, val);
> +
> +	return 0;
> +}
> +
> +static __init void snp_enable(void *arg)
> +{
> +	__snp_enable(smp_processor_id());
> +}
> +
> +static bool get_rmptable_info(u64 *start, u64 *len)
> +{
> +	u64 calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
> +
> +	rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
> +	rdmsrl(MSR_AMD64_RMP_END, rmp_end);
> +
> +	if (!rmp_base || !rmp_end) {
> +		pr_err("Memory for the RMP table has not been reserved by BIOS\n");
> +		return false;
> +	}
> +
> +	rmp_sz = rmp_end - rmp_base + 1;
> +
> +	/*
> +	 * Calculate the amount the memory that must be reserved by the BIOS to
> +	 * address the whole RAM. The reserved memory should also cover the
> +	 * RMP table itself.
> +	 */
> +	calc_rmp_sz = (((rmp_sz >> PAGE_SHIFT) + totalram_pages()) << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;

Since the rmptable is indexed by page number, I believe this check should be
using max_pfn:

    calc_rmp_sz = (max_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;

This accounts for holes/offsets in the memory map which lead to the top of
memory having pfn > totalram_pages().

> +
> +	if (calc_rmp_sz > rmp_sz) {
> +		pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
> +		       calc_rmp_sz, rmp_sz);
> +		return false;
> +	}
> +
> +	*start = rmp_base;
> +	*len = rmp_sz;
> +
> +	pr_info("RMP table physical address [0x%016llx - 0x%016llx]\n", rmp_base, rmp_end);
> +
> +	return true;
> +}
> +
> +static __init int __snp_rmptable_init(void)
> +{
> +	u64 rmp_base, sz;
> +	void *start;
> +	u64 val;
> +
> +	if (!get_rmptable_info(&rmp_base, &sz))
> +		return 1;
> +
> +	start = memremap(rmp_base, sz, MEMREMAP_WB);
> +	if (!start) {
> +		pr_err("Failed to map RMP table addr 0x%llx size 0x%llx\n", rmp_base, sz);
> +		return 1;
> +	}
> +
> +	/*
> +	 * Check if SEV-SNP is already enabled, this can happen in case of
> +	 * kexec boot.
> +	 */
> +	rdmsrl(MSR_AMD64_SYSCFG, val);
> +	if (val & MSR_AMD64_SYSCFG_SNP_EN)
> +		goto skip_enable;
> +
> +	/* Initialize the RMP table to zero */
> +	memset(start, 0, sz);
> +
> +	/* Flush the caches to ensure that data is written before SNP is enabled. */
> +	wbinvd_on_all_cpus();
> +
> +	/* MFDM must be enabled on all the CPUs prior to enabling SNP. */
> +	on_each_cpu(mfd_enable, NULL, 1);
> +
> +	/* Enable SNP on all CPUs. */
> +	on_each_cpu(snp_enable, NULL, 1);
> +
> +skip_enable:
> +	rmptable_start = (unsigned long)start;
> +	rmptable_end = rmptable_start + sz - 1;
> +
> +	return 0;
> +}
> +
> +static int __init snp_rmptable_init(void)
> +{
> +	int family, model;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return 0;
> +
> +	family = boot_cpu_data.x86;
> +	model  = boot_cpu_data.x86_model;
> +
> +	/*
> +	 * RMP table entry format is not architectural and it can vary by processor and
> +	 * is defined by the per-processor PPR. Restrict SNP support on the known CPU
> +	 * model and family for which the RMP table entry format is currently defined for.
> +	 */
> +	if (family != 0x19 || model > 0xaf)
> +		goto nosnp;
> +
> +	if (amd_iommu_snp_enable())
> +		goto nosnp;
> +
> +	if (__snp_rmptable_init())
> +		goto nosnp;
> +
> +	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
> +
> +	return 0;
> +
> +nosnp:
> +	setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
> +	return -ENOSYS;
> +}
> +
> +/*
> + * This must be called after the PCI subsystem. This is because amd_iommu_snp_enable()
> + * is called to ensure the IOMMU supports the SEV-SNP feature, which can only be
> + * called after subsys_initcall().
> + *
> + * NOTE: IOMMU is enforced by SNP to ensure that hypervisor cannot program DMA
> + * directly into guest private memory. In case of SNP, the IOMMU ensures that
> + * the page(s) used for DMA are hypervisor owned.
> + */
> +fs_initcall(snp_rmptable_init);
> -- 
> 2.25.1
>
Kalra, Ashish Jan. 19, 2023, 4:26 p.m. UTC | #3
On 1/11/2023 8:50 AM, Sabin Rapan wrote:
> 
> 
> On 14.12.2022 21:40, Michael Roth wrote:
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +# define DISABLE_SEV_SNP       0
>> +#else
>> +# define DISABLE_SEV_SNP       (1 << (X86_FEATURE_SEV_SNP & 31))
>> +#endif
>> +
> 
> Would it make sense to split the SEV-* feature family into their own
> config flag(s) ?
> I'm thinking in the context of SEV-SNP running on systems with
> Transparent SME enabled in the bios. In this case, enabling
> CONFIG_AMD_MEM_ENCRYPT will also enable SME in the kernel, which is a
> bit strange and not necessarily useful.
> Commit 4e2c87949f2b ("crypto: ccp - When TSME and SME both detected
> notify user") highlights it.
> 

Yes, we plan to move the SNP host initialization stuff into a separate 
source file and under a different config flag such as CONFIG_KVM_AMD_SEV
or something.

Thanks,
Ashish
Kalra, Ashish Jan. 19, 2023, 11:59 p.m. UTC | #4
Hello Jeremi,

On 1/18/2023 9:55 AM, Jeremi Piotrowski wrote:
> On Wed, Dec 14, 2022 at 01:40:06PM -0600, Michael Roth wrote:
>> From: Brijesh Singh <brijesh.singh@amd.com>
>>
>> The memory integrity guarantees of SEV-SNP are enforced through a new
>> structure called the Reverse Map Table (RMP). The RMP is a single data
>> structure shared across the system that contains one entry for every 4K
>> page of DRAM that may be used by SEV-SNP VMs. The goal of RMP is to
>> track the owner of each page of memory. Pages of memory can be owned by
>> the hypervisor, owned by a specific VM or owned by the AMD-SP. See APM2
>> section 15.36.3 for more detail on RMP.
>>
>> The RMP table is used to enforce access control to memory. The table itself
>> is not directly writable by the software. New CPU instructions (RMPUPDATE,
>> PVALIDATE, RMPADJUST) are used to manipulate the RMP entries.
>>
>> Based on the platform configuration, the BIOS reserves the memory used
>> for the RMP table. The start and end address of the RMP table must be
>> queried by reading the RMP_BASE and RMP_END MSRs. If the RMP_BASE and
>> RMP_END are not set then disable the SEV-SNP feature.
>>
>> The SEV-SNP feature is enabled only after the RMP table is successfully
>> initialized.
>>
>> Also set SYSCFG.MFMD when enabling SNP as SEV-SNP FW >= 1.51 requires
>> that SYSCFG.MFMD must be se
>>
>> RMP table entry format is non-architectural and it can vary by processor
>> and is defined by the PPR. Restrict SNP support on the known CPU model
>> and family for which the RMP table entry format is currently defined for.
>>
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>> Signed-off-b: Ashish Kalra <ashish.kalra@amd.com>
>> Signed-off-by: Michael Roth <michael.roth@amd.com>
>> ---
>>   arch/x86/include/asm/disabled-features.h |   8 +-
>>   arch/x86/include/asm/msr-index.h         |  11 +-
>>   arch/x86/kernel/sev.c                    | 180 +++++++++++++++++++++++
>>   3 files changed, 197 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
>> index 33d2cd04d254..9b5a2cc8064a 100644
>> --- a/arch/x86/include/asm/disabled-features.h
>> +++ b/arch/x86/include/asm/disabled-features.h
>> @@ -87,6 +87,12 @@
>>   # define DISABLE_TDX_GUEST	(1 << (X86_FEATURE_TDX_GUEST & 31))
>>   #endif
>>   
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +# define DISABLE_SEV_SNP	0
>> +#else
>> +# define DISABLE_SEV_SNP	(1 << (X86_FEATURE_SEV_SNP & 31))
>> +#endif
>> +
>>   /*
>>    * Make sure to add features to the correct mask
>>    */
>> @@ -110,7 +116,7 @@
>>   			 DISABLE_ENQCMD)
>>   #define DISABLED_MASK17	0
>>   #define DISABLED_MASK18	0
>> -#define DISABLED_MASK19	0
>> +#define DISABLED_MASK19	(DISABLE_SEV_SNP)
>>   #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>>   
>>   #endif /* _ASM_X86_DISABLED_FEATURES_H */
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 10ac52705892..35100c630617 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -565,6 +565,8 @@
>>   #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>>   #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>>   #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>> +#define MSR_AMD64_RMP_BASE		0xc0010132
>> +#define MSR_AMD64_RMP_END		0xc0010133
>>   
>>   #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
>>   
>> @@ -649,7 +651,14 @@
>>   #define MSR_K8_TOP_MEM2			0xc001001d
>>   #define MSR_AMD64_SYSCFG		0xc0010010
>>   #define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT	23
>> -#define MSR_AMD64_SYSCFG_MEM_ENCRYPT	BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
>> +#define MSR_AMD64_SYSCFG_MEM_ENCRYPT		BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
>> +#define MSR_AMD64_SYSCFG_SNP_EN_BIT		24
>> +#define MSR_AMD64_SYSCFG_SNP_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
>> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT	25
>> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
>> +#define MSR_AMD64_SYSCFG_MFDM_BIT		19
>> +#define MSR_AMD64_SYSCFG_MFDM			BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
>> +
>>   #define MSR_K8_INT_PENDING_MSG		0xc0010055
>>   /* C1E active bits in int pending message */
>>   #define K8_INTP_C1E_ACTIVE_MASK		0x18000000
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index a428c62330d3..687a91284506 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -22,6 +22,9 @@
>>   #include <linux/efi.h>
>>   #include <linux/platform_device.h>
>>   #include <linux/io.h>
>> +#include <linux/cpumask.h>
>> +#include <linux/iommu.h>
>> +#include <linux/amd-iommu.h>
>>   
>>   #include <asm/cpu_entry_area.h>
>>   #include <asm/stacktrace.h>
>> @@ -38,6 +41,7 @@
>>   #include <asm/apic.h>
>>   #include <asm/cpuid.h>
>>   #include <asm/cmdline.h>
>> +#include <asm/iommu.h>
>>   
>>   #define DR7_RESET_VALUE        0x400
>>   
>> @@ -57,6 +61,12 @@
>>   #define AP_INIT_CR0_DEFAULT		0x60000010
>>   #define AP_INIT_MXCSR_DEFAULT		0x1f80
>>   
>> +/*
>> + * The first 16KB from the RMP_BASE is used by the processor for the
>> + * bookkeeping, the range needs to be added during the RMP entry lookup.
>> + */
>> +#define RMPTABLE_CPU_BOOKKEEPING_SZ	0x4000
>> +
>>   /* For early boot hypervisor communication in SEV-ES enabled guests */
>>   static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
>>   
>> @@ -69,6 +79,9 @@ static struct ghcb *boot_ghcb __section(".data");
>>   /* Bitmap of SEV features supported by the hypervisor */
>>   static u64 sev_hv_features __ro_after_init;
>>   
>> +static unsigned long rmptable_start __ro_after_init;
>> +static unsigned long rmptable_end __ro_after_init;
>> +
>>   /* #VC handler runtime per-CPU data */
>>   struct sev_es_runtime_data {
>>   	struct ghcb ghcb_page;
>> @@ -2260,3 +2273,170 @@ static int __init snp_init_platform_device(void)
>>   	return 0;
>>   }
>>   device_initcall(snp_init_platform_device);
>> +
>> +#undef pr_fmt
>> +#define pr_fmt(fmt)	"SEV-SNP: " fmt
>> +
>> +static int __mfd_enable(unsigned int cpu)
>> +{
>> +	u64 val;
>> +
>> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
>> +		return 0;
>> +
>> +	rdmsrl(MSR_AMD64_SYSCFG, val);
>> +
>> +	val |= MSR_AMD64_SYSCFG_MFDM;
>> +
>> +	wrmsrl(MSR_AMD64_SYSCFG, val);
>> +
>> +	return 0;
>> +}
>> +
>> +static __init void mfd_enable(void *arg)
>> +{
>> +	__mfd_enable(smp_processor_id());
>> +}
>> +
>> +static int __snp_enable(unsigned int cpu)
>> +{
>> +	u64 val;
>> +
>> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
>> +		return 0;
>> +
>> +	rdmsrl(MSR_AMD64_SYSCFG, val);
>> +
>> +	val |= MSR_AMD64_SYSCFG_SNP_EN;
>> +	val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
>> +
>> +	wrmsrl(MSR_AMD64_SYSCFG, val);
>> +
>> +	return 0;
>> +}
>> +
>> +static __init void snp_enable(void *arg)
>> +{
>> +	__snp_enable(smp_processor_id());
>> +}
>> +
>> +static bool get_rmptable_info(u64 *start, u64 *len)
>> +{
>> +	u64 calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
>> +
>> +	rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
>> +	rdmsrl(MSR_AMD64_RMP_END, rmp_end);
>> +
>> +	if (!rmp_base || !rmp_end) {
>> +		pr_err("Memory for the RMP table has not been reserved by BIOS\n");
>> +		return false;
>> +	}
>> +
>> +	rmp_sz = rmp_end - rmp_base + 1;
>> +
>> +	/*
>> +	 * Calculate the amount the memory that must be reserved by the BIOS to
>> +	 * address the whole RAM. The reserved memory should also cover the
>> +	 * RMP table itself.
>> +	 */
>> +	calc_rmp_sz = (((rmp_sz >> PAGE_SHIFT) + totalram_pages()) << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
> 
> Since the rmptable is indexed by page number, I believe this check should be
> using max_pfn:
> 
>      calc_rmp_sz = (max_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
> 
> This accounts for holes/offsets in the memory map which lead to the top of
> memory having pfn > totalram_pages().
> 

I agree that this check should use max. addressable pfn to account for 
holes in the physical address map. The BIOS will probably also be 
computing RMP table size to cover the entire physical memory, which 
should be max. addressable PFN.

But, then we primarly need to check that all available RAM pages are 
covered by the RMP table, so the above check is sufficient for that, right ?

Also, i assume that max_pfn will take into account any hotplugged memory 
as i do know that totalram_pages() handles hotplugged memory.

Thanks,
Ashish

>> +
>> +	if (calc_rmp_sz > rmp_sz) {
>> +		pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
>> +		       calc_rmp_sz, rmp_sz);
>> +		return false;
>> +	}
>> +
>> +	*start = rmp_base;
>> +	*len = rmp_sz;
>> +
>> +	pr_info("RMP table physical address [0x%016llx - 0x%016llx]\n", rmp_base, rmp_end);
>> +
>> +	return true;
>> +}
>> +
Kalra, Ashish Jan. 20, 2023, 4:51 p.m. UTC | #5
On 1/19/2023 5:59 PM, Kalra, Ashish wrote:
> Hello Jeremi,
> 
> On 1/18/2023 9:55 AM, Jeremi Piotrowski wrote:
>> On Wed, Dec 14, 2022 at 01:40:06PM -0600, Michael Roth wrote:
>>> From: Brijesh Singh <brijesh.singh@amd.com>
>>>
>>> The memory integrity guarantees of SEV-SNP are enforced through a new
>>> structure called the Reverse Map Table (RMP). The RMP is a single data
>>> structure shared across the system that contains one entry for every 4K
>>> page of DRAM that may be used by SEV-SNP VMs. The goal of RMP is to
>>> track the owner of each page of memory. Pages of memory can be owned by
>>> the hypervisor, owned by a specific VM or owned by the AMD-SP. See APM2
>>> section 15.36.3 for more detail on RMP.
>>>
>>> The RMP table is used to enforce access control to memory. The table 
>>> itself
>>> is not directly writable by the software. New CPU instructions 
>>> (RMPUPDATE,
>>> PVALIDATE, RMPADJUST) are used to manipulate the RMP entries.
>>>
>>> Based on the platform configuration, the BIOS reserves the memory used
>>> for the RMP table. The start and end address of the RMP table must be
>>> queried by reading the RMP_BASE and RMP_END MSRs. If the RMP_BASE and
>>> RMP_END are not set then disable the SEV-SNP feature.
>>>
>>> The SEV-SNP feature is enabled only after the RMP table is successfully
>>> initialized.
>>>
>>> Also set SYSCFG.MFMD when enabling SNP as SEV-SNP FW >= 1.51 requires
>>> that SYSCFG.MFMD must be se
>>>
>>> RMP table entry format is non-architectural and it can vary by processor
>>> and is defined by the PPR. Restrict SNP support on the known CPU model
>>> and family for which the RMP table entry format is currently defined 
>>> for.
>>>
>>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
>>> Signed-off-b: Ashish Kalra <ashish.kalra@amd.com>
>>> Signed-off-by: Michael Roth <michael.roth@amd.com>
>>> ---
>>>   arch/x86/include/asm/disabled-features.h |   8 +-
>>>   arch/x86/include/asm/msr-index.h         |  11 +-
>>>   arch/x86/kernel/sev.c                    | 180 +++++++++++++++++++++++
>>>   3 files changed, 197 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/disabled-features.h 
>>> b/arch/x86/include/asm/disabled-features.h
>>> index 33d2cd04d254..9b5a2cc8064a 100644
>>> --- a/arch/x86/include/asm/disabled-features.h
>>> +++ b/arch/x86/include/asm/disabled-features.h
>>> @@ -87,6 +87,12 @@
>>>   # define DISABLE_TDX_GUEST    (1 << (X86_FEATURE_TDX_GUEST & 31))
>>>   #endif
>>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>>> +# define DISABLE_SEV_SNP    0
>>> +#else
>>> +# define DISABLE_SEV_SNP    (1 << (X86_FEATURE_SEV_SNP & 31))
>>> +#endif
>>> +
>>>   /*
>>>    * Make sure to add features to the correct mask
>>>    */
>>> @@ -110,7 +116,7 @@
>>>                DISABLE_ENQCMD)
>>>   #define DISABLED_MASK17    0
>>>   #define DISABLED_MASK18    0
>>> -#define DISABLED_MASK19    0
>>> +#define DISABLED_MASK19    (DISABLE_SEV_SNP)
>>>   #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
>>>   #endif /* _ASM_X86_DISABLED_FEATURES_H */
>>> diff --git a/arch/x86/include/asm/msr-index.h 
>>> b/arch/x86/include/asm/msr-index.h
>>> index 10ac52705892..35100c630617 100644
>>> --- a/arch/x86/include/asm/msr-index.h
>>> +++ b/arch/x86/include/asm/msr-index.h
>>> @@ -565,6 +565,8 @@
>>>   #define MSR_AMD64_SEV_ENABLED        
>>> BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>>>   #define MSR_AMD64_SEV_ES_ENABLED    
>>> BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>>>   #define MSR_AMD64_SEV_SNP_ENABLED    
>>> BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>>> +#define MSR_AMD64_RMP_BASE        0xc0010132
>>> +#define MSR_AMD64_RMP_END        0xc0010133
>>>   #define MSR_AMD64_VIRT_SPEC_CTRL    0xc001011f
>>> @@ -649,7 +651,14 @@
>>>   #define MSR_K8_TOP_MEM2            0xc001001d
>>>   #define MSR_AMD64_SYSCFG        0xc0010010
>>>   #define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT    23
>>> -#define MSR_AMD64_SYSCFG_MEM_ENCRYPT    
>>> BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
>>> +#define MSR_AMD64_SYSCFG_MEM_ENCRYPT        
>>> BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
>>> +#define MSR_AMD64_SYSCFG_SNP_EN_BIT        24
>>> +#define MSR_AMD64_SYSCFG_SNP_EN        
>>> BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
>>> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT    25
>>> +#define MSR_AMD64_SYSCFG_SNP_VMPL_EN        
>>> BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
>>> +#define MSR_AMD64_SYSCFG_MFDM_BIT        19
>>> +#define MSR_AMD64_SYSCFG_MFDM            
>>> BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
>>> +
>>>   #define MSR_K8_INT_PENDING_MSG        0xc0010055
>>>   /* C1E active bits in int pending message */
>>>   #define K8_INTP_C1E_ACTIVE_MASK        0x18000000
>>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>>> index a428c62330d3..687a91284506 100644
>>> --- a/arch/x86/kernel/sev.c
>>> +++ b/arch/x86/kernel/sev.c
>>> @@ -22,6 +22,9 @@
>>>   #include <linux/efi.h>
>>>   #include <linux/platform_device.h>
>>>   #include <linux/io.h>
>>> +#include <linux/cpumask.h>
>>> +#include <linux/iommu.h>
>>> +#include <linux/amd-iommu.h>
>>>   #include <asm/cpu_entry_area.h>
>>>   #include <asm/stacktrace.h>
>>> @@ -38,6 +41,7 @@
>>>   #include <asm/apic.h>
>>>   #include <asm/cpuid.h>
>>>   #include <asm/cmdline.h>
>>> +#include <asm/iommu.h>
>>>   #define DR7_RESET_VALUE        0x400
>>> @@ -57,6 +61,12 @@
>>>   #define AP_INIT_CR0_DEFAULT        0x60000010
>>>   #define AP_INIT_MXCSR_DEFAULT        0x1f80
>>> +/*
>>> + * The first 16KB from the RMP_BASE is used by the processor for the
>>> + * bookkeeping, the range needs to be added during the RMP entry 
>>> lookup.
>>> + */
>>> +#define RMPTABLE_CPU_BOOKKEEPING_SZ    0x4000
>>> +
>>>   /* For early boot hypervisor communication in SEV-ES enabled guests */
>>>   static struct ghcb boot_ghcb_page __bss_decrypted 
>>> __aligned(PAGE_SIZE);
>>> @@ -69,6 +79,9 @@ static struct ghcb *boot_ghcb __section(".data");
>>>   /* Bitmap of SEV features supported by the hypervisor */
>>>   static u64 sev_hv_features __ro_after_init;
>>> +static unsigned long rmptable_start __ro_after_init;
>>> +static unsigned long rmptable_end __ro_after_init;
>>> +
>>>   /* #VC handler runtime per-CPU data */
>>>   struct sev_es_runtime_data {
>>>       struct ghcb ghcb_page;
>>> @@ -2260,3 +2273,170 @@ static int __init snp_init_platform_device(void)
>>>       return 0;
>>>   }
>>>   device_initcall(snp_init_platform_device);
>>> +
>>> +#undef pr_fmt
>>> +#define pr_fmt(fmt)    "SEV-SNP: " fmt
>>> +
>>> +static int __mfd_enable(unsigned int cpu)
>>> +{
>>> +    u64 val;
>>> +
>>> +    if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
>>> +        return 0;
>>> +
>>> +    rdmsrl(MSR_AMD64_SYSCFG, val);
>>> +
>>> +    val |= MSR_AMD64_SYSCFG_MFDM;
>>> +
>>> +    wrmsrl(MSR_AMD64_SYSCFG, val);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static __init void mfd_enable(void *arg)
>>> +{
>>> +    __mfd_enable(smp_processor_id());
>>> +}
>>> +
>>> +static int __snp_enable(unsigned int cpu)
>>> +{
>>> +    u64 val;
>>> +
>>> +    if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
>>> +        return 0;
>>> +
>>> +    rdmsrl(MSR_AMD64_SYSCFG, val);
>>> +
>>> +    val |= MSR_AMD64_SYSCFG_SNP_EN;
>>> +    val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
>>> +
>>> +    wrmsrl(MSR_AMD64_SYSCFG, val);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static __init void snp_enable(void *arg)
>>> +{
>>> +    __snp_enable(smp_processor_id());
>>> +}
>>> +
>>> +static bool get_rmptable_info(u64 *start, u64 *len)
>>> +{
>>> +    u64 calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
>>> +
>>> +    rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
>>> +    rdmsrl(MSR_AMD64_RMP_END, rmp_end);
>>> +
>>> +    if (!rmp_base || !rmp_end) {
>>> +        pr_err("Memory for the RMP table has not been reserved by 
>>> BIOS\n");
>>> +        return false;
>>> +    }
>>> +
>>> +    rmp_sz = rmp_end - rmp_base + 1;
>>> +
>>> +    /*
>>> +     * Calculate the amount the memory that must be reserved by the 
>>> BIOS to
>>> +     * address the whole RAM. The reserved memory should also cover the
>>> +     * RMP table itself.
>>> +     */
>>> +    calc_rmp_sz = (((rmp_sz >> PAGE_SHIFT) + totalram_pages()) << 4) 
>>> + RMPTABLE_CPU_BOOKKEEPING_SZ;
>>
>> Since the rmptable is indexed by page number, I believe this check 
>> should be
>> using max_pfn:
>>
>>      calc_rmp_sz = (max_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
>>
>> This accounts for holes/offsets in the memory map which lead to the 
>> top of
>> memory having pfn > totalram_pages().
>>
> 
> I agree that this check should use max. addressable pfn to account for 
> holes in the physical address map. The BIOS will probably also be 
> computing RMP table size to cover the entire physical memory, which 
> should be max. addressable PFN.
> 
> But, then we primarly need to check that all available RAM pages are 
> covered by the RMP table, so the above check is sufficient for that, 
> right ?
> 
> Also, i assume that max_pfn will take into account any hotplugged memory 
> as i do know that totalram_pages() handles hotplugged memory.
> 

But essentially you are correct, as RMP table is indexed by PFN, we have 
to take into account these physical memory holes so that we can have 
entries for the max DRAM SPA, so i will fix this to use the max. 
addressable PFN, i.e., max_pfn.

Thanks,
Ashish

> 
>>> +
>>> +    if (calc_rmp_sz > rmp_sz) {
>>> +        pr_err("Memory reserved for the RMP table does not cover 
>>> full system RAM (expected 0x%llx got 0x%llx)\n",
>>> +               calc_rmp_sz, rmp_sz);
>>> +        return false;
>>> +    }
>>> +
>>> +    *start = rmp_base;
>>> +    *len = rmp_sz;
>>> +
>>> +    pr_info("RMP table physical address [0x%016llx - 0x%016llx]\n", 
>>> rmp_base, rmp_end);
>>> +
>>> +    return true;
>>> +}
>>> +
Borislav Petkov Feb. 2, 2023, 11:16 a.m. UTC | #6
On Wed, Dec 14, 2022 at 01:40:06PM -0600, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The memory integrity guarantees of SEV-SNP are enforced through a new
> structure called the Reverse Map Table (RMP). The RMP is a single data
> structure shared across the system that contains one entry for every 4K
> page of DRAM that may be used by SEV-SNP VMs. The goal of RMP is to
> track the owner of each page of memory. Pages of memory can be owned by
> the hypervisor, owned by a specific VM or owned by the AMD-SP. See APM2
> section 15.36.3 for more detail on RMP.
> 
> The RMP table is used to enforce access control to memory. The table itself
> is not directly writable by the software. New CPU instructions (RMPUPDATE,
> PVALIDATE, RMPADJUST) are used to manipulate the RMP entries.
> 
> Based on the platform configuration, the BIOS reserves the memory used
> for the RMP table. The start and end address of the RMP table must be
> queried by reading the RMP_BASE and RMP_END MSRs. If the RMP_BASE and
> RMP_END are not set then disable the SEV-SNP feature.
> 
> The SEV-SNP feature is enabled only after the RMP table is successfully
> initialized.
> 
> Also set SYSCFG.MFMD when enabling SNP as SEV-SNP FW >= 1.51 requires
> that SYSCFG.MFMD must be se

			   set.
> 
> RMP table entry format is non-architectural and it can vary by processor
> and is defined by the PPR. Restrict SNP support on the known CPU model
> and family for which the RMP table entry format is currently defined for.
> 
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-b: Ashish Kalra <ashish.kalra@amd.com>
	     ^^

Somebody ate a 'y' here. :)

> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>  arch/x86/include/asm/disabled-features.h |   8 +-
>  arch/x86/include/asm/msr-index.h         |  11 +-
>  arch/x86/kernel/sev.c                    | 180 +++++++++++++++++++++++
>  3 files changed, 197 insertions(+), 2 deletions(-)

...

> +static __init int __snp_rmptable_init(void)

Why is this one carved out of snp_rmptable_init() ?

> +{
> +	u64 rmp_base, sz;
> +	void *start;
> +	u64 val;
> +
> +	if (!get_rmptable_info(&rmp_base, &sz))
> +		return 1;
> +
> +	start = memremap(rmp_base, sz, MEMREMAP_WB);
> +	if (!start) {
> +		pr_err("Failed to map RMP table addr 0x%llx size 0x%llx\n", rmp_base, sz);
> +		return 1;
> +	}
> +
> +	/*
> +	 * Check if SEV-SNP is already enabled, this can happen in case of
> +	 * kexec boot.
> +	 */
> +	rdmsrl(MSR_AMD64_SYSCFG, val);
> +	if (val & MSR_AMD64_SYSCFG_SNP_EN)
> +		goto skip_enable;
> +
> +	/* Initialize the RMP table to zero */

Useless comment.

> +	memset(start, 0, sz);
> +
> +	/* Flush the caches to ensure that data is written before SNP is enabled. */
> +	wbinvd_on_all_cpus();
> +
> +	/* MFDM must be enabled on all the CPUs prior to enabling SNP. */
> +	on_each_cpu(mfd_enable, NULL, 1);
> +
> +	/* Enable SNP on all CPUs. */
> +	on_each_cpu(snp_enable, NULL, 1);

What happens if someone boots the machine with maxcpus=N, where N is
less than all CPUs on the machine? The hotplug notifier should handle it
but have you checked that it works fine?

> +skip_enable:
> +	rmptable_start = (unsigned long)start;
> +	rmptable_end = rmptable_start + sz - 1;
> +
> +	return 0;
> +}
> +
> +static int __init snp_rmptable_init(void)
> +{
> +	int family, model;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return 0;
> +
> +	family = boot_cpu_data.x86;
> +	model  = boot_cpu_data.x86_model;

Looks useless - just use boot_cpu_data directly below.

> +
> +	/*
> +	 * RMP table entry format is not architectural and it can vary by processor and
> +	 * is defined by the per-processor PPR. Restrict SNP support on the known CPU
> +	 * model and family for which the RMP table entry format is currently defined for.
> +	 */
> +	if (family != 0x19 || model > 0xaf)
> +		goto nosnp;
> +
> +	if (amd_iommu_snp_enable())
> +		goto nosnp;
> +
> +	if (__snp_rmptable_init())
> +		goto nosnp;
> +
> +	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
> +
> +	return 0;
> +
> +nosnp:
> +	setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
> +	return -ENOSYS;
> +}

Thx.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 33d2cd04d254..9b5a2cc8064a 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -87,6 +87,12 @@ 
 # define DISABLE_TDX_GUEST	(1 << (X86_FEATURE_TDX_GUEST & 31))
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+# define DISABLE_SEV_SNP	0
+#else
+# define DISABLE_SEV_SNP	(1 << (X86_FEATURE_SEV_SNP & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -110,7 +116,7 @@ 
 			 DISABLE_ENQCMD)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK18	0
-#define DISABLED_MASK19	0
+#define DISABLED_MASK19	(DISABLE_SEV_SNP)
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 20)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 10ac52705892..35100c630617 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -565,6 +565,8 @@ 
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
 #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
+#define MSR_AMD64_RMP_BASE		0xc0010132
+#define MSR_AMD64_RMP_END		0xc0010133
 
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
@@ -649,7 +651,14 @@ 
 #define MSR_K8_TOP_MEM2			0xc001001d
 #define MSR_AMD64_SYSCFG		0xc0010010
 #define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT	23
-#define MSR_AMD64_SYSCFG_MEM_ENCRYPT	BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
+#define MSR_AMD64_SYSCFG_MEM_ENCRYPT		BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
+#define MSR_AMD64_SYSCFG_SNP_EN_BIT		24
+#define MSR_AMD64_SYSCFG_SNP_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
+#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT	25
+#define MSR_AMD64_SYSCFG_SNP_VMPL_EN		BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
+#define MSR_AMD64_SYSCFG_MFDM_BIT		19
+#define MSR_AMD64_SYSCFG_MFDM			BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
+
 #define MSR_K8_INT_PENDING_MSG		0xc0010055
 /* C1E active bits in int pending message */
 #define K8_INTP_C1E_ACTIVE_MASK		0x18000000
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a428c62330d3..687a91284506 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -22,6 +22,9 @@ 
 #include <linux/efi.h>
 #include <linux/platform_device.h>
 #include <linux/io.h>
+#include <linux/cpumask.h>
+#include <linux/iommu.h>
+#include <linux/amd-iommu.h>
 
 #include <asm/cpu_entry_area.h>
 #include <asm/stacktrace.h>
@@ -38,6 +41,7 @@ 
 #include <asm/apic.h>
 #include <asm/cpuid.h>
 #include <asm/cmdline.h>
+#include <asm/iommu.h>
 
 #define DR7_RESET_VALUE        0x400
 
@@ -57,6 +61,12 @@ 
 #define AP_INIT_CR0_DEFAULT		0x60000010
 #define AP_INIT_MXCSR_DEFAULT		0x1f80
 
+/*
+ * The first 16KB from the RMP_BASE is used by the processor for the
+ * bookkeeping, the range needs to be added during the RMP entry lookup.
+ */
+#define RMPTABLE_CPU_BOOKKEEPING_SZ	0x4000
+
 /* For early boot hypervisor communication in SEV-ES enabled guests */
 static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
 
@@ -69,6 +79,9 @@  static struct ghcb *boot_ghcb __section(".data");
 /* Bitmap of SEV features supported by the hypervisor */
 static u64 sev_hv_features __ro_after_init;
 
+static unsigned long rmptable_start __ro_after_init;
+static unsigned long rmptable_end __ro_after_init;
+
 /* #VC handler runtime per-CPU data */
 struct sev_es_runtime_data {
 	struct ghcb ghcb_page;
@@ -2260,3 +2273,170 @@  static int __init snp_init_platform_device(void)
 	return 0;
 }
 device_initcall(snp_init_platform_device);
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"SEV-SNP: " fmt
+
+static int __mfd_enable(unsigned int cpu)
+{
+	u64 val;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+
+	val |= MSR_AMD64_SYSCFG_MFDM;
+
+	wrmsrl(MSR_AMD64_SYSCFG, val);
+
+	return 0;
+}
+
+static __init void mfd_enable(void *arg)
+{
+	__mfd_enable(smp_processor_id());
+}
+
+static int __snp_enable(unsigned int cpu)
+{
+	u64 val;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+
+	val |= MSR_AMD64_SYSCFG_SNP_EN;
+	val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
+
+	wrmsrl(MSR_AMD64_SYSCFG, val);
+
+	return 0;
+}
+
+static __init void snp_enable(void *arg)
+{
+	__snp_enable(smp_processor_id());
+}
+
+static bool get_rmptable_info(u64 *start, u64 *len)
+{
+	u64 calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
+
+	rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
+	rdmsrl(MSR_AMD64_RMP_END, rmp_end);
+
+	if (!rmp_base || !rmp_end) {
+		pr_err("Memory for the RMP table has not been reserved by BIOS\n");
+		return false;
+	}
+
+	rmp_sz = rmp_end - rmp_base + 1;
+
+	/*
+	 * Calculate the amount the memory that must be reserved by the BIOS to
+	 * address the whole RAM. The reserved memory should also cover the
+	 * RMP table itself.
+	 */
+	calc_rmp_sz = (((rmp_sz >> PAGE_SHIFT) + totalram_pages()) << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
+
+	if (calc_rmp_sz > rmp_sz) {
+		pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
+		       calc_rmp_sz, rmp_sz);
+		return false;
+	}
+
+	*start = rmp_base;
+	*len = rmp_sz;
+
+	pr_info("RMP table physical address [0x%016llx - 0x%016llx]\n", rmp_base, rmp_end);
+
+	return true;
+}
+
+static __init int __snp_rmptable_init(void)
+{
+	u64 rmp_base, sz;
+	void *start;
+	u64 val;
+
+	if (!get_rmptable_info(&rmp_base, &sz))
+		return 1;
+
+	start = memremap(rmp_base, sz, MEMREMAP_WB);
+	if (!start) {
+		pr_err("Failed to map RMP table addr 0x%llx size 0x%llx\n", rmp_base, sz);
+		return 1;
+	}
+
+	/*
+	 * Check if SEV-SNP is already enabled, this can happen in case of
+	 * kexec boot.
+	 */
+	rdmsrl(MSR_AMD64_SYSCFG, val);
+	if (val & MSR_AMD64_SYSCFG_SNP_EN)
+		goto skip_enable;
+
+	/* Initialize the RMP table to zero */
+	memset(start, 0, sz);
+
+	/* Flush the caches to ensure that data is written before SNP is enabled. */
+	wbinvd_on_all_cpus();
+
+	/* MFDM must be enabled on all the CPUs prior to enabling SNP. */
+	on_each_cpu(mfd_enable, NULL, 1);
+
+	/* Enable SNP on all CPUs. */
+	on_each_cpu(snp_enable, NULL, 1);
+
+skip_enable:
+	rmptable_start = (unsigned long)start;
+	rmptable_end = rmptable_start + sz - 1;
+
+	return 0;
+}
+
+static int __init snp_rmptable_init(void)
+{
+	int family, model;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return 0;
+
+	family = boot_cpu_data.x86;
+	model  = boot_cpu_data.x86_model;
+
+	/*
+	 * RMP table entry format is not architectural and it can vary by processor and
+	 * is defined by the per-processor PPR. Restrict SNP support on the known CPU
+	 * model and family for which the RMP table entry format is currently defined for.
+	 */
+	if (family != 0x19 || model > 0xaf)
+		goto nosnp;
+
+	if (amd_iommu_snp_enable())
+		goto nosnp;
+
+	if (__snp_rmptable_init())
+		goto nosnp;
+
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
+
+	return 0;
+
+nosnp:
+	setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
+	return -ENOSYS;
+}
+
+/*
+ * This must be called after the PCI subsystem. This is because amd_iommu_snp_enable()
+ * is called to ensure the IOMMU supports the SEV-SNP feature, which can only be
+ * called after subsys_initcall().
+ *
+ * NOTE: IOMMU is enforced by SNP to ensure that hypervisor cannot program DMA
+ * directly into guest private memory. In case of SNP, the IOMMU ensures that
+ * the page(s) used for DMA are hypervisor owned.
+ */
+fs_initcall(snp_rmptable_init);