Message ID | 20231017202505.340906-3-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Handle set_memory_XXcrypted() errors | expand |
* Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > Kernel memory is "encrypted" by default. Some callers may "decrypt" it > in order to share it with things outside the kernel like a device or an > untrusted VMM. > > There is nothing to stop set_memory_encrypted() from being passed memory > that is already "encrypted" (aka. "private" on TDX). In fact, some > callers do this because ... $REASONS. Unfortunately, part of the TDX > decrypted=>encrypted transition is truly one way*. It can't handle > being asked to encrypt an already encrypted page > > Allow __set_memory_enc_pgtable() to detect already-encrypted memory > before it hits the TDX code. > > * The one way part is "page acceptance" > > [commit log written by Dave Hansen] > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > arch/x86/mm/pat/set_memory.c | 41 +++++++++++++++++++++++++++++++++++- > 1 file changed, 40 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > index bda9f129835e..1238b0db3e33 100644 > --- a/arch/x86/mm/pat/set_memory.c > +++ b/arch/x86/mm/pat/set_memory.c > @@ -2122,6 +2122,21 @@ int set_memory_global(unsigned long addr, int numpages) > __pgprot(_PAGE_GLOBAL), 0); > } > > +static bool kernel_vaddr_encryped(unsigned long addr, bool enc) > +{ > + unsigned int level; > + pte_t *pte; > + > + pte = lookup_address(addr, &level); > + if (!pte) > + return false; > + > + if (enc) > + return pte_val(*pte) == cc_mkenc(pte_val(*pte)); > + > + return pte_val(*pte) == cc_mkdec(pte_val(*pte)); > +} > + > /* > * __set_memory_enc_pgtable() is used for the hypervisors that get > * informed about "encryption" status via page tables. > @@ -2130,7 +2145,7 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) > { > pgprot_t empty = __pgprot(0); > struct cpa_data cpa; > - int ret; > + int ret, numpages_in_state = 0; > > /* Should not be working on unaligned addresses */ > if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) > @@ -2143,6 +2158,30 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) > cpa.mask_clr = enc ? pgprot_decrypted(empty) : pgprot_encrypted(empty); > cpa.pgd = init_mm.pgd; > > + /* > + * If any page is already in the right state, bail with an error > + * because the code doesn't handled it. This is likely because Grammar mistake here. > + * something has gone wrong and isn't worth optimizing for. > + * > + * If all the memory pages are already in the desired state return > + * success. Missing comma. > + * > + * kernel_vaddr_encryped() does not synchronize against huge page > + * splits so take pgd_lock. A caller doing strange things could Missing comma. Thanks, Ingo
On Wed, 2023-10-18 at 10:44 +0200, Ingo Molnar wrote: > > + /* > > + * If any page is already in the right state, bail with an > > error > > + * because the code doesn't handled it. This is likely > > because > > Grammar mistake here. > > > + * something has gone wrong and isn't worth optimizing for. > > + * > > + * If all the memory pages are already in the desired state > > return > > + * success. > > Missing comma. > > > + * > > + * kernel_vaddr_encryped() does not synchronize against > > huge page > > + * splits so take pgd_lock. A caller doing strange things > > could > > Missing comma. Oh, yep. Thanks.
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index bda9f129835e..1238b0db3e33 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2122,6 +2122,21 @@ int set_memory_global(unsigned long addr, int numpages) __pgprot(_PAGE_GLOBAL), 0); } +static bool kernel_vaddr_encryped(unsigned long addr, bool enc) +{ + unsigned int level; + pte_t *pte; + + pte = lookup_address(addr, &level); + if (!pte) + return false; + + if (enc) + return pte_val(*pte) == cc_mkenc(pte_val(*pte)); + + return pte_val(*pte) == cc_mkdec(pte_val(*pte)); +} + /* * __set_memory_enc_pgtable() is used for the hypervisors that get * informed about "encryption" status via page tables. @@ -2130,7 +2145,7 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) { pgprot_t empty = __pgprot(0); struct cpa_data cpa; - int ret; + int ret, numpages_in_state = 0; /* Should not be working on unaligned addresses */ if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) @@ -2143,6 +2158,30 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) cpa.mask_clr = enc ? pgprot_decrypted(empty) : pgprot_encrypted(empty); cpa.pgd = init_mm.pgd; + /* + * If any page is already in the right state, bail with an error + * because the code doesn't handled it. This is likely because + * something has gone wrong and isn't worth optimizing for. + * + * If all the memory pages are already in the desired state return + * success. + * + * kernel_vaddr_encryped() does not synchronize against huge page + * splits so take pgd_lock. A caller doing strange things could + * get a new PMD mid level PTE confused with a huge PMD entry. Just + * lock to tie up loose ends. + */ + spin_lock(&pgd_lock); + for (int i = 0; i < numpages; i++) { + if (kernel_vaddr_encryped(addr + (PAGE_SIZE * i), enc)) + numpages_in_state++; + } + spin_unlock(&pgd_lock); + if (numpages_in_state == numpages) + return 0; + else if (numpages_in_state) + return 1; + /* Must avoid aliasing mappings in the highmem code */ kmap_flush_unused(); vm_unmap_aliases();
Kernel memory is "encrypted" by default. Some callers may "decrypt" it in order to share it with things outside the kernel like a device or an untrusted VMM. There is nothing to stop set_memory_encrypted() from being passed memory that is already "encrypted" (aka. "private" on TDX). In fact, some callers do this because ... $REASONS. Unfortunately, part of the TDX decrypted=>encrypted transition is truly one way*. It can't handle being asked to encrypt an already encrypted page Allow __set_memory_enc_pgtable() to detect already-encrypted memory before it hits the TDX code. * The one way part is "page acceptance" [commit log written by Dave Hansen] Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> --- arch/x86/mm/pat/set_memory.c | 41 +++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-)