diff mbox series

[RFC,44/62] x86/mm: Set KeyIDs in encrypted VMAs for MKTME

Message ID 20190508144422.13171-45-kirill.shutemov@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Intel MKTME enabling | expand

Commit Message

Kirill A . Shutemov May 8, 2019, 2:44 p.m. UTC
From: Alison Schofield <alison.schofield@intel.com>

MKTME architecture requires the KeyID to be placed in PTE bits 51:46.
To create an encrypted VMA, place the KeyID in the upper bits of
vm_page_prot that matches the position of those PTE bits.

When the VMA is assigned a KeyID it is always considered a KeyID
change. The VMA is either going from not encrypted to encrypted,
or from encrypted with any KeyID to encrypted with any other KeyID.
To make the change safely, remove the user pages held by the VMA
and unlink the VMA's anonymous chain.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/include/asm/mktme.h |  4 ++++
 arch/x86/mm/mktme.c          | 26 ++++++++++++++++++++++++++
 include/linux/mm.h           |  6 ++++++
 3 files changed, 36 insertions(+)

Comments

Peter Zijlstra June 14, 2019, 11:44 a.m. UTC | #1
On Wed, May 08, 2019 at 05:44:04PM +0300, Kirill A. Shutemov wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> MKTME architecture requires the KeyID to be placed in PTE bits 51:46.
> To create an encrypted VMA, place the KeyID in the upper bits of
> vm_page_prot that matches the position of those PTE bits.
> 
> When the VMA is assigned a KeyID it is always considered a KeyID
> change. The VMA is either going from not encrypted to encrypted,
> or from encrypted with any KeyID to encrypted with any other KeyID.
> To make the change safely, remove the user pages held by the VMA
> and unlink the VMA's anonymous chain.

This does not look like a transformation that preserves content; is
mprotect() still a suitable name?
Alison Schofield June 14, 2019, 5:33 p.m. UTC | #2
On Fri, Jun 14, 2019 at 01:44:08PM +0200, Peter Zijlstra wrote:
> On Wed, May 08, 2019 at 05:44:04PM +0300, Kirill A. Shutemov wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > MKTME architecture requires the KeyID to be placed in PTE bits 51:46.
> > To create an encrypted VMA, place the KeyID in the upper bits of
> > vm_page_prot that matches the position of those PTE bits.
> > 
> > When the VMA is assigned a KeyID it is always considered a KeyID
> > change. The VMA is either going from not encrypted to encrypted,
> > or from encrypted with any KeyID to encrypted with any other KeyID.
> > To make the change safely, remove the user pages held by the VMA
> > and unlink the VMA's anonymous chain.
> 
> This does not look like a transformation that preserves content; is
> mprotect() still a suitable name?

Data is not preserved across KeyID changes, by design.

Background:
We chose to implement encrypt_mprotect() as an extension
to the legacy mprotect so that memory allocated in any
method could be encrypted. ie. we didn't want to be tied
to mmap. As an mprotect extension, encrypt_mprotect also
supports the changing of access flags.

The usage we suggest is:
1) alloc the memory w PROT_NONE to prevent any usage before
   encryption
2) use encrypt_mprotect() to add the key and change the
   access to  PROT_WRITE|PROT_READ.

Preserving the data across encryption key changes has not
been a requirement. I'm not clear if it was ever considered
and rejected. I believe that copying in order to preserve
the data was never considered.

Back to your naming question:
Since it is an mprotect extension, it seems we need to keep
the mprotect in the name. 

Thanks for bringing it up. It would be good to hear more
thoughts on this.
Dave Hansen June 14, 2019, 6:26 p.m. UTC | #3
On 6/14/19 10:33 AM, Alison Schofield wrote:
> Preserving the data across encryption key changes has not
> been a requirement. I'm not clear if it was ever considered
> and rejected. I believe that copying in order to preserve
> the data was never considered.

We could preserve the data pretty easily.  It's just annoying, though.
Right now, our only KeyID conversions happen in the page allocator.  If
we were to convert in-place, we'd need something along the lines of:

	1. Allocate a scratch page
	2. Unmap target page, or at least make it entirely read-only
	3. Copy plaintext into scratch page
	4. Do cache KeyID conversion of page being converted:
	   Flush caches, change page_ext metadata
	5. Copy plaintext back into target page from scratch area
	6. Re-establish PTEs with new KeyID

#2 is *really* hard.  It's similar to the problems that the poor
filesystem guys are having with RDMA these days when RDMA is doing writes.

What we have here (destroying existing data) is certainly the _simplest_
semantic.  We can certainly give it a different name, or even non-PROT_*
semantics where it shares none of mprotect()'s functionality.

Doesn't really matter to me at all.
Alison Schofield June 14, 2019, 6:46 p.m. UTC | #4
On Fri, Jun 14, 2019 at 11:26:10AM -0700, Dave Hansen wrote:
> On 6/14/19 10:33 AM, Alison Schofield wrote:
> > Preserving the data across encryption key changes has not
> > been a requirement. I'm not clear if it was ever considered
> > and rejected. I believe that copying in order to preserve
> > the data was never considered.
> 
> We could preserve the data pretty easily.  It's just annoying, though.
> Right now, our only KeyID conversions happen in the page allocator.  If
> we were to convert in-place, we'd need something along the lines of:
> 
> 	1. Allocate a scratch page
> 	2. Unmap target page, or at least make it entirely read-only
> 	3. Copy plaintext into scratch page
> 	4. Do cache KeyID conversion of page being converted:
> 	   Flush caches, change page_ext metadata
> 	5. Copy plaintext back into target page from scratch area
> 	6. Re-establish PTEs with new KeyID

Seems like the 'Copy plaintext' steps might disappoint the user, as
much as the 'we don't preserve your data' design. Would users be happy
w the plain text steps ?
Alison

> 
> #2 is *really* hard.  It's similar to the problems that the poor
> filesystem guys are having with RDMA these days when RDMA is doing writes.
> 
> What we have here (destroying existing data) is certainly the _simplest_
> semantic.  We can certainly give it a different name, or even non-PROT_*
> semantics where it shares none of mprotect()'s functionality.
> 
> Doesn't really matter to me at all.
Dave Hansen June 14, 2019, 7:11 p.m. UTC | #5
On 6/14/19 11:46 AM, Alison Schofield wrote:
> On Fri, Jun 14, 2019 at 11:26:10AM -0700, Dave Hansen wrote:
>> On 6/14/19 10:33 AM, Alison Schofield wrote:
>>> Preserving the data across encryption key changes has not
>>> been a requirement. I'm not clear if it was ever considered
>>> and rejected. I believe that copying in order to preserve
>>> the data was never considered.
>>
>> We could preserve the data pretty easily.  It's just annoying, though.
>> Right now, our only KeyID conversions happen in the page allocator.  If
>> we were to convert in-place, we'd need something along the lines of:
>>
>> 	1. Allocate a scratch page
>> 	2. Unmap target page, or at least make it entirely read-only
>> 	3. Copy plaintext into scratch page
>> 	4. Do cache KeyID conversion of page being converted:
>> 	   Flush caches, change page_ext metadata
>> 	5. Copy plaintext back into target page from scratch area
>> 	6. Re-establish PTEs with new KeyID
> 
> Seems like the 'Copy plaintext' steps might disappoint the user, as
> much as the 'we don't preserve your data' design. Would users be happy
> w the plain text steps ?

Well, it got to be plaintext because they wrote it to memory in
plaintext in the first place, so it's kinda hard to disappoint them. :)

IMNHO, the *vast* majority of cases, folks will allocate memory and then
put a secret in it.  They aren't going to *get* a secret in some
mysterious fashion and then later decide they want to protect it.  In
other words, the inability to convert it is pretty academic and not
worth the complexity.
Peter Zijlstra June 17, 2019, 9:10 a.m. UTC | #6
On Fri, Jun 14, 2019 at 12:11:23PM -0700, Dave Hansen wrote:
> On 6/14/19 11:46 AM, Alison Schofield wrote:
> > On Fri, Jun 14, 2019 at 11:26:10AM -0700, Dave Hansen wrote:
> >> On 6/14/19 10:33 AM, Alison Schofield wrote:
> >>> Preserving the data across encryption key changes has not
> >>> been a requirement. I'm not clear if it was ever considered
> >>> and rejected. I believe that copying in order to preserve
> >>> the data was never considered.
> >>
> >> We could preserve the data pretty easily.  It's just annoying, though.
> >> Right now, our only KeyID conversions happen in the page allocator.  If
> >> we were to convert in-place, we'd need something along the lines of:
> >>
> >> 	1. Allocate a scratch page
> >> 	2. Unmap target page, or at least make it entirely read-only
> >> 	3. Copy plaintext into scratch page
> >> 	4. Do cache KeyID conversion of page being converted:
> >> 	   Flush caches, change page_ext metadata
> >> 	5. Copy plaintext back into target page from scratch area
> >> 	6. Re-establish PTEs with new KeyID
> > 
> > Seems like the 'Copy plaintext' steps might disappoint the user, as
> > much as the 'we don't preserve your data' design. Would users be happy
> > w the plain text steps ?
> 
> Well, it got to be plaintext because they wrote it to memory in
> plaintext in the first place, so it's kinda hard to disappoint them. :)
> 
> IMNHO, the *vast* majority of cases, folks will allocate memory and then
> put a secret in it.  They aren't going to *get* a secret in some
> mysterious fashion and then later decide they want to protect it.  In
> other words, the inability to convert it is pretty academic and not
> worth the complexity.

I'm not saying it is (required to preserve); but I do think it is
somewhat surprising to have an mprotect() call destroy content. It's
traditionally specified to not do that.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/mktme.h b/arch/x86/include/asm/mktme.h
index bd6707e73219..0e6df07f1921 100644
--- a/arch/x86/include/asm/mktme.h
+++ b/arch/x86/include/asm/mktme.h
@@ -12,6 +12,10 @@  extern phys_addr_t mktme_keyid_mask;
 extern int mktme_nr_keyids;
 extern int mktme_keyid_shift;
 
+/* Set the encryption keyid bits in a VMA */
+extern void mprotect_set_encrypt(struct vm_area_struct *vma, int newkeyid,
+				unsigned long start, unsigned long end);
+
 DECLARE_STATIC_KEY_FALSE(mktme_enabled_key);
 static inline bool mktme_enabled(void)
 {
diff --git a/arch/x86/mm/mktme.c b/arch/x86/mm/mktme.c
index 024165c9c7f3..91b49e88ca3f 100644
--- a/arch/x86/mm/mktme.c
+++ b/arch/x86/mm/mktme.c
@@ -1,5 +1,6 @@ 
 #include <linux/mm.h>
 #include <linux/highmem.h>
+#include <linux/rmap.h>
 #include <asm/mktme.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
@@ -53,6 +54,31 @@  int __vma_keyid(struct vm_area_struct *vma)
 	return (prot & mktme_keyid_mask) >> mktme_keyid_shift;
 }
 
+/* Set the encryption keyid bits in a VMA */
+void mprotect_set_encrypt(struct vm_area_struct *vma, int newkeyid,
+			  unsigned long start, unsigned long end)
+{
+	int oldkeyid = vma_keyid(vma);
+	pgprotval_t newprot;
+
+	/* Unmap pages with old KeyID if there's any. */
+	zap_page_range(vma, start, end - start);
+
+	if (oldkeyid == newkeyid)
+		return;
+
+	newprot = pgprot_val(vma->vm_page_prot);
+	newprot &= ~mktme_keyid_mask;
+	newprot |= (unsigned long)newkeyid << mktme_keyid_shift;
+	vma->vm_page_prot = __pgprot(newprot);
+
+	/*
+	 * The VMA doesn't have any inherited pages.
+	 * Start anon VMA tree from scratch.
+	 */
+	unlink_anon_vmas(vma);
+}
+
 /* Prepare page to be used for encryption. Called from page allocator. */
 void __prep_encrypted_page(struct page *page, int order, int keyid, bool zero)
 {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2684245f8503..c027044de9bf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2825,5 +2825,11 @@  void __init setup_nr_node_ids(void);
 static inline void setup_nr_node_ids(void) {}
 #endif
 
+#ifndef CONFIG_X86_INTEL_MKTME
+static inline void mprotect_set_encrypt(struct vm_area_struct *vma,
+					int newkeyid,
+					unsigned long start,
+					unsigned long end) {}
+#endif /* CONFIG_X86_INTEL_MKTME */
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */