Message ID | 20190508144422.13171-14-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Intel MKTME enabling | expand |
On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > +/* Prepare page to be used for encryption. Called from page allocator. */ > +void __prep_encrypted_page(struct page *page, int order, int keyid, bool zero) > +{ > + int i; > + > + /* > + * The hardware/CPU does not enforce coherency between mappings > + * of the same physical page with different KeyIDs or > + * encryption keys. We are responsible for cache management. > + */ On alloc we should flush the unencrypted (key=0) range, while on free (below) we should flush the encrypted (key!=0) range. But I seem to have missed where page_address() does the right thing here. > + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); > + > + for (i = 0; i < (1 << order); i++) { > + /* All pages coming out of the allocator should have KeyID 0 */ > + WARN_ON_ONCE(lookup_page_ext(page)->keyid); > + lookup_page_ext(page)->keyid = keyid; > + So presumably page_address() is affected by this keyid, and the below clear_highpage() then accesses the 'right' location? > + /* Clear the page after the KeyID is set. */ > + if (zero) > + clear_highpage(page); > + > + page++; > + } > +} > + > +/* > + * Handles freeing of encrypted page. > + * Called from page allocator on freeing encrypted page. > + */ > +void free_encrypted_page(struct page *page, int order) > +{ > + int i; > + > + /* > + * The hardware/CPU does not enforce coherency between mappings > + * of the same physical page with different KeyIDs or > + * encryption keys. We are responsible for cache management. > + */ I still don't like that comment much; yes the hardware doesn't do it, and yes we have to do it, but it doesn't explain the actual scheme employed to do so. > + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); > + > + for (i = 0; i < (1 << order); i++) { > + /* Check if the page has reasonable KeyID */ > + WARN_ON_ONCE(lookup_page_ext(page)->keyid > mktme_nr_keyids); It should also check keyid > 0, so maybe: (unsigned)(keyid - 1) > keyids-1 instead? > + lookup_page_ext(page)->keyid = 0; > + page++; > + } > +} > -- > 2.20.1 >
On Fri, Jun 14, 2019 at 11:34:09AM +0200, Peter Zijlstra wrote: > On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > > > + lookup_page_ext(page)->keyid = keyid; > > + lookup_page_ext(page)->keyid = 0; Also, perhaps paranoid; but do we want something like: static inline void page_set_keyid(struct page *page, int keyid) { /* ensure nothing creeps after changing the keyid */ barrier(); WRITE_ONCE(lookup_page_ext(page)->keyid, keyid); barrier(); /* ensure nothing creeps before changing the keyid */ } And this is very much assuming there is no concurrency through the allocator locks.
On Fri, Jun 14, 2019 at 11:34:09AM +0200, Peter Zijlstra wrote: > On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > > > +/* Prepare page to be used for encryption. Called from page allocator. */ > > +void __prep_encrypted_page(struct page *page, int order, int keyid, bool zero) > > +{ > > + int i; > > + > > + /* > > + * The hardware/CPU does not enforce coherency between mappings > > + * of the same physical page with different KeyIDs or > > + * encryption keys. We are responsible for cache management. > > + */ > > On alloc we should flush the unencrypted (key=0) range, while on free > (below) we should flush the encrypted (key!=0) range. > > But I seem to have missed where page_address() does the right thing > here. As you've seen by now, it will be addressed later in the patchset. I'll update the changelog to indicate that page_address() handles KeyIDs correctly. > > + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); > > + > > + for (i = 0; i < (1 << order); i++) { > > + /* All pages coming out of the allocator should have KeyID 0 */ > > + WARN_ON_ONCE(lookup_page_ext(page)->keyid); > > + lookup_page_ext(page)->keyid = keyid; > > + > > So presumably page_address() is affected by this keyid, and the below > clear_highpage() then accesses the 'right' location? Yes. clear_highpage() -> kmap_atomic() -> page_address(). > > + /* Clear the page after the KeyID is set. */ > > + if (zero) > > + clear_highpage(page); > > + > > + page++; > > + } > > +} > > + > > +/* > > + * Handles freeing of encrypted page. > > + * Called from page allocator on freeing encrypted page. > > + */ > > +void free_encrypted_page(struct page *page, int order) > > +{ > > + int i; > > + > > + /* > > + * The hardware/CPU does not enforce coherency between mappings > > + * of the same physical page with different KeyIDs or > > + * encryption keys. We are responsible for cache management. > > + */ > > I still don't like that comment much; yes the hardware doesn't do it, > and yes we have to do it, but it doesn't explain the actual scheme > employed to do so. Fair enough. I'll do better. > > + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); > > + > > + for (i = 0; i < (1 << order); i++) { > > + /* Check if the page has reasonable KeyID */ > > + WARN_ON_ONCE(lookup_page_ext(page)->keyid > mktme_nr_keyids); > > It should also check keyid > 0, so maybe: > > (unsigned)(keyid - 1) > keyids-1 > > instead? Makes sense.
On Fri, Jun 14, 2019 at 01:04:58PM +0200, Peter Zijlstra wrote: > On Fri, Jun 14, 2019 at 11:34:09AM +0200, Peter Zijlstra wrote: > > On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > > > > > + lookup_page_ext(page)->keyid = keyid; > > > > + lookup_page_ext(page)->keyid = 0; > > Also, perhaps paranoid; but do we want something like: > > static inline void page_set_keyid(struct page *page, int keyid) > { > /* ensure nothing creeps after changing the keyid */ > barrier(); > WRITE_ONCE(lookup_page_ext(page)->keyid, keyid); > barrier(); > /* ensure nothing creeps before changing the keyid */ > } > > And this is very much assuming there is no concurrency through the > allocator locks. There's no concurrency for this page: it has been off the free list, but have not yet passed on to user. Nobody else sees the page before allocation is finished. And barriers/WRITE_ONCE() looks excessive to me. It's just yet another bit of page's metadata and I don't see why it's has to be handled in a special way. Does it relax your paranoia? :P
On Fri, Jun 14, 2019 at 04:28:36PM +0300, Kirill A. Shutemov wrote: > On Fri, Jun 14, 2019 at 01:04:58PM +0200, Peter Zijlstra wrote: > > On Fri, Jun 14, 2019 at 11:34:09AM +0200, Peter Zijlstra wrote: > > > On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > > > > > > > + lookup_page_ext(page)->keyid = keyid; > > > > > > + lookup_page_ext(page)->keyid = 0; > > > > Also, perhaps paranoid; but do we want something like: > > > > static inline void page_set_keyid(struct page *page, int keyid) > > { > > /* ensure nothing creeps after changing the keyid */ > > barrier(); > > WRITE_ONCE(lookup_page_ext(page)->keyid, keyid); > > barrier(); > > /* ensure nothing creeps before changing the keyid */ > > } > > > > And this is very much assuming there is no concurrency through the > > allocator locks. > > There's no concurrency for this page: it has been off the free list, but > have not yet passed on to user. Nobody else sees the page before > allocation is finished. > > And barriers/WRITE_ONCE() looks excessive to me. It's just yet another bit > of page's metadata and I don't see why it's has to be handled in a special > way. > > Does it relax your paranoia? :P Not really, it all 'works' because clflush_cache_range() includes mb() and page_address() has an address dependency on the store, and there are no other sites that will ever change 'keyid', which is all kind of fragile. At the very least that should be explicitly called out in a comment.
On Fri, Jun 14, 2019 at 03:43:35PM +0200, Peter Zijlstra wrote: > On Fri, Jun 14, 2019 at 04:28:36PM +0300, Kirill A. Shutemov wrote: > > On Fri, Jun 14, 2019 at 01:04:58PM +0200, Peter Zijlstra wrote: > > > On Fri, Jun 14, 2019 at 11:34:09AM +0200, Peter Zijlstra wrote: > > > > On Wed, May 08, 2019 at 05:43:33PM +0300, Kirill A. Shutemov wrote: > > > > > > > > > + lookup_page_ext(page)->keyid = keyid; > > > > > > > > + lookup_page_ext(page)->keyid = 0; > > > > > > Also, perhaps paranoid; but do we want something like: > > > > > > static inline void page_set_keyid(struct page *page, int keyid) > > > { > > > /* ensure nothing creeps after changing the keyid */ > > > barrier(); > > > WRITE_ONCE(lookup_page_ext(page)->keyid, keyid); > > > barrier(); > > > /* ensure nothing creeps before changing the keyid */ > > > } > > > > > > And this is very much assuming there is no concurrency through the > > > allocator locks. > > > > There's no concurrency for this page: it has been off the free list, but > > have not yet passed on to user. Nobody else sees the page before > > allocation is finished. > > > > And barriers/WRITE_ONCE() looks excessive to me. It's just yet another bit > > of page's metadata and I don't see why it's has to be handled in a special > > way. > > > > Does it relax your paranoia? :P > > Not really, it all 'works' because clflush_cache_range() includes mb() > and page_address() has an address dependency on the store, and there are > no other sites that will ever change 'keyid', which is all kind of > fragile. Hm. I don't follow how the mb() in clflush_cache_range() relevant... Any following access of page's memory by kernel will go through page_keyid() and therefore I believe there's always address dependency on the store. Am I missing something? > At the very least that should be explicitly called out in a comment. >
On Sat, Jun 15, 2019 at 01:41:31AM +0300, Kirill A. Shutemov wrote: > On Fri, Jun 14, 2019 at 03:43:35PM +0200, Peter Zijlstra wrote: > > Not really, it all 'works' because clflush_cache_range() includes mb() > > and page_address() has an address dependency on the store, and there are > > no other sites that will ever change 'keyid', which is all kind of > > fragile. > > Hm. I don't follow how the mb() in clflush_cache_range() relevant... > > Any following access of page's memory by kernel will go through > page_keyid() and therefore I believe there's always address dependency on > the store. > > Am I missing something? The dependency doesn't help with prior calls; consider: addr = page_address(page); *addr = foo; page->key_id = bar; addr2 = page_address(page); Without a barrier() between '*addr = foo' and 'page->key_id = bar', the compiler is allowed to reorder these stores. Now, the clflush stuff we do, that already hard orders things -- we need to be done writing before we start flushing -- so we can/do rely on that, but we should explicitly mention that. Now, for the second part, addr2 must observe bar, because of the address dependency, the compiler is not allowed mess that up, but again, that is something we should put in a comment.
diff --git a/arch/x86/include/asm/mktme.h b/arch/x86/include/asm/mktme.h index b5afa31b4526..6e604126f0bc 100644 --- a/arch/x86/include/asm/mktme.h +++ b/arch/x86/include/asm/mktme.h @@ -40,6 +40,23 @@ static inline int vma_keyid(struct vm_area_struct *vma) return __vma_keyid(vma); } +#define prep_encrypted_page prep_encrypted_page +void __prep_encrypted_page(struct page *page, int order, int keyid, bool zero); +static inline void prep_encrypted_page(struct page *page, int order, + int keyid, bool zero) +{ + if (keyid) + __prep_encrypted_page(page, order, keyid, zero); +} + +#define HAVE_ARCH_FREE_PAGE +void free_encrypted_page(struct page *page, int order); +static inline void arch_free_page(struct page *page, int order) +{ + if (page_keyid(page)) + free_encrypted_page(page, order); +} + #else #define mktme_keyid_mask ((phys_addr_t)0) #define mktme_nr_keyids 0 diff --git a/arch/x86/mm/mktme.c b/arch/x86/mm/mktme.c index d4a1a9e9b1c0..43489c098e60 100644 --- a/arch/x86/mm/mktme.c +++ b/arch/x86/mm/mktme.c @@ -1,4 +1,5 @@ #include <linux/mm.h> +#include <linux/highmem.h> #include <asm/mktme.h> /* Mask to extract KeyID from physical address. */ @@ -37,3 +38,51 @@ int __vma_keyid(struct vm_area_struct *vma) pgprotval_t prot = pgprot_val(vma->vm_page_prot); return (prot & mktme_keyid_mask) >> mktme_keyid_shift; } + +/* Prepare page to be used for encryption. Called from page allocator. */ +void __prep_encrypted_page(struct page *page, int order, int keyid, bool zero) +{ + int i; + + /* + * The hardware/CPU does not enforce coherency between mappings + * of the same physical page with different KeyIDs or + * encryption keys. We are responsible for cache management. + */ + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); + + for (i = 0; i < (1 << order); i++) { + /* All pages coming out of the allocator should have KeyID 0 */ + WARN_ON_ONCE(lookup_page_ext(page)->keyid); + lookup_page_ext(page)->keyid = keyid; + + /* Clear the page after the KeyID is set. */ + if (zero) + clear_highpage(page); + + page++; + } +} + +/* + * Handles freeing of encrypted page. + * Called from page allocator on freeing encrypted page. + */ +void free_encrypted_page(struct page *page, int order) +{ + int i; + + /* + * The hardware/CPU does not enforce coherency between mappings + * of the same physical page with different KeyIDs or + * encryption keys. We are responsible for cache management. + */ + clflush_cache_range(page_address(page), PAGE_SIZE * (1UL << order)); + + for (i = 0; i < (1 << order); i++) { + /* Check if the page has reasonable KeyID */ + WARN_ON_ONCE(lookup_page_ext(page)->keyid > mktme_nr_keyids); + lookup_page_ext(page)->keyid = 0; + page++; + } +}
Hook up into page allocator to allocate and free encrypted page properly. The hardware/CPU does not enforce coherency between mappings of the same physical page with different KeyIDs or encryption keys. We are responsible for cache management. Flush cache on allocating encrypted page and on returning the page to the free pool. prep_encrypted_page() also takes care about zeroing the page. We have to do this after KeyID is set for the page. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- arch/x86/include/asm/mktme.h | 17 +++++++++++++ arch/x86/mm/mktme.c | 49 ++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+)