Message ID | 20180717112029.42378-7-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/17/2018 04:20 AM, Kirill A. Shutemov wrote: > khugepaged allocates page in advance, before we found a VMA for > collapse. We don't yet know which KeyID to use for the allocation. That's not really true. We have the VMA and the address in the caller (khugepaged_scan_pmd()), but we drop the lock and have to revalidate the VMA. > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 5ae34097aed1..d116f4ebb622 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, > */ > anon_vma_unlock_write(vma->anon_vma); > > + /* > + * At this point new_page is allocated as non-encrypted. > + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted > + * before coping data. > + */ > + if (vma_keyid(vma)) { > + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, > + vma_keyid(vma), false); > + } I guess this isn't horribly problematic now, but if we ever keep pools of preassigned-keyids, this won't work any more.
On Wed, Jul 18, 2018 at 04:11:57PM -0700, Dave Hansen wrote: > On 07/17/2018 04:20 AM, Kirill A. Shutemov wrote: > > khugepaged allocates page in advance, before we found a VMA for > > collapse. We don't yet know which KeyID to use for the allocation. > > That's not really true. We have the VMA and the address in the caller > (khugepaged_scan_pmd()), but we drop the lock and have to revalidate the > VMA. For !NUMA we allocate the page in khugepaged_do_scan(), well before we know VMA. > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 5ae34097aed1..d116f4ebb622 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, > > */ > > anon_vma_unlock_write(vma->anon_vma); > > > > + /* > > + * At this point new_page is allocated as non-encrypted. > > + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted > > + * before coping data. > > + */ > > + if (vma_keyid(vma)) { > > + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, > > + vma_keyid(vma), false); > > + } > > I guess this isn't horribly problematic now, but if we ever keep pools > of preassigned-keyids, this won't work any more. I don't get this. What pools of preassigned-keyids are you talking about?
On 07/19/2018 01:59 AM, Kirill A. Shutemov wrote: > On Wed, Jul 18, 2018 at 04:11:57PM -0700, Dave Hansen wrote: >> On 07/17/2018 04:20 AM, Kirill A. Shutemov wrote: >>> khugepaged allocates page in advance, before we found a VMA for >>> collapse. We don't yet know which KeyID to use for the allocation. >> >> That's not really true. We have the VMA and the address in the caller >> (khugepaged_scan_pmd()), but we drop the lock and have to revalidate the >> VMA. > > For !NUMA we allocate the page in khugepaged_do_scan(), well before we > know VMA. Ahh, thanks for clarifying. That's some more very good information about the design and progression of your patch that belongs in the changelog. >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 5ae34097aed1..d116f4ebb622 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, >>> */ >>> anon_vma_unlock_write(vma->anon_vma); >>> >>> + /* >>> + * At this point new_page is allocated as non-encrypted. >>> + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted >>> + * before coping data. >>> + */ >>> + if (vma_keyid(vma)) { >>> + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, >>> + vma_keyid(vma), false); >>> + } >> >> I guess this isn't horribly problematic now, but if we ever keep pools >> of preassigned-keyids, this won't work any more. > > I don't get this. What pools of preassigned-keyids are you talking about? My point was that if we ever teach the allocator or something _near_ the allocator to keep pools of pre-zeroed and/or pre-cache-cleared pages, this approach will need to get changed otherwise we will double-prep pages. My overall concern with prep_encrypted_page() in this patch set is that it's inserted pretty ad-hoc. It seems easy to miss spots where it should be. I'm also unsure of the failure mode and anything we've done to ensure that if we get this wrong, we scream clearly and loudly about what happened. Do we do something like that?
On Thu, Jul 19, 2018 at 07:13:39AM -0700, Dave Hansen wrote: > On 07/19/2018 01:59 AM, Kirill A. Shutemov wrote: > > On Wed, Jul 18, 2018 at 04:11:57PM -0700, Dave Hansen wrote: > >> On 07/17/2018 04:20 AM, Kirill A. Shutemov wrote: > >>> khugepaged allocates page in advance, before we found a VMA for > >>> collapse. We don't yet know which KeyID to use for the allocation. > >> > >> That's not really true. We have the VMA and the address in the caller > >> (khugepaged_scan_pmd()), but we drop the lock and have to revalidate the > >> VMA. > > > > For !NUMA we allocate the page in khugepaged_do_scan(), well before we > > know VMA. > > Ahh, thanks for clarifying. That's some more very good information > about the design and progression of your patch that belongs in the > changelog. Okay. > >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c > >>> index 5ae34097aed1..d116f4ebb622 100644 > >>> --- a/mm/khugepaged.c > >>> +++ b/mm/khugepaged.c > >>> @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, > >>> */ > >>> anon_vma_unlock_write(vma->anon_vma); > >>> > >>> + /* > >>> + * At this point new_page is allocated as non-encrypted. > >>> + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted > >>> + * before coping data. > >>> + */ > >>> + if (vma_keyid(vma)) { > >>> + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, > >>> + vma_keyid(vma), false); > >>> + } > >> > >> I guess this isn't horribly problematic now, but if we ever keep pools > >> of preassigned-keyids, this won't work any more. > > > > I don't get this. What pools of preassigned-keyids are you talking about? > > My point was that if we ever teach the allocator or something _near_ the > allocator to keep pools of pre-zeroed and/or pre-cache-cleared pages, > this approach will need to get changed otherwise we will double-prep pages. It shouldn't be a problem here. It's pretty slow path. We often wait memory to be compacted before page for khugepaged gets allocated. Double-prep shouldn't have visible impact. > My overall concern with prep_encrypted_page() in this patch set is that > it's inserted pretty ad-hoc. It seems easy to miss spots where it > should be. I'm also unsure of the failure mode and anything we've done > to ensure that if we get this wrong, we scream clearly and loudly about > what happened. Do we do something like that? I have debugging patch that puts BUG_ONs around set_pte_at() to check if the page's keyid matches VMA's keyid. But that's not very systematic. We would need something better than this.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5ae34097aed1..d116f4ebb622 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1056,6 +1056,16 @@ static void collapse_huge_page(struct mm_struct *mm, */ anon_vma_unlock_write(vma->anon_vma); + /* + * At this point new_page is allocated as non-encrypted. + * If VMA's KeyID is non-zero, we need to prepare it to be encrypted + * before coping data. + */ + if (vma_keyid(vma)) { + prep_encrypted_page(new_page, HPAGE_PMD_ORDER, + vma_keyid(vma), false); + } + __collapse_huge_page_copy(pte, new_page, vma, address, pte_ptl); pte_unmap(pte); __SetPageUptodate(new_page);
khugepaged allocates page in advance, before we found a VMA for collapse. We don't yet know which KeyID to use for the allocation. The page is allocated with KeyID-0. Once we know that the VMA is suitable for collapsing, we prepare the page for KeyID we need, based on vma_keyid(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- mm/khugepaged.c | 10 ++++++++++ 1 file changed, 10 insertions(+)