diff mbox series

[v2,3/4] drm/ttm, drm/vmwgfx: Correctly support support AMD memory encryption

Message ID 20190903131504.18935-4-thomas_os@shipmail.org (mailing list archive)
State New, archived
Headers show
Series Have TTM support SEV encryption with coherent memory | expand

Commit Message

Thomas Hellström (Intel) Sept. 3, 2019, 1:15 p.m. UTC
From: Thomas Hellstrom <thellstrom@vmware.com>

With TTM pages allocated out of the DMA pool, use the
force_dma_unencrypted function to be able to set up the correct
page-protection. Previously it was unconditionally set to encrypted,
which only works with SME encryption on devices with a large enough DMA
mask.

Tested with vmwgfx and sev-es. Screen garbage without this patch and normal
functionality with it.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c        | 17 +++++++++++++----
 drivers/gpu/drm/ttm/ttm_bo_vm.c          | 21 ++++++++++-----------
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |  4 ++++
 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c     |  6 ++++--
 include/drm/ttm/ttm_bo_driver.h          |  8 +++++---
 include/drm/ttm/ttm_tt.h                 |  1 +
 6 files changed, 37 insertions(+), 20 deletions(-)

Comments

Dave Hansen Sept. 3, 2019, 7:38 p.m. UTC | #1
This whole thing looks like a fascinating collection of hacks. :)

ttm is taking a stack-alllocated "VMA" and handing it to vmf_insert_*()
which obviously are expecting "real" VMAs that are linked into the mm.
It's extracting some pgprot_t information from the real VMA, making a
psuedo-temporary VMA, then passing the temporary one back into the
insertion functions:

> static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> {
...
>         struct vm_area_struct cvma;
...
>                 if (vma->vm_flags & VM_MIXEDMAP)
>                         ret = vmf_insert_mixed(&cvma, address,
>                                         __pfn_to_pfn_t(pfn, PFN_DEV));
>                 else
>                         ret = vmf_insert_pfn(&cvma, address, pfn);

I can totally see why this needs new exports.  But, man, it doesn't seem
like something we want to keep *feeding*.

The real problem here is that the encryption bits from the device VMA's
"true" vma->vm_page_prot don't match the ones that actually get
inserted, probably because the device ptes need the encryption bits
cleared but the system memory PTEs need them set *and* they're mixed
under one VMA.

The thing we need to stop is having mixed encryption rules under one VMA.
Daniel Vetter Sept. 3, 2019, 7:51 p.m. UTC | #2
On Tue, Sep 3, 2019 at 9:38 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> This whole thing looks like a fascinating collection of hacks. :)
>
> ttm is taking a stack-alllocated "VMA" and handing it to vmf_insert_*()
> which obviously are expecting "real" VMAs that are linked into the mm.
> It's extracting some pgprot_t information from the real VMA, making a
> psuedo-temporary VMA, then passing the temporary one back into the
> insertion functions:
>
> > static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
> > {
> ...
> >         struct vm_area_struct cvma;
> ...
> >                 if (vma->vm_flags & VM_MIXEDMAP)
> >                         ret = vmf_insert_mixed(&cvma, address,
> >                                         __pfn_to_pfn_t(pfn, PFN_DEV));
> >                 else
> >                         ret = vmf_insert_pfn(&cvma, address, pfn);
>
> I can totally see why this needs new exports.  But, man, it doesn't seem
> like something we want to keep *feeding*.
>
> The real problem here is that the encryption bits from the device VMA's
> "true" vma->vm_page_prot don't match the ones that actually get
> inserted, probably because the device ptes need the encryption bits
> cleared but the system memory PTEs need them set *and* they're mixed
> under one VMA.
>
> The thing we need to stop is having mixed encryption rules under one VMA.

The point here is that we want this. We need to be able to move the
buffer between device ptes and system memory ptes, transparently,
behind userspace back, without races. And the fast path (which is "no
pte exists for this vma") must be real fast, so taking mmap_sem and
replacing the vma is no-go.
-Daniel
Dave Hansen Sept. 3, 2019, 7:55 p.m. UTC | #3
On 9/3/19 12:51 PM, Daniel Vetter wrote:
>> The thing we need to stop is having mixed encryption rules under one VMA.
> The point here is that we want this. We need to be able to move the
> buffer between device ptes and system memory ptes, transparently,
> behind userspace back, without races. And the fast path (which is "no
> pte exists for this vma") must be real fast, so taking mmap_sem and
> replacing the vma is no-go.

So, when the user asks for encryption and we say, "sure, we'll encrypt
that", then we want the device driver to be able to transparently undo
that encryption under the covers for device memory?  That seems suboptimal.

I'd rather the device driver just say: "Nope, you can't encrypt my VMA".
 Because that's the truth.
Thomas Hellström (Intel) Sept. 3, 2019, 8:36 p.m. UTC | #4
On 9/3/19 9:55 PM, Dave Hansen wrote:
> On 9/3/19 12:51 PM, Daniel Vetter wrote:
>>> The thing we need to stop is having mixed encryption rules under one VMA.
>> The point here is that we want this. We need to be able to move the
>> buffer between device ptes and system memory ptes, transparently,
>> behind userspace back, without races. And the fast path (which is "no
>> pte exists for this vma") must be real fast, so taking mmap_sem and
>> replacing the vma is no-go.
> So, when the user asks for encryption and we say, "sure, we'll encrypt
> that", then we want the device driver to be able to transparently undo
> that encryption under the covers for device memory?  That seems suboptimal.
>
> I'd rather the device driver just say: "Nope, you can't encrypt my VMA".
>   Because that's the truth.

The thing here is that it's the underlying physical memory that define 
the correct encryption flags. If it's DMA memory and SEV is active or 
PCI memory. It's always unencrypted. User-space in a SEV vm should 
always, from a data protection point of view, *assume* that graphics 
buffers are unencrypted. (Which will of course limit the use of gpus and 
display controllers in a SEV vm). Platform code sets the vma encryption 
to on by default.

So the question here should really be, can we determine already at mmap 
time whether backing memory will be unencrypted and adjust the *real* 
vma->vm_page_prot under the mmap_sem?

Possibly, but that requires populating the buffer with memory at mmap 
time rather than at first fault time.

And it still requires knowledge whether the device DMA is always 
unencrypted (or if SEV is active).

/Thomas
Dave Hansen Sept. 3, 2019, 8:51 p.m. UTC | #5
On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
> So the question here should really be, can we determine already at mmap
> time whether backing memory will be unencrypted and adjust the *real*
> vma->vm_page_prot under the mmap_sem?
> 
> Possibly, but that requires populating the buffer with memory at mmap
> time rather than at first fault time.

I'm not connecting the dots.

vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
are created at mmap() or fault time.  If we establish a good
vma->vm_page_prot, can't we just use it forever for demand faults?

Or, are you concerned that if an attempt is made to demand-fault page
that's incompatible with vma->vm_page_prot that we have to SEGV?

> And it still requires knowledge whether the device DMA is always
> unencrypted (or if SEV is active).

I may be getting mixed up on MKTME (the Intel memory encryption) and
SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
anonymous?  Or just anonymous?
Thomas Hellström (Intel) Sept. 3, 2019, 9:05 p.m. UTC | #6
On 9/3/19 10:51 PM, Dave Hansen wrote:
> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>> So the question here should really be, can we determine already at mmap
>> time whether backing memory will be unencrypted and adjust the *real*
>> vma->vm_page_prot under the mmap_sem?
>>
>> Possibly, but that requires populating the buffer with memory at mmap
>> time rather than at first fault time.
> I'm not connecting the dots.
>
> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
> are created at mmap() or fault time.  If we establish a good
> vma->vm_page_prot, can't we just use it forever for demand faults?

With SEV I think that we could possibly establish the encryption flags 
at vma creation time. But thinking of it, it would actually break with 
SME where buffer content can be moved between encrypted system memory 
and unencrypted graphics card PCI memory behind user-space's back. That 
would imply killing all user-space encrypted PTEs and at fault time set 
up new ones pointing to unencrypted PCI memory..

>
> Or, are you concerned that if an attempt is made to demand-fault page
> that's incompatible with vma->vm_page_prot that we have to SEGV?
>
>> And it still requires knowledge whether the device DMA is always
>> unencrypted (or if SEV is active).
> I may be getting mixed up on MKTME (the Intel memory encryption) and
> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
> anonymous?  Or just anonymous?

SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a 
SWIOTLB backed by unencrypted memory, and it also flips coherent DMA 
memory to unencrypted (which is a very slow operation and patch 4 deals 
with caching such memory).

/Thomas
Andy Lutomirski Sept. 3, 2019, 9:46 p.m. UTC | #7
On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
<thomas_os@shipmail.org> wrote:
>
> On 9/3/19 10:51 PM, Dave Hansen wrote:
> > On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
> >> So the question here should really be, can we determine already at mmap
> >> time whether backing memory will be unencrypted and adjust the *real*
> >> vma->vm_page_prot under the mmap_sem?
> >>
> >> Possibly, but that requires populating the buffer with memory at mmap
> >> time rather than at first fault time.
> > I'm not connecting the dots.
> >
> > vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
> > are created at mmap() or fault time.  If we establish a good
> > vma->vm_page_prot, can't we just use it forever for demand faults?
>
> With SEV I think that we could possibly establish the encryption flags
> at vma creation time. But thinking of it, it would actually break with
> SME where buffer content can be moved between encrypted system memory
> and unencrypted graphics card PCI memory behind user-space's back. That
> would imply killing all user-space encrypted PTEs and at fault time set
> up new ones pointing to unencrypted PCI memory..
>
> >
> > Or, are you concerned that if an attempt is made to demand-fault page
> > that's incompatible with vma->vm_page_prot that we have to SEGV?
> >
> >> And it still requires knowledge whether the device DMA is always
> >> unencrypted (or if SEV is active).
> > I may be getting mixed up on MKTME (the Intel memory encryption) and
> > SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
> > anonymous?  Or just anonymous?
>
> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
> memory to unencrypted (which is a very slow operation and patch 4 deals
> with caching such memory).
>

I'm still lost.  You have some fancy VMA where the backing pages
change behind the application's back.  This isn't particularly novel
-- plain old anonymous memory and plain old mapped files do this too.
Can't you all the insert_pfn APIs and call it a day?  What's so
special that you need all this magic?  ISTM you should be able to
allocate memory that's addressable by the device (dma_alloc_coherent()
or whatever) and then map it into user memory just like you'd map any
other page.

I feel like I'm missing something here.
Thomas Hellström (Intel) Sept. 3, 2019, 10:08 p.m. UTC | #8
On 9/3/19 11:46 PM, Andy Lutomirski wrote:
> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
> <thomas_os@shipmail.org> wrote:
>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>> So the question here should really be, can we determine already at mmap
>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>> vma->vm_page_prot under the mmap_sem?
>>>>
>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>> time rather than at first fault time.
>>> I'm not connecting the dots.
>>>
>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>> are created at mmap() or fault time.  If we establish a good
>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>> With SEV I think that we could possibly establish the encryption flags
>> at vma creation time. But thinking of it, it would actually break with
>> SME where buffer content can be moved between encrypted system memory
>> and unencrypted graphics card PCI memory behind user-space's back. That
>> would imply killing all user-space encrypted PTEs and at fault time set
>> up new ones pointing to unencrypted PCI memory..
>>
>>> Or, are you concerned that if an attempt is made to demand-fault page
>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>
>>>> And it still requires knowledge whether the device DMA is always
>>>> unencrypted (or if SEV is active).
>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
>>> anonymous?  Or just anonymous?
>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>> memory to unencrypted (which is a very slow operation and patch 4 deals
>> with caching such memory).
>>
> I'm still lost.  You have some fancy VMA where the backing pages
> change behind the application's back.  This isn't particularly novel
> -- plain old anonymous memory and plain old mapped files do this too.
> Can't you all the insert_pfn APIs and call it a day?  What's so
> special that you need all this magic?  ISTM you should be able to
> allocate memory that's addressable by the device (dma_alloc_coherent()
> or whatever) and then map it into user memory just like you'd map any
> other page.
>
> I feel like I'm missing something here.

Yes, so in this case we use dma_alloc_coherent().

With SEV, that gives us unencrypted pages. (Pages whose linear kernel 
map is marked unencrypted). With SME that (typcially) gives us encrypted 
pages. In both these cases, vm_get_page_prot() returns
an encrypted page protection, which lands in vma->vm_page_prot.

In the SEV case, we therefore need to modify the page protection to 
unencrypted. Hence we need to know whether we're running under SEV and 
therefore need to modify the protection. If not, the user-space PTE 
would incorrectly have the encryption flag set.

/Thomas
Thomas Hellström (Intel) Sept. 3, 2019, 10:15 p.m. UTC | #9
On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>> <thomas_os@shipmail.org> wrote:
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at 
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>> would imply killing all user-space encrypted PTEs and at fault time set
>>> up new ones pointing to unencrypted PCI memory..
>>>
>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>
>>>>> And it still requires knowledge whether the device DMA is always
>>>>> unencrypted (or if SEV is active).
>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
>>>> anonymous?  Or just anonymous?
>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>> with caching such memory).
>>>
>> I'm still lost.  You have some fancy VMA where the backing pages
>> change behind the application's back.  This isn't particularly novel
>> -- plain old anonymous memory and plain old mapped files do this too.
>> Can't you all the insert_pfn APIs and call it a day?  What's so
>> special that you need all this magic?  ISTM you should be able to
>> allocate memory that's addressable by the device (dma_alloc_coherent()
>> or whatever) and then map it into user memory just like you'd map any
>> other page.
>>
>> I feel like I'm missing something here.
>
> Yes, so in this case we use dma_alloc_coherent().
>
> With SEV, that gives us unencrypted pages. (Pages whose linear kernel 
> map is marked unencrypted). With SME that (typcially) gives us 
> encrypted pages. In both these cases, vm_get_page_prot() returns
> an encrypted page protection, which lands in vma->vm_page_prot.
>
> In the SEV case, we therefore need to modify the page protection to 
> unencrypted. Hence we need to know whether we're running under SEV and 
> therefore need to modify the protection. If not, the user-space PTE 
> would incorrectly have the encryption flag set.
>
> /Thomas
>
>
And, of course, had we not been "fancy", we could have used 
dma_mmap_coherent(), which in theory should set up the correct 
user-space page protection. But now we're moving stuff around so we can't.

/Thomas
Dave Hansen Sept. 3, 2019, 11:10 p.m. UTC | #10
Thomas, this series has garnered a nak and a whole pile of thoroughly
confused reviewers.

Could you take another stab at this along with a more ample changelog
explaining the context of the problem?  I suspect that's a better place
to start than having us all piece together the disparate parts of the
thread.
Andy Lutomirski Sept. 3, 2019, 11:15 p.m. UTC | #11
> On Sep 3, 2019, at 3:15 PM, Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
> 
>> On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
>>> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>> So the question here should really be, can we determine already at mmap
>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>> 
>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>> time rather than at first fault time.
>>>>> I'm not connecting the dots.
>>>>> 
>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>> are created at mmap() or fault time.  If we establish a good
>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>> With SEV I think that we could possibly establish the encryption flags
>>>> at vma creation time. But thinking of it, it would actually break with
>>>> SME where buffer content can be moved between encrypted system memory
>>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>>> would imply killing all user-space encrypted PTEs and at fault time set
>>>> up new ones pointing to unencrypted PCI memory..
>>>> 
>>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>> 
>>>>>> And it still requires knowledge whether the device DMA is always
>>>>>> unencrypted (or if SEV is active).
>>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>>> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
>>>>> anonymous?  Or just anonymous?
>>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>>> with caching such memory).
>>>> 
>>> I'm still lost.  You have some fancy VMA where the backing pages
>>> change behind the application's back.  This isn't particularly novel
>>> -- plain old anonymous memory and plain old mapped files do this too.
>>> Can't you all the insert_pfn APIs and call it a day?  What's so
>>> special that you need all this magic?  ISTM you should be able to
>>> allocate memory that's addressable by the device (dma_alloc_coherent()
>>> or whatever) and then map it into user memory just like you'd map any
>>> other page.
>>> 
>>> I feel like I'm missing something here.
>> 
>> Yes, so in this case we use dma_alloc_coherent().
>> 
>> With SEV, that gives us unencrypted pages. (Pages whose linear kernel map is marked unencrypted). With SME that (typcially) gives us encrypted pages. In both these cases, vm_get_page_prot() returns
>> an encrypted page protection, which lands in vma->vm_page_prot.
>> 
>> In the SEV case, we therefore need to modify the page protection to unencrypted. Hence we need to know whether we're running under SEV and therefore need to modify the protection. If not, the user-space PTE would incorrectly have the encryption flag set.
>> 

I’m still confused. You got unencrypted pages with an unencrypted PFN. Why do you need to fiddle?  You have a PFN, and you’re inserting it with vmf_insert_pfn().  This should just work, no?  There doesn’t seem to be any real funny business in dma_mmap_attrs() or dma_common_mmap().

But, reading this, I have more questions:

Can’t you get rid of cvma by using vmf_insert_pfn_prot()?

Would it make sense to add a vmf_insert_dma_page() to directly do exactly what you’re trying to do?

And a broader question just because I’m still confused: why isn’t the encryption bit in the PFN?  The whole SEV/SME system seems like it’s trying a bit to hard to be fully invisible to the kernel.
Thomas Hellström (Intel) Sept. 4, 2019, 6:49 a.m. UTC | #12
On 9/4/19 1:15 AM, Andy Lutomirski wrote:
>
>> On Sep 3, 2019, at 3:15 PM, Thomas Hellström (VMware) <thomas_os@shipmail.org> wrote:
>>
>>> On 9/4/19 12:08 AM, Thomas Hellström (VMware) wrote:
>>>> On 9/3/19 11:46 PM, Andy Lutomirski wrote:
>>>> On Tue, Sep 3, 2019 at 2:05 PM Thomas Hellström (VMware)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>>> So the question here should really be, can we determine already at mmap
>>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>>>
>>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>>> time rather than at first fault time.
>>>>>> I'm not connecting the dots.
>>>>>>
>>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>>> are created at mmap() or fault time.  If we establish a good
>>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>>> With SEV I think that we could possibly establish the encryption flags
>>>>> at vma creation time. But thinking of it, it would actually break with
>>>>> SME where buffer content can be moved between encrypted system memory
>>>>> and unencrypted graphics card PCI memory behind user-space's back. That
>>>>> would imply killing all user-space encrypted PTEs and at fault time set
>>>>> up new ones pointing to unencrypted PCI memory..
>>>>>
>>>>>> Or, are you concerned that if an attempt is made to demand-fault page
>>>>>> that's incompatible with vma->vm_page_prot that we have to SEGV?
>>>>>>
>>>>>>> And it still requires knowledge whether the device DMA is always
>>>>>>> unencrypted (or if SEV is active).
>>>>>> I may be getting mixed up on MKTME (the Intel memory encryption) and
>>>>>> SEV.  Is SEV supported on all memory types?  Page cache, hugetlbfs,
>>>>>> anonymous?  Or just anonymous?
>>>>> SEV AFAIK encrypts *all* memory except DMA memory. To do that it uses a
>>>>> SWIOTLB backed by unencrypted memory, and it also flips coherent DMA
>>>>> memory to unencrypted (which is a very slow operation and patch 4 deals
>>>>> with caching such memory).
>>>>>
>>>> I'm still lost.  You have some fancy VMA where the backing pages
>>>> change behind the application's back.  This isn't particularly novel
>>>> -- plain old anonymous memory and plain old mapped files do this too.
>>>> Can't you all the insert_pfn APIs and call it a day?  What's so
>>>> special that you need all this magic?  ISTM you should be able to
>>>> allocate memory that's addressable by the device (dma_alloc_coherent()
>>>> or whatever) and then map it into user memory just like you'd map any
>>>> other page.
>>>>
>>>> I feel like I'm missing something here.
>>> Yes, so in this case we use dma_alloc_coherent().
>>>
>>> With SEV, that gives us unencrypted pages. (Pages whose linear kernel map is marked unencrypted). With SME that (typcially) gives us encrypted pages. In both these cases, vm_get_page_prot() returns
>>> an encrypted page protection, which lands in vma->vm_page_prot.
>>>
>>> In the SEV case, we therefore need to modify the page protection to unencrypted. Hence we need to know whether we're running under SEV and therefore need to modify the protection. If not, the user-space PTE would incorrectly have the encryption flag set.
>>>
> I’m still confused. You got unencrypted pages with an unencrypted PFN. Why do you need to fiddle?  You have a PFN, and you’re inserting it with vmf_insert_pfn().  This should just work, no?

OK now I see what causes the confusion.

With SEV, the encryption state is, while *physically* encoded in an 
address bit, from what I can tell, not *logically* encoded in the pfn, 
but in the page_prot for cpu mapping purposes.  That is, page_to_pfn()  
returns the same pfn whether the page is encrypted or unencrypted. Hence 
nobody can't tell from the pfn whether the page is unencrypted or encrypted.

For device DMA address purposes, the encryption status is encoded in the 
dma address by the dma layer in phys_to_dma().


>   There doesn’t seem to be any real funny business in dma_mmap_attrs() or dma_common_mmap().

No, from what I can tell the call in these functions to dma_pgprot() 
generates an incorrect page protection since it doesn't take unencrypted 
coherent memory into account. I don't think anybody has used these 
functions yet with SEV.

>
> But, reading this, I have more questions:
>
> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?

It looks like that, although there are comments in the code about 
serious performance problems using VM_PFNMAP / vmf_insert_pfn() with 
write-combining and PAT, so that would require some serious testing with 
hardware I don't have. But I guess there is definitely room for 
improvement here. Ideally we'd like to be able to change the 
vma->vm_page_prot within fault(). But we can

>
> Would it make sense to add a vmf_insert_dma_page() to directly do exactly what you’re trying to do?

Yes, but as a longer term solution I would prefer a general dma_pgprot() 
exported, so that we could, in a dma-compliant way, use coherent pages 
with other apis, like kmap_atomic_prot() and vmap(). That is, basically 
split coherent page allocation in two steps: Allocation and mapping.

>
> And a broader question just because I’m still confused: why isn’t the encryption bit in the PFN?  The whole SEV/SME system seems like it’s trying a bit to hard to be fully invisible to the kernel.

I guess you'd have to ask AMD about that. But my understanding is that 
encoding it in an address bit does make it trivial to do decryption / 
encryption on the fly to DMA devices that are not otherwise aware of it, 
just by handing them a special physical address. For cpu mapping 
purposes it might become awkward to encode it in the pfn since 
pfn_to_page and friends would need knowledge about this. Personally I 
think it would have made sense to track it like PAT in track_pfn_insert().

Thanks,

Thomas
Christian König Sept. 4, 2019, 7:33 a.m. UTC | #13
Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
> On 9/3/19 10:51 PM, Dave Hansen wrote:
>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>> So the question here should really be, can we determine already at mmap
>>> time whether backing memory will be unencrypted and adjust the *real*
>>> vma->vm_page_prot under the mmap_sem?
>>>
>>> Possibly, but that requires populating the buffer with memory at mmap
>>> time rather than at first fault time.
>> I'm not connecting the dots.
>>
>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>> are created at mmap() or fault time.  If we establish a good
>> vma->vm_page_prot, can't we just use it forever for demand faults?
>
> With SEV I think that we could possibly establish the encryption flags 
> at vma creation time. But thinking of it, it would actually break with 
> SME where buffer content can be moved between encrypted system memory 
> and unencrypted graphics card PCI memory behind user-space's back. 
> That would imply killing all user-space encrypted PTEs and at fault 
> time set up new ones pointing to unencrypted PCI memory..

Well my problem is where do you see encrypted system memory here?

At least for AMD GPUs all memory accessed must be unencrypted and that 
counts for both system as well as PCI memory.

So I don't get why we can't assume always unencrypted and keep it like that.

Regards,
Christian.
Daniel Vetter Sept. 4, 2019, 7:53 a.m. UTC | #14
On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
<thomas_os@shipmail.org> wrote:
> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
> > But, reading this, I have more questions:
> >
> > Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
>
> It looks like that, although there are comments in the code about
> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
> write-combining and PAT, so that would require some serious testing with
> hardware I don't have. But I guess there is definitely room for
> improvement here. Ideally we'd like to be able to change the
> vma->vm_page_prot within fault(). But we can

Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
of the PAT tables, which are dead slow. If you have a struct
io_mapping then that can be done once, and then just blindly inserted.
See remap_io_mapping in i915.
-Daniel
Thomas Hellström (Intel) Sept. 4, 2019, 8:19 a.m. UTC | #15
Hi, Christian,

On 9/4/19 9:33 AM, Koenig, Christian wrote:
> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>> So the question here should really be, can we determine already at mmap
>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>> vma->vm_page_prot under the mmap_sem?
>>>>
>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>> time rather than at first fault time.
>>> I'm not connecting the dots.
>>>
>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>> are created at mmap() or fault time.  If we establish a good
>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>> With SEV I think that we could possibly establish the encryption flags
>> at vma creation time. But thinking of it, it would actually break with
>> SME where buffer content can be moved between encrypted system memory
>> and unencrypted graphics card PCI memory behind user-space's back.
>> That would imply killing all user-space encrypted PTEs and at fault
>> time set up new ones pointing to unencrypted PCI memory..
> Well my problem is where do you see encrypted system memory here?
>
> At least for AMD GPUs all memory accessed must be unencrypted and that
> counts for both system as well as PCI memory.

We're talking SME now right?

The current SME setup is that if a device's DMA mask says it's capable 
of addressing the encryption bit, coherent memory will be encrypted. The 
memory controllers will decrypt for the device on the fly. Otherwise 
coherent memory will be decrypted.

>
> So I don't get why we can't assume always unencrypted and keep it like that.

I see two reasons. First, it would break with a real device that signals 
it's capable of addressing the encryption bit.

Second I can imagine unaccelerated setups (something like vkms using 
prime feeding a VNC connection) where we actually want the TTM buffers 
encrypted to protect data.

But at least the latter reason is way far out in the future.

So for me I'm ok with that if that works for you?

/Thomas


>
> Regards,
> Christian.
Thomas Hellström (Intel) Sept. 4, 2019, 8:34 a.m. UTC | #16
Hi, Dave,

On 9/4/19 1:10 AM, Dave Hansen wrote:
> Thomas, this series has garnered a nak and a whole pile of thoroughly
> confused reviewers.
>
> Could you take another stab at this along with a more ample changelog
> explaining the context of the problem?  I suspect that's a better place
> to start than having us all piece together the disparate parts of the
> thread.

Sure.

I was just trying to follow up on the emails  to get a better 
understanding what got people confused in the first place.

Thanks,

Thomas
Thomas Hellström (Intel) Sept. 4, 2019, 8:42 a.m. UTC | #17
On 9/4/19 10:19 AM, Thomas Hellström (VMware) wrote:
> Hi, Christian,
>
> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at 
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back.
>>> That would imply killing all user-space encrypted PTEs and at fault
>>> time set up new ones pointing to unencrypted PCI memory..
>> Well my problem is where do you see encrypted system memory here?
>>
>> At least for AMD GPUs all memory accessed must be unencrypted and that
>> counts for both system as well as PCI memory.
>
> We're talking SME now right?
>
> The current SME setup is that if a device's DMA mask says it's capable 
> of addressing the encryption bit, coherent memory will be encrypted. 
> The memory controllers will decrypt for the device on the fly. 
> Otherwise coherent memory will be decrypted.
>
>>
>> So I don't get why we can't assume always unencrypted and keep it 
>> like that.
>
> I see two reasons. First, it would break with a real device that 
> signals it's capable of addressing the encryption bit.
>
> Second I can imagine unaccelerated setups (something like vkms using 
> prime feeding a VNC connection) where we actually want the TTM buffers 
> encrypted to protect data.
>
> But at least the latter reason is way far out in the future.
>
> So for me I'm ok with that if that works for you?

Hmm, BTW,

Are you sure the AMD GPUs use unencrypted system memory rather than 
relying on the memory controllers to decrypt?

In that case it seems strange that they get away with encrypted TTM 
PTEs, whereas vmwgfx don't...

/Thomas

>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>
>
Thomas Hellström (Intel) Sept. 4, 2019, 10:37 a.m. UTC | #18
On 9/4/19 9:53 AM, Daniel Vetter wrote:
> On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
> <thomas_os@shipmail.org> wrote:
>> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
>>> But, reading this, I have more questions:
>>>
>>> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
>> It looks like that, although there are comments in the code about
>> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
>> write-combining and PAT, so that would require some serious testing with
>> hardware I don't have. But I guess there is definitely room for
>> improvement here. Ideally we'd like to be able to change the
>> vma->vm_page_prot within fault(). But we can
> Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
> of the PAT tables, which are dead slow. If you have a struct
> io_mapping then that can be done once, and then just blindly inserted.
> See remap_io_mapping in i915.
> -Daniel

Thanks, Daniel.

Indeed looks a lot like remap_pfn_range(), but usable at fault time?

/Thomas
Christian König Sept. 4, 2019, 11:10 a.m. UTC | #19
Am 04.09.19 um 10:19 schrieb Thomas Hellström (VMware):
> Hi, Christian,
>
> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>> So the question here should really be, can we determine already at 
>>>>> mmap
>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>
>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>> time rather than at first fault time.
>>>> I'm not connecting the dots.
>>>>
>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>> are created at mmap() or fault time.  If we establish a good
>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>> With SEV I think that we could possibly establish the encryption flags
>>> at vma creation time. But thinking of it, it would actually break with
>>> SME where buffer content can be moved between encrypted system memory
>>> and unencrypted graphics card PCI memory behind user-space's back.
>>> That would imply killing all user-space encrypted PTEs and at fault
>>> time set up new ones pointing to unencrypted PCI memory..
>> Well my problem is where do you see encrypted system memory here?
>>
>> At least for AMD GPUs all memory accessed must be unencrypted and that
>> counts for both system as well as PCI memory.
>
> We're talking SME now right?
>
> The current SME setup is that if a device's DMA mask says it's capable 
> of addressing the encryption bit, coherent memory will be encrypted. 
> The memory controllers will decrypt for the device on the fly. 
> Otherwise coherent memory will be decrypted.
>
>>
>> So I don't get why we can't assume always unencrypted and keep it 
>> like that.
>
> I see two reasons. First, it would break with a real device that 
> signals it's capable of addressing the encryption bit.

Why? Because we don't use dma_mmap_coherent()?

I've already talked with Christoph that we probably want to switch TTM 
over to using that instead to also get rid of the ttm_io_prot() hack.

Regards,
Christian.

>
> Second I can imagine unaccelerated setups (something like vkms using 
> prime feeding a VNC connection) where we actually want the TTM buffers 
> encrypted to protect data.
>
> But at least the latter reason is way far out in the future.
>
> So for me I'm ok with that if that works for you?
>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>
>
Daniel Vetter Sept. 4, 2019, 11:43 a.m. UTC | #20
On Wed, Sep 4, 2019 at 12:38 PM Thomas Hellström (VMware)
<thomas_os@shipmail.org> wrote:
>
> On 9/4/19 9:53 AM, Daniel Vetter wrote:
> > On Wed, Sep 4, 2019 at 8:49 AM Thomas Hellström (VMware)
> > <thomas_os@shipmail.org> wrote:
> >> On 9/4/19 1:15 AM, Andy Lutomirski wrote:
> >>> But, reading this, I have more questions:
> >>>
> >>> Can’t you get rid of cvma by using vmf_insert_pfn_prot()?
> >> It looks like that, although there are comments in the code about
> >> serious performance problems using VM_PFNMAP / vmf_insert_pfn() with
> >> write-combining and PAT, so that would require some serious testing with
> >> hardware I don't have. But I guess there is definitely room for
> >> improvement here. Ideally we'd like to be able to change the
> >> vma->vm_page_prot within fault(). But we can
> > Just a quick comment on this: It's the repeated (per-pfn/pte) lookup
> > of the PAT tables, which are dead slow. If you have a struct
> > io_mapping then that can be done once, and then just blindly inserted.
> > See remap_io_mapping in i915.
> > -Daniel
>
> Thanks, Daniel.
>
> Indeed looks a lot like remap_pfn_range(), but usable at fault time?

Yeah we call it from our fault handler. It's essentially vm_insert_pfn
except the pat track isn't there, but instead relies on the pat
tracking io_mapping has done already.
-Daniel
Thomas Hellström (Intel) Sept. 4, 2019, 12:35 p.m. UTC | #21
On 9/4/19 1:10 PM, Koenig, Christian wrote:
> Am 04.09.19 um 10:19 schrieb Thomas Hellström (VMware):
>> Hi, Christian,
>>
>> On 9/4/19 9:33 AM, Koenig, Christian wrote:
>>> Am 03.09.19 um 23:05 schrieb Thomas Hellström (VMware):
>>>> On 9/3/19 10:51 PM, Dave Hansen wrote:
>>>>> On 9/3/19 1:36 PM, Thomas Hellström (VMware) wrote:
>>>>>> So the question here should really be, can we determine already at
>>>>>> mmap
>>>>>> time whether backing memory will be unencrypted and adjust the *real*
>>>>>> vma->vm_page_prot under the mmap_sem?
>>>>>>
>>>>>> Possibly, but that requires populating the buffer with memory at mmap
>>>>>> time rather than at first fault time.
>>>>> I'm not connecting the dots.
>>>>>
>>>>> vma->vm_page_prot is used to create a VMA's PTEs regardless of if they
>>>>> are created at mmap() or fault time.  If we establish a good
>>>>> vma->vm_page_prot, can't we just use it forever for demand faults?
>>>> With SEV I think that we could possibly establish the encryption flags
>>>> at vma creation time. But thinking of it, it would actually break with
>>>> SME where buffer content can be moved between encrypted system memory
>>>> and unencrypted graphics card PCI memory behind user-space's back.
>>>> That would imply killing all user-space encrypted PTEs and at fault
>>>> time set up new ones pointing to unencrypted PCI memory..
>>> Well my problem is where do you see encrypted system memory here?
>>>
>>> At least for AMD GPUs all memory accessed must be unencrypted and that
>>> counts for both system as well as PCI memory.
>> We're talking SME now right?
>>
>> The current SME setup is that if a device's DMA mask says it's capable
>> of addressing the encryption bit, coherent memory will be encrypted.
>> The memory controllers will decrypt for the device on the fly.
>> Otherwise coherent memory will be decrypted.
>>
>>> So I don't get why we can't assume always unencrypted and keep it
>>> like that.
>> I see two reasons. First, it would break with a real device that
>> signals it's capable of addressing the encryption bit.
> Why? Because we don't use dma_mmap_coherent()?

Well, assuming always unencrypted would obviously break on a real device 
with encrypted coherent memory?

dma_mmap_coherent() would work from the encryption point of view 
(although I think it's currently buggy and will send out an RFC for what 
I believe is a fix for that).

>
> I've already talked with Christoph that we probably want to switch TTM
> over to using that instead to also get rid of the ttm_io_prot() hack.

OK, would that mean us ditching other memory modes completely? And 
on-the-fly caching transitions? or is it just for the special case of 
cached coherent memory? Do we need to cache the coherent kernel mappings 
in TTM as well, for ttm_bo_kmap()?

/Thomas

>
> Regards,
> Christian.
>
>> Second I can imagine unaccelerated setups (something like vkms using
>> prime feeding a VNC connection) where we actually want the TTM buffers
>> encrypted to protect data.
>>
>> But at least the latter reason is way far out in the future.
>>
>> So for me I'm ok with that if that works for you?
>>
>> /Thomas
>>
>>
>>> Regards,
>>> Christian.
>>
Thomas Hellström (Intel) Sept. 4, 2019, 1:05 p.m. UTC | #22
On 9/4/19 2:35 PM, Thomas Hellström (VMware) wrote:
>
>>
>> I've already talked with Christoph that we probably want to switch TTM
>> over to using that instead to also get rid of the ttm_io_prot() hack.
>
> OK, would that mean us ditching other memory modes completely? And 
> on-the-fly caching transitions? or is it just for the special case of 
> cached coherent memory? Do we need to cache the coherent kernel 
> mappings in TTM as well, for ttm_bo_kmap()?

Reading this again, I wanted to point out that I'm not against this. 
Just curious.

/Thomas
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index fe81c565e7ef..d5ad8f03b63f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -419,11 +419,13 @@  int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
 		page = i * dir + add;
 		if (old_iomap == NULL) {
 			pgprot_t prot = ttm_io_prot(old_mem->placement,
+						    ttm->page_flags,
 						    PAGE_KERNEL);
 			ret = ttm_copy_ttm_io_page(ttm, new_iomap, page,
 						   prot);
 		} else if (new_iomap == NULL) {
 			pgprot_t prot = ttm_io_prot(new_mem->placement,
+						    ttm->page_flags,
 						    PAGE_KERNEL);
 			ret = ttm_copy_io_ttm_page(ttm, old_iomap, page,
 						   prot);
@@ -526,11 +528,11 @@  static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	return 0;
 }
 
-pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
+pgprot_t ttm_io_prot(u32 caching_flags, u32 tt_page_flags, pgprot_t tmp)
 {
 	/* Cached mappings need no adjustment */
 	if (caching_flags & TTM_PL_FLAG_CACHED)
-		return tmp;
+		goto check_encryption;
 
 #if defined(__i386__) || defined(__x86_64__)
 	if (caching_flags & TTM_PL_FLAG_WC)
@@ -548,6 +550,11 @@  pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
 #if defined(__sparc__)
 	tmp = pgprot_noncached(tmp);
 #endif
+
+check_encryption:
+	if (tt_page_flags & TTM_PAGE_FLAG_DECRYPTED)
+		tmp = pgprot_decrypted(tmp);
+
 	return tmp;
 }
 EXPORT_SYMBOL(ttm_io_prot);
@@ -594,7 +601,8 @@  static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 	if (ret)
 		return ret;
 
-	if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED)) {
+	if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED) &&
+	    !(ttm->page_flags & TTM_PAGE_FLAG_DECRYPTED)) {
 		/*
 		 * We're mapping a single page, and the desired
 		 * page protection is consistent with the bo.
@@ -608,7 +616,8 @@  static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 		 * We need to use vmap to get the desired page protection
 		 * or to make the buffer object look contiguous.
 		 */
-		prot = ttm_io_prot(mem->placement, PAGE_KERNEL);
+		prot = ttm_io_prot(mem->placement, ttm->page_flags,
+				   PAGE_KERNEL);
 		map->bo_kmap_type = ttm_bo_map_vmap;
 		map->virtual = vmap(ttm->pages + start_page, num_pages,
 				    0, prot);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 76eedb963693..194d8d618d23 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -226,12 +226,7 @@  static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 	 * by mmap_sem in write mode.
 	 */
 	cvma = *vma;
-	cvma.vm_page_prot = vm_get_page_prot(cvma.vm_flags);
-
-	if (bo->mem.bus.is_iomem) {
-		cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
-						cvma.vm_page_prot);
-	} else {
+	if (!bo->mem.bus.is_iomem) {
 		struct ttm_operation_ctx ctx = {
 			.interruptible = false,
 			.no_wait_gpu = false,
@@ -240,14 +235,18 @@  static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 		};
 
 		ttm = bo->ttm;
-		cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
-						cvma.vm_page_prot);
-
-		/* Allocate all page at once, most common usage */
-		if (ttm_tt_populate(ttm, &ctx)) {
+		if (ttm_tt_populate(bo->ttm, &ctx)) {
 			ret = VM_FAULT_OOM;
 			goto out_io_unlock;
 		}
+		cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
+						ttm->page_flags,
+						cvma.vm_page_prot);
+	} else {
+		/* Iomem should not be marked encrypted */
+		cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
+						TTM_PAGE_FLAG_DECRYPTED,
+						cvma.vm_page_prot);
 	}
 
 	/*
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
index 7d78e6deac89..9b15df8ecd49 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
@@ -48,6 +48,7 @@ 
 #include <linux/atomic.h>
 #include <linux/device.h>
 #include <linux/kthread.h>
+#include <linux/dma-direct.h>
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_page_alloc.h>
 #include <drm/ttm/ttm_set_memory.h>
@@ -984,6 +985,9 @@  int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev,
 	}
 
 	ttm->state = tt_unbound;
+	if (force_dma_unencrypted(dev))
+		ttm->page_flags |= TTM_PAGE_FLAG_DECRYPTED;
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ttm_dma_populate);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
index bb46ca0c458f..d3ced89a37e9 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
@@ -483,8 +483,10 @@  int vmw_bo_cpu_blit(struct ttm_buffer_object *dst,
 	d.src_pages = src->ttm->pages;
 	d.dst_num_pages = dst->num_pages;
 	d.src_num_pages = src->num_pages;
-	d.dst_prot = ttm_io_prot(dst->mem.placement, PAGE_KERNEL);
-	d.src_prot = ttm_io_prot(src->mem.placement, PAGE_KERNEL);
+	d.dst_prot = ttm_io_prot(dst->mem.placement, dst->ttm->page_flags,
+				 PAGE_KERNEL);
+	d.src_prot = ttm_io_prot(src->mem.placement, src->ttm->page_flags,
+				 PAGE_KERNEL);
 	d.diff = diff;
 
 	for (j = 0; j < h; ++j) {
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 6f536caea368..68ead1bd3042 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -893,13 +893,15 @@  int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo);
 /**
  * ttm_io_prot
  *
- * @c_state: Caching state.
+ * @caching_flags: The caching flags of the map.
+ * @tt_page_flags: The tt_page_flags of the map, TTM_PAGE_FLAG_*
  * @tmp: Page protection flag for a normal, cached mapping.
  *
  * Utility function that returns the pgprot_t that should be used for
- * setting up a PTE with the caching model indicated by @c_state.
+ * setting up a PTE with the caching model indicated by @caching_flags,
+ * and encryption state indicated by @tt_page_flags,
  */
-pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp);
+pgprot_t ttm_io_prot(u32 caching_flags, u32 tt_page_flags, pgprot_t tmp);
 
 extern const struct ttm_mem_type_manager_func ttm_bo_manager_func;
 
diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h
index c0e928abf592..45cc26355513 100644
--- a/include/drm/ttm/ttm_tt.h
+++ b/include/drm/ttm/ttm_tt.h
@@ -41,6 +41,7 @@  struct ttm_operation_ctx;
 #define TTM_PAGE_FLAG_DMA32           (1 << 7)
 #define TTM_PAGE_FLAG_SG              (1 << 8)
 #define TTM_PAGE_FLAG_NO_RETRY	      (1 << 9)
+#define TTM_PAGE_FLAG_DECRYPTED       (1 << 10)
 
 enum ttm_caching_state {
 	tt_uncached,