[v2,2/2] drm/ttm: Fix vm page protection handling

Message ID	20191203104853.4378-3-thomas_os@shipmail.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=JAqu=ZZ=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E1F2320659 sender: mb878879) by pio-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id 0BB2B413CC; Tue, 3 Dec 2019 11:49:05 +0100 (CET) From: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28VMware=29?= <thomas_os@shipmail.org> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: pv-drivers@vmware.com, linux-graphics-maintainer@vmware.com, Thomas Hellstrom <thellstrom@vmware.com>, Andrew Morton <akpm@linux-foundation.org>, Michal Hocko <mhocko@suse.com>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Ralph Campbell <rcampbell@nvidia.com>, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= <jglisse@redhat.com>, =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= <christian.koenig@amd.com> Subject: [PATCH v2 2/2] drm/ttm: Fix vm page protection handling Date: Tue, 3 Dec 2019 11:48:53 +0100 Message-Id: <20191203104853.4378-3-thomas_os@shipmail.org> In-Reply-To: <20191203104853.4378-1-thomas_os@shipmail.org> References: <20191203104853.4378-1-thomas_os@shipmail.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[v2,1/2] mm: Add and export vmf_insert_mixed_prot() \| expand [v2,1/2] mm: Add and export vmf_insert_mixed_prot() [v2,2/2] drm/ttm: Fix vm page protection handling

Thomas Hellström (Intel) Dec. 3, 2019, 10:48 a.m. UTC

From: Thomas Hellstrom <thellstrom@vmware.com>

TTM graphics buffer objects may, transparently to user-space,  move
between IO and system memory. When that happens, all PTEs pointing to the
old location are zapped before the move and then faulted in again if
needed. When that happens, the page protection caching mode- and
encryption bits may change and be different from those of
struct vm_area_struct::vm_page_prot.

We were using an ugly hack to set the page protection correctly.
Fix that and instead use vmf_insert_mixed_prot() and / or
vmf_insert_pfn_prot().
Also get the default page protection from
struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
This way we catch modifications done by the vm system for drivers that
want write-notification.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Michal Hocko Dec. 4, 2019, 1:52 p.m. UTC | #1

On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
> From: Thomas Hellstrom <thellstrom@vmware.com>
> 
> TTM graphics buffer objects may, transparently to user-space,  move
> between IO and system memory. When that happens, all PTEs pointing to the
> old location are zapped before the move and then faulted in again if
> needed. When that happens, the page protection caching mode- and
> encryption bits may change and be different from those of
> struct vm_area_struct::vm_page_prot.
> 
> We were using an ugly hack to set the page protection correctly.
> Fix that and instead use vmf_insert_mixed_prot() and / or
> vmf_insert_pfn_prot().
> Also get the default page protection from
> struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
> This way we catch modifications done by the vm system for drivers that
> want write-notification.

So essentially this should have any new side effect on functionality it
is just making a hacky/ugly code less so? In other words what are the
consequences of having page protection inconsistent from vma's?

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo_vm.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index e6495ca2630b..2098f8d4dfc5 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -173,7 +173,6 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  				    pgoff_t num_prefault)
>  {
>  	struct vm_area_struct *vma = vmf->vma;
> -	struct vm_area_struct cvma = *vma;
>  	struct ttm_buffer_object *bo = vma->vm_private_data;
>  	struct ttm_bo_device *bdev = bo->bdev;
>  	unsigned long page_offset;
> @@ -244,7 +243,7 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  		goto out_io_unlock;
>  	}
>  
> -	cvma.vm_page_prot = ttm_io_prot(bo->mem.placement, prot);
> +	prot = ttm_io_prot(bo->mem.placement, prot);
>  	if (!bo->mem.bus.is_iomem) {
>  		struct ttm_operation_ctx ctx = {
>  			.interruptible = false,
> @@ -260,7 +259,7 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  		}
>  	} else {
>  		/* Iomem should not be marked encrypted */
> -		cvma.vm_page_prot = pgprot_decrypted(cvma.vm_page_prot);
> +		prot = pgprot_decrypted(prot);
>  	}
>  
>  	/*
> @@ -284,10 +283,11 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
>  		}
>  
>  		if (vma->vm_flags & VM_MIXEDMAP)
> -			ret = vmf_insert_mixed(&cvma, address,
> -					__pfn_to_pfn_t(pfn, PFN_DEV));
> +			ret = vmf_insert_mixed_prot(vma, address,
> +						    __pfn_to_pfn_t(pfn, PFN_DEV),
> +						    prot);
>  		else
> -			ret = vmf_insert_pfn(&cvma, address, pfn);
> +			ret = vmf_insert_pfn_prot(vma, address, pfn, prot);
>  
>  		/* Never error on prefaulted PTEs */
>  		if (unlikely((ret & VM_FAULT_ERROR))) {
> @@ -319,7 +319,7 @@ vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
>  	if (ret)
>  		return ret;
>  
> -	prot = vm_get_page_prot(vma->vm_flags);
> +	prot = vma->vm_page_prot;
>  	ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT);
>  	if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
>  		return ret;
> -- 
> 2.21.0

Thomas Hellström (Intel) Dec. 4, 2019, 2:16 p.m. UTC | #2

On 12/4/19 2:52 PM, Michal Hocko wrote:
> On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
>> From: Thomas Hellstrom <thellstrom@vmware.com>
>>
>> TTM graphics buffer objects may, transparently to user-space,  move
>> between IO and system memory. When that happens, all PTEs pointing to the
>> old location are zapped before the move and then faulted in again if
>> needed. When that happens, the page protection caching mode- and
>> encryption bits may change and be different from those of
>> struct vm_area_struct::vm_page_prot.
>>
>> We were using an ugly hack to set the page protection correctly.
>> Fix that and instead use vmf_insert_mixed_prot() and / or
>> vmf_insert_pfn_prot().
>> Also get the default page protection from
>> struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
>> This way we catch modifications done by the vm system for drivers that
>> want write-notification.
> So essentially this should have any new side effect on functionality it
> is just making a hacky/ugly code less so?

Functionality is unchanged. The use of a on-stack vma copy was severely 
frowned upon in an earlier thread, which also points to another similar 
example using vmf_insert_pfn_prot().

https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/

> In other words what are the
> consequences of having page protection inconsistent from vma's?

During the years, it looks like the caching- and encryption flags of 
vma::vm_page_prot have been largely removed from usage. From what I can 
tell, there are no more places left that can affect TTM. We discussed 
__split_huge_pmd_locked() towards the end of that thread, but that 
doesn't affect TTM even with huge page-table entries.

/Thomas

Michal Hocko Dec. 4, 2019, 2:35 p.m. UTC | #3

On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
> On 12/4/19 2:52 PM, Michal Hocko wrote:
> > On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
> > > From: Thomas Hellstrom <thellstrom@vmware.com>
> > > 
> > > TTM graphics buffer objects may, transparently to user-space,  move
> > > between IO and system memory. When that happens, all PTEs pointing to the
> > > old location are zapped before the move and then faulted in again if
> > > needed. When that happens, the page protection caching mode- and
> > > encryption bits may change and be different from those of
> > > struct vm_area_struct::vm_page_prot.
> > > 
> > > We were using an ugly hack to set the page protection correctly.
> > > Fix that and instead use vmf_insert_mixed_prot() and / or
> > > vmf_insert_pfn_prot().
> > > Also get the default page protection from
> > > struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
> > > This way we catch modifications done by the vm system for drivers that
> > > want write-notification.
> > So essentially this should have any new side effect on functionality it
> > is just making a hacky/ugly code less so?
> 
> Functionality is unchanged. The use of a on-stack vma copy was severely
> frowned upon in an earlier thread, which also points to another similar
> example using vmf_insert_pfn_prot().
> 
> https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
> 
> > In other words what are the
> > consequences of having page protection inconsistent from vma's?
> 
> During the years, it looks like the caching- and encryption flags of
> vma::vm_page_prot have been largely removed from usage. From what I can
> tell, there are no more places left that can affect TTM. We discussed
> __split_huge_pmd_locked() towards the end of that thread, but that doesn't
> affect TTM even with huge page-table entries.

Please state all those details/assumptions you are operating on in the
changelog.

Thomas Hellström (Intel) Dec. 4, 2019, 2:36 p.m. UTC | #4

On 12/4/19 3:35 PM, Michal Hocko wrote:
> On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
>> On 12/4/19 2:52 PM, Michal Hocko wrote:
>>> On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
>>>> From: Thomas Hellstrom <thellstrom@vmware.com>
>>>>
>>>> TTM graphics buffer objects may, transparently to user-space,  move
>>>> between IO and system memory. When that happens, all PTEs pointing to the
>>>> old location are zapped before the move and then faulted in again if
>>>> needed. When that happens, the page protection caching mode- and
>>>> encryption bits may change and be different from those of
>>>> struct vm_area_struct::vm_page_prot.
>>>>
>>>> We were using an ugly hack to set the page protection correctly.
>>>> Fix that and instead use vmf_insert_mixed_prot() and / or
>>>> vmf_insert_pfn_prot().
>>>> Also get the default page protection from
>>>> struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
>>>> This way we catch modifications done by the vm system for drivers that
>>>> want write-notification.
>>> So essentially this should have any new side effect on functionality it
>>> is just making a hacky/ugly code less so?
>> Functionality is unchanged. The use of a on-stack vma copy was severely
>> frowned upon in an earlier thread, which also points to another similar
>> example using vmf_insert_pfn_prot().
>>
>> https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
>>
>>> In other words what are the
>>> consequences of having page protection inconsistent from vma's?
>> During the years, it looks like the caching- and encryption flags of
>> vma::vm_page_prot have been largely removed from usage. From what I can
>> tell, there are no more places left that can affect TTM. We discussed
>> __split_huge_pmd_locked() towards the end of that thread, but that doesn't
>> affect TTM even with huge page-table entries.
> Please state all those details/assumptions you are operating on in the
> changelog.

Thanks. I'll update the patchset and add that.

/Thomas

Michal Hocko Dec. 4, 2019, 2:42 p.m. UTC | #5

On Wed 04-12-19 15:36:58, Thomas Hellström (VMware) wrote:
> On 12/4/19 3:35 PM, Michal Hocko wrote:
> > On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
> > > On 12/4/19 2:52 PM, Michal Hocko wrote:
> > > > On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
> > > > > From: Thomas Hellstrom <thellstrom@vmware.com>
> > > > > 
> > > > > TTM graphics buffer objects may, transparently to user-space,  move
> > > > > between IO and system memory. When that happens, all PTEs pointing to the
> > > > > old location are zapped before the move and then faulted in again if
> > > > > needed. When that happens, the page protection caching mode- and
> > > > > encryption bits may change and be different from those of
> > > > > struct vm_area_struct::vm_page_prot.
> > > > > 
> > > > > We were using an ugly hack to set the page protection correctly.
> > > > > Fix that and instead use vmf_insert_mixed_prot() and / or
> > > > > vmf_insert_pfn_prot().
> > > > > Also get the default page protection from
> > > > > struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
> > > > > This way we catch modifications done by the vm system for drivers that
> > > > > want write-notification.
> > > > So essentially this should have any new side effect on functionality it
> > > > is just making a hacky/ugly code less so?
> > > Functionality is unchanged. The use of a on-stack vma copy was severely
> > > frowned upon in an earlier thread, which also points to another similar
> > > example using vmf_insert_pfn_prot().
> > > 
> > > https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
> > > 
> > > > In other words what are the
> > > > consequences of having page protection inconsistent from vma's?
> > > During the years, it looks like the caching- and encryption flags of
> > > vma::vm_page_prot have been largely removed from usage. From what I can
> > > tell, there are no more places left that can affect TTM. We discussed
> > > __split_huge_pmd_locked() towards the end of that thread, but that doesn't
> > > affect TTM even with huge page-table entries.
> > Please state all those details/assumptions you are operating on in the
> > changelog.
> 
> Thanks. I'll update the patchset and add that.

And thinking about that this also begs for a comment in the code to
explain that some (which?) mappings might have a mismatch and the
generic code have to be careful. Because as things stand now this seems
to be really subtle and happen to work _now_ and might break in the future.
Or what does prevent a generic code to stumble over this discrepancy?

Thomas Hellström (Intel) Dec. 4, 2019, 3:19 p.m. UTC | #6

On 12/4/19 3:42 PM, Michal Hocko wrote:
> On Wed 04-12-19 15:36:58, Thomas Hellström (VMware) wrote:
>> On 12/4/19 3:35 PM, Michal Hocko wrote:
>>> On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
>>>> On 12/4/19 2:52 PM, Michal Hocko wrote:
>>>>> On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
>>>>>> From: Thomas Hellstrom <thellstrom@vmware.com>
>>>>>>
>>>>>> TTM graphics buffer objects may, transparently to user-space,  move
>>>>>> between IO and system memory. When that happens, all PTEs pointing to the
>>>>>> old location are zapped before the move and then faulted in again if
>>>>>> needed. When that happens, the page protection caching mode- and
>>>>>> encryption bits may change and be different from those of
>>>>>> struct vm_area_struct::vm_page_prot.
>>>>>>
>>>>>> We were using an ugly hack to set the page protection correctly.
>>>>>> Fix that and instead use vmf_insert_mixed_prot() and / or
>>>>>> vmf_insert_pfn_prot().
>>>>>> Also get the default page protection from
>>>>>> struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
>>>>>> This way we catch modifications done by the vm system for drivers that
>>>>>> want write-notification.
>>>>> So essentially this should have any new side effect on functionality it
>>>>> is just making a hacky/ugly code less so?
>>>> Functionality is unchanged. The use of a on-stack vma copy was severely
>>>> frowned upon in an earlier thread, which also points to another similar
>>>> example using vmf_insert_pfn_prot().
>>>>
>>>> https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
>>>>
>>>>> In other words what are the
>>>>> consequences of having page protection inconsistent from vma's?
>>>> During the years, it looks like the caching- and encryption flags of
>>>> vma::vm_page_prot have been largely removed from usage. From what I can
>>>> tell, there are no more places left that can affect TTM. We discussed
>>>> __split_huge_pmd_locked() towards the end of that thread, but that doesn't
>>>> affect TTM even with huge page-table entries.
>>> Please state all those details/assumptions you are operating on in the
>>> changelog.
>> Thanks. I'll update the patchset and add that.
> And thinking about that this also begs for a comment in the code to
> explain that some (which?) mappings might have a mismatch and the
> generic code have to be careful. Because as things stand now this seems
> to be really subtle and happen to work _now_ and might break in the future.
> Or what does prevent a generic code to stumble over this discrepancy?

Yes we had that discussion in the thread I pointed to. I initially 
suggested and argued for updating the vma::vm_page_prot using a 
WRITE_ONCE() (we only have the mmap_sem in read mode), there seems to be 
other places in generic code that does the same.

But I was convinced by Andy that this was the right way and also was 
used elsewhere.

(See also 
https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vma.c#L116)

I guess to have this properly formulated, what's required is that 
generic code doesn't build page-table entries using vma::vm_page_prot 
for VM_PFNMAP and VM_MIXEDMAP outside of driver control.

/Thomas

Michal Hocko Dec. 4, 2019, 3:26 p.m. UTC | #7

On Wed 04-12-19 16:19:27, Thomas Hellström (VMware) wrote:
> On 12/4/19 3:42 PM, Michal Hocko wrote:
> > On Wed 04-12-19 15:36:58, Thomas Hellström (VMware) wrote:
> > > On 12/4/19 3:35 PM, Michal Hocko wrote:
> > > > On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
> > > > > On 12/4/19 2:52 PM, Michal Hocko wrote:
> > > > > > On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
> > > > > > > From: Thomas Hellstrom <thellstrom@vmware.com>
> > > > > > > 
> > > > > > > TTM graphics buffer objects may, transparently to user-space,  move
> > > > > > > between IO and system memory. When that happens, all PTEs pointing to the
> > > > > > > old location are zapped before the move and then faulted in again if
> > > > > > > needed. When that happens, the page protection caching mode- and
> > > > > > > encryption bits may change and be different from those of
> > > > > > > struct vm_area_struct::vm_page_prot.
> > > > > > > 
> > > > > > > We were using an ugly hack to set the page protection correctly.
> > > > > > > Fix that and instead use vmf_insert_mixed_prot() and / or
> > > > > > > vmf_insert_pfn_prot().
> > > > > > > Also get the default page protection from
> > > > > > > struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
> > > > > > > This way we catch modifications done by the vm system for drivers that
> > > > > > > want write-notification.
> > > > > > So essentially this should have any new side effect on functionality it
> > > > > > is just making a hacky/ugly code less so?
> > > > > Functionality is unchanged. The use of a on-stack vma copy was severely
> > > > > frowned upon in an earlier thread, which also points to another similar
> > > > > example using vmf_insert_pfn_prot().
> > > > > 
> > > > > https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
> > > > > 
> > > > > > In other words what are the
> > > > > > consequences of having page protection inconsistent from vma's?
> > > > > During the years, it looks like the caching- and encryption flags of
> > > > > vma::vm_page_prot have been largely removed from usage. From what I can
> > > > > tell, there are no more places left that can affect TTM. We discussed
> > > > > __split_huge_pmd_locked() towards the end of that thread, but that doesn't
> > > > > affect TTM even with huge page-table entries.
> > > > Please state all those details/assumptions you are operating on in the
> > > > changelog.
> > > Thanks. I'll update the patchset and add that.
> > And thinking about that this also begs for a comment in the code to
> > explain that some (which?) mappings might have a mismatch and the
> > generic code have to be careful. Because as things stand now this seems
> > to be really subtle and happen to work _now_ and might break in the future.
> > Or what does prevent a generic code to stumble over this discrepancy?
> 
> Yes we had that discussion in the thread I pointed to. I initially suggested
> and argued for updating the vma::vm_page_prot using a WRITE_ONCE() (we only
> have the mmap_sem in read mode), there seems to be other places in generic
> code that does the same.
> 
> But I was convinced by Andy that this was the right way and also was used
> elsewhere.
> 
> (See also https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vma.c#L116)
> 
> I guess to have this properly formulated, what's required is that generic
> code doesn't build page-table entries using vma::vm_page_prot for VM_PFNMAP
> and VM_MIXEDMAP outside of driver control.

Let me repeat that this belongs to a code somewhere everybody can see it
rather than a "random" discussion at mailing list.

Thanks!

Thomas Hellström (Intel) Dec. 4, 2019, 3:29 p.m. UTC | #8

On 12/4/19 4:26 PM, Michal Hocko wrote:
> On Wed 04-12-19 16:19:27, Thomas Hellström (VMware) wrote:
>> On 12/4/19 3:42 PM, Michal Hocko wrote:
>>> On Wed 04-12-19 15:36:58, Thomas Hellström (VMware) wrote:
>>>> On 12/4/19 3:35 PM, Michal Hocko wrote:
>>>>> On Wed 04-12-19 15:16:09, Thomas Hellström (VMware) wrote:
>>>>>> On 12/4/19 2:52 PM, Michal Hocko wrote:
>>>>>>> On Tue 03-12-19 11:48:53, Thomas Hellström (VMware) wrote:
>>>>>>>> From: Thomas Hellstrom <thellstrom@vmware.com>
>>>>>>>>
>>>>>>>> TTM graphics buffer objects may, transparently to user-space,  move
>>>>>>>> between IO and system memory. When that happens, all PTEs pointing to the
>>>>>>>> old location are zapped before the move and then faulted in again if
>>>>>>>> needed. When that happens, the page protection caching mode- and
>>>>>>>> encryption bits may change and be different from those of
>>>>>>>> struct vm_area_struct::vm_page_prot.
>>>>>>>>
>>>>>>>> We were using an ugly hack to set the page protection correctly.
>>>>>>>> Fix that and instead use vmf_insert_mixed_prot() and / or
>>>>>>>> vmf_insert_pfn_prot().
>>>>>>>> Also get the default page protection from
>>>>>>>> struct vm_area_struct::vm_page_prot rather than using vm_get_page_prot().
>>>>>>>> This way we catch modifications done by the vm system for drivers that
>>>>>>>> want write-notification.
>>>>>>> So essentially this should have any new side effect on functionality it
>>>>>>> is just making a hacky/ugly code less so?
>>>>>> Functionality is unchanged. The use of a on-stack vma copy was severely
>>>>>> frowned upon in an earlier thread, which also points to another similar
>>>>>> example using vmf_insert_pfn_prot().
>>>>>>
>>>>>> https://lore.kernel.org/lkml/20190905103541.4161-2-thomas_os@shipmail.org/
>>>>>>
>>>>>>> In other words what are the
>>>>>>> consequences of having page protection inconsistent from vma's?
>>>>>> During the years, it looks like the caching- and encryption flags of
>>>>>> vma::vm_page_prot have been largely removed from usage. From what I can
>>>>>> tell, there are no more places left that can affect TTM. We discussed
>>>>>> __split_huge_pmd_locked() towards the end of that thread, but that doesn't
>>>>>> affect TTM even with huge page-table entries.
>>>>> Please state all those details/assumptions you are operating on in the
>>>>> changelog.
>>>> Thanks. I'll update the patchset and add that.
>>> And thinking about that this also begs for a comment in the code to
>>> explain that some (which?) mappings might have a mismatch and the
>>> generic code have to be careful. Because as things stand now this seems
>>> to be really subtle and happen to work _now_ and might break in the future.
>>> Or what does prevent a generic code to stumble over this discrepancy?
>> Yes we had that discussion in the thread I pointed to. I initially suggested
>> and argued for updating the vma::vm_page_prot using a WRITE_ONCE() (we only
>> have the mmap_sem in read mode), there seems to be other places in generic
>> code that does the same.
>>
>> But I was convinced by Andy that this was the right way and also was used
>> elsewhere.
>>
>> (See also https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vma.c#L116)
>>
>> I guess to have this properly formulated, what's required is that generic
>> code doesn't build page-table entries using vma::vm_page_prot for VM_PFNMAP
>> and VM_MIXEDMAP outside of driver control.
> Let me repeat that this belongs to a code somewhere everybody can see it
> rather than a "random" discussion at mailing list.
>
> Thanks!

Yes, I agree. I'll of course follow up with the comments added to the code.

Thomas

[v2,2/2] drm/ttm: Fix vm page protection handling

Commit Message

Comments

Patch