diff mbox series

drm/ttm: Do not put non-struct page memory into PUD/PMDs

Message ID 0-v1-69e7da97f81f+21c-ttm_pmd_jgg@nvidia.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm: Do not put non-struct page memory into PUD/PMDs | expand

Commit Message

Jason Gunthorpe Oct. 19, 2021, 6:21 p.m. UTC
PUD and PMD entries do not have a special bit.

get_user_pages_fast() considers any page that passed pmd_huge() as
usable:

		if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
			     pmd_devmap(pmd))) {

And vmf_insert_pfn_pmd_prot() unconditionally sets

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));

eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.

As such gup_huge_pmd() will try to deref a struct page:

	head = try_grab_compound_head(pmd_page(orig), refs, flags);

and thus crash.

Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot().

Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

AFAICT only the vmwgfx driver makes use of this, and I can't tell which path
it is taking.

This function will also try to install a PUD - does vmwgfx have a case where
it has obtained a 1GB high order page - or is that dead code now?


base-commit: 519d81956ee277b4419c723adfb154603c2565ba

Comments

Thomas Hellstrom Oct. 19, 2021, 6:49 p.m. UTC | #1
Hi,

On 10/19/21 20:21, Jason Gunthorpe wrote:
> PUD and PMD entries do not have a special bit.
>
> get_user_pages_fast() considers any page that passed pmd_huge() as
> usable:
>
> 		if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
> 			     pmd_devmap(pmd))) {
>
> And vmf_insert_pfn_pmd_prot() unconditionally sets
>
> 	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
>
> eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
>
> As such gup_huge_pmd() will try to deref a struct page:
>
> 	head = try_grab_compound_head(pmd_page(orig), refs, flags);
>
> and thus crash.
>
> Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot().

Actually I think if fast gup will break even page backed memory since 
the backing drivers don't assume anybody else takes a refcount / 
pincount. Normal pages have PTE_SPECIAL and VM_PFNMAP to block that.

(Side note I was recommended to introduce a PTE_HUGESPECIAL bit for this 
and basically had  a patch ready but got scared off when trying to 
handle 64-bit PTE atomic updates on x86-32)

It might be that we (Intel) try to resurrect this code using 
PTE_HUGESPECIAL in the near future for i915, but until that, I think 
it's the safest option to disable it completely.

Thanks,

Thomas

>
> Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo_vm.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> AFAICT only the vmwgfx driver makes use of this, and I can't tell which path
> it is taking.
>
> This function will also try to install a PUD - does vmwgfx have a case where
> it has obtained a 1GB high order page - or is that dead code now?
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index f56be5bc0861ec..91d02c345fbba8 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -203,10 +203,13 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf,
>   	if (page_offset + fault_page_size > bo->resource->num_pages)
>   		goto out_fallback;
>   
> +	/*
> +	 * vmf_insert_pfn_pud/pmd_prot() can only be called with struct page
> +	 * backed memory
> +	 */
>   	if (bo->resource->bus.is_iomem)
> -		pfn = ttm_bo_io_mem_pfn(bo, page_offset);
> -	else
> -		pfn = page_to_pfn(ttm->pages[page_offset]);
> +		goto out_fallback;
> +	pfn = page_to_pfn(ttm->pages[page_offset]);
>   
>   	/* pfn must be fault_page_size aligned. */
>   	if ((pfn & (fault_page_size - 1)) != 0)
>
> base-commit: 519d81956ee277b4419c723adfb154603c2565ba
Jason Gunthorpe Oct. 19, 2021, 6:52 p.m. UTC | #2
On Tue, Oct 19, 2021 at 08:49:29PM +0200, Thomas Hellström wrote:
> Hi,
> 
> On 10/19/21 20:21, Jason Gunthorpe wrote:
> > PUD and PMD entries do not have a special bit.
> > 
> > get_user_pages_fast() considers any page that passed pmd_huge() as
> > usable:
> > 
> > 		if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
> > 			     pmd_devmap(pmd))) {
> > 
> > And vmf_insert_pfn_pmd_prot() unconditionally sets
> > 
> > 	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
> > 
> > eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
> > 
> > As such gup_huge_pmd() will try to deref a struct page:
> > 
> > 	head = try_grab_compound_head(pmd_page(orig), refs, flags);
> > 
> > and thus crash.
> > 
> > Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot().
> 
> Actually I think if fast gup will break even page backed memory since the
> backing drivers don't assume anybody else takes a refcount / pincount.
> Normal pages have PTE_SPECIAL and VM_PFNMAP to block that.

Erk, yes, that is even worse.

> (Side note I was recommended to introduce a PTE_HUGESPECIAL bit for
> this and basically had  a patch ready but got scared off when trying
> to handle 64-bit PTE atomic updates on x86-32)

Right, a PMD_SPECIAL bit is needed to make this work. 

> It might be that we (Intel) try to resurrect this code using
> PTE_HUGESPECIAL in the near future for i915, but until that, I think
> it's the safest option to disable it completely.

Okay, do you want a patch to just delete this function?

Jason
Thomas Hellstrom Oct. 19, 2021, 6:56 p.m. UTC | #3
On 10/19/21 20:52, Jason Gunthorpe wrote:
> On Tue, Oct 19, 2021 at 08:49:29PM +0200, Thomas Hellström wrote:
>> Hi,
>>
>> On 10/19/21 20:21, Jason Gunthorpe wrote:
>>> PUD and PMD entries do not have a special bit.
>>>
>>> get_user_pages_fast() considers any page that passed pmd_huge() as
>>> usable:
>>>
>>> 		if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
>>> 			     pmd_devmap(pmd))) {
>>>
>>> And vmf_insert_pfn_pmd_prot() unconditionally sets
>>>
>>> 	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
>>>
>>> eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
>>>
>>> As such gup_huge_pmd() will try to deref a struct page:
>>>
>>> 	head = try_grab_compound_head(pmd_page(orig), refs, flags);
>>>
>>> and thus crash.
>>>
>>> Prevent this by never using IO memory with vmf_insert_pfn_pud/pmd_prot().
>> Actually I think if fast gup will break even page backed memory since the
>> backing drivers don't assume anybody else takes a refcount / pincount.
>> Normal pages have PTE_SPECIAL and VM_PFNMAP to block that.
> Erk, yes, that is even worse.
>
>> (Side note I was recommended to introduce a PTE_HUGESPECIAL bit for
>> this and basically had  a patch ready but got scared off when trying
>> to handle 64-bit PTE atomic updates on x86-32)
> Right, a PMD_SPECIAL bit is needed to make this work.
Yes, PMD_SPECIAL it was :)
>
>> It might be that we (Intel) try to resurrect this code using
>> PTE_HUGESPECIAL in the near future for i915, but until that, I think
>> it's the safest option to disable it completely.
> Okay, do you want a patch to just delete this function?

That'd be great.

Thanks,

Thomas


>
> Jason
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index f56be5bc0861ec..91d02c345fbba8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -203,10 +203,13 @@  static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf,
 	if (page_offset + fault_page_size > bo->resource->num_pages)
 		goto out_fallback;
 
+	/*
+	 * vmf_insert_pfn_pud/pmd_prot() can only be called with struct page
+	 * backed memory
+	 */
 	if (bo->resource->bus.is_iomem)
-		pfn = ttm_bo_io_mem_pfn(bo, page_offset);
-	else
-		pfn = page_to_pfn(ttm->pages[page_offset]);
+		goto out_fallback;
+	pfn = page_to_pfn(ttm->pages[page_offset]);
 
 	/* pfn must be fault_page_size aligned. */
 	if ((pfn & (fault_page_size - 1)) != 0)