diff mbox series

[V2] drm/ttm: increase ttm pre-fault value to PMD size

Message ID 20240604084934.225738-1-lingshan.zhu@amd.com (mailing list archive)
State New, archived
Headers show
Series [V2] drm/ttm: increase ttm pre-fault value to PMD size | expand

Commit Message

Zhu Lingshan June 4, 2024, 8:49 a.m. UTC
ttm page fault handler ttm_bo_vm_fault_reserved() maps
TTM_BO_VM_NUM_PREFAULT more pages beforehand
due to the principle of locality.

However, on some platform the page faults are more costly, this
patch intends to increase the number of ttm pre-fault to relieve
the number of page faults.

When multiple levels of page table is supported, the new default
value would be the PMD size, similar to huge page.

Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com>
---
 include/drm/ttm/ttm_bo.h | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Huang Rui June 4, 2024, 10:18 a.m. UTC | #1
On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote:
> ttm page fault handler ttm_bo_vm_fault_reserved() maps
> TTM_BO_VM_NUM_PREFAULT more pages beforehand
> due to the principle of locality.
> 
> However, on some platform the page faults are more costly, this
> patch intends to increase the number of ttm pre-fault to relieve
> the number of page faults.
> 
> When multiple levels of page table is supported, the new default
> value would be the PMD size, similar to huge page.
> 
> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com>

Acked-by: Huang Rui <ray.huang@amd.com>

> ---
>  include/drm/ttm/ttm_bo.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
> index 6ccf96c91f3a..ef0f52f56ebc 100644
> --- a/include/drm/ttm/ttm_bo.h
> +++ b/include/drm/ttm/ttm_bo.h
> @@ -39,7 +39,11 @@
>  #include "ttm_device.h"
>  
>  /* Default number of pre-faulted pages in the TTM fault handler */
> +#if CONFIG_PGTABLE_LEVELS > 2
> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT))
> +#else
>  #define TTM_BO_VM_NUM_PREFAULT 16
> +#endif
>  
>  struct iosys_map;
>  
> -- 
> 2.45.1
>
Christian König June 4, 2024, 10:24 a.m. UTC | #2
Am 04.06.24 um 12:18 schrieb Huang Rui:
> On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote:
>> ttm page fault handler ttm_bo_vm_fault_reserved() maps
>> TTM_BO_VM_NUM_PREFAULT more pages beforehand
>> due to the principle of locality.
>>
>> However, on some platform the page faults are more costly, this
>> patch intends to increase the number of ttm pre-fault to relieve
>> the number of page faults.
>>
>> When multiple levels of page table is supported, the new default
>> value would be the PMD size, similar to huge page.
>>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
>> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com>
> Acked-by: Huang Rui <ray.huang@amd.com>

Not sure if there really is an architecture with less than 3 page table 
levels, but the build robots should be able to tell us if everything is 
fine here.

Reviewed-by: Christian König <christian.koenig@amd.com>

>
>> ---
>>   include/drm/ttm/ttm_bo.h | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
>> index 6ccf96c91f3a..ef0f52f56ebc 100644
>> --- a/include/drm/ttm/ttm_bo.h
>> +++ b/include/drm/ttm/ttm_bo.h
>> @@ -39,7 +39,11 @@
>>   #include "ttm_device.h"
>>   
>>   /* Default number of pre-faulted pages in the TTM fault handler */
>> +#if CONFIG_PGTABLE_LEVELS > 2
>> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT))
>> +#else
>>   #define TTM_BO_VM_NUM_PREFAULT 16
>> +#endif
>>   
>>   struct iosys_map;
>>   
>> -- 
>> 2.45.1
>>
Alex Deucher June 19, 2024, 4:09 p.m. UTC | #3
On Tue, Jun 4, 2024 at 10:02 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 04.06.24 um 12:18 schrieb Huang Rui:
> > On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote:
> >> ttm page fault handler ttm_bo_vm_fault_reserved() maps
> >> TTM_BO_VM_NUM_PREFAULT more pages beforehand
> >> due to the principle of locality.
> >>
> >> However, on some platform the page faults are more costly, this
> >> patch intends to increase the number of ttm pre-fault to relieve
> >> the number of page faults.
> >>
> >> When multiple levels of page table is supported, the new default
> >> value would be the PMD size, similar to huge page.
> >>
> >> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
> >> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com>
> > Acked-by: Huang Rui <ray.huang@amd.com>
>
> Not sure if there really is an architecture with less than 3 page table
> levels, but the build robots should be able to tell us if everything is
> fine here.
>
> Reviewed-by: Christian König <christian.koenig@amd.com>

Has this been pushed to drm-misc?  If not, can you push it?

Thanks,

Alex

>
> >
> >> ---
> >>   include/drm/ttm/ttm_bo.h | 4 ++++
> >>   1 file changed, 4 insertions(+)
> >>
> >> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
> >> index 6ccf96c91f3a..ef0f52f56ebc 100644
> >> --- a/include/drm/ttm/ttm_bo.h
> >> +++ b/include/drm/ttm/ttm_bo.h
> >> @@ -39,7 +39,11 @@
> >>   #include "ttm_device.h"
> >>
> >>   /* Default number of pre-faulted pages in the TTM fault handler */
> >> +#if CONFIG_PGTABLE_LEVELS > 2
> >> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT))
> >> +#else
> >>   #define TTM_BO_VM_NUM_PREFAULT 16
> >> +#endif
> >>
> >>   struct iosys_map;
> >>
> >> --
> >> 2.45.1
> >>
>
Christian König June 20, 2024, 2:01 p.m. UTC | #4
Am 19.06.24 um 18:09 schrieb Alex Deucher:
> On Tue, Jun 4, 2024 at 10:02 AM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 04.06.24 um 12:18 schrieb Huang Rui:
>>> On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote:
>>>> ttm page fault handler ttm_bo_vm_fault_reserved() maps
>>>> TTM_BO_VM_NUM_PREFAULT more pages beforehand
>>>> due to the principle of locality.
>>>>
>>>> However, on some platform the page faults are more costly, this
>>>> patch intends to increase the number of ttm pre-fault to relieve
>>>> the number of page faults.
>>>>
>>>> When multiple levels of page table is supported, the new default
>>>> value would be the PMD size, similar to huge page.
>>>>
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
>>>> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com>
>>> Acked-by: Huang Rui <ray.huang@amd.com>
>> Not sure if there really is an architecture with less than 3 page table
>> levels, but the build robots should be able to tell us if everything is
>> fine here.
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
> Has this been pushed to drm-misc?  If not, can you push it?

Done.

Christian.

>
> Thanks,
>
> Alex
>
>>>> ---
>>>>    include/drm/ttm/ttm_bo.h | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
>>>> index 6ccf96c91f3a..ef0f52f56ebc 100644
>>>> --- a/include/drm/ttm/ttm_bo.h
>>>> +++ b/include/drm/ttm/ttm_bo.h
>>>> @@ -39,7 +39,11 @@
>>>>    #include "ttm_device.h"
>>>>
>>>>    /* Default number of pre-faulted pages in the TTM fault handler */
>>>> +#if CONFIG_PGTABLE_LEVELS > 2
>>>> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT))
>>>> +#else
>>>>    #define TTM_BO_VM_NUM_PREFAULT 16
>>>> +#endif
>>>>
>>>>    struct iosys_map;
>>>>
>>>> --
>>>> 2.45.1
>>>>
Yanko Kaneti July 23, 2024, 7:42 p.m. UTC | #5
Hello,

Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:

  0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size

The regression manifests in choppy youtube video playback in google-chrome-stable.
 https://www.youtube.com/watch?v=uOpl2XNOgMA
 google-chrome-stable-126.0.6478.182-1.x86_64
 VP9 video, 
 Chrome -> Override software rendering list -> on
 Chrome -> Hardware-accelerated video decode - default enabled

No other visible graphics issues.

Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
[AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]

Tested with linus tip and just reverting the commit fixes the issue.

Sorry for the brevity, not sure what other details might be relevant.

- Yanko
Christian König July 24, 2024, 7:02 a.m. UTC | #6
Hi Yanko,

interesting. What do you mean with "choppy"? E.g. lag on startup?

Regards,
Christian.

Am 23.07.24 um 21:42 schrieb Yanko Kaneti:
> Hello,
>
> Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:
>
>    0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size
>
> The regression manifests in choppy youtube video playback in google-chrome-stable.
>   https://www.youtube.com/watch?v=uOpl2XNOgMA
>   google-chrome-stable-126.0.6478.182-1.x86_64
>   VP9 video,
>   Chrome -> Override software rendering list -> on
>   Chrome -> Hardware-accelerated video decode - default enabled
>
> No other visible graphics issues.
>
> Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
> [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
>
> Tested with linus tip and just reverting the commit fixes the issue.
>
> Sorry for the brevity, not sure what other details might be relevant.
>
> - Yanko
>
>
Yanko Kaneti July 24, 2024, 7:13 a.m. UTC | #7
Hi,

Well, it starts, then within a second or two it begins stuttering with
long (half a second/second) freezes of the video , while the audio seems
to work fine. Nothing in the log from chrome or kernel , AFAICS,  to
show anything is wrong.

Regards
Yanko

On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote:
> Hi Yanko,
> 
> interesting. What do you mean with "choppy"? E.g. lag on startup?
> 
> Regards,
> Christian.
> 
> Am 23.07.24 um 21:42 schrieb Yanko Kaneti:
> > Hello,
> > 
> > Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:
> > 
> >    0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size
> > 
> > The regression manifests in choppy youtube video playback in google-chrome-stable.
> >   https://www.youtube.com/watch?v=uOpl2XNOgMA
> >   google-chrome-stable-126.0.6478.182-1.x86_64
> >   VP9 video,
> >   Chrome -> Override software rendering list -> on
> >   Chrome -> Hardware-accelerated video decode - default enabled
> > 
> > No other visible graphics issues.
> > 
> > Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
> > [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
> > 
> > Tested with linus tip and just reverting the commit fixes the issue.
> > 
> > Sorry for the brevity, not sure what other details might be relevant.
> > 
> > - Yanko
> > 
> > 
>
Yanko Kaneti Aug. 2, 2024, 9:40 a.m. UTC | #8
Hi, 

So, can't reproduce this any more with with recent rawhide (rc1+).
Tried also with the same old kernels but this time its with newer mesa
and google-chrome (126 -> 127). The same scenario as before now works
ok.

Cheers and sorry for the noise.
- Yanko

On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote:
> Hi,
> 
> Well, it starts, then within a second or two it begins stuttering with
> long (half a second/second) freezes of the video , while the audio seems
> to work fine. Nothing in the log from chrome or kernel , AFAICS,  to
> show anything is wrong.
> 
> Regards
> Yanko
> 
> On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote:
> > Hi Yanko,
> > 
> > interesting. What do you mean with "choppy"? E.g. lag on startup?
> > 
> > Regards,
> > Christian.
> > 
> > Am 23.07.24 um 21:42 schrieb Yanko Kaneti:
> > > Hello,
> > > 
> > > Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:
> > > 
> > >    0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size
> > > 
> > > The regression manifests in choppy youtube video playback in google-chrome-stable.
> > >   https://www.youtube.com/watch?v=uOpl2XNOgMA
> > >   google-chrome-stable-126.0.6478.182-1.x86_64
> > >   VP9 video,
> > >   Chrome -> Override software rendering list -> on
> > >   Chrome -> Hardware-accelerated video decode - default enabled
> > > 
> > > No other visible graphics issues.
> > > 
> > > Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
> > > [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
> > > 
> > > Tested with linus tip and just reverting the commit fixes the issue.
> > > 
> > > Sorry for the brevity, not sure what other details might be relevant.
> > > 
> > > - Yanko
> > > 
> > > 
> > 
>
Christian König Aug. 2, 2024, 9:46 a.m. UTC | #9
I'm still scratching my head what the heck this could be.

Increasing the TTM prefault number has minimal more CPU overhead on the 
first access but makes subsequent accesses to the same buffer faster 
(because the buffer is already completely present).

So as long as Chrome didn't wrote some single bytes repeatably on newly 
allocated buffers I don't see how the change could affect video playback 
at all.

Maybe Chrome did exactly that because of a bug or something, but in 
general such an application behavior wouldn't make much sense (except if 
you want to burn CPU cycles).

Anyway not going to look further into that issue.

Thanks,
Christian.

Am 02.08.24 um 11:40 schrieb Yanko Kaneti:
> Hi,
>
> So, can't reproduce this any more with with recent rawhide (rc1+).
> Tried also with the same old kernels but this time its with newer mesa
> and google-chrome (126 -> 127). The same scenario as before now works
> ok.
>
> Cheers and sorry for the noise.
> - Yanko
>
> On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote:
>> Hi,
>>
>> Well, it starts, then within a second or two it begins stuttering with
>> long (half a second/second) freezes of the video , while the audio seems
>> to work fine. Nothing in the log from chrome or kernel , AFAICS,  to
>> show anything is wrong.
>>
>> Regards
>> Yanko
>>
>> On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote:
>>> Hi Yanko,
>>>
>>> interesting. What do you mean with "choppy"? E.g. lag on startup?
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 23.07.24 um 21:42 schrieb Yanko Kaneti:
>>>> Hello,
>>>>
>>>> Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:
>>>>
>>>>     0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size
>>>>
>>>> The regression manifests in choppy youtube video playback in google-chrome-stable.
>>>>    https://www.youtube.com/watch?v=uOpl2XNOgMA
>>>>    google-chrome-stable-126.0.6478.182-1.x86_64
>>>>    VP9 video,
>>>>    Chrome -> Override software rendering list -> on
>>>>    Chrome -> Hardware-accelerated video decode - default enabled
>>>>
>>>> No other visible graphics issues.
>>>>
>>>> Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
>>>> [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
>>>>
>>>> Tested with linus tip and just reverting the commit fixes the issue.
>>>>
>>>> Sorry for the brevity, not sure what other details might be relevant.
>>>>
>>>> - Yanko
>>>>
>>>>
John Rowley Aug. 13, 2024, 5:24 p.m. UTC | #10
Apologies for resurrecting an old thread, but I'm experiencing the same issue.

For me it manifests mostly as choppy scrolling in Mullvad Browser (Flatpak),
also Firefox.

E.g. on https://github.com/dandavison/delta, when trackpad flick-scrolling down
the page, it's quite smooth up until the sections headed "Line numbers" and
"Merge conflicts", there the scrolling hangs for maybe half a sec, then
continues onwards.

I've yet to debug further why these sections in particular trip up scrolling.
But it happens on other websites too (not all, i.e. simple pages are fine).

I independently bisected it down to the same commit as Yanko; commit
0ddd2ae586d28e521d37393364d989ce118802e0: drm/ttm: increase ttm pre-fault value
to PMD size.

Reverting this commit on top of mainline master (d74da846046a) fixes the problem
and scrolling is then smooth again.

I'm also using a Ryzen IGP; Radeon 780M (1002:15bf) as part of 7840U. I'm
running Arch Linux with all packages up to date, nothing else really abstract
about my setup.

Please let me know if I can help to investigate further.

On 02/08/2024 10:46, Christian König wrote:
> I'm still scratching my head what the heck this could be.
>
> Increasing the TTM prefault number has minimal more CPU overhead on the first access but makes subsequent accesses to the same buffer faster (because the buffer is already completely present).
>
> So as long as Chrome didn't wrote some single bytes repeatably on newly allocated buffers I don't see how the change could affect video playback at all.
>
> Maybe Chrome did exactly that because of a bug or something, but in general such an application behavior wouldn't make much sense (except if you want to burn CPU cycles).
>
> Anyway not going to look further into that issue.
>
> Thanks,
> Christian.
>
> Am 02.08.24 um 11:40 schrieb Yanko Kaneti:
>> Hi,
>>
>> So, can't reproduce this any more with with recent rawhide (rc1+).
>> Tried also with the same old kernels but this time its with newer mesa
>> and google-chrome (126 -> 127). The same scenario as before now works
>> ok.
>>
>> Cheers and sorry for the noise.
>> - Yanko
>>
>> On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote:
>>> Hi,
>>>
>>> Well, it starts, then within a second or two it begins stuttering with
>>> long (half a second/second) freezes of the video , while the audio seems
>>> to work fine. Nothing in the log from chrome or kernel , AFAICS,  to
>>> show anything is wrong.
>>>
>>> Regards
>>> Yanko
>>>
>>> On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote:
>>>> Hi Yanko,
>>>>
>>>> interesting. What do you mean with "choppy"? E.g. lag on startup?
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 23.07.24 um 21:42 schrieb Yanko Kaneti:
>>>>> Hello,
>>>>>
>>>>> Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit:
>>>>>
>>>>>     0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size
>>>>>
>>>>> The regression manifests in choppy youtube video playback in google-chrome-stable.
>>>>>    https://www.youtube.com/watch?v=uOpl2XNOgMA
>>>>>    google-chrome-stable-126.0.6478.182-1.x86_64
>>>>>    VP9 video,
>>>>>    Chrome -> Override software rendering list -> on
>>>>>    Chrome -> Hardware-accelerated video decode - default enabled
>>>>>
>>>>> No other visible graphics issues.
>>>>>
>>>>> Its a desktop system with Ryzen 7 5700GRyzen 7 5700G  IGP
>>>>> [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638]
>>>>>
>>>>> Tested with linus tip and just reverting the commit fixes the issue.
>>>>>
>>>>> Sorry for the brevity, not sure what other details might be relevant.
>>>>>
>>>>> - Yanko
>>>>>
>>>>>
>
diff mbox series

Patch

diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
index 6ccf96c91f3a..ef0f52f56ebc 100644
--- a/include/drm/ttm/ttm_bo.h
+++ b/include/drm/ttm/ttm_bo.h
@@ -39,7 +39,11 @@ 
 #include "ttm_device.h"
 
 /* Default number of pre-faulted pages in the TTM fault handler */
+#if CONFIG_PGTABLE_LEVELS > 2
+#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT))
+#else
 #define TTM_BO_VM_NUM_PREFAULT 16
+#endif
 
 struct iosys_map;