Message ID | 20240604084934.225738-1-lingshan.zhu@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2] drm/ttm: increase ttm pre-fault value to PMD size | expand |
On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote: > ttm page fault handler ttm_bo_vm_fault_reserved() maps > TTM_BO_VM_NUM_PREFAULT more pages beforehand > due to the principle of locality. > > However, on some platform the page faults are more costly, this > patch intends to increase the number of ttm pre-fault to relieve > the number of page faults. > > When multiple levels of page table is supported, the new default > value would be the PMD size, similar to huge page. > > Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> > Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com> Acked-by: Huang Rui <ray.huang@amd.com> > --- > include/drm/ttm/ttm_bo.h | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h > index 6ccf96c91f3a..ef0f52f56ebc 100644 > --- a/include/drm/ttm/ttm_bo.h > +++ b/include/drm/ttm/ttm_bo.h > @@ -39,7 +39,11 @@ > #include "ttm_device.h" > > /* Default number of pre-faulted pages in the TTM fault handler */ > +#if CONFIG_PGTABLE_LEVELS > 2 > +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT)) > +#else > #define TTM_BO_VM_NUM_PREFAULT 16 > +#endif > > struct iosys_map; > > -- > 2.45.1 >
Am 04.06.24 um 12:18 schrieb Huang Rui: > On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote: >> ttm page fault handler ttm_bo_vm_fault_reserved() maps >> TTM_BO_VM_NUM_PREFAULT more pages beforehand >> due to the principle of locality. >> >> However, on some platform the page faults are more costly, this >> patch intends to increase the number of ttm pre-fault to relieve >> the number of page faults. >> >> When multiple levels of page table is supported, the new default >> value would be the PMD size, similar to huge page. >> >> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> >> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com> > Acked-by: Huang Rui <ray.huang@amd.com> Not sure if there really is an architecture with less than 3 page table levels, but the build robots should be able to tell us if everything is fine here. Reviewed-by: Christian König <christian.koenig@amd.com> > >> --- >> include/drm/ttm/ttm_bo.h | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h >> index 6ccf96c91f3a..ef0f52f56ebc 100644 >> --- a/include/drm/ttm/ttm_bo.h >> +++ b/include/drm/ttm/ttm_bo.h >> @@ -39,7 +39,11 @@ >> #include "ttm_device.h" >> >> /* Default number of pre-faulted pages in the TTM fault handler */ >> +#if CONFIG_PGTABLE_LEVELS > 2 >> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT)) >> +#else >> #define TTM_BO_VM_NUM_PREFAULT 16 >> +#endif >> >> struct iosys_map; >> >> -- >> 2.45.1 >>
On Tue, Jun 4, 2024 at 10:02 AM Christian König <christian.koenig@amd.com> wrote: > > Am 04.06.24 um 12:18 schrieb Huang Rui: > > On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote: > >> ttm page fault handler ttm_bo_vm_fault_reserved() maps > >> TTM_BO_VM_NUM_PREFAULT more pages beforehand > >> due to the principle of locality. > >> > >> However, on some platform the page faults are more costly, this > >> patch intends to increase the number of ttm pre-fault to relieve > >> the number of page faults. > >> > >> When multiple levels of page table is supported, the new default > >> value would be the PMD size, similar to huge page. > >> > >> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> > >> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com> > > Acked-by: Huang Rui <ray.huang@amd.com> > > Not sure if there really is an architecture with less than 3 page table > levels, but the build robots should be able to tell us if everything is > fine here. > > Reviewed-by: Christian König <christian.koenig@amd.com> Has this been pushed to drm-misc? If not, can you push it? Thanks, Alex > > > > >> --- > >> include/drm/ttm/ttm_bo.h | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h > >> index 6ccf96c91f3a..ef0f52f56ebc 100644 > >> --- a/include/drm/ttm/ttm_bo.h > >> +++ b/include/drm/ttm/ttm_bo.h > >> @@ -39,7 +39,11 @@ > >> #include "ttm_device.h" > >> > >> /* Default number of pre-faulted pages in the TTM fault handler */ > >> +#if CONFIG_PGTABLE_LEVELS > 2 > >> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT)) > >> +#else > >> #define TTM_BO_VM_NUM_PREFAULT 16 > >> +#endif > >> > >> struct iosys_map; > >> > >> -- > >> 2.45.1 > >> >
Am 19.06.24 um 18:09 schrieb Alex Deucher: > On Tue, Jun 4, 2024 at 10:02 AM Christian König > <christian.koenig@amd.com> wrote: >> Am 04.06.24 um 12:18 schrieb Huang Rui: >>> On Tue, Jun 04, 2024 at 04:49:34PM +0800, Zhu, Lingshan wrote: >>>> ttm page fault handler ttm_bo_vm_fault_reserved() maps >>>> TTM_BO_VM_NUM_PREFAULT more pages beforehand >>>> due to the principle of locality. >>>> >>>> However, on some platform the page faults are more costly, this >>>> patch intends to increase the number of ttm pre-fault to relieve >>>> the number of page faults. >>>> >>>> When multiple levels of page table is supported, the new default >>>> value would be the PMD size, similar to huge page. >>>> >>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> >>>> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com> >>> Acked-by: Huang Rui <ray.huang@amd.com> >> Not sure if there really is an architecture with less than 3 page table >> levels, but the build robots should be able to tell us if everything is >> fine here. >> >> Reviewed-by: Christian König <christian.koenig@amd.com> > Has this been pushed to drm-misc? If not, can you push it? Done. Christian. > > Thanks, > > Alex > >>>> --- >>>> include/drm/ttm/ttm_bo.h | 4 ++++ >>>> 1 file changed, 4 insertions(+) >>>> >>>> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h >>>> index 6ccf96c91f3a..ef0f52f56ebc 100644 >>>> --- a/include/drm/ttm/ttm_bo.h >>>> +++ b/include/drm/ttm/ttm_bo.h >>>> @@ -39,7 +39,11 @@ >>>> #include "ttm_device.h" >>>> >>>> /* Default number of pre-faulted pages in the TTM fault handler */ >>>> +#if CONFIG_PGTABLE_LEVELS > 2 >>>> +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT)) >>>> +#else >>>> #define TTM_BO_VM_NUM_PREFAULT 16 >>>> +#endif >>>> >>>> struct iosys_map; >>>> >>>> -- >>>> 2.45.1 >>>>
Hello, Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size The regression manifests in choppy youtube video playback in google-chrome-stable. https://www.youtube.com/watch?v=uOpl2XNOgMA google-chrome-stable-126.0.6478.182-1.x86_64 VP9 video, Chrome -> Override software rendering list -> on Chrome -> Hardware-accelerated video decode - default enabled No other visible graphics issues. Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] Tested with linus tip and just reverting the commit fixes the issue. Sorry for the brevity, not sure what other details might be relevant. - Yanko
Hi Yanko, interesting. What do you mean with "choppy"? E.g. lag on startup? Regards, Christian. Am 23.07.24 um 21:42 schrieb Yanko Kaneti: > Hello, > > Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: > > 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size > > The regression manifests in choppy youtube video playback in google-chrome-stable. > https://www.youtube.com/watch?v=uOpl2XNOgMA > google-chrome-stable-126.0.6478.182-1.x86_64 > VP9 video, > Chrome -> Override software rendering list -> on > Chrome -> Hardware-accelerated video decode - default enabled > > No other visible graphics issues. > > Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP > [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] > > Tested with linus tip and just reverting the commit fixes the issue. > > Sorry for the brevity, not sure what other details might be relevant. > > - Yanko > >
Hi, Well, it starts, then within a second or two it begins stuttering with long (half a second/second) freezes of the video , while the audio seems to work fine. Nothing in the log from chrome or kernel , AFAICS, to show anything is wrong. Regards Yanko On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote: > Hi Yanko, > > interesting. What do you mean with "choppy"? E.g. lag on startup? > > Regards, > Christian. > > Am 23.07.24 um 21:42 schrieb Yanko Kaneti: > > Hello, > > > > Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: > > > > 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size > > > > The regression manifests in choppy youtube video playback in google-chrome-stable. > > https://www.youtube.com/watch?v=uOpl2XNOgMA > > google-chrome-stable-126.0.6478.182-1.x86_64 > > VP9 video, > > Chrome -> Override software rendering list -> on > > Chrome -> Hardware-accelerated video decode - default enabled > > > > No other visible graphics issues. > > > > Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP > > [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] > > > > Tested with linus tip and just reverting the commit fixes the issue. > > > > Sorry for the brevity, not sure what other details might be relevant. > > > > - Yanko > > > > >
Hi, So, can't reproduce this any more with with recent rawhide (rc1+). Tried also with the same old kernels but this time its with newer mesa and google-chrome (126 -> 127). The same scenario as before now works ok. Cheers and sorry for the noise. - Yanko On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote: > Hi, > > Well, it starts, then within a second or two it begins stuttering with > long (half a second/second) freezes of the video , while the audio seems > to work fine. Nothing in the log from chrome or kernel , AFAICS, to > show anything is wrong. > > Regards > Yanko > > On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote: > > Hi Yanko, > > > > interesting. What do you mean with "choppy"? E.g. lag on startup? > > > > Regards, > > Christian. > > > > Am 23.07.24 um 21:42 schrieb Yanko Kaneti: > > > Hello, > > > > > > Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: > > > > > > 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size > > > > > > The regression manifests in choppy youtube video playback in google-chrome-stable. > > > https://www.youtube.com/watch?v=uOpl2XNOgMA > > > google-chrome-stable-126.0.6478.182-1.x86_64 > > > VP9 video, > > > Chrome -> Override software rendering list -> on > > > Chrome -> Hardware-accelerated video decode - default enabled > > > > > > No other visible graphics issues. > > > > > > Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP > > > [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] > > > > > > Tested with linus tip and just reverting the commit fixes the issue. > > > > > > Sorry for the brevity, not sure what other details might be relevant. > > > > > > - Yanko > > > > > > > > >
I'm still scratching my head what the heck this could be. Increasing the TTM prefault number has minimal more CPU overhead on the first access but makes subsequent accesses to the same buffer faster (because the buffer is already completely present). So as long as Chrome didn't wrote some single bytes repeatably on newly allocated buffers I don't see how the change could affect video playback at all. Maybe Chrome did exactly that because of a bug or something, but in general such an application behavior wouldn't make much sense (except if you want to burn CPU cycles). Anyway not going to look further into that issue. Thanks, Christian. Am 02.08.24 um 11:40 schrieb Yanko Kaneti: > Hi, > > So, can't reproduce this any more with with recent rawhide (rc1+). > Tried also with the same old kernels but this time its with newer mesa > and google-chrome (126 -> 127). The same scenario as before now works > ok. > > Cheers and sorry for the noise. > - Yanko > > On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote: >> Hi, >> >> Well, it starts, then within a second or two it begins stuttering with >> long (half a second/second) freezes of the video , while the audio seems >> to work fine. Nothing in the log from chrome or kernel , AFAICS, to >> show anything is wrong. >> >> Regards >> Yanko >> >> On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote: >>> Hi Yanko, >>> >>> interesting. What do you mean with "choppy"? E.g. lag on startup? >>> >>> Regards, >>> Christian. >>> >>> Am 23.07.24 um 21:42 schrieb Yanko Kaneti: >>>> Hello, >>>> >>>> Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: >>>> >>>> 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size >>>> >>>> The regression manifests in choppy youtube video playback in google-chrome-stable. >>>> https://www.youtube.com/watch?v=uOpl2XNOgMA >>>> google-chrome-stable-126.0.6478.182-1.x86_64 >>>> VP9 video, >>>> Chrome -> Override software rendering list -> on >>>> Chrome -> Hardware-accelerated video decode - default enabled >>>> >>>> No other visible graphics issues. >>>> >>>> Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP >>>> [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] >>>> >>>> Tested with linus tip and just reverting the commit fixes the issue. >>>> >>>> Sorry for the brevity, not sure what other details might be relevant. >>>> >>>> - Yanko >>>> >>>>
Apologies for resurrecting an old thread, but I'm experiencing the same issue. For me it manifests mostly as choppy scrolling in Mullvad Browser (Flatpak), also Firefox. E.g. on https://github.com/dandavison/delta, when trackpad flick-scrolling down the page, it's quite smooth up until the sections headed "Line numbers" and "Merge conflicts", there the scrolling hangs for maybe half a sec, then continues onwards. I've yet to debug further why these sections in particular trip up scrolling. But it happens on other websites too (not all, i.e. simple pages are fine). I independently bisected it down to the same commit as Yanko; commit 0ddd2ae586d28e521d37393364d989ce118802e0: drm/ttm: increase ttm pre-fault value to PMD size. Reverting this commit on top of mainline master (d74da846046a) fixes the problem and scrolling is then smooth again. I'm also using a Ryzen IGP; Radeon 780M (1002:15bf) as part of 7840U. I'm running Arch Linux with all packages up to date, nothing else really abstract about my setup. Please let me know if I can help to investigate further. On 02/08/2024 10:46, Christian König wrote: > I'm still scratching my head what the heck this could be. > > Increasing the TTM prefault number has minimal more CPU overhead on the first access but makes subsequent accesses to the same buffer faster (because the buffer is already completely present). > > So as long as Chrome didn't wrote some single bytes repeatably on newly allocated buffers I don't see how the change could affect video playback at all. > > Maybe Chrome did exactly that because of a bug or something, but in general such an application behavior wouldn't make much sense (except if you want to burn CPU cycles). > > Anyway not going to look further into that issue. > > Thanks, > Christian. > > Am 02.08.24 um 11:40 schrieb Yanko Kaneti: >> Hi, >> >> So, can't reproduce this any more with with recent rawhide (rc1+). >> Tried also with the same old kernels but this time its with newer mesa >> and google-chrome (126 -> 127). The same scenario as before now works >> ok. >> >> Cheers and sorry for the noise. >> - Yanko >> >> On Wed, 2024-07-24 at 10:13 +0300, Yanko Kaneti wrote: >>> Hi, >>> >>> Well, it starts, then within a second or two it begins stuttering with >>> long (half a second/second) freezes of the video , while the audio seems >>> to work fine. Nothing in the log from chrome or kernel , AFAICS, to >>> show anything is wrong. >>> >>> Regards >>> Yanko >>> >>> On Wed, 2024-07-24 at 09:02 +0200, Christian König wrote: >>>> Hi Yanko, >>>> >>>> interesting. What do you mean with "choppy"? E.g. lag on startup? >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 23.07.24 um 21:42 schrieb Yanko Kaneti: >>>>> Hello, >>>>> >>>>> Noticed and bisected a rawhide (with the new 6.11-rc0 snapshots) regression to this commit: >>>>> >>>>> 0ddd2ae586d2 drm/ttm: increase ttm pre-fault value to PMD size >>>>> >>>>> The regression manifests in choppy youtube video playback in google-chrome-stable. >>>>> https://www.youtube.com/watch?v=uOpl2XNOgMA >>>>> google-chrome-stable-126.0.6478.182-1.x86_64 >>>>> VP9 video, >>>>> Chrome -> Override software rendering list -> on >>>>> Chrome -> Hardware-accelerated video decode - default enabled >>>>> >>>>> No other visible graphics issues. >>>>> >>>>> Its a desktop system with Ryzen 7 5700GRyzen 7 5700G IGP >>>>> [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] >>>>> >>>>> Tested with linus tip and just reverting the commit fixes the issue. >>>>> >>>>> Sorry for the brevity, not sure what other details might be relevant. >>>>> >>>>> - Yanko >>>>> >>>>> >
diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h index 6ccf96c91f3a..ef0f52f56ebc 100644 --- a/include/drm/ttm/ttm_bo.h +++ b/include/drm/ttm/ttm_bo.h @@ -39,7 +39,11 @@ #include "ttm_device.h" /* Default number of pre-faulted pages in the TTM fault handler */ +#if CONFIG_PGTABLE_LEVELS > 2 +#define TTM_BO_VM_NUM_PREFAULT (1 << (PMD_SHIFT - PAGE_SHIFT)) +#else #define TTM_BO_VM_NUM_PREFAULT 16 +#endif struct iosys_map;
ttm page fault handler ttm_bo_vm_fault_reserved() maps TTM_BO_VM_NUM_PREFAULT more pages beforehand due to the principle of locality. However, on some platform the page faults are more costly, this patch intends to increase the number of ttm pre-fault to relieve the number of page faults. When multiple levels of page table is supported, the new default value would be the PMD size, similar to huge page. Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reported-and-tested-by: Li Jingxiang <jingxiang.li@ecarxgroup.com> --- include/drm/ttm/ttm_bo.h | 4 ++++ 1 file changed, 4 insertions(+)