Message ID | 1596697741-3561-1-git-send-email-yangtiezhu@loongson.cn (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | MIPS: Introduce cmdline argument writecombine= | expand |
在 2020/8/6 下午3:09, Tiezhu Yang 写道: > Loongson processors have a writecombine issue that maybe failed to > write back framebuffer used with ATI Radeon or AMD GPU at times, > after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine > mapping for MIPS"), there exists some errors such as blurred screen > and lockup, and so on. > > With this patch, disable writecombine by default for Loongson64 to > work well with ATI Radeon or AMD GPU, and it has no influence on the > other platforms due to writecombine is enabled by default. > > Additionally, if it is necessary, writecombine=on can be set manually > in the cmdline to enhance the performance for Loongson LS7A integrated > graphics in the future. > > [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec > [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0) > [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec > [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) Hi Tiezhu, Thanks for your patch. Personally I didn't have any issue with writecombine on my test systems, but there are some complains about unstable graphic card from users. So generally a cmdline writecombine switch is necessary. > > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> > --- > arch/mips/include/asm/pgtable.h | 4 ++++ > arch/mips/kernel/cpu-probe.c | 19 +++++++++++++++++++ > 2 files changed, 23 insertions(+) > > diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h > index dd7a0f5..34869f7 100644 > --- a/arch/mips/include/asm/pgtable.h > +++ b/arch/mips/include/asm/pgtable.h > @@ -473,6 +473,10 @@ static inline pgprot_t pgprot_noncached(pgprot_t _prot) > static inline pgprot_t pgprot_writecombine(pgprot_t _prot) > { > unsigned long prot = pgprot_val(_prot); > + extern bool mips_writecombine; > + > + if (!mips_writecombine) > + return pgprot_noncached(_prot); You can simply override c->writecombine to _CACHE_UNCACHED in cpu-probe.c with out this kind of hijack. > > /* cpu_data[0].writecombine is already shifted by _CACHE_SHIFT */ > prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine; > diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c > index e2955f1..98777ca 100644 > --- a/arch/mips/kernel/cpu-probe.c > +++ b/arch/mips/kernel/cpu-probe.c > @@ -459,6 +459,25 @@ static int __init ftlb_disable(char *s) > > __setup("noftlb", ftlb_disable); > > +#ifdef CONFIG_MACH_LOONGSON64 > +bool mips_writecombine; /* initialise to false by default */ > +#else > +bool mips_writecombine = true; > +#endif > +EXPORT_SYMBOL(mips_writecombine); There is no need to export this symbol, see comment before. > + > +static int __init writecombine_setup(char *str) > +{ > + if (strcmp(str, "on") == 0) > + mips_writecombine = true; > + else if (strcmp(str, "off") == 0) > + mips_writecombine = false; > + > + return 1; > +} > + > +__setup("writecombine=", writecombine_setup); Use early_param here seems more reasonable, it will be probed earlier. > + > /* > * Check if the CPU has per tc perf counters > */ Thanks - Jiaxun
On 08/06/2020 03:39 PM, Jiaxun Yang wrote: > > > 在 2020/8/6 下午3:09, Tiezhu Yang 写道: >> Loongson processors have a writecombine issue that maybe failed to >> write back framebuffer used with ATI Radeon or AMD GPU at times, >> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >> mapping for MIPS"), there exists some errors such as blurred screen >> and lockup, and so on. >> >> With this patch, disable writecombine by default for Loongson64 to >> work well with ATI Radeon or AMD GPU, and it has no influence on the >> other platforms due to writecombine is enabled by default. >> >> Additionally, if it is necessary, writecombine=on can be set manually >> in the cmdline to enhance the performance for Loongson LS7A integrated >> graphics in the future. >> >> [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than >> 10079msec >> [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id >> 0x0000000000000112 last fence id 0x000000000000011d on ring 0) >> [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than >> 10086msec >> [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id >> 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) > Hi Tiezhu, > > Thanks for your patch. > Personally I didn't have any issue with writecombine on my test > systems, but there > are some complains about unstable graphic card from users. So > generally a cmdline > writecombine switch is necessary. > >> >> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> >> --- >> arch/mips/include/asm/pgtable.h | 4 ++++ >> arch/mips/kernel/cpu-probe.c | 19 +++++++++++++++++++ >> 2 files changed, 23 insertions(+) >> >> diff --git a/arch/mips/include/asm/pgtable.h >> b/arch/mips/include/asm/pgtable.h >> index dd7a0f5..34869f7 100644 >> --- a/arch/mips/include/asm/pgtable.h >> +++ b/arch/mips/include/asm/pgtable.h >> @@ -473,6 +473,10 @@ static inline pgprot_t pgprot_noncached(pgprot_t >> _prot) >> static inline pgprot_t pgprot_writecombine(pgprot_t _prot) >> { >> unsigned long prot = pgprot_val(_prot); >> + extern bool mips_writecombine; >> + >> + if (!mips_writecombine) >> + return pgprot_noncached(_prot); > > You can simply override c->writecombine to _CACHE_UNCACHED in > cpu-probe.c with > out this kind of hijack. > >> /* cpu_data[0].writecombine is already shifted by >> _CACHE_SHIFT */ >> prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine; >> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c >> index e2955f1..98777ca 100644 >> --- a/arch/mips/kernel/cpu-probe.c >> +++ b/arch/mips/kernel/cpu-probe.c >> @@ -459,6 +459,25 @@ static int __init ftlb_disable(char *s) >> __setup("noftlb", ftlb_disable); >> +#ifdef CONFIG_MACH_LOONGSON64 >> +bool mips_writecombine; /* initialise to false by default */ >> +#else >> +bool mips_writecombine = true; >> +#endif >> +EXPORT_SYMBOL(mips_writecombine); > There is no need to export this symbol, see comment before. >> + >> +static int __init writecombine_setup(char *str) >> +{ >> + if (strcmp(str, "on") == 0) >> + mips_writecombine = true; >> + else if (strcmp(str, "off") == 0) >> + mips_writecombine = false; >> + >> + return 1; >> +} >> + >> +__setup("writecombine=", writecombine_setup); > > Use early_param here seems more reasonable, it will be probed earlier. Hi Jiaxun, Thanks for your suggestion, it looks better. I will modify and test it, then I will send v2 with another document patch to explain this cmdline argument. Thanks, Tiezhu > >> + >> /* >> * Check if the CPU has per tc perf counters >> */ > Thanks > > - Jiaxun
On Thu, Aug 06, 2020 at 04:32:13PM +0800, Tiezhu Yang wrote: > On 08/06/2020 03:39 PM, Jiaxun Yang wrote: > > > > > >在 2020/8/6 下午3:09, Tiezhu Yang 写道: > >>Loongson processors have a writecombine issue that maybe failed to > >>write back framebuffer used with ATI Radeon or AMD GPU at times, > >>after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine > >>mapping for MIPS"), there exists some errors such as blurred screen > >>and lockup, and so on. > >> > >>With this patch, disable writecombine by default for Loongson64 to > >>work well with ATI Radeon or AMD GPU, and it has no influence on the > >>other platforms due to writecombine is enabled by default. > >> > >>Additionally, if it is necessary, writecombine=on can be set manually > >>in the cmdline to enhance the performance for Loongson LS7A integrated > >>graphics in the future. > >> > >>[ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than > >>10079msec > >>[ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id > >>0x0000000000000112 last fence id 0x000000000000011d on ring 0) > >>[ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than > >>10086msec > >>[ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id > >>0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) > >Hi Tiezhu, > > > >Thanks for your patch. > >Personally I didn't have any issue with writecombine on my test systems, > >but there > >are some complains about unstable graphic card from users. So generally a > >cmdline > >writecombine switch is necessary. no, if there is hardware which can't work with writecombining enabled the driver should disable it by it's own and not by some user switch. It might even be better to revert the patch enabling writecombining blindly and add code to enable it for hardware where it works. Thomas.
在 2020/8/6 下午6:17, Thomas Bogendoerfer 写道: > On Thu, Aug 06, 2020 at 04:32:13PM +0800, Tiezhu Yang wrote: >> On 08/06/2020 03:39 PM, Jiaxun Yang wrote: >>> >>> 在 2020/8/6 下午3:09, Tiezhu Yang 写道: >>>> Loongson processors have a writecombine issue that maybe failed to >>>> write back framebuffer used with ATI Radeon or AMD GPU at times, >>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >>>> mapping for MIPS"), there exists some errors such as blurred screen >>>> and lockup, and so on. >>>> >>>> With this patch, disable writecombine by default for Loongson64 to >>>> work well with ATI Radeon or AMD GPU, and it has no influence on the >>>> other platforms due to writecombine is enabled by default. >>>> >>>> Additionally, if it is necessary, writecombine=on can be set manually >>>> in the cmdline to enhance the performance for Loongson LS7A integrated >>>> graphics in the future. >>>> >>>> [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than >>>> 10079msec >>>> [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id >>>> 0x0000000000000112 last fence id 0x000000000000011d on ring 0) >>>> [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than >>>> 10086msec >>>> [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id >>>> 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) >>> Hi Tiezhu, >>> >>> Thanks for your patch. >>> Personally I didn't have any issue with writecombine on my test systems, >>> but there >>> are some complains about unstable graphic card from users. So generally a >>> cmdline >>> writecombine switch is necessary. > no, if there is hardware which can't work with writecombining enabled > the driver should disable it by it's own and not by some user switch. > It might even be better to revert the patch enabling writecombining > blindly and add code to enable it for hardware where it works. Our current problem is Loongson's writecombine implementation seems buggy. This is our platform issue rather than target hardware issue. And we don't even know which hardware is known to be good. The same graphic card became a different story on different user's hand. However, turning off writecombine would cause a visible performance regression on graphic. I understood what Teizhu thought. For entry-level users, we don't want to trouble them, so we have writecombine disabled by default. However, for advanced user trying to tweak their system, we should leave a switch for them to get it back. Thanks. - Jiaxun > > Thomas. >
On Thu, Aug 06, 2020 at 07:56:20PM +0800, Jiaxun Yang wrote: > Our current problem is Loongson's writecombine implementation seems buggy. > This is our platform issue rather than target hardware issue. ok, so simply clear cpu_data[0].writecombine for the fauly parts > And we don't even know which hardware is known to be good. The same graphic > card became a different story on different user's hand. find out what is broken and add the needed workarounds then. > I understood what Teizhu thought. For entry-level users, we don't want to > trouble > them, so we have writecombine disabled by default. However, for advanced > user > trying to tweak their system, we should leave a switch for them to get it > back. IMHO if we do it that way, we end up with millions of knobs for tweaking broken hardware, and nobody knows what's exactly broken. Sorry I won't go that way.
在 2020/8/7 上午12:52, Thomas Bogendoerfer 写道: > On Thu, Aug 06, 2020 at 07:56:20PM +0800, Jiaxun Yang wrote: >> Our current problem is Loongson's writecombine implementation seems buggy. >> This is our platform issue rather than target hardware issue. > ok, so simply clear cpu_data[0].writecombine for the fauly parts @Tiezhu, I don't know the exact faulty parts, could you please investigate it in Loongson? I can remember a Loongson stuff told me the issue was solved in GS464E but there are still users complaining about that. In fact I can't reproduce it in all of my test systems: 3B1500 + RS780E + R5 230 3A3000 + RS780E Laptop 3A4000 + LS7A + RX550 >> And we don't even know which hardware is known to be good. The same graphic >> card became a different story on different user's hand. > find out what is broken and add the needed workarounds then. Well, let's leave this task for Loongson company. User's community don't have the ability to trace hardware behavior precisely. >> I understood what Teizhu thought. For entry-level users, we don't want to >> trouble >> them, so we have writecombine disabled by default. However, for advanced >> user >> trying to tweak their system, we should leave a switch for them to get it >> back. > IMHO if we do it that way, we end up with millions of knobs for tweaking > broken hardware, and nobody knows what's exactly broken. Sorry I won't go > that way. Haha, that was my first impression to Linux as a primary school student. It just looks like an aircraft cockpit with thousands of knobs & switches, but to airborne you just need to control yoke and throttle. - Jiaxun
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h index dd7a0f5..34869f7 100644 --- a/arch/mips/include/asm/pgtable.h +++ b/arch/mips/include/asm/pgtable.h @@ -473,6 +473,10 @@ static inline pgprot_t pgprot_noncached(pgprot_t _prot) static inline pgprot_t pgprot_writecombine(pgprot_t _prot) { unsigned long prot = pgprot_val(_prot); + extern bool mips_writecombine; + + if (!mips_writecombine) + return pgprot_noncached(_prot); /* cpu_data[0].writecombine is already shifted by _CACHE_SHIFT */ prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine; diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c index e2955f1..98777ca 100644 --- a/arch/mips/kernel/cpu-probe.c +++ b/arch/mips/kernel/cpu-probe.c @@ -459,6 +459,25 @@ static int __init ftlb_disable(char *s) __setup("noftlb", ftlb_disable); +#ifdef CONFIG_MACH_LOONGSON64 +bool mips_writecombine; /* initialise to false by default */ +#else +bool mips_writecombine = true; +#endif +EXPORT_SYMBOL(mips_writecombine); + +static int __init writecombine_setup(char *str) +{ + if (strcmp(str, "on") == 0) + mips_writecombine = true; + else if (strcmp(str, "off") == 0) + mips_writecombine = false; + + return 1; +} + +__setup("writecombine=", writecombine_setup); + /* * Check if the CPU has per tc perf counters */
Loongson processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. With this patch, disable writecombine by default for Loongson64 to work well with ATI Radeon or AMD GPU, and it has no influence on the other platforms due to writecombine is enabled by default. Additionally, if it is necessary, writecombine=on can be set manually in the cmdline to enhance the performance for Loongson LS7A integrated graphics in the future. [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0) [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> --- arch/mips/include/asm/pgtable.h | 4 ++++ arch/mips/kernel/cpu-probe.c | 19 +++++++++++++++++++ 2 files changed, 23 insertions(+)