diff mbox series

MIPS: Introduce cmdline argument writecombine=

Message ID 1596697741-3561-1-git-send-email-yangtiezhu@loongson.cn (mailing list archive)
State Rejected
Headers show
Series MIPS: Introduce cmdline argument writecombine= | expand

Commit Message

Tiezhu Yang Aug. 6, 2020, 7:09 a.m. UTC
Loongson processors have a writecombine issue that maybe failed to
write back framebuffer used with ATI Radeon or AMD GPU at times,
after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
mapping for MIPS"), there exists some errors such as blurred screen
and lockup, and so on.

With this patch, disable writecombine by default for Loongson64 to
work well with ATI Radeon or AMD GPU, and it has no influence on the
other platforms due to writecombine is enabled by default.

Additionally, if it is necessary, writecombine=on can be set manually
in the cmdline to enhance the performance for Loongson LS7A integrated
graphics in the future.

[   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec
[   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0)
[   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec
[   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 arch/mips/include/asm/pgtable.h |  4 ++++
 arch/mips/kernel/cpu-probe.c    | 19 +++++++++++++++++++
 2 files changed, 23 insertions(+)

Comments

Jiaxun Yang Aug. 6, 2020, 7:39 a.m. UTC | #1
在 2020/8/6 下午3:09, Tiezhu Yang 写道:
> Loongson processors have a writecombine issue that maybe failed to
> write back framebuffer used with ATI Radeon or AMD GPU at times,
> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
> mapping for MIPS"), there exists some errors such as blurred screen
> and lockup, and so on.
>
> With this patch, disable writecombine by default for Loongson64 to
> work well with ATI Radeon or AMD GPU, and it has no influence on the
> other platforms due to writecombine is enabled by default.
>
> Additionally, if it is necessary, writecombine=on can be set manually
> in the cmdline to enhance the performance for Loongson LS7A integrated
> graphics in the future.
>
> [   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec
> [   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0)
> [   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec
> [   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)
Hi Tiezhu,

Thanks for your patch.
Personally I didn't have any issue with writecombine on my test systems, 
but there
are some complains about unstable graphic card from users. So generally 
a cmdline
writecombine switch is necessary.

>
> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> ---
>   arch/mips/include/asm/pgtable.h |  4 ++++
>   arch/mips/kernel/cpu-probe.c    | 19 +++++++++++++++++++
>   2 files changed, 23 insertions(+)
>
> diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
> index dd7a0f5..34869f7 100644
> --- a/arch/mips/include/asm/pgtable.h
> +++ b/arch/mips/include/asm/pgtable.h
> @@ -473,6 +473,10 @@ static inline pgprot_t pgprot_noncached(pgprot_t _prot)
>   static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
>   {
>   	unsigned long prot = pgprot_val(_prot);
> +	extern bool mips_writecombine;
> +
> +	if (!mips_writecombine)
> +		return pgprot_noncached(_prot);

You can simply override c->writecombine to _CACHE_UNCACHED in 
cpu-probe.c with
out this kind of hijack.

>   
>   	/* cpu_data[0].writecombine is already shifted by _CACHE_SHIFT */
>   	prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine;
> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
> index e2955f1..98777ca 100644
> --- a/arch/mips/kernel/cpu-probe.c
> +++ b/arch/mips/kernel/cpu-probe.c
> @@ -459,6 +459,25 @@ static int __init ftlb_disable(char *s)
>   
>   __setup("noftlb", ftlb_disable);
>   
> +#ifdef CONFIG_MACH_LOONGSON64
> +bool mips_writecombine; /* initialise to false by default */
> +#else
> +bool mips_writecombine = true;
> +#endif
> +EXPORT_SYMBOL(mips_writecombine);
There is no need to export this symbol, see comment before.
> +
> +static int __init writecombine_setup(char *str)
> +{
> +	if (strcmp(str, "on") == 0)
> +		mips_writecombine = true;
> +	else if (strcmp(str, "off") == 0)
> +		mips_writecombine = false;
> +
> +	return 1;
> +}
> +
> +__setup("writecombine=", writecombine_setup);

Use early_param here seems more reasonable, it will be probed earlier.

> +
>   /*
>    * Check if the CPU has per tc perf counters
>    */
Thanks

- Jiaxun
Tiezhu Yang Aug. 6, 2020, 8:32 a.m. UTC | #2
On 08/06/2020 03:39 PM, Jiaxun Yang wrote:
>
>
> 在 2020/8/6 下午3:09, Tiezhu Yang 写道:
>> Loongson processors have a writecombine issue that maybe failed to
>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>> mapping for MIPS"), there exists some errors such as blurred screen
>> and lockup, and so on.
>>
>> With this patch, disable writecombine by default for Loongson64 to
>> work well with ATI Radeon or AMD GPU, and it has no influence on the
>> other platforms due to writecombine is enabled by default.
>>
>> Additionally, if it is necessary, writecombine=on can be set manually
>> in the cmdline to enhance the performance for Loongson LS7A integrated
>> graphics in the future.
>>
>> [   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 
>> 10079msec
>> [   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 
>> 0x0000000000000112 last fence id 0x000000000000011d on ring 0)
>> [   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 
>> 10086msec
>> [   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 
>> 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)
> Hi Tiezhu,
>
> Thanks for your patch.
> Personally I didn't have any issue with writecombine on my test 
> systems, but there
> are some complains about unstable graphic card from users. So 
> generally a cmdline
> writecombine switch is necessary.
>
>>
>> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
>> ---
>>   arch/mips/include/asm/pgtable.h |  4 ++++
>>   arch/mips/kernel/cpu-probe.c    | 19 +++++++++++++++++++
>>   2 files changed, 23 insertions(+)
>>
>> diff --git a/arch/mips/include/asm/pgtable.h 
>> b/arch/mips/include/asm/pgtable.h
>> index dd7a0f5..34869f7 100644
>> --- a/arch/mips/include/asm/pgtable.h
>> +++ b/arch/mips/include/asm/pgtable.h
>> @@ -473,6 +473,10 @@ static inline pgprot_t pgprot_noncached(pgprot_t 
>> _prot)
>>   static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
>>   {
>>       unsigned long prot = pgprot_val(_prot);
>> +    extern bool mips_writecombine;
>> +
>> +    if (!mips_writecombine)
>> +        return pgprot_noncached(_prot);
>
> You can simply override c->writecombine to _CACHE_UNCACHED in 
> cpu-probe.c with
> out this kind of hijack.
>
>>         /* cpu_data[0].writecombine is already shifted by 
>> _CACHE_SHIFT */
>>       prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine;
>> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
>> index e2955f1..98777ca 100644
>> --- a/arch/mips/kernel/cpu-probe.c
>> +++ b/arch/mips/kernel/cpu-probe.c
>> @@ -459,6 +459,25 @@ static int __init ftlb_disable(char *s)
>>     __setup("noftlb", ftlb_disable);
>>   +#ifdef CONFIG_MACH_LOONGSON64
>> +bool mips_writecombine; /* initialise to false by default */
>> +#else
>> +bool mips_writecombine = true;
>> +#endif
>> +EXPORT_SYMBOL(mips_writecombine);
> There is no need to export this symbol, see comment before.
>> +
>> +static int __init writecombine_setup(char *str)
>> +{
>> +    if (strcmp(str, "on") == 0)
>> +        mips_writecombine = true;
>> +    else if (strcmp(str, "off") == 0)
>> +        mips_writecombine = false;
>> +
>> +    return 1;
>> +}
>> +
>> +__setup("writecombine=", writecombine_setup);
>
> Use early_param here seems more reasonable, it will be probed earlier.

Hi Jiaxun,

Thanks for your suggestion, it looks better.

I will modify and test it, then I will send v2 with another
document patch to explain this cmdline argument.

Thanks,
Tiezhu

>
>> +
>>   /*
>>    * Check if the CPU has per tc perf counters
>>    */
> Thanks
>
> - Jiaxun
Thomas Bogendoerfer Aug. 6, 2020, 10:17 a.m. UTC | #3
On Thu, Aug 06, 2020 at 04:32:13PM +0800, Tiezhu Yang wrote:
> On 08/06/2020 03:39 PM, Jiaxun Yang wrote:
> >
> >
> >在 2020/8/6 下午3:09, Tiezhu Yang 写道:
> >>Loongson processors have a writecombine issue that maybe failed to
> >>write back framebuffer used with ATI Radeon or AMD GPU at times,
> >>after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
> >>mapping for MIPS"), there exists some errors such as blurred screen
> >>and lockup, and so on.
> >>
> >>With this patch, disable writecombine by default for Loongson64 to
> >>work well with ATI Radeon or AMD GPU, and it has no influence on the
> >>other platforms due to writecombine is enabled by default.
> >>
> >>Additionally, if it is necessary, writecombine=on can be set manually
> >>in the cmdline to enhance the performance for Loongson LS7A integrated
> >>graphics in the future.
> >>
> >>[   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than
> >>10079msec
> >>[   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id
> >>0x0000000000000112 last fence id 0x000000000000011d on ring 0)
> >>[   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than
> >>10086msec
> >>[   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id
> >>0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)
> >Hi Tiezhu,
> >
> >Thanks for your patch.
> >Personally I didn't have any issue with writecombine on my test systems,
> >but there
> >are some complains about unstable graphic card from users. So generally a
> >cmdline
> >writecombine switch is necessary.

no, if there is hardware which can't work with writecombining enabled
the driver should disable it by it's own and not by some user switch.
It might even be better to revert the patch enabling writecombining
blindly and add code to enable it for hardware where it works.

Thomas.
Jiaxun Yang Aug. 6, 2020, 11:56 a.m. UTC | #4
在 2020/8/6 下午6:17, Thomas Bogendoerfer 写道:
> On Thu, Aug 06, 2020 at 04:32:13PM +0800, Tiezhu Yang wrote:
>> On 08/06/2020 03:39 PM, Jiaxun Yang wrote:
>>>
>>> 在 2020/8/6 下午3:09, Tiezhu Yang 写道:
>>>> Loongson processors have a writecombine issue that maybe failed to
>>>> write back framebuffer used with ATI Radeon or AMD GPU at times,
>>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
>>>> mapping for MIPS"), there exists some errors such as blurred screen
>>>> and lockup, and so on.
>>>>
>>>> With this patch, disable writecombine by default for Loongson64 to
>>>> work well with ATI Radeon or AMD GPU, and it has no influence on the
>>>> other platforms due to writecombine is enabled by default.
>>>>
>>>> Additionally, if it is necessary, writecombine=on can be set manually
>>>> in the cmdline to enhance the performance for Loongson LS7A integrated
>>>> graphics in the future.
>>>>
>>>> [   60.958721] radeon 0000:03:00.0: ring 0 stalled for more than
>>>> 10079msec
>>>> [   60.965315] radeon 0000:03:00.0: GPU lockup (current fence id
>>>> 0x0000000000000112 last fence id 0x000000000000011d on ring 0)
>>>> [   60.976525] radeon 0000:03:00.0: ring 3 stalled for more than
>>>> 10086msec
>>>> [   60.983156] radeon 0000:03:00.0: GPU lockup (current fence id
>>>> 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3)
>>> Hi Tiezhu,
>>>
>>> Thanks for your patch.
>>> Personally I didn't have any issue with writecombine on my test systems,
>>> but there
>>> are some complains about unstable graphic card from users. So generally a
>>> cmdline
>>> writecombine switch is necessary.
> no, if there is hardware which can't work with writecombining enabled
> the driver should disable it by it's own and not by some user switch.
> It might even be better to revert the patch enabling writecombining
> blindly and add code to enable it for hardware where it works.

Our current problem is Loongson's writecombine implementation seems buggy.
This is our platform issue rather than target hardware issue.
And we don't even know which hardware is known to be good. The same graphic
card became a different story on different user's hand.

However, turning off writecombine would cause a visible performance 
regression
on graphic.

I understood what Teizhu thought. For entry-level users, we don't want 
to trouble
them, so we have writecombine disabled by default. However, for advanced 
user
trying to tweak their system, we should leave a switch for them to get 
it back.

Thanks.

- Jiaxun

>
> Thomas.
>
Thomas Bogendoerfer Aug. 6, 2020, 4:52 p.m. UTC | #5
On Thu, Aug 06, 2020 at 07:56:20PM +0800, Jiaxun Yang wrote:
> Our current problem is Loongson's writecombine implementation seems buggy.
> This is our platform issue rather than target hardware issue.

ok, so simply clear cpu_data[0].writecombine for the fauly parts

> And we don't even know which hardware is known to be good. The same graphic
> card became a different story on different user's hand.

find out what is broken and add the needed workarounds then.

> I understood what Teizhu thought. For entry-level users, we don't want to
> trouble
> them, so we have writecombine disabled by default. However, for advanced
> user
> trying to tweak their system, we should leave a switch for them to get it
> back.

IMHO if we do it that way, we end up with millions of knobs for tweaking
broken hardware, and nobody knows what's exactly broken. Sorry I won't go
that way.
Jiaxun Yang Aug. 6, 2020, 6:26 p.m. UTC | #6
在 2020/8/7 上午12:52, Thomas Bogendoerfer 写道:
> On Thu, Aug 06, 2020 at 07:56:20PM +0800, Jiaxun Yang wrote:
>> Our current problem is Loongson's writecombine implementation seems buggy.
>> This is our platform issue rather than target hardware issue.
> ok, so simply clear cpu_data[0].writecombine for the fauly parts
@Tiezhu,

I don't know the exact faulty parts, could you please investigate it in 
Loongson?

I can remember a Loongson stuff told me the issue was solved in GS464E but
there are still users complaining about that.

In fact I can't reproduce it in all of my test systems:
3B1500 + RS780E + R5 230
3A3000 + RS780E Laptop
3A4000 + LS7A + RX550
>> And we don't even know which hardware is known to be good. The same graphic
>> card became a different story on different user's hand.
> find out what is broken and add the needed workarounds then.

Well, let's leave this task for Loongson company.
User's community don't have the ability to trace hardware behavior 
precisely.

>> I understood what Teizhu thought. For entry-level users, we don't want to
>> trouble
>> them, so we have writecombine disabled by default. However, for advanced
>> user
>> trying to tweak their system, we should leave a switch for them to get it
>> back.
> IMHO if we do it that way, we end up with millions of knobs for tweaking
> broken hardware, and nobody knows what's exactly broken. Sorry I won't go
> that way.

Haha, that was my first impression to Linux as a primary school student. 
It just
looks like an aircraft cockpit with thousands of knobs & switches, but 
to airborne
you just need to control yoke and throttle.

- Jiaxun
diff mbox series

Patch

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index dd7a0f5..34869f7 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -473,6 +473,10 @@  static inline pgprot_t pgprot_noncached(pgprot_t _prot)
 static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
 {
 	unsigned long prot = pgprot_val(_prot);
+	extern bool mips_writecombine;
+
+	if (!mips_writecombine)
+		return pgprot_noncached(_prot);
 
 	/* cpu_data[0].writecombine is already shifted by _CACHE_SHIFT */
 	prot = (prot & ~_CACHE_MASK) | cpu_data[0].writecombine;
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index e2955f1..98777ca 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -459,6 +459,25 @@  static int __init ftlb_disable(char *s)
 
 __setup("noftlb", ftlb_disable);
 
+#ifdef CONFIG_MACH_LOONGSON64
+bool mips_writecombine; /* initialise to false by default */
+#else
+bool mips_writecombine = true;
+#endif
+EXPORT_SYMBOL(mips_writecombine);
+
+static int __init writecombine_setup(char *str)
+{
+	if (strcmp(str, "on") == 0)
+		mips_writecombine = true;
+	else if (strcmp(str, "off") == 0)
+		mips_writecombine = false;
+
+	return 1;
+}
+
+__setup("writecombine=", writecombine_setup);
+
 /*
  * Check if the CPU has per tc perf counters
  */