Message ID | 0b7c9431efb12c2d957fcc53ec8f0743725d61b3.camel@mengyan1223.wang (mailing list archive) |
---|---|
State | Accepted |
Commit | 3f059a7e8c13c62addc1808d13a41d1f6dc24bd1 |
Headers | show |
Series | mips: remove reference to "newer Loongson-3" | expand |
在 2021/8/29 20:49, Xi Ruoyao 写道: > Newest Loongson-3 processors have moved to use LoongArch architecture. > Sadly, the LL/SC issue is still existing on both latest Loongson-3 > processors using MIPS64 (Loongson-3A4000) and LoongArch > (Loongson-3A5000). LLSC is fixed on Loongson-3A4000 as per CPUCFG report. > > As it's very unlikely there will be new Loongson-3 processors using > MIPS64, let's stop people from false hoping. > > Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> > Cc: Huacai Chen <chenhuacai@kernel.org> > --- > > Huacai: how's the status of LL/SC issue on Loongson-2K? If > the issue exists on it as well, we can just force > CPU_LOONGSON3_WORKAROUNDS when CONFIG_CPU_LOONGSON64 and > CONFIG_SMP are both selected. Loongson-2K do need LLSC workaround, although the reason behind the workaround seems different... Thanks. - Jiaxun > > arch/mips/Kconfig | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig > index 6dfb27d531dd..ff5f344a371e 100644 > --- a/arch/mips/Kconfig > +++ b/arch/mips/Kconfig > @@ -1433,19 +1433,14 @@ config LOONGSON3_ENHANCEMENT > new Loongson-3 machines only, please say 'Y' here. > > config CPU_LOONGSON3_WORKAROUNDS > - bool "Old Loongson-3 LLSC Workarounds" > + bool "Loongson-3 LLSC Workarounds" > default y if SMP > depends on CPU_LOONGSON64 > help > Loongson-3 processors have the llsc issues which require workarounds. > Without workarounds the system may hang unexpectedly. > > - Newer Loongson-3 will fix these issues and no workarounds are needed. > - The workarounds have no significant side effect on them but may > - decrease the performance of the system so this option should be > - disabled unless the kernel is intended to be run on old systems. > - > - If unsure, please say Y. > + Say Y, unless you know what you are doing. > > config CPU_LOONGSON3_CPUCFG_EMULATION > bool "Emulate the CPUCFG instruction on older Loongson cores"
On Mon, 2021-08-30 at 10:32 +0800, Jiaxun Yang wrote: > > 在 2021/8/29 20:49, Xi Ruoyao 写道: > > Newest Loongson-3 processors have moved to use LoongArch > > architecture. > > Sadly, the LL/SC issue is still existing on both latest Loongson-3 > > processors using MIPS64 (Loongson-3A4000) and LoongArch > > (Loongson-3A5000). > LLSC is fixed on Loongson-3A4000 as per CPUCFG report. If I don't enable LL/SC fix, GCC libgomp tests fail on both 3A4000 and 3A5000 (using github.com/loongson/gcc for the latter) with "invalid access to 0x00000049" or "0x00000005". This is a race condition: it does not happen at all with OMP_NUM_THREADS=1, happens with about 10% possibility with OMP_NUM_THREADS=2, and about 90% possibility with OMP_NUM_THREAD=4 (on 3A5000, on 3A4000 the possibility is lower). My investigation suggests this means a GCC instrinic, __atomic_compare_and_exchange_n is not really atomic as it should be. If this is not a hardware issue in the GS464V/LA464 uarch, then it will be very low-possibility coincidence: two unrelated code-generation bugs for __atomic_compare_and_exchange_n (LA port has borrowed some code from MIPS port, but the instrinics are of course newly coded). And I've inspected libgomp & gcc code about __atomic_compare_and_exchange_n carefully, nothing wrong spotted except LoongArch GCC supports "-mfix- loongson3-llsc" which adds a "dbar 0" (like "sync" on MIPS) instruction after SC (only for instrinics). Enabling this fixes the libgomp failures. Likewisely, "-Wa,-mfix-loongson3-llsc" fixes it on 3A4000. libgomp code has been verified with thread sanitizer on other architectures (unfortunately libtsan is not available on MIPS or LoongArch yet), so it's very unlikely to be a coding error leading to the race. And LL/SC fix is still in Huacai's 3A5000 kernel. In a mail on linux- arch Huacai said it's "not so easy to be fixed". Or these are two different erratas and I misunderstand them as the same one?
在 2021/8/30 下午8:28, Xi Ruoyao 写道: > On Mon, 2021-08-30 at 10:32 +0800, Jiaxun Yang wrote: >> 在 2021/8/29 20:49, Xi Ruoyao 写道: >>> Newest Loongson-3 processors have moved to use LoongArch >>> architecture. >>> Sadly, the LL/SC issue is still existing on both latest Loongson-3 >>> processors using MIPS64 (Loongson-3A4000) and LoongArch >>> (Loongson-3A5000). >> LLSC is fixed on Loongson-3A4000 as per CPUCFG report. > If I don't enable LL/SC fix, GCC libgomp tests fail on both 3A4000 and > 3A5000 (using github.com/loongson/gcc for the latter) with "invalid > access to 0x00000049" or "0x00000005". This is a race condition: it > does not happen at all with OMP_NUM_THREADS=1, happens with about 10% > possibility with OMP_NUM_THREADS=2, and about 90% possibility with > OMP_NUM_THREAD=4 (on 3A5000, on 3A4000 the possibility is lower). Apologize for the false report, yes, you are right. I had checked with Loongson guys and they confirmed that the workaround still needs to be applied to latest 3A4000 processors, including 3A4000 for MIPS and 3A5000 for LoongArch. Though, the reason behind the workaround varies with the evaluation of their uArch, for GS464V based core, barrier is required as the uArch design allows regular load to be reordered after an atomic linked load, and that would break assumption of compiler atomic constraints. For GS464E, barrier is required to flush the Store Fill Buffer and land all the cachelines to L1 cache, otherwise a linked load to a cacheline located at SFB may cause deadlock. For original GS464, barrier is required to deal with some kind of pipeline hazard to ensure store condition won't be shorted to success. Patch LGTM. The config symbol looks ambiguous and I'd agree with your idea of renaming. Thanks, - Jiaxun > > My investigation suggests this means a GCC instrinic, > __atomic_compare_and_exchange_n is not really atomic as it should be. > > If this is not a hardware issue in the GS464V/LA464 uarch, then it will > be very low-possibility coincidence: two unrelated code-generation bugs > for __atomic_compare_and_exchange_n (LA port has borrowed some code from > MIPS port, but the instrinics are of course newly coded). And I've > inspected libgomp & gcc code about __atomic_compare_and_exchange_n > carefully, nothing wrong spotted except LoongArch GCC supports "-mfix- > loongson3-llsc" which adds a "dbar 0" (like "sync" on MIPS) instruction > after SC (only for instrinics). Enabling this fixes the libgomp > failures. Likewisely, "-Wa,-mfix-loongson3-llsc" fixes it on 3A4000. > > libgomp code has been verified with thread sanitizer on other > architectures (unfortunately libtsan is not available on MIPS or > LoongArch yet), so it's very unlikely to be a coding error leading to > the race. > > And LL/SC fix is still in Huacai's 3A5000 kernel. In a mail on linux- > arch Huacai said it's "not so easy to be fixed". > > Or these are two different erratas and I misunderstand them as the same > one? >
On Tue, 2021-08-31 at 00:14 +0800, Jiaxun Yang wrote: > > > 在 2021/8/30 下午8:28, Xi Ruoyao 写道: > > On Mon, 2021-08-30 at 10:32 +0800, Jiaxun Yang wrote: > > > 在 2021/8/29 20:49, Xi Ruoyao 写道: > > > > Newest Loongson-3 processors have moved to use LoongArch > > > > architecture. > > > > Sadly, the LL/SC issue is still existing on both latest > > > > Loongson-3 > > > > processors using MIPS64 (Loongson-3A4000) and LoongArch > > > > (Loongson-3A5000). > > > LLSC is fixed on Loongson-3A4000 as per CPUCFG report. > > If I don't enable LL/SC fix, GCC libgomp tests fail on both 3A4000 > > and > > 3A5000 (using github.com/loongson/gcc for the latter) with "invalid > > access to 0x00000049" or "0x00000005". This is a race condition: it > > does not happen at all with OMP_NUM_THREADS=1, happens with about > > 10% > > possibility with OMP_NUM_THREADS=2, and about 90% possibility with > > OMP_NUM_THREAD=4 (on 3A5000, on 3A4000 the possibility is lower). > Apologize for the false report, yes, you are right. I had checked with > Loongson guys > and they confirmed that the workaround still needs to be applied to > latest 3A4000 > processors, including 3A4000 for MIPS and 3A5000 for LoongArch. > > Though, the reason behind the workaround varies with the evaluation of > their uArch, > for GS464V based core, barrier is required as the uArch design allows > regular load > to be reordered after an atomic linked load, and that would break > assumption of compiler > atomic constraints. > For GS464E, barrier is required to flush the Store Fill Buffer and > land > all the cachelines > to L1 cache, otherwise a linked load to a cacheline located at SFB may > cause deadlock. > > For original GS464, barrier is required to deal with some kind of > pipeline hazard to > ensure store condition won't be shorted to success. This explains the different (mis)behavior of LL/SC on those uarchs. I remember the original report of LL/SC issue said it can cause a deadlock on earlier model of 3As, but I didn't observed any deadlock on 3A4000. (That's I why didn't tried the workaround immediately after spotting libgomp failure, but debugged the code from 00:00 AM to 04:00 :( ) Thanks for your detailed explanation! > Patch LGTM. The config symbol looks ambiguous and I'd agree with your > idea of renaming. > > Thanks, > > - Jiaxun > > Or these are two different erratas and I misunderstand them as the > > same one? So basically this is true :). They just happen to share one workaround.
On Sun, Aug 29, 2021 at 08:49:09PM +0800, Xi Ruoyao wrote: > Newest Loongson-3 processors have moved to use LoongArch architecture. > Sadly, the LL/SC issue is still existing on both latest Loongson-3 > processors using MIPS64 (Loongson-3A4000) and LoongArch > (Loongson-3A5000). > > As it's very unlikely there will be new Loongson-3 processors using > MIPS64, let's stop people from false hoping. > > Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> > Cc: Huacai Chen <chenhuacai@kernel.org> > --- > > Huacai: how's the status of LL/SC issue on Loongson-2K? If > the issue exists on it as well, we can just force > CPU_LOONGSON3_WORKAROUNDS when CONFIG_CPU_LOONGSON64 and > CONFIG_SMP are both selected. > > arch/mips/Kconfig | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) applied to mips-next. Thomas.
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig index 6dfb27d531dd..ff5f344a371e 100644 --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -1433,19 +1433,14 @@ config LOONGSON3_ENHANCEMENT new Loongson-3 machines only, please say 'Y' here. config CPU_LOONGSON3_WORKAROUNDS - bool "Old Loongson-3 LLSC Workarounds" + bool "Loongson-3 LLSC Workarounds" default y if SMP depends on CPU_LOONGSON64 help Loongson-3 processors have the llsc issues which require workarounds. Without workarounds the system may hang unexpectedly. - Newer Loongson-3 will fix these issues and no workarounds are needed. - The workarounds have no significant side effect on them but may - decrease the performance of the system so this option should be - disabled unless the kernel is intended to be run on old systems. - - If unsure, please say Y. + Say Y, unless you know what you are doing. config CPU_LOONGSON3_CPUCFG_EMULATION bool "Emulate the CPUCFG instruction on older Loongson cores"
Newest Loongson-3 processors have moved to use LoongArch architecture. Sadly, the LL/SC issue is still existing on both latest Loongson-3 processors using MIPS64 (Loongson-3A4000) and LoongArch (Loongson-3A5000). As it's very unlikely there will be new Loongson-3 processors using MIPS64, let's stop people from false hoping. Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> Cc: Huacai Chen <chenhuacai@kernel.org> --- Huacai: how's the status of LL/SC issue on Loongson-2K? If the issue exists on it as well, we can just force CPU_LOONGSON3_WORKAROUNDS when CONFIG_CPU_LOONGSON64 and CONFIG_SMP are both selected. arch/mips/Kconfig | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)