diff mbox series

MIPS: Fix CP0 counter erratum detection for R4k CPUs

Message ID alpine.DEB.2.21.2204240214430.9383@angie.orcam.me.uk (mailing list archive)
State Accepted
Headers show
Series MIPS: Fix CP0 counter erratum detection for R4k CPUs | expand

Commit Message

Maciej W. Rozycki April 24, 2022, 11:46 a.m. UTC
Fix the discrepancy between the two places we check for the CP0 counter 
erratum in along with the incorrect comparison of the R4400 revision 
number against 0x30 which matches none and consistently consider all 
R4000 and R4400 processors affected, as documented in processor errata 
publications[1][2][3], following the mapping between CP0 PRId register 
values and processor models:

  PRId   |  Processor Model
---------+--------------------
00000422 | R4000 Revision 2.2
00000430 | R4000 Revision 3.0
00000440 | R4400 Revision 1.0
00000450 | R4400 Revision 2.0
00000460 | R4400 Revision 3.0

No other revision of either processor has ever been spotted.

Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume 
R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the 
CP0 counter as buggy does not preclude it from being used as either a 
clock event or a clock source device.  It just cannot be used as both at 
a time, because in that case clock event interrupts will be occasionally 
lost, and the use as a clock event device takes precedence.

Compare against 0x4ff in `can_use_mips_counter' so that a single machine 
instruction is produced.

References:

[1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
    Technologies Inc., May 10, 1994, Erratum 53, p.13

[2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
    Inc., February 9, 1994, Erratum 21, p.4

[3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS 
    Technologies Inc., January 24, 1995, Erratum 14, p.3

Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
Cc: stable@vger.kernel.org # v2.6.24+
---
Thomas,

 Please review the requirements for SNI platforms.  In the case of an 
erratic CP0 timer we give precedence to the use as a clock event rather 
than clock source device; see `time_init' in arch/mips/kernel/time.c. 
Therefore if SNI systems have no alternative timer interrupt source, then 
the CP0 timer is supposed to still do regardless of the erratum.

 Conversely a system can do without a high-precision clock source, in
which case jiffies will be used.  Of course such a system will suffer if 
used for precision timekeeping, but such is the price for broken hardware.  
Don't SNI systems have any alternative timer available, not even the 
venerable 8254?

 Long-term I think this code should be factored out and rewritten so that 
it lives in one place and can take advantage of compile-time constants, 
which will be the case for the majority of platforms.  For the time being 
fix the immediate breakage however.

 With the considerations above in mind, please apply.

  Maciej
---
 arch/mips/include/asm/timex.h |    8 ++++----
 arch/mips/kernel/time.c       |   11 +++--------
 2 files changed, 7 insertions(+), 12 deletions(-)

linux-mips-r4k-count-erratum-prid.diff

Comments

Thomas Bogendoerfer April 29, 2022, 10:01 a.m. UTC | #1
On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote:
> Fix the discrepancy between the two places we check for the CP0 counter 
> erratum in along with the incorrect comparison of the R4400 revision 
> number against 0x30 which matches none and consistently consider all 
> R4000 and R4400 processors affected, as documented in processor errata 
> publications[1][2][3], following the mapping between CP0 PRId register 
> values and processor models:
> 
>   PRId   |  Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0

interesting, where is this documented ? And it's quite funny that so far
everybody messed up revision printing for R4400 CPUs. 

>  Please review the requirements for SNI platforms.  In the case of an 
> erratic CP0 timer we give precedence to the use as a clock event rather 
> than clock source device; see `time_init' in arch/mips/kernel/time.c. 
> Therefore if SNI systems have no alternative timer interrupt source, then 
> the CP0 timer is supposed to still do regardless of the erratum.
>
>  Conversely a system can do without a high-precision clock source, in
> which case jiffies will be used.  Of course such a system will suffer if 
> used for precision timekeeping, but such is the price for broken hardware.  
> Don't SNI systems have any alternative timer available, not even the 
> venerable 8254?

all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't
that nice for clock events as their interupts are connected via an i8259
addresses via ISA PIO. 

>  With the considerations above in mind, please apply.

will do later.

Thomas.
Maciej W. Rozycki April 29, 2022, 10:55 a.m. UTC | #2
On Fri, 29 Apr 2022, Thomas Bogendoerfer wrote:

> > Fix the discrepancy between the two places we check for the CP0 counter 
> > erratum in along with the incorrect comparison of the R4400 revision 
> > number against 0x30 which matches none and consistently consider all 
> > R4000 and R4400 processors affected, as documented in processor errata 
> > publications[1][2][3], following the mapping between CP0 PRId register 
> > values and processor models:
> > 
> >   PRId   |  Processor Model
> > ---------+--------------------
> > 00000422 | R4000 Revision 2.2
> > 00000430 | R4000 Revision 3.0
> > 00000440 | R4400 Revision 1.0
> > 00000450 | R4400 Revision 2.0
> > 00000460 | R4400 Revision 3.0
> 
> interesting, where is this documented ? And it's quite funny that so far
> everybody messed up revision printing for R4400 CPUs. 

 That's just observation combined with past discussions with Ralf.

 Basically the PRId implementation number is 0x04 for both the R4000 and 
the R4400 (the only difference between the two CPUs is the addition of the 
write-back buffer in the latter one, making it weakly ordered).  And then 
the PRId revision number matches exactly the documented CPU revision for 
the R4000, while for the R4400 you need to subtract 3 from the PRId 
revision number to get the documented CPU revision (i.e. what would be 
R4000 Revision 4.0 actually became R4400 Revision 1.0).

 At this time no old MIPSer from the SGI days may be around to confirm or 
contradict this observation.  Documentation explicitly says[1]:

"The revision number can distinguish some chip revisions, however there is 
no guarantee that changes to the chip will necessarily be reflected in the 
PRId register, or that changes to the revision number necessarily reflect 
real chip changes.  For this reason, these values are not listed and 
software should not rely on the revision number in the PRId register to 
characterize the chip."

but surely the author didn't have errata workarounds in mind plus all CPU 
revisions have already been manufactured so the mapping has been fixed.

> >  Conversely a system can do without a high-precision clock source, in
> > which case jiffies will be used.  Of course such a system will suffer if 
> > used for precision timekeeping, but such is the price for broken hardware.  
> > Don't SNI systems have any alternative timer available, not even the 
> > venerable 8254?
> 
> all SNI systems have a i8254 in their EISA/PCI chipsets. But they aren't
> that nice for clock events as their interupts are connected via an i8259
> addresses via ISA PIO. 

 Interrupts are used for clock events, so you don't need one here.  For 
clock sources you just read the counter.

 That said reading from the 8254 is messy too and you may need a spinlock 
(you need to write the Counter Latch or Read-Back command to the control 
register and then issue consecutive two reads to the requested timer's 
data register[2]).  Which is I guess why support for it has been removed 
from x86 code.  For non-SMP it might be good enough.

> >  With the considerations above in mind, please apply.
> 
> will do later.

 Great, thanks!

References:

[1] Joe Heinrich: "MIPS R4000 Microprocessor User's Manual", Second
    Edition, MIPS Technologies, Inc., April 1, 1994, Chapter 4 "Memory 
    Management", p.90

[2] "8254 Programmable Interval Timer", Intel Corporation, Order Number: 
    231164-005, September 1993, Section "Read Operations", pp.7-9

  Maciej
Philippe Mathieu-Daudé April 29, 2022, 1:04 p.m. UTC | #3
On Mon, Apr 25, 2022 at 9:39 AM Maciej W. Rozycki <macro@orcam.me.uk> wrote:
>
> Fix the discrepancy between the two places we check for the CP0 counter
> erratum in along with the incorrect comparison of the R4400 revision
> number against 0x30 which matches none and consistently consider all
> R4000 and R4400 processors affected, as documented in processor errata
> publications[1][2][3], following the mapping between CP0 PRId register
> values and processor models:
>
>   PRId   |  Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0
>
> No other revision of either processor has ever been spotted.
>
> Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume
> R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the
> CP0 counter as buggy does not preclude it from being used as either a
> clock event or a clock source device.  It just cannot be used as both at
> a time, because in that case clock event interrupts will be occasionally
> lost, and the use as a clock event device takes precedence.
>
> Compare against 0x4ff in `can_use_mips_counter' so that a single machine
> instruction is produced.
>
> References:
>
> [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
>     Technologies Inc., May 10, 1994, Erratum 53, p.13
>
> [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
>     Inc., February 9, 1994, Erratum 21, p.4
>
> [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS
>     Technologies Inc., January 24, 1995, Erratum 14, p.3
>
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
> Cc: stable@vger.kernel.org # v2.6.24+
> ---
>  arch/mips/include/asm/timex.h |    8 ++++----
>  arch/mips/kernel/time.c       |   11 +++--------
>  2 files changed, 7 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Thomas Bogendoerfer April 29, 2022, 1:53 p.m. UTC | #4
On Sun, Apr 24, 2022 at 12:46:23PM +0100, Maciej W. Rozycki wrote:
> Fix the discrepancy between the two places we check for the CP0 counter 
> erratum in along with the incorrect comparison of the R4400 revision 
> number against 0x30 which matches none and consistently consider all 
> R4000 and R4400 processors affected, as documented in processor errata 
> publications[1][2][3], following the mapping between CP0 PRId register 
> values and processor models:
> 
>   PRId   |  Processor Model
> ---------+--------------------
> 00000422 | R4000 Revision 2.2
> 00000430 | R4000 Revision 3.0
> 00000440 | R4400 Revision 1.0
> 00000450 | R4400 Revision 2.0
> 00000460 | R4400 Revision 3.0
> 
> No other revision of either processor has ever been spotted.
> 
> Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume 
> R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the 
> CP0 counter as buggy does not preclude it from being used as either a 
> clock event or a clock source device.  It just cannot be used as both at 
> a time, because in that case clock event interrupts will be occasionally 
> lost, and the use as a clock event device takes precedence.
> 
> Compare against 0x4ff in `can_use_mips_counter' so that a single machine 
> instruction is produced.
> 
> References:
> 
> [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS
>     Technologies Inc., May 10, 1994, Erratum 53, p.13
> 
> [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies
>     Inc., February 9, 1994, Erratum 21, p.4
> 
> [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS 
>     Technologies Inc., January 24, 1995, Erratum 14, p.3
> 
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug")
> Cc: stable@vger.kernel.org # v2.6.24+
> ---
> Thomas,
> 
>  Please review the requirements for SNI platforms.  In the case of an 
> erratic CP0 timer we give precedence to the use as a clock event rather 
> than clock source device; see `time_init' in arch/mips/kernel/time.c. 
> Therefore if SNI systems have no alternative timer interrupt source, then 
> the CP0 timer is supposed to still do regardless of the erratum.
> 
>  Conversely a system can do without a high-precision clock source, in
> which case jiffies will be used.  Of course such a system will suffer if 
> used for precision timekeeping, but such is the price for broken hardware.  
> Don't SNI systems have any alternative timer available, not even the 
> venerable 8254?
> 
>  Long-term I think this code should be factored out and rewritten so that 
> it lives in one place and can take advantage of compile-time constants, 
> which will be the case for the majority of platforms.  For the time being 
> fix the immediate breakage however.
> 
>  With the considerations above in mind, please apply.
> 
>   Maciej
> ---
>  arch/mips/include/asm/timex.h |    8 ++++----
>  arch/mips/kernel/time.c       |   11 +++--------
>  2 files changed, 7 insertions(+), 12 deletions(-)

applied to mips-fixes.

Thomas.
David Laight April 29, 2022, 3:24 p.m. UTC | #5
>  That said reading from the 8254 is messy too and you may need a spinlock
> (you need to write the Counter Latch or Read-Back command to the control
> register and then issue consecutive two reads to the requested timer's
> data register[2]).  Which is I guess why support for it has been removed
> from x86 code.  For non-SMP it might be good enough.

It is important to 'latch' the counter before reading it.
I tried to optimise some code to avoid the extra accesses.
In principle it ought to have worked, but reading the
unlatched values gave garbage - not just messed up 16bit values.
It was probably returning a value that was being ripple-counted.

The sheer number of IO cycles you need to read the counters
just beggars belief.
So while they can be used to get an accurate timestamp it
really takes too long to be useful.
Even in a modern x86 chipset I think they are still ISA
speed cycles.
(Mind you, we've an fpga based PCIe boards where reads
are ISA speed....)

OTOH I'd have though that for a real 486 (one without a TSC)
that is the only way to get a high res timer count.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

Index: linux-macro/arch/mips/include/asm/timex.h
===================================================================
--- linux-macro.orig/arch/mips/include/asm/timex.h
+++ linux-macro/arch/mips/include/asm/timex.h
@@ -40,9 +40,9 @@ 
 typedef unsigned int cycles_t;
 
 /*
- * On R4000/R4400 before version 5.0 an erratum exists such that if the
- * cycle counter is read in the exact moment that it is matching the
- * compare register, no interrupt will be generated.
+ * On R4000/R4400 an erratum exists such that if the cycle counter is
+ * read in the exact moment that it is matching the compare register,
+ * no interrupt will be generated.
  *
  * There is a suggested workaround and also the erratum can't strike if
  * the compare interrupt isn't being used as the clock source device.
@@ -63,7 +63,7 @@  static inline int can_use_mips_counter(u
 	if (!__builtin_constant_p(cpu_has_counter))
 		asm volatile("" : "=m" (cpu_data[0].options));
 	if (likely(cpu_has_counter &&
-		   prid >= (PRID_IMP_R4000 | PRID_REV_ENCODE_44(5, 0))))
+		   prid > (PRID_IMP_R4000 | PRID_REV_ENCODE_44(15, 15))))
 		return 1;
 	else
 		return 0;
Index: linux-macro/arch/mips/kernel/time.c
===================================================================
--- linux-macro.orig/arch/mips/kernel/time.c
+++ linux-macro/arch/mips/kernel/time.c
@@ -141,15 +141,10 @@  static __init int cpu_has_mfc0_count_bug
 	case CPU_R4400MC:
 		/*
 		 * The published errata for the R4400 up to 3.0 say the CPU
-		 * has the mfc0 from count bug.
-		 */
-		if ((current_cpu_data.processor_id & 0xff) <= 0x30)
-			return 1;
-
-		/*
-		 * we assume newer revisions are ok
+		 * has the mfc0 from count bug.  This seems the last version
+		 * produced.
 		 */
-		return 0;
+		return 1;
 	}
 
 	return 0;