DryIce , RTC not working on imx53.

Message ID	D24E577EA1E37B4788A3BFBC418F14A70760C04A@wetsrvex01.loepfe.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: "Vellemans, Noel" <Noel.Vellemans@visionBMS.com> To: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org> Subject: DryIce , RTC not working on imx53. Thread-Topic: DryIce , RTC not working on imx53. Thread-Index: AdM95Ju6zaN64kiYSMyw0qIdNzTo/w== Date: Thu, 5 Oct 2017 14:18:31 +0000 Message-ID: <D24E577EA1E37B4788A3BFBC418F14A70760C04A@wetsrvex01.loepfe.com> Accept-Language: en-US, de-CH Content-Language: en-US MIME-Version: 1.0 Precedence: list Cc: "linux-rtc@vger.kernel.org" <linux-rtc@vger.kernel.org> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Vellemans, Noel Oct. 5, 2017, 2:18 p.m. UTC

Hello,

DryIce , SRTC not working on imx53. ( kernel 4.x)  ( same hardware running older kernel versions.. means , rtc is working)

During boot all seems to be fine but once you try to read or write the hardware clock later on … it bails out with this error on the console. 

hwclock

[   97.186577] imxdi_rtc 53fa4000.rtc: Write-wait timeout val = 0x5a2ff8d3 reg = 0x00000008

Hwclock : select() to /dev/rtc0 to wait for clock tick timed out: No such file or directory



I've Added some driver – printk’s….

# hwclock
[   73.362559] dryice_rtc_read_time ------------------------------------------------
[   73.395077] dryice_rtc_read_time ------------------------------------------------
[   73.414156] dryice_rtc_read_time ------------------------------------------------
[   73.421700] di_write_wait ------------------------------------------------
[   73.472624] di_int_enable ------------------------------------------------
[   73.514609] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x5a3000c8 reg = 0x00000008
[   73.523019] di_int_enable ------------------------------------------------

<< STALLS for 5 seconds here >>
<< STALLS for 5 seconds here >>
<< STALLS for 5 seconds here >>
<< STALLS for 5 seconds here >>
<< STALLS for 5 seconds here >>

hwclock[   78.584909] dryice_rtc_alarm_irq_enable ------------------------------------------------
: select() to /dev/rtc0 to wait f[   78.593456] di_int_disable ------------------------------------------------
or clock tick timed out: No such file or directory






Strace .. logging ================================


stat("/lib/ld-uClibc.so.0", {st_mode=S_IFREG|0777, st_size=25300, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0xb6f04000
set_tls(0xb6f04490, 0xb6f04b38, 0xb6f07088, 0xb6f04490, 0xb6f06f74) = 0
mprotect(0xb6ed2000, 4096, PROT_READ)   = 0
mprotect(0xb6f06000, 4096, PROT_READ)   = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0
gettimeofday({1513095430, 708097}, NULL) = 0
getuid32()                              = 0
open("/dev/rtc", O_RDONLY|O_LARGEFILE)  = -1 ENOENT (No such file or directory)
open("/dev/rtc0", O_RDONLY|O_LARGEFILE) = 3
brk(0)                                  = 0x18000
brk(0x19000)                            = 0x19000
stat64("/etc/adjtime", 0xbeec26a8)      = -1 ENOENT (No such file or directory)
ioctl(3, PHN_SET_REGS or RTC_UIE_ON, 0) = 0
select(4, [3], NULL, NULL, {5, 0}


<< STALLS for 5 seconds here  --  select is not returning !!! timeout is 5 seconds…. >>

)      = 0 (Timeout)[  141.766162] dryice_rtc_alarm_irq_enable ------------------------------------------------

write(2, "hwclock", 7hwclock)                  = 7
write(2, ": ", 2: )                       =[  141.782195] di_int_disable ------------------------------------------------
2
write(2, "select() to ", 12select() to )            = 12
write(2, "/dev/rtc0", 9/dev/rtc0)                = 9
write(2, " to wait for clock tick timed ou"..., 33 to wait for clock tick timed out) = 33
write(2, ": ", 2: )                       = 2
write(2, "No such file or directory", 25No such file or directory) = 25
write(2, "\n", 1
)                       = 1
ioctl(3, PHN_NOT_OH or RTC_UIE_OFF, 0)  = 0
close(3)                                = 0
exit_group(74)                          = ?








QUICK analyses  ( could be wrong) ? 
It seems that hwclock is reading the current-timestamp 3 times and if not changed in those 3 read cycles… it sets up an read-interrupt-abort able time reader that should return as soon as the irq fires… but this seems to be missing !

FYI:  I’ve been using following commint to enable srtc.
commit 5b725054147deaf966b3919e10a86c6bfe946a18
Author: Patrick Bruenn <p.bruenn@beckhoff.com>
Date:   Wed Jul 26 14:05:32 2017 +0200

    ARM: dts: imx53: add srtc node
    
    The i.MX53 has an integrated secure real time clock. Add it to the dtsi.
    
    Signed-off-by: Patrick Bruenn <p.bruenn@beckhoff.com>

    Signed-off-by: Shawn Guo <shawnguo@kernel.org>

Patrick Brünn Oct. 12, 2017, 7:50 a.m. UTC | #1

>From: Vellemans, Noel [mailto:Noel.Vellemans@visionBMS.com]

>Sent: Donnerstag, 5. Oktober 2017 16:19

>Hello,

>

Hi,
not sure if I can help on this, but as I did some testing myself I thought I should throw in my results as well.

>DryIce , SRTC not working on imx53. ( kernel 4.x)  ( same hardware running

>older kernel versions.. means , rtc is working)

>

Is it only the kernel you are changing? I am asking because I had the impression that hwclock behaves different on Debian stretch (util-linux 2.29.2) and jessie (util-linux 2.25.2).
I am saying impression because it seemed on jessie I would always get a response of hwclock, but on stretch never. When I did more systematic testing it looks like right after boot hwclock -r will always fail. But if I wait some minutes, all calls succeed.
>...

>QUICK analyses  ( could be wrong) ?

>It seems that hwclock is reading the current-timestamp 3 times and if not

>changed in those 3 read cycles… it sets up an read-interrupt-abort able time

>reader that should return as soon as the irq fires… but this seems to be

>missing !

>

I am seeing a lot of interrupts with kernel 4.14-rc4 (jessie and stretch), but they seem to be unhandled:
root@CX9020:~# uname -a
Linux CX9020 4.14.0-rc4+ #151 PREEMPT Wed Oct 11 10:40:34 CEST 2017 armv7l GNU/Linux
root@CX9020:~# hwclock -D -r
hwclock from util-linux 2.29.2
Using the /dev interface to the clock.
Last drift adjustment done at 1490885082 seconds after 1969
Last calibration done at 1490885082 seconds after 1969
Hardware clock is on UTC time
Assuming hardware clock is kept in UTC time.
Waiting for clock tick...
[   26.795437] irq 40: nobody cared (try booting with the "irqpoll" option)
[   26.802696] handlers:
[   26.805031] [<c06029e8>] dryice_irq
[   26.808584] Disabling IRQ #40
select() to /dev/rtc to wait for clock tick timed out...synchronization failed
root@CX9020:~# cat /proc/interrupts
           CPU0
 17:       4276      tzic   1 Edge      mmc0
 18:         52      tzic   2 Edge      mmc1
 22:          0      tzic   6 Edge      sdma
 30:        176      tzic  14 Edge      53f80200.usb
 40:     100000      tzic  24 Edge      53fa4000.srtc
 48:        268      tzic  32 Edge      53fc0000.serial
 55:       2288      tzic  39 Edge      i.MX Timer Tick
 74:          0      tzic  58 Edge      53f98000.wdog
 79:        278      tzic  63 Edge      63fc4000.i2c
 93:          0      tzic  77 Edge      arm-pmu
103:        662      tzic  87 Edge      63fec000.ethernet
145:          0  gpio-mxc   1 Edge      50004000.esdhc cd
148:          0  gpio-mxc   4 Edge      50008000.esdhc cd
368:        674       IPU  23 Edge      imx_drm
369:          0       IPU  28 Edge      imx_drm
Err:          0

I added some tracing to dryice_irq() and saw that most of the time (if not all the time) dsr == DSR_MCO /* monotonic clock overflow */ with dier vary between 0x110, 0x10 and even 0x0.
I don't know what's the right thing to do, to recover from DSR_MCO. " return IRQ_HANDLED" will stop the nobody cared message but hwclock still times out.

And for completeness:
root@CX9020:~# dmesg | grep srtc
[    0.299043] imxdi_rtc 53fa4000.srtc: Unlocked unit detected
[    0.299539] imxdi_rtc 53fa4000.srtc: security violation interrupt not available.
[    0.299757] rtc rtc0: 53fa4000.srtc: dev (253:0)
[    0.299778] imxdi_rtc 53fa4000.srtc: rtc core: registered 53fa4000.srtc as rtc0
[    0.436785] imxdi_rtc 53fa4000.srtc: setting system clock to 2017-10-12 06:38:08 UTC (1507790288)
[  445.486624] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x59df0f8e reg = 0x00000008
[ 3025.076612] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x59df19a2 reg = 0x00000008
[ 3082.316612] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x59df19db reg = 0x00000008

Novice question: Is hwclock still required these days? For me it looks like the kernel is synchronizing with rtc on it's own. Maybe some kernel config is incompatible with hwclock?

Regards,
Patrick
Beckhoff Automation GmbH & Co. KG | Managing Director: Dipl. Phys. Hans Beckhoff
Registered office: Verl, Germany | Register court: Guetersloh HRA 7075

Russell King (Oracle) Oct. 12, 2017, 9:36 a.m. UTC | #2

On Thu, Oct 12, 2017 at 07:50:41AM +0000, Patrick Brünn wrote:
> Novice question: Is hwclock still required these days? For me it looks
> like the kernel is synchronizing with rtc on it's own. Maybe some kernel
> config is incompatible with hwclock?

It depends on your application.  If you want the kernel's idea of time
to be wrong up to a second or two, then you can rely on the kernel's
time setting.

Please realise that the kernel has always set the time from the RTC,
even on x86 where hwclock has been used.  hwclock, however, has some
advanced features and advantages that are beneficial if you're after
accuracy.

1) hwclock will try to read the RTC as close to a second-change as
   possible, so that the time read from the RTC is as close to the
   second.

2) hwclock can measure and correct the RTC time for its own drift if
   hwclock has been allowed to capture and process the offset.

What this means is that hwclock has the capability to precisely set
the kernel's time at boot, way more accurately than the kernel does.
The kernel's time setting is focused on speed, not accuracy.

So, if your userspace application is to monitor something using a
precise timestamp, and you are NTP synchronising (or other method) the
time on the system, then you need the kernel's idea of time to be set
much more precisely to avoid NTP making big corrections over the
following three to six hours.

This happens because NTP will slew the clock for a few seconds of
difference, which makes storing and reloading the PPM value useless,
and can also mean that in such a monitoring application, the results
are unreliable until NTP has re-stabilised.

Here's an example of an application where this may matter: average
speed camera system.  You have two cameras over a section of road,
each with their own processing, which are NTP synchronised.  Each
reads the numberplate of passing vehicles using ANPR technology,
and timestamps the passing of the vehicle using their local clock.
The distance is known, so it's an easy calculation to calculate the
vehicle speed.  If the vehicle speed is over the limit, the driver
is fined.

Consider what the implications are if one of the systems rebooted
and then had incorrect time (up to two seconds wrong) for up to six
hours after - a two second error is about a 3% error in recorded
speed.  Would you want to be sent a speeding fine from such a system?

(We have the first non-motorway road in Surrey, UK to have average
speed cameras installed down its entire length because of "piston
heads" who think the speed limit is 60mph rather than the sign-
posted 40mph.)

Another, probably more relevant application is a stratum-1 NTP server
synchronised via PPS to a GPS.  I wonder how many people are aware
that if you reboot such a setup relying on the kernel's time setting
only, the time sent to clients will be wrong while NTP slews the local
clock.  I've seen these effects locally, where rebooting exactly such
a system results in slaved systems which have no other source of time
also making big corrections.

Alexandre Belloni Nov. 9, 2017, 3:03 a.m. UTC | #3

On 12/10/2017 at 07:50:41 +0000, Patrick Brünn wrote:
> I am seeing a lot of interrupts with kernel 4.14-rc4 (jessie and stretch), but they seem to be unhandled:
> root@CX9020:~# uname -a
> Linux CX9020 4.14.0-rc4+ #151 PREEMPT Wed Oct 11 10:40:34 CEST 2017 armv7l GNU/Linux
> root@CX9020:~# hwclock -D -r
> hwclock from util-linux 2.29.2
> Using the /dev interface to the clock.
> Last drift adjustment done at 1490885082 seconds after 1969
> Last calibration done at 1490885082 seconds after 1969
> Hardware clock is on UTC time
> Assuming hardware clock is kept in UTC time.
> Waiting for clock tick...
> [   26.795437] irq 40: nobody cared (try booting with the "irqpoll" option)
> [   26.802696] handlers:
> [   26.805031] [<c06029e8>] dryice_irq
> [   26.808584] Disabling IRQ #40
> select() to /dev/rtc to wait for clock tick timed out...synchronization failed
> root@CX9020:~# cat /proc/interrupts
>            CPU0
>  17:       4276      tzic   1 Edge      mmc0
>  18:         52      tzic   2 Edge      mmc1
>  22:          0      tzic   6 Edge      sdma
>  30:        176      tzic  14 Edge      53f80200.usb
>  40:     100000      tzic  24 Edge      53fa4000.srtc
>  48:        268      tzic  32 Edge      53fc0000.serial
>  55:       2288      tzic  39 Edge      i.MX Timer Tick
>  74:          0      tzic  58 Edge      53f98000.wdog
>  79:        278      tzic  63 Edge      63fc4000.i2c
>  93:          0      tzic  77 Edge      arm-pmu
> 103:        662      tzic  87 Edge      63fec000.ethernet
> 145:          0  gpio-mxc   1 Edge      50004000.esdhc cd
> 148:          0  gpio-mxc   4 Edge      50008000.esdhc cd
> 368:        674       IPU  23 Edge      imx_drm
> 369:          0       IPU  28 Edge      imx_drm
> Err:          0
> 
> I added some tracing to dryice_irq() and saw that most of the time (if not all the time) dsr == DSR_MCO /* monotonic clock overflow */ with dier vary between 0x110, 0x10 and even 0x0.
> I don't know what's the right thing to do, to recover from DSR_MCO. " return IRQ_HANDLED" will stop the nobody cared message but hwclock still times out.
> 

hwclock times out because the alarm is not working properly. Can you
check whether rtctest is working?

Patrick Brünn Nov. 9, 2017, 9:59 a.m. UTC | #4

>From: Alexandre Belloni [mailto:alexandre.belloni@free-electrons.com]
>Sent: Donnerstag, 9. November 2017 04:03
>
>hwclock times out because the alarm is not working properly. Can you
>check whether rtctest is working?
>
You mean "tools/testing/selftests/timers/rtctest", right?

I just run it and it's stuck at "Counting 5 update (1/sec) interrupts from reading /dev/rtc0:"

Kernel log shows a new message for each try I restart rtctest.
[ 1032.349562] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x5a042418 reg = 0x00000008"


Regards, Patrick

Beckhoff Automation GmbH & Co. KG | Managing Director: Dipl. Phys. Hans Beckhoff
Registered office: Verl, Germany | Register court: Guetersloh HRA 7075

Alexandre Belloni Nov. 13, 2017, 4:15 p.m. UTC | #5

+Cc Fabio

On 09/11/2017 at 09:59:02 +0000, Patrick Brünn wrote:
> 
> >From: Alexandre Belloni [mailto:alexandre.belloni@free-electrons.com]
> >Sent: Donnerstag, 9. November 2017 04:03
> >
> >hwclock times out because the alarm is not working properly. Can you
> >check whether rtctest is working?
> >
> You mean "tools/testing/selftests/timers/rtctest", right?
> 
> I just run it and it's stuck at "Counting 5 update (1/sec) interrupts from reading /dev/rtc0:"
> 

So alarms or interrupts definitively don't work.

> Kernel log shows a new message for each try I restart rtctest.
> [ 1032.349562] imxdi_rtc 53fa4000.srtc: Write-wait timeout val = 0x5a042418 reg = 0x00000008"
> 

That would explain it because failing to write DCAMR means that the
alarm is not set to the proper value.

But that doesn't explain all the spurious interrupts you are seeing.

Fabio, is there any recent change in the platform code that would make
reading/writing the rtc register fail?

DryIce , RTC not working on imx53.

Commit Message

Comments

Patch