diff mbox series

[v2,05/15] ia64: convert to legacy_timer_tick

Message ID 20201030151758.1241164-6-arnd@kernel.org (mailing list archive)
State Changes Requested
Headers show
Series timers: clean up ARCH_GETTIMEOFFSET, LEGACY_TIMER_TICK | expand

Commit Message

Arnd Bergmann Oct. 30, 2020, 3:17 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

ia64 is the only architecture that calls xtime_update() in a loop,
once for each jiffie that has passed since the last event.

Before commit 3171a0305d62 ("[PATCH] simplify update_times (avoid
jiffies/jiffies_64 aliasing problem)") in 2006, it could not actually do
this any differently, but now it seems simpler to just pass the number
of jiffies that passed in the meantime.

While this loses the ability process interrupts in the middle of
the timer tick by calling local_irq_enable(), doing so is fairly
peculiar anyway and it seems better to just do what everyone
else does here.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/ia64/Kconfig       |  1 +
 arch/ia64/kernel/time.c | 36 +++++++++++++-----------------------
 2 files changed, 14 insertions(+), 23 deletions(-)

Comments

John Paul Adrian Glaubitz Dec. 18, 2020, 9:29 p.m. UTC | #1
Hi Arnd!

On 10/30/20 4:17 PM, Arnd Bergmann wrote:
> ia64 is the only architecture that calls xtime_update() in a loop,
> once for each jiffie that has passed since the last event.
> 
> Before commit 3171a0305d62 ("[PATCH] simplify update_times (avoid
> jiffies/jiffies_64 aliasing problem)") in 2006, it could not actually do
> this any differently, but now it seems simpler to just pass the number
> of jiffies that passed in the meantime.
> 
> While this loses the ability process interrupts in the middle of
> the timer tick by calling local_irq_enable(), doing so is fairly
> peculiar anyway and it seems better to just do what everyone
> else does here.

This broke the hpsa block device driver on my HP RX2600.

I'm seeing this backtrace now:

[  905.883273] usb 1-2: SerialNumber: A60020000001                                                                                                                         
[  905.918170]  sda: sda1 sda2 sda3                                                                                                                                        
[  905.920107] sd 0:1:0:0: [sda] Attached SCSI disk                                                                                                                        
[  905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected                                                                                                       
[  905.944102] scsi host1: usb-storage 1-2:1.0                                                                                                                             
[  905.944102] usbcore: registered new interface driver usb-storage                                                                                                        
[  905.944117] usbcore: registered new interface driver uas                                                                                                                
[  905.948117] scsi 1:0:0:0: CD-ROM            TEAC     DV-W28EAW        H.0C PQ: 0 ANSI: 0                                                                                
[  905.952013] usb 2-1: new full-speed USB device number 2 using ohci-pci                                                                                                  
[  905.956102] sr 1:0:0:0: Power-on or device reset occurred                                                                                                               
[  905.968107] sr 1:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray                                                                                  
[  905.968107] cdrom: Uniform CD-ROM driver Revision: 3.20                                                                                                                 
[  906.028013] ioc0: LSISAS1068 B0: Capabilities={Initiator}                                                                                                               
[  906.028110] tg3 0000:01:02.0 eth0: Tigon3 [partno(BCM95700A6) rev 2100] (PCIX:66MHz:64-bit) MAC address 00:17:a4:99:2d:09                                               
[  906.026167] usb 2-1: New USB device found, idVendor=03f0, idProduct=1126, bcdDevice= 0.01                                                                               
[  906.028110] tg3 0000:01:02.0 eth0: attached PHY is 5704 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])                                                             
[  906.028110] tg3 0000:01:02.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]                                                                                   
[  906.030214] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=63                                                                                          
[  906.032240] tg3 0000:01:02.0 eth0: dma_rwctrl[769f0000] dma_mask[64-bit]                                                                                                
[  906.036102] GSI 30 (level, low) -> CPU 2 (0x0200) vector 67                                                                                                             
[  906.034186] usb 2-1: Product: Virtual Management Device                                                                                                                 
[  906.038235] usb 2-1: Manufacturer: HP                                                                                                                                   
[  906.040017] usb 2-1: SerialNumber: 0017a499dd1c                                                                                                                         
[  906.056108] hid: raw HID events driver (C) Jiri Kosina                                                                                                                  
[  906.080013] usbcore: registered new interface driver usbhid                                                                                                             
[  906.080013] usbhid: USB HID core driver                                                                                                                                 
[  906.080107] input: HP Virtual Management Device as /devices/pci0000:00/0000:00:02.0/usb2/2-1/2-1:1.0/0003:03F0:1126.0001/input/input2                                   
[  906.088110] hid-generic 0003:03F0:1126.0001: input,hidraw0: USB HID v1.11 Keyboard [HP Virtual Management Device] on usb-0000:00:02.0-1/input0                          
[  906.088110] input: HP Virtual Management Device as /devices/pci0000:00/0000:00:02.0/usb2/2-1/2-1:1.1/0003:03F0:1126.0002/input/input3                                   
[  906.092110] hid-generic 0003:03F0:1126.0002: input,hidraw1: USB HID v1.01 Mouse [HP Virtual Management Device] on usb-0000:00:02.0-1/input1                             
[  906.104104] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.108288] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.108288] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.152107] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.170926] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.170926] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.218925] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.222925] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.222925] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.266926] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.272016] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.272016] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.272107] scsi host2: ioc0: LSISAS1068 B0, FwRev=010f0000h, Ports=1, MaxQ=511, IRQ=62                                                                                 
[  906.324110] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.330925] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.330925] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.378925] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.382926] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.382926] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  906.386926] tg3 0000:01:02.1 eth1: Tigon3 [partno(BCM95700A6) rev 2100] (PCIX:66MHz:64-bit) MAC address 00:17:a4:99:2d:08                                               
[  906.386926] tg3 0000:01:02.1 eth1: attached PHY is 5704 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])                                                             
[  906.388016] tg3 0000:01:02.1 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]                                                                                   
[  906.388016] tg3 0000:01:02.1 eth1: dma_rwctrl[769f0000] dma_mask[64-bit]                                                                                                
[  906.396108] tg3 0000:01:02.1 enp1s2f1: renamed from eth1                                                                                                                
[  906.408107] tg3 0000:01:02.0 enp1s2f0: renamed from eth0                                                                                                                
Begin: Loading essential drivers ... done.                                                                                                                                 
Begin: Running /scripts/init-premount ... done.                                                                                                                            
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.                                                                                           
Begin: Running /scripts/local-premount ... done.                                                                                                                           
Begin: Will now check root file system ... fsck from util-linux 2.36                                                                                                       
[/sbin/fsck.ext4 (1) -- /dev/sda2] fsck.ext4 -a -C0 /dev/sda2                                                                                                              
/dev/sda2: clean, 317994/2068416 files, 6480495/8257536 blocks                                                                                                             
[  906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  906.670923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
done.                                                                                                                                                                      
[  906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:                                                                                                         
[  906.722166] rcu:     2-....: (3 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=4                                                                      
[  906.722166]  (detected by 0, t=6115 jiffies, g=465, q=80)                                                                                                               
[  906.722166] Task dump for CPU 2:                                                                                                                                        
[  906.722166] task:modprobe        state:R  running task     stack:    0 pid:  162 ppid:   146 flags:0x00000002                                                           
[  906.722166]                                                                                                                                                             
[  906.722166] Call Trace:                                                                                                                                                 
[  906.722166]  [<0000000000000fd6>] 0xfd6                                                                                                                                 
[  906.722166]                                 sp=e000010046d27c70 bsp=e000010046d20e80                                                                                    
[  906.782168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:                                                                                                         
[  906.782168] rcu:     2-....: (18 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=18                                                                    
[  906.782168]  (detected by 0, t=22157 jiffies, g=465, q=80)                                                                                                              
[  906.782168] Task dump for CPU 2:                                                                                                                                        
[  906.782168] task:modprobe        state:R  running task     stack:    0 pid:  162 ppid:   146 flags:0x00000002                                                           
[  906.782168]                                                                                                                                                             
[  906.782168] Call Trace:                                                                                                                                                 
[  906.782168]  [<0000001d00000065>] 0x1d00000065                                                                                                                          
[  906.782168]                                 sp=e000010046d27c70 bsp=e000010046d20d48                                                                                    
[  906.844107] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)                                                                                     
done.                                                                                                                                                                      
Begin: Running /scripts/local-bottom ... done.                                                                                                                             
Begin: Running /scripts/init-bottom ... done.                                                                                                                              
[  906.978917] Not activating Mandatory Access Control as /sbin/tomoyo-init does not exist.                                                                                
[  907.726923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  907.734923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  907.734923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  907.790923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  907.802923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  907.802923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  907.860014] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  907.886923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  907.886923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  907.930923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  907.942923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  907.942923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  907.976107] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  907.982923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  907.982923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.106923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  908.110923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  908.110923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.256014] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  908.258922] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  908.258922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.428014] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  908.434923] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  908.434923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.710922] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  908.714922] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  908.714922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.754922] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1                           
[  908.758922] hpsa 0000:05:00.0: device is ready.                                                                                                                         
[  908.758922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1        
[  908.788106] systemd[1]: Inserted module 'autofs4'                                                                                                                       
[  908.972107] systemd[1]: systemd 246.6-5 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD -SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)                                                                                         
[  908.976013] systemd[1]: Detected architecture ia64.                                                                                                                     

Welcome to Debian GNU/Linux bullseye/sid!

[  909.020014] systemd[1]: Set hostname to <glendronach>.
[  909.178923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.182922] hpsa 0000:05:00.0: device is ready.
[  909.182922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.242922] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.254922] hpsa 0000:05:00.0: device is ready.
[  909.254922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.322922] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.334922] hpsa 0000:05:00.0: device is ready.
[  909.334922] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
[  909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds.
[  909.360108]       Not tainted 5.10.0+ #130
[  909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  909.360108] task:systemd-sysv-ge state:D stack:    0 pid:  200 ppid:   189 flags:0x00000000
[  909.364108] 
[  909.364108] Call Trace:
[  909.364423]  [<a00000010109b210>] __schedule+0x890/0x21e0
[  909.364423]                                 sp=e0000100487d7b70 bsp=e0000100487d1748
[  909.368423]  [<a00000010109cc00>] schedule+0xa0/0x240
[  909.368423]                                 sp=e0000100487d7b90 bsp=e0000100487d16e0
[  909.368558]  [<a00000010109ce70>] io_schedule+0x70/0xa0
[  909.368558]                                 sp=e0000100487d7b90 bsp=e0000100487d16c0
[  909.372290]  [<a00000010109e1c0>] bit_wait_io+0x20/0xe0
[  909.372290]                                 sp=e0000100487d7b90 bsp=e0000100487d1698
[  909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  909.376290]  [<a00000010109d860>] __wait_on_bit+0xc0/0x1c0
[  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1648
[  909.374168] rcu:     3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2 
[  909.374168]  (detected by 0, t=5661 jiffies, g=1089, q=3)
[  909.376290]  [<a00000010109da80>] out_of_line_wait_on_bit+0x120/0x140
[  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1610
[  909.374168] Task dump for CPU 3:
[  909.374168] task:khungtaskd      state:R  running task

Adrian
Arnd Bergmann Dec. 18, 2020, 10:07 p.m. UTC | #2
On Fri, Dec 18, 2020 at 10:29 PM John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
>
> Hi Arnd!
>
> On 10/30/20 4:17 PM, Arnd Bergmann wrote:
> > ia64 is the only architecture that calls xtime_update() in a loop,
> > once for each jiffie that has passed since the last event.
> >
> > Before commit 3171a0305d62 ("[PATCH] simplify update_times (avoid
> > jiffies/jiffies_64 aliasing problem)") in 2006, it could not actually do
> > this any differently, but now it seems simpler to just pass the number
> > of jiffies that passed in the meantime.
> >
> > While this loses the ability process interrupts in the middle of
> > the timer tick by calling local_irq_enable(), doing so is fairly
> > peculiar anyway and it seems better to just do what everyone
> > else does here.
>
> This broke the hpsa block device driver on my HP RX2600.

Hi Adrian,

Sorry for causing this bug, and thank you for bisecting it
down to my patch.

Do you see any other strange behavior with that patch, or is
this the only symptom you are aware of?

>
> I'm seeing this backtrace now:
>
> [  905.883273] usb 1-2: SerialNumber: A60020000001
> [  905.918170]  sda: sda1 sda2 sda3
> [  905.920107] sd 0:1:0:0: [sda] Attached SCSI disk
> [  905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected
> [  905.944102] scsi host1: usb-storage 1-2:1.0
> [  905.944102] usbcore: registered new interface driver usb-storage
> [  905.944117] usbcore: registered new interface driver uas

The timestamps show that time is moving forward, which is at least
something. Do you have a feeling for whether the timestamps are
counting in (roughly) the correct speed, or is it going much faster
or slower than it should?

To clarify: the [905.944117] numbers are seconds/microseconds
since boot, so message would be 906 seconds after the kernel
started.

> Begin: Loading essential drivers ... done.                                                                                                                        > Begin: Running /scripts/init-premount ... done.                                                                                                                 > Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.

Ok, so it gets into user space. Is this initramfs or the actual read-only root?

> [  906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
> [  906.670923] hpsa 0000:05:00.0: device is ready.
> [  906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
> done.
> [  906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  906.722166] rcu:     2-....: (3 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=4
> [  906.722166]  (detected by 0, t=6115 jiffies, g=465, q=80)
This appears to be an 'rcu stall' warning, from print_cpu_stall_info(),
indicating that timer ticks are missing.

> [  909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds.
> [  909.360108]       Not tainted 5.10.0+ #130
> [  909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  909.360108] task:systemd-sysv-ge state:D stack:    0 pid:  200 ppid:   189 flags:0x00000000
> [  909.364108]
> [  909.364108] Call Trace:
> [  909.364423]  [<a00000010109b210>] __schedule+0x890/0x21e0
> [  909.364423]                                 sp=e0000100487d7b70 bsp=e0000100487d1748
> [  909.368423]  [<a00000010109cc00>] schedule+0xa0/0x240
> [  909.368423]                                 sp=e0000100487d7b90 bsp=e0000100487d16e0
> [  909.368558]  [<a00000010109ce70>] io_schedule+0x70/0xa0
> [  909.368558]                                 sp=e0000100487d7b90 bsp=e0000100487d16c0
> [  909.372290]  [<a00000010109e1c0>] bit_wait_io+0x20/0xe0
> [  909.372290]                                 sp=e0000100487d7b90 bsp=e0000100487d1698
> [  909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  909.376290]  [<a00000010109d860>] __wait_on_bit+0xc0/0x1c0
> [  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1648
> [  909.374168] rcu:     3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2
> [  909.374168]  (detected by 0, t=5661 jiffies, g=1089, q=3)
> [  909.376290]  [<a00000010109da80>] out_of_line_wait_on_bit+0x120/0x140
> [  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1610
> [  909.374168] Task dump for CPU 3:
> [  909.374168] task:khungtaskd      state:R  running task

and this seems to be another instance of the same. I would assume that this
is completely unrelated to the block driver and just happened to trigger during
the same time the driver was doing something.

Can you see in your full logs if the "Oops: timer tick before it's due" warning
triggered at any point?

I've attached a patch for a partial revert of my original change, this
should still work with the final cleanup on top, but restore the loop
plus the local_irq_enable()/local_irq_disable() that I dropped from
the original code. Does this make a difference?

        Arnd
John Paul Adrian Glaubitz Dec. 18, 2020, 10:13 p.m. UTC | #3
Hi Arnd!

On 12/18/20 11:07 PM, Arnd Bergmann wrote:
> Sorry for causing this bug, and thank you for bisecting it
> down to my patch.
> 
> Do you see any other strange behavior with that patch, or is
> this the only symptom you are aware of?

This seems to be the only issue I'm seeing so far. However, as I'm not
able to fully boot the system, I'm not able to be certain that there
might be other fallouts once the system is running.

>> I'm seeing this backtrace now:
>>
>> [  905.883273] usb 1-2: SerialNumber: A60020000001
>> [  905.918170]  sda: sda1 sda2 sda3
>> [  905.920107] sd 0:1:0:0: [sda] Attached SCSI disk
>> [  905.944102] usb-storage 1-2:1.0: USB Mass Storage device detected
>> [  905.944102] scsi host1: usb-storage 1-2:1.0
>> [  905.944102] usbcore: registered new interface driver usb-storage
>> [  905.944117] usbcore: registered new interface driver uas
> 
> The timestamps show that time is moving forward, which is at least
> something. Do you have a feeling for whether the timestamps are
> counting in (roughly) the correct speed, or is it going much faster
> or slower than it should?
> 
> To clarify: the [905.944117] numbers are seconds/microseconds
> since boot, so message would be 906 seconds after the kernel
> started.

No, that would be definitely off. I hadn't had the machine up and running
for 15 minutes. This issue showed right after boot.

>> Begin: Loading essential drivers ... done.                                                                                                                        > Begin: Running /scripts/init-premount ... done.                                                                                                                 > Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
> 
> Ok, so it gets into user space. Is this initramfs or the actual read-only root?

This is using an initramfs.

>> [  906.666923] hpsa 0000:05:00.0: scsi 0:1:0:0: resetting logical  Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
>> [  906.670923] hpsa 0000:05:00.0: device is ready.
>> [  906.670923] hpsa 0000:05:00.0: scsi 0:1:0:0: reset logical  completed successfully Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap- En- Exp=1
>> done.
>> [  906.722166] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  906.722166] rcu:     2-....: (3 ticks this GP) idle=fe6/1/0x4000000000000000 softirq=693/698 fqs=4
>> [  906.722166]  (detected by 0, t=6115 jiffies, g=465, q=80)
> This appears to be an 'rcu stall' warning, from print_cpu_stall_info(),
> indicating that timer ticks are missing.

OK.

>> [  909.360108] INFO: task systemd-sysv-ge:200 blocked for more than 127 seconds.
>> [  909.360108]       Not tainted 5.10.0+ #130
>> [  909.360108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [  909.360108] task:systemd-sysv-ge state:D stack:    0 pid:  200 ppid:   189 flags:0x00000000
>> [  909.364108]
>> [  909.364108] Call Trace:
>> [  909.364423]  [<a00000010109b210>] __schedule+0x890/0x21e0
>> [  909.364423]                                 sp=e0000100487d7b70 bsp=e0000100487d1748
>> [  909.368423]  [<a00000010109cc00>] schedule+0xa0/0x240
>> [  909.368423]                                 sp=e0000100487d7b90 bsp=e0000100487d16e0
>> [  909.368558]  [<a00000010109ce70>] io_schedule+0x70/0xa0
>> [  909.368558]                                 sp=e0000100487d7b90 bsp=e0000100487d16c0
>> [  909.372290]  [<a00000010109e1c0>] bit_wait_io+0x20/0xe0
>> [  909.372290]                                 sp=e0000100487d7b90 bsp=e0000100487d1698
>> [  909.374168] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>> [  909.376290]  [<a00000010109d860>] __wait_on_bit+0xc0/0x1c0
>> [  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1648
>> [  909.374168] rcu:     3-....: (2 ticks this GP) idle=19e/1/0x4000000000000002 softirq=1581/1581 fqs=2
>> [  909.374168]  (detected by 0, t=5661 jiffies, g=1089, q=3)
>> [  909.376290]  [<a00000010109da80>] out_of_line_wait_on_bit+0x120/0x140
>> [  909.376290]                                 sp=e0000100487d7b90 bsp=e0000100487d1610
>> [  909.374168] Task dump for CPU 3:
>> [  909.374168] task:khungtaskd      state:R  running task
> 
> and this seems to be another instance of the same. I would assume that this
> is completely unrelated to the block driver and just happened to trigger during
> the same time the driver was doing something.
> 
> Can you see in your full logs if the "Oops: timer tick before it's due" warning
> triggered at any point?

It's difficult, to be honest. The problem is that the above message spams the whole
kernel buffer to the point that the buffer of the built-in serial console is filled
up. So I'm not sure if I've seen this message.

> I've attached a patch for a partial revert of my original change, this
> should still work with the final cleanup on top, but restore the loop
> plus the local_irq_enable()/local_irq_disable() that I dropped from
> the original code. Does this make a difference?

I'll give it a try and report back.

Adrian
John Paul Adrian Glaubitz Dec. 18, 2020, 10:32 p.m. UTC | #4
Hi Arnd!

On 12/18/20 11:13 PM, John Paul Adrian Glaubitz wrote:
>> I've attached a patch for a partial revert of my original change, this
>> should still work with the final cleanup on top, but restore the loop
>> plus the local_irq_enable()/local_irq_disable() that I dropped from
>> the original code. Does this make a difference?
> 
> I'll give it a try and report back.

Yes. That solves the timer issues. Now there is unfortunately still a
second, unrelated regression with the hpsa driver that was introduced
by one of the ia64 patches in the mm tree from Andrew which makes the
hpsa driver not load at all.

Haven't figured out yet what the problem is.

Adrian
diff mbox series

Patch

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 39b25a5a591b..db8c2a365b70 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -46,6 +46,7 @@  config IA64
 	select ARCH_THREAD_STACK_ALLOCATOR
 	select ARCH_CLOCKSOURCE_DATA
 	select GENERIC_TIME_VSYSCALL
+	select LEGACY_TIMER_TICK
 	select SWIOTLB
 	select SYSCTL_ARCH_UNALIGN_NO_WARN
 	select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 7abc5f37bfaf..9431edb08508 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -161,39 +161,29 @@  void vtime_account_idle(struct task_struct *tsk)
 static irqreturn_t
 timer_interrupt (int irq, void *dev_id)
 {
-	unsigned long new_itm;
+	unsigned long cur_itm, new_itm, ticks;
 
 	if (cpu_is_offline(smp_processor_id())) {
 		return IRQ_HANDLED;
 	}
 
 	new_itm = local_cpu_data->itm_next;
+	cur_itm = ia64_get_itc();
 
-	if (!time_after(ia64_get_itc(), new_itm))
+	if (!time_after(cur_itm, new_itm)) {
 		printk(KERN_ERR "Oops: timer tick before it's due (itc=%lx,itm=%lx)\n",
-		       ia64_get_itc(), new_itm);
-
-	profile_tick(CPU_PROFILING);
-
-	while (1) {
-		update_process_times(user_mode(get_irq_regs()));
-
-		new_itm += local_cpu_data->itm_delta;
-
-		if (smp_processor_id() == time_keeper_id)
-			xtime_update(1);
-
-		local_cpu_data->itm_next = new_itm;
+		       cur_itm, new_itm);
+		ticks = 1;
+	} else {
+		ticks = DIV_ROUND_UP(cur_itm - new_itm,
+				     local_cpu_data->itm_delta);
+		new_itm += ticks * local_cpu_data->itm_delta;
+	}
 
-		if (time_after(new_itm, ia64_get_itc()))
-			break;
+	if (smp_processor_id() != time_keeper_id)
+		ticks = 0;
 
-		/*
-		 * Allow IPIs to interrupt the timer loop.
-		 */
-		local_irq_enable();
-		local_irq_disable();
-	}
+	legacy_timer_tick(ticks);
 
 	do {
 		/*