mbox series

[v5,net-next,0/3] use bulk reads for ocelot statistics

Message ID 20220208044644.359951-1-colin.foster@in-advantage.com (mailing list archive)
Headers show
Series use bulk reads for ocelot statistics | expand

Message

Colin Foster Feb. 8, 2022, 4:46 a.m. UTC
Ocelot loops over memory regions to gather stats on different ports.
These regions are mostly continuous, and are ordered. This patch set
uses that information to break the stats reads into regions that can get
read in bulk.

The motiviation is for general cleanup, but also for SPI. Performing two
back-to-back reads on a SPI bus require toggling the CS line, holding,
re-toggling the CS line, sending 3 address bytes, sending N padding
bytes, then actually performing the read. Bulk reads could reduce almost
all of that overhead, but require that the reads are performed via
regmap_bulk_read.

v1 > v2: reword commit messages
v2 > v3: correctly mark this for net-next when sending
v3 > v4: calloc array instead of zalloc per review
v4 > v5:
    Apply CR suggestions for whitespace
    Fix calloc / zalloc mixup
    Properly destroy workqueues
    Add third commit to split long macros


Colin Foster (3):
  net: ocelot: align macros for consistency
  net: mscc: ocelot: add ability to perform bulk reads
  net: mscc: ocelot: use bulk reads for stats

 drivers/net/ethernet/mscc/ocelot.c    | 78 ++++++++++++++++++++++-----
 drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++
 include/soc/mscc/ocelot.h             | 57 ++++++++++++++------
 3 files changed, 120 insertions(+), 28 deletions(-)

Comments

Vladimir Oltean Feb. 8, 2022, 1:30 p.m. UTC | #1
On Mon, Feb 07, 2022 at 08:46:41PM -0800, Colin Foster wrote:
> Ocelot loops over memory regions to gather stats on different ports.
> These regions are mostly continuous, and are ordered. This patch set
> uses that information to break the stats reads into regions that can get
> read in bulk.
>
> The motiviation is for general cleanup, but also for SPI. Performing two
> back-to-back reads on a SPI bus require toggling the CS line, holding,
> re-toggling the CS line, sending 3 address bytes, sending N padding
> bytes, then actually performing the read. Bulk reads could reduce almost
> all of that overhead, but require that the reads are performed via
> regmap_bulk_read.
>
> v1 > v2: reword commit messages
> v2 > v3: correctly mark this for net-next when sending
> v3 > v4: calloc array instead of zalloc per review
> v4 > v5:
>     Apply CR suggestions for whitespace
>     Fix calloc / zalloc mixup
>     Properly destroy workqueues
>     Add third commit to split long macros
>
>
> Colin Foster (3):
>   net: ocelot: align macros for consistency
>   net: mscc: ocelot: add ability to perform bulk reads
>   net: mscc: ocelot: use bulk reads for stats
>
>  drivers/net/ethernet/mscc/ocelot.c    | 78 ++++++++++++++++++++++-----
>  drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++
>  include/soc/mscc/ocelot.h             | 57 ++++++++++++++------
>  3 files changed, 120 insertions(+), 28 deletions(-)
>
> --
> 2.25.1
>

Please do not merge these yet. I gave them a run on my board and the
kernel crashed on boot.

[    8.043170] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
[    8.050241] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
[    8.057142] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
[    8.064021] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
[    8.128668] ------------[ cut here ]------------
[    8.133315] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
[    8.143107] Modules linked in:
[    8.146181] CPU: 1 PID: 44 Comm: kworker/u4:2 Not tainted 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
[    8.155355] Hardware name: LS1028A RDB Board (DT)
[    8.160079] Workqueue: events_unbound deferred_probe_work_func
[    8.165945] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.172940] pc : vsc9959_wm_enc+0x2c/0x40
[    8.176967] lr : ocelot_setup_sharing_watermarks+0x94/0x1fc
[    8.182568] sp : ffff800008863810
[    8.185896] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
[    8.193079] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
[    8.200261] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
[    8.207441] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
[    8.214623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
[    8.221803] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
[    8.228984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce0f8
[    8.236165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
[    8.243345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000000
[    8.250525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
[    8.257706] Call trace:
[    8.260162]  vsc9959_wm_enc+0x2c/0x40
[    8.263841]  ocelot_devlink_sb_register+0x33c/0x380
[    8.268742]  felix_setup+0x438/0x750
[    8.272334]  dsa_register_switch+0x988/0x114c
[    8.276713]  felix_pci_probe+0x138/0x1fc
[    8.280654]  local_pci_probe+0x4c/0xc0
[    8.284423]  pci_device_probe+0x1b0/0x1f0
[    8.288451]  really_probe.part.0+0xa4/0x310
[    8.292654]  __driver_probe_device+0xa0/0x150
[    8.297030]  driver_probe_device+0xcc/0x164
[    8.301231]  __device_attach_driver+0xc4/0x130
[    8.305695]  bus_for_each_drv+0x84/0xe0
[    8.309547]  __device_attach+0xe4/0x190
[    8.313400]  device_initial_probe+0x20/0x30
[    8.317601]  bus_probe_device+0xac/0xb4
[    8.321454]  deferred_probe_work_func+0x98/0xd4
[    8.326004]  process_one_work+0x294/0x700
[    8.330037]  worker_thread+0x80/0x480
[    8.333717]  kthread+0x10c/0x120
[    8.336961]  ret_from_fork+0x10/0x20
[    8.340554] irq event stamp: 50432
[    8.343968] hardirqs last  enabled at (50431): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0
[    8.353581] hardirqs last disabled at (50432): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
[    8.361448] softirqs last  enabled at (50148): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
[    8.369923] softirqs last disabled at (50143): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
[    8.378577] ---[ end trace 0000000000000000 ]---
[    8.383304] ------------[ cut here ]------------
[    8.387942] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
[    8.397729] Modules linked in:
[    8.400800] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
[    8.411369] Hardware name: LS1028A RDB Board (DT)
[    8.416092] Workqueue: events_unbound deferred_probe_work_func
[    8.421955] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.428948] pc : vsc9959_wm_enc+0x2c/0x40
[    8.432975] lr : ocelot_setup_sharing_watermarks+0xc0/0x1fc
[    8.438573] sp : ffff800008863810
[    8.441900] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
[    8.449081] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
[    8.456262] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
[    8.463443] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
[    8.470623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
[    8.477804] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
[    8.484984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce124
[    8.492165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
[    8.499345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000001
[    8.506525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
[    8.513705] Call trace:
[    8.516161]  vsc9959_wm_enc+0x2c/0x40
[    8.519840]  ocelot_devlink_sb_register+0x33c/0x380
[    8.524740]  felix_setup+0x438/0x750
[    8.528331]  dsa_register_switch+0x988/0x114c
[    8.532708]  felix_pci_probe+0x138/0x1fc
[    8.536648]  local_pci_probe+0x4c/0xc0
[    8.540415]  pci_device_probe+0x1b0/0x1f0
[    8.544443]  really_probe.part.0+0xa4/0x310
[    8.548646]  __driver_probe_device+0xa0/0x150
[    8.553022]  driver_probe_device+0xcc/0x164
[    8.557225]  __device_attach_driver+0xc4/0x130
[    8.561688]  bus_for_each_drv+0x84/0xe0
[    8.565540]  __device_attach+0xe4/0x190
[    8.569393]  device_initial_probe+0x20/0x30
[    8.573594]  bus_probe_device+0xac/0xb4
[    8.577447]  deferred_probe_work_func+0x98/0xd4
[    8.581997]  process_one_work+0x294/0x700
[    8.586026]  worker_thread+0x80/0x480
[    8.589706]  kthread+0x10c/0x120
[    8.592949]  ret_from_fork+0x10/0x20
[    8.596541] irq event stamp: 50450
[    8.599955] hardirqs last  enabled at (50449): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0                                                                                                
[    8.609566] hardirqs last disabled at (50450): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
[    8.617431] softirqs last  enabled at (50446): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
[    8.625907] softirqs last disabled at (50435): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
[    8.634559] ---[ end trace 0000000000000000 ]---
[    8.640586] device eth1 entered promiscuous mode
[    8.645499] Unable to handle kernel paging request at virtual address 00000010400020bc
[    8.653496] Mem abort info:
[    8.656340]   ESR = 0x96000044
[    8.659413]   EC = 0x25: DABT (current EL), IL = 32 bits
[    8.664784]   SET = 0, FnV = 0
[    8.667855]   EA = 0, S1PTW = 0
[    8.671044]   FSC = 0x04: level 0 translation fault
[    8.675979] Data abort info:
[    8.678875]   ISV = 0, ISS = 0x00000044
[    8.682762]   CM = 0, WnR = 1
[    8.685795] [00000010400020bc] user address but active_mm is swapper
[    8.692272] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[    8.697865] Modules linked in:
[    8.700928] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
[    8.711490] Hardware name: LS1028A RDB Board (DT)
[    8.716206] Workqueue: events_unbound deferred_probe_work_func
[    8.722065] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.729051] pc : ocelot_phylink_mac_link_down+0x70/0x314
[    8.734381] lr : felix_phylink_mac_link_down+0x24/0x30
[    8.739536] sp : ffff800008863840
[    8.742856] x29: ffff800008863840 x28: 0000000000000000 x27: ffffd8e2af8ef180
[    8.750022] x26: 0000000000000001 x25: 0000001040002000 x24: 0000000000000000
[    8.757187] x23: 0000000000000180 x22: ffff2d1703a92000 x21: 0000000000000004
[    8.764352] x20: 0000000000000004 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
[    8.771516] x17: 0000000000040006 x16: 0000000500030008 x15: ffffffffffffffff
[    8.778680] x14: ffffffffffff0000 x13: ffffffffffffffff x12: 0000000000000010
[    8.785845] x11: 0101010101010101 x10: 0000000000000004 x9 : ffffd8e2ad3d0814
[    8.793009] x8 : fefefefefeff6a6d x7 : 0000ffffffffffff x6 : 0000000000000000
[    8.800173] x5 : 00000000ffffffff x4 : 0000000000000001 x3 : 000000000d000007
[    8.807338] x2 : 0000000000000010 x1 : 0000000000000000 x0 : 0000001040002000
[    8.814502] Call trace:
[    8.816949]  ocelot_phylink_mac_link_down+0x70/0x314
[    8.821929]  felix_phylink_mac_link_down+0x24/0x30
[    8.826734]  dsa_port_link_register_of+0xa8/0x240
[    8.831454]  dsa_port_setup+0xb4/0x180
[    8.835212]  dsa_register_switch+0xdb4/0x114c
[    8.839581]  felix_pci_probe+0x138/0x1fc
[    8.843515]  local_pci_probe+0x4c/0xc0
[    8.847275]  pci_device_probe+0x1b0/0x1f0
[    8.851296]  really_probe.part.0+0xa4/0x310
[    8.855490]  __driver_probe_device+0xa0/0x150
[    8.859860]  driver_probe_device+0xcc/0x164
[    8.864054]  __device_attach_driver+0xc4/0x130
[    8.868510]  bus_for_each_drv+0x84/0xe0
[    8.872355]  __device_attach+0xe4/0x190
[    8.876200]  device_initial_probe+0x20/0x30
[    8.880395]  bus_probe_device+0xac/0xb4
[    8.884240]  deferred_probe_work_func+0x98/0xd4
[    8.888783]  process_one_work+0x294/0x700
[    8.892808]  worker_thread+0x80/0x480
[    8.896480]  kthread+0x10c/0x120
[    8.899715]  ret_from_fork+0x10/0x20
[    8.903303] Code: 52800202 f874d839 52800001 aa1903e0 (b900bf25)
[    8.909417] ---[ end trace 0000000000000000 ]---

Investigating...
Colin Foster Feb. 8, 2022, 1:55 p.m. UTC | #2
On Tue, Feb 08, 2022 at 01:30:16PM +0000, Vladimir Oltean wrote:
> On Mon, Feb 07, 2022 at 08:46:41PM -0800, Colin Foster wrote:
> > Ocelot loops over memory regions to gather stats on different ports.
> > These regions are mostly continuous, and are ordered. This patch set
> > uses that information to break the stats reads into regions that can get
> > read in bulk.
> >
> > The motiviation is for general cleanup, but also for SPI. Performing two
> > back-to-back reads on a SPI bus require toggling the CS line, holding,
> > re-toggling the CS line, sending 3 address bytes, sending N padding
> > bytes, then actually performing the read. Bulk reads could reduce almost
> > all of that overhead, but require that the reads are performed via
> > regmap_bulk_read.
> >
> > v1 > v2: reword commit messages
> > v2 > v3: correctly mark this for net-next when sending
> > v3 > v4: calloc array instead of zalloc per review
> > v4 > v5:
> >     Apply CR suggestions for whitespace
> >     Fix calloc / zalloc mixup
> >     Properly destroy workqueues
> >     Add third commit to split long macros
> >
> >
> > Colin Foster (3):
> >   net: ocelot: align macros for consistency
> >   net: mscc: ocelot: add ability to perform bulk reads
> >   net: mscc: ocelot: use bulk reads for stats
> >
> >  drivers/net/ethernet/mscc/ocelot.c    | 78 ++++++++++++++++++++++-----
> >  drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++
> >  include/soc/mscc/ocelot.h             | 57 ++++++++++++++------
> >  3 files changed, 120 insertions(+), 28 deletions(-)
> >
> > --
> > 2.25.1
> >
> 
> Please do not merge these yet. I gave them a run on my board and the
> kernel crashed on boot.
> 
> [    8.043170] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
> [    8.050241] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
> [    8.057142] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
> [    8.064021] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
> [    8.128668] ------------[ cut here ]------------
> [    8.133315] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> [    8.143107] Modules linked in:
> [    8.146181] CPU: 1 PID: 44 Comm: kworker/u4:2 Not tainted 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [    8.155355] Hardware name: LS1028A RDB Board (DT)
> [    8.160079] Workqueue: events_unbound deferred_probe_work_func
> [    8.165945] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    8.172940] pc : vsc9959_wm_enc+0x2c/0x40
> [    8.176967] lr : ocelot_setup_sharing_watermarks+0x94/0x1fc
> [    8.182568] sp : ffff800008863810
> [    8.185896] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> [    8.193079] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> [    8.200261] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> [    8.207441] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [    8.214623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> [    8.221803] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> [    8.228984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce0f8
> [    8.236165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> [    8.243345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000000
> [    8.250525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> [    8.257706] Call trace:
> [    8.260162]  vsc9959_wm_enc+0x2c/0x40
> [    8.263841]  ocelot_devlink_sb_register+0x33c/0x380
> [    8.268742]  felix_setup+0x438/0x750
> [    8.272334]  dsa_register_switch+0x988/0x114c
> [    8.276713]  felix_pci_probe+0x138/0x1fc
> [    8.280654]  local_pci_probe+0x4c/0xc0
> [    8.284423]  pci_device_probe+0x1b0/0x1f0
> [    8.288451]  really_probe.part.0+0xa4/0x310
> [    8.292654]  __driver_probe_device+0xa0/0x150
> [    8.297030]  driver_probe_device+0xcc/0x164
> [    8.301231]  __device_attach_driver+0xc4/0x130
> [    8.305695]  bus_for_each_drv+0x84/0xe0
> [    8.309547]  __device_attach+0xe4/0x190
> [    8.313400]  device_initial_probe+0x20/0x30
> [    8.317601]  bus_probe_device+0xac/0xb4
> [    8.321454]  deferred_probe_work_func+0x98/0xd4
> [    8.326004]  process_one_work+0x294/0x700
> [    8.330037]  worker_thread+0x80/0x480
> [    8.333717]  kthread+0x10c/0x120
> [    8.336961]  ret_from_fork+0x10/0x20
> [    8.340554] irq event stamp: 50432
> [    8.343968] hardirqs last  enabled at (50431): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0
> [    8.353581] hardirqs last disabled at (50432): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> [    8.361448] softirqs last  enabled at (50148): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> [    8.369923] softirqs last disabled at (50143): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> [    8.378577] ---[ end trace 0000000000000000 ]---
> [    8.383304] ------------[ cut here ]------------
> [    8.387942] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> [    8.397729] Modules linked in:
> [    8.400800] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [    8.411369] Hardware name: LS1028A RDB Board (DT)
> [    8.416092] Workqueue: events_unbound deferred_probe_work_func
> [    8.421955] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    8.428948] pc : vsc9959_wm_enc+0x2c/0x40
> [    8.432975] lr : ocelot_setup_sharing_watermarks+0xc0/0x1fc
> [    8.438573] sp : ffff800008863810
> [    8.441900] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> [    8.449081] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> [    8.456262] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> [    8.463443] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [    8.470623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> [    8.477804] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> [    8.484984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce124
> [    8.492165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> [    8.499345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000001
> [    8.506525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> [    8.513705] Call trace:
> [    8.516161]  vsc9959_wm_enc+0x2c/0x40
> [    8.519840]  ocelot_devlink_sb_register+0x33c/0x380
> [    8.524740]  felix_setup+0x438/0x750
> [    8.528331]  dsa_register_switch+0x988/0x114c
> [    8.532708]  felix_pci_probe+0x138/0x1fc
> [    8.536648]  local_pci_probe+0x4c/0xc0
> [    8.540415]  pci_device_probe+0x1b0/0x1f0
> [    8.544443]  really_probe.part.0+0xa4/0x310
> [    8.548646]  __driver_probe_device+0xa0/0x150
> [    8.553022]  driver_probe_device+0xcc/0x164
> [    8.557225]  __device_attach_driver+0xc4/0x130
> [    8.561688]  bus_for_each_drv+0x84/0xe0
> [    8.565540]  __device_attach+0xe4/0x190
> [    8.569393]  device_initial_probe+0x20/0x30
> [    8.573594]  bus_probe_device+0xac/0xb4
> [    8.577447]  deferred_probe_work_func+0x98/0xd4
> [    8.581997]  process_one_work+0x294/0x700
> [    8.586026]  worker_thread+0x80/0x480
> [    8.589706]  kthread+0x10c/0x120
> [    8.592949]  ret_from_fork+0x10/0x20
> [    8.596541] irq event stamp: 50450
> [    8.599955] hardirqs last  enabled at (50449): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0                                                                                                
> [    8.609566] hardirqs last disabled at (50450): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> [    8.617431] softirqs last  enabled at (50446): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> [    8.625907] softirqs last disabled at (50435): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> [    8.634559] ---[ end trace 0000000000000000 ]---
> [    8.640586] device eth1 entered promiscuous mode
> [    8.645499] Unable to handle kernel paging request at virtual address 00000010400020bc
> [    8.653496] Mem abort info:
> [    8.656340]   ESR = 0x96000044
> [    8.659413]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    8.664784]   SET = 0, FnV = 0
> [    8.667855]   EA = 0, S1PTW = 0
> [    8.671044]   FSC = 0x04: level 0 translation fault
> [    8.675979] Data abort info:
> [    8.678875]   ISV = 0, ISS = 0x00000044
> [    8.682762]   CM = 0, WnR = 1
> [    8.685795] [00000010400020bc] user address but active_mm is swapper
> [    8.692272] Internal error: Oops: 96000044 [#1] PREEMPT SMP
> [    8.697865] Modules linked in:
> [    8.700928] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> [    8.711490] Hardware name: LS1028A RDB Board (DT)
> [    8.716206] Workqueue: events_unbound deferred_probe_work_func
> [    8.722065] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    8.729051] pc : ocelot_phylink_mac_link_down+0x70/0x314
> [    8.734381] lr : felix_phylink_mac_link_down+0x24/0x30
> [    8.739536] sp : ffff800008863840
> [    8.742856] x29: ffff800008863840 x28: 0000000000000000 x27: ffffd8e2af8ef180
> [    8.750022] x26: 0000000000000001 x25: 0000001040002000 x24: 0000000000000000
> [    8.757187] x23: 0000000000000180 x22: ffff2d1703a92000 x21: 0000000000000004
> [    8.764352] x20: 0000000000000004 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> [    8.771516] x17: 0000000000040006 x16: 0000000500030008 x15: ffffffffffffffff
> [    8.778680] x14: ffffffffffff0000 x13: ffffffffffffffff x12: 0000000000000010
> [    8.785845] x11: 0101010101010101 x10: 0000000000000004 x9 : ffffd8e2ad3d0814
> [    8.793009] x8 : fefefefefeff6a6d x7 : 0000ffffffffffff x6 : 0000000000000000
> [    8.800173] x5 : 00000000ffffffff x4 : 0000000000000001 x3 : 000000000d000007
> [    8.807338] x2 : 0000000000000010 x1 : 0000000000000000 x0 : 0000001040002000
> [    8.814502] Call trace:
> [    8.816949]  ocelot_phylink_mac_link_down+0x70/0x314
> [    8.821929]  felix_phylink_mac_link_down+0x24/0x30
> [    8.826734]  dsa_port_link_register_of+0xa8/0x240
> [    8.831454]  dsa_port_setup+0xb4/0x180
> [    8.835212]  dsa_register_switch+0xdb4/0x114c
> [    8.839581]  felix_pci_probe+0x138/0x1fc
> [    8.843515]  local_pci_probe+0x4c/0xc0
> [    8.847275]  pci_device_probe+0x1b0/0x1f0
> [    8.851296]  really_probe.part.0+0xa4/0x310
> [    8.855490]  __driver_probe_device+0xa0/0x150
> [    8.859860]  driver_probe_device+0xcc/0x164
> [    8.864054]  __device_attach_driver+0xc4/0x130
> [    8.868510]  bus_for_each_drv+0x84/0xe0
> [    8.872355]  __device_attach+0xe4/0x190
> [    8.876200]  device_initial_probe+0x20/0x30
> [    8.880395]  bus_probe_device+0xac/0xb4
> [    8.884240]  deferred_probe_work_func+0x98/0xd4
> [    8.888783]  process_one_work+0x294/0x700
> [    8.892808]  worker_thread+0x80/0x480
> [    8.896480]  kthread+0x10c/0x120
> [    8.899715]  ret_from_fork+0x10/0x20
> [    8.903303] Code: 52800202 f874d839 52800001 aa1903e0 (b900bf25)
> [    8.909417] ---[ end trace 0000000000000000 ]---
> 
> Investigating...

I just did a sanity check on my latest tree and it doesn't crash. I'll
keep an eye out here as well.

Gives me an opportunity to fix up your other suggestion.

Thanks for reviewing and testing Vladimir!
Vladimir Oltean Feb. 8, 2022, 2:53 p.m. UTC | #3
On Tue, Feb 08, 2022 at 05:55:32AM -0800, Colin Foster wrote:
> On Tue, Feb 08, 2022 at 01:30:16PM +0000, Vladimir Oltean wrote:
> > On Mon, Feb 07, 2022 at 08:46:41PM -0800, Colin Foster wrote:
> > > Ocelot loops over memory regions to gather stats on different ports.
> > > These regions are mostly continuous, and are ordered. This patch set
> > > uses that information to break the stats reads into regions that can get
> > > read in bulk.
> > >
> > > The motiviation is for general cleanup, but also for SPI. Performing two
> > > back-to-back reads on a SPI bus require toggling the CS line, holding,
> > > re-toggling the CS line, sending 3 address bytes, sending N padding
> > > bytes, then actually performing the read. Bulk reads could reduce almost
> > > all of that overhead, but require that the reads are performed via
> > > regmap_bulk_read.
> > >
> > > v1 > v2: reword commit messages
> > > v2 > v3: correctly mark this for net-next when sending
> > > v3 > v4: calloc array instead of zalloc per review
> > > v4 > v5:
> > >     Apply CR suggestions for whitespace
> > >     Fix calloc / zalloc mixup
> > >     Properly destroy workqueues
> > >     Add third commit to split long macros
> > >
> > >
> > > Colin Foster (3):
> > >   net: ocelot: align macros for consistency
> > >   net: mscc: ocelot: add ability to perform bulk reads
> > >   net: mscc: ocelot: use bulk reads for stats
> > >
> > >  drivers/net/ethernet/mscc/ocelot.c    | 78 ++++++++++++++++++++++-----
> > >  drivers/net/ethernet/mscc/ocelot_io.c | 13 +++++
> > >  include/soc/mscc/ocelot.h             | 57 ++++++++++++++------
> > >  3 files changed, 120 insertions(+), 28 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> > 
> > Please do not merge these yet. I gave them a run on my board and the
> > kernel crashed on boot.
> > 
> > [    8.043170] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 0
> > [    8.050241] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 1
> > [    8.057142] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 2
> > [    8.064021] mscc_felix 0000:00:00.5: Found PCS at internal MDIO address 3
> > [    8.128668] ------------[ cut here ]------------
> > [    8.133315] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> > [    8.143107] Modules linked in:
> > [    8.146181] CPU: 1 PID: 44 Comm: kworker/u4:2 Not tainted 5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> > [    8.155355] Hardware name: LS1028A RDB Board (DT)
> > [    8.160079] Workqueue: events_unbound deferred_probe_work_func
> > [    8.165945] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [    8.172940] pc : vsc9959_wm_enc+0x2c/0x40
> > [    8.176967] lr : ocelot_setup_sharing_watermarks+0x94/0x1fc
> > [    8.182568] sp : ffff800008863810
> > [    8.185896] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> > [    8.193079] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> > [    8.200261] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> > [    8.207441] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> > [    8.214623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> > [    8.221803] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> > [    8.228984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce0f8
> > [    8.236165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> > [    8.243345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000000
> > [    8.250525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> > [    8.257706] Call trace:
> > [    8.260162]  vsc9959_wm_enc+0x2c/0x40
> > [    8.263841]  ocelot_devlink_sb_register+0x33c/0x380
> > [    8.268742]  felix_setup+0x438/0x750
> > [    8.272334]  dsa_register_switch+0x988/0x114c
> > [    8.276713]  felix_pci_probe+0x138/0x1fc
> > [    8.280654]  local_pci_probe+0x4c/0xc0
> > [    8.284423]  pci_device_probe+0x1b0/0x1f0
> > [    8.288451]  really_probe.part.0+0xa4/0x310
> > [    8.292654]  __driver_probe_device+0xa0/0x150
> > [    8.297030]  driver_probe_device+0xcc/0x164
> > [    8.301231]  __device_attach_driver+0xc4/0x130
> > [    8.305695]  bus_for_each_drv+0x84/0xe0
> > [    8.309547]  __device_attach+0xe4/0x190
> > [    8.313400]  device_initial_probe+0x20/0x30
> > [    8.317601]  bus_probe_device+0xac/0xb4
> > [    8.321454]  deferred_probe_work_func+0x98/0xd4
> > [    8.326004]  process_one_work+0x294/0x700
> > [    8.330037]  worker_thread+0x80/0x480
> > [    8.333717]  kthread+0x10c/0x120
> > [    8.336961]  ret_from_fork+0x10/0x20
> > [    8.340554] irq event stamp: 50432
> > [    8.343968] hardirqs last  enabled at (50431): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0
> > [    8.353581] hardirqs last disabled at (50432): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> > [    8.361448] softirqs last  enabled at (50148): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> > [    8.369923] softirqs last disabled at (50143): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> > [    8.378577] ---[ end trace 0000000000000000 ]---
> > [    8.383304] ------------[ cut here ]------------
> > [    8.387942] WARNING: CPU: 1 PID: 44 at drivers/net/dsa/ocelot/felix_vsc9959.c:1007 vsc9959_wm_enc+0x2c/0x40
> > [    8.397729] Modules linked in:
> > [    8.400800] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> > [    8.411369] Hardware name: LS1028A RDB Board (DT)
> > [    8.416092] Workqueue: events_unbound deferred_probe_work_func
> > [    8.421955] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [    8.428948] pc : vsc9959_wm_enc+0x2c/0x40
> > [    8.432975] lr : ocelot_setup_sharing_watermarks+0xc0/0x1fc
> > [    8.438573] sp : ffff800008863810
> > [    8.441900] x29: ffff800008863810 x28: 0000000000000308 x27: 0000000000000001
> > [    8.449081] x26: 000000000300001a x25: 0000000000000008 x24: 0000000000000080
> > [    8.456262] x23: 0000000088888889 x22: 0000000000000000 x21: 0000000000000000
> > [    8.463443] x20: 00000000ffff2d17 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> > [    8.470623] x17: 0000000000000015 x16: 0000000000000041 x15: 0000000000000000
> > [    8.477804] x14: ffffd8e2afa49228 x13: 0000000000025700 x12: 0000000000000009
> > [    8.484984] x11: ffff2d1703a96c18 x10: 0000000000000004 x9 : ffffd8e2ad6ce124
> > [    8.492165] x8 : ffff2d1700be4240 x7 : ffffd8e2af981000 x6 : 0000000000000001
> > [    8.499345] x5 : ffffd8e2ad1e1440 x4 : 0000000000000000 x3 : 0000000000000001
> > [    8.506525] x2 : 0000000000000000 x1 : ffffd8e2ad3d2810 x0 : 00000000000040c0
> > [    8.513705] Call trace:
> > [    8.516161]  vsc9959_wm_enc+0x2c/0x40
> > [    8.519840]  ocelot_devlink_sb_register+0x33c/0x380
> > [    8.524740]  felix_setup+0x438/0x750
> > [    8.528331]  dsa_register_switch+0x988/0x114c
> > [    8.532708]  felix_pci_probe+0x138/0x1fc
> > [    8.536648]  local_pci_probe+0x4c/0xc0
> > [    8.540415]  pci_device_probe+0x1b0/0x1f0
> > [    8.544443]  really_probe.part.0+0xa4/0x310
> > [    8.548646]  __driver_probe_device+0xa0/0x150
> > [    8.553022]  driver_probe_device+0xcc/0x164
> > [    8.557225]  __device_attach_driver+0xc4/0x130
> > [    8.561688]  bus_for_each_drv+0x84/0xe0
> > [    8.565540]  __device_attach+0xe4/0x190
> > [    8.569393]  device_initial_probe+0x20/0x30
> > [    8.573594]  bus_probe_device+0xac/0xb4
> > [    8.577447]  deferred_probe_work_func+0x98/0xd4
> > [    8.581997]  process_one_work+0x294/0x700
> > [    8.586026]  worker_thread+0x80/0x480
> > [    8.589706]  kthread+0x10c/0x120
> > [    8.592949]  ret_from_fork+0x10/0x20
> > [    8.596541] irq event stamp: 50450
> > [    8.599955] hardirqs last  enabled at (50449): [<ffffd8e2ade442b0>] _raw_spin_unlock_irqrestore+0x90/0xb0                                                                                                
> > [    8.609566] hardirqs last disabled at (50450): [<ffffd8e2ade36e44>] el1_dbg+0x24/0x90
> > [    8.617431] softirqs last  enabled at (50446): [<ffffd8e2ac6908f0>] __do_softirq+0x480/0x5f8
> > [    8.625907] softirqs last disabled at (50435): [<ffffd8e2ac728e3c>] __irq_exit_rcu+0x17c/0x1b0
> > [    8.634559] ---[ end trace 0000000000000000 ]---
> > [    8.640586] device eth1 entered promiscuous mode
> > [    8.645499] Unable to handle kernel paging request at virtual address 00000010400020bc
> > [    8.653496] Mem abort info:
> > [    8.656340]   ESR = 0x96000044
> > [    8.659413]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [    8.664784]   SET = 0, FnV = 0
> > [    8.667855]   EA = 0, S1PTW = 0
> > [    8.671044]   FSC = 0x04: level 0 translation fault
> > [    8.675979] Data abort info:
> > [    8.678875]   ISV = 0, ISS = 0x00000044
> > [    8.682762]   CM = 0, WnR = 1
> > [    8.685795] [00000010400020bc] user address but active_mm is swapper
> > [    8.692272] Internal error: Oops: 96000044 [#1] PREEMPT SMP
> > [    8.697865] Modules linked in:
> > [    8.700928] CPU: 1 PID: 44 Comm: kworker/u4:2 Tainted: G        W         5.17.0-rc2-07010-ga9b9500ffaac-dirty #2104
> > [    8.711490] Hardware name: LS1028A RDB Board (DT)
> > [    8.716206] Workqueue: events_unbound deferred_probe_work_func
> > [    8.722065] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [    8.729051] pc : ocelot_phylink_mac_link_down+0x70/0x314
> > [    8.734381] lr : felix_phylink_mac_link_down+0x24/0x30
> > [    8.739536] sp : ffff800008863840
> > [    8.742856] x29: ffff800008863840 x28: 0000000000000000 x27: ffffd8e2af8ef180
> > [    8.750022] x26: 0000000000000001 x25: 0000001040002000 x24: 0000000000000000
> > [    8.757187] x23: 0000000000000180 x22: ffff2d1703a92000 x21: 0000000000000004
> > [    8.764352] x20: 0000000000000004 x19: ffff2d17039d8010 x18: ffffd8e2afa23b28
> > [    8.771516] x17: 0000000000040006 x16: 0000000500030008 x15: ffffffffffffffff
> > [    8.778680] x14: ffffffffffff0000 x13: ffffffffffffffff x12: 0000000000000010
> > [    8.785845] x11: 0101010101010101 x10: 0000000000000004 x9 : ffffd8e2ad3d0814
> > [    8.793009] x8 : fefefefefeff6a6d x7 : 0000ffffffffffff x6 : 0000000000000000
> > [    8.800173] x5 : 00000000ffffffff x4 : 0000000000000001 x3 : 000000000d000007
> > [    8.807338] x2 : 0000000000000010 x1 : 0000000000000000 x0 : 0000001040002000
> > [    8.814502] Call trace:
> > [    8.816949]  ocelot_phylink_mac_link_down+0x70/0x314
> > [    8.821929]  felix_phylink_mac_link_down+0x24/0x30
> > [    8.826734]  dsa_port_link_register_of+0xa8/0x240
> > [    8.831454]  dsa_port_setup+0xb4/0x180
> > [    8.835212]  dsa_register_switch+0xdb4/0x114c
> > [    8.839581]  felix_pci_probe+0x138/0x1fc
> > [    8.843515]  local_pci_probe+0x4c/0xc0
> > [    8.847275]  pci_device_probe+0x1b0/0x1f0
> > [    8.851296]  really_probe.part.0+0xa4/0x310
> > [    8.855490]  __driver_probe_device+0xa0/0x150
> > [    8.859860]  driver_probe_device+0xcc/0x164
> > [    8.864054]  __device_attach_driver+0xc4/0x130
> > [    8.868510]  bus_for_each_drv+0x84/0xe0
> > [    8.872355]  __device_attach+0xe4/0x190
> > [    8.876200]  device_initial_probe+0x20/0x30
> > [    8.880395]  bus_probe_device+0xac/0xb4
> > [    8.884240]  deferred_probe_work_func+0x98/0xd4
> > [    8.888783]  process_one_work+0x294/0x700
> > [    8.892808]  worker_thread+0x80/0x480
> > [    8.896480]  kthread+0x10c/0x120
> > [    8.899715]  ret_from_fork+0x10/0x20
> > [    8.903303] Code: 52800202 f874d839 52800001 aa1903e0 (b900bf25)
> > [    8.909417] ---[ end trace 0000000000000000 ]---
> > 
> > Investigating...
> 
> I just did a sanity check on my latest tree and it doesn't crash. I'll
> keep an eye out here as well.
> 
> Gives me an opportunity to fix up your other suggestion.
> 
> Thanks for reviewing and testing Vladimir!

I don't know why this glitch happened. I couldn't reproduce this a
second time. Booting patch by patch with KASAN enabled also showed
nothing. And the stack traces don't make much sense either. I'll just
ignore it for now.

But anyway, I found some actual bugs in the code while testing, I'll
reply on the individual patch.