mbox series

[net-next,0/7] devlink: params cleanups and devl_param_driverinit_value_get() fix

Message ID 20230209154308.2984602-1-jiri@resnulli.us (mailing list archive)
Headers show
Series devlink: params cleanups and devl_param_driverinit_value_get() fix | expand

Message

Jiri Pirko Feb. 9, 2023, 3:43 p.m. UTC
From: Jiri Pirko <jiri@nvidia.com>

The primary motivation of this patchset is the patch #6, which fixes an
issue introduced by 075935f0ae0f ("devlink: protect devlink param list
by instance lock") and reported by Kim Phillips <kim.phillips@amd.com>
(https://lore.kernel.org/netdev/719de4f0-76ac-e8b9-38a9-167ae239efc7@amd.com/)
and my colleagues doing mlx5 driver regression testing.

The basis idea is that devl_param_driverinit_value_get() could be
possible to the called without holding devlink intance lock in
most of the cases (all existing ones in the current codebase),
which would fix some mlx5 flows where the lock is not held.

To achieve that, make sure that the param value does not change between
reloads with patch #2.

Also, convert the param list to xarray which removes the worry about
list_head consistency when doing lockless lookup.

The rest of the patches are doing some small related cleanup of things
that poke me in the eye during the work.

Jiri Pirko (7):
  devlink: don't use strcpy() to copy param value
  devlink: make sure driver does not read updated driverinit param
    before reload
  devlink: fix the name of value arg of
    devl_param_driverinit_value_get()
  devlink: use xa_for_each_start() helper in
    devlink_nl_cmd_port_get_dump_one()
  devlink: convert param list to xarray
  devlink: allow to call devl_param_driverinit_value_get() without
    holding instance lock
  devlink: add forgotten devlink instance lock assertion to
    devl_param_driverinit_value_set()

 include/net/devlink.h       |   6 +-
 net/devlink/core.c          |   4 +-
 net/devlink/dev.c           |   2 +
 net/devlink/devl_internal.h |   5 +-
 net/devlink/leftover.c      | 139 ++++++++++++++++++++----------------
 5 files changed, 90 insertions(+), 66 deletions(-)

Comments

Kim Phillips Feb. 9, 2023, 9:05 p.m. UTC | #1
On 2/9/23 9:43 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@nvidia.com>
> 
> The primary motivation of this patchset is the patch #6, which fixes an
> issue introduced by 075935f0ae0f ("devlink: protect devlink param list
> by instance lock") and reported by Kim Phillips <kim.phillips@amd.com>
> (https://lore.kernel.org/netdev/719de4f0-76ac-e8b9-38a9-167ae239efc7@amd.com/)
> and my colleagues doing mlx5 driver regression testing.

I can't provide my Tested-by because this series doesn't apply
cleanly to today's (or the original day's) linux-next tag, and
today's net-next/{master,main} (5131a053f292) won't boot on any
of my systems, whether or not this series is applied, with:

[   19.478836] ------------[ cut here ]------------
[   19.483474] kernel BUG at mm/usercopy.c:102!
[   19.487761] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   19.492988] CPU: 196 PID: 1903 Comm: systemd-udevd Not tainted 6.2.0-rc6+ #119
[   19.500204] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
[   19.508284] RIP: 0010:usercopy_abort+0x7f/0x81
[   19.512738] Code: 4c 0f 45 de 51 4c 89 d1 48 c7 c2 7b 66 b6 9c 57 48 c7 c6 a8 b4 b5 9c 48 c7 c7 b8 4b c0 9c 48 0f 45 f2 4c 89 da e8 1e 56 ff ff <0f> 0b 49 89 d8 4c 89 c9 44 89 ea 31 f6 48 c7 c7 c5 66 b6 9c e8 68
[   19.531484] RSP: 0018:ffffaeabdc8d3ad0 EFLAGS: 00010246
[   19.536708] RAX: 000000000000006b RBX: 0000000000000053 RCX: 0000000000000000
[   19.543833] RDX: 0000000000000000 RSI: 00000000ffdfffff RDI: 00000000ffffffff
[   19.550964] RBP: ffffaeabdc8d3ae8 R08: 0000000000000000 R09: ffffaeabdc8d3950
[   19.558088] R10: 0000000000000001 R11: 0000000000000001 R12: ffff93700bf72a80
[   19.565214] R13: 0000000000000001 R14: ffff93700bf72ad3 R15: 0000000000000040
[   19.572346] FS:  00007f2b4b275880(0000) GS:ffff938e80c00000(0000) knlGS:0000000000000000
[   19.580432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.586177] CR2: 00007ffc48657b08 CR3: 000080108f998000 CR4: 0000000000350ee0
[   19.593300] Call Trace:
[   19.595746]  <TASK>
[   19.597853]  __check_heap_object+0x9f/0xe0
[   19.601950]  __check_object_size+0x1f7/0x220
[   19.606221]  simple_copy_to_iter+0x2f/0x60
[   19.610323]  __skb_datagram_iter+0x78/0x2e0
[   19.614507]  ? __pfx_simple_copy_to_iter+0x10/0x10
[   19.619300]  ? __skb_recv_datagram+0x8b/0xc0
[   19.623574]  skb_copy_datagram_iter+0x66/0xe0
[   19.627933]  netlink_recvmsg+0xd1/0x400
[   19.631772]  ? apparmor_socket_recvmsg+0x22/0x30
[   19.636391]  sock_recvmsg+0xaa/0xb0
[   19.639882]  ____sys_recvmsg+0x9b/0x200
[   19.643715]  ? import_iovec+0x1f/0x30
[   19.647379]  ? copy_msghdr_from_user+0x77/0xb0
[   19.651817]  ___sys_recvmsg+0x80/0xc0
[   19.655483]  ? __lock_acquire.isra.0+0x123/0x540
[   19.660102]  ? sched_clock+0xd/0x20
[   19.663595]  __sys_recvmsg+0x66/0xc0
[   19.667176]  __x64_sys_recvmsg+0x23/0x30
[   19.671099]  do_syscall_64+0x3f/0x90
[   19.674680]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   19.679731] RIP: 0033:0x7f2b4b8a9487
[   19.683311] Code: 64 89 02 48 c7 c0 ff ff ff ff eb c1 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[   19.702057] RSP: 002b:00007ffc48657968 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
[   19.709623] RAX: ffffffffffffffda RBX: 000055cb525ae070 RCX: 00007f2b4b8a9487
[   19.716755] RDX: 0000000000000000 RSI: 00007ffc48657a10 RDI: 0000000000000005
[   19.723888] RBP: 00007ffc48659b40 R08: 00000000ffffffff R09: 0000000000000020
[   19.731010] R10: 000055cb525ae818 R11: 0000000000000246 R12: 0000000000000000
[   19.738134] R13: 00007ffc48657af0 R14: 000055cb522372a0 R15: 000055cb525ae1c0
[   19.745262]  </TASK>
[   19.747453] Modules linked in:
[   19.750524] ---[ end trace 0000000000000000 ]---
[   19.755148] RIP: 0010:usercopy_abort+0x7f/0x81
[   19.759598] Code: 4c 0f 45 de 51 4c 89 d1 48 c7 c2 7b 66 b6 9c 57 48 c7 c6 a8 b4 b5 9c 48 c7 c7 b8 4b c0 9c 48 0f 45 f2 4c 89 da e8 1e 56 ff ff <0f> 0b 49 89 d8 4c 89 c9 44 89 ea 31 f6 48 c7 c7 c5 66 b6 9c e8 68
[   19.778347] RSP: 0018:ffffaeabdc8d3ad0 EFLAGS: 00010246
[   19.783571] RAX: 000000000000006b RBX: 0000000000000053 RCX: 0000000000000000
[   19.790703] RDX: 0000000000000000 RSI: 00000000ffdfffff RDI: 00000000ffffffff
[   19.797838] RBP: ffffaeabdc8d3ae8 R08: 0000000000000000 R09: ffffaeabdc8d3950
[   19.804971] R10: 0000000000000001 R11: 0000000000000001 R12: ffff93700bf72a80
[   19.812102] R13: 0000000000000001 R14: ffff93700bf72ad3 R15: 0000000000000040
[   19.819236] FS:  00007f2b4b275880(0000) GS:ffff938e80c00000(0000) knlGS:0000000000000000
[   19.827330] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.833075] CR2: 00007ffc48657b08 CR3: 000080108f998000 CR4: 0000000000350ee0
[   19.840545] printk: systemd-udevd: 23 output lines suppressed due to ratelimiting
[   19.864222] ------------[ cut here ]------------
[   19.868843] WARNING: CPU: 68 PID: 0 at net/netlink/af_netlink.c:414 netlink_sock_destruct+0xa1/0xc0
[   19.877886] Modules linked in:
[   19.880947] CPU: 68 PID: 0 Comm: swapper/68 Tainted: G      D            6.2.0-rc6+ #119
[   19.889033] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
[   19.897119] RIP: 0010:netlink_sock_destruct+0xa1/0xc0
[   19.902172] Code: 29 41 8b 84 24 9c 02 00 00 85 c0 75 2b 49 83 bc 24 68 05 00 00 00 75 08 41 5c 5d e9 bd ac 28 00 0f 0b 41 5c 5d e9 b3 ac 28 00 <0f> 0b 41 8b 84 24 9c 02 00 00 85 c0 74 d5 0f 0b eb d1 66 66 2e 0f
[   19.920916] RSP: 0018:ffffaeab87e3ce38 EFLAGS: 00010202
[   19.926140] RAX: 0000000000000380 RBX: ffff937ed3c27d30 RCX: 0000000000000000
[   19.933274] RDX: 0000000000000102 RSI: ffffffff9bf21a23 RDI: 0000000000000000
[   19.940406] RBP: ffffaeab87e3ce40 R08: 0000000000000001 R09: 0000000000000000
[   19.947539] R10: 0000000000000000 R11: 0000000000000000 R12: ffff937ed3c27800
[   19.954672] R13: ffff937ed3c27f30 R14: ffff937ed3304b80 R15: ffffaeab87e3cf20
[   19.961804] FS:  0000000000000000(0000) GS:ffff938e70c00000(0000) knlGS:0000000000000000
[   19.969889] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.975637] CR2: 0000000000000000 CR3: 0000800ade212000 CR4: 0000000000350ee0
[   19.982771] Call Trace:
[   19.985221]  <IRQ>
[   19.987237]  __sk_destruct+0x33/0x230
[   19.990906]  ? deferred_put_nlk_sk+0x1d/0x100
[   19.995265]  sk_destruct+0x52/0x60
[   19.998672]  __sk_free+0x30/0xd0
[   20.001902]  sk_free+0x2e/0x50
[   20.004964]  deferred_put_nlk_sk+0x6b/0x100
[   20.009148]  rcu_core+0x4c2/0x7a0
[   20.012467]  ? rcu_core+0x47e/0x7a0
[   20.015961]  rcu_core_si+0x12/0x20
[   20.019366]  __do_softirq+0x11f/0x353
[   20.023034]  irq_exit_rcu+0xaf/0xe0
[   20.026533]  sysvec_apic_timer_interrupt+0xb4/0xd0
[   20.031326]  </IRQ>
[   20.033432]  <TASK>
[   20.035536]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
[   20.040679] RIP: 0010:cpuidle_enter_state+0x126/0x4d0
[   20.045738] Code: 00 31 ff e8 bc f9 46 ff 80 7d d7 00 74 16 9c 58 0f 1f 40 00 f6 c4 02 0f 85 8e 03 00 00 31 ff e8 d0 19 4f ff fb 0f 1f 44 00 00 <45> 85 ff 0f 88 d9 01 00 00 49 63 c7 4c 2b 75 c8 48 8d 14 40 48 8d
[   20.064484] RSP: 0018:ffffaeab80797e48 EFLAGS: 00000246
[   20.069711] RAX: ffff938e70c00000 RBX: 0000000000000002 RCX: 000000000000001f
[   20.076844] RDX: 0000000000000000 RSI: ffffffff9cb59380 RDI: ffffffff9cb5eaaf
[   20.083973] RBP: ffffaeab80797e80 R08: 000000049ffc9237 R09: 0000000000000e04
[   20.091110] R10: ffff938e70df3964 R11: ffff938e70df3944 R12: ffff9370025afc00
[   20.098240] R13: ffffffff9d35b180 R14: 000000049ffc9237 R15: 0000000000000002
[   20.105377]  ? cpuidle_enter_state+0x104/0x4d0
[   20.109819]  cpuidle_enter+0x32/0x50
[   20.113397]  call_cpuidle+0x23/0x50
[   20.116892]  do_idle+0x1d4/0x250
[   20.120123]  cpu_startup_entry+0x24/0x30
[   20.124051]  start_secondary+0x114/0x130
[   20.127974]  secondary_startup_64_no_verify+0xd3/0xdb
[   20.133032]  </TASK>
[   20.135218] ---[ end trace 0000000000000000 ]---
[   21.988002] raid6: avx2x4   gen() 27873 MB/s
[   22.060002] raid6: avx2x2   gen() 28291 MB/s
[   22.132001] raid6: avx2x1   gen() 23642 MB/s
[   22.136275] raid6: using algorithm avx2x2 gen() 28291 MB/s
[   22.208001] raid6: .... xor() 17406 MB/s, rmw enabled
[   22.213051] raid6: using avx2x2 recovery algorithm
[   22.222249] xor: automatically using best checksumming function   avx
[   22.234862] async_tx: api initialized (async)
[   22.555209] Btrfs loaded, crc32c=crc32c-intel, zoned=yes, fsverity=yes

(it then drops to the initramfs prompt).

Is there a different tree the series can be rebased on, until net-next
gets fixed?

Thanks,

Kim
Jakub Kicinski Feb. 9, 2023, 9:31 p.m. UTC | #2
On Thu, 9 Feb 2023 15:05:46 -0600 Kim Phillips wrote:
> Is there a different tree the series can be rebased on, until net-next
> gets fixed?

merge in net-next, the fix should be there but was merged a couple of
hours ago so probably not yet in linux-next
Kim Phillips Feb. 9, 2023, 10:37 p.m. UTC | #3
On 2/9/23 3:31 PM, Jakub Kicinski wrote:
> On Thu, 9 Feb 2023 15:05:46 -0600 Kim Phillips wrote:
>> Is there a different tree the series can be rebased on, until net-next
>> gets fixed?
> 
> merge in net-next, the fix should be there but was merged a couple of
> hours ago so probably not yet in linux-next

I=Ok, I took next-20230209, git merged net-next/master, fixed a merge
conflict to use the latter net-next/master version:

<<<<<<< HEAD
	if (err == NOTIFY_BAD) {
		dl_trap->trap.action = action_orig;
		err = trap_event_ctx.err;
	}
out:
	return err;
=======
	if (err == NOTIFY_BAD)
		dl_trap->trap.action = action_orig;

	return trap_event_ctx.err;
 >>>>>>> net-next/master

...and unfortunately still get a splat on that same Rome system:

[   22.647832] mlx5_core 0000:21:00.0: firmware version: 14.22.1002
[   22.653879] mlx5_core 0000:21:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[   23.228950] mlx5_core 0000:21:00.0: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
[   23.245100] mlx5_core 0000:21:00.0: Port module event: module 0, Cable plugged
[   23.570053] mlx5_core 0000:21:00.0: Supported tc offload range - chains: 1, prios: 1
[   23.577812] mlx5_core 0000:21:00.0: mlx5e_tc_post_act_init:40:(pid 9): firmware level support is missing
[   23.594377] mlx5_core 0000:21:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic)
[   23.605492] mlx5_core 0000:21:00.1: firmware version: 14.22.1002
[   23.611536] mlx5_core 0000:21:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[   24.199756] mlx5_core 0000:21:00.1: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
[   24.216876] mlx5_core 0000:21:00.1: Port module event: module 1, Cable unplugged
[   24.555670] mlx5_core 0000:21:00.1: Supported tc offload range - chains: 1, prios: 1
[   24.563428] mlx5_core 0000:21:00.1: mlx5e_tc_post_act_init:40:(pid 9): firmware level support is missing
[   24.580084] mlx5_core 0000:21:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic)
[   24.593808] systemd-udevd[1974]: Using default interface naming scheme 'v245'.
[   24.602595] systemd-udevd[1974]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
[   24.613314] mlx5_core 0000:21:00.0 enp33s0f0np0: renamed from eth0
[   24.701259] ------------[ cut here ]------------
[   24.705888] WARNING: CPU: 228 PID: 2318 at net/devlink/leftover.c:9643 devl_param_driverinit_value_get+0xe5/0x1f0
[   24.716153] Modules linked in: mlx5_ib(+) ib_uverbs ib_core mlx5_core ast i2c_algo_bit drm_shmem_helper hid_generic drm_kms_helper syscopyarea sysfillrect sysimgblt usbhid pci_hyperv_intf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd mlxfw hid psample drm ahci tls libahci i2c_piix4 wmi
[   24.745589] CPU: 228 PID: 2318 Comm: systemd-udevd Not tainted 6.2.0-rc7-next-20230209+ #4
[   24.753856] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
[   24.761943] RIP: 0010:devl_param_driverinit_value_get+0xe5/0x1f0
[   24.767955] Code: 00 5b b8 ea ff ff ff 41 5c 41 5d 5d e9 58 cd 08 00 48 8d bf 28 02 00 00 be ff ff ff ff e8 03 2a 07 00 85 c0 0f 85 43 ff ff ff <0f> 0b 49 8b 84 24 18 01 00 00 48 83 78 18 00 0f 85 41 ff ff ff 0f
[   24.786702] RSP: 0018:ffffc217dfff7a28 EFLAGS: 00010246
[   24.791925] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000000
[   24.799058] RDX: 0000000000000000 RSI: ffff9d7458b00228 RDI: ffff9d835f588d50
[   24.806194] RBP: ffffc217dfff7a40 R08: 0000000000000000 R09: ffff9d8316157c00
[   24.813325] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d7458b00000
[   24.820455] R13: ffffc217dfff7a50 R14: 0000000000000001 R15: 0000000000000002
[   24.827589] FS:  00007f03c4b0a880(0000) GS:ffff9d92c8c00000(0000) knlGS:0000000000000000
[   24.835677] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.841422] CR2: 00007ffd0c160f48 CR3: 000080109f420000 CR4: 0000000000350ee0
[   24.848557] Call Trace:
[   24.851003]  <TASK>
[   24.853117]  mlx5_is_roce_on+0x3a/0xb0 [mlx5_core]
[   24.858010]  ? __kmalloc+0x53/0x1b0
[   24.861512]  mlx5r_probe+0x149/0x170 [mlx5_ib]
[   24.865974]  ? __pfx_mlx5r_probe+0x10/0x10 [mlx5_ib]
[   24.870957]  auxiliary_bus_probe+0x45/0xa0
[   24.875059]  really_probe+0x17b/0x3e0
[   24.878731]  __driver_probe_device+0x7e/0x180
[   24.883090]  driver_probe_device+0x23/0x80
[   24.887191]  __driver_attach+0xcb/0x1a0
[   24.891027]  ? __pfx___driver_attach+0x10/0x10
[   24.895475]  bus_for_each_dev+0x89/0xd0
[   24.899311]  driver_attach+0x22/0x30
[   24.902894]  bus_add_driver+0x1b9/0x240
[   24.906735]  driver_register+0x66/0x130
[   24.910584]  __auxiliary_driver_register+0x73/0xe0
[   24.915385]  mlx5_ib_init+0xda/0x110 [mlx5_ib]
[   24.919846]  ? __pfx_init_module+0x10/0x10 [mlx5_ib]
[   24.924831]  do_one_initcall+0x7a/0x2b0
[   24.928677]  ? kmalloc_trace+0x2e/0xe0
[   24.932433]  do_init_module+0x6a/0x260
[   24.936191]  load_module+0x1e90/0x2050
[   24.939942]  ? ima_post_read_file+0xd6/0xf0
[   24.944138]  __do_sys_finit_module+0xc8/0x140
[   24.948497]  ? __do_sys_finit_module+0xc8/0x140
[   24.953036]  __x64_sys_finit_module+0x1e/0x30
[   24.957399]  do_syscall_64+0x3f/0x90
[   24.960987]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   24.966047] RIP: 0033:0x7f03c513673d
[   24.969628] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 37 0d 00 f7 d8 64 89 01 48
[   24.988380] RSP: 002b:00007ffd0c1665f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   24.995943] RAX: ffffffffffffffda RBX: 0000556e1aec4d30 RCX: 00007f03c513673d
[   25.003078] RDX: 0000000000000000 RSI: 00007f03c5016ded RDI: 000000000000000e
[   25.010210] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000556e1ae664e8
[   25.017343] R10: 000000000000000e R11: 0000000000000246 R12: 00007f03c5016ded
[   25.024477] R13: 0000000000000000 R14: 0000556e1aeee320 R15: 0000556e1aec4d30
[   25.031621]  </TASK>
[   25.033815] ---[ end trace 0000000000000000 ]---
[   25.072333] ------------[ cut here ]------------
[   25.076971] WARNING: CPU: 100 PID: 2318 at net/devlink/leftover.c:9643 devl_param_driverinit_value_get+0xe5/0x1f0
[   25.087406] Modules linked in: mlx5_ib(+) ib_uverbs ib_core mlx5_core ast i2c_algo_bit drm_shmem_helper hid_generic drm_kms_helper syscopyarea sysfillrect sysimgblt usbhid pci_hyperv_intf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd mlxfw hid psample drm ahci tls libahci i2c_piix4 wmi
[   25.116844] CPU: 100 PID: 2318 Comm: systemd-udevd Tainted: G        W          6.2.0-rc7-next-20230209+ #4
[   25.126576] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
[   25.134665] RIP: 0010:devl_param_driverinit_value_get+0xe5/0x1f0
[   25.140676] Code: 00 5b b8 ea ff ff ff 41 5c 41 5d 5d e9 58 cd 08 00 48 8d bf 28 02 00 00 be ff ff ff ff e8 03 2a 07 00 85 c0 0f 85 43 ff ff ff <0f> 0b 49 8b 84 24 18 01 00 00 48 83 78 18 00 0f 85 41 ff ff ff 0f
[   25.159421] RSP: 0018:ffffc217dfff7a28 EFLAGS: 00010246
[   25.164646] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000000
[   25.171779] RDX: 0000000000000000 RSI: ffff9d745c680228 RDI: ffff9d835f588d50
[   25.178910] RBP: ffffc217dfff7a40 R08: 0000000000000000 R09: ffff9d835e860400
[   25.186045] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d745c680000
[   25.193178] R13: ffffc217dfff7a50 R14: 0000000000000001 R15: 0000000000000002
[   25.200310] FS:  00007f03c4b0a880(0000) GS:ffff9d92b8c00000(0000) knlGS:0000000000000000
[   25.208395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   25.214141] CR2: 00007f03c520d52c CR3: 000080109f420000 CR4: 0000000000350ee0
[   25.221275] Call Trace:
[   25.223726]  <TASK>
[   25.225831]  mlx5_is_roce_on+0x3a/0xb0 [mlx5_core]
[   25.230678]  ? __kmalloc+0x53/0x1b0
[   25.234172]  mlx5r_probe+0x149/0x170 [mlx5_ib]
[   25.238641]  ? __pfx_mlx5r_probe+0x10/0x10 [mlx5_ib]
[   25.243624]  auxiliary_bus_probe+0x45/0xa0
[   25.247724]  really_probe+0x17b/0x3e0
[   25.251393]  __driver_probe_device+0x7e/0x180
[   25.255761]  driver_probe_device+0x23/0x80
[   25.259868]  __driver_attach+0xcb/0x1a0
[   25.263707]  ? __pfx___driver_attach+0x10/0x10
[   25.268159]  bus_for_each_dev+0x89/0xd0
[   25.272001]  driver_attach+0x22/0x30
[   25.275577]  bus_add_driver+0x1b9/0x240
[   25.279421]  driver_register+0x66/0x130
[   25.283264]  __auxiliary_driver_register+0x73/0xe0
[   25.288062]  mlx5_ib_init+0xda/0x110 [mlx5_ib]
[   25.292519]  ? __pfx_init_module+0x10/0x10 [mlx5_ib]
[   25.297496]  do_one_initcall+0x7a/0x2b0
[   25.301337]  ? kmalloc_trace+0x2e/0xe0
[   25.305088]  do_init_module+0x6a/0x260
[   25.308841]  load_module+0x1e90/0x2050
[   25.312595]  ? ima_post_read_file+0xd6/0xf0
[   25.316797]  __do_sys_finit_module+0xc8/0x140
[   25.321155]  ? __do_sys_finit_module+0xc8/0x140
[   25.325696]  __x64_sys_finit_module+0x1e/0x30
[   25.330057]  do_syscall_64+0x3f/0x90
[   25.333635]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   25.338687] RIP: 0033:0x7f03c513673d
[   25.342266] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 37 0d 00 f7 d8 64 89 01 48
[   25.361015] RSP: 002b:00007ffd0c1665f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   25.368579] RAX: ffffffffffffffda RBX: 0000556e1aec4d30 RCX: 00007f03c513673d
[   25.375713] RDX: 0000000000000000 RSI: 00007f03c5016ded RDI: 000000000000000e
[   25.382843] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000556e1ae664e8
[   25.389976] R10: 000000000000000e R11: 0000000000000246 R12: 00007f03c5016ded
[   25.397109] R13: 0000000000000000 R14: 0000556e1aeee320 R15: 0000556e1aec4d30
[   25.404249]  </TASK>
[   25.406437] ---[ end trace 0000000000000000 ]---

Did I do the merge wrong, or is the problem still there?

Thanks,

Kim
Jakub Kicinski Feb. 10, 2023, 12:16 a.m. UTC | #4
On Thu, 9 Feb 2023 16:37:13 -0600 Kim Phillips wrote:
> On 2/9/23 3:31 PM, Jakub Kicinski wrote:
> > On Thu, 9 Feb 2023 15:05:46 -0600 Kim Phillips wrote:  
> >> Is there a different tree the series can be rebased on, until net-next
> >> gets fixed?  
> > 
> > merge in net-next, the fix should be there but was merged a couple of
> > hours ago so probably not yet in linux-next  
> 
> I=Ok, I took next-20230209, git merged net-next/master, fixed a merge
> conflict to use the latter net-next/master version:

> ...and unfortunately still get a splat on that same Rome system:

Alright, so that's the splat I'm guessing the patch set was supposed 
to address. Can you confirm that you're testing 

 next-20230209 + net-next/master + this patches from the list

? The previous crash (which I called "fixed in net-next") was an
unrelated bug in the socket layer. You still have to apply the
patches from the list to fix the devlink warning.
Jakub Kicinski Feb. 10, 2023, 4:53 a.m. UTC | #5
On Thu,  9 Feb 2023 16:43:01 +0100 Jiri Pirko wrote:
> The primary motivation of this patchset is the patch #6, which fixes an
> issue introduced by 075935f0ae0f ("devlink: protect devlink param list
> by instance lock") and reported by Kim Phillips <kim.phillips@amd.com>
> (https://lore.kernel.org/netdev/719de4f0-76ac-e8b9-38a9-167ae239efc7@amd.com/)
> and my colleagues doing mlx5 driver regression testing.
> 
> The basis idea is that devl_param_driverinit_value_get() could be
> possible to the called without holding devlink intance lock in
> most of the cases (all existing ones in the current codebase),
> which would fix some mlx5 flows where the lock is not held.
> 
> To achieve that, make sure that the param value does not change between
> reloads with patch #2.
> 
> Also, convert the param list to xarray which removes the worry about
> list_head consistency when doing lockless lookup.
> 
> The rest of the patches are doing some small related cleanup of things
> that poke me in the eye during the work.

Acked-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko Feb. 10, 2023, 7:55 a.m. UTC | #6
Thu, Feb 09, 2023 at 11:37:13PM CET, kim.phillips@amd.com wrote:
>On 2/9/23 3:31 PM, Jakub Kicinski wrote:
>> On Thu, 9 Feb 2023 15:05:46 -0600 Kim Phillips wrote:
>> > Is there a different tree the series can be rebased on, until net-next
>> > gets fixed?
>> 
>> merge in net-next, the fix should be there but was merged a couple of
>> hours ago so probably not yet in linux-next
>
>I=Ok, I took next-20230209, git merged net-next/master, fixed a merge
>conflict to use the latter net-next/master version:
>
><<<<<<< HEAD
>	if (err == NOTIFY_BAD) {
>		dl_trap->trap.action = action_orig;
>		err = trap_event_ctx.err;
>	}
>out:
>	return err;
>=======
>	if (err == NOTIFY_BAD)
>		dl_trap->trap.action = action_orig;
>
>	return trap_event_ctx.err;
>>>>>>>> net-next/master
>
>...and unfortunately still get a splat on that same Rome system:
>
>[   22.647832] mlx5_core 0000:21:00.0: firmware version: 14.22.1002
>[   22.653879] mlx5_core 0000:21:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
>[   23.228950] mlx5_core 0000:21:00.0: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
>[   23.245100] mlx5_core 0000:21:00.0: Port module event: module 0, Cable plugged
>[   23.570053] mlx5_core 0000:21:00.0: Supported tc offload range - chains: 1, prios: 1
>[   23.577812] mlx5_core 0000:21:00.0: mlx5e_tc_post_act_init:40:(pid 9): firmware level support is missing
>[   23.594377] mlx5_core 0000:21:00.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic)
>[   23.605492] mlx5_core 0000:21:00.1: firmware version: 14.22.1002
>[   23.611536] mlx5_core 0000:21:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
>[   24.199756] mlx5_core 0000:21:00.1: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
>[   24.216876] mlx5_core 0000:21:00.1: Port module event: module 1, Cable unplugged
>[   24.555670] mlx5_core 0000:21:00.1: Supported tc offload range - chains: 1, prios: 1
>[   24.563428] mlx5_core 0000:21:00.1: mlx5e_tc_post_act_init:40:(pid 9): firmware level support is missing
>[   24.580084] mlx5_core 0000:21:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic)
>[   24.593808] systemd-udevd[1974]: Using default interface naming scheme 'v245'.
>[   24.602595] systemd-udevd[1974]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
>[   24.613314] mlx5_core 0000:21:00.0 enp33s0f0np0: renamed from eth0
>[   24.701259] ------------[ cut here ]------------
>[   24.705888] WARNING: CPU: 228 PID: 2318 at net/devlink/leftover.c:9643 devl_param_driverinit_value_get+0xe5/0x1f0

Odd as this patchset removes this warning. I think you forgot to apply.


>[   24.716153] Modules linked in: mlx5_ib(+) ib_uverbs ib_core mlx5_core ast i2c_algo_bit drm_shmem_helper hid_generic drm_kms_helper syscopyarea sysfillrect sysimgblt usbhid pci_hyperv_intf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd mlxfw hid psample drm ahci tls libahci i2c_piix4 wmi
>[   24.745589] CPU: 228 PID: 2318 Comm: systemd-udevd Not tainted 6.2.0-rc7-next-20230209+ #4
>[   24.753856] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
>[   24.761943] RIP: 0010:devl_param_driverinit_value_get+0xe5/0x1f0
>[   24.767955] Code: 00 5b b8 ea ff ff ff 41 5c 41 5d 5d e9 58 cd 08 00 48 8d bf 28 02 00 00 be ff ff ff ff e8 03 2a 07 00 85 c0 0f 85 43 ff ff ff <0f> 0b 49 8b 84 24 18 01 00 00 48 83 78 18 00 0f 85 41 ff ff ff 0f
>[   24.786702] RSP: 0018:ffffc217dfff7a28 EFLAGS: 00010246
>[   24.791925] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000000
>[   24.799058] RDX: 0000000000000000 RSI: ffff9d7458b00228 RDI: ffff9d835f588d50
>[   24.806194] RBP: ffffc217dfff7a40 R08: 0000000000000000 R09: ffff9d8316157c00
>[   24.813325] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d7458b00000
>[   24.820455] R13: ffffc217dfff7a50 R14: 0000000000000001 R15: 0000000000000002
>[   24.827589] FS:  00007f03c4b0a880(0000) GS:ffff9d92c8c00000(0000) knlGS:0000000000000000
>[   24.835677] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[   24.841422] CR2: 00007ffd0c160f48 CR3: 000080109f420000 CR4: 0000000000350ee0
>[   24.848557] Call Trace:
>[   24.851003]  <TASK>
>[   24.853117]  mlx5_is_roce_on+0x3a/0xb0 [mlx5_core]
>[   24.858010]  ? __kmalloc+0x53/0x1b0
>[   24.861512]  mlx5r_probe+0x149/0x170 [mlx5_ib]
>[   24.865974]  ? __pfx_mlx5r_probe+0x10/0x10 [mlx5_ib]
>[   24.870957]  auxiliary_bus_probe+0x45/0xa0
>[   24.875059]  really_probe+0x17b/0x3e0
>[   24.878731]  __driver_probe_device+0x7e/0x180
>[   24.883090]  driver_probe_device+0x23/0x80
>[   24.887191]  __driver_attach+0xcb/0x1a0
>[   24.891027]  ? __pfx___driver_attach+0x10/0x10
>[   24.895475]  bus_for_each_dev+0x89/0xd0
>[   24.899311]  driver_attach+0x22/0x30
>[   24.902894]  bus_add_driver+0x1b9/0x240
>[   24.906735]  driver_register+0x66/0x130
>[   24.910584]  __auxiliary_driver_register+0x73/0xe0
>[   24.915385]  mlx5_ib_init+0xda/0x110 [mlx5_ib]
>[   24.919846]  ? __pfx_init_module+0x10/0x10 [mlx5_ib]
>[   24.924831]  do_one_initcall+0x7a/0x2b0
>[   24.928677]  ? kmalloc_trace+0x2e/0xe0
>[   24.932433]  do_init_module+0x6a/0x260
>[   24.936191]  load_module+0x1e90/0x2050
>[   24.939942]  ? ima_post_read_file+0xd6/0xf0
>[   24.944138]  __do_sys_finit_module+0xc8/0x140
>[   24.948497]  ? __do_sys_finit_module+0xc8/0x140
>[   24.953036]  __x64_sys_finit_module+0x1e/0x30
>[   24.957399]  do_syscall_64+0x3f/0x90
>[   24.960987]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
>[   24.966047] RIP: 0033:0x7f03c513673d
>[   24.969628] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 37 0d 00 f7 d8 64 89 01 48
>[   24.988380] RSP: 002b:00007ffd0c1665f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>[   24.995943] RAX: ffffffffffffffda RBX: 0000556e1aec4d30 RCX: 00007f03c513673d
>[   25.003078] RDX: 0000000000000000 RSI: 00007f03c5016ded RDI: 000000000000000e
>[   25.010210] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000556e1ae664e8
>[   25.017343] R10: 000000000000000e R11: 0000000000000246 R12: 00007f03c5016ded
>[   25.024477] R13: 0000000000000000 R14: 0000556e1aeee320 R15: 0000556e1aec4d30
>[   25.031621]  </TASK>
>[   25.033815] ---[ end trace 0000000000000000 ]---
>[   25.072333] ------------[ cut here ]------------
>[   25.076971] WARNING: CPU: 100 PID: 2318 at net/devlink/leftover.c:9643 devl_param_driverinit_value_get+0xe5/0x1f0
>[   25.087406] Modules linked in: mlx5_ib(+) ib_uverbs ib_core mlx5_core ast i2c_algo_bit drm_shmem_helper hid_generic drm_kms_helper syscopyarea sysfillrect sysimgblt usbhid pci_hyperv_intf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd mlxfw hid psample drm ahci tls libahci i2c_piix4 wmi
>[   25.116844] CPU: 100 PID: 2318 Comm: systemd-udevd Tainted: G        W          6.2.0-rc7-next-20230209+ #4
>[   25.126576] Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RDY1009A 09/16/2020
>[   25.134665] RIP: 0010:devl_param_driverinit_value_get+0xe5/0x1f0
>[   25.140676] Code: 00 5b b8 ea ff ff ff 41 5c 41 5d 5d e9 58 cd 08 00 48 8d bf 28 02 00 00 be ff ff ff ff e8 03 2a 07 00 85 c0 0f 85 43 ff ff ff <0f> 0b 49 8b 84 24 18 01 00 00 48 83 78 18 00 0f 85 41 ff ff ff 0f
>[   25.159421] RSP: 0018:ffffc217dfff7a28 EFLAGS: 00010246
>[   25.164646] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000000
>[   25.171779] RDX: 0000000000000000 RSI: ffff9d745c680228 RDI: ffff9d835f588d50
>[   25.178910] RBP: ffffc217dfff7a40 R08: 0000000000000000 R09: ffff9d835e860400
>[   25.186045] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d745c680000
>[   25.193178] R13: ffffc217dfff7a50 R14: 0000000000000001 R15: 0000000000000002
>[   25.200310] FS:  00007f03c4b0a880(0000) GS:ffff9d92b8c00000(0000) knlGS:0000000000000000
>[   25.208395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>[   25.214141] CR2: 00007f03c520d52c CR3: 000080109f420000 CR4: 0000000000350ee0
>[   25.221275] Call Trace:
>[   25.223726]  <TASK>
>[   25.225831]  mlx5_is_roce_on+0x3a/0xb0 [mlx5_core]
>[   25.230678]  ? __kmalloc+0x53/0x1b0
>[   25.234172]  mlx5r_probe+0x149/0x170 [mlx5_ib]
>[   25.238641]  ? __pfx_mlx5r_probe+0x10/0x10 [mlx5_ib]
>[   25.243624]  auxiliary_bus_probe+0x45/0xa0
>[   25.247724]  really_probe+0x17b/0x3e0
>[   25.251393]  __driver_probe_device+0x7e/0x180
>[   25.255761]  driver_probe_device+0x23/0x80
>[   25.259868]  __driver_attach+0xcb/0x1a0
>[   25.263707]  ? __pfx___driver_attach+0x10/0x10
>[   25.268159]  bus_for_each_dev+0x89/0xd0
>[   25.272001]  driver_attach+0x22/0x30
>[   25.275577]  bus_add_driver+0x1b9/0x240
>[   25.279421]  driver_register+0x66/0x130
>[   25.283264]  __auxiliary_driver_register+0x73/0xe0
>[   25.288062]  mlx5_ib_init+0xda/0x110 [mlx5_ib]
>[   25.292519]  ? __pfx_init_module+0x10/0x10 [mlx5_ib]
>[   25.297496]  do_one_initcall+0x7a/0x2b0
>[   25.301337]  ? kmalloc_trace+0x2e/0xe0
>[   25.305088]  do_init_module+0x6a/0x260
>[   25.308841]  load_module+0x1e90/0x2050
>[   25.312595]  ? ima_post_read_file+0xd6/0xf0
>[   25.316797]  __do_sys_finit_module+0xc8/0x140
>[   25.321155]  ? __do_sys_finit_module+0xc8/0x140
>[   25.325696]  __x64_sys_finit_module+0x1e/0x30
>[   25.330057]  do_syscall_64+0x3f/0x90
>[   25.333635]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
>[   25.338687] RIP: 0033:0x7f03c513673d
>[   25.342266] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 37 0d 00 f7 d8 64 89 01 48
>[   25.361015] RSP: 002b:00007ffd0c1665f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>[   25.368579] RAX: ffffffffffffffda RBX: 0000556e1aec4d30 RCX: 00007f03c513673d
>[   25.375713] RDX: 0000000000000000 RSI: 00007f03c5016ded RDI: 000000000000000e
>[   25.382843] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000556e1ae664e8
>[   25.389976] R10: 000000000000000e R11: 0000000000000246 R12: 00007f03c5016ded
>[   25.397109] R13: 0000000000000000 R14: 0000556e1aeee320 R15: 0000556e1aec4d30
>[   25.404249]  </TASK>
>[   25.406437] ---[ end trace 0000000000000000 ]---
>
>Did I do the merge wrong, or is the problem still there?
>
>Thanks,
>
>Kim
Jiri Pirko Feb. 10, 2023, 7:56 a.m. UTC | #7
Fri, Feb 10, 2023 at 05:53:09AM CET, kuba@kernel.org wrote:
>On Thu,  9 Feb 2023 16:43:01 +0100 Jiri Pirko wrote:
>> The primary motivation of this patchset is the patch #6, which fixes an
>> issue introduced by 075935f0ae0f ("devlink: protect devlink param list
>> by instance lock") and reported by Kim Phillips <kim.phillips@amd.com>
>> (https://lore.kernel.org/netdev/719de4f0-76ac-e8b9-38a9-167ae239efc7@amd.com/)
>> and my colleagues doing mlx5 driver regression testing.
>> 
>> The basis idea is that devl_param_driverinit_value_get() could be
>> possible to the called without holding devlink intance lock in
>> most of the cases (all existing ones in the current codebase),
>> which would fix some mlx5 flows where the lock is not held.
>> 
>> To achieve that, make sure that the param value does not change between
>> reloads with patch #2.
>> 
>> Also, convert the param list to xarray which removes the worry about
>> list_head consistency when doing lockless lookup.
>> 
>> The rest of the patches are doing some small related cleanup of things
>> that poke me in the eye during the work.
>
>Acked-by: Jakub Kicinski <kuba@kernel.org>

I will be sending v2 soon with Simon's nits resolved and also one small
fix. Thanks!