diff mbox series

qdisc: fix NULL pointer dereference in perf_trace_qdisc_reset()

Message ID 20240621114551.2061-3-yskelg@gmail.com (mailing list archive)
State Handled Elsewhere
Headers show
Series qdisc: fix NULL pointer dereference in perf_trace_qdisc_reset() | expand

Commit Message

Yunseong Kim June 21, 2024, 11:45 a.m. UTC
From: Yunseong Kim <yskelg@gmail.com>

In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from

 qdisc->dev_queue->dev <NULL> ->name

This situation simulated from bunch of veths and Bluetooth dis/reconnection.

During qdisc initialization, qdisc was being set to noop_queue.
In veth_init_queue, the initial tx_num was reduced back to one,
causing the qdisc reset to be called with noop, which led to the kernel panic.

I think this will happen on the kernel version.
 Linux kernel version ≥ v6.7.10, ≥ v6.8 ≥ v6.9 and 6.10

This occurred from 51270d573a8d. I think this patch is absolutely 
necessary. Previously, It was showing not intended string value of name.

I've reproduced 3 time from my fedora 40 Debug Kernel with any other module 
and patched.

version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug

[ 5287.164555] veth0_vlan: left promiscuous mode
[ 5287.164929] veth1_macvtap: left promiscuous mode
[ 5287.164950] veth0_macvtap: left promiscuous mode
[ 5287.164983] veth1_vlan: left promiscuous mode
[ 5287.165008] veth0_vlan: left promiscuous mode
[ 5287.165450] veth1_macvtap: left promiscuous mode
[ 5287.165472] veth0_macvtap: left promiscuous mode
[ 5287.165502] veth1_vlan: left promiscuous mode
…
[ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state
[ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state
[ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0
[ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state
[ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state
[ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0
[ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state
…
[ 5298.002798] bridge_slave_0: left promiscuous mode
[ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state
[ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[ 5298.320207] bond0 (unregistering): Released all slaves
[ 5298.354296] hsr_slave_0: left promiscuous mode
[ 5298.360750] hsr_slave_1: left promiscuous mode
[ 5298.374889] veth1_macvtap: left promiscuous mode
[ 5298.374931] veth0_macvtap: left promiscuous mode
[ 5298.374988] veth1_vlan: left promiscuous mode
[ 5298.375024] veth0_vlan: left promiscuous mode
[ 5299.109741] team0 (unregistering): Port device team_slave_1 removed
[ 5299.185870] team0 (unregistering): Port device team_slave_0 removed
…
[ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1
[ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9
[ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9
….
[ 5301.075531] team0: Port device team_slave_1 added
[ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state
[ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state
[ 5301.085588] bridge_slave_0: entered allmulticast mode
[ 5301.085800] bridge_slave_0: entered promiscuous mode
[ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state
[ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state
…
[ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[ 5301.193481] hsr_slave_0: entered promiscuous mode
[ 5301.204425] hsr_slave_1: entered promiscuous mode
[ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[ 5301.210185] Cannot create hsr debugfs directory
[ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 5301.255934] team0: Port device team_slave_0 added
[ 5301.256480] team0: Port device team_slave_1 added
[ 5301.256948] team0: Port device team_slave_0 added
…
[ 5301.435928] hsr_slave_0: entered promiscuous mode
[ 5301.446029] hsr_slave_1: entered promiscuous mode
[ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[ 5301.455884] Cannot create hsr debugfs directory
[ 5301.502664] hsr_slave_0: entered promiscuous mode
[ 5301.513675] hsr_slave_1: entered promiscuous mode
[ 5301.526155] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[ 5301.526164] Cannot create hsr debugfs directory
[ 5301.563662] hsr_slave_0: entered promiscuous mode
[ 5301.576129] hsr_slave_1: entered promiscuous mode
[ 5301.580259] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[ 5301.580270] Cannot create hsr debugfs directory
[ 5301.590269] 8021q: adding VLAN 0 to HW filter on device bond0

[ 5301.595872] KASAN: null-ptr-deref in range [0x0000000000000130-0x0000000000000137]
[ 5301.595877] Mem abort info:
[ 5301.595881]   ESR = 0x0000000096000006
[ 5301.595885]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 5301.595889]   SET = 0, FnV = 0
[ 5301.595893]   EA = 0, S1PTW = 0
[ 5301.595896]   FSC = 0x06: level 2 translation fault
[ 5301.595900] Data abort info:
[ 5301.595903]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[ 5301.595907]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 5301.595911]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 5301.595915] [dfff800000000026] address between user and kernel address ranges
[ 5301.595971] Internal error: Oops: 0000000096000006 [#1] SMP
…
[ 5301.596076] CPU: 2 PID: 102769 Comm:
syz-executor.3 Kdump: loaded Tainted:
 G        W         -------  ---  6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug #1
[ 5301.596080] Hardware name: VMware, Inc. VMware20,1/VBSA,
 BIOS VMW201.00V.21805430.BA64.2305221830 05/22/2023
[ 5301.596082] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 5301.596085] pc : strnlen+0x40/0x88
[ 5301.596114] lr : trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
[ 5301.596124] sp : ffff8000beef6b40
[ 5301.596126] x29: ffff8000beef6b40 x28: dfff800000000000 x27: 0000000000000001
[ 5301.596131] x26: 6de1800082c62bd0 x25: 1ffff000110aa9e0 x24: ffff800088554f00
[ 5301.596136] x23: ffff800088554ec0 x22: 0000000000000130 x21: 0000000000000140
[ 5301.596140] x20: dfff800000000000 x19: ffff8000beef6c60 x18: ffff7000115106d8
[ 5301.596143] x17: ffff800121bad000 x16: ffff800080020000 x15: 0000000000000006
[ 5301.596147] x14: 0000000000000002 x13: ffff0001f3ed8d14 x12: ffff700017ddeda5
[ 5301.596151] x11: 1ffff00017ddeda4 x10: ffff700017ddeda4 x9 : ffff800082cc5eec
[ 5301.596155] x8 : 0000000000000004 x7 : 00000000f1f1f1f1 x6 : 00000000f2f2f200
[ 5301.596158] x5 : 00000000f3f3f3f3 x4 : ffff700017dded80 x3 : 00000000f204f1f1
[ 5301.596162] x2 : 0000000000000026 x1 : 0000000000000000 x0 : 0000000000000130
[ 5301.596166] Call trace:
[ 5301.596175]  strnlen+0x40/0x88
[ 5301.596179]  trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
[ 5301.596182]  perf_trace_qdisc_reset+0xb0/0x538
[ 5301.596184]  __traceiter_qdisc_reset+0x68/0xc0
[ 5301.596188]  qdisc_reset+0x43c/0x5e8
[ 5301.596190]  netif_set_real_num_tx_queues+0x288/0x770
[ 5301.596194]  veth_init_queues+0xfc/0x130 [veth]
[ 5301.596198]  veth_newlink+0x45c/0x850 [veth]
[ 5301.596202]  rtnl_newlink_create+0x2c8/0x798
[ 5301.596205]  __rtnl_newlink+0x92c/0xb60
[ 5301.596208]  rtnl_newlink+0xd8/0x130
[ 5301.596211]  rtnetlink_rcv_msg+0x2e0/0x890
[ 5301.596214]  netlink_rcv_skb+0x1c4/0x380
[ 5301.596225]  rtnetlink_rcv+0x20/0x38
[ 5301.596227]  netlink_unicast+0x3c8/0x640
[ 5301.596231]  netlink_sendmsg+0x658/0xa60
[ 5301.596234]  __sock_sendmsg+0xd0/0x180
[ 5301.596243]  __sys_sendto+0x1c0/0x280
[ 5301.596246]  __arm64_sys_sendto+0xc8/0x150
[ 5301.596249]  invoke_syscall+0xdc/0x268
[ 5301.596256]  el0_svc_common.constprop.0+0x16c/0x240
[ 5301.596259]  do_el0_svc+0x48/0x68
[ 5301.596261]  el0_svc+0x50/0x188
[ 5301.596265]  el0t_64_sync_handler+0x120/0x130
[ 5301.596268]  el0t_64_sync+0x194/0x198
[ 5301.596272] Code: eb15001f 54000120 d343fc02 12000801 (38f46842) 
[ 5301.596285] SMP: stopping secondary CPUs
[ 5301.597053] Starting crashdump kernel...
[ 5301.597057] Bye!

Yeoreum and I use two fuzzing tool simultaneously.

One process with syz-executor : https://github.com/google/syzkaller

 $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \
    -enable=none -collide=false log1

The other process with perf fuzzer:
 https://github.com/deater/perf_event_tests/tree/master/fuzzer

 $ perf_event_tests/fuzzer/perf_fuzzer

Yeoreum and I don't know if the patch we wrote will fix the underlying cause,
but we think that priority is to prevent kernel panic happening.
So, we're sending this patch.

I can attach a sys-execprog's executing program, kernel dump and dmesg
if someone need it, but I'm not sure how to safely attach large vmcore with vmlinux.

Signed-off-by: Yunseong Kim <yskelg@gmail.com>, Yeoreum Yun <yeoreum.yun@arm.com>
---
 include/trace/events/qdisc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Pedro Tammela June 21, 2024, 2:24 p.m. UTC | #1
On 21/06/2024 08:45, yskelg@gmail.com wrote:
> From: Yunseong Kim <yskelg@gmail.com>
> 
> In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from
> 
>   qdisc->dev_queue->dev <NULL> ->name
> 
> This situation simulated from bunch of veths and Bluetooth dis/reconnection.
> 
> During qdisc initialization, qdisc was being set to noop_queue.
> In veth_init_queue, the initial tx_num was reduced back to one,
> causing the qdisc reset to be called with noop, which led to the kernel panic.
> 
> I think this will happen on the kernel version.
>   Linux kernel version ≥ v6.7.10, ≥ v6.8 ≥ v6.9 and 6.10

You should tag your patch for the net tree

> 
> This occurred from 51270d573a8d. I think this patch is absolutely
> necessary. Previously, It was showing not intended string value of name.
Add a 'Fixes:' tag with this commit

> 
> I've reproduced 3 time from my fedora 40 Debug Kernel with any other module
> and patched.
> 
> version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug
> 
> [ 5287.164555] veth0_vlan: left promiscuous mode
> [ 5287.164929] veth1_macvtap: left promiscuous mode
> [ 5287.164950] veth0_macvtap: left promiscuous mode
> [ 5287.164983] veth1_vlan: left promiscuous mode
> [ 5287.165008] veth0_vlan: left promiscuous mode
> [ 5287.165450] veth1_macvtap: left promiscuous mode
> [ 5287.165472] veth0_macvtap: left promiscuous mode
> [ 5287.165502] veth1_vlan: left promiscuous mode
> …
> [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state
> [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state
> [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state
> [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state
> [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0
> [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state
> [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state
> [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0
> [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state
> …
> [ 5298.002798] bridge_slave_0: left promiscuous mode
> [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state
> [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
> [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
> [ 5298.320207] bond0 (unregistering): Released all slaves
> [ 5298.354296] hsr_slave_0: left promiscuous mode
> [ 5298.360750] hsr_slave_1: left promiscuous mode
> [ 5298.374889] veth1_macvtap: left promiscuous mode
> [ 5298.374931] veth0_macvtap: left promiscuous mode
> [ 5298.374988] veth1_vlan: left promiscuous mode
> [ 5298.375024] veth0_vlan: left promiscuous mode
> [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed
> [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed
> …
> [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1
> [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9
> [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9
> ….
> [ 5301.075531] team0: Port device team_slave_1 added
> [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state
> [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state
> [ 5301.085588] bridge_slave_0: entered allmulticast mode
> [ 5301.085800] bridge_slave_0: entered promiscuous mode
> [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state
> [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state
> …
> [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
> [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
> [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
> [ 5301.193481] hsr_slave_0: entered promiscuous mode
> [ 5301.204425] hsr_slave_1: entered promiscuous mode
> [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present!
> [ 5301.210185] Cannot create hsr debugfs directory
> [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
> [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
> [ 5301.255934] team0: Port device team_slave_0 added
> [ 5301.256480] team0: Port device team_slave_1 added
> [ 5301.256948] team0: Port device team_slave_0 added
> …
> [ 5301.435928] hsr_slave_0: entered promiscuous mode
> [ 5301.446029] hsr_slave_1: entered promiscuous mode
> [ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present!
> [ 5301.455884] Cannot create hsr debugfs directory
> [ 5301.502664] hsr_slave_0: entered promiscuous mode
> [ 5301.513675] hsr_slave_1: entered promiscuous mode
> [ 5301.526155] debugfs: Directory 'hsr0' with parent 'hsr' already present!
> [ 5301.526164] Cannot create hsr debugfs directory
> [ 5301.563662] hsr_slave_0: entered promiscuous mode
> [ 5301.576129] hsr_slave_1: entered promiscuous mode
> [ 5301.580259] debugfs: Directory 'hsr0' with parent 'hsr' already present!
> [ 5301.580270] Cannot create hsr debugfs directory
> [ 5301.590269] 8021q: adding VLAN 0 to HW filter on device bond0
> 
> [ 5301.595872] KASAN: null-ptr-deref in range [0x0000000000000130-0x0000000000000137]
> [ 5301.595877] Mem abort info:
> [ 5301.595881]   ESR = 0x0000000096000006
> [ 5301.595885]   EC = 0x25: DABT (current EL), IL = 32 bits
> [ 5301.595889]   SET = 0, FnV = 0
> [ 5301.595893]   EA = 0, S1PTW = 0
> [ 5301.595896]   FSC = 0x06: level 2 translation fault
> [ 5301.595900] Data abort info:
> [ 5301.595903]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
> [ 5301.595907]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 5301.595911]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 5301.595915] [dfff800000000026] address between user and kernel address ranges
> [ 5301.595971] Internal error: Oops: 0000000096000006 [#1] SMP
> …
> [ 5301.596076] CPU: 2 PID: 102769 Comm:
> syz-executor.3 Kdump: loaded Tainted:
>   G        W         -------  ---  6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug #1
> [ 5301.596080] Hardware name: VMware, Inc. VMware20,1/VBSA,
>   BIOS VMW201.00V.21805430.BA64.2305221830 05/22/2023
> [ 5301.596082] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [ 5301.596085] pc : strnlen+0x40/0x88
> [ 5301.596114] lr : trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
> [ 5301.596124] sp : ffff8000beef6b40
> [ 5301.596126] x29: ffff8000beef6b40 x28: dfff800000000000 x27: 0000000000000001
> [ 5301.596131] x26: 6de1800082c62bd0 x25: 1ffff000110aa9e0 x24: ffff800088554f00
> [ 5301.596136] x23: ffff800088554ec0 x22: 0000000000000130 x21: 0000000000000140
> [ 5301.596140] x20: dfff800000000000 x19: ffff8000beef6c60 x18: ffff7000115106d8
> [ 5301.596143] x17: ffff800121bad000 x16: ffff800080020000 x15: 0000000000000006
> [ 5301.596147] x14: 0000000000000002 x13: ffff0001f3ed8d14 x12: ffff700017ddeda5
> [ 5301.596151] x11: 1ffff00017ddeda4 x10: ffff700017ddeda4 x9 : ffff800082cc5eec
> [ 5301.596155] x8 : 0000000000000004 x7 : 00000000f1f1f1f1 x6 : 00000000f2f2f200
> [ 5301.596158] x5 : 00000000f3f3f3f3 x4 : ffff700017dded80 x3 : 00000000f204f1f1
> [ 5301.596162] x2 : 0000000000000026 x1 : 0000000000000000 x0 : 0000000000000130
> [ 5301.596166] Call trace:
> [ 5301.596175]  strnlen+0x40/0x88
> [ 5301.596179]  trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
> [ 5301.596182]  perf_trace_qdisc_reset+0xb0/0x538
> [ 5301.596184]  __traceiter_qdisc_reset+0x68/0xc0
> [ 5301.596188]  qdisc_reset+0x43c/0x5e8
> [ 5301.596190]  netif_set_real_num_tx_queues+0x288/0x770
> [ 5301.596194]  veth_init_queues+0xfc/0x130 [veth]
> [ 5301.596198]  veth_newlink+0x45c/0x850 [veth]
> [ 5301.596202]  rtnl_newlink_create+0x2c8/0x798
> [ 5301.596205]  __rtnl_newlink+0x92c/0xb60
> [ 5301.596208]  rtnl_newlink+0xd8/0x130
> [ 5301.596211]  rtnetlink_rcv_msg+0x2e0/0x890
> [ 5301.596214]  netlink_rcv_skb+0x1c4/0x380
> [ 5301.596225]  rtnetlink_rcv+0x20/0x38
> [ 5301.596227]  netlink_unicast+0x3c8/0x640
> [ 5301.596231]  netlink_sendmsg+0x658/0xa60
> [ 5301.596234]  __sock_sendmsg+0xd0/0x180
> [ 5301.596243]  __sys_sendto+0x1c0/0x280
> [ 5301.596246]  __arm64_sys_sendto+0xc8/0x150
> [ 5301.596249]  invoke_syscall+0xdc/0x268
> [ 5301.596256]  el0_svc_common.constprop.0+0x16c/0x240
> [ 5301.596259]  do_el0_svc+0x48/0x68
> [ 5301.596261]  el0_svc+0x50/0x188
> [ 5301.596265]  el0t_64_sync_handler+0x120/0x130
> [ 5301.596268]  el0t_64_sync+0x194/0x198
> [ 5301.596272] Code: eb15001f 54000120 d343fc02 12000801 (38f46842)
> [ 5301.596285] SMP: stopping secondary CPUs
> [ 5301.597053] Starting crashdump kernel...
> [ 5301.597057] Bye!
> 
> Yeoreum and I use two fuzzing tool simultaneously.
> 
> One process with syz-executor : https://github.com/google/syzkaller
> 
>   $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \
>      -enable=none -collide=false log1
> 
> The other process with perf fuzzer:
>   https://github.com/deater/perf_event_tests/tree/master/fuzzer
> 
>   $ perf_event_tests/fuzzer/perf_fuzzer
> 
> Yeoreum and I don't know if the patch we wrote will fix the underlying cause,
> but we think that priority is to prevent kernel panic happening.
> So, we're sending this patch.
> 
> I can attach a sys-execprog's executing program, kernel dump and dmesg
> if someone need it, but I'm not sure how to safely attach large vmcore with vmlinux.

The syzkaller program + C reproducer is usually enough, please make it 
visible somewhere

> 
> Signed-off-by: Yunseong Kim <yskelg@gmail.com>, Yeoreum Yun <yeoreum.yun@arm.com>

Should be two SoB tags

> ---
>   include/trace/events/qdisc.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h
> index f1b5e816e7e5..170b51fbe47a 100644
> --- a/include/trace/events/qdisc.h
> +++ b/include/trace/events/qdisc.h
> @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset,
>   	TP_ARGS(q),
>   
>   	TP_STRUCT__entry(
> -		__string(	dev,		qdisc_dev(q)->name	)
> +		__string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue")
>   		__string(	kind,		q->ops->id		)
>   		__field(	u32,		parent			)
>   		__field(	u32,		handle			)
Yunseong Kim June 21, 2024, 3:06 p.m. UTC | #2
Hi Pedro,

On 6/21/24 11:24 오후, Pedro Tammela wrote:
> On 21/06/2024 08:45, yskelg@gmail.com wrote:
>> From: Yunseong Kim <yskelg@gmail.com>
>>
>> In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from
>>
>>   qdisc->dev_queue->dev <NULL> ->name
>>
>> This situation simulated from bunch of veths and Bluetooth
>> dis/reconnection.
>>
>> During qdisc initialization, qdisc was being set to noop_queue.
>> In veth_init_queue, the initial tx_num was reduced back to one,
>> causing the qdisc reset to be called with noop, which led to the
>> kernel panic.
>>
>> I think this will happen on the kernel version.
>>   Linux kernel version ≥ v6.7.10, ≥ v6.8 ≥ v6.9 and 6.10
> 
> You should tag your patch for the net tree
Thank you for the code review, I will tag the next patch for the net tree.

>> This occurred from 51270d573a8d. I think this patch is absolutely
>> necessary. Previously, It was showing not intended string value of name.
> Add a 'Fixes:' tag with this commit

I will added 'Fixes: 51270d573a8d' Tag on patch v2 message.

>> I can attach a sys-execprog's executing program, kernel dump and dmesg
>> if someone need it, but I'm not sure how to safely attach large vmcore
>> with vmlinux.
> 
> The syzkaller program + C reproducer is usually enough, please make it
> visible somewhere

I got it, I have a converted C syz program. So, I've attached the GitHub
gist link and C source code in this mail.

 https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740

>> Signed-off-by: Yunseong Kim <yskelg@gmail.com>, Yeoreum Yun
>> <yeoreum.yun@arm.com>
> 
> Should be two SoB tags

Oh, It's the first time we've sent together, I made a mistake.. Sorry.
Thank you Pedro for the advice!

>> ---
>>   include/trace/events/qdisc.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h
>> index f1b5e816e7e5..170b51fbe47a 100644
>> --- a/include/trace/events/qdisc.h
>> +++ b/include/trace/events/qdisc.h
>> @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset,
>>       TP_ARGS(q),
>>         TP_STRUCT__entry(
>> -        __string(    dev,        qdisc_dev(q)->name    )
>> +        __string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue")
>>           __string(    kind,        q->ops->id        )
>>           __field(    u32,        parent            )
>>           __field(    u32,        handle            )
> 


Warm Regards,
Yunseong Kim
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef __NR_add_key
#define __NR_add_key 217
#endif
#ifndef __NR_bpf
#define __NR_bpf 280
#endif
#ifndef __NR_io_uring_register
#define __NR_io_uring_register 427
#endif
#ifndef __NR_io_uring_setup
#define __NR_io_uring_setup 425
#endif
#ifndef __NR_keyctl
#define __NR_keyctl 219
#endif
#ifndef __NR_mlockall
#define __NR_mlockall 230
#endif
#ifndef __NR_mmap
#define __NR_mmap 222
#endif
#ifndef __NR_mremap
#define __NR_mremap 216
#endif
#ifndef __NR_munmap
#define __NR_munmap 215
#endif
#ifndef __NR_openat
#define __NR_openat 56
#endif
#ifndef __NR_read
#define __NR_read 63
#endif
#ifndef __NR_shmctl
#define __NR_shmctl 195
#endif

#define BITMASK(bf_off, bf_len) (((1ull << (bf_len)) - 1) << (bf_off))
#define STORE_BY_BITMASK(type, htobe, addr, val, bf_off, bf_len)               \
  *(type*)(addr) =                                                             \
      htobe((htobe(*(type*)(addr)) & ~BITMASK((bf_off), (bf_len))) |           \
            (((type)(val) << (bf_off)) & BITMASK((bf_off), (bf_len))))

uint64_t r[7] = {0xffffffffffffffff,
                 0xffffffffffffffff,
                 0xffffffffffffffff,
                 0x0,
                 0xffffffffffffffff,
                 0xffffffffffffffff,
                 0x0};

int main(void)
{
  syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul,
          /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1,
          /*offset=*/0ul);
  syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul,
          /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/ 7ul,
          /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1,
          /*offset=*/0ul);
  syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul,
          /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1,
          /*offset=*/0ul);
  const char* reason;
  (void)reason;
  intptr_t res = 0;
  if (write(1, "executing program\n", sizeof("executing program\n") - 1)) {
  }
  *(uint32_t*)0x20000004 = 0;
  *(uint32_t*)0x20000008 = 0;
  *(uint32_t*)0x2000000c = 0;
  *(uint32_t*)0x20000010 = 0;
  *(uint32_t*)0x20000018 = -1;
  memset((void*)0x2000001c, 0, 12);
  res =
      syscall(__NR_io_uring_setup, /*entries=*/0xe68, /*params=*/0x20000000ul);
  if (res != -1)
    r[0] = res;
  memset((void*)0x20000080, 111, 1);
  syscall(__NR_io_uring_register, /*fd=*/r[0], /*opcode=*/0xaul,
          /*arg=*/0x20000080ul, /*size=*/1ul);
  syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0xa000ul, /*prot=*/0ul,
          /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1,
          /*offset=*/0ul);
  res = syscall(__NR_bpf, /*cmd=*/0ul, /*arg=*/0ul,
                /*size=*/0xfffffffffffffc91ul);
  if (res != -1)
    r[1] = res;
  *(uint32_t*)0x200000c0 = -1;
  *(uint32_t*)0x200000c4 = 0;
  res = syscall(__NR_bpf, /*cmd=*/0x21ul, /*arg=*/0x200000c0ul, /*size=*/8ul);
  if (res != -1)
    r[2] = res;
  *(uint32_t*)0x20000c80 = -1;
  *(uint32_t*)0x20000c84 = 0x20;
  *(uint64_t*)0x20000c88 = 0x20000280;
  *(uint64_t*)0x20000280 = 0x20000180;
  *(uint32_t*)0x20000288 = 0x95;
  *(uint64_t*)0x20000290 = 0x20000b80;
  res = syscall(__NR_bpf, /*cmd=*/0xful, /*arg=*/0x20000c80ul, /*size=*/0x10ul);
  if (res != -1)
    r[3] = *(uint32_t*)0x2000028c;
  memcpy((void*)0x20000040, "./file1\000", 8);
  res = syscall(__NR_openat, /*fd=*/0xffffff9c, /*file=*/0x20000040ul,
                /*flags=*/0ul, /*mode=*/0ul);
  if (res != -1)
    r[4] = res;
  syscall(__NR_read, /*fd=*/r[4], /*buf=*/0ul, /*count=*/0ul);
  *(uint32_t*)0x20000cc0 = 0x1b;
  *(uint32_t*)0x20000cc4 = 0;
  *(uint32_t*)0x20000cc8 = 0;
  *(uint32_t*)0x20000ccc = 0x9ff;
  *(uint32_t*)0x20000cd0 = 0;
  *(uint32_t*)0x20000cd4 = r[1];
  *(uint32_t*)0x20000cd8 = 0;
  memset((void*)0x20000cdc, 0, 16);
  *(uint32_t*)0x20000cec = 0;
  *(uint32_t*)0x20000cf0 = -1;
  *(uint32_t*)0x20000cf4 = 4;
  *(uint32_t*)0x20000cf8 = 4;
  *(uint32_t*)0x20000cfc = 4;
  *(uint64_t*)0x20000d00 = 0;
  res = syscall(__NR_bpf, /*cmd=*/0ul, /*arg=*/0x20000cc0ul, /*size=*/0x48ul);
  if (res != -1)
    r[5] = res;
  *(uint32_t*)0x20000e00 = 0;
  *(uint32_t*)0x20000e04 = 6;
  *(uint64_t*)0x20000e08 = 0x20000040;
  *(uint8_t*)0x20000040 = 0x18;
  STORE_BY_BITMASK(uint8_t, , 0x20000041, 7, 0, 4);
  STORE_BY_BITMASK(uint8_t, , 0x20000041, 4, 4, 4);
  *(uint16_t*)0x20000042 = 0;
  *(uint32_t*)0x20000044 = 4;
  *(uint8_t*)0x20000048 = 0;
  *(uint8_t*)0x20000049 = 0;
  *(uint16_t*)0x2000004a = 0;
  *(uint32_t*)0x2000004c = 0;
  STORE_BY_BITMASK(uint8_t, , 0x20000050, 4, 0, 3);
  STORE_BY_BITMASK(uint8_t, , 0x20000050, 1, 3, 1);
  STORE_BY_BITMASK(uint8_t, , 0x20000050, 0, 4, 4);
  STORE_BY_BITMASK(uint8_t, , 0x20000051, 0xa, 0, 4);
  STORE_BY_BITMASK(uint8_t, , 0x20000051, 0, 4, 4);
  *(uint16_t*)0x20000052 = 0xc;
  *(uint32_t*)0x20000054 = 0;
  *(uint8_t*)0x20000058 = 0x18;
  STORE_BY_BITMASK(uint8_t, , 0x20000059, 0, 0, 4);
  STORE_BY_BITMASK(uint8_t, , 0x20000059, 4, 4, 4);
  *(uint16_t*)0x2000005a = 0;
  *(uint32_t*)0x2000005c = 5;
  *(uint8_t*)0x20000060 = 0;
  *(uint8_t*)0x20000061 = 0;
  *(uint16_t*)0x20000062 = 0;
  *(uint32_t*)0x20000064 = 0;
  *(uint8_t*)0x20000068 = 0x85;
  STORE_BY_BITMASK(uint8_t, , 0x20000069, 0, 0, 4);
  STORE_BY_BITMASK(uint8_t, , 0x20000069, 1, 4, 4);
  *(uint16_t*)0x2000006a = 0;
  *(uint32_t*)0x2000006c = 0xfffffff9;
  *(uint64_t*)0x20000e10 = 0x20000080;
  memcpy((void*)0x20000080, "GPL\000", 4);
  *(uint32_t*)0x20000e18 = 3;
  *(uint32_t*)0x20000e1c = 0;
  *(uint64_t*)0x20000e20 = 0;
  *(uint32_t*)0x20000e28 = 0x41100;
  *(uint32_t*)0x20000e2c = 0x50;
  memset((void*)0x20000e30, 0, 16);
  *(uint32_t*)0x20000e40 = 0;
  *(uint32_t*)0x20000e44 = 0x21;
  *(uint32_t*)0x20000e48 = r[2];
  *(uint32_t*)0x20000e4c = 8;
  *(uint64_t*)0x20000e50 = 0x20000100;
  *(uint32_t*)0x20000100 = 3;
  *(uint32_t*)0x20000104 = 5;
  *(uint32_t*)0x20000e58 = 8;
  *(uint32_t*)0x20000e5c = 0x10;
  *(uint64_t*)0x20000e60 = 0x20000140;
  *(uint32_t*)0x20000140 = 3;
  *(uint32_t*)0x20000144 = 9;
  *(uint32_t*)0x20000148 = 0x62;
  *(uint32_t*)0x2000014c = 0x5038;
  *(uint32_t*)0x20000e68 = 0x10;
  *(uint32_t*)0x20000e6c = r[3];
  *(uint32_t*)0x20000e70 = r[4];
  *(uint32_t*)0x20000e74 = 5;
  *(uint64_t*)0x20000e78 = 0x20000d40;
  *(uint32_t*)0x20000d40 = r[1];
  *(uint32_t*)0x20000d44 = r[1];
  *(uint32_t*)0x20000d48 = r[5];
  *(uint64_t*)0x20000e80 = 0x20000d80;
  *(uint32_t*)0x20000d80 = 2;
  *(uint32_t*)0x20000d84 = 3;
  *(uint32_t*)0x20000d88 = 2;
  *(uint32_t*)0x20000d8c = 7;
  *(uint32_t*)0x20000d90 = 0;
  *(uint32_t*)0x20000d94 = 5;
  *(uint32_t*)0x20000d98 = 7;
  *(uint32_t*)0x20000d9c = 7;
  *(uint32_t*)0x20000da0 = 4;
  *(uint32_t*)0x20000da4 = 4;
  *(uint32_t*)0x20000da8 = 6;
  *(uint32_t*)0x20000dac = 0xa;
  *(uint32_t*)0x20000db0 = 2;
  *(uint32_t*)0x20000db4 = 5;
  *(uint32_t*)0x20000db8 = 7;
  *(uint32_t*)0x20000dbc = 7;
  *(uint32_t*)0x20000dc0 = 4;
  *(uint32_t*)0x20000dc4 = 1;
  *(uint32_t*)0x20000dc8 = 0xf;
  *(uint32_t*)0x20000dcc = 8;
  *(uint32_t*)0x20000e88 = 0x10;
  *(uint32_t*)0x20000e8c = 0x46f;
  syscall(__NR_bpf, /*cmd=*/5ul, /*arg=*/0x20000e00ul, /*size=*/0x90ul);
  memcpy((void*)0x20000000, "asymmetric\000", 11);
  memcpy((void*)0x20000240, "syz", 3);
  *(uint8_t*)0x20000243 = 0x21;
  *(uint8_t*)0x20000244 = 0;
  res = syscall(__NR_add_key, /*type=*/0x20000000ul, /*desc=*/0x20000240ul,
                /*payload=*/0ul, /*paylen=*/0ul, /*keyring=*/0xfffffff9);
  if (res != -1)
    r[6] = res;
  syscall(__NR_keyctl, /*code=*/0xbul, /*key=*/r[6], /*payload=*/0x20000300ul,
          /*len=*/0x44ul, 0);
  *(uint32_t*)0x200000c0 = 5;
  *(uint32_t*)0x200000c4 = 0xb;
  *(uint64_t*)0x200000c8 = 0x20000180;
  *(uint8_t*)0x20000180 = r[2];
  *(uint64_t*)0x200000d0 = 0x20000200;
  memcpy((void*)0x20000200, "GPL\000", 4);
  *(uint32_t*)0x200000d8 = 0x4000004;
  *(uint32_t*)0x200000dc = 0;
  *(uint64_t*)0x200000e0 = 0;
  *(uint32_t*)0x200000e8 = 0x40f00;
  *(uint32_t*)0x200000ec = 0;
  memset((void*)0x200000f0, 0, 16);
  *(uint32_t*)0x20000100 = 0;
  *(uint32_t*)0x20000104 = 0x17;
  *(uint32_t*)0x20000108 = 0;
  *(uint32_t*)0x2000010c = 0;
  *(uint64_t*)0x20000110 = 0;
  *(uint32_t*)0x20000118 = 0;
  *(uint32_t*)0x2000011c = 0;
  *(uint64_t*)0x20000120 = 0;
  *(uint32_t*)0x20000128 = 0;
  *(uint32_t*)0x2000012c = r[3];
  *(uint32_t*)0x20000130 = 0;
  *(uint32_t*)0x20000134 = 0;
  *(uint64_t*)0x20000138 = 0;
  *(uint64_t*)0x20000140 = 0;
  *(uint32_t*)0x20000148 = 0;
  *(uint32_t*)0x2000014c = 0;
  syscall(__NR_bpf, /*cmd=*/5ul, /*arg=*/0x200000c0ul, /*size=*/0x90ul);
  syscall(__NR_mlockall, /*flags=MCL_FUTURE|MCL_CURRENT*/ 3ul);
  *(uint32_t*)0x20000280 = 0x798e2636;
  *(uint32_t*)0x20000284 = 0;
  *(uint32_t*)0x20000288 = 0;
  *(uint32_t*)0x2000028c = 0;
  *(uint32_t*)0x20000290 = 0xee00;
  *(uint32_t*)0x20000294 = 0;
  *(uint16_t*)0x20000298 = 0;
  *(uint32_t*)0x2000029c = 0x80;
  *(uint64_t*)0x200002a0 = 0;
  *(uint64_t*)0x200002a8 = 0;
  *(uint64_t*)0x200002b0 = 0;
  *(uint32_t*)0x200002b8 = 0;
  *(uint32_t*)0x200002bc = 0;
  *(uint16_t*)0x200002c0 = 0;
  *(uint16_t*)0x200002c2 = 0;
  *(uint64_t*)0x200002c8 = 0;
  *(uint64_t*)0x200002d0 = 0;
  syscall(__NR_shmctl, /*shmid=*/0, /*cmd=*/1ul, /*buf=*/0x20000280ul);
  syscall(__NR_munmap, /*addr=*/0x20000000ul, /*len=*/0x400000ul);
  syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0xa000ul,
          /*prot=PROT_GROWSDOWN|PROT_SEM|PROT_READ|PROT_EXEC*/ 0x100000dul,
          /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/ 0x32ul, /*fd=*/-1,
          /*offset=*/0ul);
  syscall(__NR_mremap, /*addr=*/0x20000000ul, /*len=*/0xc00000ul,
          /*newlen=*/0x3000ul, /*flags=MREMAP_FIXED|MREMAP_MAYMOVE*/ 3ul,
          /*newaddr=*/0x20ffa000ul);
  return 0;
}
Steven Rostedt June 25, 2024, 10:41 p.m. UTC | #3
On Fri, 21 Jun 2024 20:45:49 +0900
yskelg@gmail.com wrote:

> diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h
> index f1b5e816e7e5..170b51fbe47a 100644
> --- a/include/trace/events/qdisc.h
> +++ b/include/trace/events/qdisc.h
> @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset,
>  	TP_ARGS(q),
>  
>  	TP_STRUCT__entry(
> -		__string(	dev,		qdisc_dev(q)->name	)
> +		__string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue")

From a tracing point of view:

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

>  		__string(	kind,		q->ops->id		)
>  		__field(	u32,		parent			)
>  		__field(	u32,		handle			)
diff mbox series

Patch

diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h
index f1b5e816e7e5..170b51fbe47a 100644
--- a/include/trace/events/qdisc.h
+++ b/include/trace/events/qdisc.h
@@ -81,7 +81,7 @@  TRACE_EVENT(qdisc_reset,
 	TP_ARGS(q),
 
 	TP_STRUCT__entry(
-		__string(	dev,		qdisc_dev(q)->name	)
+		__string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue")
 		__string(	kind,		q->ops->id		)
 		__field(	u32,		parent			)
 		__field(	u32,		handle			)