diff mbox series

ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855

Message ID 20211129101309.2931-1-quic_wgong@quicinc.com (mailing list archive)
State Accepted
Commit 9f6da09a5f6ab94bca58395af56b883b3a79663a
Delegated to: Kalle Valo
Headers show
Series ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 | expand

Commit Message

Wen Gong Nov. 29, 2021, 10:13 a.m. UTC
Currently mac80211 will send 3 scan request for each scan of WCN6855,
they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will
cache the RNR IE(Reduced Neighbor Report element) which exist in the
beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz,
and then use the cache to scan in 6 GHz band scan if the 6 GHz scan
is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to
search more AP of 6 GHz. Also it will decrease the time cost of scan
because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it
means the 2.4 GHz and 5 GHz scans are doing simultaneously.

Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since
it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means
all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
---
 drivers/net/wireless/ath/ath11k/mac.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Sven Eckelmann Dec. 3, 2021, 2:09 p.m. UTC | #1
On Monday, 29 November 2021 11:13:09 CET Wen Gong wrote:
> Currently mac80211 will send 3 scan request for each scan of WCN6855,
> they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will
> cache the RNR IE(Reduced Neighbor Report element) which exist in the
> beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz,
> and then use the cache to scan in 6 GHz band scan if the 6 GHz scan
> is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to
> search more AP of 6 GHz. Also it will decrease the time cost of scan
> because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it
> means the 2.4 GHz and 5 GHz scans are doing simultaneously.
> 
> Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since
> it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means
> all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw.
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

I've tested this on ath-next on commit a93789ae541c ("ath11k: Avoid NULL ptr 
access during mgmt tx cleanup") with a WCN6856 card (EmWicon/jjplus WMX7205) 
with firmware WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1. ath-next 
was required for me because 32 MSI vectors are not available on the 
used system.

Without this patch, it works fine. With patch, I just have to connect to an AP 
via wpa_supplicant to crash the system. See the attached x86-64 .config, the 
stacktrace and the decoded stacktrace.

Kind regards,
	Sven
[   51.095079] general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI
[   51.105795] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
[   51.112157] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[   51.118339] RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) 
[ 51.123061] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00
All code
========
   0:	4d 85 ed             	test   %r13,%r13
   3:	74 4b                	je     0x50
   5:	41 8b 85 bc 00 00 00 	mov    0xbc(%r13),%eax
   c:	49 03 85 c0 00 00 00 	add    0xc0(%r13),%rax
  13:	0f b6 10             	movzbl (%rax),%edx
  16:	f6 c2 01             	test   $0x1,%dl
  19:	74 35                	je     0x50
  1b:	48 8b 70 28          	mov    0x28(%rax),%rsi
  1f:	48 85 f6             	test   %rsi,%rsi
  22:	74 2c                	je     0x50
  24:	40 f6 c6 01          	test   $0x1,%sil
  28:	75 21                	jne    0x4b
  2a:*	48 8b 06             	mov    (%rsi),%rax		<-- trapping instruction
  2d:	ba 01 00 00 00       	mov    $0x1,%edx
  32:	4c 89 ef             	mov    %r13,%rdi
  35:	0f ae e8             	lfence 
  38:	ff d0                	callq  *%rax
  3a:	41                   	rex.B
  3b:	8b                   	.byte 0x8b
  3c:	85                   	.byte 0x85
  3d:	bc                   	.byte 0xbc
	...

Code starting with the faulting instruction
===========================================
   0:	48 8b 06             	mov    (%rsi),%rax
   3:	ba 01 00 00 00       	mov    $0x1,%edx
   8:	4c 89 ef             	mov    %r13,%rdi
   b:	0f ae e8             	lfence 
   e:	ff d0                	callq  *%rax
  10:	41                   	rex.B
  11:	8b                   	.byte 0x8b
  12:	85                   	.byte 0x85
  13:	bc                   	.byte 0xbc
	...
[   51.141815] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246
[   51.147049] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000
[   51.154189] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900
[   51.161323] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8
[   51.168465] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0
[   51.175605] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005
[   51.182740] FS:  0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000
[   51.190832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.196578] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0
[   51.203713] Call Trace:
[   51.206170]  <IRQ>
[   51.208196] consume_skb (net/core/skbuff.c:757 net/core/skbuff.c:912 net/core/skbuff.c:906) 
[   51.211620] ath11k_ce_tx_process_cb+0x157/0x220 ath11k
[   51.217177] ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:437 drivers/net/wireless/ath/ath11k/ce.c:675) ath11k
[   51.223130] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) 
[   51.227680] ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:633) ath11k_pci
[   51.233095] tasklet_action_common.constprop.0 (./arch/x86/include/asm/bitops.h:75 ./include/asm-generic/bitops/instrumented-atomic.h:42 kernel/softirq.c:879 kernel/softirq.c:787) 
[   51.238425] __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) 
[   51.242023] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) 
[   51.245780] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
[   51.249638]  </IRQ>
[   51.251743]  <TASK>
[   51.253850] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) 
[   51.258044] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) 
[ 51.263026] Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
All code
========
   0:	31 ff                	xor    %edi,%edi
   2:	e8 d9 c6 9e ff       	callq  0xffffffffff9ec6e0
   7:	45 84 ff             	test   %r15b,%r15b
   a:	74 17                	je     0x23
   c:	9c                   	pushfq 
   d:	58                   	pop    %rax
   e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  13:	f6 c4 02             	test   $0x2,%ah
  16:	0f 85 78 02 00 00    	jne    0x294
  1c:	31 ff                	xor    %edi,%edi
  1e:	e8 bd 97 a5 ff       	callq  0xffffffffffa597e0
  23:	fb                   	sti    
  24:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  2a:*	45 85 f6             	test   %r14d,%r14d		<-- trapping instruction
  2d:	0f 88 11 01 00 00    	js     0x144
  33:	49 63 c6             	movslq %r14d,%rax
  36:	4c 2b 2c 24          	sub    (%rsp),%r13
  3a:	48 8d 14 40          	lea    (%rax,%rax,2),%rdx
  3e:	48                   	rex.W
  3f:	8d                   	.byte 0x8d

Code starting with the faulting instruction
===========================================
   0:	45 85 f6             	test   %r14d,%r14d
   3:	0f 88 11 01 00 00    	js     0x11a
   9:	49 63 c6             	movslq %r14d,%rax
   c:	4c 2b 2c 24          	sub    (%rsp),%r13
  10:	48 8d 14 40          	lea    (%rax,%rax,2),%rdx
  14:	48                   	rex.W
  15:	8d                   	.byte 0x8d
[   51.281781] RSP: 0018:ffffffffb4e03e60 EFLAGS: 00000246
[   51.287017] RAX: ffff9a9d1ac00000 RBX: 0000000000000002 RCX: 000000000000001f
[   51.294157] RDX: 0000000000000000 RSI: ffffffffb494bd50 RDI: ffffffffb4927def
[   51.301290] RBP: ffff9a9d0151b000 R08: 0000000be57e1147 R09: 0000000000000018
[   51.308424] R10: 0000000000000ed3 R11: 0000000000002406 R12: ffffffffb4fd05c0
[   51.315565] R13: 0000000be57e1147 R14: 0000000000000002 R15: 0000000000000000
[   51.322716] cpuidle_enter (drivers/cpuidle/cpuidle.c:353) 
[   51.326305] do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) 
[   51.329547] cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) 
[   51.333479] start_kernel (init/main.c:1137) 
[   51.337156] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) 
[   51.342228]  </TASK>
[   51.344424] Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd ccp cfg80211 jitterentropy_rng rng_core sha512_ssse3 evdev sha512_generic kvm snd_pcm snd_timer ctr leds_apu drbg snd ansi_cprng sg irqbypass ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy
[   51.403181] ---[ end trace 5511b9c3dbb0841e ]---
[   51.407861] RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) 
[ 51.412592] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00
All code
========
   0:	4d 85 ed             	test   %r13,%r13
   3:	74 4b                	je     0x50
   5:	41 8b 85 bc 00 00 00 	mov    0xbc(%r13),%eax
   c:	49 03 85 c0 00 00 00 	add    0xc0(%r13),%rax
  13:	0f b6 10             	movzbl (%rax),%edx
  16:	f6 c2 01             	test   $0x1,%dl
  19:	74 35                	je     0x50
  1b:	48 8b 70 28          	mov    0x28(%rax),%rsi
  1f:	48 85 f6             	test   %rsi,%rsi
  22:	74 2c                	je     0x50
  24:	40 f6 c6 01          	test   $0x1,%sil
  28:	75 21                	jne    0x4b
  2a:*	48 8b 06             	mov    (%rsi),%rax		<-- trapping instruction
  2d:	ba 01 00 00 00       	mov    $0x1,%edx
  32:	4c 89 ef             	mov    %r13,%rdi
  35:	0f ae e8             	lfence 
  38:	ff d0                	callq  *%rax
  3a:	41                   	rex.B
  3b:	8b                   	.byte 0x8b
  3c:	85                   	.byte 0x85
  3d:	bc                   	.byte 0xbc
	...

Code starting with the faulting instruction
===========================================
   0:	48 8b 06             	mov    (%rsi),%rax
   3:	ba 01 00 00 00       	mov    $0x1,%edx
   8:	4c 89 ef             	mov    %r13,%rdi
   b:	0f ae e8             	lfence 
   e:	ff d0                	callq  *%rax
  10:	41                   	rex.B
  11:	8b                   	.byte 0x8b
  12:	85                   	.byte 0x85
  13:	bc                   	.byte 0xbc
	...
[   51.431366] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246
[   51.436623] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000
[   51.443782] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900
[   51.450939] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8
[   51.458099] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0
[   51.465256] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005
[   51.472416] FS:  0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000
[   51.480528] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.486299] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0
[   51.493459] Kernel panic - not syncing: Fatal exception in interrupt
[   51.499831] Kernel Offset: 0x32800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   51.510610] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
[   51.095079] general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI
[   51.105795] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
[   51.112157] Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
[   51.118339] RIP: 0010:skb_release_data+0x81/0x170
[   51.123061] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00
[   51.141815] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246
[   51.147049] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000
[   51.154189] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900
[   51.161323] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8
[   51.168465] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0
[   51.175605] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005
[   51.182740] FS:  0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000
[   51.190832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.196578] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0
[   51.203713] Call Trace:
[   51.206170]  <IRQ>
[   51.208196]  consume_skb+0x39/0xb0
[   51.211620]  ath11k_ce_tx_process_cb+0x157/0x220 [ath11k]
[   51.217177]  ath11k_ce_per_engine_service+0x3c0/0x3d0 [ath11k]
[   51.223130]  ? _raw_spin_lock_irqsave+0x26/0x50
[   51.227680]  ath11k_pci_ce_tasklet+0x1c/0x40 [ath11k_pci]
[   51.233095]  tasklet_action_common.constprop.0+0xaf/0xe0
[   51.238425]  __do_softirq+0xec/0x2e9
[   51.242023]  __irq_exit_rcu+0xbc/0x110
[   51.245780]  common_interrupt+0xb8/0xd0
[   51.249638]  </IRQ>
[   51.251743]  <TASK>
[   51.253850]  asm_common_interrupt+0x1e/0x40
[   51.258044] RIP: 0010:cpuidle_enter_state+0xda/0x370
[   51.263026] Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
[   51.281781] RSP: 0018:ffffffffb4e03e60 EFLAGS: 00000246
[   51.287017] RAX: ffff9a9d1ac00000 RBX: 0000000000000002 RCX: 000000000000001f
[   51.294157] RDX: 0000000000000000 RSI: ffffffffb494bd50 RDI: ffffffffb4927def
[   51.301290] RBP: ffff9a9d0151b000 R08: 0000000be57e1147 R09: 0000000000000018
[   51.308424] R10: 0000000000000ed3 R11: 0000000000002406 R12: ffffffffb4fd05c0
[   51.315565] R13: 0000000be57e1147 R14: 0000000000000002 R15: 0000000000000000
[   51.322716]  cpuidle_enter+0x29/0x40
[   51.326305]  do_idle+0x200/0x2b0
[   51.329547]  cpu_startup_entry+0x19/0x20
[   51.333479]  start_kernel+0x6b7/0x6dc
[   51.337156]  secondary_startup_64_no_verify+0xb0/0xbb
[   51.342228]  </TASK>
[   51.344424] Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd ccp cfg80211 jitterentropy_rng rng_core sha512_ssse3 evdev sha512_generic kvm snd_pcm snd_timer ctr leds_apu drbg snd ansi_cprng sg irqbypass ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy
[   51.403181] ---[ end trace 5511b9c3dbb0841e ]---
[   51.407861] RIP: 0010:skb_release_data+0x81/0x170
[   51.412592] Code: 4d 85 ed 74 4b 41 8b 85 bc 00 00 00 49 03 85 c0 00 00 00 0f b6 10 f6 c2 01 74 35 48 8b 70 28 48 85 f6 74 2c 40 f6 c6 01 75 21 <48> 8b 06 ba 01 00 00 00 4c 89 ef 0f ae e8 ff d0 41 8b 85 bc 00 00
[   51.431366] RSP: 0018:ffffbec4c0003e30 EFLAGS: 00010246
[   51.436623] RAX: ffff9a9d11a6c2c0 RBX: ffff9a9d08341a68 RCX: 0000000000000000
[   51.443782] RDX: 0000000000000003 RSI: 00408210000b231a RDI: ffff9a9d01162900
[   51.450939] RBP: ffff9a9d01162900 R08: 0000000000000212 R09: ffffffffb4ed24e8
[   51.458099] R10: 0000000000000000 R11: 00000000dca23000 R12: ffff9a9d11a6c2c0
[   51.465256] R13: ffff9a9d01162900 R14: ffff9a9d083435d8 R15: 0000000000000005
[   51.472416] FS:  0000000000000000(0000) GS:ffff9a9d1ac00000(0000) knlGS:0000000000000000
[   51.480528] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.486299] CR2: 000055b14ef3a778 CR3: 0000000108c6e000 CR4: 00000000000006f0
[   51.493459] Kernel panic - not syncing: Fatal exception in interrupt
[   51.499831] Kernel Offset: 0x32800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   51.510610] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Wen Gong Dec. 6, 2021, 3:29 a.m. UTC | #2
On 12/3/2021 10:09 PM, Sven Eckelmann wrote:
> On Monday, 29 November 2021 11:13:09 CET Wen Gong wrote:
...
> I've tested this on ath-next on commit a93789ae541c ("ath11k: Avoid NULL ptr
> access during mgmt tx cleanup") with a WCN6856 card (EmWicon/jjplus WMX7205)
> with firmware WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1. ath-next
> was required for me because 32 MSI vectors are not available on the
> used system.
>
> Without this patch, it works fine. With patch, I just have to connect to an AP
> via wpa_supplicant to crash the system. See the attached x86-64 .config, the
> stacktrace and the decoded stacktrace.

I did test in my setup, not see the crash.

I am afraid you also need this patch("ath11k: change to use dynamic 
memory for channel list of scan",

https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com 
)

Could you apply this patch and try again?

> Kind regards,
> 	Sven
Sven Eckelmann Dec. 6, 2021, 6:56 a.m. UTC | #3
On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote:
[...]
> I did test in my setup, not see the crash.
> 
> I am afraid you also need this patch("ath11k: change to use dynamic 
> memory for channel list of scan",
> 
> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com 
> )
> 
> Could you apply this patch and try again?

Tried it and I see the same problem.

Kind regards,
	Sven
Wen Gong Dec. 6, 2021, 7:10 a.m. UTC | #4
On 12/6/2021 2:56 PM, Sven Eckelmann wrote:
> On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote:
> [...]
>> I did test in my setup, not see the crash.
>>
>> I am afraid you also need this patch("ath11k: change to use dynamic
>> memory for channel list of scan",
>>
>> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com
>> )
>>
>> Could you apply this patch and try again?
> Tried it and I see the same problem.
Could you tell what is your test steps?
>
> Kind regards,
> 	Sven
Sven Eckelmann Dec. 6, 2021, 8:03 p.m. UTC | #5
On Monday, 6 December 2021 08:10:40 CET Wen Gong wrote:
> > On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote:
> > [...]
> >> I did test in my setup, not see the crash.
> >>
> >> I am afraid you also need this patch("ath11k: change to use dynamic
> >> memory for channel list of scan",
> >>
> >> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com
> >> )
> >>
> >> Could you apply this patch and try again?
> > Tried it and I see the same problem.
> Could you tell what is your test steps?

Start kernel with commit a93789ae541c ("ath11k: Avoid NULL ptr 
access during mgmt tx cleanup") + patches:

* ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855
* ath11k: change to use dynamic memory for channel list of scan

You can find the config in the first mail. But I have now enabled KASAN inline 
to hopefully create some better error messages.

The firmware + board data (see mail "ath11k: incorrect board_id retrieval") 
was prepared like this:

   git clone https://github.com/kvalo/ath11k-firmware /root/ath11k-firmware
   mkdir -p /lib/firmware/ath11k/WCN6855/hw2.0/
   cp /root/ath11k-firmware/WCN6855/hw2.0/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/
   cp /root/ath11k-firmware/WCN6855/hw2.0/1.1/WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/

   git clone https://github.com/qca/qca-swiss-army-knife /root/qca-swiss-army-knife
   apt install python2
   python2 /root/qca-swiss-army-knife/tools/scripts/ath11k/ath11k-bdencoder  -e /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin
   rm /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin
   cp 'bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=266.bin' /lib/firmware/ath11k/WCN6855/hw2.0/board.bin

Then I am just starting up the device as usual, and start wpa_supplicant (with 
defconfig + CONFIG_MESH=y) from commit 14ab4a816c68 ("Reject 
ap_vendor_elements if its length is odd")

    cat << "EOF" > station_test.cfg
    network={
      ssid="MyTestAP"
      key_mgmt=WPA-PSK FT-PSK
      proto=RSN
      psk="testtest"
    }
    EOF
    ip link set up dev wlp6s0
    ~/hostap/wpa_supplicant/wpa_supplicant -D nl80211 -i wlp6s0 -c station_test.cfg

The actual SSID + PSK is valid and multiple access points (4) have this BSS on 
2.4GHz + 5GHz.

So you are basically always calling dev_kfree_skb_any in ath11k_ce_tx_process_cb
because wcn6855 hw2.0 has credit_flow has set. But it seems like one of the 
entries returned by ath11k_ce_completed_send_next is bogus and causes this 
problems during the ath11k_ce_tx_process_cb. And for some reason, this is
triggered here by this firmware feature.

    ./scripts/faddr2line --list vmlinux consume_skb+0x9f/0x1c0
    consume_skb+0x9f/0x1c0:
    
    __kfree_skb at net/core/skbuff.c:757
     752     */
     753 
     754    void __kfree_skb(struct sk_buff *skb)
     755    {
     756            skb_release_all(skb);
    >757<           kfree_skbmem(skb);
     758    }
     759    EXPORT_SYMBOL(__kfree_skb);
     760 
     761    /**
     762     *      kfree_skb - free an sk_buff
    
    (inlined by) consume_skb at net/core/skbuff.c:912
     907    {
     908            if (!skb_unref(skb))
     909                    return;
     910 
     911            trace_consume_skb(skb);
    >912<           __kfree_skb(skb);
     913    }
     914    EXPORT_SYMBOL(consume_skb);
     915    #endif
     916 
     917    /**
    
    (inlined by) consume_skb at net/core/skbuff.c:906
     901     *
     902     *      Drop a ref to the buffer and free it if the usage count has hit zero
     903     *      Functions identically to kfree_skb, but kfree_skb assumes that the frame
     904     *      is being dropped after a failure and notes that
     905     */
    >906<   void consume_skb(struct sk_buff *skb)
     907    {
     908            if (!skb_unref(skb))
     909                    return;
     910 
     911            trace_consume_skb(skb);


    ./scripts/faddr2line --list vmlinux skb_release_data+0x1b0/0x5c0
    skb_release_data+0x1b0/0x5c0:
    
    skb_zcopy_clear at include/linux/skbuff.h:1549
     1544   {
     1545           struct ubuf_info *uarg = skb_zcopy(skb);
     1546 
     1547           if (uarg) {
     1548                   if (!skb_zcopy_is_nouarg(skb))
    >1549<                          uarg->callback(skb, uarg, zerocopy_success);
     1550 
     1551                   skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
     1552           }
     1553   }
     1554 
    
    (inlined by) skb_release_data at net/core/skbuff.c:669
     664            if (skb->cloned &&
     665                atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
     666                                  &shinfo->dataref))
     667                    goto exit;
     668 
    >669<           skb_zcopy_clear(skb, true);
     670 
     671            for (i = 0; i < shinfo->nr_frags; i++)
     672                    __skb_frag_unref(&shinfo->frags[i], skb->pp_recycle);
     673 
     674            if (shinfo->frag_list)

But I didn't like the inlined code. So I've changed the compilation flags 
slightly:

    diff --git a/net/core/Makefile b/net/core/Makefile
    index 6bdcb2cafed8..5eda226c5f27 100644
    --- a/net/core/Makefile
    +++ b/net/core/Makefile
    @@ -37,3 +37,4 @@ obj-$(CONFIG_NET_SOCK_MSG) += skmsg.o
     obj-$(CONFIG_BPF_SYSCALL) += sock_map.o
     obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o
     obj-$(CONFIG_OF)	+= of_net.o
    +ccflags-y += -fno-inline -O1 -fno-optimize-sibling-calls

Now the stacktrace is a lot more readable. And the returned
crash location makes a lot more sense:

    ./scripts/faddr2line --list vmlinux 'skb_zcopy_clear+0x34/0x8f'
    skb_zcopy_clear+0x34/0x8f:
    
    skb_zcopy_clear at include/linux/skbuff.h:1549
     1544   {
     1545           struct ubuf_info *uarg = skb_zcopy(skb);
     1546 
     1547           if (uarg) {
     1548                   if (!skb_zcopy_is_nouarg(skb))
    >1549<                          uarg->callback(skb, uarg, zerocopy_success);
     1550 
     1551                   skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
     1552           }
     1553   }
     1554

Or with the assembler:

     (gdb) disassemble /m *(skb_zcopy_clear+0x34/0x8f)
     Dump of assembler code for function skb_zcopy_clear:
     1544    {
        0x000000000000072a <+0>:     push   %r12
        0x000000000000072c <+2>:     push   %rbp
        0x000000000000072d <+3>:     push   %rbx
        0x000000000000072e <+4>:     mov    %rdi,%rbx
        0x0000000000000731 <+7>:     mov    %esi,%r12d
     
     1545            struct ubuf_info *uarg = skb_zcopy(skb);
        0x0000000000000734 <+10>:    call   0x5d3 <skb_zcopy>
     
     1546
     1547            if (uarg) {
        0x0000000000000739 <+15>:    test   %rax,%rax
        0x000000000000073c <+18>:    je     0x7a0 <skb_zcopy_clear+118>
        0x000000000000073e <+20>:    mov    %rax,%rbp
     
     1548                    if (!skb_zcopy_is_nouarg(skb))
        0x0000000000000741 <+23>:    mov    %rbx,%rdi
        0x0000000000000744 <+26>:    call   0x6f6 <skb_zcopy_is_nouarg>
        0x0000000000000749 <+31>:    test   %al,%al
        0x000000000000074b <+33>:    jne    0x777 <skb_zcopy_clear+77>
     
     1549                            uarg->callback(skb, uarg, zerocopy_success);
        0x000000000000074d <+35>:    mov    %rbp,%rdx
        0x0000000000000750 <+38>:    shr    $0x3,%rdx
        0x0000000000000754 <+42>:    movabs $0xdffffc0000000000,%rax
        0x000000000000075e <+52>:    cmpb   $0x0,(%rdx,%rax,1)
        0x0000000000000762 <+56>:    jne    0x7a5 <skb_zcopy_clear+123>
        0x0000000000000764 <+58>:    movzbl %r12b,%edx
        0x0000000000000768 <+62>:    mov    0x0(%rbp),%rax
        0x000000000000076c <+66>:    mov    %rbp,%rsi
        0x000000000000076f <+69>:    mov    %rbx,%rdi
        0x0000000000000772 <+72>:    call   0x777 <skb_zcopy_clear+77>
        0x00000000000007a5 <+123>:   mov    %rbp,%rdi
        0x00000000000007a8 <+126>:   call   0x7ad <skb_zcopy_clear+131>
        0x00000000000007ad <+131>:   jmp    0x764 <skb_zcopy_clear+58>
     
     1550
     1551                    skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
        0x0000000000000777 <+77>:    mov    %rbx,%rdi
        0x000000000000077a <+80>:    call   0x518 <skb_end_pointer>
        0x000000000000077f <+85>:    mov    %rax,%rbx
        0x0000000000000782 <+88>:    mov    %rax,%rdx
        0x0000000000000785 <+91>:    shr    $0x3,%rdx
        0x0000000000000789 <+95>:    movabs $0xdffffc0000000000,%rax
        0x0000000000000793 <+105>:   movzbl (%rdx,%rax,1),%eax
        0x0000000000000797 <+109>:   test   %al,%al
        0x0000000000000799 <+111>:   je     0x79d <skb_zcopy_clear+115>
        0x000000000000079b <+113>:   jle    0x7af <skb_zcopy_clear+133>
        0x000000000000079d <+115>:   andb   $0xf8,(%rbx)
        0x00000000000007af <+133>:   mov    %rbx,%rdi
        0x00000000000007b2 <+136>:   call   0x7b7 <skb_zcopy_clear+141>
        0x00000000000007b7 <+141>:   jmp    0x79d <skb_zcopy_clear+115>
     
     1552            }
     1553    }
        0x00000000000007a0 <+118>:   pop    %rbx
        0x00000000000007a1 <+119>:   pop    %rbp
        0x00000000000007a2 <+120>:   pop    %r12
        0x00000000000007a4 <+122>:   ret    
     
     End of assembler dump.

To make it even easier to read, just disable the inline KASAN and reduce the 
optimization level for this for it:

    diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
    index 059b6266dcd7..819cc58ab051 100644
    --- a/include/linux/skbuff.h
    +++ b/include/linux/skbuff.h
    @@ -1540,6 +1540,8 @@ static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref)
     }
     
     /* Release a reference on a zerocopy structure */
    +#pragma GCC push_options
    +#pragma GCC optimize ("O0")
     static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
     {
     	struct ubuf_info *uarg = skb_zcopy(skb);
    @@ -1551,6 +1553,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
     		skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
     	}
     }
    +#pragma GCC pop_options
     
     static inline void skb_mark_not_on_list(struct sk_buff *skb)
     {

This creates this nice, unoptimized function which crashes at +63:

    $ gdb net/core/skbuff.o -q                                                    
    Reading symbols from net/core/skbuff.o...
    (gdb) disassemble /m *(skb_zcopy_clear+0x3f/0x70)
    Dump of assembler code for function skb_zcopy_clear:
    1546    {
       0x0000000000000000 <+0>:     push   %rbp
       0x0000000000000001 <+1>:     mov    %rsp,%rbp
       0x0000000000000004 <+4>:     sub    $0x18,%rsp
       0x0000000000000008 <+8>:     mov    %rdi,-0x10(%rbp)
       0x000000000000000c <+12>:    mov    %esi,%eax
       0x000000000000000e <+14>:    mov    %al,-0x14(%rbp)
    
    1547            struct ubuf_info *uarg = skb_zcopy(skb);
       0x0000000000000011 <+17>:    mov    -0x10(%rbp),%rax
       0x0000000000000015 <+21>:    mov    %rax,%rdi
       0x0000000000000018 <+24>:    call   0x29e <skb_zcopy>
       0x000000000000001d <+29>:    mov    %rax,-0x8(%rbp)
    
    1548
    1549            if (uarg) {
       0x0000000000000021 <+33>:    cmpq   $0x0,-0x8(%rbp)
       0x0000000000000026 <+38>:    je     0x6d <skb_zcopy_clear+109>
    
    1550                    if (!skb_zcopy_is_nouarg(skb))
       0x0000000000000028 <+40>:    mov    -0x10(%rbp),%rax
       0x000000000000002c <+44>:    mov    %rax,%rdi
       0x000000000000002f <+47>:    call   0x2df <skb_zcopy_is_nouarg>
       0x0000000000000034 <+52>:    xor    $0x1,%eax
       0x0000000000000037 <+55>:    test   %al,%al
       0x0000000000000039 <+57>:    je     0x59 <skb_zcopy_clear+89>
    
    1551                            uarg->callback(skb, uarg, zerocopy_success);
       0x000000000000003b <+59>:    mov    -0x8(%rbp),%rax
       0x000000000000003f <+63>:    mov    (%rax),%r8
       0x0000000000000042 <+66>:    movzbl -0x14(%rbp),%edx
       0x0000000000000046 <+70>:    mov    -0x8(%rbp),%rcx
       0x000000000000004a <+74>:    mov    -0x10(%rbp),%rax
       0x000000000000004e <+78>:    mov    %rcx,%rsi
       0x0000000000000051 <+81>:    mov    %rax,%rdi
       0x0000000000000054 <+84>:    call   0x59 <skb_zcopy_clear+89>
    
    1552
    1553                    skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
       0x0000000000000059 <+89>:    mov    -0x10(%rbp),%rax
       0x000000000000005d <+93>:    mov    %rax,%rdi
       0x0000000000000060 <+96>:    call   0x27f <skb_end_pointer>
       0x0000000000000065 <+101>:   movzbl (%rax),%edx
       0x0000000000000068 <+104>:   and    $0xfffffff8,%edx
       0x000000000000006b <+107>:   mov    %dl,(%rax)
    
    1554            }
    1555    }
       0x000000000000006d <+109>:   nop
       0x000000000000006e <+110>:   leave  
       0x000000000000006f <+111>:   ret    
    
    End of assembler dump.

The question now: What is causing the unclean state of the skb and thus 
doesn't let it get rejected by skb_zcopy_is_nouarg before the uarg
callback is tried.

Kind regards,
	Sven
general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #3
Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) 
Code: 00 00 48 8b 75 28 48 85 f6 0f 84 d2 00 00 00 40 f6 c6 01 0f 85 a3 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 d3 03 00 00 48 8b 06 ba 01 00 00 00 48 89 df 0f
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	48 8b 75 28          	mov    0x28(%rbp),%rsi
   6:	48 85 f6             	test   %rsi,%rsi
   9:	0f 84 d2 00 00 00    	je     0xe1
   f:	40 f6 c6 01          	test   $0x1,%sil
  13:	0f 85 a3 00 00 00    	jne    0xbc
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df 
  23:	48 89 f2             	mov    %rsi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
  2a:*	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)		<-- trapping instruction
  2e:	0f 85 d3 03 00 00    	jne    0x407
  34:	48 8b 06             	mov    (%rsi),%rax
  37:	ba 01 00 00 00       	mov    $0x1,%edx
  3c:	48 89 df             	mov    %rbx,%rdi
  3f:	0f                   	.byte 0xf

Code starting with the faulting instruction
===========================================
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	0f 85 d3 03 00 00    	jne    0x3dd
   a:	48 8b 06             	mov    (%rsi),%rax
   d:	ba 01 00 00 00       	mov    $0x1,%edx
  12:	48 89 df             	mov    %rbx,%rdi
  15:	0f                   	.byte 0xf
RSP: 0018:ffff8880c7c09c50 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff888004c6bdc0 RCX: 1ffff1100076945d
RDX: 0008104200016463 RSI: 00408210000b231a RDI: ffff888003b4a2e8
RBP: ffff888003b4a2c0 R08: 0000000000000000 R09: ffff888004c6be97
R10: ffffed100098d7d2 R11: 0000000000000001 R12: ffff888003b4a2c0
R13: ffff888004c6be7c R14: ffff88800c641e58 R15: ffff888004c6be80
FS:  0000000000000000(0000) GS:ffff8880c7c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055d3a95f6778 CR3: 0000000017c20000 CR4: 00000000000006f0
Call Trace:
<IRQ>
? _raw_write_lock_irq (kernel/locking/spinlock.c:177) 
consume_skb (net/core/skbuff.c:757 net/core/skbuff.c:912 net/core/skbuff.c:906) 
ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k
? __local_bh_enable_ip (./arch/x86/include/asm/preempt.h:103 kernel/softirq.c:390) 
? ath11k_ce_alloc_pipes (drivers/net/wireless/ath/ath11k/ce.c:500) ath11k
? ath11k_hal_srng_access_end (drivers/net/wireless/ath/ath11k/hal.c:849) ath11k
ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k
? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) 
? __lock_text_start (kernel/locking/spinlock.c:161) 
? ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:689) ath11k
? __wake_up_bit (kernel/sched/wait_bit.c:192) 
? __irq_put_desc_unlock (kernel/irq/irqdesc.c:819) 
ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci
? tasklet_clear_sched (kernel/softirq.c:752) 
tasklet_action_common.constprop.0 (kernel/softirq.c:783) 
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) 
__irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) 
common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
</IRQ>
<TASK>
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) 
RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) 
Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d
All code
========
   0:	ff                   	(bad)  
   1:	e8 8e 95 db fe       	callq  0xfffffffffedb9594
   6:	80 3c 24 00          	cmpb   $0x0,(%rsp)
   a:	74 17                	je     0x23
   c:	9c                   	pushfq 
   d:	58                   	pop    %rax
   e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  13:	f6 c4 02             	test   $0x2,%ah
  16:	0f 85 8e 06 00 00    	jne    0x6aa
  1c:	31 ff                	xor    %edi,%edi
  1e:	e8 a1 b9 ef fe       	callq  0xfffffffffeefb9c4
  23:	fb                   	sti    
  24:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  2a:*	45 85 ed             	test   %r13d,%r13d		<-- trapping instruction
  2d:	0f 88 52 03 00 00    	js     0x385
  33:	4d 63 e5             	movslq %r13d,%r12
  36:	4b 8d 04 64          	lea    (%r12,%r12,2),%rax
  3a:	49 8d 04 84          	lea    (%r12,%rax,4),%rax
  3e:	48                   	rex.W
  3f:	8d                   	.byte 0x8d

Code starting with the faulting instruction
===========================================
   0:	45 85 ed             	test   %r13d,%r13d
   3:	0f 88 52 03 00 00    	js     0x35b
   9:	4d 63 e5             	movslq %r13d,%r12
   c:	4b 8d 04 64          	lea    (%r12,%r12,2),%rax
  10:	49 8d 04 84          	lea    (%r12,%rax,4),%rax
  14:	48                   	rex.W
  15:	8d                   	.byte 0x8d
RSP: 0018:ffffffff89a07de0 EFLAGS: 00000246
RAX: dffffc0000000000 RBX: ffff888003b44000 RCX: 1ffffffff129775c
RDX: 1ffff11018f88331 RSI: ffffffff89031b00 RDI: ffff8880c7c41988
RBP: ffffffff89ee0d20 R08: 0000000000000002 R09: ffff8880c7c41c2b
R10: ffffed1018f88385 R11: 0000000000000001 R12: 0000000000000002
R13: 0000000000000002 R14: 00000024aa5bda97 R15: ffffffff89ee0e08
? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) 
? tick_nohz_idle_stop_tick (./include/linux/hrtimer.h:419 kernel/time/tick-sched.c:920 kernel/time/tick-sched.c:1062 kernel/time/tick-sched.c:1083) 
cpuidle_enter (drivers/cpuidle/cpuidle.c:353) 
do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) 
? arch_cpu_idle_exit+0x40/0x40 
cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) 
start_kernel (init/main.c:1137) 
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) 
</TASK>
Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl ccp btbcm rng_core btintel libarc4 evdev leds_apu bluetooth kvm snd_pcm snd_timer jitterentropy_rng cfg80211 snd sha512_ssse3 sha512_generic sg soundcore irqbypass ctr pcspkr drbg ansi_cprng k10temp ecdh_generic rfkill ecc sp5100_tco watchdog acpi_cpufreq button drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres usbcore scsi_mod i2c_piix4 usb_common scsi_common libphy
---[ end trace dc622588d92d6988 ]---
RIP: 0010:skb_release_data (./include/linux/skbuff.h:1549 net/core/skbuff.c:669) 
Code: 00 00 48 8b 75 28 48 85 f6 0f 84 d2 00 00 00 40 f6 c6 01 0f 85 a3 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 d3 03 00 00 48 8b 06 ba 01 00 00 00 48 89 df 0f
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	48 8b 75 28          	mov    0x28(%rbp),%rsi
   6:	48 85 f6             	test   %rsi,%rsi
   9:	0f 84 d2 00 00 00    	je     0xe1
   f:	40 f6 c6 01          	test   $0x1,%sil
  13:	0f 85 a3 00 00 00    	jne    0xbc
  19:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  20:	fc ff df 
  23:	48 89 f2             	mov    %rsi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
  2a:*	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)		<-- trapping instruction
  2e:	0f 85 d3 03 00 00    	jne    0x407
  34:	48 8b 06             	mov    (%rsi),%rax
  37:	ba 01 00 00 00       	mov    $0x1,%edx
  3c:	48 89 df             	mov    %rbx,%rdi
  3f:	0f                   	.byte 0xf

Code starting with the faulting instruction
===========================================
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	0f 85 d3 03 00 00    	jne    0x3dd
   a:	48 8b 06             	mov    (%rsi),%rax
   d:	ba 01 00 00 00       	mov    $0x1,%edx
  12:	48 89 df             	mov    %rbx,%rdi
  15:	0f                   	.byte 0xf
RSP: 0018:ffff8880c7c09c50 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff888004c6bdc0 RCX: 1ffff1100076945d
RDX: 0008104200016463 RSI: 00408210000b231a RDI: ffff888003b4a2e8
RBP: ffff888003b4a2c0 R08: 0000000000000000 R09: ffff888004c6be97
R10: ffffed100098d7d2 R11: 0000000000000001 R12: ffff888003b4a2c0
R13: ffff888004c6be7c R14: ffff88800c641e58 R15: ffff888004c6be80
FS:  0000000000000000(0000) GS:ffff8880c7c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055d3a95f6778 CR3: 0000000017c20000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x5c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1549) 
Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae
All code
========
   0:	e8 9a fe ff ff       	callq  0xfffffffffffffe9f
   5:	48 85 c0             	test   %rax,%rax
   8:	74 62                	je     0x6c
   a:	48 89 c5             	mov    %rax,%rbp
   d:	48 89 df             	mov    %rbx,%rdi
  10:	e8 ad ff ff ff       	callq  0xffffffffffffffc2
  15:	84 c0                	test   %al,%al
  17:	75 2a                	jne    0x43
  19:	48 89 ea             	mov    %rbp,%rdx
  1c:	48 c1 ea 03          	shr    $0x3,%rdx
  20:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  27:	fc ff df 
  2a:*	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)		<-- trapping instruction
  2e:	75 41                	jne    0x71
  30:	41 0f b6 d4          	movzbl %r12b,%edx
  34:	48 8b 45 00          	mov    0x0(%rbp),%rax
  38:	48 89 ee             	mov    %rbp,%rsi
  3b:	48 89 df             	mov    %rbx,%rdi
  3e:	0f                   	.byte 0xf
  3f:	ae                   	scas   %es:(%rdi),%al

Code starting with the faulting instruction
===========================================
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	75 41                	jne    0x47
   6:	41 0f b6 d4          	movzbl %r12b,%edx
   a:	48 8b 45 00          	mov    0x0(%rbp),%rax
   e:	48 89 ee             	mov    %rbp,%rsi
  11:	48 89 df             	mov    %rbx,%rdi
  14:	0f                   	.byte 0xf
  15:	ae                   	scas   %es:(%rdi),%al
RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b
RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8
RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17
R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001
R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0
Call Trace:
<IRQ>
skb_release_data (net/core/skbuff.c:671) 
skb_release_all (net/core/skbuff.c:743) 
__kfree_skb (net/core/skbuff.c:757) 
consume_skb (net/core/skbuff.c:912) 
__dev_kfree_skb_any (net/core/dev.c:3038) 
ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k
? __local_bh_enable_ip (./arch/x86/include/asm/preempt.h:103 kernel/softirq.c:390) 
? ath11k_ce_alloc_pipes (drivers/net/wireless/ath/ath11k/ce.c:500) ath11k
? ath11k_hal_srng_access_end (drivers/net/wireless/ath/ath11k/hal.c:849) ath11k
ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k
? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) 
? __lock_text_start (kernel/locking/spinlock.c:161) 
? ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:689) ath11k
? __wake_up_bit (kernel/sched/wait_bit.c:192) 
? __irq_put_desc_unlock (kernel/irq/irqdesc.c:819) 
ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci
? tasklet_clear_sched (kernel/softirq.c:752) 
tasklet_action_common.constprop.0 (kernel/softirq.c:783) 
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) 
__irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) 
common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
</IRQ>
<TASK>
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) 
RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) 
Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d
All code
========
   0:	ff                   	(bad)  
   1:	e8 8e 95 db fe       	callq  0xfffffffffedb9594
   6:	80 3c 24 00          	cmpb   $0x0,(%rsp)
   a:	74 17                	je     0x23
   c:	9c                   	pushfq 
   d:	58                   	pop    %rax
   e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  13:	f6 c4 02             	test   $0x2,%ah
  16:	0f 85 8e 06 00 00    	jne    0x6aa
  1c:	31 ff                	xor    %edi,%edi
  1e:	e8 a1 b9 ef fe       	callq  0xfffffffffeefb9c4
  23:	fb                   	sti    
  24:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  2a:*	45 85 ed             	test   %r13d,%r13d		<-- trapping instruction
  2d:	0f 88 52 03 00 00    	js     0x385
  33:	4d 63 e5             	movslq %r13d,%r12
  36:	4b 8d 04 64          	lea    (%r12,%r12,2),%rax
  3a:	49 8d 04 84          	lea    (%r12,%rax,4),%rax
  3e:	48                   	rex.W
  3f:	8d                   	.byte 0x8d

Code starting with the faulting instruction
===========================================
   0:	45 85 ed             	test   %r13d,%r13d
   3:	0f 88 52 03 00 00    	js     0x35b
   9:	4d 63 e5             	movslq %r13d,%r12
   c:	4b 8d 04 64          	lea    (%r12,%r12,2),%rax
  10:	49 8d 04 84          	lea    (%r12,%rax,4),%rax
  14:	48                   	rex.W
  15:	8d                   	.byte 0x8d
RSP: 0018:ffffffffa1407de0 EFLAGS: 00000246
RAX: dffffc0000000000 RBX: ffff888003b20800 RCX: 1ffffffff41d935c
RDX: 1ffff11018748331 RSI: ffffffffa0a31b00 RDI: ffff8880c3a41988
RBP: ffffffffa18e0d20 R08: 0000000000000002 R09: ffff8880c3a41c2b
R10: ffffed1018748385 R11: 0000000000000001 R12: 0000000000000002
R13: 0000000000000002 R14: 0000001dfc72dae5 R15: ffffffffa18e0e08
? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) 
? tick_nohz_idle_stop_tick (./include/linux/hrtimer.h:419 kernel/time/tick-sched.c:920 kernel/time/tick-sched.c:1062 kernel/time/tick-sched.c:1083) 
cpuidle_enter (drivers/cpuidle/cpuidle.c:353) 
do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) 
? arch_cpu_idle_exit+0x40/0x40 
cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) 
start_kernel (init/main.c:1137) 
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) 
</TASK>
Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl btbcm ccp btintel libarc4 rng_core evdev bluetooth cfg80211 kvm leds_apu jitterentropy_rng sha512_ssse3 sha512_generic snd_pcm ctr sg drbg snd_timer irqbypass ansi_cprng snd ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd r8169 ehci_hcd realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy
---[ end trace bd73d57ff2669c03 ]---
RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1549) 
Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae
All code
========
   0:	e8 9a fe ff ff       	callq  0xfffffffffffffe9f
   5:	48 85 c0             	test   %rax,%rax
   8:	74 62                	je     0x6c
   a:	48 89 c5             	mov    %rax,%rbp
   d:	48 89 df             	mov    %rbx,%rdi
  10:	e8 ad ff ff ff       	callq  0xffffffffffffffc2
  15:	84 c0                	test   %al,%al
  17:	75 2a                	jne    0x43
  19:	48 89 ea             	mov    %rbp,%rdx
  1c:	48 c1 ea 03          	shr    $0x3,%rdx
  20:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  27:	fc ff df 
  2a:*	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)		<-- trapping instruction
  2e:	75 41                	jne    0x71
  30:	41 0f b6 d4          	movzbl %r12b,%edx
  34:	48 8b 45 00          	mov    0x0(%rbp),%rax
  38:	48 89 ee             	mov    %rbp,%rsi
  3b:	48 89 df             	mov    %rbx,%rdi
  3e:	0f                   	.byte 0xf
  3f:	ae                   	scas   %es:(%rdi),%al

Code starting with the faulting instruction
===========================================
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	75 41                	jne    0x47
   6:	41 0f b6 d4          	movzbl %r12b,%edx
   a:	48 8b 45 00          	mov    0x0(%rbp),%rax
   e:	48 89 ee             	mov    %rbp,%rsi
  11:	48 89 df             	mov    %rbx,%rdi
  14:	0f                   	.byte 0xf
  15:	ae                   	scas   %es:(%rdi),%al
RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b
RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8
RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17
R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001
R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x1d800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
general protection fault, probably for non-canonical address 0xe0080c4200016463: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0x00408210000b2318-0x00408210000b231f]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
RIP: 0010:skb_zcopy_clear+0x34/0x8f
Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae
RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b
RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8
RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17
R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001
R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 skb_release_data+0x91/0x1de
 skb_release_all+0x3e/0x47
 __kfree_skb+0xe/0x18
 consume_skb+0x24/0x26
 __dev_kfree_skb_any+0x2a/0x2b
 ath11k_ce_tx_process_cb+0x3ef/0x8d0 [ath11k]
 ? __local_bh_enable_ip+0x37/0x80
 ? ath11k_ce_alloc_pipes+0x5c0/0x5c0 [ath11k]
 ? ath11k_hal_srng_access_end+0x1d7/0x5d0 [ath11k]
 ath11k_ce_per_engine_service+0x96b/0xc60 [ath11k]
 ? _raw_spin_lock_irqsave+0x9a/0xf0
 ? __lock_text_start+0x8/0x8
 ? ath11k_ce_tx_process_cb+0x8d0/0x8d0 [ath11k]
 ? __wake_up_bit+0x100/0x100
 ? __irq_put_desc_unlock+0x18/0x90
 ath11k_pci_ce_tasklet+0x64/0x100 [ath11k_pci]
 ? tasklet_clear_sched+0x47/0xe0
 tasklet_action_common.constprop.0+0x240/0x2d0
 __do_softirq+0x1b0/0x5b9
 __irq_exit_rcu+0xc6/0x170
 common_interrupt+0xa9/0xc0
 </IRQ>
 <TASK>
 asm_common_interrupt+0x1e/0x40
RIP: 0010:cpuidle_enter_state+0x196/0xa60
Code: ff e8 8e 95 db fe 80 3c 24 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 8e 06 00 00 31 ff e8 a1 b9 ef fe fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 52 03 00 00 4d 63 e5 4b 8d 04 64 49 8d 04 84 48 8d
RSP: 0018:ffffffffa1407de0 EFLAGS: 00000246
RAX: dffffc0000000000 RBX: ffff888003b20800 RCX: 1ffffffff41d935c
RDX: 1ffff11018748331 RSI: ffffffffa0a31b00 RDI: ffff8880c3a41988
RBP: ffffffffa18e0d20 R08: 0000000000000002 R09: ffff8880c3a41c2b
R10: ffffed1018748385 R11: 0000000000000001 R12: 0000000000000002
R13: 0000000000000002 R14: 0000001dfc72dae5 R15: ffffffffa18e0e08
 ? _raw_spin_unlock_irqrestore+0x25/0x40
 ? tick_nohz_idle_stop_tick+0x599/0xa60
 cpuidle_enter+0x4a/0xa0
 do_idle+0x3d7/0x530
 ? arch_cpu_idle_exit+0x40/0x40
 cpu_startup_entry+0x19/0x20
 start_kernel+0x38d/0x3ab
 secondary_startup_64_no_verify+0xb0/0xbb
 </TASK>
Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 kvm_amd btusb btrtl btbcm ccp btintel libarc4 rng_core evdev bluetooth cfg80211 kvm leds_apu jitterentropy_rng sha512_ssse3 sha512_generic snd_pcm ctr sg drbg snd_timer irqbypass ansi_cprng snd ecdh_generic rfkill soundcore ecc pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci libata ehci_pci ohci_hcd r8169 ehci_hcd realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy
---[ end trace bd73d57ff2669c03 ]---
RIP: 0010:skb_zcopy_clear+0x34/0x8f
Code: e8 9a fe ff ff 48 85 c0 74 62 48 89 c5 48 89 df e8 ad ff ff ff 84 c0 75 2a 48 89 ea 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 75 41 41 0f b6 d4 48 8b 45 00 48 89 ee 48 89 df 0f ae
RSP: 0018:ffff8880c3a09c30 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: ffff88800c233b40 RCX: ffffffff9fce961b
RDX: 0008104200016463 RSI: 0000000000000001 RDI: ffff888015edeae8
RBP: 00408210000b231a R08: 0000000000000000 R09: ffff88800c233c17
R10: ffffed1001846782 R11: 0000000000000001 R12: 0000000000000001
R13: ffff88800c233bbe R14: ffff88800b701e58 R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff8880c3a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdccf1c75c0 CR3: 00000000063a4000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x1d800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
RIP: 0010:skb_zcopy_clear+0x3f/0x70
Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8
RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202
RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000
RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00
RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8
R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0
R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005
FS:  0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 skb_release_data+0x4b/0xa2
 skb_release_all+0x20/0x22
 __kfree_skb+0xe/0x18
 consume_skb+0x24/0x26
 __dev_kfree_skb_any+0x2a/0x2b
 ath11k_ce_tx_process_cb+0x157/0x220 [ath11k]
 ath11k_ce_per_engine_service+0x3c0/0x3d0 [ath11k]
 ? _raw_spin_lock_irqsave+0x26/0x50
 ath11k_pci_ce_tasklet+0x1c/0x40 [ath11k_pci]
 tasklet_action_common.constprop.0+0xaf/0xe0
 __do_softirq+0xec/0x2e9
 __irq_exit_rcu+0xbc/0x110
 common_interrupt+0xb8/0xd0
 </IRQ>
 <TASK>
 asm_common_interrupt+0x1e/0x40
RIP: 0010:cpuidle_enter_state+0xda/0x370
Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
RSP: 0018:ffffffff92203e60 EFLAGS: 00000246
RAX: ffff8aa31ac00000 RBX: 0000000000000002 RCX: 000000000000001f
RDX: 0000000000000000 RSI: ffffffff91b70667 RDI: ffffffff91b55729
RBP: ffff8aa300906c00 R08: 0000000955084e02 R09: 0000000000000018
R10: 0000000000000001 R11: 0000000000001015 R12: ffffffff923d05c0
R13: 0000000955084e02 R14: 0000000000000002 R15: 0000000000000000
 cpuidle_enter+0x29/0x40
 do_idle+0x200/0x2b0
 cpu_startup_entry+0x19/0x20
 start_kernel+0x6b7/0x6dc
 secondary_startup_64_no_verify+0xb0/0xbb
 </TASK>
Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd cfg80211 ccp rng_core jitterentropy_rng kvm sha512_ssse3 sha512_generic evdev ctr snd_pcm drbg sg snd_timer ansi_cprng leds_apu irqbypass ecdh_generic snd rfkill ecc soundcore pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy
---[ end trace 23d792ef4816c4de ]---
RIP: 0010:skb_zcopy_clear+0x3f/0x70
Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8
RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202
RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000
RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00
RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8
R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0
R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005
FS:  0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
general protection fault, probably for non-canonical address 0x408210000b231a: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc1+ #1
Hardware name: PC Engines APU/APU, BIOS 4.0 09/08/2014
RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1551) 
Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8
All code
========
   0:	48 89 c7             	mov    %rax,%rdi
   3:	e8 81 02 00 00       	callq  0x289
   8:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
   c:	48 83 7d f8 00       	cmpq   $0x0,-0x8(%rbp)
  11:	74 45                	je     0x58
  13:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
  17:	48 89 c7             	mov    %rax,%rdi
  1a:	e8 ab 02 00 00       	callq  0x2ca
  1f:	83 f0 01             	xor    $0x1,%eax
  22:	84 c0                	test   %al,%al
  24:	74 1e                	je     0x44
  26:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
  2a:*	4c 8b 00             	mov    (%rax),%r8		<-- trapping instruction
  2d:	0f b6 55 ec          	movzbl -0x14(%rbp),%edx
  31:	48 8b 4d f8          	mov    -0x8(%rbp),%rcx
  35:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
  39:	48 89 ce             	mov    %rcx,%rsi
  3c:	48 89 c7             	mov    %rax,%rdi
  3f:	e8                   	.byte 0xe8

Code starting with the faulting instruction
===========================================
   0:	4c 8b 00             	mov    (%rax),%r8
   3:	0f b6 55 ec          	movzbl -0x14(%rbp),%edx
   7:	48 8b 4d f8          	mov    -0x8(%rbp),%rcx
   b:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
   f:	48 89 ce             	mov    %rcx,%rsi
  12:	48 89 c7             	mov    %rax,%rdi
  15:	e8                   	.byte 0xe8
RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202
RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000
RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00
RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8
R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0
R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005
FS:  0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0
Call Trace:
<IRQ>
skb_release_data (net/core/skbuff.c:671) 
skb_release_all (net/core/skbuff.c:743) 
__kfree_skb (net/core/skbuff.c:757) 
consume_skb (net/core/skbuff.c:912) 
__dev_kfree_skb_any (net/core/dev.c:3038) 
ath11k_ce_tx_process_cb (drivers/net/wireless/ath/ath11k/ce.c:515) ath11k
ath11k_ce_per_engine_service (drivers/net/wireless/ath/ath11k/ce.c:694) ath11k
? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 ./include/linux/atomic/atomic-instrumented.h:513 ./include/asm-generic/qspinlock.h:82 ./include/linux/spinlock.h:185 ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) 
ath11k_pci_ce_tasklet (drivers/net/wireless/ath/ath11k/pci.c:637) ath11k_pci
tasklet_action_common.constprop.0 (./arch/x86/include/asm/bitops.h:75 ./include/asm-generic/bitops/instrumented-atomic.h:42 kernel/softirq.c:879 kernel/softirq.c:787) 
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:212 ./include/trace/events/irq.h:142 kernel/softirq.c:559) 
__irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:636) 
common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
</IRQ>
<TASK>
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:629) 
RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:259) 
Code: 31 ff e8 d9 c6 9e ff 45 84 ff 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 78 02 00 00 31 ff e8 bd 97 a5 ff fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d
All code
========
   0:	31 ff                	xor    %edi,%edi
   2:	e8 d9 c6 9e ff       	callq  0xffffffffff9ec6e0
   7:	45 84 ff             	test   %r15b,%r15b
   a:	74 17                	je     0x23
   c:	9c                   	pushfq 
   d:	58                   	pop    %rax
   e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  13:	f6 c4 02             	test   $0x2,%ah
  16:	0f 85 78 02 00 00    	jne    0x294
  1c:	31 ff                	xor    %edi,%edi
  1e:	e8 bd 97 a5 ff       	callq  0xffffffffffa597e0
  23:	fb                   	sti    
  24:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  2a:*	45 85 f6             	test   %r14d,%r14d		<-- trapping instruction
  2d:	0f 88 11 01 00 00    	js     0x144
  33:	49 63 c6             	movslq %r14d,%rax
  36:	4c 2b 2c 24          	sub    (%rsp),%r13
  3a:	48 8d 14 40          	lea    (%rax,%rax,2),%rdx
  3e:	48                   	rex.W
  3f:	8d                   	.byte 0x8d

Code starting with the faulting instruction
===========================================
   0:	45 85 f6             	test   %r14d,%r14d
   3:	0f 88 11 01 00 00    	js     0x11a
   9:	49 63 c6             	movslq %r14d,%rax
   c:	4c 2b 2c 24          	sub    (%rsp),%r13
  10:	48 8d 14 40          	lea    (%rax,%rax,2),%rdx
  14:	48                   	rex.W
  15:	8d                   	.byte 0x8d
RSP: 0018:ffffffff92203e60 EFLAGS: 00000246
RAX: ffff8aa31ac00000 RBX: 0000000000000002 RCX: 000000000000001f
RDX: 0000000000000000 RSI: ffffffff91b70667 RDI: ffffffff91b55729
RBP: ffff8aa300906c00 R08: 0000000955084e02 R09: 0000000000000018
R10: 0000000000000001 R11: 0000000000001015 R12: ffffffff923d05c0
R13: 0000000955084e02 R14: 0000000000000002 R15: 0000000000000000
cpuidle_enter (drivers/cpuidle/cpuidle.c:353) 
do_idle (kernel/sched/idle.c:158 kernel/sched/idle.c:239 kernel/sched/idle.c:306) 
cpu_startup_entry (kernel/sched/idle.c:402 (discriminator 1)) 
start_kernel (init/main.c:1137) 
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:283) 
</TASK>
Modules linked in: qrtr_mhi qrtr ath11k_pci mhi ath11k qmi_helpers mac80211 btusb btrtl btbcm btintel bluetooth libarc4 kvm_amd cfg80211 ccp rng_core jitterentropy_rng kvm sha512_ssse3 sha512_generic evdev ctr snd_pcm drbg sg snd_timer ansi_cprng leds_apu irqbypass ecdh_generic snd rfkill ecc soundcore pcspkr k10temp sp5100_tco watchdog button acpi_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common uas usb_storage ohci_pci ahci libahci ohci_hcd ehci_pci ehci_hcd libata r8169 realtek mdio_devres scsi_mod usbcore i2c_piix4 usb_common scsi_common libphy
---[ end trace 23d792ef4816c4de ]---
RIP: 0010:skb_zcopy_clear (./include/linux/skbuff.h:1551) 
Code: 48 89 c7 e8 81 02 00 00 48 89 45 f8 48 83 7d f8 00 74 45 48 8b 45 f0 48 89 c7 e8 ab 02 00 00 83 f0 01 84 c0 74 1e 48 8b 45 f8 <4c> 8b 00 0f b6 55 ec 48 8b 4d f8 48 8b 45 f0 48 89 ce 48 89 c7 e8
All code
========
   0:	48 89 c7             	mov    %rax,%rdi
   3:	e8 81 02 00 00       	callq  0x289
   8:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
   c:	48 83 7d f8 00       	cmpq   $0x0,-0x8(%rbp)
  11:	74 45                	je     0x58
  13:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
  17:	48 89 c7             	mov    %rax,%rdi
  1a:	e8 ab 02 00 00       	callq  0x2ca
  1f:	83 f0 01             	xor    $0x1,%eax
  22:	84 c0                	test   %al,%al
  24:	74 1e                	je     0x44
  26:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
  2a:*	4c 8b 00             	mov    (%rax),%r8		<-- trapping instruction
  2d:	0f b6 55 ec          	movzbl -0x14(%rbp),%edx
  31:	48 8b 4d f8          	mov    -0x8(%rbp),%rcx
  35:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
  39:	48 89 ce             	mov    %rcx,%rsi
  3c:	48 89 c7             	mov    %rax,%rdi
  3f:	e8                   	.byte 0xe8

Code starting with the faulting instruction
===========================================
   0:	4c 8b 00             	mov    (%rax),%r8
   3:	0f b6 55 ec          	movzbl -0x14(%rbp),%edx
   7:	48 8b 4d f8          	mov    -0x8(%rbp),%rcx
   b:	48 8b 45 f0          	mov    -0x10(%rbp),%rax
   f:	48 89 ce             	mov    %rcx,%rsi
  12:	48 89 c7             	mov    %rax,%rdi
  15:	e8                   	.byte 0xe8
RSP: 0018:ffffb58e80003de8 EFLAGS: 00010202
RAX: 00408210000b231a RBX: ffff8aa303097b00 RCX: 0000000000000000
RDX: 0000000000000102 RSI: 0000000000000001 RDI: ffff8aa303097b00
RBP: ffffb58e80003e00 R08: 0000000000000212 R09: ffffffff922d24e8
R10: 0000000000000000 R11: 00000000db69d000 R12: ffff8aa310c69ac0
R13: ffff8aa303097b00 R14: ffff8aa3062235d8 R15: 0000000000000005
FS:  0000000000000000(0000) GS:ffff8aa31ac00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055dac9d55408 CR3: 00000001090fa000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Wen Gong Dec. 7, 2021, 4:35 a.m. UTC | #6
On 12/7/2021 4:03 AM, Sven Eckelmann wrote:
> On Monday, 6 December 2021 08:10:40 CET Wen Gong wrote:
>>> On Monday, 6 December 2021 04:29:39 CET Wen Gong wrote:
>>> [...]
>>>> I did test in my setup, not see the crash.
>>>>
>>>> I am afraid you also need this patch("ath11k: change to use dynamic
>>>> memory for channel list of scan",
>>>>
>>>> https://patchwork.kernel.org/project/linux-wireless/patch/20211129110939.15711-1-quic_wgong@quicinc.com
>>>> )
>>>>
>>>> Could you apply this patch and try again?
>>> Tried it and I see the same problem.
>> Could you tell what is your test steps?
> Start kernel with commit a93789ae541c ("ath11k: Avoid NULL ptr
> access during mgmt tx cleanup") + patches:
>
> * ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855
> * ath11k: change to use dynamic memory for channel list of scan
>
> You can find the config in the first mail. But I have now enabled KASAN inline
> to hopefully create some better error messages.
>
> The firmware + board data (see mail "ath11k: incorrect board_id retrieval")
> was prepared like this:
>
>     git clone https://github.com/kvalo/ath11k-firmware /root/ath11k-firmware
>     mkdir -p /lib/firmware/ath11k/WCN6855/hw2.0/
>     cp /root/ath11k-firmware/WCN6855/hw2.0/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/
>     cp /root/ath11k-firmware/WCN6855/hw2.0/1.1/WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1/*.bin /lib/firmware/ath11k/WCN6855/hw2.0/
>
>     git clone https://github.com/qca/qca-swiss-army-knife /root/qca-swiss-army-knife
>     apt install python2
>     python2 /root/qca-swiss-army-knife/tools/scripts/ath11k/ath11k-bdencoder  -e /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin
>     rm /lib/firmware/ath11k/WCN6855/hw2.0/board-2.bin
>     cp 'bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=266.bin' /lib/firmware/ath11k/WCN6855/hw2.0/board.bin
>
> Then I am just starting up the device as usual, and start wpa_supplicant (with
> defconfig + CONFIG_MESH=y) from commit 14ab4a816c68 ("Reject
> ap_vendor_elements if its length is odd")
>
>      cat << "EOF" > station_test.cfg
>      network={
>        ssid="MyTestAP"
>        key_mgmt=WPA-PSK FT-PSK
>        proto=RSN
>        psk="testtest"
>      }
>      EOF
>      ip link set up dev wlp6s0
>      ~/hostap/wpa_supplicant/wpa_supplicant -D nl80211 -i wlp6s0 -c station_test.cfg
>
> The actual SSID + PSK is valid and multiple access points (4) have this BSS on
> 2.4GHz + 5GHz.
>
> So you are basically always calling dev_kfree_skb_any in ath11k_ce_tx_process_cb
> because wcn6855 hw2.0 has credit_flow has set. But it seems like one of the
> entries returned by ath11k_ce_completed_send_next is bogus and causes this
> problems during the ath11k_ce_tx_process_cb. And for some reason, this is
> triggered here by this firmware feature.
>
>      ./scripts/faddr2line --list vmlinux consume_skb+0x9f/0x1c0
>      consume_skb+0x9f/0x1c0:
>      
>      __kfree_skb at net/core/skbuff.c:757
>       752     */
>       753
>       754    void __kfree_skb(struct sk_buff *skb)
>       755    {
>       756            skb_release_all(skb);
>      >757<           kfree_skbmem(skb);
>       758    }
>       759    EXPORT_SYMBOL(__kfree_skb);
>       760
>       761    /**
>       762     *      kfree_skb - free an sk_buff
>      
>      (inlined by) consume_skb at net/core/skbuff.c:912
>       907    {
>       908            if (!skb_unref(skb))
>       909                    return;
>       910
>       911            trace_consume_skb(skb);
>      >912<           __kfree_skb(skb);
>       913    }
>       914    EXPORT_SYMBOL(consume_skb);
>       915    #endif
>       916
>       917    /**
>      
>      (inlined by) consume_skb at net/core/skbuff.c:906
>       901     *
>       902     *      Drop a ref to the buffer and free it if the usage count has hit zero
>       903     *      Functions identically to kfree_skb, but kfree_skb assumes that the frame
>       904     *      is being dropped after a failure and notes that
>       905     */
>      >906<   void consume_skb(struct sk_buff *skb)
>       907    {
>       908            if (!skb_unref(skb))
>       909                    return;
>       910
>       911            trace_consume_skb(skb);
>
>
>      ./scripts/faddr2line --list vmlinux skb_release_data+0x1b0/0x5c0
>      skb_release_data+0x1b0/0x5c0:
>      
>      skb_zcopy_clear at include/linux/skbuff.h:1549
>       1544   {
>       1545           struct ubuf_info *uarg = skb_zcopy(skb);
>       1546
>       1547           if (uarg) {
>       1548                   if (!skb_zcopy_is_nouarg(skb))
>      >1549<                          uarg->callback(skb, uarg, zerocopy_success);
>       1550
>       1551                   skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>       1552           }
>       1553   }
>       1554
>      
>      (inlined by) skb_release_data at net/core/skbuff.c:669
>       664            if (skb->cloned &&
>       665                atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>       666                                  &shinfo->dataref))
>       667                    goto exit;
>       668
>      >669<           skb_zcopy_clear(skb, true);
>       670
>       671            for (i = 0; i < shinfo->nr_frags; i++)
>       672                    __skb_frag_unref(&shinfo->frags[i], skb->pp_recycle);
>       673
>       674            if (shinfo->frag_list)
>
> But I didn't like the inlined code. So I've changed the compilation flags
> slightly:
>
>      diff --git a/net/core/Makefile b/net/core/Makefile
>      index 6bdcb2cafed8..5eda226c5f27 100644
>      --- a/net/core/Makefile
>      +++ b/net/core/Makefile
>      @@ -37,3 +37,4 @@ obj-$(CONFIG_NET_SOCK_MSG) += skmsg.o
>       obj-$(CONFIG_BPF_SYSCALL) += sock_map.o
>       obj-$(CONFIG_BPF_SYSCALL) += bpf_sk_storage.o
>       obj-$(CONFIG_OF)	+= of_net.o
>      +ccflags-y += -fno-inline -O1 -fno-optimize-sibling-calls
>
> Now the stacktrace is a lot more readable. And the returned
> crash location makes a lot more sense:
>
>      ./scripts/faddr2line --list vmlinux 'skb_zcopy_clear+0x34/0x8f'
>      skb_zcopy_clear+0x34/0x8f:
>      
>      skb_zcopy_clear at include/linux/skbuff.h:1549
>       1544   {
>       1545           struct ubuf_info *uarg = skb_zcopy(skb);
>       1546
>       1547           if (uarg) {
>       1548                   if (!skb_zcopy_is_nouarg(skb))
>      >1549<                          uarg->callback(skb, uarg, zerocopy_success);
>       1550
>       1551                   skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>       1552           }
>       1553   }
>       1554
>
> Or with the assembler:
>
>       (gdb) disassemble /m *(skb_zcopy_clear+0x34/0x8f)
>       Dump of assembler code for function skb_zcopy_clear:
>       1544    {
>          0x000000000000072a <+0>:     push   %r12
>          0x000000000000072c <+2>:     push   %rbp
>          0x000000000000072d <+3>:     push   %rbx
>          0x000000000000072e <+4>:     mov    %rdi,%rbx
>          0x0000000000000731 <+7>:     mov    %esi,%r12d
>       
>       1545            struct ubuf_info *uarg = skb_zcopy(skb);
>          0x0000000000000734 <+10>:    call   0x5d3 <skb_zcopy>
>       
>       1546
>       1547            if (uarg) {
>          0x0000000000000739 <+15>:    test   %rax,%rax
>          0x000000000000073c <+18>:    je     0x7a0 <skb_zcopy_clear+118>
>          0x000000000000073e <+20>:    mov    %rax,%rbp
>       
>       1548                    if (!skb_zcopy_is_nouarg(skb))
>          0x0000000000000741 <+23>:    mov    %rbx,%rdi
>          0x0000000000000744 <+26>:    call   0x6f6 <skb_zcopy_is_nouarg>
>          0x0000000000000749 <+31>:    test   %al,%al
>          0x000000000000074b <+33>:    jne    0x777 <skb_zcopy_clear+77>
>       
>       1549                            uarg->callback(skb, uarg, zerocopy_success);
>          0x000000000000074d <+35>:    mov    %rbp,%rdx
>          0x0000000000000750 <+38>:    shr    $0x3,%rdx
>          0x0000000000000754 <+42>:    movabs $0xdffffc0000000000,%rax
>          0x000000000000075e <+52>:    cmpb   $0x0,(%rdx,%rax,1)
>          0x0000000000000762 <+56>:    jne    0x7a5 <skb_zcopy_clear+123>
>          0x0000000000000764 <+58>:    movzbl %r12b,%edx
>          0x0000000000000768 <+62>:    mov    0x0(%rbp),%rax
>          0x000000000000076c <+66>:    mov    %rbp,%rsi
>          0x000000000000076f <+69>:    mov    %rbx,%rdi
>          0x0000000000000772 <+72>:    call   0x777 <skb_zcopy_clear+77>
>          0x00000000000007a5 <+123>:   mov    %rbp,%rdi
>          0x00000000000007a8 <+126>:   call   0x7ad <skb_zcopy_clear+131>
>          0x00000000000007ad <+131>:   jmp    0x764 <skb_zcopy_clear+58>
>       
>       1550
>       1551                    skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>          0x0000000000000777 <+77>:    mov    %rbx,%rdi
>          0x000000000000077a <+80>:    call   0x518 <skb_end_pointer>
>          0x000000000000077f <+85>:    mov    %rax,%rbx
>          0x0000000000000782 <+88>:    mov    %rax,%rdx
>          0x0000000000000785 <+91>:    shr    $0x3,%rdx
>          0x0000000000000789 <+95>:    movabs $0xdffffc0000000000,%rax
>          0x0000000000000793 <+105>:   movzbl (%rdx,%rax,1),%eax
>          0x0000000000000797 <+109>:   test   %al,%al
>          0x0000000000000799 <+111>:   je     0x79d <skb_zcopy_clear+115>
>          0x000000000000079b <+113>:   jle    0x7af <skb_zcopy_clear+133>
>          0x000000000000079d <+115>:   andb   $0xf8,(%rbx)
>          0x00000000000007af <+133>:   mov    %rbx,%rdi
>          0x00000000000007b2 <+136>:   call   0x7b7 <skb_zcopy_clear+141>
>          0x00000000000007b7 <+141>:   jmp    0x79d <skb_zcopy_clear+115>
>       
>       1552            }
>       1553    }
>          0x00000000000007a0 <+118>:   pop    %rbx
>          0x00000000000007a1 <+119>:   pop    %rbp
>          0x00000000000007a2 <+120>:   pop    %r12
>          0x00000000000007a4 <+122>:   ret
>       
>       End of assembler dump.
>
> To make it even easier to read, just disable the inline KASAN and reduce the
> optimization level for this for it:
>
>      diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>      index 059b6266dcd7..819cc58ab051 100644
>      --- a/include/linux/skbuff.h
>      +++ b/include/linux/skbuff.h
>      @@ -1540,6 +1540,8 @@ static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref)
>       }
>       
>       /* Release a reference on a zerocopy structure */
>      +#pragma GCC push_options
>      +#pragma GCC optimize ("O0")
>       static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
>       {
>       	struct ubuf_info *uarg = skb_zcopy(skb);
>      @@ -1551,6 +1553,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
>       		skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>       	}
>       }
>      +#pragma GCC pop_options
>       
>       static inline void skb_mark_not_on_list(struct sk_buff *skb)
>       {
>
> This creates this nice, unoptimized function which crashes at +63:
>
>      $ gdb net/core/skbuff.o -q
>      Reading symbols from net/core/skbuff.o...
>      (gdb) disassemble /m *(skb_zcopy_clear+0x3f/0x70)
>      Dump of assembler code for function skb_zcopy_clear:
>      1546    {
>         0x0000000000000000 <+0>:     push   %rbp
>         0x0000000000000001 <+1>:     mov    %rsp,%rbp
>         0x0000000000000004 <+4>:     sub    $0x18,%rsp
>         0x0000000000000008 <+8>:     mov    %rdi,-0x10(%rbp)
>         0x000000000000000c <+12>:    mov    %esi,%eax
>         0x000000000000000e <+14>:    mov    %al,-0x14(%rbp)
>      
>      1547            struct ubuf_info *uarg = skb_zcopy(skb);
>         0x0000000000000011 <+17>:    mov    -0x10(%rbp),%rax
>         0x0000000000000015 <+21>:    mov    %rax,%rdi
>         0x0000000000000018 <+24>:    call   0x29e <skb_zcopy>
>         0x000000000000001d <+29>:    mov    %rax,-0x8(%rbp)
>      
>      1548
>      1549            if (uarg) {
>         0x0000000000000021 <+33>:    cmpq   $0x0,-0x8(%rbp)
>         0x0000000000000026 <+38>:    je     0x6d <skb_zcopy_clear+109>
>      
>      1550                    if (!skb_zcopy_is_nouarg(skb))
>         0x0000000000000028 <+40>:    mov    -0x10(%rbp),%rax
>         0x000000000000002c <+44>:    mov    %rax,%rdi
>         0x000000000000002f <+47>:    call   0x2df <skb_zcopy_is_nouarg>
>         0x0000000000000034 <+52>:    xor    $0x1,%eax
>         0x0000000000000037 <+55>:    test   %al,%al
>         0x0000000000000039 <+57>:    je     0x59 <skb_zcopy_clear+89>
>      
>      1551                            uarg->callback(skb, uarg, zerocopy_success);
>         0x000000000000003b <+59>:    mov    -0x8(%rbp),%rax
>         0x000000000000003f <+63>:    mov    (%rax),%r8
>         0x0000000000000042 <+66>:    movzbl -0x14(%rbp),%edx
>         0x0000000000000046 <+70>:    mov    -0x8(%rbp),%rcx
>         0x000000000000004a <+74>:    mov    -0x10(%rbp),%rax
>         0x000000000000004e <+78>:    mov    %rcx,%rsi
>         0x0000000000000051 <+81>:    mov    %rax,%rdi
>         0x0000000000000054 <+84>:    call   0x59 <skb_zcopy_clear+89>
>      
>      1552
>      1553                    skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>         0x0000000000000059 <+89>:    mov    -0x10(%rbp),%rax
>         0x000000000000005d <+93>:    mov    %rax,%rdi
>         0x0000000000000060 <+96>:    call   0x27f <skb_end_pointer>
>         0x0000000000000065 <+101>:   movzbl (%rax),%edx
>         0x0000000000000068 <+104>:   and    $0xfffffff8,%edx
>         0x000000000000006b <+107>:   mov    %dl,(%rax)
>      
>      1554            }
>      1555    }
>         0x000000000000006d <+109>:   nop
>         0x000000000000006e <+110>:   leave
>         0x000000000000006f <+111>:   ret
>      
>      End of assembler dump.
>
> The question now: What is causing the unclean state of the skb and thus
> doesn't let it get rejected by skb_zcopy_is_nouarg before the uarg
> callback is tried.
>
> Kind regards,
> 	Sven

Thanks Sven a lot for your analyze.

I still can not reproduce it.

I think it is because the write over skb->tail in scan, because the 
invalid address

is same for each crash(0x408210000b231a/0xe0080c4200016463), and it is 
caused by this instruction

"0x000000000000003f <+63>:    mov    (%rax),%r8" which is assign the value of uarg->callback to %r8.

Could you add below change?

It will print the log to help us find out the bug.

diff --git a/drivers/net/wireless/ath/ath11k/mac.c 
b/drivers/net/wireless/ath/ath11k/mac.c
index 26181f237e23..2147f74f5ebf 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -3421,12 +3421,15 @@ static int ath11k_mac_op_hw_scan(struct 
ieee80211_hw *hw,
                 memcpy(arg.extraie.ptr, req->ie, req->ie_len);
         }

+       ath11k_info(ar->ab, "n_ssids %d\n", req->n_ssids);
+
         if (req->n_ssids) {
                 arg.num_ssids = req->n_ssids;
                 for (i = 0; i < arg.num_ssids; i++) {
                         arg.ssid[i].length  = req->ssids[i].ssid_len;
                         memcpy(&arg.ssid[i].ssid, req->ssids[i].ssid,
                                req->ssids[i].ssid_len);
+                       ath11k_info(ar->ab, "ssid[%d] len %d\n", i, 
arg.ssid[i].length);
                 }
         } else {
                 arg.scan_flags |= WMI_SCAN_FLAG_PASSIVE;
diff --git a/drivers/net/wireless/ath/ath11k/wmi.c 
b/drivers/net/wireless/ath/ath11k/wmi.c
index 7d7f76d4bf1f..e42a64251799 100644
--- a/drivers/net/wireless/ath/ath11k/wmi.c
+++ b/drivers/net/wireless/ath/ath11k/wmi.c
@@ -2271,6 +2271,7 @@ int ath11k_wmi_send_scan_start_cmd(struct ath11k *ar,
                 }
         }

+       ath11k_info(ar->ab, "%s ptr %px skb data %px len %d over %d", 
__func__, ptr, skb->data, skb->len, ((unsigned char 
*)ptr)-skb->data-skb->len);
         ret = ath11k_wmi_cmd_send(wmi, skb,
                                   WMI_START_SCAN_CMDID);
         if (ret) {
Sven Eckelmann Dec. 7, 2021, 2:30 p.m. UTC | #7
On Tuesday, 7 December 2021 05:35:04 CET Wen Gong wrote:
> Thanks Sven a lot for your analyze.
> 
> I still can not reproduce it.
> 
> I think it is because the write over skb->tail in scan, because the 
> invalid address

Yes, I thought that I wanted to write about it but it might have gone into 
another draft of the mail. So what I wanted to write was something like:

The information which is used in skb_zcopy_clear/skb_zcopy/skb_zcopy_is_nouarg 
is coming from skb_shinfo. And skb_end_pointer is just a pointer to a region 
at the end of the skb buffer (skb->end). And this got corrupted by something
Unfortunately this is correctly allocated memory and thus kasan cannot help
us with it.



[...]
> --- a/drivers/net/wireless/ath/ath11k/wmi.c
> +++ b/drivers/net/wireless/ath/ath11k/wmi.c
> @@ -2271,6 +2271,7 @@ int ath11k_wmi_send_scan_start_cmd(struct ath11k *ar,
>                  }
>          }
> 
> +       ath11k_info(ar->ab, "%s ptr %px skb data %px len %d over %d", 
> __func__, ptr, skb->data, skb->len, ((unsigned char 
> *)ptr)-skb->data-skb->len);
>          ret = ath11k_wmi_cmd_send(wmi, skb,
>                                    WMI_START_SCAN_CMDID);
>          if (ret) {

Changed the last part to:

    ath11k_err(ar->ab, "%s ptr %px skb data %px len %d over %ld\n", __func__, ptr, skb->data, skb->len, ((unsigned char *)ptr) - skb->data - skb->len);


The output is:

    ath11k_pci 0000:01:00.0: n_ssids 1
    ath11k_pci 0000:01:00.0: ssid[0] len 0
    ath11k_pci 0000:01:00.0: ath11k_wmi_send_scan_start_cmd ptr ffff9217101e82b4 skb data ffff9217101e804c len 616 over 0

But we are looking at the ath11k_ce_tx_process_cb function. So I would have 
expected that it is related to something which as sent out. So the first thing 
I did was to add some skb_dumps in the sent path (ath11k_htc_send) and in the 
cleanup path (skb_zcopy_clear). Something like this (just the cleanup path 
because otherwise I have to post a rather large diff):

    diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
    index 819cc58ab051..c15512e2f30c 100644
    --- a/include/linux/skbuff.h
    +++ b/include/linux/skbuff.h
    @@ -1547,8 +1547,10 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
     	struct ubuf_info *uarg = skb_zcopy(skb);
     
     	if (uarg) {
    -		if (!skb_zcopy_is_nouarg(skb))
    +		if (!skb_zcopy_is_nouarg(skb)) {
    +			skb_dump(KERN_ERR, skb, true);
     			uarg->callback(skb, uarg, zerocopy_success);
    +		}
     
     		skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
     	}


But interestingly, it already crashes to parse the fraglist in 
ath11k_htc_send. So I've added some more dump to figure out where is breaks. 
And I've noticed that it breaks after following section in 
ath11k_wmi_send_scan_start_cmd

	if (params->extraie.len)
		memcpy(ptr, params->extraie.ptr,
		       params->extraie.len);

Here is the full output:

    [   30.641297] ath11k_wmi_send_scan_start_cmd:2357
    [   30.645873] skb len=616 headroom=76 headlen=616 tailroom=12
    [   30.645873] mac=(-1,-1) net=(0,-1) trans=-1
    [   30.645873] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
    [   30.645873] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
    [   30.645873] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
    [   30.673381] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.681073] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.688758] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.696465] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.704197] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.710852] skb linear:   00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00
    [   30.718538] skb linear:   00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00
    [   30.726271] skb linear:   00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00
    [   30.733954] skb linear:   00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00
    [   30.741636] skb linear:   00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00
    [   30.749346] skb linear:   00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.757092] skb linear:   00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.764795] skb linear:   00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.772483] skb linear:   00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.780170] skb linear:   00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00
    [   30.787854] skb linear:   000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00
    [   30.795541] skb linear:   000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00
    [   30.803236] skb linear:   000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00
    [   30.810933] skb linear:   000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00
    [   30.818620] skb linear:   000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00
    [   30.826322] skb linear:   000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00
    [   30.834018] skb linear:   00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00
    [   30.841712] skb linear:   00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00
    [   30.849402] skb linear:   00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00
    [   30.857094] skb linear:   00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00
    [   30.864776] skb linear:   00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00
    [   30.872490] skb linear:   00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00
    [   30.880182] skb linear:   00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00
    [   30.887882] skb linear:   00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00
    [   30.895581] skb linear:   00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00
    [   30.903265] skb linear:   00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00
    [   30.910974] skb linear:   000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00
    [   30.918675] skb linear:   000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00
    [   30.926418] skb linear:   000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00
    [   30.934118] skb linear:   000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00
    [   30.941842] skb linear:   000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00
    [   30.949537] skb linear:   000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00
    [   30.957221] skb linear:   00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00
    [   30.964912] skb linear:   00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00
    [   30.972614] skb linear:   00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00
    [   30.980315] skb linear:   00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.988010] skb linear:   00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   30.995696] skb linear:   00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00
    [   31.003394] skb linear:   00000260: 00 00 00 00 00 00 00 00
    [   31.009002] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.015646] ath11k_wmi_send_scan_start_cmd:2362
    [   31.020217] skb len=616 headroom=76 headlen=616 tailroom=12
    [   31.020217] mac=(-1,-1) net=(0,-1) trans=-1
    [   31.020217] shinfo(txflags=0 nr_frags=255 gso(size=0 type=265087 segs=0))
    [   31.020217] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
    [   31.020217] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
    [   31.048289] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.056015] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.063714] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.071425] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.079141] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.085787] skb linear:   00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00
    [   31.093518] skb linear:   00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00
    [   31.101239] skb linear:   00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00
    [   31.108947] skb linear:   00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00
    [   31.116630] skb linear:   00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00
    [   31.124326] skb linear:   00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.132007] skb linear:   00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.139708] skb linear:   00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.147420] skb linear:   00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.155118] skb linear:   00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00
    [   31.162798] skb linear:   000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00
    [   31.170486] skb linear:   000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00
    [   31.178175] skb linear:   000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00
    [   31.185876] skb linear:   000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00
    [   31.193593] skb linear:   000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00
    [   31.201278] skb linear:   000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00
    [   31.208969] skb linear:   00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00
    [   31.216655] skb linear:   00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00
    [   31.224346] skb linear:   00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00
    [   31.232030] skb linear:   00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00
    [   31.239739] skb linear:   00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00
    [   31.247428] skb linear:   00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00
    [   31.255141] skb linear:   00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00
    [   31.262840] skb linear:   00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00
    [   31.270591] skb linear:   00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00
    [   31.278282] skb linear:   00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00
    [   31.285965] skb linear:   000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00
    [   31.293675] skb linear:   000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00
    [   31.301361] skb linear:   000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00
    [   31.309056] skb linear:   000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00
    [   31.316753] skb linear:   000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00
    [   31.324441] skb linear:   000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00
    [   31.332138] skb linear:   00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00
    [   31.339840] skb linear:   00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00
    [   31.347520] skb linear:   00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00
    [   31.355232] skb linear:   00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.362920] skb linear:   00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   31.370607] skb linear:   00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00
    [   31.378331] skb linear:   00000260: 01 08 02 04 0b 16 0c 12
    [   31.383972] skb tailroom: 00000000: 18 24 32 04 30 48 60 6c 2d 1a e3 19
    [   31.390651] skb fraglist:
    [   31.393348] BUG: unable to handle page fault for address: 00000100000000bc
    [   31.400317] #PF: supervisor read access in kernel mode
    [   31.405624] #PF: error_code(0x0000) - not-present page
    [   31.410832] PGD 0 P4D 0 
    [   31.413422] Oops: 0000 [#1] PREEMPT SMP NOPTI
    [   31.417881] CPU: 0 PID: 520 Comm: wpa_supplicant Not tainted 5.16.0-rc1+ #5
    [   31.424862] Hardware name: PC Engines apu2/apu2, BIOS v4.15.0.1 11/23/2021
    [   31.431750] RIP: 0010:skb_end_pointer+0x0/0xe
    [   31.436129] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8
    [   31.454883] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282
    [   31.460116] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000
    [   31.467267] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000
    [   31.474408] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370
    [   31.481549] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268
    [   31.488689] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b
    [   31.495823] FS:  00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000
    [   31.503936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   31.509706] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0
    [   31.516868] Call Trace:
    [   31.519325]  <TASK>
    [   31.521433]  skb_dump+0x24/0x53a
    [   31.524688]  ? _printk+0x58/0x6f
    [   31.527938]  skb_dump+0x532/0x53a
    [   31.531267]  ath11k_wmi_send_scan_start_cmd.cold+0x5f2/0x793 [ath11k]
    [   31.537785]  ath11k_mac_op_hw_scan+0x173/0x3f0 [ath11k]
    [   31.543086]  drv_hw_scan+0x43/0x130 [mac80211]
    [   31.547690]  __ieee80211_start_scan+0x152/0x6d0 [mac80211]
    [   31.553306]  ieee80211_request_scan+0x2c/0x50 [mac80211]
    [   31.558738]  rdev_scan+0x28/0xd0 [cfg80211]
    [   31.563117]  nl80211_trigger_scan+0x3fe/0x680 [cfg80211]
    [   31.568584]  genl_family_rcv_msg_doit+0xea/0x150
    [   31.573223]  genl_rcv_msg+0xde/0x1d0
    [   31.576816]  ? nl80211_send_scan_start+0x90/0x90 [cfg80211]
    [   31.582520]  ? genl_get_cmd+0xd0/0xd0
    [   31.586191]  netlink_rcv_skb+0x50/0xf0
    [   31.589958]  genl_rcv+0x24/0x40
    [   31.593109]  netlink_unicast+0x239/0x340
    [   31.597045]  netlink_sendmsg+0x245/0x480
    [   31.600981]  sock_sendmsg+0x5e/0x60
    [   31.604487]  ____sys_sendmsg+0x22e/0x270
    [   31.608440]  ? import_iovec+0x2d/0x30
    [   31.612123]  ? sendmsg_copy_msghdr+0x7c/0xa0
    [   31.616406]  ___sys_sendmsg+0x75/0xb0
    [   31.620081]  ? __mod_lruvec_page_state+0x7d/0xc0
    [   31.624714]  ? folio_add_lru+0x5c/0xa0
    [   31.628476]  ? _raw_spin_unlock+0x16/0x30
    [   31.632506]  ? __handle_mm_fault+0x1261/0x1520
    [   31.636965]  __sys_sendmsg+0x59/0xa0
    [   31.640552]  do_syscall_64+0x3b/0xc0
    [   31.644148]  entry_SYSCALL_64_after_hwframe+0x44/0xae
    [   31.649208] RIP: 0033:0x7f2725ef6f33
    [   31.652797] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 89 54 24 1c 48
    [   31.671547] RSP: 002b:00007fff1b5f1668 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [   31.679122] RAX: ffffffffffffffda RBX: 0000564919260760 RCX: 00007f2725ef6f33
    [   31.686264] RDX: 0000000000000000 RSI: 00007fff1b5f16a0 RDI: 0000000000000005
    [   31.693406] RBP: 000056491928f6c0 R08: 0000000000000004 R09: 00007f2725fb6c00
    [   31.700547] R10: 00007fff1b5f178c R11: 0000000000000246 R12: 0000564919260670
    [   31.707689] R13: 00007fff1b5f16a0 R14: 0000000000000000 R15: 00007fff1b5f178c
    [   31.714834]  </TASK>
    [   31.717031] Modules linked in: qrtr_mhi btusb btrtl btbcm btintel bluetooth jitterentropy_rng sha512_ssse3 sha512_generic drbg ansi_cprng amd64_edac ecdh_generic edac_mce_amd ecc kvm_amd kvm irqbypass qrtr crc32_pclmul ghash_clmulni_intel ath11k_pci mhi ath11k evdev pcengines_apuv2 qmi_helpers gpio_keys_polled gpio_amd_fch aesni_intel snd_pcm crypto_simd snd_timer sdhci_pci xhci_pci snd cqhci mac80211 soundcore ehci_pci sp5100_tco cryptd libarc4 xhci_hcd sdhci ehci_hcd pcspkr igb watchdog ptp cfg80211 mmc_core k10temp i2c_piix4 fam15h_power usbcore ccp pps_core sg dca rng_core i2c_algo_bit usb_common rfkill leds_gpio gpio_keys acpi_cpufreq button fuse drm configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul scsi_mod crct10dif_common crc32c_intel scsi_common
    [   31.793074] CR2: 00000100000000bc
    [   31.796498] ---[ end trace 07252723010a83e6 ]---
    [   31.801261] RIP: 0010:skb_end_pointer+0x0/0xe
    [   31.805824] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8
    [   31.824842] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282
    [   31.830105] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000
    [   31.837270] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000
    [   31.844441] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370
    [   31.851614] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268
    [   31.858781] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b
    [   31.866020] FS:  00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000
    [   31.874141] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   31.879920] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0


So the length calculated for the ath11k_wmi_alloc_skb is just wrong. Reason 
for this is the extraie_len_with_pad which is only an u8. But the 
params->extraie.len with the IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS is for me 
already 264. So the length will end up as 8 - but the length it occupies
is still 264.

But the problem is the length of the WMI_TLV_LEN. The params->extraie.len can 
be up to 32 bit and WMI_TLV_LEN only has 16 bit. So the params->extraie.len 
must also be size limited or we might run into a different problem.

Kind regards,
	Sven
Wen Gong Dec. 8, 2021, 3:43 a.m. UTC | #8
Thanks Sven's analyze/debugging.

I see your patch "ath11k: Fix buffer overflow when scanning with extraie".

On 12/7/2021 10:30 PM, Sven Eckelmann wrote:
> On Tuesday, 7 December 2021 05:35:04 CET Wen Gong wrote:
>> Thanks Sven a lot for your analyze.
>>
>> I still can not reproduce it.
>>
>> I think it is because the write over skb->tail in scan, because the
>> invalid address
> Yes, I thought that I wanted to write about it but it might have gone into
> another draft of the mail. So what I wanted to write was something like:
>
> The information which is used in skb_zcopy_clear/skb_zcopy/skb_zcopy_is_nouarg
> is coming from skb_shinfo. And skb_end_pointer is just a pointer to a region
> at the end of the skb buffer (skb->end). And this got corrupted by something
> Unfortunately this is correctly allocated memory and thus kasan cannot help
> us with it.
>
>
>
> [...]
>> --- a/drivers/net/wireless/ath/ath11k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath11k/wmi.c
>> @@ -2271,6 +2271,7 @@ int ath11k_wmi_send_scan_start_cmd(struct ath11k *ar,
>>                   }
>>           }
>>
>> +       ath11k_info(ar->ab, "%s ptr %px skb data %px len %d over %d",
>> __func__, ptr, skb->data, skb->len, ((unsigned char
>> *)ptr)-skb->data-skb->len);
>>           ret = ath11k_wmi_cmd_send(wmi, skb,
>>                                     WMI_START_SCAN_CMDID);
>>           if (ret) {
> Changed the last part to:
>
>      ath11k_err(ar->ab, "%s ptr %px skb data %px len %d over %ld\n", __func__, ptr, skb->data, skb->len, ((unsigned char *)ptr) - skb->data - skb->len);
>
>
> The output is:
>
>      ath11k_pci 0000:01:00.0: n_ssids 1
>      ath11k_pci 0000:01:00.0: ssid[0] len 0
>      ath11k_pci 0000:01:00.0: ath11k_wmi_send_scan_start_cmd ptr ffff9217101e82b4 skb data ffff9217101e804c len 616 over 0
>
> But we are looking at the ath11k_ce_tx_process_cb function. So I would have
> expected that it is related to something which as sent out. So the first thing
> I did was to add some skb_dumps in the sent path (ath11k_htc_send) and in the
> cleanup path (skb_zcopy_clear). Something like this (just the cleanup path
> because otherwise I have to post a rather large diff):
>
>      diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>      index 819cc58ab051..c15512e2f30c 100644
>      --- a/include/linux/skbuff.h
>      +++ b/include/linux/skbuff.h
>      @@ -1547,8 +1547,10 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
>       	struct ubuf_info *uarg = skb_zcopy(skb);
>       
>       	if (uarg) {
>      -		if (!skb_zcopy_is_nouarg(skb))
>      +		if (!skb_zcopy_is_nouarg(skb)) {
>      +			skb_dump(KERN_ERR, skb, true);
>       			uarg->callback(skb, uarg, zerocopy_success);
>      +		}
>       
>       		skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
>       	}
>
>
> But interestingly, it already crashes to parse the fraglist in
> ath11k_htc_send. So I've added some more dump to figure out where is breaks.
> And I've noticed that it breaks after following section in
> ath11k_wmi_send_scan_start_cmd
>
> 	if (params->extraie.len)
> 		memcpy(ptr, params->extraie.ptr,
> 		       params->extraie.len);
>
> Here is the full output:
>
>      [   30.641297] ath11k_wmi_send_scan_start_cmd:2357
>      [   30.645873] skb len=616 headroom=76 headlen=616 tailroom=12
>      [   30.645873] mac=(-1,-1) net=(0,-1) trans=-1
>      [   30.645873] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
>      [   30.645873] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
>      [   30.645873] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
>      [   30.673381] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.681073] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.688758] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.696465] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.704197] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.710852] skb linear:   00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00
>      [   30.718538] skb linear:   00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00
>      [   30.726271] skb linear:   00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00
>      [   30.733954] skb linear:   00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00
>      [   30.741636] skb linear:   00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00
>      [   30.749346] skb linear:   00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.757092] skb linear:   00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.764795] skb linear:   00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.772483] skb linear:   00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.780170] skb linear:   00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00
>      [   30.787854] skb linear:   000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00
>      [   30.795541] skb linear:   000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00
>      [   30.803236] skb linear:   000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00
>      [   30.810933] skb linear:   000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00
>      [   30.818620] skb linear:   000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00
>      [   30.826322] skb linear:   000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00
>      [   30.834018] skb linear:   00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00
>      [   30.841712] skb linear:   00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00
>      [   30.849402] skb linear:   00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00
>      [   30.857094] skb linear:   00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00
>      [   30.864776] skb linear:   00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00
>      [   30.872490] skb linear:   00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00
>      [   30.880182] skb linear:   00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00
>      [   30.887882] skb linear:   00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00
>      [   30.895581] skb linear:   00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00
>      [   30.903265] skb linear:   00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00
>      [   30.910974] skb linear:   000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00
>      [   30.918675] skb linear:   000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00
>      [   30.926418] skb linear:   000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00
>      [   30.934118] skb linear:   000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00
>      [   30.941842] skb linear:   000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00
>      [   30.949537] skb linear:   000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00
>      [   30.957221] skb linear:   00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00
>      [   30.964912] skb linear:   00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00
>      [   30.972614] skb linear:   00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00
>      [   30.980315] skb linear:   00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.988010] skb linear:   00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   30.995696] skb linear:   00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00
>      [   31.003394] skb linear:   00000260: 00 00 00 00 00 00 00 00
>      [   31.009002] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.015646] ath11k_wmi_send_scan_start_cmd:2362
>      [   31.020217] skb len=616 headroom=76 headlen=616 tailroom=12
>      [   31.020217] mac=(-1,-1) net=(0,-1) trans=-1
>      [   31.020217] shinfo(txflags=0 nr_frags=255 gso(size=0 type=265087 segs=0))
>      [   31.020217] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
>      [   31.020217] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
>      [   31.048289] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.056015] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.063714] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.071425] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.079141] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.085787] skb linear:   00000000: 9c 00 4d 00 00 a0 00 00 01 00 00 00 00 00 00 00
>      [   31.093518] skb linear:   00000010: 01 00 00 00 1f 00 00 00 32 00 00 00 96 00 00 00
>      [   31.101239] skb linear:   00000020: 32 00 00 00 f4 01 00 00 00 00 00 00 00 00 00 00
>      [   31.108947] skb linear:   00000030: 00 00 00 00 20 4e 00 00 05 00 00 00 10 00 00 00
>      [   31.116630] skb linear:   00000040: 00 00 00 00 61 00 00 00 01 00 00 00 01 00 00 00
>      [   31.124326] skb linear:   00000050: 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.132007] skb linear:   00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.139708] skb linear:   00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.147420] skb linear:   00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.155118] skb linear:   00000090: 00 00 00 00 28 00 00 00 1e 00 00 00 00 00 00 00
>      [   31.162798] skb linear:   000000a0: 84 01 10 00 6c 09 00 00 71 09 00 00 76 09 00 00
>      [   31.170486] skb linear:   000000b0: 7b 09 00 00 80 09 00 00 85 09 00 00 8a 09 00 00
>      [   31.178175] skb linear:   000000c0: 8f 09 00 00 94 09 00 00 99 09 00 00 9e 09 00 00
>      [   31.185876] skb linear:   000000d0: a3 09 00 00 a8 09 00 00 3c 14 00 00 50 14 00 00
>      [   31.193593] skb linear:   000000e0: 64 14 00 00 78 14 00 00 8c 14 00 00 a0 14 00 00
>      [   31.201278] skb linear:   000000f0: b4 14 00 00 c8 14 00 00 7c 15 00 00 90 15 00 00
>      [   31.208969] skb linear:   00000100: a4 15 00 00 b8 15 00 00 cc 15 00 00 e0 15 00 00
>      [   31.216655] skb linear:   00000110: f4 15 00 00 08 16 00 00 1c 16 00 00 30 16 00 00
>      [   31.224346] skb linear:   00000120: 44 16 00 00 58 16 00 00 71 16 00 00 85 16 00 00
>      [   31.232030] skb linear:   00000130: 99 16 00 00 ad 16 00 00 c1 16 00 00 43 17 00 00
>      [   31.239739] skb linear:   00000140: 57 17 00 00 6b 17 00 00 7f 17 00 00 93 17 00 00
>      [   31.247428] skb linear:   00000150: a7 17 00 00 bb 17 00 00 cf 17 00 00 e3 17 00 00
>      [   31.255141] skb linear:   00000160: f7 17 00 00 0b 18 00 00 1f 18 00 00 33 18 00 00
>      [   31.262840] skb linear:   00000170: 47 18 00 00 5b 18 00 00 6f 18 00 00 83 18 00 00
>      [   31.270591] skb linear:   00000180: 97 18 00 00 ab 18 00 00 bf 18 00 00 d3 18 00 00
>      [   31.278282] skb linear:   00000190: e7 18 00 00 fb 18 00 00 0f 19 00 00 23 19 00 00
>      [   31.285965] skb linear:   000001a0: 37 19 00 00 4b 19 00 00 5f 19 00 00 73 19 00 00
>      [   31.293675] skb linear:   000001b0: 87 19 00 00 9b 19 00 00 af 19 00 00 c3 19 00 00
>      [   31.301361] skb linear:   000001c0: d7 19 00 00 eb 19 00 00 ff 19 00 00 13 1a 00 00
>      [   31.309056] skb linear:   000001d0: 27 1a 00 00 3b 1a 00 00 4f 1a 00 00 63 1a 00 00
>      [   31.316753] skb linear:   000001e0: 77 1a 00 00 8b 1a 00 00 9f 1a 00 00 b3 1a 00 00
>      [   31.324441] skb linear:   000001f0: c7 1a 00 00 db 1a 00 00 ef 1a 00 00 03 1b 00 00
>      [   31.332138] skb linear:   00000200: 17 1b 00 00 2b 1b 00 00 3f 1b 00 00 53 1b 00 00
>      [   31.339840] skb linear:   00000210: 67 1b 00 00 7b 1b 00 00 8f 1b 00 00 a3 1b 00 00
>      [   31.347520] skb linear:   00000220: b7 1b 00 00 cb 1b 00 00 24 00 13 00 00 00 00 00
>      [   31.355232] skb linear:   00000230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.362920] skb linear:   00000240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      [   31.370607] skb linear:   00000250: 08 00 13 00 ff ff ff ff ff ff 00 00 08 00 11 00
>      [   31.378331] skb linear:   00000260: 01 08 02 04 0b 16 0c 12
>      [   31.383972] skb tailroom: 00000000: 18 24 32 04 30 48 60 6c 2d 1a e3 19
>      [   31.390651] skb fraglist:
>      [   31.393348] BUG: unable to handle page fault for address: 00000100000000bc
>      [   31.400317] #PF: supervisor read access in kernel mode
>      [   31.405624] #PF: error_code(0x0000) - not-present page
>      [   31.410832] PGD 0 P4D 0
>      [   31.413422] Oops: 0000 [#1] PREEMPT SMP NOPTI
>      [   31.417881] CPU: 0 PID: 520 Comm: wpa_supplicant Not tainted 5.16.0-rc1+ #5
>      [   31.424862] Hardware name: PC Engines apu2/apu2, BIOS v4.15.0.1 11/23/2021
>      [   31.431750] RIP: 0010:skb_end_pointer+0x0/0xe
>      [   31.436129] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8
>      [   31.454883] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282
>      [   31.460116] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000
>      [   31.467267] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000
>      [   31.474408] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370
>      [   31.481549] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268
>      [   31.488689] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b
>      [   31.495823] FS:  00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000
>      [   31.503936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>      [   31.509706] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0
>      [   31.516868] Call Trace:
>      [   31.519325]  <TASK>
>      [   31.521433]  skb_dump+0x24/0x53a
>      [   31.524688]  ? _printk+0x58/0x6f
>      [   31.527938]  skb_dump+0x532/0x53a
>      [   31.531267]  ath11k_wmi_send_scan_start_cmd.cold+0x5f2/0x793 [ath11k]
>      [   31.537785]  ath11k_mac_op_hw_scan+0x173/0x3f0 [ath11k]
>      [   31.543086]  drv_hw_scan+0x43/0x130 [mac80211]
>      [   31.547690]  __ieee80211_start_scan+0x152/0x6d0 [mac80211]
>      [   31.553306]  ieee80211_request_scan+0x2c/0x50 [mac80211]
>      [   31.558738]  rdev_scan+0x28/0xd0 [cfg80211]
>      [   31.563117]  nl80211_trigger_scan+0x3fe/0x680 [cfg80211]
>      [   31.568584]  genl_family_rcv_msg_doit+0xea/0x150
>      [   31.573223]  genl_rcv_msg+0xde/0x1d0
>      [   31.576816]  ? nl80211_send_scan_start+0x90/0x90 [cfg80211]
>      [   31.582520]  ? genl_get_cmd+0xd0/0xd0
>      [   31.586191]  netlink_rcv_skb+0x50/0xf0
>      [   31.589958]  genl_rcv+0x24/0x40
>      [   31.593109]  netlink_unicast+0x239/0x340
>      [   31.597045]  netlink_sendmsg+0x245/0x480
>      [   31.600981]  sock_sendmsg+0x5e/0x60
>      [   31.604487]  ____sys_sendmsg+0x22e/0x270
>      [   31.608440]  ? import_iovec+0x2d/0x30
>      [   31.612123]  ? sendmsg_copy_msghdr+0x7c/0xa0
>      [   31.616406]  ___sys_sendmsg+0x75/0xb0
>      [   31.620081]  ? __mod_lruvec_page_state+0x7d/0xc0
>      [   31.624714]  ? folio_add_lru+0x5c/0xa0
>      [   31.628476]  ? _raw_spin_unlock+0x16/0x30
>      [   31.632506]  ? __handle_mm_fault+0x1261/0x1520
>      [   31.636965]  __sys_sendmsg+0x59/0xa0
>      [   31.640552]  do_syscall_64+0x3b/0xc0
>      [   31.644148]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>      [   31.649208] RIP: 0033:0x7f2725ef6f33
>      [   31.652797] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 89 54 24 1c 48
>      [   31.671547] RSP: 002b:00007fff1b5f1668 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>      [   31.679122] RAX: ffffffffffffffda RBX: 0000564919260760 RCX: 00007f2725ef6f33
>      [   31.686264] RDX: 0000000000000000 RSI: 00007fff1b5f16a0 RDI: 0000000000000005
>      [   31.693406] RBP: 000056491928f6c0 R08: 0000000000000004 R09: 00007f2725fb6c00
>      [   31.700547] R10: 00007fff1b5f178c R11: 0000000000000246 R12: 0000564919260670
>      [   31.707689] R13: 00007fff1b5f16a0 R14: 0000000000000000 R15: 00007fff1b5f178c
>      [   31.714834]  </TASK>
>      [   31.717031] Modules linked in: qrtr_mhi btusb btrtl btbcm btintel bluetooth jitterentropy_rng sha512_ssse3 sha512_generic drbg ansi_cprng amd64_edac ecdh_generic edac_mce_amd ecc kvm_amd kvm irqbypass qrtr crc32_pclmul ghash_clmulni_intel ath11k_pci mhi ath11k evdev pcengines_apuv2 qmi_helpers gpio_keys_polled gpio_amd_fch aesni_intel snd_pcm crypto_simd snd_timer sdhci_pci xhci_pci snd cqhci mac80211 soundcore ehci_pci sp5100_tco cryptd libarc4 xhci_hcd sdhci ehci_hcd pcspkr igb watchdog ptp cfg80211 mmc_core k10temp i2c_piix4 fam15h_power usbcore ccp pps_core sg dca rng_core i2c_algo_bit usb_common rfkill leds_gpio gpio_keys acpi_cpufreq button fuse drm configfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul scsi_mod crct10dif_common crc32c_intel scsi_common
>      [   31.793074] CR2: 00000100000000bc
>      [   31.796498] ---[ end trace 07252723010a83e6 ]---
>      [   31.801261] RIP: 0010:skb_end_pointer+0x0/0xe
>      [   31.805824] Code: dc ff ff ff c3 48 8b 07 48 83 e0 fe 48 83 c8 02 48 89 07 c3 8b 47 08 c3 89 77 08 c3 01 77 08 c3 29 77 08 c3 b8 00 00 00 00 c3 <8b> 87 bc 00 00 00 48 03 87 c0 00 00 00 c3 8b 87 bc 00 00 00 c3 e8
>      [   31.824842] RSP: 0018:ffffb0edc0427490 EFLAGS: 00010282
>      [   31.830105] RAX: ffff8ff9818b1ec0 RBX: 0000010000000000 RCX: 0000000000000000
>      [   31.837270] RDX: 0000000000000001 RSI: 0000010000000000 RDI: 0000010000000000
>      [   31.844441] RBP: ffff8ff9891a07c8 R08: 0000000000000000 R09: ffffb0edc0427370
>      [   31.851614] R10: ffffb0edc0427368 R11: ffffffffa5cd22e8 R12: 0000000000000268
>      [   31.858781] R13: 00000000ffffffff R14: 0000010000000000 R15: ffffffffc0f2198b
>      [   31.866020] FS:  00007f2725a55c00(0000) GS:ffff8ff9aac00000(0000) knlGS:0000000000000000
>      [   31.874141] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>      [   31.879920] CR2: 00000100000000bc CR3: 00000001090bc000 CR4: 00000000000406f0
>
>
> So the length calculated for the ath11k_wmi_alloc_skb is just wrong. Reason
> for this is the extraie_len_with_pad which is only an u8. But the
> params->extraie.len with the IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS is for me
> already 264. So the length will end up as 8 - but the length it occupies
> is still 264.
>
> But the problem is the length of the WMI_TLV_LEN. The params->extraie.len can
> be up to 32 bit and WMI_TLV_LEN only has 16 bit. So the params->extraie.len
> must also be size limited or we might run into a different problem.
>
> Kind regards,
> 	Sven
Kalle Valo Dec. 8, 2021, 8:16 a.m. UTC | #9
Wen Gong <quic_wgong@quicinc.com> wrote:

> Currently mac80211 will send 3 scan request for each scan of WCN6855,
> they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will
> cache the RNR IE(Reduced Neighbor Report element) which exist in the
> beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz,
> and then use the cache to scan in 6 GHz band scan if the 6 GHz scan
> is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to
> search more AP of 6 GHz. Also it will decrease the time cost of scan
> because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it
> means the 2.4 GHz and 5 GHz scans are doing simultaneously.
> 
> Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since
> it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means
> all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw.
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
> 
> Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

Sven, after your memory corruption fix is this good to take?
Wen Gong Dec. 8, 2021, 8:19 a.m. UTC | #10
On 12/8/2021 4:16 PM, Kalle Valo wrote:
> Wen Gong <quic_wgong@quicinc.com> wrote:
...
> Sven, after your memory corruption fix is this good to take?

After Sven's fix "ath11k: Fix buffer overflow when scanning with 
extraie", it will not happen kernel crash.

But it need Sven's confirm.
Sven Eckelmann Dec. 8, 2021, 9:12 a.m. UTC | #11
On Wednesday, 8 December 2021 09:19:28 CET Wen Gong wrote:
> On 12/8/2021 4:16 PM, Kalle Valo wrote:
> > Wen Gong <quic_wgong@quicinc.com> wrote:
> ...
> > Sven, after your memory corruption fix is this good to take?
> 
> After Sven's fix "ath11k: Fix buffer overflow when scanning with 
> extraie", it will not happen kernel crash.
> 
> But it need Sven's confirm.

Correct, it is not causing any problems anymore when the other fix was applied 
before this change.

Tested-by: Sven Eckelmann <sven@narfation.org>

Kind regards,
	Sven
Kalle Valo Dec. 8, 2021, 9:48 a.m. UTC | #12
Sven Eckelmann <sven@narfation.org> writes:

> On Wednesday, 8 December 2021 09:19:28 CET Wen Gong wrote:
>> On 12/8/2021 4:16 PM, Kalle Valo wrote:
>> > Wen Gong <quic_wgong@quicinc.com> wrote:
>> ...
>> > Sven, after your memory corruption fix is this good to take?
>> 
>> After Sven's fix "ath11k: Fix buffer overflow when scanning with 
>> extraie", it will not happen kernel crash.
>> 
>> But it need Sven's confirm.
>
> Correct, it is not causing any problems anymore when the other fix was applied 
> before this change.
>
> Tested-by: Sven Eckelmann <sven@narfation.org>

Very good, thanks. I included your Tested-by.
Kalle Valo Dec. 9, 2021, 7:59 a.m. UTC | #13
Wen Gong <quic_wgong@quicinc.com> wrote:

> Currently mac80211 will send 3 scan request for each scan of WCN6855,
> they are 2.4 GHz/5 GHz/6 GHz band scan. Firmware of WCN6855 will
> cache the RNR IE(Reduced Neighbor Report element) which exist in the
> beacon of 2.4 GHz/5 GHz of the AP which is co-located with 6 GHz,
> and then use the cache to scan in 6 GHz band scan if the 6 GHz scan
> is in the same scan with the 2.4 GHz/5 GHz band, this will helpful to
> search more AP of 6 GHz. Also it will decrease the time cost of scan
> because firmware will use dual-band scan for the 2.4 GHz/5 GHz, it
> means the 2.4 GHz and 5 GHz scans are doing simultaneously.
> 
> Set the flag IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855 since
> it supports 2.4 GHz/5 GHz/6 GHz and it is single pdev which means
> all the 2.4 GHz/5 GHz/6 GHz exist in the same wiphy/ieee80211_hw.
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
> 
> Tested-by: Sven Eckelmann <sven@narfation.org>
> Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

Patch applied to ath-next branch of ath.git, thanks.

9f6da09a5f6a ath11k: enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS for WCN6855
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index 06d20658586a..8218ea52f285 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -7915,6 +7915,9 @@  static int __ath11k_mac_register(struct ath11k *ar)
 
 	ar->hw->wiphy->interface_modes = ab->hw_params.interface_modes;
 
+	if (ab->hw_params.single_pdev_only && ar->supports_6ghz)
+		ieee80211_hw_set(ar->hw, SINGLE_SCAN_ON_ALL_BANDS);
+
 	ieee80211_hw_set(ar->hw, SIGNAL_DBM);
 	ieee80211_hw_set(ar->hw, SUPPORTS_PS);
 	ieee80211_hw_set(ar->hw, SUPPORTS_DYNAMIC_PS);