ath10k: fix system hang at qca99x0 probe on x86 platform
diff mbox

Message ID 20160614061728.570-1-rmanohar@qti.qualcomm.com
State Changes Requested
Delegated to: Kalle Valo
Headers show

Commit Message

Rajkumar Manoharan June 14, 2016, 6:17 a.m. UTC
commit b057886524be ("ath10k: do not use coherent memory for allocated
device memory chunks") replaced coherent memory allocation for memory
chunks to fix low memory platforms. Unfortunately this is causing system
freeze on x86 platform while bringing up qca99x0 device. The system
hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
this by limiting maximum memory chunk size to 256 KiB per request.

Cc: Felix Fietkau <nbd@nbd.name>
Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
---
 drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
 drivers/net/wireless/ath/ath10k/wmi.h | 1 +
 2 files changed, 7 insertions(+)

Comments

Sebastian Gottschall June 29, 2016, 1:55 p.m. UTC | #1
this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
so please rework it, or leave it out.
note:
maybe the limit of 256kb is too low for that card


    10.102047] ath10k_pci 0000:01:00.0: unable to read from the device
[   10.102075] ath10k_pci 0000:01:00.0: could not execute otp for board 
id check: -110
[   10.107116] ath10k_pci 0000:01:00.0: failed to get board id from otp: 
-110
[   10.126517] ath10k_pci 0000:01:00.0: failed to fetch board data for 
bus=pci,vendor=168c,device=0040,subsystem-vendor=168c,subsystem-device=0002 
from ath10k/QCA99X0/hw2.0/board-2.bin
[   10.126697] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
crc32 62700264
[   11.518268] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 
7173fc19-f807-4345-906a-9f3d17fb751b)
[   11.518307] ath10k_pci 0000:01:00.0: qca99x0 hw2.0 target 0x01000000 
chip_id 0x003b01ff sub 168c:0002
[   11.526123] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 0 testmode 0
[   11.536976] ath10k_pci 0000:01:00.0: firmware ver 10.4.1.00030-1 api 
5 features no-p2p crc32 d2901e01
[   11.543610] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
crc32 62700264
[   11.552770] ath10k_pci 0000:01:00.0: htt-ver 0.0 wmi-op 6 htt-op 4 
cal file max-sta 512 raw 0 hwcrypto 1
[   11.561910] ath10k_pci 0000:01:00.0: firmware register dump:
[   11.569606] ath10k_pci 0000:01:00.0: [00]: 0x01000000 0x000015B3 
0x000D89A5 0x00955B31
[   11.575252] ath10k_pci 0000:01:00.0: [04]: 0x000D89A5 0x00060730 
0x0000000C 0x00000000
[   11.582978] ath10k_pci 0000:01:00.0: [08]: 0x00000010 0x0000000D 
0x000E8A64 0xFFFFC000
[   11.590876] ath10k_pci 0000:01:00.0: [12]: 0x00000009 0x00000000 
0x000D89BC 0x000D8A03
[   11.598846] ath10k_pci 0000:01:00.0: [16]: 0x00953438 0x000D89BE 
0x00000000 0x00000000
[   11.606676] ath10k_pci 0000:01:00.0: [20]: 0x400D89A5 0x0040655C 
0x00413A84 0x00000005
[   11.614574] ath10k_pci 0000:01:00.0: [24]: 0x809CDC6E 0x004065BC 
0x00417450 0xC00D89A5
[   11.622473] ath10k_pci 0000:01:00.0: [28]: 0x8098011C 0x0040660C 
0x004179D8 0x0000000D
[   11.630372] ath10k_pci 0000:01:00.0: [32]: 0x809CDE70 0x0040662C 
0x004179D8 0x0000000D
[   11.638272] ath10k_pci 0000:01:00.0: [36]: 0x80981786 0x0040665C 
0x004179D8 0x00000020
[   11.646171] ath10k_pci 0000:01:00.0: [40]: 0x809CE0F7 0x0040667C 
0x00000000 0x0000A000
[   11.654070] ath10k_pci 0000:01:00.0: [44]: 0x809B307A 0x004066AC 
0x00981768 0x0042028C
[   11.661970] ath10k_pci 0000:01:00.0: [48]: 0x809AF3DA 0x004066FC 
0x00000002 0x0042028C
[   11.669869] ath10k_pci 0000:01:00.0: [52]: 0x809AEB02 0x0040672C 
0x00406750 0x0041847C
[   11.677768] ath10k_pci 0000:01:00.0: [56]: 0x8094EAAA 0x0040674C 
0x000E89AC 0x00000001

Sebastian


Am 14.06.2016 um 08:17 schrieb Rajkumar Manoharan:
> commit b057886524be ("ath10k: do not use coherent memory for allocated
> device memory chunks") replaced coherent memory allocation for memory
> chunks to fix low memory platforms. Unfortunately this is causing system
> freeze on x86 platform while bringing up qca99x0 device. The system
> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
> this by limiting maximum memory chunk size to 256 KiB per request.
>
> Cc: Felix Fietkau <nbd@nbd.name>
> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
> ---
>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>   2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
> index 6279ab4a760e..7c15f65fe5ed 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
>   		if (!pool_size)
>   			return -EINVAL;
>   
> +		if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
> +			num_units = WMI_MAX_MEM_CHUNK_SIZE /
> +					round_up(unit_len, 4);
> +			pool_size = num_units * round_up(unit_len, 4);
> +		}
> +
>   		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
>   		if (!vaddr)
>   			num_units /= 2;
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h
> index 90f594e89f94..dea1f235a54d 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.h
> +++ b/drivers/net/wireless/ath/ath10k/wmi.h
> @@ -6184,6 +6184,7 @@ struct wmi_roam_ev {
>   #define ATH10K_DEFAULT_ATIM 0
>   
>   #define WMI_MAX_MEM_REQS 16
> +#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */
>   
>   struct wmi_scan_ev_arg {
>   	__le32 event_type; /* %WMI_SCAN_EVENT_ */
Sebastian Gottschall June 29, 2016, 2:04 p.m. UTC | #2
this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
so please rework it, or leave it out.
note:
maybe the limit of 256kb is too low for that card


    10.102047] ath10k_pci 0000:01:00.0: unable to read from the device
[   10.102075] ath10k_pci 0000:01:00.0: could not execute otp for board 
id check: -110
[   10.107116] ath10k_pci 0000:01:00.0: failed to get board id from otp: 
-110
[   10.126517] ath10k_pci 0000:01:00.0: failed to fetch board data for 
bus=pci,vendor=168c,device=0040,subsystem-vendor=168c,subsystem-device=0002 
from ath10k/QCA99X0/hw2.0/board-2.bin
[   10.126697] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
crc32 62700264
[   11.518268] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 
7173fc19-f807-4345-906a-9f3d17fb751b)
[   11.518307] ath10k_pci 0000:01:00.0: qca99x0 hw2.0 target 0x01000000 
chip_id 0x003b01ff sub 168c:0002
[   11.526123] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 
tracing 0 dfs 0 testmode 0
[   11.536976] ath10k_pci 0000:01:00.0: firmware ver 10.4.1.00030-1 api 
5 features no-p2p crc32 d2901e01
[   11.543610] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
crc32 62700264
[   11.552770] ath10k_pci 0000:01:00.0: htt-ver 0.0 wmi-op 6 htt-op 4 
cal file max-sta 512 raw 0 hwcrypto 1
[   11.561910] ath10k_pci 0000:01:00.0: firmware register dump:
[   11.569606] ath10k_pci 0000:01:00.0: [00]: 0x01000000 0x000015B3 
0x000D89A5 0x00955B31
[   11.575252] ath10k_pci 0000:01:00.0: [04]: 0x000D89A5 0x00060730 
0x0000000C 0x00000000
[   11.582978] ath10k_pci 0000:01:00.0: [08]: 0x00000010 0x0000000D 
0x000E8A64 0xFFFFC000
[   11.590876] ath10k_pci 0000:01:00.0: [12]: 0x00000009 0x00000000 
0x000D89BC 0x000D8A03
[   11.598846] ath10k_pci 0000:01:00.0: [16]: 0x00953438 0x000D89BE 
0x00000000 0x00000000
[   11.606676] ath10k_pci 0000:01:00.0: [20]: 0x400D89A5 0x0040655C 
0x00413A84 0x00000005
[   11.614574] ath10k_pci 0000:01:00.0: [24]: 0x809CDC6E 0x004065BC 
0x00417450 0xC00D89A5
[   11.622473] ath10k_pci 0000:01:00.0: [28]: 0x8098011C 0x0040660C 
0x004179D8 0x0000000D
[   11.630372] ath10k_pci 0000:01:00.0: [32]: 0x809CDE70 0x0040662C 
0x004179D8 0x0000000D
[   11.638272] ath10k_pci 0000:01:00.0: [36]: 0x80981786 0x0040665C 
0x004179D8 0x00000020
[   11.646171] ath10k_pci 0000:01:00.0: [40]: 0x809CE0F7 0x0040667C 
0x00000000 0x0000A000
[   11.654070] ath10k_pci 0000:01:00.0: [44]: 0x809B307A 0x004066AC 
0x00981768 0x0042028C
[   11.661970] ath10k_pci 0000:01:00.0: [48]: 0x809AF3DA 0x004066FC 
0x00000002 0x0042028C
[   11.669869] ath10k_pci 0000:01:00.0: [52]: 0x809AEB02 0x0040672C 
0x00406750 0x0041847C
[   11.677768] ath10k_pci 0000:01:00.0: [56]: 0x8094EAAA 0x0040674C 
0x000E89AC 0x00000001

Sebastian


Am 14.06.2016 um 08:17 schrieb Rajkumar Manoharan:
> commit b057886524be ("ath10k: do not use coherent memory for allocated
> device memory chunks") replaced coherent memory allocation for memory
> chunks to fix low memory platforms. Unfortunately this is causing system
> freeze on x86 platform while bringing up qca99x0 device. The system
> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
> this by limiting maximum memory chunk size to 256 KiB per request.
>
> Cc: Felix Fietkau <nbd@nbd.name>
> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated 
> device memory chunks")
> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
> ---
>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>   2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
> b/drivers/net/wireless/ath/ath10k/wmi.c
> index 6279ab4a760e..7c15f65fe5ed 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k 
> *ar, u32 req_id,
>           if (!pool_size)
>               return -EINVAL;
>   +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
> +                    round_up(unit_len, 4);
> +            pool_size = num_units * round_up(unit_len, 4);
> +        }
> +
>           vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
>           if (!vaddr)
>               num_units /= 2;
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.h 
> b/drivers/net/wireless/ath/ath10k/wmi.h
> index 90f594e89f94..dea1f235a54d 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.h
> +++ b/drivers/net/wireless/ath/ath10k/wmi.h
> @@ -6184,6 +6184,7 @@ struct wmi_roam_ev {
>   #define ATH10K_DEFAULT_ATIM 0
>     #define WMI_MAX_MEM_REQS 16
> +#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */
>     struct wmi_scan_ev_arg {
>       __le32 event_type; /* %WMI_SCAN_EVENT_ */
Sebastian Gottschall June 29, 2016, 2:10 p.m. UTC | #3
by the way. 512 works


Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
> so please rework it, or leave it out.
> note:
> maybe the limit of 256kb is too low for that card
>
>
>    10.102047] ath10k_pci 0000:01:00.0: unable to read from the device
> [   10.102075] ath10k_pci 0000:01:00.0: could not execute otp for 
> board id check: -110
> [   10.107116] ath10k_pci 0000:01:00.0: failed to get board id from 
> otp: -110
> [   10.126517] ath10k_pci 0000:01:00.0: failed to fetch board data for 
> bus=pci,vendor=168c,device=0040,subsystem-vendor=168c,subsystem-device=0002 
> from ath10k/QCA99X0/hw2.0/board-2.bin
> [   10.126697] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
> crc32 62700264
> [   11.518268] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 
> 7173fc19-f807-4345-906a-9f3d17fb751b)
> [   11.518307] ath10k_pci 0000:01:00.0: qca99x0 hw2.0 target 
> 0x01000000 chip_id 0x003b01ff sub 168c:0002
> [   11.526123] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1 
> tracing 0 dfs 0 testmode 0
> [   11.536976] ath10k_pci 0000:01:00.0: firmware ver 10.4.1.00030-1 
> api 5 features no-p2p crc32 d2901e01
> [   11.543610] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A 
> crc32 62700264
> [   11.552770] ath10k_pci 0000:01:00.0: htt-ver 0.0 wmi-op 6 htt-op 4 
> cal file max-sta 512 raw 0 hwcrypto 1
> [   11.561910] ath10k_pci 0000:01:00.0: firmware register dump:
> [   11.569606] ath10k_pci 0000:01:00.0: [00]: 0x01000000 0x000015B3 
> 0x000D89A5 0x00955B31
> [   11.575252] ath10k_pci 0000:01:00.0: [04]: 0x000D89A5 0x00060730 
> 0x0000000C 0x00000000
> [   11.582978] ath10k_pci 0000:01:00.0: [08]: 0x00000010 0x0000000D 
> 0x000E8A64 0xFFFFC000
> [   11.590876] ath10k_pci 0000:01:00.0: [12]: 0x00000009 0x00000000 
> 0x000D89BC 0x000D8A03
> [   11.598846] ath10k_pci 0000:01:00.0: [16]: 0x00953438 0x000D89BE 
> 0x00000000 0x00000000
> [   11.606676] ath10k_pci 0000:01:00.0: [20]: 0x400D89A5 0x0040655C 
> 0x00413A84 0x00000005
> [   11.614574] ath10k_pci 0000:01:00.0: [24]: 0x809CDC6E 0x004065BC 
> 0x00417450 0xC00D89A5
> [   11.622473] ath10k_pci 0000:01:00.0: [28]: 0x8098011C 0x0040660C 
> 0x004179D8 0x0000000D
> [   11.630372] ath10k_pci 0000:01:00.0: [32]: 0x809CDE70 0x0040662C 
> 0x004179D8 0x0000000D
> [   11.638272] ath10k_pci 0000:01:00.0: [36]: 0x80981786 0x0040665C 
> 0x004179D8 0x00000020
> [   11.646171] ath10k_pci 0000:01:00.0: [40]: 0x809CE0F7 0x0040667C 
> 0x00000000 0x0000A000
> [   11.654070] ath10k_pci 0000:01:00.0: [44]: 0x809B307A 0x004066AC 
> 0x00981768 0x0042028C
> [   11.661970] ath10k_pci 0000:01:00.0: [48]: 0x809AF3DA 0x004066FC 
> 0x00000002 0x0042028C
> [   11.669869] ath10k_pci 0000:01:00.0: [52]: 0x809AEB02 0x0040672C 
> 0x00406750 0x0041847C
> [   11.677768] ath10k_pci 0000:01:00.0: [56]: 0x8094EAAA 0x0040674C 
> 0x000E89AC 0x00000001
>
> Sebastian
>
>
> Am 14.06.2016 um 08:17 schrieb Rajkumar Manoharan:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <nbd@nbd.name>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for 
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct 
>> ath10k *ar, u32 req_id,
>>           if (!pool_size)
>>               return -EINVAL;
>>   +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> +                    round_up(unit_len, 4);
>> +            pool_size = num_units * round_up(unit_len, 4);
>> +        }
>> +
>>           vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
>>           if (!vaddr)
>>               num_units /= 2;
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.h 
>> b/drivers/net/wireless/ath/ath10k/wmi.h
>> index 90f594e89f94..dea1f235a54d 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.h
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.h
>> @@ -6184,6 +6184,7 @@ struct wmi_roam_ev {
>>   #define ATH10K_DEFAULT_ATIM 0
>>     #define WMI_MAX_MEM_REQS 16
>> +#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */
>>     struct wmi_scan_ev_arg {
>>       __le32 event_type; /* %WMI_SCAN_EVENT_ */
>
>
Rajkumar Manoharan June 29, 2016, 4:35 p.m. UTC | #4
>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>> so please rework it, or leave it out.
>> note:
>> maybe the limit of 256kb is too low for that card
>>
> by the way. 512 works
>
Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.

-Rajkumar
Sebastian Gottschall June 29, 2016, 4:58 p.m. UTC | #5
Am 29.06.2016 um 18:35 schrieb Manoharan, Rajkumar:
>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>> so please rework it, or leave it out.
>>> note:
>>> maybe the limit of 256kb is too low for that card
>>>
>> by the way. 512 works
>>
> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.
please check 9984 as well. i dont have this card right now but it seems 
it has something todo with the firmware size and the 9984 is bigger than 
9980
i'm still waiting for my sample cards
>
> -Rajkumar
Michal Kazior June 30, 2016, 7:09 a.m. UTC | #6
On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>> so please rework it, or leave it out.
>>> note:
>>> maybe the limit of 256kb is too low for that card
>>>
>> by the way. 512 works

I think this suggests the problem isn't about memory chunk size limit
per se but some kind of bug in address/offset logic in fw or hw.

DMA coherent and single-map addresses use completely different ranges
in many cases. Perhaps some MSBs are not properly handled in fw or hw.
I recall there is a magic macro through which target device accesses
host memory so maybe that's a good place to look to better understand
the problem?

I recall Ben mentioned he worked around the problem by enabling
IOMMU/VT-d on his system. This could either prevent the device from
doing bad things or maybe changed DMA address ranges that are handed
out to the driver effectively or changed PCI controller behavior in
some way.


> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.

Just changing the memory chunk size limit blindly is bad and
Sebastian's crash has proven it. 512 may seem to work now but it may
fail with a other 10.4 firmware revisions or make x86 hang in other
cases.


Michał
Rajkumar Manoharan July 19, 2016, 3:25 p.m. UTC | #7
On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>> so please rework it, or leave it out.
>>>> note:
>>>> maybe the limit of 256kb is too low for that card
>>>>
>>> by the way. 512 works
>
> I think this suggests the problem isn't about memory chunk size limit
> per se but some kind of bug in address/offset logic in fw or hw.
> 
> DMA coherent and single-map addresses use completely different ranges
> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
> I recall there is a magic macro through which target device accesses
> host memory so maybe that's a good place to look to better understand
> the problem?
> 
Michał,

Could you please shed some light on this issue? It seems this issue is popping up
more frequently and there are multiple threads for this issue.

"Anyone brought up 9984 NIC on x86-64?"
"AR9882 IOMMU faults"

> I recall Ben mentioned he worked around the problem by enabling
> IOMMU/VT-d on his system. This could either prevent the device from
> doing bad things or maybe changed DMA address ranges that are handed
> out to the driver effectively or changed PCI controller behavior in
> some way.
>
>> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.
> 
> Just changing the memory chunk size limit blindly is bad and
> Sebastian's crash has proven it. 512 may seem to work now but it may
> fail with a other 10.4 firmware revisions or make x86 hang in other
> cases.
> 
Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
applicable to existing logic. no?

-Rajkumar
Adrian Chadd July 19, 2016, 4:13 p.m. UTC | #8
On 19 July 2016 at 08:25, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
> On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>>> so please rework it, or leave it out.
>>>>> note:
>>>>> maybe the limit of 256kb is too low for that card
>>>>>
>>>> by the way. 512 works
>>
>> I think this suggests the problem isn't about memory chunk size limit
>> per se but some kind of bug in address/offset logic in fw or hw.
>>
>> DMA coherent and single-map addresses use completely different ranges
>> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
>> I recall there is a magic macro through which target device accesses
>> host memory so maybe that's a good place to look to better understand
>> the problem?
>>
> Michał,
>
> Could you please shed some light on this issue? It seems this issue is popping up
> more frequently and there are multiple threads for this issue.
>
> "Anyone brought up 9984 NIC on x86-64?"
> "AR9882 IOMMU faults"
>
>> I recall Ben mentioned he worked around the problem by enabling
>> IOMMU/VT-d on his system. This could either prevent the device from
>> doing bad things or maybe changed DMA address ranges that are handed
>> out to the driver effectively or changed PCI controller behavior in
>> some way.
>>
>>> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.
>>
>> Just changing the memory chunk size limit blindly is bad and
>> Sebastian's crash has proven it. 512 may seem to work now but it may
>> fail with a other 10.4 firmware revisions or make x86 hang in other
>> cases.
>>
> Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
> to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
> applicable to existing logic. no?

Try allocating WMI memory with GFP_DMA32. The way it currently is
working in linux is that caling dma map ends up allocating iommu slots
to map that 64 bit memory back into 32 bit space, /or/ it will end up
allocating bounce buffers.

The WMI memory alloc routine is being used for the swap space too,
which ends up with 700kbyte or more allocated twice - once for the
initial alloc, and another for the dma map call.

You should try GFP_DMA32 and see if that fixes it. You need contig, <
32 bit physical memory allocated, and bounce buffers are really
supposed to be ephemeral.



-adrian
Ben Greear July 19, 2016, 4:53 p.m. UTC | #9
On 07/19/2016 09:13 AM, Adrian Chadd wrote:
> On 19 July 2016 at 08:25, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>> On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
>>> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>>>> so please rework it, or leave it out.
>>>>>> note:
>>>>>> maybe the limit of 256kb is too low for that card
>>>>>>
>>>>> by the way. 512 works
>>>
>>> I think this suggests the problem isn't about memory chunk size limit
>>> per se but some kind of bug in address/offset logic in fw or hw.
>>>
>>> DMA coherent and single-map addresses use completely different ranges
>>> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
>>> I recall there is a magic macro through which target device accesses
>>> host memory so maybe that's a good place to look to better understand
>>> the problem?
>>>
>> Michał,
>>
>> Could you please shed some light on this issue? It seems this issue is popping up
>> more frequently and there are multiple threads for this issue.
>>
>> "Anyone brought up 9984 NIC on x86-64?"
>> "AR9882 IOMMU faults"
>>
>>> I recall Ben mentioned he worked around the problem by enabling
>>> IOMMU/VT-d on his system. This could either prevent the device from
>>> doing bad things or maybe changed DMA address ranges that are handed
>>> out to the driver effectively or changed PCI controller behavior in
>>> some way.
>>>
>>>> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.
>>>
>>> Just changing the memory chunk size limit blindly is bad and
>>> Sebastian's crash has proven it. 512 may seem to work now but it may
>>> fail with a other 10.4 firmware revisions or make x86 hang in other
>>> cases.
>>>
>> Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
>> to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
>> applicable to existing logic. no?
>
> Try allocating WMI memory with GFP_DMA32. The way it currently is
> working in linux is that caling dma map ends up allocating iommu slots
> to map that 64 bit memory back into 32 bit space, /or/ it will end up
> allocating bounce buffers.
>
> The WMI memory alloc routine is being used for the swap space too,
> which ends up with 700kbyte or more allocated twice - once for the
> initial alloc, and another for the dma map call.
>
> You should try GFP_DMA32 and see if that fixes it. You need contig, <
> 32 bit physical memory allocated, and bounce buffers are really
> supposed to be ephemeral.

I briefly tested with the GFP_DMA32 and it worked on my 9984 test rig.

I also found firmware bugs in my 10.4.3 (3.3-25) based firmware when smaller
chunk sizes are used.  Possibly this is fixed in QCA firmware images, but likely
if you select more active peers than will fit in one 256k chunk your firmware
will crash.  I tested with 72 active peers.  I have fixed this bug in my 9984 firmware
as far as I can tell, but I have not run stress tests on it yet.

Thanks,
Ben

>
>
>
> -adrian
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>
Adrian Chadd July 19, 2016, 5 p.m. UTC | #10
Some more test hardware arrived on Friday, so I'll whack a peregrine
v2 NIC in this afternoon and test out my changes.

I'll send a pull request with the DMA32 changes today or tomorrow.

(If someone would like to send me some beeliner/cascade hardware or
arrange something to be picked up in San Jose then I can test on that,
otherwise it'll have to wait; I don't have easy access to anything
else besides dakota and that is being used for other bring-up
activities atm.)



-adrian
Michal Kazior July 20, 2016, 5:36 a.m. UTC | #11
On 19 July 2016 at 17:25, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
> On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>>> so please rework it, or leave it out.
>>>>> note:
>>>>> maybe the limit of 256kb is too low for that card
>>>>>
>>>> by the way. 512 works
>>
>> I think this suggests the problem isn't about memory chunk size limit
>> per se but some kind of bug in address/offset logic in fw or hw.
>>
>> DMA coherent and single-map addresses use completely different ranges
>> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
>> I recall there is a magic macro through which target device accesses
>> host memory so maybe that's a good place to look to better understand
>> the problem?
>>
> Michał,
>
> Could you please shed some light on this issue? It seems this issue is popping up
> more frequently and there are multiple threads for this issue.
>
> "Anyone brought up 9984 NIC on x86-64?"
> "AR9882 IOMMU faults"

I think IOMMU faults were solved by using DMA_BIDIRECTIONAL, no?

Anyway, FWIW there's this concept in firmware called dma_local_bits
and A_DMA_ADDR()/A_CPU_ADDR(). Not sure if it's relevant but may be
worth checking out in detail.


>> I recall Ben mentioned he worked around the problem by enabling
>> IOMMU/VT-d on his system. This could either prevent the device from
>> doing bad things or maybe changed DMA address ranges that are handed
>> out to the driver effectively or changed PCI controller behavior in
>> some way.
>>
>>> Thanks a lot Sebastian. Let me confirm the same on x86 and will update the change.
>>
>> Just changing the memory chunk size limit blindly is bad and
>> Sebastian's crash has proven it. 512 may seem to work now but it may
>> fail with a other 10.4 firmware revisions or make x86 hang in other
>> cases.
>>
> Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
> to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
> applicable to existing logic. no?

We still don't know *why* using non-coherent memory causes problems.
Changing chunk size limit seems to alter the behavior in some
unpredictable ways, yes, but it's really hard to tell if the "try
smaller chunk sizes" *itself* introduces any problems.


Michał
Michal Kazior July 20, 2016, 5:38 a.m. UTC | #12
On 19 July 2016 at 18:53, Ben Greear <greearb@candelatech.com> wrote:
> On 07/19/2016 09:13 AM, Adrian Chadd wrote:
[...]
>> Try allocating WMI memory with GFP_DMA32. The way it currently is
>> working in linux is that caling dma map ends up allocating iommu slots
>> to map that 64 bit memory back into 32 bit space, /or/ it will end up
>> allocating bounce buffers.
>>
>> The WMI memory alloc routine is being used for the swap space too,
>> which ends up with 700kbyte or more allocated twice - once for the
>> initial alloc, and another for the dma map call.
>>
>> You should try GFP_DMA32 and see if that fixes it. You need contig, <
>> 32 bit physical memory allocated, and bounce buffers are really
>> supposed to be ephemeral.
>
>
> I briefly tested with the GFP_DMA32 and it worked on my 9984 test rig.

Doesn't GFP_DMA32 alter the address space your pointers refer to in a
similar way dma-coherent allocations do (compared to regular
kmalloc+dma-map-single)? This would support my theory about
firmware/hardware getting confused if host mem chunks have some MSBs
set.


Michał
Adrian Chadd July 20, 2016, 5:44 a.m. UTC | #13
Hi,

dma coherent doesn't /have/ to mean "low 32 bits". It's just supposed
to mean "try really hard to use uncached memory on platforms that
support it."

The ath10k hardware (at least what I've played with thus far) is all
32 bit DMA hardware, not 64 bit, so it can't be handed 64 bit memory -
contiguous or otherwise.

So, if dma coherent on linux means 32 bit only physmem, great.

Now, it also turns out that various platforms that say they do
coherent memory these days do "mostly coherent", and you still need
some flush/sync ops..


-adrian
Michal Kazior July 20, 2016, 6:05 a.m. UTC | #14
On 20 July 2016 at 07:44, Adrian Chadd <adrian@freebsd.org> wrote:
> Hi,
>
> dma coherent doesn't /have/ to mean "low 32 bits". It's just supposed
> to mean "try really hard to use uncached memory on platforms that
> support it."

Good point. Maybe it does on x86, or at least some machines.

@Ben: Can you verify if that's the case for you? Can you see what
address ranges hostmem chunks get with and without the GFP_DMA32 (and
maybe compare it against a revert to compare to dma-coherent as well)?


> The ath10k hardware (at least what I've played with thus far) is all
> 32 bit DMA hardware, not 64 bit, so it can't be handed 64 bit memory -
> contiguous or otherwise.
>
> So, if dma coherent on linux means 32 bit only physmem, great.
>
> Now, it also turns out that various platforms that say they do
> coherent memory these days do "mostly coherent", and you still need
> some flush/sync ops..

Yeah, but since the device has it's own CPU and RAM it has to have a
way to distinguish local and host memory in some way using these 32
bits, no? (think about firmware generating local 802.11 frames vs
pushing frames coming from host driver)


Michał
Adrian Chadd July 20, 2016, 6:11 a.m. UTC | #15
Hi,

The ath10k on-chip DMA engine only knows how to address the low 32
bits of physical address space. It can't do DMA elsewhere without a
hardware IOMMU in the system. Even then, as far as the device is
concerned, it's being given < 32 bit physical memory addresses to DMA
around.

The local/host memory designation is different; IIRC when the CE is
doing transfers between local/host memory it's explicitly setup to do
so when you setup the ring/transfer. But those are all 32 bit
addresses anyway.


-adrian
Michal Kazior July 20, 2016, 6:27 a.m. UTC | #16
On 20 July 2016 at 08:11, Adrian Chadd <adrian@freebsd.org> wrote:
> Hi,
>
> The ath10k on-chip DMA engine only knows how to address the low 32
> bits of physical address space. It can't do DMA elsewhere without a
> hardware IOMMU in the system. Even then, as far as the device is
> concerned, it's being given < 32 bit physical memory addresses to DMA
> around.

Sure. ath10k even makes sure to use 32bit mask for dma (which is
redundant because it's already the default in linux pci subsystem
anyway).


> The local/host memory designation is different; IIRC when the CE is
> doing transfers between local/host memory it's explicitly setup to do
> so when you setup the ring/transfer. But those are all 32 bit
> addresses anyway.

Is CE involved in direct DMA accesses as well? Anyway my current
suspicion stems from this code hunk in ath10k:

static u32 ath10k_pci_targ_cpu_to_ce_addr(struct ath10k *ar, u32 addr)
{
        u32 val = 0;

        switch (ar->hw_rev) {
        case ATH10K_HW_QCA988X:
        case ATH10K_HW_QCA9887:
        case ATH10K_HW_QCA6174:
        case ATH10K_HW_QCA9377:
                val = (ath10k_pci_read32(ar, SOC_CORE_BASE_ADDRESS +
                                          CORE_CTRL_ADDRESS) &
                       0x7ff) << 21;
                break;
        case ATH10K_HW_QCA9888:
        case ATH10K_HW_QCA99X0:
        case ATH10K_HW_QCA9984:
        case ATH10K_HW_QCA4019:
                val = ath10k_pci_read32(ar, PCIE_BAR_REG_ADDRESS);
                break;
        }

        val |= 0x100000 | (addr & 0xfffff);
        return val;
}

This is used for CE diagnostic window in ath10k. However there seems
to be a counterpart (A_DMA_ADDR/A_CPU_ADDR) in firmware that does
something similar and dma_local_bits seem to be constructed in a
similar manner (at least on QCA9880), i.e. ((x >> 21) & 0x7ff) << 21.

The QCA99X0 seems to do it slightly differently (in firmware code as
well). Perhaps there's some kind of bug or an unexpected overlap that
cause addressing problems?


Michał
Sebastian Gottschall July 20, 2016, 8:49 a.m. UTC | #17
Hello

while hunting a link stability (packet transmission stop) issue i 
discovered a maybe cosmetic, but maybe als serious issue.
AP is a QCA9880 3x3 card configured as WDS AP
Station is a QCA9880 2x2 card configured as WDS STA

the TX rate of the station matches to the rx rate of the AP.
but the RX rate of the station is wrong as it seems which may be a cause 
of the issue.
could this be a firmware bug on QCA9880?

output of fw_stats

WDS AP:
              Peer MAC address 40:a5:ef:85:4d:6f
                      Peer RSSI 12
                   Peer TX rate 175500
                   Peer RX rate 175500
               Peer RX duration 0


WDS STA:
             Peer MAC address 40:a5:ef:51:49:db
                      Peer RSSI 13
                   Peer TX rate 175500
                   Peer RX rate 351000
               Peer RX duration 0


Sebastian
Sebastian Gottschall July 20, 2016, 8:51 a.m. UTC | #18
Hello

while hunting a link stability (packet transmission stop) issue i 
discovered a maybe cosmetic, but maybe als serious issue.
AP is a QCA9880 3x3 card configured as WDS AP
Station is a QCA9880 2x2 card configured as WDS STA

the TX rate of the station matches to the rx rate of the AP.
but the RX rate of the station is wrong as it seems which may be a cause 
of the issue.
could this be a firmware bug on QCA9880?

output of fw_stats

WDS AP:
              Peer MAC address 40:a5:ef:85:4d:6f
                      Peer RSSI 12
                   Peer TX rate 175500
                   Peer RX rate 175500
               Peer RX duration 0


WDS STA:
             Peer MAC address 40:a5:ef:51:49:db
                      Peer RSSI 13
                   Peer TX rate 175500
                   Peer RX rate 351000
               Peer RX duration 0


Sebastian
Michal Kazior July 20, 2016, 10:23 a.m. UTC | #19
On 20 July 2016 at 10:51, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
> Hello
>
> while hunting a link stability (packet transmission stop) issue i discovered
> a maybe cosmetic, but maybe als serious issue.
> AP is a QCA9880 3x3 card configured as WDS AP
> Station is a QCA9880 2x2 card configured as WDS STA
>
> the TX rate of the station matches to the rx rate of the AP.
> but the RX rate of the station is wrong as it seems which may be a cause of
> the issue.
> could this be a firmware bug on QCA9880?
>
> output of fw_stats
>
> WDS AP:
>              Peer MAC address 40:a5:ef:85:4d:6f
>                      Peer RSSI 12
>                   Peer TX rate 175500
>                   Peer RX rate 175500
>               Peer RX duration 0
>
>
> WDS STA:
>             Peer MAC address 40:a5:ef:51:49:db
>                      Peer RSSI 13
>                   Peer TX rate 175500
>                   Peer RX rate 351000
>               Peer RX duration 0

Hmm.. Interesting. FWIW these are "last tx/rx rate". This isn't
average nor anything fancy like that.

Can you compare that to `iw wlanX station dump`, please? It should
report last rx bitrate at least (tx bitrate is broken so don't rely on
that). The 175.5 and 351 seem to be both vht mcs index=4 but with
differet nss values (1 or 2 spatial streams).

I do wonder at what rate frames are actually transmitted OTA.

Is this reproducible? Can you try setting a fixed tx bitrate (`iw
wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
nss=1) to see if it makes any difference? Perhaps rate-control and tx
try-list/status are not parsed properly (for statistical purposes) in
firmware which ends up with invalid peer-tx-rate on WDS AP.


Michał
Sebastian Gottschall July 20, 2016, 10:50 a.m. UTC | #20
Am 20.07.2016 um 12:23 schrieb Michal Kazior:
> On 20 July 2016 at 10:51, Sebastian Gottschall 
> <s.gottschall@dd-wrt.com> wrote:
>> Hello
>>
>> while hunting a link stability (packet transmission stop) issue i 
>> discovered
>> a maybe cosmetic, but maybe als serious issue.
>> AP is a QCA9880 3x3 card configured as WDS AP
>> Station is a QCA9880 2x2 card configured as WDS STA
>>
>> the TX rate of the station matches to the rx rate of the AP.
>> but the RX rate of the station is wrong as it seems which may be a 
>> cause of
>> the issue.
>> could this be a firmware bug on QCA9880?
>>
>> output of fw_stats
>>
>> WDS AP:
>>               Peer MAC address 40:a5:ef:85:4d:6f
>>                       Peer RSSI 12
>>                    Peer TX rate 175500
>>                    Peer RX rate 175500
>>                Peer RX duration 0
>>
>>
>> WDS STA:
>>              Peer MAC address 40:a5:ef:51:49:db
>>                       Peer RSSI 13
>>                    Peer TX rate 175500
>>                    Peer RX rate 351000
>>                Peer RX duration 0
> Hmm.. Interesting. FWIW these are "last tx/rx rate". This isn't
> average nor anything fancy like that.
this is in sync yes. its also not that i just observed it for a second. 
this test installation is running here for a while
and while looking for the cause i observed this behavior.

>
> Can you compare that to `iw wlanX station dump`, please? It should
> report last rx bitrate at least (tx bitrate is broken so don't rely on
> that). The 175.5 and 351 seem to be both vht mcs index=4 but with
> differet nss values (1 or 2 spatial streams).
iw station dump shows the same rate

peer rx rate and iw dump is identical
STA:
Peer RX rate 526500
rx bitrate:     526.6 MBit/s VHT-MCS 6 80MHz VHT-NSS 2
AP:

AP
Peer TX rate 234000

interesting here is that the TX rate on STA side matches to the RX rate 
on AP side, but not vice versa

>
> I do wonder at what rate frames are actually transmitted OTA.
>
> Is this reproducible? Can you try setting a fixed tx bitrate (`iw
> wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
> nss=1) to see if it makes any difference? Perhaps rate-control and tx
> try-list/status are not parsed properly (for statistical purposes) in
> firmware which ends up with invalid peer-tx-rate on WDS AP.
lets try. can you correct the syntax? the following is not correct
iw dev ath1 set bitrates legacy-5 ht-mcs vht-mcs 1:4

Usage:  iw [options] dev <devname> set bitrates [legacy-<2.4|5> <legacy 
rate in Mbps>*] [ht-mcs-<2.4|5> <MCS index>*] [v
ht-mcs-<2.4|5> <NSS:MCSx,MCSy... | NSS:MCSx-MCSy>*] [sgi-2.4|lgi-2.4] 
[sgi-5|lgi-5]

Sets up the specified rate masks.
Not passing any arguments would clear the existing mask (if any).


>
>
> Michał
>
Michal Kazior July 20, 2016, 11:03 a.m. UTC | #21
On 20 July 2016 at 12:50, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
> Am 20.07.2016 um 12:23 schrieb Michal Kazior:
>>
>> On 20 July 2016 at 10:51, Sebastian Gottschall <s.gottschall@dd-wrt.com>
>> wrote:
>>>
>>> Hello
>>>
>>> while hunting a link stability (packet transmission stop) issue i
>>> discovered
>>> a maybe cosmetic, but maybe als serious issue.
>>> AP is a QCA9880 3x3 card configured as WDS AP
>>> Station is a QCA9880 2x2 card configured as WDS STA
>>>
>>> the TX rate of the station matches to the rx rate of the AP.
>>> but the RX rate of the station is wrong as it seems which may be a cause
>>> of
>>> the issue.
>>> could this be a firmware bug on QCA9880?
>>>
>>> output of fw_stats
>>>
>>> WDS AP:
>>>               Peer MAC address 40:a5:ef:85:4d:6f
>>>                       Peer RSSI 12
>>>                    Peer TX rate 175500
>>>                    Peer RX rate 175500
>>>                Peer RX duration 0
>>>
>>>
>>> WDS STA:
>>>              Peer MAC address 40:a5:ef:51:49:db
>>>                       Peer RSSI 13
>>>                    Peer TX rate 175500
>>>                    Peer RX rate 351000
>>>                Peer RX duration 0
[...]
>> Is this reproducible? Can you try setting a fixed tx bitrate (`iw
>> wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
>> nss=1) to see if it makes any difference? Perhaps rate-control and tx
>> try-list/status are not parsed properly (for statistical purposes) in
>> firmware which ends up with invalid peer-tx-rate on WDS AP.
>
> lets try. can you correct the syntax? the following is not correct
> iw dev ath1 set bitrates legacy-5 ht-mcs vht-mcs 1:4
>
> Usage:  iw [options] dev <devname> set bitrates [legacy-<2.4|5> <legacy rate
> in Mbps>*] [ht-mcs-<2.4|5> <MCS index>*] [v
> ht-mcs-<2.4|5> <NSS:MCSx,MCSy... | NSS:MCSx-MCSy>*] [sgi-2.4|lgi-2.4]
> [sgi-5|lgi-5]
>
> Sets up the specified rate masks.
> Not passing any arguments would clear the existing mask (if any).

Ah, sorry, my bad.

  iw dev ath1 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:4

(ht-mcs and vht-mcs were missing "-5" suffix)

As a sidenote: The logic here is that you need to explicitly tell that
you want empty set of legacy rates and empty set of HT rates and only
a single VHT rate to be set (all explicitly for 5ghz band). This makes
sure that FW rate control is ignored and a fixed rate is used for all
data transmissions (and this should also build tx try-lists without
any fancy retries/fallbacks to different rates).


Michał
Sebastian Gottschall July 20, 2016, 11:10 a.m. UTC | #22
Am 20.07.2016 um 13:03 schrieb Michal Kazior:
> On 20 July 2016 at 12:50, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
>> Am 20.07.2016 um 12:23 schrieb Michal Kazior:
>>> On 20 July 2016 at 10:51, Sebastian Gottschall <s.gottschall@dd-wrt.com>
>>> wrote:
>>>> Hello
>>>>
>>>> while hunting a link stability (packet transmission stop) issue i
>>>> discovered
>>>> a maybe cosmetic, but maybe als serious issue.
>>>> AP is a QCA9880 3x3 card configured as WDS AP
>>>> Station is a QCA9880 2x2 card configured as WDS STA
>>>>
>>>> the TX rate of the station matches to the rx rate of the AP.
>>>> but the RX rate of the station is wrong as it seems which may be a cause
>>>> of
>>>> the issue.
>>>> could this be a firmware bug on QCA9880?
>>>>
>>>> output of fw_stats
>>>>
>>>> WDS AP:
>>>>                Peer MAC address 40:a5:ef:85:4d:6f
>>>>                        Peer RSSI 12
>>>>                     Peer TX rate 175500
>>>>                     Peer RX rate 175500
>>>>                 Peer RX duration 0
>>>>
>>>>
>>>> WDS STA:
>>>>               Peer MAC address 40:a5:ef:51:49:db
>>>>                        Peer RSSI 13
>>>>                     Peer TX rate 175500
>>>>                     Peer RX rate 351000
>>>>                 Peer RX duration 0
> [...]
>>> Is this reproducible? Can you try setting a fixed tx bitrate (`iw
>>> wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
>>> nss=1) to see if it makes any difference? Perhaps rate-control and tx
>>> try-list/status are not parsed properly (for statistical purposes) in
>>> firmware which ends up with invalid peer-tx-rate on WDS AP.
>> lets try. can you correct the syntax? the following is not correct
>> iw dev ath1 set bitrates legacy-5 ht-mcs vht-mcs 1:4
>>
>> Usage:  iw [options] dev <devname> set bitrates [legacy-<2.4|5> <legacy rate
>> in Mbps>*] [ht-mcs-<2.4|5> <MCS index>*] [v
>> ht-mcs-<2.4|5> <NSS:MCSx,MCSy... | NSS:MCSx-MCSy>*] [sgi-2.4|lgi-2.4]
>> [sgi-5|lgi-5]
>>
>> Sets up the specified rate masks.
>> Not passing any arguments would clear the existing mask (if any).
> Ah, sorry, my bad.
>
>    iw dev ath1 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:4
>
> (ht-mcs and vht-mcs were missing "-5" suffix)
>
> As a sidenote: The logic here is that you need to explicitly tell that
> you want empty set of legacy rates and empty set of HT rates and only
> a single VHT rate to be set (all explicitly for 5ghz band). This makes
> sure that FW rate control is ignored and a fixed rate is used for all
> data transmissions (and this should also build tx try-lists without
> any fancy retries/fallbacks to different rates).
after setting this setting on ap. i get the following results:

sta:
rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
                  Peer TX rate 234000
                   Peer RX rate 175500


AP:
         rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
                   Peer TX rate 263300
                   Peer RX rate 175500

this looks even more curious
>
> Michał
>
Sebastian Gottschall July 20, 2016, 11:12 a.m. UTC | #23
Am 20.07.2016 um 13:03 schrieb Michal Kazior:
> On 20 July 2016 at 12:50, Sebastian Gottschall 
> <s.gottschall@dd-wrt.com> wrote:
>> Am 20.07.2016 um 12:23 schrieb Michal Kazior:
>>> On 20 July 2016 at 10:51, Sebastian Gottschall 
>>> <s.gottschall@dd-wrt.com>
>>> wrote:
>>>> Hello
>>>>
>>>> while hunting a link stability (packet transmission stop) issue i
>>>> discovered
>>>> a maybe cosmetic, but maybe als serious issue.
>>>> AP is a QCA9880 3x3 card configured as WDS AP
>>>> Station is a QCA9880 2x2 card configured as WDS STA
>>>>
>>>> the TX rate of the station matches to the rx rate of the AP.
>>>> but the RX rate of the station is wrong as it seems which may be a 
>>>> cause
>>>> of
>>>> the issue.
>>>> could this be a firmware bug on QCA9880?
>>>>
>>>> output of fw_stats
>>>>
>>>> WDS AP:
>>>>                Peer MAC address 40:a5:ef:85:4d:6f
>>>>                        Peer RSSI 12
>>>>                     Peer TX rate 175500
>>>>                     Peer RX rate 175500
>>>>                 Peer RX duration 0
>>>>
>>>>
>>>> WDS STA:
>>>>               Peer MAC address 40:a5:ef:51:49:db
>>>>                        Peer RSSI 13
>>>>                     Peer TX rate 175500
>>>>                     Peer RX rate 351000
>>>>                 Peer RX duration 0
> [...]
>>> Is this reproducible? Can you try setting a fixed tx bitrate (`iw
>>> wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
>>> nss=1) to see if it makes any difference? Perhaps rate-control and tx
>>> try-list/status are not parsed properly (for statistical purposes) in
>>> firmware which ends up with invalid peer-tx-rate on WDS AP.
>> lets try. can you correct the syntax? the following is not correct
>> iw dev ath1 set bitrates legacy-5 ht-mcs vht-mcs 1:4
>>
>> Usage:  iw [options] dev <devname> set bitrates [legacy-<2.4|5> 
>> <legacy rate
>> in Mbps>*] [ht-mcs-<2.4|5> <MCS index>*] [v
>> ht-mcs-<2.4|5> <NSS:MCSx,MCSy... | NSS:MCSx-MCSy>*] [sgi-2.4|lgi-2.4]
>> [sgi-5|lgi-5]
>>
>> Sets up the specified rate masks.
>> Not passing any arguments would clear the existing mask (if any).
> Ah, sorry, my bad.
>
>    iw dev ath1 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:4
>
> (ht-mcs and vht-mcs were missing "-5" suffix)
>
> As a sidenote: The logic here is that you need to explicitly tell that
> you want empty set of legacy rates and empty set of HT rates and only
> a single VHT rate to be set (all explicitly for 5ghz band). This makes
> sure that FW rate control is ignored and a fixed rate is used for all
> data transmissions (and this should also build tx try-lists without
> any fancy retries/fallbacks to different rates).
after setting this setting on ap. i get the following results:

sta:
rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
                  Peer TX rate 234000
                   Peer RX rate 175500


AP:
         rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
                   Peer TX rate 263300
                   Peer RX rate 175500

this looks even more curious
>
> Michał
>
Michal Kazior July 20, 2016, 11:32 a.m. UTC | #24
On 20 July 2016 at 13:10, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
> Am 20.07.2016 um 13:03 schrieb Michal Kazior:
>>
>> On 20 July 2016 at 12:50, Sebastian Gottschall <s.gottschall@dd-wrt.com>
>> wrote:
>>>
>>> Am 20.07.2016 um 12:23 schrieb Michal Kazior:
>>>>
>>>> On 20 July 2016 at 10:51, Sebastian Gottschall <s.gottschall@dd-wrt.com>
>>>> wrote:
>>>>>
>>>>> Hello
>>>>>
>>>>> while hunting a link stability (packet transmission stop) issue i
>>>>> discovered
>>>>> a maybe cosmetic, but maybe als serious issue.
>>>>> AP is a QCA9880 3x3 card configured as WDS AP
>>>>> Station is a QCA9880 2x2 card configured as WDS STA
>>>>>
>>>>> the TX rate of the station matches to the rx rate of the AP.
>>>>> but the RX rate of the station is wrong as it seems which may be a
>>>>> cause
>>>>> of
>>>>> the issue.
>>>>> could this be a firmware bug on QCA9880?
>>>>>
>>>>> output of fw_stats
>>>>>
>>>>> WDS AP:
>>>>>                Peer MAC address 40:a5:ef:85:4d:6f
>>>>>                        Peer RSSI 12
>>>>>                     Peer TX rate 175500
>>>>>                     Peer RX rate 175500
>>>>>                 Peer RX duration 0
>>>>>
>>>>>
>>>>> WDS STA:
>>>>>               Peer MAC address 40:a5:ef:51:49:db
>>>>>                        Peer RSSI 13
>>>>>                     Peer TX rate 175500
>>>>>                     Peer RX rate 351000
>>>>>                 Peer RX duration 0
>>
>> [...]
>>>>
>>>> Is this reproducible? Can you try setting a fixed tx bitrate (`iw
>>>> wlanX set bitrates legacy-5 ht-mcs vht-mcs 1:4` to force vht mcs=4,
>>>> nss=1) to see if it makes any difference? Perhaps rate-control and tx
>>>> try-list/status are not parsed properly (for statistical purposes) in
>>>> firmware which ends up with invalid peer-tx-rate on WDS AP.
>>>
>>> lets try. can you correct the syntax? the following is not correct
>>> iw dev ath1 set bitrates legacy-5 ht-mcs vht-mcs 1:4
>>>
>>> Usage:  iw [options] dev <devname> set bitrates [legacy-<2.4|5> <legacy
>>> rate
>>> in Mbps>*] [ht-mcs-<2.4|5> <MCS index>*] [v
>>> ht-mcs-<2.4|5> <NSS:MCSx,MCSy... | NSS:MCSx-MCSy>*] [sgi-2.4|lgi-2.4]
>>> [sgi-5|lgi-5]
>>>
>>> Sets up the specified rate masks.
>>> Not passing any arguments would clear the existing mask (if any).
>>
>> Ah, sorry, my bad.
>>
>>    iw dev ath1 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:4
>>
>> (ht-mcs and vht-mcs were missing "-5" suffix)
>>
>> As a sidenote: The logic here is that you need to explicitly tell that
>> you want empty set of legacy rates and empty set of HT rates and only
>> a single VHT rate to be set (all explicitly for 5ghz band). This makes
>> sure that FW rate control is ignored and a fixed rate is used for all
>> data transmissions (and this should also build tx try-lists without
>> any fancy retries/fallbacks to different rates).
>
> after setting this setting on ap. i get the following results:
>
> sta:
> rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
>                  Peer TX rate 234000
>                   Peer RX rate 175500
>
>
> AP:
>         rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
>                   Peer TX rate 263300
>                   Peer RX rate 175500
>
> this looks even more curious

Thanks for checking.

Looks like firmware reports garbage for peer-tx-rate.. 234mbps and
263.3mbps were probably values that rate control tried before and
since firmware reuses memory hunks for tx descriptors it probably ends
up reading stale data and reports it as peer-tx-rate.

I did a quick look at firmware code that collects peer-tx-rate. It
does seem to have some logic flaws but I couldn't imagine how it can
report the values you're seeing.

On the plus side it's probably harmless.


Michał
Ben Greear July 20, 2016, 4:42 p.m. UTC | #25
On 07/20/2016 04:12 AM, Sebastian Gottschall wrote:
> Am 20.07.2016 um 13:03 schrieb Michal Kazior:

>>>>> WDS AP:
>>>>>                Peer MAC address 40:a5:ef:85:4d:6f
>>>>>                        Peer RSSI 12
>>>>>                     Peer TX rate 175500
>>>>>                     Peer RX rate 175500
>>>>>                 Peer RX duration 0
>>>>>
>>>>>
>>>>> WDS STA:
>>>>>               Peer MAC address 40:a5:ef:51:49:db
>>>>>                        Peer RSSI 13
>>>>>                     Peer TX rate 175500
>>>>>                     Peer RX rate 351000
>>>>>                 Peer RX duration 0
>> [...]

One thing, if your signal is really this hot (-12, -13), then you are likely over-saturating
the RF logic.  You normally get best results at around -40 signal level...

Thanks,
Ben
Ben Greear July 20, 2016, 4:48 p.m. UTC | #26
On 07/19/2016 11:05 PM, Michal Kazior wrote:
> On 20 July 2016 at 07:44, Adrian Chadd <adrian@freebsd.org> wrote:
>> Hi,
>>
>> dma coherent doesn't /have/ to mean "low 32 bits". It's just supposed
>> to mean "try really hard to use uncached memory on platforms that
>> support it."
>
> Good point. Maybe it does on x86, or at least some machines.
>
> @Ben: Can you verify if that's the case for you? Can you see what
> address ranges hostmem chunks get with and without the GFP_DMA32 (and
> maybe compare it against a revert to compare to dma-coherent as well)?

You just want a printk("%p", foo); for the chunks returned with and without
this flag?

>
>
>> The ath10k hardware (at least what I've played with thus far) is all
>> 32 bit DMA hardware, not 64 bit, so it can't be handed 64 bit memory -
>> contiguous or otherwise.
>>
>> So, if dma coherent on linux means 32 bit only physmem, great.
>>
>> Now, it also turns out that various platforms that say they do
>> coherent memory these days do "mostly coherent", and you still need
>> some flush/sync ops..
>
> Yeah, but since the device has it's own CPU and RAM it has to have a
> way to distinguish local and host memory in some way using these 32
> bits, no? (think about firmware generating local 802.11 frames vs
> pushing frames coming from host driver)

Host memory cannot be accessed directly I think, at least not by normal
code.  Firmware uses some low level 'ce' type logic to handle that I think?

In 10.4 firmware, check out the code-swap code, for instance, or the
rate-ctrl swap logic in 10.1 or higher?

Thanks,
Ben
Sebastian Gottschall July 20, 2016, 4:50 p.m. UTC | #27
Am 20.07.2016 um 18:42 schrieb Ben Greear:
>
>
> On 07/20/2016 04:12 AM, Sebastian Gottschall wrote:
>> Am 20.07.2016 um 13:03 schrieb Michal Kazior:
>
>>>>>> WDS AP:
>>>>>>                Peer MAC address 40:a5:ef:85:4d:6f
>>>>>>                        Peer RSSI 12
>>>>>>                     Peer TX rate 175500
>>>>>>                     Peer RX rate 175500
>>>>>>                 Peer RX duration 0
>>>>>>
>>>>>>
>>>>>> WDS STA:
>>>>>>               Peer MAC address 40:a5:ef:51:49:db
>>>>>>                        Peer RSSI 13
>>>>>>                     Peer TX rate 175500
>>>>>>                     Peer RX rate 351000
>>>>>>                 Peer RX duration 0
>>> [...]
>
> One thing, if your signal is really this hot (-12, -13), then you are 
> likely over-saturating
> the RF logic.  You normally get best results at around -40 signal 
> level...
the peer ssid is not db. its more likelly the snr. both devices have 20 
meter distance with a brick wall in between
so its not over saturating. mac80211 shows about -81 rssi and -95 noise 
right now with heavy fluctuations up to -70
>
> Thanks,
> Ben
>
Ben Greear July 20, 2016, 4:52 p.m. UTC | #28
On 07/19/2016 10:36 PM, Michal Kazior wrote:
> On 19 July 2016 at 17:25, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>> On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior@tieto.com> wrote:
>>> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar@qti.qualcomm.com> wrote:
>>>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>>>> so please rework it, or leave it out.
>>>>>> note:
>>>>>> maybe the limit of 256kb is too low for that card
>>>>>>
>>>>> by the way. 512 works
>>>
>>> I think this suggests the problem isn't about memory chunk size limit
>>> per se but some kind of bug in address/offset logic in fw or hw.
>>>
>>> DMA coherent and single-map addresses use completely different ranges
>>> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
>>> I recall there is a magic macro through which target device accesses
>>> host memory so maybe that's a good place to look to better understand
>>> the problem?
>>>
>> Michał,
>>
>> Could you please shed some light on this issue? It seems this issue is popping up
>> more frequently and there are multiple threads for this issue.
>>
>> "Anyone brought up 9984 NIC on x86-64?"
>> "AR9882 IOMMU faults"
>
> I think IOMMU faults were solved by using DMA_BIDIRECTIONAL, no?

Yes, that resolves the faults, or at least the vast majority of them.

Remaining spurious faults are likely firmware bugs accessing bad memory, I guess.
You probably don't notice this at all on ARM and other systems w/out hardware IOMMU?

>> Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
>> to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
>> applicable to existing logic. no?
>
> We still don't know *why* using non-coherent memory causes problems.
> Changing chunk size limit seems to alter the behavior in some
> unpredictable ways, yes, but it's really hard to tell if the "try
> smaller chunk sizes" *itself* introduces any problems.

If it is crashing firmware somehow, and you can get a backtrace, then likely
it can be debugged.  In my case, changing the size caused firmware to crash due to
lame logic bug in the firmware, for instance.  Possibly other crashes are as
mundane as that.

Thanks,
Ben
Adrian Chadd July 20, 2016, 5:02 p.m. UTC | #29
Hi,

The "right" way for the target CPU to interact with host CPU memory
(and vice versa, for mostly what it's worth) is to have the copy
engine copy (ie, "DMA") the pieces between them. This may be for
diagnostic purposes, but it's not supposed to be used like this for
doing wifi data exchange, right? :-P

Now, there /may/ be some alignment hilarity in various bits of code
and/or hardware. Eg, Merlin (AR9280) requires its descriptors to be
within a 4k block - the code to iterate through the descriptor
physical address space didn't do a "val = val + offset", it did
something in verilog like "val = (val & 0xffffc000) | (offset &
0x3fff)". This meant if you allocated a descriptor that started just
before the end of a 4k physmem aligned block, you'd end up with
exciting results. I don't know if there are any situations like this
in the ath10k hardware, but I'm sure there will be some gotchas
somewhere.

In any case, if ath10k is consuming too much bounce buffers, the calls
to allocate memory aren't working right and should be restricted to 32
bit addresses. Whether that's by using the DMA memory API (before it's
mapped) or passing in GFP_DMA32 is a fun debate.

(My test hardware arrived, so I'll test this all out today on
Peregrine-v2 and see if the driver works.)



-adrian




-adrian
Ben Greear July 22, 2016, 10:43 p.m. UTC | #30
On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
> commit b057886524be ("ath10k: do not use coherent memory for allocated
> device memory chunks") replaced coherent memory allocation for memory
> chunks to fix low memory platforms. Unfortunately this is causing system
> freeze on x86 platform while bringing up qca99x0 device. The system
> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
> this by limiting maximum memory chunk size to 256 KiB per request.
>
> Cc: Felix Fietkau <nbd@nbd.name>
> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
> ---
>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>   2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
> index 6279ab4a760e..7c15f65fe5ed 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
>   		if (!pool_size)
>   			return -EINVAL;
>
> +		if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
> +			num_units = WMI_MAX_MEM_CHUNK_SIZE /
> +					round_up(unit_len, 4);
> +			pool_size = num_units * round_up(unit_len, 4);
> +		}

I started testing my 9980 x86-64 system with VT/d enabled today.
With this patch in my tree, it crashes on bootup (with my firmware).
Works fine without this patch.

I don't see the exact place it is crashing in the firmware, though I could
probably narrow it down with some effort.  It is in the early startup code,
at least, which makes debugging more difficult.

This patch works fine with a slightly newer firmware compiled for 9984.  That
same firmware compiled for 9980 crashes, but I am not certain it is the same issue
as the older 9980.  It appears similar, at least.

Looks to me like there are lots of variances in how firmware and chip revisions
deal with this particular code, so we are going to have to test on lots of chips
and platforms to know if a 'fix' is really a fix or not.

Thanks,
Ben
Sebastian Gottschall July 23, 2016, 9:11 a.m. UTC | #31
from my point of view this patch is just shit. it trunkates the maximum 
allocated memory to a certain value.
so firmware requests 800 kb memory but just gets 256kb. so out of bound 
memory access is guaranteed at all.


Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <nbd@nbd.name>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for 
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct 
>> ath10k *ar, u32 req_id,
>>           if (!pool_size)
>>               return -EINVAL;
>>
>> +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> +                    round_up(unit_len, 4);
>> +            pool_size = num_units * round_up(unit_len, 4);
>> +        }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I 
> could
> probably narrow it down with some effort.  It is in the early startup 
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for 
> 9984.  That
> same firmware compiled for 9980 crashes, but I am not certain it is 
> the same issue
> as the older 9980.  It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip 
> revisions
> deal with this particular code, so we are going to have to test on 
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>
Rajkumar Manoharan July 23, 2016, 9:45 a.m. UTC | #32
> from my point of view this patch is just shit. it trunkates the maximum
> allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound
> memory access is guaranteed at all.
> 
Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
to allocate smaller chunks. So anyway it is guaranteed for oob access. no?

        while (!vaddr && num_units) {
                pool_size = num_units * round_up(unit_len, 4);
                if (!pool_size)
                        return -EINVAL;

                vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
                if (!vaddr)
                        num_units /= 2;
        }

Actually the commit "ath10k: do not use coherent memory for allocated device memory chunks"
is causing system hang on non VT/d x86 platform. Better to revert the commit until it is properly
root caused

-Rajkumar    

Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <nbd@nbd.name>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct
>> ath10k *ar, u32 req_id,
>>           if (!pool_size)
>>               return -EINVAL;
>>
>> +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> +                    round_up(unit_len, 4);
>> +            pool_size = num_units * round_up(unit_len, 4);
>> +        }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I
> could
> probably narrow it down with some effort.  It is in the early startup
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for
> 9984.  That
> same firmware compiled for 9980 crashes, but I am not certain it is
> the same issue
> as the older 9980.  It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip
> revisions
> deal with this particular code, so we are going to have to test on
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@dd-wrt.com
Tel.: +496251-582650 / Fax: +496251-5826565
Ben Greear July 23, 2016, 9:55 p.m. UTC | #33
On 07/23/2016 02:11 AM, Sebastian Gottschall wrote:
> from my point of view this patch is just shit. it trunkates the maximum allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound memory access is guaranteed at all.

At least some of the firmware logic attempts to deal with this segmentation.  9984 firmware had bugs in one area that I fixed,
but probably there are other bugs and that is why it fails for 9980 firmware.

Even if I fix my firmware, that doesn't help everyone else with stock firmware.

At best, maybe enable this type of logic for only exact firmware with proper feature
flag showing that it is known to work with it.

Thanks,
Ben
Sebastian Gottschall July 25, 2016, 2:37 p.m. UTC | #34
Hello

finally i received 9984 devices to follow up with my work on vht160.
but the big issue is. the 9984 works in ap mode with current firmware. 
but in sta mode it crashes after seconds, no matter which operation mode 
is used.
ben's ct firmware crashes in ap and sta mode, so i prefer to run using 
the offical variant. can someone help here?

Sebastian
Ben Greear July 25, 2016, 2:43 p.m. UTC | #35
On 07/25/2016 07:37 AM, Sebastian Gottschall wrote:
> Hello
>
> finally i received 9984 devices to follow up with my work on vht160.
> but the big issue is. the 9984 works in ap mode with current firmware. but in sta mode it crashes after seconds, no matter which operation mode is used.
> ben's ct firmware crashes in ap and sta mode, so i prefer to run using the offical variant. can someone help here?

Does it crash if you connect to an AP with 9980 or 9880?  I am curious if it is only with
9984 AP that crashes.  I have only one 9984 radio right now, but STA mode was working for me against
9980 AP.

I am just getting started on this, so hopefully it will get more stable soon...

And for the QCA firmware, might be worth posting a stack dump from the firmware,
maybe someone there can decode it.  Unfortunately, I could not get a useful decode
from the one you sent me of my firmware...

Thanks,
Ben

>
> Sebastian
>
Sebastian Gottschall July 25, 2016, 3:09 p.m. UTC | #36
Am 25.07.2016 um 16:43 schrieb Ben Greear:
> On 07/25/2016 07:37 AM, Sebastian Gottschall wrote:
>> Hello
>>
>> finally i received 9984 devices to follow up with my work on vht160.
>> but the big issue is. the 9984 works in ap mode with current 
>> firmware. but in sta mode it crashes after seconds, no matter which 
>> operation mode is used.
>> ben's ct firmware crashes in ap and sta mode, so i prefer to run 
>> using the offical variant. can someone help here?
>
> Does it crash if you connect to an AP with 9980 or 9880?  I am curious 
> if it is only with
> 9984 AP that crashes.  I have only one 9984 radio right now, but STA 
> mode was working for me against
> 9980 AP.
>
nope
ct firmware 9984 wdsap -> wdssta 9980 -> ap crash, sta ok
qca firmware 9984 wdsap -> wdssta 9980 -> ap no crash, sta ok
ct firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash , ap crash
qca firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash, ap ok (i 
tested both sta and wdssta)

> I am just getting started on this, so hopefully it will get more 
> stable soon...
>
> And for the QCA firmware, might be worth posting a stack dump from the 
> firmware,
> maybe someone there can decode it.  Unfortunately, I could not get a 
> useful decode
> from the one you sent me of my firmware...
i just sended you yours. seem to be more usefull since you can decode it 
with your tools
>
> Thanks,
> Ben
>
>>
>> Sebastian
>>
>
>
Adrian Chadd July 25, 2016, 4:03 p.m. UTC | #37
heh, adding "wdsap" and "wdssta" is a pretty big piece of information,
versus STA/AP.

just saying!



-a


On 25 July 2016 at 08:09, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
> Am 25.07.2016 um 16:43 schrieb Ben Greear:
>>
>> On 07/25/2016 07:37 AM, Sebastian Gottschall wrote:
>>>
>>> Hello
>>>
>>> finally i received 9984 devices to follow up with my work on vht160.
>>> but the big issue is. the 9984 works in ap mode with current firmware.
>>> but in sta mode it crashes after seconds, no matter which operation mode is
>>> used.
>>> ben's ct firmware crashes in ap and sta mode, so i prefer to run using
>>> the offical variant. can someone help here?
>>
>>
>> Does it crash if you connect to an AP with 9980 or 9880?  I am curious if
>> it is only with
>> 9984 AP that crashes.  I have only one 9984 radio right now, but STA mode
>> was working for me against
>> 9980 AP.
>>
> nope
> ct firmware 9984 wdsap -> wdssta 9980 -> ap crash, sta ok
> qca firmware 9984 wdsap -> wdssta 9980 -> ap no crash, sta ok
> ct firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash , ap crash
> qca firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash, ap ok (i tested
> both sta and wdssta)
>
>> I am just getting started on this, so hopefully it will get more stable
>> soon...
>>
>> And for the QCA firmware, might be worth posting a stack dump from the
>> firmware,
>> maybe someone there can decode it.  Unfortunately, I could not get a
>> useful decode
>> from the one you sent me of my firmware...
>
> i just sended you yours. seem to be more usefull since you can decode it
> with your tools
>>
>>
>> Thanks,
>> Ben
>>
>>>
>>> Sebastian
>>>
>>
>>
>
>
> --
> Mit freundlichen Grüssen / Regards
>
> Sebastian Gottschall / CTO
>
> NewMedia-NET GmbH - DD-WRT
> Firmensitz:  Berliner Ring 101, 64625 Bensheim
> Registergericht: Amtsgericht Darmstadt, HRB 25473
> Geschäftsführer: Peter Steinhäuser, Christian Scheele
> http://www.dd-wrt.com
> email: s.gottschall@dd-wrt.com
> Tel.: +496251-582650 / Fax: +496251-5826565
>
>
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
Sebastian Gottschall July 25, 2016, 4:31 p.m. UTC | #38
Am 25.07.2016 um 18:03 schrieb Adrian Chadd:
> heh, adding "wdsap" and "wdssta" is a pretty big piece of information,
> versus STA/AP.
>
> just saying!
doesnt matter. wdsap/wdssta is just 4addr support. but ap->sta crashes too
>
>
>
> -a
>
>
> On 25 July 2016 at 08:09, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
>> Am 25.07.2016 um 16:43 schrieb Ben Greear:
>>> On 07/25/2016 07:37 AM, Sebastian Gottschall wrote:
>>>> Hello
>>>>
>>>> finally i received 9984 devices to follow up with my work on vht160.
>>>> but the big issue is. the 9984 works in ap mode with current firmware.
>>>> but in sta mode it crashes after seconds, no matter which operation mode is
>>>> used.
>>>> ben's ct firmware crashes in ap and sta mode, so i prefer to run using
>>>> the offical variant. can someone help here?
>>>
>>> Does it crash if you connect to an AP with 9980 or 9880?  I am curious if
>>> it is only with
>>> 9984 AP that crashes.  I have only one 9984 radio right now, but STA mode
>>> was working for me against
>>> 9980 AP.
>>>
>> nope
>> ct firmware 9984 wdsap -> wdssta 9980 -> ap crash, sta ok
>> qca firmware 9984 wdsap -> wdssta 9980 -> ap no crash, sta ok
>> ct firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash , ap crash
>> qca firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash, ap ok (i tested
>> both sta and wdssta)
>>
>>> I am just getting started on this, so hopefully it will get more stable
>>> soon...
>>>
>>> And for the QCA firmware, might be worth posting a stack dump from the
>>> firmware,
>>> maybe someone there can decode it.  Unfortunately, I could not get a
>>> useful decode
>>> from the one you sent me of my firmware...
>> i just sended you yours. seem to be more usefull since you can decode it
>> with your tools
>>>
>>> Thanks,
>>> Ben
>>>
>>>> Sebastian
>>>>
>>>
>>
>> --
>> Mit freundlichen Grüssen / Regards
>>
>> Sebastian Gottschall / CTO
>>
>> NewMedia-NET GmbH - DD-WRT
>> Firmensitz:  Berliner Ring 101, 64625 Bensheim
>> Registergericht: Amtsgericht Darmstadt, HRB 25473
>> Geschäftsführer: Peter Steinhäuser, Christian Scheele
>> http://www.dd-wrt.com
>> email: s.gottschall@dd-wrt.com
>> Tel.: +496251-582650 / Fax: +496251-5826565
>>
>>
>>
>> _______________________________________________
>> ath10k mailing list
>> ath10k@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/ath10k
Adrian Chadd July 25, 2016, 5:22 p.m. UTC | #39
Hi,

Yeah, but WDS/4-addr support likely does something different
internally to the firmware.
(I finally hve source to look, fwiw...)


-a
Sebastian Gottschall July 26, 2016, 1:11 a.m. UTC | #40
Am 25.07.2016 um 19:22 schrieb Adrian Chadd:
> Hi,
>
> Yeah, but WDS/4-addr support likely does something different
> internally to the firmware.
> (I finally hve source to look, fwiw...
ignore it. it does crash in any sta operation mode. STA without WDS too

>
>
> -a
>
Ben Greear July 26, 2016, 2:26 a.m. UTC | #41
On 07/25/2016 06:11 PM, Sebastian Gottschall wrote:
> Am 25.07.2016 um 19:22 schrieb Adrian Chadd:
>> Hi,
>>
>> Yeah, but WDS/4-addr support likely does something different
>> internally to the firmware.
>> (I finally hve source to look, fwiw...
> ignore it. it does crash in any sta operation mode. STA without WDS too

Just my firmware, or QCA firmware too?

Thanks,
Ben
Sebastian Gottschall July 26, 2016, 2:40 a.m. UTC | #42
Am 26.07.2016 um 04:26 schrieb Ben Greear:
>
>
> On 07/25/2016 06:11 PM, Sebastian Gottschall wrote:
>> Am 25.07.2016 um 19:22 schrieb Adrian Chadd:
>>> Hi,
>>>
>>> Yeah, but WDS/4-addr support likely does something different
>>> internally to the firmware.
>>> (I finally hve source to look, fwiw...
>> ignore it. it does crash in any sta operation mode. STA without WDS too
>
> Just my firmware, or QCA firmware too?
yours is crashing in ap and sta mode. qca is only crashing in sta mode.
see my previous list which shows all test scenarios. i wrote this already


----->
ct firmware 9984 wdsap -> wdssta 9980 -> ap crash, sta ok
qca firmware 9984 wdsap -> wdssta 9980 -> ap no crash, sta ok
ct firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash , ap crash
qca firmware 9984 wdsap -> wdssta or sta 9984 -> sta crash, ap ok (i 
tested both sta and wdssta)
>
> Thanks,
> Ben
>
>
Michal Kazior July 26, 2016, 6:21 a.m. UTC | #43
On 26 July 2016 at 05:06, sudheer thota <sdhrtht@outlook.com> wrote:
> wireless interface(ath10k) monitor failing to capture data packets when
> WMM enabled on AP.
>
> AP Status reflects in following state: (when no QOS data Packets. MGMT and CTRL Packets are able to captured)
> 1) 0x03 (3) - Secondary channel is below Primary
> 2) 1 - any Channel width in secondary channel width
> 3) 1 - Use of RIFS permitted
> 4) 0x02 - Only HT STAs in the BSS, however there exists at least on 20Mhz STA
> 5) 0 - all Associated HT STAs are greenfield capable

You didn't really explain your topology. Are you using ath10k to sniff
only or to run AP *and* sniff at the same time?

FWIW It's kind of a problem to sniff your own transmission with QoS
because of NWifi vs 802.11 headers problem so regular packet
dissection won't recognize them properly.


Michał
Sebastian Gottschall July 26, 2016, 9:31 a.m. UTC | #44
i found the reason and the solution. i disabled all beamforming vht 
caps. this will prevent the firmware of beeing crashing.
so it seems, beamforming is responsible for it
could the firmware be fixed?

Sebastian



Am 26.07.2016 um 04:26 schrieb Ben Greear:
>
>
> On 07/25/2016 06:11 PM, Sebastian Gottschall wrote:
>> Am 25.07.2016 um 19:22 schrieb Adrian Chadd:
>>> Hi,
>>>
>>> Yeah, but WDS/4-addr support likely does something different
>>> internally to the firmware.
>>> (I finally hve source to look, fwiw...
>> ignore it. it does crash in any sta operation mode. STA without WDS too
>
> Just my firmware, or QCA firmware too?
>
> Thanks,
> Ben
>
>
Michal Kazior July 26, 2016, 10:27 a.m. UTC | #45
On 26 July 2016 at 11:31, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote:
> i found the reason and the solution. i disabled all beamforming vht caps.
> this will prevent the firmware of beeing crashing.
> so it seems, beamforming is responsible for it
> could the firmware be fixed?

This sounds awfully familiar. qca6174 hw.2 firmware had this kind of
problem at one point - it would advertise txbf support even though it
lacked it. It caused firmware crashes whenever you tried to associate
to a txbf capable AP. However qca6174 firmware is very distinct from
qca9xxx.

But maybe it's just a silly out-of-bounds or deref bug in firmware
txbf code. Or maybe the txbf vdev/peer param ABI has changed and
ath10k sets something else thinking it is configuring txbf. Beats me.


Michał
sudheer thota July 26, 2016, 11:53 a.m. UTC | #46
ath10k is a standalone sniffer configured to monitor data transfer rates between AP and STA. its not the AP.

details of the configuration:
AP: Netgear WNDR3400V3 : WMM enabled, Ch 153, upto 300mbps.
STA1: Roku Streaming Stick: only 20MHz Capable - LGI (verified from data-rates and MCS index)

Monitor: Ath10k

Seems to be combination of RIFS (enabled), Operating mode (0x02) and Greenfield Capable, causing the issues to miss QOS data packet.

Confirmed from connecting different devices(STA2 - Samsung Tablet) which turns into  RIFS (enabled), Operating mode (0x03) and Greenfield not Capable.


----------------------------------------
> From: michal.kazior@tieto.com
> Date: Tue, 26 Jul 2016 08:21:41 +0200
> Subject: Re: Not able to capture qos data packets - QCA988x
> To: sdhrtht@outlook.com
> CC: ath10k@lists.infradead.org
>
> On 26 July 2016 at 05:06, sudheer thota <sdhrtht@outlook.com> wrote:
>> wireless interface(ath10k) monitor failing to capture data packets when
>> WMM enabled on AP.
>>
>> AP Status reflects in following state: (when no QOS data Packets. MGMT and CTRL Packets are able to captured)
>> 1) 0x03 (3) - Secondary channel is below Primary
>> 2) 1 - any Channel width in secondary channel width
>> 3) 1 - Use of RIFS permitted
>> 4) 0x02 - Only HT STAs in the BSS, however there exists at least on 20Mhz STA
>> 5) 0 - all Associated HT STAs are greenfield capable
>
> You didn't really explain your topology. Are you using ath10k to sniff
> only or to run AP *and* sniff at the same time?
>
> FWIW It's kind of a problem to sniff your own transmission with QoS
> because of NWifi vs 802.11 headers problem so regular packet
> dissection won't recognize them properly.
>
>
> Michał
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
Ben Greear Sept. 26, 2016, 4:53 p.m. UTC | #47
On 07/20/2016 10:02 AM, Adrian Chadd wrote:
> Hi,
>
> The "right" way for the target CPU to interact with host CPU memory
> (and vice versa, for mostly what it's worth) is to have the copy
> engine copy (ie, "DMA") the pieces between them. This may be for
> diagnostic purposes, but it's not supposed to be used like this for
> doing wifi data exchange, right? :-P
>
> Now, there /may/ be some alignment hilarity in various bits of code
> and/or hardware. Eg, Merlin (AR9280) requires its descriptors to be
> within a 4k block - the code to iterate through the descriptor
> physical address space didn't do a "val = val + offset", it did
> something in verilog like "val = (val & 0xffffc000) | (offset &
> 0x3fff)". This meant if you allocated a descriptor that started just
> before the end of a 4k physmem aligned block, you'd end up with
> exciting results. I don't know if there are any situations like this
> in the ath10k hardware, but I'm sure there will be some gotchas
> somewhere.
>
> In any case, if ath10k is consuming too much bounce buffers, the calls
> to allocate memory aren't working right and should be restricted to 32
> bit addresses. Whether that's by using the DMA memory API (before it's
> mapped) or passing in GFP_DMA32 is a fun debate.
>
> (My test hardware arrived, so I'll test this all out today on
> Peregrine-v2 and see if the driver works.)

I have been running this patch for a while:

     ath10k:  Use GPF_DMA32 for firmware swap memory.

     This fixes OS crash when using QCA 9984 NIC on x86-64 system
     without vt-d enabled.

     Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

     All tests were with CT firmware.

     Signed-off-by: Ben Greear <greearb@candelatech.com>

-------------------- drivers/net/wireless/ath/ath10k/wmi.c --------------------
index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
  		if (!pool_size)
  			return -EINVAL;

-		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
  		if (!vaddr)
  			num_units /= 2;
  	}


It mostly seems to work, but then sometimes I get a splat like this below.  It appears
it is invalid to actually do kzalloc with GFP_DMA32 (based on that BUG_ON that
hit in the new_slab method)??

Any idea for a more proper way to do this?



gfp: 4
------------[ cut here ]------------
kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 
iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6
CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core]
task: ffff880036433a00 ti: ffff880036440000 task.ti: ffff880036440000
RIP: 0010:[<ffffffff8124592a>]  [<ffffffff8124592a>] new_slab+0x39a/0x410
RSP: 0018:ffff880036443b58  EFLAGS: 00010092
RAX: 0000000000000006 RBX: 00000000024082c4 RCX: 0000000000000000
RDX: 0000000000000006 RSI: ffff88021e30dd08 RDI: ffff88021e30dd08
RBP: ffff880036443b90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000372 R12: ffff88021dc01200
R13: ffff88021dc00cc0 R14: ffff88021dc01200 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff88021e300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3e65c1c730 CR3: 0000000001e06000 CR4: 00000000001406e0
Stack:
  ffffffff8127a4fc ffff0a01ffffff10 00000000024082c4 ffff88021dc01200
  ffff88021dc00cc0 ffff88021dc01200 0000000000000001 ffff880036443c58
  ffffffff81247ac6 ffff88021e31b360 ffff880036433a00 ffff880036433a00
Call Trace:
  [<ffffffff8127a4fc>] ? __d_lookup+0x9c/0x160
  [<ffffffff81247ac6>] ___slab_alloc+0x396/0x4a0
  [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff811f5279>] ? alloc_kmem_pages+0x9/0x10
  [<ffffffff8120f203>] ? kmalloc_order+0x13/0x40
  [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff81247bf6>] __slab_alloc.isra.72+0x26/0x40
  [<ffffffff81248767>] __kmalloc+0x147/0x1b0
  [<ffffffffa0f8e14d>] ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff811370a1>] ? dequeue_entity+0x261/0xac0
  [<ffffffff8111c2d8>] process_one_work+0x148/0x420
  [<ffffffff8111c929>] worker_thread+0x49/0x480
  [<ffffffff8111c8e0>] ? rescuer_thread+0x330/0x330
  [<ffffffff81121984>] kthread+0xc4/0xe0
  [<ffffffff8184d75f>] ret_from_fork+0x1f/0x40
  [<ffffffff811218c0>] ? kthread_create_on_node+0x170/0x170
Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 
5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18
RIP  [<ffffffff8124592a>] new_slab+0x39a/0x410
  RSP <ffff880036443b58>
---[ end trace ea3b0043b2911d93 ]---


static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
         if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
                 pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
                 BUG();
         }

         return allocate_slab(s,
                 flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
}


Thanks,
Ben

Patch
diff mbox

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 6279ab4a760e..7c15f65fe5ed 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -4411,6 +4411,12 @@  static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
 		if (!pool_size)
 			return -EINVAL;
 
+		if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
+			num_units = WMI_MAX_MEM_CHUNK_SIZE /
+					round_up(unit_len, 4);
+			pool_size = num_units * round_up(unit_len, 4);
+		}
+
 		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
 		if (!vaddr)
 			num_units /= 2;
diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h
index 90f594e89f94..dea1f235a54d 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.h
+++ b/drivers/net/wireless/ath/ath10k/wmi.h
@@ -6184,6 +6184,7 @@  struct wmi_roam_ev {
 #define ATH10K_DEFAULT_ATIM 0
 
 #define WMI_MAX_MEM_REQS 16
+#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */
 
 struct wmi_scan_ev_arg {
 	__le32 event_type; /* %WMI_SCAN_EVENT_ */