diff mbox

ath10k: fix system hang at qca99x0 probe on x86 platform

Message ID 20160614061728.570-1-rmanohar@qti.qualcomm.com (mailing list archive)
State Not Applicable
Delegated to: Kalle Valo
Headers show

Commit Message

Rajkumar Manoharan June 14, 2016, 6:17 a.m. UTC
commit b057886524be ("ath10k: do not use coherent memory for allocated
device memory chunks") replaced coherent memory allocation for memory
chunks to fix low memory platforms. Unfortunately this is causing system
freeze on x86 platform while bringing up qca99x0 device. The system
hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
this by limiting maximum memory chunk size to 256 KiB per request.

Cc: Felix Fietkau <nbd@nbd.name>
Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
---
 drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
 drivers/net/wireless/ath/ath10k/wmi.h | 1 +
 2 files changed, 7 insertions(+)

Comments

Ben Greear July 22, 2016, 10:43 p.m. UTC | #1
On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
> commit b057886524be ("ath10k: do not use coherent memory for allocated
> device memory chunks") replaced coherent memory allocation for memory
> chunks to fix low memory platforms. Unfortunately this is causing system
> freeze on x86 platform while bringing up qca99x0 device. The system
> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
> this by limiting maximum memory chunk size to 256 KiB per request.
>
> Cc: Felix Fietkau <nbd@nbd.name>
> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
> ---
>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>   2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
> index 6279ab4a760e..7c15f65fe5ed 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
>   		if (!pool_size)
>   			return -EINVAL;
>
> +		if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
> +			num_units = WMI_MAX_MEM_CHUNK_SIZE /
> +					round_up(unit_len, 4);
> +			pool_size = num_units * round_up(unit_len, 4);
> +		}

I started testing my 9980 x86-64 system with VT/d enabled today.
With this patch in my tree, it crashes on bootup (with my firmware).
Works fine without this patch.

I don't see the exact place it is crashing in the firmware, though I could
probably narrow it down with some effort.  It is in the early startup code,
at least, which makes debugging more difficult.

This patch works fine with a slightly newer firmware compiled for 9984.  That
same firmware compiled for 9980 crashes, but I am not certain it is the same issue
as the older 9980.  It appears similar, at least.

Looks to me like there are lots of variances in how firmware and chip revisions
deal with this particular code, so we are going to have to test on lots of chips
and platforms to know if a 'fix' is really a fix or not.

Thanks,
Ben
Sebastian Gottschall July 23, 2016, 9:11 a.m. UTC | #2
from my point of view this patch is just shit. it trunkates the maximum 
allocated memory to a certain value.
so firmware requests 800 kb memory but just gets 256kb. so out of bound 
memory access is guaranteed at all.


Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <nbd@nbd.name>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for 
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c 
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct 
>> ath10k *ar, u32 req_id,
>>           if (!pool_size)
>>               return -EINVAL;
>>
>> +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> +                    round_up(unit_len, 4);
>> +            pool_size = num_units * round_up(unit_len, 4);
>> +        }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I 
> could
> probably narrow it down with some effort.  It is in the early startup 
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for 
> 9984.  That
> same firmware compiled for 9980 crashes, but I am not certain it is 
> the same issue
> as the older 9980.  It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip 
> revisions
> deal with this particular code, so we are going to have to test on 
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>
Rajkumar Manoharan July 23, 2016, 9:45 a.m. UTC | #3
> from my point of view this patch is just shit. it trunkates the maximum
> allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound
> memory access is guaranteed at all.
> 
Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
to allocate smaller chunks. So anyway it is guaranteed for oob access. no?

        while (!vaddr && num_units) {
                pool_size = num_units * round_up(unit_len, 4);
                if (!pool_size)
                        return -EINVAL;

                vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
                if (!vaddr)
                        num_units /= 2;
        }

Actually the commit "ath10k: do not use coherent memory for allocated device memory chunks"
is causing system hang on non VT/d x86 platform. Better to revert the commit until it is properly
root caused

-Rajkumar    

Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <nbd@nbd.name>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
>> ---
>>   drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>>   drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct
>> ath10k *ar, u32 req_id,
>>           if (!pool_size)
>>               return -EINVAL;
>>
>> +        if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> +            num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> +                    round_up(unit_len, 4);
>> +            pool_size = num_units * round_up(unit_len, 4);
>> +        }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I
> could
> probably narrow it down with some effort.  It is in the early startup
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for
> 9984.  That
> same firmware compiled for 9980 crashes, but I am not certain it is
> the same issue
> as the older 9980.  It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip
> revisions
> deal with this particular code, so we are going to have to test on
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@dd-wrt.com
Tel.: +496251-582650 / Fax: +496251-5826565

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear July 23, 2016, 9:55 p.m. UTC | #4
On 07/23/2016 02:11 AM, Sebastian Gottschall wrote:
> from my point of view this patch is just shit. it trunkates the maximum allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound memory access is guaranteed at all.

At least some of the firmware logic attempts to deal with this segmentation.  9984 firmware had bugs in one area that I fixed,
but probably there are other bugs and that is why it fails for 9980 firmware.

Even if I fix my firmware, that doesn't help everyone else with stock firmware.

At best, maybe enable this type of logic for only exact firmware with proper feature
flag showing that it is known to work with it.

Thanks,
Ben
Ben Greear Sept. 26, 2016, 4:53 p.m. UTC | #5
On 07/20/2016 10:02 AM, Adrian Chadd wrote:
> Hi,
>
> The "right" way for the target CPU to interact with host CPU memory
> (and vice versa, for mostly what it's worth) is to have the copy
> engine copy (ie, "DMA") the pieces between them. This may be for
> diagnostic purposes, but it's not supposed to be used like this for
> doing wifi data exchange, right? :-P
>
> Now, there /may/ be some alignment hilarity in various bits of code
> and/or hardware. Eg, Merlin (AR9280) requires its descriptors to be
> within a 4k block - the code to iterate through the descriptor
> physical address space didn't do a "val = val + offset", it did
> something in verilog like "val = (val & 0xffffc000) | (offset &
> 0x3fff)". This meant if you allocated a descriptor that started just
> before the end of a 4k physmem aligned block, you'd end up with
> exciting results. I don't know if there are any situations like this
> in the ath10k hardware, but I'm sure there will be some gotchas
> somewhere.
>
> In any case, if ath10k is consuming too much bounce buffers, the calls
> to allocate memory aren't working right and should be restricted to 32
> bit addresses. Whether that's by using the DMA memory API (before it's
> mapped) or passing in GFP_DMA32 is a fun debate.
>
> (My test hardware arrived, so I'll test this all out today on
> Peregrine-v2 and see if the driver works.)

I have been running this patch for a while:

     ath10k:  Use GPF_DMA32 for firmware swap memory.

     This fixes OS crash when using QCA 9984 NIC on x86-64 system
     without vt-d enabled.

     Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880.

     All tests were with CT firmware.

     Signed-off-by: Ben Greear <greearb@candelatech.com>

-------------------- drivers/net/wireless/ath/ath10k/wmi.c --------------------
index e20aa39..727b3aa 100644
@@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
  		if (!pool_size)
  			return -EINVAL;

-		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
+		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32);
  		if (!vaddr)
  			num_units /= 2;
  	}


It mostly seems to work, but then sometimes I get a splat like this below.  It appears
it is invalid to actually do kzalloc with GFP_DMA32 (based on that BUG_ON that
hit in the new_slab method)??

Any idea for a more proper way to do this?



gfp: 4
------------[ cut here ]------------
kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 
iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6
CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core]
task: ffff880036433a00 ti: ffff880036440000 task.ti: ffff880036440000
RIP: 0010:[<ffffffff8124592a>]  [<ffffffff8124592a>] new_slab+0x39a/0x410
RSP: 0018:ffff880036443b58  EFLAGS: 00010092
RAX: 0000000000000006 RBX: 00000000024082c4 RCX: 0000000000000000
RDX: 0000000000000006 RSI: ffff88021e30dd08 RDI: ffff88021e30dd08
RBP: ffff880036443b90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000372 R12: ffff88021dc01200
R13: ffff88021dc00cc0 R14: ffff88021dc01200 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff88021e300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3e65c1c730 CR3: 0000000001e06000 CR4: 00000000001406e0
Stack:
  ffffffff8127a4fc ffff0a01ffffff10 00000000024082c4 ffff88021dc01200
  ffff88021dc00cc0 ffff88021dc01200 0000000000000001 ffff880036443c58
  ffffffff81247ac6 ffff88021e31b360 ffff880036433a00 ffff880036433a00
Call Trace:
  [<ffffffff8127a4fc>] ? __d_lookup+0x9c/0x160
  [<ffffffff81247ac6>] ___slab_alloc+0x396/0x4a0
  [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff811f5279>] ? alloc_kmem_pages+0x9/0x10
  [<ffffffff8120f203>] ? kmalloc_order+0x13/0x40
  [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff81247bf6>] __slab_alloc.isra.72+0x26/0x40
  [<ffffffff81248767>] __kmalloc+0x147/0x1b0
  [<ffffffffa0f8e14d>] ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core]
  [<ffffffff811370a1>] ? dequeue_entity+0x261/0xac0
  [<ffffffff8111c2d8>] process_one_work+0x148/0x420
  [<ffffffff8111c929>] worker_thread+0x49/0x480
  [<ffffffff8111c8e0>] ? rescuer_thread+0x330/0x330
  [<ffffffff81121984>] kthread+0xc4/0xe0
  [<ffffffff8184d75f>] ret_from_fork+0x1f/0x40
  [<ffffffff811218c0>] ? kthread_create_on_node+0x170/0x170
Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 
5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18
RIP  [<ffffffff8124592a>] new_slab+0x39a/0x410
  RSP <ffff880036443b58>
---[ end trace ea3b0043b2911d93 ]---


static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
         if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
                 pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
                 BUG();
         }

         return allocate_slab(s,
                 flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
}


Thanks,
Ben
diff mbox

Patch

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 6279ab4a760e..7c15f65fe5ed 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -4411,6 +4411,12 @@  static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
 		if (!pool_size)
 			return -EINVAL;
 
+		if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
+			num_units = WMI_MAX_MEM_CHUNK_SIZE /
+					round_up(unit_len, 4);
+			pool_size = num_units * round_up(unit_len, 4);
+		}
+
 		vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
 		if (!vaddr)
 			num_units /= 2;
diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h
index 90f594e89f94..dea1f235a54d 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.h
+++ b/drivers/net/wireless/ath/ath10k/wmi.h
@@ -6184,6 +6184,7 @@  struct wmi_roam_ev {
 #define ATH10K_DEFAULT_ATIM 0
 
 #define WMI_MAX_MEM_REQS 16
+#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */
 
 struct wmi_scan_ev_arg {
 	__le32 event_type; /* %WMI_SCAN_EVENT_ */