Message ID | 20160614061728.570-1-rmanohar@qti.qualcomm.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Kalle Valo |
Headers | show |
On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote: > commit b057886524be ("ath10k: do not use coherent memory for allocated > device memory chunks") replaced coherent memory allocation for memory > chunks to fix low memory platforms. Unfortunately this is causing system > freeze on x86 platform while bringing up qca99x0 device. The system > hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix > this by limiting maximum memory chunk size to 256 KiB per request. > > Cc: Felix Fietkau <nbd@nbd.name> > Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks") > Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> > --- > drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++ > drivers/net/wireless/ath/ath10k/wmi.h | 1 + > 2 files changed, 7 insertions(+) > > diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c > index 6279ab4a760e..7c15f65fe5ed 100644 > --- a/drivers/net/wireless/ath/ath10k/wmi.c > +++ b/drivers/net/wireless/ath/ath10k/wmi.c > @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, > if (!pool_size) > return -EINVAL; > > + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) { > + num_units = WMI_MAX_MEM_CHUNK_SIZE / > + round_up(unit_len, 4); > + pool_size = num_units * round_up(unit_len, 4); > + } I started testing my 9980 x86-64 system with VT/d enabled today. With this patch in my tree, it crashes on bootup (with my firmware). Works fine without this patch. I don't see the exact place it is crashing in the firmware, though I could probably narrow it down with some effort. It is in the early startup code, at least, which makes debugging more difficult. This patch works fine with a slightly newer firmware compiled for 9984. That same firmware compiled for 9980 crashes, but I am not certain it is the same issue as the older 9980. It appears similar, at least. Looks to me like there are lots of variances in how firmware and chip revisions deal with this particular code, so we are going to have to test on lots of chips and platforms to know if a 'fix' is really a fix or not. Thanks, Ben
from my point of view this patch is just shit. it trunkates the maximum allocated memory to a certain value. so firmware requests 800 kb memory but just gets 256kb. so out of bound memory access is guaranteed at all. Am 23.07.2016 um 00:43 schrieb Ben Greear: > On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote: >> commit b057886524be ("ath10k: do not use coherent memory for allocated >> device memory chunks") replaced coherent memory allocation for memory >> chunks to fix low memory platforms. Unfortunately this is causing system >> freeze on x86 platform while bringing up qca99x0 device. The system >> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix >> this by limiting maximum memory chunk size to 256 KiB per request. >> >> Cc: Felix Fietkau <nbd@nbd.name> >> Fixes: b057886524be ("ath10k: do not use coherent memory for >> allocated device memory chunks") >> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> >> --- >> drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++ >> drivers/net/wireless/ath/ath10k/wmi.h | 1 + >> 2 files changed, 7 insertions(+) >> >> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c >> b/drivers/net/wireless/ath/ath10k/wmi.c >> index 6279ab4a760e..7c15f65fe5ed 100644 >> --- a/drivers/net/wireless/ath/ath10k/wmi.c >> +++ b/drivers/net/wireless/ath/ath10k/wmi.c >> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct >> ath10k *ar, u32 req_id, >> if (!pool_size) >> return -EINVAL; >> >> + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) { >> + num_units = WMI_MAX_MEM_CHUNK_SIZE / >> + round_up(unit_len, 4); >> + pool_size = num_units * round_up(unit_len, 4); >> + } > > I started testing my 9980 x86-64 system with VT/d enabled today. > With this patch in my tree, it crashes on bootup (with my firmware). > Works fine without this patch. > > I don't see the exact place it is crashing in the firmware, though I > could > probably narrow it down with some effort. It is in the early startup > code, > at least, which makes debugging more difficult. > > This patch works fine with a slightly newer firmware compiled for > 9984. That > same firmware compiled for 9980 crashes, but I am not certain it is > the same issue > as the older 9980. It appears similar, at least. > > Looks to me like there are lots of variances in how firmware and chip > revisions > deal with this particular code, so we are going to have to test on > lots of chips > and platforms to know if a 'fix' is really a fix or not. > > Thanks, > Ben > >
> from my point of view this patch is just shit. it trunkates the maximum > allocated memory to a certain value. > so firmware requests 800 kb memory but just gets 256kb. so out of bound > memory access is guaranteed at all. > Even with current logic, If the memory chunk allocation fails for bigger size, then it tries to allocate smaller chunks. So anyway it is guaranteed for oob access. no? while (!vaddr && num_units) { pool_size = num_units * round_up(unit_len, 4); if (!pool_size) return -EINVAL; vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); if (!vaddr) num_units /= 2; } Actually the commit "ath10k: do not use coherent memory for allocated device memory chunks" is causing system hang on non VT/d x86 platform. Better to revert the commit until it is properly root caused -Rajkumar Am 23.07.2016 um 00:43 schrieb Ben Greear: > On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote: >> commit b057886524be ("ath10k: do not use coherent memory for allocated >> device memory chunks") replaced coherent memory allocation for memory >> chunks to fix low memory platforms. Unfortunately this is causing system >> freeze on x86 platform while bringing up qca99x0 device. The system >> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix >> this by limiting maximum memory chunk size to 256 KiB per request. >> >> Cc: Felix Fietkau <nbd@nbd.name> >> Fixes: b057886524be ("ath10k: do not use coherent memory for >> allocated device memory chunks") >> Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> >> --- >> drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++ >> drivers/net/wireless/ath/ath10k/wmi.h | 1 + >> 2 files changed, 7 insertions(+) >> >> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c >> b/drivers/net/wireless/ath/ath10k/wmi.c >> index 6279ab4a760e..7c15f65fe5ed 100644 >> --- a/drivers/net/wireless/ath/ath10k/wmi.c >> +++ b/drivers/net/wireless/ath/ath10k/wmi.c >> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct >> ath10k *ar, u32 req_id, >> if (!pool_size) >> return -EINVAL; >> >> + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) { >> + num_units = WMI_MAX_MEM_CHUNK_SIZE / >> + round_up(unit_len, 4); >> + pool_size = num_units * round_up(unit_len, 4); >> + } > > I started testing my 9980 x86-64 system with VT/d enabled today. > With this patch in my tree, it crashes on bootup (with my firmware). > Works fine without this patch. > > I don't see the exact place it is crashing in the firmware, though I > could > probably narrow it down with some effort. It is in the early startup > code, > at least, which makes debugging more difficult. > > This patch works fine with a slightly newer firmware compiled for > 9984. That > same firmware compiled for 9980 crashes, but I am not certain it is > the same issue > as the older 9980. It appears similar, at least. > > Looks to me like there are lots of variances in how firmware and chip > revisions > deal with this particular code, so we are going to have to test on > lots of chips > and platforms to know if a 'fix' is really a fix or not. > > Thanks, > Ben > > -- Mit freundlichen Grüssen / Regards Sebastian Gottschall / CTO NewMedia-NET GmbH - DD-WRT Firmensitz: Berliner Ring 101, 64625 Bensheim Registergericht: Amtsgericht Darmstadt, HRB 25473 Geschäftsführer: Peter Steinhäuser, Christian Scheele http://www.dd-wrt.com email: s.gottschall@dd-wrt.com Tel.: +496251-582650 / Fax: +496251-5826565 -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/23/2016 02:11 AM, Sebastian Gottschall wrote: > from my point of view this patch is just shit. it trunkates the maximum allocated memory to a certain value. > so firmware requests 800 kb memory but just gets 256kb. so out of bound memory access is guaranteed at all. At least some of the firmware logic attempts to deal with this segmentation. 9984 firmware had bugs in one area that I fixed, but probably there are other bugs and that is why it fails for 9980 firmware. Even if I fix my firmware, that doesn't help everyone else with stock firmware. At best, maybe enable this type of logic for only exact firmware with proper feature flag showing that it is known to work with it. Thanks, Ben
On 07/20/2016 10:02 AM, Adrian Chadd wrote: > Hi, > > The "right" way for the target CPU to interact with host CPU memory > (and vice versa, for mostly what it's worth) is to have the copy > engine copy (ie, "DMA") the pieces between them. This may be for > diagnostic purposes, but it's not supposed to be used like this for > doing wifi data exchange, right? :-P > > Now, there /may/ be some alignment hilarity in various bits of code > and/or hardware. Eg, Merlin (AR9280) requires its descriptors to be > within a 4k block - the code to iterate through the descriptor > physical address space didn't do a "val = val + offset", it did > something in verilog like "val = (val & 0xffffc000) | (offset & > 0x3fff)". This meant if you allocated a descriptor that started just > before the end of a 4k physmem aligned block, you'd end up with > exciting results. I don't know if there are any situations like this > in the ath10k hardware, but I'm sure there will be some gotchas > somewhere. > > In any case, if ath10k is consuming too much bounce buffers, the calls > to allocate memory aren't working right and should be restricted to 32 > bit addresses. Whether that's by using the DMA memory API (before it's > mapped) or passing in GFP_DMA32 is a fun debate. > > (My test hardware arrived, so I'll test this all out today on > Peregrine-v2 and see if the driver works.) I have been running this patch for a while: ath10k: Use GPF_DMA32 for firmware swap memory. This fixes OS crash when using QCA 9984 NIC on x86-64 system without vt-d enabled. Also tested on ea8500 with 9980, and x86-64 with 9980 and 9880. All tests were with CT firmware. Signed-off-by: Ben Greear <greearb@candelatech.com> -------------------- drivers/net/wireless/ath/ath10k/wmi.c -------------------- index e20aa39..727b3aa 100644 @@ -4491,7 +4491,7 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; - vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); + vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN | GFP_DMA32); if (!vaddr) num_units /= 2; } It mostly seems to work, but then sometimes I get a splat like this below. It appears it is invalid to actually do kzalloc with GFP_DMA32 (based on that BUG_ON that hit in the new_slab method)?? Any idea for a more proper way to do this? gfp: 4 ------------[ cut here ]------------ kernel BUG at /home/greearb/git/linux-4.7.dev.y/mm/slub.c:1508! invalid opcode: 0000 [#1] PREEMPT SMP Modules linked in: coretemp hwmon ath9k intel_rapl ath10k_pci x86_pkg_temp_thermal ath9k_common ath10k_core intel_powerclamp ath9k_hw ath kvm iTCO_wdt mac80211 iTCO_vendor_support irqbypass snd_hda_codec_hdmi 6 CPU: 2 PID: 268 Comm: kworker/u8:5 Not tainted 4.7.2+ #16 Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013 Workqueue: ath10k_aux_wq ath10k_wmi_event_service_ready_work [ath10k_core] task: ffff880036433a00 ti: ffff880036440000 task.ti: ffff880036440000 RIP: 0010:[<ffffffff8124592a>] [<ffffffff8124592a>] new_slab+0x39a/0x410 RSP: 0018:ffff880036443b58 EFLAGS: 00010092 RAX: 0000000000000006 RBX: 00000000024082c4 RCX: 0000000000000000 RDX: 0000000000000006 RSI: ffff88021e30dd08 RDI: ffff88021e30dd08 RBP: ffff880036443b90 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000372 R12: ffff88021dc01200 R13: ffff88021dc00cc0 R14: ffff88021dc01200 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88021e300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3e65c1c730 CR3: 0000000001e06000 CR4: 00000000001406e0 Stack: ffffffff8127a4fc ffff0a01ffffff10 00000000024082c4 ffff88021dc01200 ffff88021dc00cc0 ffff88021dc01200 0000000000000001 ffff880036443c58 ffffffff81247ac6 ffff88021e31b360 ffff880036433a00 ffff880036433a00 Call Trace: [<ffffffff8127a4fc>] ? __d_lookup+0x9c/0x160 [<ffffffff81247ac6>] ___slab_alloc+0x396/0x4a0 [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [<ffffffff811f5279>] ? alloc_kmem_pages+0x9/0x10 [<ffffffff8120f203>] ? kmalloc_order+0x13/0x40 [<ffffffffa0f8e14d>] ? ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [<ffffffff81247bf6>] __slab_alloc.isra.72+0x26/0x40 [<ffffffff81248767>] __kmalloc+0x147/0x1b0 [<ffffffffa0f8e14d>] ath10k_wmi_event_service_ready_work+0x5ad/0x800 [ath10k_core] [<ffffffff811370a1>] ? dequeue_entity+0x261/0xac0 [<ffffffff8111c2d8>] process_one_work+0x148/0x420 [<ffffffff8111c929>] worker_thread+0x49/0x480 [<ffffffff8111c8e0>] ? rescuer_thread+0x330/0x330 [<ffffffff81121984>] kthread+0xc4/0xe0 [<ffffffff8184d75f>] ret_from_fork+0x1f/0x40 [<ffffffff811218c0>] ? kthread_create_on_node+0x170/0x170 Code: e9 65 fd ff ff 49 8b 57 20 48 8d 42 ff 83 e2 01 49 0f 44 c7 f0 80 08 40 e9 6f fd ff ff 89 c6 48 c7 c7 01 36 c7 81 e8 e8 40 fa ff <0f> 0b ba 00 10 00 00 be 5a 00 00 00 48 89 c7 48 d3 e2 e8 bf 18 RIP [<ffffffff8124592a>] new_slab+0x39a/0x410 RSP <ffff880036443b58> ---[ end trace ea3b0043b2911d93 ]--- static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node) { if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK); BUG(); } return allocate_slab(s, flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node); } Thanks, Ben
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index 6279ab4a760e..7c15f65fe5ed 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id, if (!pool_size) return -EINVAL; + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) { + num_units = WMI_MAX_MEM_CHUNK_SIZE / + round_up(unit_len, 4); + pool_size = num_units * round_up(unit_len, 4); + } + vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN); if (!vaddr) num_units /= 2; diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h index 90f594e89f94..dea1f235a54d 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.h +++ b/drivers/net/wireless/ath/ath10k/wmi.h @@ -6184,6 +6184,7 @@ struct wmi_roam_ev { #define ATH10K_DEFAULT_ATIM 0 #define WMI_MAX_MEM_REQS 16 +#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */ struct wmi_scan_ev_arg { __le32 event_type; /* %WMI_SCAN_EVENT_ */
commit b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks") replaced coherent memory allocation for memory chunks to fix low memory platforms. Unfortunately this is causing system freeze on x86 platform while bringing up qca99x0 device. The system hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix this by limiting maximum memory chunk size to 256 KiB per request. Cc: Felix Fietkau <nbd@nbd.name> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks") Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> --- drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++ drivers/net/wireless/ath/ath10k/wmi.h | 1 + 2 files changed, 7 insertions(+)