Message ID | 1431434736-7077-1-git-send-email-michal.kazior@tieto.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
On 12 May 2015 at 14:45, Michal Kazior <michal.kazior@tieto.com> wrote: > Patch df1404650ccb ("mac80211: remove support for > IFF_PROMISC") removed promiscuous flag propagation > to drivers. > > However the patch was designed against ath10k > without 548462133d98 ("ath10k: fix interrupt > storm"). > > After merge the code drifted into being no longer > correct and due to monitor vdev being > overzealously started caused IBSS to crash on > 999.999.0.636 for QCA988X (this firmware revision > is known to have issues with monitor vdev). > > This patch keeps expectations of commit > 548462133d98 (i.e. reduce irq storm by not > enabling monitor vdev for AP) and doesn't break > existing (known) setups that imply promiscuous > mode on network interfaces. > > Contrary to what it looks like 548462133d98 > functionality is not reverted since the intention > was a subset of what df1404650ccb did. > > Fixes: c17c997d5613 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next") > Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Apparently this also fixes some weird issues with qca6174 hw2.1 notably: - ath10k causing disconnecting of other devices in a BSS - random Fw crashes Both problems started to happen because c17c997d5613 enabled monitor vdev by default on STA interfaces. It seems that qca6174 hw2.1 firmware has issues similar to those of qca988x 999.999.0.636 regarding monitor vdev opration. Also, I've made a typo in the subject. I'll post v2 with subject fixed and extended commit log later. Micha?
Adding John as this involved wireless-testing Michal Kazior <michal.kazior@tieto.com> writes: > On 12 May 2015 at 14:45, Michal Kazior <michal.kazior@tieto.com> wrote: >> Patch df1404650ccb ("mac80211: remove support for >> IFF_PROMISC") removed promiscuous flag propagation >> to drivers. >> >> However the patch was designed against ath10k >> without 548462133d98 ("ath10k: fix interrupt >> storm"). >> >> After merge the code drifted into being no longer >> correct and due to monitor vdev being >> overzealously started caused IBSS to crash on >> 999.999.0.636 for QCA988X (this firmware revision >> is known to have issues with monitor vdev). >> >> This patch keeps expectations of commit >> 548462133d98 (i.e. reduce irq storm by not >> enabling monitor vdev for AP) and doesn't break >> existing (known) setups that imply promiscuous >> mode on network interfaces. >> >> Contrary to what it looks like 548462133d98 >> functionality is not reverted since the intention >> was a subset of what df1404650ccb did. >> >> Fixes: c17c997d5613 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next") >> Signed-off-by: Michal Kazior <michal.kazior@tieto.com> > > Apparently this also fixes some weird issues with qca6174 hw2.1 notably: > - ath10k causing disconnecting of other devices in a BSS > - random Fw crashes > > Both problems started to happen because c17c997d5613 enabled monitor > vdev by default on STA interfaces. It seems that qca6174 hw2.1 > firmware has issues similar to those of qca988x 999.999.0.636 > regarding monitor vdev opration. > > Also, I've made a typo in the subject. > > I'll post v2 with subject fixed and extended commit log later. Keep in mind that c17c997d5613 is actually from wireless-testing.git which means that it will never go to wireless-drivers-next.git nor to net-next.git. So the merge conflict bug is purely in wireless-testing.git and in master branch of ath.git (but not in ath-next branch!). I think John should apply your v2 patch once you send it. But if you have something which should be fixed in ath-next remember to send that in a separate patch so that I can apply that directly to ath-next.
Kalle Valo <kvalo@qca.qualcomm.com> writes: > Adding John as this involved wireless-testing > > Michal Kazior <michal.kazior@tieto.com> writes: > >> On 12 May 2015 at 14:45, Michal Kazior <michal.kazior@tieto.com> wrote: >>> Patch df1404650ccb ("mac80211: remove support for >>> IFF_PROMISC") removed promiscuous flag propagation >>> to drivers. >>> >>> However the patch was designed against ath10k >>> without 548462133d98 ("ath10k: fix interrupt >>> storm"). >>> >>> After merge the code drifted into being no longer >>> correct and due to monitor vdev being >>> overzealously started caused IBSS to crash on >>> 999.999.0.636 for QCA988X (this firmware revision >>> is known to have issues with monitor vdev). >>> >>> This patch keeps expectations of commit >>> 548462133d98 (i.e. reduce irq storm by not >>> enabling monitor vdev for AP) and doesn't break >>> existing (known) setups that imply promiscuous >>> mode on network interfaces. >>> >>> Contrary to what it looks like 548462133d98 >>> functionality is not reverted since the intention >>> was a subset of what df1404650ccb did. >>> >>> Fixes: c17c997d5613 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next") >>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com> >> >> Apparently this also fixes some weird issues with qca6174 hw2.1 notably: >> - ath10k causing disconnecting of other devices in a BSS >> - random Fw crashes >> >> Both problems started to happen because c17c997d5613 enabled monitor >> vdev by default on STA interfaces. It seems that qca6174 hw2.1 >> firmware has issues similar to those of qca988x 999.999.0.636 >> regarding monitor vdev opration. >> >> Also, I've made a typo in the subject. >> >> I'll post v2 with subject fixed and extended commit log later. > > Keep in mind that c17c997d5613 is actually from wireless-testing.git > which means that it will never go to wireless-drivers-next.git nor to > net-next.git. So the merge conflict bug is purely in > wireless-testing.git and in master branch of ath.git (but not in > ath-next branch!). > > I think John should apply your v2 patch once you send it. But if you > have something which should be fixed in ath-next remember to send that > in a separate patch so that I can apply that directly to ath-next. Actually now that Dave pulled my pull request the issue is fixed in wireless-drivers-next already. So once John pulls from wireless-drivers-next and makes sure that ath10k is 100% identical in both trees the issue should be sorted out and no need for extra patches.
Hello could it be possible to add a ACK timing feature to the ath10k firmware (QCA9880 internal register 0x8014, mask 0x3FFF) regards, Sebastian
On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: > Hello > > could it be possible to add a ACK timing feature to the ath10k firmware (QCA9880 internal register 0x8014, mask 0x3FFF) You just need ability to set this register to some value? If so, probably something I could add to CT firmware, at least. Thanks, Ben
Am 25.05.2015 um 19:13 schrieb Ben Greear: > > > On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: >> Hello >> >> could it be possible to add a ACK timing feature to the ath10k >> firmware (QCA9880 internal register 0x8014, mask 0x3FFF) > > You just need ability to set this register to some value? > > If so, probably something I could add to CT firmware, at least. > not alone. this register is rewritten on each reset (channel change etc.) so it needs to be correct handled. yes. just writing and handling the ack value would be enough. the math behind is no problem. otherwise its impossible todo long range links with ath10k. (LSDK based firmware from compex do support this feature unlike ath10k) for distance handling the following parameters must be adjustable (in ath9k we implemented the coverageclass attribute for it which was based on my previous work on madwifi) since i just have a old ath10k firmware source which i never got working (working toolchain missing) i just write down you the register definitions here which must be adjustable. the math etc. for calculating these values can be done later by me in ath10k OS_REG_WRITE(MAC_DCU_GBL_IFS_SLOT_ADDRESS, MAC_DCU_GBL_IFS_SLOT_DURATION_SET(your_slot_time_here * 88)); //(default value is 9) OS_REG_WRITE(MAC_DCU_GBL_IFS_SIFS_ADDRESS, MAC_DCU_GBL_IFS_SIFS_DURATION_SET(your_sifs_time_here * 88)); //(default value is 14) OS_REG_WRITE(MAC_DCU_GBL_IFS_EIFS_ADDRESS, MAC_DCU_GBL_IFS_SIFS_DURATION_SET(your_eifs_time_here * 88)); //(default value is 92) OS_REG_WRITE(MAC_PCU_ACK_CTS_TIMEOUT_ADDRESS, MAC_PCU_ACK_CTS_TIMEOUT_ACK_TIMEOUT_SET(your_ack_time_here * 88) ); // (default value is 30) OS_REG_WRITE(MAC_PCU_ACK_CTS_TIMEOUT_ADDRESS, MAC_PCU_ACK_CTS_TIMEOUT_CTS_TIMEOUT_SET(your_cts_time_here * 88)); // (default value is 30) these registers are prewritten using the ini array named qca9880_peregrine_bimodal_asic_mac its possible to adjust them using debugfs reg_value and reg_addr, but as i said on each channel change or internal reset, the registers are overwritten with default values. so best is to adjust them direct after registers are written from ini array. > Thanks, > Ben > >
On 05/25/2015 10:48 AM, Sebastian Gottschall wrote: > Am 25.05.2015 um 19:13 schrieb Ben Greear: >> >> >> On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: >>> Hello >>> >>> could it be possible to add a ACK timing feature to the ath10k firmware (QCA9880 internal register 0x8014, mask 0x3FFF) >> >> You just need ability to set this register to some value? >> >> If so, probably something I could add to CT firmware, at least. >> > not alone. this register is rewritten on each reset (channel change etc.) so it needs to be correct handled. > yes. just writing and handling the ack value would be enough. the math behind is no problem. > otherwise its impossible todo long range links with ath10k. (LSDK based firmware from compex do support this feature unlike ath10k) I'll see if I can add this to my firmware, probably will be a few days before I can get time to work on it. Will post to list when I have a FW build ready for testing. Thanks, Ben
Am 25.05.2015 um 19:53 schrieb Ben Greear: > > > On 05/25/2015 10:48 AM, Sebastian Gottschall wrote: >> Am 25.05.2015 um 19:13 schrieb Ben Greear: >>> >>> >>> On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: >>>> Hello >>>> >>>> could it be possible to add a ACK timing feature to the ath10k >>>> firmware (QCA9880 internal register 0x8014, mask 0x3FFF) >>> >>> You just need ability to set this register to some value? >>> >>> If so, probably something I could add to CT firmware, at least. >>> >> not alone. this register is rewritten on each reset (channel change >> etc.) so it needs to be correct handled. >> yes. just writing and handling the ack value would be enough. the >> math behind is no problem. >> otherwise its impossible todo long range links with ath10k. (LSDK >> based firmware from compex do support this feature unlike ath10k) > > I'll see if I can add this to my firmware, probably will be a few days > before I can get > time to work on it. Will post to list when I have a FW build ready > for testing. do you plan to bring up your codebase to 10.2.4 with api 5 one time? or is the code already up to date, just using the old api? > > Thanks, > Ben >
On 05/25/2015 12:21 PM, Sebastian Gottschall wrote: > Am 25.05.2015 um 19:53 schrieb Ben Greear: >> >> >> On 05/25/2015 10:48 AM, Sebastian Gottschall wrote: >>> Am 25.05.2015 um 19:13 schrieb Ben Greear: >>>> >>>> >>>> On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: >>>>> Hello >>>>> >>>>> could it be possible to add a ACK timing feature to the ath10k firmware (QCA9880 internal register 0x8014, mask 0x3FFF) >>>> >>>> You just need ability to set this register to some value? >>>> >>>> If so, probably something I could add to CT firmware, at least. >>>> >>> not alone. this register is rewritten on each reset (channel change etc.) so it needs to be correct handled. >>> yes. just writing and handling the ack value would be enough. the math behind is no problem. >>> otherwise its impossible todo long range links with ath10k. (LSDK based firmware from compex do support this feature unlike ath10k) >> >> I'll see if I can add this to my firmware, probably will be a few days before I can get >> time to work on it. Will post to list when I have a FW build ready for testing. > do you plan to bring up your codebase to 10.2.4 with api 5 one time? > or is the code already up to date, just using the old api? I'm having a slow time getting updated source from QCA, but I plan to move to a newer code base when I can get access. For now, my firmware is based on 10.1.467, but it has quite a bit of improvements and changes. It does not support some of the newer chipsets that newer QCA firmware supports. Thanks, Ben
Am 25.05.2015 um 21:32 schrieb Ben Greear: > > > On 05/25/2015 12:21 PM, Sebastian Gottschall wrote: >> Am 25.05.2015 um 19:53 schrieb Ben Greear: >>> >>> >>> On 05/25/2015 10:48 AM, Sebastian Gottschall wrote: >>>> Am 25.05.2015 um 19:13 schrieb Ben Greear: >>>>> >>>>> >>>>> On 05/25/2015 10:10 AM, Sebastian Gottschall wrote: >>>>>> Hello >>>>>> >>>>>> could it be possible to add a ACK timing feature to the ath10k >>>>>> firmware (QCA9880 internal register 0x8014, mask 0x3FFF) >>>>> >>>>> You just need ability to set this register to some value? >>>>> >>>>> If so, probably something I could add to CT firmware, at least. >>>>> >>>> not alone. this register is rewritten on each reset (channel change >>>> etc.) so it needs to be correct handled. >>>> yes. just writing and handling the ack value would be enough. the >>>> math behind is no problem. >>>> otherwise its impossible todo long range links with ath10k. (LSDK >>>> based firmware from compex do support this feature unlike ath10k) >>> >>> I'll see if I can add this to my firmware, probably will be a few >>> days before I can get >>> time to work on it. Will post to list when I have a FW build ready >>> for testing. >> do you plan to bring up your codebase to 10.2.4 with api 5 one time? >> or is the code already up to date, just using the old api? > > I'm having a slow time getting updated source from QCA, but I plan to > move to a newer code base when I can get access. > > For now, my firmware is based on 10.1.467, but it has quite a bit of > improvements > and changes. It does not support some of the newer chipsets that > newer QCA > firmware supports. as soon as i have seen each new chipset has a own firmware. the standard firmware will only support AR9880 v2. since i'm only working on embedded devices which are only running on AR9880 v2 based chipsets, this isnt a big issue Sebastian > > Thanks, > Ben > >
today using the latest testing driver, i found out the memory consumption is unbelievable high. my router here has 64 mb ram. this ram is fully taken after some minutes by ath10k. but only if data flow present. here the results of "free" after some minutes root@DD-WRT:~# free total used free shared buffers Mem: 61636 58752 2884 0 2600 -/+ buffers: 56152 5484 Swap: 0 0 0 now i terminate hostapd which controls the ath10k chipset root@DD-WRT:~# kill 902 root@DD-WRT:~# free total used free shared buffers Mem: 61636 23212 38424 0 2416 -/+ buffers: 20796 40840 Swap: 0 0 0 you see the difference? regards, Sebastian Gottschall
Default firmware has a hard-coded minimum number of tx buffers (somewhere more than 1k buffers I think). Maybe driver is allocating all this memory somehow? If you do one-way traffic tests (udp), I wonder if you can tell if it is tx or rx that consumes the memory? CT firmware can be configured to use any multiple-of-8 amount of tx buffers, though I have not tested below around 600. Thanks, Ben On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: > today using the latest testing driver, i found out the memory consumption is unbelievable high. > my router here has 64 mb ram. this ram is fully taken after some minutes by ath10k. but only if data flow present. > > here the results of "free" after some minutes > root@DD-WRT:~# free > total used free shared buffers > Mem: 61636 58752 2884 0 2600 > -/+ buffers: 56152 5484 > Swap: 0 0 0 > > > now i terminate hostapd which controls the ath10k chipset > > > root@DD-WRT:~# kill 902 > root@DD-WRT:~# free > total used free shared buffers > Mem: 61636 23212 38424 0 2416 > -/+ buffers: 20796 40840 > Swap: 0 0 0 > > > you see the difference? > > > regards, > Sebastian Gottschall > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k >
Am 26.05.2015 um 00:39 schrieb Ben Greear: > Default firmware has a hard-coded minimum number of tx buffers (somewhere > more than 1k buffers I think). Maybe driver is allocating all this > memory somehow? > > If you do one-way traffic tests (udp), I wonder if you can tell if it > is tx > or rx that consumes the memory? its tx. i have a ethernet over ip tunnel running on that link and i broadcast iptv in that way. (its my way to convert multicast to unicast) the tunnel itself is rfc ethernet over ip, which is somewhat like udp. so connectionless protocol Sebastian > > CT firmware can be configured to use any multiple-of-8 amount of tx > buffers, though I have not tested below around 600. > > Thanks, > Ben > > On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >> today using the latest testing driver, i found out the memory >> consumption is unbelievable high. >> my router here has 64 mb ram. this ram is fully taken after some >> minutes by ath10k. but only if data flow present. >> >> here the results of "free" after some minutes >> root@DD-WRT:~# free >> total used free shared buffers >> Mem: 61636 58752 2884 0 2600 >> -/+ buffers: 56152 5484 >> Swap: 0 0 0 >> >> >> now i terminate hostapd which controls the ath10k chipset >> >> >> root@DD-WRT:~# kill 902 >> root@DD-WRT:~# free >> total used free shared buffers >> Mem: 61636 23212 38424 0 2416 >> -/+ buffers: 20796 40840 >> Swap: 0 0 0 >> >> >> you see the difference? >> >> >> regards, >> Sebastian Gottschall >> >> _______________________________________________ >> ath10k mailing list >> ath10k@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/ath10k >> >
Can you test with ath9k to make sure it is actually ath10k related? And/or try traffic in RX direction only to see if that still uses lots of memory? Does memory come back after you just stop traffic (w/out stopping hostapd)? Thanks, Ben On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: > Am 26.05.2015 um 00:39 schrieb Ben Greear: >> Default firmware has a hard-coded minimum number of tx buffers (somewhere >> more than 1k buffers I think). Maybe driver is allocating all this >> memory somehow? >> >> If you do one-way traffic tests (udp), I wonder if you can tell if it is tx >> or rx that consumes the memory? > its tx. i have a ethernet over ip tunnel running on that link and i broadcast iptv in that way. (its my way to convert multicast to unicast) > the tunnel itself is rfc ethernet over ip, which is somewhat like udp. so connectionless protocol > > Sebastian >> >> CT firmware can be configured to use any multiple-of-8 amount of tx >> buffers, though I have not tested below around 600. >> >> Thanks, >> Ben >> >> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>> today using the latest testing driver, i found out the memory consumption is unbelievable high. >>> my router here has 64 mb ram. this ram is fully taken after some minutes by ath10k. but only if data flow present. >>> >>> here the results of "free" after some minutes >>> root@DD-WRT:~# free >>> total used free shared buffers >>> Mem: 61636 58752 2884 0 2600 >>> -/+ buffers: 56152 5484 >>> Swap: 0 0 0 >>> >>> >>> now i terminate hostapd which controls the ath10k chipset >>> >>> >>> root@DD-WRT:~# kill 902 >>> root@DD-WRT:~# free >>> total used free shared buffers >>> Mem: 61636 23212 38424 0 2416 >>> -/+ buffers: 20796 40840 >>> Swap: 0 0 0 >>> >>> >>> you see the difference? >>> >>> >>> regards, >>> Sebastian Gottschall >>> >>> _______________________________________________ >>> ath10k mailing list >>> ath10k@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/ath10k >>> >> >
Am 26.05.2015 um 01:42 schrieb Ben Greear: > Can you test with ath9k to make sure it is actually ath10k related? already tested. this device has 2 chipsets. one is ath9k based and the second is ath10k based. :-) only if i kill the hostapd process which controls ath10k. the memory waste is gone > > And/or try traffic in RX direction only to see if that still uses > lots of memory? > > Does memory come back after you just stop traffic (w/out stopping > hostapd)? yes. slowly. its fluctuating. so sometimes there is 30 mb free again and seconds later just 2 mb. so very heavy changes. on bigger routers with more than 64 mb (i have a second here with 128 mb) the total consumption stabilizes at 45 - 50 mb for the driver only which is still too much for sure. so it may not a leak. but ath10k or the firmware is wasting too much memory for embedded devices and ar9880 is just used on embedded devices almost > > Thanks, > Ben > > > On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: >> Am 26.05.2015 um 00:39 schrieb Ben Greear: >>> Default firmware has a hard-coded minimum number of tx buffers >>> (somewhere >>> more than 1k buffers I think). Maybe driver is allocating all this >>> memory somehow? >>> >>> If you do one-way traffic tests (udp), I wonder if you can tell if >>> it is tx >>> or rx that consumes the memory? >> its tx. i have a ethernet over ip tunnel running on that link and i >> broadcast iptv in that way. (its my way to convert multicast to unicast) >> the tunnel itself is rfc ethernet over ip, which is somewhat like >> udp. so connectionless protocol >> >> Sebastian >>> >>> CT firmware can be configured to use any multiple-of-8 amount of tx >>> buffers, though I have not tested below around 600. >>> >>> Thanks, >>> Ben >>> >>> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>>> today using the latest testing driver, i found out the memory >>>> consumption is unbelievable high. >>>> my router here has 64 mb ram. this ram is fully taken after some >>>> minutes by ath10k. but only if data flow present. >>>> >>>> here the results of "free" after some minutes >>>> root@DD-WRT:~# free >>>> total used free shared buffers >>>> Mem: 61636 58752 2884 0 2600 >>>> -/+ buffers: 56152 5484 >>>> Swap: 0 0 0 >>>> >>>> >>>> now i terminate hostapd which controls the ath10k chipset >>>> >>>> >>>> root@DD-WRT:~# kill 902 >>>> root@DD-WRT:~# free >>>> total used free shared buffers >>>> Mem: 61636 23212 38424 0 2416 >>>> -/+ buffers: 20796 40840 >>>> Swap: 0 0 0 >>>> >>>> >>>> you see the difference? >>>> >>>> >>>> regards, >>>> Sebastian Gottschall >>>> >>>> _______________________________________________ >>>> ath10k mailing list >>>> ath10k@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/ath10k >>>> >>> >> >
On 26 May 2015 at 02:07, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: > Am 26.05.2015 um 01:42 schrieb Ben Greear: >> >> Can you test with ath9k to make sure it is actually ath10k related? > > already tested. this device has 2 chipsets. one is ath9k based and the > second is ath10k based. :-) > only if i kill the hostapd process which controls ath10k. the memory waste > is gone Keep in mind that hostapd itself requires memory to function as well. Each process (and thread) need some internal kernel memory (stack, et al). >> And/or try traffic in RX direction only to see if that still uses >> lots of memory? > > >> >> Does memory come back after you just stop traffic (w/out stopping >> hostapd)? > > yes. slowly. its fluctuating. so sometimes there is 30 mb free again and > seconds later just 2 mb. so very heavy changes. on bigger routers with more > than 64 mb (i have a second here with 128 mb) > the total consumption stabilizes at 45 - 50 mb for the driver only which is > still too much for sure. so it may not a leak. but ath10k or the firmware > is wasting too much memory for embedded devices > and ar9880 is just used on embedded devices almost Using `free` is a pretty poor way of assessing memory usage of a kernel driver. It reports how much the OS has memory available to userspace immediately (kernel recycles some memory for performance reasons, e.g. SLAB does it). There's a lot of metadata too so what you actually see is many other things that involve ath10k being used. The driver itself should be consuming around 5MB of memory at idle (interface up, no significant traffic). Most of this goes for the Rx ring which has 1023*1920 bytes (+/- allocation and metadata waste). Then there's a bunch of CE buffers as well which take up some memory (used for driver-firmware communication), e.g. 2048*512 + 2048*128 (HTT and WMI, both target->host). When Txing it may eat up additional 1424 * (MSDU size + sizeof(skbuff)). Note that Tx queues can be longer - driver isn't aware of qdiscs and those can store frames as well. 11ac supports frame aggregates going up to 1MB so these queues pretty much need to be this long if you want to be able to get highest possible throughput. Micha? > >> >> Thanks, >> Ben >> >> >> On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: >>> >>> Am 26.05.2015 um 00:39 schrieb Ben Greear: >>>> >>>> Default firmware has a hard-coded minimum number of tx buffers >>>> (somewhere >>>> more than 1k buffers I think). Maybe driver is allocating all this >>>> memory somehow? >>>> >>>> If you do one-way traffic tests (udp), I wonder if you can tell if it is >>>> tx >>>> or rx that consumes the memory? >>> >>> its tx. i have a ethernet over ip tunnel running on that link and i >>> broadcast iptv in that way. (its my way to convert multicast to unicast) >>> the tunnel itself is rfc ethernet over ip, which is somewhat like udp. >>> so connectionless protocol >>> >>> Sebastian >>>> >>>> >>>> CT firmware can be configured to use any multiple-of-8 amount of tx >>>> buffers, though I have not tested below around 600. >>>> >>>> Thanks, >>>> Ben >>>> >>>> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>>>> >>>>> today using the latest testing driver, i found out the memory >>>>> consumption is unbelievable high. >>>>> my router here has 64 mb ram. this ram is fully taken after some >>>>> minutes by ath10k. but only if data flow present. >>>>> >>>>> here the results of "free" after some minutes >>>>> root@DD-WRT:~# free >>>>> total used free shared buffers >>>>> Mem: 61636 58752 2884 0 2600 >>>>> -/+ buffers: 56152 5484 >>>>> Swap: 0 0 0 >>>>> >>>>> >>>>> now i terminate hostapd which controls the ath10k chipset >>>>> >>>>> >>>>> root@DD-WRT:~# kill 902 >>>>> root@DD-WRT:~# free >>>>> total used free shared buffers >>>>> Mem: 61636 23212 38424 0 2416 >>>>> -/+ buffers: 20796 40840 >>>>> Swap: 0 0 0 >>>>> >>>>> >>>>> you see the difference? >>>>> >>>>> >>>>> regards, >>>>> Sebastian Gottschall >>>>> >>>>> _______________________________________________ >>>>> ath10k mailing list >>>>> ath10k@lists.infradead.org >>>>> http://lists.infradead.org/mailman/listinfo/ath10k >>>>> >>>> >>> >> > > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k
On Tue, May 26, 2015 at 07:42:35AM +0200, Michal Kazior wrote: > On 26 May 2015 at 02:07, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: > > Am 26.05.2015 um 01:42 schrieb Ben Greear: > >> > >> Can you test with ath9k to make sure it is actually ath10k related? > > > > already tested. this device has 2 chipsets. one is ath9k based and the > > second is ath10k based. :-) > > only if i kill the hostapd process which controls ath10k. the memory waste > > is gone > > Keep in mind that hostapd itself requires memory to function as well. > Each process (and thread) need some internal kernel memory (stack, et > al). > Have seen simialar issue long hours run in mbssid mode with multi-client. Killing hostapd regains memory. [<c021dd44>] (unwind_backtrace) from [<c021ae0c>] (show_stack+0x10/0x14) [<c021ae0c>] (show_stack) from [<c0336b9c>] (dump_stack+0x88/0xcc) [<c0336b9c>] (dump_stack) from [<c0279804>] (dump_header.isra.11+0x64/0x178) [<c0279804>] (dump_header.isra.11) from [<c0279b10>] (oom_kill_process+0x70/0x384) [<c0279b10>] (oom_kill_process) from [<c027a2a0>] (out_of_memory+0x2d4/0x304) [<c027a2a0>] (out_of_memory) from [<c027d180>] (__alloc_pages_nodemask+0x608/0x664) [<c027d180>] (__alloc_pages_nodemask) from [<c0278780>] (filemap_fault+0x1f8/0x390) [<c0278780>] (filemap_fault) from [<c028f45c>] (__do_fault+0xa4/0x42c) [<c028f45c>] (__do_fault) from [<c0292494>] (handle_mm_fault+0x230/0x7b0) [<c0292494>] (handle_mm_fault) from [<c021f70c>] (do_page_fault+0x114/0x26c) [<c021f70c>] (do_page_fault) from [<c0208440>] (do_PrefetchAbort+0x34/0x98) Need to check whether it is a regression or not. -Rajkumar
Am 26.05.2015 um 07:42 schrieb Michal Kazior: > On 26 May 2015 at 02:07, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: >> Am 26.05.2015 um 01:42 schrieb Ben Greear: >>> Can you test with ath9k to make sure it is actually ath10k related? >> already tested. this device has 2 chipsets. one is ath9k based and the >> second is ath10k based. :-) >> only if i kill the hostapd process which controls ath10k. the memory waste >> is gone > Keep in mind that hostapd itself requires memory to function as well. > Each process (and thread) need some internal kernel memory (stack, et > al). > i know. 1.8 mb is what i see in userspace. the ath9k and ath10k controling hostapd uses the same amount of memory. no difference between them. 50 mb is never taken by hostapd. consider that this embedded device has just 64 mb ram. (dlink-dir859) > >> yes. slowly. its fluctuating. so sometimes there is 30 mb free again and >> seconds later just 2 mb. so very heavy changes. on bigger routers with more >> than 64 mb (i have a second here with 128 mb) >> the total consumption stabilizes at 45 - 50 mb for the driver only which is >> still too much for sure. so it may not a leak. but ath10k or the firmware >> is wasting too much memory for embedded devices >> and ar9880 is just used on embedded devices almost > Using `free` is a pretty poor way of assessing memory usage of a > kernel driver. It reports how much the OS has memory available to > userspace immediately (kernel recycles some memory for performance > reasons, e.g. SLAB does it). There's a lot of metadata too so what you > actually see is many other things that involve ath10k being used. i checked meminfo as well. but you dont see any differences in it. it shows only differences in the same values as free. all other slab related info etc are not changing. > > The driver itself should be consuming around 5MB of memory at idle > (interface up, no significant traffic). Most of this goes for the Rx > ring which has 1023*1920 bytes (+/- allocation and metadata waste). > Then there's a bunch of CE buffers as well which take up some memory > (used for driver-firmware communication), e.g. 2048*512 + 2048*128 > (HTT and WMI, both target->host). > > When Txing it may eat up additional 1424 * (MSDU size + > sizeof(skbuff)). Note that Tx queues can be longer - driver isn't > aware of qdiscs and those can store frames as well. > > 11ac supports frame aggregates going up to 1MB so these queues pretty > much need to be this long if you want to be able to get highest > possible throughput. yes. but ath10k has its main usage on embedded devices. at least for AR9880 chipsets since there is not even a windows driver available for AR9880. so now consider that ath10k is not able to run on devices with good stability, where the QCA LSDK Driver does not seem to have that big resource problem. so it doesnt make much sense to go on here in this way. this resource problem must be solved. about 50 MB is really too much. Sebastian > > > Micha? > >>> Thanks, >>> Ben >>> >>> >>> On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: >>>> Am 26.05.2015 um 00:39 schrieb Ben Greear: >>>>> Default firmware has a hard-coded minimum number of tx buffers >>>>> (somewhere >>>>> more than 1k buffers I think). Maybe driver is allocating all this >>>>> memory somehow? >>>>> >>>>> If you do one-way traffic tests (udp), I wonder if you can tell if it is >>>>> tx >>>>> or rx that consumes the memory? >>>> its tx. i have a ethernet over ip tunnel running on that link and i >>>> broadcast iptv in that way. (its my way to convert multicast to unicast) >>>> the tunnel itself is rfc ethernet over ip, which is somewhat like udp. >>>> so connectionless protocol >>>> >>>> Sebastian >>>>> >>>>> CT firmware can be configured to use any multiple-of-8 amount of tx >>>>> buffers, though I have not tested below around 600. >>>>> >>>>> Thanks, >>>>> Ben >>>>> >>>>> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>>>>> today using the latest testing driver, i found out the memory >>>>>> consumption is unbelievable high. >>>>>> my router here has 64 mb ram. this ram is fully taken after some >>>>>> minutes by ath10k. but only if data flow present. >>>>>> >>>>>> here the results of "free" after some minutes >>>>>> root@DD-WRT:~# free >>>>>> total used free shared buffers >>>>>> Mem: 61636 58752 2884 0 2600 >>>>>> -/+ buffers: 56152 5484 >>>>>> Swap: 0 0 0 >>>>>> >>>>>> >>>>>> now i terminate hostapd which controls the ath10k chipset >>>>>> >>>>>> >>>>>> root@DD-WRT:~# kill 902 >>>>>> root@DD-WRT:~# free >>>>>> total used free shared buffers >>>>>> Mem: 61636 23212 38424 0 2416 >>>>>> -/+ buffers: 20796 40840 >>>>>> Swap: 0 0 0 >>>>>> >>>>>> >>>>>> you see the difference? >>>>>> >>>>>> >>>>>> regards, >>>>>> Sebastian Gottschall >>>>>> >>>>>> _______________________________________________ >>>>>> ath10k mailing list >>>>>> ath10k@lists.infradead.org >>>>>> http://lists.infradead.org/mailman/listinfo/ath10k >>>>>> >> >> _______________________________________________ >> ath10k mailing list >> ath10k@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/ath10k
good point. ath10k is configured with one additional vap for me. but not multi client. both vap's are running in ap mode. let me send you my hostapd config here. passphrases has been masked driver=nl80211 ctrl_interface=/var/run/hostapd wmm_ac_bk_cwmin=4 wmm_ac_bk_cwmax=10 wmm_ac_bk_aifs=7 wmm_ac_bk_txop_limit=0 wmm_ac_bk_acm=0 wmm_ac_be_aifs=3 wmm_ac_be_cwmin=4 wmm_ac_be_cwmax=10 wmm_ac_be_acm=0 wmm_ac_vi_aifs=2 wmm_ac_vi_cwmin=3 wmm_ac_vi_cwmax=4 wmm_ac_vi_txop_limit=94 wmm_ac_vi_acm=0 wmm_ac_vo_aifs=2 wmm_ac_vo_cwmin=2 wmm_ac_vo_cwmax=3 wmm_ac_vo_txop_limit=47 wmm_ac_vo_acm=0 tx_queue_data3_aifs=7 tx_queue_data3_cwmin=15 tx_queue_data3_cwmax=1023 tx_queue_data3_burst=0 tx_queue_data2_aifs=3 tx_queue_data2_cwmin=15 tx_queue_data2_cwmax=63 tx_queue_data1_aifs=1 tx_queue_data1_cwmin=7 tx_queue_data1_cwmax=15 tx_queue_data1_burst=3.0 tx_queue_data0_aifs=1 tx_queue_data0_cwmin=3 tx_queue_data0_cwmax=7 tx_queue_data0_burst=1.5 country_code=DE tx_queue_data2_burst=2.0 wmm_ac_be_txop_limit=64 ieee80211n=1 dynamic_ht40=0 ht_capab=[HT40+][LDPC][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][DSSS_CCK-40] vht_capab=[RXLDPC][SHORT-GI-80][TX-STBC-2BY1][RX-STBC1][RX-ANTENNA-PATTERN][TX-ANTENNA-PATTERN][MAX-MPDU-11454][MAX-A-MPDU-LEN-EXP7] ieee80211ac=1 vht_oper_chwidth=1 vht_oper_centr_freq_seg0_idx=106 hw_mode=a channel=100 frequency=5500 beacon_int=100 dtim_period=2 interface=ath1 disassoc_low_ack=1 wds_sta=1 wmm_enabled=1 bssid=E8:CC:18:FF:E0:A4 ignore_broadcast_ssid=0 max_num_sta=256 ssid=dd-wrt-NA-5 bridge=br0 logger_syslog=-1 logger_syslog_level=2 logger_stdout=-1 logger_stdout_level=2 dump_file=/tmp/hostapd.dump eapol_version=1 eapol_key_index_workaround=0 wpa=2 wpa_passphrase=*********** wpa_key_mgmt=WPA-PSK wpa_pairwise=CCMP wpa_group_rekey=3600 bss=ath1.1 disassoc_low_ack=1 wmm_enabled=1 bssid=EA:CC:18:FF:E0:A4 ignore_broadcast_ssid=0 max_num_sta=256 ssid=dd-wrt-TV bridge=br0 logger_syslog=-1 logger_syslog_level=2 logger_stdout=-1 logger_stdout_level=2 dump_file=/tmp/hostapd.dump eapol_version=1 eapol_key_index_workaround=0 wpa=2 wpa_passphrase=************ wpa_key_mgmt=WPA-PSK wpa_pairwise=CCMP wpa_group_rekey=3600 Am 26.05.2015 um 08:20 schrieb Rajkumar Manoharan: > On Tue, May 26, 2015 at 07:42:35AM +0200, Michal Kazior wrote: >> On 26 May 2015 at 02:07, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: >>> Am 26.05.2015 um 01:42 schrieb Ben Greear: >>>> Can you test with ath9k to make sure it is actually ath10k related? >>> already tested. this device has 2 chipsets. one is ath9k based and the >>> second is ath10k based. :-) >>> only if i kill the hostapd process which controls ath10k. the memory waste >>> is gone >> Keep in mind that hostapd itself requires memory to function as well. >> Each process (and thread) need some internal kernel memory (stack, et >> al). >> > Have seen simialar issue long hours run in mbssid mode with multi-client. > Killing hostapd regains memory. > > [<c021dd44>] (unwind_backtrace) from [<c021ae0c>] (show_stack+0x10/0x14) > [<c021ae0c>] (show_stack) from [<c0336b9c>] (dump_stack+0x88/0xcc) > [<c0336b9c>] (dump_stack) from [<c0279804>] (dump_header.isra.11+0x64/0x178) > [<c0279804>] (dump_header.isra.11) from [<c0279b10>] (oom_kill_process+0x70/0x384) > [<c0279b10>] (oom_kill_process) from [<c027a2a0>] (out_of_memory+0x2d4/0x304) > [<c027a2a0>] (out_of_memory) from [<c027d180>] (__alloc_pages_nodemask+0x608/0x664) > [<c027d180>] (__alloc_pages_nodemask) from [<c0278780>] (filemap_fault+0x1f8/0x390) > [<c0278780>] (filemap_fault) from [<c028f45c>] (__do_fault+0xa4/0x42c) > [<c028f45c>] (__do_fault) from [<c0292494>] (handle_mm_fault+0x230/0x7b0) > [<c0292494>] (handle_mm_fault) from [<c021f70c>] (do_page_fault+0x114/0x26c) > [<c021f70c>] (do_page_fault) from [<c0208440>] (do_PrefetchAbort+0x34/0x98) > > Need to check whether it is a regression or not. > > -Rajkumar >
On 26 May 2015 at 09:23, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: > Am 26.05.2015 um 07:42 schrieb Michal Kazior: [...] >> The driver itself should be consuming around 5MB of memory at idle >> (interface up, no significant traffic). Most of this goes for the Rx >> ring which has 1023*1920 bytes (+/- allocation and metadata waste). >> Then there's a bunch of CE buffers as well which take up some memory >> (used for driver-firmware communication), e.g. 2048*512 + 2048*128 >> (HTT and WMI, both target->host). >> >> When Txing it may eat up additional 1424 * (MSDU size + >> sizeof(skbuff)). Note that Tx queues can be longer - driver isn't >> aware of qdiscs and those can store frames as well. >> >> 11ac supports frame aggregates going up to 1MB so these queues pretty >> much need to be this long if you want to be able to get highest >> possible throughput. > > yes. but ath10k has its main usage on embedded devices. at least for AR9880 > chipsets I'm aware of that. > since there is not even a windows driver available for AR9880. > so now consider that ath10k is not able to run on devices with good > stability, where the QCA LSDK Driver > does not seem to have that big resource problem. Did you measure LSDK the same way within same conditions? Same libc, same kernel, etc? Do you see OOMs? What stability issues are we talking about? Did you try stressing the system by actually trying to consume memory until it's run out to see how much memory is _really_ left for the system to use? > so it doesnt make much sense to go on here in this way. > this resource problem must be solved. about 50 MB is really too much. I don't see this much memory being used with ath10k in my x86_64 virtual machine even with `free`. I see ~10MB of less "free" memory after starting hostapd and running traffic for some time vs no hostapd and ath10k stopped. I don't even see how ath10k could take 50MB directly. Perhaps there's some lazy memory recycling going on in the system? Maybe more memory is effectively consumed (compared to ath9k) due to alignment requirements or memory paging (which become more apparent with increased number of allocations)? Micha? > > > Sebastian > >> >> >> Micha? >> >>>> Thanks, >>>> Ben >>>> >>>> >>>> On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: >>>>> >>>>> Am 26.05.2015 um 00:39 schrieb Ben Greear: >>>>>> >>>>>> Default firmware has a hard-coded minimum number of tx buffers >>>>>> (somewhere >>>>>> more than 1k buffers I think). Maybe driver is allocating all this >>>>>> memory somehow? >>>>>> >>>>>> If you do one-way traffic tests (udp), I wonder if you can tell if it >>>>>> is >>>>>> tx >>>>>> or rx that consumes the memory? >>>>> >>>>> its tx. i have a ethernet over ip tunnel running on that link and i >>>>> broadcast iptv in that way. (its my way to convert multicast to >>>>> unicast) >>>>> the tunnel itself is rfc ethernet over ip, which is somewhat like udp. >>>>> so connectionless protocol >>>>> >>>>> Sebastian >>>>>> >>>>>> >>>>>> CT firmware can be configured to use any multiple-of-8 amount of tx >>>>>> buffers, though I have not tested below around 600. >>>>>> >>>>>> Thanks, >>>>>> Ben >>>>>> >>>>>> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>>>>>> >>>>>>> today using the latest testing driver, i found out the memory >>>>>>> consumption is unbelievable high. >>>>>>> my router here has 64 mb ram. this ram is fully taken after some >>>>>>> minutes by ath10k. but only if data flow present. >>>>>>> >>>>>>> here the results of "free" after some minutes >>>>>>> root@DD-WRT:~# free >>>>>>> total used free shared buffers >>>>>>> Mem: 61636 58752 2884 0 2600 >>>>>>> -/+ buffers: 56152 5484 >>>>>>> Swap: 0 0 0 >>>>>>> >>>>>>> >>>>>>> now i terminate hostapd which controls the ath10k chipset >>>>>>> >>>>>>> >>>>>>> root@DD-WRT:~# kill 902 >>>>>>> root@DD-WRT:~# free >>>>>>> total used free shared buffers >>>>>>> Mem: 61636 23212 38424 0 2416 >>>>>>> -/+ buffers: 20796 40840 >>>>>>> Swap: 0 0 0 >>>>>>> >>>>>>> >>>>>>> you see the difference? >>>>>>> >>>>>>> >>>>>>> regards, >>>>>>> Sebastian Gottschall >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ath10k mailing list >>>>>>> ath10k@lists.infradead.org >>>>>>> http://lists.infradead.org/mailman/listinfo/ath10k >>>>>>> >>> >>> _______________________________________________ >>> ath10k mailing list >>> ath10k@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/ath10k > >
Am 26.05.2015 um 10:26 schrieb Michal Kazior: > On 26 May 2015 at 09:23, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: >> Am 26.05.2015 um 07:42 schrieb Michal Kazior: > [...] >>> The driver itself should be consuming around 5MB of memory at idle >>> (interface up, no significant traffic). Most of this goes for the Rx >>> ring which has 1023*1920 bytes (+/- allocation and metadata waste). >>> Then there's a bunch of CE buffers as well which take up some memory >>> (used for driver-firmware communication), e.g. 2048*512 + 2048*128 >>> (HTT and WMI, both target->host). >>> >>> When Txing it may eat up additional 1424 * (MSDU size + >>> sizeof(skbuff)). Note that Tx queues can be longer - driver isn't >>> aware of qdiscs and those can store frames as well. >>> >>> 11ac supports frame aggregates going up to 1MB so these queues pretty >>> much need to be this long if you want to be able to get highest >>> possible throughput. >> yes. but ath10k has its main usage on embedded devices. at least for AR9880 >> chipsets > I'm aware of that. > > >> since there is not even a windows driver available for AR9880. >> so now consider that ath10k is not able to run on devices with good >> stability, where the QCA LSDK Driver >> does not seem to have that big resource problem. > Did you measure LSDK the same way within same conditions? Same libc, > same kernel, etc? i measured userspace memory consumption. and all what cannot be seen can be counted as taken by the kernel. > > Do you see OOMs? What stability issues are we talking about? > > Did you try stressing the system by actually trying to consume memory > until it's run out to see how much memory is _really_ left for the > system to use? no. the original dlink-dir859 firmware based on qca lsdk, does not provide oom's but with ath10k i was able to crash my device, since it was running out of memory. and i dont need to stress the system. running with one single client and 8 mbit tx flow is enough to just have 2 mb ram free on a 64 mb system > > >> so it doesnt make much sense to go on here in this way. >> this resource problem must be solved. about 50 MB is really too much. > I don't see this much memory being used with ath10k in my x86_64 > virtual machine even with `free`. I see ~10MB of less "free" memory > after starting hostapd and running traffic for some time vs no hostapd > and ath10k stopped. you wont see the memory taken that easy and your x64 system has likelly alot of ram, so you dont notice that 50 mb are just taken by ath10k. if you kill the hostapd process of ath10k, you will see the difference likelly. one point here raised up, is that qca is aware of high memory consumption with vap's my example has 2 vap's. i already provided a config file for hostapd on this mailing list > > I don't even see how ath10k could take 50MB directly. Perhaps there's > some lazy memory recycling going on in the system? Maybe more memory > is effectively consumed (compared to ath9k) due to alignment > requirements or memory paging (which become more apparent with > increased number of allocations)? ath9k takes about 2 - 3 mb ram, if i compare the consumption before and after destroying a running ath9k hostapd instance. > > > Micha? > >> >> Sebastian >> >>> >>> Micha? >>> >>>>> Thanks, >>>>> Ben >>>>> >>>>> >>>>> On 05/25/2015 04:00 PM, Sebastian Gottschall wrote: >>>>>> Am 26.05.2015 um 00:39 schrieb Ben Greear: >>>>>>> Default firmware has a hard-coded minimum number of tx buffers >>>>>>> (somewhere >>>>>>> more than 1k buffers I think). Maybe driver is allocating all this >>>>>>> memory somehow? >>>>>>> >>>>>>> If you do one-way traffic tests (udp), I wonder if you can tell if it >>>>>>> is >>>>>>> tx >>>>>>> or rx that consumes the memory? >>>>>> its tx. i have a ethernet over ip tunnel running on that link and i >>>>>> broadcast iptv in that way. (its my way to convert multicast to >>>>>> unicast) >>>>>> the tunnel itself is rfc ethernet over ip, which is somewhat like udp. >>>>>> so connectionless protocol >>>>>> >>>>>> Sebastian >>>>>>> >>>>>>> CT firmware can be configured to use any multiple-of-8 amount of tx >>>>>>> buffers, though I have not tested below around 600. >>>>>>> >>>>>>> Thanks, >>>>>>> Ben >>>>>>> >>>>>>> On 05/25/2015 02:26 PM, Sebastian Gottschall wrote: >>>>>>>> today using the latest testing driver, i found out the memory >>>>>>>> consumption is unbelievable high. >>>>>>>> my router here has 64 mb ram. this ram is fully taken after some >>>>>>>> minutes by ath10k. but only if data flow present. >>>>>>>> >>>>>>>> here the results of "free" after some minutes >>>>>>>> root@DD-WRT:~# free >>>>>>>> total used free shared buffers >>>>>>>> Mem: 61636 58752 2884 0 2600 >>>>>>>> -/+ buffers: 56152 5484 >>>>>>>> Swap: 0 0 0 >>>>>>>> >>>>>>>> >>>>>>>> now i terminate hostapd which controls the ath10k chipset >>>>>>>> >>>>>>>> >>>>>>>> root@DD-WRT:~# kill 902 >>>>>>>> root@DD-WRT:~# free >>>>>>>> total used free shared buffers >>>>>>>> Mem: 61636 23212 38424 0 2416 >>>>>>>> -/+ buffers: 20796 40840 >>>>>>>> Swap: 0 0 0 >>>>>>>> >>>>>>>> >>>>>>>> you see the difference? >>>>>>>> >>>>>>>> >>>>>>>> regards, >>>>>>>> Sebastian Gottschall >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ath10k mailing list >>>>>>>> ath10k@lists.infradead.org >>>>>>>> http://lists.infradead.org/mailman/listinfo/ath10k >>>>>>>> >>>> _______________________________________________ >>>> ath10k mailing list >>>> ath10k@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/ath10k >>
On 26 May 2015 at 10:37, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: > Am 26.05.2015 um 10:26 schrieb Michal Kazior: >> >> On 26 May 2015 at 09:23, Sebastian Gottschall <s.gottschall@dd-wrt.com> >> wrote: [...] >> Do you see OOMs? What stability issues are we talking about? >> >> Did you try stressing the system by actually trying to consume memory >> until it's run out to see how much memory is _really_ left for the >> system to use? > > no. the original dlink-dir859 firmware based on qca lsdk, does not provide > oom's > but with ath10k i was able to crash my device, since it was running out of > memory. How did it crash, i.e. did you manage to get a call trace? If not, can you connect UART to the system and get one, please? > and i dont need to stress the system. running with one single client and 8 > mbit tx flow is enough to just have 2 mb ram free on a 64 mb system 1. Is the router acting as an endpoint in the traffic or a bridge? 2. So does it crash or is free memory just low during traffic? It's not clear to me. >>> so it doesnt make much sense to go on here in this way. >>> this resource problem must be solved. about 50 MB is really too much. >> >> I don't see this much memory being used with ath10k in my x86_64 >> virtual machine even with `free`. I see ~10MB of less "free" memory >> after starting hostapd and running traffic for some time vs no hostapd >> and ath10k stopped. > > you wont see the memory taken that easy and your x64 system has likelly alot > of ram, so you dont notice that 50 mb are just taken by ath10k. The amount of memory in a virtual machine doesn't matter. If anything I should be seeing _more_ memory being consumed since kernel should be more relaxed due to smaller memory pressure. I have a very bare VM if you're implying I have a lot of background noise. If you're still doubting here's a couple of printouts (I've run my VM with 64MB of RAM; some of it is obviously reserved and unreachable): user processes: > 1 ? S 0:01 /bin/sh /init > 1189 ? Ss 0:00 udevd --daemon > 1471 ? Ss 0:00 /usr/sbin/sshd > 1530 ttyS0 Ss+ 0:00 /bin/login -f > 1533 ttyS0 S+ 0:00 \_ -rc > 1564 ttyS0 R+ 0:00 \_ ps fax (everything else is kernel threads) after boot (ath10k module loaded and probed): > total used free shared buffers cached > Mem: 46928 33808 13120 152 0 5156 > -/+ buffers/cache: 28652 18276 > Swap: 0 0 0 hostad+iperf: > total used free shared buffers cached > Mem: 46928 44672 2256 440 0 2952 > -/+ buffers/cache: 41720 5208 > Swap: 0 0 0 hostapd (no iperf): > total used free shared buffers cached > Mem: 46928 42436 4492 500 0 2784 > -/+ buffers/cache: 39652 7276 > Swap: 0 0 0 hostapd stopped: > total used free shared buffers cached > Mem: 46928 32220 14708 388 0 2604 > -/+ buffers/cache: 29616 17312 > Swap: 0 0 0 ath10k_pci and ath10k_core unloaded: > total used free shared buffers cached > Mem: 46928 28552 18376 144 0 4712 > -/+ buffers/cache: 23840 23088 > Swap: 0 0 0 While running iperf I was able to get 400mbps+ of UDP traffic with another 2x2 11ac device without much trouble. Do note: The VM is running a glibc based system and has kernel and modules with full debugging hence the high base memory usage. Yet it still manages to work just fine. > if you kill the hostapd process of ath10k, you will see the difference > likelly. > one point here raised up, is that qca is aware of high memory consumption > with vap's > my example has 2 vap's. i already provided a config file for hostapd on this > mailing list You must be aware you can't compare ath10k to LSDK apples to apples. Their QSDK includes kernel customizations which makes it nearly impossible to compare. They may have some fixes for the platform itself that haven't been upstreamed for what it's worth. Micha?
Am 26.05.2015 um 11:21 schrieb Michal Kazior: > On 26 May 2015 at 10:37, Sebastian Gottschall <s.gottschall@dd-wrt.com> wrote: >> Am 26.05.2015 um 10:26 schrieb Michal Kazior: >>> On 26 May 2015 at 09:23, Sebastian Gottschall <s.gottschall@dd-wrt.com> >>> wrote: > [...] >>> Do you see OOMs? What stability issues are we talking about? >>> >>> Did you try stressing the system by actually trying to consume memory >>> until it's run out to see how much memory is _really_ left for the >>> system to use? >> no. the original dlink-dir859 firmware based on qca lsdk, does not provide >> oom's >> but with ath10k i was able to crash my device, since it was running out of >> memory. > How did it crash, i.e. did you manage to get a call trace? If not, can > you connect UART to the system and get one, please? no real crash. its was a out of memory hang. so the userspace will not work correct anymore > > >> and i dont need to stress the system. running with one single client and 8 >> mbit tx flow is enough to just have 2 mb ram free on a 64 mb system > 1. Is the router acting as an endpoint in the traffic or a bridge? > 2. So does it crash or is free memory just low during traffic? It's > not clear to me. dlink asked me to port this device with dd-wrt so the router can be in any situation. right now its configured as standard accesspoint with 2 interfaces for ath10k. (see my hostapd config i provided earlier today, it clearly shows how its configured) and the crash is pure out of memory. the traffic is constant about 8 mbit tx flow > >>>> so it doesnt make much sense to go on here in this way. >>>> this resource problem must be solved. about 50 MB is really too much. >>> I don't see this much memory being used with ath10k in my x86_64 >>> virtual machine even with `free`. I see ~10MB of less "free" memory >>> after starting hostapd and running traffic for some time vs no hostapd >>> and ath10k stopped. >> you wont see the memory taken that easy and your x64 system has likelly alot >> of ram, so you dont notice that 50 mb are just taken by ath10k. > The amount of memory in a virtual machine doesn't matter. If anything > I should be seeing _more_ memory being consumed since kernel should be > more relaxed due to smaller memory pressure. if the userspace has no memory left, the kernel will raise oom handler > > I have a very bare VM if you're implying I have a lot of background noise. > > If you're still doubting here's a couple of printouts (I've run my VM > with 64MB of RAM; some of it is obviously reserved and unreachable): > > user processes: >> 1 ? S 0:01 /bin/sh /init >> 1189 ? Ss 0:00 udevd --daemon >> 1471 ? Ss 0:00 /usr/sbin/sshd >> 1530 ttyS0 Ss+ 0:00 /bin/login -f >> 1533 ttyS0 S+ 0:00 \_ -rc >> 1564 ttyS0 R+ 0:00 \_ ps fax yes. i have alot of ram free. the system itself just takes 16 - 20 mb ram out of 64 mb. if i now start the ath10k interface, the whole system memory is almost gone. (if traffic is flowing) > (everything else is kernel threads) > > after boot (ath10k module loaded and probed): >> total used free shared buffers cached >> Mem: 46928 33808 13120 152 0 5156 >> -/+ buffers/cache: 28652 18276 >> Swap: 0 0 0 use the config i provided and generate some traffic. then you will see that the left memory is running till zero > hostad+iperf: >> total used free shared buffers cached >> Mem: 46928 44672 2256 440 0 2952 >> -/+ buffers/cache: 41720 5208 >> Swap: 0 0 0 you see it already here. > hostapd (no iperf): >> total used free shared buffers cached >> Mem: 46928 42436 4492 500 0 2784 >> -/+ buffers/cache: 39652 7276 >> Swap: 0 0 0 > hostapd stopped: >> total used free shared buffers cached >> Mem: 46928 32220 14708 388 0 2604 >> -/+ buffers/cache: 29616 17312 >> Swap: 0 0 0 > ath10k_pci and ath10k_core unloaded: >> total used free shared buffers cached >> Mem: 46928 28552 18376 144 0 4712 >> -/+ buffers/cache: 23840 23088 >> Swap: 0 0 0 > While running iperf I was able to get 400mbps+ of UDP traffic with > another 2x2 11ac device without much trouble. > > Do note: The VM is running a glibc based system and has kernel and > modules with full debugging hence the high base memory usage. Yet it > still manages to work just fine. mine is musl based. no debugging beside this > > >> if you kill the hostapd process of ath10k, you will see the difference >> likelly. >> one point here raised up, is that qca is aware of high memory consumption >> with vap's >> my example has 2 vap's. i already provided a config file for hostapd on this >> mailing list > You must be aware you can't compare ath10k to LSDK apples to apples. > Their QSDK includes kernel customizations which makes it nearly > impossible to compare. They may have some fixes for the platform > itself that haven't been upstreamed for what it's worth. i know. but what we want to reach is that ath10k can be used in routers with 64 mb ram. right now its not enough and dd-wrt is highly optimized for small memory footprint. but it has also features like nas storage or even freeradius which cannot be used on such devices if all memory is already taken by a single driver > > > Micha? >
Kalle Valo <kvalo@qca.qualcomm.com> writes: >>>> Fixes: c17c997d5613 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next") >>>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com> >>> >>> Apparently this also fixes some weird issues with qca6174 hw2.1 notably: >>> - ath10k causing disconnecting of other devices in a BSS >>> - random Fw crashes >>> >>> Both problems started to happen because c17c997d5613 enabled monitor >>> vdev by default on STA interfaces. It seems that qca6174 hw2.1 >>> firmware has issues similar to those of qca988x 999.999.0.636 >>> regarding monitor vdev opration. >>> >>> Also, I've made a typo in the subject. >>> >>> I'll post v2 with subject fixed and extended commit log later. >> >> Keep in mind that c17c997d5613 is actually from wireless-testing.git >> which means that it will never go to wireless-drivers-next.git nor to >> net-next.git. So the merge conflict bug is purely in >> wireless-testing.git and in master branch of ath.git (but not in >> ath-next branch!). >> >> I think John should apply your v2 patch once you send it. But if you >> have something which should be fixed in ath-next remember to send that >> in a separate patch so that I can apply that directly to ath-next. > > Actually now that Dave pulled my pull request the issue is fixed in > wireless-drivers-next already. So once John pulls from > wireless-drivers-next and makes sure that ath10k is 100% identical in > both trees the issue should be sorted out and no need for extra patches. John now fixed this in wireless-testing, thanks John. And I now updated ath.git master branch so it should be ok as well. Please let me know if there are still problems.
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 425dbe271495..594eb369ff7f 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -1031,22 +1031,6 @@ static int ath10k_monitor_stop(struct ath10k *ar) return 0; } -static bool ath10k_mac_should_disable_promisc(struct ath10k *ar) -{ - struct ath10k_vif *arvif; - - if (!ar->num_started_vdevs) - return false; - - list_for_each_entry(arvif, &ar->arvifs, list) - if (arvif->vdev_type != WMI_VDEV_TYPE_AP) - return false; - - ath10k_dbg(ar, ATH10K_DBG_MAC, - "mac disabling promiscuous mode because vdev is started\n"); - return true; -} - static bool ath10k_mac_monitor_vdev_is_needed(struct ath10k *ar) { int num_ctx; @@ -1065,7 +1049,6 @@ static bool ath10k_mac_monitor_vdev_is_needed(struct ath10k *ar) return false; return ar->monitor || - !ath10k_mac_should_disable_promisc(ar) || test_bit(ATH10K_CAC_RUNNING, &ar->dev_flags); } @@ -1267,7 +1250,7 @@ static int ath10k_vdev_start_restart(struct ath10k_vif *arvif, { struct ath10k *ar = arvif->ar; struct wmi_vdev_start_request_arg arg = {}; - int ret = 0, ret2; + int ret = 0; lockdep_assert_held(&ar->conf_mutex); @@ -1326,16 +1309,6 @@ static int ath10k_vdev_start_restart(struct ath10k_vif *arvif, ar->num_started_vdevs++; ath10k_recalc_radar_detection(ar); - ret = ath10k_monitor_recalc(ar); - if (ret) { - ath10k_warn(ar, "mac failed to recalc monitor for vdev %i restart %d: %d\n", - arg.vdev_id, restart, ret); - ret2 = ath10k_vdev_stop(arvif); - if (ret2) - ath10k_warn(ar, "mac failed to stop vdev %i restart %d: %d\n", - arg.vdev_id, restart, ret2); - } - return ret; }
Patch df1404650ccb ("mac80211: remove support for IFF_PROMISC") removed promiscuous flag propagation to drivers. However the patch was designed against ath10k without 548462133d98 ("ath10k: fix interrupt storm"). After merge the code drifted into being no longer correct and due to monitor vdev being overzealously started caused IBSS to crash on 999.999.0.636 for QCA988X (this firmware revision is known to have issues with monitor vdev). This patch keeps expectations of commit 548462133d98 (i.e. reduce irq storm by not enabling monitor vdev for AP) and doesn't break existing (known) setups that imply promiscuous mode on network interfaces. Contrary to what it looks like 548462133d98 functionality is not reverted since the intention was a subset of what df1404650ccb did. Fixes: c17c997d5613 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> --- drivers/net/wireless/ath/ath10k/mac.c | 29 +---------------------------- 1 file changed, 1 insertion(+), 28 deletions(-)