Message ID | 20221121110359.4652-1-quic_mpubbise@quicinc.com (mailing list archive) |
---|---|
Headers | show |
Series | Enable low power mode when WLAN is not active | expand |
Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: > Currently, WLAN chip is powered once during driver probe and is kept > ON (powered) always even when WLAN is not active; keeping the chip > powered ON all the time will consume extra power which is not > desirable for battery operated devices. Same is the case with non-WoW > suspend, chip will not be put into low power mode when the system is > suspended resulting in higher battery drain. > > Send QMI MODE OFF command to firmware during WiFi OFF to put device > into low power mode. > > Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 > > Manikanta Pubbisetty (3): > ath11k: Fix double free issue during SRNG deinit > ath11k: Move hardware initialization logic to start() > ath11k: Enable low power mode when WLAN is not active Nowadays we use "wifi:" prefix, but I can add it.
Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: > Currently, WLAN chip is powered once during driver probe and is kept > ON (powered) always even when WLAN is not active; keeping the chip > powered ON all the time will consume extra power which is not > desirable for battery operated devices. Same is the case with non-WoW > suspend, chip will not be put into low power mode when the system is > suspended resulting in higher battery drain. > > Send QMI MODE OFF command to firmware during WiFi OFF to put device > into low power mode. > > Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 > > Manikanta Pubbisetty (3): > ath11k: Fix double free issue during SRNG deinit > ath11k: Move hardware initialization logic to start() > ath11k: Enable low power mode when WLAN is not active > --- > V3: > - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore > - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes > - Fixed other minor issues that were found during code review > - Spelling corrections in the commit messages I still see a crash, immediately after the first rmmod: Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range [0x00000000000001f0-0x00000000000001f7] Really odd that you don't see it. Unfortunately not able to debug this further right now. This is with: wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9
Kalle Valo <kvalo@kernel.org> writes: > Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: > >> Currently, WLAN chip is powered once during driver probe and is kept >> ON (powered) always even when WLAN is not active; keeping the chip >> powered ON all the time will consume extra power which is not >> desirable for battery operated devices. Same is the case with non-WoW >> suspend, chip will not be put into low power mode when the system is >> suspended resulting in higher battery drain. >> >> Send QMI MODE OFF command to firmware during WiFi OFF to put device >> into low power mode. >> >> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 >> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 >> >> Manikanta Pubbisetty (3): >> ath11k: Fix double free issue during SRNG deinit >> ath11k: Move hardware initialization logic to start() >> ath11k: Enable low power mode when WLAN is not active >> --- >> V3: >> - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore >> - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes >> - Fixed other minor issues that were found during code review >> - Spelling corrections in the commit messages > > I still see a crash, immediately after the first rmmod: > > Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci > Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably > for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP > DEBUG_PAGEALLOC KASAN > Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range > [0x00000000000001f0-0x00000000000001f7] > > Really odd that you don't see it. Unfortunately not able to debug this > further right now. > > This is with: > > wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9 A bit more information how I see the crash. So first I have all modules loaded: $ lsmod Module Size Used by ath11k_pci 57344 0 ath11k 2015232 1 ath11k_pci mac80211 3284992 1 ath11k libarc4 16384 1 mac80211 cfg80211 2494464 2 ath11k,mac80211 qmi_helpers 57344 1 ath11k qrtr_mhi 20480 0 mhi 217088 2 ath11k_pci,qrtr_mhi qrtr 98304 5 qrtr_mhi nvme 122880 3 nvme_core 299008 5 nvme $ Then I just remove ath11k_pci module and boom: $ sudo rmmod ath11k_pci [ 153.658409] general protection fault, probably for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN This happens every time, there doesn't seem to be any randomness on the behaviour.
On 11/23/2022 9:35 PM, Kalle Valo wrote: > Kalle Valo <kvalo@kernel.org> writes: > >> Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: >> >>> Currently, WLAN chip is powered once during driver probe and is kept >>> ON (powered) always even when WLAN is not active; keeping the chip >>> powered ON all the time will consume extra power which is not >>> desirable for battery operated devices. Same is the case with non-WoW >>> suspend, chip will not be put into low power mode when the system is >>> suspended resulting in higher battery drain. >>> >>> Send QMI MODE OFF command to firmware during WiFi OFF to put device >>> into low power mode. >>> >>> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 >>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 >>> >>> Manikanta Pubbisetty (3): >>> ath11k: Fix double free issue during SRNG deinit >>> ath11k: Move hardware initialization logic to start() >>> ath11k: Enable low power mode when WLAN is not active >>> --- >>> V3: >>> - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore >>> - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes >>> - Fixed other minor issues that were found during code review >>> - Spelling corrections in the commit messages >> >> I still see a crash, immediately after the first rmmod: >> >> Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci >> Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably >> for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP >> DEBUG_PAGEALLOC KASAN >> Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range >> [0x00000000000001f0-0x00000000000001f7] >> >> Really odd that you don't see it. Unfortunately not able to debug this >> further right now. >> >> This is with: >> >> wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9 > > A bit more information how I see the crash. So first I have all modules > loaded: > > $ lsmod > Module Size Used by > ath11k_pci 57344 0 > ath11k 2015232 1 ath11k_pci > mac80211 3284992 1 ath11k > libarc4 16384 1 mac80211 > cfg80211 2494464 2 ath11k,mac80211 > qmi_helpers 57344 1 ath11k > qrtr_mhi 20480 0 > mhi 217088 2 ath11k_pci,qrtr_mhi > qrtr 98304 5 qrtr_mhi > nvme 122880 3 > nvme_core 299008 5 nvme > $ > > Then I just remove ath11k_pci module and boom: > > $ sudo rmmod ath11k_pci > > [ 153.658409] general protection fault, probably for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN > > This happens every time, there doesn't seem to be any randomness on the > behaviour. > Thanks for the help Kalle, this is exactly what I was doing in my tests. Unfortunately, I'm not able to reproduce the problem. I have also tried with the exact firmware that you have pointed out. Let me see if I'm missing anything. Thanks, Manikanta
Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: > On 11/23/2022 9:35 PM, Kalle Valo wrote: > >> Kalle Valo <kvalo@kernel.org> writes: >> >>> Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: >>> >>>> Currently, WLAN chip is powered once during driver probe and is kept >>>> ON (powered) always even when WLAN is not active; keeping the chip >>>> powered ON all the time will consume extra power which is not >>>> desirable for battery operated devices. Same is the case with non-WoW >>>> suspend, chip will not be put into low power mode when the system is >>>> suspended resulting in higher battery drain. >>>> >>>> Send QMI MODE OFF command to firmware during WiFi OFF to put device >>>> into low power mode. >>>> >>>> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 >>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 >>>> >>>> Manikanta Pubbisetty (3): >>>> ath11k: Fix double free issue during SRNG deinit >>>> ath11k: Move hardware initialization logic to start() >>>> ath11k: Enable low power mode when WLAN is not active >>>> --- >>>> V3: >>>> - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore >>>> - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes >>>> - Fixed other minor issues that were found during code review >>>> - Spelling corrections in the commit messages >>> >>> I still see a crash, immediately after the first rmmod: >>> >>> Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci >>> Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably >>> for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP >>> DEBUG_PAGEALLOC KASAN >>> Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range >>> [0x00000000000001f0-0x00000000000001f7] >>> >>> Really odd that you don't see it. Unfortunately not able to debug this >>> further right now. >>> >>> This is with: >>> >>> wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9 >> >> A bit more information how I see the crash. So first I have all modules >> loaded: >> >> $ lsmod >> Module Size Used by >> ath11k_pci 57344 0 >> ath11k 2015232 1 ath11k_pci >> mac80211 3284992 1 ath11k >> libarc4 16384 1 mac80211 >> cfg80211 2494464 2 ath11k,mac80211 >> qmi_helpers 57344 1 ath11k >> qrtr_mhi 20480 0 >> mhi 217088 2 ath11k_pci,qrtr_mhi >> qrtr 98304 5 qrtr_mhi >> nvme 122880 3 >> nvme_core 299008 5 nvme >> $ >> >> Then I just remove ath11k_pci module and boom: >> >> $ sudo rmmod ath11k_pci >> >> [ 153.658409] general protection fault, probably for non-canonical >> address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >> KASAN >> >> This happens every time, there doesn't seem to be any randomness on the >> behaviour. >> > > Thanks for the help Kalle, this is exactly what I was doing in my > tests. Unfortunately, I'm not able to reproduce the problem. I have > also tried with the exact firmware that you have pointed out. Let me > see if I'm missing anything. I tested this more and patch 3 seems to be the one causing the crash. I didn't see this when patch 1-2 were applied. The crash happens in ath11k_dp_process_rxdma_err() in this line: srng = &ab->hal.srng_list[err_ring->ring_id]; ab looks sane to me (0xffff88814c960000) but err_ring is set to 0x200. Does this help?
On 12/8/2022 1:24 PM, Kalle Valo wrote: > Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: > >> On 11/23/2022 9:35 PM, Kalle Valo wrote: >> >>> Kalle Valo <kvalo@kernel.org> writes: >>> >>>> Manikanta Pubbisetty <quic_mpubbise@quicinc.com> writes: >>>> >>>>> Currently, WLAN chip is powered once during driver probe and is kept >>>>> ON (powered) always even when WLAN is not active; keeping the chip >>>>> powered ON all the time will consume extra power which is not >>>>> desirable for battery operated devices. Same is the case with non-WoW >>>>> suspend, chip will not be put into low power mode when the system is >>>>> suspended resulting in higher battery drain. >>>>> >>>>> Send QMI MODE OFF command to firmware during WiFi OFF to put device >>>>> into low power mode. >>>>> >>>>> Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-00887-QCAMSLSWPLZ-1 >>>>> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.16 >>>>> >>>>> Manikanta Pubbisetty (3): >>>>> ath11k: Fix double free issue during SRNG deinit >>>>> ath11k: Move hardware initialization logic to start() >>>>> ath11k: Enable low power mode when WLAN is not active >>>>> --- >>>>> V3: >>>>> - Removed patch "ath11k: Fix failed to parse regulatory event print" as it is not needed anymore >>>>> - Fixed a potential deadlock scenario reported by lockdep around ab->core_lock with V2 changes >>>>> - Fixed other minor issues that were found during code review >>>>> - Spelling corrections in the commit messages >>>> >>>> I still see a crash, immediately after the first rmmod: >>>> >>>> Nov 22 11:05:47 nuc2 [ 139.378719] rmmod ath11k_pci >>>> Nov 22 11:05:48 nuc2 [ 139.892395] general protection fault, probably >>>> for non-canonical address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP >>>> DEBUG_PAGEALLOC KASAN >>>> Nov 22 11:05:48 nuc2 [ 139.892453] KASAN: null-ptr-deref in range >>>> [0x00000000000001f0-0x00000000000001f7] >>>> >>>> Really odd that you don't see it. Unfortunately not able to debug this >>>> further right now. >>>> >>>> This is with: >>>> >>>> wcn6855 hw2.0 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.9 >>> >>> A bit more information how I see the crash. So first I have all modules >>> loaded: >>> >>> $ lsmod >>> Module Size Used by >>> ath11k_pci 57344 0 >>> ath11k 2015232 1 ath11k_pci >>> mac80211 3284992 1 ath11k >>> libarc4 16384 1 mac80211 >>> cfg80211 2494464 2 ath11k,mac80211 >>> qmi_helpers 57344 1 ath11k >>> qrtr_mhi 20480 0 >>> mhi 217088 2 ath11k_pci,qrtr_mhi >>> qrtr 98304 5 qrtr_mhi >>> nvme 122880 3 >>> nvme_core 299008 5 nvme >>> $ >>> >>> Then I just remove ath11k_pci module and boom: >>> >>> $ sudo rmmod ath11k_pci >>> >>> [ 153.658409] general protection fault, probably for non-canonical >>> address 0xdffffc000000003e: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC >>> KASAN >>> >>> This happens every time, there doesn't seem to be any randomness on the >>> behaviour. >>> >> >> Thanks for the help Kalle, this is exactly what I was doing in my >> tests. Unfortunately, I'm not able to reproduce the problem. I have >> also tried with the exact firmware that you have pointed out. Let me >> see if I'm missing anything. > > I tested this more and patch 3 seems to be the one causing the crash. I > didn't see this when patch 1-2 were applied. > > The crash happens in ath11k_dp_process_rxdma_err() in this line: > > srng = &ab->hal.srng_list[err_ring->ring_id]; > > ab looks sane to me (0xffff88814c960000) but err_ring is set to 0x200. > Does this help? > Thanks for your time and analysis on the bug. From the callstack() that you have shared earlier, it looks like ath11k_dp_process_rxdma_err() is called from dp_service_srngs which is a napi poll callback. To me it looks like the napi handler of the driver is getting called after the srng resources have been de-initialized during rmmod. I have closed examined the changes in patch 3 (I'm still doing it). so far, I could not find anything that could trigger this kind of a problem. Since NAPI is disabled upon rmmod, NAPI poll should not be called. It's quite not clear if I'm missing something. Thanks, Manikanta