diff mbox series

[v3,2/2] wifi: ath11k: reduce the timeout value back for hw scan from 10 seconds to 1 second

Message ID 20221011072408.23731-3-quic_wgong@quicinc.com (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show
Series wifi: ath11k: reduce the timeout value for hw scan | expand

Commit Message

Wen Gong Oct. 11, 2022, 7:24 a.m. UTC
For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support")
increased the timeout from one second to max 10 seconds when 11d scan
offload enabled and 6 GHz enabled, it is reasonable for the commit, it
is because the first 11d scan request is sent to firmware before the
first hw scan request after wlan load, then the hw scan started event
will reported from firmware after the 11d scan finished, it needs about
6 seconds when 6 GHz enabled, so increased it from one second to 10
seconds in the commit to avoid timed out for hw scan started. Then
another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan
and hw scan while add interface") change the sequence of the first 11d
scan and hw scan, then ath11k will receive the hw scan started event
from firmware immediately for the first hw scan, thus ath11k does not
need set the timeout value to max 10 seconds again, and this is to set
the timeout value back from 10 seconds to 1 second.

After the 1st hw scan finished, firmware will start 11d scan immediately,
and firmware need use some seconds to finish 11d scan, if the 2nd hw
scan is sent from ath11k to firmware before 11d scan finished, the 2nd
hw scan will started after 11d scan finished, this will lead timeout to
wait scan started in ath11k. Treat the timeout as a normal situation if
11d scan is running and skip report scan fail for this situation.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3

Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
---
 drivers/net/wireless/ath/ath11k/mac.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Comments

Kalle Valo Nov. 8, 2022, 10:20 a.m. UTC | #1
Wen Gong <quic_wgong@quicinc.com> writes:

> For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support")
> increased the timeout from one second to max 10 seconds when 11d scan
> offload enabled and 6 GHz enabled, it is reasonable for the commit, it
> is because the first 11d scan request is sent to firmware before the
> first hw scan request after wlan load, then the hw scan started event
> will reported from firmware after the 11d scan finished, it needs about
> 6 seconds when 6 GHz enabled, so increased it from one second to 10
> seconds in the commit to avoid timed out for hw scan started. Then
> another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan
> and hw scan while add interface") change the sequence of the first 11d
> scan and hw scan, then ath11k will receive the hw scan started event
> from firmware immediately for the first hw scan, thus ath11k does not
> need set the timeout value to max 10 seconds again, and this is to set
> the timeout value back from 10 seconds to 1 second.
>
> After the 1st hw scan finished, firmware will start 11d scan immediately,
> and firmware need use some seconds to finish 11d scan, if the 2nd hw
> scan is sent from ath11k to firmware before 11d scan finished, the 2nd
> hw scan will started after 11d scan finished, this will lead timeout to
> wait scan started in ath11k. Treat the timeout as a normal situation if
> 11d scan is running and skip report scan fail for this situation.
>
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
>
> Signed-off-by: Wen Gong <quic_wgong@quicinc.com>

[...]

> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,
>  
>  	ret = ath11k_start_scan(ar, &arg);
>  	if (ret) {
> -		ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
> +		if (ret == -EBUSY)
> +			ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
> +				   "scan engine is busy 11d state %d\n", ar->state_11d);
> +		else
> +			ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
> +
>  		spin_lock_bh(&ar->data_lock);
>  		ar->scan.state = ATH11K_SCAN_IDLE;
>  		spin_unlock_bh(&ar->data_lock);

This feels like a hack to me, for example will these failed scans now
cause delays is connection establishment? IMHO it's crucial from user's
point of view that we don't delay that in any way.

I would rather fix the root cause, do we know what's causing this?
Wen Gong Nov. 18, 2022, 10:29 a.m. UTC | #2
On 11/8/2022 6:20 PM, Kalle Valo wrote:
> Wen Gong <quic_wgong@quicinc.com> writes:
>
...
> [...]
>
>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,
>>   
>>   	ret = ath11k_start_scan(ar, &arg);
>>   	if (ret) {
>> -		ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>> +		if (ret == -EBUSY)
>> +			ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>> +				   "scan engine is busy 11d state %d\n", ar->state_11d);
>> +		else
>> +			ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>> +
>>   		spin_lock_bh(&ar->data_lock);
>>   		ar->scan.state = ATH11K_SCAN_IDLE;
>>   		spin_unlock_bh(&ar->data_lock);
> This feels like a hack to me, for example will these failed scans now
> cause delays is connection establishment? IMHO it's crucial from user's
> point of view that we don't delay that in any way.
It will not delay connection.
After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
scan will be sent to firmware after the 1st hw scan. It means the hw
scan for connection is run before 11d scan, and then connection could
be started immediately after the 1st hw scan finished. It means no
delay for connection.
> I would rather fix the root cause, do we know what's causing this?
In firmware, hw scan and 11d scan are all running in the same queue,
they can not be run parallel.

When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
to ath11k will be run immediately. If the 2nd hw scan arrived before
the 11d scan finished, for example, it arrived 7.1 seconds after the
1st hw scan, at this moment, the 11d scan is still running in firmware,
then the 2nd hw scan will not receive scan started event untill the 11d
scan finished, and meanwhile, the 2nd hw scan is holding the ar->conf_mutex
in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
to avoid holding ar->conf_mutex for long time.
Wen Gong Nov. 23, 2022, 3:41 a.m. UTC | #3
On 11/18/2022 6:29 PM, Wen Gong wrote:
> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>> Wen Gong <quic_wgong@quicinc.com> writes:
>>
> ...
>> [...]
>>
>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct 
>>> ieee80211_hw *hw,
>>>         ret = ath11k_start_scan(ar, &arg);
>>>       if (ret) {
>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>> +        if (ret == -EBUSY)
>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>> +                   "scan engine is busy 11d state %d\n", 
>>> ar->state_11d);
>>> +        else
>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>> +
>>>           spin_lock_bh(&ar->data_lock);
>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>           spin_unlock_bh(&ar->data_lock);
>> This feels like a hack to me, for example will these failed scans now
>> cause delays is connection establishment? IMHO it's crucial from user's
>> point of view that we don't delay that in any way.
> It will not delay connection.
> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
> scan will be sent to firmware after the 1st hw scan. It means the hw
> scan for connection is run before 11d scan, and then connection could
> be started immediately after the 1st hw scan finished. It means no
> delay for connection.
>> I would rather fix the root cause, do we know what's causing this?
> In firmware, hw scan and 11d scan are all running in the same queue,
> they can not be run parallel.
>
> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
> to ath11k will be run immediately. If the 2nd hw scan arrived before
> the 11d scan finished, for example, it arrived 7.1 seconds after the
> 1st hw scan, at this moment, the 11d scan is still running in firmware,
> then the 2nd hw scan will not receive scan started event untill the 11d
> scan finished, and meanwhile, the 2nd hw scan is holding the 
> ar->conf_mutex
> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
> to avoid holding ar->conf_mutex for long time.

Hi Kalle,

Should I change commit log with above explanation and send v4?
Wen Gong Dec. 16, 2022, 3:08 a.m. UTC | #4
Hi Kalle,

Should I change commit log with below explanation and send v4?

On 11/23/2022 11:41 AM, Wen Gong wrote:
> On 11/18/2022 6:29 PM, Wen Gong wrote:
>> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>>> Wen Gong <quic_wgong@quicinc.com> writes:
>>>
>> ...
>>> [...]
>>>
>>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct 
>>>> ieee80211_hw *hw,
>>>>         ret = ath11k_start_scan(ar, &arg);
>>>>       if (ret) {
>>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>>> +        if (ret == -EBUSY)
>>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>>> +                   "scan engine is busy 11d state %d\n", 
>>>> ar->state_11d);
>>>> +        else
>>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n", 
>>>> ret);
>>>> +
>>>>           spin_lock_bh(&ar->data_lock);
>>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>>           spin_unlock_bh(&ar->data_lock);
>>> This feels like a hack to me, for example will these failed scans now
>>> cause delays is connection establishment? IMHO it's crucial from user's
>>> point of view that we don't delay that in any way.
>> It will not delay connection.
>> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
>> scan will be sent to firmware after the 1st hw scan. It means the hw
>> scan for connection is run before 11d scan, and then connection could
>> be started immediately after the 1st hw scan finished. It means no
>> delay for connection.
>>> I would rather fix the root cause, do we know what's causing this?
>> In firmware, hw scan and 11d scan are all running in the same queue,
>> they can not be run parallel.
>>
>> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
>> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
>> to ath11k will be run immediately. If the 2nd hw scan arrived before
>> the 11d scan finished, for example, it arrived 7.1 seconds after the
>> 1st hw scan, at this moment, the 11d scan is still running in firmware,
>> then the 2nd hw scan will not receive scan started event untill the 11d
>> scan finished, and meanwhile, the 2nd hw scan is holding the 
>> ar->conf_mutex
>> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
>> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
>> to avoid holding ar->conf_mutex for long time.
>
> Hi Kalle,
>
> Should I change commit log with above explanation and send v4?
>
Wen Gong Jan. 13, 2023, 6:54 a.m. UTC | #5
Hi Kalle,

Should I change commit log with below explanation and send v4?

On 12/16/2022 11:08 AM, Wen Gong wrote:
> Hi Kalle,
>
> Should I change commit log with below explanation and send v4?
>
> On 11/23/2022 11:41 AM, Wen Gong wrote:
>> On 11/18/2022 6:29 PM, Wen Gong wrote:
>>> On 11/8/2022 6:20 PM, Kalle Valo wrote:
>>>> Wen Gong <quic_wgong@quicinc.com> writes:
>>>>
>>> ...
>>>> [...]
>>>>
>>>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct 
>>>>> ieee80211_hw *hw,
>>>>>         ret = ath11k_start_scan(ar, &arg);
>>>>>       if (ret) {
>>>>> -        ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
>>>>> +        if (ret == -EBUSY)
>>>>> +            ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
>>>>> +                   "scan engine is busy 11d state %d\n", 
>>>>> ar->state_11d);
>>>>> +        else
>>>>> +            ath11k_warn(ar->ab, "failed to start hw scan: %d\n", 
>>>>> ret);
>>>>> +
>>>>>           spin_lock_bh(&ar->data_lock);
>>>>>           ar->scan.state = ATH11K_SCAN_IDLE;
>>>>>           spin_unlock_bh(&ar->data_lock);
>>>> This feels like a hack to me, for example will these failed scans now
>>>> cause delays is connection establishment? IMHO it's crucial from 
>>>> user's
>>>> point of view that we don't delay that in any way.
>>> It will not delay connection.
>>> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d
>>> scan will be sent to firmware after the 1st hw scan. It means the hw
>>> scan for connection is run before 11d scan, and then connection could
>>> be started immediately after the 1st hw scan finished. It means no
>>> delay for connection.
>>>> I would rather fix the root cause, do we know what's causing this?
>>> In firmware, hw scan and 11d scan are all running in the same queue,
>>> they can not be run parallel.
>>>
>>> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and
>>> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived
>>> to ath11k will be run immediately. If the 2nd hw scan arrived before
>>> the 11d scan finished, for example, it arrived 7.1 seconds after the
>>> 1st hw scan, at this moment, the 11d scan is still running in firmware,
>>> then the 2nd hw scan will not receive scan started event untill the 11d
>>> scan finished, and meanwhile, the 2nd hw scan is holding the 
>>> ar->conf_mutex
>>> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some
>>> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s
>>> to avoid holding ar->conf_mutex for long time.
>>
>> Hi Kalle,
>>
>> Should I change commit log with above explanation and send v4?
>>
Kalle Valo Jan. 13, 2023, 12:14 p.m. UTC | #6
Wen Gong <quic_wgong@quicinc.com> writes:

> Should I change commit log with below explanation and send v4?

Please stop spamming the same question over and over, it's really
annoying. If I don't have time to look at something, spamming me won't
help, quite the opposite. It would be a lot better if you would help
with the other upstream related tasks we have, that way I might have
more time to look at your patches.

To answer your question I need to look at this patchset in detail and I
don't know when I'm able to do that. But at this moment I don't trust
this patchset is the right approach and I'm not willing to take it.
Wen Gong Jan. 31, 2023, 2:40 a.m. UTC | #7
On 1/13/2023 8:14 PM, Kalle Valo wrote:
> Wen Gong <quic_wgong@quicinc.com> writes:
>
>> Should I change commit log with below explanation and send v4?
> Please stop spamming the same question over and over, it's really
> annoying. If I don't have time to look at something, spamming me won't
> help, quite the opposite. It would be a lot better if you would help
> with the other upstream related tasks we have, that way I might have
> more time to look at your patches.
>
> To answer your question I need to look at this patchset in detail and I
> don't know when I'm able to do that. But at this moment I don't trust
> this patchset is the right approach and I'm not willing to take it.

yes.

I will send v4 only for one patch "[v3,1/2] wifi: ath11k: change to set 
11d state instead of start 11d scan while disconnect",

is it ok?
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index b0c3cf258d12..666775a1e2a9 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -3560,7 +3560,6 @@  static int ath11k_start_scan(struct ath11k *ar,
 			     struct scan_req_params *arg)
 {
 	int ret;
-	unsigned long timeout = 1 * HZ;
 
 	lockdep_assert_held(&ar->conf_mutex);
 
@@ -3571,19 +3570,15 @@  static int ath11k_start_scan(struct ath11k *ar,
 	if (ret)
 		return ret;
 
-	if (test_bit(WMI_TLV_SERVICE_11D_OFFLOAD, ar->ab->wmi_ab.svc_map)) {
-		timeout = 5 * HZ;
-
-		if (ar->supports_6ghz)
-			timeout += 5 * HZ;
-	}
-
-	ret = wait_for_completion_timeout(&ar->scan.started, timeout);
+	ret = wait_for_completion_timeout(&ar->scan.started, 1 * HZ);
 	if (ret == 0) {
 		ret = ath11k_scan_stop(ar);
 		if (ret)
 			ath11k_warn(ar->ab, "failed to stop scan: %d\n", ret);
 
+		if (ar->state_11d == ATH11K_11D_RUNNING)
+			return -EBUSY;
+
 		return -ETIMEDOUT;
 	}
 
@@ -3682,7 +3677,12 @@  static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw,
 
 	ret = ath11k_start_scan(ar, &arg);
 	if (ret) {
-		ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
+		if (ret == -EBUSY)
+			ath11k_dbg(ar->ab, ATH11K_DBG_MAC,
+				   "scan engine is busy 11d state %d\n", ar->state_11d);
+		else
+			ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret);
+
 		spin_lock_bh(&ar->data_lock);
 		ar->scan.state = ATH11K_SCAN_IDLE;
 		spin_unlock_bh(&ar->data_lock);