mbox series

[0/3] mac80211: Trigger disconnect for STA during recovery

Message ID 20201215172113.5038-1-youghand@codeaurora.org (mailing list archive)
Headers show
Series mac80211: Trigger disconnect for STA during recovery | expand

Message

Youghandhar Chintala Dec. 15, 2020, 5:21 p.m. UTC
From: Rakesh Pillai <pillair@codeaurora.org>

Currently in case of target hardware restart ,we just reconfig and
re-enable the security keys and enable the network queues to start
data traffic back from where it was interrupted.

Many ath10k wifi chipsets have sequence numbers for the data
packets assigned by firmware and the mac sequence number will
restart from zero after target hardware restart leading to mismatch
in the sequence number expected by the remote peer vs the sequence
number of the frame sent by the target firmware.

This mismatch in sequence number will cause out-of-order packets
on the remote peer and all the frames sent by the device are dropped
until we reach the sequence number which was sent before we restarted
the target hardware

In order to fix this, we trigger a disconnect in case of hardware
restart. After this there will be a fresh connection and thereby
avoiding the dropping of frames by remote peer.

The right fix would be to pull the entire data path into the host
which is not feasible or would need lots of complex/inefficient
datapath changes.

Rakesh Pillai (1):
  ath10k: Set wiphy flag to trigger sta disconnect on hardware restart

Youghandhar Chintala (2):
  cfg80211: Add wiphy flag to trigger STA disconnect after hardware
    restart
  mac80211: Add support to trigger sta disconnect on hardware restart

 drivers/net/wireless/ath/ath10k/core.c | 15 +++++++++++++++
 drivers/net/wireless/ath/ath10k/hw.h   |  3 +++
 drivers/net/wireless/ath/ath10k/mac.c  |  3 +++
 include/net/cfg80211.h                 |  4 ++++
 net/mac80211/ieee80211_i.h             |  3 +++
 net/mac80211/mlme.c                    |  9 +++++++++
 net/mac80211/util.c                    | 22 +++++++++++++++++++---
 7 files changed, 56 insertions(+), 3 deletions(-)

Comments

Ben Greear Dec. 15, 2020, 6:23 p.m. UTC | #1
On 12/15/20 9:21 AM, Youghandhar Chintala wrote:
> From: Rakesh Pillai <pillair@codeaurora.org>
> 
> Currently in case of target hardware restart ,we just reconfig and
> re-enable the security keys and enable the network queues to start
> data traffic back from where it was interrupted.

Are there any known mac80211 radios/drivers that *can* support seamless restarts?

If not, then just could always enable this feature in mac80211?

Thanks,
Ben
Rakesh Pillai Dec. 16, 2020, 11:35 a.m. UTC | #2
> From: Ben Greear <greearb@candelatech.com>
> 
> On 12/15/20 9:21 AM, Youghandhar Chintala wrote:
> > From: Rakesh Pillai <pillair@codeaurora.org>
> >
> > Currently in case of target hardware restart ,we just reconfig and
> > re-enable the security keys and enable the network queues to start
> > data traffic back from where it was interrupted.
> 
> Are there any known mac80211 radios/drivers that *can* support seamless
> restarts?
> 
> If not, then just could always enable this feature in mac80211?

I am not aware of any mac80211 target which can restart in a seamless manner.
Hence I chose to keep this optional and driver can expose this flag (if needed) based on the hardware capability.

Thanks,
Rakesh Pillai.
Brian Norris Dec. 17, 2020, 10:24 p.m. UTC | #3
On Tue, Dec 15, 2020 at 10:23:33AM -0800, Ben Greear wrote:
> On 12/15/20 9:21 AM, Youghandhar Chintala wrote:
> > From: Rakesh Pillai <pillair@codeaurora.org>
> > 
> > Currently in case of target hardware restart ,we just reconfig and
> > re-enable the security keys and enable the network queues to start
> > data traffic back from where it was interrupted.
> 
> Are there any known mac80211 radios/drivers that *can* support seamless restarts?
> 
> If not, then just could always enable this feature in mac80211?

I'm quite sure that iwlwifi intentionally supports a seamless restart.
From my experience with dealing with user reports, I don't recall any
issues where restart didn't function as expected, unless there was some
deeper underlying failure (e.g., hardware/power failure; driver bugs /
lockups).

I don't have very good stats for ath10k/QCA6174, but it survives
our testing OK and I again don't recall any user-reported complaints in
this area. I'd say this is a weaker example though, as I don't have as
clear of data. (By contrast, ath10k/WCN399x, which Rakesh, et al, are
patching here, does not pass our tests at all, and clearly fails to
recover from "seamless" restarts, as noted in patch 3.)

I'd also note that we don't operate in AP mode -- only STA -- and IIRC
Ben, you've complained about AP mode in the past.

Brian
Brian Norris Dec. 17, 2020, 10:30 p.m. UTC | #4
On Tue, Dec 15, 2020 at 10:51:13PM +0530, Youghandhar Chintala wrote:
> From: Rakesh Pillai <pillair@codeaurora.org>

I meant to mention in my other reply: the threading on this series is
broken (as in, it doesn't exist). It looks like you're using
git-send-email (good!), but somehow it doesn't have any In-Reply-To or
References (bad!). Did you send all your mail in one invocation, or did
you send them as separate git-send-email commands? Anyway, please
investigate what when wrong so you can get this right in the future.

For one, this affects Patchwork's ability to group patch series (not to
mention everybody who uses a decent mail reader, with proper threading).
See for example the lore archive, which only is threading replies to
this cover letter:

https://lore.kernel.org/linux-wireless/20201215172113.5038-1-youghand@codeaurora.org/

Regards,
Brian
Ben Greear Dec. 17, 2020, 10:57 p.m. UTC | #5
On 12/17/20 2:24 PM, Brian Norris wrote:
> On Tue, Dec 15, 2020 at 10:23:33AM -0800, Ben Greear wrote:
>> On 12/15/20 9:21 AM, Youghandhar Chintala wrote:
>>> From: Rakesh Pillai <pillair@codeaurora.org>
>>>
>>> Currently in case of target hardware restart ,we just reconfig and
>>> re-enable the security keys and enable the network queues to start
>>> data traffic back from where it was interrupted.
>>
>> Are there any known mac80211 radios/drivers that *can* support seamless restarts?
>>
>> If not, then just could always enable this feature in mac80211?
> 
> I'm quite sure that iwlwifi intentionally supports a seamless restart.
>  From my experience with dealing with user reports, I don't recall any
> issues where restart didn't function as expected, unless there was some
> deeper underlying failure (e.g., hardware/power failure; driver bugs /
> lockups).
> 
> I don't have very good stats for ath10k/QCA6174, but it survives
> our testing OK and I again don't recall any user-reported complaints in
> this area. I'd say this is a weaker example though, as I don't have as
> clear of data. (By contrast, ath10k/WCN399x, which Rakesh, et al, are
> patching here, does not pass our tests at all, and clearly fails to
> recover from "seamless" restarts, as noted in patch 3.)
> 
> I'd also note that we don't operate in AP mode -- only STA -- and IIRC
> Ben, you've complained about AP mode in the past.

I complain about all sorts of things, but I'm usually running
station mode :)

Do you actually see iwlwifi stations stay associated through
firmware crashes?

Anyway, happy to hear some have seamless recovery, and in that case,
I have no objections to the patch.

Thanks,
Ben
Brian Norris Dec. 17, 2020, 11:18 p.m. UTC | #6
On Thu, Dec 17, 2020 at 2:57 PM Ben Greear <greearb@candelatech.com> wrote:
> On 12/17/20 2:24 PM, Brian Norris wrote:
> > I'd also note that we don't operate in AP mode -- only STA -- and IIRC
> > Ben, you've complained about AP mode in the past.
>
> I complain about all sorts of things, but I'm usually running
> station mode :)

Hehe, fair :) Maybe I'm mixed up.

But I do get the feeling that specifically within the ath10k family,
there are wildly different use cases (mobile, PC, AP) and chips (and
firmware) that tend to go along with them, and that those use cases
get a fairly different population of {developers, testers, reporters}.
So claiming "feature X works" pretty much always has to be couched in
which chips, firmware, and use case. And there's certainly some wisdom
in these sections:

https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches#hardware_families
https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches#tested-on_tag

> Do you actually see iwlwifi stations stay associated through
> firmware crashes?

Yes.

> Anyway, happy to hear some have seamless recovery, and in that case,
> I have no objections to the patch.

OK! I hope I'm not the only one with such results, because then I
still might question my sanity (and test coverage), but that's still
my understanding.

BTW, I haven't yet closely reviewed the patch series myself, but I ACK
the concept.

Brian