Message ID | 20191108194210.23618-1-greearb@candelatech.com (mailing list archive) |
---|---|
Headers | show |
Series | Ben's grab bag of mac80211 patches | expand |
Hi Ben, Well, for me to consider any of these, you really should come with commit logs and reasons that actually make sense outside of your environment :) I committed a note that replaces patch 1. I don't like 2-4, that just adds state and one should never hit it. If you're hitting these you should investigate the root cause, not fix the symptoms. 5 seems like it might be reasonable, but it's hard to read and understand, maybe revisit that? I tend to think the previous module options along the lines of patch 6 were a mistake, rather than add more ... 7 is totally not understandable, but might be legitimate? Unlikely, but hard to say. 8 I don't like at all. How about you do it in the driver somehow? 9 is like 2-4 really, I guess maybe this one I could get behind if it came with a commit log that actually explains why one is likely to hit this multiple times or something? 10 we did to fix other behaviour, so ... johannes
On 11/22/2019 03:49 AM, Johannes Berg wrote: > Hi Ben, > > Well, for me to consider any of these, you really should come with > commit logs and reasons that actually make sense outside of your > environment :) > > I committed a note that replaces patch 1. Well, 1 out of 10 ain't that bad! > I don't like 2-4, that just adds state and one should never hit it. If > you're hitting these you should investigate the root cause, not fix the > symptoms. > > 5 seems like it might be reasonable, but it's hard to read and > understand, maybe revisit that? This is the patch you previously said you liked but it would not apply upstream. Now it applies. That whole code mess is hard to understand, but I have been running a similar patch for a while and it has worked well. Instead of trying to understand the patch, try applying it and then read the resulting code. It is a lot simpler to understand that way I think. You can also sniff air with current code w/out this patch and watch the crappy retry behaviour where the retries are clumped in a few ms of time instead of being spread out. > > I tend to think the previous module options along the lines of patch 6 > were a mistake, rather than add more ... > > 7 is totally not understandable, but might be legitimate? Unlikely, but > hard to say. Ath10k will always use legacy rates for ctrl frames and such even if you otherwise restrict the rateset. So, it is legit to set a single rate even if that would leave no legacy rates configured as far as mac80211 is concerned. Your patch broke the ability to set a single rate in ath10k, and my change will allow it to work again. > > 8 I don't like at all. How about you do it in the driver somehow? I had low hopes for this one anyway. mac80211 has the software decrypt logic, not the driver, so it seemed reasonable to have the mac80211 do a callback. This patch is likely only useful for drivers that do block-ack in the firmware and support software decrypt, which may only be my modified ath10k-ct driver/firmware. > > 9 is like 2-4 really, I guess maybe this one I could get behind if it > came with a commit log that actually explains why one is likely to hit > this multiple times or something? Basically, it is almost never useful to use WARN_ON instead of WARN_ON_ONCE. If you ever do hit the bug, often the logs are full of WARN_ON spam and you cannot even find the real problem until you change it to WARN_ON_ONCE and reproduce the problem. I hit this problem due to some coding bug while poking at ath10k, and you get one splat per tx frame, so you can image the spam. > 10 we did to fix other behaviour, so ... This one is especially useful for roaming several virtual stations, but maybe that is only useful test case. I included it more as an RFC, but it has worked well enough for us in case you see some worth in it (and obviously it should be changed to not use // comments in case the functionality is actually deemed useful). Thanks, Ben > > johannes >
On Fri, 2019-11-22 at 08:37 -0800, Ben Greear wrote: > Well, 1 out of 10 ain't that bad! Heh. > > 5 seems like it might be reasonable, but it's hard to read and > > understand, maybe revisit that? > > This is the patch you previously said you liked but it would not apply upstream. > Now it applies. That whole code mess is hard to understand, but I have been running > a similar patch for a while and it has worked well. > > Instead of trying to understand the patch, try applying it and then read the resulting > code. It is a lot simpler to understand that way I think. OK :) > You can also sniff air with current code w/out this patch and watch the crappy > retry behaviour where the retries are clumped in a few ms of time instead of being > spread out. Right. I'll have to review this carefully. I think the Intel firmware will soon spread out the hardware retries a bit too (vs. here the software retries). > Ath10k will always use legacy rates for ctrl frames and such even if *control* frames have a very tightly controlled rate, e.g. ACK must have the same or next lower mandatory rate, or basic rate, or something. I don't have it in my head right now. Do you mean *management* frames? > you otherwise restrict the rateset. So, it is legit to set a single rate even if > that would leave no legacy rates configured as far as mac80211 is concerned. > > Your patch broke the ability to set a single rate in ath10k, and my change will > allow it to work again. Either way, it's kind of an ath10k bug in a sense. You don't actually want to "set this single rate", you want to "set this single rate for data frames" or so ... which is a different API. If you look at ath9k or any other driver with software rate control, it would not be able to pick a legacy rate for mgmt frames at all with your patch and the warning, and then I'm pretty sure you'll hit a WARN_ON() in rate.c. So no, I will certainly not apply this. Maybe you want to add a different API that affects only data frames (or a variant of this API or something), instead, but this is not it. We can't leave the software drivers without any rates to pick from for management frames. > > 8 I don't like at all. How about you do it in the driver somehow? > > I had low hopes for this one anyway. mac80211 has the software decrypt logic, not the > driver, so it seemed reasonable to have the mac80211 do a callback. This patch is likely > only useful for drivers that do block-ack in the firmware and support software decrypt, > which may only be my modified ath10k-ct driver/firmware. So far that's the only one, as far as I can tell, yeah ... > > 9 is like 2-4 really, I guess maybe this one I could get behind if it > > came with a commit log that actually explains why one is likely to hit > > this multiple times or something? > > Basically, it is almost never useful to use WARN_ON instead of WARN_ON_ONCE. If you ever > do hit the bug, often the logs are full of WARN_ON spam and you cannot even find the real problem > until you change it to WARN_ON_ONCE and reproduce the problem. > > I hit this problem due to some coding bug while poking at ath10k, and you get one splat > per tx frame, so you can image the spam. Yeah, but does it matter? It should really just happen during development, right? Having the _ONCE makes this cost more space and code, and I'm not sure I see that much point in that. > > 10 we did to fix other behaviour, so ... > > This one is especially useful for roaming several virtual stations, but maybe that is only useful test > case. I included it more as an RFC, but it has worked well enough for us in case you see some worth in > it (and obviously it should be changed to not use // comments in case the functionality is actually > deemed useful). I guess we can go back and forth on this ... but in most cases you don't know that the AP didn't go away, so IMHO the current behaviour is better. It's just in your special case you know it's more likely some local issues vs. the AP having disappeared. johannes
On 11/22/19 9:53 AM, Johannes Berg wrote: > On Fri, 2019-11-22 at 08:37 -0800, Ben Greear wrote: > >> Well, 1 out of 10 ain't that bad! > > Heh. > >>> 5 seems like it might be reasonable, but it's hard to read and >>> understand, maybe revisit that? >> >> This is the patch you previously said you liked but it would not apply upstream. >> Now it applies. That whole code mess is hard to understand, but I have been running >> a similar patch for a while and it has worked well. >> >> Instead of trying to understand the patch, try applying it and then read the resulting >> code. It is a lot simpler to understand that way I think. > > OK :) > >> You can also sniff air with current code w/out this patch and watch the crappy >> retry behaviour where the retries are clumped in a few ms of time instead of being >> spread out. > > Right. I'll have to review this carefully. I think the Intel firmware > will soon spread out the hardware retries a bit too (vs. here the > software retries). > >> Ath10k will always use legacy rates for ctrl frames and such even if > > *control* frames have a very tightly controlled rate, e.g. ACK must have > the same or next lower mandatory rate, or basic rate, or something. I > don't have it in my head right now. > > Do you mean *management* frames? Yes, I mean all sorts of frames that should use legacy rates. When you 'set the rate' in ath10k, you are really just setting the data-packet's rate, and upstream driver/firmware has other limitations (you can set single rate, or mcs-8 or mcs-9, but you cannot tell it to use mcs-5 and mcs-6 only, for instance). There are other ways to adjust management frame rates, but it is not through normal mac80211 rate-setting logic. >> you otherwise restrict the rateset. So, it is legit to set a single rate even if >> that would leave no legacy rates configured as far as mac80211 is concerned. >> >> Your patch broke the ability to set a single rate in ath10k, and my change will >> allow it to work again. > > Either way, it's kind of an ath10k bug in a sense. You don't actually > want to "set this single rate", you want to "set this single rate for > data frames" or so ... which is a different API. > > If you look at ath9k or any other driver with software rate control, it > would not be able to pick a legacy rate for mgmt frames at all with your > patch and the warning, and then I'm pretty sure you'll hit a WARN_ON() > in rate.c. > > So no, I will certainly not apply this. > > Maybe you want to add a different API that affects only data frames (or > a variant of this API or something), instead, but this is not it. We > can't leave the software drivers without any rates to pick from for > management frames. Maybe we can add a driver flag that indicates we can relax this check and have ath10k set that flag? Adding a whole top to bottom API to change how rates are set sounds like a pain, and would probably be specific to ath10k only in the end anyway? (Remember the discussion about setting advertised vs used rates? This would be even more complex and less useful than that I think). > >>> 8 I don't like at all. How about you do it in the driver somehow? >> >> I had low hopes for this one anyway. mac80211 has the software decrypt logic, not the >> driver, so it seemed reasonable to have the mac80211 do a callback. This patch is likely >> only useful for drivers that do block-ack in the firmware and support software decrypt, >> which may only be my modified ath10k-ct driver/firmware. > > So far that's the only one, as far as I can tell, yeah ... > >>> 9 is like 2-4 really, I guess maybe this one I could get behind if it >>> came with a commit log that actually explains why one is likely to hit >>> this multiple times or something? >> >> Basically, it is almost never useful to use WARN_ON instead of WARN_ON_ONCE. If you ever >> do hit the bug, often the logs are full of WARN_ON spam and you cannot even find the real problem >> until you change it to WARN_ON_ONCE and reproduce the problem. >> >> I hit this problem due to some coding bug while poking at ath10k, and you get one splat >> per tx frame, so you can image the spam. > > Yeah, but does it matter? It should really just happen during > development, right? Having the _ONCE makes this cost more space and > code, and I'm not sure I see that much point in that. Sure, for this one. The sdata-in-driver warnings happen often-ish in ath10k. No one can figure out how to fix it, or really cares to look I think. But, really, not worth arguing over. The main patch in this series that I would like to see upstream is the one that tweaks retry logic. Others are fairly easy to maintain outside the tree and are of more limited use to the general user. > >>> 10 we did to fix other behaviour, so ... >> >> This one is especially useful for roaming several virtual stations, but maybe that is only useful test >> case. I included it more as an RFC, but it has worked well enough for us in case you see some worth in >> it (and obviously it should be changed to not use // comments in case the functionality is actually >> deemed useful). > > I guess we can go back and forth on this ... but in most cases you don't > know that the AP didn't go away, so IMHO the current behaviour is > better. It's just in your special case you know it's more likely some > local issues vs. the AP having disappeared. What I saw is that supplicant would try to do a roam, and kernel would reject it because it had just deleted the object. Maybe it is better to have the kernel be more lenient and let supplicant take care of the policy decisions? Thanks, Ben > > johannes >
From: Ben Greear <greearb@candelatech.com> Here are some patches from my tree, freshly applied and compile tested on top of upstream 5.4.0-rc6. One incorporates a fix suggested back in 2013 :) There should not be much dependencies between these, so just skip any you don't want. Ben Greear (10): mac80211: Add comment about tx on monitor devs. mac80211: Change warn-on to warn-on-once mac80211: Don't spam kernel with sdata-in-driver failures. mac80211: Don't spam so loud about warned-sdata-in-driver. mac80211: Improve connection-loss probing. mac80211: Make max-auth-tries configurable as module option mac80211: Revert some of e8e4f5, fixes setting single rate in ath10k. mac80211: Support decrypting action frames for reinsertion into the driver. mac80211: Use warn-on-once for queue mis-configuration. mlme: Don't unlink bss on assoc timeout and similar. include/net/mac80211.h | 10 +++++ net/mac80211/cfg.c | 6 ++- net/mac80211/debug.h | 3 ++ net/mac80211/driver-ops.c | 9 ++++ net/mac80211/driver-ops.h | 36 ++++++++++++++-- net/mac80211/ieee80211_i.h | 3 +- net/mac80211/iface.c | 7 ++++ net/mac80211/mlme.c | 84 +++++++++++++++++--------------------- net/mac80211/rx.c | 2 +- net/mac80211/tx.c | 6 +++ net/mac80211/util.c | 5 ++- 11 files changed, 115 insertions(+), 56 deletions(-)