Message ID | 1364472669-5629-1-git-send-email-arend@broadcom.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, 2013-03-28 at 13:11 +0100, Arend van Spriel wrote: > Some protocols need a more reliable connection to complete > successful in reasonable time. This patch adds a user-space > API to indicate the wireless driver that a critical protocol > is about to commence and when it is done, using nl80211 primitives > NL80211_CMD_CRIT_PROTOCOL_START and NL80211_CRIT_PROTOCOL_STOP. > > The driver can support this by implementing the cfg80211 callbacks > .crit_proto_start() and .crit_proto_stop(). Examples of protocols > that can benefit from this are DHCP, EAPOL, APIPA. Exactly how the > link can/should be made more reliable is up to the driver. Things > to consider are avoid scanning, no multi-channel operations, and > alter coexistence schemes. Looks better than the BT coex thing :-) > + * @NL80211_CMD_CRIT_PROTOCOL_START: Indicates user-space will start running > + * a critical protocol that needs more reliability in the connection to > + * complete. I think this needs a little bit more documentation, addressing specifically: * What is the protocol ID? You haven't defined that at all, I don't like that you're effectively introducing a private driver API there, and userspace tools can't really know what to put into it. What's the value in that anyway, is it to allow nesting multiple protocols? That seems a bit excessive for the kernel, if needed it could be managed by the supplicant? If it doesn't go away, then it should probably be an enum and be checked ... but then you might need to have drivers advertise which ones they support? I fail to see the point right now. * I think there should probably be some sort of timeout. Would that timeout be configurable by userspace, or should this be determined in the driver? It seems userspace has a better idea of what kind of timeout would be needed. Either way, does userspace need an indication that it ended? Also, if there's a timeout, is a per-device maximum advertisement needed? > + NL80211_ATTR_CRIT_PROT_ID, > + NL80211_ATTR_MAX_CRIT_PROT_DURATION, Oh, you do have a duration, but I don't see it used in the cfg80211 API? > +++ b/net/wireless/core.c > @@ -745,6 +745,8 @@ static void wdev_cleanup_work(struct work_struct *work) > wdev = container_of(work, struct wireless_dev, cleanup_work); > rdev = wiphy_to_dev(wdev->wiphy); > > + schedule_delayed_work(&wdev->crit_proto_work, 0); That's icky -- will cause all kinds of locking problems if this schedules out because it'll be hard to ensure it really finished. You should probably cancel and do it inline here? > +void wdev_cancel_crit_proto(struct work_struct *work) > +{ > + struct delayed_work *dwork; > + struct cfg80211_registered_device *rdev; > + struct wireless_dev *wdev; > + > + dwork = container_of(work, struct delayed_work, work); > + wdev = container_of(dwork, struct wireless_dev, crit_proto_work); > + rdev = wiphy_to_dev(wdev->wiphy); > + > + rdev_crit_proto_stop(rdev, wdev, wdev->crit_proto); > + wdev->crit_proto = 0; Why even store the protocol value in the wdev? Or was that intended to be used to verify that you're not canceling something that doesn't exist? > +#define NL80211_MIN_CRIT_PROT_DURATION 2500 /* msec */ That seems pretty long? Why have such a long *minimum* duration? At 2.5 seconds, it's way long, and then disabling most of the protections/powersave/whatever no longer makes sense for this period of time since really mostly what this does will be reducing the wifi latency. > @@ -1417,6 +1419,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *dev, > } > CMD(start_p2p_device, START_P2P_DEVICE); > CMD(set_mcast_rate, SET_MCAST_RATE); > + CMD(crit_proto_start, CRIT_PROTOCOL_START); > + CMD(crit_proto_stop, CRIT_PROTOCOL_STOP); Ah, a tricky problem -- unrelated to your patch. You probably saw the wiphy size limit problem. If we keep adding commands here, we'll eventually break older userspace completely, so I think we shouldn't. A new way will probably be required, either adding new commands only conditionally on split wiphy dump, or inventing a new way. I suppose making it conditional is acceptable for all new commands since new tools in userspace will be required anyway to make them work. > + cancel_delayed_work(&wdev->crit_proto_work); That should probably be _sync. Is it even valid to start one while an old one is present? What's the expected behaviour? Right now you override, but is that really desired? Actually ... I think you should just get rid of the work, or at least make it optional. If we were going to implement this, for example, we'd definitely push the timing all the way into the firmware, and thus wouldn't want to have this cancel work. Isn't that similar for you? > + /* skip delayed work if no timeout provided */ I suspect a timeout should be required? > + schedule_delayed_work(&wdev->crit_proto_work, > + msecs_to_jiffies(duration)); Ah, ok, I guess I misunderstood it. You do use it, but don't pass it to the driver since you let cfg80211 handle the timing... > + wdev->crit_proto = proto; > + > +done: > + return rdev_crit_proto_start(rdev, wdev, proto); If this fails, you get confusion, the work will run anyway. > + if (info->attrs[NL80211_ATTR_CRIT_PROT_ID]) > + proto = nla_get_u16(info->attrs[NL80211_ATTR_CRIT_PROT_ID]); > + > + return rdev_crit_proto_stop(rdev, wdev, proto); You stored it before, so why not use that here as well? Or would start(proto=1) stop(proto=2) be valid? johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/28/2013 09:17 AM, Johannes Berg wrote: > On Thu, 2013-03-28 at 13:11 +0100, Arend van Spriel wrote: >> Some protocols need a more reliable connection to complete >> successful in reasonable time. This patch adds a user-space >> API to indicate the wireless driver that a critical protocol >> is about to commence and when it is done, using nl80211 primitives >> NL80211_CMD_CRIT_PROTOCOL_START and NL80211_CRIT_PROTOCOL_STOP. >> >> The driver can support this by implementing the cfg80211 callbacks >> .crit_proto_start() and .crit_proto_stop(). Examples of protocols >> that can benefit from this are DHCP, EAPOL, APIPA. Exactly how the >> link can/should be made more reliable is up to the driver. Things >> to consider are avoid scanning, no multi-channel operations, and >> alter coexistence schemes. > > Looks better than the BT coex thing :-) > >> + * @NL80211_CMD_CRIT_PROTOCOL_START: Indicates user-space will start running >> + * a critical protocol that needs more reliability in the connection to >> + * complete. > > I think this needs a little bit more documentation, addressing > specifically: > > * What is the protocol ID? You haven't defined that at all, I don't Maybe just ignore the protocol entirely. Let applications set their skb->priority or ToS so that their packets are handled with priority as desired. I'm guessing the NIC should have the same behaviour regardless of the protocol (ie, stop scanning, no off-channel work, etc), so why would it care about protocol. And, I do like the timeout (with maximum value hard-coded in the kernel so that a crashed or buggy app can't disable scanning/off-channel for longer than a few seconds). Thanks, Ben
On 03/28/2013 05:17 PM, Johannes Berg wrote: > On Thu, 2013-03-28 at 13:11 +0100, Arend van Spriel wrote: >> Some protocols need a more reliable connection to complete >> successful in reasonable time. This patch adds a user-space >> API to indicate the wireless driver that a critical protocol >> is about to commence and when it is done, using nl80211 primitives >> NL80211_CMD_CRIT_PROTOCOL_START and NL80211_CRIT_PROTOCOL_STOP. >> >> The driver can support this by implementing the cfg80211 callbacks >> .crit_proto_start() and .crit_proto_stop(). Examples of protocols >> that can benefit from this are DHCP, EAPOL, APIPA. Exactly how the >> link can/should be made more reliable is up to the driver. Things >> to consider are avoid scanning, no multi-channel operations, and >> alter coexistence schemes. > > Looks better than the BT coex thing :-) > These are only words, but thanks ;-) >> + * @NL80211_CMD_CRIT_PROTOCOL_START: Indicates user-space will start running >> + * a critical protocol that needs more reliability in the connection to >> + * complete. > > I think this needs a little bit more documentation, addressing > specifically: > > * What is the protocol ID? You haven't defined that at all, I don't > like that > you're effectively introducing a private driver API there, and > userspace tools > can't really know what to put into it. What's the value in that > anyway, is it > to allow nesting multiple protocols? That seems a bit excessive for > the > kernel, if needed it could be managed by the supplicant? > If it doesn't go away, then it should probably be an enum and be > checked ... > but then you might need to have drivers advertise which ones they > support? I > fail to see the point right now. I was already hesitant to put the protocol (ETH_P_*) in here. I do not have a clear use-case for it so let us drop it. > * I think there should probably be some sort of timeout. Would that > timeout be > configurable by userspace, or should this be determined in the > driver? It > seems userspace has a better idea of what kind of timeout would be > needed. > Either way, does userspace need an indication that it ended? > Also, if there's a timeout, is a per-device maximum advertisement > needed? > >> + NL80211_ATTR_CRIT_PROT_ID, >> + NL80211_ATTR_MAX_CRIT_PROT_DURATION, > > Oh, you do have a duration, but I don't see it used in the cfg80211 API? > >> +++ b/net/wireless/core.c >> @@ -745,6 +745,8 @@ static void wdev_cleanup_work(struct work_struct *work) >> wdev = container_of(work, struct wireless_dev, cleanup_work); >> rdev = wiphy_to_dev(wdev->wiphy); >> >> + schedule_delayed_work(&wdev->crit_proto_work, 0); > > That's icky -- will cause all kinds of locking problems if this > schedules out because it'll be hard to ensure it really finished. You > should probably cancel and do it inline here? > >> +void wdev_cancel_crit_proto(struct work_struct *work) >> +{ >> + struct delayed_work *dwork; >> + struct cfg80211_registered_device *rdev; >> + struct wireless_dev *wdev; >> + >> + dwork = container_of(work, struct delayed_work, work); >> + wdev = container_of(dwork, struct wireless_dev, crit_proto_work); >> + rdev = wiphy_to_dev(wdev->wiphy); >> + >> + rdev_crit_proto_stop(rdev, wdev, wdev->crit_proto); >> + wdev->crit_proto = 0; > > Why even store the protocol value in the wdev? Or was that intended to > be used to verify that you're not canceling something that doesn't > exist? > >> +#define NL80211_MIN_CRIT_PROT_DURATION 2500 /* msec */ > > That seems pretty long? Why have such a long *minimum* duration? At 2.5 > seconds, it's way long, and then disabling most of the > protections/powersave/whatever no longer makes sense for this period of > time since really mostly what this does will be reducing the wifi > latency. > Ok, so what minimum do you (or someone else can chime in here) think a DHCP exchange takes as that was considered a likely protocol that can benefit from this API. >> @@ -1417,6 +1419,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *dev, >> } >> CMD(start_p2p_device, START_P2P_DEVICE); >> CMD(set_mcast_rate, SET_MCAST_RATE); >> + CMD(crit_proto_start, CRIT_PROTOCOL_START); >> + CMD(crit_proto_stop, CRIT_PROTOCOL_STOP); > > Ah, a tricky problem -- unrelated to your patch. You probably saw the > wiphy size limit problem. If we keep adding commands here, we'll > eventually break older userspace completely, so I think we shouldn't. A > new way will probably be required, either adding new commands only > conditionally on split wiphy dump, or inventing a new way. I suppose > making it conditional is acceptable for all new commands since new tools > in userspace will be required anyway to make them work. > Indeed noticed some emails on this. I simply added these lines without looking what this code fragment does. >> + cancel_delayed_work(&wdev->crit_proto_work); > > That should probably be _sync. Is it even valid to start one while an > old one is present? What's the expected behaviour? Right now you > override, but is that really desired? Good point. Maybe keep track that crit_proto is started and reject a subsequent call (-EBUSY). Ideally, the start and stop should be done by the same user-space process/application. Is that possible? > Actually ... I think you should just get rid of the work, or at least > make it optional. If we were going to implement this, for example, we'd > definitely push the timing all the way into the firmware, and thus > wouldn't want to have this cancel work. Isn't that similar for you? > >> + /* skip delayed work if no timeout provided */ > > I suspect a timeout should be required? > >> + schedule_delayed_work(&wdev->crit_proto_work, >> + msecs_to_jiffies(duration)); > > Ah, ok, I guess I misunderstood it. You do use it, but don't pass it to > the driver since you let cfg80211 handle the timing... > Indeed. I wanted to be sure that the duration provided by user-space is applicable independent from a driver implementation. Do you think it makes sense to have this dealt with by cfg80211? >> + wdev->crit_proto = proto; >> + >> +done: >> + return rdev_crit_proto_start(rdev, wdev, proto); > > If this fails, you get confusion, the work will run anyway. > Good point. >> + if (info->attrs[NL80211_ATTR_CRIT_PROT_ID]) >> + proto = nla_get_u16(info->attrs[NL80211_ATTR_CRIT_PROT_ID]); >> + >> + return rdev_crit_proto_stop(rdev, wdev, proto); > > You stored it before, so why not use that here as well? Or would > start(proto=1) > stop(proto=2) > > be valid? Let's make life a bit easier by getting rid of the proto as we do not currently see a use-case for the protocol. Thanks, Arend -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-03-28 at 22:16 +0100, Arend van Spriel wrote: > > That seems pretty long? Why have such a long *minimum* duration? At 2.5 > > seconds, it's way long, and then disabling most of the > > protections/powersave/whatever no longer makes sense for this period of > > time since really mostly what this does will be reducing the wifi > > latency. > Ok, so what minimum do you (or someone else can chime in here) think a > DHCP exchange takes as that was considered a likely protocol that can > benefit from this API. Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, of course. I don't really see any reasonable minimum time? We might want to enforce a max though, maybe. > > Ah, a tricky problem -- unrelated to your patch. You probably saw the > > wiphy size limit problem. If we keep adding commands here, we'll > > eventually break older userspace completely, so I think we shouldn't. A > > new way will probably be required, either adding new commands only > > conditionally on split wiphy dump, or inventing a new way. I suppose > > making it conditional is acceptable for all new commands since new tools > > in userspace will be required anyway to make them work. > > > > Indeed noticed some emails on this. I simply added these lines without > looking what this code fragment does. Probably best to just say if (split) { CMD(...) } > Good point. Maybe keep track that crit_proto is started and reject a > subsequent call (-EBUSY). Ideally, the start and stop should be done by > the same user-space process/application. Is that possible? Yes ... but you'd have to make sure you abort when the application dies I guess, with the socket closing notifier thing. > > Ah, ok, I guess I misunderstood it. You do use it, but don't pass it to > > the driver since you let cfg80211 handle the timing... > > Indeed. I wanted to be sure that the duration provided by user-space is > applicable independent from a driver implementation. Do you think it > makes sense to have this dealt with by cfg80211? Not sure ... Like I said, I think if we'd implement it we'd like to put it into the firmware, so at least it'd have to be optional. And at that point I don't really see that much value in doing it in cfg80211, it's pretty simple after all. johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-03-28 at 22:28 +0100, Johannes Berg wrote: > On Thu, 2013-03-28 at 22:16 +0100, Arend van Spriel wrote: > > > That seems pretty long? Why have such a long *minimum* duration? At 2.5 > > > seconds, it's way long, and then disabling most of the > > > protections/powersave/whatever no longer makes sense for this period of > > > time since really mostly what this does will be reducing the wifi > > > latency. > > > Ok, so what minimum do you (or someone else can chime in here) think a > > DHCP exchange takes as that was considered a likely protocol that can > > benefit from this API. > > Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, > of course. I don't really see any reasonable minimum time? We might want > to enforce a max though, maybe. Not quite. A lot is dependent on the server itself, and I've had users on university and corporate networks report it sometimes takes 30 to 60 seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, OFFER, ACK). Sometimes there's a NAK in there if the server doesn't like your lease, which means you need another round-trip. So in many cases, it's a couple round-trips and each of these packets may or may not get lost in noisy environments. "nicer" DHCP clients have larger-bounded random backoffs between request packets to ensure they don't hammer the DHCP server or network if there are a bunch of machines booting up at the same time. That can often make the transaction take longer than a few seconds if even one DISCOVER or REQUEST packet gets lost due to noise. So really at least 15 seconds is a more useful timeout for DHCP. Dan > > > Ah, a tricky problem -- unrelated to your patch. You probably saw the > > > wiphy size limit problem. If we keep adding commands here, we'll > > > eventually break older userspace completely, so I think we shouldn't. A > > > new way will probably be required, either adding new commands only > > > conditionally on split wiphy dump, or inventing a new way. I suppose > > > making it conditional is acceptable for all new commands since new tools > > > in userspace will be required anyway to make them work. > > > > > > > Indeed noticed some emails on this. I simply added these lines without > > looking what this code fragment does. > > Probably best to just say > if (split) { > CMD(...) > } > > > Good point. Maybe keep track that crit_proto is started and reject a > > subsequent call (-EBUSY). Ideally, the start and stop should be done by > > the same user-space process/application. Is that possible? > > Yes ... but you'd have to make sure you abort when the application dies > I guess, with the socket closing notifier thing. > > > > Ah, ok, I guess I misunderstood it. You do use it, but don't pass it to > > > the driver since you let cfg80211 handle the timing... > > > > Indeed. I wanted to be sure that the duration provided by user-space is > > applicable independent from a driver implementation. Do you think it > > makes sense to have this dealt with by cfg80211? > > Not sure ... Like I said, I think if we'd implement it we'd like to put > it into the firmware, so at least it'd have to be optional. And at that > point I don't really see that much value in doing it in cfg80211, it's > pretty simple after all. > > johannes > > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: > > Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, > > of course. I don't really see any reasonable minimum time? We might want > > to enforce a max though, maybe. > > Not quite. A lot is dependent on the server itself, and I've had users > on university and corporate networks report it sometimes takes 30 to 60 > seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, > OFFER, ACK). Sometimes there's a NAK in there if the server doesn't > like your lease, which means you need another round-trip. So in many > cases, it's a couple round-trips and each of these packets may or may > not get lost in noisy environments. Oh, yes, of course. However, we're talking about optimising the good cases, not the bad ones. Think of it this way: if it goes fast, we shouldn't make it slow by putting things like powersave or similar in the way. If it's slow, then it'll still work, just slower. But when "slower" only means a few hundred milliseconds, it doesn't matter if everything takes forever (30-60 secs) johannes -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/28/2013 03:42 PM, Dan Williams wrote: > On Thu, 2013-03-28 at 22:28 +0100, Johannes Berg wrote: >> On Thu, 2013-03-28 at 22:16 +0100, Arend van Spriel wrote: >>>> That seems pretty long? Why have such a long *minimum* duration? At 2.5 >>>> seconds, it's way long, and then disabling most of the >>>> protections/powersave/whatever no longer makes sense for this period of >>>> time since really mostly what this does will be reducing the wifi >>>> latency. >> >>> Ok, so what minimum do you (or someone else can chime in here) think a >>> DHCP exchange takes as that was considered a likely protocol that can >>> benefit from this API. >> >> Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, >> of course. I don't really see any reasonable minimum time? We might want >> to enforce a max though, maybe. > > Not quite. A lot is dependent on the server itself, and I've had users > on university and corporate networks report it sometimes takes 30 to 60 > seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, > OFFER, ACK). Sometimes there's a NAK in there if the server doesn't > like your lease, which means you need another round-trip. So in many > cases, it's a couple round-trips and each of these packets may or may > not get lost in noisy environments. Anyone know if DHCP requests and responses go to the high-priority queue in the NIC by default? Seems like that might be a big help if not... Thanks, Ben
On Thu, 2013-03-28 at 15:51 -0700, Ben Greear wrote: > On 03/28/2013 03:42 PM, Dan Williams wrote: > > On Thu, 2013-03-28 at 22:28 +0100, Johannes Berg wrote: > >> On Thu, 2013-03-28 at 22:16 +0100, Arend van Spriel wrote: > >>>> That seems pretty long? Why have such a long *minimum* duration? At 2.5 > >>>> seconds, it's way long, and then disabling most of the > >>>> protections/powersave/whatever no longer makes sense for this period of > >>>> time since really mostly what this does will be reducing the wifi > >>>> latency. > >> > >>> Ok, so what minimum do you (or someone else can chime in here) think a > >>> DHCP exchange takes as that was considered a likely protocol that can > >>> benefit from this API. > >> > >> Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, > >> of course. I don't really see any reasonable minimum time? We might want > >> to enforce a max though, maybe. > > > > Not quite. A lot is dependent on the server itself, and I've had users > > on university and corporate networks report it sometimes takes 30 to 60 > > seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, > > OFFER, ACK). Sometimes there's a NAK in there if the server doesn't > > like your lease, which means you need another round-trip. So in many > > cases, it's a couple round-trips and each of these packets may or may > > not get lost in noisy environments. > > Anyone know if DHCP requests and responses go to the high-priority > queue in the NIC by default? Seems like that might be a big help if not... Depends on the DHCP client I suppose, but probably doubtful for dhclient and dhcpcd at least. That would be a good patch. Dan -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-03-28 at 23:44 +0100, Johannes Berg wrote: > On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: > > > > Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, > > > of course. I don't really see any reasonable minimum time? We might want > > > to enforce a max though, maybe. > > > > Not quite. A lot is dependent on the server itself, and I've had users > > on university and corporate networks report it sometimes takes 30 to 60 > > seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, > > OFFER, ACK). Sometimes there's a NAK in there if the server doesn't > > like your lease, which means you need another round-trip. So in many > > cases, it's a couple round-trips and each of these packets may or may > > not get lost in noisy environments. > > Oh, yes, of course. However, we're talking about optimising the good > cases, not the bad ones. Think of it this way: if it goes fast, we > shouldn't make it slow by putting things like powersave or similar in > the way. If it's slow, then it'll still work, just slower. But when > "slower" only means a few hundred milliseconds, it doesn't matter if > everything takes forever (30-60 secs) True, but at least 4 or 5 seconds is the minimum time I'd recommend here for DHCP. Dan -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/28/2013 04:01 PM, Dan Williams wrote: > On Thu, 2013-03-28 at 23:44 +0100, Johannes Berg wrote: >> On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: >> >>>> Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, >>>> of course. I don't really see any reasonable minimum time? We might want >>>> to enforce a max though, maybe. >>> >>> Not quite. A lot is dependent on the server itself, and I've had users >>> on university and corporate networks report it sometimes takes 30 to 60 >>> seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, >>> OFFER, ACK). Sometimes there's a NAK in there if the server doesn't >>> like your lease, which means you need another round-trip. So in many >>> cases, it's a couple round-trips and each of these packets may or may >>> not get lost in noisy environments. >> >> Oh, yes, of course. However, we're talking about optimising the good >> cases, not the bad ones. Think of it this way: if it goes fast, we >> shouldn't make it slow by putting things like powersave or similar in >> the way. If it's slow, then it'll still work, just slower. But when >> "slower" only means a few hundred milliseconds, it doesn't matter if >> everything takes forever (30-60 secs) > > True, but at least 4 or 5 seconds is the minimum time I'd recommend here > for DHCP. Couldn't dhcp just turn off the critical protection as soon as it is done? Then, you only need to worry about the max time allowed. Also, you would probably need to enforce in the kernel that only x out of y time in any given period can be locked, otherwise lots of different dhclient processes (perhaps erroneously spawned..or running on lots of different VIFs) could basically disable scanning or channel changes... Thanks, Ben
On 03/28/2013 11:44 PM, Johannes Berg wrote: > On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: > >>> Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, >>> of course. I don't really see any reasonable minimum time? We might want >>> to enforce a max though, maybe. >> >> Not quite. A lot is dependent on the server itself, and I've had users >> on university and corporate networks report it sometimes takes 30 to 60 >> seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, >> OFFER, ACK). Sometimes there's a NAK in there if the server doesn't >> like your lease, which means you need another round-trip. So in many >> cases, it's a couple round-trips and each of these packets may or may >> not get lost in noisy environments. > > Oh, yes, of course. However, we're talking about optimising the good > cases, not the bad ones. Think of it this way: if it goes fast, we > shouldn't make it slow by putting things like powersave or similar in > the way. If it's slow, then it'll still work, just slower. But when > "slower" only means a few hundred milliseconds, it doesn't matter if > everything takes forever (30-60 secs) In our android driver, which has a private ioctl for this stuff, it is used for DHCP and makes WLAN connection more reliable by altering BT coex parameters. It has its own timeout schedule. Timer T1 is 2.5 sec. for DCHP transaction to complete. If T1 expires before completion it increases WLAN priority more with timer T2 for 5 seconds. That is why I put 2.5 sec as a minimum duration in the patch. Unfortunately I do not have the data available to back these timeout values. I will ask around. Gr. AvS -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/29/2013 12:30 AM, Ben Greear wrote: > On 03/28/2013 04:01 PM, Dan Williams wrote: >> On Thu, 2013-03-28 at 23:44 +0100, Johannes Berg wrote: >>> On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: >>> >>>>> Well, you can do DHCP a second or so, I'd think? And EAPOL much >>>>> quicker, >>>>> of course. I don't really see any reasonable minimum time? We might >>>>> want >>>>> to enforce a max though, maybe. >>>> >>>> Not quite. A lot is dependent on the server itself, and I've had users >>>> on university and corporate networks report it sometimes takes 30 to 60 >>>> seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, >>>> OFFER, ACK). Sometimes there's a NAK in there if the server doesn't >>>> like your lease, which means you need another round-trip. So in many >>>> cases, it's a couple round-trips and each of these packets may or may >>>> not get lost in noisy environments. >>> >>> Oh, yes, of course. However, we're talking about optimising the good >>> cases, not the bad ones. Think of it this way: if it goes fast, we >>> shouldn't make it slow by putting things like powersave or similar in >>> the way. If it's slow, then it'll still work, just slower. But when >>> "slower" only means a few hundred milliseconds, it doesn't matter if >>> everything takes forever (30-60 secs) >> >> True, but at least 4 or 5 seconds is the minimum time I'd recommend here >> for DHCP. > > Couldn't dhcp just turn off the critical protection as soon as it is done? That is the idea. It just seemed sane to have some minimum specified, but I guess its value depends on the protocol that needs protection as this API is not limited to DHCP. I will remove the minimum. Also I think DHCP should not use the API to protect the whole transaction, but only when there is a message exchange being initiated. > Then, you only need to worry about the max time allowed. True, but I think that also depends on the protocol and possibly also on the solution in the driver to increase a more reliable connection. Some solution may have a negative effect on other functions (eg. bluetooth) which require another maximum timeout opposed to suppressing a scanning. With DHCP in mind I would say somewhere between 5-10 sec. is (more than) enough. > Also, you would probably need to enforce in the kernel that only > x out of y time in any given period can be locked, otherwise lots > of different dhclient processes (perhaps erroneously spawned..or > running on lots of different VIFs) could basically disable scanning > or channel changes... True. Will try to come up with some sane solution for this. Gr. AvS -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2013-03-28 at 16:30 -0700, Ben Greear wrote: > On 03/28/2013 04:01 PM, Dan Williams wrote: > > On Thu, 2013-03-28 at 23:44 +0100, Johannes Berg wrote: > >> On Thu, 2013-03-28 at 17:42 -0500, Dan Williams wrote: > >> > >>>> Well, you can do DHCP a second or so, I'd think? And EAPOL much quicker, > >>>> of course. I don't really see any reasonable minimum time? We might want > >>>> to enforce a max though, maybe. > >>> > >>> Not quite. A lot is dependent on the server itself, and I've had users > >>> on university and corporate networks report it sometimes takes 30 to 60 > >>> seconds for the whole DHCP transaction to complete (DISCOVER, REQUEST, > >>> OFFER, ACK). Sometimes there's a NAK in there if the server doesn't > >>> like your lease, which means you need another round-trip. So in many > >>> cases, it's a couple round-trips and each of these packets may or may > >>> not get lost in noisy environments. > >> > >> Oh, yes, of course. However, we're talking about optimising the good > >> cases, not the bad ones. Think of it this way: if it goes fast, we > >> shouldn't make it slow by putting things like powersave or similar in > >> the way. If it's slow, then it'll still work, just slower. But when > >> "slower" only means a few hundred milliseconds, it doesn't matter if > >> everything takes forever (30-60 secs) > > > > True, but at least 4 or 5 seconds is the minimum time I'd recommend here > > for DHCP. > > Couldn't dhcp just turn off the critical protection as soon as it is done? > > Then, you only need to worry about the max time allowed. Yes, that's really what I meant. 4 - 5 seconds is the "best worst-case scenario", clearly when a lease is acquired the critical protection would be turned off by the connection manager. But if something doesn't turn it off, and the 802.11 stack needs a timeout value, I would suggest 4 or 5 seconds for that. Dan > Also, you would probably need to enforce in the kernel that only > x out of y time in any given period can be locked, otherwise lots > of different dhclient processes (perhaps erroneously spawned..or > running on lots of different VIFs) could basically disable scanning > or channel changes... > > Thanks, > Ben > -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 57870b6..e864989 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -2002,6 +2002,9 @@ struct cfg80211_update_ft_ies_params { * @update_ft_ies: Provide updated Fast BSS Transition information to the * driver. If the SME is in the driver/firmware, this information can be * used in building Authentication and Reassociation Request frames. + * @crit_proto_start: Indicates a critical protocol needs more link reliability. + * @crit_proto_stop: Indicates critical protocol no longer needs increased link + * reliability. */ struct cfg80211_ops { int (*suspend)(struct wiphy *wiphy, struct cfg80211_wowlan *wow); @@ -2231,6 +2234,10 @@ struct cfg80211_ops { struct cfg80211_chan_def *chandef); int (*update_ft_ies)(struct wiphy *wiphy, struct net_device *dev, struct cfg80211_update_ft_ies_params *ftie); + int (*crit_proto_start)(struct wiphy *wiphy, + struct wireless_dev *wdev, u16 protocol); + int (*crit_proto_stop)(struct wiphy *wiphy, + struct wireless_dev *wdev, u16 protocol); }; /* @@ -2830,6 +2837,8 @@ struct cfg80211_cached_keys; * @p2p_started: true if this is a P2P Device that has been started * @cac_started: true if DFS channel availability check has been started * @cac_start_time: timestamp (jiffies) when the dfs state was entered. + * @crit_proto_work: delayed work guarding duration of critical protocol. + * @crit_proto: ethernet protocol for which delayed work is scheduled. */ struct wireless_dev { struct wiphy *wiphy; @@ -2884,6 +2893,9 @@ struct wireless_dev { bool cac_started; unsigned long cac_start_time; + struct delayed_work crit_proto_work; + u16 crit_proto; + #ifdef CONFIG_CFG80211_WEXT /* wext data */ struct { diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h index 79da871..a9b17ff 100644 --- a/include/uapi/linux/nl80211.h +++ b/include/uapi/linux/nl80211.h @@ -639,6 +639,13 @@ * with the relevant Information Elements. This event is used to report * received FT IEs (MDIE, FTIE, RSN IE, TIE, RICIE). * + * @NL80211_CMD_CRIT_PROTOCOL_START: Indicates user-space will start running + * a critical protocol that needs more reliability in the connection to + * complete. + * + * @NL80211_CMD_CRIT_PROTOCOL_STOP: Indicates the connection reliability can + * return back to normal. + * * @NL80211_CMD_MAX: highest used command number * @__NL80211_CMD_AFTER_LAST: internal use */ @@ -798,6 +805,9 @@ enum nl80211_commands { NL80211_CMD_UPDATE_FT_IES, NL80211_CMD_FT_EVENT, + NL80211_CMD_CRIT_PROTOCOL_START, + NL80211_CMD_CRIT_PROTOCOL_STOP, + /* add new commands above here */ /* used to define NL80211_CMD_MAX below */ @@ -1709,6 +1719,9 @@ enum nl80211_attrs { NL80211_ATTR_MDID, NL80211_ATTR_IE_RIC, + NL80211_ATTR_CRIT_PROT_ID, + NL80211_ATTR_MAX_CRIT_PROT_DURATION, + /* add attributes here, update the policy in nl80211.c */ __NL80211_ATTR_AFTER_LAST, diff --git a/net/wireless/core.c b/net/wireless/core.c index 92e3fd4..e98db00 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -745,6 +745,8 @@ static void wdev_cleanup_work(struct work_struct *work) wdev = container_of(work, struct wireless_dev, cleanup_work); rdev = wiphy_to_dev(wdev->wiphy); + schedule_delayed_work(&wdev->crit_proto_work, 0); + cfg80211_lock_rdev(rdev); if (WARN_ON(rdev->scan_req && rdev->scan_req->wdev == wdev)) { @@ -771,6 +773,20 @@ static void wdev_cleanup_work(struct work_struct *work) dev_put(wdev->netdev); } +void wdev_cancel_crit_proto(struct work_struct *work) +{ + struct delayed_work *dwork; + struct cfg80211_registered_device *rdev; + struct wireless_dev *wdev; + + dwork = container_of(work, struct delayed_work, work); + wdev = container_of(dwork, struct wireless_dev, crit_proto_work); + rdev = wiphy_to_dev(wdev->wiphy); + + rdev_crit_proto_stop(rdev, wdev, wdev->crit_proto); + wdev->crit_proto = 0; +} + void cfg80211_unregister_wdev(struct wireless_dev *wdev) { struct cfg80211_registered_device *rdev = wiphy_to_dev(wdev->wiphy); @@ -886,6 +902,8 @@ static int cfg80211_netdev_notifier_call(struct notifier_block *nb, spin_lock_init(&wdev->event_lock); INIT_LIST_HEAD(&wdev->mgmt_registrations); spin_lock_init(&wdev->mgmt_registrations_lock); + INIT_DELAYED_WORK(&wdev->crit_proto_work, + wdev_cancel_crit_proto); mutex_lock(&rdev->devlist_mtx); wdev->identifier = ++rdev->wdev_id; diff --git a/net/wireless/core.h b/net/wireless/core.h index d5d06fd..af2eb94 100644 --- a/net/wireless/core.h +++ b/net/wireless/core.h @@ -502,6 +502,7 @@ void cfg80211_update_iface_num(struct cfg80211_registered_device *rdev, void cfg80211_leave(struct cfg80211_registered_device *rdev, struct wireless_dev *wdev); +void wdev_cancel_crit_proto(struct work_struct *work); #define CFG80211_MAX_NUM_DIFFERENT_CHANNELS 10 diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index f924d45..dd0c4b7 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -25,6 +25,8 @@ #include "reg.h" #include "rdev-ops.h" +#define NL80211_MIN_CRIT_PROT_DURATION 2500 /* msec */ + static int nl80211_crypto_settings(struct cfg80211_registered_device *rdev, struct genl_info *info, struct cfg80211_crypto_settings *settings, @@ -1417,6 +1419,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *dev, } CMD(start_p2p_device, START_P2P_DEVICE); CMD(set_mcast_rate, SET_MCAST_RATE); + CMD(crit_proto_start, CRIT_PROTOCOL_START); + CMD(crit_proto_stop, CRIT_PROTOCOL_STOP); #ifdef CONFIG_NL80211_TESTMODE CMD(testmode_cmd, TESTMODE); @@ -2467,6 +2471,8 @@ static int nl80211_new_interface(struct sk_buff *skb, struct genl_info *info) spin_lock_init(&wdev->event_lock); INIT_LIST_HEAD(&wdev->mgmt_registrations); spin_lock_init(&wdev->mgmt_registrations_lock); + INIT_DELAYED_WORK(&wdev->crit_proto_work, + wdev_cancel_crit_proto); mutex_lock(&rdev->devlist_mtx); wdev->identifier = ++rdev->wdev_id; @@ -8196,6 +8202,62 @@ static int nl80211_update_ft_ies(struct sk_buff *skb, struct genl_info *info) return rdev_update_ft_ies(rdev, dev, &ft_params); } +static int nl80211_crit_protocol_start(struct sk_buff *skb, + struct genl_info *info) +{ + struct cfg80211_registered_device *rdev = info->user_ptr[0]; + struct wireless_dev *wdev = info->user_ptr[1]; + u16 proto = 0; + u16 duration; + + if (!rdev->ops->crit_proto_start) + return -EOPNOTSUPP; + + if (WARN_ON(!rdev->ops->crit_proto_stop)) + return -EINVAL; + + cancel_delayed_work(&wdev->crit_proto_work); + wdev->crit_proto = 0; + + /* determine protocol if provided */ + if (info->attrs[NL80211_ATTR_CRIT_PROT_ID]) + proto = nla_get_u16(info->attrs[NL80211_ATTR_CRIT_PROT_ID]); + + /* skip delayed work if no timeout provided */ + if (!info->attrs[NL80211_ATTR_MAX_CRIT_PROT_DURATION]) + goto done; + + duration = + nla_get_u16(info->attrs[NL80211_ATTR_MAX_CRIT_PROT_DURATION]); + + duration = max_t(u16, duration, NL80211_MIN_CRIT_PROT_DURATION); + schedule_delayed_work(&wdev->crit_proto_work, + msecs_to_jiffies(duration)); + wdev->crit_proto = proto; + +done: + return rdev_crit_proto_start(rdev, wdev, proto); +} + +static int nl80211_crit_protocol_stop(struct sk_buff *skb, + struct genl_info *info) +{ + struct cfg80211_registered_device *rdev = info->user_ptr[0]; + struct wireless_dev *wdev = info->user_ptr[1]; + u16 proto = 0; + + if (!rdev->ops->crit_proto_stop) + return -EOPNOTSUPP; + + cancel_delayed_work(&wdev->crit_proto_work); + wdev->crit_proto = 0; + + if (info->attrs[NL80211_ATTR_CRIT_PROT_ID]) + proto = nla_get_u16(info->attrs[NL80211_ATTR_CRIT_PROT_ID]); + + return rdev_crit_proto_stop(rdev, wdev, proto); +} + #define NL80211_FLAG_NEED_WIPHY 0x01 #define NL80211_FLAG_NEED_NETDEV 0x02 #define NL80211_FLAG_NEED_RTNL 0x04 @@ -8885,6 +8947,22 @@ static struct genl_ops nl80211_ops[] = { .internal_flags = NL80211_FLAG_NEED_NETDEV_UP | NL80211_FLAG_NEED_RTNL, }, + { + .cmd = NL80211_CMD_CRIT_PROTOCOL_START, + .doit = nl80211_crit_protocol_start, + .policy = nl80211_policy, + .flags = GENL_ADMIN_PERM, + .internal_flags = NL80211_FLAG_NEED_WDEV_UP | + NL80211_FLAG_NEED_RTNL, + }, + { + .cmd = NL80211_CMD_CRIT_PROTOCOL_STOP, + .doit = nl80211_crit_protocol_stop, + .policy = nl80211_policy, + .flags = GENL_ADMIN_PERM, + .internal_flags = NL80211_FLAG_NEED_WDEV_UP | + NL80211_FLAG_NEED_RTNL, + } }; static struct genl_multicast_group nl80211_mlme_mcgrp = { diff --git a/net/wireless/rdev-ops.h b/net/wireless/rdev-ops.h index d77e1c1..906b92f 100644 --- a/net/wireless/rdev-ops.h +++ b/net/wireless/rdev-ops.h @@ -901,4 +901,26 @@ static inline int rdev_update_ft_ies(struct cfg80211_registered_device *rdev, return ret; } +static inline int rdev_crit_proto_start(struct cfg80211_registered_device *rdev, + struct wireless_dev *wdev, u16 protocol) +{ + int ret; + + trace_rdev_crit_proto_start(&rdev->wiphy, wdev, protocol); + ret = rdev->ops->crit_proto_start(&rdev->wiphy, wdev, protocol); + trace_rdev_return_int(&rdev->wiphy, ret); + return ret; +} + +static inline int rdev_crit_proto_stop(struct cfg80211_registered_device *rdev, + struct wireless_dev *wdev, u16 protocol) +{ + int ret; + + trace_rdev_crit_proto_stop(&rdev->wiphy, wdev, protocol); + ret = rdev->ops->crit_proto_stop(&rdev->wiphy, wdev, protocol); + trace_rdev_return_int(&rdev->wiphy, ret); + return ret; +} + #endif /* __CFG80211_RDEV_OPS */ diff --git a/net/wireless/trace.h b/net/wireless/trace.h index ccadef2..43fcc60 100644 --- a/net/wireless/trace.h +++ b/net/wireless/trace.h @@ -1805,6 +1805,42 @@ TRACE_EVENT(rdev_update_ft_ies, WIPHY_PR_ARG, NETDEV_PR_ARG, __entry->md) ); +TRACE_EVENT(rdev_crit_proto_start, + TP_PROTO(struct wiphy *wiphy, struct wireless_dev *wdev, + u16 protocol), + TP_ARGS(wiphy, wdev, protocol), + TP_STRUCT__entry( + WIPHY_ENTRY + WDEV_ENTRY + __field(u16, proto) + ), + TP_fast_assign( + WIPHY_ASSIGN; + WDEV_ASSIGN; + __entry->proto = protocol; + ), + TP_printk(WIPHY_PR_FMT ", " WDEV_PR_FMT ", proto=%x", + WIPHY_PR_ARG, WDEV_PR_ARG, __entry->proto) +); + +TRACE_EVENT(rdev_crit_proto_stop, + TP_PROTO(struct wiphy *wiphy, struct wireless_dev *wdev, + u16 protocol), + TP_ARGS(wiphy, wdev, protocol), + TP_STRUCT__entry( + WIPHY_ENTRY + WDEV_ENTRY + __field(u16, proto) + ), + TP_fast_assign( + WIPHY_ASSIGN; + WDEV_ASSIGN; + __entry->proto = protocol; + ), + TP_printk(WIPHY_PR_FMT ", " WDEV_PR_FMT ", proto=%x", + WIPHY_PR_ARG, WDEV_PR_ARG, __entry->proto) +); + /************************************************************* * cfg80211 exported functions traces * *************************************************************/