Message ID | 20231201104329.25898-5-johannes@sipsolutions.net (mailing list archive) |
---|---|
Headers | show |
Series | netlink carrier race workaround | expand |
On Fri, 1 Dec 2023 11:41:14 +0100 Johannes Berg wrote: > So I had put this aside for a while, but really got annoyed by all > the test failures now ... thinking about this again I basically now > arrived at a variant of solution #3 previously outlined, and I've > kind of convinced myself that userspace should always get an event > with a new carrier_up_count as it does today. Would it work if we exposed "linkwatch is pending" / "link is transitioning" bit to user space? Even crazier, would it help if we had rtnl_getlink() run linkwatch for the target link if linkwatch is pending?
On Fri, 2023-12-01 at 16:28 -0800, Jakub Kicinski wrote: > On Fri, 1 Dec 2023 11:41:14 +0100 Johannes Berg wrote: > > So I had put this aside for a while, but really got annoyed by all > > the test failures now ... thinking about this again I basically now > > arrived at a variant of solution #3 previously outlined, and I've > > kind of convinced myself that userspace should always get an event > > with a new carrier_up_count as it does today. > > Would it work if we exposed "linkwatch is pending" / "link is > transitioning" bit to user space? Not sure, not by much or more than what this did? It's basically the same, I think: I exposed the carrier_up_count at the kernel time, so if userspace hasn't seen an event with a value >= that it knows the link is transitioning. > Even crazier, would it help if we had rtnl_getlink() run > linkwatch for the target link if linkwatch is pending? Sure, if we were to just synchronize that at the right time (doesn't even need to be rtnl_getlink, could be a new operation) that'd solve the issue too, perhaps more easily. johannes
On Sat, 02 Dec 2023 11:06:36 +0100 Johannes Berg wrote: > > Would it work if we exposed "linkwatch is pending" / "link is > > transitioning" bit to user space? > > Not sure, not by much or more than what this did? It's basically the > same, I think: I exposed the carrier_up_count at the kernel time, so if > userspace hasn't seen an event with a value >= that it knows the link is > transitioning. The benefit being that it'd work for everyone, without having to add the carrier count in random events? > > Even crazier, would it help if we had rtnl_getlink() run > > linkwatch for the target link if linkwatch is pending? > > Sure, if we were to just synchronize that at the right time (doesn't > even need to be rtnl_getlink, could be a new operation) that'd solve the > issue too, perhaps more easily. I was wondering about the new op, too, but "synchronize things please" op feels a little hacky. rtnl_getlink returns link state, so it feels somewhat natural for it to do the sync, to make sure that what it returns is in fact correct information. No strong feelings, tho. rtnl_getlink does return a lot, so maybe a new rtnl_getcarrier op? Or we can make reading sysfs "carrier" do the sync?
On Sat, 2023-12-02 at 10:46 -0800, Jakub Kicinski wrote: > On Sat, 02 Dec 2023 11:06:36 +0100 Johannes Berg wrote: > > > Would it work if we exposed "linkwatch is pending" / "link is > > > transitioning" bit to user space? > > > > Not sure, not by much or more than what this did? It's basically the > > same, I think: I exposed the carrier_up_count at the kernel time, so if > > userspace hasn't seen an event with a value >= that it knows the link is > > transitioning. > > The benefit being that it'd work for everyone, without having to add > the carrier count in random events? Well, true. You'd still have to add random rtnl_getlink() calls to your userspace, and then wait for an event if it's transitioning? Actually a bit _more_ complicated since then we'd have to do rtnl_getlink() after receiving the assoc event, and then wait if still transitioning. Or I guess we could do it when sending a frame there in the tests, but it's another call into the kernel vs. getting the information we need in the event. But yeah honestly I don't mind that either, and maybe it helps address some other use cases like what Andrew had in mind in his reply to my original thread. > > > Even crazier, would it help if we had rtnl_getlink() run > > > linkwatch for the target link if linkwatch is pending? > > > > Sure, if we were to just synchronize that at the right time (doesn't > > even need to be rtnl_getlink, could be a new operation) that'd solve the > > issue too, perhaps more easily. > > I was wondering about the new op, too, but "synchronize things please" > op feels a little hacky. Agree ... but then again it's all a bit hacky. You can even read "carrier is on" when it's really not yet ready... > rtnl_getlink returns link state, so it feels > somewhat natural for it to do the sync, to make sure that what it > returns is in fact correct information. Yeah that's a good point that I just mentioned above though - today the kernel will happily return a state that it's not actually willing to honour yet, i.e. if you actively read the state, you'll see carrier on before the kernel is actually willing to transmit packets on the link. Fixing that _would_ be nice, but I'm somewhat worried that it will cause performance regressions to always sync there? OTOH, it would hopefully not actually have to wait most of the time since link_watch isn't always pending... > rtnl_getlink does return a lot, so maybe a new rtnl_getcarrier op? Does it matter? Just another attribute ... > Or we can make reading sysfs "carrier" do the sync? I think I wouldn't mind now, and perhaps if we want to sync in netlink we should also do this here so that it's consistent, but I'm not sure I'd want this to be the only way to do it, I might imagine that someone might want this in some kind of container that doesn't necessarily have (full) access there? Dunno. We _could_ also use an input attribute on the rtnl_getlink() call to have userspace explicitly opt in to doing the sync before returning information? johannes
On Sun, 03 Dec 2023 19:51:28 +0100 Johannes Berg wrote: > I think I wouldn't mind now, and perhaps if we want to sync in netlink > we should also do this here so that it's consistent, but I'm not sure > I'd want this to be the only way to do it, I might imagine that someone > might want this in some kind of container that doesn't necessarily have > (full) access there? Dunno. Also dunno :) We can add a "sync" version of netif_carrier_ok() and then call if from whatever places we need. > We _could_ also use an input attribute on the rtnl_getlink() call to > have userspace explicitly opt in to doing the sync before returning > information? Yeah, maybe.. IMHO a more "Rusty Russell API levels" thing to do would be to allow opting out, as those who set the magic flag "know what they are doing" and returning unsync'ed carrier may be surprising. Also a "don't sync flag" we can add later, once someone who actually cares appears, avoiding uAPI growth
On Mon, 2023-12-04 at 08:23 -0800, Jakub Kicinski wrote: > On Sun, 03 Dec 2023 19:51:28 +0100 Johannes Berg wrote: > > I think I wouldn't mind now, and perhaps if we want to sync in netlink > > we should also do this here so that it's consistent, but I'm not sure > > I'd want this to be the only way to do it, I might imagine that someone > > might want this in some kind of container that doesn't necessarily have > > (full) access there? Dunno. > > Also dunno :) We can add a "sync" version of netif_carrier_ok() > and then call if from whatever places we need. [note: netif_carrier_ok(), not netif_carrier_on(), almost confused that] Yeah I guess we can have a netif_carrier_ok_sync(), though it feels kind of dubious to hide such an important detail in the middle of a netlink message building: if (nla_put_string(skb, IFLA_IFNAME, dev->name) || ... #ifdef CONFIG_RPS nla_put_u32(skb, IFLA_NUM_RX_QUEUES, dev->num_rx_queues) || #endif put_master_ifindex(skb, dev) || nla_put_u8(skb, IFLA_CARRIER, netif_carrier_ok(dev)) || ... Also, if we ever _do_ want to make it optional, then it's problematic, do we do netif_carrier_ok_maybe_sync(dev, sync)? ;-) > > We _could_ also use an input attribute on the rtnl_getlink() call to > > have userspace explicitly opt in to doing the sync before returning > > information? > > Yeah, maybe.. IMHO a more "Rusty Russell API levels" thing to do would > be to allow opting out, as those who set the magic flag "know what they > are doing" and returning unsync'ed carrier may be surprising. > Also a "don't sync flag" we can add later, once someone who actually > cares appears, avoiding uAPI growth
On Mon, 04 Dec 2023 20:14:10 +0100 Johannes Berg wrote: > Heh. But do I want to get blamed for the (perhaps inevitable?) > performance regression? I guess I'll try ... I'd happily bet that nobody will notice. Feel free to add: Suggested-by: Jakub Kicinski <kuba@kernel.org> If that makes it better? > Actually I could even still combine this with the netif carrier up count > in the wireless events, so we only have to do the rtnl_getlink if we > haven't seen an event yet, and - in the likely common case - save the > extra roundtrip? Though I guess it's not a huge problem, it's once per > connection basically. No objections to merging your carrier count patches to wireless, if you prefer to keep them. But it'd be nice to also have a generic mechanism.