Message ID | 20230119180008.2156048-1-leitao@debian.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,v2] netpoll: Remove 4s sleep during carrier detection | expand |
On Thu, 19 Jan 2023 10:00:08 -0800 Breno Leitao wrote: > This patch proposes to remove the msleep(4s) during netpoll_setup() if > the carrier appears instantly. > > Modern NICs do not seem to have this bouncing problem anymore, and this > sleep slows down the machine boot unnecessarily We should mention in the message that the wait is counter-productive on servers which have BMC communicating over NC-SI via the same NIC as gets used for netconsole. BMC will keep the PHY up, hence the carrier appearing instantly. We could add a smaller delay, but really having instant carrier and then loosing it seems like a driver bug, so let's try to rip the band aid off and ask for forgiveness instead. Few extra process rules: - don't repost another version within 24h, - keep a changelog under --- - add tree name to the tag - [PATCH net-next] Also, I'd just go for PATCH, no need to RFC this. If someone wants to object they can object to a PATCH.
On Thu, Jan 19, 2023 at 11:04:21AM -0800, Jakub Kicinski wrote: > On Thu, 19 Jan 2023 10:00:08 -0800 Breno Leitao wrote: > > This patch proposes to remove the msleep(4s) during netpoll_setup() if > > the carrier appears instantly. > > > > Modern NICs do not seem to have this bouncing problem anymore, and this > > sleep slows down the machine boot unnecessarily I'm not sure 'bouncing' is the correct word here. That would imply up, down, up, down and then stable up. What i guess the real issue here was the MAC driver said the link was up while autoneg was still happening, which takes around 1.5 seconds. > We should mention in the message that the wait is counter-productive on > servers which have BMC communicating over NC-SI via the same NIC as gets > used for netconsole. BMC will keep the PHY up, hence the carrier > appearing instantly. > > We could add a smaller delay, but really having instant carrier and > then loosing it seems like a driver bug, so let's try to rip the band > aid off and ask for forgiveness instead. It would be good to put some of this into the commit message. Explain the case you see it go wrong. The other scenarios i can think of are: The bootloader configured the interface up, and used the interface, e.g. to tftp boot. The PHY was left up when transitioning into Linux. Hence there is no need to wait around 1.5 seconds for autoneg to complete. The link is fibre, SERDES getting sync could happen within 0.1Hz, and so it appears to be instantaneously. This work around does seem very old, pre-git times, so i also doubt there are many systems which are truly broken like this. Andrew
diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 9be762e1d042..a089b704b986 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -682,7 +682,7 @@ int netpoll_setup(struct netpoll *np) } if (!netif_running(ndev)) { - unsigned long atmost, atleast; + unsigned long atmost; np_info(np, "device %s not up yet, forcing it\n", np->dev_name); @@ -694,7 +694,6 @@ int netpoll_setup(struct netpoll *np) } rtnl_unlock(); - atleast = jiffies + HZ/10; atmost = jiffies + carrier_timeout * HZ; while (!netif_carrier_ok(ndev)) { if (time_after(jiffies, atmost)) { @@ -704,15 +703,6 @@ int netpoll_setup(struct netpoll *np) msleep(1); } - /* If carrier appears to come up instantly, we don't - * trust it and pause so that we don't pump all our - * queued console messages into the bitbucket. - */ - - if (time_before(jiffies, atleast)) { - np_notice(np, "carrier detect appears untrustworthy, waiting 4 seconds\n"); - msleep(4000); - } rtnl_lock(); }
This patch proposes to remove the msleep(4s) during netpoll_setup() if the carrier appears instantly. Modern NICs do not seem to have this bouncing problem anymore, and this sleep slows down the machine boot unnecessarily Reported-by: Michael van der Westhuizen <rmikey@meta.com> Signed-off-by: Breno Leitao <leitao@debian.org> --- net/core/netpoll.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)