Message ID | 20230329064010.24657-1-wsa+renesas@sang-engineering.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v4] smsc911x: only update stats when interface is up | expand |
Hi Wolfram, On Wed Mar 29, 2023 at 8:40 AM CEST, Wolfram Sang wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > Otherwise the clocks are not enabled and reading registers will OOPS. > Copy the behaviour from Renesas SH_ETH and use a custom flag because > using netif_running() is racy. A generic solution still needs to be > implemented. Tested on a Renesas APE6-EK. > > Fixes: 1e30b8d755b8 ("net: smsc911x: Make Runtime PM handling more fine-grained") > Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > --- > > Changes since v3: > * broken out of a patch series > * don't use netif_running() but a custom flag > > drivers/net/ethernet/smsc/smsc911x.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c > index a690d139e177..af96986cbc88 100644 > --- a/drivers/net/ethernet/smsc/smsc911x.c > +++ b/drivers/net/ethernet/smsc/smsc911x.c > @@ -140,6 +140,8 @@ struct smsc911x_data { > > /* clock */ > struct clk *clk; > + > + bool is_open; > }; > > /* Easy access to information */ > @@ -1738,6 +1740,8 @@ static int smsc911x_open(struct net_device *dev) > smsc911x_reg_write(pdata, TX_CFG, TX_CFG_TX_ON_); > > netif_start_queue(dev); > + pdata->is_open = true; > + > return 0; > > irq_stop_out: > @@ -1778,6 +1782,8 @@ static int smsc911x_stop(struct net_device *dev) > dev->phydev = NULL; > } > netif_carrier_off(dev); > + pdata->is_open = false; > + > pm_runtime_put(dev->dev.parent); > > SMSC_TRACE(pdata, ifdown, "Interface stopped"); > @@ -1841,8 +1847,12 @@ smsc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) > static struct net_device_stats *smsc911x_get_stats(struct net_device *dev) > { > struct smsc911x_data *pdata = netdev_priv(dev); > - smsc911x_tx_update_txcounters(dev); > - dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP); > + > + if (pdata->is_open) { Couldn't you just use netif_carrier_ok() here and drop the is_open variable? > + smsc911x_tx_update_txcounters(dev); > + dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP); > + } > + > return &dev->stats; > } > > -- > 2.30.2 BR Steen
On Wed, 29 Mar 2023 08:40:10 +0200 Wolfram Sang wrote: > Otherwise the clocks are not enabled and reading registers will OOPS. > Copy the behaviour from Renesas SH_ETH and use a custom flag because > using netif_running() is racy. A generic solution still needs to be > implemented. Tested on a Renesas APE6-EK. Hm, so you opted to not add the flag in the core? To keep the backport small? I think we should just add it.. Clearly multiple drivers would benefit and it's not a huge change.
On Wed, Mar 29, 2023 at 12:39:58PM -0700, Jakub Kicinski wrote: > On Wed, 29 Mar 2023 08:40:10 +0200 Wolfram Sang wrote: > > Otherwise the clocks are not enabled and reading registers will OOPS. > > Copy the behaviour from Renesas SH_ETH and use a custom flag because > > using netif_running() is racy. A generic solution still needs to be > > implemented. Tested on a Renesas APE6-EK. > > Hm, so you opted to not add the flag in the core? > To keep the backport small? I think we should just add it.. > Clearly multiple drivers would benefit and it's not a huge change. I did it this way for two reasons. First, yes, this is a minimal patch for backporting. No dependency on core changes, very easy. Second, this is a solution I could develop quickly. I am interested in finding another solution, but I guess it needs more time, especially as it probably touches 15 drivers. I created an action item for it. I hope I'll be able to work on it somewhen. But for now, I just need the SMSC bug fixed and need to move to the next issue. If we later have the generic solution, converting this driver also won't make a lot of a difference.
On Wed, 29 Mar 2023 21:48:55 +0200 Wolfram Sang wrote: > > Hm, so you opted to not add the flag in the core? > > To keep the backport small? I think we should just add it.. > > Clearly multiple drivers would benefit and it's not a huge change. > > I did it this way for two reasons. First, yes, this is a minimal patch > for backporting. No dependency on core changes, very easy. Second, this > is a solution I could develop quickly. I am interested in finding > another solution, but I guess it needs more time, especially as it > probably touches 15 drivers. I created an action item for it. I hope > I'll be able to work on it somewhen. But for now, I just need the SMSC > bug fixed and need to move to the next issue. If we later have the > generic solution, converting this driver also won't make a lot of a > difference. Okay, core changes aside - does pm_runtime_put() imply an RCU sync? Otherwise your check in get_stats is racy...
> > + if (pdata->is_open) { > > Couldn't you just use netif_carrier_ok() here and drop the is_open > variable? From my research, I can't: 1) netif_carrier_ok() uses __LINK_STATE_NOCARRIER 2) __LINK_STATE_NOCARRIER gets cleared in netif_carrier_on() 3) netif_carrier_on() is this code: if (test_and_clear_bit(__LINK_STATE_NOCARRIER, &dev->state)) { if (dev->reg_state == NETREG_UNINITIALIZED) return; atomic_inc(&dev->carrier_up_count); linkwatch_fire_event(dev); if (netif_running(dev)) __netdev_watchdog_up(dev); } 4) Notice the last if. It checks netif_running(). So, it is possible to have the carrier on and the device not opened yet. 5) Sadly, no cigar. If I didn't miss something... But thanks for the suggestion! Happy hacking, Wolfram
> Okay, core changes aside - does pm_runtime_put() imply an RCU sync? > Otherwise your check in get_stats is racy... From some light research, I can't find a trace of RCU sync. Pity, I'll need to move furhter investigation to later. I'll report back if I have something, but it may take a while. Thanks for the help!
On Fri, 31 Mar 2023 08:33:24 +0200 Wolfram Sang wrote: > > Okay, core changes aside - does pm_runtime_put() imply an RCU sync? > > Otherwise your check in get_stats is racy... > > From some light research, I can't find a trace of RCU sync. Pity, I'll > need to move furhter investigation to later. I'll report back if I have > something, but it may take a while. Thanks for the help! If you don't want to spend too much time you can just call it yourself right? :) Put a synchronize_rcu() with a comment about concurrent stat reading after pdata->is_open = false; and that's it?
> > From some light research, I can't find a trace of RCU sync. Pity, I'll > > need to move furhter investigation to later. I'll report back if I have > > something, but it may take a while. Thanks for the help! > > If you don't want to spend too much time you can just call it yourself > right? :) Put a synchronize_rcu() with a comment about concurrent stat > reading after pdata->is_open = false; and that's it? Probably. Although I trust you, I would still add code which I a) don't fully understand myself yet and b) feels somehow wrong because something like synchronize_rcu() should likely not be in drivers but somewhere more generic. Given that the bug in this driver went unnoticed for years (same with the race in other drivers), I do prefer to come back later with the good feeling that I understood the problem and fixed it thoroughly.
diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c index a690d139e177..af96986cbc88 100644 --- a/drivers/net/ethernet/smsc/smsc911x.c +++ b/drivers/net/ethernet/smsc/smsc911x.c @@ -140,6 +140,8 @@ struct smsc911x_data { /* clock */ struct clk *clk; + + bool is_open; }; /* Easy access to information */ @@ -1738,6 +1740,8 @@ static int smsc911x_open(struct net_device *dev) smsc911x_reg_write(pdata, TX_CFG, TX_CFG_TX_ON_); netif_start_queue(dev); + pdata->is_open = true; + return 0; irq_stop_out: @@ -1778,6 +1782,8 @@ static int smsc911x_stop(struct net_device *dev) dev->phydev = NULL; } netif_carrier_off(dev); + pdata->is_open = false; + pm_runtime_put(dev->dev.parent); SMSC_TRACE(pdata, ifdown, "Interface stopped"); @@ -1841,8 +1847,12 @@ smsc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) static struct net_device_stats *smsc911x_get_stats(struct net_device *dev) { struct smsc911x_data *pdata = netdev_priv(dev); - smsc911x_tx_update_txcounters(dev); - dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP); + + if (pdata->is_open) { + smsc911x_tx_update_txcounters(dev); + dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP); + } + return &dev->stats; }
Otherwise the clocks are not enabled and reading registers will OOPS. Copy the behaviour from Renesas SH_ETH and use a custom flag because using netif_running() is racy. A generic solution still needs to be implemented. Tested on a Renesas APE6-EK. Fixes: 1e30b8d755b8 ("net: smsc911x: Make Runtime PM handling more fine-grained") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> --- Changes since v3: * broken out of a patch series * don't use netif_running() but a custom flag drivers/net/ethernet/smsc/smsc911x.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)