Message ID | 20201023202834.660091-1-mdf@kernel.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [PATCH/RFC,net-next,v3] net: dec: tulip: de2104x: Add shutdown handler to stop NIC | expand |
On Fri, 23 Oct 2020 13:28:34 -0700 Moritz Fischer wrote: > diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c > index d9f6c19940ef..ea7442cc8e75 100644 > --- a/drivers/net/ethernet/dec/tulip/de2104x.c > +++ b/drivers/net/ethernet/dec/tulip/de2104x.c > @@ -2175,11 +2175,19 @@ static int __maybe_unused de_resume(struct device *dev_d) > > static SIMPLE_DEV_PM_OPS(de_pm_ops, de_suspend, de_resume); > > +static void de_shutdown(struct pci_dev *pdev) > +{ > + struct net_device *dev = pci_get_drvdata(pdev); > + > + de_close(dev); Apparently I get all the best ideas when I'm about to apply something.. I don't think you can just call de_close() like that, because (a) it may expect rtnl_lock() to be held, and (b) it may not be open. Perhaps call unregister_netdev(dev) - that'll close the device. Or rtnl_lock(); dev_close(dev); rtnl_unlock(); > +} > + > static struct pci_driver de_driver = { > .name = DRV_NAME, > .id_table = de_pci_tbl, > .probe = de_init_one, > .remove = de_remove_one, > + .shutdown = de_shutdown, > .driver.pm = &de_pm_ops, > }; >
Hi Jakub, On Tue, Oct 27, 2020 at 04:16:06PM -0700, Jakub Kicinski wrote: > On Fri, 23 Oct 2020 13:28:34 -0700 Moritz Fischer wrote: > > diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c > > index d9f6c19940ef..ea7442cc8e75 100644 > > --- a/drivers/net/ethernet/dec/tulip/de2104x.c > > +++ b/drivers/net/ethernet/dec/tulip/de2104x.c > > @@ -2175,11 +2175,19 @@ static int __maybe_unused de_resume(struct device *dev_d) > > > > static SIMPLE_DEV_PM_OPS(de_pm_ops, de_suspend, de_resume); > > > > +static void de_shutdown(struct pci_dev *pdev) > > +{ > > + struct net_device *dev = pci_get_drvdata(pdev); > > + > > + de_close(dev); > > Apparently I get all the best ideas when I'm about to apply something.. Better now than after =) > I don't think you can just call de_close() like that, because > (a) it may expect rtnl_lock() to be held, and (b) it may not be open. how about: rtnl_lock(); if (netif_running(dev)) dev_close(dev); rtnl_unlock(); > > Perhaps call unregister_netdev(dev) - that'll close the device. > Or rtnl_lock(); dev_close(dev); rtnl_unlock(); > > > +} > > + > > static struct pci_driver de_driver = { > > .name = DRV_NAME, > > .id_table = de_pci_tbl, > > .probe = de_init_one, > > .remove = de_remove_one, > > + .shutdown = de_shutdown, > > .driver.pm = &de_pm_ops, > > }; > > > Cheers, Moritz
On Tue, 27 Oct 2020 18:59:09 -0700 Moritz Fischer wrote: > Hi Jakub, > > On Tue, Oct 27, 2020 at 04:16:06PM -0700, Jakub Kicinski wrote: > > On Fri, 23 Oct 2020 13:28:34 -0700 Moritz Fischer wrote: > > > diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c > > > index d9f6c19940ef..ea7442cc8e75 100644 > > > --- a/drivers/net/ethernet/dec/tulip/de2104x.c > > > +++ b/drivers/net/ethernet/dec/tulip/de2104x.c > > > @@ -2175,11 +2175,19 @@ static int __maybe_unused de_resume(struct device *dev_d) > > > > > > static SIMPLE_DEV_PM_OPS(de_pm_ops, de_suspend, de_resume); > > > > > > +static void de_shutdown(struct pci_dev *pdev) > > > +{ > > > + struct net_device *dev = pci_get_drvdata(pdev); > > > + > > > + de_close(dev); > > > > Apparently I get all the best ideas when I'm about to apply something.. > > Better now than after =) > > > I don't think you can just call de_close() like that, because > > (a) it may expect rtnl_lock() to be held, and (b) it may not be open. > > how about: > > rtnl_lock(); > if (netif_running(dev)) > dev_close(dev); > rtnl_unlock(); That's fine as well, although dev_close() checks if the device is UP AFAICT.
diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c index d9f6c19940ef..ea7442cc8e75 100644 --- a/drivers/net/ethernet/dec/tulip/de2104x.c +++ b/drivers/net/ethernet/dec/tulip/de2104x.c @@ -2175,11 +2175,19 @@ static int __maybe_unused de_resume(struct device *dev_d) static SIMPLE_DEV_PM_OPS(de_pm_ops, de_suspend, de_resume); +static void de_shutdown(struct pci_dev *pdev) +{ + struct net_device *dev = pci_get_drvdata(pdev); + + de_close(dev); +} + static struct pci_driver de_driver = { .name = DRV_NAME, .id_table = de_pci_tbl, .probe = de_init_one, .remove = de_remove_one, + .shutdown = de_shutdown, .driver.pm = &de_pm_ops, };
The driver does not implement a shutdown handler which leads to issues when using kexec in certain scenarios. The NIC keeps on fetching descriptors which gets flagged by the IOMMU with errors like this: DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000 Signed-off-by: Moritz Fischer <mdf@kernel.org> --- I'd consider it a bug-fix but Jakub said it's a feature, so net-next it is. I don't have a strong preference either way. Changes from v2: - Changed to net-next - Removed extra whitespace Changes from v1: - Replace call to de_remove_one with de_shutdown() function as suggested by James. --- drivers/net/ethernet/dec/tulip/de2104x.c | 8 ++++++++ 1 file changed, 8 insertions(+)