diff mbox series

[PATCH/RFC,net] net: dec: tulip: de2104x: Add shutdown handler to stop NIC

Message ID 20201022220636.609956-1-mdf@kernel.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [PATCH/RFC,net] net: dec: tulip: de2104x: Add shutdown handler to stop NIC | expand

Commit Message

Moritz Fischer Oct. 22, 2020, 10:06 p.m. UTC
The driver does not implement a shutdown handler which leads to issues
when using kexec in certain scenarios. The NIC keeps on fetching
descriptors which gets flagged by the IOMMU with errors like this:

DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000

Signed-off-by: Moritz Fischer <mdf@kernel.org>
---

Hi all,

I'm not sure if this is the proper way for a shutdown handler,
I've tried to look at a bunch of examples and couldn't find a specific
solution, in my tests on hardware this works, though.

Open to suggestions.

Thanks,
Moritz

---
 drivers/net/ethernet/dec/tulip/de2104x.c | 1 +
 1 file changed, 1 insertion(+)

Comments

James Bottomley Oct. 22, 2020, 11:04 p.m. UTC | #1
On Thu, 2020-10-22 at 15:06 -0700, Moritz Fischer wrote:
> The driver does not implement a shutdown handler which leads to
> issues
> when using kexec in certain scenarios. The NIC keeps on fetching
> descriptors which gets flagged by the IOMMU with errors like this:
> 
> DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> 
> Signed-off-by: Moritz Fischer <mdf@kernel.org>
> ---
> 
> Hi all,
> 
> I'm not sure if this is the proper way for a shutdown handler,
> I've tried to look at a bunch of examples and couldn't find a
> specific
> solution, in my tests on hardware this works, though.
> 
> Open to suggestions.
> 
> Thanks,
> Moritz
> 
> ---
>  drivers/net/ethernet/dec/tulip/de2104x.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c
> b/drivers/net/ethernet/dec/tulip/de2104x.c
> index f1a2da15dd0a..372c62c7e60f 100644
> --- a/drivers/net/ethernet/dec/tulip/de2104x.c
> +++ b/drivers/net/ethernet/dec/tulip/de2104x.c
> @@ -2185,6 +2185,7 @@ static struct pci_driver de_driver = {
>  	.id_table	= de_pci_tbl,
>  	.probe		= de_init_one,
>  	.remove		= de_remove_one,
> +	.shutdown	= de_remove_one,

This doesn't look right: shutdown is supposed to turn off the device
without disturbing the tree or causing any knock on effects (I think
that rule is mostly because you don't want anything in userspace
triggering since it's likely to be nearly dead).  Remove removes the
device from the tree and cleans up everything.  I think the function
you want that's closest to what shutdown needs is de_close().  That
basically just turns off the chip and frees the interrupt ... you'll
have to wrapper it to call it from the pci_driver, though.

James
Moritz Fischer Oct. 23, 2020, 12:15 a.m. UTC | #2
On Thu, Oct 22, 2020 at 04:04:16PM -0700, James Bottomley wrote:
> On Thu, 2020-10-22 at 15:06 -0700, Moritz Fischer wrote:
> > The driver does not implement a shutdown handler which leads to
> > issues
> > when using kexec in certain scenarios. The NIC keeps on fetching
> > descriptors which gets flagged by the IOMMU with errors like this:
> > 
> > DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> > DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> > DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> > DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> > DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
> > 
> > Signed-off-by: Moritz Fischer <mdf@kernel.org>
> > ---
> > 
> > Hi all,
> > 
> > I'm not sure if this is the proper way for a shutdown handler,
> > I've tried to look at a bunch of examples and couldn't find a
> > specific
> > solution, in my tests on hardware this works, though.
> > 
> > Open to suggestions.
> > 
> > Thanks,
> > Moritz
> > 
> > ---
> >  drivers/net/ethernet/dec/tulip/de2104x.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c
> > b/drivers/net/ethernet/dec/tulip/de2104x.c
> > index f1a2da15dd0a..372c62c7e60f 100644
> > --- a/drivers/net/ethernet/dec/tulip/de2104x.c
> > +++ b/drivers/net/ethernet/dec/tulip/de2104x.c
> > @@ -2185,6 +2185,7 @@ static struct pci_driver de_driver = {
> >  	.id_table	= de_pci_tbl,
> >  	.probe		= de_init_one,
> >  	.remove		= de_remove_one,
> > +	.shutdown	= de_remove_one,
> 
> This doesn't look right: shutdown is supposed to turn off the device
> without disturbing the tree or causing any knock on effects (I think
> that rule is mostly because you don't want anything in userspace
> triggering since it's likely to be nearly dead).  Remove removes the
> device from the tree and cleans up everything.  I think the function
> you want that's closest to what shutdown needs is de_close().  That
> basically just turns off the chip and frees the interrupt ... you'll
> have to wrapper it to call it from the pci_driver, though.

Thanks for the suggestion, I like that better. I'll send a v2 after
testing.
I think anything that hits on de_stop_hw() will keep the NIC from
fetching further descriptors.

Cheers,
Moritz
diff mbox series

Patch

diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c b/drivers/net/ethernet/dec/tulip/de2104x.c
index f1a2da15dd0a..372c62c7e60f 100644
--- a/drivers/net/ethernet/dec/tulip/de2104x.c
+++ b/drivers/net/ethernet/dec/tulip/de2104x.c
@@ -2185,6 +2185,7 @@  static struct pci_driver de_driver = {
 	.id_table	= de_pci_tbl,
 	.probe		= de_init_one,
 	.remove		= de_remove_one,
+	.shutdown	= de_remove_one,
 #ifdef CONFIG_PM
 	.suspend	= de_suspend,
 	.resume		= de_resume,