diff mbox series

[net] ibmvnic: device remove has higher precedence over reset

Message ID 20210121062005.53271-1-ljp@linux.ibm.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] ibmvnic: device remove has higher precedence over reset | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 11 of 11 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 19 this patch: 19
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Lijun Pan Jan. 21, 2021, 6:20 a.m. UTC
Returning -EBUSY in ibmvnic_remove() does not actually hold the
removal procedure since driver core doesn't care for the return
value (see __device_release_driver() in drivers/base/dd.c
calling dev->bus->remove()) though vio_bus_remove
(in arch/powerpc/platforms/pseries/vio.c) records the
return value and passes it on. [1]

During the device removal precedure, we should not schedule
any new reset (ibmvnic_reset check for REMOVING and exit),
and should rely on the flush_work and flush_delayed_work
to complete the pending resets, specifically we need to
let __ibmvnic_reset() keep running while in REMOVING state since
flush_work and flush_delayed_work shall call __ibmvnic_reset finally.
So we skip the checking for REMOVING in __ibmvnic_reset.

[1] https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
v1 versus RFC: 
  1/ articulate why remove the REMOVING checking in __ibmvnic_reset
  and why keep the current checking for REMOVING in ibmvnic_reset.
  2/ The locking issue mentioned by Uwe are being addressed separately 
     by	https://lists.openwall.net/netdev/2021/01/08/89
  3/ This patch does not have merge conflict with 2/

 drivers/net/ethernet/ibm/ibmvnic.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

Comments

Dany Madden Jan. 21, 2021, 6:24 p.m. UTC | #1
On 2021-01-20 22:20, Lijun Pan wrote:
> Returning -EBUSY in ibmvnic_remove() does not actually hold the
> removal procedure since driver core doesn't care for the return
> value (see __device_release_driver() in drivers/base/dd.c
> calling dev->bus->remove()) though vio_bus_remove
> (in arch/powerpc/platforms/pseries/vio.c) records the
> return value and passes it on. [1]
> 
> During the device removal precedure, we should not schedule
> any new reset (ibmvnic_reset check for REMOVING and exit),
> and should rely on the flush_work and flush_delayed_work
> to complete the pending resets, specifically we need to
> let __ibmvnic_reset() keep running while in REMOVING state since
> flush_work and flush_delayed_work shall call __ibmvnic_reset finally.
> So we skip the checking for REMOVING in __ibmvnic_reset.
> 
> [1]
> https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during
> device reset")
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
> v1 versus RFC:
>   1/ articulate why remove the REMOVING checking in __ibmvnic_reset
>   and why keep the current checking for REMOVING in ibmvnic_reset.
>   2/ The locking issue mentioned by Uwe are being addressed separately
>      by	https://lists.openwall.net/netdev/2021/01/08/89
>   3/ This patch does not have merge conflict with 2/
> 
>  drivers/net/ethernet/ibm/ibmvnic.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index aed985e08e8a..11f28fd03057 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct 
> *work)
>  	while (rwi) {
>  		spin_lock_irqsave(&adapter->state_lock, flags);
> 
> -		if (adapter->state == VNIC_REMOVING ||
> -		    adapter->state == VNIC_REMOVED) {
> +		if (adapter->state == VNIC_REMOVED) {

If we do get here, we would crash because ibmvnic_remove() happened. It 
frees the adapter struct already.

>  			spin_unlock_irqrestore(&adapter->state_lock, flags);
>  			kfree(rwi);
>  			rc = EBUSY;
> @@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev)
>  	unsigned long flags;
> 
>  	spin_lock_irqsave(&adapter->state_lock, flags);
> -	if (test_bit(0, &adapter->resetting)) {
> -		spin_unlock_irqrestore(&adapter->state_lock, flags);
> -		return -EBUSY;
> -	}
> -
>  	adapter->state = VNIC_REMOVING;
>  	spin_unlock_irqrestore(&adapter->state_lock, flags);
Lijun Pan Jan. 21, 2021, 6:46 p.m. UTC | #2
On Thu, Jan 21, 2021 at 12:42 PM Dany Madden <drt@linux.ibm.com> wrote:
>
> On 2021-01-20 22:20, Lijun Pan wrote:
> > Returning -EBUSY in ibmvnic_remove() does not actually hold the
> > removal procedure since driver core doesn't care for the return
> > value (see __device_release_driver() in drivers/base/dd.c
> > calling dev->bus->remove()) though vio_bus_remove
> > (in arch/powerpc/platforms/pseries/vio.c) records the
> > return value and passes it on. [1]
> >
> > During the device removal precedure, we should not schedule
> > any new reset (ibmvnic_reset check for REMOVING and exit),
> > and should rely on the flush_work and flush_delayed_work
> > to complete the pending resets, specifically we need to
> > let __ibmvnic_reset() keep running while in REMOVING state since
> > flush_work and flush_delayed_work shall call __ibmvnic_reset finally.
> > So we skip the checking for REMOVING in __ibmvnic_reset.
> >
> > [1]
> > https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
> > Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during
> > device reset")
> > Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> > ---
> > v1 versus RFC:
> >   1/ articulate why remove the REMOVING checking in __ibmvnic_reset
> >   and why keep the current checking for REMOVING in ibmvnic_reset.
> >   2/ The locking issue mentioned by Uwe are being addressed separately
> >      by       https://lists.openwall.net/netdev/2021/01/08/89
> >   3/ This patch does not have merge conflict with 2/
> >
> >  drivers/net/ethernet/ibm/ibmvnic.c | 8 +-------
> >  1 file changed, 1 insertion(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> > b/drivers/net/ethernet/ibm/ibmvnic.c
> > index aed985e08e8a..11f28fd03057 100644
> > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct
> > *work)
> >       while (rwi) {
> >               spin_lock_irqsave(&adapter->state_lock, flags);
> >
> > -             if (adapter->state == VNIC_REMOVING ||
> > -                 adapter->state == VNIC_REMOVED) {
> > +             if (adapter->state == VNIC_REMOVED) {
>
> If we do get here, we would crash because ibmvnic_remove() happened. It
> frees the adapter struct already.

Not exactly. viodev is gone; netdev is done; ibmvnic_adapter is still there.

Lijun
>
> >                       spin_unlock_irqrestore(&adapter->state_lock, flags);
> >                       kfree(rwi);
> >                       rc = EBUSY;
> > @@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev)
> >       unsigned long flags;
> >
> >       spin_lock_irqsave(&adapter->state_lock, flags);
> > -     if (test_bit(0, &adapter->resetting)) {
> > -             spin_unlock_irqrestore(&adapter->state_lock, flags);
> > -             return -EBUSY;
> > -     }
> > -
> >       adapter->state = VNIC_REMOVING;
> >       spin_unlock_irqrestore(&adapter->state_lock, flags);
Lijun Pan Jan. 21, 2021, 7:48 p.m. UTC | #3
> > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> > b/drivers/net/ethernet/ibm/ibmvnic.c
> > index aed985e08e8a..11f28fd03057 100644
> > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct
> > *work)
> >       while (rwi) {
> >               spin_lock_irqsave(&adapter->state_lock, flags);
> >
> > -             if (adapter->state == VNIC_REMOVING ||
> > -                 adapter->state == VNIC_REMOVED) {
> > +             if (adapter->state == VNIC_REMOVED) {
>
> If we do get here, we would crash because ibmvnic_remove() happened. It
> frees the adapter struct already.

Not exactly. viodev is gone; netdev is gone; ibmvnic_adapter is still there.

Lijun
Sukadev Bhattiprolu Jan. 21, 2021, 10:38 p.m. UTC | #4
Lijun Pan [lijunp213@gmail.com] wrote:
> > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> > > b/drivers/net/ethernet/ibm/ibmvnic.c
> > > index aed985e08e8a..11f28fd03057 100644
> > > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct
> > > *work)
> > >       while (rwi) {
> > >               spin_lock_irqsave(&adapter->state_lock, flags);
> > >
> > > -             if (adapter->state == VNIC_REMOVING ||
> > > -                 adapter->state == VNIC_REMOVED) {
> > > +             if (adapter->state == VNIC_REMOVED) {

If the adapter is in REMOVING state, there is no point going
through the reset process. We could just bail out here. We
should also drain any other resets in the queue (something
my other patch set was addressing).

Sukadev

> >
> > If we do get here, we would crash because ibmvnic_remove() happened. It
> > frees the adapter struct already.
> 
> Not exactly. viodev is gone; netdev is gone; ibmvnic_adapter is still there.
> 
> Lijun
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index aed985e08e8a..11f28fd03057 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2235,8 +2235,7 @@  static void __ibmvnic_reset(struct work_struct *work)
 	while (rwi) {
 		spin_lock_irqsave(&adapter->state_lock, flags);
 
-		if (adapter->state == VNIC_REMOVING ||
-		    adapter->state == VNIC_REMOVED) {
+		if (adapter->state == VNIC_REMOVED) {
 			spin_unlock_irqrestore(&adapter->state_lock, flags);
 			kfree(rwi);
 			rc = EBUSY;
@@ -5372,11 +5371,6 @@  static int ibmvnic_remove(struct vio_dev *dev)
 	unsigned long flags;
 
 	spin_lock_irqsave(&adapter->state_lock, flags);
-	if (test_bit(0, &adapter->resetting)) {
-		spin_unlock_irqrestore(&adapter->state_lock, flags);
-		return -EBUSY;
-	}
-
 	adapter->state = VNIC_REMOVING;
 	spin_unlock_irqrestore(&adapter->state_lock, flags);