Message ID | 20210121062005.53271-1-ljp@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] ibmvnic: device remove has higher precedence over reset | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 11 of 11 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 19 this patch: 19 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 20 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On 2021-01-20 22:20, Lijun Pan wrote: > Returning -EBUSY in ibmvnic_remove() does not actually hold the > removal procedure since driver core doesn't care for the return > value (see __device_release_driver() in drivers/base/dd.c > calling dev->bus->remove()) though vio_bus_remove > (in arch/powerpc/platforms/pseries/vio.c) records the > return value and passes it on. [1] > > During the device removal precedure, we should not schedule > any new reset (ibmvnic_reset check for REMOVING and exit), > and should rely on the flush_work and flush_delayed_work > to complete the pending resets, specifically we need to > let __ibmvnic_reset() keep running while in REMOVING state since > flush_work and flush_delayed_work shall call __ibmvnic_reset finally. > So we skip the checking for REMOVING in __ibmvnic_reset. > > [1] > https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53 > Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during > device reset") > Signed-off-by: Lijun Pan <ljp@linux.ibm.com> > --- > v1 versus RFC: > 1/ articulate why remove the REMOVING checking in __ibmvnic_reset > and why keep the current checking for REMOVING in ibmvnic_reset. > 2/ The locking issue mentioned by Uwe are being addressed separately > by https://lists.openwall.net/netdev/2021/01/08/89 > 3/ This patch does not have merge conflict with 2/ > > drivers/net/ethernet/ibm/ibmvnic.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > b/drivers/net/ethernet/ibm/ibmvnic.c > index aed985e08e8a..11f28fd03057 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct > *work) > while (rwi) { > spin_lock_irqsave(&adapter->state_lock, flags); > > - if (adapter->state == VNIC_REMOVING || > - adapter->state == VNIC_REMOVED) { > + if (adapter->state == VNIC_REMOVED) { If we do get here, we would crash because ibmvnic_remove() happened. It frees the adapter struct already. > spin_unlock_irqrestore(&adapter->state_lock, flags); > kfree(rwi); > rc = EBUSY; > @@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev) > unsigned long flags; > > spin_lock_irqsave(&adapter->state_lock, flags); > - if (test_bit(0, &adapter->resetting)) { > - spin_unlock_irqrestore(&adapter->state_lock, flags); > - return -EBUSY; > - } > - > adapter->state = VNIC_REMOVING; > spin_unlock_irqrestore(&adapter->state_lock, flags);
On Thu, Jan 21, 2021 at 12:42 PM Dany Madden <drt@linux.ibm.com> wrote: > > On 2021-01-20 22:20, Lijun Pan wrote: > > Returning -EBUSY in ibmvnic_remove() does not actually hold the > > removal procedure since driver core doesn't care for the return > > value (see __device_release_driver() in drivers/base/dd.c > > calling dev->bus->remove()) though vio_bus_remove > > (in arch/powerpc/platforms/pseries/vio.c) records the > > return value and passes it on. [1] > > > > During the device removal precedure, we should not schedule > > any new reset (ibmvnic_reset check for REMOVING and exit), > > and should rely on the flush_work and flush_delayed_work > > to complete the pending resets, specifically we need to > > let __ibmvnic_reset() keep running while in REMOVING state since > > flush_work and flush_delayed_work shall call __ibmvnic_reset finally. > > So we skip the checking for REMOVING in __ibmvnic_reset. > > > > [1] > > https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53 > > Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > > Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during > > device reset") > > Signed-off-by: Lijun Pan <ljp@linux.ibm.com> > > --- > > v1 versus RFC: > > 1/ articulate why remove the REMOVING checking in __ibmvnic_reset > > and why keep the current checking for REMOVING in ibmvnic_reset. > > 2/ The locking issue mentioned by Uwe are being addressed separately > > by https://lists.openwall.net/netdev/2021/01/08/89 > > 3/ This patch does not have merge conflict with 2/ > > > > drivers/net/ethernet/ibm/ibmvnic.c | 8 +------- > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > > b/drivers/net/ethernet/ibm/ibmvnic.c > > index aed985e08e8a..11f28fd03057 100644 > > --- a/drivers/net/ethernet/ibm/ibmvnic.c > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct > > *work) > > while (rwi) { > > spin_lock_irqsave(&adapter->state_lock, flags); > > > > - if (adapter->state == VNIC_REMOVING || > > - adapter->state == VNIC_REMOVED) { > > + if (adapter->state == VNIC_REMOVED) { > > If we do get here, we would crash because ibmvnic_remove() happened. It > frees the adapter struct already. Not exactly. viodev is gone; netdev is done; ibmvnic_adapter is still there. Lijun > > > spin_unlock_irqrestore(&adapter->state_lock, flags); > > kfree(rwi); > > rc = EBUSY; > > @@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev) > > unsigned long flags; > > > > spin_lock_irqsave(&adapter->state_lock, flags); > > - if (test_bit(0, &adapter->resetting)) { > > - spin_unlock_irqrestore(&adapter->state_lock, flags); > > - return -EBUSY; > > - } > > - > > adapter->state = VNIC_REMOVING; > > spin_unlock_irqrestore(&adapter->state_lock, flags);
> > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > > b/drivers/net/ethernet/ibm/ibmvnic.c > > index aed985e08e8a..11f28fd03057 100644 > > --- a/drivers/net/ethernet/ibm/ibmvnic.c > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct > > *work) > > while (rwi) { > > spin_lock_irqsave(&adapter->state_lock, flags); > > > > - if (adapter->state == VNIC_REMOVING || > > - adapter->state == VNIC_REMOVED) { > > + if (adapter->state == VNIC_REMOVED) { > > If we do get here, we would crash because ibmvnic_remove() happened. It > frees the adapter struct already. Not exactly. viodev is gone; netdev is gone; ibmvnic_adapter is still there. Lijun
Lijun Pan [lijunp213@gmail.com] wrote: > > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > > > b/drivers/net/ethernet/ibm/ibmvnic.c > > > index aed985e08e8a..11f28fd03057 100644 > > > --- a/drivers/net/ethernet/ibm/ibmvnic.c > > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > > > @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct > > > *work) > > > while (rwi) { > > > spin_lock_irqsave(&adapter->state_lock, flags); > > > > > > - if (adapter->state == VNIC_REMOVING || > > > - adapter->state == VNIC_REMOVED) { > > > + if (adapter->state == VNIC_REMOVED) { If the adapter is in REMOVING state, there is no point going through the reset process. We could just bail out here. We should also drain any other resets in the queue (something my other patch set was addressing). Sukadev > > > > If we do get here, we would crash because ibmvnic_remove() happened. It > > frees the adapter struct already. > > Not exactly. viodev is gone; netdev is gone; ibmvnic_adapter is still there. > > Lijun
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index aed985e08e8a..11f28fd03057 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2235,8 +2235,7 @@ static void __ibmvnic_reset(struct work_struct *work) while (rwi) { spin_lock_irqsave(&adapter->state_lock, flags); - if (adapter->state == VNIC_REMOVING || - adapter->state == VNIC_REMOVED) { + if (adapter->state == VNIC_REMOVED) { spin_unlock_irqrestore(&adapter->state_lock, flags); kfree(rwi); rc = EBUSY; @@ -5372,11 +5371,6 @@ static int ibmvnic_remove(struct vio_dev *dev) unsigned long flags; spin_lock_irqsave(&adapter->state_lock, flags); - if (test_bit(0, &adapter->resetting)) { - spin_unlock_irqrestore(&adapter->state_lock, flags); - return -EBUSY; - } - adapter->state = VNIC_REMOVING; spin_unlock_irqrestore(&adapter->state_lock, flags);
Returning -EBUSY in ibmvnic_remove() does not actually hold the removal procedure since driver core doesn't care for the return value (see __device_release_driver() in drivers/base/dd.c calling dev->bus->remove()) though vio_bus_remove (in arch/powerpc/platforms/pseries/vio.c) records the return value and passes it on. [1] During the device removal precedure, we should not schedule any new reset (ibmvnic_reset check for REMOVING and exit), and should rely on the flush_work and flush_delayed_work to complete the pending resets, specifically we need to let __ibmvnic_reset() keep running while in REMOVING state since flush_work and flush_delayed_work shall call __ibmvnic_reset finally. So we skip the checking for REMOVING in __ibmvnic_reset. [1] https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53 Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> --- v1 versus RFC: 1/ articulate why remove the REMOVING checking in __ibmvnic_reset and why keep the current checking for REMOVING in ibmvnic_reset. 2/ The locking issue mentioned by Uwe are being addressed separately by https://lists.openwall.net/netdev/2021/01/08/89 3/ This patch does not have merge conflict with 2/ drivers/net/ethernet/ibm/ibmvnic.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)