Message ID | 20210112181441.206545-6-sukadev@linux.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ibmvnic: Use more consistent locking | expand |
Context | Check | Description |
---|---|---|
netdev/apply | fail | Patch does not apply to net-next |
netdev/tree_selection | success | Clearly marked for net-next |
On Tue, 2021-01-12 at 10:14 -0800, Sukadev Bhattiprolu wrote: > The work queue is used to queue reset requests like CHANGE-PARAM or > FAILOVER resets for the worker thread. When the adapter is being > removed > the adapter state is set to VNIC_REMOVING and the work queue is > flushed > so no new work is added. However the check for adapter being removed > is > racy in that the adapter can go into REMOVING state just after we > check > and we might end up adding work just as it is being flushed (or > after). > > Use a new spin lock, ->remove_lock to ensure that after (while) the > work > queue is (being) flushed, no new work is queued. > > A follow-on patch will convert the existing ->state_lock from a spin > lock > to to a mutex to allow us to hold it for longer periods of time. > Since > work can be scheduled from a tasklet, we cannot block on the mutex, > so > use a new spin lock. > > Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before > device removal") > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> > --- > drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++++++- > drivers/net/ethernet/ibm/ibmvnic.h | 2 ++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > b/drivers/net/ethernet/ibm/ibmvnic.c > index ad551418ac63..7645df37e886 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -2435,6 +2435,7 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > enum ibmvnic_reset_reason reason) > { > struct net_device *netdev = adapter->netdev; > + unsigned long flags; > int ret; > > if (adapter->state == VNIC_PROBING) { > @@ -2443,6 +2444,8 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > goto err; > } > > + spin_lock_irqsave(&adapter->remove_lock, flags); > + > /* > * If failover is pending don't schedule any other reset. > * Instead let the failover complete. If there is already a > @@ -2465,8 +2468,9 @@ static int ibmvnic_reset(struct ibmvnic_adapter > *adapter, > netdev_dbg(adapter->netdev, "Scheduling reset (reason %d)\n", > reason); > schedule_work(&adapter->ibmvnic_reset); > > - return 0; > + ret = 0; > err: > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > return -ret; > } > > @@ -5387,6 +5391,7 @@ static int ibmvnic_probe(struct vio_dev *dev, > const struct vio_device_id *id) > memset(&adapter->pending_resets, 0, sizeof(adapter- > >pending_resets)); > spin_lock_init(&adapter->rwi_lock); > spin_lock_init(&adapter->state_lock); > + spin_lock_init(&adapter->remove_lock); > mutex_init(&adapter->fw_lock); > init_completion(&adapter->init_done); > init_completion(&adapter->fw_done); > @@ -5467,7 +5472,15 @@ static int ibmvnic_remove(struct vio_dev *dev) > return -EBUSY; > } > > + /* If ibmvnic_reset() is scheduling a reset, wait for it to > + * finish. Then prevent it from scheduling any more resets > + * and have the reset functions ignore any resets that have > + * already been scheduled. > + */ > + spin_lock_irqsave(&adapter->remove_lock, flags); > adapter->state = VNIC_REMOVING; > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > + Why irqsave/restore variants ? are you expecting this spinlock to be held in interruptcontext ? > spin_unlock_irqrestore(&adapter->state_lock, flags); > > flush_work(&adapter->ibmvnic_reset); > diff --git a/drivers/net/ethernet/ibm/ibmvnic.h > b/drivers/net/ethernet/ibm/ibmvnic.h > index 1179a95a3f92..2779696ade09 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.h > +++ b/drivers/net/ethernet/ibm/ibmvnic.h > @@ -1081,6 +1081,8 @@ struct ibmvnic_adapter { > spinlock_t rwi_lock; > enum ibmvnic_reset_reason pending_resets[VNIC_RESET_MAX-1]; > short next_reset; > + /* serialize ibmvnic_reset() and ibmvnic_remove() */ > + spinlock_t remove_lock; > struct work_struct ibmvnic_reset; > struct delayed_work ibmvnic_delayed_reset; > unsigned long resetting;
Saeed Mahameed [saeed@kernel.org] wrote: > On Tue, 2021-01-12 at 10:14 -0800, Sukadev Bhattiprolu wrote: <snip> > > @@ -5467,7 +5472,15 @@ static int ibmvnic_remove(struct vio_dev *dev) > > return -EBUSY; > > } > > > > + /* If ibmvnic_reset() is scheduling a reset, wait for it to > > + * finish. Then prevent it from scheduling any more resets > > + * and have the reset functions ignore any resets that have > > + * already been scheduled. > > + */ > > + spin_lock_irqsave(&adapter->remove_lock, flags); > > adapter->state = VNIC_REMOVING; > > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > > + > > Why irqsave/restore variants ? are you expecting this spinlock to be > held in interruptcontext ? > > > spin_unlock_irqrestore(&adapter->state_lock, flags); Good question. One of the callers of ibmvnic_reset() is the ->ndo_tx_timeout() method which gets called from the watchdog timer. Thanks for the review. Sukadev
On Tue, 12 Jan 2021 16:40:49 -0800 Sukadev Bhattiprolu wrote: > Saeed Mahameed [saeed@kernel.org] wrote: > > On Tue, 2021-01-12 at 10:14 -0800, Sukadev Bhattiprolu wrote: > > <snip> > > > @@ -5467,7 +5472,15 @@ static int ibmvnic_remove(struct vio_dev *dev) > > > return -EBUSY; > > > } > > > > > > + /* If ibmvnic_reset() is scheduling a reset, wait for it to > > > + * finish. Then prevent it from scheduling any more resets > > > + * and have the reset functions ignore any resets that have > > > + * already been scheduled. > > > + */ > > > + spin_lock_irqsave(&adapter->remove_lock, flags); > > > adapter->state = VNIC_REMOVING; > > > + spin_unlock_irqrestore(&adapter->remove_lock, flags); > > > + > > > > Why irqsave/restore variants ? are you expecting this spinlock to be > > held in interruptcontext ? > > > > > spin_unlock_irqrestore(&adapter->state_lock, flags); > > Good question. > > One of the callers of ibmvnic_reset() is the ->ndo_tx_timeout() > method which gets called from the watchdog timer. watchdog is a normal timer, so it's gonna run in softirq, you don't need to mask irqs.
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index ad551418ac63..7645df37e886 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2435,6 +2435,7 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter, enum ibmvnic_reset_reason reason) { struct net_device *netdev = adapter->netdev; + unsigned long flags; int ret; if (adapter->state == VNIC_PROBING) { @@ -2443,6 +2444,8 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter, goto err; } + spin_lock_irqsave(&adapter->remove_lock, flags); + /* * If failover is pending don't schedule any other reset. * Instead let the failover complete. If there is already a @@ -2465,8 +2468,9 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter, netdev_dbg(adapter->netdev, "Scheduling reset (reason %d)\n", reason); schedule_work(&adapter->ibmvnic_reset); - return 0; + ret = 0; err: + spin_unlock_irqrestore(&adapter->remove_lock, flags); return -ret; } @@ -5387,6 +5391,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) memset(&adapter->pending_resets, 0, sizeof(adapter->pending_resets)); spin_lock_init(&adapter->rwi_lock); spin_lock_init(&adapter->state_lock); + spin_lock_init(&adapter->remove_lock); mutex_init(&adapter->fw_lock); init_completion(&adapter->init_done); init_completion(&adapter->fw_done); @@ -5467,7 +5472,15 @@ static int ibmvnic_remove(struct vio_dev *dev) return -EBUSY; } + /* If ibmvnic_reset() is scheduling a reset, wait for it to + * finish. Then prevent it from scheduling any more resets + * and have the reset functions ignore any resets that have + * already been scheduled. + */ + spin_lock_irqsave(&adapter->remove_lock, flags); adapter->state = VNIC_REMOVING; + spin_unlock_irqrestore(&adapter->remove_lock, flags); + spin_unlock_irqrestore(&adapter->state_lock, flags); flush_work(&adapter->ibmvnic_reset); diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h index 1179a95a3f92..2779696ade09 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.h +++ b/drivers/net/ethernet/ibm/ibmvnic.h @@ -1081,6 +1081,8 @@ struct ibmvnic_adapter { spinlock_t rwi_lock; enum ibmvnic_reset_reason pending_resets[VNIC_RESET_MAX-1]; short next_reset; + /* serialize ibmvnic_reset() and ibmvnic_remove() */ + spinlock_t remove_lock; struct work_struct ibmvnic_reset; struct delayed_work ibmvnic_delayed_reset; unsigned long resetting;
The work queue is used to queue reset requests like CHANGE-PARAM or FAILOVER resets for the worker thread. When the adapter is being removed the adapter state is set to VNIC_REMOVING and the work queue is flushed so no new work is added. However the check for adapter being removed is racy in that the adapter can go into REMOVING state just after we check and we might end up adding work just as it is being flushed (or after). Use a new spin lock, ->remove_lock to ensure that after (while) the work queue is (being) flushed, no new work is queued. A follow-on patch will convert the existing ->state_lock from a spin lock to to a mutex to allow us to hold it for longer periods of time. Since work can be scheduled from a tasklet, we cannot block on the mutex, so use a new spin lock. Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before device removal") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> --- drivers/net/ethernet/ibm/ibmvnic.c | 15 ++++++++++++++- drivers/net/ethernet/ibm/ibmvnic.h | 2 ++ 2 files changed, 16 insertions(+), 1 deletion(-)