diff mbox series

iavf: fix locking of critical sections

Message ID 20210316100141.53551-1-sassmann@kpanic.de (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series iavf: fix locking of critical sections | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers warning 2 maintainers not CCed: davem@davemloft.net kuba@kernel.org
netdev/source_inline fail Was 0 now: 1
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit fail Errors and warnings before: 1 this patch: 2
netdev/kdoc fail Errors and warnings before: 1 this patch: 2
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: __func__ should be used instead of gcc specific __FUNCTION__ WARNING: line length of 109 exceeds 80 columns
netdev/build_allmodconfig_warn fail Errors and warnings before: 1 this patch: 2
netdev/header_inline success Link

Commit Message

Stefan Assmann March 16, 2021, 10:01 a.m. UTC
To avoid races between iavf_init_task(), iavf_reset_task(),
iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
remove functions more locking is required.
The current protection by __IAVF_IN_CRITICAL_TASK is needed in
additional places.

- The reset task performs state transitions, therefore needs locking.
- The adminq task acts on replies from the PF in
  iavf_virtchnl_completion() which may alter the states.
- The init task is not only run during probe but also if a VF gets stuck
  to reinitialize it.
- The shutdown function performs a state transition.
- The remove function perorms a state transition and also free's
  resources.

iavf_lock_timeout() is introduced to avoid waiting infinitely
and cause a deadlock. Rather unlock and print a warning.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 57 ++++++++++++++++++---
 1 file changed, 50 insertions(+), 7 deletions(-)

Comments

Jakub Kicinski March 16, 2021, 5:14 p.m. UTC | #1
On Tue, 16 Mar 2021 11:01:41 +0100 Stefan Assmann wrote:
> To avoid races between iavf_init_task(), iavf_reset_task(),
> iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
> remove functions more locking is required.
> The current protection by __IAVF_IN_CRITICAL_TASK is needed in
> additional places.
> 
> - The reset task performs state transitions, therefore needs locking.
> - The adminq task acts on replies from the PF in
>   iavf_virtchnl_completion() which may alter the states.
> - The init task is not only run during probe but also if a VF gets stuck
>   to reinitialize it.
> - The shutdown function performs a state transition.
> - The remove function perorms a state transition and also free's
>   resources.
> 
> iavf_lock_timeout() is introduced to avoid waiting infinitely
> and cause a deadlock. Rather unlock and print a warning.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>

I personally think that the overuse of flags in Intel drivers brings
nothing but trouble. At which point does it make sense to just add a
lock / semaphore here rather than open code all this with no clear
semantics? No code seems to just test the __IAVF_IN_CRITICAL_TASK flag,
all the uses look like poor man's locking at a quick grep. What am I
missing?
Stefan Assmann March 16, 2021, 5:27 p.m. UTC | #2
On 16.03.21 18:14, Jakub Kicinski wrote:
> On Tue, 16 Mar 2021 11:01:41 +0100 Stefan Assmann wrote:
>> To avoid races between iavf_init_task(), iavf_reset_task(),
>> iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
>> remove functions more locking is required.
>> The current protection by __IAVF_IN_CRITICAL_TASK is needed in
>> additional places.
>>
>> - The reset task performs state transitions, therefore needs locking.
>> - The adminq task acts on replies from the PF in
>>   iavf_virtchnl_completion() which may alter the states.
>> - The init task is not only run during probe but also if a VF gets stuck
>>   to reinitialize it.
>> - The shutdown function performs a state transition.
>> - The remove function perorms a state transition and also free's
>>   resources.
>>
>> iavf_lock_timeout() is introduced to avoid waiting infinitely
>> and cause a deadlock. Rather unlock and print a warning.
>>
>> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> 
> I personally think that the overuse of flags in Intel drivers brings
> nothing but trouble. At which point does it make sense to just add a
> lock / semaphore here rather than open code all this with no clear
> semantics? No code seems to just test the __IAVF_IN_CRITICAL_TASK flag,
> all the uses look like poor man's locking at a quick grep. What am I
> missing?
> 

Hi Jakub,

I agree with you that the locking could be done with other locking
mechanisms just as good. I didn't invent the current method so I'll let
Intel comment on that part, but I'd like to point out that what I'm
making use of is fixing what is currently in the driver.

Thanks!

  Stefan
Jakub Kicinski March 16, 2021, 8:29 p.m. UTC | #3
On Tue, 16 Mar 2021 18:27:10 +0100 Stefan Assmann wrote:
> On 16.03.21 18:14, Jakub Kicinski wrote:
> > On Tue, 16 Mar 2021 11:01:41 +0100 Stefan Assmann wrote:  
> >> To avoid races between iavf_init_task(), iavf_reset_task(),
> >> iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
> >> remove functions more locking is required.
> >> The current protection by __IAVF_IN_CRITICAL_TASK is needed in
> >> additional places.
> >>
> >> - The reset task performs state transitions, therefore needs locking.
> >> - The adminq task acts on replies from the PF in
> >>   iavf_virtchnl_completion() which may alter the states.
> >> - The init task is not only run during probe but also if a VF gets stuck
> >>   to reinitialize it.
> >> - The shutdown function performs a state transition.
> >> - The remove function perorms a state transition and also free's
> >>   resources.
> >>
> >> iavf_lock_timeout() is introduced to avoid waiting infinitely
> >> and cause a deadlock. Rather unlock and print a warning.
> >>
> >> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>  
> > 
> > I personally think that the overuse of flags in Intel drivers brings
> > nothing but trouble. At which point does it make sense to just add a
> > lock / semaphore here rather than open code all this with no clear
> > semantics? No code seems to just test the __IAVF_IN_CRITICAL_TASK flag,
> > all the uses look like poor man's locking at a quick grep. What am I
> > missing?
> 
> I agree with you that the locking could be done with other locking
> mechanisms just as good. I didn't invent the current method so I'll let
> Intel comment on that part, but I'd like to point out that what I'm
> making use of is fixing what is currently in the driver.

Right, I should have made it clear that I don't blame you for the
current state of things. Would you mind sending a patch on top of 
this one to do a conversion to a semaphore? 

Intel folks any opinions?
Jesse Brandeburg March 16, 2021, 10:02 p.m. UTC | #4
Jakub Kicinski wrote:
> > > I personally think that the overuse of flags in Intel drivers brings
> > > nothing but trouble. At which point does it make sense to just add a
> > > lock / semaphore here rather than open code all this with no clear
> > > semantics? No code seems to just test the __IAVF_IN_CRITICAL_TASK flag,
> > > all the uses look like poor man's locking at a quick grep. What am I
> > > missing?
> > 
> > I agree with you that the locking could be done with other locking
> > mechanisms just as good. I didn't invent the current method so I'll let
> > Intel comment on that part, but I'd like to point out that what I'm
> > making use of is fixing what is currently in the driver.
> 
> Right, I should have made it clear that I don't blame you for the
> current state of things. Would you mind sending a patch on top of 
> this one to do a conversion to a semaphore? 
> 
> Intel folks any opinions?

I know Slawomir has been working closely with Stefan on figuring out
the right ways to fix this code.  Hopefully he can speak for himself,
but I know he's on Europe time.

As for conversion to mutexes I'm a big fan, and as long as we don't
have too many collisions with the RTNL lock I think it's a reasonable
improvement to do, and if Stefan doesn't want to work on it, we can
look into whether Slawomir or his team can.
Stefan Assmann March 17, 2021, 7:49 a.m. UTC | #5
On 16.03.21 23:02, Jesse Brandeburg wrote:
> Jakub Kicinski wrote:
>>>> I personally think that the overuse of flags in Intel drivers brings
>>>> nothing but trouble. At which point does it make sense to just add a
>>>> lock / semaphore here rather than open code all this with no clear
>>>> semantics? No code seems to just test the __IAVF_IN_CRITICAL_TASK flag,
>>>> all the uses look like poor man's locking at a quick grep. What am I
>>>> missing?
>>>
>>> I agree with you that the locking could be done with other locking
>>> mechanisms just as good. I didn't invent the current method so I'll let
>>> Intel comment on that part, but I'd like to point out that what I'm
>>> making use of is fixing what is currently in the driver.
>>
>> Right, I should have made it clear that I don't blame you for the
>> current state of things. Would you mind sending a patch on top of 
>> this one to do a conversion to a semaphore? 

Sure, I'm happy to help working on the conversion once the current issue
is resolved.

>> Intel folks any opinions?
> 
> I know Slawomir has been working closely with Stefan on figuring out
> the right ways to fix this code.  Hopefully he can speak for himself,
> but I know he's on Europe time.
> 
> As for conversion to mutexes I'm a big fan, and as long as we don't
> have too many collisions with the RTNL lock I think it's a reasonable
> improvement to do, and if Stefan doesn't want to work on it, we can
> look into whether Slawomir or his team can.

I'd appreciate to be involved.
Thanks!

  Stefan
Laba, SlawomirX March 17, 2021, 5:27 p.m. UTC | #6
We were discussing introducing mutexes in those critical spots for a long time now (in my team).
Stefan, if you find time, you are most welcome to offer your solution with mutexes.

Slawek


-----Original Message-----
From: Stefan Assmann <sassmann@kpanic.de> 
Sent: Wednesday, March 17, 2021 8:50 AM
To: Brandeburg, Jesse <jesse.brandeburg@intel.com>; Jakub Kicinski <kuba@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Yang, Lihong <lihong.yang@intel.com>; Laba, SlawomirX <slawomirx.laba@intel.com>; Nunley, Nicholas D <nicholas.d.nunley@intel.com>
Subject: Re: [PATCH] iavf: fix locking of critical sections

On 16.03.21 23:02, Jesse Brandeburg wrote:
> Jakub Kicinski wrote:
>>>> I personally think that the overuse of flags in Intel drivers 
>>>> brings nothing but trouble. At which point does it make sense to 
>>>> just add a lock / semaphore here rather than open code all this 
>>>> with no clear semantics? No code seems to just test the 
>>>> __IAVF_IN_CRITICAL_TASK flag, all the uses look like poor man's 
>>>> locking at a quick grep. What am I missing?
>>>
>>> I agree with you that the locking could be done with other locking 
>>> mechanisms just as good. I didn't invent the current method so I'll 
>>> let Intel comment on that part, but I'd like to point out that what 
>>> I'm making use of is fixing what is currently in the driver.
>>
>> Right, I should have made it clear that I don't blame you for the 
>> current state of things. Would you mind sending a patch on top of 
>> this one to do a conversion to a semaphore?

Sure, I'm happy to help working on the conversion once the current issue is resolved.

>> Intel folks any opinions?
> 
> I know Slawomir has been working closely with Stefan on figuring out 
> the right ways to fix this code.  Hopefully he can speak for himself, 
> but I know he's on Europe time.
> 
> As for conversion to mutexes I'm a big fan, and as long as we don't 
> have too many collisions with the RTNL lock I think it's a reasonable 
> improvement to do, and if Stefan doesn't want to work on it, we can 
> look into whether Slawomir or his team can.

I'd appreciate to be involved.
Thanks!

  Stefan
Stefan Assmann March 17, 2021, 6:14 p.m. UTC | #7
On 17.03.21 18:27, Laba, SlawomirX wrote:
> We were discussing introducing mutexes in those critical spots for a long time now (in my team).
> Stefan, if you find time, you are most welcome to offer your solution with mutexes.

Hi Slawek,

I'll work on that conversion once the current patch went through Intel
testing and is merged. Smaller patches, smaller steps are usually
better, so let's make one change at a time.

Thanks!

  Stefan

> Slawek
> 
> 
> -----Original Message-----
> From: Stefan Assmann <sassmann@kpanic.de> 
> Sent: Wednesday, March 17, 2021 8:50 AM
> To: Brandeburg, Jesse <jesse.brandeburg@intel.com>; Jakub Kicinski <kuba@kernel.org>
> Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Yang, Lihong <lihong.yang@intel.com>; Laba, SlawomirX <slawomirx.laba@intel.com>; Nunley, Nicholas D <nicholas.d.nunley@intel.com>
> Subject: Re: [PATCH] iavf: fix locking of critical sections
> 
> On 16.03.21 23:02, Jesse Brandeburg wrote:
>> Jakub Kicinski wrote:
>>>>> I personally think that the overuse of flags in Intel drivers 
>>>>> brings nothing but trouble. At which point does it make sense to 
>>>>> just add a lock / semaphore here rather than open code all this 
>>>>> with no clear semantics? No code seems to just test the 
>>>>> __IAVF_IN_CRITICAL_TASK flag, all the uses look like poor man's 
>>>>> locking at a quick grep. What am I missing?
>>>>
>>>> I agree with you that the locking could be done with other locking 
>>>> mechanisms just as good. I didn't invent the current method so I'll 
>>>> let Intel comment on that part, but I'd like to point out that what 
>>>> I'm making use of is fixing what is currently in the driver.
>>>
>>> Right, I should have made it clear that I don't blame you for the 
>>> current state of things. Would you mind sending a patch on top of 
>>> this one to do a conversion to a semaphore?
> 
> Sure, I'm happy to help working on the conversion once the current issue is resolved.
> 
>>> Intel folks any opinions?
>>
>> I know Slawomir has been working closely with Stefan on figuring out 
>> the right ways to fix this code.  Hopefully he can speak for himself, 
>> but I know he's on Europe time.
>>
>> As for conversion to mutexes I'm a big fan, and as long as we don't 
>> have too many collisions with the RTNL lock I think it's a reasonable 
>> improvement to do, and if Stefan doesn't want to work on it, we can 
>> look into whether Slawomir or his team can.
> 
> I'd appreciate to be involved.
> Thanks!
> 
>   Stefan
>
Jankowski, Konrad0 July 6, 2021, 8:07 a.m. UTC | #8
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Stefan Assmann
> Sent: wtorek, 16 marca 2021 11:02
> To: intel-wired-lan@lists.osuosl.org
> Cc: Laba, SlawomirX <slawomirx.laba@intel.com>; netdev@vger.kernel.org;
> sassmann@kpanic.de
> Subject: [Intel-wired-lan] [PATCH] iavf: fix locking of critical sections
> 
> To avoid races between iavf_init_task(), iavf_reset_task(),
> iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
> remove functions more locking is required.
> The current protection by __IAVF_IN_CRITICAL_TASK is needed in additional
> places.
> 
> - The reset task performs state transitions, therefore needs locking.
> - The adminq task acts on replies from the PF in
>   iavf_virtchnl_completion() which may alter the states.
> - The init task is not only run during probe but also if a VF gets stuck
>   to reinitialize it.
> - The shutdown function performs a state transition.
> - The remove function perorms a state transition and also free's
>   resources.
> 
> iavf_lock_timeout() is introduced to avoid waiting infinitely and cause a
> deadlock. Rather unlock and print a warning.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 57 ++++++++++++++++++---
>  1 file changed, 50 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index dc5b3c06d1e0..538b7aa43fa5 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -131,6 +131,30 @@ enum iavf_status iavf_free_virt_mem_d(struct
> iavf_hw *hw,
>  	return 0;
>  }
> 
> +/**
> + * iavf_timeout - try to set bit but give up after timeout
> + * @adapter: board private structure
> + * @bit: bit to set
> + * @msecs: timeout in msecs
> + *
> + * Returns 0 on success, negative on failure  **/ static inline int
> +iavf_lock_timeout(struct iavf_adapter *adapter,
> +				    enum iavf_critical_section_t bit,
> +				    unsigned int msecs)
> +{
> +	unsigned int wait, delay = 10;
> +
> +	for (wait = 0; wait < msecs; wait += delay) {
> +		if (!test_and_set_bit(bit, &adapter->crit_section))
> +			return 0;
> +
> +		msleep(delay);
> +	}
> +
> +	return -1;
> +}
> +
>  /**
>   * iavf_schedule_reset - Set the flags and schedule a reset event
>   * @adapter: board private structure
> @@ -2069,6 +2093,10 @@ static void iavf_reset_task(struct work_struct
> *work)
>  	if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
>  		return;
> 
> +	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 200)) {
> +		schedule_work(&adapter->reset_task);
> +		return;
> +	}
>  	while (test_and_set_bit(__IAVF_IN_CLIENT_TASK,
>  				&adapter->crit_section))
>  		usleep_range(500, 1000);
> @@ -2275,6 +2303,8 @@ static void iavf_adminq_task(struct work_struct
> *work)
>  	if (!event.msg_buf)
>  		goto out;
> 
> +	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 200))
> +		goto freedom;
>  	do {
>  		ret = iavf_clean_arq_element(hw, &event, &pending);
>  		v_op = (enum
> virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
> @@ -2288,6 +2318,7 @@ static void iavf_adminq_task(struct work_struct
> *work)
>  		if (pending != 0)
>  			memset(event.msg_buf, 0,
> IAVF_MAX_AQ_BUF_SIZE);
>  	} while (pending);
> +	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
> 
>  	if ((adapter->flags &
>  	     (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) ||
> @@ -3590,6 +3621,10 @@ static void iavf_init_task(struct work_struct
> *work)
>  						    init_task.work);
>  	struct iavf_hw *hw = &adapter->hw;
> 
> +	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000)) {
> +		dev_warn(&adapter->pdev->dev, "failed to set
> __IAVF_IN_CRITICAL_TASK in %s\n", __FUNCTION__);
> +		return;
> +	}
>  	switch (adapter->state) {
>  	case __IAVF_STARTUP:
>  		if (iavf_startup(adapter) < 0)
> @@ -3602,14 +3637,14 @@ static void iavf_init_task(struct work_struct
> *work)
>  	case __IAVF_INIT_GET_RESOURCES:
>  		if (iavf_init_get_resources(adapter) < 0)
>  			goto init_failed;
> -		return;
> +		goto out;
>  	default:
>  		goto init_failed;
>  	}
> 
>  	queue_delayed_work(iavf_wq, &adapter->init_task,
>  			   msecs_to_jiffies(30));
> -	return;
> +	goto out;
>  init_failed:
>  	if (++adapter->aq_wait_count > IAVF_AQ_MAX_ERR) {
>  		dev_err(&adapter->pdev->dev,
> @@ -3618,9 +3653,11 @@ static void iavf_init_task(struct work_struct
> *work)
>  		iavf_shutdown_adminq(hw);
>  		adapter->state = __IAVF_STARTUP;
>  		queue_delayed_work(iavf_wq, &adapter->init_task, HZ * 5);
> -		return;
> +		goto out;
>  	}
>  	queue_delayed_work(iavf_wq, &adapter->init_task, HZ);
> +out:
> +	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
>  }
> 
>  /**
> @@ -3637,9 +3674,12 @@ static void iavf_shutdown(struct pci_dev *pdev)
>  	if (netif_running(netdev))
>  		iavf_close(netdev);
> 
> +	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000))
> +		dev_warn(&adapter->pdev->dev, "failed to set
> __IAVF_IN_CRITICAL_TASK
> +in %s\n", __FUNCTION__);
>  	/* Prevent the watchdog from running. */
>  	adapter->state = __IAVF_REMOVE;
>  	adapter->aq_required = 0;
> +	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
> 
>  #ifdef CONFIG_PM
>  	pci_save_state(pdev);
> @@ -3866,10 +3906,6 @@ static void iavf_remove(struct pci_dev *pdev)
>  				 err);
>  	}
> 
> -	/* Shut down all the garbage mashers on the detention level */
> -	adapter->state = __IAVF_REMOVE;
> -	adapter->aq_required = 0;
> -	adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
>  	iavf_request_reset(adapter);
>  	msleep(50);
>  	/* If the FW isn't responding, kick it once, but only once. */ @@ -
> 3877,6 +3913,13 @@ static void iavf_remove(struct pci_dev *pdev)
>  		iavf_request_reset(adapter);
>  		msleep(50);
>  	}
> +	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000))
> +		dev_warn(&adapter->pdev->dev, "failed to set
> __IAVF_IN_CRITICAL_TASK
> +in %s\n", __FUNCTION__);
> +
> +	/* Shut down all the garbage mashers on the detention level */
> +	adapter->state = __IAVF_REMOVE;
> +	adapter->aq_required = 0;
> +	adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
>  	iavf_free_all_tx_resources(adapter);
>  	iavf_free_all_rx_resources(adapter);
>  	iavf_misc_irq_disable(adapter);
> --
> 2.29.2

Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index dc5b3c06d1e0..538b7aa43fa5 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -131,6 +131,30 @@  enum iavf_status iavf_free_virt_mem_d(struct iavf_hw *hw,
 	return 0;
 }
 
+/**
+ * iavf_timeout - try to set bit but give up after timeout
+ * @adapter: board private structure
+ * @bit: bit to set
+ * @msecs: timeout in msecs
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static inline int iavf_lock_timeout(struct iavf_adapter *adapter,
+				    enum iavf_critical_section_t bit,
+				    unsigned int msecs)
+{
+	unsigned int wait, delay = 10;
+
+	for (wait = 0; wait < msecs; wait += delay) {
+		if (!test_and_set_bit(bit, &adapter->crit_section))
+			return 0;
+
+		msleep(delay);
+	}
+
+	return -1;
+}
+
 /**
  * iavf_schedule_reset - Set the flags and schedule a reset event
  * @adapter: board private structure
@@ -2069,6 +2093,10 @@  static void iavf_reset_task(struct work_struct *work)
 	if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
 		return;
 
+	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 200)) {
+		schedule_work(&adapter->reset_task);
+		return;
+	}
 	while (test_and_set_bit(__IAVF_IN_CLIENT_TASK,
 				&adapter->crit_section))
 		usleep_range(500, 1000);
@@ -2275,6 +2303,8 @@  static void iavf_adminq_task(struct work_struct *work)
 	if (!event.msg_buf)
 		goto out;
 
+	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 200))
+		goto freedom;
 	do {
 		ret = iavf_clean_arq_element(hw, &event, &pending);
 		v_op = (enum virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
@@ -2288,6 +2318,7 @@  static void iavf_adminq_task(struct work_struct *work)
 		if (pending != 0)
 			memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE);
 	} while (pending);
+	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
 
 	if ((adapter->flags &
 	     (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) ||
@@ -3590,6 +3621,10 @@  static void iavf_init_task(struct work_struct *work)
 						    init_task.work);
 	struct iavf_hw *hw = &adapter->hw;
 
+	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000)) {
+		dev_warn(&adapter->pdev->dev, "failed to set __IAVF_IN_CRITICAL_TASK in %s\n", __FUNCTION__);
+		return;
+	}
 	switch (adapter->state) {
 	case __IAVF_STARTUP:
 		if (iavf_startup(adapter) < 0)
@@ -3602,14 +3637,14 @@  static void iavf_init_task(struct work_struct *work)
 	case __IAVF_INIT_GET_RESOURCES:
 		if (iavf_init_get_resources(adapter) < 0)
 			goto init_failed;
-		return;
+		goto out;
 	default:
 		goto init_failed;
 	}
 
 	queue_delayed_work(iavf_wq, &adapter->init_task,
 			   msecs_to_jiffies(30));
-	return;
+	goto out;
 init_failed:
 	if (++adapter->aq_wait_count > IAVF_AQ_MAX_ERR) {
 		dev_err(&adapter->pdev->dev,
@@ -3618,9 +3653,11 @@  static void iavf_init_task(struct work_struct *work)
 		iavf_shutdown_adminq(hw);
 		adapter->state = __IAVF_STARTUP;
 		queue_delayed_work(iavf_wq, &adapter->init_task, HZ * 5);
-		return;
+		goto out;
 	}
 	queue_delayed_work(iavf_wq, &adapter->init_task, HZ);
+out:
+	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
 }
 
 /**
@@ -3637,9 +3674,12 @@  static void iavf_shutdown(struct pci_dev *pdev)
 	if (netif_running(netdev))
 		iavf_close(netdev);
 
+	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000))
+		dev_warn(&adapter->pdev->dev, "failed to set __IAVF_IN_CRITICAL_TASK in %s\n", __FUNCTION__);
 	/* Prevent the watchdog from running. */
 	adapter->state = __IAVF_REMOVE;
 	adapter->aq_required = 0;
+	clear_bit(__IAVF_IN_CRITICAL_TASK, &adapter->crit_section);
 
 #ifdef CONFIG_PM
 	pci_save_state(pdev);
@@ -3866,10 +3906,6 @@  static void iavf_remove(struct pci_dev *pdev)
 				 err);
 	}
 
-	/* Shut down all the garbage mashers on the detention level */
-	adapter->state = __IAVF_REMOVE;
-	adapter->aq_required = 0;
-	adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
 	iavf_request_reset(adapter);
 	msleep(50);
 	/* If the FW isn't responding, kick it once, but only once. */
@@ -3877,6 +3913,13 @@  static void iavf_remove(struct pci_dev *pdev)
 		iavf_request_reset(adapter);
 		msleep(50);
 	}
+	if (iavf_lock_timeout(adapter, __IAVF_IN_CRITICAL_TASK, 5000))
+		dev_warn(&adapter->pdev->dev, "failed to set __IAVF_IN_CRITICAL_TASK in %s\n", __FUNCTION__);
+
+	/* Shut down all the garbage mashers on the detention level */
+	adapter->state = __IAVF_REMOVE;
+	adapter->aq_required = 0;
+	adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
 	iavf_free_all_tx_resources(adapter);
 	iavf_free_all_rx_resources(adapter);
 	iavf_misc_irq_disable(adapter);