diff mbox series

[v3,1/1] powerpc/vnic: Extend "failover pending" window

Message ID 20201030170711.1562994-1-sukadev@linux.ibm.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series [v3,1/1] powerpc/vnic: Extend "failover pending" window | expand

Commit Message

Sukadev Bhattiprolu Oct. 30, 2020, 5:07 p.m. UTC
Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
the "failover pending window" - where we wait for the partner to become
ready (after a transport event) before actually attempting to failover.
i.e window is between following two events:

        a. we get a transport event due to a FAILOVER

        b. later, we get CRQ_INITIALIZED indicating the partner is
           ready  at which point we schedule a FAILOVER reset.

and ->failover_pending is true during this window.

If during this window, we attempt to open (or close) a device, we pretend
that the operation succeded and let the FAILOVER reset path complete the
operation.

This is fine, except if the transport event ("a" above) occurs during the
open and after open has already checked whether a failover is pending. If
that happens, we fail the open, which can cause the boot scripts to leave
the interface down requiring administrator to manually bring up the device.

This fix "extends" the failover pending window till we are _actually_
ready to perform the failover reset (i.e until after we get the RTNL
lock). Since open() holds the RTNL lock, we can be sure that we either
finish the open or if the open() fails due to the failover pending window,
we can again pretend that open is done and let the failover complete it.

We could try and block the open until failover is completed but a) that
could still timeout the application and b) Existing code "pretends" that
failover occurred "just after" open succeeded, so marks the open successful
and lets the failover complete the open. So, mark the open successful even
if the transport event occurs before we actually start the open.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
Changelog [v3]:
	[Lijun Pan]: Add a few more notes to patch description.

Changelog [v2]:
	[Brian King] Ensure we clear failover_pending during hard reset
---
 drivers/net/ethernet/ibm/ibmvnic.c | 36 ++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

Comments

drt Oct. 30, 2020, 6:56 p.m. UTC | #1
On 2020-10-30 10:07, Sukadev Bhattiprolu wrote:
> Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
> the "failover pending window" - where we wait for the partner to become
> ready (after a transport event) before actually attempting to failover.
> i.e window is between following two events:
> 
>         a. we get a transport event due to a FAILOVER
> 
>         b. later, we get CRQ_INITIALIZED indicating the partner is
>            ready  at which point we schedule a FAILOVER reset.
> 
> and ->failover_pending is true during this window.
> 
> If during this window, we attempt to open (or close) a device, we 
> pretend
> that the operation succeded and let the FAILOVER reset path complete 
> the
> operation.
> 
> This is fine, except if the transport event ("a" above) occurs during 
> the
> open and after open has already checked whether a failover is pending. 
> If
> that happens, we fail the open, which can cause the boot scripts to 
> leave
> the interface down requiring administrator to manually bring up the 
> device.
> 
> This fix "extends" the failover pending window till we are _actually_
> ready to perform the failover reset (i.e until after we get the RTNL
> lock). Since open() holds the RTNL lock, we can be sure that we either
> finish the open or if the open() fails due to the failover pending 
> window,
> we can again pretend that open is done and let the failover complete 
> it.
> 
> We could try and block the open until failover is completed but a) that
> could still timeout the application and b) Existing code "pretends" 
> that
> failover occurred "just after" open succeeded, so marks the open 
> successful
> and lets the failover complete the open. So, mark the open successful 
> even
> if the transport event occurs before we actually start the open.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
> ---
> Changelog [v3]:
> 	[Lijun Pan]: Add a few more notes to patch description.
> 
> Changelog [v2]:
> 	[Brian King] Ensure we clear failover_pending during hard reset
> ---

Thanks, Suka!

Acked-by: Dany Madden <drt@linux.ibm.com>

>  drivers/net/ethernet/ibm/ibmvnic.c | 36 ++++++++++++++++++++++++++----
>  1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 1b702a43a5d0..2a0f6f6820db 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1197,18 +1197,27 @@ static int ibmvnic_open(struct net_device 
> *netdev)
>  	if (adapter->state != VNIC_CLOSED) {
>  		rc = ibmvnic_login(netdev);
>  		if (rc)
> -			return rc;
> +			goto out;
> 
>  		rc = init_resources(adapter);
>  		if (rc) {
>  			netdev_err(netdev, "failed to initialize resources\n");
>  			release_resources(adapter);
> -			return rc;
> +			goto out;
>  		}
>  	}
> 
>  	rc = __ibmvnic_open(netdev);
> 
> +out:
> +	/*
> +	 * If open fails due to a pending failover, set device state and
> +	 * return. Device operation will be handled by reset routine.
> +	 */
> +	if (rc && adapter->failover_pending) {
> +		adapter->state = VNIC_OPEN;
> +		rc = 0;
> +	}
>  	return rc;
>  }
> 
> @@ -1931,6 +1940,13 @@ static int do_reset(struct ibmvnic_adapter 
> *adapter,
>  		   rwi->reset_reason);
> 
>  	rtnl_lock();
> +	/*
> +	 * Now that we have the rtnl lock, clear any pending failover.
> +	 * This will ensure ibmvnic_open() has either completed or will
> +	 * block until failover is complete.
> +	 */
> +	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
> +		adapter->failover_pending = false;
> 
>  	netif_carrier_off(netdev);
>  	adapter->reset_reason = rwi->reset_reason;
> @@ -2211,6 +2227,13 @@ static void __ibmvnic_reset(struct work_struct 
> *work)
>  			/* CHANGE_PARAM requestor holds rtnl_lock */
>  			rc = do_change_param_reset(adapter, rwi, reset_state);
>  		} else if (adapter->force_reset_recovery) {
> +			/*
> +			 * Since we are doing a hard reset now, clear the
> +			 * failover_pending flag so we don't ignore any
> +			 * future MOBILITY or other resets.
> +			 */
> +			adapter->failover_pending = false;
> +
>  			/* Transport event occurred during previous reset */
>  			if (adapter->wait_for_reset) {
>  				/* Previous was CHANGE_PARAM; caller locked */
> @@ -2275,9 +2298,15 @@ static int ibmvnic_reset(struct ibmvnic_adapter 
> *adapter,
>  	unsigned long flags;
>  	int ret;
> 
> +	/*
> +	 * If failover is pending don't schedule any other reset.
> +	 * Instead let the failover complete. If there is already a
> +	 * a failover reset scheduled, we will detect and drop the
> +	 * duplicate reset when walking the ->rwi_list below.
> +	 */
>  	if (adapter->state == VNIC_REMOVING ||
>  	    adapter->state == VNIC_REMOVED ||
> -	    adapter->failover_pending) {
> +	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
>  		ret = EBUSY;
>  		netdev_dbg(netdev, "Adapter removing or pending failover, skipping 
> reset\n");
>  		goto err;
> @@ -4653,7 +4682,6 @@ static void ibmvnic_handle_crq(union ibmvnic_crq 
> *crq,
>  		case IBMVNIC_CRQ_INIT:
>  			dev_info(dev, "Partner initialized\n");
>  			adapter->from_passive_init = true;
> -			adapter->failover_pending = false;
>  			if (!completion_done(&adapter->init_done)) {
>  				complete(&adapter->init_done);
>  				adapter->init_done_rc = -EIO;
Jakub Kicinski Nov. 3, 2020, 12:56 a.m. UTC | #2
On Fri, 30 Oct 2020 10:07:11 -0700 Sukadev Bhattiprolu wrote:
> Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
> the "failover pending window" - where we wait for the partner to become
> ready (after a transport event) before actually attempting to failover.
> i.e window is between following two events:
> 
>         a. we get a transport event due to a FAILOVER
> 
>         b. later, we get CRQ_INITIALIZED indicating the partner is
>            ready  at which point we schedule a FAILOVER reset.
> 
> and ->failover_pending is true during this window.
> 
> If during this window, we attempt to open (or close) a device, we pretend
> that the operation succeded and let the FAILOVER reset path complete the
> operation.

Applied, thanks!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1b702a43a5d0..2a0f6f6820db 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1197,18 +1197,27 @@  static int ibmvnic_open(struct net_device *netdev)
 	if (adapter->state != VNIC_CLOSED) {
 		rc = ibmvnic_login(netdev);
 		if (rc)
-			return rc;
+			goto out;
 
 		rc = init_resources(adapter);
 		if (rc) {
 			netdev_err(netdev, "failed to initialize resources\n");
 			release_resources(adapter);
-			return rc;
+			goto out;
 		}
 	}
 
 	rc = __ibmvnic_open(netdev);
 
+out:
+	/*
+	 * If open fails due to a pending failover, set device state and
+	 * return. Device operation will be handled by reset routine.
+	 */
+	if (rc && adapter->failover_pending) {
+		adapter->state = VNIC_OPEN;
+		rc = 0;
+	}
 	return rc;
 }
 
@@ -1931,6 +1940,13 @@  static int do_reset(struct ibmvnic_adapter *adapter,
 		   rwi->reset_reason);
 
 	rtnl_lock();
+	/*
+	 * Now that we have the rtnl lock, clear any pending failover.
+	 * This will ensure ibmvnic_open() has either completed or will
+	 * block until failover is complete.
+	 */
+	if (rwi->reset_reason == VNIC_RESET_FAILOVER)
+		adapter->failover_pending = false;
 
 	netif_carrier_off(netdev);
 	adapter->reset_reason = rwi->reset_reason;
@@ -2211,6 +2227,13 @@  static void __ibmvnic_reset(struct work_struct *work)
 			/* CHANGE_PARAM requestor holds rtnl_lock */
 			rc = do_change_param_reset(adapter, rwi, reset_state);
 		} else if (adapter->force_reset_recovery) {
+			/*
+			 * Since we are doing a hard reset now, clear the
+			 * failover_pending flag so we don't ignore any
+			 * future MOBILITY or other resets.
+			 */
+			adapter->failover_pending = false;
+
 			/* Transport event occurred during previous reset */
 			if (adapter->wait_for_reset) {
 				/* Previous was CHANGE_PARAM; caller locked */
@@ -2275,9 +2298,15 @@  static int ibmvnic_reset(struct ibmvnic_adapter *adapter,
 	unsigned long flags;
 	int ret;
 
+	/*
+	 * If failover is pending don't schedule any other reset.
+	 * Instead let the failover complete. If there is already a
+	 * a failover reset scheduled, we will detect and drop the
+	 * duplicate reset when walking the ->rwi_list below.
+	 */
 	if (adapter->state == VNIC_REMOVING ||
 	    adapter->state == VNIC_REMOVED ||
-	    adapter->failover_pending) {
+	    (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
 		ret = EBUSY;
 		netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
 		goto err;
@@ -4653,7 +4682,6 @@  static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
 		case IBMVNIC_CRQ_INIT:
 			dev_info(dev, "Partner initialized\n");
 			adapter->from_passive_init = true;
-			adapter->failover_pending = false;
 			if (!completion_done(&adapter->init_done)) {
 				complete(&adapter->init_done);
 				adapter->init_done_rc = -EIO;