diff mbox series

[21/42] lnet: handles unregister/register events

Message ID 1674514855-15399-22-git-send-email-jsimmons@infradead.org (mailing list archive)
State New, archived
Headers show
Series lustre: sync to OpenSFS tree as of Jan 22 2023 | expand

Commit Message

James Simmons Jan. 23, 2023, 11 p.m. UTC
From: Cyril Bordage <cbordage@whamcloud.com>

When network is restarted, devices are unregistered and then
registered again. When a device registers using an index that is
different from the previous one (before network was restarted), LNet
ignores it. Consequently, this device stays with link in fatal state.

To fix that, we catch unregistering events to clear the saved index
value, and when a registering event comes, we save the new value.

WC-bug-id: https://jira.whamcloud.com/browse/LU-16378
Lustre-commit: 3c9282a67d73799a0 ("LU-16378 lnet: handles unregister/register events")
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49375
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/socklnd/socklnd.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c
index d8d1071d40f4..07e056845b24 100644
--- a/net/lnet/klnds/socklnd/socklnd.c
+++ b/net/lnet/klnds/socklnd/socklnd.c
@@ -2010,10 +2010,30 @@  ksocknal_handle_link_state_change(struct net_device *dev,
 		sa = (void *)&ksi->ksni_addr;
 		found_ip = false;
 
-		if (ksi->ksni_index != ifindex ||
-		    strcmp(ksi->ksni_name, dev->name))
+		if (strcmp(ksi->ksni_name, dev->name))
+			continue;
+
+		if (ksi->ksni_index == -1) {
+			if (dev->reg_state != NETREG_REGISTERED)
+				continue;
+			/* A registration just happened: save the new index for
+			 * the device
+			 */
+			ksi->ksni_index = ifindex;
+			goto out;
+		}
+
+		if (ksi->ksni_index != ifindex)
 			continue;
 
+		if (dev->reg_state == NETREG_UNREGISTERING) {
+			/* Device is being unregitering, we need to clear the
+			 * index, it can change when device will be back
+			 */
+			ksi->ksni_index = -1;
+			goto out;
+		}
+
 		ni = net->ksnn_ni;
 
 		in_dev = __in_dev_get_rtnl(dev);
@@ -2108,6 +2128,8 @@  static int ksocknal_device_event(struct notifier_block *unused,
 	case NETDEV_UP:
 	case NETDEV_DOWN:
 	case NETDEV_CHANGE:
+	case NETDEV_REGISTER:
+	case NETDEV_UNREGISTER:
 		ksocknal_handle_link_state_change(dev, operstate);
 		break;
 	}