diff mbox series

[18/39] lnet: Introduce lnet_recovery_limit parameter

Message ID 1611249422-556-19-git-send-email-jsimmons@infradead.org (mailing list archive)
State New
Headers show
Series lustre: update to latest OpenSFS version as of Jan 21 2021 | expand

Commit Message

James Simmons Jan. 21, 2021, 5:16 p.m. UTC
From: Chris Horn <chris.horn@hpe.com>

This parameter controls how long LNet will attempt to recover an
unhealthy interface.

Defaults to 0 to indicate indefinite recovery. This maintains the
current behavior.

HPE-bug-id: LUS-9109
WC-bug-id: https://jira.whamcloud.com/browse/LU-13569
Lustre-commit: a2e61838f8de89 ("LU-13569 lnet: Introduce lnet_recovery_limit parameter")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/39716
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h | 1 +
 net/lnet/lnet/api-ni.c        | 5 +++++
 2 files changed, 6 insertions(+)
diff mbox series

Patch

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index d349f06..927ca44 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -476,6 +476,7 @@  struct lnet_ni *
 extern unsigned int lnet_numa_range;
 extern unsigned int lnet_health_sensitivity;
 extern unsigned int lnet_recovery_interval;
+extern unsigned int lnet_recovery_limit;
 extern unsigned int lnet_peer_discovery_disabled;
 extern unsigned int lnet_drop_asym_route;
 extern unsigned int router_sensitivity_percentage;
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index 03473bf..322b25d 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -112,6 +112,11 @@  static int recovery_interval_set(const char *val,
 MODULE_PARM_DESC(lnet_recovery_interval,
 		 "Interval to recover unhealthy interfaces in seconds");
 
+unsigned int lnet_recovery_limit;
+module_param(lnet_recovery_limit, uint, 0644);
+MODULE_PARM_DESC(lnet_recovery_limit,
+		 "How long to attempt recovery of unhealthy peer interfaces in seconds. Set to 0 to allow indefinite recovery");
+
 static int lnet_interfaces_max = LNET_INTERFACES_MAX_DEFAULT;
 static int intf_max_set(const char *val, const struct kernel_param *kp);
 module_param_call(lnet_interfaces_max, intf_max_set, param_get_int,