diff mbox series

[16/40] lustre: ptlrpc: clarify AT error message

Message ID 1681042400-15491-17-git-send-email-jsimmons@infradead.org (mailing list archive)
State New, archived
Headers show
Series lustre: backport OpenSFS changes from March XX, 2023 | expand

Commit Message

James Simmons April 9, 2023, 12:12 p.m. UTC
From: Aurelien Degremont <degremoa@amazon.com>

Clarify the error message related to passed deadline
for AT early replies. It was indicating that the system
was CPU bound which is most of the time wrong, as the issue
is rather communication failure delaying RPC traffic.
This could be confusing to people which will look for
CPU resource consumption where the network traffic is
more at cause.

Also try to use less cryptic keywords which makes only
sense to the feature developer, and not to admins.

WC-bug-id: https://jira.whamcloud.com/browse/LU-930
Lustre-commit: 9ce04000fba07706c ("LU-930 ptlrpc: clarify AT error message")
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49548
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/ptlrpc/service.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)
diff mbox series

Patch

diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c
index aaf7529..bf76272 100644
--- a/fs/lustre/ptlrpc/service.c
+++ b/fs/lustre/ptlrpc/service.c
@@ -1303,12 +1303,11 @@  static void ptlrpc_at_check_timed(struct ptlrpc_service_part *svcpt)
 		 * We're already past request deadlines before we even get a
 		 * chance to send early replies
 		 */
-		LCONSOLE_WARN("%s: This server is not able to keep up with request traffic (cpu-bound).\n",
-			      svcpt->scp_service->srv_name);
-		CWARN("earlyQ=%d reqQ=%d recA=%d, svcEst=%d, delay=%lldms\n",
-		      counter, svcpt->scp_nreqs_incoming,
-		      svcpt->scp_nreqs_active,
-		      at_get(&svcpt->scp_at_estimate), delay_ms);
+		LCONSOLE_WARN("'%s' is processing requests too slowly, client may timeout. Late by %ds, missed %d early replies (reqs waiting=%d active=%d, at_estimate=%d, delay=%lldms)\n",
+			      svcpt->scp_service->srv_name, -first, counter,
+			      svcpt->scp_nreqs_incoming,
+			      svcpt->scp_nreqs_active,
+			      at_get(&svcpt->scp_at_estimate), delay_ms);
 	}
 
 	/*