From patchwork Sat Feb 10 01:06:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thiago Becker X-Patchwork-Id: 10210127 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1E8EB60247 for ; Sat, 10 Feb 2018 01:07:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12211299FA for ; Sat, 10 Feb 2018 01:07:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 034F729A02; Sat, 10 Feb 2018 01:07:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A075F299FA for ; Sat, 10 Feb 2018 01:07:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753251AbeBJBHI (ORCPT ); Fri, 9 Feb 2018 20:07:08 -0500 Received: from mail-qt0-f194.google.com ([209.85.216.194]:42693 "EHLO mail-qt0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753142AbeBJBHG (ORCPT ); Fri, 9 Feb 2018 20:07:06 -0500 Received: by mail-qt0-f194.google.com with SMTP id i8so12810751qtj.9; Fri, 09 Feb 2018 17:07:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Y1OE+4j/FTkrwNqOzND0SnfPXecQBfpj+U1GFyUr1mA=; b=ssgctcPmsmUdieVlETW357YKa0vVSXcOArM9G3R5Ua8w+ayRd4EzHFe+8hn13pfSZj cmjh8bl1Cx1p0pBF63RtKUrNp18dF/nezAJOFYY1CTiTPhXLAMrnHTb3Ggs7HMR/ITni /nDqEpu2YMtqTX78IMDf7nDaxbvdnUyVaefyfm2RZTOtVk89nghDgc31fIXW5jIlgMDP j5w7EqLj3OhKWI2S/RZAuQV2c6ujywvs78xUT/8cvBSkWtvqmLkN37NKcP0GEFil3BGe nNsuqA1rU/heK/QCSydugjyQrpWC6Lc+2PIYGhMBChnPQQAK72WFE9/ApKkiSIP5+pDF XV9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Y1OE+4j/FTkrwNqOzND0SnfPXecQBfpj+U1GFyUr1mA=; b=oSClYBM1KEaOFDFgHwpFgKRsLkSSNuCpXKnJQiaPHOeDfN8KDDmwNDeKzADjvKVK2H qHd2zVXw9vRfIkKpZFAEup1m5m+AXGEP5U8eG8SywJ6RNfO2E1BjKek7/a1Tq8utrccp kQhWrGkuOizTD4PmF42vZbqEmZDPOqKXYw6rRSZ/FF8vzsc+ZkZdRbM+PAPNeIjpyrvS zAKiELWfQwDsh1wlz6iRGAd8BOoyUwD8DTN+YspOSVqNauoMRG6SyZhZAmkOYf3PfxJM SOq2peTyd2ZP8a8QcvihVDvqgwXEmo3HPv8IH7LHM++ycrDnJawWAWwsPijmmn3rxwCl NAQA== X-Gm-Message-State: APf1xPA0GBVIokjWqbqa7Y7bQdN9yIhAobRSfR6gY0k9gROB40hchk8h /dYPHi93UEeh/z4wbA+b604rEMyp7g== X-Google-Smtp-Source: AH8x224Ya5+wK8iQRNjQ0e5l7LxoWT3Z+oDuy98jS73YABWwtP4t/n9ymjT469pCacLGfd1Pe2rCmA== X-Received: by 10.237.35.38 with SMTP id h35mr7585417qtc.300.1518224825858; Fri, 09 Feb 2018 17:07:05 -0800 (PST) Received: from rhat.com ([201.37.68.249]) by smtp.googlemail.com with ESMTPSA id e139sm3327576qka.27.2018.02.09.17.07.02 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 09 Feb 2018 17:07:05 -0800 (PST) From: Thiago Rafael Becker To: linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org Cc: trond.myklebust@primarydata.com, anna.schumaker@netapp.com, bfields@fieldses.org, jlayton@kernel.org, davem@davemloft.net, Thiago Rafael Becker Subject: [PATCH] sunrpc: Add task's xid to 'not responding' messages on call_timeout Date: Fri, 9 Feb 2018 23:06:28 -0200 Message-Id: <20180210010628.19792-1-thiago.becker@gmail.com> X-Mailer: git-send-email 2.14.3 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When investigating reasons for nfs failures, packet dumps arei eventually used. Finding the rpc that generated the failure is done by comparing all sent rpc calls and all received rpc replies for those which are unanswered, which is prone to errors like - Slow server responses - Incomplete and uncaptured packets in the packet dump - The heuristics used to inspect packets failing to interpret one This patch adds the xid of rpc_tasks to the 'not responding' messages in call_timeout to make these analysis more precise. Signed-off-by: Thiago Rafael Becker --- net/sunrpc/clnt.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index e2a4184f3c5d..83c8aca951f4 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -2214,9 +2214,10 @@ call_timeout(struct rpc_task *task) } if (RPC_IS_SOFT(task)) { if (clnt->cl_chatty) { - printk(KERN_NOTICE "%s: server %s not responding, timed out\n", + printk(KERN_NOTICE "%s: server %s not responding, timed out (xid: %x)\n", clnt->cl_program->name, - task->tk_xprt->servername); + task->tk_xprt->servername, + be32_to_cpu(task->tk_rqstp->rq_xid)); } if (task->tk_flags & RPC_TASK_TIMEOUT) rpc_exit(task, -ETIMEDOUT); @@ -2228,9 +2229,10 @@ call_timeout(struct rpc_task *task) if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) { task->tk_flags |= RPC_CALL_MAJORSEEN; if (clnt->cl_chatty) { - printk(KERN_NOTICE "%s: server %s not responding, still trying\n", + printk(KERN_NOTICE "%s: server %s not responding, still trying (xid: %x)\n", clnt->cl_program->name, - task->tk_xprt->servername); + task->tk_xprt->servername, + be32_to_cpu(task->tk_rqstp->rq_xid)); } } rpc_force_rebind(clnt);