From patchwork Wed Jul 19 04:05:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 9850457 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4B30E60392 for ; Wed, 19 Jul 2017 04:05:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39AED26E1A for ; Wed, 19 Jul 2017 04:05:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E5B728433; Wed, 19 Jul 2017 04:05:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_TVD_MIME_EPI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7B8D26E1A for ; Wed, 19 Jul 2017 04:05:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751179AbdGSEFS (ORCPT ); Wed, 19 Jul 2017 00:05:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:42303 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750822AbdGSEFR (ORCPT ); Wed, 19 Jul 2017 00:05:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "Cc" Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 04A94AE16; Wed, 19 Jul 2017 04:05:15 +0000 (UTC) From: NeilBrown To: Trond Myklebust , Anna Schumaker Date: Wed, 19 Jul 2017 14:05:01 +1000 Cc: "David S. Miller" , Eric Dumazet Cc: linux-nfs@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] net/sunrpc/xprt_sock: fix regression in connection error reporting. Message-ID: <87r2xdxeaa.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Commit 3d4762639dd3 ("tcp: remove poll() flakes when receiving RST") in v4.12 changed the order in which ->sk_state_change() and ->sk_error_report() are called when a socket is shut down - sk_state_change() is now called first. This causes xs_tcp_state_change() -> xs_sock_mark_closed() -> xprt_disconnect_done() to wake all pending tasked with -EAGAIN. When the ->sk_error_report() callback arrives, it is too late to pass the error on, and it is lost. As easy way to demonstrate the problem caused is to try to start rpc.nfsd while rcpbind isn't running. nfsd will attempt a tcp connection to rpcbind. A ECONNREFUSED error is returned, but sunrpc code loses the error and keeps retrying. If it saw the ECONNREFUSED, it would abort. To fix this, handle the sk->sk_err in the TCP_CLOSE branch of xs_tcp_state_change(). Fixes: 3d4762639dd3 ("tcp: remove poll() flakes when receiving RST") Cc: stable@vger.kernel.org (v4.12) Signed-off-by: NeilBrown --- net/sunrpc/xprtsock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index d5b54c020dec..4f154d388748 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1624,6 +1624,8 @@ static void xs_tcp_state_change(struct sock *sk) if (test_and_clear_bit(XPRT_SOCK_CONNECTING, &transport->sock_state)) xprt_clear_connecting(xprt); + if (sk->sk_err) + xprt_wake_pending_tasks(xprt, -sk->sk_err); xs_sock_mark_closed(xprt); } out: