From patchwork Mon Jul 9 15:45:47 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 1174021 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id AA79940B21 for ; Mon, 9 Jul 2012 15:46:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752055Ab2GIPpw (ORCPT ); Mon, 9 Jul 2012 11:45:52 -0400 Received: from mail-gh0-f174.google.com ([209.85.160.174]:42320 "EHLO mail-gh0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752033Ab2GIPpv (ORCPT ); Mon, 9 Jul 2012 11:45:51 -0400 Received: by mail-gh0-f174.google.com with SMTP id r11so10209954ghr.19 for ; Mon, 09 Jul 2012 08:45:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:subject:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-type:content-transfer-encoding; bh=bcoJ+FPJaokLb0Du9SP7VYStlsZnlA4lOxUuyxd5Fkg=; b=m0i1DTyBbQ8LVA9uIQuY/Gd3unDfbqkZSQ6LevttEsfkAhkEl85soS1iV/w5iXPaaS eMq9XFoC8C5iLyf0Y5Gv+O5ykzwfkNmNEQrZ8PwN5jyrD+RA991GkHXN7Jgv0zUHNv1J 4v7d2lRkYOHXc+km/mNHdeotJ584M+Mxo5SMhNdT5d+NOTYrwA4f6dFdSuiKF4un7vzy rK7xGvm2thPw9+reKTSsJj76d1yuIVbSzGBS8iw61wGFMNmstcD5bmpI6D5qe9kDyKs9 VK9ywUNTDEWDAAWERES4PF/IAjCCfSMxRxSUvBpRCrZrvU1LNEO/0S//DXN9DXlD1/Ta Q/3Q== Received: by 10.50.42.196 with SMTP id q4mr8882338igl.28.1341848751435; Mon, 09 Jul 2012 08:45:51 -0700 (PDT) Received: from degas.1015granger.net (adsl-99-26-161-222.dsl.sfldmi.sbcglobal.net. [99.26.161.222]) by mx.google.com with ESMTPS id ga6sm9275552igc.2.2012.07.09.08.45.47 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 09 Jul 2012 08:45:50 -0700 (PDT) From: Chuck Lever Subject: [PATCH 14/14] NFS: Slow down state manager after an unhandled error To: trond.myklebust@netapp.com Cc: linux-nfs@vger.kernel.org Date: Mon, 09 Jul 2012 11:45:47 -0400 Message-ID: <20120709154546.1604.93273.stgit@degas.1015granger.net> In-Reply-To: <20120709153355.1604.14102.stgit@degas.1015granger.net> References: <20120709153355.1604.14102.stgit@degas.1015granger.net> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org If the state manager thread is not actually able to fully recover from some situation, it wakes up waiters, who kick off a new state manager thread. Quite often the fresh invocation of the state manager is just as successful. This results in a livelock as the client dumps thousands of NFS requests a second on the network in a vain attempt to recover. Not very friendly. To mitigate this situation, add a delay in the state manager after an unhandled error, so that the client sends just a few requests every second in this case. --- fs/nfs/nfs4state.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 3a8563c..a5844e1 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -2151,6 +2151,7 @@ static void nfs4_state_manager(struct nfs_client *clp) out_error: pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s" " with error %d\n", clp->cl_hostname, -status); + ssleep(1); nfs4_end_drain_session(clp); nfs4_clear_state_manager_bit(clp); }