From patchwork Fri Jul 8 12:47:08 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Seth Forshee X-Patchwork-Id: 9220805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DA5EF60572 for ; Fri, 8 Jul 2016 12:47:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAC9E2877E for ; Fri, 8 Jul 2016 12:47:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BF01A28780; Fri, 8 Jul 2016 12:47:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F06C62877E for ; Fri, 8 Jul 2016 12:47:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755150AbcGHMrT (ORCPT ); Fri, 8 Jul 2016 08:47:19 -0400 Received: from mail-oi0-f46.google.com ([209.85.218.46]:34487 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755031AbcGHMrL (ORCPT ); Fri, 8 Jul 2016 08:47:11 -0400 Received: by mail-oi0-f46.google.com with SMTP id s66so59462894oif.1 for ; Fri, 08 Jul 2016 05:47:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=FYuAbrYviJGOTW5RBBB2YUbVYqKkjf0xtv+fwbuCp0c=; b=HRz24DAfVglxxHcXC+jSvZmKJTUGhyn37kBTZzKqtR5BGg/ai2uxmzunZt0ecBl2iR vsjoHNKeSAo/i5WyzoSSp5QMXwTLyG3ZI6k7TqGo0ROh/RfgxztzLISKZj49/CSjnKoy mYq5qLczsXKHhu/AVbJtoNWFLs3huq1N11WyCf85Xr/mtK2G3M2wjCTZpeMkrQpEKD5t xmULgeWQC+d2F4Sis++mw+5123ckRrDz7GIdk8hxs3kfz9vfO/RUTBO3stZFuFkrlDF9 bA+iVsahfG+jK1mfQUmrluXcxMJLxu1f05pRsjPyID3tOjLvISoqrV452pJTNp04r5LC GeWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=FYuAbrYviJGOTW5RBBB2YUbVYqKkjf0xtv+fwbuCp0c=; b=TaA6LP2j9MgEaWuaWE0A8y5kAOC6eNxa/wkJmF3DPw3IHeTyr2P0p99shuwFrgSj/v HAJnhcQ671dggRgFyy98i3X0jysOxcIGWGgKIOKcalaj0WEp+3ultUUyw+lMtjgLit7J 2oiOgH2HjyaYpsydzF+09TJ8V9ZpAGulBV682q5ijStxOBdupm0gsRLfmZtVpKZTnPlP uqE91trkncs/+rfcbgyYcGvbeoWy1G6DKpj0EQUTOsERwEOiKD76wsneY/Y7hRVyLlaB 3Yq/UtK76R5BoU2aXLHCfRLJCW522XaE8c27Dm99tOM0mrxlm5rfC/zDvh9lDZExm1fm H/xg== X-Gm-Message-State: ALyK8tIX9uSLd1KBxYa72hROEPZr9KiWoetSazPSG7hqcBH1/6AGd0IlsOQjbFQa4sETNXKL X-Received: by 10.202.102.136 with SMTP id m8mr3104324oik.40.1467982030092; Fri, 08 Jul 2016 05:47:10 -0700 (PDT) Received: from localhost ([2605:a601:aab:f920:f556:2d74:24b9:d0d2]) by smtp.gmail.com with ESMTPSA id k205sm12579792oia.7.2016.07.08.05.47.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Jul 2016 05:47:09 -0700 (PDT) Date: Fri, 8 Jul 2016 07:47:08 -0500 From: Seth Forshee To: Michal Hocko Cc: Jeff Layton , Trond Myklebust , Anna Schumaker , linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Tycho Andersen Subject: Re: Hang due to nfs letting tasks freeze with locked inodes Message-ID: <20160708124708.GA16921@ubuntu-hedt> References: <20160706174655.GD45215@ubuntu-hedt> <1467842838.2908.45.camel@redhat.com> <20160708122224.GA20200@dhcp22.suse.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160708122224.GA20200@dhcp22.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Jul 08, 2016 at 02:22:24PM +0200, Michal Hocko wrote: > On Wed 06-07-16 18:07:18, Jeff Layton wrote: > > On Wed, 2016-07-06 at 12:46 -0500, Seth Forshee wrote: > > > We're seeing a hang when freezing a container with an nfs bind mount while > > > running iozone. Two iozone processes were hung with this stack trace. > > > > > >  [] schedule+0x35/0x80 > > >  [] schedule_preempt_disabled+0xe/0x10 > > >  [] __mutex_lock_slowpath+0xb9/0x130 > > >  [] mutex_lock+0x1f/0x30 > > >  [] do_unlinkat+0x12b/0x2d0 > > >  [] SyS_unlink+0x16/0x20 > > >  [] entry_SYSCALL_64_fastpath+0x16/0x71 > > > > > > This seems to be due to another iozone thread frozen during unlink with > > > this stack trace: > > > > > >  [] __refrigerator+0x7a/0x140 > > >  [] nfs4_handle_exception+0x118/0x130 [nfsv4] > > >  [] nfs4_proc_remove+0x7d/0xf0 [nfsv4] > > >  [] nfs_unlink+0x149/0x350 [nfs] > > >  [] vfs_unlink+0xf1/0x1a0 > > >  [] do_unlinkat+0x279/0x2d0 > > >  [] SyS_unlink+0x16/0x20 > > >  [] entry_SYSCALL_64_fastpath+0x16/0x71 > > > > > > Since nfs is allowing the thread to be frozen with the inode locked it's > > > preventing other threads trying to lock the same inode from freezing. It > > > seems like a bad idea for nfs to be doing this. > > > > > > > Yeah, known problem. Not a simple one to fix though. > > Apart from alternative Dave was mentioning in other email, what is the > point to use freezable wait from this path in the first place? > > nfs4_handle_exception does nfs4_wait_clnt_recover from the same path and > that does wait_on_bit_action with TASK_KILLABLE so we are waiting in two > different modes from the same path AFAICS. There do not seem to be other > callers of nfs4_delay outside of nfs4_handle_exception. Sounds like > something is not quite right here to me. If the nfs4_delay did regular > wait then the freezing would fail as well but at least it would be clear > who is the culrprit rather than having an indirect dependency. It turns out there are more paths than this one doing a freezable wait, and they're all also killable. This leads me to a slightly different question than yours, why nfs can give up waiting in the case of a signal but not when the task is frozen. I know the changes below aren't "correct," but I've been experimenting with them anyway to see what would happen. So far things seem to be fine, and the deadlock is gone. That should give you an idea of all the places I found using a freezable wait. Seth --- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index f714b98..62dbe59 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -77,8 +77,8 @@ nfs_fattr_to_ino_t(struct nfs_fattr *fattr) */ int nfs_wait_bit_killable(struct wait_bit_key *key, int mode) { - freezable_schedule_unsafe(); - if (signal_pending_state(mode, current)) + schedule(); + if (signal_pending_state(mode, current) || freezing(current)) return -ERESTARTSYS; return 0; } diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index cb28cce..2315183 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -35,9 +35,9 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags) res = rpc_call_sync(clnt, msg, flags); if (res != -EJUKEBOX) break; - freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME); + schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME); res = -ERESTARTSYS; - } while (!fatal_signal_pending(current)); + } while (!fatal_signal_pending(current) && !freezing(current)); return res; } diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 98a4415..0dad2fb 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -334,9 +334,8 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout) might_sleep(); - freezable_schedule_timeout_killable_unsafe( - nfs4_update_delay(timeout)); - if (fatal_signal_pending(current)) + schedule_timeout_killable(nfs4_update_delay(timeout)); + if (fatal_signal_pending(current) || freezing(current)) res = -ERESTARTSYS; return res; } @@ -5447,7 +5446,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4 static unsigned long nfs4_set_lock_task_retry(unsigned long timeout) { - freezable_schedule_timeout_killable_unsafe(timeout); + schedule_timeout_killable(timeout); timeout <<= 1; if (timeout > NFS4_LOCK_MAXTIMEOUT) return NFS4_LOCK_MAXTIMEOUT; @@ -6148,7 +6147,7 @@ nfs4_proc_lock(struct file *filp, int cmd, struct file_lock *request) break; timeout = nfs4_set_lock_task_retry(timeout); status = -ERESTARTSYS; - if (signalled()) + if (signalled() || freezing(current)) break; } while(status < 0); return status; diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index 73ad57a..0218dc2 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -252,8 +252,8 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue); static int rpc_wait_bit_killable(struct wait_bit_key *key, int mode) { - freezable_schedule_unsafe(); - if (signal_pending_state(mode, current)) + schedule(); + if (signal_pending_state(mode, current) || freezing(current)) return -ERESTARTSYS; return 0; }