NFSroot hangs with bad unlock balance in Linux next

Message ID	20160508141629.GF2694@ZenIV.linux.org.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> Date: Sun, 8 May 2016 15:16:29 +0100 From: Al Viro <viro@ZenIV.linux.org.uk> To: Tony Lindgren <tony@atomide.com> Cc: Christoph Hellwig <hch@lst.de>, Trond Myklebust <trond.myklebust@primarydata.com>, Anna Schumaker <anna.schumaker@netapp.com>, linux-nfs@vger.kernel.org, linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: NFSroot hangs with bad unlock balance in Linux next Message-ID: <20160508141629.GF2694@ZenIV.linux.org.uk> References: <20160505220344.GE5995@atomide.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160505220344.GE5995@atomide.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk

Message ID

20160508141629.GF2694@ZenIV.linux.org.uk (mailing list archive)

State

New, archived

Headers

Date: Sun, 8 May 2016 15:16:29 +0100
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Tony Lindgren <tony@atomide.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Trond Myklebust <trond.myklebust@primarydata.com>,
	Anna Schumaker <anna.schumaker@netapp.com>,
	linux-nfs@vger.kernel.org, linux-omap@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: NFSroot hangs with bad unlock balance in Linux next
Message-ID: <20160508141629.GF2694@ZenIV.linux.org.uk>
References: <20160505220344.GE5995@atomide.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160505220344.GE5995@atomide.com>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk

Commit Message

Al Viro May 8, 2016, 2:16 p.m. UTC

On Thu, May 05, 2016 at 03:03:44PM -0700, Tony Lindgren wrote:
> Hi,
> 
> Looks like Linux next with NFSroot hangs for me at some point booting
> into init. Then after a while it produces "BUG: bad unlock balance
> detected!".
> 
> This happens at least with omap5-uevm and igepv5. Not sure yet if it
> also happens on other boards, the ones I'm seeing it happen both have
> USB Ethernet controller. They usually hang after the system starts
> being idle some tens of seconds into booting.
> 
> I tried to bisect it down with no luck. I do have the following
> trace, does that provide any clues?

> kworker/0:2/112 is trying to release lock (&nfsi->rmdir_sem) at:
> [<c03a7f98>] nfs_async_unlink_release+0x20/0x68
> but there are no more locks to release!

Very strange.  We grab that rwsem at the entry into nfs_call_unlink()
and then either release it there and return or call nfs_do_call_unlink().
Which arranges for eventual call of nfs_async_unlink_release() (via
->rpc_release); nfs_async_unlink_release() releases the rwsem.  Nobody else
releases it (on the read side, that is).

The only kinda-sorta possibility I see here is that the inode we are
unlocking in that nfs_async_unlink_release() is not the one we'd locked
in nfs_call_unlink() that has lead to it.  That really shouldn't happen,
though...  Just to verify whether that's what we are hitting, could you
try to reproduce that thing with the patch below on top of -next and see
if it triggers any of those WARN_ON?

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Al Viro May 9, 2016, 7:32 a.m. UTC | #1

On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:

> Very strange.  We grab that rwsem at the entry into nfs_call_unlink()
> and then either release it there and return or call nfs_do_call_unlink().
> Which arranges for eventual call of nfs_async_unlink_release() (via
> ->rpc_release); nfs_async_unlink_release() releases the rwsem.  Nobody else
> releases it (on the read side, that is).
> 
> The only kinda-sorta possibility I see here is that the inode we are
> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> in nfs_call_unlink() that has lead to it.  That really shouldn't happen,
> though...  Just to verify whether that's what we are hitting, could you
> try to reproduce that thing with the patch below on top of -next and see
> if it triggers any of those WARN_ON?

D'oh...  Lockdep warnings are easy to trigger (and, AFAICS, bogus).
up_read/down_read in fs/nfs/unlink.c should be replaced with
up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
Hangs are different - I've no idea what's triggering those.  I've seen
something similar on that -next, but not on work.lookups.

The joy of bisecting -next...  <a couple of hours later>
9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Apr 25 10:39:32 2016 -0700

    net: SOCKWQ_ASYNC_NOSPACE optimizations

Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs.  Plain
revert gives a conflict, since there had been additional change in
"net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.

Note that hangs appear without any fs/nfs/unlink.c modifications being
there.  When the hang happens it affects NFS traffic; ssh session still
works fine until it steps on a filesystem operation on NFS (i.e. you
can use builtins, access procfs, etc.)
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet May 9, 2016, 2:14 p.m. UTC | #2

On Mon, May 9, 2016 at 12:32 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
>
>> Very strange.  We grab that rwsem at the entry into nfs_call_unlink()
>> and then either release it there and return or call nfs_do_call_unlink().
>> Which arranges for eventual call of nfs_async_unlink_release() (via
>> ->rpc_release); nfs_async_unlink_release() releases the rwsem.  Nobody else
>> releases it (on the read side, that is).
>>
>> The only kinda-sorta possibility I see here is that the inode we are
>> unlocking in that nfs_async_unlink_release() is not the one we'd locked
>> in nfs_call_unlink() that has lead to it.  That really shouldn't happen,
>> though...  Just to verify whether that's what we are hitting, could you
>> try to reproduce that thing with the patch below on top of -next and see
>> if it triggers any of those WARN_ON?
>
> D'oh...  Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> up_read/down_read in fs/nfs/unlink.c should be replaced with
> up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> Hangs are different - I've no idea what's triggering those.  I've seen
> something similar on that -next, but not on work.lookups.
>
> The joy of bisecting -next...  <a couple of hours later>
> 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Mon Apr 25 10:39:32 2016 -0700
>
>     net: SOCKWQ_ASYNC_NOSPACE optimizations
>
> Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs.  Plain
> revert gives a conflict, since there had been additional change in
> "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
>
> Note that hangs appear without any fs/nfs/unlink.c modifications being
> there.  When the hang happens it affects NFS traffic; ssh session still
> works fine until it steps on a filesystem operation on NFS (i.e. you
> can use builtins, access procfs, etc.)

Yeah, the issue was reported last week (
http://www.spinics.net/lists/netdev/msg375777.html ),
and I could not convince myself to add a new sock flag,  like
SOCK_FASYNC_STICKY.

(Just in case NFS would ever call sock_fasync() with an empty
fasync_list, and SOCK_FASYNC would be cleared again.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tony Lindgren May 9, 2016, 3:12 p.m. UTC | #3

* Eric Dumazet <edumazet@google.com> [160509 07:16]:
> On Mon, May 9, 2016 at 12:32 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
> >
> >> Very strange.  We grab that rwsem at the entry into nfs_call_unlink()
> >> and then either release it there and return or call nfs_do_call_unlink().
> >> Which arranges for eventual call of nfs_async_unlink_release() (via
> >> ->rpc_release); nfs_async_unlink_release() releases the rwsem.  Nobody else
> >> releases it (on the read side, that is).
> >>
> >> The only kinda-sorta possibility I see here is that the inode we are
> >> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> >> in nfs_call_unlink() that has lead to it.  That really shouldn't happen,
> >> though...  Just to verify whether that's what we are hitting, could you
> >> try to reproduce that thing with the patch below on top of -next and see
> >> if it triggers any of those WARN_ON?

Thanks no warnings with that patch though.

> > D'oh...  Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> > up_read/down_read in fs/nfs/unlink.c should be replaced with
> > up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> > Hangs are different - I've no idea what's triggering those.  I've seen
> > something similar on that -next, but not on work.lookups.
> >
> > The joy of bisecting -next...  <a couple of hours later>
> > 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> > commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Mon Apr 25 10:39:32 2016 -0700
> >
> >     net: SOCKWQ_ASYNC_NOSPACE optimizations
> >
> > Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs.  Plain
> > revert gives a conflict, since there had been additional change in
> > "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
> >
> > Note that hangs appear without any fs/nfs/unlink.c modifications being
> > there.  When the hang happens it affects NFS traffic; ssh session still
> > works fine until it steps on a filesystem operation on NFS (i.e. you
> > can use builtins, access procfs, etc.)
> 
> Yeah, the issue was reported last week (
> http://www.spinics.net/lists/netdev/msg375777.html ),
> and I could not convince myself to add a new sock flag,  like
> SOCK_FASYNC_STICKY.
> 
> (Just in case NFS would ever call sock_fasync() with an empty
> fasync_list, and SOCK_FASYNC would be cleared again.

Yeah applying the test patch from the url above makes things work
for me again.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tony Lindgren May 9, 2016, 3:21 p.m. UTC | #4

* Tony Lindgren <tony@atomide.com> [160509 08:15]:
> * Eric Dumazet <edumazet@google.com> [160509 07:16]:
> > On Mon, May 9, 2016 at 12:32 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
> > >
> > >> Very strange.  We grab that rwsem at the entry into nfs_call_unlink()
> > >> and then either release it there and return or call nfs_do_call_unlink().
> > >> Which arranges for eventual call of nfs_async_unlink_release() (via
> > >> ->rpc_release); nfs_async_unlink_release() releases the rwsem.  Nobody else
> > >> releases it (on the read side, that is).
> > >>
> > >> The only kinda-sorta possibility I see here is that the inode we are
> > >> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> > >> in nfs_call_unlink() that has lead to it.  That really shouldn't happen,
> > >> though...  Just to verify whether that's what we are hitting, could you
> > >> try to reproduce that thing with the patch below on top of -next and see
> > >> if it triggers any of those WARN_ON?
> 
> Thanks no warnings with that patch though.
> 
> > > D'oh...  Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> > > up_read/down_read in fs/nfs/unlink.c should be replaced with
> > > up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> > > Hangs are different - I've no idea what's triggering those.  I've seen
> > > something similar on that -next, but not on work.lookups.
> > >
> > > The joy of bisecting -next...  <a couple of hours later>
> > > 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> > > commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Mon Apr 25 10:39:32 2016 -0700
> > >
> > >     net: SOCKWQ_ASYNC_NOSPACE optimizations
> > >
> > > Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs.  Plain
> > > revert gives a conflict, since there had been additional change in
> > > "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
> > >
> > > Note that hangs appear without any fs/nfs/unlink.c modifications being
> > > there.  When the hang happens it affects NFS traffic; ssh session still
> > > works fine until it steps on a filesystem operation on NFS (i.e. you
> > > can use builtins, access procfs, etc.)
> > 
> > Yeah, the issue was reported last week (
> > http://www.spinics.net/lists/netdev/msg375777.html ),
> > and I could not convince myself to add a new sock flag,  like
> > SOCK_FASYNC_STICKY.
> > 
> > (Just in case NFS would ever call sock_fasync() with an empty
> > fasync_list, and SOCK_FASYNC would be cleared again.
> 
> Yeah applying the test patch from the url above makes things work
> for me again.

Looks like with both patches applied I still also get this eventually:

=====================================
[ BUG: bad unlock balance detected! ]
4.6.0-rc7-next-20160509+ #1264 Not tainted
-------------------------------------
kworker/0:1/18 is trying to release lock (&nfsi->rmdir_sem) at:
[<c03a894c>] nfs_async_unlink_release+0x3c/0xc0
but there are no more locks to release!

               other info that might help us debug this:
2 locks held by kworker/0:1/18:
 #0:  ("nfsiod"){.+.+..}, at: [<c015473c>] process_one_work+0x120/0x6bc
 #1:  ((&task->u.tk_work)#2){+.+...}, at: [<c015473c>] process_one_work+0x120/0x6bc

               stack backtrace:
CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 4.6.0-rc7-next-20160509+ #1264
Hardware name: Generic OMAP5 (Flattened Device Tree)
Workqueue: nfsiod rpc_async_release
[<c0110318>] (unwind_backtrace) from [<c010c3bc>] (show_stack+0x10/0x14)
[<c010c3bc>] (show_stack) from [<c0481da8>] (dump_stack+0xb0/0xe4)
[<c0481da8>] (dump_stack) from [<c018d090>] (print_unlock_imbalance_bug+0xb0/0xe0)
[<c018d090>] (print_unlock_imbalance_bug) from [<c0190e9c>] (lock_release+0x2ec/0x4c0)
[<c0190e9c>] (lock_release) from [<c018a4b8>] (up_read+0x18/0x58)
[<c018a4b8>] (up_read) from [<c03a894c>] (nfs_async_unlink_release+0x3c/0xc0)
[<c03a894c>] (nfs_async_unlink_release) from [<c07860e0>] (rpc_free_task+0x24/0x44)
[<c07860e0>] (rpc_free_task) from [<c0154804>] (process_one_work+0x1e8/0x6bc)
[<c0154804>] (process_one_work) from [<c0154e1c>] (worker_thread+0x144/0x4e8)
[<c0154e1c>] (worker_thread) from [<c015acd4>] (kthread+0xdc/0xf8)
[<c015acd4>] (kthread) from [<c01078f0>] (ret_from_fork+0x14/0x24)

After the warning, NFSroot keeps working with Eric's patch.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index d367b06..dbbb4c9 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -64,6 +64,10 @@  static void nfs_async_unlink_release(void *calldata)
 	struct dentry *dentry = data->dentry;
 	struct super_block *sb = dentry->d_sb;
 
+	if (WARN_ON(data->parent != dentry->d_parent) ||
+	    WARN_ON(data->parent_inode != dentry->d_parent->d_inode)) {
+		printk(KERN_ERR "WTF2[%pd4]", dentry);
+	}
 	up_read(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
 	d_lookup_done(dentry);
 	nfs_free_unlinkdata(data);
@@ -114,7 +118,8 @@  static void nfs_do_call_unlink(struct nfs_unlinkdata *data)
 
 static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
 {
-	struct inode *dir = d_inode(dentry->d_parent);
+	struct dentry *parent = dentry->d_parent;
+	struct inode *dir = d_inode(parent);
 	struct dentry *alias;
 
 	down_read(&NFS_I(dir)->rmdir_sem);
@@ -152,6 +157,12 @@  static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
 		return ret;
 	}
 	data->dentry = alias;
+	data->parent = parent;
+	data->parent_inode = dir;
+	if (WARN_ON(parent != alias->d_parent) ||
+	    WARN_ON(dir != parent->d_inode)) {
+		printk(KERN_ERR "WTF1[%pd4]", alias);
+	}
 	nfs_do_call_unlink(data);
 	return 1;
 }
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ee8491d..b01a7f1 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1471,6 +1471,8 @@  struct nfs_unlinkdata {
 	struct nfs_removeargs args;
 	struct nfs_removeres res;
 	struct dentry *dentry;
+	struct dentry *parent;
+	struct inode *parent_inode;
 	wait_queue_head_t wq;
 	struct rpc_cred	*cred;
 	struct nfs_fattr dir_attr;

NFSroot hangs with bad unlock balance in Linux next

Commit Message

Comments

Patch