diff mbox

[4/4] NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page()

Message ID 20140916053135.22257.46476.stgit@notabene.brown (mailing list archive)
State New, archived
Headers show

Commit Message

NeilBrown Sept. 16, 2014, 5:31 a.m. UTC
Now that nfs_release_page() doesn't block indefinitely, other deadlock
avoidance mechanisms aren't needed.
 - it doesn't hurt for kswapd to block occasionally.  If it doesn't
   want to block it would clear __GFP_WAIT.  The current_is_kswapd()
   was only added to avoid deadlocks and we have a new approach for
   that.
 - memory allocation in the SUNRPC layer can very rarely try to
   ->releasepage() a page it is trying to handle.  The deadlock
   is removed as nfs_release_page() doesn't block indefinitely.

So we don't need to set PF_FSTRANS for sunrpc network operations any
more.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/file.c                   |   16 +++++++---------
 net/sunrpc/sched.c              |    2 --
 net/sunrpc/xprtrdma/transport.c |    2 --
 net/sunrpc/xprtsock.c           |   10 ----------
 4 files changed, 7 insertions(+), 23 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Trond Myklebust Sept. 16, 2014, 10:04 p.m. UTC | #1
Hi Neil,

On Tue, Sep 16, 2014 at 1:31 AM, NeilBrown <neilb@suse.de> wrote:
> Now that nfs_release_page() doesn't block indefinitely, other deadlock
> avoidance mechanisms aren't needed.
>  - it doesn't hurt for kswapd to block occasionally.  If it doesn't
>    want to block it would clear __GFP_WAIT.  The current_is_kswapd()
>    was only added to avoid deadlocks and we have a new approach for
>    that.
>  - memory allocation in the SUNRPC layer can very rarely try to
>    ->releasepage() a page it is trying to handle.  The deadlock
>    is removed as nfs_release_page() doesn't block indefinitely.
>
> So we don't need to set PF_FSTRANS for sunrpc network operations any
> more.

Jeff Layton and I had a little discussion about this earlier today.
The issue that Jeff raised was that these 1 second waits, although
they will eventually complete, can nevertheless have a cumulative
large effect if, say, the reason why we're not making progress is that
we're being called as part of a socket reconnect attempt in
xs_tcp_setup_socket().

In that case, any attempts to call nfs_release_page() on pages that
need to use that socket, will result in a 1 second wait, and no
progress in satisfying the allocation attempt.

Our conclusion was that we still need the PF_FSTRANS in order to deal
with that case, where we need to actually circumvent the new wait in
order to guarantee progress on the task of allocating and connecting
the new socket.

Comments?

Cheers
  Trond
diff mbox

Patch

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 8d74983417af..5949ca37cd18 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -469,18 +469,16 @@  static int nfs_release_page(struct page *page, gfp_t gfp)
 	dfprintk(PAGECACHE, "NFS: release_page(%p)\n", page);
 
 	/* Always try to initiate a 'commit' if relevant, but only
-	 * wait for it if __GFP_WAIT is set and the calling process is
-	 * allowed to block.  Even then, only wait 1 second.  Waiting
-	 * indefinitely can cause deadlocks when the NFS server is on
-	 * this machine, and there is no particular need to wait
-	 * extensively here.  A short wait has the benefit that
-	 * someone else can worry about the freezer.
+	 * wait for it if __GFP_WAIT is set.  Even then, only wait 1
+	 * second.  Waiting indefinitely can cause deadlocks when the
+	 * NFS server is on this machine, when a new TCP connection is
+	 * needed and in other rare cases.  There is no particular
+	 * need to wait extensively here.  A short wait has the
+	 * benefit that someone else can worry about the freezer.
 	 */
 	if (mapping) {
 		nfs_commit_inode(mapping->host, 0);
-		if ((gfp & __GFP_WAIT) &&
-		    !current_is_kswapd() &&
-		    !(current->flags & PF_FSTRANS))
+		if ((gfp & __GFP_WAIT))
 			wait_on_page_bit_killable_timeout(page, PG_private,
 							  HZ);
 	}
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 9358c79fd589..fe3441abdbe5 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -821,9 +821,7 @@  void rpc_execute(struct rpc_task *task)
 
 static void rpc_async_schedule(struct work_struct *work)
 {
-	current->flags |= PF_FSTRANS;
 	__rpc_execute(container_of(work, struct rpc_task, u.tk_work));
-	current->flags &= ~PF_FSTRANS;
 }
 
 /**
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 2faac4940563..6a4615dd0261 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -205,7 +205,6 @@  xprt_rdma_connect_worker(struct work_struct *work)
 	struct rpc_xprt *xprt = &r_xprt->xprt;
 	int rc = 0;
 
-	current->flags |= PF_FSTRANS;
 	xprt_clear_connected(xprt);
 
 	dprintk("RPC:       %s: %sconnect\n", __func__,
@@ -216,7 +215,6 @@  xprt_rdma_connect_worker(struct work_struct *work)
 
 	dprintk("RPC:       %s: exit\n", __func__);
 	xprt_clear_connecting(xprt);
-	current->flags &= ~PF_FSTRANS;
 }
 
 /*
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 43cd89eacfab..4707c0c8568b 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1927,8 +1927,6 @@  static int xs_local_setup_socket(struct sock_xprt *transport)
 	struct socket *sock;
 	int status = -EIO;
 
-	current->flags |= PF_FSTRANS;
-
 	clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
 	status = __sock_create(xprt->xprt_net, AF_LOCAL,
 					SOCK_STREAM, 0, &sock, 1);
@@ -1968,7 +1966,6 @@  static int xs_local_setup_socket(struct sock_xprt *transport)
 out:
 	xprt_clear_connecting(xprt);
 	xprt_wake_pending_tasks(xprt, status);
-	current->flags &= ~PF_FSTRANS;
 	return status;
 }
 
@@ -2071,8 +2068,6 @@  static void xs_udp_setup_socket(struct work_struct *work)
 	struct socket *sock = transport->sock;
 	int status = -EIO;
 
-	current->flags |= PF_FSTRANS;
-
 	/* Start by resetting any existing state */
 	xs_reset_transport(transport);
 	sock = xs_create_sock(xprt, transport,
@@ -2092,7 +2087,6 @@  static void xs_udp_setup_socket(struct work_struct *work)
 out:
 	xprt_clear_connecting(xprt);
 	xprt_wake_pending_tasks(xprt, status);
-	current->flags &= ~PF_FSTRANS;
 }
 
 /*
@@ -2229,8 +2223,6 @@  static void xs_tcp_setup_socket(struct work_struct *work)
 	struct rpc_xprt *xprt = &transport->xprt;
 	int status = -EIO;
 
-	current->flags |= PF_FSTRANS;
-
 	if (!sock) {
 		clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
 		sock = xs_create_sock(xprt, transport,
@@ -2276,7 +2268,6 @@  static void xs_tcp_setup_socket(struct work_struct *work)
 	case -EINPROGRESS:
 	case -EALREADY:
 		xprt_clear_connecting(xprt);
-		current->flags &= ~PF_FSTRANS;
 		return;
 	case -EINVAL:
 		/* Happens, for instance, if the user specified a link
@@ -2294,7 +2285,6 @@  out_eagain:
 out:
 	xprt_clear_connecting(xprt);
 	xprt_wake_pending_tasks(xprt, status);
-	current->flags &= ~PF_FSTRANS;
 }
 
 /**