[1/2] NFS: Fix I/O request leakages
diff mbox series

Message ID 20190213215357.19696-1-trond.myklebust@hammerspace.com
State New
Headers show
Series
  • [1/2] NFS: Fix I/O request leakages
Related show

Commit Message

Trond Myklebust Feb. 13, 2019, 9:53 p.m. UTC
When we fail to add the request to the I/O queue, we currently leave it
to the caller to free the failed request. However since some of the
requests that fail are actually created by nfs_pageio_add_request()
itself, and are not passed back the caller, this leads to a leakage
issue, which can again cause page locks to leak.

This commit addresses the leakage by freeing the created requests on
error, using desc->pg_completion_ops->error_cleanup()

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: a7d42ddb30997 ("nfs: add mirroring support to pgio layer")
Cc: stable@vger.kernel.org # v4.0+
---
 fs/nfs/pagelist.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

Comments

Trond Myklebust Feb. 13, 2019, 10:01 p.m. UTC | #1
Hi Anna,

On Wed, 2019-02-13 at 16:53 -0500, Trond Myklebust wrote:
> When we fail to add the request to the I/O queue, we currently leave
> it
> to the caller to free the failed request. However since some of the
> requests that fail are actually created by nfs_pageio_add_request()
> itself, and are not passed back the caller, this leads to a leakage
> issue, which can again cause page locks to leak.
> 
> This commit addresses the leakage by freeing the created requests on
> error, using desc->pg_completion_ops->error_cleanup()
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> Fixes: a7d42ddb30997 ("nfs: add mirroring support to pgio layer")
> Cc: stable@vger.kernel.org # v4.0+

I believe this patch should fix the page lock leak that I mentioned
yesterday. The issue is that we're creating nfs_page structures as part
of a page group, but then abandoning those structures to their fate. It
means that the calls to nfs_page_group_sync_on_bit(req, PG_UNLOCKPAGE),
nfs_page_group_sync_on_bit(req, PG_WB_END), and even
nfs_page_group_sync_on_bit(req, PG_TEARDOWN) can never succeed, and so
we end up leaking, page locks, page writeback locks, and nfs_page
requests...

Cheers
  Trond
Sasha Levin Feb. 18, 2019, 9:14 p.m. UTC | #2
Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: a7d42ddb3099 nfs: add mirroring support to pgio layer.

The bot has tested the following trees: v4.20.10, v4.19.23, v4.14.101, v4.9.158, v4.4.174.

v4.20.10: Build OK!
v4.19.23: Build OK!
v4.14.101: Build OK!
v4.9.158: Build OK!
v4.4.174: Failed to apply! Possible dependencies:
    c18b96a1b862 ("nfs: clean up rest of reqs when failing to add one")
    d600ad1f2bdb ("NFS41: pop some layoutget errors to application")


How should we proceed with this patch?

--
Thanks,
Sasha

Patch
diff mbox series

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index e54d899c1848..40b90f187eeb 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -988,6 +988,17 @@  static void nfs_pageio_doio(struct nfs_pageio_descriptor *desc)
 	}
 }
 
+static void
+nfs_pageio_cleanup_request(struct nfs_pageio_descriptor *desc,
+		struct nfs_page *req)
+{
+	LIST_HEAD(head);
+
+	nfs_list_remove_request(req);
+	nfs_list_add_request(req, &head);
+	desc->pg_completion_ops->error_cleanup(&head);
+}
+
 /**
  * nfs_pageio_add_request - Attempt to coalesce a request into a page list.
  * @desc: destination io descriptor
@@ -1025,10 +1036,8 @@  static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
 			nfs_page_group_unlock(req);
 			desc->pg_moreio = 1;
 			nfs_pageio_doio(desc);
-			if (desc->pg_error < 0)
-				return 0;
-			if (mirror->pg_recoalesce)
-				return 0;
+			if (desc->pg_error < 0 || mirror->pg_recoalesce)
+				goto out_cleanup_subreq;
 			/* retry add_request for this subreq */
 			nfs_page_group_lock(req);
 			continue;
@@ -1061,6 +1070,9 @@  static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
 	desc->pg_error = PTR_ERR(subreq);
 	nfs_page_group_unlock(req);
 	return 0;
+out_cleanup_subreq:
+	nfs_pageio_cleanup_request(desc, subreq);
+	return 0;
 }
 
 static int nfs_do_recoalesce(struct nfs_pageio_descriptor *desc)
@@ -1168,11 +1180,14 @@  int nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
 		if (nfs_pgio_has_mirroring(desc))
 			desc->pg_mirror_idx = midx;
 		if (!nfs_pageio_add_request_mirror(desc, dupreq))
-			goto out_failed;
+			goto out_cleanup_subreq;
 	}
 
 	return 1;
 
+out_cleanup_subreq:
+	if (req != dupreq)
+		nfs_pageio_cleanup_request(desc, dupreq);
 out_failed:
 	nfs_pageio_error_cleanup(desc);
 	return 0;