diff mbox series

[08/11] xfs: centralize page allocation and freeing for buffers

Message ID 20210519190900.320044-9-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [01/11] xfs: cleanup error handling in xfs_buf_get_map | expand

Commit Message

Christoph Hellwig May 19, 2021, 7:08 p.m. UTC
Factor out two helpers that do everything needed for allocating and
freeing pages that back a buffer, and remove the duplication between
the different interfaces.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf.c | 110 ++++++++++++++++-------------------------------
 1 file changed, 37 insertions(+), 73 deletions(-)

Comments

Dave Chinner May 19, 2021, 11:22 p.m. UTC | #1
On Wed, May 19, 2021 at 09:08:57PM +0200, Christoph Hellwig wrote:
> Factor out two helpers that do everything needed for allocating and
> freeing pages that back a buffer, and remove the duplication between
> the different interfaces.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

This seems really confused.

Up until this point in the patch set you are pulling code out
of xfs_buf_alloc_pages() into helpers. Now you are getting rid of
the helpers and putting the slightly modified code back into
xfs_buf_alloc_pages(). This doesn't make any sense at all.

The freeing helper now requires the buffer state to be
manipulated on allocation failure so that the free function doesn't
run off the end of the bp->b_pages array. That's a bit of a
landmine, and it doesn't really clean anything up much at all.

And on the allocation side there is new "fail fast" behaviour
because you've lifted the readahead out of xfs_buf_alloc_pages. You
also lifted the zeroing checks, which I note that you immediately
put back inside xfs_buf_alloc_pages() in the next patch.

The stuff up to this point in the series makes sense. From this
patch onwards it seems to me that you're just undoing the factoring
and cleanups from the first few patches...

I mean, like the factoring of xfs_buf_alloc_slab(), you could have
just factored out xfs_buf_alloc_pages(bp, page_count) from
xfs_buf_allocate_memory() and used that directly in
xfs_buf_get_uncached() and avoided a bunch of this factoring, make a
slight logic modification and recombine churn. And it would be
trivial to do on top of the bulk allocation patch which already
converts both of these functions to use bulk allocation....

Cheers,

Dave.
Christoph Hellwig May 20, 2021, 5:35 a.m. UTC | #2
On Thu, May 20, 2021 at 09:22:45AM +1000, Dave Chinner wrote:
> Up until this point in the patch set you are pulling code out
> of xfs_buf_alloc_pages() into helpers. Now you are getting rid of
> the helpers and putting the slightly modified code back into
> xfs_buf_alloc_pages(). This doesn't make any sense at all.

It makes a whole lot of sense, but it seems you don't like the
structure :)

As stated in the commit log we now have one helper that sets a
_XBF_PAGES backing with pages and the map, and one helper to
tear it down.   I think it makes a whole lot of sense this way.

> The freeing helper now requires the buffer state to be
> manipulated on allocation failure so that the free function doesn't
> run off the end of the bp->b_pages array. That's a bit of a
> landmine, and it doesn't really clean anything up much at all.

It is something we also do elsewhere in the kernel.  Another
alternative would be to do a NULL check on the page, or to just
pointlessly duplicate the freeing loop.

> And on the allocation side there is new "fail fast" behaviour
> because you've lifted the readahead out of xfs_buf_alloc_pages. You
> also lifted the zeroing checks, which I note that you immediately
> put back inside xfs_buf_alloc_pages() in the next patch.

This is to clearly split code consolidatation from behavior changes.
I could move both earlier at the downside of adding a lot of new
code first that later gets removed.

> I mean, like the factoring of xfs_buf_alloc_slab(), you could have
> just factored out xfs_buf_alloc_pages(bp, page_count) from
> xfs_buf_allocate_memory() and used that directly in
> xfs_buf_get_uncached() and avoided a bunch of this factoring, make a
> slight logic modification and recombine churn. And it would be
> trivial to do on top of the bulk allocation patch which already
> converts both of these functions to use bulk allocation....

As mentioned in the cover letter: the bulk allocation review is what
trigger this as it tripped me following various lose ends.  And as
usual I'd rather have that kind of change at the end where the
surrounding code makes sense, so the rebased version is now is patch 11
of this series.
Dave Chinner May 25, 2021, 11:59 p.m. UTC | #3
On Thu, May 20, 2021 at 07:35:04AM +0200, Christoph Hellwig wrote:
> On Thu, May 20, 2021 at 09:22:45AM +1000, Dave Chinner wrote:
> > Up until this point in the patch set you are pulling code out
> > of xfs_buf_alloc_pages() into helpers. Now you are getting rid of
> > the helpers and putting the slightly modified code back into
> > xfs_buf_alloc_pages(). This doesn't make any sense at all.
> 
> It makes a whole lot of sense, but it seems you don't like the
> structure :)
> 
> As stated in the commit log we now have one helper that sets a
> _XBF_PAGES backing with pages and the map, and one helper to
> tear it down.   I think it makes a whole lot of sense this way.

I don't like the way the patchset is built. It creates temporary
infrastructure, then tears it down again to return the code to
almost exactly the same structure that it originally had. In doing
this, you change the semantics of functions and helpers multiple
times yet, eventually, we end up with the same semantics as we
started with.

It's much more obvious to factor out the end helpers first, with the
exact semantics that the current have and will end up with, and then
just convert and clean up the code in and around those helpers. It's
much easier to follow and very correct if the function call
semnatics and behaviour don't keep changing...

> > The freeing helper now requires the buffer state to be
> > manipulated on allocation failure so that the free function doesn't
> > run off the end of the bp->b_pages array. That's a bit of a
> > landmine, and it doesn't really clean anything up much at all.
> 
> It is something we also do elsewhere in the kernel.  Another
> alternative would be to do a NULL check on the page, or to just
> pointlessly duplicate the freeing loop.

A null check in the freeing code is much simpler to understand at a
glance. It's easy to miss that the error handling only works because
callers have a single extra line of code that makes error handling
work correctly. This is a bad pattern because it's easy for new code
to get it wrong and have nobody notice that it's wrong.

> > And on the allocation side there is new "fail fast" behaviour
> > because you've lifted the readahead out of xfs_buf_alloc_pages. You
> > also lifted the zeroing checks, which I note that you immediately
> > put back inside xfs_buf_alloc_pages() in the next patch.
> 
> This is to clearly split code consolidatation from behavior changes.
> I could move both earlier at the downside of adding a lot of new
> code first that later gets removed.

Ah, what new code? factoring out the _alloc_pages() code at the same
time as the alloc_kmem() code is the only "new" code that is
necessary. Everything else is then consolidation, and this doesn't
require repeatedly changing behaviour and moving code out and back
into helpers....

> > I mean, like the factoring of xfs_buf_alloc_slab(), you could have
> > just factored out xfs_buf_alloc_pages(bp, page_count) from
> > xfs_buf_allocate_memory() and used that directly in
> > xfs_buf_get_uncached() and avoided a bunch of this factoring, make a
> > slight logic modification and recombine churn. And it would be
> > trivial to do on top of the bulk allocation patch which already
> > converts both of these functions to use bulk allocation....
> 
> As mentioned in the cover letter: the bulk allocation review is what
> trigger this as it tripped me following various lose ends.  And as
> usual I'd rather have that kind of change at the end where the
> surrounding code makes sense, so the rebased version is now is patch 11
> of this series.

I've re-written my patch based on this cleanup series. It largely
does all the same things, and ends up largely in the same place, but
does things in an order that doesn't keep changing behaviour and
repeatedly moving the same code around. I'll post it once I've QA'd
it.

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 76a107e3cb2a22..31aff8323605cd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -273,35 +273,17 @@  _xfs_buf_alloc(
 }
 
 /*
- *	Allocate a page array capable of holding a specified number
- *	of pages, and point the page buf at it.
+ * Free all pages allocated to the buffer including the page map.
  */
-STATIC int
-_xfs_buf_get_pages(
-	struct xfs_buf		*bp)
+static void
+xfs_buf_free_pages(
+	struct xfs_buf	*bp)
 {
-	ASSERT(bp->b_pages == NULL);
-
-	bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
-	if (bp->b_page_count > XB_PAGES) {
-		bp->b_pages = kmem_zalloc(sizeof(struct page *) *
-						bp->b_page_count, KM_NOFS);
-		if (!bp->b_pages)
-			return -ENOMEM;
-	} else {
-		bp->b_pages = bp->b_page_array;
-	}
+	unsigned int	i;
 
-	return 0;
-}
+	for (i = 0; i < bp->b_page_count; i++)
+		__free_page(bp->b_pages[i]);
 
-/*
- *	Frees b_pages if it was allocated.
- */
-STATIC void
-_xfs_buf_free_pages(
-	struct xfs_buf	*bp)
-{
 	if (bp->b_pages != bp->b_page_array) {
 		kmem_free(bp->b_pages);
 		bp->b_pages = NULL;
@@ -324,22 +306,14 @@  xfs_buf_free(
 	ASSERT(list_empty(&bp->b_lru));
 
 	if (bp->b_flags & _XBF_PAGES) {
-		uint		i;
-
 		if (xfs_buf_is_vmapped(bp))
 			vm_unmap_ram(bp->b_addr, bp->b_page_count);
-
-		for (i = 0; i < bp->b_page_count; i++) {
-			struct page	*page = bp->b_pages[i];
-
-			__free_page(page);
-		}
+		xfs_buf_free_pages(bp);
 		if (current->reclaim_state)
 			current->reclaim_state->reclaimed_slab +=
 							bp->b_page_count;
 	} else if (bp->b_flags & _XBF_KMEM)
 		kmem_free(bp->b_addr);
-	_xfs_buf_free_pages(bp);
 	xfs_buf_free_maps(bp);
 	kmem_cache_free(xfs_buf_zone, bp);
 }
@@ -380,34 +354,33 @@  xfs_buf_alloc_slab(
 static int
 xfs_buf_alloc_pages(
 	struct xfs_buf		*bp,
-	uint			flags)
+	gfp_t			gfp_mask,
+	bool			fail_fast)
 {
-	gfp_t			gfp_mask = xb_to_gfp(flags);
-	unsigned short		i;
-	int			error;
-
-	/*
-	 * assure zeroed buffer for non-read cases.
-	 */
-	if (!(flags & XBF_READ))
-		gfp_mask |= __GFP_ZERO;
+	int			i;
 
-	error = _xfs_buf_get_pages(bp);
-	if (unlikely(error))
-		return error;
+	ASSERT(bp->b_pages == NULL);
 
-	bp->b_flags |= _XBF_PAGES;
+	bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
+	if (bp->b_page_count > XB_PAGES) {
+		bp->b_pages = kmem_zalloc(sizeof(struct page *) *
+						bp->b_page_count, KM_NOFS);
+		if (!bp->b_pages)
+			return -ENOMEM;
+	} else {
+		bp->b_pages = bp->b_page_array;
+	}
 
 	for (i = 0; i < bp->b_page_count; i++) {
 		struct page	*page;
 		uint		retries = 0;
 retry:
 		page = alloc_page(gfp_mask);
-		if (unlikely(page == NULL)) {
-			if (flags & XBF_READ_AHEAD) {
+		if (unlikely(!page)) {
+			if (fail_fast) {
 				bp->b_page_count = i;
-				error = -ENOMEM;
-				goto out_free_pages;
+				xfs_buf_free_pages(bp);
+				return -ENOMEM;
 			}
 
 			/*
@@ -429,13 +402,9 @@  xfs_buf_alloc_pages(
 
 		bp->b_pages[i] = page;
 	}
-	return 0;
 
-out_free_pages:
-	for (i = 0; i < bp->b_page_count; i++)
-		__free_page(bp->b_pages[i]);
-	bp->b_flags &= ~_XBF_PAGES;
-	return error;
+	bp->b_flags |= _XBF_PAGES;
+	return 0;
 }
 
 /*
@@ -706,7 +675,13 @@  xfs_buf_get_map(
 	 */
 	if (BBTOB(new_bp->b_length) >= PAGE_SIZE ||
 	    xfs_buf_alloc_slab(new_bp, flags) < 0) {
-		error = xfs_buf_alloc_pages(new_bp, flags);
+		gfp_t			gfp_mask = xb_to_gfp(flags);
+
+		/* assure a zeroed buffer for non-read cases */
+		if (!(flags & XBF_READ))
+			gfp_mask |= __GFP_ZERO;
+		error = xfs_buf_alloc_pages(new_bp, gfp_mask,
+					   flags & XBF_READ_AHEAD);
 		if (error)
 			goto out_free_buf;
 	}
@@ -936,7 +911,7 @@  xfs_buf_get_uncached(
 	int			flags,
 	struct xfs_buf		**bpp)
 {
-	int			error, i;
+	int			error;
 	struct xfs_buf		*bp;
 	DEFINE_SINGLE_BUF_MAP(map, XFS_BUF_DADDR_NULL, numblks);
 
@@ -947,19 +922,10 @@  xfs_buf_get_uncached(
 	if (error)
 		goto fail;
 
-	error = _xfs_buf_get_pages(bp);
+	error = xfs_buf_alloc_pages(bp, xb_to_gfp(flags), true);
 	if (error)
 		goto fail_free_buf;
 
-	for (i = 0; i < bp->b_page_count; i++) {
-		bp->b_pages[i] = alloc_page(xb_to_gfp(flags));
-		if (!bp->b_pages[i]) {
-			error = -ENOMEM;
-			goto fail_free_mem;
-		}
-	}
-	bp->b_flags |= _XBF_PAGES;
-
 	error = _xfs_buf_map_pages(bp, 0);
 	if (unlikely(error)) {
 		xfs_warn(target->bt_mount,
@@ -972,9 +938,7 @@  xfs_buf_get_uncached(
 	return 0;
 
  fail_free_mem:
-	while (--i >= 0)
-		__free_page(bp->b_pages[i]);
-	_xfs_buf_free_pages(bp);
+	xfs_buf_free_pages(bp);
  fail_free_buf:
 	xfs_buf_free_maps(bp);
 	kmem_cache_free(xfs_buf_zone, bp);