diff mbox series

[01/16] xfs: drop ->writepage completely

Message ID 20181107063127.3902-2-david@fromorbit.com (mailing list archive)
State New, archived
Headers show
Series xfs: Block size > PAGE_SIZE support | expand

Commit Message

Dave Chinner Nov. 7, 2018, 6:31 a.m. UTC
From: Dave Chinner <dchinner@redhat.com>

->writepage is only used in one place - single page writeback from
memory reclaim. We only allow such writeback from kswapd, not from
direct memory reclaim, and so it is rarely used. When it comes from
kswapd, it is effectively random dirty page shoot-down, which is
horrible for IO patterns. We will already have background writeback
trying to clean all the dirty pages in memory as efficiently as
possible, so having kswapd interrupt our well formed IO stream only
slows things down. So get rid of xfs_vm_writepage() completely.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_aops.c | 17 -----------------
 1 file changed, 17 deletions(-)

Comments

Christoph Hellwig Nov. 9, 2018, 3:12 p.m. UTC | #1
[adding linux-mm to the CC list]

On Wed, Nov 07, 2018 at 05:31:12PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> ->writepage is only used in one place - single page writeback from
> memory reclaim. We only allow such writeback from kswapd, not from
> direct memory reclaim, and so it is rarely used. When it comes from
> kswapd, it is effectively random dirty page shoot-down, which is
> horrible for IO patterns. We will already have background writeback
> trying to clean all the dirty pages in memory as efficiently as
> possible, so having kswapd interrupt our well formed IO stream only
> slows things down. So get rid of xfs_vm_writepage() completely.

Interesting.  IFF we can pull this off it would simplify a lot of
things, so I'm generally in favor of it.

->writepage callers in generic code are:

 (1) mm/vmscan.c:pageout() - this is the kswaped (or direct reclaim) you
     mention above.  It basically does nothing in this case which isn't
     great, but the whole point of this patch..
 (2) mm/migrate.c:writeout() - this is only called if no ->migratepage
     method is presend, but we have one in XFS, so we should be ok.

Plus a few pieces of code that are just library functions like
generic_writepages and mpage_writepages.
Dave Chinner Nov. 12, 2018, 9:08 p.m. UTC | #2
On Fri, Nov 09, 2018 at 07:12:39AM -0800, Christoph Hellwig wrote:
> [adding linux-mm to the CC list]
> 
> On Wed, Nov 07, 2018 at 05:31:12PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > ->writepage is only used in one place - single page writeback from
> > memory reclaim. We only allow such writeback from kswapd, not from
> > direct memory reclaim, and so it is rarely used. When it comes from
> > kswapd, it is effectively random dirty page shoot-down, which is
> > horrible for IO patterns. We will already have background writeback
> > trying to clean all the dirty pages in memory as efficiently as
> > possible, so having kswapd interrupt our well formed IO stream only
> > slows things down. So get rid of xfs_vm_writepage() completely.
> 
> Interesting.  IFF we can pull this off it would simplify a lot of
> things, so I'm generally in favor of it.

Over the past few days of hammeringon this, the only thing I've
noticed is that page reclaim hangs up less, but it's also putting a
bit more pressure on the shrinkers. Filesystem intensive workloads
that drive the machine into reclaim via the page cache seem to hit
breakdown conditions slightly earlier and the impact is that the
shrinkers are run harder. Mostly I see this as the XFS buffer cache
having a much harder time keeping a working set active.

However, while the workloads hit the working set cache, writeback
performance does seem to be slightly higher. It is, however, being
offset by the deeper lows that come from the cache being turned
over.

So there's a bit of rebalancing to be done here as a followup, but
I've been unable to drive the system into unexepected OOM kills or
other bad behaviour as a result of removing ->writepage.

Cheers,

Dave.
Darrick J. Wong Feb. 2, 2021, 8:51 p.m. UTC | #3
On Tue, Nov 13, 2018 at 08:08:39AM +1100, Dave Chinner wrote:
> On Fri, Nov 09, 2018 at 07:12:39AM -0800, Christoph Hellwig wrote:
> > [adding linux-mm to the CC list]
> > 
> > On Wed, Nov 07, 2018 at 05:31:12PM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > ->writepage is only used in one place - single page writeback from
> > > memory reclaim. We only allow such writeback from kswapd, not from
> > > direct memory reclaim, and so it is rarely used. When it comes from
> > > kswapd, it is effectively random dirty page shoot-down, which is
> > > horrible for IO patterns. We will already have background writeback
> > > trying to clean all the dirty pages in memory as efficiently as
> > > possible, so having kswapd interrupt our well formed IO stream only
> > > slows things down. So get rid of xfs_vm_writepage() completely.
> > 
> > Interesting.  IFF we can pull this off it would simplify a lot of
> > things, so I'm generally in favor of it.
> 
> Over the past few days of hammeringon this, the only thing I've
> noticed is that page reclaim hangs up less, but it's also putting a
> bit more pressure on the shrinkers. Filesystem intensive workloads
> that drive the machine into reclaim via the page cache seem to hit
> breakdown conditions slightly earlier and the impact is that the
> shrinkers are run harder. Mostly I see this as the XFS buffer cache
> having a much harder time keeping a working set active.
> 
> However, while the workloads hit the working set cache, writeback
> performance does seem to be slightly higher. It is, however, being
> offset by the deeper lows that come from the cache being turned
> over.
> 
> So there's a bit of rebalancing to be done here as a followup, but
> I've been unable to drive the system into unexepected OOM kills or
> other bad behaviour as a result of removing ->writepage.

FWIW I've been running this patch in my development kernels as part of
exercising realtime reflink with rextsize > 1.  So far I haven't seen
any particularly adverse effects.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
diff mbox series

Patch

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 338b9d9984e0..0e1342e359ca 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -911,22 +911,6 @@  xfs_do_writepage(
 	return 0;
 }
 
-STATIC int
-xfs_vm_writepage(
-	struct page		*page,
-	struct writeback_control *wbc)
-{
-	struct xfs_writepage_ctx wpc = {
-		.io_type = XFS_IO_HOLE,
-	};
-	int			ret;
-
-	ret = xfs_do_writepage(page, wbc, &wpc);
-	if (wpc.ioend)
-		ret = xfs_submit_ioend(wbc, wpc.ioend, ret);
-	return ret;
-}
-
 STATIC int
 xfs_vm_writepages(
 	struct address_space	*mapping,
@@ -1019,7 +1003,6 @@  xfs_iomap_swapfile_activate(
 const struct address_space_operations xfs_address_space_operations = {
 	.readpage		= xfs_vm_readpage,
 	.readpages		= xfs_vm_readpages,
-	.writepage		= xfs_vm_writepage,
 	.writepages		= xfs_vm_writepages,
 	.set_page_dirty		= iomap_set_page_dirty,
 	.releasepage		= xfs_vm_releasepage,