diff mbox series

fs: drop_caches: skip dropping pagecache which is always dirty

Message ID 20220720022118.1495752-1-yang.yang29@zte.com.cn (mailing list archive)
State New
Headers show
Series fs: drop_caches: skip dropping pagecache which is always dirty | expand

Commit Message

CGEL July 20, 2022, 2:21 a.m. UTC
From: Yang Yang <yang.yang29@zte.com.cn>

Pagecache of some kind of fs has PG_dirty bit set once it was
allocated, so it can't be dropped. These fs include ramfs and
tmpfs. This can make drop_pagecache_sb() more efficient.

Introduce a new fs flag to do this, and this new flag may be
used in other case in future.

Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
---
 fs/drop_caches.c   | 7 +++++++
 fs/ramfs/inode.c   | 2 +-
 include/linux/fs.h | 1 +
 mm/shmem.c         | 2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox July 20, 2022, 3:02 a.m. UTC | #1
On Wed, Jul 20, 2022 at 02:21:19AM +0000, cgel.zte@gmail.com wrote:
> From: Yang Yang <yang.yang29@zte.com.cn>
> 
> Pagecache of some kind of fs has PG_dirty bit set once it was
> allocated, so it can't be dropped. These fs include ramfs and
> tmpfs. This can make drop_pagecache_sb() more efficient.

Why do we want to make drop_pagecache_sb() more efficient?
CGEL July 20, 2022, 6:02 a.m. UTC | #2
On Wed, Jul 20, 2022 at 04:02:40AM +0100, Matthew Wilcox wrote:
> On Wed, Jul 20, 2022 at 02:21:19AM +0000, cgel.zte@gmail.com wrote:
> > From: Yang Yang <yang.yang29@zte.com.cn>
> > 
> > Pagecache of some kind of fs has PG_dirty bit set once it was
> > allocated, so it can't be dropped. These fs include ramfs and
> > tmpfs. This can make drop_pagecache_sb() more efficient.
> 
> Why do we want to make drop_pagecache_sb() more efficient?

Some users may use drop_caches besides testing or debugging.

For example, some systems will create a lot of pagecache when boot up
while reading bzImage, ramdisk, docker images etc. Most of this pagecache
is useless after boot up. It may has a longterm negative effects for the
workload when trigger page reclaim. It is especially harmful when trigger
direct_reclaim or we need allocate pages in atomic context. So users may
chose to drop_caches after boot up.
Christoph Hellwig July 20, 2022, 6:04 a.m. UTC | #3
On Wed, Jul 20, 2022 at 06:02:32AM +0000, CGEL wrote:
> For example, some systems will create a lot of pagecache when boot up
> while reading bzImage, ramdisk, docker images etc. Most of this pagecache
> is useless after boot up. It may has a longterm negative effects for the
> workload when trigger page reclaim. It is especially harmful when trigger
> direct_reclaim or we need allocate pages in atomic context. So users may
> chose to drop_caches after boot up.

It is purely a debug interface.  If you want to drop specific page cache
that needs to be done through madvise.
CGEL July 20, 2022, 7:01 a.m. UTC | #4
On Tue, Jul 19, 2022 at 11:04:36PM -0700, Christoph Hellwig wrote:
> On Wed, Jul 20, 2022 at 06:02:32AM +0000, CGEL wrote:
> > For example, some systems will create a lot of pagecache when boot up
> > while reading bzImage, ramdisk, docker images etc. Most of this pagecache
> > is useless after boot up. It may has a longterm negative effects for the
> > workload when trigger page reclaim. It is especially harmful when trigger
> > direct_reclaim or we need allocate pages in atomic context. So users may
> > chose to drop_caches after boot up.
> 
> It is purely a debug interface.  If you want to drop specific page cache
> that needs to be done through madvise.

It's not easy for users to use madvise in complex system, it takes cost.
Since drop_caches is not forbidden, and users may want simply use it in
the scene above mail described.
Matthew Wilcox July 20, 2022, 3:02 p.m. UTC | #5
On Wed, Jul 20, 2022 at 06:02:32AM +0000, CGEL wrote:
> On Wed, Jul 20, 2022 at 04:02:40AM +0100, Matthew Wilcox wrote:
> > On Wed, Jul 20, 2022 at 02:21:19AM +0000, cgel.zte@gmail.com wrote:
> > > From: Yang Yang <yang.yang29@zte.com.cn>
> > > 
> > > Pagecache of some kind of fs has PG_dirty bit set once it was
> > > allocated, so it can't be dropped. These fs include ramfs and
> > > tmpfs. This can make drop_pagecache_sb() more efficient.
> > 
> > Why do we want to make drop_pagecache_sb() more efficient?
> 
> Some users may use drop_caches besides testing or debugging.

This is a terrible reason.

> For example, some systems will create a lot of pagecache when boot up
> while reading bzImage, ramdisk, docker images etc. Most of this pagecache
> is useless after boot up. It may has a longterm negative effects for the
> workload when trigger page reclaim. It is especially harmful when trigger
> direct_reclaim or we need allocate pages in atomic context. So users may
> chose to drop_caches after boot up.

If that's actually a problem, work on fixing that.
CGEL July 21, 2022, 1 a.m. UTC | #6
On Wed, Jul 20, 2022 at 04:02:04PM +0100, Matthew Wilcox wrote:
> On Wed, Jul 20, 2022 at 06:02:32AM +0000, CGEL wrote:
> > On Wed, Jul 20, 2022 at 04:02:40AM +0100, Matthew Wilcox wrote:
> > > On Wed, Jul 20, 2022 at 02:21:19AM +0000, cgel.zte@gmail.com wrote:
> > > > From: Yang Yang <yang.yang29@zte.com.cn>
> > > > 
> > > > Pagecache of some kind of fs has PG_dirty bit set once it was
> > > > allocated, so it can't be dropped. These fs include ramfs and
> > > > tmpfs. This can make drop_pagecache_sb() more efficient.
> > > 
> > > Why do we want to make drop_pagecache_sb() more efficient?
> > 
> > Some users may use drop_caches besides testing or debugging.
> 
> This is a terrible reason.
>

Another case that may use drop_caches: "Migration of virtual machines
will go faster if there are fewer pages to copy, so administrators would
like to be able to force a virtual machine to reclaim as much memory as
possible before the migration begins. "

See https://lwn.net/Articles/894849/

> > For example, some systems will create a lot of pagecache when boot up
> > while reading bzImage, ramdisk, docker images etc. Most of this pagecache
> > is useless after boot up. It may has a longterm negative effects for the
> > workload when trigger page reclaim. It is especially harmful when trigger
> > direct_reclaim or we need allocate pages in atomic context. So users may
> > chose to drop_caches after boot up.
> 
> If that's actually a problem, work on fixing that.
diff mbox series

Patch

diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index e619c31b6bd9..16956d5d3922 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -19,6 +19,13 @@  static void drop_pagecache_sb(struct super_block *sb, void *unused)
 {
 	struct inode *inode, *toput_inode = NULL;
 
+	/*
+	 * Pagecache of this kind of fs has PG_dirty bit set once it was
+	 * allocated, so it can't be dropped.
+	 */
+	if (sb->s_type->fs_flags & FS_ALWAYS_DIRTY)
+		return;
+
 	spin_lock(&sb->s_inode_list_lock);
 	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
 		spin_lock(&inode->i_lock);
diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
index bc66d0173e33..5fb62d37618f 100644
--- a/fs/ramfs/inode.c
+++ b/fs/ramfs/inode.c
@@ -289,7 +289,7 @@  static struct file_system_type ramfs_fs_type = {
 	.init_fs_context = ramfs_init_fs_context,
 	.parameters	= ramfs_fs_parameters,
 	.kill_sb	= ramfs_kill_sb,
-	.fs_flags	= FS_USERNS_MOUNT,
+	.fs_flags	= FS_USERNS_MOUNT | FS_ALWAYS_DIRTY,
 };
 
 static int __init init_ramfs_fs(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e285bd9d6188..90cdd10d683e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2532,6 +2532,7 @@  struct file_system_type {
 #define FS_USERNS_MOUNT		8	/* Can be mounted by userns root */
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
+#define FS_ALWAYS_DIRTY		64	/* Pagecache is always dirty. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
diff --git a/mm/shmem.c b/mm/shmem.c
index 8baf26eda989..5d549f61735f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3974,7 +3974,7 @@  static struct file_system_type shmem_fs_type = {
 	.parameters	= shmem_fs_parameters,
 #endif
 	.kill_sb	= kill_litter_super,
-	.fs_flags	= FS_USERNS_MOUNT,
+	.fs_flags	= FS_USERNS_MOUNT | FS_ALWAYS_DIRTY,
 };
 
 void __init shmem_init(void)