btrfs: Implement lazytime
diff mbox series

Message ID 20200114085325045.JFBE.12086.ppp.dion.ne.jp@dmta0008.auone-net.jp
State New
Headers show
Series
  • btrfs: Implement lazytime
Related show

Commit Message

Kusanagi Kouichi Jan. 14, 2020, 8:53 a.m. UTC
I tested with xfstests and lazytime didn't cause any new failures.

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
---
 fs/btrfs/delayed-inode.c |  6 ++++++
 fs/btrfs/file.c          | 11 +++++++++++
 fs/btrfs/inode.c         | 28 +++++++++++++++++++++++++---
 fs/btrfs/ioctl.c         | 14 ++++++++++++++
 fs/btrfs/super.c         |  1 +
 5 files changed, 57 insertions(+), 3 deletions(-)

Comments

Josef Bacik Jan. 14, 2020, 1:44 p.m. UTC | #1
On 1/14/20 12:53 AM, Kusanagi Kouichi wrote:
> I tested with xfstests and lazytime didn't cause any new failures.
> 
> Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
> ---

We don't use the I_DIRTY flags for tracking our inodes, and .write_inode was 
removed because we didn't need it and it deadlocks.  Thanks,

Josef
David Sterba Jan. 14, 2020, 9:21 p.m. UTC | #2
On Tue, Jan 14, 2020 at 05:53:24PM +0900, Kusanagi Kouichi wrote:
> I tested with xfstests and lazytime didn't cause any new failures.

The changelog should describe what the patch does (the 'why' part too,
but this is obvious from the subject in this case). That fstests pass
without new failures is nice but there should be a specific test for
that or instructions in the changelog how to test.
Kusanagi Kouichi Jan. 15, 2020, 1:41 p.m. UTC | #3
On 2020-01-14 05:44:33 -0800, Josef Bacik wrote:
> On 1/14/20 12:53 AM, Kusanagi Kouichi wrote:
> > I tested with xfstests and lazytime didn't cause any new failures.
> > 
> > Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
> > ---
> 
> We don't use the I_DIRTY flags for tracking our inodes, and .write_inode was
> removed because we didn't need it and it deadlocks.  Thanks,
> 
> Josef

Did you apply the patch and deadlock occur? According to commit 3c4276936f6f
("Btrfs: fix btrfs_write_inode vs delayed iput deadlock"), which removed
.write_inode, .write_inode calls btrfs_run_delayed_iputs and deadlock occurs.
But .write_inode in this patch doesn't seem to call btrfs_run_delayed_iputs.
Kusanagi Kouichi Jan. 15, 2020, 1:45 p.m. UTC | #4
On 2020-01-14 22:21:07 +0100, David Sterba wrote:
> On Tue, Jan 14, 2020 at 05:53:24PM +0900, Kusanagi Kouichi wrote:
> > I tested with xfstests and lazytime didn't cause any new failures.
> 
> The changelog should describe what the patch does (the 'why' part too,
> but this is obvious from the subject in this case). That fstests pass
> without new failures is nice but there should be a specific test for
> that or instructions in the changelog how to test.

To test lazytime, I set the following variables:
TEST_FS_MOUNT_OPTS="-o lazytime,space_cache=v2"
MOUNT_OPTIONS="-o lazytime,space_cache=v2"
David Sterba Jan. 15, 2020, 4:31 p.m. UTC | #5
On Wed, Jan 15, 2020 at 10:45:36PM +0900, Kusanagi Kouichi wrote:
> On 2020-01-14 22:21:07 +0100, David Sterba wrote:
> > On Tue, Jan 14, 2020 at 05:53:24PM +0900, Kusanagi Kouichi wrote:
> > > I tested with xfstests and lazytime didn't cause any new failures.
> > 
> > The changelog should describe what the patch does (the 'why' part too,
> > but this is obvious from the subject in this case). That fstests pass
> > without new failures is nice but there should be a specific test for
> > that or instructions in the changelog how to test.
> 
> To test lazytime, I set the following variables:
> TEST_FS_MOUNT_OPTS="-o lazytime,space_cache=v2"
> MOUNT_OPTIONS="-o lazytime,space_cache=v2"

How did you verify that the lazy time updates were applied properly?
Kusanagi Kouichi Jan. 16, 2020, 6:01 p.m. UTC | #6
On 2020-01-15 17:31:28 +0100, David Sterba wrote:
> On Wed, Jan 15, 2020 at 10:45:36PM +0900, Kusanagi Kouichi wrote:
> > On 2020-01-14 22:21:07 +0100, David Sterba wrote:
> > > On Tue, Jan 14, 2020 at 05:53:24PM +0900, Kusanagi Kouichi wrote:
> > > > I tested with xfstests and lazytime didn't cause any new failures.
> > > 
> > > The changelog should describe what the patch does (the 'why' part too,
> > > but this is obvious from the subject in this case). That fstests pass
> > > without new failures is nice but there should be a specific test for
> > > that or instructions in the changelog how to test.
> > 
> > To test lazytime, I set the following variables:
> > TEST_FS_MOUNT_OPTS="-o lazytime,space_cache=v2"
> > MOUNT_OPTIONS="-o lazytime,space_cache=v2"
> 
> How did you verify that the lazy time updates were applied properly?

I ran the attached test.
diff --git a/tests/generic/999 b/tests/generic/999
new file mode 100755
index 00000000..781b37c5
--- /dev/null
+++ b/tests/generic/999
@@ -0,0 +1,76 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2019 Kusanagi Kouichi.  All Rights Reserved.
+#
+# FS QA Test 999
+#
+# Test timestamp is persistent across umount.
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+
+_scratch_mkfs > /dev/null 2>&1
+_scratch_mount
+
+check_persist()
+{
+    ls "$SCRATCH_MNT" > /dev/null
+    before="$(stat -c '%x %y %z' "$SCRATCH_MNT")"
+    $XFS_IO_PROG -c "$1" "$SCRATCH_MNT"
+    _scratch_cycle_mount strictatime
+    after="$(stat -c '%x %y %z' "$SCRATCH_MNT")"
+    if test "$before" != "$after"
+    then
+	echo "timestamp didn't persist across umount."
+	echo "ls $1"
+	echo "before $before"
+	echo "after  $after"
+	exit
+    fi
+}
+
+check_persist ''
+check_persist fsync
+check_persist syncfs
+check_persist sync
+
+"$FSSTRESS_PROG" -d "$SCRATCH_MNT" -v $(_scale_fsstress_args -n 1000 -p 2) > "$tmp".fsstress
+find "$SCRATCH_MNT" ! -type d -exec stat -c '%x %y %z %i %F %n' '{}' + > "$tmp".before
+_scratch_cycle_mount
+find "$SCRATCH_MNT" ! -type d -exec stat -c '%x %y %z %i %F %n' '{}' + > "$tmp".after
+if ! diff -u "$tmp".before "$tmp".after
+then
+    echo "timestamp didn't persist across umount after fsstress."
+    cat "$tmp".fsstress
+    exit
+fi
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/999.out b/tests/generic/999.out
new file mode 100644
index 00000000..7fbc6768
--- /dev/null
+++ b/tests/generic/999.out
@@ -0,0 +1 @@
+QA output created by 999
diff --git a/tests/generic/group b/tests/generic/group
index 6fe62505..7879eb70 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -595,3 +595,4 @@
 590 auto prealloc preallocrw
 591 auto quick rw pipe splice
 592 auto quick encrypt
+999 auto quick

Patch
diff mbox series

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index d3e15e1d4a91..b30b00678503 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1722,6 +1722,12 @@  static void fill_stack_inode_item(struct btrfs_trans_handle *trans,
 				  struct btrfs_inode_item *inode_item,
 				  struct inode *inode)
 {
+	if (inode->i_sb->s_flags & SB_LAZYTIME) {
+		spin_lock(&inode->i_lock);
+		inode->i_state &= ~(I_DIRTY_SYNC | I_DIRTY_TIME);
+		spin_unlock(&inode->i_lock);
+	}
+
 	btrfs_set_stack_inode_uid(inode_item, i_uid_read(inode));
 	btrfs_set_stack_inode_gid(inode_item, i_gid_read(inode));
 	btrfs_set_stack_inode_size(inode_item, BTRFS_I(inode)->disk_i_size);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 8d47c76b7bd1..36aa75f4e532 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2069,6 +2069,17 @@  int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 
 	trace_btrfs_sync_file(file, datasync);
 
+	if (!datasync && inode->i_sb->s_flags & SB_LAZYTIME) {
+		while (1) {
+			ret = sync_inode_metadata(inode, 0);
+			if (ret != -EAGAIN)
+				break;
+			flush_work(&fs_info->async_reclaim_work);
+		}
+		if (ret)
+			return ret;
+	}
+
 	btrfs_init_log_ctx(&ctx, inode);
 
 	/*
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5509c41a4f43..a60aee76cc95 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3941,6 +3941,12 @@  static void fill_inode_item(struct btrfs_trans_handle *trans,
 
 	btrfs_init_map_token(&token, leaf);
 
+	if (inode->i_sb->s_flags & SB_LAZYTIME) {
+		spin_lock(&inode->i_lock);
+		inode->i_state &= ~(I_DIRTY_SYNC | I_DIRTY_TIME);
+		spin_unlock(&inode->i_lock);
+	}
+
 	btrfs_set_token_inode_uid(leaf, item, i_uid_read(inode), &token);
 	btrfs_set_token_inode_gid(leaf, item, i_gid_read(inode), &token);
 	btrfs_set_token_inode_size(leaf, item, BTRFS_I(inode)->disk_i_size,
@@ -6188,6 +6194,16 @@  static int btrfs_dirty_inode(struct inode *inode)
 	return ret;
 }
 
+int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
+{
+	if (work_busy(&btrfs_sb(inode->i_sb)->async_reclaim_work) & WORK_BUSY_RUNNING) {
+		mark_inode_dirty_sync(inode);
+		return -EAGAIN;
+	}
+
+	return btrfs_dirty_inode(inode);
+}
+
 /*
  * This is a copy of file_update_time.  We need this so we can return error on
  * ENOSPC for updating the inode in the case of file write and mmap writes.
@@ -6201,15 +6217,21 @@  static int btrfs_update_time(struct inode *inode, struct timespec64 *now,
 	if (btrfs_root_readonly(root))
 		return -EROFS;
 
-	if (flags & S_VERSION)
-		dirty |= inode_maybe_inc_iversion(inode, dirty);
+	if (!(flags & S_VERSION && inode_maybe_inc_iversion(inode, dirty))) {
+		if (unlikely(!dirty))
+			return 0;
+		if (inode->i_sb->s_flags & SB_LAZYTIME &&
+		    !test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
+			return generic_update_time(inode, now, flags);
+	}
+
 	if (flags & S_CTIME)
 		inode->i_ctime = *now;
 	if (flags & S_MTIME)
 		inode->i_mtime = *now;
 	if (flags & S_ATIME)
 		inode->i_atime = *now;
-	return dirty ? btrfs_dirty_inode(inode) : 0;
+	return btrfs_dirty_inode(inode);
 }
 
 /*
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 18e328ce4b54..af47b4b046de 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -799,6 +799,14 @@  static int create_snapshot(struct btrfs_root *root, struct inode *dir,
 	if (ret)
 		goto dec_and_free;
 
+	/* For xfstests generic/003. Is it better to fix the test? */
+	if (dir->i_sb->s_flags & SB_LAZYTIME) {
+		down_read(&dir->i_sb->s_umount);
+		if (!sb_rdonly(dir->i_sb))
+			sync_inodes_sb(dir->i_sb);
+		up_read(&dir->i_sb->s_umount);
+	}
+
 	/*
 	 * All previous writes have started writeback in NOCOW mode, so now
 	 * we force future writes to fallback to COW mode during snapshot
@@ -5497,6 +5505,12 @@  long btrfs_ioctl(struct file *file, unsigned int
 		ret = btrfs_start_delalloc_roots(fs_info, -1);
 		if (ret)
 			return ret;
+		if (inode->i_sb->s_flags & SB_LAZYTIME) {
+			down_read(&inode->i_sb->s_umount);
+			if (!sb_rdonly(inode->i_sb))
+				sync_inodes_sb(inode->i_sb);
+			up_read(&inode->i_sb->s_umount);
+		}
 		ret = btrfs_sync_fs(inode->i_sb, 1);
 		/*
 		 * The transaction thread may want to do more work,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f452a94abdc3..ccfe9aff1394 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2281,6 +2281,7 @@  static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
 }
 
 static const struct super_operations btrfs_super_ops = {
+	.write_inode	= btrfs_write_inode,
 	.drop_inode	= btrfs_drop_inode,
 	.evict_inode	= btrfs_evict_inode,
 	.put_super	= btrfs_put_super,