diff mbox

[PATCH-v3,3/6] vfs: don't let the dirty time inodes get more than a day stale

Message ID 1416893674-419-4-git-send-email-tytso@mit.edu (mailing list archive)
State Not Applicable
Headers show

Commit Message

Theodore Ts'o Nov. 25, 2014, 5:34 a.m. UTC
Guarantee that the on-disk timestamps will be no more than 24 hours
stale.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/fs-writeback.c  |  1 +
 fs/inode.c         | 28 +++++++++++++++++++++++-----
 include/linux/fs.h |  1 +
 3 files changed, 25 insertions(+), 5 deletions(-)

Comments

Rasmus Villemoes Nov. 25, 2014, 2:58 p.m. UTC | #1
On Tue, Nov 25 2014, Theodore Ts'o <tytso@mit.edu> wrote:

>  static int update_time(struct inode *inode, struct timespec *time, int flags)
>  {
> +	struct timespec uptime;
> +	unsigned short daycode;
>  	int ret;
>  
>  	if (inode->i_op->update_time) {
> @@ -1525,17 +1527,33 @@ static int update_time(struct inode *inode, struct timespec *time, int flags)
>  		if (flags & S_CTIME)
>  			inode->i_ctime = *time;
>  		if (flags & S_MTIME)
> -			inode->i_mtime = *time;
> +		inode->i_mtime = *time;
>  	}
> +	/*
> +	 * If i_ts_dirty_day is zero, then either we have not deferred
> +	 * timestamp updates, or the system has been up for less than
> +	 * a day (so days_since_boot is zero), so we defer timestamp
> +	 * updates in that case and set the I_DIRTY_TIME flag.  If a
> +	 * day or more has passed, then i_ts_dirty_day will be
> +	 * different from days_since_boot, and then we should update
> +	 * the on-disk inode and then we can clear i_ts_dirty_day.
> +	 */

I think days_since_boot was a lot clearer than daycode. In any case,
please make the comment and the code consistent.

>  	if ((inode->i_sb->s_flags & MS_LAZYTIME) &&
>  	    !(flags & S_VERSION)) {
>  		if (inode->i_state & I_DIRTY_TIME)
>  			return 0;
> -		spin_lock(&inode->i_lock);
> -		inode->i_state |= I_DIRTY_TIME;
> -		spin_unlock(&inode->i_lock);
> -		return 0;
> +		get_monotonic_boottime(&uptime);
> +		daycode = div_u64(uptime.tv_sec, (HZ * 86400));

You should probably divide by the number of seconds in a day, not the
number of jiffies in a day.

Isn't div_u64 mostly for when the divisor is not known at compile time?
Technically, "(u64)uptime.tv_sec / 86400" is of course a u64/u64 division,
but the compiler should see that the divisor is only 32 bits and hence
should be able to generate code which is at least as efficient as
whatever inline asm the arch provides for u64/u32 divisions.


Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Nov. 25, 2014, 3:55 p.m. UTC | #2
On Tue, Nov 25, 2014 at 03:58:01PM +0100, Rasmus Villemoes wrote:
> 
> I think days_since_boot was a lot clearer than daycode. In any case,
> please make the comment and the code consistent.

Yeah, I was going back and forth between days since the epoch and days
since boot, but found it was more efficient to calculate the days
since boot.  Sure, I'll go back to days_since_boot.

> You should probably divide by the number of seconds in a day, not the
> number of jiffies in a day.

Right, brain hiccup on my part, will fix.

> Isn't div_u64 mostly for when the divisor is not known at compile time?
> Technically, "(u64)uptime.tv_sec / 86400" is of course a u64/u64 division,
> but the compiler should see that the divisor is only 32 bits and hence
> should be able to generate code which is at least as efficient as
> whatever inline asm the arch provides for u64/u32 divisions.

The problem with doing u64/u64 divisions is that the compiler will
call out to a (non-existent) library function on some architectures.
For example, try compiling the following using: with "gcc -S -m32
foo.c"

int main(int argc, char **argv)
{
	unsigned long long t = time(0);

	printf("%llu\n", t / 86400);
}

You will find in the .S file the following:

	...
	pushl	$0
	pushl	$86400
	pushl	%edx
	pushl	%eax
	call	__udivdi3
	...

It will work finn compiling for x86_64, but if you do this and push
out a git branch, you will soon get a very nice e-mail from the ktest
bot explaining how you've broken the build on the i386 architecture
since the kernel doesn't provide __udivdi3.
	
Hence if you are doing any kind of divisions involving u64, you have
to use the functions in include/linux/math64.h, and div_u64 is, per
math64.h:

/**
 * div_u64 - unsigned 64bit divide with 32bit divisor
 *
 * This is the most common 64bit divide and should be used if possible,
 * as many 32bit archs can optimize this variant better than a full 64bit
 * divide.
 */

Cheers,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index ce7de22..eb04277 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1141,6 +1141,7 @@  void __mark_inode_dirty(struct inode *inode, int flags)
 	if (flags & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
 		trace_writeback_dirty_inode_start(inode, flags);
 
+		inode->i_ts_dirty_day = 0;
 		if (sb->s_op->dirty_inode)
 			sb->s_op->dirty_inode(inode, flags);
 
diff --git a/fs/inode.c b/fs/inode.c
index 2093a84..34a443f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1511,6 +1511,8 @@  static int relatime_need_update(struct vfsmount *mnt, struct inode *inode,
  */
 static int update_time(struct inode *inode, struct timespec *time, int flags)
 {
+	struct timespec uptime;
+	unsigned short daycode;
 	int ret;
 
 	if (inode->i_op->update_time) {
@@ -1525,17 +1527,33 @@  static int update_time(struct inode *inode, struct timespec *time, int flags)
 		if (flags & S_CTIME)
 			inode->i_ctime = *time;
 		if (flags & S_MTIME)
-			inode->i_mtime = *time;
+		inode->i_mtime = *time;
 	}
+	/*
+	 * If i_ts_dirty_day is zero, then either we have not deferred
+	 * timestamp updates, or the system has been up for less than
+	 * a day (so days_since_boot is zero), so we defer timestamp
+	 * updates in that case and set the I_DIRTY_TIME flag.  If a
+	 * day or more has passed, then i_ts_dirty_day will be
+	 * different from days_since_boot, and then we should update
+	 * the on-disk inode and then we can clear i_ts_dirty_day.
+	 */
 	if ((inode->i_sb->s_flags & MS_LAZYTIME) &&
 	    !(flags & S_VERSION)) {
 		if (inode->i_state & I_DIRTY_TIME)
 			return 0;
-		spin_lock(&inode->i_lock);
-		inode->i_state |= I_DIRTY_TIME;
-		spin_unlock(&inode->i_lock);
-		return 0;
+		get_monotonic_boottime(&uptime);
+		daycode = div_u64(uptime.tv_sec, (HZ * 86400));
+		if (!inode->i_ts_dirty_day ||
+		    inode->i_ts_dirty_day == daycode) {
+			spin_lock(&inode->i_lock);
+			inode->i_state |= I_DIRTY_TIME;
+			spin_unlock(&inode->i_lock);
+			inode->i_ts_dirty_day = daycode;
+			return 0;
+		}
 	}
+	inode->i_ts_dirty_day = 0;
 	if (inode->i_op->write_time)
 		return inode->i_op->write_time(inode);
 	mark_inode_dirty_sync(inode);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 489b2f2..e3574cd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -575,6 +575,7 @@  struct inode {
 	struct timespec		i_ctime;
 	spinlock_t		i_lock;	/* i_blocks, i_bytes, maybe i_size */
 	unsigned short          i_bytes;
+	unsigned short		i_ts_dirty_day;
 	unsigned int		i_blkbits;
 	blkcnt_t		i_blocks;