From patchwork Mon Oct 31 19:02:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 9406297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A3DBE60234 for ; Mon, 31 Oct 2016 19:03:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 97E2528BBB for ; Mon, 31 Oct 2016 19:03:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8AC3728CA8; Mon, 31 Oct 2016 19:03:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F23C228BBB for ; Mon, 31 Oct 2016 19:03:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S945766AbcJaTC6 (ORCPT ); Mon, 31 Oct 2016 15:02:58 -0400 Received: from mail-pf0-f196.google.com ([209.85.192.196]:35212 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S945748AbcJaTC5 (ORCPT ); Mon, 31 Oct 2016 15:02:57 -0400 Received: by mail-pf0-f196.google.com with SMTP id s8so9405215pfj.2; Mon, 31 Oct 2016 12:02:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id; bh=fdkKmxAu5qMY9Oq30nbH9zPhnLB4mf4nlAU9ecRJuy0=; b=y9QVdNnitmmbEFh0mNVYSXexgXxWqL96A0sHJ3AnmzAVv7QYUPMf2DggfEh5J0YK6a mPvaTIE2PhdI+n5dSPIfJTgu8KCaZBcz7+xZoeTu0yuo+OyZLX3IISvbtXc+KGRaaIJt feNq1nZBowej1hBqb+J/lGXiXnILSpDDjvdXJVemYVNmrdkXC5Q0tgVJ80Dy8WCCns3h mpc4d4aX/kqbSAfCxmXpeXjZnu1RmwFDK+dIo2s86xowGCFQmX/+yCsIxjEHzkCujkJ7 YylTnlWVwt0gRZl8+HoLZ1W/IOiShgmgZY++r5ivcirHYE9/HBe2c/MAgIR5yVwY22oa rGzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id; bh=fdkKmxAu5qMY9Oq30nbH9zPhnLB4mf4nlAU9ecRJuy0=; b=Wodyi+s3h2rtsXlu6ER84FS3FlNwMt+Qz/G1YUhfdjQaR+FOCAyfqrMjuhTc1vJzqb 9VZQ3kgn7x+SOzIDnEnDB/mfViIk2PDmYNkOrne9mlcT0u4/9zj2LGSHp4bP5H9L7jgc pd7IcT/3pJriTkhgq2oplFlD9Q4QnaX7DDPI6OsBfkaFzh3jG7KRVnEdI1lWa5IzDKzG BXNiOuP9pKkfojnCnS72MsU09z35f5HhhN6X9TCnJ7UP0FUkGdNF+3hrCfnNKLQp/wdA C7pYcNcG4LKL2efxWwcD4l5hGFeq9KX8RKuOCfJGHAIlwxa4a4yuuHeW4sAEGMHLJMDh 2igQ== X-Gm-Message-State: ABUngvdqHHA62gmj6CJpMW6HcWXt7I7h4MHGSHPpsKvggFm0duU3epH2Ic5wbOW4KDGR4A== X-Received: by 10.98.0.198 with SMTP id 189mr52826709pfa.75.1477940576673; Mon, 31 Oct 2016 12:02:56 -0700 (PDT) Received: from ako.i.sslab.ics.keio.ac.jp (pl787.nas821.p-kanagawa.nttpc.ne.jp. [1.33.235.19]) by smtp.gmail.com with ESMTPSA id u1sm37184854pfb.96.2016.10.31.12.02.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 31 Oct 2016 12:02:55 -0700 (PDT) From: Naohiro Aota To: linux-fsdevel@vger.kernel.org Cc: viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, tytso@mit.edu, asraaiteng@gmail.com, Naohiro Aota Subject: [PATCH][RESEND] fs: always set I_DIRTY_TIME to fsync correctly on lazytime Date: Tue, 1 Nov 2016 04:02:45 +0900 Message-Id: <20161031190245.13404-1-naota@elisp.net> X-Mailer: git-send-email 2.10.1 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While lazytime states that "The on-disk timestamps are updated only when: ... - the application employs fsync(2), syncfs(2), or sync(2)" [1], it does not write a timestamp update on fsync(). [1] http://manpages.ubuntu.com/manpages/xenial/man8/mount.8.html The following commands will reproduce the problem: $ mount -o noatime,lazytime ext4.img /mnt/tmp $ cd /mnt/tmp (create an 128M file to fio, not to observe size update) $ dd if=/dev/zero of=wxyz.0.0 bs=1M count=128 (do write/fsync) $ fio --name wxyz --direct=1 --buffered=0 --size=128m --bs=64k --rw=write \ --ioengine=sync --numjobs=1 --fsync=5 Since fio invokes 1 fsync per 5 writes, we should see rapid journal commits for timestamp update by tracing jbd2:jbd2_end_commit trace point. Only we can see are, however, some periodic (~5 sec) commits from bdi flush like below. $ trace jbd2:jbd2_end_commit jbd2/loop0-8-1617 [002] .... 96.637351: jbd2_end_commit: dev 7,0 transaction 5393 sync 0 head 5393 jbd2/loop0-8-1617 [000] .... 101.679411: jbd2_end_commit: dev 7,0 transaction 5394 sync 0 head 5393 jbd2/loop0-8-1617 [003] .... 106.743628: jbd2_end_commit: dev 7,0 transaction 5395 sync 0 head 5393 jbd2/loop0-8-1617 [001] .... 111.801964: jbd2_end_commit: dev 7,0 transaction 5396 sync 0 head 5393 ... The problem is __mark_inode_dirty() does not always flag I_DIRTY_TIME. It seems that it is no use to mark an inode I_DIRTY_TIME when the inode is already I_DIRTY_INODE. However, by that decision, we're skipping journal write if we invoke two fsync()s between two bdi flushes. As the following table shows, any fsync after the first fsync do nothing (if there's no update other than timestamp). Event | i_state | journal ---------------------+--------------+------------------------ | I_DIRTY_TIME | no write (lazytime) | I_DIRTY_SYNC | write timestamp update | I_DIRTY_SYNC | no write (lazytime) | I_DIRTY_SYNC | no write *BUG* ... | 0 | | I_DIRTY_TIME | no write (lazytime) | I_DIRTY_SYNC | write timestamp update We should set I_DIRTY_TIME on the second timestamp update to let fsync() notice there's a timestamp update after the last inode writeout. After this patch, we can see rapid trace of journal commit: $ trace jbd2:jbd2_end_commit jbd2/loop0-8-1879 [002] .... 208.275057: jbd2_end_commit: dev 7,0 transaction 5364 sync 0 head 3343 jbd2/loop0-8-1879 [000] .... 208.302539: jbd2_end_commit: dev 7,0 transaction 5365 sync 0 head 3343 jbd2/loop0-8-1879 [000] .... 208.327238: jbd2_end_commit: dev 7,0 transaction 5366 sync 0 head 3343 jbd2/loop0-8-1879 [003] .... 208.347618: jbd2_end_commit: dev 7,0 transaction 5367 sync 0 head 3343 ... Reported-by: Asraa Ali Mardan Signed-off-by: Naohiro Aota Reviewed-by: Jan Kara --- fs/fs-writeback.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 05713a5..ace628c 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2100,16 +2100,17 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ smp_mb(); - if (((inode->i_state & flags) == flags) || - (dirtytime && (inode->i_state & I_DIRTY_INODE))) + if ((inode->i_state & flags) == flags) return; if (unlikely(block_dump)) block_dump___mark_inode_dirty(inode); spin_lock(&inode->i_lock); - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) + if (dirtytime && (inode->i_state & I_DIRTY_INODE)) { + inode->i_state |= I_DIRTY_TIME; goto out_unlock_inode; + } if ((inode->i_state & flags) != flags) { const int was_dirty = inode->i_state & I_DIRTY;