From patchwork Thu Feb 25 11:28:27 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: OGAWA Hirofumi X-Patchwork-Id: 8423041 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id CB1EFC0553 for ; Thu, 25 Feb 2016 11:55:09 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id EE8FC2034E for ; Thu, 25 Feb 2016 11:55:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D0D8D200E5 for ; Thu, 25 Feb 2016 11:55:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759879AbcBYLzD (ORCPT ); Thu, 25 Feb 2016 06:55:03 -0500 Received: from mail.parknet.co.jp ([210.171.160.6]:53194 "EHLO mail.parknet.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752104AbcBYLzC (ORCPT ); Thu, 25 Feb 2016 06:55:02 -0500 X-Greylist: delayed 1592 seconds by postgrey-1.27 at vger.kernel.org; Thu, 25 Feb 2016 06:55:02 EST Received: from ibmpc.myhome.or.jp (unknown [210.171.168.39]) by mail.parknet.co.jp (Postfix) with ESMTP id C25D9170001; Thu, 25 Feb 2016 20:28:28 +0900 (JST) Received: from devron.myhome.or.jp (root@devron.myhome.or.jp [192.168.0.3]) by ibmpc.myhome.or.jp (8.15.2/8.15.2/Debian-3) with ESMTPS id u1PBSRwh008951 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Feb 2016 20:28:28 +0900 Received: from devron.myhome.or.jp (hirofumi@localhost [127.0.0.1]) by devron.myhome.or.jp (8.15.2/8.15.2/Debian-3) with ESMTPS id u1PBSRfd013014 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Feb 2016 20:28:27 +0900 Received: (from hirofumi@localhost) by devron.myhome.or.jp (8.15.2/8.15.2/Submit) id u1PBSRoA013013; Thu, 25 Feb 2016 20:28:27 +0900 From: OGAWA Hirofumi To: "Theodore Ts'o" Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH] ext4/jbd2: Fix jbd2_journal_destory() for umount path Date: Thu, 25 Feb 2016 20:28:27 +0900 Message-ID: <87fuwh58ro.fsf@mail.parknet.co.jp> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On umount path, jbd2_journal_destroy() writes latest transaction ID (->j_tail_sequence) to be used at next mount. The bug is that ->j_tail_sequence is not holding latest transaction ID in some cases. So, at next mount, there is chance to conflict with remaining (not overwritten yet) transactions. mount (id=10) write transaction (id=11) write transaction (id=12) umount (id=10) <= the bug doesn't write latest ID mount (id=10) write transaction (id=11) crash recovery transaction (id=11) transaction (id=12) <= valid transaction ID, but old commit must not replay Like above, this bug become the cause of recovery failure, or FS corruption. So why ->j_tail_sequence doesn't point latest ID? Because if checkpoint transactions was reclaimed by memory pressure (i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated. (And another case is, __jbd2_journal_clean_checkpoint_list() is called with empty transaction.) So in above cases, ->j_tail_sequence is not pointing latest transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not done too. So, to fix this problem with minimum changes, this patch updates ->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes, some optimizations would be possible to avoid unnecessary REQ_FLUSH for example though.) BTW, journal->j_tail_sequence = ++journal->j_transaction_sequence; Increment of ->j_transaction_sequence seems to be unnecessary, but ext3 does this. Signed-off-by: OGAWA Hirofumi --- fs/jbd2/journal.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff -puN fs/jbd2/journal.c~ext4-umount-fix fs/jbd2/journal.c --- linux/fs/jbd2/journal.c~ext4-umount-fix 2016-02-25 03:26:32.710407670 +0900 +++ linux-hirofumi/fs/jbd2/journal.c 2016-02-25 03:26:32.711407673 +0900 @@ -1412,7 +1412,7 @@ out: * Update a journal's dynamic superblock fields to show that journal is empty. * Write updated superblock to disk waiting for IO to complete. */ -static void jbd2_mark_journal_empty(journal_t *journal) +static void jbd2_mark_journal_empty(journal_t *journal, int write_op) { journal_superblock_t *sb = journal->j_superblock; @@ -1430,7 +1430,7 @@ static void jbd2_mark_journal_empty(jour sb->s_start = cpu_to_be32(0); read_unlock(&journal->j_state_lock); - jbd2_write_superblock(journal, WRITE_FUA); + jbd2_write_superblock(journal, write_op); /* Log is no longer empty */ write_lock(&journal->j_state_lock); @@ -1716,7 +1716,13 @@ int jbd2_journal_destroy(journal_t *jour if (journal->j_sb_buffer) { if (!is_journal_aborted(journal)) { mutex_lock(&journal->j_checkpoint_mutex); - jbd2_mark_journal_empty(journal); + + write_lock(&journal->j_state_lock); + journal->j_tail_sequence = + ++journal->j_transaction_sequence; + write_unlock(&journal->j_state_lock); + + jbd2_mark_journal_empty(journal, WRITE_FLUSH_FUA); mutex_unlock(&journal->j_checkpoint_mutex); } else err = -EIO; @@ -1975,7 +1981,7 @@ int jbd2_journal_flush(journal_t *journa * the magic code for a fully-recovered superblock. Any future * commits of data to the journal will restore the current * s_start value. */ - jbd2_mark_journal_empty(journal); + jbd2_mark_journal_empty(journal, WRITE_FUA); mutex_unlock(&journal->j_checkpoint_mutex); write_lock(&journal->j_state_lock); J_ASSERT(!journal->j_running_transaction); @@ -2021,7 +2027,7 @@ int jbd2_journal_wipe(journal_t *journal if (write) { /* Lock to make assertions happy... */ mutex_lock(&journal->j_checkpoint_mutex); - jbd2_mark_journal_empty(journal); + jbd2_mark_journal_empty(journal, WRITE_FUA); mutex_unlock(&journal->j_checkpoint_mutex); }