From patchwork Fri Jul 20 09:59:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 10536387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 46A50602CA for ; Fri, 20 Jul 2018 09:59:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F1D92937F for ; Fri, 20 Jul 2018 09:59:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 21527293FF; Fri, 20 Jul 2018 09:59:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9FB422937F for ; Fri, 20 Jul 2018 09:59:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730139AbeGTKqk (ORCPT ); Fri, 20 Jul 2018 06:46:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:38006 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727243AbeGTKqk (ORCPT ); Fri, 20 Jul 2018 06:46:40 -0400 Received: from debian3.lan (bl8-197-74.dsl.telepac.pt [85.241.197.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2A3CD20661 for ; Fri, 20 Jul 2018 09:59:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1532080750; bh=oUXpDKjEbNwttdFbh2afglwksmamtGSCFqhJAR0gYtY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=efuJTQUPurCRzKCbR4ZPGeoptu2QZtM0J6M5qA8em5LVj2wzob52PnJE73pQ9pP7A rQsNq9Ms0IJ+wucHtRNYsDqwXeOLmLvDGUWhxQ0u2435tWsLXS8AUP+DdeaEzY1wg0 NG/ZXIXW4qjhoNdJ1MJ+gxN7QwQ7QBRH9qhX3CSw= From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2] Btrfs: fix mount failure after fsync due to hard link recreation Date: Fri, 20 Jul 2018 10:59:06 +0100 Message-Id: <20180720095906.3675-1-fdmanana@kernel.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180719180154.20125-1-fdmanana@kernel.org> References: <20180719180154.20125-1-fdmanana@kernel.org> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Filipe Manana If we end up with logging an inode reference item which has the same name but different index from the one we have persisted, we end up failing when replaying the log with an errno value of -EEXIST. The error comes from btrfs_add_link(), which is called from add_inode_ref(), when we are replaying an inode reference item. Example scenario where this happens: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ touch /mnt/foo $ ln /mnt/foo /mnt/bar $ sync # Rename the first hard link (foo) to a new name and rename the second # hard link (bar) to the old name of the first hard link (foo). $ mv /mnt/foo /mnt/qwerty $ mv /mnt/bar /mnt/foo # Create a new file, in the same parent directory, with the old name of # the second hard link (bar) and fsync this new file. # We do this instead of calling fsync on foo/qwerty because if we did # that the fsync resulted in a full transaction commit, not triggering # the problem. $ touch /mnt/bar $ xfs_io -c "fsync" /mnt/bar $ mount /dev/sdb /mnt mount: mount /dev/sdb on /mnt failed: File exists So fix this by checking if a conflicting inode reference exists (same name, same parent but different index), removing it (and the associated dir index entries from the parent inode) if it exists, before attempting to add the new reference. A test case for fstests follows soon. Signed-off-by: Filipe Manana --- V2: added missing error check from btrfs_unlink_inode() fs/btrfs/tree-log.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index f8220ec02036..5f3c33466c18 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -1291,6 +1291,49 @@ static int unlink_old_inode_refs(struct btrfs_trans_handle *trans, return ret; } +static int btrfs_inode_ref_exists(struct inode *inode, + struct inode *dir, + const u8 ref_type, + const char *name, + const int namelen) +{ + struct btrfs_key key; + struct btrfs_path *path; + const u64 parent_id = btrfs_ino(BTRFS_I(dir)); + int ret; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + key.objectid = btrfs_ino(BTRFS_I(inode)); + key.type = ref_type; + if (key.type == BTRFS_INODE_REF_KEY) + key.offset = parent_id; + else + key.offset = btrfs_extref_hash(parent_id, name, namelen); + + ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + goto out; + } + if (key.type == BTRFS_INODE_EXTREF_KEY) + ret = btrfs_find_name_in_ext_backref(path->nodes[0], + path->slots[0], + parent_id, name, + namelen, NULL); + else + ret = btrfs_find_name_in_backref(path->nodes[0], path->slots[0], + name, namelen, NULL); + +out: + btrfs_free_path(path); + return ret; +} + /* * replay one inode back reference item found in the log tree. * eb, slot and key refer to the buffer and key found in the log tree. @@ -1400,6 +1443,32 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans, } } + /* + * If a reference item already exists for this inode + * with the same parent and name, but different index, + * drop it and the corresponding directory index entries + * from the parent before adding the new reference item + * and dir index entries, otherwise we would fail with + * -EEXIST returned from btrfs_add_link() below. + */ + ret = btrfs_inode_ref_exists(inode, dir, key->type, + name, namelen); + if (ret > 0) { + ret = btrfs_unlink_inode(trans, root, + BTRFS_I(dir), + BTRFS_I(inode), + name, namelen); + /* + * If we dropped the link count to 0, bump it so + * that later the iput() on the inode will not + * free it. We will fixup the link count later. + */ + if (!ret && inode->i_nlink == 0) + inc_nlink(inode); + } + if (ret < 0) + goto out; + /* insert our name */ ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),