From patchwork Fri Nov 30 16:52:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10706725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5EDD214E2 for ; Fri, 30 Nov 2018 16:52:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F9A92EFDE for ; Fri, 30 Nov 2018 16:52:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43A9830413; Fri, 30 Nov 2018 16:52:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D14642EFDE for ; Fri, 30 Nov 2018 16:52:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727013AbeLAECP (ORCPT ); Fri, 30 Nov 2018 23:02:15 -0500 Received: from mail-yb1-f196.google.com ([209.85.219.196]:38552 "EHLO mail-yb1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726645AbeLAECO (ORCPT ); Fri, 30 Nov 2018 23:02:14 -0500 Received: by mail-yb1-f196.google.com with SMTP id u103-v6so2464965ybi.5 for ; Fri, 30 Nov 2018 08:52:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4MrXXWbMNZESoo3DTjd4Y1kveRLZLRLiOpanXwnI+7I=; b=OkMbq44oadO+5O8MvVvfMNTxUFp021l5OiNoQJLM9GYrR+X4/BOdter1XUCLligIFm VUxV6oHOLZFiPmXBXCSIVg6wKbvdu8EmXZC4eekvaFUpZX5Oa8IVRpQYLgyq7gSjTIGE +w7V0c/+/LMmRWtlVMvg6l5H2r0XnyFjdJQmKVkkKKAQamAVReCS/YgoHfmuGrkGVLFP XGm7Iyef7Du1KYEBO8y7Z+Do4xMgmcWE+fwuoTJkzb/xxT9nyml6F1ndNJj0oXZH0Q4r Jst6CdFFMS7YeMHZVkWec6xX3kqTG78OzCCkOCDGebYENxnhdhrISH+tpM1CDuOrx8US 6lng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4MrXXWbMNZESoo3DTjd4Y1kveRLZLRLiOpanXwnI+7I=; b=gDyNhFxBj4ec4NotG0a/yo+JNyM3vsB8/2Wr9QXgkLaK2FIDD2VZRmZ8iA/I7NtPQp grPi4bc3HY///caBUULjuucvh1HVyw1nkX2Ih3jx1QkznA0XolqH/o5gzEJYir65R0az Ju3K9U2K/7lpJsxKt+s3MBd6QM5tK9CNfQ9dVBTPBY2zZiTXhaGnaw9DwdLqeVDarthW Na3x4bzlDCM/CeZsUvUxAYUj5/xic+UJbcMoPGZEp9KpYL6jeY2TAVN/tzGSxvfrHI5j Sa09WP3XN9nVE1oTzoZGyRSdBJMncnDXMRmFQLrMhN/8j/+1m1vJL+LpKqbHK8iZ2RTu hfUQ== X-Gm-Message-State: AA+aEWaDv3Z9wW9HxAu2okj76cXATUOSWy4LcykWLS0S2PVVF1vjkzNi QpZ3D9HMnaiXHCOqJkboEF/zM6t3xvw= X-Google-Smtp-Source: AFSGD/WT3Ur1JWp02+qCIUv6e4wflHfymDY0EzyEtdO13lweUkw5qftI5cE0arc7lm4UA4tyil4GGg== X-Received: by 2002:a25:be85:: with SMTP id i5-v6mr6285144ybk.463.1543596739013; Fri, 30 Nov 2018 08:52:19 -0800 (PST) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id w2sm1773811ywe.62.2018.11.30.08.52.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 30 Nov 2018 08:52:18 -0800 (PST) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Cc: Josef Bacik Subject: [PATCH 1/2] btrfs: catch cow on deleting snapshots Date: Fri, 30 Nov 2018 11:52:13 -0500 Message-Id: <20181130165214.17883-2-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20181130165214.17883-1-josef@toxicpanda.com> References: <20181130165214.17883-1-josef@toxicpanda.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Josef Bacik When debugging some weird extent reference bug I suspected that we were changing a snapshot while we were deleting it, which could explain my bug. This was indeed what was happening, and this patch helped me verify my theory. It is never correct to modify the snapshot once it's being deleted, so mark the root when we are deleting it and make sure we complain about it when it happens. Signed-off-by: Josef Bacik Reviewed-by: Filipe Manana --- fs/btrfs/ctree.c | 3 +++ fs/btrfs/ctree.h | 1 + fs/btrfs/extent-tree.c | 9 +++++++++ 3 files changed, 13 insertions(+) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 5912a97b07a6..5f82f86085e8 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, u64 search_start; int ret; + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n"); + if (trans->transaction != fs_info->running_transaction) WARN(1, KERN_CRIT "trans %llu running %llu\n", trans->transid, diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index facde70c15ed..5a3a94ccb65c 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1199,6 +1199,7 @@ enum { BTRFS_ROOT_FORCE_COW, BTRFS_ROOT_MULTI_LOG_TASKS, BTRFS_ROOT_DIRTY, + BTRFS_ROOT_DELETING, }; /* diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 581c2a0b2945..dcb699dd57f3 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root, if (block_rsv) trans->block_rsv = block_rsv; + /* + * This will help us catch people modifying the fs tree while we're + * dropping it. It is unsafe to mess with the fs tree while it's being + * dropped as we unlock the root node and parent nodes as we walk down + * the tree, assuming nothing will change. If something does change + * then we'll have stale information and drop references to blocks we've + * already dropped. + */ + set_bit(BTRFS_ROOT_DELETING, &root->state); if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { level = btrfs_header_level(root->node); path->nodes[level] = btrfs_lock_root_node(root); From patchwork Fri Nov 30 16:52:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10706727 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F36F013A4 for ; Fri, 30 Nov 2018 16:52:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E355F2EFDE for ; Fri, 30 Nov 2018 16:52:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D59FC30413; Fri, 30 Nov 2018 16:52:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3BB32EFDE for ; Fri, 30 Nov 2018 16:52:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727114AbeLAECQ (ORCPT ); Fri, 30 Nov 2018 23:02:16 -0500 Received: from mail-yb1-f193.google.com ([209.85.219.193]:34871 "EHLO mail-yb1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727089AbeLAECQ (ORCPT ); Fri, 30 Nov 2018 23:02:16 -0500 Received: by mail-yb1-f193.google.com with SMTP id z2-v6so2465977ybj.2 for ; Fri, 30 Nov 2018 08:52:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YafhEERTli8qMT9FDPqqH8cgoD5ts4/0PmcAj1r+z9c=; b=KvCLEA5D/2btoWolOwjR+E+RsJbGZgZEVkE4FTvUnzjAPsI/Q7emDOJSCuHRZZP13w 0CzNtKZC9I9aW8SIuWXc+YdbOchneDK9kJ6DTnCUhRQIwAqT97piDkAfBCf5NGnn6doE PC8475noJdxShb9EpoYNrfoNrUFLRoz2NAjz1zj6iQdIiAdZ0caghTDPpxfeSg5yQYoH iBkGy2r7ZT0RVIVcvEDO1UfpiS+J8+3tQfDEkic2MWfCwyCSy+gNKAEJtrLT9ByEtX2x HiHGC6hrkZdLOeKQtKvNtzHOp6N+8ZWLp6QkviKhwtiICjtsSKCrn0/RbhHPwMZL8EOa K9uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YafhEERTli8qMT9FDPqqH8cgoD5ts4/0PmcAj1r+z9c=; b=Yg+rSddPQlQpdrvObySfKalduN7rQO6SR6ozuqtUZLPWgIs0Fpp3K+KCckbRmGONoq 9qmRgHSzte2bSOm1QmoY7eS7SwVvmmZIF3P4R96Ip30Bs/TnjjDeROO0qFiVgQBrWfm9 4/GC8b8HCl2nGJ4cVzdO5wn10Cn1zmoI7Z7ncJWIAgbol4jRGQ3QPqyNOCOwUSqQpDiB +1ypyYoT1Rr6Sm3noMG6fzmtL5nJdQYj0R90Y7mHLWI6T+rsP3HD8eFGuvST007n7oh+ tAo2So8gBVhg7VdImLUDZzRVT6cJ+m0utWxmlXcXuwyDpXHqWaFHI7yRFdP/XlBmeGWN lqBA== X-Gm-Message-State: AA+aEWaFP9tPlShTue+1d3da0rMXERzGBw9yREH1FBXRXvUUPG5GYfwn T7A9xVg7WbUJYYSbtVa908mi/7N+ESo= X-Google-Smtp-Source: AFSGD/WuOf3Q5mie3X4tVbtl8nTiyf74b8d+LPgvkK2orqLMWNSj/Lr2gkXF91IuFdEmIC6PEoISbg== X-Received: by 2002:a25:4b81:: with SMTP id y123-v6mr5926935yba.275.1543596740698; Fri, 30 Nov 2018 08:52:20 -0800 (PST) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id x4sm8472802ywj.80.2018.11.30.08.52.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 30 Nov 2018 08:52:19 -0800 (PST) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Cc: Josef Bacik , stable@vger.kernel.org Subject: [PATCH 2/2] btrfs: run delayed items before dropping the snapshot Date: Fri, 30 Nov 2018 11:52:14 -0500 Message-Id: <20181130165214.17883-3-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20181130165214.17883-1-josef@toxicpanda.com> References: <20181130165214.17883-1-josef@toxicpanda.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Josef Bacik With my delayed refs patches in place we started seeing a large amount of aborts in __btrfs_free_extent BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964 owner 1 offset 0 Call Trace: ? btrfs_merge_delayed_refs+0xaf/0x340 __btrfs_run_delayed_refs+0x6ea/0xfc0 ? btrfs_set_path_blocking+0x31/0x60 btrfs_run_delayed_refs+0xeb/0x180 btrfs_commit_transaction+0x179/0x7f0 ? btrfs_check_space_for_delayed_refs+0x30/0x50 ? should_end_transaction.isra.19+0xe/0x40 btrfs_drop_snapshot+0x41c/0x7c0 btrfs_clean_one_deleted_snapshot+0xb5/0xd0 cleaner_kthread+0xf6/0x120 kthread+0xf8/0x130 ? btree_invalidatepage+0x90/0x90 ? kthread_bind+0x10/0x10 ret_from_fork+0x35/0x40 This was because btrfs_drop_snapshot depends on the root not being modified while it's dropping the snapshot. It will unlock the root node (and really every node) as it walks down the tree, only to re-lock it when it needs to do something. This is a problem because if we modify the tree we could cow a block in our path, which free's our reference to that block. Then once we get back to that shared block we'll free our reference to it again, and get ENOENT when trying to lookup our extent reference to that block in __btrfs_free_extent. This is ultimately happening because we have delayed items left to be processed for our deleted snapshot _after_ all of the inodes are closed for the snapshot. We only run the delayed inode item if we're deleting the inode, and even then we do not run the delayed insertions or delayed removals. These can be run at any point after our final inode does it's last iput, which is what triggers the snapshot deletion. We can end up with the snapshot deletion happening and then have the delayed items run on that file system, resulting in the above problem. This problem has existed forever, however my patches made it much easier to hit as I wake up the cleaner much more often to deal with delayed iputs, which made us more likely to start the snapshot dropping work before the transaction commits, which is when the delayed items would generally be run. Before, generally speaking, we would run the delayed items, commit the transaction, and wakeup the cleaner thread to start deleting snapshots, which means we were less likely to hit this problem. You could still hit it if you had multiple snapshots to be deleted and ended up with lots of delayed items, but it was definitely harder. Fix for now by simply running all the delayed items before starting to drop the snapshot. We could make this smarter in the future by making the delayed items per-root, and then simply drop any delayed items for roots that we are going to delete. But for now just a quick and easy solution is the safest. Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik Reviewed-by: Filipe Manana --- fs/btrfs/extent-tree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index dcb699dd57f3..965702034b22 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9330,6 +9330,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root, goto out_free; } + btrfs_run_delayed_items(trans); + if (block_rsv) trans->block_rsv = block_rsv;