[RFC] btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock

[BUG]
For certains crafted image, whose csum root leaf has missing backref, if
we try to trigger write with data csum, it could cause deadlock with the
following kernel WARN_ON():
------
WARNING: CPU: 1 PID: 41 at fs/btrfs/locking.c:230 btrfs_tree_lock+0x3e2/0x400
CPU: 1 PID: 41 Comm: kworker/u4:1 Not tainted 4.18.0-rc1+ #8
Workqueue: btrfs-endio-write btrfs_endio_write_helper
RIP: 0010:btrfs_tree_lock+0x3e2/0x400
Call Trace:
 btrfs_alloc_tree_block+0x39f/0x770
 __btrfs_cow_block+0x285/0x9e0
 btrfs_cow_block+0x191/0x2e0
 btrfs_search_slot+0x492/0x1160
 btrfs_lookup_csum+0xec/0x280
 btrfs_csum_file_blocks+0x2be/0xa60
 add_pending_csums+0xaf/0xf0
 btrfs_finish_ordered_io+0x74b/0xc90
 finish_ordered_fn+0x15/0x20
 normal_work_helper+0xf6/0x500
 btrfs_endio_write_helper+0x12/0x20
 process_one_work+0x302/0x770
 worker_thread+0x81/0x6d0
 kthread+0x180/0x1d0
 ret_from_fork+0x35/0x40
---[ end trace 2e85051acb5f6dc1 ]---
------

[CAUSE]
That crafted image has missing backref for csum tree root leaf.
And when we try to allocate new tree block, since there is no
EXTENT/METADATA_ITEM for csum tree root, btrfs consider it's free slot
and use it.

The extent tree of the image looks like:
Normal image                      |       This fuzzed image
----------------------------------+--------------------------------
BG 29360128                       | BG 29360128
 One empty slot                   |  One empty slot
29364224: backref to UUID tree    | 29364224: backref to UUID tree
 Two empty slots                  |  Two empty slots
29376512: backref to CSUM tree    |  One empty slot (bad type) <<<
29380608: backref to D_RELOC tree | 29380608: backref to D_RELOC tree
...                               | ...

Since bytenr 29376512 has no METADATA/EXTENT_ITEM, when btrfs try to
alloc tree block, it's an valid slot for btrfs.

And for finish_ordered_write, when we need to insert csum, we try to CoW
csum tree root.

By *COINCIDENT*, empty slots at bytenr BG_OFFSET, BG_OFFSET + 8K,
BG_OFFSET + 12K is already used by tree block COW for other trees,
the next empty slot is BG_OFFSET + 16K, which should be the backref for
CSUM tree.

But due to the bad type, btrfs can recognize it and still consider it as
an empty slot, and will try to use it for csum tree CoW.

Then in the following call trace, we will try to lock the new tree
block, which turns out to be the old csum tree root which is already
locked:

btrfs_search_slot() called on csum tree root, which is at 29376512
|- btrfs_cow_block()
   |- btrfs_set_lock_block()
   |  |- Now locks tree block 29376512 (old csum tree root)
   |- __btrfs_cow_block()
      |- btrfs_alloc_tree_block()
         |- btrfs_reserve_extent()
            | Now it returns tree block 29376512, which extent tree
            | shows its empty slot, but it's already hold by csum tree
            |- btrfs_init_new_buffer()
               |- btrfs_tree_lock()
                  | Triggers WARN_ON(eb->lock_owner == current->pid)
                  |- wait_event()
                     Wait lock owner to release the lock, but it's
                     locked by ourself, so it will deadlock

[FIX]
This patch will do the lock_owner and current->pid check at
btrfs_init_new_buffer().
So above deadlock can be avoided.

Since such problem can only happen in crafted image, we will still
trigger kernel warning, but with a little more meaningful warning
message.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200405
Reported-by: Xu Wen <wen.xu@gatech.edu>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
changelog:
v2:
  Modify btrfs_tree_lock() to be able to return int to detect possible
  deadlock and return.
  Titled as "btrfs: locking: Allow btrfs_tree_lock() to return error to avoid deadlock"
v3:
  Instead of modify all btrfs_tree_lock() callers, only check possible
  deadlock at btrfs_init_new_buffer(), as the bug only happens for newly
  allocated tree block.
  With better commit message describing the on-disk extent tree
  corruption along with the call trace of how dead lock happens.

Hi David,

This v3 should explain the bug with more details and bring a minimal
impact to existing function callers.

Thanks,
Qu
---
 fs/btrfs/extent-tree.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Message ID	20180814055121.6077-1-wqu@suse.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C85AE139A for <patchwork-linux-btrfs@patchwork.kernel.org>; Tue, 14 Aug 2018 05:51:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3F0F2952F for <patchwork-linux-btrfs@patchwork.kernel.org>; Tue, 14 Aug 2018 05:51:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A7CAF2963D; Tue, 14 Aug 2018 05:51:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B0342952F for <patchwork-linux-btrfs@patchwork.kernel.org>; Tue, 14 Aug 2018 05:51:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730562AbeHNIhG (ORCPT <rfc822;patchwork-linux-btrfs@patchwork.kernel.org>); Tue, 14 Aug 2018 04:37:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:38174 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730178AbeHNIhG (ORCPT <rfc822;linux-btrfs@vger.kernel.org>); Tue, 14 Aug 2018 04:37:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D7F30AE67 for <linux-btrfs@vger.kernel.org>; Tue, 14 Aug 2018 05:51:27 +0000 (UTC) From: Qu Wenruo <wqu@suse.com> To: linux-btrfs@vger.kernel.org, dsterba@suse.cz Subject: [PATCH RFC] btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock Date: Tue, 14 Aug 2018 13:51:21 +0800 Message-Id: <20180814055121.6077-1-wqu@suse.com> X-Mailer: git-send-email 2.18.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: <linux-btrfs.vger.kernel.org> X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP
Series	[RFC] btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock \| expand [RFC] btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock

[RFC] btrfs: locking: Add extra check in btrfs_init_new_buffer() to avoid deadlock

Commit Message

Comments

Patch