From patchwork Fri Dec 6 13:58:37 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 3297661 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 4ED5A9F373 for ; Fri, 6 Dec 2013 13:59:33 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 18B6D204EB for ; Fri, 6 Dec 2013 13:59:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 92F9F20398 for ; Fri, 6 Dec 2013 13:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756218Ab3LFN7Y (ORCPT ); Fri, 6 Dec 2013 08:59:24 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:34661 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755394Ab3LFN7U (ORCPT ); Fri, 6 Dec 2013 08:59:20 -0500 Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id rB6DwiqZ031452 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 6 Dec 2013 13:58:45 GMT Received: from aserz7022.oracle.com (aserz7022.oracle.com [141.146.126.231]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id rB6DwhYT025733 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 6 Dec 2013 13:58:44 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserz7022.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id rB6DwhJn004610; Fri, 6 Dec 2013 13:58:43 GMT Received: from localhost.localdomain (/10.191.2.186) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 06 Dec 2013 05:58:43 -0800 Date: Fri, 6 Dec 2013 21:58:37 +0800 From: Liu Bo To: Pedro Fonseca Cc: linux-btrfs@vger.kernel.org Subject: Re: Null pointer oops when deleting item in btrfs_find_all_root() Message-ID: <20131206135836.GD20595@localhost.localdomain> Reply-To: bo.li.liu@oracle.com References: <52A1CAA5.8090302@mpi-sws.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <52A1CAA5.8090302@mpi-sws.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: acsinet21.oracle.com [141.146.126.237] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Dec 06, 2013 at 02:01:25PM +0100, Pedro Fonseca wrote: > Hi, > > I've encountered another null pointer bug in btrfs_find_all_root(). > > It may be releated to a bug I previously reported to the mailing > list ("Null pointer dereference bug in btrfs_find_all_root"). But > this test ran on kernel version 3.12.2 and the oops was triggered > when deleting an item from the list. The actual workload (i.e. FS > operations) is similar though. Not sure if the following commit[1] has been merged in this 3.12.2, any chance to check it? -liubo [1]: commit 48ec47364b6d493f0a9cdc116977bf3f34e5c3ec Author: Liu Bo Date: Wed Oct 30 13:25:24 2013 +0800 Btrfs: fix a crash when running balance and defrag concurrently Running balance and defrag concurrently can end up with a crash: kernel BUG at fs/btrfs/relocation.c:4528! RIP: 0010:[] [] btrfs_reloc_cow_block+ 0x1eb/0x230 [btrfs] Call Trace: [] ? update_ref_for_cow+0x241/0x380 [btrfs] [] ? copy_extent_buffer+0xad/0x110 [btrfs] [] __btrfs_cow_block+0x3a1/0x520 [btrfs] [] btrfs_cow_block+0x116/0x1b0 [btrfs] [] btrfs_search_slot+0x43d/0x970 [btrfs] [] btrfs_lookup_file_extent+0x37/0x40 [btrfs] [] __btrfs_drop_extents+0x11e/0xae0 [btrfs] [] ? generic_bin_search.constprop.39+0x8d/0x1a0 [btrfs] [] ? kmem_cache_alloc+0x1da/0x200 [] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [] btrfs_drop_extents+0x60/0x90 [btrfs] [] relink_extent_backref+0x2ed/0x780 [btrfs] [] ? btrfs_submit_bio_hook+0x1e0/0x1e0 [btrfs] [] ? iterate_inodes_from_logical+0x87/0xa0 [btrfs] [] btrfs_finish_ordered_io+0x229/0xac0 [btrfs] [] finish_ordered_fn+0x15/0x20 [btrfs] [] worker_loop+0x125/0x4e0 [btrfs] [] ? btrfs_queue_worker+0x300/0x300 [btrfs] [] kthread+0xc0/0xd0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork+0x7c/0xb0 [] ? insert_kthread_work+0x40/0x40 ---------------------------------------------------------------------- It turns out to be that balance operation will bump root's @last_snapshot, which enables snapshot-aware defrag path, and backref walking stuff will find data reloc tree as refs' parent, and hit the BUG_ON() during COW. As data reloc tree's data is just for relocation purpose, and will be deleted right after relocation is done, it's unnecessary to walk those refs belonged to data reloc tree, it'd be better to skip them. Signed-off-by: Liu Bo Signed-off-by: Josef Bacik Signed-off-by: Chris Mason --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 721936a..30d24cf 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -185,6 +185,9 @@ static int __add_prelim_ref(struct list_head *head, u64 root_id, { struct __prelim_ref *ref; + if (root_id == BTRFS_DATA_RELOC_TREE_OBJECTID) + return 0; + ref = kmem_cache_alloc(btrfs_prelim_ref_cache, gfp_mask); if (!ref) return -ENOMEM;