From patchwork Thu May 26 17:26:28 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Mason X-Patchwork-Id: 9137293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A11BB6075A for ; Thu, 26 May 2016 17:27:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9385F27BE4 for ; Thu, 26 May 2016 17:27:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88543280B2; Thu, 26 May 2016 17:27:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8873727BE4 for ; Thu, 26 May 2016 17:27:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754377AbcEZR1F (ORCPT ); Thu, 26 May 2016 13:27:05 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:55489 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752431AbcEZR1E (ORCPT ); Thu, 26 May 2016 13:27:04 -0400 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u4QHLvhL004614; Thu, 26 May 2016 10:26:31 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=facebook; bh=5LUSamTldn4D2whwGfhKrtFoXjck1OAo5l0kyDKGwiA=; b=Zcn7WnQx1iG6GtLh0Fubqvsz/SSdVqfewCkXc50tUKpr/T8NW1qTWOZgil4eOD0wycQK 1VJ/8JxCKEsrWZyjRZ9kmRjiS/rC3oqXG6PGhtMSEF0XlW4tHSRyek5r2IHSInITCo5/ MPKFUXuJxMlv9SFMyPWUWrjr8bj5eM5W5lw= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 236506g7m6-1 (version=TLSv1 cipher=AES128-SHA bits=128 verify=NOT); Thu, 26 May 2016 10:26:31 -0700 Received: from localhost (192.168.52.123) by mail.thefacebook.com (192.168.16.24) with Microsoft SMTP Server (TLS) id 14.3.294.0; Thu, 26 May 2016 10:26:30 -0700 Date: Thu, 26 May 2016 13:26:28 -0400 From: Chris Mason To: Zygo Blaxell CC: , Filipe Manana Subject: Re: mixed inline, non-inline extents leading to EIO when reading small files Message-ID: <20160526172628.pd6ps5ieuxbxwno7@floor.thefacebook.com> Mail-Followup-To: Chris Mason , Zygo Blaxell , linux-btrfs@vger.kernel.org, Filipe Manana References: <20160526161952.GH15597@hungrycats.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160526161952.GH15597@hungrycats.org> User-Agent: Mutt/1.5.23.1 (2014-03-12) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-05-26_10:, , signatures=0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, May 26, 2016 at 12:19:52PM -0400, Zygo Blaxell wrote: > I frequently see these in /etc/lvm/backup/*. Something that LVM does > when it writes these files triggers the problem. This problem occurs > in kernels 3.18..4.4.11 (i.e. all the kernels I've tested). > > btrfs-debug-tree finds this: > > item 26 key (2702988 INODE_ITEM 0) itemoff 12632 itemsize 160 > inode generation 49642 transid 49799 size 7856 nbytes 8192 > block group 0 mode 100644 links 1 uid 0 gid 0 > rdev 0 flags 0x0 > item 27 key (2702988 INODE_REF 2799) itemoff 12617 itemsize 15 > inode ref index 4 namelen 5 name: volgr > item 28 key (2702988 EXTENT_DATA 0) itemoff 11247 itemsize 1370 > inline extent data size 1349 ram 4096 compress(zlib) > item 29 key (2702988 EXTENT_DATA 4096) itemoff 11194 itemsize 53 > extent data disk byte 1161560064 nr 4096 > extent data offset 0 nr 4096 ram 4096 > extent compression(none) > > When the problem occurs it usually affects all files in /etc/lvm/backup. > I have seen it randomly in other parts of the filesystem but it's much > rarer elsewhere. > > Attempts to read this file return EIO. There are no errors reported in > scrub or kmesg. > > Filesystem is mounted with options: > > noatime,compress-force=zlib,flushoncommit,space_cache,skip_balance,commit=300 > > Am I missing anything? I've got this queued up to send out. We hit this with holes instead of extents, and without compression, but its a similar problem: Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent When dealing with inline extents, btrfs_get_extent will incorrectly try to insert a duplicate extent_map. The dup hits -EEXIST from add_extent_map, but then we try to merge with the existing one and end up trying to insert a zero length extent_map. This actually works most of the time, except when there are extent maps past the end of the inline extent. rocksdb will trigger this sometimes because it preallocates an extent and then truncates down. Josef made a script to trigger with xfs_io: #!/bin/bash xfs_io -f -c "pwrite 0 1000" inline xfs_io -c "falloc -k 4k 1M" inline xfs_io -c "pread 0 1000" -c "fadvise -d 0 1000" -c "pread 0 1000" inline xfs_io -c "fadvise -d 0 1000" inline cat inline You'll get EIOs trying to read inline after this because add_extent_map is returning EEXIST Signed-off-by: Chris Mason --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 98a3ba2..4352589 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6914,7 +6914,18 @@ insert: * existing will always be non-NULL, since there must be * extent causing the -EEXIST. */ - if (start >= extent_map_end(existing) || + if (existing->start == em->start && + extent_map_end(existing) == extent_map_end(em) && + em->block_start == existing->block_start) { + /* + * these two extents are the same, it happens + * with inlines especially + */ + free_extent_map(em); + em = existing; + err = 0; + + } else if (start >= extent_map_end(existing) || start <= existing->start) { /* * The existing extent map is the one nearest to