From patchwork Wed Nov 9 23:26:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 9420325 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C5D9D6048E for ; Wed, 9 Nov 2016 23:27:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5737285EF for ; Wed, 9 Nov 2016 23:27:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA236293DF; Wed, 9 Nov 2016 23:27:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1418E289E7 for ; Wed, 9 Nov 2016 23:27:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751599AbcKIX07 (ORCPT ); Wed, 9 Nov 2016 18:26:59 -0500 Received: from mail-pf0-f173.google.com ([209.85.192.173]:33059 "EHLO mail-pf0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751496AbcKIX06 (ORCPT ); Wed, 9 Nov 2016 18:26:58 -0500 Received: by mail-pf0-f173.google.com with SMTP id d2so134501300pfd.0 for ; Wed, 09 Nov 2016 15:26:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=ul6sOxDWkSBiOSRjHe7itDcmP1l3/KSytytXf8JVnVM=; b=Z7gGrXurjnk9QbzfjjoMC9rsxrK6I87z6Cd15qUodQ1F/uH4E4j2rlGtWL9IvuB/Rz K/wy6zf3LyIybTtyqEpDjGhIOnTf/0uUiB3CCmLD94GXcDdNiU5FS5mi07UKEpmKH3/D QWeCe0Q8BXb5qqNPnaID13iYAhvTUXyaRSa+6dEoViz9gBJYi9Qmt1Jkz9NS+nPMAG/N amiKwa6FsTFwrUqDcxNbBqDrn1DIq2+xWzH2nsUR+bLaVjoEDaWgtGBM29zzdDooomPk SooQmgnokvGNMka4nKSlnhUA/boIwgPeoOnNyLjX2tgm/dIou8zje3AM38eOpiOoHL+4 hJcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ul6sOxDWkSBiOSRjHe7itDcmP1l3/KSytytXf8JVnVM=; b=cFYk8JaxhPl2atMEPeLlAgN6tDZHqAa6m9BQzzMF4M19TiO7kqnfIWQ/jt1+qobJX0 0AYM5elUNITCUTbFaIrSUkf2cNfcoZBQV45DnuYNGD+hF05J3Pnj4WJCmw8ADRDiQ8Ab QVDepRQAWEX0ZW+sbK/OpCqyuqwZPbgA3lb4EUXIEFRfcvchTxC4aNuaZxc+uOInBNah KSNzAVFJ59IjF75whBh4ZKtdKMOjprK31bcJTOuq/sXjKrJd1xaEYP3UAsMwRnha0Rj7 EoNm5x72YBwC5NgnxwtopFkeJqBz+YMPW+H9f0BQ5kKL/vCixRF6j4xoYtPhcVWmV6rM LtBg== X-Gm-Message-State: ABUngvfZZguFJD97qZ04/2gyT5u5Lej08OTqBhvAxG0BoXb8lfKVFUHb7Wt18E9slhUt14JQ X-Received: by 10.98.194.86 with SMTP id l83mr3889342pfg.64.1478734017557; Wed, 09 Nov 2016 15:26:57 -0800 (PST) Received: from localhost.localdomain ([2601:602:8801:8110:1a5e:fff:fea7:e0ef]) by smtp.gmail.com with ESMTPSA id b5sm1699984pfb.78.2016.11.09.15.26.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Nov 2016 15:26:56 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH] Btrfs: deal with existing encompassing extent map in btrfs_get_extent() Date: Wed, 9 Nov 2016 15:26:50 -0800 Message-Id: <262a1e171d091626edbd23c637cb138ba9d84ed8.1478733376.git.osandov@fb.com> X-Mailer: git-send-email 2.10.2 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval My QEMU VM was seeing inexplicable I/O errors that I tracked down to errors coming from the qcow2 virtual drive in the host system. The qcow2 file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT. Every once in awhile, pread() or pwrite() would return EEXIST, which makes no sense. This turned out to be a bug in btrfs_get_extent(). Commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where two threads race on adding the same extent map to an inode's extent map tree. However, if the added em is merged with an adjacent em in the extent tree, then we'll end up with an existing extent that is not identical to but instead encompasses the extent we tried to add. When we call merge_extent_mapping() to find the nonoverlapping part of the new em, the arithmetic overflows because there is no such thing. We then end up trying to add a bogus em to the em_tree, which results in a EEXIST that can bubble all the way up to userspace. Fix it by extending the identical extent map special case. Signed-off-by: Omar Sandoval Reviewed-by: Liu Bo --- Applies to 4.9-rc4. Here [1] is a reproducer for this bug that doesn't involve firing up a QEMU VM. Also, a big shoutout to BCC [2] and BPF for making it possible to debug this on my laptop without compiling a custom kernel and rebooting just to add printks [3]. 1: https://gist.github.com/osandov/d08aabe5d4dec15517e9fde17012fd3b 2: https://github.com/iovisor/bcc 3: https://gist.github.com/osandov/eb1db868ce10c3af9e00b90f3a65bf9f fs/btrfs/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2b790bd..e5cf589 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7049,11 +7049,11 @@ struct extent_map *btrfs_get_extent(struct inode *inode, struct page *page, * extent causing the -EEXIST. */ if (existing->start == em->start && - extent_map_end(existing) == extent_map_end(em) && + extent_map_end(existing) >= extent_map_end(em) && em->block_start == existing->block_start) { /* - * these two extents are the same, it happens - * with inlines especially + * The existing extent map already encompasses the + * entire extent map we tried to add. */ free_extent_map(em); em = existing;