From patchwork Sat Apr 1 06:34:49 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 9657499 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4091A60351 for ; Sat, 1 Apr 2017 06:35:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 33F77286C8 for ; Sat, 1 Apr 2017 06:35:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28CBD286D1; Sat, 1 Apr 2017 06:35:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A5FC286C8 for ; Sat, 1 Apr 2017 06:35:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751225AbdDAGfb (ORCPT ); Sat, 1 Apr 2017 02:35:31 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:45707 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750964AbdDAGfa (ORCPT ); Sat, 1 Apr 2017 02:35:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YaaiHK/Loux8Z/3SE3RqSYdbO4GPgb5DtPvoip1KJTA=; b=oHR9lWRJXmi/tEUfDLPvABup6 udKtGxOfpLLlM8bbo3uF+fkHSMTdZ1IS+z3GzgCKV2MB3SlDkds3+mM9eSQvQblL4t3ECJXjrCcs+ dFjXrKeUHDSOTqLyzddQYVB8pWh4NqGfv9Kw7D9EClCe1qThu7p/lSxO4FUlD/moHlfaS4HbOi11M d5pxELpxcbtpBAtFDglHCcHphmQF6KPK2ODdz5NLeJcrK8/6K6mHtWTIzouE/HgOYecakaCrH4dvv cX3hzQxByHszneSF4EERxIfyBwuAv+XWzBTW8J9Ynf92MnosG7FMRQUboBFK5dExVRQD+Sw8YUAEK BSoxnpOSQ==; Received: from 77.117.150.49.wireless.dyn.drei.com ([77.117.150.49] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1cuCdF-0000kY-5j; Sat, 01 Apr 2017 06:35:29 +0000 From: Christoph Hellwig To: stable@vger.kernel.org Cc: linux-xfs@vger.kernel.org, Brian Foster , "Darrick J . Wong" Subject: [PATCH 03/26] xfs: fix eofblocks race with file extending async dio writes Date: Sat, 1 Apr 2017 08:34:49 +0200 Message-Id: <20170401063512.25313-4-hch@lst.de> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170401063512.25313-1-hch@lst.de> References: <20170401063512.25313-1-hch@lst.de> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Brian Foster commit e4229d6b0bc9280f29624faf170cf76a9f1ca60e upstream. It's possible for post-eof blocks to end up being used for direct I/O writes. dio write performs an upfront unwritten extent allocation, sends the dio and then updates the inode size (if necessary) on write completion. If a file release occurs while a file extending dio write is in flight, it is possible to mistake the post-eof blocks for speculative preallocation and incorrectly truncate them from the inode. This means that the resulting dio write completion can discover a hole and allocate new blocks rather than perform unwritten extent conversion. This requires a strange mix of I/O and is thus not likely to reproduce in real world workloads. It is intermittently reproduced by generic/299. The error manifests as an assert failure due to transaction overrun because the aforementioned write completion transaction has only reserved enough blocks for btree operations: XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \ file: fs/xfs//xfs_trans.c, line: 309 The root cause is that xfs_free_eofblocks() uses i_size to truncate post-eof blocks from the inode, but async, file extending direct writes do not update i_size until write completion, long after inode locks are dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode to the incorrect size. Update xfs_free_eofblocks() to serialize against dio similar to how extending writes are serialized against i_size updates before post-eof block zeroing. Specifically, wait on dio while under the iolock. This ensures that dio write completions have updated i_size before post-eof blocks are processed. Signed-off-by: Brian Foster Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_bmap_util.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 9319ee9759d4..eb890ed1ed5c 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -959,6 +959,9 @@ xfs_free_eofblocks( if (error) return error; + /* wait on dio to ensure i_size has settled */ + inode_dio_wait(VFS_I(ip)); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp); if (error) {