From patchwork Thu Jan 17 16:36:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768569 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D915F746 for ; Thu, 17 Jan 2019 16:36:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAB303034D for ; Thu, 17 Jan 2019 16:36:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C940130376; Thu, 17 Jan 2019 16:36:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 692A6304AB for ; Thu, 17 Jan 2019 16:36:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728864AbfAQQgn (ORCPT ); Thu, 17 Jan 2019 11:36:43 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53116 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQgm (ORCPT ); Thu, 17 Jan 2019 11:36:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=0WI8ktZxJ5kl34XDH77EQXr0AbKFQuuB4VcUqUExDrk=; b=sp0QToEv7MeNX7IoRQ6FXkJsP/ 52h4ZS3xdqetIv3wRDCxleInn9Bt6uN+G/4v2UlF4V7hY9b13J89qZyGAZWmoUc3/vvPnvBXoMa07 XwLUp+23wOk+xPG1XQ160RymefJ9vT8a90j8Bj68MY6/RHrIFf41sF3doEcZPixoFSjp1UWzOpb8q esvFa6OnTtVYd5XewUQoWTDi9nwp7Q/3t0BSo5dLypV6DPtDTmppmmwl3/9pq/iUXmswUv93cDMNk h/tVdFzcaC1DKQAFrVOEv6LYy8ZspaL1ItA/aoTrUv868hYxmzx6FrIbovObZFFb9/mcxb1n7f0Ex bT9K3oeg==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAen-00045b-QH; Thu, 17 Jan 2019 16:36:42 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: "Darrick J . Wong" Subject: [PATCH 01/11] xfs: remove xfs_trim_extent_eof Date: Thu, 17 Jan 2019 17:36:26 +0100 Message-Id: <20190117163636.23171-2-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Opencoding this function in the only caller makes it blindly obvious what is going on instead of having to look at two files for that. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 11 ----------- fs/xfs/libxfs/xfs_bmap.h | 1 - fs/xfs/xfs_aops.c | 2 +- 3 files changed, 1 insertion(+), 13 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 332eefa2700b..4c73927819c2 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3685,17 +3685,6 @@ xfs_trim_extent( } } -/* trim extent to within eof */ -void -xfs_trim_extent_eof( - struct xfs_bmbt_irec *irec, - struct xfs_inode *ip) - -{ - xfs_trim_extent(irec, 0, XFS_B_TO_FSB(ip->i_mount, - i_size_read(VFS_I(ip)))); -} - /* * Trim the returned map to the required bounds */ diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index 09d3ea97cc15..b4ff710d7250 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -181,7 +181,6 @@ static inline bool xfs_bmap_is_real_extent(struct xfs_bmbt_irec *irec) void xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno, xfs_filblks_t len); -void xfs_trim_extent_eof(struct xfs_bmbt_irec *, struct xfs_inode *); int xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd); int xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version); void xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork); diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 338b9d9984e0..d7275075878e 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -329,7 +329,7 @@ xfs_map_blocks( * mechanism to protect us from arbitrary extent modifying contexts, not * just eofblocks. */ - xfs_trim_extent_eof(&wpc->imap, ip); + xfs_trim_extent(&wpc->imap, 0, XFS_B_TO_FSB(mp, i_size_read(inode))); /* * COW fork blocks can overlap data fork blocks even if the blocks From patchwork Thu Jan 17 16:36:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768571 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA98B746 for ; Thu, 17 Jan 2019 16:36:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D89473046C for ; Thu, 17 Jan 2019 16:36:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CCB162F35B; Thu, 17 Jan 2019 16:36:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF48C30507 for ; Thu, 17 Jan 2019 16:36:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728675AbfAQQgp (ORCPT ); Thu, 17 Jan 2019 11:36:45 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53124 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQgp (ORCPT ); Thu, 17 Jan 2019 11:36:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=KoHTu3n21isY0qpl07A5aGfg+A3oQjQb+n76ru20hls=; b=YTSfvHU9QoZ0gZhbJg8tsogccJ nmKOSWcG+zcxpDRemav/vup7eKWu1z6dL8ZmnSa5CvO/GWg0l+BKFTw5CF+djh2BHeeNqeZBc/aqg etg4gqjv41gumoQU1YlIgVP2+d1xtrLNeTkx930VLGQXUKXn0aRXUoEQYwEN6MSjRd9Ld+xjk/biz 49doJJH6KSyBjqmMgHgQISXJxHkHMU1Z2jhEzzaFsd+wom1EmInfD7hxK0sdvBHl1MWV46o23d6ew QdR5EStkdFV2ndJkwVyzTYffcp0fsOoHaXu+7jfejl1heHWqPUlZn7r8mJlHC7FIHKWVuISYY04rS pPZQze0w==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAeq-00045u-CB; Thu, 17 Jan 2019 16:36:44 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: "Darrick J . Wong" Subject: [PATCH 02/11] xfs: remove the io_type field from the writeback context and ioend Date: Thu, 17 Jan 2019 17:36:27 +0100 Message-Id: <20190117163636.23171-3-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The io_type field contains what is basically a summary of information from the inode fork and the imap. But we can just as easily use that information directly, simplifying a few bits here and there and improving the trace points. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/xfs/xfs_aops.c | 91 ++++++++++++++++++++------------------------ fs/xfs/xfs_aops.h | 24 +----------- fs/xfs/xfs_iomap.c | 8 ++-- fs/xfs/xfs_reflink.c | 2 +- fs/xfs/xfs_trace.h | 34 +++++++---------- 5 files changed, 62 insertions(+), 97 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index d7275075878e..8fec6fd4c632 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -28,7 +28,7 @@ */ struct xfs_writepage_ctx { struct xfs_bmbt_irec imap; - unsigned int io_type; + int fork; unsigned int cow_seq; struct xfs_ioend *ioend; }; @@ -255,30 +255,20 @@ xfs_end_io( */ error = blk_status_to_errno(ioend->io_bio->bi_status); if (unlikely(error)) { - switch (ioend->io_type) { - case XFS_IO_COW: + if (ioend->io_fork == XFS_COW_FORK) xfs_reflink_cancel_cow_range(ip, offset, size, true); - break; - } - goto done; } /* - * Success: commit the COW or unwritten blocks if needed. + * Success: commit the COW or unwritten blocks if needed. */ - switch (ioend->io_type) { - case XFS_IO_COW: + if (ioend->io_fork == XFS_COW_FORK) error = xfs_reflink_end_cow(ip, offset, size); - break; - case XFS_IO_UNWRITTEN: - /* writeback should never update isize */ + else if (ioend->io_state == XFS_EXT_UNWRITTEN) error = xfs_iomap_write_unwritten(ip, offset, size, false); - break; - default: + else ASSERT(!xfs_ioend_is_append(ioend) || ioend->io_append_trans); - break; - } done: if (ioend->io_append_trans) @@ -293,7 +283,8 @@ xfs_end_bio( struct xfs_ioend *ioend = bio->bi_private; struct xfs_mount *mp = XFS_I(ioend->io_inode)->i_mount; - if (ioend->io_type == XFS_IO_UNWRITTEN || ioend->io_type == XFS_IO_COW) + if (ioend->io_fork == XFS_COW_FORK || + ioend->io_state == XFS_EXT_UNWRITTEN) queue_work(mp->m_unwritten_workqueue, &ioend->io_work); else if (ioend->io_append_trans) queue_work(mp->m_data_workqueue, &ioend->io_work); @@ -313,7 +304,6 @@ xfs_map_blocks( xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset), end_fsb; xfs_fileoff_t cow_fsb = NULLFILEOFF; struct xfs_bmbt_irec imap; - int whichfork = XFS_DATA_FORK; struct xfs_iext_cursor icur; bool imap_valid; int error = 0; @@ -350,7 +340,7 @@ xfs_map_blocks( offset_fsb < wpc->imap.br_startoff + wpc->imap.br_blockcount; if (imap_valid && (!xfs_inode_has_cow_data(ip) || - wpc->io_type == XFS_IO_COW || + wpc->fork == XFS_COW_FORK || wpc->cow_seq == READ_ONCE(ip->i_cowfp->if_seq))) return 0; @@ -382,6 +372,9 @@ xfs_map_blocks( if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) { wpc->cow_seq = READ_ONCE(ip->i_cowfp->if_seq); xfs_iunlock(ip, XFS_ILOCK_SHARED); + + wpc->fork = XFS_COW_FORK; + /* * Truncate can race with writeback since writeback doesn't * take the iolock and truncate decreases the file size before @@ -394,11 +387,13 @@ xfs_map_blocks( * will kill the contents anyway. */ if (offset > i_size_read(inode)) { - wpc->io_type = XFS_IO_HOLE; + wpc->imap.br_blockcount = end_fsb - offset_fsb; + wpc->imap.br_startoff = offset_fsb; + wpc->imap.br_startblock = HOLESTARTBLOCK; + wpc->imap.br_state = XFS_EXT_NORM; return 0; } - whichfork = XFS_COW_FORK; - wpc->io_type = XFS_IO_COW; + goto allocate_blocks; } @@ -419,12 +414,14 @@ xfs_map_blocks( imap.br_startoff = end_fsb; /* fake a hole past EOF */ xfs_iunlock(ip, XFS_ILOCK_SHARED); + wpc->fork = XFS_DATA_FORK; + if (imap.br_startoff > offset_fsb) { /* landed in a hole or beyond EOF */ imap.br_blockcount = imap.br_startoff - offset_fsb; imap.br_startoff = offset_fsb; imap.br_startblock = HOLESTARTBLOCK; - wpc->io_type = XFS_IO_HOLE; + imap.br_state = XFS_EXT_NORM; } else { /* * Truncate to the next COW extent if there is one. This is the @@ -436,30 +433,23 @@ xfs_map_blocks( cow_fsb < imap.br_startoff + imap.br_blockcount) imap.br_blockcount = cow_fsb - imap.br_startoff; - if (isnullstartblock(imap.br_startblock)) { - /* got a delalloc extent */ - wpc->io_type = XFS_IO_DELALLOC; + /* got a delalloc extent? */ + if (isnullstartblock(imap.br_startblock)) goto allocate_blocks; - } - - if (imap.br_state == XFS_EXT_UNWRITTEN) - wpc->io_type = XFS_IO_UNWRITTEN; - else - wpc->io_type = XFS_IO_OVERWRITE; } wpc->imap = imap; - trace_xfs_map_blocks_found(ip, offset, count, wpc->io_type, &imap); + trace_xfs_map_blocks_found(ip, offset, count, wpc->fork, &imap); return 0; allocate_blocks: - error = xfs_iomap_write_allocate(ip, whichfork, offset, &imap, + error = xfs_iomap_write_allocate(ip, wpc->fork, offset, &imap, &wpc->cow_seq); if (error) return error; - ASSERT(whichfork == XFS_COW_FORK || cow_fsb == NULLFILEOFF || + ASSERT(wpc->fork == XFS_COW_FORK || cow_fsb == NULLFILEOFF || imap.br_startoff + imap.br_blockcount <= cow_fsb); wpc->imap = imap; - trace_xfs_map_blocks_alloc(ip, offset, count, wpc->io_type, &imap); + trace_xfs_map_blocks_alloc(ip, offset, count, wpc->fork, &imap); return 0; } @@ -484,7 +474,7 @@ xfs_submit_ioend( int status) { /* Convert CoW extents to regular */ - if (!status && ioend->io_type == XFS_IO_COW) { + if (!status && ioend->io_fork == XFS_COW_FORK) { /* * Yuk. This can do memory allocation, but is not a * transactional operation so everything is done in GFP_KERNEL @@ -502,7 +492,8 @@ xfs_submit_ioend( /* Reserve log space if we might write beyond the on-disk inode size. */ if (!status && - ioend->io_type != XFS_IO_UNWRITTEN && + (ioend->io_fork == XFS_COW_FORK || + ioend->io_state != XFS_EXT_UNWRITTEN) && xfs_ioend_is_append(ioend) && !ioend->io_append_trans) status = xfs_setfilesize_trans_alloc(ioend); @@ -531,7 +522,8 @@ xfs_submit_ioend( static struct xfs_ioend * xfs_alloc_ioend( struct inode *inode, - unsigned int type, + int fork, + xfs_exntst_t state, xfs_off_t offset, struct block_device *bdev, sector_t sector) @@ -545,7 +537,8 @@ xfs_alloc_ioend( ioend = container_of(bio, struct xfs_ioend, io_inline_bio); INIT_LIST_HEAD(&ioend->io_list); - ioend->io_type = type; + ioend->io_fork = fork; + ioend->io_state = state; ioend->io_inode = inode; ioend->io_size = 0; ioend->io_offset = offset; @@ -606,13 +599,15 @@ xfs_add_to_ioend( sector = xfs_fsb_to_db(ip, wpc->imap.br_startblock) + ((offset - XFS_FSB_TO_B(mp, wpc->imap.br_startoff)) >> 9); - if (!wpc->ioend || wpc->io_type != wpc->ioend->io_type || + if (!wpc->ioend || + wpc->fork != wpc->ioend->io_fork || + wpc->imap.br_state != wpc->ioend->io_state || sector != bio_end_sector(wpc->ioend->io_bio) || offset != wpc->ioend->io_offset + wpc->ioend->io_size) { if (wpc->ioend) list_add(&wpc->ioend->io_list, iolist); - wpc->ioend = xfs_alloc_ioend(inode, wpc->io_type, offset, - bdev, sector); + wpc->ioend = xfs_alloc_ioend(inode, wpc->fork, + wpc->imap.br_state, offset, bdev, sector); } if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff)) { @@ -721,7 +716,7 @@ xfs_writepage_map( error = xfs_map_blocks(wpc, inode, file_offset); if (error) break; - if (wpc->io_type == XFS_IO_HOLE) + if (wpc->imap.br_startblock == HOLESTARTBLOCK) continue; xfs_add_to_ioend(inode, file_offset, page, iop, wpc, wbc, &submit_list); @@ -916,9 +911,7 @@ xfs_vm_writepage( struct page *page, struct writeback_control *wbc) { - struct xfs_writepage_ctx wpc = { - .io_type = XFS_IO_HOLE, - }; + struct xfs_writepage_ctx wpc = { }; int ret; ret = xfs_do_writepage(page, wbc, &wpc); @@ -932,9 +925,7 @@ xfs_vm_writepages( struct address_space *mapping, struct writeback_control *wbc) { - struct xfs_writepage_ctx wpc = { - .io_type = XFS_IO_HOLE, - }; + struct xfs_writepage_ctx wpc = { }; int ret; xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED); diff --git a/fs/xfs/xfs_aops.h b/fs/xfs/xfs_aops.h index e5c23948a8ab..6c2615b83c5d 100644 --- a/fs/xfs/xfs_aops.h +++ b/fs/xfs/xfs_aops.h @@ -8,33 +8,13 @@ extern struct bio_set xfs_ioend_bioset; -/* - * Types of I/O for bmap clustering and I/O completion tracking. - * - * This enum is used in string mapping in xfs_trace.h; please keep the - * TRACE_DEFINE_ENUMs for it up to date. - */ -enum { - XFS_IO_HOLE, /* covers region without any block allocation */ - XFS_IO_DELALLOC, /* covers delalloc region */ - XFS_IO_UNWRITTEN, /* covers allocated but uninitialized data */ - XFS_IO_OVERWRITE, /* covers already allocated extent */ - XFS_IO_COW, /* covers copy-on-write extent */ -}; - -#define XFS_IO_TYPES \ - { XFS_IO_HOLE, "hole" }, \ - { XFS_IO_DELALLOC, "delalloc" }, \ - { XFS_IO_UNWRITTEN, "unwritten" }, \ - { XFS_IO_OVERWRITE, "overwrite" }, \ - { XFS_IO_COW, "CoW" } - /* * Structure for buffered I/O completions. */ struct xfs_ioend { struct list_head io_list; /* next ioend in chain */ - unsigned int io_type; /* delalloc / unwritten */ + int io_fork; /* inode fork written back */ + xfs_exntst_t io_state; /* extent state */ struct inode *io_inode; /* file being written to */ size_t io_size; /* size of the extent */ xfs_off_t io_offset; /* offset in the file */ diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 27c93b5f029d..32a7c169e096 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -575,7 +575,7 @@ xfs_file_iomap_begin_delay( goto out_unlock; } - trace_xfs_iomap_found(ip, offset, count, 0, &got); + trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK, &got); goto done; } @@ -647,7 +647,7 @@ xfs_file_iomap_begin_delay( * them out if the write happens to fail. */ iomap->flags |= IOMAP_F_NEW; - trace_xfs_iomap_alloc(ip, offset, count, 0, &got); + trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &got); done: if (isnullstartblock(got.br_startblock)) got.br_startblock = DELAYSTARTBLOCK; @@ -1139,7 +1139,7 @@ xfs_file_iomap_begin( return error; iomap->flags |= IOMAP_F_NEW; - trace_xfs_iomap_alloc(ip, offset, length, 0, &imap); + trace_xfs_iomap_alloc(ip, offset, length, XFS_DATA_FORK, &imap); out_finish: if (xfs_ipincount(ip) && (ip->i_itemp->ili_fsync_fields @@ -1155,7 +1155,7 @@ xfs_file_iomap_begin( out_found: ASSERT(nimaps); xfs_iunlock(ip, lockmode); - trace_xfs_iomap_found(ip, offset, length, 0, &imap); + trace_xfs_iomap_found(ip, offset, length, XFS_DATA_FORK, &imap); goto out_finish; out_unlock: diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index c5b4fa004ca4..2babc2cbe103 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1192,7 +1192,7 @@ xfs_reflink_remap_blocks( break; ASSERT(nimaps == 1); - trace_xfs_reflink_remap_imap(src, srcoff, len, XFS_IO_OVERWRITE, + trace_xfs_reflink_remap_imap(src, srcoff, len, XFS_DATA_FORK, &imap); /* Translate imap into the destination file. */ diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 6fcc893dfc91..f75c6d380543 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1218,23 +1218,17 @@ DEFINE_EVENT(xfs_readpage_class, name, \ DEFINE_READPAGE_EVENT(xfs_vm_readpage); DEFINE_READPAGE_EVENT(xfs_vm_readpages); -TRACE_DEFINE_ENUM(XFS_IO_HOLE); -TRACE_DEFINE_ENUM(XFS_IO_DELALLOC); -TRACE_DEFINE_ENUM(XFS_IO_UNWRITTEN); -TRACE_DEFINE_ENUM(XFS_IO_OVERWRITE); -TRACE_DEFINE_ENUM(XFS_IO_COW); - DECLARE_EVENT_CLASS(xfs_imap_class, TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count, - int type, struct xfs_bmbt_irec *irec), - TP_ARGS(ip, offset, count, type, irec), + int whichfork, struct xfs_bmbt_irec *irec), + TP_ARGS(ip, offset, count, whichfork, irec), TP_STRUCT__entry( __field(dev_t, dev) __field(xfs_ino_t, ino) __field(loff_t, size) __field(loff_t, offset) __field(size_t, count) - __field(int, type) + __field(int, whichfork) __field(xfs_fileoff_t, startoff) __field(xfs_fsblock_t, startblock) __field(xfs_filblks_t, blockcount) @@ -1245,33 +1239,33 @@ DECLARE_EVENT_CLASS(xfs_imap_class, __entry->size = ip->i_d.di_size; __entry->offset = offset; __entry->count = count; - __entry->type = type; + __entry->whichfork = whichfork; __entry->startoff = irec ? irec->br_startoff : 0; __entry->startblock = irec ? irec->br_startblock : 0; __entry->blockcount = irec ? irec->br_blockcount : 0; ), TP_printk("dev %d:%d ino 0x%llx size 0x%llx offset 0x%llx count %zd " - "type %s startoff 0x%llx startblock %lld blockcount 0x%llx", + "fork %s startoff 0x%llx startblock %lld blockcount 0x%llx", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, __entry->size, __entry->offset, __entry->count, - __print_symbolic(__entry->type, XFS_IO_TYPES), + __entry->whichfork == XFS_COW_FORK ? "cow" : "data", __entry->startoff, (int64_t)__entry->startblock, __entry->blockcount) ) -#define DEFINE_IOMAP_EVENT(name) \ +#define DEFINE_IMAP_EVENT(name) \ DEFINE_EVENT(xfs_imap_class, name, \ TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count, \ - int type, struct xfs_bmbt_irec *irec), \ - TP_ARGS(ip, offset, count, type, irec)) -DEFINE_IOMAP_EVENT(xfs_map_blocks_found); -DEFINE_IOMAP_EVENT(xfs_map_blocks_alloc); -DEFINE_IOMAP_EVENT(xfs_iomap_alloc); -DEFINE_IOMAP_EVENT(xfs_iomap_found); + int whichfork, struct xfs_bmbt_irec *irec), \ + TP_ARGS(ip, offset, count, whichfork, irec)) +DEFINE_IMAP_EVENT(xfs_map_blocks_found); +DEFINE_IMAP_EVENT(xfs_map_blocks_alloc); +DEFINE_IMAP_EVENT(xfs_iomap_alloc); +DEFINE_IMAP_EVENT(xfs_iomap_found); DECLARE_EVENT_CLASS(xfs_simple_io_class, TP_PROTO(struct xfs_inode *ip, xfs_off_t offset, ssize_t count), @@ -3078,7 +3072,7 @@ DEFINE_EVENT(xfs_inode_irec_class, name, \ DEFINE_INODE_EVENT(xfs_reflink_set_inode_flag); DEFINE_INODE_EVENT(xfs_reflink_unset_inode_flag); DEFINE_ITRUNC_EVENT(xfs_reflink_update_inode_size); -DEFINE_IOMAP_EVENT(xfs_reflink_remap_imap); +DEFINE_IMAP_EVENT(xfs_reflink_remap_imap); TRACE_EVENT(xfs_reflink_remap_blocks_loop, TP_PROTO(struct xfs_inode *src, xfs_fileoff_t soffset, xfs_filblks_t len, struct xfs_inode *dest, From patchwork Thu Jan 17 16:36:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C0DA613BF for ; Thu, 17 Jan 2019 16:36:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B05BC2F35B for ; Thu, 17 Jan 2019 16:36:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AEB823045D; Thu, 17 Jan 2019 16:36:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44BD23045E for ; Thu, 17 Jan 2019 16:36:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728728AbfAQQgr (ORCPT ); Thu, 17 Jan 2019 11:36:47 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53128 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQgr (ORCPT ); Thu, 17 Jan 2019 11:36:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=cTnbLLEBkNfc7U0Fv7GcOIfsGpgG7CbNERTYvjn0F2U=; b=WftYTcSE7RJS0FQK3nSXO6KfLA Dx7xmgeXOGcmBoXkKlVmuRuN1/98LYyqj9BE2sNyCoITRwd6SriXS1Ky/iOSqnmIp8kDllX4vgjZ5 w/1OVppI8UA1OMtW0vA8Pe1R0L74XDUp935LTT41Vl/9PrWg05dPuvE8aQP5KQtWJyVI1aE7C3e0i zYZyvseVhC2cEtuxhQrBA3P9/CWFOWamGyt8iDQ65jInjnZzWFnn3dbJgLBErB9ARkMCWAd8dMGgA upqlnWWhWpSEUDaotn5J83EjPMZYcEs+WkMr31zj+52dDUqTumUOOYnWFoAI+zvcYg6QtDELaNxla pNQxOj9Q==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAet-00046C-1Q; Thu, 17 Jan 2019 16:36:47 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: "Darrick J . Wong" Subject: [PATCH 03/11] xfs: remove the s_maxbytes checks in xfs_map_blocks Date: Thu, 17 Jan 2019 17:36:28 +0100 Message-Id: <20190117163636.23171-4-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We already ensure all data fits into s_maxbytes in the write / fault path. The only reason we have them here is that they were copy and pasted from xfs_bmapi_read when we stopped using that function. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/xfs/xfs_aops.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 8fec6fd4c632..5b6fab283316 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -301,7 +301,8 @@ xfs_map_blocks( struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; ssize_t count = i_blocksize(inode); - xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset), end_fsb; + xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); + xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); xfs_fileoff_t cow_fsb = NULLFILEOFF; struct xfs_bmbt_irec imap; struct xfs_iext_cursor icur; @@ -356,11 +357,6 @@ xfs_map_blocks( xfs_ilock(ip, XFS_ILOCK_SHARED); ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE || (ip->i_df.if_flags & XFS_IFEXTENTS)); - ASSERT(offset <= mp->m_super->s_maxbytes); - - if (offset > mp->m_super->s_maxbytes - count) - count = mp->m_super->s_maxbytes - offset; - end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)offset + count); /* * Check if this is offset is covered by a COW extents, and if yes use From patchwork Thu Jan 17 16:36:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ADC2913BF for ; Thu, 17 Jan 2019 16:36:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E9D72FA0A for ; Thu, 17 Jan 2019 16:36:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9CEE23046C; Thu, 17 Jan 2019 16:36:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F0FEA2FBBB for ; Thu, 17 Jan 2019 16:36:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728856AbfAQQgu (ORCPT ); Thu, 17 Jan 2019 11:36:50 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53140 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728792AbfAQQgu (ORCPT ); Thu, 17 Jan 2019 11:36:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VFT3cyYap7AnpInSBbi5CLapgHN3iPbxY3Du7KiZZ4k=; b=SRa2gIQ6Vgsp633H3/k9ssAYR Are4G6jOuKbth+Yu1L58HCQJuG6jIId/Y1f1FjxrEya6R5WjWQMyOPY1spdJPjuiluIy47x6ljsCV Z4Et84ScE7gDBNIacPjBhGCIXbit2Gc4ClhkasfSKMA5NWmuOCNNhqq3Rrd6GcZGLVJTErXGVBKPp Lmw0KnKogIO3m4jEvOGAe6SbEnUwiw/piwEbiw67ALNTvKggv/zEf1p/zbroNoUhHtoqS4HxlPvSN Ttzvwy31nAztbY6LyI3lLK2/Zngp/3++pawI0kpgXuBQY8g8Z8fpHzHlnbe4Ndpjv0BMMB0MoOTI3 0XdFnG5ZA==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAev-00046U-J3 for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:36:50 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 04/11] xfs: rework the truncate race handling in the writeback path Date: Thu, 17 Jan 2019 17:36:29 +0100 Message-Id: <20190117163636.23171-5-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We currently try to handle the races with truncate and COW to data fork conversion rather ad-hoc in a few places in the writeback path: - xfs_map_blocks contains an i_size check for the COW fork only, and returns an imap containing a hole to make the writeback code skip the rest of the page - xfs_iomap_write_allocate does another i_size check under ilock, and does an extent tree lookup to find the last extent to skip everthing beyond that, returning -EAGAIN if either is invalid to make the writeback code exit early - xfs_bmapi_write can ignore holes for delalloc conversions, but only does so if called for the COW fork Replace this with a coherent scheme: - check i_size first in xfs_map_blocks, and skip any processing if we already are beyond i_size by presenting a hole until the end of the file to the caller - in xfs_iomap_write_allocate check i_size again, and return -EAGAIN if we are beyond it now that we've taken ilock. - Skip holes for all delalloc conversion in xfs_bmapi_write instead of doing a separate lookup before calling it - in xfs_map_blocks retry the case where xfs_iomap_write_allocate could not perform a conversion one single time if we were on a COW fork to handle the race where an extent moved from the COW to the data fork, and else return a hole to skip writeback as we must have races with writeback Overall this greatly simplifies the code, makes it more robust and also handles the COW to data fork race properly that we did not handle previosuly. Signed-off-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 27 ++++----- fs/xfs/xfs_aops.c | 61 +++++++++++++------- fs/xfs/xfs_iomap.c | 121 ++++++++++++--------------------------- 3 files changed, 87 insertions(+), 122 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 4c73927819c2..48e61dd59105 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4305,28 +4305,21 @@ xfs_bmapi_write( /* in hole or beyond EOF? */ if (eof || bma.got.br_startoff > bno) { /* - * CoW fork conversions should /never/ hit EOF or - * holes. There should always be something for us - * to work on. + * It is possible that the extents have changed since + * we did the read call as we dropped the ilock for a + * while. We have to be careful about truncates or hole + * punchs here - we are not allowed to allocate + * non-delalloc blocks here. + * + * The only protection against truncation is the pages + * for the range we are being asked to convert are + * locked and hence a truncate will block on them + * first. */ ASSERT(!((flags & XFS_BMAPI_CONVERT) && (flags & XFS_BMAPI_COWFORK))); if (flags & XFS_BMAPI_DELALLOC) { - /* - * For the COW fork we can reasonably get a - * request for converting an extent that races - * with other threads already having converted - * part of it, as there converting COW to - * regular blocks is not protected using the - * IOLOCK. - */ - ASSERT(flags & XFS_BMAPI_COWFORK); - if (!(flags & XFS_BMAPI_COWFORK)) { - error = -EIO; - goto error0; - } - if (eof || bno >= end) break; } else { diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 5b6fab283316..ef05164651dc 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -300,6 +300,7 @@ xfs_map_blocks( { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; + loff_t isize = i_size_read(inode); ssize_t count = i_blocksize(inode); xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); @@ -308,6 +309,15 @@ xfs_map_blocks( struct xfs_iext_cursor icur; bool imap_valid; int error = 0; + int retries = 0; + + /* + * If the offset is beyong the inode size we know that we raced with + * truncate and are done now. Note that we'll recheck this again + * under the ilock later before doing delalloc conversions. + */ + if (offset > isize) + goto eof; /* * We have to make sure the cached mapping is within EOF to protect @@ -320,7 +330,7 @@ xfs_map_blocks( * mechanism to protect us from arbitrary extent modifying contexts, not * just eofblocks. */ - xfs_trim_extent(&wpc->imap, 0, XFS_B_TO_FSB(mp, i_size_read(inode))); + xfs_trim_extent(&wpc->imap, 0, XFS_B_TO_FSB(mp, isize)); /* * COW fork blocks can overlap data fork blocks even if the blocks @@ -354,6 +364,7 @@ xfs_map_blocks( * into real extents. If we return without a valid map, it means we * landed in a hole and we skip the block. */ +retry: xfs_ilock(ip, XFS_ILOCK_SHARED); ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE || (ip->i_df.if_flags & XFS_IFEXTENTS)); @@ -370,26 +381,6 @@ xfs_map_blocks( xfs_iunlock(ip, XFS_ILOCK_SHARED); wpc->fork = XFS_COW_FORK; - - /* - * Truncate can race with writeback since writeback doesn't - * take the iolock and truncate decreases the file size before - * it starts truncating the pages between new_size and old_size. - * Therefore, we can end up in the situation where writeback - * gets a CoW fork mapping but the truncate makes the mapping - * invalid and we end up in here trying to get a new mapping. - * bail out here so that we simply never get a valid mapping - * and so we drop the write altogether. The page truncation - * will kill the contents anyway. - */ - if (offset > i_size_read(inode)) { - wpc->imap.br_blockcount = end_fsb - offset_fsb; - wpc->imap.br_startoff = offset_fsb; - wpc->imap.br_startblock = HOLESTARTBLOCK; - wpc->imap.br_state = XFS_EXT_NORM; - return 0; - } - goto allocate_blocks; } @@ -440,13 +431,39 @@ xfs_map_blocks( allocate_blocks: error = xfs_iomap_write_allocate(ip, wpc->fork, offset, &imap, &wpc->cow_seq); - if (error) + if (error) { + if (error == -EAGAIN) + goto truncate_race; return error; + } ASSERT(wpc->fork == XFS_COW_FORK || cow_fsb == NULLFILEOFF || imap.br_startoff + imap.br_blockcount <= cow_fsb); wpc->imap = imap; trace_xfs_map_blocks_alloc(ip, offset, count, wpc->fork, &imap); return 0; + +truncate_race: + /* + * If we failed to find the extent in the COW fork we might have raced + * with a COW to data fork conversion or truncate. Restart the lookup + * to catch the extent in the data fork for the former case, but prevent + * additional retries to avoid looping forever for the latter case. + */ + if (wpc->fork == XFS_COW_FORK && !retries++) { + imap_valid = false; + goto retry; + } +eof: + /* + * If we raced with truncate there might be no data left at this offset. + * In that case we need to return a hole so that the writeback code + * skips writeback for the rest of the file. + */ + wpc->imap.br_startoff = offset_fsb; + wpc->imap.br_blockcount = end_fsb - offset_fsb; + wpc->imap.br_startblock = HOLESTARTBLOCK; + wpc->imap.br_state = XFS_EXT_NORM; + return 0; } /* diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 32a7c169e096..6acfed2ae858 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -685,14 +685,13 @@ xfs_iomap_write_allocate( { xfs_mount_t *mp = ip->i_mount; struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork); - xfs_fileoff_t offset_fsb, last_block; + xfs_fileoff_t offset_fsb; xfs_fileoff_t end_fsb, map_start_fsb; xfs_filblks_t count_fsb; xfs_trans_t *tp; int nimaps; int error = 0; int flags = XFS_BMAPI_DELALLOC; - int nres; if (whichfork == XFS_COW_FORK) flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; @@ -712,95 +711,51 @@ xfs_iomap_write_allocate( while (count_fsb != 0) { /* - * Set up a transaction with which to allocate the - * backing store for the file. Do allocations in a - * loop until we get some space in the range we are - * interested in. The other space that might be allocated - * is in the delayed allocation extent on which we sit - * but before our buffer starts. + * We have already reserved space for the extent and any + * indirect blocks when creating the delalloc extent, there + * is no need to reserve space in this transaction again. */ - nimaps = 0; - while (nimaps == 0) { - nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); - /* - * We have already reserved space for the extent and any - * indirect blocks when creating the delalloc extent, - * there is no need to reserve space in this transaction - * again. - */ - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0, - 0, XFS_TRANS_RESERVE, &tp); - if (error) - return error; + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0, + 0, XFS_TRANS_RESERVE, &tp); + if (error) + return error; - xfs_ilock(ip, XFS_ILOCK_EXCL); - xfs_trans_ijoin(tp, ip, 0); + xfs_ilock(ip, XFS_ILOCK_EXCL); - /* - * it is possible that the extents have changed since - * we did the read call as we dropped the ilock for a - * while. We have to be careful about truncates or hole - * punchs here - we are not allowed to allocate - * non-delalloc blocks here. - * - * The only protection against truncation is the pages - * for the range we are being asked to convert are - * locked and hence a truncate will block on them - * first. - * - * As a result, if we go beyond the range we really - * need and hit an delalloc extent boundary followed by - * a hole while we have excess blocks in the map, we - * will fill the hole incorrectly and overrun the - * transaction reservation. - * - * Using a single map prevents this as we are forced to - * check each map we look for overlap with the desired - * range and abort as soon as we find it. Also, given - * that we only return a single map, having one beyond - * what we can return is probably a bit silly. - * - * We also need to check that we don't go beyond EOF; - * this is a truncate optimisation as a truncate sets - * the new file size before block on the pages we - * currently have locked under writeback. Because they - * are about to be tossed, we don't need to write them - * back.... - */ - nimaps = 1; - end_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip)); - error = xfs_bmap_last_offset(ip, &last_block, - XFS_DATA_FORK); - if (error) + /* + * We need to check that we don't go beyond EOF; this is a + * truncate optimisation as a truncate sets the new file size + * before block on the pages we currently have locked under + * writeback. Because they are about to be tossed, we don't + * need to write them back.... + */ + end_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip)); + if (map_start_fsb + count_fsb > end_fsb) { + count_fsb = end_fsb - map_start_fsb; + if (count_fsb == 0) { + error = -EAGAIN; goto trans_cancel; - - last_block = XFS_FILEOFF_MAX(last_block, end_fsb); - if ((map_start_fsb + count_fsb) > last_block) { - count_fsb = last_block - map_start_fsb; - if (count_fsb == 0) { - error = -EAGAIN; - goto trans_cancel; - } } + } - /* - * From this point onwards we overwrite the imap - * pointer that the caller gave to us. - */ - error = xfs_bmapi_write(tp, ip, map_start_fsb, - count_fsb, flags, nres, imap, - &nimaps); - if (error) - goto trans_cancel; + nimaps = 1; + xfs_trans_ijoin(tp, ip, 0); + error = xfs_bmapi_write(tp, ip, map_start_fsb, count_fsb, flags, + XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), + imap, &nimaps); + if (error) + goto trans_cancel; - error = xfs_trans_commit(tp); - if (error) - goto error0; + error = xfs_trans_commit(tp); + if (error) + goto error0; - if (whichfork == XFS_COW_FORK) - *cow_seq = READ_ONCE(ifp->if_seq); - xfs_iunlock(ip, XFS_ILOCK_EXCL); - } + if (whichfork == XFS_COW_FORK) + *cow_seq = READ_ONCE(ifp->if_seq); + xfs_iunlock(ip, XFS_ILOCK_EXCL); + + if (nimaps == 0) + return -EAGAIN; /* * See if we were able to allocate an extent that From patchwork Thu Jan 17 16:36:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768577 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 295AF746 for ; Thu, 17 Jan 2019 16:36:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1AD6E30479 for ; Thu, 17 Jan 2019 16:36:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 190E13058B; Thu, 17 Jan 2019 16:36:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8825230479 for ; Thu, 17 Jan 2019 16:36:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728893AbfAQQgx (ORCPT ); Thu, 17 Jan 2019 11:36:53 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53146 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQgx (ORCPT ); Thu, 17 Jan 2019 11:36:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=SsHuuydxWc1PNnpv6M0+5lMT00No+zhN7cby8RcRj/c=; b=nWpxkjnrp/jkjwIEVJPPnn4oas JLxBwbwmsJcNn9Yxd0P00jsVMiSH3NVbJN52bLmtRt3xc82xuN0SQnah7I/zekaEaPl3iGb3i2Z8+ kw2Aa1b7G+EVYOa/+M0JYUTTBilwdjQQHjDcEKRKbLhir7ypsOBRlY50TCyu+AMoqpitLodHqFBPX a0KlLLcKPrxNML2h9qLsUln//P3LO5+qGd7zsn9WYmAdsqAfXiIJKATZILioFUnndIrL6wsALJ67m OEOD3YshLvNVIJqhCCTD97URp23G5pfZMHJeBxtd70ScJRkf7gHd1KH9WX2JW4GhP+36elBilwxKj 0jifMKdw==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAey-00046n-Bo; Thu, 17 Jan 2019 16:36:52 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: "Darrick J . Wong" Subject: [PATCH 05/11] xfs: make xfs_bmbt_to_iomap more useful Date: Thu, 17 Jan 2019 17:36:30 +0100 Message-Id: <20190117163636.23171-6-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Move checking for invalid zero blocks and setting of various iomap flags into this helper. Also make it deal with "raw" delalloc extents to avoid clutter in the callers. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/xfs/xfs_iomap.c | 84 +++++++++++++++++++++------------------------- fs/xfs/xfs_iomap.h | 4 +-- fs/xfs/xfs_pnfs.c | 2 +- 3 files changed, 41 insertions(+), 49 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 6acfed2ae858..9f1fd224bb06 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -35,18 +35,40 @@ #define XFS_WRITEIO_ALIGN(mp,off) (((off) >> mp->m_writeio_log) \ << mp->m_writeio_log) -void +static int +xfs_alert_fsblock_zero( + xfs_inode_t *ip, + xfs_bmbt_irec_t *imap) +{ + xfs_alert_tag(ip->i_mount, XFS_PTAG_FSBLOCK_ZERO, + "Access to block zero in inode %llu " + "start_block: %llx start_off: %llx " + "blkcnt: %llx extent-state: %x", + (unsigned long long)ip->i_ino, + (unsigned long long)imap->br_startblock, + (unsigned long long)imap->br_startoff, + (unsigned long long)imap->br_blockcount, + imap->br_state); + return -EFSCORRUPTED; +} + +int xfs_bmbt_to_iomap( struct xfs_inode *ip, struct iomap *iomap, - struct xfs_bmbt_irec *imap) + struct xfs_bmbt_irec *imap, + bool shared) { struct xfs_mount *mp = ip->i_mount; + if (unlikely(!imap->br_startblock && !XFS_IS_REALTIME_INODE(ip))) + return xfs_alert_fsblock_zero(ip, imap); + if (imap->br_startblock == HOLESTARTBLOCK) { iomap->addr = IOMAP_NULL_ADDR; iomap->type = IOMAP_HOLE; - } else if (imap->br_startblock == DELAYSTARTBLOCK) { + } else if (imap->br_startblock == DELAYSTARTBLOCK || + isnullstartblock(imap->br_startblock)) { iomap->addr = IOMAP_NULL_ADDR; iomap->type = IOMAP_DELALLOC; } else { @@ -60,6 +82,13 @@ xfs_bmbt_to_iomap( iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount); iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip)); iomap->dax_dev = xfs_find_daxdev_for_inode(VFS_I(ip)); + + if (xfs_ipincount(ip) && + (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP)) + iomap->flags |= IOMAP_F_DIRTY; + if (shared) + iomap->flags |= IOMAP_F_SHARED; + return 0; } static void @@ -138,23 +167,6 @@ xfs_iomap_eof_align_last_fsb( return 0; } -STATIC int -xfs_alert_fsblock_zero( - xfs_inode_t *ip, - xfs_bmbt_irec_t *imap) -{ - xfs_alert_tag(ip->i_mount, XFS_PTAG_FSBLOCK_ZERO, - "Access to block zero in inode %llu " - "start_block: %llx start_off: %llx " - "blkcnt: %llx extent-state: %x", - (unsigned long long)ip->i_ino, - (unsigned long long)imap->br_startblock, - (unsigned long long)imap->br_startoff, - (unsigned long long)imap->br_blockcount, - imap->br_state); - return -EFSCORRUPTED; -} - int xfs_iomap_write_direct( xfs_inode_t *ip, @@ -649,17 +661,7 @@ xfs_file_iomap_begin_delay( iomap->flags |= IOMAP_F_NEW; trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &got); done: - if (isnullstartblock(got.br_startblock)) - got.br_startblock = DELAYSTARTBLOCK; - - if (!got.br_startblock) { - error = xfs_alert_fsblock_zero(ip, &got); - if (error) - goto out_unlock; - } - - xfs_bmbt_to_iomap(ip, iomap, &got); - + error = xfs_bmbt_to_iomap(ip, iomap, &got, false); out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; @@ -1097,15 +1099,7 @@ xfs_file_iomap_begin( trace_xfs_iomap_alloc(ip, offset, length, XFS_DATA_FORK, &imap); out_finish: - if (xfs_ipincount(ip) && (ip->i_itemp->ili_fsync_fields - & ~XFS_ILOG_TIMESTAMP)) - iomap->flags |= IOMAP_F_DIRTY; - - xfs_bmbt_to_iomap(ip, iomap, &imap); - - if (shared) - iomap->flags |= IOMAP_F_SHARED; - return 0; + return xfs_bmbt_to_iomap(ip, iomap, &imap, shared); out_found: ASSERT(nimaps); @@ -1228,12 +1222,10 @@ xfs_xattr_iomap_begin( out_unlock: xfs_iunlock(ip, lockmode); - if (!error) { - ASSERT(nimaps); - xfs_bmbt_to_iomap(ip, iomap, &imap); - } - - return error; + if (error) + return error; + ASSERT(nimaps); + return xfs_bmbt_to_iomap(ip, iomap, &imap, false); } const struct iomap_ops xfs_xattr_iomap_ops = { diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h index c6170548831b..ed27e41b687c 100644 --- a/fs/xfs/xfs_iomap.h +++ b/fs/xfs/xfs_iomap.h @@ -17,8 +17,8 @@ int xfs_iomap_write_allocate(struct xfs_inode *, int, xfs_off_t, struct xfs_bmbt_irec *, unsigned int *); int xfs_iomap_write_unwritten(struct xfs_inode *, xfs_off_t, xfs_off_t, bool); -void xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *, - struct xfs_bmbt_irec *); +int xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *, + struct xfs_bmbt_irec *, bool shared); xfs_extlen_t xfs_eof_alignment(struct xfs_inode *ip, xfs_extlen_t extsize); static inline xfs_filblks_t diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c index f44c3599527d..bde2c9f56a46 100644 --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -185,7 +185,7 @@ xfs_fs_map_blocks( } xfs_iunlock(ip, XFS_IOLOCK_EXCL); - xfs_bmbt_to_iomap(ip, iomap, &imap); + error = xfs_bmbt_to_iomap(ip, iomap, &imap, false); *device_generation = mp->m_generation; return error; out_unlock: From patchwork Thu Jan 17 16:36:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768579 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C433A13BF for ; Thu, 17 Jan 2019 16:36:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B123B3034D for ; Thu, 17 Jan 2019 16:36:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF7603058B; Thu, 17 Jan 2019 16:36:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3DFB72FA0A for ; Thu, 17 Jan 2019 16:36:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728897AbfAQQgz (ORCPT ); Thu, 17 Jan 2019 11:36:55 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53154 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQgz (ORCPT ); Thu, 17 Jan 2019 11:36:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ErUrNxMgULqAVeoiNRzIxUI3DEc2FVO5ATb5WaPWS8M=; b=bfluBUqowWrJbpY1yM3m5hDpn 0Ydw1W0XQCaJkb5S4TljbPHugm81lBg2sgo4f5zbckYqanTaDZl9fkMmn2cbzikd5m30H04Y4xl0K 221J85cPD3R0KK0iAY0fIttu17c1kRL7n7pe+kvnXvcxThrcz6xek67samrOulDUu+b3yoVqM4OJI vkUMH9tZB53MZYfVq9XGmTnBfHZYb+/9RjgD0qGQ4/H5vLIxZhNq6D31/u8qIP6DYTNVMZAy9ckcK XkpeKw790A5ZNA/xDsKAZTUn5C2IU6TDK+JTv7JSrGLoc8oVHNCCbRylYlWq2+u9M/M0m+S0fttF3 ZSmtr8kjw==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAf0-000479-UC for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:36:55 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 06/11] xfs: don't use delalloc extents for COW on files with extsize hints Date: Thu, 17 Jan 2019 17:36:31 +0100 Message-Id: <20190117163636.23171-7-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While using delalloc for extsize hints is generally a good idea, the current code that does so only for COW doesn't help us much and creates a lot of special cases. Switch it to use real allocations like we do for direct I/O. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_iomap.c | 28 +++++++++++++++++----------- fs/xfs/xfs_reflink.c | 10 +++++++++- fs/xfs/xfs_reflink.h | 3 ++- 3 files changed, 28 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 9f1fd224bb06..d851abac16a9 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1039,22 +1039,28 @@ xfs_file_iomap_begin( * been done up front, so we don't need to do them here. */ if (xfs_is_reflink_inode(ip)) { + struct xfs_bmbt_irec orig = imap; + /* if zeroing doesn't need COW allocation, then we are done. */ if ((flags & IOMAP_ZERO) && !needs_cow_for_zeroing(&imap, nimaps)) goto out_found; - if (flags & IOMAP_DIRECT) { - /* may drop and re-acquire the ilock */ - error = xfs_reflink_allocate_cow(ip, &imap, &shared, - &lockmode); - if (error) - goto out_unlock; - } else { - error = xfs_reflink_reserve_cow(ip, &imap); - if (error) - goto out_unlock; - } + error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode, + flags); + if (error) + goto out_unlock; + + /* + * For buffered writes we need to report the address of the + * previous block (if there was any) so that the higher level + * write code can perform read-modify-write operations. For + * direct I/O code, which must be block aligned we need to + * report the newly allocated address. + */ + if (!(flags & IOMAP_DIRECT) && + orig.br_startblock != HOLESTARTBLOCK) + imap = orig; end_fsb = imap.br_startoff + imap.br_blockcount; length = XFS_FSB_TO_B(mp, end_fsb) - offset; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 2babc2cbe103..8a5353daf9ab 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -397,7 +397,8 @@ xfs_reflink_allocate_cow( struct xfs_inode *ip, struct xfs_bmbt_irec *imap, bool *shared, - uint *lockmode) + uint *lockmode, + unsigned iomap_flags) { struct xfs_mount *mp = ip->i_mount; xfs_fileoff_t offset_fsb = imap->br_startoff; @@ -471,6 +472,13 @@ xfs_reflink_allocate_cow( if (nimaps == 0) return -ENOSPC; convert: + /* + * COW fork extents are supposed to remain unwritten until we're ready + * to initiate a disk write. For direct I/O we are going to write the + * data and need the conversion, but for buffered writes we're done. + */ + if (!(iomap_flags & IOMAP_DIRECT)) + return 0; return xfs_reflink_convert_cow_extent(ip, imap, offset_fsb, count_fsb); out_unreserve: diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 6d73daef1f13..70d68a1a9b49 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -15,7 +15,8 @@ extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, extern int xfs_reflink_reserve_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap); extern int xfs_reflink_allocate_cow(struct xfs_inode *ip, - struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode); + struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode, + unsigned iomap_flags); extern int xfs_reflink_convert_cow(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); From patchwork Thu Jan 17 16:36:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768581 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4695313BF for ; Thu, 17 Jan 2019 16:36:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3794C2F62E for ; Thu, 17 Jan 2019 16:36:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3612430479; Thu, 17 Jan 2019 16:36:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CECAD3046C for ; Thu, 17 Jan 2019 16:36:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728759AbfAQQg6 (ORCPT ); Thu, 17 Jan 2019 11:36:58 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53164 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQg6 (ORCPT ); Thu, 17 Jan 2019 11:36:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=/26+8IsJKd4uzJWMAWiADT1sCkd6BcAVhIh2aG/FAmk=; b=FPYMAuctsZ6MIxkdW+MjHToij3 LLXoJzt/mDlZUQVewmvGx29AKmdo92bdDUctC3WYeDoaWYqxdWz8jLGBwzJnMTzDOAePXqtFofUp9 hSeNHCNGwPubaaA7Y9GxQgTzmmKZaOdWBbfLMhMXo18fAeTqop3xprI7x+c5U/h1Mu+8gebLtcGXM 6lNvNjtjYI1q6GjMYaQtopNkU4SMoL5E+Od2U/U1C29zaDR8wszzphZOvxSJtZ9qn9rJIhQFZS3zV mjpcEZEpJeNFqxLPBjMSBOhgs7zM7Y7iiAF/eZQigYU/pdZedeYAYd7HuPWCyEmEbKeKfxJSVZvQO rDGdthjA==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAf3-00047L-Io; Thu, 17 Jan 2019 16:36:57 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Cc: "Darrick J . Wong" Subject: [PATCH 07/11] xfs: also truncate holes covered by COW blocks Date: Thu, 17 Jan 2019 17:36:32 +0100 Message-Id: <20190117163636.23171-8-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This only matters if we want to write data through the COW fork that is not actually an overwrite of existing data. Reasons for that are speculative COW fork allocations using the cowextsize, or a mode where we always write through the COW fork. Currently both can't actually happen, but I plan to enable them. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong --- fs/xfs/xfs_aops.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index ef05164651dc..64dd16712dc3 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -403,28 +403,29 @@ xfs_map_blocks( wpc->fork = XFS_DATA_FORK; + /* landed in a hole or beyond EOF? */ if (imap.br_startoff > offset_fsb) { - /* landed in a hole or beyond EOF */ imap.br_blockcount = imap.br_startoff - offset_fsb; imap.br_startoff = offset_fsb; imap.br_startblock = HOLESTARTBLOCK; imap.br_state = XFS_EXT_NORM; - } else { - /* - * Truncate to the next COW extent if there is one. This is the - * only opportunity to do this because we can skip COW fork - * lookups for the subsequent blocks in the mapping; however, - * the requirement to treat the COW range separately remains. - */ - if (cow_fsb != NULLFILEOFF && - cow_fsb < imap.br_startoff + imap.br_blockcount) - imap.br_blockcount = cow_fsb - imap.br_startoff; - - /* got a delalloc extent? */ - if (isnullstartblock(imap.br_startblock)) - goto allocate_blocks; } + /* + * Truncate to the next COW extent if there is one. This is the only + * opportunity to do this because we can skip COW fork lookups for the + * subsequent blocks in the mapping; however, the requirement to treat + * the COW range separately remains. + */ + if (cow_fsb != NULLFILEOFF && + cow_fsb < imap.br_startoff + imap.br_blockcount) + imap.br_blockcount = cow_fsb - imap.br_startoff; + + /* got a delalloc extent? */ + if (imap.br_startblock != HOLESTARTBLOCK && + isnullstartblock(imap.br_startblock)) + goto allocate_blocks; + wpc->imap = imap; trace_xfs_map_blocks_found(ip, offset, count, wpc->fork, &imap); return 0; From patchwork Thu Jan 17 16:36:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768583 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 936C1746 for ; Thu, 17 Jan 2019 16:37:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82A7E30479 for ; Thu, 17 Jan 2019 16:37:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7FAA42B225; Thu, 17 Jan 2019 16:37:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE948304A8 for ; Thu, 17 Jan 2019 16:37:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728906AbfAQQhB (ORCPT ); Thu, 17 Jan 2019 11:37:01 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53172 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQhB (ORCPT ); Thu, 17 Jan 2019 11:37:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=jq1nC2HNoC8x9jYWFcW/4auM8YwUzirJJuV2IDyzx04=; b=p726ac+gxH9MzmZB7u3w28xT0 pZPKMy/FNmZnkmgPqr9+9UaONlsjb/MmfF7X/kPuyev9SMDs3PZRaT0SlaJ1xNyyYd+mUPKo5Jdhp IiwgXtJ65wfjGBej6AQPerKEId1JHYn8z986TeyzdLidVpHhVUJItBHxzjPmzP03rPCFSH0CAuYVq GRF/JjfzEhkSSUgLFaOpTtZ3xwQQdH6UDSAirJpnmsrgzKmjPNAxSmUyfbiS4LwUtEA7UVTqbIKIT RJqyrf1lDOp6F2U8MbAiyaN5dCrRLuCILMSnkdct7lhnl8nWoDMGEQAQDVLrlc8qwq+NPhDBjaye+ W1ZR2TLrw==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAf6-00047o-CL for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:37:00 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 08/11] xfs: merge COW handling into xfs_file_iomap_begin_delay Date: Thu, 17 Jan 2019 17:36:33 +0100 Message-Id: <20190117163636.23171-9-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Besides simplifying the code a bit this allows to actually implement the behavior of using COW preallocation for non-COW data mentioned in the current comments. Note that this breaks the current version of xfs/420, but that is because the test is broken. A separate fix will be sent for it. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_iomap.c | 133 ++++++++++++++++++++++++++++++------------- fs/xfs/xfs_reflink.c | 67 ---------------------- fs/xfs/xfs_reflink.h | 2 - fs/xfs/xfs_trace.h | 3 - 4 files changed, 94 insertions(+), 111 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index d851abac16a9..4dc3c766cc28 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -534,15 +534,16 @@ xfs_file_iomap_begin_delay( { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; - struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK); xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); xfs_fileoff_t maxbytes_fsb = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes); xfs_fileoff_t end_fsb; - int error = 0, eof = 0; - struct xfs_bmbt_irec got; - struct xfs_iext_cursor icur; + struct xfs_bmbt_irec imap, cmap; + struct xfs_iext_cursor icur, ccur; xfs_fsblock_t prealloc_blocks = 0; + bool eof = false, cow_eof = false, shared; + int whichfork = XFS_DATA_FORK; + int error = 0; ASSERT(!XFS_IS_REALTIME_INODE(ip)); ASSERT(!xfs_get_extsz_hint(ip)); @@ -560,7 +561,7 @@ xfs_file_iomap_begin_delay( XFS_STATS_INC(mp, xs_blk_mapw); - if (!(ifp->if_flags & XFS_IFEXTENTS)) { + if (!(ip->i_df.if_flags & XFS_IFEXTENTS)) { error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK); if (error) goto out_unlock; @@ -568,51 +569,92 @@ xfs_file_iomap_begin_delay( end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb); - eof = !xfs_iext_lookup_extent(ip, ifp, offset_fsb, &icur, &got); + /* + * Search the data fork fork first to look up our source mapping. We + * always need the data fork map, as we have to return it to the + * iomap code so that the higher level write code can read data in to + * perform read-modify-write cycles for unaligned writes. + */ + eof = !xfs_iext_lookup_extent(ip, &ip->i_df, offset_fsb, &icur, &imap); if (eof) - got.br_startoff = end_fsb; /* fake hole until the end */ + imap.br_startoff = end_fsb; /* fake hole until the end */ - if (got.br_startoff <= offset_fsb) { + /* We never need to allocate blocks for zeroing a hole. */ + if ((flags & IOMAP_ZERO) && imap.br_startoff > offset_fsb) { + xfs_hole_to_iomap(ip, iomap, offset_fsb, imap.br_startoff); + goto out_unlock; + } + + /* + * Search the COW fork extent list even if we did not find a data fork + * extent. This serves two purposes: first this implements the + * speculative preallocation using cowextsize, so that we also unshare + * block adjacent to shared blocks instead of just the shared blocks + * themselves. Second the lookup in the extent list is generally faster + * than going out to the shared extent tree. + */ + if (xfs_is_reflink_inode(ip)) { + cow_eof = !xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, + &ccur, &cmap); + if (!cow_eof && cmap.br_startoff <= offset_fsb) { + trace_xfs_reflink_cow_found(ip, &cmap); + whichfork = XFS_COW_FORK; + goto done; + } + } + + if (imap.br_startoff <= offset_fsb) { /* * For reflink files we may need a delalloc reservation when * overwriting shared extents. This includes zeroing of * existing extents that contain data. */ - if (xfs_is_reflink_inode(ip) && - ((flags & IOMAP_WRITE) || - got.br_state != XFS_EXT_UNWRITTEN)) { - xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb); - error = xfs_reflink_reserve_cow(ip, &got); - if (error) - goto out_unlock; + if (!xfs_is_reflink_inode(ip) || + ((flags & IOMAP_ZERO) && imap.br_state != XFS_EXT_NORM)) { + trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK, + &imap); + goto done; } - trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK, &got); - goto done; - } + xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb); - if (flags & IOMAP_ZERO) { - xfs_hole_to_iomap(ip, iomap, offset_fsb, got.br_startoff); - goto out_unlock; + /* Trim the mapping to the nearest shared extent boundary. */ + error = xfs_reflink_trim_around_shared(ip, &imap, &shared); + if (error) + goto out_unlock; + + /* Not shared? Just report the (potentially capped) extent. */ + if (!shared) { + trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK, + &imap); + goto done; + } + + /* + * Fork all the shared blocks from our write offset until the + * end of the extent. + */ + whichfork = XFS_COW_FORK; + end_fsb = imap.br_startoff + imap.br_blockcount; + } else { + /* + * We cap the maximum length we map here to MAX_WRITEBACK_PAGES + * pages to keep the chunks of work done where somewhat + * symmetric with the work writeback does. This is a completely + * arbitrary number pulled out of thin air. + * + * Note that the values needs to be less than 32-bits wide until + * the lower level functions are updated. + */ + count = min_t(loff_t, count, 1024 * PAGE_SIZE); + end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb); } error = xfs_qm_dqattach_locked(ip, false); if (error) goto out_unlock; - /* - * We cap the maximum length we map here to MAX_WRITEBACK_PAGES pages - * to keep the chunks of work done where somewhat symmetric with the - * work writeback does. This is a completely arbitrary number pulled - * out of thin air as a best guess for initial testing. - * - * Note that the values needs to be less than 32-bits wide until - * the lower level functions are updated. - */ - count = min_t(loff_t, count, 1024 * PAGE_SIZE); - end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb); - - if (eof) { + if (eof && whichfork == XFS_DATA_FORK) { prealloc_blocks = xfs_iomap_prealloc_size(ip, offset, count, &icur); if (prealloc_blocks) { @@ -635,9 +677,11 @@ xfs_file_iomap_begin_delay( } retry: - error = xfs_bmapi_reserve_delalloc(ip, XFS_DATA_FORK, offset_fsb, - end_fsb - offset_fsb, prealloc_blocks, &got, &icur, - eof); + error = xfs_bmapi_reserve_delalloc(ip, whichfork, offset_fsb, + end_fsb - offset_fsb, prealloc_blocks, + whichfork == XFS_DATA_FORK ? &imap : &cmap, + whichfork == XFS_DATA_FORK ? &icur : &ccur, + whichfork == XFS_DATA_FORK ? eof : cow_eof); switch (error) { case 0: break; @@ -659,9 +703,20 @@ xfs_file_iomap_begin_delay( * them out if the write happens to fail. */ iomap->flags |= IOMAP_F_NEW; - trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &got); + trace_xfs_iomap_alloc(ip, offset, count, whichfork, + whichfork == XFS_DATA_FORK ? &imap : &cmap); done: - error = xfs_bmbt_to_iomap(ip, iomap, &got, false); + if (whichfork == XFS_COW_FORK) { + if (imap.br_startoff > offset_fsb) { + xfs_trim_extent(&cmap, offset_fsb, + imap.br_startoff - offset_fsb); + error = xfs_bmbt_to_iomap(ip, iomap, &cmap, false); + goto out_unlock; + } + /* ensure we only report blocks we have a reservation for */ + xfs_trim_extent(&imap, cmap.br_startoff, cmap.br_blockcount); + } + error = xfs_bmbt_to_iomap(ip, iomap, &imap, false); out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 8a5353daf9ab..9ef1f79cb3ae 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -234,73 +234,6 @@ xfs_reflink_trim_around_shared( } } -/* - * Trim the passed in imap to the next shared/unshared extent boundary, and - * if imap->br_startoff points to a shared extent reserve space for it in the - * COW fork. - * - * Note that imap will always contain the block numbers for the existing blocks - * in the data fork, as the upper layers need them for read-modify-write - * operations. - */ -int -xfs_reflink_reserve_cow( - struct xfs_inode *ip, - struct xfs_bmbt_irec *imap) -{ - struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); - struct xfs_bmbt_irec got; - int error = 0; - bool eof = false; - struct xfs_iext_cursor icur; - bool shared; - - /* - * Search the COW fork extent list first. This serves two purposes: - * first this implement the speculative preallocation using cowextisze, - * so that we also unshared block adjacent to shared blocks instead - * of just the shared blocks themselves. Second the lookup in the - * extent list is generally faster than going out to the shared extent - * tree. - */ - - if (!xfs_iext_lookup_extent(ip, ifp, imap->br_startoff, &icur, &got)) - eof = true; - if (!eof && got.br_startoff <= imap->br_startoff) { - trace_xfs_reflink_cow_found(ip, imap); - xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); - return 0; - } - - /* Trim the mapping to the nearest shared extent boundary. */ - error = xfs_reflink_trim_around_shared(ip, imap, &shared); - if (error) - return error; - - /* Not shared? Just report the (potentially capped) extent. */ - if (!shared) - return 0; - - /* - * Fork all the shared blocks from our write offset until the end of - * the extent. - */ - error = xfs_qm_dqattach_locked(ip, false); - if (error) - return error; - - error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff, - imap->br_blockcount, 0, &got, &icur, eof); - if (error == -ENOSPC || error == -EDQUOT) - trace_xfs_reflink_cow_enospc(ip, imap); - if (error) - return error; - - xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); - trace_xfs_reflink_cow_alloc(ip, &got); - return 0; -} - /* Convert part of an unwritten CoW extent to a real one. */ STATIC int xfs_reflink_convert_cow_extent( diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 70d68a1a9b49..4a9e3cd4768a 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -12,8 +12,6 @@ extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp, extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, struct xfs_bmbt_irec *irec, bool *shared); -extern int xfs_reflink_reserve_cow(struct xfs_inode *ip, - struct xfs_bmbt_irec *imap); extern int xfs_reflink_allocate_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode, unsigned iomap_flags); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index f75c6d380543..a8c32206e147 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3196,13 +3196,10 @@ DEFINE_INODE_ERROR_EVENT(xfs_reflink_unshare_error); /* copy on write */ DEFINE_INODE_IREC_EVENT(xfs_reflink_trim_around_shared); -DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_alloc); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_found); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_enospc); DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow); -DEFINE_RW_EVENT(xfs_reflink_reserve_cow); - DEFINE_SIMPLE_IO_EVENT(xfs_reflink_bounce_dio_write); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range); From patchwork Thu Jan 17 16:36:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768585 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5F31746 for ; Thu, 17 Jan 2019 16:37:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7AED3034D for ; Thu, 17 Jan 2019 16:37:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D652F305D9; Thu, 17 Jan 2019 16:37:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 742183045E for ; Thu, 17 Jan 2019 16:37:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728907AbfAQQhE (ORCPT ); Thu, 17 Jan 2019 11:37:04 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53178 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQhD (ORCPT ); Thu, 17 Jan 2019 11:37:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VtyqnXN4s+eT0N8nxIqJeqlaGLpJst1lta+/VbgiErU=; b=edY+AV4s2Uw6U3p39lwOZHZR0 Jc26RSS9ht55RjyfqdBZP/XmK/vWslyA50Ziqbl0G4jVCXyu9GKHeToUg4fH3EMPKq7MYa3uS9xXg 2g2ySEA+ttP5afZptVL8TsDAkz+3RfXW9JYxqvLJFGiW9lWRyRIquSuPpxTTqeEz42OvspuGByym9 eu3q8OsuS9ZuGp0itrAs1RdlkB0t/Vx3hr3GNR+7dy3IV/5uPkopc6fzBNR0An52BC7lPgB396rUW jYcfzw6OEyzPE740VcikfAI0v7zF4nHWEf+w1/Q5nCIr9eE4k421FjDuLp30PDG9/VTez5678tsr+ 0rdB74kXw==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAf9-000488-1b for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:37:03 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 09/11] xfs: report IOMAP_F_SHARED from xfs_file_iomap_begin_delay Date: Thu, 17 Jan 2019 17:36:34 +0100 Message-Id: <20190117163636.23171-10-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP No user of it in the iomap code at the moment, but we should not actively report wrong information if we can trivially get it right. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_iomap.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 4dc3c766cc28..95d7a21a8152 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -541,7 +541,7 @@ xfs_file_iomap_begin_delay( struct xfs_bmbt_irec imap, cmap; struct xfs_iext_cursor icur, ccur; xfs_fsblock_t prealloc_blocks = 0; - bool eof = false, cow_eof = false, shared; + bool eof = false, cow_eof = false, shared = false; int whichfork = XFS_DATA_FORK; int error = 0; @@ -710,13 +710,14 @@ xfs_file_iomap_begin_delay( if (imap.br_startoff > offset_fsb) { xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb); - error = xfs_bmbt_to_iomap(ip, iomap, &cmap, false); + error = xfs_bmbt_to_iomap(ip, iomap, &cmap, true); goto out_unlock; } /* ensure we only report blocks we have a reservation for */ xfs_trim_extent(&imap, cmap.br_startoff, cmap.br_blockcount); + shared = true; } - error = xfs_bmbt_to_iomap(ip, iomap, &imap, false); + error = xfs_bmbt_to_iomap(ip, iomap, &imap, shared); out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; From patchwork Thu Jan 17 16:36:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768587 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1792613BF for ; Thu, 17 Jan 2019 16:37:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 07CEF2AB9C for ; Thu, 17 Jan 2019 16:37:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 062FF3058B; Thu, 17 Jan 2019 16:37:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E4ED530479 for ; Thu, 17 Jan 2019 16:37:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728908AbfAQQhG (ORCPT ); Thu, 17 Jan 2019 11:37:06 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53184 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQhG (ORCPT ); Thu, 17 Jan 2019 11:37:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=8JdrHDMueKeHLUmjEZJcfgYDijKovtf7tHtZRHA1XLA=; b=MASksbPiC+eElb/Lfy3TsvPcO BEBijuO8TuLcvhHI248w/fz11WrZsQfWsmwARy2h/ECYMtYSLmGZIpT9KsfcNZqr5f7tf/1JyqGim yLpZh6oCqfYbM+cXDBew31V27gcPz2Napk3eTNMvR9V/Y2EaMPRODfDd+3iVvw/Iro/wZV5ar8kqN jB6uHMEZMJFldVuCoecBZk3xLKMC78T/0/p+4+TGYUlSooTwj3kS4aEML9d7PYlZdeJSBrwFgBkN+ DKrN+03ANO0mgo758FCdPKHavWHptAEawdx0Ew4tqfJLAo/TIYV7bMHhThX807mfdDLrBKsTaXR5v WqbJ2A/UA==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAfB-00048P-NH for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:37:06 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 10/11] xfs: make COW fork unwritten extent conversions more robust Date: Thu, 17 Jan 2019 17:36:35 +0100 Message-Id: <20190117163636.23171-11-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If we have racing buffered and direct I/O COW fork extents under writeback can have been moved to the data fork by the time we call xfs_reflink_convert_cow from xfs_submit_ioend. This would be mostly harmless as the block numbers don't change by this move, except for the fact that xfs_bmapi_write will crash or trigger asserts when not finding existing extents, even despite trying to paper over this with the XFS_BMAPI_CONVERT_ONLY flag. Instead of special casing non-transaction conversions in the already way too complicated xfs_bmapi_write just add a new helper for the much simpler non-transactional COW fork case, which simplify ignores not found extents. Signed-off-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 12 ++------ fs/xfs/libxfs/xfs_bmap.h | 8 +++--- fs/xfs/xfs_reflink.c | 61 +++++++++++++++++++++++++--------------- 3 files changed, 45 insertions(+), 36 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 48e61dd59105..1847709f9e50 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -2029,7 +2029,7 @@ xfs_bmap_add_extent_delay_real( /* * Convert an unwritten allocation to a real allocation or vice versa. */ -STATIC int /* error */ +int /* error */ xfs_bmap_add_extent_unwritten_real( struct xfs_trans *tp, xfs_inode_t *ip, /* incore inode pointer */ @@ -4236,9 +4236,7 @@ xfs_bmapi_write( ASSERT(*nmap >= 1); ASSERT(*nmap <= XFS_BMAP_MAX_NMAP); - ASSERT(tp != NULL || - (flags & (XFS_BMAPI_CONVERT | XFS_BMAPI_COWFORK)) == - (XFS_BMAPI_CONVERT | XFS_BMAPI_COWFORK)); + ASSERT(tp != NULL); ASSERT(len > 0); ASSERT(XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_LOCAL); ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); @@ -4316,9 +4314,6 @@ xfs_bmapi_write( * locked and hence a truncate will block on them * first. */ - ASSERT(!((flags & XFS_BMAPI_CONVERT) && - (flags & XFS_BMAPI_COWFORK))); - if (flags & XFS_BMAPI_DELALLOC) { if (eof || bno >= end) break; @@ -4333,8 +4328,7 @@ xfs_bmapi_write( * First, deal with the hole before the allocated space * that we found, if any. */ - if ((need_alloc || wasdelay) && - !(flags & XFS_BMAPI_CONVERT_ONLY)) { + if (need_alloc || wasdelay) { bma.eof = eof; bma.conv = !!(flags & XFS_BMAPI_CONVERT); bma.wasdel = wasdelay; diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index b4ff710d7250..6365099e3262 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -98,9 +98,6 @@ struct xfs_extent_free_item /* Only convert delalloc space, don't allocate entirely new extents */ #define XFS_BMAPI_DELALLOC 0x400 -/* Only convert unwritten extents, don't allocate new blocks */ -#define XFS_BMAPI_CONVERT_ONLY 0x800 - /* Skip online discard of freed extents */ #define XFS_BMAPI_NODISCARD 0x1000 @@ -118,7 +115,6 @@ struct xfs_extent_free_item { XFS_BMAPI_REMAP, "REMAP" }, \ { XFS_BMAPI_COWFORK, "COWFORK" }, \ { XFS_BMAPI_DELALLOC, "DELALLOC" }, \ - { XFS_BMAPI_CONVERT_ONLY, "CONVERT_ONLY" }, \ { XFS_BMAPI_NODISCARD, "NODISCARD" }, \ { XFS_BMAPI_NORMAP, "NORMAP" } @@ -227,6 +223,10 @@ int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork, xfs_fileoff_t off, xfs_filblks_t len, xfs_filblks_t prealloc, struct xfs_bmbt_irec *got, struct xfs_iext_cursor *cur, int eof); +int xfs_bmap_add_extent_unwritten_real(struct xfs_trans *tp, + struct xfs_inode *ip, int whichfork, + struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp, + struct xfs_bmbt_irec *new, int *logflagsp); static inline void xfs_bmap_add_free( diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 9ef1f79cb3ae..f84b37fa4f17 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -234,26 +234,42 @@ xfs_reflink_trim_around_shared( } } -/* Convert part of an unwritten CoW extent to a real one. */ -STATIC int -xfs_reflink_convert_cow_extent( - struct xfs_inode *ip, - struct xfs_bmbt_irec *imap, - xfs_fileoff_t offset_fsb, - xfs_filblks_t count_fsb) +static int +xfs_reflink_convert_cow_locked( + struct xfs_inode *ip, + xfs_fileoff_t offset_fsb, + xfs_filblks_t count_fsb) { - int nimaps = 1; + struct xfs_iext_cursor icur; + struct xfs_bmbt_irec got; + struct xfs_btree_cur *dummy_cur = NULL; + int dummy_logflags; + int error; - if (imap->br_state == XFS_EXT_NORM) + if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got)) return 0; - xfs_trim_extent(imap, offset_fsb, count_fsb); - trace_xfs_reflink_convert_cow(ip, imap); - if (imap->br_blockcount == 0) - return 0; - return xfs_bmapi_write(NULL, ip, imap->br_startoff, imap->br_blockcount, - XFS_BMAPI_COWFORK | XFS_BMAPI_CONVERT, 0, imap, - &nimaps); + do { + if (got.br_startoff >= offset_fsb + count_fsb) + break; + if (got.br_state == XFS_EXT_NORM) + continue; + if (WARN_ON_ONCE(isnullstartblock(got.br_startblock))) + return -EIO; + + xfs_trim_extent(&got, offset_fsb, count_fsb); + if (!got.br_blockcount) + continue; + + got.br_state = XFS_EXT_NORM; + error = xfs_bmap_add_extent_unwritten_real(NULL, ip, + XFS_COW_FORK, &icur, &dummy_cur, &got, + &dummy_logflags); + if (error) + return error; + } while (xfs_iext_next_extent(ip->i_cowfp, &icur, &got)); + + return error; } /* Convert all of the unwritten CoW extents in a file's range to real ones. */ @@ -267,15 +283,12 @@ xfs_reflink_convert_cow( xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); xfs_filblks_t count_fsb = end_fsb - offset_fsb; - struct xfs_bmbt_irec imap; - int nimaps = 1, error = 0; + int error; ASSERT(count != 0); xfs_ilock(ip, XFS_ILOCK_EXCL); - error = xfs_bmapi_write(NULL, ip, offset_fsb, count_fsb, - XFS_BMAPI_COWFORK | XFS_BMAPI_CONVERT | - XFS_BMAPI_CONVERT_ONLY, 0, &imap, &nimaps); + error = xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; } @@ -405,14 +418,16 @@ xfs_reflink_allocate_cow( if (nimaps == 0) return -ENOSPC; convert: + xfs_trim_extent(imap, offset_fsb, count_fsb); /* * COW fork extents are supposed to remain unwritten until we're ready * to initiate a disk write. For direct I/O we are going to write the * data and need the conversion, but for buffered writes we're done. */ - if (!(iomap_flags & IOMAP_DIRECT)) + if (!(iomap_flags & IOMAP_DIRECT) || imap->br_state == XFS_EXT_NORM) return 0; - return xfs_reflink_convert_cow_extent(ip, imap, offset_fsb, count_fsb); + trace_xfs_reflink_convert_cow(ip, imap); + return xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); out_unreserve: xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0, From patchwork Thu Jan 17 16:36:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10768589 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8842F746 for ; Thu, 17 Jan 2019 16:37:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7918E30376 for ; Thu, 17 Jan 2019 16:37:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 778C33058B; Thu, 17 Jan 2019 16:37:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2A4630376 for ; Thu, 17 Jan 2019 16:37:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728914AbfAQQhJ (ORCPT ); Thu, 17 Jan 2019 11:37:09 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:53190 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726854AbfAQQhJ (ORCPT ); Thu, 17 Jan 2019 11:37:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:To:From:Sender: Reply-To:Cc:Content-Type:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=YWo+2T4ijivA63GLpaUKSSTgngFZ6hb/I9shx0ZqqM4=; b=pdGWtOqRN0I56ooB/Bph6WzPs ueXYKzCU1CdvE4pefAEBhix9aFppQeKIqy/B1/9CvkJlvEra8IjvsmF0UepnKOFAlbC0meX9cIk/W qZW5Kgb8JZ94LZB8bQiL5ZgmjfqgY8tTJyZcRU6DcccqNe6NkxAuZyEcsCb6IU0bYyw/9tUB3DUnq 9M0c5zhWMnbeR5jVLCNWpUzkK/BDkUOxkVTO9SP7J/Xt3/D09B2OFb2rga/Men4BVpkuj/eHXyGi7 5BmGMVM7cVEDO0itZj6YxcSHJ18n+DnHzKaG9ZYh97QbT8rjgpAOErzzkX+PnLKML4LvVUfgn7P/R JwL7euzsg==; Received: from 089144213167.atnat0022.highway.a1.net ([89.144.213.167] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gkAfE-00048g-9u for linux-xfs@vger.kernel.org; Thu, 17 Jan 2019 16:37:08 +0000 From: Christoph Hellwig To: linux-xfs@vger.kernel.org Subject: [PATCH 11/11] xfs: introduce an always_cow mode Date: Thu, 17 Jan 2019 17:36:36 +0100 Message-Id: <20190117163636.23171-12-hch@lst.de> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190117163636.23171-1-hch@lst.de> References: <20190117163636.23171-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a mode where XFS never overwrites existing blocks in place. This is to aid debugging our COW code, and also put infatructure in place for things like possible future support for zoned block devices, which can't support overwrites. This mode is enabled globally by doing a: echo 1 > /sys/fs/xfs/debug/always_cow Note that the parameter is global to allow running all tests in xfstests easily in this mode, which would not easily be possible with a per-fs sysfs file. Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_aops.c | 2 +- fs/xfs/xfs_file.c | 11 ++++++++++- fs/xfs/xfs_iomap.c | 28 ++++++++++++++++++---------- fs/xfs/xfs_reflink.c | 28 ++++++++++++++++++++++++---- fs/xfs/xfs_reflink.h | 13 +++++++++++++ fs/xfs/xfs_super.c | 13 +++++++++---- fs/xfs/xfs_sysctl.h | 1 + fs/xfs/xfs_sysfs.c | 24 ++++++++++++++++++++++++ 8 files changed, 100 insertions(+), 20 deletions(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 64dd16712dc3..19cba00d3269 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -986,7 +986,7 @@ xfs_vm_bmap( * Since we don't pass back blockdev info, we can't return bmap * information for rt files either. */ - if (xfs_is_reflink_inode(ip) || XFS_IS_REALTIME_INODE(ip)) + if (xfs_is_cow_inode(ip) || XFS_IS_REALTIME_INODE(ip)) return 0; return iomap_bmap(mapping, block, &xfs_iomap_ops); } diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index e47425071e65..8d2be043590a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -507,7 +507,7 @@ xfs_file_dio_aio_write( * We can't properly handle unaligned direct I/O to reflink * files yet, as we can't unshare a partial block. */ - if (xfs_is_reflink_inode(ip)) { + if (xfs_is_cow_inode(ip)) { trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count); return -EREMCHG; } @@ -806,6 +806,15 @@ xfs_file_fallocate( return -EOPNOTSUPP; xfs_ilock(ip, iolock); + /* + * If always_cow mode we can't use preallocation and thus should not + * allow creating them. + */ + if (xfs_is_always_cow_inode(ip) && (mode & ~FALLOC_FL_KEEP_SIZE) == 0) { + error = -EOPNOTSUPP; + goto out_unlock; + } + error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 95d7a21a8152..99454be3d2b5 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -395,12 +395,13 @@ xfs_quota_calc_throttle( STATIC xfs_fsblock_t xfs_iomap_prealloc_size( struct xfs_inode *ip, + int whichfork, loff_t offset, loff_t count, struct xfs_iext_cursor *icur) { struct xfs_mount *mp = ip->i_mount; - struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK); + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork); xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); struct xfs_bmbt_irec prev; int shift = 0; @@ -593,7 +594,11 @@ xfs_file_iomap_begin_delay( * themselves. Second the lookup in the extent list is generally faster * than going out to the shared extent tree. */ - if (xfs_is_reflink_inode(ip)) { + if (xfs_is_cow_inode(ip)) { + if (!ip->i_cowfp) { + ASSERT(!xfs_is_reflink_inode(ip)); + xfs_ifork_init_cow(ip); + } cow_eof = !xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &ccur, &cmap); if (!cow_eof && cmap.br_startoff <= offset_fsb) { @@ -609,7 +614,7 @@ xfs_file_iomap_begin_delay( * overwriting shared extents. This includes zeroing of * existing extents that contain data. */ - if (!xfs_is_reflink_inode(ip) || + if (!xfs_is_cow_inode(ip) || ((flags & IOMAP_ZERO) && imap.br_state != XFS_EXT_NORM)) { trace_xfs_iomap_found(ip, offset, count, XFS_DATA_FORK, &imap); @@ -619,7 +624,7 @@ xfs_file_iomap_begin_delay( xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb); /* Trim the mapping to the nearest shared extent boundary. */ - error = xfs_reflink_trim_around_shared(ip, &imap, &shared); + error = xfs_inode_need_cow(ip, &imap, &shared); if (error) goto out_unlock; @@ -648,15 +653,18 @@ xfs_file_iomap_begin_delay( */ count = min_t(loff_t, count, 1024 * PAGE_SIZE); end_fsb = min(XFS_B_TO_FSB(mp, offset + count), maxbytes_fsb); + + if (xfs_is_always_cow_inode(ip)) + whichfork = XFS_COW_FORK; } error = xfs_qm_dqattach_locked(ip, false); if (error) goto out_unlock; - if (eof && whichfork == XFS_DATA_FORK) { - prealloc_blocks = xfs_iomap_prealloc_size(ip, offset, count, - &icur); + if (eof) { + prealloc_blocks = xfs_iomap_prealloc_size(ip, whichfork, offset, + count, &icur); if (prealloc_blocks) { xfs_extlen_t align; xfs_off_t end_offset; @@ -988,7 +996,7 @@ xfs_ilock_for_iomap( * COW writes may allocate delalloc space or convert unwritten COW * extents, so we need to make sure to take the lock exclusively here. */ - if (xfs_is_reflink_inode(ip) && is_write) { + if (xfs_is_cow_inode(ip) && is_write) { /* * FIXME: It could still overwrite on unshared extents and not * need allocation. @@ -1022,7 +1030,7 @@ xfs_ilock_for_iomap( * check, so if we got ILOCK_SHARED for a write and but we're now a * reflink inode we have to switch to ILOCK_EXCL and relock. */ - if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_reflink_inode(ip)) { + if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) { xfs_iunlock(ip, mode); mode = XFS_ILOCK_EXCL; goto relock; @@ -1094,7 +1102,7 @@ xfs_file_iomap_begin( * Break shared extents if necessary. Checks for non-blocking IO have * been done up front, so we don't need to do them here. */ - if (xfs_is_reflink_inode(ip)) { + if (xfs_is_cow_inode(ip)) { struct xfs_bmbt_irec orig = imap; /* if zeroing doesn't need COW allocation, then we are done. */ diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index f84b37fa4f17..e2d9179bd50d 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -192,7 +192,7 @@ xfs_reflink_trim_around_shared( int error = 0; /* Holes, unwritten, and delalloc extents cannot be shared */ - if (!xfs_is_reflink_inode(ip) || !xfs_bmap_is_real_extent(irec)) { + if (!xfs_is_cow_inode(ip) || !xfs_bmap_is_real_extent(irec)) { *shared = false; return 0; } @@ -234,6 +234,23 @@ xfs_reflink_trim_around_shared( } } +bool +xfs_inode_need_cow( + struct xfs_inode *ip, + struct xfs_bmbt_irec *imap, + bool *shared) +{ + /* We can't update any real extents in always COW mode. */ + if (xfs_is_always_cow_inode(ip) && + !isnullstartblock(imap->br_startblock)) { + *shared = true; + return 0; + } + + /* Trim the mapping to the nearest shared extent boundary. */ + return xfs_reflink_trim_around_shared(ip, imap, shared); +} + static int xfs_reflink_convert_cow_locked( struct xfs_inode *ip, @@ -321,7 +338,7 @@ xfs_find_trim_cow_extent( if (got.br_startoff > offset_fsb) { xfs_trim_extent(imap, imap->br_startoff, got.br_startoff - imap->br_startoff); - return xfs_reflink_trim_around_shared(ip, imap, shared); + return xfs_inode_need_cow(ip, imap, shared); } *shared = true; @@ -356,7 +373,10 @@ xfs_reflink_allocate_cow( xfs_extlen_t resblks = 0; ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - ASSERT(xfs_is_reflink_inode(ip)); + if (!ip->i_cowfp) { + ASSERT(!xfs_is_reflink_inode(ip)); + xfs_ifork_init_cow(ip); + } error = xfs_find_trim_cow_extent(ip, imap, shared, &found); if (error || !*shared) @@ -542,7 +562,7 @@ xfs_reflink_cancel_cow_range( int error; trace_xfs_reflink_cancel_cow_range(ip, offset, count); - ASSERT(xfs_is_reflink_inode(ip)); + ASSERT(ip->i_cowfp); offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset); if (count == NULLFILEOFF) diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 4a9e3cd4768a..aad222aa0b23 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -6,11 +6,24 @@ #ifndef __XFS_REFLINK_H #define __XFS_REFLINK_H 1 +static inline bool xfs_is_always_cow_inode(struct xfs_inode *ip) +{ + return xfs_globals.always_cow && + xfs_sb_version_hasreflink(&ip->i_mount->m_sb); +} + +static inline bool xfs_is_cow_inode(struct xfs_inode *ip) +{ + return xfs_is_reflink_inode(ip) || xfs_is_always_cow_inode(ip); +} + extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp, xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal); extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, struct xfs_bmbt_irec *irec, bool *shared); +bool xfs_inode_need_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap, + bool *shared); extern int xfs_reflink_allocate_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode, diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index c9097cb0b955..0835eed77295 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1729,11 +1729,16 @@ xfs_fs_fill_super( } } - if (xfs_sb_version_hasreflink(&mp->m_sb) && mp->m_sb.sb_rblocks) { - xfs_alert(mp, + if (xfs_sb_version_hasreflink(&mp->m_sb)) { + if (mp->m_sb.sb_rblocks) { + xfs_alert(mp, "reflink not compatible with realtime device!"); - error = -EINVAL; - goto out_filestream_unmount; + error = -EINVAL; + goto out_filestream_unmount; + } + + if (xfs_globals.always_cow) + xfs_info(mp, "using DEBUG-only always_cow mode."); } if (xfs_sb_version_hasrmapbt(&mp->m_sb) && mp->m_sb.sb_rblocks) { diff --git a/fs/xfs/xfs_sysctl.h b/fs/xfs/xfs_sysctl.h index 168488130a19..ad7f9be13087 100644 --- a/fs/xfs/xfs_sysctl.h +++ b/fs/xfs/xfs_sysctl.h @@ -85,6 +85,7 @@ struct xfs_globals { int log_recovery_delay; /* log recovery delay (secs) */ int mount_delay; /* mount setup delay (secs) */ bool bug_on_assert; /* BUG() the kernel on assert failure */ + bool always_cow; /* use COW fork for all overwrites */ }; extern struct xfs_globals xfs_globals; diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c index cd6a994a7250..cabda13f3c64 100644 --- a/fs/xfs/xfs_sysfs.c +++ b/fs/xfs/xfs_sysfs.c @@ -183,10 +183,34 @@ mount_delay_show( } XFS_SYSFS_ATTR_RW(mount_delay); +static ssize_t +always_cow_store( + struct kobject *kobject, + const char *buf, + size_t count) +{ + ssize_t ret; + + ret = kstrtobool(buf, &xfs_globals.always_cow); + if (ret < 0) + return ret; + return count; +} + +static ssize_t +always_cow_show( + struct kobject *kobject, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.always_cow); +} +XFS_SYSFS_ATTR_RW(always_cow); + static struct attribute *xfs_dbg_attrs[] = { ATTR_LIST(bug_on_assert), ATTR_LIST(log_recovery_delay), ATTR_LIST(mount_delay), + ATTR_LIST(always_cow), NULL, };