From patchwork Sun Sep 21 18:55:30 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Rajendra X-Patchwork-Id: 4944531 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id A372D9F313 for ; Sun, 21 Sep 2014 18:56:28 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BC60D20221 for ; Sun, 21 Sep 2014 18:56:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A9A0620121 for ; Sun, 21 Sep 2014 18:56:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751902AbaIUS4V (ORCPT ); Sun, 21 Sep 2014 14:56:21 -0400 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:40735 "EHLO e23smtp03.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751843AbaIUS4S (ORCPT ); Sun, 21 Sep 2014 14:56:18 -0400 Received: from /spool/local by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 22 Sep 2014 04:56:17 +1000 Received: from d23dlp03.au.ibm.com (202.81.31.214) by e23smtp03.au.ibm.com (202.81.31.209) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 22 Sep 2014 04:56:14 +1000 Received: from d23relay07.au.ibm.com (d23relay07.au.ibm.com [9.190.26.37]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id 26F81357804C for ; Mon, 22 Sep 2014 04:56:14 +1000 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay07.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s8LIve9l11272194 for ; Mon, 22 Sep 2014 04:57:40 +1000 Received: from d23av02.au.ibm.com (localhost [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s8LIuDTB017930 for ; Mon, 22 Sep 2014 04:56:13 +1000 Received: from localhost.in.ibm.com ([9.79.216.204]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s8LItZTW017526; Mon, 22 Sep 2014 04:56:11 +1000 From: Chandan Rajendra To: clm@fb.com, jbacik@fb.com, bo.li.liu@oracle.com, dsterba@suse.cz Cc: Chandan Rajendra , aneesh.kumar@linux.vnet.ibm.com, linux-btrfs@vger.kernel.org Subject: [RFC PATCH V7 16/16] Btrfs: subpagesize-blocksize: Track blocks of ordered extent submitted for write I/O. Date: Mon, 22 Sep 2014 00:25:30 +0530 Message-Id: <1411325730-21817-17-git-send-email-chandan@linux.vnet.ibm.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1411325730-21817-1-git-send-email-chandan@linux.vnet.ibm.com> References: <1411325730-21817-1-git-send-email-chandan@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14092118-0009-0000-0000-000000533093 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In the subpagesize-blocksize scenario, the following command (with 4k as the PAGE_SIZE and 2k as the block size) can cause false accounting of blocks of an ordered extent that is written to disk: $ xfs_io -f -c "pwrite 0 10240" \ -c "sync_range 0 4096" \ -c "sync_range 8192 2048" \ -c "pwrite 10240 2048" \ -c "sync_range 10240 2048" \ /mnt/btrfs/file.bin To fix this, we would have to explicitly track the blocks of an ordered extent that have already been submitted for write I/O. Signed-off-by: Chandan Rajendra --- fs/btrfs/extent_io.c | 24 ++++++++++++++++++++++-- fs/btrfs/ordered-data.c | 4 +++- fs/btrfs/ordered-data.h | 4 ++++ 3 files changed, 29 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ccd9e1c..2cf9e59 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3201,6 +3201,8 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode, u64 extent_offset; u64 extent_end; u64 iosize; + u64 blk, nr_blks; + u64 blk_submitted; sector_t sector; struct extent_state *cached_state = NULL; struct block_device *bdev; @@ -3267,11 +3269,26 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode, iosize = min(extent_end - cur, end - cur + 1); iosize = ALIGN(iosize, blocksize); + blk = extent_offset >> inode->i_sb->s_blocksize_bits; + nr_blks = iosize >> inode->i_sb->s_blocksize_bits; + + blk_submitted = find_next_bit(ordered->blocks_submitted, + ordered->len >> inode->i_sb->s_blocksize_bits, + blk); + if (blk_submitted < blk + nr_blks) { + if (blk_submitted == blk) { + cur += blocksize; + btrfs_put_ordered_extent(ordered); + continue; + } + iosize = (blk_submitted - blk) + << inode->i_sb->s_blocksize_bits; + nr_blks = iosize >> inode->i_sb->s_blocksize_bits; + } + sector = (ordered->start + extent_offset) >> 9; bdev = BTRFS_I(inode)->root->fs_info->fs_devices->latest_bdev; compressed = test_bit(BTRFS_ORDERED_COMPRESSED, &ordered->flags); - btrfs_put_ordered_extent(ordered); - ordered = NULL; /* * compressed and inline extents are written through other @@ -3284,6 +3301,7 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode, */ nr++; cur += iosize; + btrfs_put_ordered_extent(ordered); continue; } @@ -3298,6 +3316,8 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode, } else { unsigned long max_nr = (i_size >> PAGE_CACHE_SHIFT) + 1; + bitmap_set(ordered->blocks_submitted, blk, nr_blks); + btrfs_put_ordered_extent(ordered); set_range_writeback(tree, cur, cur + iosize - 1); if (!PageWriteback(page)) { btrfs_err(BTRFS_I(inode)->root->fs_info, diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 4d9832f..59b2544 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -199,13 +199,15 @@ static int __btrfs_add_ordered_extent(struct inode *inode, u64 file_offset, nr_longs = BITS_TO_LONGS(len >> inode->i_sb->s_blocksize_bits); if (nr_longs == 1) { entry->blocks_done = &entry->blocks_bitmap; + entry->blocks_submitted = &entry->blocks_submitted_bitmap; } else { - entry->blocks_done = kzalloc(nr_longs * sizeof(unsigned long), + entry->blocks_done = kzalloc(2 * nr_longs * sizeof(unsigned long), GFP_NOFS); if (!entry->blocks_done) { kmem_cache_free(btrfs_ordered_extent_cache, entry); return -ENOMEM; } + entry->blocks_submitted = entry->blocks_done + nr_longs; } entry->file_offset = file_offset; diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 7de3b1e..851914c 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -139,6 +139,10 @@ struct btrfs_ordered_extent { /* bitmap to track the blocks that have been written to disk */ unsigned long *blocks_done; unsigned long blocks_bitmap; + + /* bitmap to track the blocks that have been submitted for write i/o */ + unsigned long *blocks_submitted; + unsigned long blocks_submitted_bitmap; }; /*