From patchwork Mon Jul 25 07:51:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaoguang Wang X-Patchwork-Id: 9245375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 40648607FD for ; Mon, 25 Jul 2016 07:55:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 336AD22380 for ; Mon, 25 Jul 2016 07:55:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28068269E2; Mon, 25 Jul 2016 07:55:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A61822380 for ; Mon, 25 Jul 2016 07:55:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752391AbcGYHzS (ORCPT ); Mon, 25 Jul 2016 03:55:18 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:19955 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1752264AbcGYHzN (ORCPT ); Mon, 25 Jul 2016 03:55:13 -0400 X-IronPort-AV: E=Sophos;i="5.20,367,1444665600"; d="scan'208";a="687182" Received: from unknown (HELO cn.fujitsu.com) ([10.167.250.3]) by song.cn.fujitsu.com with ESMTP; 25 Jul 2016 15:54:33 +0800 Received: from localhost.localdomain (unknown [10.167.226.107]) by cn.fujitsu.com (Postfix) with ESMTP id 7ADC44056404; Mon, 25 Jul 2016 15:54:32 +0800 (CST) From: Wang Xiaoguang To: linux-btrfs@vger.kernel.org Cc: dsterba@suse.cz, jbacik@fb.com, holger@applied-asynchrony.com Subject: [PATCH v2 4/4] btrfs: should block unused block groups deletion work when allocating data space Date: Mon, 25 Jul 2016 15:51:41 +0800 Message-Id: <20160725075141.5712-5-wangxg.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.9.0 In-Reply-To: <20160725075141.5712-1-wangxg.fnst@cn.fujitsu.com> References: <20160725075141.5712-1-wangxg.fnst@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 7ADC44056404.A39F5 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: wangxg.fnst@cn.fujitsu.com Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP cleaner_kthread() may run at any time, in which it'll call btrfs_delete_unused_bgs() to delete unused block groups. Because this work is asynchronous, it may also result in false ENOSPC error. Please see below race window: CPU1 | CPU2 | |-> btrfs_alloc_data_chunk_ondemand() |-> cleaner_kthread() |-> do_chunk_alloc() | | | assume it returns ENOSPC, which means | | | btrfs_space_info is full and have free| | | space to satisfy data request. | | | | |- > btrfs_delete_unused_bgs() | | | it will decrease btrfs_space_info | | | total_bytes and make | | | btrfs_space_info is not full. | | | In this case, we may get ENOSPC error, but btrfs_space_info is not full. To fix this issue, in btrfs_alloc_data_chunk_ondemand(), if we need to call do_chunk_alloc() to allocating new chunk, we should block btrfs_delete_unused_bgs(). So here we introduce a new struct rw_semaphore bg_delete_sem to do this job. Signed-off-by: Wang Xiaoguang --- fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 1 + fs/btrfs/extent-tree.c | 40 ++++++++++++++++++++++++++++++++++------ 3 files changed, 36 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 7eb2913..bf0751d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -800,6 +800,7 @@ struct btrfs_fs_info { struct mutex cleaner_mutex; struct mutex chunk_mutex; struct mutex volume_mutex; + struct rw_semaphore bg_delete_sem; /* * this is taken to make sure we don't set block groups ro after diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 60ce119..65a1465 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2676,6 +2676,7 @@ int open_ctree(struct super_block *sb, mutex_init(&fs_info->ordered_operations_mutex); mutex_init(&fs_info->tree_log_mutex); mutex_init(&fs_info->chunk_mutex); + init_rwsem(&fs_info->bg_delete_sem); mutex_init(&fs_info->transaction_kthread_mutex); mutex_init(&fs_info->cleaner_mutex); mutex_init(&fs_info->volume_mutex); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index df8d756..d1f8638 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4111,6 +4111,7 @@ int btrfs_alloc_data_chunk_ondemand(struct inode *inode, u64 bytes) int ret = 0; int need_commit = 2; int have_pinned_space; + int have_bg_delete_sem = 0; /* make sure bytes are sectorsize aligned */ bytes = ALIGN(bytes, root->sectorsize); @@ -4121,8 +4122,11 @@ int btrfs_alloc_data_chunk_ondemand(struct inode *inode, u64 bytes) } data_sinfo = fs_info->data_sinfo; - if (!data_sinfo) + if (!data_sinfo) { + down_read(&root->fs_info->bg_delete_sem); + have_bg_delete_sem = 1; goto alloc; + } again: /* make sure we have enough space to handle the data first */ @@ -4134,10 +4138,21 @@ again: if (used + bytes > data_sinfo->total_bytes) { struct btrfs_trans_handle *trans; + spin_unlock(&data_sinfo->lock); + /* + * We may need to allocate new chunk, so we should block + * btrfs_delete_unused_bgs() + */ + if (have_bg_delete_sem == 0) { + down_read(&root->fs_info->bg_delete_sem); + have_bg_delete_sem = 1; + } + /* * if we don't have enough free bytes in this space then we need * to alloc a new chunk. */ + spin_lock(&data_sinfo->lock); if (!data_sinfo->full) { u64 alloc_target; @@ -4156,17 +4171,20 @@ alloc: * the fs. */ trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) + if (IS_ERR(trans)) { + up_read(&root->fs_info->bg_delete_sem); return PTR_ERR(trans); + } ret = do_chunk_alloc(trans, root->fs_info->extent_root, alloc_target, CHUNK_ALLOC_NO_FORCE); btrfs_end_transaction(trans, root); if (ret < 0) { - if (ret != -ENOSPC) + if (ret != -ENOSPC) { + up_read(&root->fs_info->bg_delete_sem); return ret; - else { + } else { have_pinned_space = 1; goto commit_trans; } @@ -4200,15 +4218,19 @@ commit_trans: } trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) + if (IS_ERR(trans)) { + up_read(&root->fs_info->bg_delete_sem); return PTR_ERR(trans); + } if (have_pinned_space >= 0 || test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) || need_commit > 0) { ret = btrfs_commit_transaction(trans, root); - if (ret) + if (ret) { + up_read(&root->fs_info->bg_delete_sem); return ret; + } /* * The cleaner kthread might still be doing iput * operations. Wait for it to finish so that @@ -4225,6 +4247,7 @@ commit_trans: trace_btrfs_space_reservation(root->fs_info, "space_info:enospc", data_sinfo->flags, bytes, 1); + up_read(&root->fs_info->bg_delete_sem); return -ENOSPC; } data_sinfo->bytes_may_use += bytes; @@ -4232,6 +4255,9 @@ commit_trans: data_sinfo->flags, bytes, 1); spin_unlock(&data_sinfo->lock); + if (have_bg_delete_sem == 1) + up_read(&root->fs_info->bg_delete_sem); + return ret; } @@ -10594,6 +10620,7 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info) spin_unlock(&fs_info->unused_bgs_lock); mutex_lock(&fs_info->delete_unused_bgs_mutex); + down_write(&root->fs_info->bg_delete_sem); /* Don't want to race with allocators so take the groups_sem */ down_write(&space_info->groups_sem); @@ -10721,6 +10748,7 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info) end_trans: btrfs_end_transaction(trans, root); next: + up_write(&root->fs_info->bg_delete_sem); mutex_unlock(&fs_info->delete_unused_bgs_mutex); btrfs_put_block_group(block_group); spin_lock(&fs_info->unused_bgs_lock);