From patchwork Thu Oct 23 02:37:51 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 5137811 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 844CEC11AC for ; Thu, 23 Oct 2014 02:38:10 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 98F4E20251 for ; Thu, 23 Oct 2014 02:38:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 97E2F2024F for ; Thu, 23 Oct 2014 02:38:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932206AbaJWCiE (ORCPT ); Wed, 22 Oct 2014 22:38:04 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:33194 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1753250AbaJWCiD (ORCPT ); Wed, 22 Oct 2014 22:38:03 -0400 X-IronPort-AV: E=Sophos;i="5.04,772,1406563200"; d="scan'208";a="37784835" Received: from unknown (HELO edo.cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 23 Oct 2014 10:34:50 +0800 Received: from G08CNEXCHPEKD02.g08.fujitsu.local (localhost.localdomain [127.0.0.1]) by edo.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id s9N2bq23027619 for ; Thu, 23 Oct 2014 10:37:52 +0800 Received: from adam-work.localdomain (10.167.226.33) by G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.181.6; Thu, 23 Oct 2014 10:37:57 +0800 From: Qu Wenruo To: Subject: [PATCH] btrfs: Enhance btrfs chunk allocation algorithm to reduce ENOSPC caused by unbalanced data/metadata allocation. Date: Thu, 23 Oct 2014 10:37:51 +0800 Message-ID: <1414031871-10859-1-git-send-email-quwenruo@cn.fujitsu.com> X-Mailer: git-send-email 2.1.2 MIME-Version: 1.0 X-Originating-IP: [10.167.226.33] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When btrfs allocate a chunk, it will try to alloc up to 1G for data and 256M for metadata, or 10% of all the writeable space if there is enough space for the stripe on device. However, when we run out of space, this allocation may cause unbalanced chunk allocation. For example, there are only 1G unallocated space, and request for allocate DATA chunk is sent, and all the space will be allocated as data chunk, making later metadata chunk alloc request unable to handle, which will cause ENOSPC. This is the one of the common complains from end users about why ENOSPC happens but there is still available space. This patch will try not to alloc chunk which is more than half of the unallocated space, making the last space more balanced at a small cost of more fragmented chunk at the last 1G. Some easy example: Preallocate 17.5G on a 20G empty btrfs fs: [Before] # btrfs fi show /mnt/test Label: none uuid: da8741b1-5d47-4245-9e94-bfccea34e91e Total devices 1 FS bytes used 17.50GiB devid 1 size 20.00GiB used 20.00GiB path /dev/sdb All space is allocated. No space later metadata space. [After] # btrfs fi show /mnt/test Label: none uuid: e6935aeb-a232-4140-84f9-80aab1f23d56 Total devices 1 FS bytes used 17.50GiB devid 1 size 20.00GiB used 19.77GiB path /dev/sdb About 230M is still available for later metadata allocation. Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index d47289c..fa8de79 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4240,6 +4240,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, int ret; u64 max_stripe_size; u64 max_chunk_size; + u64 total_avail_space = 0; u64 stripe_size; u64 num_bytes; u64 raid_stripe_len = BTRFS_STRIPE_LEN; @@ -4352,10 +4353,27 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, devices_info[ndevs].max_avail = max_avail; devices_info[ndevs].total_avail = total_avail; devices_info[ndevs].dev = device; + total_avail_space += total_avail; ++ndevs; } /* + * Try not to occupy more than half of the unallocated space. + * When run short of space and alloc all the space to + * data/metadata will cause ENOSPC to be triggered more easily. + * + * And since the minimum chunk size is 16M, the half-half will cause + * 16M allocated from 20M available space and reset 4M will not be + * used ever. In that case(16~32M), allocate all directly. + */ + if (total_avail_space < 32 * 1024 * 1024 && + total_avail_space > 16 * 1024 * 1024) + max_chunk_size = total_avail_space; + else + max_chunk_size = min(total_avail_space / 2, max_chunk_size); + max_chunk_size = min(total_avail_space / 2, max_chunk_size); + + /* * now sort the devices by hole size / available space */ sort(devices_info, ndevs, sizeof(struct btrfs_device_info),