From patchwork Thu Jan 2 21:26:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316075 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 691BE138C for ; Thu, 2 Jan 2020 21:26:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 38BD421D7D for ; Thu, 2 Jan 2020 21:26:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000414; bh=TQzFZ9G+ZGRQARDVmZl0ugpqhLEZvTnZnVN8EMXDPUk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=khdtfglQ4atLZ5IMHu/++JHU+5XhnlrLAIAvv/yH+Xyz1EYpEJnheHWaKI+PHhN9X s3IAPXx5MvpXaHdZDQ0EYOkhEpcsWAZt/qZIp+2x/H+GriXlBlipY61mULC+ONJh0z LKKbZjGjdS17138tEE0eMVnik48MGcFFhvqkHR7g= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725883AbgABV0x (ORCPT ); Thu, 2 Jan 2020 16:26:53 -0500 Received: from mail-qt1-f196.google.com ([209.85.160.196]:35443 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgABV0x (ORCPT ); Thu, 2 Jan 2020 16:26:53 -0500 Received: by mail-qt1-f196.google.com with SMTP id e12so35611975qto.2 for ; Thu, 02 Jan 2020 13:26:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=RaWEyrwdMVsq+Nf1/ZtjAs8mbWCFevwKXtmeu9yks4g=; b=HtrZ4jyK/qrVVPJjTacP6TjvdjikAlDND/oyv5MuvXdE+KjvYcZtPHCObn70OMYAJa d1HwYAFs4Cy3VrRF/CWJlavJdnTO0/X6lav9uILBNotiSoPigjQody0ORTCfuh5kD56q YrtGPct3W3tt7WmtNSd1zpXGZynC/eLCZxaPOKzYhZVJXR+re1UlH7XhJjoKrZLGxDkn 9Us0sTAOX8As0Mu5EraRz3YSNjC0hbtgje10FhlrjpFW2oLYmAD2Y4iyKZYRTcr8Xopp D7dmq4cfNYaIQeKK7BwDQnRaQgcwq64KLmBnHsM9jJzJTsY1qIKxprWA8uARPeV12aHJ IYtw== X-Gm-Message-State: APjAAAUWwq1x5AzPDtVD3Fy0SiX3wvFLAitZ3LzR0Ml8BwtyA/6ZWoW1 Dp3BpEsVLs2oRMf57Q85ZRk= X-Google-Smtp-Source: APXvYqw4gty4fpjip/T4EIMUIvmg1oSbiAwr2s8VHmmk4SC9D6l6g/g15YzXT8SKNmH8NsWlTXByrQ== X-Received: by 2002:ac8:4647:: with SMTP id f7mr61416874qto.361.1578000411898; Thu, 02 Jan 2020 13:26:51 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:51 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 01/12] btrfs: calculate discard delay based on number of extents Date: Thu, 2 Jan 2020 16:26:35 -0500 Message-Id: X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org An earlier patch keeps track of discardable_extents. These are undiscarded extents managed by the free space cache. Here, we will use this to dynamically calculate the discard delay interval. There are 3 rate to consider. The first is the target convergence rate, the rate to discard all discardable_extents over the BTRFS_DISCARD_TARGET_MSEC time frame. This is clamped by the lower limit, the iops limit or BTRFS_DISCARD_MIN_DELAY (1ms), and the upper limit, BTRFS_DISCARD_MAX_DELAY (1s). We reevaluate this delay every transaction commit. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/discard.c | 55 +++++++++++++++++++++++++++++++++++++++--- fs/btrfs/discard.h | 1 + fs/btrfs/extent-tree.c | 4 ++- fs/btrfs/sysfs.c | 31 ++++++++++++++++++++++++ 5 files changed, 88 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 7c1c236d13ae..c73bbc7e4491 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -468,6 +468,8 @@ struct btrfs_discard_ctl { struct list_head discard_list[BTRFS_NR_DISCARD_LISTS]; atomic_t discardable_extents; atomic64_t discardable_bytes; + unsigned long delay; + unsigned iops_limit; }; /* delayed seq elem */ diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index 173770bf8a2d..abcc3b2189d1 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -15,6 +15,12 @@ #define BTRFS_DISCARD_DELAY (120ULL * NSEC_PER_SEC) #define BTRFS_DISCARD_UNUSED_DELAY (10ULL * NSEC_PER_SEC) +/* Target completion latency of discarding all discardable extents */ +#define BTRFS_DISCARD_TARGET_MSEC (6 * 60 * 60UL * MSEC_PER_SEC) +#define BTRFS_DISCARD_MIN_DELAY_MSEC (1UL) +#define BTRFS_DISCARD_MAX_DELAY_MSEC (1000UL) +#define BTRFS_DISCARD_MAX_IOPS (10U) + static struct list_head *get_discard_list(struct btrfs_discard_ctl *discard_ctl, struct btrfs_block_group *block_group) { @@ -235,11 +241,18 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl, block_group = find_next_block_group(discard_ctl, now); if (block_group) { - u64 delay = 0; + unsigned long delay = discard_ctl->delay; + + /* + * This timeout is to hopefully prevent immediate discarding + * in a recently allocated block group. + */ + if (now < block_group->discard_eligible_time) { + u64 bg_timeout = (block_group->discard_eligible_time - + now); - if (now < block_group->discard_eligible_time) - delay = nsecs_to_jiffies( - block_group->discard_eligible_time - now); + delay = max(delay, nsecs_to_jiffies(bg_timeout)); + } mod_delayed_work(discard_ctl->discard_workers, &discard_ctl->work, delay); @@ -342,6 +355,38 @@ bool btrfs_run_discard_work(struct btrfs_discard_ctl *discard_ctl) test_bit(BTRFS_FS_DISCARD_RUNNING, &fs_info->flags)); } +/** + * btrfs_discard_calc_delay - recalculate the base delay + * @discard_ctl: discard control + * + * Recalculate the base delay which is based off the total number of + * discardable_extents. Clamp this between the lower_limit (iops_limit or 1ms) + * and the upper_limit (BTRFS_DISCARD_MAX_DELAY_MSEC). + */ +void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl) +{ + s32 discardable_extents = + atomic_read(&discard_ctl->discardable_extents); + unsigned iops_limit; + unsigned long delay, lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC; + + if (!discardable_extents) + return; + + spin_lock(&discard_ctl->lock); + + iops_limit = READ_ONCE(discard_ctl->iops_limit); + if (iops_limit) + lower_limit = max_t(unsigned long, lower_limit, + MSEC_PER_SEC / iops_limit); + + delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents; + delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC); + discard_ctl->delay = msecs_to_jiffies(delay); + + spin_unlock(&discard_ctl->lock); +} + /** * btrfs_discard_update_discardable - propagate discard counters * @block_group: block_group of interest @@ -464,6 +509,8 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info) atomic_set(&discard_ctl->discardable_extents, 0); atomic64_set(&discard_ctl->discardable_bytes, 0); + discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC; + discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS; } void btrfs_discard_cleanup(struct btrfs_fs_info *fs_info) diff --git a/fs/btrfs/discard.h b/fs/btrfs/discard.h index 0f2f89b1b0b9..5250fe178e49 100644 --- a/fs/btrfs/discard.h +++ b/fs/btrfs/discard.h @@ -17,6 +17,7 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl, bool btrfs_run_discard_work(struct btrfs_discard_ctl *discard_ctl); /* Update operations */ +void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl); void btrfs_discard_update_discardable(struct btrfs_block_group *block_group, struct btrfs_free_space_ctl *ctl); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 2c12366cfde5..0163fdd59f8f 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2935,8 +2935,10 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans) cond_resched(); } - if (btrfs_test_opt(fs_info, DISCARD_ASYNC)) + if (btrfs_test_opt(fs_info, DISCARD_ASYNC)) { + btrfs_discard_calc_delay(&fs_info->discard_ctl); btrfs_discard_schedule_work(&fs_info->discard_ctl, true); + } /* * Transaction is finished. We don't need the lock anymore. We diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index e9dbdbbbebeb..e175aaf7a1e6 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -344,6 +344,36 @@ static const struct attribute_group btrfs_static_feature_attr_group = { */ #define discard_to_fs_info(_kobj) to_fs_info((_kobj)->parent->parent) +static ssize_t btrfs_discard_iops_limit_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%u\n", + READ_ONCE(fs_info->discard_ctl.iops_limit)); +} + +static ssize_t btrfs_discard_iops_limit_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + struct btrfs_discard_ctl *discard_ctl = &fs_info->discard_ctl; + unsigned iops_limit; + int ret; + + ret = kstrtouint(buf, 10, &iops_limit); + if (ret) + return -EINVAL; + + WRITE_ONCE(discard_ctl->iops_limit, iops_limit); + + return len; +} +BTRFS_ATTR_RW(discard, iops_limit, btrfs_discard_iops_limit_show, + btrfs_discard_iops_limit_store); + static ssize_t btrfs_discardable_extents_show(struct kobject *kobj, struct kobj_attribute *a, char *buf) @@ -367,6 +397,7 @@ static ssize_t btrfs_discardable_bytes_show(struct kobject *kobj, BTRFS_ATTR(discard, discardable_bytes, btrfs_discardable_bytes_show); static const struct attribute *discard_debug_attrs[] = { + BTRFS_ATTR_PTR(discard, iops_limit), BTRFS_ATTR_PTR(discard, discardable_extents), BTRFS_ATTR_PTR(discard, discardable_bytes), NULL, From patchwork Thu Jan 2 21:26:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316077 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 82964138C for ; Thu, 2 Jan 2020 21:26:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 556FD21D7D for ; Thu, 2 Jan 2020 21:26:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000415; bh=GtHungNKzxcXo5NSr48/08s3M0zeWADScmQLJcjC5KQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=tRRYaomiVSG4eAJZY5EC/AIGcz/XmMGSGh9wUAgTJBNEkAy9DWp3T5fbZ5NJ/XHK0 GPOX8Bwd49sP+SMxPGGp7BO938HTlduaslKV35S/96Mql0iJ0aixC+3vckJYotNPhU rRSQ/HTQ1YWz/NkmkHvcprYZFGx7fxhg744h3xXI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725916AbgABV0z (ORCPT ); Thu, 2 Jan 2020 16:26:55 -0500 Received: from mail-qk1-f195.google.com ([209.85.222.195]:42060 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725876AbgABV0y (ORCPT ); Thu, 2 Jan 2020 16:26:54 -0500 Received: by mail-qk1-f195.google.com with SMTP id z14so31115562qkg.9 for ; Thu, 02 Jan 2020 13:26:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=iyfaqzULUE1H7nul54Fg/Doy0YpQWDaHW1wBXlrhRhA=; b=hZUlPUoy/lZpupYVDCMmcFBNRVCL29S/AdYn9xrZeqoCcjoucJEYDaBOshx2doU11y 6SfQ+ZrTk9HzHG0U4gyT5fxM/S2L5/bjVVNvSbL3VuzQ+74Bxg8z72St6RHo2fG4VorZ nq3yDpVNa7YO31dHrs1VffhREbQV71zUJdPFl9KI3jQR5qx5K+Nup95noNCkrhjl3K50 QBliCTejiStGPcztkhVgzmzdUWEyFTT4h4xtm11bsoFPbGotDSPOT6ARU9Q/oPmUMsAz QisOTNu0ReG1t9Mg+nvbdgbqr7/f5fklslc7m1VmLTG73NmVXssmuGDT3kPTTHjlpFyD QGxg== X-Gm-Message-State: APjAAAWePpWrWeO5nuV/03gixrcpz5EHRRw4qBYoKuUeMitv1YbWiyq0 2Yg2qHMiy6Py1hHH0YYGheY= X-Google-Smtp-Source: APXvYqxaaLhjii8+9gOJMyW91NCLGfNOvaivd70JC87ePxQQepb748jGkNaCjgLIX8ACVDkAR+rb4w== X-Received: by 2002:a37:ac12:: with SMTP id e18mr69367381qkm.94.1578000413053; Thu, 02 Jan 2020 13:26:53 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:52 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 02/12] btrfs: add bps discard rate limit for async discard Date: Thu, 2 Jan 2020 16:26:36 -0500 Message-Id: <8929cde12ad0237ab8a4e2bbdff7b713886eff56.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Provide the ability to rate limit based on kbps in addition to iops as additional guides for the target discard rate. The delay used ends up being max(kbps_delay, iops_delay). Signed-off-by: Dennis Zhou --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/discard.c | 23 +++++++++++++++++++++-- fs/btrfs/sysfs.c | 31 +++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index c73bbc7e4491..2485cf94b628 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -466,10 +466,12 @@ struct btrfs_discard_ctl { spinlock_t lock; struct btrfs_block_group *block_group; struct list_head discard_list[BTRFS_NR_DISCARD_LISTS]; + u64 prev_discard; atomic_t discardable_extents; atomic64_t discardable_bytes; unsigned long delay; unsigned iops_limit; + u32 kbps_limit; }; /* delayed seq elem */ diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index abcc3b2189d1..eb148ca9a508 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include "ctree.h" @@ -222,8 +223,8 @@ void btrfs_discard_queue_work(struct btrfs_discard_ctl *discard_ctl, * @override: override the current timer * * Discards are issued by a delayed workqueue item. @override is used to - * update the current delay as the baseline delay interview is reevaluated - * on transaction commit. This is also maxed with any other rate limit. + * update the current delay as the baseline delay interval is reevaluated on + * transaction commit. This is also maxed with any other rate limit. */ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl, bool override) @@ -242,6 +243,20 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl, block_group = find_next_block_group(discard_ctl, now); if (block_group) { unsigned long delay = discard_ctl->delay; + u32 kbps_limit = READ_ONCE(discard_ctl->kbps_limit); + + /* + * A single delayed workqueue item is responsible for + * discarding, so we can manage the bytes rate limit by keeping + * track of the previous discard. + */ + if (kbps_limit && discard_ctl->prev_discard) { + u64 bps_limit = ((u64)kbps_limit) * SZ_1K; + u64 bps_delay = div64_u64(discard_ctl->prev_discard * + MSEC_PER_SEC, bps_limit); + + delay = max(delay, msecs_to_jiffies(bps_delay)); + } /* * This timeout is to hopefully prevent immediate discarding @@ -317,6 +332,8 @@ static void btrfs_discard_workfn(struct work_struct *work) btrfs_block_group_end(block_group), 0, true); + discard_ctl->prev_discard = trimmed; + /* Determine next steps for a block_group */ if (block_group->discard_cursor >= btrfs_block_group_end(block_group)) { if (discard_state == BTRFS_DISCARD_BITMAPS) { @@ -507,10 +524,12 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info) for (i = 0; i < BTRFS_NR_DISCARD_LISTS; i++) INIT_LIST_HEAD(&discard_ctl->discard_list[i]); + discard_ctl->prev_discard = 0; atomic_set(&discard_ctl->discardable_extents, 0); atomic64_set(&discard_ctl->discardable_bytes, 0); discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC; discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS; + discard_ctl->kbps_limit = 0; } void btrfs_discard_cleanup(struct btrfs_fs_info *fs_info) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index e175aaf7a1e6..39b59f368f06 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -374,6 +374,36 @@ static ssize_t btrfs_discard_iops_limit_store(struct kobject *kobj, BTRFS_ATTR_RW(discard, iops_limit, btrfs_discard_iops_limit_show, btrfs_discard_iops_limit_store); +static ssize_t btrfs_discard_kbps_limit_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%u\n", + READ_ONCE(fs_info->discard_ctl.kbps_limit)); +} + +static ssize_t btrfs_discard_kbps_limit_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + struct btrfs_discard_ctl *discard_ctl = &fs_info->discard_ctl; + u32 kbps_limit; + int ret; + + ret = kstrtou32(buf, 10, &kbps_limit); + if (ret) + return -EINVAL; + + WRITE_ONCE(discard_ctl->kbps_limit, kbps_limit); + + return len; +} +BTRFS_ATTR_RW(discard, kbps_limit, btrfs_discard_kbps_limit_show, + btrfs_discard_kbps_limit_store); + static ssize_t btrfs_discardable_extents_show(struct kobject *kobj, struct kobj_attribute *a, char *buf) @@ -398,6 +428,7 @@ BTRFS_ATTR(discard, discardable_bytes, btrfs_discardable_bytes_show); static const struct attribute *discard_debug_attrs[] = { BTRFS_ATTR_PTR(discard, iops_limit), + BTRFS_ATTR_PTR(discard, kbps_limit), BTRFS_ATTR_PTR(discard, discardable_extents), BTRFS_ATTR_PTR(discard, discardable_bytes), NULL, From patchwork Thu Jan 2 21:26:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316079 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3600317F0 for ; Thu, 2 Jan 2020 21:26:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1507821582 for ; Thu, 2 Jan 2020 21:26:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000416; bh=AQEhrkt1VHsJN3A96+mQ3QbjR/1+7tdLH8Mo2xHCDNg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=T+aGGZloDGN4x/DIFYw5VLFBdMJrrDi16kG7/X7RjNVdsO03rEufbmNvE5EWWj0AS VqwcmcLsvOdzOqAclz72tpDJU3hBiCUZSl2f2d0HVzIyumps5dn2psKYTxnLVgexJf JehMl1oJ3zNYU7pUwQvs2I0JwAfYZWcpAIUoLW94= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725943AbgABV0z (ORCPT ); Thu, 2 Jan 2020 16:26:55 -0500 Received: from mail-qv1-f65.google.com ([209.85.219.65]:39423 "EHLO mail-qv1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgABV0z (ORCPT ); Thu, 2 Jan 2020 16:26:55 -0500 Received: by mail-qv1-f65.google.com with SMTP id y8so15515641qvk.6 for ; Thu, 02 Jan 2020 13:26:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=MtLI26uKkR+KnPVywuu3DHK8aDekgZjDX7rcaBxOf1Q=; b=p5K6U5kepYinYYIOSirPl4N3AGtyAVVgsxrVQdPNHKJkQC1Tg/wj5nfqfdnModkAcI ueeed4BMhu9wucUl8LhhWAjplWbF7GTklptR3U4Yj4wOXW/+Qv7h/ODka3pSDzbT62c3 hqFsNJY2OCDq7yExUgS4iOfnseFk9nRhUtSOsBc2Jh/iZfxjLUL5EM6BqMT9+cAWF3FG wm0TSrRGwF3FLtDwir53SP3EVtJyHn0HazXDNmraW+eoDrHolNieK2UkaxW3nvBDXmXT 0QchaOaJZgIWe+M+FJrtAPq+S4aoLEvquMv3St8NSDUxH6T9rjP/ZvDD3sGyeb56uecv hPJw== X-Gm-Message-State: APjAAAUaQgC3E+nCgK5qihA8CtyodhSnJWXXJa+5lFlUXslxkVw4ZT9s Tw2xHEEFy5LpAemriMa8Ui0= X-Google-Smtp-Source: APXvYqzSvN+kulNUxO2u1UyUnlQ4QlwxPLAIc+PWxJN2OYUJZzXQj965eLGh+y5Z/6By12d606XR4w== X-Received: by 2002:a0c:c24f:: with SMTP id w15mr65548308qvh.66.1578000414174; Thu, 02 Jan 2020 13:26:54 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:53 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 03/12] btrfs: limit max discard size for async discard Date: Thu, 2 Jan 2020 16:26:37 -0500 Message-Id: X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Throttle the maximum size of a discard so that we can provide an upper bound for the rate of async discard. While the block layer is able to split discards into the appropriate sized discards, we want to be able to account more accurately the rate at which we are consuming ncq slots as well as limit the upper bound of work for a discard. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/discard.h | 5 +++++ fs/btrfs/free-space-cache.c | 41 +++++++++++++++++++++++++++++-------- 2 files changed, 37 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/discard.h b/fs/btrfs/discard.h index 5250fe178e49..562c60fab77a 100644 --- a/fs/btrfs/discard.h +++ b/fs/btrfs/discard.h @@ -3,10 +3,15 @@ #ifndef BTRFS_DISCARD_H #define BTRFS_DISCARD_H +#include + struct btrfs_fs_info; struct btrfs_discard_ctl; struct btrfs_block_group; +/* Discard size limits */ +#define BTRFS_ASYNC_DISCARD_MAX_SIZE (SZ_64M) + /* Work operations */ void btrfs_discard_cancel_work(struct btrfs_discard_ctl *discard_ctl, struct btrfs_block_group *block_group); diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 40fb918a82f4..34291c777998 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -3466,16 +3466,36 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, extent_start = entry->offset; extent_bytes = entry->bytes; extent_trim_state = entry->trim_state; - start = max(start, extent_start); - bytes = min(extent_start + extent_bytes, end) - start; - if (bytes < minlen) { - spin_unlock(&ctl->tree_lock); - mutex_unlock(&ctl->cache_writeout_mutex); - goto next; - } + if (async) { + start = entry->offset; + bytes = entry->bytes; + if (bytes < minlen) { + spin_unlock(&ctl->tree_lock); + mutex_unlock(&ctl->cache_writeout_mutex); + goto next; + } + unlink_free_space(ctl, entry); + if (bytes > BTRFS_ASYNC_DISCARD_MAX_SIZE) { + bytes = extent_bytes = + BTRFS_ASYNC_DISCARD_MAX_SIZE; + entry->offset += BTRFS_ASYNC_DISCARD_MAX_SIZE; + entry->bytes -= BTRFS_ASYNC_DISCARD_MAX_SIZE; + link_free_space(ctl, entry); + } else { + kmem_cache_free(btrfs_free_space_cachep, entry); + } + } else { + start = max(start, extent_start); + bytes = min(extent_start + extent_bytes, end) - start; + if (bytes < minlen) { + spin_unlock(&ctl->tree_lock); + mutex_unlock(&ctl->cache_writeout_mutex); + goto next; + } - unlink_free_space(ctl, entry); - kmem_cache_free(btrfs_free_space_cachep, entry); + unlink_free_space(ctl, entry); + kmem_cache_free(btrfs_free_space_cachep, entry); + } spin_unlock(&ctl->tree_lock); trim_entry.start = extent_start; @@ -3639,6 +3659,9 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, goto next; } + if (async && bytes > BTRFS_ASYNC_DISCARD_MAX_SIZE) + bytes = BTRFS_ASYNC_DISCARD_MAX_SIZE; + bitmap_clear_bits(ctl, entry, start, bytes); if (entry->bytes == 0) free_bitmap(ctl, entry); From patchwork Thu Jan 2 21:26:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316081 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 250EB14B4 for ; Thu, 2 Jan 2020 21:26:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EDE9921D7D for ; Thu, 2 Jan 2020 21:26:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000418; bh=V0BpxmSdk91x1+FKW9uy+DkEgYgmnKda6E0+aBbWIEU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=Lziap8CRjNfvfqH/0uLTqd0FliWsM22SsXGxsosUir9NZLqsCRtOsYqu2MVZ4PaL9 wR9EWoi9l1p2V39ZPZ6+mTaKRMfF+eieqv+rbLhBG0FiiuUZDb2WPGHxgWegpvcoNe Dn4HGEVzMOdy00iloMo+vroL3v3I3hnw61ovM/e8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726019AbgABV05 (ORCPT ); Thu, 2 Jan 2020 16:26:57 -0500 Received: from mail-qk1-f193.google.com ([209.85.222.193]:43221 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgABV04 (ORCPT ); Thu, 2 Jan 2020 16:26:56 -0500 Received: by mail-qk1-f193.google.com with SMTP id t129so32359194qke.10 for ; Thu, 02 Jan 2020 13:26:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=nYpxLbpm9KSucsYDPYdd9HUhINGTrnEqQK70rkWv3Tg=; b=S8Ve+eWITc1/qMz7/CDIkPfITm4Tzfp+71R+vlcF7OQnSEjgoodiHL9sCX5s4nf6fH AsGVmJOEmauNdX7INxBnygdhO46s7uy1FvG/tP5NgdFTQR+HOnnGoewRLRCCDKtZgoOa Eceo4kh7koccPRW0N1gab/LPvrAFkrifjWO20sDJSi6/6svSJ6u6nbpDw6Kwn+vzaRR4 ND+ad4bP763RdNbIJPQT+Zgr4GiHBL+RRjlxjuKFsf5d3h9i5wauiHn1cnoDqnmghk8j MNaQDmtJKjRTwSnwNAHeef7dixeLVUshkXKvLMxnQ1O751O7Zh9TApmi4CzTTyjaSgWm +UIA== X-Gm-Message-State: APjAAAVI1nPtrKNfXOPb2iATt9TQRRszabUclld9AMyl+riuhqgcO3YD V6Kw5t+whrjEHW2OaIP+VHg= X-Google-Smtp-Source: APXvYqyqnwbUx2jpueYGsuaM++bbFXNcXpmPwBcPO5BXhAJn06IzUCoF8TifLA03SiC8sOgfX0kjew== X-Received: by 2002:ae9:c317:: with SMTP id n23mr59838043qkg.356.1578000415262; Thu, 02 Jan 2020 13:26:55 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:54 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 04/12] btrfs: make max async discard size tunable Date: Thu, 2 Jan 2020 16:26:38 -0500 Message-Id: <786ac88afbfaa7993449811b282ea8790ba02338.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Expose max_discard_size as a tunable via sysfs. Signed-off-by: Dennis Zhou --- fs/btrfs/ctree.h | 1 + fs/btrfs/discard.c | 1 + fs/btrfs/discard.h | 2 +- fs/btrfs/free-space-cache.c | 19 ++++++++++++------- fs/btrfs/sysfs.c | 31 +++++++++++++++++++++++++++++++ 5 files changed, 46 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2485cf94b628..9328a0398678 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -469,6 +469,7 @@ struct btrfs_discard_ctl { u64 prev_discard; atomic_t discardable_extents; atomic64_t discardable_bytes; + u64 max_discard_size; unsigned long delay; unsigned iops_limit; u32 kbps_limit; diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index eb148ca9a508..822b888d90e3 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -527,6 +527,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info) discard_ctl->prev_discard = 0; atomic_set(&discard_ctl->discardable_extents, 0); atomic64_set(&discard_ctl->discardable_bytes, 0); + discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DFL_MAX_SIZE; discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC; discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS; discard_ctl->kbps_limit = 0; diff --git a/fs/btrfs/discard.h b/fs/btrfs/discard.h index 562c60fab77a..72816e479416 100644 --- a/fs/btrfs/discard.h +++ b/fs/btrfs/discard.h @@ -10,7 +10,7 @@ struct btrfs_discard_ctl; struct btrfs_block_group; /* Discard size limits */ -#define BTRFS_ASYNC_DISCARD_MAX_SIZE (SZ_64M) +#define BTRFS_ASYNC_DISCARD_DFL_MAX_SIZE (SZ_64M) /* Work operations */ void btrfs_discard_cancel_work(struct btrfs_discard_ctl *discard_ctl, diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 34291c777998..fb1a53f9b39c 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -3428,6 +3428,8 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 end, u64 minlen, bool async) { + struct btrfs_discard_ctl *discard_ctl = + &block_group->fs_info->discard_ctl; struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; struct btrfs_free_space *entry; struct rb_node *node; @@ -3436,6 +3438,7 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, u64 extent_bytes; enum btrfs_trim_state extent_trim_state; u64 bytes; + u64 max_discard_size = READ_ONCE(discard_ctl->max_discard_size); while (start < end) { struct btrfs_trim_range trim_entry; @@ -3475,11 +3478,10 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, goto next; } unlink_free_space(ctl, entry); - if (bytes > BTRFS_ASYNC_DISCARD_MAX_SIZE) { - bytes = extent_bytes = - BTRFS_ASYNC_DISCARD_MAX_SIZE; - entry->offset += BTRFS_ASYNC_DISCARD_MAX_SIZE; - entry->bytes -= BTRFS_ASYNC_DISCARD_MAX_SIZE; + if (max_discard_size && bytes > max_discard_size) { + bytes = extent_bytes = max_discard_size; + entry->offset += max_discard_size; + entry->bytes -= max_discard_size; link_free_space(ctl, entry); } else { kmem_cache_free(btrfs_free_space_cachep, entry); @@ -3584,12 +3586,15 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 end, u64 minlen, bool async) { + struct btrfs_discard_ctl *discard_ctl = + &block_group->fs_info->discard_ctl; struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; struct btrfs_free_space *entry; int ret = 0; int ret2; u64 bytes; u64 offset = offset_to_bitmap(ctl, start); + u64 max_discard_size = READ_ONCE(discard_ctl->max_discard_size); while (offset < end) { bool next_bitmap = false; @@ -3659,8 +3664,8 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, goto next; } - if (async && bytes > BTRFS_ASYNC_DISCARD_MAX_SIZE) - bytes = BTRFS_ASYNC_DISCARD_MAX_SIZE; + if (async && max_discard_size && bytes > max_discard_size) + bytes = max_discard_size; bitmap_clear_bits(ctl, entry, start, bytes); if (entry->bytes == 0) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 39b59f368f06..8b0fd8557438 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -404,6 +404,36 @@ static ssize_t btrfs_discard_kbps_limit_store(struct kobject *kobj, BTRFS_ATTR_RW(discard, kbps_limit, btrfs_discard_kbps_limit_show, btrfs_discard_kbps_limit_store); +static ssize_t btrfs_discard_max_discard_size_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%llu\n", + READ_ONCE(fs_info->discard_ctl.max_discard_size)); +} + +static ssize_t btrfs_discard_max_discard_size_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + struct btrfs_discard_ctl *discard_ctl = &fs_info->discard_ctl; + u64 max_discard_size; + int ret; + + ret = kstrtou64(buf, 10, &max_discard_size); + if (ret) + return -EINVAL; + + WRITE_ONCE(discard_ctl->max_discard_size, max_discard_size); + + return len; +} +BTRFS_ATTR_RW(discard, max_discard_size, btrfs_discard_max_discard_size_show, + btrfs_discard_max_discard_size_store); + static ssize_t btrfs_discardable_extents_show(struct kobject *kobj, struct kobj_attribute *a, char *buf) @@ -429,6 +459,7 @@ BTRFS_ATTR(discard, discardable_bytes, btrfs_discardable_bytes_show); static const struct attribute *discard_debug_attrs[] = { BTRFS_ATTR_PTR(discard, iops_limit), BTRFS_ATTR_PTR(discard, kbps_limit), + BTRFS_ATTR_PTR(discard, max_discard_size), BTRFS_ATTR_PTR(discard, discardable_extents), BTRFS_ATTR_PTR(discard, discardable_bytes), NULL, From patchwork Thu Jan 2 21:26:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316083 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 96AE614B4 for ; Thu, 2 Jan 2020 21:26:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5DD4921582 for ; Thu, 2 Jan 2020 21:26:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000419; bh=ASaSfHUwtr3aJPeP3H0fFvdeaIOpUx7FCSgAOQ3ZMUU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=PmxMm3ba61KuvxSszWcmVFmhrRC6km8w0fPY+nfqBwu91nMkTbi2R5zHB1HS+g9yo JcUr5tKaxcyqydKePgz+ps8jiV7aLXUv165/OjSSoydjCp/vURBPyCxXUCueaKXy0n +CU+mSJw01P08b0Y57EbiG8vqGdwjgEKjHZIlNbA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726039AbgABV06 (ORCPT ); Thu, 2 Jan 2020 16:26:58 -0500 Received: from mail-qk1-f193.google.com ([209.85.222.193]:44355 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726005AbgABV05 (ORCPT ); Thu, 2 Jan 2020 16:26:57 -0500 Received: by mail-qk1-f193.google.com with SMTP id w127so32357161qkb.11 for ; Thu, 02 Jan 2020 13:26:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=Aicga4teOFENIGWjY6/kKF0D+25t0QVvq+nQ5mrhdno=; b=dmS3Fw9RZ32aAlGWNb4Q9ub/FlM15mHr/daWWCQywpDTVLdyMgVOLJOB3scuT8FCzf R+XPbQGHloHmpu9BEYl38W4H2sRlqMx5MM9X+vu+92e8jWNDGK2yapLFfzIfvL3Sfa7H M89lbKwiRLWhVAPnX5R/9UfVMnl+7iWPBOWAKn03lUpCsoLDoc+gvbrSm7GQKOfydT3B 4TXxhmktPX7oTu3Hi3un0TA1zm7QUt0AJJy64c4X1qY6G6wfutEsYWDQ6XHEFTeUtfvN HG0a+K1YI/wsAZQ6Si6gmMxy+8cuO3VfXrTT3U5AaEvJGh3sZlJX0B32DJQjXxRzPd9y nRSw== X-Gm-Message-State: APjAAAUH85qJvFMEEsgoPxR7xSn4spNN6E3qRxQuMW7zU+GPMXwvNmQa jWcvWZj0Re3Nhwa3Ja24r+s= X-Google-Smtp-Source: APXvYqx8TjeHPo0B0lvLi4lL3wdnFegClpBdz3IMEbGGudMmeRyrvcU6YUfliNEtpCtmpbLSOAcBrA== X-Received: by 2002:a37:4047:: with SMTP id n68mr69616227qka.258.1578000416433; Thu, 02 Jan 2020 13:26:56 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:55 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 05/12] btrfs: have multiple discard lists Date: Thu, 2 Jan 2020 16:26:39 -0500 Message-Id: <387a15caf917750427eb9499924d2ef571c16c3c.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Non-block group destruction discarding currently only had a single list with no minimum discard length. This can lead to caravaning more meaningful discards behind a heavily fragmented block group. This adds support for multiple lists with minimum discard lengths to prevent the caravan effect. We promote block groups back up when we exceed the BTRFS_ASYNC_DISCARD_MAX_FILTER size, currently we support only 2 lists with filters of 1MB and 32KB respectively. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 2 +- fs/btrfs/discard.c | 102 ++++++++++++++++++++++++++++++++---- fs/btrfs/discard.h | 6 +++ fs/btrfs/free-space-cache.c | 54 ++++++++++++++----- fs/btrfs/free-space-cache.h | 2 +- 5 files changed, 142 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9328a0398678..09371e8b55a7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -456,7 +456,7 @@ struct btrfs_full_stripe_locks_tree { * afterwards represent monotonically decreasing discard filter sizes to * prioritize what should be discarded next. */ -#define BTRFS_NR_DISCARD_LISTS 2 +#define BTRFS_NR_DISCARD_LISTS 3 #define BTRFS_DISCARD_INDEX_UNUSED 0 #define BTRFS_DISCARD_INDEX_START 1 diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index 822b888d90e3..de436c0051ce 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -22,6 +22,10 @@ #define BTRFS_DISCARD_MAX_DELAY_MSEC (1000UL) #define BTRFS_DISCARD_MAX_IOPS (10U) +/* Montonically decreasing minimum length filters after index 0. */ +static int discard_minlen[BTRFS_NR_DISCARD_LISTS] = {0, + BTRFS_ASYNC_DISCARD_MAX_FILTER, BTRFS_ASYNC_DISCARD_MIN_FILTER}; + static struct list_head *get_discard_list(struct btrfs_discard_ctl *discard_ctl, struct btrfs_block_group *block_group) { @@ -139,16 +143,18 @@ static struct btrfs_block_group *find_next_block_group( * peek_discard_list - wrap find_next_block_group() * @discard_ctl: discard control * @discard_state: the discard_state of the block_group after state management + * @discard_index: the discard_index of the block_group after state management * * This wraps find_next_block_group() and sets the block_group to be in use. * discard_state's control flow is managed here. Variables related to - * discard_state are reset here as needed (eg. discard_cursor). @discard_state - * is remembered as it may change while we're discarding, but we want the - * discard to execute in the context determined here. + * discard_state are reset here as needed (eg discard_cursor). @discard_state + * and @discard_index are remembered as it may change while we're discarding, + * but we want the discard to execute in the context determined here. */ static struct btrfs_block_group *peek_discard_list( struct btrfs_discard_ctl *discard_ctl, - enum btrfs_discard_state *discard_state) + enum btrfs_discard_state *discard_state, + int *discard_index) { struct btrfs_block_group *block_group; const u64 now = ktime_get_ns(); @@ -169,6 +175,7 @@ static struct btrfs_block_group *peek_discard_list( } discard_ctl->block_group = block_group; *discard_state = block_group->discard_state; + *discard_index = block_group->discard_index; } else { block_group = NULL; } @@ -178,6 +185,64 @@ static struct btrfs_block_group *peek_discard_list( return block_group; } +/** + * btrfs_discard_check_filter - updates a block groups filters + * @block_group: block group of interest + * @bytes: recently freed region size after coalescing + * + * Async discard maintains multiple lists with progressively smaller filters + * to prioritize discarding based on size. Should a free space that matches + * a larger filter be returned to the free_space_cache, prioritize that discard + * by moving @block_group to the proper filter. + */ +void btrfs_discard_check_filter(struct btrfs_block_group *block_group, + u64 bytes) +{ + struct btrfs_discard_ctl *discard_ctl; + + if (!block_group || + !btrfs_test_opt(block_group->fs_info, DISCARD_ASYNC)) + return; + + discard_ctl = &block_group->fs_info->discard_ctl; + + if (block_group->discard_index > BTRFS_DISCARD_INDEX_START && + bytes >= discard_minlen[block_group->discard_index - 1]) { + int i; + + remove_from_discard_list(discard_ctl, block_group); + + for (i = BTRFS_DISCARD_INDEX_START; i < BTRFS_NR_DISCARD_LISTS; + i++) { + if (bytes >= discard_minlen[i]) { + block_group->discard_index = i; + add_to_discard_list(discard_ctl, block_group); + break; + } + } + } +} + +/** + * btrfs_update_discard_index - moves a block_group along the discard lists + * @discard_ctl: discard control + * @block_group: block_group of interest + * + * Increment @block_group's discard_index. If it falls of the list, let it be. + * Otherwise add it back to the appropriate list. + */ +static void btrfs_update_discard_index(struct btrfs_discard_ctl *discard_ctl, + struct btrfs_block_group *block_group) +{ + block_group->discard_index++; + if (block_group->discard_index == BTRFS_NR_DISCARD_LISTS) { + block_group->discard_index = 1; + return; + } + + add_to_discard_list(discard_ctl, block_group); +} + /** * btrfs_discard_cancel_work - remove a block_group from the discard lists * @discard_ctl: discard control @@ -296,6 +361,8 @@ static void btrfs_finish_discard_pass(struct btrfs_discard_ctl *discard_ctl, btrfs_mark_bg_unused(block_group); else add_to_discard_unused_list(discard_ctl, block_group); + } else { + btrfs_update_discard_index(discard_ctl, block_group); } } @@ -312,25 +379,42 @@ static void btrfs_discard_workfn(struct work_struct *work) struct btrfs_discard_ctl *discard_ctl; struct btrfs_block_group *block_group; enum btrfs_discard_state discard_state; + int discard_index = 0; u64 trimmed = 0; + u64 minlen = 0; discard_ctl = container_of(work, struct btrfs_discard_ctl, work.work); - block_group = peek_discard_list(discard_ctl, &discard_state); + block_group = peek_discard_list(discard_ctl, &discard_state, + &discard_index); if (!block_group || !btrfs_run_discard_work(discard_ctl)) return; /* Perform discarding */ - if (discard_state == BTRFS_DISCARD_BITMAPS) + minlen = discard_minlen[discard_index]; + + if (discard_state == BTRFS_DISCARD_BITMAPS) { + u64 maxlen = 0; + + /* + * Use the previous levels minimum discard length as the max + * length filter. In the case something is added to make a + * region go beyond the max filter, the entire bitmap is set + * back to BTRFS_TRIM_STATE_UNTRIMMED. + */ + if (discard_index != BTRFS_DISCARD_INDEX_UNUSED) + maxlen = discard_minlen[discard_index - 1]; + btrfs_trim_block_group_bitmaps(block_group, &trimmed, block_group->discard_cursor, btrfs_block_group_end(block_group), - 0, true); - else + minlen, maxlen, true); + } else { btrfs_trim_block_group_extents(block_group, &trimmed, block_group->discard_cursor, btrfs_block_group_end(block_group), - 0, true); + minlen, true); + } discard_ctl->prev_discard = trimmed; diff --git a/fs/btrfs/discard.h b/fs/btrfs/discard.h index 72816e479416..f73276ae74f7 100644 --- a/fs/btrfs/discard.h +++ b/fs/btrfs/discard.h @@ -11,6 +11,12 @@ struct btrfs_block_group; /* Discard size limits */ #define BTRFS_ASYNC_DISCARD_DFL_MAX_SIZE (SZ_64M) +#define BTRFS_ASYNC_DISCARD_MAX_FILTER (SZ_1M) +#define BTRFS_ASYNC_DISCARD_MIN_FILTER (SZ_32K) + +/* List operations */ +void btrfs_discard_check_filter(struct btrfs_block_group *block_group, + u64 bytes); /* Work operations */ void btrfs_discard_cancel_work(struct btrfs_discard_ctl *discard_ctl, diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index fb1a53f9b39c..6f0d60e86b6f 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2465,6 +2465,7 @@ int __btrfs_add_free_space(struct btrfs_fs_info *fs_info, struct btrfs_block_group *block_group = ctl->private; struct btrfs_free_space *info; int ret = 0; + u64 filter_bytes = bytes; info = kmem_cache_zalloc(btrfs_free_space_cachep, GFP_NOFS); if (!info) @@ -2501,6 +2502,8 @@ int __btrfs_add_free_space(struct btrfs_fs_info *fs_info, */ steal_from_bitmap(ctl, info, true); + filter_bytes = max(filter_bytes, info->bytes); + ret = link_free_space(ctl, info); if (ret) kmem_cache_free(btrfs_free_space_cachep, info); @@ -2513,8 +2516,10 @@ int __btrfs_add_free_space(struct btrfs_fs_info *fs_info, ASSERT(ret != -EEXIST); } - if (trim_state != BTRFS_TRIM_STATE_TRIMMED) + if (trim_state != BTRFS_TRIM_STATE_TRIMMED) { + btrfs_discard_check_filter(block_group, filter_bytes); btrfs_discard_queue_work(&fs_info->discard_ctl, block_group); + } return ret; } @@ -3478,7 +3483,14 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, goto next; } unlink_free_space(ctl, entry); - if (max_discard_size && bytes > max_discard_size) { + /* + * Let bytes = BTRFS_MAX_DISCARD_SIZE + X. + * If X < BTRFS_ASYNC_DISCARD_MIN_FILTER, we won't trim + * X when we come back around. So trim it now. + */ + if (max_discard_size && + bytes >= (max_discard_size + + BTRFS_ASYNC_DISCARD_MIN_FILTER)) { bytes = extent_bytes = max_discard_size; entry->offset += max_discard_size; entry->bytes -= max_discard_size; @@ -3584,7 +3596,7 @@ static void end_trimming_bitmap(struct btrfs_free_space_ctl *ctl, */ static int trim_bitmaps(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 end, u64 minlen, - bool async) + u64 maxlen, bool async) { struct btrfs_discard_ctl *discard_ctl = &block_group->fs_info->discard_ctl; @@ -3612,7 +3624,15 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, } entry = tree_search_offset(ctl, offset, 1, 0); - if (!entry || (async && start == offset && + /* + * Bitmaps are marked trimmed lossily now to prevent constant + * discarding of the same bitmap (the reason why we are bound + * by the filters). So, retrim the block group bitmaps when we + * are preparing to punt to the unused_bgs list. This uses + * @minlen to determine if we are in BTRFS_DISCARD_INDEX_UNUSED + * which is the only discard index which sets minlen to 0. + */ + if (!entry || (async && minlen && start == offset && btrfs_free_space_trimmed(entry))) { spin_unlock(&ctl->tree_lock); mutex_unlock(&ctl->cache_writeout_mutex); @@ -3633,10 +3653,10 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, ret2 = search_bitmap(ctl, entry, &start, &bytes, false); if (ret2 || start >= end) { /* - * This keeps the invariant that all bytes are trimmed - * if BTRFS_TRIM_STATE_TRIMMED is set on a bitmap. + * We lossily consider a bitmap trimmed if we only skip + * over regions <= BTRFS_ASYNC_DISCARD_MIN_FILTER. */ - if (ret2 && !minlen) + if (ret2 && minlen <= BTRFS_ASYNC_DISCARD_MIN_FILTER) end_trimming_bitmap(ctl, entry); else entry->trim_state = BTRFS_TRIM_STATE_UNTRIMMED; @@ -3657,14 +3677,20 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, } bytes = min(bytes, end - start); - if (bytes < minlen) { - entry->trim_state = BTRFS_TRIM_STATE_UNTRIMMED; + if (bytes < minlen || (async && maxlen && bytes > maxlen)) { spin_unlock(&ctl->tree_lock); mutex_unlock(&ctl->cache_writeout_mutex); goto next; } - if (async && max_discard_size && bytes > max_discard_size) + /* + * Let bytes = BTRFS_MAX_DISCARD_SIZE + X. + * If X < @minlen, we won't trim X when we come back around. + * So trim it now. We differ here from trimming extents as we + * don't keep individual state per bit. + */ + if (async && max_discard_size && + bytes > (max_discard_size + minlen)) bytes = max_discard_size; bitmap_clear_bits(ctl, entry, start, bytes); @@ -3772,7 +3798,7 @@ int btrfs_trim_block_group(struct btrfs_block_group *block_group, if (ret) goto out; - ret = trim_bitmaps(block_group, trimmed, start, end, minlen, false); + ret = trim_bitmaps(block_group, trimmed, start, end, minlen, 0, false); div64_u64_rem(end, BITS_PER_BITMAP * ctl->unit, &rem); /* If we ended in the middle of a bitmap, reset the trimming flag */ if (rem) @@ -3806,7 +3832,7 @@ int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - bool async) + u64 maxlen, bool async) { int ret; @@ -3820,7 +3846,9 @@ int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, btrfs_get_block_group_trimming(block_group); spin_unlock(&block_group->lock); - ret = trim_bitmaps(block_group, trimmed, start, end, minlen, async); + ret = trim_bitmaps(block_group, trimmed, start, end, minlen, maxlen, + async); + btrfs_put_block_group_trimming(block_group); return ret; diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h index 5018190a49a3..2e0a8077aa74 100644 --- a/fs/btrfs/free-space-cache.h +++ b/fs/btrfs/free-space-cache.h @@ -146,7 +146,7 @@ int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, bool async); int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - bool async); + u64 maxlen, bool async); /* Support functions for running our sanity tests */ #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS From patchwork Thu Jan 2 21:26:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316085 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 729FB17EA for ; Thu, 2 Jan 2020 21:27:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4F79C21582 for ; Thu, 2 Jan 2020 21:27:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000420; bh=7ZPB1Z4Cabr5tYaIt0CekNQXwWGmlmvUGf1JYAoMCTw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=uExmNsTabFA51clbr15l1u3aONXJaTG+a/ay2EkbF/qSHzzzCrdu1XL2MxMU5M/tC hFFtvn35PM246zYQ1z3V7PM5zrxylr9v6j0mAW4cyAiNy41VAM2e9pkp+l1UkFbrBP MkPMwAKrM40nSNJ2OXBYsNIFbWq5zgR7uEAcAsL4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726052AbgABV07 (ORCPT ); Thu, 2 Jan 2020 16:26:59 -0500 Received: from mail-qk1-f194.google.com ([209.85.222.194]:40915 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbgABV06 (ORCPT ); Thu, 2 Jan 2020 16:26:58 -0500 Received: by mail-qk1-f194.google.com with SMTP id c17so32379475qkg.7 for ; Thu, 02 Jan 2020 13:26:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=BuzfP5I9/BHaBf6FyHQQOIfKEOpRMWsA5XU5wykKT2o=; b=CdOid+UBVvRGXRbF7qCzcVOF2m61Y8IiZsjsLVZ66tUHkFWOtXKXnx0l7xzRBVtzvX hS/zgPvMPKIsCac3Khfu/Vbrx/QKLJX2bJGPqx7+uWdWi00+k4/4GkKRRFRaXnI9Nat+ fUH3hsttp09Q/yqSKBammQVB6r+jbEquKZjr2oz5VeYr0M7951F7b53eKztJArPjjpli 57bgEoJ9fCYRFmwvwa3DJZ/YZv99VSUOxW+/H9uhSdFgHYOnvy+VsaW1ttX3Gh27xyUT ofxSUREb8VrQqvpm0wtjAHJ8Cv4iC7rL6w2HaeZ/j6h/TkfTEuITdPV/hEEKscxo2no4 re9g== X-Gm-Message-State: APjAAAVrRD/NfGkmEJQ4TigQEgvsyNyYn/ycAe2GqhfyzQgYCTN0LNhf dgknY+BznA3MV4ifjtrQlKE= X-Google-Smtp-Source: APXvYqxb4MqD2deYjYHn9wwqXYJEA7Ah5f2jj42wRGUKZgo1thInSXK1jt4a/hnwldtPqagZOGtgPw== X-Received: by 2002:a37:6313:: with SMTP id x19mr70606916qkb.231.1578000417794; Thu, 02 Jan 2020 13:26:57 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:56 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 06/12] btrfs: only keep track of data extents for async discard Date: Thu, 2 Jan 2020 16:26:40 -0500 Message-Id: <81a50b61fa32f4b080702f196b31c8c4defd9840.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As mentioned earlier, discarding data can be done either by issuing an explicit discard or implicitly by reusing the LBA. Metadata block_groups see much more frequent reuse due to well it being metadata. So instead of explicitly discarding metadata block_groups, just leave them be and let the latter implicit discarding be done for them. For mixed block_groups, block_groups which contain both metadata and data, we let them be as higher fragmentation is expected. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/block-group.h | 7 +++++++ fs/btrfs/discard.c | 16 ++++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h index a8d2edcd8760..4a088e690432 100644 --- a/fs/btrfs/block-group.h +++ b/fs/btrfs/block-group.h @@ -182,6 +182,13 @@ static inline u64 btrfs_block_group_end(struct btrfs_block_group *block_group) return (block_group->start + block_group->length); } +static inline bool btrfs_is_block_group_data_only( + struct btrfs_block_group *block_group) +{ + return ((block_group->flags & BTRFS_BLOCK_GROUP_DATA) && + !(block_group->flags & BTRFS_BLOCK_GROUP_METADATA)); +} + #ifdef CONFIG_BTRFS_DEBUG static inline int btrfs_should_fragment_free_space( struct btrfs_block_group *block_group) diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index de436c0051ce..7dbbf762ee8d 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -54,6 +54,13 @@ static void __add_to_discard_list(struct btrfs_discard_ctl *discard_ctl, static void add_to_discard_list(struct btrfs_discard_ctl *discard_ctl, struct btrfs_block_group *block_group) { + /* + * Async discard only operates on block_groups that are explicitly for + * data. Mixed block_groups are not supported. + */ + if (!btrfs_is_block_group_data_only(block_group)) + return; + spin_lock(&discard_ctl->lock); __add_to_discard_list(discard_ctl, block_group); spin_unlock(&discard_ctl->lock); @@ -166,7 +173,10 @@ static struct btrfs_block_group *peek_discard_list( if (block_group && now > block_group->discard_eligible_time) { if (block_group->discard_index == BTRFS_DISCARD_INDEX_UNUSED && block_group->used != 0) { - __add_to_discard_list(discard_ctl, block_group); + if (btrfs_is_block_group_data_only(block_group)) + __add_to_discard_list(discard_ctl, block_group); + else + list_del_init(&block_group->discard_list); goto again; } if (block_group->discard_state == BTRFS_DISCARD_RESET_CURSOR) { @@ -504,7 +514,9 @@ void btrfs_discard_update_discardable(struct btrfs_block_group *block_group, s32 extents_delta; s64 bytes_delta; - if (!block_group || !btrfs_test_opt(block_group->fs_info, DISCARD_ASYNC)) + if (!block_group || + !btrfs_test_opt(block_group->fs_info, DISCARD_ASYNC) || + !btrfs_is_block_group_data_only(block_group)) return; discard_ctl = &block_group->fs_info->discard_ctl; From patchwork Thu Jan 2 21:26:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316087 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E25A117EA for ; Thu, 2 Jan 2020 21:27:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B660720848 for ; Thu, 2 Jan 2020 21:27:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000421; bh=gX2o7/AUw9rTy1rbzo2CCot854bJZF0VYxxPPuChGJU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=WbgQnOHAygCjQyAQtjcSm3g9fS2avOHq01UjrKyGBVkZQ7MKGoB1w/CgHtYGzkUr6 rNwnJLzVWulKiFPfk66+KcJFPVxSm0Q+RfDc/I6KjD86wOO36ThTpw5iNIOwZFpRFx E884lfHpi51GhHLRssdfdN/FmukY9dDeNFy3U4/A= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726083AbgABV1B (ORCPT ); Thu, 2 Jan 2020 16:27:01 -0500 Received: from mail-qk1-f195.google.com ([209.85.222.195]:41868 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726050AbgABV1A (ORCPT ); Thu, 2 Jan 2020 16:27:00 -0500 Received: by mail-qk1-f195.google.com with SMTP id x129so32369710qke.8 for ; Thu, 02 Jan 2020 13:26:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=q7zpdabC3bKhMvoBo6/2VMh+UvqnEssKkzjNOLMKk08=; b=qIBB2vTYksZmWoF/ky1pxqcEsO1bP9RRAHg+z+10weoUbQkCd79a3DRqZI6AEyXnMj BsMLZj5Gp2QeEzZkrVH7QpL2ywuQ4mbvA+37PgXUTNQX1B0DGZkrKQ1NULQrvdSXVT4n 2QDzyi60dimQMkB7igh6IgejZGcE1lUpkaHolFZSEKJ8db6hzd+gAqZbOnaB4kylSRkQ IoxUYqfNtKLCrgM6ea6OnIAmDoY+F4dNFPbzNmS9JHzXXnNOHocftU8Wyzw4NU3Ne0QD txTvgK+d7h2nOJk1KdvSrAcaV+srsje0hdlTCaTxhQryNVDcDyVIvTbBgKkillwD8GVX swXw== X-Gm-Message-State: APjAAAWnpAf3riaZeOYvIzqoedqRpRUiXPdq8nlGDVUL3/hc7CKuwRjt Nji5JxpJZrlABxYTcWwmvFY= X-Google-Smtp-Source: APXvYqz226gShX1EuAi/z8v7gubK8QlJ32iRxCwTJJny2gcp57qMf82JxJyAQjNEyl3Q6WMvupLTmg== X-Received: by 2002:a05:620a:10a7:: with SMTP id h7mr67183219qkk.423.1578000418892; Thu, 02 Jan 2020 13:26:58 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:58 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 07/12] btrfs: keep track of discard reuse stats Date: Thu, 2 Jan 2020 16:26:41 -0500 Message-Id: <717a3e42ac3fb94b502d4dc258fbaafc60f1d6a2.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Keep track of how much we are discarding and how often we are reusing with async discard. The discard_*_bytes doesn't need any special protection because the work item provides the single threaded access. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/ctree.h | 3 +++ fs/btrfs/discard.c | 5 +++++ fs/btrfs/free-space-cache.c | 14 ++++++++++++++ fs/btrfs/sysfs.c | 36 ++++++++++++++++++++++++++++++++++++ 4 files changed, 58 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 09371e8b55a7..fad310d46c78 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -473,6 +473,9 @@ struct btrfs_discard_ctl { unsigned long delay; unsigned iops_limit; u32 kbps_limit; + u64 discard_extent_bytes; + u64 discard_bitmap_bytes; + atomic64_t discard_bytes_saved; }; /* delayed seq elem */ diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index 7dbbf762ee8d..bc6d4344397d 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -419,11 +419,13 @@ static void btrfs_discard_workfn(struct work_struct *work) block_group->discard_cursor, btrfs_block_group_end(block_group), minlen, maxlen, true); + discard_ctl->discard_bitmap_bytes += trimmed; } else { btrfs_trim_block_group_extents(block_group, &trimmed, block_group->discard_cursor, btrfs_block_group_end(block_group), minlen, true); + discard_ctl->discard_extent_bytes += trimmed; } discard_ctl->prev_discard = trimmed; @@ -627,6 +629,9 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info) discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC; discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS; discard_ctl->kbps_limit = 0; + discard_ctl->discard_extent_bytes = 0; + discard_ctl->discard_bitmap_bytes = 0; + atomic64_set(&discard_ctl->discard_bytes_saved, 0); } void btrfs_discard_cleanup(struct btrfs_fs_info *fs_info) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 6f0d60e86b6f..8a4a3b9cd544 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2842,6 +2842,8 @@ u64 btrfs_find_space_for_alloc(struct btrfs_block_group *block_group, u64 *max_extent_size) { struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; + struct btrfs_discard_ctl *discard_ctl = + &block_group->fs_info->discard_ctl; struct btrfs_free_space *entry = NULL; u64 bytes_search = bytes + empty_size; u64 ret = 0; @@ -2858,6 +2860,10 @@ u64 btrfs_find_space_for_alloc(struct btrfs_block_group *block_group, ret = offset; if (entry->bitmap) { bitmap_clear_bits(ctl, entry, offset, bytes); + + if (!btrfs_free_space_trimmed(entry)) + atomic64_add(bytes, &discard_ctl->discard_bytes_saved); + if (!entry->bytes) free_bitmap(ctl, entry); } else { @@ -2866,6 +2872,9 @@ u64 btrfs_find_space_for_alloc(struct btrfs_block_group *block_group, align_gap = entry->offset; align_gap_trim_state = entry->trim_state; + if (!btrfs_free_space_trimmed(entry)) + atomic64_add(bytes, &discard_ctl->discard_bytes_saved); + entry->offset = offset + bytes; WARN_ON(entry->bytes < bytes + align_gap_len); @@ -2969,6 +2978,8 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group *block_group, u64 min_start, u64 *max_extent_size) { struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; + struct btrfs_discard_ctl *discard_ctl = + &block_group->fs_info->discard_ctl; struct btrfs_free_space *entry = NULL; struct rb_node *node; u64 ret = 0; @@ -3033,6 +3044,9 @@ u64 btrfs_alloc_from_cluster(struct btrfs_block_group *block_group, spin_lock(&ctl->tree_lock); + if (!btrfs_free_space_trimmed(entry)) + atomic64_add(bytes, &discard_ctl->discard_bytes_saved); + ctl->free_space -= bytes; if (!entry->bitmap && !btrfs_free_space_trimmed(entry)) ctl->discardable_bytes[BTRFS_STAT_CURR] -= bytes; diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 8b0fd8557438..2e973af8353f 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -456,12 +456,48 @@ static ssize_t btrfs_discardable_bytes_show(struct kobject *kobj, } BTRFS_ATTR(discard, discardable_bytes, btrfs_discardable_bytes_show); +static ssize_t btrfs_discard_extent_bytes_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%lld\n", + fs_info->discard_ctl.discard_extent_bytes); +} +BTRFS_ATTR(discard, discard_extent_bytes, btrfs_discard_extent_bytes_show); + +static ssize_t btrfs_discard_bitmap_bytes_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%lld\n", + fs_info->discard_ctl.discard_bitmap_bytes); +} +BTRFS_ATTR(discard, discard_bitmap_bytes, btrfs_discard_bitmap_bytes_show); + +static ssize_t btrfs_discard_bytes_saved_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = discard_to_fs_info(kobj); + + return snprintf(buf, PAGE_SIZE, "%lld\n", + atomic64_read(&fs_info->discard_ctl.discard_bytes_saved)); +} +BTRFS_ATTR(discard, discard_bytes_saved, btrfs_discard_bytes_saved_show); + static const struct attribute *discard_debug_attrs[] = { BTRFS_ATTR_PTR(discard, iops_limit), BTRFS_ATTR_PTR(discard, kbps_limit), BTRFS_ATTR_PTR(discard, max_discard_size), BTRFS_ATTR_PTR(discard, discardable_extents), BTRFS_ATTR_PTR(discard, discardable_bytes), + BTRFS_ATTR_PTR(discard, discard_extent_bytes), + BTRFS_ATTR_PTR(discard, discard_bitmap_bytes), + BTRFS_ATTR_PTR(discard, discard_bytes_saved), NULL, }; From patchwork Thu Jan 2 21:26:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316089 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 025AE14B4 for ; Thu, 2 Jan 2020 21:27:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D3F1A21582 for ; Thu, 2 Jan 2020 21:27:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000422; bh=K13zeKGLKcKGsMx7uqQhfL2Ks0dvC4SqVADhByzb+pQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=Rwv3wm5fN0ogCD5hOMiLcmFi/J+F9cJ6bBvvA99NhsOZM/Zfm2a/m3tHIOlBKXSzn B3JPnL1saqNA562J8P83+J2KEesK75uSrPLHc93y/BzSKMlUvKUFtpUR1A3GhNb4ek bGQk0hriQe55Oaamz++7DfCivnIozVWFsoe3jDLk= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726099AbgABV1C (ORCPT ); Thu, 2 Jan 2020 16:27:02 -0500 Received: from mail-qv1-f47.google.com ([209.85.219.47]:38319 "EHLO mail-qv1-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726078AbgABV1B (ORCPT ); Thu, 2 Jan 2020 16:27:01 -0500 Received: by mail-qv1-f47.google.com with SMTP id t6so15523276qvs.5 for ; Thu, 02 Jan 2020 13:27:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=QeCvklU/bMVsGIc6OKqOiQPGW+xPwn5hKqXcy6zD/AU=; b=kHw2jUGXGeWOM1uwevUOLE8pqu8+DtD318GwYt8aBcqp/5UvDxzWg9lfzy47HresCC ubE1vjQHN6/PSMztF7EfMYYFdyAIwZFAy7J2a12Fo/RT3ni6te8xbYfaaU0ai6VEb7nf cplCkU/ycAAwUtv24EJkCbhw8uqOaobKqM9aRqSwEjQrd0M85eGHTmA8vF0IXLDIaVSd TonUxvHbw8mALrntPIecUsfmgxweqeiGg8oLcwhQrJPhDmLhfu6Cya3MbZhuj9PqYXzc v9tpYlAoy8BJ2LocB6MHbNb5yKSLxc236nIn5++K2dBEEYnoY9hqWB6EVSM0rGY1UGGw molw== X-Gm-Message-State: APjAAAUkdHEGoO7sqlNdWBiPuke/3r4QhueJofHwp/HbHt0zkfrAoFfH eJhRPWT0HBfp5YcHee/VX4Y= X-Google-Smtp-Source: APXvYqwVBNZVWb1eMLkHBZ9PJZR6z5D25fkmXrHpnAw32awq9P5D5cmw9d+qFCi8mFx029U3SGeU1g== X-Received: by 2002:a0c:e14f:: with SMTP id c15mr62932613qvl.169.1578000420055; Thu, 02 Jan 2020 13:27:00 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.26.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:26:59 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 08/12] btrfs: add async discard header Date: Thu, 2 Jan 2020 16:26:42 -0500 Message-Id: <6b051445e34dadaa0ed1cd7a5efdd19eddbae887.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Give a brief overview for how async discard is implemented. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/discard.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index bc6d4344397d..d5a89e3755ed 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -1,4 +1,42 @@ // SPDX-License-Identifier: GPL-2.0 +/* + * This contains the logic to handle async discard. + * + * Async discard manages trimming of free space outside of transaction commit. + * Discarding is done by managing the block_groups on a LRU list based on free + * space recency. Two passes are used to first prioritize discarding extents + * and then allow for trimming in the bitmap the best opportunity to coalesce. + * The block_groups are maintained on multiple lists to allow for multiple + * passes with different discard filter requirements. A delayed work item is + * used to manage discarding with timeout determined by a max of the delay + * incurred by the iops rate limit, the byte rate limit, and the max delay of + * BTRFS_DISCARD_MAX_DELAY. + * + * Note, this only keeps track of block_groups that are explicitly for data. + * Mixed block_groups are not supported. + * + * The first list is special to manage discarding of fully free block groups. + * This is necessary because we issue a final trim for a full free block group + * after forgetting it. When a block group becomes unused, instead of directly + * being added to the unused_bgs list, we add it to this first list. Then + * from there, if it becomes fully discarded, we place it onto the unused_bgs + * list. + * + * The in-memory free space cache serves as the backing state for discard. + * Consequently this means there is no persistence. We opt to load all the + * block groups in as not discarded, so the mount case degenerates to the + * crashing case. + * + * As the free space cache uses bitmaps, there exists a tradeoff between + * ease/efficiency for find_free_extent() and the accuracy of discard state. + * Here we opt to let untrimmed regions merge with everything while only letting + * trimmed regions merge with other trimmed regions. This can cause + * overtrimming, but the coalescing benefit seems to be worth it. Additionally, + * bitmap state is tracked as a whole. If we're able to fully trim a bitmap, + * the trimmed flag is set on the bitmap. Otherwise, if an allocation comes in, + * this resets the state and we will retry trimming the whole bitmap. This is a + * tradeoff between discard state accuracy and the cost of accounting. + */ #include #include From patchwork Thu Jan 2 21:26:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316091 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 321E614B4 for ; Thu, 2 Jan 2020 21:27:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 10EBB21582 for ; Thu, 2 Jan 2020 21:27:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000424; bh=Fwa49Pct4Oqcb4SSu0+lXaXOBf0B0dm0BQaXGpcsZUc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=2czXJ2Sbhu+iLemPrQbCixMv/rC20BTjmhOjvl4vy1InrVb8vTSQe3wujiigurS/B iol5WKvH7gT/xAU0YiD2qQAUaq68dYEwJXd7b64pxd4/olDJ5RpqSAbkc0q67C79Ha IDE1qk2AHvbZWtghLNOyCVAOwmtW/KXO0JICiijs= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726112AbgABV1D (ORCPT ); Thu, 2 Jan 2020 16:27:03 -0500 Received: from mail-qv1-f67.google.com ([209.85.219.67]:35376 "EHLO mail-qv1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726050AbgABV1C (ORCPT ); Thu, 2 Jan 2020 16:27:02 -0500 Received: by mail-qv1-f67.google.com with SMTP id u10so15438982qvi.2 for ; Thu, 02 Jan 2020 13:27:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=tg/k5+OeLFe7P1hVJ+bYqYqQSX6vUJvgrdOmMK11Ekw=; b=j9zX97bG+1/B+f3o/HYj0QFy9uitcTt0x6QRe+FTSUFVODYsF2/CPa3H++ObfDY2A+ VttOJDsfX8v+Dq9IbL5K+YLNkG0oF2AOnsmY8CfETtElMKO0W5Db478+niPH1gXFgPdj IwL1oo5IWOH5eNlkpt5rij7SiyJ+csixWZNQW/6IS4RgBJrIPZF5cWrkjnf4noA1Ihoy +jBbdmlVq40uCQ9nkvfsCpXV4O0NWveQANH9Ww7Ty9Z6rkhGjYLgAm07Yvodm/nXWfwm QvoZSXe9zt8n6AL3m0CQgehWZPCSPNROR1Byb2mDn+rI7VSa/J/kGtnXyQZmy4PM2lNX 9I7Q== X-Gm-Message-State: APjAAAUDkcw9rDPJ0YfzBBjvFA4VULNf+3RGPLpwJictTVrQyZsuxstO mpdKNirDtj05ZKyhItxFkDA= X-Google-Smtp-Source: APXvYqxcWp6WkVdLKrpkUnOEXcESIEDRPyVewIQ7cSCx4rmAeOLnOqA6eqsY5wJ7q6ab/Mry9NzbxQ== X-Received: by 2002:ad4:496f:: with SMTP id p15mr63101706qvy.191.1578000421147; Thu, 02 Jan 2020 13:27:01 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.27.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:27:00 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 09/12] btrfs: increase the metadata allowance for the free_space_cache Date: Thu, 2 Jan 2020 16:26:43 -0500 Message-Id: <49f2d76f892a24fb6c9cd384f4c7ecc04e63522c.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently, there is no way for the free space cache to recover from being serviced by purely bitmaps because the extent threshold is set to 0 in recalculate_thresholds() when we surpass the metadata allowance. This adds a recovery mechanism by keeping large extents out of the bitmaps and increases the metadata upper bound to 64KB. The recovery mechanism bypasses this upper bound, thus making it a soft upper bound. But, with the bypass being 1MB or greater, it shouldn't add unbounded overhead. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/free-space-cache.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 8a4a3b9cd544..665f6eb6c828 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -24,7 +24,8 @@ #include "discard.h" #define BITS_PER_BITMAP (PAGE_SIZE * 8UL) -#define MAX_CACHE_BYTES_PER_GIG SZ_32K +#define MAX_CACHE_BYTES_PER_GIG SZ_64K +#define FORCE_EXTENT_THRESHOLD SZ_1M struct btrfs_trim_range { u64 start; @@ -1694,26 +1695,17 @@ static void recalculate_thresholds(struct btrfs_free_space_ctl *ctl) ASSERT(ctl->total_bitmaps <= max_bitmaps); /* - * The goal is to keep the total amount of memory used per 1gb of space - * at or below 32k, so we need to adjust how much memory we allow to be - * used by extent based free space tracking + * We are trying to keep the total amount of memory used per 1gb of + * space to be MAX_CACHE_BYTES_PER_GIG. However, with a reclamation + * mechanism of pulling extents >= FORCE_EXTENT_THRESHOLD out of + * bitmaps, we may end up using more memory than this. */ if (size < SZ_1G) max_bytes = MAX_CACHE_BYTES_PER_GIG; else max_bytes = MAX_CACHE_BYTES_PER_GIG * div_u64(size, SZ_1G); - /* - * we want to account for 1 more bitmap than what we have so we can make - * sure we don't go over our overall goal of MAX_CACHE_BYTES_PER_GIG as - * we add more bitmaps. - */ - bitmap_bytes = (ctl->total_bitmaps + 1) * ctl->unit; - - if (bitmap_bytes >= max_bytes) { - ctl->extents_thresh = 0; - return; - } + bitmap_bytes = ctl->total_bitmaps * ctl->unit; /* * we want the extent entry threshold to always be at most 1/2 the max @@ -2099,6 +2091,10 @@ static bool use_bitmap(struct btrfs_free_space_ctl *ctl, forced = true; #endif + /* This is a way to reclaim large regions from the bitmaps. */ + if (!forced && info->bytes >= FORCE_EXTENT_THRESHOLD) + return false; + /* * If we are below the extents threshold then we can add this as an * extent, and don't have to deal with the bitmap From patchwork Thu Jan 2 21:26:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316093 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 26D1217EA for ; Thu, 2 Jan 2020 21:27:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EF3DE21582 for ; Thu, 2 Jan 2020 21:27:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000425; bh=COsTIDWj5r2Pk2GGDDo67Vs6kTdKHsrbrxTTGKMPUTI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=xdHRryqA8jIvPHsHtN7VvB/XPWuOX22UaV5y+5W1yV6FUbx2FUmUg+LZaESVuIThW 2y/ITiYGRtzwXiAR04E4pICO3KV2YMu9epl9vdSlfvbr9naDbVTNHNbXjfKIZX97z0 sCNFD9NmVOkWELuPbwimJmLpBYnQ1aHSa2iIDMs8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726136AbgABV1E (ORCPT ); Thu, 2 Jan 2020 16:27:04 -0500 Received: from mail-qk1-f193.google.com ([209.85.222.193]:35338 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726103AbgABV1D (ORCPT ); Thu, 2 Jan 2020 16:27:03 -0500 Received: by mail-qk1-f193.google.com with SMTP id z76so32734142qka.2 for ; Thu, 02 Jan 2020 13:27:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=djSD4JkyKqBZ5ReN6k+RMgEaj9MxjSa/MHjRi+dngUY=; b=XBLjhCv7tCM373y6ImR+RYjxLtCiuXNzz0wb+rZu0kqqBFwu9/cW55BwpU4VV90syr hx7nEj3AxkOCsTifVscMtAPn8y67a7XC/VrJNwxv02ZZs2RfM1ccG9QZyLewqZ5J8Kss rFuuUogRKZLoanjK+UziW2lLFSIls8Y7o4oG/BQhReC7VTBSTFkm6loKvQKAhXNy88WA TaZrX3IrgHx9zbacyjwcEhGSyxHlW2fJXCKQ1O0+tku46ITN3z2xYkYqQVI2vSkM7MCe PJF9OILfI3tu0cgN9/oODsxrEwNXzfu8A5ndCWsY2jU7dtEa54SzMzr4locF57+prisx by9w== X-Gm-Message-State: APjAAAW1OZOPaeYqo5WQyptp7bjvZrCS7dKO/UPsvZ1tPEXQl4MwkUH8 2yo/pnTpqEnojg7mMpxJY+0= X-Google-Smtp-Source: APXvYqwEiBldsfOlOXLWg3MFq+iiRL7J/RNsAOCbqGhGcszCLXbTxkZsPysOGkZqsObVzO3DwJ0u2Q== X-Received: by 2002:a37:bf82:: with SMTP id p124mr69374658qkf.337.1578000422347; Thu, 02 Jan 2020 13:27:02 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.27.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:27:01 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 10/12] btrfs: make smaller extents more likely to go into bitmaps Date: Thu, 2 Jan 2020 16:26:44 -0500 Message-Id: <075b6ea5fdbd3c113cf08963f430ec148fa53a78.1577999991.git.dennis@kernel.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org It's less than ideal for small extents to eat into our extent budget, so force extents <= 32KB into the bitmaps save for the first handful. Signed-off-by: Dennis Zhou Reviewed-by: Josef Bacik --- fs/btrfs/free-space-cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 665f6eb6c828..6d74d96a1d13 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2107,8 +2107,8 @@ static bool use_bitmap(struct btrfs_free_space_ctl *ctl, * of cache left then go ahead an dadd them, no sense in adding * the overhead of a bitmap if we don't have to. */ - if (info->bytes <= fs_info->sectorsize * 4) { - if (ctl->free_extents * 2 <= ctl->extents_thresh) + if (info->bytes <= fs_info->sectorsize * 8) { + if (ctl->free_extents * 3 <= ctl->extents_thresh) return false; } else { return false; From patchwork Thu Jan 2 21:26:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316095 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FD0A17EA for ; Thu, 2 Jan 2020 21:27:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D03421582 for ; Thu, 2 Jan 2020 21:27:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000426; bh=pg/0tP3FzUYknte3UeA30ZOgHJZ0NcIRL3mUf3Cz5wY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=jX1MHXtkDy9xFj6BD65e28sYjkiea/X1LjMJrwL9XLpAKYgh5i2QmBkhfiGLh7jHp lM2EEJUEAWSNtOIiWCBlPAudvi0NOTkcHyLDllkp51VEVmuRcfXArYYIni77b/QxZ1 q+lWIuKBWXyMl0Kgp8pspmjNQh223ewrNsGxZX68= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726004AbgABV1F (ORCPT ); Thu, 2 Jan 2020 16:27:05 -0500 Received: from mail-qk1-f195.google.com ([209.85.222.195]:41873 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726118AbgABV1E (ORCPT ); Thu, 2 Jan 2020 16:27:04 -0500 Received: by mail-qk1-f195.google.com with SMTP id x129so32369850qke.8 for ; Thu, 02 Jan 2020 13:27:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=6dIL2CDD+lbwVKjmjsWcfqgDq4bkzY0K07V/pgzdXVc=; b=sW/IamC7gOhWESa4BlbfSOO3yABbZb8OuoIr6vIr7xCfNmdkV4puEr4XNs7CjeapKC iwW9PVGLy72ndc9kAPbIXZ+MCrC1iLxCNTau1oULZhGTqrvlQu/yH566aaIKE9ePiJhH U0f9O6rFx0C6CiA/aBbDmh0UPiLOuDtdzQ5IDNtE7di1z25wBgzTiq/L/VvZzr5qmKiH wHf0muvVpQ/FWYAAfve9PEnnBwn5Ss2yy55TL73JLJNvbEsULHNK63QjmQ61CqrH9xA1 iSKJAjt/bzeTswnDcNKLmACLMhyF6mc8v0dkOfBB5TOKGmSpiLLQvGzKOPqghnk+Xpav k5LQ== X-Gm-Message-State: APjAAAWhWAjnKxDvqcBhvIoIwrSoOtCCw07EtDmA4oFUblz0KuchqOGm IWA+8d28CZ4aZohX+Qg7A+1BnsAr X-Google-Smtp-Source: APXvYqyfu2cET+WmwdFPLaXU8Y0E+I2z9WdHSbA/rnTaRIpua8x/MA5b1W59SrEGqNIwUChHC3UmRA== X-Received: by 2002:a37:4b93:: with SMTP id y141mr70999981qka.205.1578000423360; Thu, 02 Jan 2020 13:27:03 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.27.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:27:02 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 11/12] btrfs: ensure removal of discardable_* in free_bitmap() Date: Thu, 2 Jan 2020 16:26:45 -0500 Message-Id: X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Most callers of free_bitmap() only call it if bitmap_info->bytes is 0. However, there are certain cases where we may free the free space cache via __btrfs_remove_free_space_cache(). This exposes a path where free_bitmap() is called regardless. This may result in a bad accounting situation for discardable_bytes and discardable_extents. So, remove the stats and call btrfs_discard_update_discardable(). Signed-off-by: Dennis Zhou --- fs/btrfs/free-space-cache.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 6d74d96a1d13..2b294c57060c 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -1959,6 +1959,18 @@ static void add_new_bitmap(struct btrfs_free_space_ctl *ctl, static void free_bitmap(struct btrfs_free_space_ctl *ctl, struct btrfs_free_space *bitmap_info) { + /* + * Normally when this is called, the bitmap is fully empty. However, + * if we are blowing up the free space cache for one reason or another + * via __btrfs_remove_free_space_cache(), then it may not be freed and + * we may leave stats on the table. + */ + if (bitmap_info->bytes && !btrfs_free_space_trimmed(bitmap_info)) { + ctl->discardable_extents[BTRFS_STAT_CURR] -= + bitmap_info->bitmap_extents; + ctl->discardable_bytes[BTRFS_STAT_CURR] -= bitmap_info->bytes; + + } unlink_free_space(ctl, bitmap_info); kmem_cache_free(btrfs_free_space_bitmap_cachep, bitmap_info->bitmap); kmem_cache_free(btrfs_free_space_cachep, bitmap_info); @@ -2776,6 +2788,8 @@ void __btrfs_remove_free_space_cache(struct btrfs_free_space_ctl *ctl) { spin_lock(&ctl->tree_lock); __btrfs_remove_free_space_cache_locked(ctl); + if (ctl->private) + btrfs_discard_update_discardable(ctl->private, ctl); spin_unlock(&ctl->tree_lock); } From patchwork Thu Jan 2 21:26:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Zhou X-Patchwork-Id: 11316097 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD5FC14B4 for ; Thu, 2 Jan 2020 21:27:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9B43F21582 for ; Thu, 2 Jan 2020 21:27:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1578000427; bh=L3VOXvqfFEf5dsq8hkRiUG1kp5bQ3ifpjbW9qBBOTpQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:In-Reply-To: References:List-ID:From; b=NomkawOgq6XHrGnSCj7xsb0Y+5ftfz05UHeBLCvUi4KJMGM+LjMilH8Ok4h4Xa0l8 nXDtr9K8/6q35rQFFpnJIWkGSd0CuJnbS00QFTvCozzE+G5h5ZTPgnHzAIJBF2cm29 lmnlFxphRHTg97aiHZCXJJo88vRdmuxAFaMJZcuc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725837AbgABV1H (ORCPT ); Thu, 2 Jan 2020 16:27:07 -0500 Received: from mail-qk1-f196.google.com ([209.85.222.196]:41875 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726103AbgABV1F (ORCPT ); Thu, 2 Jan 2020 16:27:05 -0500 Received: by mail-qk1-f196.google.com with SMTP id x129so32369879qke.8 for ; Thu, 02 Jan 2020 13:27:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=xyz6+GMhRCzeJCqoQJA7vjXCjroTr2fE+vSm9V53CEE=; b=NmNwFCjl+J8h54y3TGY74p8XNPgaUTwX8K/cEn2+alE7hghQnWe58UevshAfD3XvJg Wa1ZlIrG2+dqYVt/AyK42Fktc4wjIXTj2tijrGBaGCYMOK/GtS8MlSqzP3qgNzzRINZG PMnkZzRLDxbKssg4aJuFYezZfgvBA4ljgYID+ftUb+s6kmrS/EQFn+aHlxbKFIqJYyoD y7YEQQjOrcU+1/885xOJafh/Ct2cIzbbRXpiQRRp0NeByKsCwZxUB3D2PvKjg7hz6gs/ 34KhnE5IKgdaArFNTtu3VEt7FQvz/GzY7WRjnhlk/FqIMY0qa87T9j1Xi9Fxhepc8sCc Bgkw== X-Gm-Message-State: APjAAAUiH8oYxow56OlTo0yDAZpd3lxtxLyO5WOH+agK+II6rYxsBYw7 v1OyB9s26cedxO7aB9wUDqI= X-Google-Smtp-Source: APXvYqwMW0L8wmy9U0B+dHW9TU6bS8RSZZ4mvR2KB0f0NaMZL4VdM5oJfLLhkR+IcguuyZCjk0Hesw== X-Received: by 2002:ae9:d887:: with SMTP id u129mr63161411qkf.357.1578000424444; Thu, 02 Jan 2020 13:27:04 -0800 (PST) Received: from dennisz-mbp.thefacebook.com ([163.114.130.128]) by smtp.gmail.com with ESMTPSA id f42sm17553933qta.0.2020.01.02.13.27.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 02 Jan 2020 13:27:03 -0800 (PST) From: Dennis Zhou To: David Sterba , Chris Mason , Josef Bacik , Omar Sandoval Cc: kernel-team@fb.com, linux-btrfs@vger.kernel.org, Dennis Zhou Subject: [PATCH 12/12] btrfs: add correction to handle -1 edge case in async discard Date: Thu, 2 Jan 2020 16:26:46 -0500 Message-Id: X-Mailer: git-send-email 2.13.5 In-Reply-To: References: In-Reply-To: References: Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From Dave's testing, it's possible to drive a file system to have -1 discardable_extents and a corresponding negative discardable_bytes. As btrfs_discard_calc_delay() is the only user of discardable_extents, we can correct here for any negative discardable_extents/discardable_bytes. Reported-by: David Sterba Signed-off-by: Dennis Zhou --- fs/btrfs/discard.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index d5a89e3755ed..d2c7851e31de 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -518,14 +518,32 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl) { s32 discardable_extents = atomic_read(&discard_ctl->discardable_extents); + s64 discardable_bytes = atomic64_read(&discard_ctl->discardable_bytes); unsigned iops_limit; unsigned long delay, lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC; - if (!discardable_extents) - return; - spin_lock(&discard_ctl->lock); + /* + * The following is to fix a potential -1 discrepenancy that I'm not + * sure how to reproduce. But given that this is the only place that + * utilizes these numbers and this is only called by from + * btrfs_finish_extent_commit() which is synchronized, we can correct + * here. + */ + if (discardable_extents < 0) + atomic_add(-discardable_extents, + &discard_ctl->discardable_extents); + + if (discardable_bytes < 0) + atomic64_add(-discardable_bytes, + &discard_ctl->discardable_bytes); + + if (discardable_extents <= 0) { + spin_unlock(&discard_ctl->lock); + return; + } + iops_limit = READ_ONCE(discard_ctl->iops_limit); if (iops_limit) lower_limit = max_t(unsigned long, lower_limit,