From patchwork Fri Aug 16 15:20:18 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Josef Bacik <josef@toxicpanda.com>
X-Patchwork-Id: 11097865
Return-Path: <linux-btrfs-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CED8B56FE
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
 Fri, 16 Aug 2019 15:20:33 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD93128929
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
 Fri, 16 Aug 2019 15:20:33 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id B1AB928A17; Fri, 16 Aug 2019 15:20:33 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A4DA2899D
	for <patchwork-linux-btrfs@patchwork.kernel.org>;
 Fri, 16 Aug 2019 15:20:31 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1727549AbfHPPUa (ORCPT
        <rfc822;patchwork-linux-btrfs@patchwork.kernel.org>);
        Fri, 16 Aug 2019 11:20:30 -0400
Received: from mail-qt1-f196.google.com ([209.85.160.196]:34072 "EHLO
        mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1727311AbfHPPUa (ORCPT
        <rfc822;linux-btrfs@vger.kernel.org>);
        Fri, 16 Aug 2019 11:20:30 -0400
Received: by mail-qt1-f196.google.com with SMTP id q4so6500911qtp.1
        for <linux-btrfs@vger.kernel.org>;
 Fri, 16 Aug 2019 08:20:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=toxicpanda-com.20150623.gappssmtp.com; s=20150623;
        h=from:to:subject:date:message-id:in-reply-to:references:mime-version
         :content-transfer-encoding;
        bh=B467Ri0de2wq/UqSeJ7pK5/0NTpfGHZiwnmW7UyKD1Q=;
        b=VWk2pgxM2NkeOCIwNshDX24JmNnvPPl7obnqnEp6F514wR88zWwphEZiWWEDnB1NHJ
         BasGD65O+N++c/5Kn8D7Xf5digmOE/FFgpaTUUEFC/h8iFWPF8XmEgnZV92StZuxkGeZ
         Z54dnnyb5PIPy7OqnRe3zUTN3Gi3/2tnGwVK7EXgtqucK9jswTzRiuFSeEAHiQ4V8mna
         Z3xv3m+kQ57o7ppuqQM+0t/mBnLWvS1H/Tqxzx4q09XGM0FY/aAtz012TuNBYd5l6BDw
         GMqKXapfw6/vXNq8U6dVlpPqmP1kuWeuwO8JW+ou2Dk8l7vOasXV84KxOMmoFFAJ9xyI
         Ljsg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=B467Ri0de2wq/UqSeJ7pK5/0NTpfGHZiwnmW7UyKD1Q=;
        b=MtOz5kr+Qj4VCFyRV1csPx48Nga9jMPSDkPwgRQdY6AN61E37GwK4q1gbtXvSGicB2
         UOb+ICC3AwNY/CZeAGkiNHBEI0taIFmGDK9CX7ULpMZrzPQl4cCZPr3ljvosbjoOxOdx
         BukKN+spWXVdvuOqkoE6Azkvxe9SW0VjqTElMDytFBT0BhtOpXEXQxd9iUtdIc3rXrYS
         bWkjXqZSYacdcUcFg07Yw9x0rn588HrcyoylhuCfedvtGGUtTpiy7BVJJy/Tfq889ywO
         ZKb8MOuIKjRPQCHeEs+rlPEir4hLxvfrPNm+Q8MbzccNSo1x90SgcKURacB49JAA64e9
         nXsQ==
X-Gm-Message-State: APjAAAWPj+AcZNPqUSJRML3+jCBWNQ5DJ2AwZc4Y3zdlSWUXJu1DPAJq
        u8RNo6elHTQ11C83gb983enU3Jh2mXnwew==
X-Google-Smtp-Source: 
 APXvYqzw2L8m1/U7AUDT/4ZERliusz+D5AqFhOHM+Jj5QMezVhagX/hTbVa9Tf1boF2R6oxOJFqijw==
X-Received: by 2002:a0c:e64e:: with SMTP id c14mr2175112qvn.117.1565968828930;
        Fri, 16 Aug 2019 08:20:28 -0700 (PDT)
Received: from localhost ([107.15.81.208])
        by smtp.gmail.com with ESMTPSA id
 k2sm2875245qtq.84.2019.08.16.08.20.28
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 16 Aug 2019 08:20:28 -0700 (PDT)
From: Josef Bacik <josef@toxicpanda.com>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH 4/5] btrfs: do not account global reserve in can_overcommit
Date: Fri, 16 Aug 2019 11:20:18 -0400
Message-Id: <20190816152019.1962-5-josef@toxicpanda.com>
X-Mailer: git-send-email 2.21.0
In-Reply-To: <20190816152019.1962-1-josef@toxicpanda.com>
References: <20190816152019.1962-1-josef@toxicpanda.com>
MIME-Version: 1.0
Sender: linux-btrfs-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-btrfs.vger.kernel.org>
X-Mailing-List: linux-btrfs@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

We ran into a problem in production where a box with plenty of space was
getting wedged doing ENOSPC flushing.  These boxes only had 20% of the
disk allocated, but their metadata space + global reserve was right at
the size of their metadata chunk.

In this case can_overcommit should be allowing allocations without
problem, but there's logic in can_overcommit that doesn't allow us to
overcommit if there's not enough real space to satisfy the global
reserve.

This is for historical reasons.  Before there were only certain places
we could allocate chunks.  We could go to commit the transaction and not
have enough space for our pending delayed refs and such and be unable to
allocate a new chunk.  This would result in a abort because of ENOSPC.
This code was added to solve this problem.

However since then we've gained the ability to always be able to
allocate a chunk.  So we can easily overcommit in these cases without
risking a transaction abort because of ENOSPC.

Also prior to now the global reserve really would be used because that's
the space we relied on for delayed refs.  With delayed refs being
tracked separately we no longer have to worry about running out of
delayed refs space while committing.  We are much less likely to
exhaust our global reserve space during transaction commit.

Fix the can_overcommit code to simply see if our current usage + what we
want is less than our current free space plus whatever slack space we
have in the disk is.  This solves the problem we were seeing in
production and keeps us from flushing as aggressively as we approach our
actual metadata size usage.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/space-info.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 9c5f81074cd5..3d3f301bae26 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -165,9 +165,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
 			  enum btrfs_reserve_flush_enum flush,
 			  bool system_chunk)
 {
-	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
 	u64 profile;
-	u64 space_size;
 	u64 avail;
 	u64 used;
 	int factor;
@@ -181,22 +179,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
 	else
 		profile = btrfs_metadata_alloc_profile(fs_info);
 
-	used = btrfs_space_info_used(space_info, false);
-
-	/*
-	 * We only want to allow over committing if we have lots of actual space
-	 * free, but if we don't have enough space to handle the global reserve
-	 * space then we could end up having a real enospc problem when trying
-	 * to allocate a chunk or some other such important allocation.
-	 */
-	spin_lock(&global_rsv->lock);
-	space_size = calc_global_rsv_need_space(global_rsv);
-	spin_unlock(&global_rsv->lock);
-	if (used + space_size >= space_info->total_bytes)
-		return 0;
-
-	used += space_info->bytes_may_use;
-
+	used = btrfs_space_info_used(space_info, true);
 	avail = atomic64_read(&fs_info->free_chunk_space);
 
 	/*