From patchwork Fri Dec 1 21:00:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13476391 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="blymQRDD"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="bT6oGRcQ" Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9FD110C2 for ; Fri, 1 Dec 2023 12:59:00 -0800 (PST) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 51BFF5C01C1; Fri, 1 Dec 2023 15:59:00 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Fri, 01 Dec 2023 15:59:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1701464340; x= 1701550740; bh=0kJXEGAVBdGtZ+Gdv9T8tAWTrKEgtcGgN5Vv6HI3ORU=; b=b lymQRDD27vEecbVyrahcAaETtTOwSK/dWY5fsKQ0T1jSSeSu8JZmeQ4AiPh3rr1s pvrJo9u9WRQsSCKsUukO96tRzkGr+qdB+4x79n2mb7eGd2QpFq13F0je73pjLDQK b0rhTro/RjDw/y+l97O8rMyRu7e/JLN94KiqczSTvSl/oK/3j8gMkF0s0vm0OPlm OJPiyFKbHteBhXXDcnDyDofM/pFx9GQFsitrQvXOiQsbt0I1dvEznrGau27YR5k8 009krE+IiJEQ/CbCBKhVqVz4j+7hsctuQSYGeX+GY5W3UEkeoIRe2J2EhNYDq03e uVPWj0P5tSze6zVM5edVA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1701464340; x=1701550740; bh=0 kJXEGAVBdGtZ+Gdv9T8tAWTrKEgtcGgN5Vv6HI3ORU=; b=bT6oGRcQIfO14LDAv k9EIu5DlhFoJvfIPKCF+6tQPbwlvzfMFjWaAOUuW7zkhoPjak21e1XNGwXzuTHMs +uS2c/9qQeX7K66G0LFGjYt+zzG+9pseb6X8HWd/IYxbhBZpSaRokAwdPqJumkWm YrBixYY0hM2RPIYDHXNk25SfU/pQOxJCKs0un5/ueMi3rQibnDfGZvfE7i5HAlyI 1XuibR5LMI30s5BZrBUpK+jDD3zecaArJSl49MItGDJ2geDTV1S+2q7HMsBz36Lv XjR2nDhbJcx8Terr9IP5om2o7BAtdT4+IDbwLE8xrjHuwg6kSWMy84h7Tj27qxG+ SMF/Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiledgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Dec 2023 15:58:59 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/5] btrfs: free qgroup rsv on ioerr ordered_extent Date: Fri, 1 Dec 2023 13:00:09 -0800 Message-ID: <301bc827ef330a961a95791e6c4d3dbe3e2a6108.1701464169.git.boris@bur.io> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 An ordered extent completing is a critical moment in qgroup rsv handling, as the ownership of the reservation is handed off from the ordered extent to the delayed ref. In the happy path we release (unlock) but do not free (decrement counter) the reservation, and the delayed ref drives the free. However, on an error, we don't create a delayed ref, since there is no ref to add. Therefore, free on the error path. Signed-off-by: Boris Burkov Reviewed-by: Qu Wenruo --- fs/btrfs/ordered-data.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 574e8a55e24a..8d4ab5ecfa5d 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -599,7 +599,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, release = entry->disk_num_bytes; else release = entry->num_bytes; - btrfs_delalloc_release_metadata(btrfs_inode, release, false); + btrfs_delalloc_release_metadata(btrfs_inode, release, + test_bit(BTRFS_ORDERED_IOERR, &entry->flags)); } percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes, From patchwork Fri Dec 1 21:00:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13476392 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="EorRAG1o"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="1gd3Hg6s" Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 927D612A for ; Fri, 1 Dec 2023 12:59:02 -0800 (PST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.nyi.internal (Postfix) with ESMTP id 093E65C00CC; Fri, 1 Dec 2023 15:59:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Fri, 01 Dec 2023 15:59:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1701464342; x= 1701550742; bh=Keo4pSmq6Ov+O26COeoleO/K3gBkI9DfMuCHwSIrBBs=; b=E orRAG1oOYKfYvDJU5nsg0+eadtSV42NWIngmyIiJZ73KMk6+GDf7MSl9pFmT5ADN 0NYdqfs258qFJF2YEojqiZLVSyGuAohQG3l/he9XizL/0hRIRYje3u/GSVd6XgG0 AYtO3AdjkO7UATE9pTxKEFB0FWgS2ozNGprhra6L832DqbcI4M30gM27vuK29smh wV6cb0U5MyIJRdXxQ5upI2WkP/ISsenL+Mc3ToSqx9j/yEDVccXj8ULgCPUdje1R VjVM3KiUt8Gj3S9jgK1UbnqKgmB200ga3RAjFjamXYxOZpXNGRXJtVODfWXSW/HT 4ps9ebGpDw457/m7u3khQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1701464342; x=1701550742; bh=K eo4pSmq6Ov+O26COeoleO/K3gBkI9DfMuCHwSIrBBs=; b=1gd3Hg6s9eKuANQ+s cnJo7P2HQjsQUVG6aw9JTd4HzWyDX5XfQWaMJ0v9eCZgIZa7d/0GvsL3dvDGs+eq a+6S7w2J/93qiA3vQ4JpZzllyPFUb7BTW4Uhyba8UOXtC9mqnpm50+Rajvyx/SRA YK1Pt2h9aVNwJqo5tWwy8jj2b9avc65XRdmMqoqa3If8d1nzpuTpYdD+MCqBI05B IU49FwnWC+qwjbjKUu0Ldhv0BRD+E7jODMhwTSoJhbgpZeYqglkrDquHKKQADLER wUZhYPTTZuTiBglY07fGGgy7oAJFSR7slreNhLURRZzzquaQIoPYn/Bp9UYVAeCk rbGvQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiledgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Dec 2023 15:59:01 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/5] btrfs: fix qgroup_free_reserved_data int overflow Date: Fri, 1 Dec 2023 13:00:10 -0800 Message-ID: <98d6609df5dc669df4025c257c28077f44b21e04.1701464169.git.boris@bur.io> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The reserved data counter and input parameter is a u64, but we inadvertantly accumulate it in an int. Overflowing that int results in freeing the wrong amount of data and breaking rsv accounting. Unfortunately, this overflow rot spreads from there, as the qgroup release/free functions rely on returning an int to take advantage of negative values for error codes. Therefore, the full fix is to return the "released" or "freed" amount by a u64* argument and to return 0 or negative error code via the return value. Most of the callsites simply ignore the return value, though some of them handle the error and count the returned bytes. Change all of them accordingly. Signed-off-by: Boris Burkov Reviewed-by: Qu Wenruo --- fs/btrfs/delalloc-space.c | 2 +- fs/btrfs/file.c | 2 +- fs/btrfs/inode.c | 16 ++++++++-------- fs/btrfs/ordered-data.c | 7 ++++--- fs/btrfs/qgroup.c | 25 +++++++++++++++---------- fs/btrfs/qgroup.h | 4 ++-- 6 files changed, 31 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c index 51453d4928fa..2833e8ef4c09 100644 --- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -199,7 +199,7 @@ void btrfs_free_reserved_data_space(struct btrfs_inode *inode, start = round_down(start, fs_info->sectorsize); btrfs_free_reserved_data_space_noquota(fs_info, len); - btrfs_qgroup_free_data(inode, reserved, start, len); + btrfs_qgroup_free_data(inode, reserved, start, len, NULL); } /* diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index e9c4b947a5aa..7a71720aaed2 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3192,7 +3192,7 @@ static long btrfs_fallocate(struct file *file, int mode, qgroup_reserved -= range->len; } else if (qgroup_reserved > 0) { btrfs_qgroup_free_data(BTRFS_I(inode), data_reserved, - range->start, range->len); + range->start, range->len, NULL); qgroup_reserved -= range->len; } list_del(&range->list); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f8647d8271b7..e79a047aa5d1 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -697,7 +697,7 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, * And at reserve time, it's always aligned to page size, so * just free one page here. */ - btrfs_qgroup_free_data(inode, NULL, 0, PAGE_SIZE); + btrfs_qgroup_free_data(inode, NULL, 0, PAGE_SIZE, NULL); btrfs_free_path(path); btrfs_end_transaction(trans); return ret; @@ -5141,7 +5141,7 @@ static void evict_inode_truncate_pages(struct inode *inode) */ if (state_flags & EXTENT_DELALLOC) btrfs_qgroup_free_data(BTRFS_I(inode), NULL, start, - end - start + 1); + end - start + 1, NULL); clear_extent_bit(io_tree, start, end, EXTENT_CLEAR_ALL_BITS | EXTENT_DO_ACCOUNTING, @@ -8076,7 +8076,7 @@ static void btrfs_invalidate_folio(struct folio *folio, size_t offset, * reserved data space. * Since the IO will never happen for this page. */ - btrfs_qgroup_free_data(inode, NULL, cur, range_end + 1 - cur); + btrfs_qgroup_free_data(inode, NULL, cur, range_end + 1 - cur, NULL); if (!inode_evicting) { clear_extent_bit(tree, cur, range_end, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_UPTODATE | @@ -9513,7 +9513,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( struct btrfs_path *path; u64 start = ins->objectid; u64 len = ins->offset; - int qgroup_released; + u64 qgroup_released = 0; int ret; memset(&stack_fi, 0, sizeof(stack_fi)); @@ -9526,9 +9526,9 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE); /* Encryption and other encoding is reserved and all 0 */ - qgroup_released = btrfs_qgroup_release_data(inode, file_offset, len); - if (qgroup_released < 0) - return ERR_PTR(qgroup_released); + ret = btrfs_qgroup_release_data(inode, file_offset, len, &qgroup_released); + if (ret < 0) + return ERR_PTR(ret); if (trans) { ret = insert_reserved_file_extent(trans, inode, @@ -10423,7 +10423,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, btrfs_delalloc_release_metadata(inode, disk_num_bytes, ret < 0); out_qgroup_free_data: if (ret < 0) - btrfs_qgroup_free_data(inode, data_reserved, start, num_bytes); + btrfs_qgroup_free_data(inode, data_reserved, start, num_bytes, NULL); out_free_data_space: /* * If btrfs_reserve_extent() succeeded, then we already decremented diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 8d4ab5ecfa5d..c68fb78b7454 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -152,11 +152,12 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( { struct btrfs_ordered_extent *entry; int ret; + u64 qgroup_rsv = 0; if (flags & ((1 << BTRFS_ORDERED_NOCOW) | (1 << BTRFS_ORDERED_PREALLOC))) { /* For nocow write, we can release the qgroup rsv right now */ - ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes); + ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes, &qgroup_rsv); if (ret < 0) return ERR_PTR(ret); } else { @@ -164,7 +165,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( * The ordered extent has reserved qgroup space, release now * and pass the reserved number for qgroup_record to free. */ - ret = btrfs_qgroup_release_data(inode, file_offset, num_bytes); + ret = btrfs_qgroup_release_data(inode, file_offset, num_bytes, &qgroup_rsv); if (ret < 0) return ERR_PTR(ret); } @@ -182,7 +183,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; entry->truncated_len = (u64)-1; - entry->qgroup_rsv = ret; + entry->qgroup_rsv = qgroup_rsv; entry->flags = flags; refcount_set(&entry->refs, 1); init_waitqueue_head(&entry->wait); diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index ce446d9d7f23..a953c16c7eb8 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -4057,13 +4057,14 @@ int btrfs_qgroup_reserve_data(struct btrfs_inode *inode, /* Free ranges specified by @reserved, normally in error path */ static int qgroup_free_reserved_data(struct btrfs_inode *inode, - struct extent_changeset *reserved, u64 start, u64 len) + struct extent_changeset *reserved, + u64 start, u64 len, u64 *freed_ret) { struct btrfs_root *root = inode->root; struct ulist_node *unode; struct ulist_iterator uiter; struct extent_changeset changeset; - int freed = 0; + u64 freed = 0; int ret; extent_changeset_init(&changeset); @@ -4104,7 +4105,9 @@ static int qgroup_free_reserved_data(struct btrfs_inode *inode, } btrfs_qgroup_free_refroot(root->fs_info, root->root_key.objectid, freed, BTRFS_QGROUP_RSV_DATA); - ret = freed; + if (freed_ret) + *freed_ret = freed; + ret = 0; out: extent_changeset_release(&changeset); return ret; @@ -4112,7 +4115,7 @@ static int qgroup_free_reserved_data(struct btrfs_inode *inode, static int __btrfs_qgroup_release_data(struct btrfs_inode *inode, struct extent_changeset *reserved, u64 start, u64 len, - int free) + u64 *released, int free) { struct extent_changeset changeset; int trace_op = QGROUP_RELEASE; @@ -4128,7 +4131,7 @@ static int __btrfs_qgroup_release_data(struct btrfs_inode *inode, /* In release case, we shouldn't have @reserved */ WARN_ON(!free && reserved); if (free && reserved) - return qgroup_free_reserved_data(inode, reserved, start, len); + return qgroup_free_reserved_data(inode, reserved, start, len, released); extent_changeset_init(&changeset); ret = clear_record_extent_bits(&inode->io_tree, start, start + len -1, EXTENT_QGROUP_RESERVED, &changeset); @@ -4143,7 +4146,8 @@ static int __btrfs_qgroup_release_data(struct btrfs_inode *inode, btrfs_qgroup_free_refroot(inode->root->fs_info, inode->root->root_key.objectid, changeset.bytes_changed, BTRFS_QGROUP_RSV_DATA); - ret = changeset.bytes_changed; + if (released) + *released = changeset.bytes_changed; out: extent_changeset_release(&changeset); return ret; @@ -4162,9 +4166,10 @@ static int __btrfs_qgroup_release_data(struct btrfs_inode *inode, * NOTE: This function may sleep for memory allocation. */ int btrfs_qgroup_free_data(struct btrfs_inode *inode, - struct extent_changeset *reserved, u64 start, u64 len) + struct extent_changeset *reserved, + u64 start, u64 len, u64 *freed) { - return __btrfs_qgroup_release_data(inode, reserved, start, len, 1); + return __btrfs_qgroup_release_data(inode, reserved, start, len, freed, 1); } /* @@ -4182,9 +4187,9 @@ int btrfs_qgroup_free_data(struct btrfs_inode *inode, * * NOTE: This function may sleep for memory allocation. */ -int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len) +int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len, u64 *released) { - return __btrfs_qgroup_release_data(inode, NULL, start, len, 0); + return __btrfs_qgroup_release_data(inode, NULL, start, len, released, 0); } static void add_root_meta_rsv(struct btrfs_root *root, int num_bytes, diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h index 855a4f978761..15b485506104 100644 --- a/fs/btrfs/qgroup.h +++ b/fs/btrfs/qgroup.h @@ -358,10 +358,10 @@ int btrfs_verify_qgroup_counts(struct btrfs_fs_info *fs_info, u64 qgroupid, /* New io_tree based accurate qgroup reserve API */ int btrfs_qgroup_reserve_data(struct btrfs_inode *inode, struct extent_changeset **reserved, u64 start, u64 len); -int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len); +int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len, u64 *released); int btrfs_qgroup_free_data(struct btrfs_inode *inode, struct extent_changeset *reserved, u64 start, - u64 len); + u64 len, u64 *freed); int btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes, enum btrfs_qgroup_rsv_type type, bool enforce); int __btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes, From patchwork Fri Dec 1 21:00:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13476393 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="AmFtPR/e"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Inc1CjCL" Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43A6610C2 for ; Fri, 1 Dec 2023 12:59:04 -0800 (PST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id AD3B45C0152; Fri, 1 Dec 2023 15:59:03 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Fri, 01 Dec 2023 15:59:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1701464343; x= 1701550743; bh=4PLRxsAkKGOULGwrl/bBXsUKf+Xgsyg4yVEAj580wDI=; b=A mFtPR/eMBvD+329hisuKPvWUe4/Y7o67S8e4n9DVqfulWhhb//4KrgYLuCuaaZfu hcI8l0hjxsKhT2ifFQjZvF8TAqk+NnuyXA+s3X1Pibe1zSCo3Ovjmw/jNAvwxyKz 7Q2GIt0BkbNxPIvjypHikIBmqLaLeLYfn1we1+Gib8eZ9INhVnbrIvjwL15Q5MSB 47cNWfO7xAjK1H59ZZr9DX8yRLeK8tviggveveidfWU9myH514/5EfiKSAQ4vBl2 2+xnQdZ5tQqL+iz281IKer7SynGQUiMq+rZT0ZlTFP6wrP5+RLQavY9QcwDPLjrE kAbc2LHCTcfbgQfOfCbCw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1701464343; x=1701550743; bh=4 PLRxsAkKGOULGwrl/bBXsUKf+Xgsyg4yVEAj580wDI=; b=Inc1CjCLqsPwaDf0s HNdBxU97qznbx1wuCLOfKJtCtuZ4Vs1rKxd85Y+21OqJ8Hj9zyp3yx84fwY/kh/0 zfOkkczYz1XzI6DskhnfS46fZHb6O5uqgXJaFbNuauOEuuk8krR6S3nkFbBknZ0e zdiqUSpJa+GxMbWy/dqRaT2APMqoqZvbt8r+Rt+s4Ry5O1e5wBB92SKeb/tOuSEb AQIAe+W3/6xabvdTYsmwPEDFYZLYIBcioEbVs9zmAJxyDnjwioaGub7oUD/nMfT5 ONIzApcJ1Vkyad1zjvQxEk49jEG+0Bt1NjvYun3N+HQHlVlWH2+ytn+tt4ijCOsT kHDkg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiledgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Dec 2023 15:59:03 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 3/5] btrfs: free qgroup pertrans rsv on trans abort Date: Fri, 1 Dec 2023 13:00:11 -0800 Message-ID: <07934597eaee1e2204c204bfd34bc628708e3739.1701464169.git.boris@bur.io> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If we abort a transaction, we never run the code that frees the pertrans qgroup reservation. This results in warnings on unmount as that reservation has been leaked. The leak isn't a huge issue since the fs is read-only, but it's better to clean it up when we know we can/should. Do it during the cleanup_transaction step of aborting. Signed-off-by: Boris Burkov Reviewed-by: Qu Wenruo --- fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++++ fs/btrfs/qgroup.c | 5 +++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9317606017e2..a1f440cd6d45 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4775,6 +4775,32 @@ void btrfs_cleanup_dirty_bgs(struct btrfs_transaction *cur_trans, } } +static void btrfs_free_all_qgroup_pertrans(struct btrfs_fs_info *fs_info) +{ + struct btrfs_root *gang[8]; + int i; + int ret; + + spin_lock(&fs_info->fs_roots_radix_lock); + while (1) { + ret = radix_tree_gang_lookup_tag(&fs_info->fs_roots_radix, + (void **)gang, 0, + ARRAY_SIZE(gang), + 0); // BTRFS_ROOT_TRANS_TAG + if (ret == 0) + break; + for (i = 0; i < ret; i++) { + struct btrfs_root *root = gang[i]; + + btrfs_qgroup_free_meta_all_pertrans(root); + radix_tree_tag_clear(&fs_info->fs_roots_radix, + (unsigned long)root->root_key.objectid, + 0); // BTRFS_ROOT_TRANS_TAG + } + } + spin_unlock(&fs_info->fs_roots_radix_lock); +} + void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans, struct btrfs_fs_info *fs_info) { @@ -4803,6 +4829,8 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans, EXTENT_DIRTY); btrfs_destroy_pinned_extent(fs_info, &cur_trans->pinned_extents); + btrfs_free_all_qgroup_pertrans(fs_info); + cur_trans->state =TRANS_STATE_COMPLETED; wake_up(&cur_trans->commit_wait); } diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index a953c16c7eb8..daec90342dad 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -4337,8 +4337,9 @@ static void qgroup_convert_meta(struct btrfs_fs_info *fs_info, u64 ref_root, qgroup_rsv_release(fs_info, qgroup, num_bytes, BTRFS_QGROUP_RSV_META_PREALLOC); - qgroup_rsv_add(fs_info, qgroup, num_bytes, - BTRFS_QGROUP_RSV_META_PERTRANS); + if (!sb_rdonly(fs_info->sb)) + qgroup_rsv_add(fs_info, qgroup, num_bytes, + BTRFS_QGROUP_RSV_META_PERTRANS); list_for_each_entry(glist, &qgroup->groups, next_group) qgroup_iterator_add(&qgroup_list, glist->group); From patchwork Fri Dec 1 21:00:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13476394 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="XKSGvavR"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="wK+xa72a" Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAD5010DB for ; Fri, 1 Dec 2023 12:59:05 -0800 (PST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.nyi.internal (Postfix) with ESMTP id 5DDC55C00CB; Fri, 1 Dec 2023 15:59:05 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Fri, 01 Dec 2023 15:59:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1701464345; x= 1701550745; bh=rZbkCaKgzsun6YvoxZu5I8w+KlxOYgvNjTINXJWdl44=; b=X KSGvavR8O9am9MP4OaAgc0yv75IPr/ktArIdDmWsPGzKL5vt5ZcJs1UkzPDZ5j5R TtIFruT/s5O0/m+5mUgYX9LFIM2QFmbd7CP8hLsWa2fjTz5ezlSN2yb2sK8ph+Wm jWirCTYuP8tN6GJIQr0Ueh1Kr4HP7e0uya3lljgvEN5K/tOUIvHgE+bbKSGC1IOd 8jv/GjwLKIc4aenZIOyz7wzC71xgo6Y5iW7DvBmPf4CdI9fxbJGcIn06rtQ6/k0M 8RI9z0FlHmCTo1R/ueYBLMha0ZAvVYdd4VqFpKCHu7CJcVKO9yrK5SaJjDhEa3HC +I2pGd+9aU9T/fnt/igFg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1701464345; x=1701550745; bh=r ZbkCaKgzsun6YvoxZu5I8w+KlxOYgvNjTINXJWdl44=; b=wK+xa72aF4nAdorHL KeeYiHJHdRW/5w/cYZ/9N8WNOWgMRlOKs3lZffmFVpPyaZdTrLDHT6PDKOm/bAIV BwYToVR80aCSSLees7pbT5IGJ89vwFBC5RytHiyedgpgM4VE7lgLcRgervMZkNW3 4XrDcQBj+5uF8ZyJfDPt537P8ViQh15WcZ+fu44BlnbTD3ig03CQLmE2anAX0fDT 9axVvSEqZOSSm18CqIywHglTqMdMj29z639JnAnyPidUtCQacALCCczlP6FRA75j j696RVQnBa9sc8w5SUedopyAQ3Rp6dSvO8lKODUkqVczTXgPHJ6jVvAZnzhEFWbH odEVg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiledgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgepudenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Dec 2023 15:59:04 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 4/5] btrfs: dont clear qgroup rsv bit in release_folio Date: Fri, 1 Dec 2023 13:00:12 -0800 Message-ID: <9a8e2f9639330dd5e82db11a49f84fa17cb9988d.1701464169.git.boris@bur.io> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The EXTENT_QGROUP_RESERVED bit is used to "lock" regions of the file for duplicate reservations. That is two writes to that range in one transaction shouldn't create two reservations, as the reservation will only be freed once when the write finally goes down. Therefore, it is never OK to clear that bit without freeing the associated qgroup rsv. At this point, we don't want to be freeing the rsv, so mask off the bit. Signed-off-by: Boris Burkov Reviewed-by: Qu Wenruo --- fs/btrfs/extent_io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 0143bf63044c..87283087c669 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2310,7 +2310,8 @@ static int try_release_extent_state(struct extent_io_tree *tree, ret = 0; } else { u32 clear_bits = ~(EXTENT_LOCKED | EXTENT_NODATASUM | - EXTENT_DELALLOC_NEW | EXTENT_CTLBITS); + EXTENT_DELALLOC_NEW | EXTENT_CTLBITS + | EXTENT_QGROUP_RESERVED); /* * At this point we can safely clear everything except the From patchwork Fri Dec 1 21:00:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13476395 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bur.io header.i=@bur.io header.b="RIeDjFm1"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="12d1+N7Q" Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92FE412A for ; Fri, 1 Dec 2023 12:59:07 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 013855C00CB; Fri, 1 Dec 2023 15:59:07 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Fri, 01 Dec 2023 15:59:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm2; t=1701464346; x= 1701550746; bh=bY5hgXS7HK/qiu6LG36iWyufVoqx8JfjZQqIT6n6Y4g=; b=R IeDjFm1CY3gSuemb8IDvANsQrbIbEhdQYeF4V5FVdqiPUvWFjHwbZ7i8k8d9bu6w jWfibYhyB6sD4x1I0pO1FTp1CehVyxD4Y2qRUYU9/GTknFdiAONxwJRQBNs0Ihyl 7y0bgqmjgjJeZJLywOxiHwReAPe3htO2GmYf4KA5nhd94Q9YeoM+M8tvnLDO6SXc vwm1t1sM6TeSb/kNaR77kn9nbIut5Gu0qcFT4j28+h/BW7o1E03naUok3JvxGIza v47VKPyEx6/wIh51lCQgOgsfOim1eSy3Yy5G2l0+hwN2TjbH59r4MHvzbPf3RnTq 7EQyX0zVMQQv2e2Tz944A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1701464346; x=1701550746; bh=b Y5hgXS7HK/qiu6LG36iWyufVoqx8JfjZQqIT6n6Y4g=; b=12d1+N7QfS54LeffN ozr+C9WQA6vuTToCpfYT4EXprvsed4QFvCcAWI1gFmlcDL52mpxNUBKn6QlCoIYg xz/7BI6Eses36/Ut/b8bpfWfpAeEpJwx1bFXzzNpYPKZ7Jz+6OxsFT6JGZZ8WIWN /n2jVJOE7NZCcn4WAHJURBvzuYV33blUvzz2RHJt8kIfIccwulQbN2Sf3tUm5J7L 72mjAn17SiuCMCzgpPUHgMLVDAY6RmYjNNmb1KCfI1UddXZLPNS+77lZxeo+sUyp xnlcfrDtq5gMw6hLuiQQVimQSUx0DHefPXARZnCZQhx+POS5b5S6c3nRXkrVrkjw PpZJQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiledgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Dec 2023 15:59:06 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 5/5] btrfs: ensure releasing squota rsv on head refs Date: Fri, 1 Dec 2023 13:00:13 -0800 Message-ID: <451f3bf7c437690326036d84bf43bf32c9887ebd.1701464169.git.boris@bur.io> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A reservation goes through a 3 step lifetime: - generated during delalloc - released/counted by ordered_extent allocation - freed by running delayed ref That third step depends on must_insert_reserved on the head ref, so the head ref with that field set owns the reservation. Once you prepare to run the head ref, must_insert_reserved is unset, which means that running the ref must free the reservation, whether or not it succeeds, or else the reservation is leaked. That results in either a risk of spurious ENOSPC if the fs stays writeable or a warning on unmount if it is readonly. The existing squota code was aware of these invariants, but missed a few cases. Improve it by adding a helper function to use in the cleanup paths and call it from the existing early returns in running delayed refs. This also simplifies btrfs_record_squota_delta and struct btrfs_quota_delta. This fixes (or at least improves the reliability of) generic/475 with mkfs -O squota. On my machine, that test failed ~4/10 times without this patch and passed 100/100 times with it. Signed-off-by: Boris Burkov --- fs/btrfs/extent-tree.c | 47 +++++++++++++++++++++++++++++------------- fs/btrfs/qgroup.c | 16 +++++++++++--- fs/btrfs/qgroup.h | 4 ++-- 3 files changed, 48 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 008c4c77a847..e41251c08190 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1547,6 +1547,23 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, return ret; } +static void free_head_ref_squota_rsv(struct btrfs_fs_info *fs_info, + struct btrfs_delayed_ref_head *href) +{ + u64 root = href->owning_root; + + /* + * Don't check must_insert_reserved, as this is called from contexts + * where it has already been unset. + */ + if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE || + !href->is_data || !is_fstree(root)) + return; + + btrfs_qgroup_free_refroot(fs_info, root, href->reserved_bytes, + BTRFS_QGROUP_RSV_DATA); +} + static int run_delayed_data_ref(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_head *href, struct btrfs_delayed_ref_node *node, @@ -1569,7 +1586,6 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans, struct btrfs_squota_delta delta = { .root = href->owning_root, .num_bytes = node->num_bytes, - .rsv_bytes = href->reserved_bytes, .is_data = true, .is_inc = true, .generation = trans->transid, @@ -1586,11 +1602,9 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans, flags, ref->objectid, ref->offset, &key, node->ref_mod, href->owning_root); + free_head_ref_squota_rsv(trans->fs_info, href); if (!ret) ret = btrfs_record_squota_delta(trans->fs_info, &delta); - else - btrfs_qgroup_free_refroot(trans->fs_info, delta.root, - delta.rsv_bytes, BTRFS_QGROUP_RSV_DATA); } else if (node->action == BTRFS_ADD_DELAYED_REF) { ret = __btrfs_inc_extent_ref(trans, node, parent, ref->root, ref->objectid, ref->offset, @@ -1742,7 +1756,6 @@ static int run_delayed_tree_ref(struct btrfs_trans_handle *trans, struct btrfs_squota_delta delta = { .root = href->owning_root, .num_bytes = fs_info->nodesize, - .rsv_bytes = 0, .is_data = false, .is_inc = true, .generation = trans->transid, @@ -1774,8 +1787,10 @@ static int run_one_delayed_ref(struct btrfs_trans_handle *trans, int ret = 0; if (TRANS_ABORTED(trans)) { - if (insert_reserved) + if (insert_reserved) { btrfs_pin_extent(trans, node->bytenr, node->num_bytes, 1); + free_head_ref_squota_rsv(trans->fs_info, href); + } return 0; } @@ -1871,6 +1886,8 @@ u64 btrfs_cleanup_ref_head_accounting(struct btrfs_fs_info *fs_info, struct btrfs_delayed_ref_root *delayed_refs, struct btrfs_delayed_ref_head *head) { + u64 ret = 0; + /* * We had csum deletions accounted for in our delayed refs rsv, we need * to drop the csum leaves for this update from our delayed_refs_rsv. @@ -1885,14 +1902,13 @@ u64 btrfs_cleanup_ref_head_accounting(struct btrfs_fs_info *fs_info, btrfs_delayed_refs_rsv_release(fs_info, 0, nr_csums); - return btrfs_calc_delayed_ref_csum_bytes(fs_info, nr_csums); + ret = btrfs_calc_delayed_ref_csum_bytes(fs_info, nr_csums); } - if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE && - head->must_insert_reserved && head->is_data) - btrfs_qgroup_free_refroot(fs_info, head->owning_root, - head->reserved_bytes, BTRFS_QGROUP_RSV_DATA); + /* must_insert_reserved can be set only if we didn't run the head ref */ + if (head->must_insert_reserved) + free_head_ref_squota_rsv(fs_info, head); - return 0; + return ret; } static int cleanup_ref_head(struct btrfs_trans_handle *trans, @@ -2033,6 +2049,11 @@ static int btrfs_run_delayed_refs_for_head(struct btrfs_trans_handle *trans, * spin lock. */ must_insert_reserved = locked_ref->must_insert_reserved; + /* + * Unsetting this on the head ref relinquishes ownership of + * the rsv_bytes, so it is critical that every possible code + * path from here forward frees all rsv including qgroup rsv. + */ locked_ref->must_insert_reserved = false; extent_op = locked_ref->extent_op; @@ -3292,7 +3313,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, struct btrfs_squota_delta delta = { .root = delayed_ref_root, .num_bytes = num_bytes, - .rsv_bytes = 0, .is_data = is_data, .is_inc = false, .generation = btrfs_extent_generation(leaf, ei), @@ -4935,7 +4955,6 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans, .root = root_objectid, .num_bytes = ins->offset, .generation = trans->transid, - .rsv_bytes = 0, .is_data = true, .is_inc = true, }; diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index daec90342dad..9576d77f6f6a 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -4661,6 +4661,19 @@ void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans) *root = RB_ROOT; } +void btrfs_free_squota_rsv(struct btrfs_fs_info *fs_info, + u64 root, u64 rsv_bytes) +{ + if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE) + return; + + if (!is_fstree(root)) + return; + + btrfs_qgroup_free_refroot(fs_info, root, rsv_bytes, + BTRFS_QGROUP_RSV_DATA); +} + int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info, struct btrfs_squota_delta *delta) { @@ -4705,8 +4718,5 @@ int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info, out: spin_unlock(&fs_info->qgroup_lock); - if (!ret && delta->rsv_bytes) - btrfs_qgroup_free_refroot(fs_info, root, delta->rsv_bytes, - BTRFS_QGROUP_RSV_DATA); return ret; } diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h index 15b485506104..3dbb4095d2f2 100644 --- a/fs/btrfs/qgroup.h +++ b/fs/btrfs/qgroup.h @@ -274,8 +274,6 @@ struct btrfs_squota_delta { u64 root; /* The number of bytes in the extent being counted. */ u64 num_bytes; - /* The number of bytes reserved for this extent. */ - u64 rsv_bytes; /* The generation the extent was created in. */ u64 generation; /* Whether we are using or freeing the extent. */ @@ -422,6 +420,8 @@ int btrfs_qgroup_trace_subtree_after_cow(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *eb); void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans); bool btrfs_check_quota_leak(struct btrfs_fs_info *fs_info); +void btrfs_free_squota_rsv(struct btrfs_fs_info *fs_info, + u64 root, u64 rsv_bytes); int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info, struct btrfs_squota_delta *delta);