From patchwork Thu Nov 8 05:49:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10673505 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39C4614E2 for ; Thu, 8 Nov 2018 05:49:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1DBA32D73E for ; Thu, 8 Nov 2018 05:49:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 105DB2D742; Thu, 8 Nov 2018 05:49:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 933362D73E for ; Thu, 8 Nov 2018 05:49:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726337AbeKHPXL (ORCPT ); Thu, 8 Nov 2018 10:23:11 -0500 Received: from mx2.suse.de ([195.135.220.15]:46092 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725945AbeKHPXL (ORCPT ); Thu, 8 Nov 2018 10:23:11 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CC479AF8D for ; Thu, 8 Nov 2018 05:49:22 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 0/6] btrfs: qgroup: Delay subtree scan to reduce overhead Date: Thu, 8 Nov 2018 13:49:12 +0800 Message-Id: <20181108054919.18253-1-wqu@suse.com> X-Mailer: git-send-email 2.19.1 MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patchset can be fetched from github: https://github.com/adam900710/linux/tree/qgroup_delayed_subtree_rebased Which is based on v4.20-rc1. This patch address the heavy load subtree scan, but delaying it until we're going to modify the swapped tree block. The overall workflow is: 1) Record the subtree root block get swapped. During subtree swap: O = Old tree blocks N = New tree blocks reloc tree file tree X Root Root / \ / \ NA OB OA OB / | | \ / | | \ NC ND OE OF OC OD OE OF In these case, NA and OA is going to be swapped, record (NA, OA) into file tree X. 2) After subtree swap. reloc tree file tree X Root Root / \ / \ OA OB NA OB / | | \ / | | \ OC OD OE OF NC ND OE OF 3a) CoW happens for OB If we are going to CoW tree block OB, we check OB's bytenr against tree X's swapped_blocks structure. It doesn't fit any one, nothing will happen. 3b) CoW happens for NA Check NA's bytenr against tree X's swapped_blocks, and get a hit. Then we do subtree scan on both subtree OA and NA. Resulting 6 tree blocks to be scanned (OA, OC, OD, NA, NC, ND). Then no matter what we do to file tree X, qgroup numbers will still be correct. Then NA's record get removed from X's swapped_blocks. 4) Transaction commit Any record in X's swapped_blocks get removed, since there is no modification to swapped subtrees, no need to trigger heavy qgroup subtree rescan for them. [[Benchmark]] Hardware: VM 4G vRAM, 8 vCPUs, disk is using 'unsafe' cache mode, backing device is SAMSUNG 850 evo SSD. Host has 16G ram. Mkfs parameter: --nodesize 4K (To bump up tree size) Initial subvolume contents: 4G data copied from /usr and /lib. (With enough regular small files) Snapshots: 16 snapshots of the original subvolume. each snapshot has 3 random files modified. balance parameter: -m So the content should be pretty similar to a real world root fs layout. And after file system population, there is no other activity, so it should be the best case scenario. | v4.20-rc1 | w/ patchset | diff ----------------------------------------------------------------------- relocated extents | 22615 | 22457 | -0.1% qgroup dirty extents | 163457 | 121606 | -25.6% time (sys) | 22.884s | 18.842s | -17.6% time (real) | 27.724s | 22.884s | -17.5% changelog: v2: Rebase to v4.20-rc1. Instead commit transaction after each reloc tree merge, delay it until merge_reloc_roots() finishes. This provides a more natural behavior, and reduce the unnecessary transaction commits. Qu Wenruo (6): btrfs: qgroup: Allow btrfs_qgroup_extent_record::old_roots unpopulated at insert time btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots() btrfs: qgroup: Refactor btrfs_qgroup_trace_subtree_swap() btrfs: qgroup: Introduce per-root swapped blocks infrastructure btrfs: qgroup: Use delayed subtree rescan for balance btrfs: qgroup: Cleanup old subtree swap code fs/btrfs/ctree.c | 8 + fs/btrfs/ctree.h | 14 ++ fs/btrfs/disk-io.c | 1 + fs/btrfs/qgroup.c | 376 +++++++++++++++++++++++++++++++---------- fs/btrfs/qgroup.h | 107 +++++++++++- fs/btrfs/relocation.c | 140 ++++++++++++--- fs/btrfs/transaction.c | 1 + 7 files changed, 527 insertions(+), 120 deletions(-)