From patchwork Mon Nov 25 03:20:46 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 3227911 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 91B1EC045B for ; Mon, 25 Nov 2013 03:21:22 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9AC0220254 for ; Mon, 25 Nov 2013 03:21:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B2B5620253 for ; Mon, 25 Nov 2013 03:21:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753457Ab3KYDVS (ORCPT ); Sun, 24 Nov 2013 22:21:18 -0500 Received: from mail-wg0-f52.google.com ([74.125.82.52]:63190 "EHLO mail-wg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753251Ab3KYDVQ (ORCPT ); Sun, 24 Nov 2013 22:21:16 -0500 Received: by mail-wg0-f52.google.com with SMTP id x13so3287965wgg.19 for ; Sun, 24 Nov 2013 19:21:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=12ISeL9TM26aMzXP5FbK5IHV493nlC08ac1/pO3q7FU=; b=JqKoTpqS7bWMQxySEFuMzJDX5D+QpYoPo0DyFxzgGk7jkh/JewmT5fo6BXy/WVGNbi okwFO8hxFhNEE2Ons2W8jz94wIUTuDswyKB8rUBObnGJ+I44zsRxeo5WoY+7J5OwHeCx jwUPofzWaMfrHU4OzRSgreIe+XSEhbYuwGaNVxNWxm+ajSuVZ870MtwiymmwrYHRCPye 7rgdPtwDKQaME704FgKM1zvwFGBe31kw6sFZpBZcaozLwv5XSjwE8U/AkorWuPuBgpHi c37tPZD308Ti0isWIf43svcJBtzdCt5PNw3mOHy5NzVENibA+1gqAn25X9bSWYCZhzSu mMvA== X-Received: by 10.180.198.170 with SMTP id jd10mr11650970wic.65.1385349675351; Sun, 24 Nov 2013 19:21:15 -0800 (PST) Received: from storm-desktop.lan (bl9-170-14.dsl.telepac.pt. [85.242.170.14]) by mx.google.com with ESMTPSA id bs15sm43837491wib.10.2013.11.24.19.21.14 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 24 Nov 2013 19:21:14 -0800 (PST) From: Filipe David Borba Manana To: linux-btrfs@vger.kernel.org Cc: Filipe David Borba Manana Subject: [PATCH] Btrfs: try harder to avoid btree node splits Date: Mon, 25 Nov 2013 03:20:46 +0000 Message-Id: <1385349646-6210-1-git-send-email-fdmanana@gmail.com> X-Mailer: git-send-email 1.7.9.5 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When attempting to move items from our target leaf to its neighbor leaves (right and left), we only need to free data_size - free_space bytes from our leaf in order to add the new item (which has size of data_size bytes). Therefore attempt to move items to the right and left leaves if they have at least data_size - free_space bytes free, instead of data_size bytes free. After 5 runs of the following test, I got a smaller number of btree node splits overall: sysbench --test=fileio --file-num=512 --file-total-size=5G \ --file-test-mode=seqwr --num-threads=512 \ --file-block-size=8192 --max-requests=100000 --file-io-mode=sync Before this change: * 6171 splits (average of 5 test runs) * 61.508Mb/sec of throughput (average of 5 test runs) After this change: * 6036 splits (average of 5 test runs) * 63.533Mb/sec of throughput (average of 5 test runs) An ideal test would not just have multiple threads/processes writing to a file (insertion of file extent items) but also do other operations that result in insertion of items with varied sizes, like file/directory creations, creation of links, symlinks, xattrs, etc. Signed-off-by: Filipe David Borba Manana --- fs/btrfs/ctree.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 2262980..11f9a18 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -3929,14 +3929,17 @@ static noinline int push_for_double_split(struct btrfs_trans_handle *trans, int progress = 0; int slot; u32 nritems; + int space_needed = data_size; slot = path->slots[0]; + if (slot < btrfs_header_nritems(path->nodes[0])) + space_needed -= btrfs_leaf_free_space(root, path->nodes[0]); /* * try to push all the items after our slot into the * right leaf */ - ret = push_leaf_right(trans, root, path, 1, data_size, 0, slot); + ret = push_leaf_right(trans, root, path, 1, space_needed, 0, slot); if (ret < 0) return ret; @@ -3956,7 +3959,7 @@ static noinline int push_for_double_split(struct btrfs_trans_handle *trans, /* try to push all the items before our slot into the next leaf */ slot = path->slots[0]; - ret = push_leaf_left(trans, root, path, 1, data_size, 0, slot); + ret = push_leaf_left(trans, root, path, 1, space_needed, 0, slot); if (ret < 0) return ret; @@ -4000,13 +4003,18 @@ static noinline int split_leaf(struct btrfs_trans_handle *trans, /* first try to make some room by pushing left and right */ if (data_size && path->nodes[1]) { - wret = push_leaf_right(trans, root, path, data_size, - data_size, 0, 0); + int space_needed = data_size; + + if (slot < btrfs_header_nritems(l)) + space_needed -= btrfs_leaf_free_space(root, l); + + wret = push_leaf_right(trans, root, path, space_needed, + space_needed, 0, 0); if (wret < 0) return wret; if (wret) { - wret = push_leaf_left(trans, root, path, data_size, - data_size, 0, (u32)-1); + wret = push_leaf_left(trans, root, path, space_needed, + space_needed, 0, (u32)-1); if (wret < 0) return wret; }