From patchwork Thu Dec 29 19:34:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 9491365 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C83AD60488 for ; Thu, 29 Dec 2016 19:38:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3AC91FFB9 for ; Thu, 29 Dec 2016 19:38:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A84DE25250; Thu, 29 Dec 2016 19:38:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB4452018E for ; Thu, 29 Dec 2016 19:38:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750788AbcL2Thy (ORCPT ); Thu, 29 Dec 2016 14:37:54 -0500 Received: from mail-io0-f193.google.com ([209.85.223.193]:33821 "EHLO mail-io0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750764AbcL2Thx (ORCPT ); Thu, 29 Dec 2016 14:37:53 -0500 Received: by mail-io0-f193.google.com with SMTP id n85so20216714ioi.1; Thu, 29 Dec 2016 11:37:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=oHa6Z/47l0Jpixq8jiGLwdnSux4OtlPCQ/UJdKyNfh8=; b=gUYxJPNThlL6uQMtrucxSgRzF0KOeH63AnwxeFqTGIHF7lb7rMj/R6tCelsjS9G4VK ZhZS0TgPUDFCeE7ov3LD1oVFAKMvX3FYlc5GSiR8oMp8elCXe/V4raQkU5UF/YbBM6py 7oQiSIgOQrGpxcxCX4ga82hQTbYzRRlzUAHRogvL9LK2ZVLYTu2df/1TR31jmuBFqzot 4xrR0EjapOBE9jtgmil9y5Yr2DmyCUO6fHvM6l78RpC2dgCr/Qd/+BZRySKQUHDxFFKL 1hbGX0jT9T2UkoCrFz2ZKf9wkEApgcQDlcBjVS4utj9aOKlMUzwpHyNWaHOf0KeGRZ39 krVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=oHa6Z/47l0Jpixq8jiGLwdnSux4OtlPCQ/UJdKyNfh8=; b=rCgl53xkoBpBQ2BgURNrhO2daEPldL+1ehOcZF+ty5P+V7Qu70yy8Uy12F4aU0a8sS 7UNtiiWXfZLIIUkiPMZzVRfe+StkXgyLepgTvlfp1dZl8gfomA1mxfMWfeUa/MODfFJ8 BRXu37ShUZINeAPuDnZL9k8jHjwOy+LBGl1r5U8fNvUs9zGDoGnjSYqY26z3e0FYXz02 YhiCz6ppeIh/qOsdwcnsQm4wgfIn31F8FzQOripnPL3Ir8QaSeX2ZQgGZWRNerClWlnG zxB5U3h0HyZTAXyTsfyP/VZiYGc6KazkQushwgbMj/owk0VT5PPFCo+FDSga12E+YvSW bvgA== X-Gm-Message-State: AIkVDXIbmQdJn02HdPmFwibwH1caA7ogbvvOoO4bV9+HnskYW/07q0Lfl+k1V4toRmEzRQ== X-Received: by 10.107.156.75 with SMTP id f72mr34588855ioe.9.1483040272407; Thu, 29 Dec 2016 11:37:52 -0800 (PST) Received: from zzz.Home (h69-131-94-31.mdsnwi.broadband.dynamic.tds.net. [69.131.94.31]) by smtp.gmail.com with ESMTPSA id p77sm25794100iod.35.2016.12.29.11.37.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 29 Dec 2016 11:37:51 -0800 (PST) From: Eric Biggers To: linux-fsdevel@vger.kernel.org Cc: Alexander Viro , Christoph Lameter , linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH] fs/buffer.c: make bh_lru_install() more efficient Date: Thu, 29 Dec 2016 13:34:45 -0600 Message-Id: <20161229193445.1913-1-ebiggers3@gmail.com> X-Mailer: git-send-email 2.11.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers To install a buffer_head into the cpu's LRU queue, bh_lru_install() would construct a new copy of the queue and then memcpy it over the real queue. But it's easily possible to do the update in-place, which is faster and simpler. Some work can also be skipped if the buffer_head was already in the queue. As a microbenchmark I timed how long it takes to run sb_getblk() 10,000,000 times alternating between BH_LRU_SIZE + 1 blocks. Effectively, this benchmarks looking up buffer_heads that are in the page cache but not in the LRU: Before this patch: 1.758s After this patch: 1.653s This patch also removes about 350 bytes of compiled code (on x86_64), partly due to removal of the memcpy() which was being inlined+unrolled. Signed-off-by: Eric Biggers --- fs/buffer.c | 43 +++++++++++++++---------------------------- 1 file changed, 15 insertions(+), 28 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index d21771fcf7d3..282ca52517bf 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1273,44 +1273,31 @@ static inline void check_irqs_on(void) } /* - * The LRU management algorithm is dopey-but-simple. Sorry. + * Install a buffer_head into this cpu's LRU. If not already in the LRU, it is + * inserted at the front, and the buffer_head at the back if any is evicted. + * Or, if already in the LRU it is moved to the front. */ static void bh_lru_install(struct buffer_head *bh) { - struct buffer_head *evictee = NULL; + struct buffer_head *evictee = bh; + struct bh_lru *b; + int i; check_irqs_on(); bh_lru_lock(); - if (__this_cpu_read(bh_lrus.bhs[0]) != bh) { - struct buffer_head *bhs[BH_LRU_SIZE]; - int in; - int out = 0; - - get_bh(bh); - bhs[out++] = bh; - for (in = 0; in < BH_LRU_SIZE; in++) { - struct buffer_head *bh2 = - __this_cpu_read(bh_lrus.bhs[in]); - if (bh2 == bh) { - __brelse(bh2); - } else { - if (out >= BH_LRU_SIZE) { - BUG_ON(evictee != NULL); - evictee = bh2; - } else { - bhs[out++] = bh2; - } - } + b = this_cpu_ptr(&bh_lrus); + for (i = 0; i < BH_LRU_SIZE; i++) { + swap(evictee, b->bhs[i]); + if (evictee == bh) { + bh_lru_unlock(); + return; } - while (out < BH_LRU_SIZE) - bhs[out++] = NULL; - memcpy(this_cpu_ptr(&bh_lrus.bhs), bhs, sizeof(bhs)); } - bh_lru_unlock(); - if (evictee) - __brelse(evictee); + get_bh(bh); + bh_lru_unlock(); + brelse(evictee); } /*