From patchwork Tue Jul 28 19:09:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cassidy Burden X-Patchwork-Id: 6887961 X-Patchwork-Delegate: agross@codeaurora.org Return-Path: X-Original-To: patchwork-linux-arm-msm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 5C460C05AC for ; Tue, 28 Jul 2015 19:12:15 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 19BBF20776 for ; Tue, 28 Jul 2015 19:12:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 847CF2076F for ; Tue, 28 Jul 2015 19:12:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752343AbbG1TML (ORCPT ); Tue, 28 Jul 2015 15:12:11 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:57534 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752246AbbG1TMK (ORCPT ); Tue, 28 Jul 2015 15:12:10 -0400 Received: from smtp.codeaurora.org (localhost [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id 8037E1413FB; Tue, 28 Jul 2015 19:12:09 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 486) id 62EA014142A; Tue, 28 Jul 2015 19:12:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from linux-kernel-memory-lab-01.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: cburden@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id D3E201413FB; Tue, 28 Jul 2015 19:12:07 +0000 (UTC) From: Cassidy Burden To: yury.norov@gmail.com, akpm@linux-foundation.org Cc: linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Cassidy Burden , Alexey Klimov , "David S. Miller" , Daniel Borkmann , Hannes Frederic Sowa , Lai Jiangshan , Mark Salter , AKASHI Takahiro , Thomas Graf , Valentin Rothberg , Chris Wilson Subject: [PATCH] lib: Make _find_next_bit helper function inline Date: Tue, 28 Jul 2015 12:09:24 -0700 Message-Id: <1438110564-19932-1-git-send-email-cburden@codeaurora.org> X-Mailer: git-send-email 1.9.1 X-Virus-Scanned: ClamAV using ClamSMTP Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP I've tested Yury Norov's find_bit reimplementation with the test_find_bit module (https://lkml.org/lkml/2015/3/8/141) and measured about 35-40% performance degradation on arm64 3.18 run with fixed CPU frequency. The performance degradation appears to be caused by the helper function _find_next_bit. After inlining this function into find_next_bit and find_next_zero_bit I get slightly better performance than the old implementation: find_next_zero_bit find_next_bit old new inline old new inline 26 36 24 24 33 23 25 36 24 24 33 23 26 36 24 24 33 23 25 36 24 24 33 23 25 36 24 24 33 23 25 37 24 24 33 23 25 37 24 24 33 23 25 37 24 24 33 23 25 36 24 24 33 23 25 37 24 24 33 23 Signed-off-by: Cassidy Burden Cc: Alexey Klimov Cc: David S. Miller Cc: Daniel Borkmann Cc: Hannes Frederic Sowa Cc: Lai Jiangshan Cc: Mark Salter Cc: AKASHI Takahiro Cc: Thomas Graf Cc: Valentin Rothberg Cc: Chris Wilson --- lib/find_bit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/find_bit.c b/lib/find_bit.c index 18072ea..d0e04f9 100644 --- a/lib/find_bit.c +++ b/lib/find_bit.c @@ -28,7 +28,7 @@ * find_next_zero_bit. The difference is the "invert" argument, which * is XORed with each fetched word before searching it for one bits. */ -static unsigned long _find_next_bit(const unsigned long *addr, +static inline unsigned long _find_next_bit(const unsigned long *addr, unsigned long nbits, unsigned long start, unsigned long invert) { unsigned long tmp;