From patchwork Mon Nov 2 05:26:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wonhyuk Yang X-Patchwork-Id: 11872975 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D381592C for ; Mon, 2 Nov 2020 05:27:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E7F222267 for ; Mon, 2 Nov 2020 05:27:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="uP9gJ+6V" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7E7F222267 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7AEF26B005C; Mon, 2 Nov 2020 00:27:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 73A146B005D; Mon, 2 Nov 2020 00:27:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 602416B0068; Mon, 2 Nov 2020 00:27:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id 30E526B005C for ; Mon, 2 Nov 2020 00:27:12 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BB2728249980 for ; Mon, 2 Nov 2020 05:27:11 +0000 (UTC) X-FDA: 77438344662.06.wire49_20015d2272ad Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 95B1310038273 for ; Mon, 2 Nov 2020 05:27:11 +0000 (UTC) X-Spam-Summary: 1,0,0,5ba0a060b6804509,d41d8cd98f00b204,vvghjk1234@gmail.com,,RULES_HIT:41:69:355:379:541:560:800:960:973:988:989:1260:1345:1437:1534:1542:1711:1730:1747:1777:1792:2194:2198:2199:2200:2393:2559:2562:2740:2898:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4321:5007:6261:6653:7514:7576:7903:8603:9036:9108:9413:9592:10004:11026:11473:11658:11914:12043:12296:12297:12438:12517:12519:12555:12679:12895:14181:14394:14687:14721:21080:21444:21451:21627:21666:21990:30034:30054:30070,0,RBL:209.85.210.194:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100;04yfjdacg7zcn8x67mdys736rt6m6ypnbxunggnzi56we6jsyjgdc9h775e3gpj.87trbznmqhdzqoa5t65y18gfe1k89k87ue6fjqeh8179etqh54xfowdqdcoe7yk.a-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:68,LUA_SUMMARY:none X-HE-Tag: wire49_20015d2272ad X-Filterd-Recvd-Size: 4892 Received: from mail-pf1-f194.google.com (mail-pf1-f194.google.com [209.85.210.194]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Nov 2020 05:27:11 +0000 (UTC) Received: by mail-pf1-f194.google.com with SMTP id b3so10110929pfo.2 for ; Sun, 01 Nov 2020 21:27:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=NAi+M0eP5lC4V1/tqk8blmanOp0Wa8BfyrqVPshnMdU=; b=uP9gJ+6VPhZK/ofkhE7Stv48pumSiJQtrJN39MUc2T6fsncbvwQgPfujpGGu0t2nBo cFpecIO9FpiCaGN7MyxLoOEb28b92WAzttSIOkVAANLR8a+amjRcFV1Dus2Ah4+Xuf5T MDdGrI7bwM2DIZgshW0UqnToO4armV3ieK97f9sfs9oqFPs2jZCW3dXbDoYqP/7L0Q/Y dOpyvJ6R026NN3FD9ZIG1d4JMkxm1jdasUamhnCcve0ErofMWvZaU4ELArhSCVRkA2Ix jeToB/C38wkXjI7vqBfi0owApw1M3Y/fntAvtcVyuaJuq1cKCIPdaQCrHED6OdeI/7oM 8DFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=NAi+M0eP5lC4V1/tqk8blmanOp0Wa8BfyrqVPshnMdU=; b=UjAXDa4aBYmK6ji12YpqcgsC0mt18abI/2g+A88AAzTq7bMRfzkZ8XVfhYbxqqfTKH SC5wy8JqPvR1u6FfVtLAUOddPkSbi/S8LyokA2CS+1vCzp1zvg7fYhKNZrRfVb7FbIHH Wi6mR4q71LGDFgZY62Sq10My7Nnqwvti0xU/T0KQeeZndsvwXr1o4j841tf335qmr7ny QLUqzgyLOfFKeRJTh2DTOsgdPvBdN2yTFDJX16ZoRK4EpXt6MB9cob7rAAz5sNg6l+M2 fLYuSryrH3KdwcOqFNrZXHOphs3L8Zf3pW/iahpW3HRovh5uMFIGq448a+bTTw2UEa6T 7c+Q== X-Gm-Message-State: AOAM5335ifoVYaPXhhfJ0xTBdttkE9N3HlKlT4prRk92Pzmu81WfGWUD PVrFLr9F4vM/PHzGFyP2+To= X-Google-Smtp-Source: ABdhPJzzjRVn2bKnwwLa0sQoE1iYaJsUwiIieYJkGqEjnPvHaLMC6pmBNjBB4GzSBDeoQjchsdjSNw== X-Received: by 2002:a17:90b:378b:: with SMTP id mz11mr5985946pjb.50.1604294830166; Sun, 01 Nov 2020 21:27:10 -0800 (PST) Received: from DESKTOP-P2JGRFE.localdomain ([1.234.114.36]) by smtp.gmail.com with ESMTPSA id e10sm2821359pfh.38.2020.11.01.21.27.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Nov 2020 21:27:09 -0800 (PST) From: Wonhuyk Yang To: Dennis Zhou Cc: Tejun Heo , Christoph Lameter , linux-mm@kvack.org, Wonhyuk Yang Subject: [PATCH] percpu: reduce the number of searches calculating best upa Date: Mon, 2 Nov 2020 14:26:47 +0900 Message-Id: <20201102052647.8211-1-vvghjk1234@gmail.com> X-Mailer: git-send-email 2.17.1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Wonhyuk Yang Best upa is determined by iterating 1 to max_upa. If the size of alloc_size is power of 2, numbers of iteration decrease to logarithmic level. Prime factorization of alloc_size makes it easy to get possible upas. When alloc_size is power of 2, we can avoid cost of the prime factorization and possible upas are 1, 2, 4, ... max_upa. Signed-off-by: Wonhyuk Yang --- mm/percpu.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 66a93f096394..a24f3973744f 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -2689,18 +2689,17 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( /* * Determine min_unit_size, alloc_size and max_upa such that - * alloc_size is multiple of atom_size and is the smallest - * which can accommodate 4k aligned segments which are equal to - * or larger than min_unit_size. + * alloc_size is the maximu value of min_unit_size, atom_size. + * Also, alloc_size is power of 2 because both min_unit_size + * and atom_size are power of 2. */ min_unit_size = max_t(size_t, size_sum, PCPU_MIN_UNIT_SIZE); + min_unit_size = roundup_pow_of_two(min_unit_size); /* determine the maximum # of units that can fit in an allocation */ - alloc_size = roundup(min_unit_size, atom_size); - upa = alloc_size / min_unit_size; - while (alloc_size % upa || (offset_in_page(alloc_size / upa))) - upa--; - max_upa = upa; + alloc_size = max_t(size_t, min_unit_size, atom_size); + max_upa = alloc_size / min_unit_size; + /* group cpus according to their proximity */ for_each_possible_cpu(cpu) { @@ -2727,12 +2726,9 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( * Related to atom_size, which could be much larger than the unit_size. */ last_allocs = INT_MAX; - for (upa = max_upa; upa; upa--) { + for (upa = max_upa; upa; upa >>= 1) { int allocs = 0, wasted = 0; - if (alloc_size % upa || (offset_in_page(alloc_size / upa))) - continue; - for (group = 0; group < nr_groups; group++) { int this_allocs = DIV_ROUND_UP(group_cnt[group], upa); allocs += this_allocs;