From patchwork Wed May 20 18:26:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11561009 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF3AE14B7 for ; Wed, 20 May 2020 18:29:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9B9C320671 for ; Wed, 20 May 2020 18:29:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="k7X5PZyD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9B9C320671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D9A1680007; Wed, 20 May 2020 14:29:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D4A27900002; Wed, 20 May 2020 14:29:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C37E880007; Wed, 20 May 2020 14:29:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id A9EAA900002 for ; Wed, 20 May 2020 14:29:15 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6EB97824805A for ; Wed, 20 May 2020 18:29:15 +0000 (UTC) X-FDA: 76837934670.23.rose43_33a3e422bda2c X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30012:30054:30064,0,RBL:156.151.31.86:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: rose43_33a3e422bda2c X-Filterd-Recvd-Size: 5232 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:29:14 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHIMa176596; Wed, 20 May 2020 18:29:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=nN6DbGwF8FnkPi4tkPEPNI6ezoScQhlz0u39zLEZexY=; b=k7X5PZyDCEvcOKbFcm1+WBoe31bPk0AUe7csxpHau/kt1u8OuI3cZ3719/Faaq8TGZDe a7DlcYbdJbxeAXv+mzP5oR94p6M909+YfUSpUySH0UtKxOOm970grYk/Dqu+hRijmcdX qlbOMrVnDDwByiWf2BHSOtfvj1Kk9SpIMioqoT2waKlPssRdWj/wN79mqRtHeqA1ZC2p vraFINUmezkjw/pddUoJ2e6o+mY48/Xl4nqxPTPhMSpqhNeLEdsTagu/EYgW87bWB0FG 1inxzUGkTGXLHbsPHxHsGKpfxekYMdCxGoj6OFCf/f8yFUl/cfqcBi4Jx5KWVa7gGZfN FQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 3127krcrqx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:29:03 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KICTfd099424; Wed, 20 May 2020 18:27:03 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 314gm7hpgd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:03 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04KIQsn7016354; Wed, 20 May 2020 18:26:55 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:26:54 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 1/7] padata: remove exit routine Date: Wed, 20 May 2020 14:26:39 -0400 Message-Id: <20200520182645.1658949-2-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 adultscore=0 phishscore=0 mlxscore=0 spamscore=0 suspectscore=2 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 spamscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 mlxscore=0 impostorscore=0 suspectscore=2 mlxlogscore=999 malwarescore=0 cotscore=-2147483648 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata_driver_exit() is unnecessary because padata isn't built as a module and doesn't exit. padata's init routine will soon allocate memory, so getting rid of the exit function now avoids pointless code to free it. Signed-off-by: Daniel Jordan --- kernel/padata.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index a6afa12fb75ee..835919c745266 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -1072,10 +1072,4 @@ static __init int padata_driver_init(void) } module_init(padata_driver_init); -static __exit void padata_driver_exit(void) -{ - cpuhp_remove_multi_state(CPUHP_PADATA_DEAD); - cpuhp_remove_multi_state(hp_online); -} -module_exit(padata_driver_exit); #endif From patchwork Wed May 20 18:26:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560989 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 940BD913 for ; Wed, 20 May 2020 18:27:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 56E7F20671 for ; Wed, 20 May 2020 18:27:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="o6V9JEi/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56E7F20671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 33C77900002; Wed, 20 May 2020 14:27:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2EC1180009; Wed, 20 May 2020 14:27:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 002CD900003; Wed, 20 May 2020 14:27:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id B93A0900002 for ; Wed, 20 May 2020 14:27:43 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 710AA181AEF30 for ; Wed, 20 May 2020 18:27:43 +0000 (UTC) X-FDA: 76837930806.07.son34_2644cfd606b53 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30029:30054:30064,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: son34_2644cfd606b53 X-Filterd-Recvd-Size: 7178 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:42 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHZoq000875; Wed, 20 May 2020 18:27:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=EEM8Jl/oICbjsjiZ/HuAgCE6hAIkFuScST0UVyGiCpY=; b=o6V9JEi/cj47DnhABLEggaXWDNqXZoYjRBmg44xiUDNvEp/CL3DpPQkexak5TjikjrwE ZwUq+Ke9u5GC6+IBd1rK0VdK1Vew3yFFn74qyiGJtCeJrDhg80iNcptsMGfcAzYqn/1n JzG2o8ntPmoZZUoRxtJ9l6mYof6HYYjmcQw/bNk97hu9kilNxCwsHEBFERERgSV4NTh8 B7D9uh9qRerHMVEX1gTfqzLpV8ix5ktvoUza5c1k5vjLVb0oyjVMTdvQj1S3cvlFD6rK dIdVZmSTVhs9WjE/HgQ8yYU2TNNFKFhBgzrKNL5f3zkrTISkgkedh6xPwvbChZUtsxeD eA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 31501rb89j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:03 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIDB2b076087; Wed, 20 May 2020 18:27:02 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 315020rd6u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:02 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 04KIQveR008071; Wed, 20 May 2020 18:26:57 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:26:57 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 2/7] padata: initialize earlier Date: Wed, 20 May 2020 14:26:40 -0400 Message-Id: <20200520182645.1658949-3-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 suspectscore=2 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1011 priorityscore=1501 cotscore=-2147483648 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=2 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata will soon initialize the system's struct pages in parallel, so it needs to be ready by page_alloc_init_late(). The error return from padata_driver_init() triggers an initcall warning, so add a warning to padata_init() to avoid silent failure. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 6 ++++++ init/main.c | 2 ++ kernel/padata.c | 17 ++++++++--------- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index a0d8b41850b25..476ecfa41f363 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -164,6 +164,12 @@ struct padata_instance { #define PADATA_INVALID 4 }; +#ifdef CONFIG_PADATA +extern void __init padata_init(void); +#else +static inline void __init padata_init(void) {} +#endif + extern struct padata_instance *padata_alloc_possible(const char *name); extern void padata_free(struct padata_instance *pinst); extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst); diff --git a/init/main.c b/init/main.c index 03371976d3872..8ab521f7af5d2 100644 --- a/init/main.c +++ b/init/main.c @@ -94,6 +94,7 @@ #include #include #include +#include #include #include @@ -1482,6 +1483,7 @@ static noinline void __init kernel_init_freeable(void) smp_init(); sched_init_smp(); + padata_init(); page_alloc_init_late(); /* Initialize page ext after all struct pages are initialized. */ page_ext_init(); diff --git a/kernel/padata.c b/kernel/padata.c index 835919c745266..6f709bc0fc413 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -31,7 +31,6 @@ #include #include #include -#include #define MAX_OBJ_NUM 1000 @@ -1050,26 +1049,26 @@ void padata_free_shell(struct padata_shell *ps) } EXPORT_SYMBOL(padata_free_shell); -#ifdef CONFIG_HOTPLUG_CPU - -static __init int padata_driver_init(void) +void __init padata_init(void) { +#ifdef CONFIG_HOTPLUG_CPU int ret; ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "padata:online", padata_cpu_online, NULL); if (ret < 0) - return ret; + goto err; hp_online = ret; ret = cpuhp_setup_state_multi(CPUHP_PADATA_DEAD, "padata:dead", NULL, padata_cpu_dead); if (ret < 0) { cpuhp_remove_multi_state(hp_online); - return ret; + goto err; } - return 0; -} -module_init(padata_driver_init); + return; +err: + pr_warn("padata: initialization failed\n"); #endif +} From patchwork Wed May 20 18:26:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560991 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C0E78913 for ; Wed, 20 May 2020 18:27:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 80B0D20671 for ; Wed, 20 May 2020 18:27:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="t1HrSf+u" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 80B0D20671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 64A7F80009; Wed, 20 May 2020 14:27:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 55A268000A; Wed, 20 May 2020 14:27:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D4948000B; Wed, 20 May 2020 14:27:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 1BBA68000A for ; Wed, 20 May 2020 14:27:44 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CFC7D180AD81D for ; Wed, 20 May 2020 18:27:43 +0000 (UTC) X-FDA: 76837930806.30.glass26_264541d6c3e14 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:4423:30034:30036:30051:30054:30064:30075,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:205,LUA_SUMMARY:none X-HE-Tag: glass26_264541d6c3e14 X-Filterd-Recvd-Size: 13408 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:42 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHRrk000780; Wed, 20 May 2020 18:27:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=inOVoPdk3ShCeURbGvONf8SqJIpbyrKXmxbqcuxhVCI=; b=t1HrSf+uEOH0m5wOK12ubs73IKJPe+bsE2s3oI6aoFmtyOaukq5E24DSC81lgl/jYprj iftP8wiQ4g3PgZH27U2TlPWynvgHMyg6t+9khIaSEp/7ZbG8kZPfSmZ4SgwXqmPq87Tb BG44VDpiwdWR/aIT5mMtVINeJzAqAOO7EQdv9gk340YR010pLJzzwPpxr+VovGIjpBZM aY9bKBsqPVevgjkUReS5PTW1st4GQUrwv0VlBt/QjIvLTXUI2Y56jf9/riwf7YFRgJyf R06AbGgvX3m3Fm4b671ZskCozgSz9qCo6yqtDQNHVC5+pLsuHP6JhcafymP3O//QClG2 lg== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 31501rb89q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:04 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIDBaU187377; Wed, 20 May 2020 18:27:04 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 312t38cm0s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:03 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04KIQxW9004547; Wed, 20 May 2020 18:26:59 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:26:59 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 3/7] padata: allocate work structures for parallel jobs from a pool Date: Wed, 20 May 2020 14:26:41 -0400 Message-Id: <20200520182645.1658949-4-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 malwarescore=0 suspectscore=2 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 cotscore=-2147483648 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=2 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata allocates per-CPU, per-instance work structs for parallel jobs. A do_parallel call assigns a job to a sequence number and hashes the number to a CPU, where the job will eventually run using the corresponding work. This approach fit with how padata used to bind a job to each CPU round-robin, makes less sense after commit bfde23ce200e6 ("padata: unbind parallel jobs from specific CPUs") because a work isn't bound to a particular CPU anymore, and isn't needed at all for multithreaded jobs because they don't have sequence numbers. Replace the per-CPU works with a preallocated pool, which allows sharing them between existing padata users and the upcoming multithreaded user. The pool will also facilitate setting NUMA-aware concurrency limits with later users. The pool is sized according to the number of possible CPUs. With this limit, MAX_OBJ_NUM no longer makes sense, so remove it. If the global pool is exhausted, a parallel job is run in the current task instead to throttle a system trying to do too much in parallel. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 8 +-- kernel/padata.c | 118 +++++++++++++++++++++++++++-------------- 2 files changed, 78 insertions(+), 48 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 476ecfa41f363..3bfa503503ac5 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -24,7 +24,6 @@ * @list: List entry, to attach to the padata lists. * @pd: Pointer to the internal control structure. * @cb_cpu: Callback cpu for serializatioon. - * @cpu: Cpu for parallelization. * @seq_nr: Sequence number of the parallelized data object. * @info: Used to pass information from the parallel to the serial function. * @parallel: Parallel execution function. @@ -34,7 +33,6 @@ struct padata_priv { struct list_head list; struct parallel_data *pd; int cb_cpu; - int cpu; unsigned int seq_nr; int info; void (*parallel)(struct padata_priv *padata); @@ -68,15 +66,11 @@ struct padata_serial_queue { /** * struct padata_parallel_queue - The percpu padata parallel queue * - * @parallel: List to wait for parallelization. * @reorder: List to wait for reordering after parallel processing. - * @work: work struct for parallelization. * @num_obj: Number of objects that are processed by this cpu. */ struct padata_parallel_queue { - struct padata_list parallel; struct padata_list reorder; - struct work_struct work; atomic_t num_obj; }; @@ -111,7 +105,7 @@ struct parallel_data { struct padata_parallel_queue __percpu *pqueue; struct padata_serial_queue __percpu *squeue; atomic_t refcnt; - atomic_t seq_nr; + unsigned int seq_nr; unsigned int processed; int cpu; struct padata_cpumask cpumask; diff --git a/kernel/padata.c b/kernel/padata.c index 6f709bc0fc413..78ff9aa529204 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -32,7 +32,15 @@ #include #include -#define MAX_OBJ_NUM 1000 +struct padata_work { + struct work_struct pw_work; + struct list_head pw_list; /* padata_free_works linkage */ + void *pw_data; +}; + +static DEFINE_SPINLOCK(padata_works_lock); +static struct padata_work *padata_works; +static LIST_HEAD(padata_free_works); static void padata_free_pd(struct parallel_data *pd); @@ -58,30 +66,44 @@ static int padata_cpu_hash(struct parallel_data *pd, unsigned int seq_nr) return padata_index_to_cpu(pd, cpu_index); } -static void padata_parallel_worker(struct work_struct *parallel_work) +static struct padata_work *padata_work_alloc(void) { - struct padata_parallel_queue *pqueue; - LIST_HEAD(local_list); + struct padata_work *pw; - local_bh_disable(); - pqueue = container_of(parallel_work, - struct padata_parallel_queue, work); + lockdep_assert_held(&padata_works_lock); - spin_lock(&pqueue->parallel.lock); - list_replace_init(&pqueue->parallel.list, &local_list); - spin_unlock(&pqueue->parallel.lock); + if (list_empty(&padata_free_works)) + return NULL; /* No more work items allowed to be queued. */ - while (!list_empty(&local_list)) { - struct padata_priv *padata; + pw = list_first_entry(&padata_free_works, struct padata_work, pw_list); + list_del(&pw->pw_list); + return pw; +} - padata = list_entry(local_list.next, - struct padata_priv, list); +static void padata_work_init(struct padata_work *pw, work_func_t work_fn, + void *data) +{ + INIT_WORK(&pw->pw_work, work_fn); + pw->pw_data = data; +} - list_del_init(&padata->list); +static void padata_work_free(struct padata_work *pw) +{ + lockdep_assert_held(&padata_works_lock); + list_add(&pw->pw_list, &padata_free_works); +} - padata->parallel(padata); - } +static void padata_parallel_worker(struct work_struct *parallel_work) +{ + struct padata_work *pw = container_of(parallel_work, struct padata_work, + pw_work); + struct padata_priv *padata = pw->pw_data; + local_bh_disable(); + padata->parallel(padata); + spin_lock(&padata_works_lock); + padata_work_free(pw); + spin_unlock(&padata_works_lock); local_bh_enable(); } @@ -105,9 +127,9 @@ int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu) { struct padata_instance *pinst = ps->pinst; - int i, cpu, cpu_index, target_cpu, err; - struct padata_parallel_queue *queue; + int i, cpu, cpu_index, err; struct parallel_data *pd; + struct padata_work *pw; rcu_read_lock_bh(); @@ -135,25 +157,25 @@ int padata_do_parallel(struct padata_shell *ps, if ((pinst->flags & PADATA_RESET)) goto out; - if (atomic_read(&pd->refcnt) >= MAX_OBJ_NUM) - goto out; - - err = 0; atomic_inc(&pd->refcnt); padata->pd = pd; padata->cb_cpu = *cb_cpu; - padata->seq_nr = atomic_inc_return(&pd->seq_nr); - target_cpu = padata_cpu_hash(pd, padata->seq_nr); - padata->cpu = target_cpu; - queue = per_cpu_ptr(pd->pqueue, target_cpu); - - spin_lock(&queue->parallel.lock); - list_add_tail(&padata->list, &queue->parallel.list); - spin_unlock(&queue->parallel.lock); + rcu_read_unlock_bh(); - queue_work(pinst->parallel_wq, &queue->work); + spin_lock(&padata_works_lock); + padata->seq_nr = ++pd->seq_nr; + pw = padata_work_alloc(); + spin_unlock(&padata_works_lock); + if (pw) { + padata_work_init(pw, padata_parallel_worker, padata); + queue_work(pinst->parallel_wq, &pw->pw_work); + } else { + /* Maximum works limit exceeded, run in the current task. */ + padata->parallel(padata); + } + return 0; out: rcu_read_unlock_bh(); @@ -324,8 +346,9 @@ static void padata_serial_worker(struct work_struct *serial_work) void padata_do_serial(struct padata_priv *padata) { struct parallel_data *pd = padata->pd; + int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr); struct padata_parallel_queue *pqueue = per_cpu_ptr(pd->pqueue, - padata->cpu); + hashed_cpu); struct padata_priv *cur; spin_lock(&pqueue->reorder.lock); @@ -416,8 +439,6 @@ static void padata_init_pqueues(struct parallel_data *pd) pqueue = per_cpu_ptr(pd->pqueue, cpu); __padata_list_init(&pqueue->reorder); - __padata_list_init(&pqueue->parallel); - INIT_WORK(&pqueue->work, padata_parallel_worker); atomic_set(&pqueue->num_obj, 0); } } @@ -451,7 +472,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_shell *ps) padata_init_pqueues(pd); padata_init_squeues(pd); - atomic_set(&pd->seq_nr, -1); + pd->seq_nr = -1; atomic_set(&pd->refcnt, 1); spin_lock_init(&pd->lock); pd->cpu = cpumask_first(pd->cpumask.pcpu); @@ -1051,6 +1072,7 @@ EXPORT_SYMBOL(padata_free_shell); void __init padata_init(void) { + unsigned int i, possible_cpus; #ifdef CONFIG_HOTPLUG_CPU int ret; @@ -1062,13 +1084,27 @@ void __init padata_init(void) ret = cpuhp_setup_state_multi(CPUHP_PADATA_DEAD, "padata:dead", NULL, padata_cpu_dead); - if (ret < 0) { - cpuhp_remove_multi_state(hp_online); - goto err; - } + if (ret < 0) + goto remove_online_state; +#endif + + possible_cpus = num_possible_cpus(); + padata_works = kmalloc_array(possible_cpus, sizeof(struct padata_work), + GFP_KERNEL); + if (!padata_works) + goto remove_dead_state; + + for (i = 0; i < possible_cpus; ++i) + list_add(&padata_works[i].pw_list, &padata_free_works); return; + +remove_dead_state: +#ifdef CONFIG_HOTPLUG_CPU + cpuhp_remove_multi_state(CPUHP_PADATA_DEAD); +remove_online_state: + cpuhp_remove_multi_state(hp_online); err: - pr_warn("padata: initialization failed\n"); #endif + pr_warn("padata: initialization failed\n"); } From patchwork Wed May 20 18:26:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560987 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 164B7913 for ; Wed, 20 May 2020 18:27:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CA475206C3 for ; Wed, 20 May 2020 18:27:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="BZq3cMuD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CA475206C3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0838B80008; Wed, 20 May 2020 14:27:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E0FF280009; Wed, 20 May 2020 14:27:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C37A9900004; Wed, 20 May 2020 14:27:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 9D774900003 for ; Wed, 20 May 2020 14:27:43 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 509D08248047 for ; Wed, 20 May 2020 18:27:43 +0000 (UTC) X-FDA: 76837930806.26.mom65_263c4935cf124 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30005:30034:30054:30064:30067:30074:30090,0,RBL:141.146.126.78:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:38,LUA_SUMMARY:none X-HE-Tag: mom65_263c4935cf124 X-Filterd-Recvd-Size: 14062 Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:42 +0000 (UTC) Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHmds141417; Wed, 20 May 2020 18:27:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=eDTtvqqpt3W8JqT78XVCS3W1n8xa+uH0ByQ+TS7+z7U=; b=BZq3cMuDxAcfco76tedwyIJbm4sMbg63cYiHOa2CyxQiFn7lvOn50X7yozDKtjQ6N94e 23VaZzXp6Bz4q5qR8u4yA0NKXi3vRFgVCYsW2/16PJF3poq4Ypgm6s31sdiSzaBxBkvh xWMK88N7yyYiXfG2r3p2hXHnSWFZeDFDNOphkl7QEU7nHCWYNHCM5z9xG4IYnxISmME5 L6bOyQvd95dllFaZD1z+DXeekTgypPgfGZlIGTeLRV3NbTMyHFaTlhmZ9bytXR2AKDM/ KciXkoPo1pLixktYnmD9Wfre99+SfHuwVfGt8jPUS184WR0/Ol6ZoUc4Li0BWubUtUfB lw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 31284m4q1d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:05 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIDCcI076125; Wed, 20 May 2020 18:27:04 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 315020rd8y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:04 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04KIR1ak004560; Wed, 20 May 2020 18:27:01 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:27:01 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 4/7] padata: add basic support for multithreaded jobs Date: Wed, 20 May 2020 14:26:42 -0400 Message-Id: <20200520182645.1658949-5-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 suspectscore=2 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 mlxscore=0 cotscore=-2147483648 impostorscore=0 malwarescore=0 mlxlogscore=999 lowpriorityscore=0 phishscore=0 spamscore=0 bulkscore=0 adultscore=0 priorityscore=1501 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sometimes the kernel doesn't take full advantage of system memory bandwidth, leading to a single CPU spending excessive time in initialization paths where the data scales with memory size. Multithreading naturally addresses this problem. Extend padata, a framework that handles many parallel yet singlethreaded jobs, to also handle multithreaded jobs by adding support for splitting up the work evenly, specifying a minimum amount of work that's appropriate for one helper thread to do, load balancing between helpers, and coordinating them. This is inspired by work from Pavel Tatashin and Steve Sistare. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 29 ++++++++ kernel/padata.c | 152 ++++++++++++++++++++++++++++++++++++++++- 2 files changed, 178 insertions(+), 3 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 3bfa503503ac5..b0affa466a841 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -4,6 +4,9 @@ * * Copyright (C) 2008, 2009 secunet Security Networks AG * Copyright (C) 2008, 2009 Steffen Klassert + * + * Copyright (c) 2020 Oracle and/or its affiliates. + * Author: Daniel Jordan */ #ifndef PADATA_H @@ -130,6 +133,31 @@ struct padata_shell { struct list_head list; }; +/** + * struct padata_mt_job - represents one multithreaded job + * + * @thread_fn: Called for each chunk of work that a padata thread does. + * @fn_arg: The thread function argument. + * @start: The start of the job (units are job-specific). + * @size: size of this node's work (units are job-specific). + * @align: Ranges passed to the thread function fall on this boundary, with the + * possible exceptions of the beginning and end of the job. + * @min_chunk: The minimum chunk size in job-specific units. This allows + * the client to communicate the minimum amount of work that's + * appropriate for one worker thread to do at once. + * @max_threads: Max threads to use for the job, actual number may be less + * depending on task size and minimum chunk size. + */ +struct padata_mt_job { + void (*thread_fn)(unsigned long start, unsigned long end, void *arg); + void *fn_arg; + unsigned long start; + unsigned long size; + unsigned long align; + unsigned long min_chunk; + int max_threads; +}; + /** * struct padata_instance - The overall control structure. * @@ -171,6 +199,7 @@ extern void padata_free_shell(struct padata_shell *ps); extern int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu); extern void padata_do_serial(struct padata_priv *padata); +extern void __init padata_do_multithreaded(struct padata_mt_job *job); extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type, cpumask_var_t cpumask); extern int padata_start(struct padata_instance *pinst); diff --git a/kernel/padata.c b/kernel/padata.c index 78ff9aa529204..e78f57d9aef90 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -7,6 +7,9 @@ * Copyright (C) 2008, 2009 secunet Security Networks AG * Copyright (C) 2008, 2009 Steffen Klassert * + * Copyright (c) 2020 Oracle and/or its affiliates. + * Author: Daniel Jordan + * * This program is free software; you can redistribute it and/or modify it * under the terms and conditions of the GNU General Public License, * version 2, as published by the Free Software Foundation. @@ -21,6 +24,7 @@ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. */ +#include #include #include #include @@ -32,6 +36,8 @@ #include #include +#define PADATA_WORK_ONSTACK 1 /* Work's memory is on stack */ + struct padata_work { struct work_struct pw_work; struct list_head pw_list; /* padata_free_works linkage */ @@ -42,7 +48,17 @@ static DEFINE_SPINLOCK(padata_works_lock); static struct padata_work *padata_works; static LIST_HEAD(padata_free_works); +struct padata_mt_job_state { + spinlock_t lock; + struct completion completion; + struct padata_mt_job *job; + int nworks; + int nworks_fini; + unsigned long chunk_size; +}; + static void padata_free_pd(struct parallel_data *pd); +static void __init padata_mt_helper(struct work_struct *work); static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index) { @@ -81,18 +97,56 @@ static struct padata_work *padata_work_alloc(void) } static void padata_work_init(struct padata_work *pw, work_func_t work_fn, - void *data) + void *data, int flags) { - INIT_WORK(&pw->pw_work, work_fn); + if (flags & PADATA_WORK_ONSTACK) + INIT_WORK_ONSTACK(&pw->pw_work, work_fn); + else + INIT_WORK(&pw->pw_work, work_fn); pw->pw_data = data; } +static int __init padata_work_alloc_mt(int nworks, void *data, + struct list_head *head) +{ + int i; + + spin_lock(&padata_works_lock); + /* Start at 1 because the current task participates in the job. */ + for (i = 1; i < nworks; ++i) { + struct padata_work *pw = padata_work_alloc(); + + if (!pw) + break; + padata_work_init(pw, padata_mt_helper, data, 0); + list_add(&pw->pw_list, head); + } + spin_unlock(&padata_works_lock); + + return i; +} + static void padata_work_free(struct padata_work *pw) { lockdep_assert_held(&padata_works_lock); list_add(&pw->pw_list, &padata_free_works); } +static void __init padata_works_free(struct list_head *works) +{ + struct padata_work *cur, *next; + + if (list_empty(works)) + return; + + spin_lock(&padata_works_lock); + list_for_each_entry_safe(cur, next, works, pw_list) { + list_del(&cur->pw_list); + padata_work_free(cur); + } + spin_unlock(&padata_works_lock); +} + static void padata_parallel_worker(struct work_struct *parallel_work) { struct padata_work *pw = container_of(parallel_work, struct padata_work, @@ -168,7 +222,7 @@ int padata_do_parallel(struct padata_shell *ps, pw = padata_work_alloc(); spin_unlock(&padata_works_lock); if (pw) { - padata_work_init(pw, padata_parallel_worker, padata); + padata_work_init(pw, padata_parallel_worker, padata, 0); queue_work(pinst->parallel_wq, &pw->pw_work); } else { /* Maximum works limit exceeded, run in the current task. */ @@ -409,6 +463,98 @@ static int pd_setup_cpumasks(struct parallel_data *pd, return err; } +static void __init padata_mt_helper(struct work_struct *w) +{ + struct padata_work *pw = container_of(w, struct padata_work, pw_work); + struct padata_mt_job_state *ps = pw->pw_data; + struct padata_mt_job *job = ps->job; + bool done; + + spin_lock(&ps->lock); + + while (job->size > 0) { + unsigned long start, size, end; + + start = job->start; + /* So end is chunk size aligned if enough work remains. */ + size = roundup(start + 1, ps->chunk_size) - start; + size = min(size, job->size); + end = start + size; + + job->start = end; + job->size -= size; + + spin_unlock(&ps->lock); + job->thread_fn(start, end, job->fn_arg); + spin_lock(&ps->lock); + } + + ++ps->nworks_fini; + done = (ps->nworks_fini == ps->nworks); + spin_unlock(&ps->lock); + + if (done) + complete(&ps->completion); +} + +/** + * padata_do_multithreaded - run a multithreaded job + * @job: Description of the job. + * + * See the definition of struct padata_mt_job for more details. + */ +void __init padata_do_multithreaded(struct padata_mt_job *job) +{ + /* In case threads finish at different times. */ + static const unsigned long load_balance_factor = 4; + struct padata_work my_work, *pw; + struct padata_mt_job_state ps; + LIST_HEAD(works); + int nworks; + + if (job->size == 0) + return; + + /* Ensure at least one thread when size < min_chunk. */ + nworks = max(job->size / job->min_chunk, 1ul); + nworks = min(nworks, job->max_threads); + + if (nworks == 1) { + /* Single thread, no coordination needed, cut to the chase. */ + job->thread_fn(job->start, job->start + job->size, job->fn_arg); + return; + } + + spin_lock_init(&ps.lock); + init_completion(&ps.completion); + ps.job = job; + ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); + ps.nworks_fini = 0; + + /* + * Chunk size is the amount of work a helper does per call to the + * thread function. Load balance large jobs between threads by + * increasing the number of chunks, guarantee at least the minimum + * chunk size from the caller, and honor the caller's alignment. + */ + ps.chunk_size = job->size / (ps.nworks * load_balance_factor); + ps.chunk_size = max(ps.chunk_size, job->min_chunk); + ps.chunk_size = roundup(ps.chunk_size, job->align); + + list_for_each_entry(pw, &works, pw_list) + queue_work(system_unbound_wq, &pw->pw_work); + + /* Use the current thread, which saves starting a workqueue worker. */ + padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); + padata_mt_helper(&my_work.pw_work); + + /* Wait for all the helpers to finish. */ + wait_for_completion(&ps.completion); + + destroy_work_on_stack(&my_work.pw_work); + padata_works_free(&works); +} + static void __padata_list_init(struct padata_list *pd_list) { INIT_LIST_HEAD(&pd_list->list); From patchwork Wed May 20 18:26:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560995 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1FEF414B7 for ; Wed, 20 May 2020 18:27:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C5D8D20671 for ; Wed, 20 May 2020 18:27:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Y7nx+LCF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5D8D20671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 66FFC8000C; Wed, 20 May 2020 14:27:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 623108000A; Wed, 20 May 2020 14:27:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49B698000C; Wed, 20 May 2020 14:27:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id 293818000A for ; Wed, 20 May 2020 14:27:47 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CBF70180AD81F for ; Wed, 20 May 2020 18:27:46 +0000 (UTC) X-FDA: 76837930932.25.drain71_26b2a5970ea28 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30005:30034:30045:30054:30055:30064,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: drain71_26b2a5970ea28 X-Filterd-Recvd-Size: 15892 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:45 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KII2kp001702; Wed, 20 May 2020 18:27:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=xmheG4rQ7EihGRxBUkcGZAWcew/R296No8qaesdB9YM=; b=Y7nx+LCF/t7b77t5QDWRK2QB6+3yhDvycV+pgjTPRlch8HJWE5XO1CmOt4L2/SulbEaZ D0tLOKFg0AGZk7q4lGycVtcz2CF5ZyCjed+FOaINHQqollul9T8H5Fkj/2G9aGUPMjw6 Hw690rP2WuV8HQdMVLoJCBrnrjr+tr01Y3kPxwthG0fGYKVJX0m9HQLPKuQu0SogNZpN zA/ZyUQq9TDEYsHssZP/0itZ2DgU5H5rnxaJgrBMagTQ9J3UDX8lI34U/eqO1bM6+U4L DZhzR4nria+seh055RNzj+xTivqBSrOkwlluWC1vSUSpc92/OjMK6zPz9SWbi+QoQCyS 1g== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 31501rb89w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:07 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KID9Dk187208; Wed, 20 May 2020 18:27:06 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 312t38cm5p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:06 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04KIR4Ng016467; Wed, 20 May 2020 18:27:04 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:27:03 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 5/7] mm: parallelize deferred_init_memmap() Date: Wed, 20 May 2020 14:26:43 -0400 Message-Id: <20200520182645.1658949-6-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 malwarescore=0 suspectscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 cotscore=-2147483648 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Deferred struct page init is a significant bottleneck in kernel boot. Optimizing it maximizes availability for large-memory systems and allows spinning up short-lived VMs as needed without having to leave them running. It also benefits bare metal machines hosting VMs that are sensitive to downtime. In projects such as VMM Fast Restart[1], where guest state is preserved across kexec reboot, it helps prevent application and network timeouts in the guests. Multithread to take full advantage of system memory bandwidth. The maximum number of threads is capped at the number of CPUs on the node because speedups always improve with additional threads on every system tested, and at this phase of boot, the system is otherwise idle and waiting on page init to finish. Helper threads operate on section-aligned ranges to both avoid false sharing when setting the pageblock's migrate type and to avoid accessing uninitialized buddy pages, though max order alignment is enough for the latter. The minimum chunk size is also a section. There was benefit to using multiple threads even on relatively small memory (1G) systems, and this is the smallest size that the alignment allows. The time (milliseconds) is the slowest node to initialize since boot blocks until all nodes finish. intel_pstate is loaded in active mode without hwp and with turbo enabled, and intel_idle is active as well. Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz (Skylake, bare metal) 2 nodes * 26 cores * 2 threads = 104 CPUs 384G/node = 768G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 4078.0 ( 9.0) -- 1779.0 ( 8.7) 2% ( 1) 1.4% 4021.3 ( 2.9) 3.4% 1717.7 ( 7.8) 12% ( 6) 35.1% 2644.7 ( 35.3) 80.8% 341.0 ( 35.5) 25% ( 13) 38.7% 2498.0 ( 34.2) 89.1% 193.3 ( 32.3) 37% ( 19) 39.1% 2482.0 ( 25.2) 90.1% 175.3 ( 31.7) 50% ( 26) 38.8% 2495.0 ( 8.7) 89.1% 193.7 ( 3.5) 75% ( 39) 39.2% 2478.0 ( 21.0) 90.3% 172.7 ( 26.7) 100% ( 52) 40.0% 2448.0 ( 2.0) 91.9% 143.3 ( 1.5) Intel(R) Xeon(R) CPU E5-2699C v4 @ 2.20GHz (Broadwell, bare metal) 1 node * 16 cores * 2 threads = 32 CPUs 192G/node = 192G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 1996.0 ( 18.0) -- 1104.3 ( 6.7) 3% ( 1) 1.4% 1968.0 ( 3.0) 2.7% 1074.7 ( 9.0) 12% ( 4) 40.1% 1196.0 ( 22.7) 72.4% 305.3 ( 16.8) 25% ( 8) 47.4% 1049.3 ( 17.2) 84.2% 174.0 ( 10.6) 37% ( 12) 48.3% 1032.0 ( 14.9) 86.8% 145.3 ( 2.5) 50% ( 16) 48.9% 1020.3 ( 2.5) 88.0% 133.0 ( 1.7) 75% ( 24) 49.1% 1016.3 ( 8.1) 88.4% 128.0 ( 1.7) 100% ( 32) 49.4% 1009.0 ( 8.5) 88.6% 126.3 ( 0.6) Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (Haswell, bare metal) 2 nodes * 18 cores * 2 threads = 72 CPUs 128G/node = 256G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 1682.7 ( 6.7) -- 630.0 ( 4.6) 3% ( 1) 0.4% 1676.0 ( 2.0) 0.7% 625.3 ( 3.2) 12% ( 4) 25.8% 1249.0 ( 1.0) 68.2% 200.3 ( 1.2) 25% ( 9) 30.0% 1178.0 ( 5.2) 79.7% 128.0 ( 3.5) 37% ( 13) 30.6% 1167.7 ( 3.1) 81.3% 117.7 ( 1.2) 50% ( 18) 30.6% 1167.3 ( 2.3) 81.4% 117.0 ( 1.0) 75% ( 27) 31.0% 1161.3 ( 4.6) 82.5% 110.0 ( 6.9) 100% ( 36) 32.1% 1142.0 ( 3.6) 85.7% 90.0 ( 1.0) AMD EPYC 7551 32-Core Processor (Zen, kvm guest) 1 node * 8 cores * 2 threads = 16 CPUs 64G/node = 64G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 1003.7 ( 16.6) -- 243.3 ( 8.1) 6% ( 1) 1.4% 990.0 ( 4.6) 1.2% 240.3 ( 1.5) 12% ( 2) 11.4% 889.3 ( 16.7) 44.5% 135.0 ( 3.0) 25% ( 4) 16.8% 835.3 ( 9.0) 65.8% 83.3 ( 2.5) 37% ( 6) 18.6% 816.7 ( 17.6) 70.4% 72.0 ( 1.0) 50% ( 8) 18.2% 821.0 ( 5.0) 70.7% 71.3 ( 1.2) 75% ( 12) 19.0% 813.3 ( 5.0) 71.8% 68.7 ( 2.1) 100% ( 16) 19.8% 805.3 ( 10.8) 76.4% 57.3 ( 15.9) Server-oriented distros that enable deferred page init sometimes run in small VMs, and they still benefit even though the fraction of boot time saved is smaller: AMD EPYC 7551 32-Core Processor (Zen, kvm guest) 1 node * 2 cores * 2 threads = 4 CPUs 16G/node = 16G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 722.3 ( 9.5) -- 50.7 ( 0.6) 25% ( 1) -3.3% 746.3 ( 4.7) -2.0% 51.7 ( 1.2) 50% ( 2) 0.2% 721.0 ( 11.3) 29.6% 35.7 ( 4.9) 75% ( 3) -0.3% 724.3 ( 11.2) 48.7% 26.0 ( 0.0) 100% ( 4) 3.0% 700.3 ( 13.6) 55.9% 22.3 ( 0.6) Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (Haswell, kvm guest) 1 node * 2 cores * 2 threads = 4 CPUs 14G/node = 14G memory kernel boot deferred init ------------------------ ------------------------ node% (thr) speedup time_ms (stdev) speedup time_ms (stdev) ( 0) -- 673.0 ( 6.9) -- 57.0 ( 1.0) 25% ( 1) -0.6% 677.3 ( 19.8) 1.8% 56.0 ( 1.0) 50% ( 2) 3.4% 650.0 ( 3.6) 36.8% 36.0 ( 5.2) 75% ( 3) 4.2% 644.7 ( 7.6) 56.1% 25.0 ( 1.0) 100% ( 4) 5.3% 637.0 ( 5.6) 63.2% 21.0 ( 0.0) On Josh's 96-CPU and 192G memory system: Without this patch series: [ 0.487132] node 0 initialised, 23398907 pages in 292ms [ 0.499132] node 1 initialised, 24189223 pages in 304ms ... [ 0.629376] Run /sbin/init as init process With this patch series: [ 0.227868] node 0 initialised, 23398907 pages in 28ms [ 0.230019] node 1 initialised, 24189223 pages in 28ms ... [ 0.361069] Run /sbin/init as init process [1] https://static.sched.com/hosted_files/kvmforum2019/66/VMM-fast-restart_kvmforum2019.pdf Signed-off-by: Daniel Jordan --- mm/Kconfig | 6 ++--- mm/page_alloc.c | 60 ++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 58 insertions(+), 8 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index c1acc34c1c358..04c1da3f9f44c 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -750,13 +750,13 @@ config DEFERRED_STRUCT_PAGE_INIT depends on SPARSEMEM depends on !NEED_PER_CPU_KM depends on 64BIT + select PADATA help Ordinarily all struct pages are initialised during early boot in a single thread. On very large machines this can take a considerable amount of time. If this option is set, large machines will bring up - a subset of memmap at boot and then initialise the rest in parallel - by starting one-off "pgdatinitX" kernel thread for each node X. This - has a potential performance impact on processes running early in the + a subset of memmap at boot and then initialise the rest in parallel. + This has a potential performance impact on tasks running early in the lifetime of the system until these kthreads finish the initialisation. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d0c0d9364aa6d..9cb780e8dec78 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -68,6 +68,7 @@ #include #include #include +#include #include #include @@ -1814,16 +1815,44 @@ deferred_init_maxorder(u64 *i, struct zone *zone, unsigned long *start_pfn, return nr_pages; } +struct definit_args { + struct zone *zone; + atomic_long_t nr_pages; +}; + +static void __init +deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, + void *arg) +{ + unsigned long spfn, epfn, nr_pages = 0; + struct definit_args *args = arg; + struct zone *zone = args->zone; + u64 i; + + deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn, start_pfn); + + /* + * Initialize and free pages in MAX_ORDER sized increments so that we + * can avoid introducing any issues with the buddy allocator. + */ + while (spfn < end_pfn) { + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); + cond_resched(); + } + + atomic_long_add(nr_pages, &args->nr_pages); +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { pg_data_t *pgdat = data; const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); unsigned long spfn = 0, epfn = 0, nr_pages = 0; - unsigned long first_init_pfn, flags; + unsigned long first_init_pfn, flags, epfn_align; unsigned long start = jiffies; struct zone *zone; - int zid; + int zid, max_threads; u64 i; /* Bind memory initialisation thread to a local node if possible */ @@ -1863,11 +1892,32 @@ static int __init deferred_init_memmap(void *data) goto zone_empty; /* - * Initialize and free pages in MAX_ORDER sized increments so - * that we can avoid introducing any issues with the buddy - * allocator. + * More CPUs always led to greater speedups on tested systems, up to + * all the nodes' CPUs. Use all since the system is otherwise idle now. */ + max_threads = max(cpumask_weight(cpumask), 1u); + while (spfn < epfn) { + epfn_align = ALIGN_DOWN(epfn, PAGES_PER_SECTION); + + if (IS_ALIGNED(spfn, PAGES_PER_SECTION) && + epfn_align - spfn >= PAGES_PER_SECTION) { + struct definit_args arg = { zone, ATOMIC_LONG_INIT(0) }; + struct padata_mt_job job = { + .thread_fn = deferred_init_memmap_chunk, + .fn_arg = &arg, + .start = spfn, + .size = epfn_align - spfn, + .align = PAGES_PER_SECTION, + .min_chunk = PAGES_PER_SECTION, + .max_threads = max_threads, + }; + + padata_do_multithreaded(&job); + nr_pages += atomic_long_read(&arg.nr_pages); + spfn = epfn_align; + } + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); cond_resched(); } From patchwork Wed May 20 18:26:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560985 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A4A714B7 for ; Wed, 20 May 2020 18:27:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CBC8620829 for ; Wed, 20 May 2020 18:27:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="x5LSs/4i" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CBC8620829 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D564A80007; Wed, 20 May 2020 14:27:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CD42480008; Wed, 20 May 2020 14:27:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4E1780007; Wed, 20 May 2020 14:27:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0128.hostedemail.com [216.40.44.128]) by kanga.kvack.org (Postfix) with ESMTP id 98181900002 for ; Wed, 20 May 2020 14:27:43 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 46E13181AEF2A for ; Wed, 20 May 2020 18:27:43 +0000 (UTC) X-FDA: 76837930806.25.chain13_2640cb0d50b06 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30054:30064:30090,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: chain13_2640cb0d50b06 X-Filterd-Recvd-Size: 7240 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:42 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHRpm000781; Wed, 20 May 2020 18:27:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=jCJIf2wmiDSS+BjtUDA+ZMjI+y7yIcmsNXeQVwW/Xag=; b=x5LSs/4ijZLCMnuICsPpuDDbTx4yAJ7NkaxrnkWI9TPHmcBX62mo6sL8dHRaXxzLhRi/ MUajNfN96/ikt9Whx1H3v5dWpP9hvXoUpY+eP7n2PEvQMOD4xceHvw/uAsLa6UP6/XHr ympP6PF3g7EVPonCaqo8RI5UrhNrzxVL9ckPBo1raknaIOAhZG7uYdEnvCsIKtYi6ccM 7vdryTK6ZS4qSN/X0iJlmiJhUHLyfqD6mKnX0ttiRGCpZgF0gRi2QvyuTtxlHsR73KeV 7oOHt5FfU7G+Kki8B5SZaMtx0dXjormuQj2mv2ffGHRcwNLGD6+p0iFFoafpoq8DPKK4 kg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 31501rb8a6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:09 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KICdqQ014999; Wed, 20 May 2020 18:27:09 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3030.oracle.com with ESMTP id 313gj3yth8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:09 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 04KIR64p016486; Wed, 20 May 2020 18:27:06 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:27:06 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 6/7] mm: make deferred init's max threads arch-specific Date: Wed, 20 May 2020 14:26:44 -0400 Message-Id: <20200520182645.1658949-7-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 spamscore=0 malwarescore=0 mlxscore=0 adultscore=0 bulkscore=0 suspectscore=2 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 cotscore=-2147483648 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=2 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Using padata during deferred init has only been tested on x86, so for now limit it to this architecture. If another arch wants this, it can find the max thread limit that's best for it and override deferred_page_init_max_threads(). Signed-off-by: Daniel Jordan --- arch/x86/mm/init_64.c | 12 ++++++++++++ include/linux/memblock.h | 3 +++ mm/page_alloc.c | 13 ++++++++----- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 8b5f73f5e207c..2d749ec12ea8a 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1260,6 +1260,18 @@ void __init mem_init(void) mem_init_print_info(NULL); } +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask) +{ + /* + * More CPUs always led to greater speedups on tested systems, up to + * all the nodes' CPUs. Use all since the system is otherwise idle + * now. + */ + return max_t(int, cpumask_weight(node_cpumask), 1); +} +#endif + int kernel_set_to_readonly; void mark_rodata_ro(void) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 6bc37a731d27b..2b289df44194f 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -275,6 +275,9 @@ void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, #define for_each_free_mem_pfn_range_in_zone_from(i, zone, p_start, p_end) \ for (; i != U64_MAX; \ __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) + +int __init deferred_page_init_max_threads(const struct cpumask *node_cpumask); + #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ /** diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9cb780e8dec78..0d7d805f98b2d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1843,6 +1843,13 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, atomic_long_add(nr_pages, &args->nr_pages); } +/* An arch may override for more concurrency. */ +__weak int __init +deferred_page_init_max_threads(const struct cpumask *node_cpumask) +{ + return 1; +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { @@ -1891,11 +1898,7 @@ static int __init deferred_init_memmap(void *data) first_init_pfn)) goto zone_empty; - /* - * More CPUs always led to greater speedups on tested systems, up to - * all the nodes' CPUs. Use all since the system is otherwise idle now. - */ - max_threads = max(cpumask_weight(cpumask), 1u); + max_threads = deferred_page_init_max_threads(cpumask); while (spfn < epfn) { epfn_align = ALIGN_DOWN(epfn, PAGES_PER_SECTION); From patchwork Wed May 20 18:26:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11560993 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C1E27913 for ; Wed, 20 May 2020 18:27:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8530720671 for ; Wed, 20 May 2020 18:27:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="P3/+t99s" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8530720671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9562F8000B; Wed, 20 May 2020 14:27:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 905318000A; Wed, 20 May 2020 14:27:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 781258000B; Wed, 20 May 2020 14:27:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id 4333F8000A for ; Wed, 20 May 2020 14:27:45 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DC6F08248047 for ; Wed, 20 May 2020 18:27:44 +0000 (UTC) X-FDA: 76837930890.04.heat55_26584fdac9d13 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30012:30034:30045:30051:30054:30064:30090,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: heat55_26584fdac9d13 X-Filterd-Recvd-Size: 7676 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 May 2020 18:27:43 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIHSQi000794; Wed, 20 May 2020 18:27:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=+I5DK1bpDxPp17bxmkq3TLU7ktjAi9yJhHWfSrepooM=; b=P3/+t99sOI/cJLLQGLPGcYunKHfgTXLhGsWaI52GKYtr7wyKIO64OoVtL9IMpVjD74uS r8CzhtQuf1c1Jitqjbj9kH5cupw6vHmbuWRRgEHEQH8ZJbs/tLe6xx+nXDL1BwAvlD+F R8R3Xyx8Gex8EBS0uet5gInYKJlSgBY6zWUTNHfldY+mT6Dw5c9Ka5098MbyD8WMf03L /NJHzNYzKCc3WLNcSBYOArsURzvhCgY67wjBxEMWNQzJS7WpLdmCbw7EOK31zOgdiR92 ml56VbjMyyzFqVbE3KSAyqwxek3U/YkOCY6ZY3BUAFQTuKQdK9He5gQ9z+lw066FZ3b8 5Q== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 31501rb8a9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 20 May 2020 18:27:11 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04KIDD4p076208; Wed, 20 May 2020 18:27:11 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 315020rdjx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 May 2020 18:27:10 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 04KIR8V3008161; Wed, 20 May 2020 18:27:08 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 20 May 2020 11:27:08 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Robert Elliott , Shile Zhang , Steven Sistare , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Daniel Jordan Subject: [PATCH v2 7/7] padata: document multithreaded jobs Date: Wed, 20 May 2020 14:26:45 -0400 Message-Id: <20200520182645.1658949-8-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> References: <20200520182645.1658949-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 suspectscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9627 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 clxscore=1015 priorityscore=1501 cotscore=-2147483648 impostorscore=0 bulkscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005200148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add Documentation for multithreaded jobs. Signed-off-by: Daniel Jordan --- Documentation/core-api/padata.rst | 41 +++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/Documentation/core-api/padata.rst b/Documentation/core-api/padata.rst index 9a24c111781d9..b7e047af993e8 100644 --- a/Documentation/core-api/padata.rst +++ b/Documentation/core-api/padata.rst @@ -4,23 +4,26 @@ The padata parallel execution mechanism ======================================= -:Date: December 2019 +:Date: April 2020 Padata is a mechanism by which the kernel can farm jobs out to be done in -parallel on multiple CPUs while retaining their ordering. It was developed for -use with the IPsec code, which needs to be able to perform encryption and -decryption on large numbers of packets without reordering those packets. The -crypto developers made a point of writing padata in a sufficiently general -fashion that it could be put to other uses as well. +parallel on multiple CPUs while optionally retaining their ordering. -Usage -===== +It was originally developed for IPsec, which needs to perform encryption and +decryption on large numbers of packets without reordering those packets. This +is currently the sole consumer of padata's serialized job support. + +Padata also supports multithreaded jobs, splitting up the job evenly while load +balancing and coordinating between threads. + +Running Serialized Jobs +======================= Initializing ------------ -The first step in using padata is to set up a padata_instance structure for -overall control of how jobs are to be run:: +The first step in using padata to run parallel jobs is to set up a +padata_instance structure for overall control of how jobs are to be run:: #include @@ -162,6 +165,24 @@ functions that correspond to the allocation in reverse:: It is the user's responsibility to ensure all outstanding jobs are complete before any of the above are called. +Running Multithreaded Jobs +========================== + +A multithreaded job has a main thread and zero or more helper threads, with the +main thread participating in the job and then waiting until all helpers have +finished. padata splits the job into units called chunks, where a chunk is a +piece of the job that one thread completes in one call to the thread function. + +A user has to do three things to run a multithreaded job. First, describe the +job by defining a padata_mt_job structure, which is explained in the Interface +section. This includes a pointer to the thread function, which padata will +call each time it assigns a job chunk to a thread. Then, define the thread +function, which accepts three arguments, ``start``, ``end``, and ``arg``, where +the first two delimit the range that the thread operates on and the last is a +pointer to the job's shared state, if any. Prepare the shared state, which is +typically a stack-allocated structure that wraps the required data. Last, call +padata_do_multithreaded(), which will return once the job is finished. + Interface =========