From patchwork Thu Apr 30 20:11:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521445 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4DD3781 for ; Thu, 30 Apr 2020 20:13:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E1F3208D6 for ; Thu, 30 Apr 2020 20:13:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="VRTPDgb0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E1F3208D6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5AA508E0005; Thu, 30 Apr 2020 16:13:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 55B168E0001; Thu, 30 Apr 2020 16:13:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 449CF8E0005; Thu, 30 Apr 2020 16:13:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 28F3B8E0001 for ; Thu, 30 Apr 2020 16:13:52 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D1E83180AD806 for ; Thu, 30 Apr 2020 20:13:51 +0000 (UTC) X-FDA: 76765622262.03.mice74_5bd9a7f69ae09 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30012:30054:30064,0,RBL:141.146.126.78:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:28,LUA_SUMMARY:none X-HE-Tag: mice74_5bd9a7f69ae09 X-Filterd-Recvd-Size: 5049 Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:13:50 +0000 (UTC) Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UKD04K087856; Thu, 30 Apr 2020 20:13:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=YSH968m2chokijP0CuyV65vxstwMSwaqbt+J6W/LPzI=; b=VRTPDgb0IFZuQzrky+2E54YeGLLLnR1f4e9GyhB9JqdAf7twWjttXYbbeyY4flRfQpC4 u4q3RxVxuD+rX051J800fVwQ94csDIGGmC1wxQLHPeONDPRQzdaWyRTTSuhMIS8NFkzR pBCmDiTOP3+NTNuPd1jpXkN26darea6/KV2P5d9zJ5ZycXk0L5S5QHQe/AC8GQyJaiOC JvSXtWOsEGKrtc52JBg1qCI2XZozheAyfyuTexSs+45iJAK4Nwy11amhFs1oADwFV6o6 1fj/Stn3srd4dk0qSWRHvffjTJf/hZCh6J4jJdwWk5rrhlZ75dBWLFNZkMxsyEdEENKS Dg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 30nucgdkwt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:13:39 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK7RZQ096205; Thu, 30 Apr 2020 20:11:39 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3030.oracle.com with ESMTP id 30qtjy23km-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:39 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 03UKBXrf002202; Thu, 30 Apr 2020 20:11:33 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:33 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 1/7] padata: remove exit routine Date: Thu, 30 Apr 2020 16:11:19 -0400 Message-Id: <20200430201125.532129-2-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 suspectscore=2 malwarescore=0 bulkscore=0 phishscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 clxscore=1015 priorityscore=1501 mlxlogscore=999 impostorscore=0 suspectscore=2 malwarescore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300151 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata_driver_exit() is unnecessary because padata isn't built as a module and doesn't exit. padata's init routine will soon allocate memory, so getting rid of the exit function now avoids pointless code to free it. Signed-off-by: Daniel Jordan --- kernel/padata.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index 72777c10bb9cb..36a8e98741bb3 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -1071,10 +1071,4 @@ static __init int padata_driver_init(void) } module_init(padata_driver_init); -static __exit void padata_driver_exit(void) -{ - cpuhp_remove_multi_state(CPUHP_PADATA_DEAD); - cpuhp_remove_multi_state(hp_online); -} -module_exit(padata_driver_exit); #endif From patchwork Thu Apr 30 20:11:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521417 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 144A992A for ; Thu, 30 Apr 2020 20:12:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C7C2020774 for ; Thu, 30 Apr 2020 20:12:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="jpQ5KXL7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C7C2020774 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BF5EF8E0006; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9CA708E0008; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BCFE8E0006; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id 0D0E58E0006 for ; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D055C180AD806 for ; Thu, 30 Apr 2020 20:12:05 +0000 (UTC) X-FDA: 76765617810.15.mind51_4c6af8e113f29 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30029:30054:30064,0,RBL:156.151.31.86:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: mind51_4c6af8e113f29 X-Filterd-Recvd-Size: 6996 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:12:04 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK9LwW022271; Thu, 30 Apr 2020 20:11:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=BK9ctXRrIpRgGO6ZKhxMQ6dCR5vDweThxUdS7159PCM=; b=jpQ5KXL7fHt/Y5npAsmAs/QsQ+pjs4ePHIf/dZrXUuCAui+XQSMhvn8iHskTMt4CCVIh EtEY5b+bLXik8SQ6sbK1t1R6nEqRSOxjNt1WaB1s+idAtx7iWLLifU2FnX47a/ySMunb PZ/BeOZd5UjTZ3qHnOQsCpw9j8RpOF2u4f6ngFeyMbAQvjKBC/d6camCMcTNAzYjD2eV KU6Vk6TSix5hAu9wsw6xCb4hLXaEl3l+34NDB8pT00gp/lfTCURkqFaGIU6IfUoCJ42I RsGbY/W5gwxJuh28ppKkjDFRuMs3WlDL68KGmBLFshrcK1PcypS/wOQWIFMmjpiqaCWL Jw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 30p01p45a9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:40 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK7PdW095893; Thu, 30 Apr 2020 20:11:40 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 30qtjy23ky-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:40 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 03UKBZBd024024; Thu, 30 Apr 2020 20:11:35 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:35 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 2/7] padata: initialize earlier Date: Thu, 30 Apr 2020 16:11:20 -0400 Message-Id: <20200430201125.532129-3-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 suspectscore=2 malwarescore=0 bulkscore=0 phishscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 clxscore=1015 phishscore=0 mlxlogscore=999 adultscore=0 priorityscore=1501 mlxscore=0 suspectscore=2 malwarescore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata will soon initialize the system's struct pages in parallel, so it needs to be ready by page_alloc_init_late(). The error return from padata_driver_init() triggers an initcall warning, so add a warning to padata_init() to avoid silent failure. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 6 ++++++ init/main.c | 2 ++ kernel/padata.c | 17 ++++++++--------- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index a0d8b41850b25..476ecfa41f363 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -164,6 +164,12 @@ struct padata_instance { #define PADATA_INVALID 4 }; +#ifdef CONFIG_PADATA +extern void __init padata_init(void); +#else +static inline void __init padata_init(void) {} +#endif + extern struct padata_instance *padata_alloc_possible(const char *name); extern void padata_free(struct padata_instance *pinst); extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst); diff --git a/init/main.c b/init/main.c index ee4947af823f3..5451a80e43016 100644 --- a/init/main.c +++ b/init/main.c @@ -94,6 +94,7 @@ #include #include #include +#include #include #include @@ -1438,6 +1439,7 @@ static noinline void __init kernel_init_freeable(void) smp_init(); sched_init_smp(); + padata_init(); page_alloc_init_late(); /* Initialize page ext after all struct pages are initialized. */ page_ext_init(); diff --git a/kernel/padata.c b/kernel/padata.c index 36a8e98741bb3..b05cd30f8905b 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -31,7 +31,6 @@ #include #include #include -#include #define MAX_OBJ_NUM 1000 @@ -1049,26 +1048,26 @@ void padata_free_shell(struct padata_shell *ps) } EXPORT_SYMBOL(padata_free_shell); -#ifdef CONFIG_HOTPLUG_CPU - -static __init int padata_driver_init(void) +void __init padata_init(void) { +#ifdef CONFIG_HOTPLUG_CPU int ret; ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "padata:online", padata_cpu_online, NULL); if (ret < 0) - return ret; + goto err; hp_online = ret; ret = cpuhp_setup_state_multi(CPUHP_PADATA_DEAD, "padata:dead", NULL, padata_cpu_dead); if (ret < 0) { cpuhp_remove_multi_state(hp_online); - return ret; + goto err; } - return 0; -} -module_init(padata_driver_init); + return; +err: + pr_warn("padata: initialization failed\n"); #endif +} From patchwork Thu Apr 30 20:11:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 998EC92A for ; Thu, 30 Apr 2020 20:13:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4A4472082E for ; Thu, 30 Apr 2020 20:13:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="RQnf4l7n" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4A4472082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C894E8E0006; Thu, 30 Apr 2020 16:13:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C13B68E0001; Thu, 30 Apr 2020 16:13:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADBED8E0006; Thu, 30 Apr 2020 16:13:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 975478E0001 for ; Thu, 30 Apr 2020 16:13:53 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 63F7A8248D7C for ; Thu, 30 Apr 2020 20:13:53 +0000 (UTC) X-FDA: 76765622346.27.space39_5c152097d5d04 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:4423:30034:30036:30051:30054:30064:30075,0,RBL:156.151.31.86:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: space39_5c152097d5d04 X-Filterd-Recvd-Size: 13222 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:13:52 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UKDfRp025824; Thu, 30 Apr 2020 20:13:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=OM/fbru4UEW/37LyyUAiPpDA7ZzW3bAgBlzQnvhOljM=; b=RQnf4l7ni65knfAnj2oAt11nN3u1diMi2kGnfBS9oYldzCTSNJ7A/rnFfQNCNAVhIL6F 4JrkC6xsvTH61hN7PSlhm4D/ebHdr+TbS8DjSIbU3zXGoAayO74w9IVhIZwXOMWXdBMT 0jrCGsyMx+PpTKMaaVhjSpZQ0IqJE1M0tILkkCsgjJWerlIR0yfq170MJ6Vf6PkJS09X 9ZGGID8eFKmEoSOo8/Ja/5Kwwtc4+74rrKq2BN5ZihjSfpbwtsAg9/aRCF4Khwv2c7dt mSkLwi3R1u8M82iewdkPho7iwl2MBUsYvwospYtWwVFlmAcuQ8eL5suWOLVpvb2bAf0b 0w== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 30p01p45j8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:13:40 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK6spH001498; Thu, 30 Apr 2020 20:11:40 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 30qtf8fyfx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:39 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 03UKBbt0026674; Thu, 30 Apr 2020 20:11:37 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:37 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 3/7] padata: allocate work structures for parallel jobs from a pool Date: Thu, 30 Apr 2020 16:11:21 -0400 Message-Id: <20200430201125.532129-4-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=2 phishscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 clxscore=1015 phishscore=0 mlxlogscore=999 adultscore=0 priorityscore=1501 mlxscore=0 suspectscore=2 malwarescore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300151 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata allocates per-CPU, per-instance work structs for parallel jobs. A do_parallel call assigns a job to a sequence number and hashes the number to a CPU, where the job will eventually run using the corresponding work. This approach fit with how padata used to bind a job to each CPU round-robin, makes less sense after commit bfde23ce200e6 ("padata: unbind parallel jobs from specific CPUs") because a work isn't bound to a particular CPU anymore, and isn't needed at all for multithreaded jobs because they don't have sequence numbers. Replace the per-CPU works with a preallocated pool, which allows sharing them between existing padata users and the upcoming multithreaded user. The pool will also facilitate setting NUMA-aware concurrency limits with later users. The pool is sized according to the number of possible CPUs. With this limit, MAX_OBJ_NUM no longer makes sense, so remove it. If the global pool is exhausted, a parallel job is run in the current task instead to throttle a system trying to do too much in parallel. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 8 +-- kernel/padata.c | 118 +++++++++++++++++++++++++++-------------- 2 files changed, 78 insertions(+), 48 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 476ecfa41f363..3bfa503503ac5 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -24,7 +24,6 @@ * @list: List entry, to attach to the padata lists. * @pd: Pointer to the internal control structure. * @cb_cpu: Callback cpu for serializatioon. - * @cpu: Cpu for parallelization. * @seq_nr: Sequence number of the parallelized data object. * @info: Used to pass information from the parallel to the serial function. * @parallel: Parallel execution function. @@ -34,7 +33,6 @@ struct padata_priv { struct list_head list; struct parallel_data *pd; int cb_cpu; - int cpu; unsigned int seq_nr; int info; void (*parallel)(struct padata_priv *padata); @@ -68,15 +66,11 @@ struct padata_serial_queue { /** * struct padata_parallel_queue - The percpu padata parallel queue * - * @parallel: List to wait for parallelization. * @reorder: List to wait for reordering after parallel processing. - * @work: work struct for parallelization. * @num_obj: Number of objects that are processed by this cpu. */ struct padata_parallel_queue { - struct padata_list parallel; struct padata_list reorder; - struct work_struct work; atomic_t num_obj; }; @@ -111,7 +105,7 @@ struct parallel_data { struct padata_parallel_queue __percpu *pqueue; struct padata_serial_queue __percpu *squeue; atomic_t refcnt; - atomic_t seq_nr; + unsigned int seq_nr; unsigned int processed; int cpu; struct padata_cpumask cpumask; diff --git a/kernel/padata.c b/kernel/padata.c index b05cd30f8905b..edd3ff551e262 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -32,7 +32,15 @@ #include #include -#define MAX_OBJ_NUM 1000 +struct padata_work { + struct work_struct pw_work; + struct list_head pw_list; /* padata_free_works linkage */ + void *pw_data; +}; + +static DEFINE_SPINLOCK(padata_works_lock); +static struct padata_work *padata_works; +static LIST_HEAD(padata_free_works); static void padata_free_pd(struct parallel_data *pd); @@ -58,30 +66,44 @@ static int padata_cpu_hash(struct parallel_data *pd, unsigned int seq_nr) return padata_index_to_cpu(pd, cpu_index); } -static void padata_parallel_worker(struct work_struct *parallel_work) +static struct padata_work *padata_work_alloc(void) { - struct padata_parallel_queue *pqueue; - LIST_HEAD(local_list); + struct padata_work *pw; - local_bh_disable(); - pqueue = container_of(parallel_work, - struct padata_parallel_queue, work); + lockdep_assert_held(&padata_works_lock); - spin_lock(&pqueue->parallel.lock); - list_replace_init(&pqueue->parallel.list, &local_list); - spin_unlock(&pqueue->parallel.lock); + if (list_empty(&padata_free_works)) + return NULL; /* No more work items allowed to be queued. */ - while (!list_empty(&local_list)) { - struct padata_priv *padata; + pw = list_first_entry(&padata_free_works, struct padata_work, pw_list); + list_del(&pw->pw_list); + return pw; +} - padata = list_entry(local_list.next, - struct padata_priv, list); +static void padata_work_init(struct padata_work *pw, work_func_t work_fn, + void *data) +{ + INIT_WORK(&pw->pw_work, work_fn); + pw->pw_data = data; +} - list_del_init(&padata->list); +static void padata_work_free(struct padata_work *pw) +{ + lockdep_assert_held(&padata_works_lock); + list_add(&pw->pw_list, &padata_free_works); +} - padata->parallel(padata); - } +static void padata_parallel_worker(struct work_struct *parallel_work) +{ + struct padata_work *pw = container_of(parallel_work, struct padata_work, + pw_work); + struct padata_priv *padata = pw->pw_data; + local_bh_disable(); + padata->parallel(padata); + spin_lock(&padata_works_lock); + padata_work_free(pw); + spin_unlock(&padata_works_lock); local_bh_enable(); } @@ -105,9 +127,9 @@ int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu) { struct padata_instance *pinst = ps->pinst; - int i, cpu, cpu_index, target_cpu, err; - struct padata_parallel_queue *queue; + int i, cpu, cpu_index, err; struct parallel_data *pd; + struct padata_work *pw; rcu_read_lock_bh(); @@ -135,25 +157,25 @@ int padata_do_parallel(struct padata_shell *ps, if ((pinst->flags & PADATA_RESET)) goto out; - if (atomic_read(&pd->refcnt) >= MAX_OBJ_NUM) - goto out; - - err = 0; atomic_inc(&pd->refcnt); padata->pd = pd; padata->cb_cpu = *cb_cpu; - padata->seq_nr = atomic_inc_return(&pd->seq_nr); - target_cpu = padata_cpu_hash(pd, padata->seq_nr); - padata->cpu = target_cpu; - queue = per_cpu_ptr(pd->pqueue, target_cpu); - - spin_lock(&queue->parallel.lock); - list_add_tail(&padata->list, &queue->parallel.list); - spin_unlock(&queue->parallel.lock); + rcu_read_unlock_bh(); - queue_work(pinst->parallel_wq, &queue->work); + spin_lock(&padata_works_lock); + padata->seq_nr = ++pd->seq_nr; + pw = padata_work_alloc(); + spin_unlock(&padata_works_lock); + if (pw) { + padata_work_init(pw, padata_parallel_worker, padata); + queue_work(pinst->parallel_wq, &pw->pw_work); + } else { + /* Maximum works limit exceeded, run in the current task. */ + padata->parallel(padata); + } + return 0; out: rcu_read_unlock_bh(); @@ -324,8 +346,9 @@ static void padata_serial_worker(struct work_struct *serial_work) void padata_do_serial(struct padata_priv *padata) { struct parallel_data *pd = padata->pd; + int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr); struct padata_parallel_queue *pqueue = per_cpu_ptr(pd->pqueue, - padata->cpu); + hashed_cpu); struct padata_priv *cur; spin_lock(&pqueue->reorder.lock); @@ -416,8 +439,6 @@ static void padata_init_pqueues(struct parallel_data *pd) pqueue = per_cpu_ptr(pd->pqueue, cpu); __padata_list_init(&pqueue->reorder); - __padata_list_init(&pqueue->parallel); - INIT_WORK(&pqueue->work, padata_parallel_worker); atomic_set(&pqueue->num_obj, 0); } } @@ -451,7 +472,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_shell *ps) padata_init_pqueues(pd); padata_init_squeues(pd); - atomic_set(&pd->seq_nr, -1); + pd->seq_nr = -1; atomic_set(&pd->refcnt, 1); spin_lock_init(&pd->lock); pd->cpu = cpumask_first(pd->cpumask.pcpu); @@ -1050,6 +1071,7 @@ EXPORT_SYMBOL(padata_free_shell); void __init padata_init(void) { + unsigned int i, possible_cpus; #ifdef CONFIG_HOTPLUG_CPU int ret; @@ -1061,13 +1083,27 @@ void __init padata_init(void) ret = cpuhp_setup_state_multi(CPUHP_PADATA_DEAD, "padata:dead", NULL, padata_cpu_dead); - if (ret < 0) { - cpuhp_remove_multi_state(hp_online); - goto err; - } + if (ret < 0) + goto remove_online_state; +#endif + + possible_cpus = num_possible_cpus(); + padata_works = kmalloc_array(possible_cpus, sizeof(struct padata_work), + GFP_KERNEL); + if (!padata_works) + goto remove_dead_state; + + for (i = 0; i < possible_cpus; ++i) + list_add(&padata_works[i].pw_list, &padata_free_works); return; + +remove_dead_state: +#ifdef CONFIG_HOTPLUG_CPU + cpuhp_remove_multi_state(CPUHP_PADATA_DEAD); +remove_online_state: + cpuhp_remove_multi_state(hp_online); err: - pr_warn("padata: initialization failed\n"); #endif + pr_warn("padata: initialization failed\n"); } From patchwork Thu Apr 30 20:11:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521449 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DFDF81 for ; Thu, 30 Apr 2020 20:13:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 206BC20774 for ; Thu, 30 Apr 2020 20:13:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="fLKeMtx3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 206BC20774 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3AF9F8E0001; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 363818E0007; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EFED8E0001; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id E66EB8E0007 for ; Thu, 30 Apr 2020 16:13:53 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 91909181AEF00 for ; Thu, 30 Apr 2020 20:13:53 +0000 (UTC) X-FDA: 76765622346.23.bears78_5c1ee32957f13 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30005:30034:30054:30064:30067:30074:30090,0,RBL:156.151.31.86:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: bears78_5c1ee32957f13 X-Filterd-Recvd-Size: 13876 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:13:52 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UKDVRZ025753; Thu, 30 Apr 2020 20:13:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=bsNo9rsQ3aFck0kh92tJlakAeF2BagCrvsBcqBZwAXg=; b=fLKeMtx3d59+jUCFA69Ged5Iax+Lk7YDP+/H16X+U6ar7kDa35CHuJB5q3KYZUsrGwNs ZM3SAAuBuLK2jqKQDCEctzNp80jL4ETebv0BNyMSIK3ziws/Zw7V+OWaQX3ORSqLZs3v G8DqzqN8LvqWMteRd3Wqe599GzHysnr8FDVD2wqLgM3n0huFVAppZo5KyBnVA/39oXHN 0CDFE9jQ++TbvRonp4OclcuUb0MIeDF+yBTilRTh5Nd9nYRpwt6JUVlixNwl6vcXhkTC OAMPLcmjyo7q+ZZak7Y6w3Cnlr8oZBu4bV2gcvFWhc/0NQ8CV/E8LftDZoc63gzO+/OO AQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 30p01p45je-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:13:42 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK6tei001625; Thu, 30 Apr 2020 20:11:42 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 30qtf8fyjw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:41 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 03UKBdFS002226; Thu, 30 Apr 2020 20:11:39 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:39 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 4/7] padata: add basic support for multithreaded jobs Date: Thu, 30 Apr 2020 16:11:22 -0400 Message-Id: <20200430201125.532129-5-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=2 phishscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 spamscore=0 clxscore=1015 phishscore=0 mlxlogscore=999 adultscore=0 priorityscore=1501 mlxscore=0 suspectscore=2 malwarescore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300151 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sometimes the kernel doesn't take full advantage of system memory bandwidth, leading to a single CPU spending excessive time in initialization paths where the data scales with memory size. Multithreading naturally addresses this problem. Extend padata, a framework that handles many parallel yet singlethreaded jobs, to also handle multithreaded jobs by adding support for splitting up the work evenly, specifying a minimum amount of work that's appropriate for one helper thread to do, load balancing between helpers, and coordinating them. This is inspired by work from Pavel Tatashin and Steve Sistare. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 29 ++++++++ kernel/padata.c | 152 ++++++++++++++++++++++++++++++++++++++++- 2 files changed, 178 insertions(+), 3 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 3bfa503503ac5..b0affa466a841 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -4,6 +4,9 @@ * * Copyright (C) 2008, 2009 secunet Security Networks AG * Copyright (C) 2008, 2009 Steffen Klassert + * + * Copyright (c) 2020 Oracle and/or its affiliates. + * Author: Daniel Jordan */ #ifndef PADATA_H @@ -130,6 +133,31 @@ struct padata_shell { struct list_head list; }; +/** + * struct padata_mt_job - represents one multithreaded job + * + * @thread_fn: Called for each chunk of work that a padata thread does. + * @fn_arg: The thread function argument. + * @start: The start of the job (units are job-specific). + * @size: size of this node's work (units are job-specific). + * @align: Ranges passed to the thread function fall on this boundary, with the + * possible exceptions of the beginning and end of the job. + * @min_chunk: The minimum chunk size in job-specific units. This allows + * the client to communicate the minimum amount of work that's + * appropriate for one worker thread to do at once. + * @max_threads: Max threads to use for the job, actual number may be less + * depending on task size and minimum chunk size. + */ +struct padata_mt_job { + void (*thread_fn)(unsigned long start, unsigned long end, void *arg); + void *fn_arg; + unsigned long start; + unsigned long size; + unsigned long align; + unsigned long min_chunk; + int max_threads; +}; + /** * struct padata_instance - The overall control structure. * @@ -171,6 +199,7 @@ extern void padata_free_shell(struct padata_shell *ps); extern int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu); extern void padata_do_serial(struct padata_priv *padata); +extern void __init padata_do_multithreaded(struct padata_mt_job *job); extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type, cpumask_var_t cpumask); extern int padata_start(struct padata_instance *pinst); diff --git a/kernel/padata.c b/kernel/padata.c index edd3ff551e262..ccb617d37677a 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -7,6 +7,9 @@ * Copyright (C) 2008, 2009 secunet Security Networks AG * Copyright (C) 2008, 2009 Steffen Klassert * + * Copyright (c) 2020 Oracle and/or its affiliates. + * Author: Daniel Jordan + * * This program is free software; you can redistribute it and/or modify it * under the terms and conditions of the GNU General Public License, * version 2, as published by the Free Software Foundation. @@ -21,6 +24,7 @@ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. */ +#include #include #include #include @@ -32,6 +36,8 @@ #include #include +#define PADATA_WORK_ONSTACK 1 /* Work's memory is on stack */ + struct padata_work { struct work_struct pw_work; struct list_head pw_list; /* padata_free_works linkage */ @@ -42,7 +48,17 @@ static DEFINE_SPINLOCK(padata_works_lock); static struct padata_work *padata_works; static LIST_HEAD(padata_free_works); +struct padata_mt_job_state { + spinlock_t lock; + struct completion completion; + struct padata_mt_job *job; + int nworks; + int nworks_fini; + unsigned long chunk_size; +}; + static void padata_free_pd(struct parallel_data *pd); +static void __init padata_mt_helper(struct work_struct *work); static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index) { @@ -81,18 +97,56 @@ static struct padata_work *padata_work_alloc(void) } static void padata_work_init(struct padata_work *pw, work_func_t work_fn, - void *data) + void *data, int flags) { - INIT_WORK(&pw->pw_work, work_fn); + if (flags & PADATA_WORK_ONSTACK) + INIT_WORK_ONSTACK(&pw->pw_work, work_fn); + else + INIT_WORK(&pw->pw_work, work_fn); pw->pw_data = data; } +static int __init padata_work_alloc_mt(int nworks, void *data, + struct list_head *head) +{ + int i; + + spin_lock(&padata_works_lock); + /* Start at 1 because the current task participates in the job. */ + for (i = 1; i < nworks; ++i) { + struct padata_work *pw = padata_work_alloc(); + + if (!pw) + break; + padata_work_init(pw, padata_mt_helper, data, 0); + list_add(&pw->pw_list, head); + } + spin_unlock(&padata_works_lock); + + return i; +} + static void padata_work_free(struct padata_work *pw) { lockdep_assert_held(&padata_works_lock); list_add(&pw->pw_list, &padata_free_works); } +static void __init padata_works_free(struct list_head *works) +{ + struct padata_work *cur, *next; + + if (list_empty(works)) + return; + + spin_lock(&padata_works_lock); + list_for_each_entry_safe(cur, next, works, pw_list) { + list_del(&cur->pw_list); + padata_work_free(cur); + } + spin_unlock(&padata_works_lock); +} + static void padata_parallel_worker(struct work_struct *parallel_work) { struct padata_work *pw = container_of(parallel_work, struct padata_work, @@ -168,7 +222,7 @@ int padata_do_parallel(struct padata_shell *ps, pw = padata_work_alloc(); spin_unlock(&padata_works_lock); if (pw) { - padata_work_init(pw, padata_parallel_worker, padata); + padata_work_init(pw, padata_parallel_worker, padata, 0); queue_work(pinst->parallel_wq, &pw->pw_work); } else { /* Maximum works limit exceeded, run in the current task. */ @@ -409,6 +463,98 @@ static int pd_setup_cpumasks(struct parallel_data *pd, return err; } +static void __init padata_mt_helper(struct work_struct *w) +{ + struct padata_work *pw = container_of(w, struct padata_work, pw_work); + struct padata_mt_job_state *ps = pw->pw_data; + struct padata_mt_job *job = ps->job; + bool done; + + spin_lock(&ps->lock); + + while (job->size > 0) { + unsigned long start, size, end; + + start = job->start; + /* So end is chunk size aligned if enough work remains. */ + size = roundup(start + 1, ps->chunk_size) - start; + size = min(size, job->size); + end = start + size; + + job->start = end; + job->size -= size; + + spin_unlock(&ps->lock); + job->thread_fn(start, end, job->fn_arg); + spin_lock(&ps->lock); + } + + ++ps->nworks_fini; + done = (ps->nworks_fini == ps->nworks); + spin_unlock(&ps->lock); + + if (done) + complete(&ps->completion); +} + +/** + * padata_do_multithreaded - run a multithreaded job + * @job: Description of the job. + * + * See the definition of struct padata_mt_job for more details. + */ +void __init padata_do_multithreaded(struct padata_mt_job *job) +{ + /* In case threads finish at different times. */ + static const unsigned long load_balance_factor = 4; + struct padata_work my_work, *pw; + struct padata_mt_job_state ps; + LIST_HEAD(works); + int nworks; + + if (job->size == 0) + return; + + /* Ensure at least one thread when size < min_chunk. */ + nworks = max(job->size / job->min_chunk, 1ul); + nworks = min(nworks, job->max_threads); + + if (nworks == 1) { + /* Single thread, no coordination needed, cut to the chase. */ + job->thread_fn(job->start, job->start + job->size, job->fn_arg); + return; + } + + spin_lock_init(&ps.lock); + init_completion(&ps.completion); + ps.job = job; + ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); + ps.nworks_fini = 0; + + /* + * Chunk size is the amount of work a helper does per call to the + * thread function. Load balance large jobs between threads by + * increasing the number of chunks, guarantee at least the minimum + * chunk size from the caller, and honor the caller's alignment. + */ + ps.chunk_size = job->size / (ps.nworks * load_balance_factor); + ps.chunk_size = max(ps.chunk_size, job->min_chunk); + ps.chunk_size = roundup(ps.chunk_size, job->align); + + list_for_each_entry(pw, &works, pw_list) + queue_work(system_unbound_wq, &pw->pw_work); + + /* Use the current thread, which saves starting a workqueue worker. */ + padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); + padata_mt_helper(&my_work.pw_work); + + /* Wait for all the helpers to finish. */ + wait_for_completion(&ps.completion); + + destroy_work_on_stack(&my_work.pw_work); + padata_works_free(&works); +} + static void __padata_list_init(struct padata_list *pd_list) { INIT_LIST_HEAD(&pd_list->list); From patchwork Thu Apr 30 20:11:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521451 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B477C81 for ; Thu, 30 Apr 2020 20:14:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 744E320774 for ; Thu, 30 Apr 2020 20:14:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="m+I2zk3N" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 744E320774 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AB7F18E0007; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9CB118E0008; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 732E28E0007; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0181.hostedemail.com [216.40.44.181]) by kanga.kvack.org (Postfix) with ESMTP id 4F3258E0008 for ; Thu, 30 Apr 2020 16:13:54 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 16035180AD806 for ; Thu, 30 Apr 2020 20:13:54 +0000 (UTC) X-FDA: 76765622388.07.ink53_5c2e9aec4641d X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30012:30054:30064:30070,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: ink53_5c2e9aec4641d X-Filterd-Recvd-Size: 9123 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:13:53 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UKDWOM009406; Thu, 30 Apr 2020 20:13:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=m72YR/nkrMDSIXspqjuUroii0+CofYy9l2S6f8qhUFQ=; b=m+I2zk3NpPfpkFFc97LloLnclZOxcrApFODkdPrRxwRZoPNPkDnQHWRimbWTptFS5Vi6 baMhKWWgvcLSnAWAzUVAWjbjA1l/CdBAVkcgYlalt+C/RY8aYJPDvduA26lkfbcJDXlk 7IR2Xw67kKidhNPd5o8BhKVR6WCtZKnvaxw/oCcoc/ZL70tacGG5uUabV+MwQp0fDck7 7S5O8ju/IpVrnxnHJDUpzcwe/L3tq/2ifqLvoTPCWD3aJ7sfiEW/liQgXlQsdmCtiDHY GdA/By95bKhNSXIHvfvECom/hwxjby6gPWtiH2JdJumN62b0H0+ABKzZUwVuevLB8t3l 6w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 30p2p0k2wx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:13:43 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK6XkW140606; Thu, 30 Apr 2020 20:11:42 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 30qtkx5a75-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:42 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 03UKBfka024124; Thu, 30 Apr 2020 20:11:41 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:41 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 5/7] mm: move zone iterator outside of deferred_init_maxorder() Date: Thu, 30 Apr 2020 16:11:23 -0400 Message-Id: <20200430201125.532129-6-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=2 mlxscore=0 phishscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 phishscore=0 clxscore=1015 bulkscore=0 adultscore=0 lowpriorityscore=0 impostorscore=0 malwarescore=0 mlxscore=0 suspectscore=2 mlxlogscore=999 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300151 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: padata will soon divide up pfn ranges between threads when parallelizing deferred init, and deferred_init_maxorder() complicates that by using an opaque index in addition to start and end pfns. Move the index outside the function to make splitting the job easier, and simplify the code while at it. deferred_init_maxorder() now always iterates within a single pfn range instead of potentially multiple ranges, and advances start_pfn to the end of that range instead of the max-order block so partial pfn ranges in the block aren't skipped in a later iteration. The section alignment check in deferred_grow_zone() is removed as well since this alignment is no longer guaranteed. It's not clear what value the alignment provided originally. Signed-off-by: Daniel Jordan --- mm/page_alloc.c | 88 +++++++++++++++---------------------------------- 1 file changed, 27 insertions(+), 61 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 68669d3a5a665..990514d8f0d94 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1708,55 +1708,23 @@ deferred_init_mem_pfn_range_in_zone(u64 *i, struct zone *zone, } /* - * Initialize and free pages. We do it in two loops: first we initialize - * struct page, then free to buddy allocator, because while we are - * freeing pages we can access pages that are ahead (computing buddy - * page in __free_one_page()). - * - * In order to try and keep some memory in the cache we have the loop - * broken along max page order boundaries. This way we will not cause - * any issues with the buddy page computation. + * Initialize the struct pages and then free them to the buddy allocator at + * most a max order block at a time because while we are freeing pages we can + * access pages that are ahead (computing buddy page in __free_one_page()). + * It's also cache friendly. */ static unsigned long __init -deferred_init_maxorder(u64 *i, struct zone *zone, unsigned long *start_pfn, - unsigned long *end_pfn) +deferred_init_maxorder(struct zone *zone, unsigned long *start_pfn, + unsigned long end_pfn) { - unsigned long mo_pfn = ALIGN(*start_pfn + 1, MAX_ORDER_NR_PAGES); - unsigned long spfn = *start_pfn, epfn = *end_pfn; - unsigned long nr_pages = 0; - u64 j = *i; - - /* First we loop through and initialize the page values */ - for_each_free_mem_pfn_range_in_zone_from(j, zone, start_pfn, end_pfn) { - unsigned long t; - - if (mo_pfn <= *start_pfn) - break; - - t = min(mo_pfn, *end_pfn); - nr_pages += deferred_init_pages(zone, *start_pfn, t); - - if (mo_pfn < *end_pfn) { - *start_pfn = mo_pfn; - break; - } - } - - /* Reset values and now loop through freeing pages as needed */ - swap(j, *i); - - for_each_free_mem_pfn_range_in_zone_from(j, zone, &spfn, &epfn) { - unsigned long t; - - if (mo_pfn <= spfn) - break; + unsigned long nr_pages, pfn; - t = min(mo_pfn, epfn); - deferred_free_pages(spfn, t); + pfn = ALIGN(*start_pfn + 1, MAX_ORDER_NR_PAGES); + pfn = min(pfn, end_pfn); - if (mo_pfn <= epfn) - break; - } + nr_pages = deferred_init_pages(zone, *start_pfn, pfn); + deferred_free_pages(*start_pfn, pfn); + *start_pfn = pfn; return nr_pages; } @@ -1814,9 +1782,11 @@ static int __init deferred_init_memmap(void *data) * that we can avoid introducing any issues with the buddy * allocator. */ - while (spfn < epfn) { - nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); - cond_resched(); + for_each_free_mem_pfn_range_in_zone_from(i, zone, &spfn, &epfn) { + while (spfn < epfn) { + nr_pages += deferred_init_maxorder(zone, &spfn, epfn); + cond_resched(); + } } zone_empty: /* Sanity check that the next zone really is unpopulated */ @@ -1883,22 +1853,18 @@ deferred_grow_zone(struct zone *zone, unsigned int order) * that we can avoid introducing any issues with the buddy * allocator. */ - while (spfn < epfn) { - /* update our first deferred PFN for this section */ - first_deferred_pfn = spfn; - - nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); - touch_nmi_watchdog(); - - /* We should only stop along section boundaries */ - if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION) - continue; - - /* If our quota has been met we can stop here */ - if (nr_pages >= nr_pages_needed) - break; + for_each_free_mem_pfn_range_in_zone_from(i, zone, &spfn, &epfn) { + while (spfn < epfn) { + nr_pages += deferred_init_maxorder(zone, &spfn, epfn); + touch_nmi_watchdog(); + + /* If our quota has been met we can stop here */ + if (nr_pages >= nr_pages_needed) + goto out; + } } +out: pgdat->first_deferred_pfn = spfn; pgdat_resize_unlock(pgdat, &flags); From patchwork Thu Apr 30 20:11:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D7A592A for ; Thu, 30 Apr 2020 20:12:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 407A82082E for ; Thu, 30 Apr 2020 20:12:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ln2bq8S+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 407A82082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 44CF98E0009; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 361BF8E0001; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BD938E0008; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0137.hostedemail.com [216.40.44.137]) by kanga.kvack.org (Postfix) with ESMTP id EE7428E0005 for ; Thu, 30 Apr 2020 16:12:05 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A4F05181AEF09 for ; Thu, 30 Apr 2020 20:12:05 +0000 (UTC) X-FDA: 76765617810.04.sleet32_4c6d45b28c049 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30005:30034:30054:30055:30064:30070,0,RBL:141.146.126.78:@oracle.com:.lbl8.mailshell.net-62.18.0.100 64.10.201.10,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sleet32_4c6d45b28c049 X-Filterd-Recvd-Size: 11770 Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:12:05 +0000 (UTC) Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK9F0k084676; Thu, 30 Apr 2020 20:11:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=r/CbKeQi8OFEvnMJbjndCS8xkwaSl+iaVTADntYBC8Q=; b=ln2bq8S+L02pSxT8jCNrNLjs4Bi3HNfuXnoKvEChwDRq8KBEr26cjs6vzBV5rAYQWQ2b LRgn06oLtaj7RwYrrnQE6EPPoNAm9DbqSe869iaJl0aMFk3OgeYjZcV76NQFu4Qug8z4 evGp/vVBNqHksrwbm549blgLVXueqjTSU79PmPkbw+Wv7fBfBErGk8wUkDIUmSlJ6WzY UGAAyO2lbbJjQibs8qWhglQMnPVubtP7nIi7lsUmTsu70yvIfh7GR8Tje2l5gpMuWpFj bgw2JRBbWq8zlVX6WKKGZkG43ljRU4SGIDlNARPpuDfU7cZSYJfhVZokgZDXyJd0mxoS sA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 30nucgdkq2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:46 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK6YlR140654; Thu, 30 Apr 2020 20:11:46 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3030.oracle.com with ESMTP id 30qtkx5a93-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:46 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 03UKBiuP026717; Thu, 30 Apr 2020 20:11:44 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:43 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 6/7] mm: parallelize deferred_init_memmap() Date: Thu, 30 Apr 2020 16:11:24 -0400 Message-Id: <20200430201125.532129-7-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=2 mlxscore=0 phishscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 clxscore=1015 priorityscore=1501 mlxlogscore=999 impostorscore=0 suspectscore=2 malwarescore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Deferred struct page init uses one thread per node, which is a significant bottleneck at boot for big machines--often the largest. Parallelize to reduce system downtime. The maximum number of threads is capped at the number of CPUs on the node because speedups always improve with additional threads on every system tested, and at this phase of boot, the system is otherwise idle and waiting on page init to finish. Helper threads operate on MAX_ORDER_NR_PAGES-aligned ranges to avoid accessing uninitialized buddy pages, so set the job's alignment accordingly. The minimum chunk size is also MAX_ORDER_NR_PAGES because there was benefit to using multiple threads even on relatively small memory (1G) systems. Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz (Skylake, bare metal) 2 nodes * 26 cores * 2 threads = 104 CPUs 384G/node = 768G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 4056.7 ( 5.5) -- 1763.3 ( 4.2) test 39.9% 2436.7 ( 2.1) 91.8% 144.3 ( 5.9) Intel(R) Xeon(R) CPU E5-2699C v4 @ 2.20GHz (Broadwell, bare metal) 1 node * 16 cores * 2 threads = 32 CPUs 192G/node = 192G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 1957.3 ( 14.0) -- 1093.7 ( 12.9) test 49.1% 996.0 ( 7.2) 88.4% 127.3 ( 5.1) Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (Haswell, bare metal) 2 nodes * 18 cores * 2 threads = 72 CPUs 128G/node = 256G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 1666.0 ( 3.5) -- 618.0 ( 3.5) test 31.3% 1145.3 ( 1.5) 85.6% 89.0 ( 1.7) AMD EPYC 7551 32-Core Processor (Zen, kvm guest) 1 node * 8 cores * 2 threads = 16 CPUs 64G/node = 64G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 1029.7 ( 42.3) -- 253.7 ( 3.1) test 23.3% 789.3 ( 15.0) 76.3% 60.0 ( 5.6) Server-oriented distros that enable deferred page init sometimes run in small VMs, and they still benefit even though the fraction of boot time saved is smaller: AMD EPYC 7551 32-Core Processor (Zen, kvm guest) 1 node * 2 cores * 2 threads = 4 CPUs 16G/node = 16G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 757.7 ( 17.1) -- 57.0 ( 0.0) test 6.2% 710.3 ( 15.0) 63.2% 21.0 ( 0.0) Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (Haswell, kvm guest) 1 node * 2 cores * 2 threads = 4 CPUs 14G/node = 14G memory kernel boot deferred init ------------------------ ------------------------ speedup time_ms (stdev) speedup time_ms (stdev) base -- 656.3 ( 7.1) -- 57.3 ( 1.5) test 8.6% 599.7 ( 5.9) 62.8% 21.3 ( 1.2) Signed-off-by: Daniel Jordan --- mm/Kconfig | 6 +++--- mm/page_alloc.c | 46 ++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 41 insertions(+), 11 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index ab80933be65ff..e5007206c7601 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -622,13 +622,13 @@ config DEFERRED_STRUCT_PAGE_INIT depends on SPARSEMEM depends on !NEED_PER_CPU_KM depends on 64BIT + select PADATA help Ordinarily all struct pages are initialised during early boot in a single thread. On very large machines this can take a considerable amount of time. If this option is set, large machines will bring up - a subset of memmap at boot and then initialise the rest in parallel - by starting one-off "pgdatinitX" kernel thread for each node X. This - has a potential performance impact on processes running early in the + a subset of memmap at boot and then initialise the rest in parallel. + This has a potential performance impact on tasks running early in the lifetime of the system until these kthreads finish the initialisation. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 990514d8f0d94..96d6d0d920c27 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -68,6 +68,7 @@ #include #include #include +#include #include #include @@ -1729,6 +1730,25 @@ deferred_init_maxorder(struct zone *zone, unsigned long *start_pfn, return nr_pages; } +struct def_init_args { + struct zone *zone; + atomic_long_t nr_pages; +}; + +static void __init deferred_init_memmap_chunk(unsigned long spfn, + unsigned long epfn, void *arg) +{ + struct def_init_args *args = arg; + unsigned long nr_pages = 0; + + while (spfn < epfn) { + nr_pages += deferred_init_maxorder(args->zone, &spfn, epfn); + cond_resched(); + } + + atomic_long_add(nr_pages, &args->nr_pages); +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { @@ -1738,7 +1758,7 @@ static int __init deferred_init_memmap(void *data) unsigned long first_init_pfn, flags; unsigned long start = jiffies; struct zone *zone; - int zid; + int zid, max_threads; u64 i; /* Bind memory initialisation thread to a local node if possible */ @@ -1778,15 +1798,25 @@ static int __init deferred_init_memmap(void *data) goto zone_empty; /* - * Initialize and free pages in MAX_ORDER sized increments so - * that we can avoid introducing any issues with the buddy - * allocator. + * More CPUs always led to greater speedups on tested systems, up to + * all the nodes' CPUs. Use all since the system is otherwise idle now. */ + max_threads = max(cpumask_weight(cpumask), 1u); + for_each_free_mem_pfn_range_in_zone_from(i, zone, &spfn, &epfn) { - while (spfn < epfn) { - nr_pages += deferred_init_maxorder(zone, &spfn, epfn); - cond_resched(); - } + struct def_init_args args = { zone, ATOMIC_LONG_INIT(0) }; + struct padata_mt_job job = { + .thread_fn = deferred_init_memmap_chunk, + .fn_arg = &args, + .start = spfn, + .size = epfn - spfn, + .align = MAX_ORDER_NR_PAGES, + .min_chunk = MAX_ORDER_NR_PAGES, + .max_threads = max_threads, + }; + + padata_do_multithreaded(&job); + nr_pages += atomic_long_read(&args.nr_pages); } zone_empty: /* Sanity check that the next zone really is unpopulated */ From patchwork Thu Apr 30 20:11:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 11521413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 963A592A for ; Thu, 30 Apr 2020 20:12:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5938320774 for ; Thu, 30 Apr 2020 20:12:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="fUyqySXD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5938320774 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6641C8E0001; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 44B6C8E0008; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 276FA8E0005; Thu, 30 Apr 2020 16:12:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id E8DC18E0001 for ; Thu, 30 Apr 2020 16:12:05 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A4D0D8248D7C for ; Thu, 30 Apr 2020 20:12:05 +0000 (UTC) X-FDA: 76765617810.16.mask03_4c6ae7aac2040 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,daniel.m.jordan@oracle.com,,RULES_HIT:30012:30034:30045:30051:30054:30064:30090,0,RBL:156.151.31.85:@oracle.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: mask03_4c6ae7aac2040 X-Filterd-Recvd-Size: 7494 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Thu, 30 Apr 2020 20:12:04 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK9RpK005992; Thu, 30 Apr 2020 20:11:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=+I5DK1bpDxPp17bxmkq3TLU7ktjAi9yJhHWfSrepooM=; b=fUyqySXDeLDsctdyoiziddJfcj8mcTOQxSfQWSYjilj7uE73/HfZauZY2u5i3Uvnv5tG KAp31PWGtBoS4N9epLF82hU53mniZOA/7lEKUmGjH9KQ+H8nH1Cv4EpFkW6aOiLQYMoX C+k0eWTaXn4nJZt4GHwStEs5HC9LqEeghzxIIbxp3ejHUWBFXbT7/tq5dR1lXdmCGGJ1 l+UQocYxc5AJd89VOHMDkMnZiIFb1tj2e3zzxxVqrjAbyCo4wTz8IntLONqd/e7Dypen biFbKfiOp/Jdby2w/mf18RluKSLiAmDV3u+b0FlU7mTnkhhI8TC+riHLkuTMqfVSGna9 QQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 30p2p0k2ps-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:48 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 03UK7Qbg096026; Thu, 30 Apr 2020 20:11:48 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 30qtjy23v9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 30 Apr 2020 20:11:48 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 03UKBlCt024136; Thu, 30 Apr 2020 20:11:47 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 30 Apr 2020 13:11:45 -0700 From: Daniel Jordan To: Andrew Morton , Herbert Xu , Steffen Klassert Cc: Alex Williamson , Alexander Duyck , Dan Williams , Dave Hansen , David Hildenbrand , Jason Gunthorpe , Jonathan Corbet , Josh Triplett , Kirill Tkhai , Michal Hocko , Pavel Machek , Pavel Tatashin , Peter Zijlstra , Randy Dunlap , Shile Zhang , Tejun Heo , Zi Yan , linux-crypto@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Daniel Jordan Subject: [PATCH 7/7] padata: document multithreaded jobs Date: Thu, 30 Apr 2020 16:11:25 -0400 Message-Id: <20200430201125.532129-8-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200430201125.532129-1-daniel.m.jordan@oracle.com> References: <20200430201125.532129-1-daniel.m.jordan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 mlxlogscore=999 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9607 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 phishscore=0 clxscore=1015 bulkscore=0 adultscore=0 lowpriorityscore=0 impostorscore=0 malwarescore=0 mlxscore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004300150 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add Documentation for multithreaded jobs. Signed-off-by: Daniel Jordan --- Documentation/core-api/padata.rst | 41 +++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/Documentation/core-api/padata.rst b/Documentation/core-api/padata.rst index 9a24c111781d9..b7e047af993e8 100644 --- a/Documentation/core-api/padata.rst +++ b/Documentation/core-api/padata.rst @@ -4,23 +4,26 @@ The padata parallel execution mechanism ======================================= -:Date: December 2019 +:Date: April 2020 Padata is a mechanism by which the kernel can farm jobs out to be done in -parallel on multiple CPUs while retaining their ordering. It was developed for -use with the IPsec code, which needs to be able to perform encryption and -decryption on large numbers of packets without reordering those packets. The -crypto developers made a point of writing padata in a sufficiently general -fashion that it could be put to other uses as well. +parallel on multiple CPUs while optionally retaining their ordering. -Usage -===== +It was originally developed for IPsec, which needs to perform encryption and +decryption on large numbers of packets without reordering those packets. This +is currently the sole consumer of padata's serialized job support. + +Padata also supports multithreaded jobs, splitting up the job evenly while load +balancing and coordinating between threads. + +Running Serialized Jobs +======================= Initializing ------------ -The first step in using padata is to set up a padata_instance structure for -overall control of how jobs are to be run:: +The first step in using padata to run parallel jobs is to set up a +padata_instance structure for overall control of how jobs are to be run:: #include @@ -162,6 +165,24 @@ functions that correspond to the allocation in reverse:: It is the user's responsibility to ensure all outstanding jobs are complete before any of the above are called. +Running Multithreaded Jobs +========================== + +A multithreaded job has a main thread and zero or more helper threads, with the +main thread participating in the job and then waiting until all helpers have +finished. padata splits the job into units called chunks, where a chunk is a +piece of the job that one thread completes in one call to the thread function. + +A user has to do three things to run a multithreaded job. First, describe the +job by defining a padata_mt_job structure, which is explained in the Interface +section. This includes a pointer to the thread function, which padata will +call each time it assigns a job chunk to a thread. Then, define the thread +function, which accepts three arguments, ``start``, ``end``, and ``arg``, where +the first two delimit the range that the thread operates on and the last is a +pointer to the job's shared state, if any. Prepare the shared state, which is +typically a stack-allocated structure that wraps the required data. Last, call +padata_do_multithreaded(), which will return once the job is finished. + Interface =========