From patchwork Thu Jan 6 00:46:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704962 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7500FC433FE for ; Thu, 6 Jan 2022 00:48:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343962AbiAFAsA (ORCPT ); Wed, 5 Jan 2022 19:48:00 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:16552 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232729AbiAFArz (ORCPT ); Wed, 5 Jan 2022 19:47:55 -0500 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4QfO011262; Thu, 6 Jan 2022 00:47:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=WAtN67KHK0wTvC6vTxsC/598Mppie6gA5gLpr72IscY=; b=natvdBGQcoFsDKKM4NoKPw4X0kDh53F4S1dp3afIlgOIgSczm1aQr6ux6YXWm/HS/3+/ 4uzo7auznx+TRKi03yStdCN/xcGmFzD8q1bW7CdFpPBLyMY0a9+CWSMj4c+VSeiatuuo AvdGUpc5EI5o9eYpM8L6XnCdPpaTayn/uitzAlH0M6M0YteEkdctrk+2wBsLaH4mdPRI IXU7aQjBDFk2YXXLbULmEop9rwlAL4As2m5Wy6VVeEl7jqFJlP8wBNI72wmac3D2kODM P7r7G+SiQN3aFwYb4PwZ+P9+kh41lmZM15ru7Mnz+f6bRnPh3PKgAITELvc+XKde7Oe+ pA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpdg43k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:19 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VfVe076226; Thu, 6 Jan 2022 00:47:18 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2042.outbound.protection.outlook.com [104.47.57.42]) by aserp3020.oracle.com with ESMTP id 3ddmqa3d01-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Fut4/lLyjhBoaSz0VvwngQ5PCOAQNH8KiENTjsGaeYV4Zq/VCX+kh7Fbs2HhtXwX1jaIzgaib2NcTaxzToE8wzk5LIaed8AJKtG25TZpzq4P0CfXErye9eh88o4paGv7R1rl5Q8RPcpVB1+kOjmfhWlJ/y5HOggiIhQlhWHE98lji4O7ewbycZl9n0erSvx73WacB92Wsok/I47PP8sf/93ptCeO9o4+6QnhRiyqRdOYvi8BrhdvgpGuwu5x/rtjCh5TVMoTdkZ9m8WoLIblyjiMmbaIh3tJ3DAeGv20P7WPUCAr5P87Hw9w4RISdU70VayZEcaMqHYZCRzNxr7Uqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WAtN67KHK0wTvC6vTxsC/598Mppie6gA5gLpr72IscY=; b=U2AJRqj0hTQ12WffDjAVSsHQaXlPo58AZuwtbLnzHny3HTU/3Icy5y5LQtV1PtRkjnrkvUH8YNwWPokCzOaDscRoBmR4sc9xzw12Wo8Ll+h7Qrl5+nbk0okzZQDXoGL6yr1/zbgAUQWyzc2ZMWnZgecSMKXeaWB8DB5QAYVa2t/AdYOYvmrs9iRZvskvtMu4fBU6vkKbgU6pruQ8M6npjUQHdNK84ILh6h7n1ZGTWywKRnK/R9dECGK10jimZHG9D1n5eqXM4fuK/WLZrOtgBupGQomNMcqdQJDr0HHfmSOHNkrEr68wDEva37QKgoynzqvmNmgxXohEkyVW0ZXo4g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WAtN67KHK0wTvC6vTxsC/598Mppie6gA5gLpr72IscY=; b=wCLrvilXDfwtkT5Zq6FT9eQGxRHP+kzpvVdWhEJa616apwFAUaw0jcSQYf2+DhJlSwiI6eVAJa+JRrX6X82vYhM2iIu3lr7vI7kmuYKKek3m+uBNE0Fv8TyRxPauWZj4c8djFkwkFItQrC7zY9mNA4bPReV7/wrmxHtS8Evl5So= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:16 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:16 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 01/16] padata: Remove __init from multithreading functions Date: Wed, 5 Jan 2022 19:46:41 -0500 Message-Id: <20220106004656.126790-2-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 05c48c74-9a94-46a5-ecda-08d9d0ae15f8 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1824; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /MdqR1nW8oGOJRWkL07QKJWzVETsZrbHQlMEKoSp8cPEWmrDBhiTivpIa8RVu0lWZTweb/O2JzEaoZfggDzrQX+lQ05IZPHeFZQxgGtrpqL/GW5Z1mHHbr8btwtQaDN6ye8gG/BzNH4oEs61vIWfC1gbcbuZBSFC5UlTjB+cQyRyX1+1aBcIx3ryOucdQtqCJXSY6qtRqRlfn4xk6swc0pVZttfkGM4E0+esNPfmZG8WrIAlDvFm3XBAK6CA8FvohZ/WmgMuC51A5QIyy0dnB7rlfHBhZZUB7hW6M6cuiiSXu+RpEzx6ewT/akn+c7GKo/rnhfw99p1xJnGUTeAt9Q2l70+kHJZ4BJFCeZ2s+WWfs5rWRPd1qmuG95thSpHh/itBgK1l+GNQMk8hvKZKIs6Zhu4zxXXLSMpR4GRIgIbta8SVxVP9lIieky1oAeSYaMUnQrMzjHFUcomvf2MVqsKmXzw0Zh70bPClmyvIQdJNo0U338Jgdefsemys3Kbs1QKpnVTeVHB0SSUgw7NzEOPotcCo+ydvF7q9oPl6zdgu8zB5KFVZA/2dcA06h+JaPwe+fybaZkmOkd0cTz1qcmazTQx8cZJab0/rk8KzIVD2Y07m8D6CF3g4lxfwyiMuLU2tWF1KrQl3a5fYFk14+uh408qScPCH1fECf79KMtQc5CaSk94Ji5MVVD6GhF84q8EKxBhafl5iusGR42oHuf9I3dLZCfbCOsIKE9L53Tk= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: C17whqtZAPMPuGXlrcMJWk+Do/nK6i7mWhzVFi/v/QhNy/ZfmVd3mgaICJIfvrZHlU/qPpLGHH8usUWCzvnbfjYTFd75cDGqCuEOUFg7CaKXfy1hSt8GREK5bzX41MHm103IrDllVbE3IwvH4xffKeSU+TQ9hNnZKQlHb0yM6ctvbrdMlWwaPb8Iqp+Og9h+jDohwUsusyqeG9LHgu+f4shXT2PsSkHuOZRYktvUg1wLsPvjpsQDtMrI1FTxPQVPEbPTsKz3+VxW2+VkCBfXJSpto8zCdErAlrosfXnDOnOIPBbQxAgzTR7+EWGezPlkTfxQxVDNAXu4UTQIYERPO9RbtqtH7KgIzavysGXX9HWKZhZwSBPmA1ECkeTKF9HoOfuYH0El25GaU87bn84r8/qIIT3Bhvb6hG/rpwIvHA+y0pHG6QJ27/mTncKC/UkSJFy3C9R71DVhpLPaXJtG9HnwpIaRCHshY8Mp2Z/mcqm5HHY3o38v++fKLfZaWuEPZgW9s8BT97Sme/qUVxfjSmpvgE9LXHp9XKD/s0YA9tmj6VoPX4knZ9AcnE1x3sy143vrsKKvqjWi86FsfpTfkqxilmYlYaG0wYSOomKOX8XgFCzN2T6Lyz/hW6aZnoD62pxXeepBPk8QX6dAdfKETP3QWN/O4FGxVNtfv4/APXs02Se2hvq7EuWHNL6dfbb/AuaZIjQRRhuzlGE1xDIvjqFmc0fbFaHTCczuXbsZ8eRGkZnnlWO9PIoWbG0e98wtpSrMcmAUgNmNZo4SGsBoKD8HiqXU8aJ9XV56i/Wp9H7jVG3dekqujJkYh4W9SwXe8/7DeU4o4vRV40roRWF5rJ08FrVc1IMKm9z0rFq3J6WwM6bZwpzJf9tmItZjA+oJfNEOj2WIZEeMd6V5hsy2vStSFXeP9Ucw5IqR5MZcHGwU22zaaCA+btlHdcY/77/5VcTgdFaoOkshGtxLwUpNpx86wCJZhs+BwbScP6R7JSIHKbfJfp9TfPiEzn9P1Q6eRd8URtVNH/pvqafwmMn6nZ/l4hZoan85mzCmOmfA3P195qKPy5oyOsHFinsdyghTVZZhfnQIGVXpJrRQADth3VreR4nI/8CmwhVWqv/uAlcjxxLXZ6f8+MhLB9D2Og6cubLE94bHmDtL3ZrYnKyYP93e8u9S5pU2PsKoh04kHAckbE6gESyUrJ5e9MkuNukf6u6rqcJNudue04pJsUkuhrloUKMhSraZbIHEyGuHo/xgG0ygL0jBIneE5YX16yeuDVauQB/ggCpIgEA6gwSt/TAbkGMAVAeHsaRfpyYXZo8shWBa2qmFnjXu4W2OBZzOGjaMg0wfeIIZqky9mHABTD4A1J561ZDs1N3qEK/8VhZhO8r54TK1MQSqH8t+gnZZUTfdB94LV3VIVUmoMiNTSFKIPRdV3Nj2OWK4sfNZGXNp1tOJYhDMlzeZ4z1MUI1FL+WKYyhFci2HmkXW4LHqwvLlAmxoh1SCPUWlhASUesibGPuhD+b91VAUrYRe6wSZi+dKEcG/iArCYVLfhUgQUmNKCfAUyl0V6X23Wzwr4SC38k6kOEIquBBx2aV9YkCAZP3K3lRLs1OuYe95Wccg5RSm876Prjtfh9GlxzTe4OMOcBGo+wf+eq7P45Zz0WpDR78rpUhwF0JTef3SCw4FTiUJsIiln7ehOBxrRiAKanw= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 05c48c74-9a94-46a5-ecda-08d9d0ae15f8 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:16.2382 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2N9984L1X1glfZky+EAXdUNq2vMDgNbCfn0L6YgQSlM/ALCievElU6AeIPGBP41SbKGajWXElBogw7aQxQi0nbl+UfVKyPW4rZyB6+hS8/A= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=566 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: kewc-Tu24IOzPSRNed-rRCQ9TqQLf7c6 X-Proofpoint-GUID: kewc-Tu24IOzPSRNed-rRCQ9TqQLf7c6 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A non-__init caller will need them soon. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 2 +- kernel/padata.c | 11 +++++------ 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index a433f13fc4bf..0dc031d54742 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -188,7 +188,7 @@ extern void padata_free_shell(struct padata_shell *ps); extern int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu); extern void padata_do_serial(struct padata_priv *padata); -extern void __init padata_do_multithreaded(struct padata_mt_job *job); +extern void padata_do_multithreaded(struct padata_mt_job *job); extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type, cpumask_var_t cpumask); #endif diff --git a/kernel/padata.c b/kernel/padata.c index d4d3ba6e1728..5d13920d2a12 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -58,7 +58,7 @@ struct padata_mt_job_state { }; static void padata_free_pd(struct parallel_data *pd); -static void __init padata_mt_helper(struct work_struct *work); +static void padata_mt_helper(struct work_struct *work); static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index) { @@ -106,8 +106,7 @@ static void padata_work_init(struct padata_work *pw, work_func_t work_fn, pw->pw_data = data; } -static int __init padata_work_alloc_mt(int nworks, void *data, - struct list_head *head) +static int padata_work_alloc_mt(int nworks, void *data, struct list_head *head) { int i; @@ -132,7 +131,7 @@ static void padata_work_free(struct padata_work *pw) list_add(&pw->pw_list, &padata_free_works); } -static void __init padata_works_free(struct list_head *works) +static void padata_works_free(struct list_head *works) { struct padata_work *cur, *next; @@ -438,7 +437,7 @@ static int padata_setup_cpumasks(struct padata_instance *pinst) return err; } -static void __init padata_mt_helper(struct work_struct *w) +static void padata_mt_helper(struct work_struct *w) { struct padata_work *pw = container_of(w, struct padata_work, pw_work); struct padata_mt_job_state *ps = pw->pw_data; @@ -478,7 +477,7 @@ static void __init padata_mt_helper(struct work_struct *w) * * See the definition of struct padata_mt_job for more details. */ -void __init padata_do_multithreaded(struct padata_mt_job *job) +void padata_do_multithreaded(struct padata_mt_job *job) { /* In case threads finish at different times. */ static const unsigned long load_balance_factor = 4; From patchwork Thu Jan 6 00:46:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60357C433FE for ; Thu, 6 Jan 2022 00:48:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343994AbiAFAsF (ORCPT ); Wed, 5 Jan 2022 19:48:05 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:16616 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232760AbiAFArz (ORCPT ); Wed, 5 Jan 2022 19:47:55 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4TM0023248; Thu, 6 Jan 2022 00:47:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=KgMOY08OYy5Jfl+TXf47eMq8wi/R/m1t5Tc7eqelZAA=; b=obA+DWnirHZieNnVEX++4t6Tr6JYwiQLZmt7W4rzKZGH9QPZJh8FogPqWSLX2E6I+qfN 1NKJ1kRohMBBbt8gQ7LE6t6FdKU02kKZXfXy8JvdZcsi7x3Z1ttGqcvCYsnT3//1icSk N1GJ0x44ARbb1EGCzrj6Mkq0kKhYdfddvYk7QDYnNu5/5M6h6O1KZ6WeVewwJJOuuIiS XGyksNvnZkbH5SwxrZEF867rngoW2HjbhXgq9s8VQgcLOgZ8qxPLvODJitCk8gDma2Rc QjwQTEMV3AlXdrvFnE1KX81UdeZNRywnXeSZbCpxjYOPEnSVSqJrZD3Z0WAskG7NvCzL iQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpeg41f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:21 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VgcA076335; Thu, 6 Jan 2022 00:47:20 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2043.outbound.protection.outlook.com [104.47.57.43]) by aserp3020.oracle.com with ESMTP id 3ddmqa3d1h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HZYGTUx3O/Jt9dXL3LxDxabdxxwBmwuB/zJUlTDYbUhlVneSKEiV/ENLjN4evJvPZAKsS/deabTUnPduwfEWWb50sfpImnG5kQcK+4x7KO+qf1kbjmsEAeNb6ahuOrZl8+ZrI4Kgq0HsGWZQzR2Y82CfjmPrYLnjOfjndy/SSK6cMlKDbMJ4cDwt5dqrVKxb9qOFf7ei7cQznWJ8T+HjDXse9Kt3KYPvdasjf6ncPrQktF2m5Ci8kQwK35D1MkNW2mwe1om1wVpLhluSyHXSv2B9boKGwZkLtgpdN4IOIlcSZ77KwcjMlTVxqnnV+8Z2I2L494Rmvz1PuYuwjmCvDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KgMOY08OYy5Jfl+TXf47eMq8wi/R/m1t5Tc7eqelZAA=; b=QDzWIabzL1Wh5kAWalj6vSY8avHk9oXE09+04m2R8DaHJonZnFEng/XFekL/lTgvakRZ/Rn9hBdfh8yynwJ3bF9snsmLJjvrWlhsDiv/6R8XIVmGrY+s3pE5fr3WgJmbHAuMIcjse/oe/cGPtMMoucXr0P5VHH+MI1d40I1jHhi4O9Wbr7EYJtKX+TG3J8mVAhPR3ICCmXOISlq11jYTooC0ibyr8Wmp22tdMF/P5p3vp44211o2mJFYmoWqCy4lS9dn1s+DA1HsxRRXjmqoFN3Hh8k7cSgsgoLn7lZY10fZzTAF7H1jODmkGrcjTtqh98R4Hg7vNU9PLdMkuG8YHg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KgMOY08OYy5Jfl+TXf47eMq8wi/R/m1t5Tc7eqelZAA=; b=v+U+oy8wb9riz2oYRvfy1ZtVfF3adOBgEHvuwzaUGDjyQKOXEugeZ4Rfju6t2vZxazZBKxo8clKqvx4NvAVbS7XPX+aFmQa6Lxg0XR2Lj4yojjf1j01T8aP6uqkimbRztd6s2vLuOdHIEqwyBR6/Y3d07PKd3R5UKUBZwmd8VsY= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:18 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:19 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 02/16] padata: Return first error from a job Date: Wed, 5 Jan 2022 19:46:42 -0500 Message-Id: <20220106004656.126790-3-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 512a504a-9bd6-405f-adbb-08d9d0ae1790 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:4303; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6Lb5mBnrM58MOmrLMWJf9fCodHNjPLb/fLVYNHx+Ed39u58V2s1rkBSv3GC05I7145sijpese8488pp6LzgDZrokOpPy0ViSwRofU9ind8v0LmiTz8pTk5UnIoPU684WeH43Br8jXQlG9bxxMPjme83tqcMSX+tU3MWmr4Io8WBXMyeu83PWcTzeJaDLPMNzcKBNB8Sh63TztnfylfUfVxCzylLrikM9bRhpEQC6ibQ5e6Ei01gvGLtgrh1criWed5jgaY75P5NUphD2R39v3SGYJpCiAUL2G49fiKV4YvjcAIZfyF583QcFxL2PJuBx2I3n06fKym44KKNLWWmD1/dKHR05oK90IBnQKx0/HcwnYAd5tp0JPzQ9MF/k+vVrYVfH8smXO6B+6EwFH5QlfybX7DYRrH+b4frGKUaFvmzOywRpx5DrpZOKXdPQAUdLYCdikGN7diti1c+7Z7l5rzQEgpSo2h1M/R0b9+qmeAMO6TtnkER0QUtvQ+f6i+6L50SgxDQWNAf44BA4A8qsqLv/xUVhg4fI9xzrOi6mttNxeThNvmhppQn1wlJ4KDaHtWcBjYCbYrLSeGZHs1xPwVDzAAgzU7KEnw23HKA+UaD7AshcGUwky+GVWQ2cDocnJNjGKqIQIm/AeLXwWkRYb18z3Fc4FA0Ic+BNRok6nacF0g+ZF6VWKeLeOU7dCQWScy2Ln+c/kuk58DeMgVfjrigatXPyC0iVlVmkT62h7Us= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: aXMXNG+lg0slpWdQELMzu8lRJE/kse2CaRag2Tjq4BVo+gfCdRUX6bSAa2U/E3VHcKsjX6P2vpyB6a2ZH4BaLasjEINyT6zzgJ/0iNvNaULLvUOauBj8OjsU/c3GZhsFdVzzfrVC2YXSju9rpL4TkOh+GL7FHwl6O+i/027W5wHJEt0g3OckHIZOebo4CoXDwnj1BbpMgbGnlyxh6ljL+u96jNpU2jsjd176BDaYg5Pbu6McrzthaIUDn91vYi+L/ziFngwMhcw7q/Xre1pRc9tqWRPTHd6TKZToKPX0BCOhoap2VsRglV0xvBah/M2UJhArah0n4BBer7gSx0ei6RrUQQGRHHb5MzHM/bIYEjIz23R57x8MhuF66VfGI582prQwHurB8wZU8bEF7Kh7h6VJPp1RgbNUVZbpvldKDlEBGiA+qGzMOu8D+dEjG6DKZ+sCY8TcZPEVekcinH40SIyF0PjXUS0aaufSabWmvTKeFclgnUh+zfnEiiBjCzlPvP8nY7vaXzgloZVB44tMyIYjUbWEN2JTLkuOOFlwVqHbuAK+7U42kcpHGdXe3TldAs1kDkEesU3KfrNZLz9rwRUtjOYJ2Jl+o3FMstnQtuT0fC4gq9e99WZwO7lK0jHR6WTjHPQa7ZhoxBoxnMm3XR2UykxoPLAOQKECAJIqCjyBM0OuASp9FiXVjtlO9aDmBTWpNJ3bctbTJrekWAeNceVjl4aWZbsaxnS6duIYZKh2kE94SofZQgqTN0rWuwrnjUlSaTby5mgczVsqlUCWWqYm04BAqoU+utioBepZYeC8PKDm9lrQVGB/YjK1hvZ2RWH1Q3ZAPut5/tFex6OZdYguK1eGW7JzXPDPUiARxls4iznumnuXcA5vr5Xo/VbowwmKXg5CZDzes1pfz6rqRXuny9fVdyp3ag+oEnPHmYqO6qDBfAoKmn0F+wAJ4/3wovLkweAoRPH6QM+v2y4eF5eAs6UEOH7Hn5d/8nJ4Clili5plOBROudNqKG7TUmNpzCoX89a5WKhoEA9teBsfneuH52KWHVH1EtQPc5VUIiJaw/ZZDfpS/rsDPu0Lh4iEQVdO5/5I1C9jGRleeuzB4Ffh9DzyfIJ4d7N1IMMf9LZGMyRwuHRkOxzhBi6beyw4/Cx248+kd8X4l2r3zgRq78VxWtqSJ5k9wNzhpWxZNr6fkXzUOjDjXu+7PdL0XnnHWxrjwmFCSmGL8bzvjk/d0E3iAgcg72d0cNxZrxZsmcHTAA7fQtGCwGo3HgdgOTOkcc1orp13hyaH6L7w5waMRhhe5Kr9CyYSncBmBCklYdSBFBC1i3PH4xbsS5ksSU5rUq1DCeBKnIHTN9YnM65sgFGU6uhVfopGuQ8LXc+4VbRSWtd4szwTM2lduC09ApG881jBKEJZh8t42Iuoh79AotrCTOotqbn0/Ecvd8tWaZLEhm+XcfdvbBhdWVSNd76xHN7m8MqjRnFR7ODBdDnye9+dzkH4jN0uj5EtHwsJ7TK6JcAMnKJBZCyr9FdYkeAoSHvLmNnnAW2rd0DeoQ+4tFjgOY6hNVpJJ8JbqaleMCTrWpKTDksSy05JMOblIYpv3r2CC12PCwi6MqFGFyMj+H6zn9BmsjFlB+xAPlWor2mOf8kGIrwqLyhGWJVlSbRFx/E006Z9+QBFylRD2avWUZdtwK2gjujmdazXWAtUuFg= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 512a504a-9bd6-405f-adbb-08d9d0ae1790 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:18.9473 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tBWAB5rGKG/lXbAC+wwPy3pUFItO1xR0rk39xh8mTOSAXbhGLUvsT1E/sDk84kRWk63MNvRFxsMZC9DjO9ztSJZJv+GvJPU3rxpVRNLy4d4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=690 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: jQanJTV1FTLmez4qLUmhqXWU1nnSHByB X-Proofpoint-GUID: jQanJTV1FTLmez4qLUmhqXWU1nnSHByB Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The only current user of multithreaded jobs, deferred struct page init, can't fail, but soon the world won't be perfect anymore. Return the first error encountered during a job. Threads can fail for different reasons, which may need special handling in the future, but returning the first will do for the upcoming new user because the kernel unwinds the same way no matter the error. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 5 +++-- kernel/padata.c | 22 ++++++++++++++++------ mm/page_alloc.c | 4 +++- 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 0dc031d54742..1c8670a24ccf 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -126,6 +126,7 @@ struct padata_shell { * struct padata_mt_job - represents one multithreaded job * * @thread_fn: Called for each chunk of work that a padata thread does. + * Returns 0 or client-specific nonzero error code. * @fn_arg: The thread function argument. * @start: The start of the job (units are job-specific). * @size: size of this node's work (units are job-specific). @@ -138,7 +139,7 @@ struct padata_shell { * depending on task size and minimum chunk size. */ struct padata_mt_job { - void (*thread_fn)(unsigned long start, unsigned long end, void *arg); + int (*thread_fn)(unsigned long start, unsigned long end, void *arg); void *fn_arg; unsigned long start; unsigned long size; @@ -188,7 +189,7 @@ extern void padata_free_shell(struct padata_shell *ps); extern int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu); extern void padata_do_serial(struct padata_priv *padata); -extern void padata_do_multithreaded(struct padata_mt_job *job); +extern int padata_do_multithreaded(struct padata_mt_job *job); extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type, cpumask_var_t cpumask); #endif diff --git a/kernel/padata.c b/kernel/padata.c index 5d13920d2a12..1596ca22b316 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -54,6 +54,7 @@ struct padata_mt_job_state { struct padata_mt_job *job; int nworks; int nworks_fini; + int error; /* first error from thread_fn */ unsigned long chunk_size; }; @@ -446,8 +447,9 @@ static void padata_mt_helper(struct work_struct *w) spin_lock(&ps->lock); - while (job->size > 0) { + while (job->size > 0 && ps->error == 0) { unsigned long start, size, end; + int ret; start = job->start; /* So end is chunk size aligned if enough work remains. */ @@ -459,8 +461,12 @@ static void padata_mt_helper(struct work_struct *w) job->size -= size; spin_unlock(&ps->lock); - job->thread_fn(start, end, job->fn_arg); + ret = job->thread_fn(start, end, job->fn_arg); spin_lock(&ps->lock); + + /* Save first error code only. */ + if (ps->error == 0) + ps->error = ret; } ++ps->nworks_fini; @@ -476,8 +482,10 @@ static void padata_mt_helper(struct work_struct *w) * @job: Description of the job. * * See the definition of struct padata_mt_job for more details. + * + * Return: 0 or a client-specific nonzero error code. */ -void padata_do_multithreaded(struct padata_mt_job *job) +int padata_do_multithreaded(struct padata_mt_job *job) { /* In case threads finish at different times. */ static const unsigned long load_balance_factor = 4; @@ -487,7 +495,7 @@ void padata_do_multithreaded(struct padata_mt_job *job) int nworks; if (job->size == 0) - return; + return 0; /* Ensure at least one thread when size < min_chunk. */ nworks = max(job->size / job->min_chunk, 1ul); @@ -495,8 +503,8 @@ void padata_do_multithreaded(struct padata_mt_job *job) if (nworks == 1) { /* Single thread, no coordination needed, cut to the chase. */ - job->thread_fn(job->start, job->start + job->size, job->fn_arg); - return; + return job->thread_fn(job->start, job->start + job->size, + job->fn_arg); } spin_lock_init(&ps.lock); @@ -504,6 +512,7 @@ void padata_do_multithreaded(struct padata_mt_job *job) ps.job = job; ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); ps.nworks_fini = 0; + ps.error = 0; /* * Chunk size is the amount of work a helper does per call to the @@ -527,6 +536,7 @@ void padata_do_multithreaded(struct padata_mt_job *job) destroy_work_on_stack(&my_work.pw_work); padata_works_free(&works); + return ps.error; } static void __padata_list_init(struct padata_list *pd_list) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index eeb3a9cb36bb..039786d840cf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2018,7 +2018,7 @@ deferred_init_maxorder(u64 *i, struct zone *zone, unsigned long *start_pfn, return nr_pages; } -static void __init +static int __init deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, void *arg) { @@ -2036,6 +2036,8 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, deferred_init_maxorder(&i, zone, &spfn, &epfn); cond_resched(); } + + return 0; } /* An arch may override for more concurrency. */ From patchwork Thu Jan 6 00:46:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61973C433F5 for ; Thu, 6 Jan 2022 00:48:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344047AbiAFAsP (ORCPT ); Wed, 5 Jan 2022 19:48:15 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:22422 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343961AbiAFAsA (ORCPT ); Wed, 5 Jan 2022 19:48:00 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4XNU023551; Thu, 6 Jan 2022 00:47:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=PoisbZfL3AfP4BBP2lSBO2jVWJUFXNgYLbrr9J5R6XA=; b=RL2bXCnsrG1LbYoQhWLBOTOCaBalFKyH7oy2bRXIgDxay97NHTbfvquZzRNnY20XOvtH tRP4C77oYiD+u4b3GZCKLyX4hbIYCcWT2YSbNd9wYNbqvF/A6oO5CYU2tqH3FIWaooPO fLUkTzLk9jLDXYKwSp5mYvTDNG9Khd4O5smth50hbk0vZRtMZGz+Pp5CiDw3R7Qnu/dV C8n4hUXLt4mSWXezrkKwA/4z07F4WtZwEWJfZQHu/GcoD7ztmqjH2ZnB19fXHi7QVYqB amWbk9DdKHX3euqcwfHp+FMusKdFb86YSoMag7O/WX63T7XDiP/YwvtH9137DyBbI916 hw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpeg41s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:25 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060W8mm102657; Thu, 6 Jan 2022 00:47:24 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2048.outbound.protection.outlook.com [104.47.66.48]) by aserp3030.oracle.com with ESMTP id 3ddmqgu51k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cmDJovxDttcXdnX86qOnjWmXGSbyUuCrGGsesM+8AvsPmCNdiL8RPuyofmWYBGJVOiEjLrz8tP+5T9TvH2wePUgiUvJgJDmflF5loFW7wSYD8AlTbu/gUBJnaZPh1Gd5k42pBVd9oBrzUKreOXYFdM6+KINJAJYJlSO29etQnCrK80R3goe1gdBZO2b0WYdaOcFuVJeZpA4VAvqXghwD25FJiiR1RKRp66lzccWwnS60ZO9kGHuaeW7lPIKo7rHs8Y/ftNwPIoG1GRxSECCyBedsJ3jjjVUzDhEFGmZJf+dzd5axO4iB4dg+tM/6pFyF0oFSnoKO6RYs4RDWjFfkrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PoisbZfL3AfP4BBP2lSBO2jVWJUFXNgYLbrr9J5R6XA=; b=SsIkuMTnOJ/uVc4fy3rZhPIGfstTEJCVQubwFtSiB73quZO0g9xZrJr5JvfHDHt9qqMGhpypMZ0FsnjyQFxzhflRHV2YnFHH68rAfwL37RJXNG+vazuw6ovnVao0e4gNOI/HePkNOgxBsR3VrCPGXSe1aC5HdvYtRaov3LG9bvrh+diq23DszM0XP3FnzIVt8Y9sKbpJ3O+5asubIUeJyr4oeuy9Ws/n4ymh/rpn462NJqwCTkLgfOcjbd3rKmN7f9tG5r3+ImYBie5Z6WF3Ly9zeDSV4yv9h0SbEqYSdYd+FLPXByy1h6qhQZ1WZHF0hjM5i5bqmJWmui/uQ7W2Tw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PoisbZfL3AfP4BBP2lSBO2jVWJUFXNgYLbrr9J5R6XA=; b=lZmMutmMDDxO3XCaVH0aBLUL69XF/qRpY9vQcdNMPRefLAhaweh40ejivD8kCXCiPCD3dw+lSUPY/B0BKi2UCE0x5Y2vCV2/5wmwn4urhRe44zRR9nqWOBp0LEkLAA4mBTZSZGWl+MKBxP11trNbL4te9Dpa75Gi7HL2SKAzNrw= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:21 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:21 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 03/16] padata: Add undo support Date: Wed, 5 Jan 2022 19:46:43 -0500 Message-Id: <20220106004656.126790-4-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b2effce9-2081-45ee-e6a4-08d9d0ae192d X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:608; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ApamA5NsUIfXuhdZDe8fx2caSD/QYBOZUbavaVif1l/n6BHG6sySYOzbNoBLqHmM+qbinNxivpOGeU3/ojCmIFIrXKKZIkMHFl7ao5YtDMG9xi/WHMRKzh+D/N4CFj/E+Zx7vp3iCUr/ZoTYSo2ArThX5w7nMNSapQz4+TFv0Z6GGzUd/dHftes+Vyn4QYw4VtqXZiOEyJ9r2/9lpepvVTtt882Z87QM3gztrXPSoOhLawdk2DmrDAcIQEoyx3WQhcuD6V+puL/kzShNctkVulLCVjyTFGcWvjjB7HkjrMDsrwBv+LiJduxAsRVKA1wRhljoD79jt7UL2NhCPtdWx7asOTkCPnosdjdVsf9+ZqmaS/qi699LNS6RtizMnHMDD/csuNORy4/j4pROtu644Cm6LKw+viX+iJeVdBvwjrSPX2P+BvvQ5sZfVt3SZR1KHkAIy6vaK+Idgip1J7TtGTBNLR4ZiB8gEt9QuwK4JwxDNZkHzZTHyrWW2GY/yRGkGNTtjtiW/Q04qgLlOhQlc36KLT4YX8q6vPn7LHdty6mv1rEElmq2DTWppKkv1vf03CW5ltnX6oIclV7nSnNCcyO58f45XgLUnBhNSltE0z+cRxSS3/VPA51lYBL144h/ONN9Wi6PAVcF3DQeVbYwmyG2a3cPf9im7Rq+4Ykl3bys7/NMSY/fc89Uo+0mJpWEEL0+GrlnAxGDXUNokQcRCaYOBrOhTW+hLwfIs77TbO8= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: FJDWYOBPvo5lEb0oKvNhde8bPtLWRXP6LQTSb7JW1wx4WGpFzyqSS54YtDfDpW2dxxtLCXAgzokkdT7cwO2mdKuwb0Sw7xhAjNSkhwYut7Gr0PP1RisgVqSNFsJUcefN5f3U9qmklPsXfm+DWL0Q/9TTxaqvFE++K+hToJ+qDq7FVQgZ0E42hP4Mu6ByXztKbf2sn6rbIZL58gEXY1a3RRsT+h9EtTUlaIwHiLfhWXuaVelLmm94HRq94fzW/oqXPC2mcNW9UoQ5yHwkkj1eIQ3fwYiNTRo00UWOTcriq5zYwlJn9TWEVpz6iI5IN93V2ilvsr2t2NyoWV1+9Ys6VEi5ZC7XOzKTGEnrSP9fnipwshzb4pyeDgProgwwPmR7QoJORAuUJ0Ptr3VzTXHM5TjmDmYMBajltePF8HrDap9icibvsp9tuZDUHH2THPPkK2D1wLId6aNhwh7KIdGaymiCxasmGSv0Fo9IKl0SMQTfE5ZJqLPGDIO2bSRyqGt5NBcJ4zixYaoGetEJ4nw9iQwl2Dr1pxNOtAiZZhF8P7/FsGPi3BHiL+ZfwvZY+4L3ajeHzZCkOClgf5qpR5dvZYjUBq2oNPKJT0DsQZ97dvA5Ld2Z2+EYHOrohYkAYtByUHvaDSydz5UJtBnJ0jLpXjHaJ+hSQpesKstR9+qzUBvQ8N1S+oWoiMfTS+WrCFxxFETzk8+Pej4p9JmmE5NqOv60DyvGOu+eZf29t6DPGeb7FykjMbokYYq+EG3Kw8/7YnONytbu1A/jzdALJreKfw+uN7xaODH29pwtke+TZVHDuDqSuJ/7wZ3F2+lrTUqhpPbQFPQZwDEajoCaXqWsoGUKnWw1OuvIpkVTtjIqz5I0isiosvudSx6gN3T4MOVfkNJFt8YqHas1JI/DfqisygzalbBnQJWk0rMRXYKyGYFvd+dvOhh9ID7iw8VNL2wgnv3sVdzg5cGcTYo70FFpnIrKLADdXoHw7TkhXNw+eI0IvZRY8wcuySYPtRU+pfO254qtQA5Mj0M2a/k4VNLWZMU48EVwXdnkuUKj+7fgA3QQPMKy6bRL23LVfelpGHA71ZRtxQYjcKTaNtnn+NNaAD3E/Ir7EOi8CpSGYx9G2Cvrsf0QBK0h4D0oMU4FfJeCQ+e1mef0S0yx6/WbFHxK3dRG7E8k5T7WJCtmApTQ49vEm9dC9olcU0IRmRkdqCZy7hJRU1+jOIe9zEHC7jw4DZYFewKB6IaJuQx2xRKZ7O58SpmCg06IBATt9z0iHXv27eJddkGERLz4IOkYEJipOU3WlAWhoG5yYc7OfBr3KE65/YXlSjgdAggyxIe40ZxSOENZZv6T8c3ORTBPcaLgF9XD72si4KuBFh/AXhI4q3ur4ng0mrqV9t2qb6KrhKtLLJsiKs2zDEZ0koAIht3cyPZE2jO8h1BSO7qropEVoibL7taAENKRv8b63tRtHROy6VtAKSRrF41BFsQ4omV3EtjqNo8vZEZE+eFNqnRSgp4GLj9CQAU6Ct+Sf3r0KS7tazqveJ+KFT2B8mnFjDc344NBarGy5cABfIO7mfZaYd86qIrpGE/XNBIa8vMDGDkr93VxJQhfHSSfKmUeuwJ4xULn4q6aZz99tKwlgzI7GIcwSNzQUlESDCWq/Nsu8txAIpwFRJvt6EVfdhL5tQZw7FQb6zj0+cL7+K5elswV+cA= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b2effce9-2081-45ee-e6a4-08d9d0ae192d X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:21.6892 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: TgV5ZF0ycnEdQ8MMR8lM2dBmWEKt6vqZzn1iDxgixu9I44MejpjSe1mY7oZ+/WrIisEc17vhEhcr2QxVrGFlmBt8GGeb/OUxPIoz0FuMm4o= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=634 phishscore=0 malwarescore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: IckvHlTAip73IL8Ep3N-9WoqAYcw_KlT X-Proofpoint-GUID: IckvHlTAip73IL8Ep3N-9WoqAYcw_KlT Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Jobs can fail midway through their work. To recover, the finished chunks of work need to be undone in a job-specific way. Let padata_do_multithreaded callers specify an "undo" callback responsible for undoing one chunk of a job. To avoid multiple levels of error handling, do not allow the callback to fail. Undoing is singlethreaded to keep it simple and because it's a slow path. Signed-off-by: Daniel Jordan --- include/linux/padata.h | 6 +++ kernel/padata.c | 113 +++++++++++++++++++++++++++++++++++------ 2 files changed, 103 insertions(+), 16 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 1c8670a24ccf..2a9fa459463d 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -135,6 +135,10 @@ struct padata_shell { * @min_chunk: The minimum chunk size in job-specific units. This allows * the client to communicate the minimum amount of work that's * appropriate for one worker thread to do at once. + * @undo_fn: A function that undoes one chunk of the task per call. If + * error(s) occur during the job, this is called on all successfully + * completed chunks. The chunk(s) in which failure occurs should be + * handled in the thread function. * @max_threads: Max threads to use for the job, actual number may be less * depending on task size and minimum chunk size. */ @@ -145,6 +149,8 @@ struct padata_mt_job { unsigned long size; unsigned long align; unsigned long min_chunk; + + void (*undo_fn)(unsigned long start, unsigned long end, void *arg); int max_threads; }; diff --git a/kernel/padata.c b/kernel/padata.c index 1596ca22b316..d0876f861464 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -42,6 +43,10 @@ struct padata_work { struct work_struct pw_work; struct list_head pw_list; /* padata_free_works linkage */ void *pw_data; + /* holds job units from padata_mt_job::start to pw_error_start */ + unsigned long pw_error_offset; + unsigned long pw_error_start; + unsigned long pw_error_end; }; static DEFINE_SPINLOCK(padata_works_lock); @@ -56,6 +61,9 @@ struct padata_mt_job_state { int nworks_fini; int error; /* first error from thread_fn */ unsigned long chunk_size; + unsigned long position; + unsigned long remaining_size; + struct list_head failed_works; }; static void padata_free_pd(struct parallel_data *pd); @@ -447,26 +455,38 @@ static void padata_mt_helper(struct work_struct *w) spin_lock(&ps->lock); - while (job->size > 0 && ps->error == 0) { - unsigned long start, size, end; + while (ps->remaining_size > 0 && ps->error == 0) { + unsigned long position, position_offset, size, end; int ret; - start = job->start; + position_offset = job->size - ps->remaining_size; + position = ps->position; /* So end is chunk size aligned if enough work remains. */ - size = roundup(start + 1, ps->chunk_size) - start; - size = min(size, job->size); - end = start + size; + size = roundup(position + 1, ps->chunk_size) - position; + size = min(size, ps->remaining_size); + end = position + size; - job->start = end; - job->size -= size; + ps->position = end; + ps->remaining_size -= size; spin_unlock(&ps->lock); - ret = job->thread_fn(start, end, job->fn_arg); + + ret = job->thread_fn(position, end, job->fn_arg); + spin_lock(&ps->lock); - /* Save first error code only. */ - if (ps->error == 0) - ps->error = ret; + if (ret) { + /* Save first error code only. */ + if (ps->error == 0) + ps->error = ret; + /* Save information about where the job failed. */ + if (job->undo_fn) { + list_move(&pw->pw_list, &ps->failed_works); + pw->pw_error_start = position; + pw->pw_error_offset = position_offset; + pw->pw_error_end = end; + } + } } ++ps->nworks_fini; @@ -477,6 +497,60 @@ static void padata_mt_helper(struct work_struct *w) complete(&ps->completion); } +static int padata_error_cmp(void *unused, const struct list_head *a, + const struct list_head *b) +{ + struct padata_work *work_a = list_entry(a, struct padata_work, pw_list); + struct padata_work *work_b = list_entry(b, struct padata_work, pw_list); + + if (work_a->pw_error_offset < work_b->pw_error_offset) + return -1; + else if (work_a->pw_error_offset > work_b->pw_error_offset) + return 1; + return 0; +} + +static void padata_undo(struct padata_mt_job_state *ps, + struct list_head *works_list, + struct padata_work *stack_work) +{ + struct list_head *failed_works = &ps->failed_works; + struct padata_mt_job *job = ps->job; + unsigned long undo_pos = job->start; + + /* Sort so the failed ranges can be checked as we go. */ + list_sort(NULL, failed_works, padata_error_cmp); + + /* Undo completed work on this node, skipping failed ranges. */ + while (undo_pos != ps->position) { + struct padata_work *failed_work; + unsigned long undo_end; + + failed_work = list_first_entry_or_null(failed_works, + struct padata_work, + pw_list); + if (failed_work) + undo_end = failed_work->pw_error_start; + else + undo_end = ps->position; + + if (undo_pos != undo_end) + job->undo_fn(undo_pos, undo_end, job->fn_arg); + + if (failed_work) { + undo_pos = failed_work->pw_error_end; + /* main thread's stack_work stays off works_list */ + if (failed_work == stack_work) + list_del(&failed_work->pw_list); + else + list_move(&failed_work->pw_list, works_list); + } else { + undo_pos = undo_end; + } + } + WARN_ON_ONCE(!list_empty(failed_works)); +} + /** * padata_do_multithreaded - run a multithreaded job * @job: Description of the job. @@ -509,10 +583,13 @@ int padata_do_multithreaded(struct padata_mt_job *job) spin_lock_init(&ps.lock); init_completion(&ps.completion); - ps.job = job; - ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); - ps.nworks_fini = 0; - ps.error = 0; + INIT_LIST_HEAD(&ps.failed_works); + ps.job = job; + ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); + ps.nworks_fini = 0; + ps.error = 0; + ps.position = job->start; + ps.remaining_size = job->size; /* * Chunk size is the amount of work a helper does per call to the @@ -529,11 +606,15 @@ int padata_do_multithreaded(struct padata_mt_job *job) /* Use the current thread, which saves starting a workqueue worker. */ padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); + INIT_LIST_HEAD(&my_work.pw_list); padata_mt_helper(&my_work.pw_work); /* Wait for all the helpers to finish. */ wait_for_completion(&ps.completion); + if (ps.error && job->undo_fn) + padata_undo(&ps, &works, &my_work); + destroy_work_on_stack(&my_work.pw_work); padata_works_free(&works); return ps.error; From patchwork Thu Jan 6 00:46:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704964 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A3CC43217 for ; Thu, 6 Jan 2022 00:48:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344026AbiAFAsM (ORCPT ); Wed, 5 Jan 2022 19:48:12 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:20618 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343955AbiAFAr7 (ORCPT ); Wed, 5 Jan 2022 19:47:59 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4TM5023248; Thu, 6 Jan 2022 00:47:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=4h9fxZQjoge9zoEz5uRD/jyfg0rfEsOFpuVpN34IFis=; b=MU68LcvTccjVYNTtRXvU2vTAXEYW+GEWznU0SfO1UKDYorpzR26aQv9uh/9TmuaTKw2n MN8gmiACR3Xh/w/2OiqVpIk/RgXw+q5/ZLmjzflXtUaGYb+OGdbI00/UOiP8frpgOKUq zMdJUmYwAVIovFiu36Ws9+JBaqyc/vS8zz+36/20C2E6P/HDwXFYvn52PKC/LgH+i9Ua k7hic7GhT7h2I9DICf/Qn1je58Y06eyWBNJKkLoDmAlO5q1xWUfbT5hwgV4xo1qNnk5x 9ZM/V6gEa3xd1iXU+44t6TtD7tCeNBNG+YGRp1V3smzdzin+ln7Gi3JfaTxDwJUBG+S7 Ww== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpeg421-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:29 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VIsZ107309; Thu, 6 Jan 2022 00:47:27 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2047.outbound.protection.outlook.com [104.47.66.47]) by userp3020.oracle.com with ESMTP id 3ddmq5uwk0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ras/B4sEn1e609S5NQVAYJwtO1VGIS4SJj4DYQQ9kINQmMjUGzt3/BMRz75keSCTWDmN672Dr6fRXnMtd58dB2DJ5eDIE/RCKnXWiiEEeoFoWH2Sl7ZAgVWAlvsxS7B6EruY/cPZ2YNJcZpcjJMSSkTmjweVvf+zXZZF4W6Kxeg1Mhqg7jRySj56tEfVDv3dTyBMz/fTWBcTu/YTPWUM2hkgsFPYMMonSHCaHYcVUfJM/dXxUR7iuQzOQsamXxr5Eu6aX4HIX/LFV+83WZAAVgzxsg/HeQKXkk3PQQvQS74L694Y12BRx86NIICejOPYoJs7FudvF0Y+hID2GhFELw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4h9fxZQjoge9zoEz5uRD/jyfg0rfEsOFpuVpN34IFis=; b=Cq/GtN7QeJsE2GLGtb7ku2aNb2j8nCvGbYClXjyZdg3XBYeYnE227y7o6PpZMczoquQWygJiQIbuM+RExZ1CbU6aNpzYLdSOAI+scw8CWrLdpSHTMbzPy5oZuZDsmYGsTgx4kHe1r8YCdZfAWTgchyZJ12SCLxQq0sFH2S6XIOcyYeTfid5BVvtM0h5gaiNFV026iGhVkAZYeUJaL25iagFhrwzz7fO6SaPhMp4JXyKCUafG1VvKMYomU31gdifFtSVXfDnNrZ19vgPiy/dEvKwBSDunKSsKL3hbslWf3YcVgr2gXe/Qa/qmYKCEZgeOsWBDF/RyuJ1sIpTQd+zZSA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4h9fxZQjoge9zoEz5uRD/jyfg0rfEsOFpuVpN34IFis=; b=Zb77Ind/eqFI9u6gz9jzqxYNngiT2589S7qO/PyEelfyOtJOXxoilgDe/XSYscpPXRO2oyzr5FW/nk0MpQqQGoIR3L2DHeJG7lNdsWOXBCzFFtpR9prK3CMBIFKXJB1UjipsrqZTH+LWfxxKg6tqrlrY/NgN7sx4Hh50r+A/uyg= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:24 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:24 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 04/16] padata: Detect deadlocks between main and helper threads Date: Wed, 5 Jan 2022 19:46:44 -0500 Message-Id: <20220106004656.126790-5-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b4597dcc-d4b5-48c4-63e3-08d9d0ae1ad4 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:538; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UJuxC5rg3ADIN4UcifVaKMIALHSC/v5eRjy3LGiYUPPtvBZ6RhdQ57Y4xnMMRyQm50gMbv6J3m0s9D+GoOTvGGApevnsRb5nY0WDIHKX7DsrcppxmSGtSsRRgK+vfxTbb45Wj7wl4RbpV7BEyKhLdQDLKMWJUkrCyMekAVUen7LS+g5LWGcK1ZdoR9cBaVOMeO8JjEA/s0cpnCJUZioKQvmOwiuOYO3uLJ9CUQiT8gV/laHP+ZLXSRI5BjIx6S9c5lM5lDQ1U1PQ7e/ysm6yysM7SIgIoBqFDKDSq/BGKrI1+fD0ybcFdVYX62/mball15kkrcXRw0raPWpHuwok0kxb2544wGtk0/5RJ1CZS8TbMIZH9OiMpBL7QlpPij/ZF3VozW1ReUuWrF1FdmC19iT7DHuv+K04II0G3w2CPv2FP0/ezNIXbH1BaVcM78iTEabXsFuRXi9f80PmLrCqWbE1XpWUJQmdCWVEpItevdongS3IvkdJbKbkhz70/SsB/DgnEoqHqOl7H0Lwz05h/T1z03P0HT23NFIATFHqCfIhkHLbNJA50vc99nljY1Mtno17Uc51dl+uKK5hhgAfjvJ4OThiARHbqh/XKWFRh8RWg5wRMnZvXvhraVl3+OZQM/rVQ9b9QD/cOAnRPOqYjkjjn0vcH9yikjac3OVID+umLoNuvvNKUYQchRxKShB3z0txKfYp/fA3jVzVZJ6W0+NE7cPaPGulN/a6inZ1NXc= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 96/3BNgUbgvDGAZIxyrKpgGq9V8RlsY6EHveVDUYmoNzBzPajee7GezxgQ408XOG1KbAH3a7mKIc8n5InGnctRWl7G7SDUoxmLaobOqk3imqAbMlmXhAniRwGKLznOkSBB1CYso07VWh9oY5wqBXXbbRR0BkomYVnPeWKNxnJ6wly2aFIprvAVM4rLS0ZPExAI3k3Al6RsB+g6vr80eeaG9IIcDbAvl2juVRYKruq/QSv0dBXGYAYU8B34KNy2CPdmNZz+HmFgOnObc+ZzneeVAuKKKDW0YsIDrPRM/+cNMqARdXcAWZEEUtMToKc08jcZORnAByJlRzIKcK29NKn7hWtCfWHe+w+cj4ewtIHluUmKN7SmPm2NjEqzMBGH5nkf99CZ4C1gTsFxxBK8hxZZT6hgDJJhIpQbx7CoArFgk4XeKeRPMQ3uAEaKkvhuArvNw7eSTBh5XzuWXyO1pcyWF0sgCU/HznwWEjb6kAvIdfY+3UIMYmgcoUUdt8FJCBSJjr5LWuLurhyM64GMOjeS7ob3M3BEgunxS0xDmRGB4elkJyNtkdWboyBMdwkhdunFoBAMfMuePxLjGL1t+yK7vu12v4txjSEo+T7e9Wg6/US8jweYYvNzWuHB/tUF8kQX/4Bewad+Rmsf1Pmd7V+OMjy78eurvNKpCoNS6o3EkwhpVU1OAXtQXlBqvd7xiLFuxkQpje8Mahx5/s0mdgdSp3X8mrQiOUyHTBZ7/BVgr1cN0vUtiGdXFpv6SZQqORu6WsmZIh3aNicD8dBniahyeacYtB5ZuJK8Dwh2MFcChlSE4Gl4gy6l+Lbk2tqRuxgQEm+8i7FcrsDTnWj4tw4wM9puLkgQvyjtTtWcRmohgmaS3OqSDXhCXeRdPRAUpmSgIIutfwDk5RfFSFNiQPNXjaEIdaCyi3WYZHORW7fIKomDR1c+34Spb2c8h3ykpDu5U3bfHB8PfMscPfO+u8wmh3z+PfdMy8+1YRLP4l29prDyXhJSgAlHVd4EIsKd4pvhzPqozH+nbaEW3aUTCBEasut5S95kldec27EsNeahbFHxlTA5kq+g4bnW8YddK1gc2SCBvSZOLSbHZs/4PnnHIDV59CcwWfU1o7asWsRcZ3xfmwscmR2pLbQdjtypsjM8oXx3CU6ZjkbQz7AiTs2rXiMaawJptP+rDu9+6F00yE7165veBYkMwQatUPsxcHq3gIdg9C4Y/cSdHBot1y99jAdhjiUaNI1p9MMRw84dVzhiNdz9KiNendxAtS0QCbc+khwMccGzJhSJRJ5btF33zY1kGT+EjfVokxdL3z426LdyxtT6Tp8BhNrfRoV1AhPRgPEbHR7hauFs3azWXSg+OVfXhgq9uGt8sFQ+6rfr+v8S3EkUQhic/SOnUzT6cyT9JPxFmv76E5xQvECsfKgo5VGxdDPNzZC/2+1tFPu4YPExWzQpCacURSPeYtYn4rA66LOqHCR8pmW3bI/JeNIH5TETNLDLM+KTUq8Ka5zt69X05g9kUvZqel7EpXA6UvmhGfpO2ni3XodL4S3jTmYz30gfU3a39sADgm2E/SBbowKQj7h7j3h9uYqSQPCqw/xyPTQkK7ahc4XiZKaz7pQ0p1DGDQaHsxynJWrxX9oQomuKjjFIYAvWfRh4I0OeHicYKd8+10kDFtU5gUqLrK50SFSq6z7YLdejDc1AkLiDc= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b4597dcc-d4b5-48c4-63e3-08d9d0ae1ad4 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:24.4133 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GQ9S0WN4CzH9RifL8BPYGTmr1T7LZf2hQ1H77f4rOAtwZAlPK9dKHpfFJnx6pWOuHmGgXrT//3ttx8heP7cBLFtl6Es56av0ihsvj6DwGs0= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxscore=0 spamscore=0 suspectscore=0 mlxlogscore=546 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: VPF4fmMAJ08kkXoGfE_JRmztebTYOvOY X-Proofpoint-GUID: VPF4fmMAJ08kkXoGfE_JRmztebTYOvOY Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A caller of padata_do_multithreaded() can unwittingly introduce deadlocks if it already holds lock(s) that thread_fn() takes. Lockdep can't detect such a dependency because it doesn't know that padata_do_multithreaded() waits on the helper threads. Use a lockdep_map to encode the dependency, following the pattern in workqueue, CPU hotplug, and other parts of the kernel. See commit 4e6045f13478 ("workqueue: debug flushing deadlocks with lockdep") for an example of a similar situation. Each padata_do_multithreaded() callsite gets its own lock_class_key to avoid false positives involving locks from different calls that don't depend on each other. Signed-off-by: Daniel Jordan Suggested-by: Peter Zijlstra --- include/linux/padata.h | 22 +++++++++++++++++++++- kernel/padata.c | 15 +++++++++++++-- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 2a9fa459463d..907d624a8ca4 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -17,6 +17,7 @@ #include #include #include +#include #define PADATA_CPU_SERIAL 0x01 #define PADATA_CPU_PARALLEL 0x02 @@ -188,6 +189,23 @@ extern void __init padata_init(void); static inline void __init padata_init(void) {} #endif +#ifdef CONFIG_LOCKDEP + +#define padata_do_multithreaded(job) \ +({ \ + static struct lock_class_key __key; \ + const char *__map_name = "padata master waiting"; \ + \ + padata_do_multithreaded_job((job), &__key, __map_name); \ +}) + +#else + +#define padata_do_multithreaded(job) \ + padata_do_multithreaded_job((job), NULL, NULL) + +#endif + extern struct padata_instance *padata_alloc(const char *name); extern void padata_free(struct padata_instance *pinst); extern struct padata_shell *padata_alloc_shell(struct padata_instance *pinst); @@ -195,7 +213,9 @@ extern void padata_free_shell(struct padata_shell *ps); extern int padata_do_parallel(struct padata_shell *ps, struct padata_priv *padata, int *cb_cpu); extern void padata_do_serial(struct padata_priv *padata); -extern int padata_do_multithreaded(struct padata_mt_job *job); +extern int padata_do_multithreaded_job(struct padata_mt_job *job, + struct lock_class_key *key, + const char *map_name); extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type, cpumask_var_t cpumask); #endif diff --git a/kernel/padata.c b/kernel/padata.c index d0876f861464..b458deb17121 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -64,6 +64,9 @@ struct padata_mt_job_state { unsigned long position; unsigned long remaining_size; struct list_head failed_works; +#ifdef CONFIG_LOCKDEP + struct lockdep_map lockdep_map; +#endif }; static void padata_free_pd(struct parallel_data *pd); @@ -470,9 +473,11 @@ static void padata_mt_helper(struct work_struct *w) ps->remaining_size -= size; spin_unlock(&ps->lock); + lock_map_acquire(&ps->lockdep_map); ret = job->thread_fn(position, end, job->fn_arg); + lock_map_release(&ps->lockdep_map); spin_lock(&ps->lock); if (ret) { @@ -552,14 +557,16 @@ static void padata_undo(struct padata_mt_job_state *ps, } /** - * padata_do_multithreaded - run a multithreaded job + * padata_do_multithreaded_job - run a multithreaded job * @job: Description of the job. * * See the definition of struct padata_mt_job for more details. * * Return: 0 or a client-specific nonzero error code. */ -int padata_do_multithreaded(struct padata_mt_job *job) +int padata_do_multithreaded_job(struct padata_mt_job *job, + struct lock_class_key *key, + const char *map_name) { /* In case threads finish at different times. */ static const unsigned long load_balance_factor = 4; @@ -583,6 +590,7 @@ int padata_do_multithreaded(struct padata_mt_job *job) spin_lock_init(&ps.lock); init_completion(&ps.completion); + lockdep_init_map(&ps.lockdep_map, map_name, key, 0); INIT_LIST_HEAD(&ps.failed_works); ps.job = job; ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); @@ -601,6 +609,9 @@ int padata_do_multithreaded(struct padata_mt_job *job) ps.chunk_size = max(ps.chunk_size, job->min_chunk); ps.chunk_size = roundup(ps.chunk_size, job->align); + lock_map_acquire(&ps.lockdep_map); + lock_map_release(&ps.lockdep_map); + list_for_each_entry(pw, &works, pw_list) queue_work(system_unbound_wq, &pw->pw_work); From patchwork Thu Jan 6 00:46:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704966 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8135C433EF for ; Thu, 6 Jan 2022 00:48:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344065AbiAFAsR (ORCPT ); Wed, 5 Jan 2022 19:48:17 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:31660 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343956AbiAFAsA (ORCPT ); Wed, 5 Jan 2022 19:48:00 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4jdB031970; Thu, 6 Jan 2022 00:47:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=yvrNOGTULS/Zu6HsAvJl0g41BPCHDKMfE0kEjyGc2Jo=; b=GzYfRr6ZsgZszLiLuc+f10Okjwb02e4gQaIAuMm1BmPDQ8mf+yb86ZgtGEyfeAImnTxm 0Tu4Cbg+HcnO1lyQ0JkdkSWg5P270nFfVjYR2P5xHuEJElZ40Xd260cFN3GG3tjFP7Td OjHHbzYeuCidJsKWogIXtqi5o58uWes+qB1BjBenH6SGNKbbefQ3QWuc+eGgu+OzCsHn cL+OANaMjTx5+Cdv+D/Oz0RpsKveWSliuNpADDl2wKfjuVtIjsJ+139XhfXTXje4RgBJ pi78A4G314T392a2ctg48KXIYWroGBvUevdG7EOZ9mXX4PCJ4BCFMXuaTXMasgDOGZXf LA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpjr3w7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:30 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060W859102533; Thu, 6 Jan 2022 00:47:29 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2045.outbound.protection.outlook.com [104.47.66.45]) by aserp3030.oracle.com with ESMTP id 3ddmqgu54b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V49QziCn42D7s6sqyNK/T7b06BT5LqHeR5MlfJ4wsip+nfmVdcdbpWKTztrnJ8MXChephsgVHm+xvBAjO+IOZjsJMarJBttPtj9qKBXIpmLsvblEks8md71ay+yEsFmhAn+HX1SqwN5Trh+4ra44PaCrTSGv9kJ/VEdyXEfvT6FPPJnxfbH0pYw7w93AMhpUCms3PEf71m+cnAWzXLvxqxeD7Pk9dZf14jDsNvq4LDuarZlnWtAUIq6pNS6M1d9PY0CLO936XBmFVFH4/pI7oK+ySBrdeLgPYqQOov5kmyYrmCvVnlfNTNAQAVRQGzQkpZciilxPX70f9osKKYWgHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yvrNOGTULS/Zu6HsAvJl0g41BPCHDKMfE0kEjyGc2Jo=; b=QLxsIe0no/yNU0uimveubk2Z0f6jNrMnZNUQVlCFQFVYjS7DPDWa4H0yEjeAZWS2ZNMpfchjE+UmEC0sYLvdHQ3FU3aFIMzUQC0BWWR3B7dmS+QqYTM905PLgqMz2S15Ef5BiwHpLyAeTjDX/5Bwbmx36iugT5cGAo3TgiZnkJ5gUP2ixYGeZk7y72Ln9U1j/c5J9UFHNBjgvjFgM5Wb22y+9T3PRwrjGIe8j4ckWE7C99u25Tv/W53sH+xbJdPRVewF+cAgtVr6yJncwmlziX0dVIyoqN5QPqg7af6ZxOhhz50CeO35Fi8jJ6z8ha6+J5QX2ryIXsDQBRvt/wIriQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yvrNOGTULS/Zu6HsAvJl0g41BPCHDKMfE0kEjyGc2Jo=; b=AYw2z21OjVrgmUiu3aqYa1Oz1DDhegeCYmwbWh04eF2Iq4bL1+q2UbLkE9IR43buPLVgeOOswaYKultGddmnxZ0AUNUdnp+QiRDAKxAcaHHZOC2Vh3i5DnKuvRqcfg97XqZgOIAvYUbMDiNXjFqKrWZCqyOJUxlF1Z/pxfVD/VM= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:27 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:27 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 05/16] vfio/type1: Pass mm to vfio_pin_pages_remote() Date: Wed, 5 Jan 2022 19:46:45 -0500 Message-Id: <20220106004656.126790-6-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: fda4b5fe-8a62-4e76-e0cd-08d9d0ae1c6e X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3968; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7HTnmjOGojUtjvcSIi0sfJQBF5QWSs9b4xHDI99xgMlN+SwBV+8c11oX9YZhAseOj/ixEmPrGjSxrRzEux2l9GpPacrD93aycS7o5Cgn4gEC3MjBbwtimbeq5Qn4jX8W79an5mOey9c/pBCUsynxzylKk2WD63xGgUK6qqZFcKXOQtE6FmqIGNC8KlYUumLiy8kR99dSTA+CoPyAHPt5BXGNqoZZVRXC6Qjs6ze5GQG4m93P1x9ZUWVREcFUIgvJ2C31vTyksMtb31uIL1aqmkmQ1AAR6KgzaJwXtHmj1lk1avES9QHcxR0g9YWh6ycbhV4yNCdBJ8GG3V/fZQIePHRVjpVUSNHvEyHQ6m4IXArhhyIAtZb2x8ir6QgOcvPg4V1ou+YHsAzOltFqsUYjXcjQOK5NsziUm7KRRwqjUwNTIMzMRF3n0KLK2WQ2GAfYHHb4H8LQfacrg+UpT20qMD3jRdeE0+YCkL2iPH3tzYQnVxr8OQv9L9ExgLcFuYKAUBXBKwbpcLESx54kqL7DGbUcgu6kOplNYw5rCOtRTRgN4wFHPPluNRzLWVLSknIQfunh1eCrPs7hdazgx9oaAdPagykuU6pH2Dm6PyP1zASB2XBIYWxilv0hR+WPmn8pC7FQRdBRx8lX8Axa4rsGsfOhWKtNUj6KAmBjuoohjtCrUHXvSlx9UaZZxRDf9zDi9nSQ8k+DbyGFDXTzyXtHVW7eK4+pfWM+tBiVZtlolvQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 3mhWo4lpVW0YWvpZezieRFeehbd/Unz4RS5sJdezirG/fuGRATmZFUOmAjyMRzIVdJZBehtkeqIJzJqAXuiC9dmgQvSuUy+6ahtH+ClnSKTK34+yujaS4qdJpEz67kZkj7TdSmVBEyTX787wZJUUF5NlJpo9KiykfosIJeHpkLE8jptGTtFe1pl4C5v9xfD8ePG+gqjVFCMOebP47aqT/aYC560B2N0ro2qt2PC3gUNDAXY6357liyHMBrl/Ad6BqSpZ84ule4USv33GrFtdstQv48ghs0SMX1KcaULhIa9xIm2RcIqWlAVYvYt9u+Igzj0mPLCA8NzTMLzfz0drdDiWwD1bANZI9hBGah5VIbjdfPS3mDoelNcFLkUJ2AwlnH43lN99e0At9BFFgApVEB0kRXK44gl3cF+ndk0fZKTYr72dqkxvaDCZqF83MH9om7MxVQQDW2cREtwqe6y/rqu5JU6PWQZrGVb1Af1xANU49NXmnotIbUV1qSdpfV4VDd70vRIyFMyKgdB9RtU8wfVnWiy+mHeIMgCRJthx72uJi8AeKnnfEPSVClnjacphQxyrGjrS2GFUJbC/87QOYAL6NjAYUpWmfmyN5yKVO+hIv060qVKSpFygdIvquFhhjZGSfxIJZX972T/EFtD7oy3uC03AKs4oYkZBpGXzbI7LkJVxeg5yKBc8j55kHUjS9wp8DuvuhTd6Ll18XjcWWOTIRMSw/YIYI3u3Q8n1RqDisk3RMD9cKqFn3E/xnnilX6bOLtineX58TF+7Poc/iUW76nTosQ+GYKrKlRokxExrpgT3RijZLC9xA7Zy8zC9Tug3SuW9bpuSAWvm26lT4rhtYrpwN4B9iRqc8z4+b4aWqkDrsMeY1GYX9sDe74MF63tsDMRHmT+9uhL1aJpyQfwy1yuSGL//HKlWZ7E2b+0TH/+SkezVfZYAXTtQYG7Cd5pqa0vZaQzANlQT6eWv1YCir3LI4gW1P0j5upX8EkWR+5UTq5T76EriE1E52Ks7Mzv2b93THmpMGwvPSbnxN+lquqOdjp1oIFQZ4xPIo499cASWMj1S4A6ENRrPETPmItVQdfIZTM2piehUFETekjY/O4CgZQ0Jm8EHiKZ9jaWkpM8DYb8MVkN/DjgFTzGUe6j0XUyVDoR3/0Ori+JGwhv1nqzRsFnOjDmNd9UasvVk4guruveXtIWsg54CMiVm4zx4lNb7p/vSHvgiueLQ3Hvi0gaHoELpfg8W1q/tIjDwRIhqt3m2Tz9fogrq5ThzIdVYsOEzop2V7s3/VD6MVsJTh2KJsRhFBhvXp82JCLe+vCp01r0YnjY9XlBuGlyj/R5mJ+CC6F0X7CLX+43QyVBCdosl8Y3dg0l3JXJZcAxPmeSbs6zqLfT3x33xpjpjFnhWIm7vFQNCmJOdKuqOdDrUPI0SERXCpvYISIk7U6ZnQA3tcoeAE+W9cPFdvpWiBgw6ZkhcWrVQwLmHdFuAmXx9rLxKjJZd552d9URzk/GpkQOHz49wjmCMNjDnKqMjquppSRjzo5xAounVDBRhnnJvObOB8+Ou38SL1SNPyQY/UbDiFtUZZ60a73vJ7jeC4qgbO2C6AeKgLOUjvtudAc8kIZhp/lrRYsGr8YXwBNGmxRgWMYM1qJAp6fNKk7ASS4IACKjQM/FHsNi9JrFa5PxGskmgk4puvCjv42RWTb4= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: fda4b5fe-8a62-4e76-e0cd-08d9d0ae1c6e X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:27.1313 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yrORHis9uL2uEmjdDMmE5M5Su71WBIiBw0r9z867l8NpLrt95HAKDL0qHWKyRuCfzcX2MOOE0Dvqd5aBP9LJSSL2R8mmNXf8tbWIhT2HGkY= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 phishscore=0 malwarescore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: QkNMjeKvCArgEM4gNiTaTiOvU_IyKWlI X-Proofpoint-GUID: QkNMjeKvCArgEM4gNiTaTiOvU_IyKWlI Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Refactoring for later, when padata helpers need to use the main thread's mm. Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0b4f7c174c7a..26bb2d9b698b 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -649,10 +649,10 @@ static int vfio_wait_all_valid(struct vfio_iommu *iommu) */ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, long npage, unsigned long *pfn_base, - unsigned long limit, struct vfio_batch *batch) + unsigned long limit, struct vfio_batch *batch, + struct mm_struct *mm) { unsigned long pfn; - struct mm_struct *mm = current->mm; long ret, pinned = 0, lock_acct = 0; bool rsvd; dma_addr_t iova = vaddr - dma->vaddr + dma->iova; @@ -1500,7 +1500,7 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, /* Pin a contiguous chunk of memory */ npage = vfio_pin_pages_remote(dma, vaddr + dma->size, size >> PAGE_SHIFT, &pfn, limit, - &batch); + &batch, current->mm); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; @@ -1763,7 +1763,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, npage = vfio_pin_pages_remote(dma, vaddr, n >> PAGE_SHIFT, &pfn, limit, - &batch); + &batch, + current->mm); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; From patchwork Thu Jan 6 00:46:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10CADC433EF for ; Thu, 6 Jan 2022 00:48:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343985AbiAFAsV (ORCPT ); Wed, 5 Jan 2022 19:48:21 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:31546 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343957AbiAFAsA (ORCPT ); Wed, 5 Jan 2022 19:48:00 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4jdC031970; Thu, 6 Jan 2022 00:47:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=Q/6bDBTgCx+bOXRkG/YRDrBG3uA87fmnjSGMlx+MiHQ=; b=Q7tnx4alkn/Z7w47+JnBJAWsQHv3jqDVXch4dygL1t0YgKsd/nXidYAoxbaaySXlN1/+ Vxvp0kKGDRF5BzzhnBvRideW3SgU91JUz2uTqYJaETk8AcLmnpVZF7Lh+NuHlwcMgF3b 8l0gyHfhBHJgNDcLtz3ptgpgko3t3f2sbynN7kKSfxWl00xRGTUdN7tVoghsgUohAO5S GPosTxOAq3CGvz5QB3jZKwsHm99chvfJrSoz0BhYbd7Oj1avtDfAoPKCZhkvlECENjib JK3mC+NsB1iJUZFo38qwZ4t2BCPU7b5cMCT64i1oC/XTnEzfXRQYQoCE0HkKD0lNuTL5 dw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpjr3w8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:32 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VgDL076303; Thu, 6 Jan 2022 00:47:31 GMT Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam08lp2042.outbound.protection.outlook.com [104.47.73.42]) by aserp3020.oracle.com with ESMTP id 3ddmqa3d87-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PkNuM7MoNkUTOdn2ZmZGWfY5bpYf+qbyFo+ci1/qm1oCEDU4VkysLc68DzDj5T6qfTcLxDhH0EiDrhL8giie+cuzQBtVVFN7xjn9QFYJVsq7MVhpVXOxNqz/xDKUo+penV5EeqkCFd+uafoRy5HoBs2jVjBEPVc/4nVzTY/fdPdh5iCR4Iz3DC1rsaUfMBTZWXd4BKxyqv1a4pV7ZGYRwFCzXHCvIkt0VDm43eimXpxWx4Q7UOCvJq2OGjT0S+9XJc/5JHqzGxsumP68KrGR4XK5xlw3v9rLWfG9by5FqdwYQQ11entjmc9f3MwBBCPLV5EgTvVp2eFAq/yV2jB4bA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Q/6bDBTgCx+bOXRkG/YRDrBG3uA87fmnjSGMlx+MiHQ=; b=kQWfoAKltKYJNQJev7HZ6ojlHVOw2rtxSDGMHtpZOGFm+Rtt2ToCUGIVimka0fDvN7VBKZdv+MgZ7yQKB9ZEy41ja+l4eC39Nl05bch9LUCrwC+qCg0TrTsWsfhZ7E3usMzr7bmr+slwDKfhyu8M6nFBxGUbTjVmyMYpIMDymmZYlNw2aJmnR17jaTWgevsfuqJXuBAK8TSPp13fKW10joCF2gg73GqH+Oxl2ebiDMLM/IW6QyBnvtHNZv3NxQkNP8xwbKChanuMbw0x+ruQJe09ciPT08lY0pmEh5uYKBjnmFENhePgUtSMCZjLbCZh2QWuCbD3MgiD1y/SHnUCuw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Q/6bDBTgCx+bOXRkG/YRDrBG3uA87fmnjSGMlx+MiHQ=; b=UFgHr0Xnw6obdyx9+omdx/qkgB0akl6+MLScnLnIsUCOGJXd3e5qHaUeG2MD8Fgx+DPtWrqkMlB0GxbzQkBL7YgeoZU4AksDdFYw7iJ3tVP2jR/rv9IMdMP88nJo9I9MV9yRcxjS1tSrv1i+JXNBPvYaWSmGH4fNnCxuKvNv7uA= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:29 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:29 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 06/16] vfio/type1: Refactor dma map removal Date: Wed, 5 Jan 2022 19:46:46 -0500 Message-Id: <20220106004656.126790-7-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 88432e5f-372b-4cfd-3064-08d9d0ae1e0b X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3968; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: V1tI8xlG/Uhy6zrHk3l4ZiBmX/jgyBNrMogOs2N9NsOrU0oUAeDzcargpACrAP2tR41gFYOnZ+jKTzVLjqFRApnkde9gWVhCb/LIAQ1Vp0ta+M8FWKnur1JFn484kXLdQR4cbWAGJKW/w/pjRjZJx6GAaJjupDDOiyJkeeMAM5pUEfH1TnYcWikvcMAPSAqy3cV3CfD2QCKqCmeWx3wGkVxmFVbgB7qWSmC2eytnNq9Gv7TIxZVy7q19KiBy6EWYIAFywu8Bi3hpvbPTwqWlJxAnJplXVxo57djTEp2mvri48Si3quVdl2HbSgnNZKtelHNp5Fple50JMp+y8FeMPFYtm7U91J2JEeLcmOox8GkPlu46f9ftxjcCL6nkz+EacUg+np0LzL9JaYW7xq+mz2j095/4oYPbvHd8llREXqzxKeS2viVgHVJMVQ2KSRXJoCoMMS+yEShmEW4tyEf2xf26BE4ukbqtOxFMEfha8u9mUyGkaH3SdTHmR0k+q5wtHUcSe/NR2W1j7V9GYMpt8Jfra/KjMcrwP3//dFK0ncE8untQqm42hbqJHokas8T+Ek0p8EJTTQVucsh11LWEicQOwzHjjwvXXDEUX/eslAbhk0ouyGJFgz9M6UpUxBTpes6c+F3NO42Omx06ZRsSIUGPT156bDNEhWqucOIL2F9/qjI04lNtfzzfBZUNipF01gBBB7U+V7yIPiHknz5ollrUJOPpjtb5ZtQUnX6LIDk= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: EtRnG/KAmJmcrxE5G44vh3dkhKyqCAgggGyODj5vLBwzrm7oD1kIM7fRe9+P095W2uasDKo7STE0MFeTlsaMMzlNMXyFwF6OGpIjuLEWU+62OKXzlnVYsKhhkdmaLq2tdqBDUTiRr24HizXonuEwBoMT6neuTFwj4ByV+SwZnfWthZyAS4XVRL6hf6hcot1v4ZESA2as8225xyXvbx7NwYf8pKJav/FwGTP8RZLo6Z+I1jVf172HWNDfYzX8azE9PJbUD5iNxmXEJyywZDiM/UG3pSrsjEc6wepdNzfJpSpb8bfH9js4zViAspLViWVIt4/4JFeRJHEexwqQEWQPuneTVRdGvgRM3XexbfJDQAyE3BSMT2cYNcXnXkNtIdle7fkfhj/gmrrhrrsLw9tq3y80CUMfd23uoppIHfsnhZZ5CFRsUAFGp851b77izxEh2wFJOakxr+biQ8ea4P4gTOjDadRMjeTV2+NSQsDnS0uKYWHaINIy+Sfkrh/Gpwx5jPmnCwK0/Qn1HlK0lojZ3dGIQGVnM1dLxS7hD0mpYa2+pvKFW8PnyVe8mTUr+7QUnCF+msn1vHyMkSw7Yqj9/rfjkDCjmfo5pYWwcAhyp9JI0CwbHPWj+nEzqqpGJM9Uy0PaHQB8OnCbYRzqZpRKVEOVr+Ts+/ZqQHEEutIL5Xb4U10RfmtlmWy7rZOP+fGDjbCaRL1cFQQnAURxZi+5qBHhO3gNdJihuCr+V6znNCm0Nv8Wkmlz7n7YFfA59sp4QwzN9KDwmt0NbYg/0yajri4bltNCBihDbDWYPl72Nhjwmg6g8WF1k2EShNycsIBq1W2JzjnPFba/NHLGdCqB/7KLmFBTE2P/upGREgM8djinYGVBAFWN8BYo625wJiSZasfLTrCQC/kGrkeV1SoieDfR0RmgIcKgUGqyMPAaCQTfxYaLr+ddfwwGrSecBeDMDk4BmPm9xpE+LtM/tfbtavwg3B4l4byhKvb8IcjmLQIwif1Ha61AUgyvpjvWG60aVEvsFOZK3fy2lgv6E0Qn756yiCl7PA+/06TfRie3UFPmUHYCOF5bB+t8lSy6Ga0SrVztLmTDcTg6Wi6orPC4JaV/OPleQUyZy0hLXDwQnTnGkgLDXlmcqJh3fP+sJ1ziVCJh9Clg1XW5izlQn1GX/F6KY/1nPaZw/bkjMQr8k7C3xRr7Jt0fGnac9i2M63nCUACEUIKRe8aY9BgC6DU7v/WkG57TOv7fAM8BLHeVGE3OdWFw4HzdBu8mQxJrtOXjfahT4MRyl6lQUji+h6TaXkOo3OUBuqNu9McHhFaSSs6iZsV3Dwo/y5Fx1BZlH1vBl25DqPidM+Oihcy+y3i/hdFpyZnKjvL8f10qgxG/RJCUH1szDjcHqDKgjI7tCdx2TD15aplp4CilmnPEbOMEP3cfwojVY1Yp4sVbNVmG/X/+6FZ8z+NmMQ1qa45IE6Kh4B0BmoOCzstBEF84AaxlgNP0QsYASZsX5Ej28pJCUQeAZsZDOYVCyJoIF/DIuWVozpUVkEIXdA0U+IcvqHVxGDsK3MTDaEHAG5vpyJPmwaCOIZYmp465Ty1A2B7UjTbb1GGzJqYY8N9qcaDVvgKj/DX+WdAc1I//HboYxK3/vT4SakToiM71BVwtktgwtqiN+XL79A5OO8YIrGqsX4Y/87kiLVOg1SX8Lc0TMufiACs= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 88432e5f-372b-4cfd-3064-08d9d0ae1e0b X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:29.8125 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: xMguZG52AxPDZi/wEB3uMm9ScURkaKA7S8dsm0hRKpqpjsZ3IN6QLz9BMYqu05UO4T68XvpJyfE3A3s3UATsTCFDpbwO6gx3XnNOosGjTkI= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=957 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: wbmakN5Gn_W7bdHHFuLBwxxd7FCG4684 X-Proofpoint-GUID: wbmakN5Gn_W7bdHHFuLBwxxd7FCG4684 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Do these small refactors to prepare for multithreaded page pinning: * pass @iova and @end args to vfio_unmap_unpin() * set iommu_mapped outside of vfio_unmap_unpin() * split part of vfio_remove_dma() off into vfio_remove_dma_finish() They all facilitate padata's undo callback, which needs to undo only the parts of the job that each helper completed successfully. Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 26bb2d9b698b..8440e7e2c36d 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1078,16 +1078,16 @@ static size_t unmap_unpin_slow(struct vfio_domain *domain, } static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, + dma_addr_t iova, dma_addr_t end, bool do_accounting) { - dma_addr_t iova = dma->iova, end = dma->iova + dma->size; struct vfio_domain *domain, *d; LIST_HEAD(unmapped_region_list); struct iommu_iotlb_gather iotlb_gather; int unmapped_region_cnt = 0; long unlocked = 0; - if (!dma->size) + if (iova == end) return 0; if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) @@ -1104,7 +1104,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, struct vfio_domain, next); list_for_each_entry_continue(d, &iommu->domain_list, next) { - iommu_unmap(d->domain, dma->iova, dma->size); + iommu_unmap(d->domain, iova, end - iova); cond_resched(); } @@ -1147,8 +1147,6 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, } } - dma->iommu_mapped = false; - if (unmapped_region_cnt) { unlocked += vfio_sync_unpin(dma, domain, &unmapped_region_list, &iotlb_gather); @@ -1161,10 +1159,11 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, return unlocked; } -static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) +static void vfio_remove_dma_finish(struct vfio_iommu *iommu, + struct vfio_dma *dma) { WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list)); - vfio_unmap_unpin(iommu, dma, true); + dma->iommu_mapped = false; vfio_unlink_dma(iommu, dma); put_task_struct(dma->task); vfio_dma_bitmap_free(dma); @@ -1176,6 +1175,12 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) iommu->dma_avail++; } +static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) +{ + vfio_unmap_unpin(iommu, dma, dma->iova, dma->iova + dma->size, true); + vfio_remove_dma_finish(iommu, dma); +} + static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu) { struct vfio_domain *domain; @@ -2466,7 +2471,9 @@ static void vfio_iommu_unmap_unpin_reaccount(struct vfio_iommu *iommu) long locked = 0, unlocked = 0; dma = rb_entry(n, struct vfio_dma, node); - unlocked += vfio_unmap_unpin(iommu, dma, false); + unlocked += vfio_unmap_unpin(iommu, dma, dma->iova, + dma->iova + dma->size, false); + dma->iommu_mapped = false; p = rb_first(&dma->pfn_list); for (; p; p = rb_next(p)) { struct vfio_pfn *vpfn = rb_entry(p, struct vfio_pfn, From patchwork Thu Jan 6 00:46:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F33AC433EF for ; Thu, 6 Jan 2022 00:48:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343961AbiAFAsi (ORCPT ); Wed, 5 Jan 2022 19:48:38 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:30232 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343995AbiAFAsF (ORCPT ); Wed, 5 Jan 2022 19:48:05 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4Y5e023582; Thu, 6 Jan 2022 00:47:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=jgvLjAHm5Vf49VWwJ3j4Je5hXaayeoK3l5mdD4qALeU=; b=UjaC5j9ljknDWL/TMLFPOfo3VNJhZ0j/JSINiyt67xZZpC8ljXeepCdxh8xxNRFlldJI sVzuVqPEGMMVePKiWGvHXIPCfn6XXahXXCOF+0vuqR2aLVoROP1bmx3k1JSNsD8Q2j5u 6RBNkL4z7BKmWsmS+roe9+ORasZ7xvzNXghYoje3rjVMru9WdXUD57SbFEehcYgvQ7R3 IYeGbI2g67+T3RDX2W4n/iUoh+YxymBG7xeDyZirlmUkT8DPJwo25G6e9kDsUlC2xCy6 rqk6PbfpKxK32gCb+wb+/otMAp3VpdKU7EWtzIx9DX6jI2kV0PGPSi3reKrlZuejxHqX ZQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpeg425-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:36 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VnHj086820; Thu, 6 Jan 2022 00:47:35 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2041.outbound.protection.outlook.com [104.47.57.41]) by userp3030.oracle.com with ESMTP id 3ddmqbvt0t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:34 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OX4S1UUlqm4BWn4iYqkD4MjmaCGFkzmOM+8iSSbwP4Qi9YZHp2E5qWsjNcUpZfK4eeVj+woVeacgk8FjCNsfCVvxBcUbuYN66qG9nOX+ris5Env28XXeXjci7j90xQl6QlWuNkg4uKzpdSvXb7dAxT7AvRw1Y6xdyfuXxas0UjlsGwLBavjqkh1BvIq6NVKNz78UYR/LZwlLfUgT0YcJHujCskUYV90yZuIQqF6U10jhdtpZXpGZY4S33DE1EcZ+urzwJigXGKk+ft74Cdl4hFslRCW9yjEiJvCGBjIvQO3CRs3ylm1JhaJ0pf4eFPSRnRsWRM56oc12xGttOci07g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jgvLjAHm5Vf49VWwJ3j4Je5hXaayeoK3l5mdD4qALeU=; b=jrMl/Ce4RZ6zmvMXFtvFjhNcY1qDu6v3Y5PVAc5wgDr5ZofgTqBxIxv51up7FGVZRUEZC44PPWnelHqYFW7HXrC7YXHxZhnAcmH0OwwjQjH5+y1yNzk6Gbsaik0utsCC9FwIMlnMmxvOiV5jrPmPg6Jp9txJn2Ha8LsVhwuTReOPNF7xqr/UvgXrvMeYqnPsACcSiYyhNPpU6AU5SR8dM7e1l04d9tlJSUvwMc7XKO/vNXOt5h0dELITcP5sfoGImZnl7RzOOUqFaPuRmyus40EewDs0UfG6C6icP56PbOT9F829DZMi/dQI+UU6I1yX9B3promdet0oFaHFH60iMQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jgvLjAHm5Vf49VWwJ3j4Je5hXaayeoK3l5mdD4qALeU=; b=Ny2NzjosMYfFBVpdP0xR+SqIDFMpNLfbxJctbewM87Se3eHnyVJoZA0Q9ZeyXTWEbmrZFIMW9zz64kh0Pc0SvQQUEpuWEJ0kURI+Sn4anS6+iqTr8oDospxMqZdrQeF5EuxU/fUU95tbz7kWgXbQAb6BidIbF3l4Dr0y7c+NT1o= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:32 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:32 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 07/16] vfio/type1: Parallelize vfio_pin_map_dma() Date: Wed, 5 Jan 2022 19:46:47 -0500 Message-Id: <20220106004656.126790-8-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 96f4a87a-0f54-4be6-11b5-08d9d0ae1fa5 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xyb22aRtPdtNbPj2ZMrZtWJJeLQYnoBEV9B4/HHuF5Grt4IEUUXJ7/+mY88a43SmyiOtwCf2LGETqPnkDJ17FL9XE1qXpR4HInHB5btNu/58BzMnsZI4EIrKB16DQk4G0js+XPuvKBKTamaKbAuIXGm3qC1RYXlhnr9MS9EbL8RuMsAKjNkaQR8+rmECGYnc8OweeV0CT1jjaMJL0FY2njqx9ojqacSoYknu3NSFzkd2/hPM25rLmDfVdxgI4GOTj7CavrX8cVx3nmbg9pXvQS+sX70uER/BV4v4PXvoKUo/1VxdB+jXEWOATFmoaJEwfKV01fUDN6+AGp2mP4Ct3AHI7Kd+x7RSoQ6oxAsOvFLLWvGIRnwROSx2i3hSO7hyZ8Z6ktrfySVF58QGSLukDeR3Muro20t9YRcQ0EPiE6KSdeCtA+NIIBMD/1gw7QDVbiZWOLh11+DOmvFki5S9bbZ9YReWWQvFBx6KEtbyMr0ghWc5vNyNb6/xlfXR61M/tRKuYFubxo3HqarHx2KYiNmQ34JeAbTazI6i3aifg3g1H0C2Z0sVu2T1u22KwwgTEoCw2K9rrcEZv/aXX69q10Ail/qtHQDxOISkHY88FeuFq8ry8yuoQXTAsY7Syxvh0a3Piuh0A7MuabyKAL9KhIMVSUrzklCmwL7MThIxTi10hvnuVBfRogjrPzRadfDSNKExrwY1dJXmRNoUJrccCCFeXTgA6oxtZaMCQZEurGAQ8HX7FSFSo+ysQcdeiLysfKC83OEj/CcdP+1LUIhsqHqIUYTb6HQqH8yZEpSyhPcnQOoVxWBB9pPQmK4VjJVK/TPtC7p12Wwu/w/hIX8uionKoOa5iNguEvB74LXGzGs= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(966005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: CKbaYQ7AfqK9amf10QGa34yhVeeJAJf06YoeVozsaVVhVgUnJJmF7vQ7UpwLApoQIRgMIcVasqNv8YTIHlIZni0tvILQMCT947MC9W79IT5xwkDXSx7ycgj9fXrlohx9OZz3lPHeo0EVAzJ4uI2oycqqtEGbAj0Apr8JiiZ9QezVUjAhGl8SgfCmZy0WlGa0S9/rNlF3QhUKmhA+u4OBWEG2P1a/FqVdabrTXyRR1DHga4KrjdHei2/Ersw3WSG0RSe2EcDTSbkIFBBvnhWmoeewRh6dC2/pZAQ1ErwgLc6XEbWQOX2faH3HUeIGV45yQSo9f30lC1pIj5uNEyfurWUqJojIJBU07w55rHRL86+BTSEr/xuZXuM0KAYgMCMN9FAg1Z8ip5fXkxExGhnAEsDjbsWexd0wzbuh8u3P578Z1KoHCLnDzpRXkomlrwmjRDs9kk75CS5NWbWAhxik6MJMF0sFayQc0tFrOkaDWV82Qb8QvalxyBat9znVtgtUlcca4pPhOzP8h0FslFvaku2ycStFf/DmoKqV7KRXnlEQDV5GYdohq4Sq5lLtuOmIt9IQUriw8n3q2EUHr1hrJWp1PBIX6RJFEtV/HT2052hbKe9dpG23f/gQc2DYbwXQuSXLtILIX3Za8z3k2fgcqNeMERtHhOGtAoLGc5xwEdwCcCyJlGqxkR+GcENQwe++DoaYhq5XQXUCNSG2HC3d3Z+VJ61pkm7EP/gmD1EKLYhFayoDyYJJe0+o35fcGLV+41orLp45twF5tjFo1rQJrbVVX8IbA8RFY7zNh1f1YJBq7Pi4lUZYUjEkQRYiH3WyxOnDtQAk7y5wwbRc5CtjbLiFHFVkdn2BR/hUcuxXUUcjKvN1R5RUIXbF7Ql2qbdNTtx6XdqvfBiybiDNp3/Ya2WdsaCa9X5KmOAxAesAsczGrqnSpvt2iJq9SY8u+u04Ewjb2ixT4dfVmA0TJXr96UHWzy5sA++kwMG6+fYq0ABUDWimA27AamBLh7kntvz23sz1fZibxBFq0y60wor+L8vQLHl0SMbxQgCHsO+keLqrYW193VF2RGbbaiOdEaae4jjSX1Yhb3ln6YVRXv1cLFRvMb/AerzTbrCI2qXtWA1P1zuFOT3LO+XcoMFtj8CXMMfn7LI3Ihq0fTUh9yy2S+SHIwlQlIadLYQJihPRa0oDUY4NT7xcIA6Rs6oOifxAXSe/lc60hQSQb7+xyIhbUWPKr3SU/KsFR+DNkhUiu5/F9NzLOGljqubQup93pSW+zF7pWKkLYYtSrsoo57CRUV9j7lwABKVTuKoP/UI+j+eFuJdjza0ra+0J9o9v1Ydq63J9sK7FTn2ABI7U8iOCxnkubXH4ORVre1ugbx1ND4JSi3xGfrFUbWTLlXYXtwuV8NH2w+xLeYa1oIeS4P/xWITf87zwcWPoma/pGXhK2W82zMJ34nQlp284e0NR5oSHt4z6YZ81RjTxsnpqJIggjUtt3UYOCq/WAKk6UBJPlaVzYcWw1dA5K8R2NqRWyHeFhrMzlyiYjYQL/W18SYB9vPtd3KNo46bDXFvqEA9jDjy0rZzlvf4fzdyaJcn2utANuaEGU9kpDsra925ygYJ8fHXMkne18OyznfHHX96U5eXKrNNm6J6Mj4/WlpwhcWKtLkO4ZQMBQd6pJ9FCuCMHWtXd+YcUojpMVTLzE2jzCW4= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 96f4a87a-0f54-4be6-11b5-08d9d0ae1fa5 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:32.5047 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QNUFs71oEf47YNhgi7HdQFenO8q0YZ9O50YHOxGmaUoBi6kHqHSIWsHDfdft8CqgNWBqlcznbj9JIYkval2Dln4DyUzVdYup3fVDTmvWIlc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 bulkscore=0 mlxscore=0 malwarescore=0 suspectscore=0 mlxlogscore=533 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: 5OSWCM1iSYsusk1kUE07XgTMUM3C8LaZ X-Proofpoint-GUID: 5OSWCM1iSYsusk1kUE07XgTMUM3C8LaZ Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The VFIO_IOMMU_MAP_DMA ioctl uses a single CPU to pin all pages in the given range to facilitate DMA to/from the passed-through device. The pages may not have been faulted in and cleared, in which case the wall time for this can be truly horrendous, but even if this was already done (e.g. qemu prealloc), pinning pages for the largest guests still takes significant time, even with recent optimizations to hugetlb gup[1] and ioctl(VFIO_IOMMU_MAP_DMA) itself[2]. Parallelize with padata for faster guest initialization times. Numbers come later on. [1] https://lore.kernel.org/linux-mm/20210128182632.24562-1-joao.m.martins@oracle.com [2] https://lore.kernel.org/lkml/20210219161305.36522-1-daniel.m.jordan@oracle.com/ Signed-off-by: Daniel Jordan Suggested-by: Konrad Rzeszutek Wilk --- drivers/vfio/Kconfig | 1 + drivers/vfio/vfio_iommu_type1.c | 95 +++++++++++++++++++++++++++------ 2 files changed, 80 insertions(+), 16 deletions(-) diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index 67d0bf4efa16..39c7efb7b1b1 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -2,6 +2,7 @@ config VFIO_IOMMU_TYPE1 tristate depends on VFIO + select PADATA default n config VFIO_IOMMU_SPAPR_TCE diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 8440e7e2c36d..faee849f1cce 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -40,6 +40,7 @@ #include #include #include +#include #define DRIVER_VERSION "0.2" #define DRIVER_AUTHOR "Alex Williamson " @@ -1488,24 +1489,44 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova, return ret; } -static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, - size_t map_size) +struct vfio_pin_args { + struct vfio_iommu *iommu; + struct vfio_dma *dma; + unsigned long limit; + struct mm_struct *mm; +}; + +static void vfio_pin_map_dma_undo(unsigned long start_vaddr, + unsigned long end_vaddr, void *arg) +{ + struct vfio_pin_args *args = arg; + struct vfio_dma *dma = args->dma; + dma_addr_t iova = dma->iova + (start_vaddr - dma->vaddr); + dma_addr_t end = dma->iova + (end_vaddr - dma->vaddr); + + vfio_unmap_unpin(args->iommu, args->dma, iova, end, true); +} + +static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, + unsigned long end_vaddr, void *arg) { - dma_addr_t iova = dma->iova; - unsigned long vaddr = dma->vaddr; + struct vfio_pin_args *args = arg; + struct vfio_dma *dma = args->dma; + dma_addr_t iova = dma->iova + (start_vaddr - dma->vaddr); + unsigned long unmapped_size = end_vaddr - start_vaddr; + unsigned long pfn, mapped_size = 0; struct vfio_batch batch; - size_t size = map_size; long npage; - unsigned long pfn, limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; int ret = 0; vfio_batch_init(&batch); - while (size) { + while (unmapped_size) { /* Pin a contiguous chunk of memory */ - npage = vfio_pin_pages_remote(dma, vaddr + dma->size, - size >> PAGE_SHIFT, &pfn, limit, - &batch, current->mm); + npage = vfio_pin_pages_remote(dma, start_vaddr + mapped_size, + unmapped_size >> PAGE_SHIFT, + &pfn, args->limit, &batch, + args->mm); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; @@ -1513,24 +1534,66 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, } /* Map it! */ - ret = vfio_iommu_map(iommu, iova + dma->size, pfn, npage, - dma->prot); + ret = vfio_iommu_map(args->iommu, iova + mapped_size, pfn, + npage, dma->prot); if (ret) { - vfio_unpin_pages_remote(dma, iova + dma->size, pfn, + vfio_unpin_pages_remote(dma, iova + mapped_size, pfn, npage, true); vfio_batch_unpin(&batch, dma); break; } - size -= npage << PAGE_SHIFT; - dma->size += npage << PAGE_SHIFT; + unmapped_size -= npage << PAGE_SHIFT; + mapped_size += npage << PAGE_SHIFT; } vfio_batch_fini(&batch); + + /* + * Undo the successfully completed part of this chunk now. padata will + * undo previously completed chunks internally at the end of the job. + */ + if (ret) { + vfio_pin_map_dma_undo(start_vaddr, start_vaddr + mapped_size, + args); + return ret; + } + + return 0; +} + +/* Small-memory guests benefited from this relatively small value in testing. */ +#define VFIO_MIN_CHUNK (1ul << 27) + +/* The sweet spot between performance and efficiency on the test machines. */ +#define VFIO_MAX_THREADS 16 + +static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma, + size_t map_size) +{ + unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + int ret = 0; + struct vfio_pin_args args = { iommu, dma, limit, current->mm }; + /* Stay on PMD boundary in case THP is being used. */ + struct padata_mt_job job = { + .thread_fn = vfio_pin_map_dma_chunk, + .fn_arg = &args, + .start = dma->vaddr, + .size = map_size, + .align = PMD_SIZE, + .min_chunk = VFIO_MIN_CHUNK, + .undo_fn = vfio_pin_map_dma_undo, + .max_threads = VFIO_MAX_THREADS, + }; + + ret = padata_do_multithreaded(&job); + dma->iommu_mapped = true; if (ret) - vfio_remove_dma(iommu, dma); + vfio_remove_dma_finish(iommu, dma); + else + dma->size += map_size; return ret; } From patchwork Thu Jan 6 00:46:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92765C433EF for ; Thu, 6 Jan 2022 00:49:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344274AbiAFAtU (ORCPT ); Wed, 5 Jan 2022 19:49:20 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:45590 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344013AbiAFAsK (ORCPT ); Wed, 5 Jan 2022 19:48:10 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N50I7009819; Thu, 6 Jan 2022 00:47:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=gvnI0ILzdOkq2SPBmAdOe7bmdVxk5DSoROBSRpypnCc=; b=GVOVquoc6I1WaCBQgvMKECfM4NLwJrcWozdNf5LeaXw4nv8gIh+bUBTGvko+2aoprc/W slFJaNrxlwFc2FDFqgFsvF+R1QkIKDfYU5EvpfCPvdQlaLgsYUj18a2qwvV3Jvz8rrlG 35muNecWWTdsir1kM9Tnxi2pjzaD6GzDoWnlK1Nmz6nndXtKfEylbgKfMvRU6m3W8mST WeKQ59saS7wFwtkaKs+q7wnHdGRCw5+A9jGVmzHk/DkUBm+e0e5nRQfeW/HgnJFmTdSu E8/lZZJgiSw2O30gzQaRY1EOp9ozZY8pSAfAsUofNng+HLwQEiF3coChLYqspSHM4Z5A gw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpp83u8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:40 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VnTC086823; Thu, 6 Jan 2022 00:47:39 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2048.outbound.protection.outlook.com [104.47.57.48]) by userp3030.oracle.com with ESMTP id 3ddmqbvt3x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:38 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bLWmc1CHR4Fdg0/qwbaxAhmAgT71bGLEtV7e21uDnWh2UXRPdEFYyTrzyEryE34RqXZ8EN1OSZ2ni2YcjTLlrGe+dAAe3qgs8vw4gTzn4OX2ssKK+cFQHJ6Duq2+Sv0paGL0K17s8mUbIckQ314rEPvB1d2MuGnVtIzhacydXrLTotbSv3uIVPPaMomXz/KH3ZlY2IIvaNhMrUwuLPiffzImzBcmm5eT+4yCpLzeITW9LzgD2NyqI5nUOBZh0QXomXs90DErwrYg0FeFUrP2/1N1gxP8BDjkr+bv8EQ3NS0AkH1E9AYQKiFJGTCuLD9rDXXGwpxDHRuTfjdhmAz/7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gvnI0ILzdOkq2SPBmAdOe7bmdVxk5DSoROBSRpypnCc=; b=UMCKscuh5zs694rKWDt1GONHBqp6hR4nuwnLn4M4q/tcdowv6B41J/9CumO+mDVc6OE60fEJI8a+LdSFadl2jjGi+zR5Nttp0wOlHu7g+1WGxUvExJ7HZUQsiA9r3B/8pfJCsMsFDu+vTdDRyCmGHh+7XWZaWJSGKyP2pqyLTvy+Uup/JJFFJ544kH4uhGOh0Xb6JVOVmL3ltyJttvr9t9QW01CZRIVoB/q8jjxInqAXiQDmUZgzib/SbbkpwRNsBiA5scH4HPU3dR9FDq2W3p5Jw8dsiBCaIT/FmDdfzUu0ZwCP+GgX5UMMiwwP8c1VpbZi8Jmn3ZLY4u0lAB0HQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gvnI0ILzdOkq2SPBmAdOe7bmdVxk5DSoROBSRpypnCc=; b=v84o2jlkW2cEy6biUqy61SLTKfVpxZ1cx2yc6J5qc7U/2vfwcyw6sxWMddblZpz1UGyiFLxCv194P2sRUY50tBZJ5IafZDf2WOQlOBeVZp8VD6a5AfQdIVProgzCpt6YYqj6vktMFffUnyfXOSlN/W2Z7sdWUZ3mbvfqLPQpORY= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:35 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:35 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 08/16] vfio/type1: Cache locked_vm to ease mmap_lock contention Date: Wed, 5 Jan 2022 19:46:48 -0500 Message-Id: <20220106004656.126790-9-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4df35cf5-1c23-425a-16ef-08d9d0ae2146 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3383; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TscjkaAKYe0kdSgHZPuf0DDR8Y+iYi7qRG+nOw27/ctCNcm1ADqkM81qGEmvZekWru+D/tLzPtF5+zYjZPHncAb3IteqbYZlI/yokIsV5gYgUvnKAsCoL2jXm5GtQvrgP0Ae4k+7c8hI679dHc/Tnh4MMff+XFxbIBdMaD0eiPPXfTL1E3WqMmG4KLOGDwAu0hUTRGbPxR9IcLWqzj9F8s2D5pkqC2fYUfi3O4rq+/6960+yz/+5zhJLvzCQ1Xh3lENeA5vVBXCRxnGc17xKjQTRlox/11zGJl5yWDdDAABe/qsyvG5p/HnjTzcM8Jcvu8YGkwhPj1JY3mxq9NRK4adq+9njABLvH/v8OhZFLD2QQ4zifsjTz9Yc9MzvhpDUvgqWMMu80QrgV9eB7H5dGOq+CQHUHbx+8JvAZFUFXyv3QxDddlXP8JpaJBdM+W1z8H31cSPXFasD39mLwbleNDXm3a0cft2tZdOYIOW9QrGYfDWs84yjNPSdYHLIWAShsDtM5zHaHZ+2YdaG1ZjwPzkaF/zPfnde7kSS1YyshD/0YVSiuW/J4l0AtOu66Fon2pPFfgRQsOu7VTiTyIJ0wBJ3jOqdokmQUvRRiMcZl3SUs7JJbCaUTdn9yRd/vFhawMnctTBFecsMsF/LyEkDzt2dbvi7K47MF+QEBHp/OBs7KfcO2gdt7XEOTDXMgj47MjaBoQiCdLpwHHIri912HFYNfpfHnw8RYh6QaLOnt0o= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(30864003)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007)(559001)(579004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 8F702XA9Vdx4KKKPBDkzM/nTdA2Hm+Tl6HHIWV8dacc07Vwab3jwCLrmAtrpK6AHDRiQ3bNnGpu0M0myCydCUtgQfj52doteIoSOtUzBBcXBUKvDkcCYHW82p9+aJGsNNI/s199w8SL8IodFHeP9TlLVpgTk1AMb6taAFeW/Y+UHobMurlXRLSMSYqFw5HREn/0pg0CjPSGbG775XWKoil/Yghq+94R6qUoREQxtbR63JxON8lgoUsGwpuZv7pk4qaUlu80l362i7SFtsJJJJGpyXIPVUJMHt3cv7X+wBneNO3ceLnVz+WQzNmUjVDDCa6eW9kife9Hlp7tsVshcKmMZmF87CvhD5LBxCFxzttCYI2LY0D0cS/qOqPzcidilSLyEacW00kyUMNbrIyowYocqP14p4ADZyhwHYiKCI2N/7iXPgkNpH9383nXXUZEzPFPaG0cRkWm9pZNxhtAK2psCAp99hX09/x7Ly5ODPTxniacQ7JARuHmgen1B/7mDShymMG98Rtn4U94YVFMe7YF9uJoCZqG8HxvL/RzwBYgqYdlcWF/apMqOFWTrIuBVZm3sJow1MCeeDNNgOMqZ/MJhDqNIenXcdjWuDC/+wqZyWVkKx+jBpWd1CTlQul+htkLyXc9Y49uMSxxHKOkYdE2JzabT0wPcO9MFURnno4GIlOdEgkzdDtvjFS5HvXtvxqBAaXUvVCk9UuLXEaxkxUixaT0hlBc1XlayW/2ZpJau9+4cwXorOpV725IF/TpuuA6pLC/bepLPFckNMOG5rKyoHOO34ogsSCAukrksYO9hlOk7A29hAv37Ny4AuYMZSrrCYW/FSmCc76KszHjMxK2CKlAvnGlR+9Z5NKzNmVsBLJVcwn9OTJQQmiPh4UBwd3dKUfWT7U72mFB1vIpCM9tfy5IAMdv0FATWpavZ2CGn9pu+TS/zFsLvbsBGPReZG5zQG/lvWbZWElXBx72Y4F50iqt3NMnUGf6BD+PPfTrICVnN5B9+j4c36o9Q3wwVJB+GEZhMnIyPNqpcKfRzumgfnD/qMlxl9J2ZDDaf/Z7VuCOUAAIM1/QkTBICa8I7y78YaXCe2/YO5NbMLjKnmxMDLvr02AV0/Z1s1chk5jFfrnzfrtmPJ2m2iSiaFX8QRTt3Wa0YVzdowCPwZWfPSk2rLFRkj46RpB+kKex0Fef5oLdzYvrZKA+2+eY4tpf70dULJRs6c6rwnHoEtQSLv2WgFUfwJxu7U+4tYLHnoed4nIOEqdvmimCQHaFPjBcoM0T0L+d9Bi8sFzlkT+vC25p0P31Ru/5UTVgUAXRBlWCIFTT8ay96zUOpOpAHQOo6/F3mKxdMg0heRlGSbdCUvEUATPtrpOfBxP3g+qLRgr8/qCWv64iHS/l9rKgBSLuvs0VdAVidyfSmer/wCJFrnz/8XR0rxUR8lciGlzvV3R+ThK71FSnJwPOuP3e/wxBoHs+6bPavi7qSyoifxd5uRAD5O0be/Hg7xUuSXn0iPMf8mMZTGpQ6frWTDjTRfKApadBk0aiNgKTPKYIdW7R+kgdv885sNvfnB8vsrIIldwv58WmCjME4ZuKKGUxxuRZ/N+HFfs+s9XM4MWeuQtM/OYyj3lpbTZr0ykaBZUy71vHYH6ZpQ/Sy42JkARaGfYc8Vgu4Dzb6f+tQVNWKoRIv2/7dXHp3h7QorCoQUZWNK+E= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4df35cf5-1c23-425a-16ef-08d9d0ae2146 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:35.2874 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4Yb7p1nBYzwhZliJRsQU352xpyg2JZAhoTS4ug+5/rGGwqOF8s3kkOvxeVKHQAwR/KS+DPCIBvv09LXMxwYbyAnOOFR4Xo4aaVrsKr7BRXE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 bulkscore=0 mlxscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: nOxwIalNXqSvuYj1Dve3zqXah1nGdb0e X-Proofpoint-GUID: nOxwIalNXqSvuYj1Dve3zqXah1nGdb0e Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org padata threads hold mmap_lock as reader for the majority of their runtime in order to call pin_user_pages_remote(), but they also periodically take mmap_lock as writer for short periods to adjust mm->locked_vm, hurting parallelism. Alleviate the write-side contention with a per-thread cache of locked_vm which allows taking mmap_lock as writer far less frequently. Failure to refill the cache due to insufficient locked_vm will not cause the entire pinning operation to error out. This avoids spurious failure in case some pinned pages aren't accounted to locked_vm. Cache size is limited to provide some protection in the unlikely event of a concurrent locked_vm accounting operation in the same address space needlessly failing in case the cache takes more locked_vm than it needs. Performance Testing =================== The tests measure the time from qemu invocation to roughly the end of qemu guest initialization, and cover all combinations of these parameters: - guest memory type (hugetlb, THP) - guest memory size (16, 128, 360-or-980 G) - number of qemu prealloc threads (0, 16) The goal is to find reasonable values for - number of padata threads (0, 8, 16, 24, 32) - locked_vm cache size in pages (0, 32768, 65536, 131072) The winning compromises seem to be 16 threads and 65536 pages. They both balance between performance on the one hand and threading efficiency or needless locked_vm accounting failures on the other. Hardware info: - Intel Xeon Platinum 8167M (Skylake) 2 nodes * 26 cores * 2 threads = 104 CPUs 2.00GHz, performance scaling governor, turbo enabled 384G/node = 768G memory - AMD EPYC 7J13 (Milan) 2 nodes * 64 cores * 2 threads = 256 CPUs 2.50GHz, performance scaling governor, turbo enabled ~1T/node = ~2T memory The base kernel is 5.14. I had to downgrade from 5.15 because of an intel iommu bug that's since been fixed. The qemu version is 6.2.0-rc4. Key: qthr: number of qemu prealloc threads mem: guest memory size pin: wall time of the largest VFIO page pin (qemu does several) qemu init: wall time of qemu invocation to roughly the end of qemu init thr: number of padata threads Summary Data ============ All tests in the summary section use 16 padata threads and 65536 pages of locked_vm cache. With these settings, there's some contention on pmd lock. When increasing the padata min chunk size from 128M to 1G to align threads on PMD page table boundaries, the contention drops significantly but the times get worse (don't you hate it when that happens?). I'm planning to look into this more. qemu prealloc significantly reduces the pinning time, as expected, but counterintuitively makes qemu on the base kernel take longer to initialize a THP-backed guest than when qemu prealloc isn't used. That's something to investigate, but not this series. Intel ~~~~~ base test ...................... .......................................... qemu qemu qemu mem pin init pin pin init init qthr (G) (s) (std) (s) (std) speedup (s) (std) speedup (s) (std) hugetlb 0 16 2.9 (0.0) 3.8 (0.0) 11.2x 0.3 (0.0) 5.2x 0.7 (0.0) 0 128 26.6 (0.1) 28.0 (0.1) 12.0x 2.2 (0.0) 8.7x 3.2 (0.0) 0 360 75.1 (0.5) 77.5 (0.5) 11.9x 6.3 (0.0) 9.2x 8.4 (0.0) 16 16 0.1 (0.0) 0.7 (0.0) 2.5x 0.0 (0.0) 1.1x 0.7 (0.0) 16 128 0.6 (0.0) 3.6 (0.0) 7.9x 0.1 (0.0) 1.2x 3.0 (0.0) 16 360 1.8 (0.0) 9.4 (0.0) 8.5x 0.2 (0.0) 1.2x 7.8 (0.0) THP 0 16 3.3 (0.0) 4.2 (0.0) 7.3x 0.4 (0.0) 4.2x 1.0 (0.0) 0 128 29.5 (0.2) 30.5 (0.2) 11.8x 2.5 (0.0) 9.6x 3.2 (0.0) 0 360 83.8 (0.6) 85.1 (0.6) 11.9x 7.0 (0.0) 10.7x 8.0 (0.0) 16 16 0.6 (0.0) 6.1 (0.0) 4.0x 0.1 (0.0) 1.1x 5.6 (0.1) 16 128 5.1 (0.0) 44.5 (0.0) 9.6x 0.5 (0.0) 1.1x 40.3 (0.4) 16 360 14.4 (0.0) 125.4 (0.3) 9.7x 1.5 (0.0) 1.1x 111.5 (0.8) AMD ~~~ base test ....................... .......................................... qemu qemu qemu mem pin init pin pin init init qthr (G) (s) (std) (s) (std) speedup (s) (std) speedup (s) (std) hugetlb 0 16 1.1 (0.0) 1.5 (0.0) 4.3x 0.2 (0.0) 2.6x 0.6 (0.0) 0 128 9.6 (0.1) 10.2 (0.1) 4.3x 2.2 (0.0) 3.6x 2.8 (0.0) 0 980 74.1 (0.7) 75.7 (0.7) 4.3x 17.1 (0.0) 3.9x 19.2 (0.0) 16 16 0.0 (0.0) 0.6 (0.0) 3.2x 0.0 (0.0) 1.0x 0.6 (0.0) 16 128 0.3 (0.0) 2.7 (0.0) 8.5x 0.0 (0.0) 1.1x 2.4 (0.0) 16 980 2.0 (0.0) 18.2 (0.1) 8.1x 0.3 (0.0) 1.1x 16.4 (0.0) THP 0 16 1.2 (0.0) 1.7 (0.0) 4.0x 0.3 (0.0) 2.3x 0.7 (0.0) 0 128 10.9 (0.1) 11.4 (0.1) 4.1x 2.7 (0.2) 3.7x 3.1 (0.2) 0 980 85.3 (0.6) 86.1 (0.6) 4.7x 18.3 (0.0) 4.5x 19.0 (0.0) 16 16 0.5 (0.3) 6.2 (0.4) 5.1x 0.1 (0.0) 1.1x 5.7 (0.1) 16 128 3.4 (0.8) 45.5 (1.0) 8.5x 0.4 (0.1) 1.1x 42.1 (0.2) 16 980 19.6 (0.9) 337.9 (0.7) 6.5x 3.0 (0.2) 1.1x 320.4 (0.7) All Data ======== The first row in each table is the base kernel (0 threads). The remaining rows are all the test kernel and are sorted by fastest time. Intel ~~~~~ hugetlb lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 16 -- 0 -- 2.9 (0.0) -- 3.8 (0.0) 65536 16 11.2x 0.3 (0.0) 5.2x 0.7 (0.0) 65536 24 11.2x 0.3 (0.0) 5.2x 0.7 (0.0) 131072 16 11.2x 0.3 (0.0) 5.1x 0.7 (0.0) 131072 24 10.9x 0.3 (0.0) 5.1x 0.7 (0.0) 32768 16 10.2x 0.3 (0.0) 5.1x 0.7 (0.0) 131072 32 10.4x 0.3 (0.0) 5.1x 0.7 (0.0) 65536 32 10.4x 0.3 (0.0) 5.1x 0.7 (0.0) 32768 32 10.5x 0.3 (0.0) 5.1x 0.7 (0.0) 32768 24 10.0x 0.3 (0.0) 5.1x 0.7 (0.0) 131072 8 7.4x 0.4 (0.0) 4.2x 0.9 (0.0) 65536 8 7.1x 0.4 (0.0) 4.1x 0.9 (0.0) 32768 8 6.8x 0.4 (0.0) 4.1x 0.9 (0.0) 0 8 2.7x 1.1 (0.3) 2.3x 1.6 (0.3) 0 16 1.9x 1.6 (0.0) 1.7x 2.2 (0.0) 0 32 1.9x 1.6 (0.0) 1.7x 2.2 (0.0) 0 24 1.8x 1.6 (0.0) 1.7x 2.2 (0.0) 131072 1 1.0x 2.9 (0.0) 1.0x 3.8 (0.0) 0 1 1.0x 2.9 (0.0) 1.0x 3.8 (0.0) 65536 1 1.0x 2.9 (0.0) 1.0x 3.8 (0.0) 32768 1 1.0x 3.0 (0.0) 1.0x 3.8 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 128 -- 0 -- 26.6 (0.1) -- 28.0 (0.1) 131072 24 13.1x 2.0 (0.0) 9.2x 3.0 (0.0) 131072 32 12.9x 2.1 (0.0) 9.2x 3.1 (0.0) 131072 16 12.7x 2.1 (0.0) 9.1x 3.1 (0.0) 65536 24 12.3x 2.2 (0.0) 8.9x 3.1 (0.0) 65536 32 12.3x 2.2 (0.0) 8.9x 3.2 (0.0) 65536 16 12.0x 2.2 (0.0) 8.7x 3.2 (0.0) 32768 24 11.1x 2.4 (0.0) 8.3x 3.4 (0.0) 32768 32 11.0x 2.4 (0.0) 8.2x 3.4 (0.0) 32768 16 11.0x 2.4 (0.0) 8.2x 3.4 (0.0) 131072 8 7.5x 3.6 (0.0) 6.1x 4.6 (0.0) 65536 8 7.1x 3.7 (0.1) 5.9x 4.8 (0.0) 32768 8 6.8x 3.9 (0.1) 5.7x 4.9 (0.1) 0 8 3.0x 8.9 (0.6) 2.8x 10.0 (0.7) 0 16 1.9x 13.8 (0.3) 1.9x 14.9 (0.3) 0 32 1.9x 14.1 (0.2) 1.8x 15.2 (0.3) 0 24 1.8x 14.4 (0.1) 1.8x 15.6 (0.1) 131072 1 1.0x 26.4 (0.2) 1.0x 27.8 (0.2) 65536 1 1.0x 26.5 (0.0) 1.0x 27.9 (0.0) 0 1 1.0x 26.6 (0.3) 1.0x 27.9 (0.3) 32768 1 1.0x 26.6 (0.2) 1.0x 28.0 (0.2) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 360 -- 0 -- 75.1 (0.5) -- 77.5 (0.5) 131072 24 13.0x 5.8 (0.0) 9.9x 7.8 (0.0) 131072 32 12.9x 5.8 (0.0) 9.8x 7.9 (0.0) 131072 16 12.6x 6.0 (0.0) 9.6x 8.1 (0.0) 65536 24 12.4x 6.0 (0.0) 9.6x 8.1 (0.0) 65536 32 12.1x 6.2 (0.0) 9.4x 8.3 (0.0) 65536 16 11.9x 6.3 (0.0) 9.2x 8.4 (0.0) 32768 24 11.3x 6.6 (0.0) 8.9x 8.7 (0.0) 32768 16 10.9x 6.9 (0.0) 8.7x 9.0 (0.0) 32768 32 10.7x 7.0 (0.1) 8.6x 9.1 (0.1) 131072 8 7.4x 10.1 (0.0) 6.3x 12.3 (0.0) 65536 8 7.2x 10.5 (0.1) 6.2x 12.6 (0.1) 32768 8 6.8x 11.1 (0.1) 5.9x 13.2 (0.1) 0 8 3.2x 23.6 (0.3) 3.0x 25.7 (0.3) 0 32 1.9x 39.2 (0.2) 1.9x 41.5 (0.2) 0 16 1.9x 39.8 (0.4) 1.8x 42.0 (0.4) 0 24 1.8x 40.9 (0.4) 1.8x 43.1 (0.4) 32768 1 1.0x 74.9 (0.5) 1.0x 77.3 (0.5) 131072 1 1.0x 75.3 (0.6) 1.0x 77.7 (0.6) 0 1 1.0x 75.6 (0.2) 1.0x 78.1 (0.2) 65536 1 1.0x 75.9 (0.1) 1.0x 78.4 (0.1) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 16 -- 0 -- 0.1 (0.0) -- 0.7 (0.0) 65536 24 8.3x 0.0 (0.0) 1.1x 0.7 (0.0) 65536 32 6.3x 0.0 (0.0) 1.1x 0.7 (0.0) 131072 32 4.2x 0.0 (0.0) 1.1x 0.7 (0.0) 65536 8 4.2x 0.0 (0.0) 1.1x 0.7 (0.0) 131072 24 4.2x 0.0 (0.0) 1.1x 0.7 (0.0) 32768 16 3.9x 0.0 (0.0) 1.1x 0.7 (0.0) 32768 32 3.5x 0.0 (0.0) 1.1x 0.7 (0.0) 32768 24 4.0x 0.0 (0.0) 1.1x 0.7 (0.0) 32768 8 2.6x 0.0 (0.0) 1.1x 0.7 (0.0) 0 16 3.1x 0.0 (0.0) 1.1x 0.7 (0.0) 131072 16 2.7x 0.0 (0.0) 1.1x 0.7 (0.0) 65536 16 2.5x 0.0 (0.0) 1.1x 0.7 (0.0) 0 24 2.5x 0.0 (0.0) 1.1x 0.7 (0.0) 0 8 2.8x 0.0 (0.0) 1.1x 0.7 (0.0) 131072 8 2.2x 0.0 (0.0) 1.1x 0.7 (0.0) 0 32 2.3x 0.0 (0.0) 1.1x 0.7 (0.0) 32768 1 0.9x 0.1 (0.0) 1.0x 0.8 (0.0) 131072 1 0.9x 0.1 (0.0) 1.0x 0.8 (0.0) 65536 1 0.9x 0.1 (0.0) 1.0x 0.8 (0.0) 0 1 0.9x 0.1 (0.0) 1.0x 0.8 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 128 -- 0 -- 0.6 (0.0) -- 3.6 (0.0) 131072 24 13.4x 0.0 (0.0) 1.2x 3.0 (0.0) 65536 32 11.8x 0.1 (0.0) 1.2x 3.0 (0.0) 131072 32 11.8x 0.1 (0.0) 1.2x 3.0 (0.0) 32768 32 10.4x 0.1 (0.0) 1.2x 3.0 (0.0) 32768 24 9.3x 0.1 (0.0) 1.2x 3.0 (0.0) 131072 16 8.7x 0.1 (0.0) 1.2x 3.0 (0.0) 65536 16 7.9x 0.1 (0.0) 1.2x 3.0 (0.0) 32768 16 7.7x 0.1 (0.0) 1.2x 3.0 (0.0) 65536 24 7.6x 0.1 (0.0) 1.2x 3.0 (0.0) 131072 8 5.7x 0.1 (0.0) 1.2x 3.0 (0.0) 65536 8 4.9x 0.1 (0.0) 1.2x 3.1 (0.0) 32768 8 4.6x 0.1 (0.0) 1.2x 3.1 (0.0) 0 8 3.9x 0.2 (0.0) 1.1x 3.1 (0.0) 0 16 3.1x 0.2 (0.1) 1.1x 3.1 (0.1) 0 24 2.9x 0.2 (0.0) 1.1x 3.2 (0.0) 0 32 2.6x 0.2 (0.0) 1.1x 3.2 (0.0) 131072 1 0.9x 0.7 (0.0) 1.0x 3.6 (0.0) 65536 1 0.9x 0.7 (0.0) 1.0x 3.6 (0.0) 32768 1 0.9x 0.7 (0.0) 1.0x 3.6 (0.0) 0 1 0.9x 0.7 (0.0) 1.0x 3.6 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 360 -- 0 -- 1.8 (0.0) -- 9.4 (0.0) 131072 32 15.1x 0.1 (0.0) 1.2x 7.7 (0.0) 65536 32 13.5x 0.1 (0.0) 1.2x 7.7 (0.0) 65536 24 11.6x 0.2 (0.0) 1.2x 7.8 (0.0) 131072 16 11.5x 0.2 (0.0) 1.2x 7.8 (0.0) 32768 32 11.3x 0.2 (0.0) 1.2x 7.8 (0.0) 32768 24 10.5x 0.2 (0.0) 1.2x 7.8 (0.0) 131072 24 10.4x 0.2 (0.0) 1.2x 7.8 (0.0) 32768 16 8.8x 0.2 (0.0) 1.2x 7.8 (0.0) 65536 16 8.5x 0.2 (0.0) 1.2x 7.8 (0.0) 131072 8 6.1x 0.3 (0.0) 1.2x 7.9 (0.1) 65536 8 5.5x 0.3 (0.0) 1.2x 7.9 (0.0) 32768 8 5.3x 0.3 (0.0) 1.2x 7.9 (0.0) 0 8 4.8x 0.4 (0.1) 1.2x 8.0 (0.1) 0 16 3.3x 0.5 (0.1) 1.2x 8.1 (0.1) 0 24 3.1x 0.6 (0.0) 1.1x 8.2 (0.0) 0 32 2.7x 0.7 (0.0) 1.1x 8.3 (0.0) 131072 1 0.9x 1.9 (0.0) 1.0x 9.5 (0.0) 32768 1 0.9x 1.9 (0.0) 1.0x 9.5 (0.0) 65536 1 0.9x 1.9 (0.0) 1.0x 9.6 (0.0) 0 1 0.9x 1.9 (0.0) 1.0x 9.5 (0.0) THP lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 16 -- 0 -- 3.3 (0.0) -- 4.2 (0.0) 32768 32 7.5x 0.4 (0.0) 4.3x 1.0 (0.0) 131072 32 7.6x 0.4 (0.0) 4.3x 1.0 (0.0) 65536 16 7.3x 0.4 (0.0) 4.2x 1.0 (0.0) 65536 32 7.5x 0.4 (0.0) 4.3x 1.0 (0.0) 131072 16 7.2x 0.5 (0.0) 4.2x 1.0 (0.0) 65536 24 7.0x 0.5 (0.0) 4.1x 1.0 (0.0) 131072 24 6.9x 0.5 (0.0) 4.1x 1.0 (0.0) 32768 16 6.3x 0.5 (0.0) 3.9x 1.1 (0.0) 32768 24 5.7x 0.6 (0.0) 3.8x 1.1 (0.0) 32768 8 5.0x 0.7 (0.0) 3.5x 1.2 (0.0) 65536 8 5.4x 0.6 (0.0) 3.4x 1.2 (0.1) 131072 8 5.7x 0.6 (0.0) 3.5x 1.2 (0.1) 0 32 2.0x 1.6 (0.1) 1.8x 2.3 (0.1) 0 24 1.9x 1.7 (0.0) 1.7x 2.5 (0.1) 0 16 1.8x 1.8 (0.3) 1.6x 2.6 (0.3) 0 8 1.9x 1.7 (0.3) 1.7x 2.5 (0.3) 0 1 1.0x 3.3 (0.0) 1.0x 4.2 (0.0) 131072 1 1.0x 3.3 (0.0) 1.0x 4.2 (0.0) 65536 1 1.0x 3.3 (0.0) 1.0x 4.2 (0.0) 32768 1 1.0x 3.3 (0.0) 1.0x 4.2 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 128 -- 0 -- 29.5 (0.2) -- 30.5 (0.2) 131072 24 12.9x 2.3 (0.0) 10.3x 2.9 (0.0) 131072 32 12.8x 2.3 (0.0) 10.2x 3.0 (0.0) 131072 16 12.5x 2.4 (0.0) 10.0x 3.0 (0.0) 65536 24 12.1x 2.4 (0.0) 9.8x 3.1 (0.0) 65536 32 12.0x 2.4 (0.0) 9.8x 3.1 (0.0) 65536 16 11.8x 2.5 (0.0) 9.6x 3.2 (0.0) 32768 24 11.1x 2.7 (0.0) 9.1x 3.3 (0.0) 32768 32 10.7x 2.7 (0.0) 8.9x 3.4 (0.0) 32768 16 10.6x 2.8 (0.0) 8.8x 3.5 (0.0) 131072 8 7.3x 4.0 (0.0) 6.4x 4.8 (0.0) 65536 8 7.1x 4.2 (0.0) 6.2x 4.9 (0.0) 32768 8 6.6x 4.4 (0.0) 5.8x 5.2 (0.0) 0 8 3.6x 8.1 (0.7) 3.4x 9.0 (0.7) 0 32 2.2x 13.6 (1.9) 2.1x 14.5 (1.9) 0 16 2.1x 14.0 (3.2) 2.1x 14.8 (3.2) 0 24 2.1x 14.1 (3.1) 2.0x 15.0 (3.1) 0 1 1.0x 29.6 (0.2) 1.0x 30.6 (0.2) 32768 1 1.0x 29.6 (0.2) 1.0x 30.7 (0.2) 131072 1 1.0x 29.7 (0.0) 1.0x 30.7 (0.0) 65536 1 1.0x 29.8 (0.1) 1.0x 30.8 (0.1) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 360 -- 0 -- 83.8 (0.6) -- 85.1 (0.6) 131072 24 13.6x 6.2 (0.0) 12.0x 7.1 (0.0) 131072 32 13.4x 6.2 (0.0) 11.9x 7.2 (0.0) 65536 24 12.8x 6.6 (0.1) 11.3x 7.5 (0.1) 131072 16 12.7x 6.6 (0.0) 11.3x 7.5 (0.0) 65536 32 12.4x 6.8 (0.0) 11.0x 7.7 (0.0) 65536 16 11.9x 7.0 (0.0) 10.7x 8.0 (0.0) 32768 24 11.4x 7.4 (0.0) 10.3x 8.3 (0.0) 32768 32 11.0x 7.6 (0.0) 10.0x 8.5 (0.0) 32768 16 10.7x 7.8 (0.0) 9.7x 8.8 (0.0) 131072 8 7.4x 11.4 (0.0) 6.8x 12.4 (0.0) 65536 8 7.2x 11.7 (0.0) 6.7x 12.7 (0.0) 32768 8 6.7x 12.6 (0.1) 6.2x 13.6 (0.1) 0 8 3.1x 27.2 (6.1) 3.0x 28.3 (6.1) 0 32 2.1x 39.9 (6.4) 2.1x 41.0 (6.4) 0 24 2.1x 40.6 (6.6) 2.0x 41.7 (6.6) 0 16 2.0x 42.6 (0.1) 1.9x 43.8 (0.1) 131072 1 1.0x 83.9 (0.5) 1.0x 85.2 (0.5) 65536 1 1.0x 84.2 (0.3) 1.0x 85.5 (0.3) 32768 1 1.0x 84.6 (0.1) 1.0x 85.9 (0.1) 0 1 1.0x 84.9 (0.1) 1.0x 86.2 (0.1) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 16 -- 0 -- 0.6 (0.0) -- 6.1 (0.0) 65536 32 3.9x 0.1 (0.0) 1.1x 5.5 (0.0) 32768 32 3.9x 0.1 (0.0) 1.1x 5.6 (0.0) 131072 24 3.9x 0.1 (0.0) 1.1x 5.6 (0.0) 131072 32 3.9x 0.1 (0.0) 1.1x 5.5 (0.0) 65536 24 3.9x 0.1 (0.0) 1.1x 5.6 (0.0) 32768 24 3.9x 0.1 (0.0) 1.1x 5.6 (0.1) 65536 16 4.0x 0.1 (0.0) 1.1x 5.6 (0.1) 32768 16 3.9x 0.1 (0.0) 1.1x 5.6 (0.0) 131072 16 3.9x 0.1 (0.0) 1.1x 5.6 (0.1) 65536 8 4.0x 0.1 (0.0) 1.1x 5.6 (0.0) 131072 8 4.0x 0.1 (0.0) 1.1x 5.7 (0.1) 32768 8 4.0x 0.1 (0.0) 1.1x 5.6 (0.0) 0 32 1.6x 0.4 (0.0) 1.0x 5.9 (0.1) 0 24 1.6x 0.4 (0.0) 1.0x 5.9 (0.0) 0 16 1.5x 0.4 (0.0) 1.0x 6.0 (0.0) 0 8 1.5x 0.4 (0.0) 1.0x 5.9 (0.1) 65536 1 1.0x 0.6 (0.0) 1.0x 6.1 (0.1) 32768 1 1.0x 0.6 (0.0) 1.0x 6.1 (0.0) 0 1 1.0x 0.6 (0.0) 1.0x 6.2 (0.0) 131072 1 1.0x 0.6 (0.0) 1.0x 6.2 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 128 -- 0 -- 5.1 (0.0) -- 44.5 (0.0) 131072 32 16.5x 0.3 (0.0) 1.1x 40.4 (0.3) 65536 32 15.7x 0.3 (0.0) 1.1x 40.4 (0.6) 131072 24 13.9x 0.4 (0.0) 1.1x 39.8 (0.3) 32768 32 14.1x 0.4 (0.0) 1.1x 40.0 (0.5) 65536 24 12.9x 0.4 (0.0) 1.1x 39.8 (0.5) 32768 24 12.2x 0.4 (0.0) 1.1x 40.1 (0.1) 65536 16 9.6x 0.5 (0.0) 1.1x 40.3 (0.4) 131072 16 9.7x 0.5 (0.0) 1.1x 40.4 (0.5) 32768 16 9.2x 0.5 (0.0) 1.1x 40.8 (0.5) 65536 8 5.5x 0.9 (0.0) 1.1x 40.5 (0.5) 131072 8 5.5x 0.9 (0.0) 1.1x 40.7 (0.6) 32768 8 5.2x 1.0 (0.0) 1.1x 40.7 (0.3) 0 32 1.6x 3.1 (0.0) 1.0x 43.5 (0.8) 0 24 1.6x 3.2 (0.0) 1.0x 42.9 (0.5) 0 16 1.5x 3.3 (0.0) 1.0x 43.5 (0.4) 0 8 1.5x 3.5 (0.0) 1.0x 43.4 (0.5) 65536 1 1.0x 5.0 (0.0) 1.0x 44.6 (0.1) 32768 1 1.0x 5.0 (0.0) 1.0x 44.9 (0.2) 131072 1 1.0x 5.0 (0.0) 1.0x 44.8 (0.2) 0 1 1.0x 5.0 (0.0) 1.0x 44.8 (0.3) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 360 -- 0 -- 14.4 (0.0) -- 125.4 (0.3) 131072 32 16.5x 0.9 (0.0) 1.1x 112.0 (0.7) 65536 32 14.9x 1.0 (0.0) 1.1x 113.3 (1.3) 32768 32 14.0x 1.0 (0.0) 1.1x 112.6 (1.0) 131072 24 13.5x 1.1 (0.0) 1.1x 111.3 (0.9) 65536 24 13.3x 1.1 (0.0) 1.1x 112.3 (0.8) 32768 24 12.4x 1.2 (0.0) 1.1x 111.1 (0.8) 65536 16 9.7x 1.5 (0.0) 1.1x 111.5 (0.8) 131072 16 9.7x 1.5 (0.0) 1.1x 112.1 (1.2) 32768 16 9.3x 1.5 (0.0) 1.1x 113.2 (0.4) 131072 8 5.5x 2.6 (0.0) 1.1x 114.8 (1.3) 32768 8 5.5x 2.6 (0.0) 1.1x 114.1 (1.0) 65536 8 5.4x 2.6 (0.0) 1.1x 115.0 (3.3) 0 32 1.6x 8.8 (0.0) 1.0x 120.7 (0.7) 0 24 1.6x 8.9 (0.0) 1.1x 119.4 (0.1) 0 16 1.5x 9.5 (0.0) 1.0x 120.1 (0.7) 0 8 1.4x 10.1 (0.2) 1.0x 123.6 (1.9) 32768 1 1.0x 14.3 (0.0) 1.0x 126.2 (0.9) 65536 1 1.0x 14.3 (0.0) 1.0x 125.4 (0.6) 131072 1 1.0x 14.3 (0.0) 1.0x 126.5 (1.0) 0 1 1.0x 14.3 (0.0) 1.0x 124.7 (1.2) AMD ~~~ hugetlb lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 16 -- 0 -- 1.1 (0.0) -- 1.5 (0.0) 131072 8 4.3x 0.2 (0.0) 2.5x 0.6 (0.0) 65536 16 4.3x 0.2 (0.0) 2.6x 0.6 (0.0) 65536 8 4.0x 0.3 (0.0) 2.5x 0.6 (0.0) 65536 24 3.8x 0.3 (0.0) 2.4x 0.6 (0.0) 32768 32 3.6x 0.3 (0.0) 2.3x 0.6 (0.0) 131072 32 3.6x 0.3 (0.0) 2.3x 0.6 (0.0) 65536 32 3.5x 0.3 (0.0) 2.3x 0.6 (0.0) 32768 8 3.4x 0.3 (0.0) 2.3x 0.7 (0.0) 131072 24 3.0x 0.3 (0.0) 2.1x 0.7 (0.0) 131072 16 2.8x 0.4 (0.0) 2.0x 0.8 (0.1) 32768 16 2.6x 0.4 (0.0) 1.9x 0.8 (0.0) 32768 24 2.6x 0.4 (0.0) 1.9x 0.8 (0.0) 0 32 1.3x 0.8 (0.0) 1.2x 1.2 (0.0) 0 24 1.3x 0.8 (0.0) 1.2x 1.3 (0.0) 0 16 1.2x 0.9 (0.0) 1.2x 1.3 (0.0) 0 8 1.1x 0.9 (0.0) 1.1x 1.4 (0.0) 32768 1 1.0x 1.1 (0.0) 1.0x 1.5 (0.0) 131072 1 1.0x 1.1 (0.0) 1.0x 1.5 (0.0) 0 1 1.0x 1.1 (0.0) 1.0x 1.5 (0.0) 65536 1 1.0x 1.1 (0.0) 1.0x 1.5 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 128 -- 0 -- 9.6 (0.1) -- 10.2 (0.1) 131072 32 4.5x 2.1 (0.0) 3.9x 2.6 (0.0) 131072 8 4.4x 2.2 (0.0) 3.7x 2.8 (0.1) 65536 16 4.3x 2.2 (0.0) 3.6x 2.8 (0.0) 131072 16 4.2x 2.3 (0.1) 3.6x 2.9 (0.0) 65536 8 4.1x 2.3 (0.0) 3.6x 2.8 (0.0) 131072 24 4.1x 2.4 (0.1) 3.5x 3.0 (0.1) 65536 24 3.8x 2.5 (0.0) 3.4x 3.0 (0.0) 65536 32 3.8x 2.5 (0.0) 3.3x 3.1 (0.0) 32768 32 3.6x 2.6 (0.0) 3.3x 3.1 (0.0) 32768 8 3.3x 2.9 (0.1) 2.9x 3.5 (0.1) 32768 16 3.2x 3.0 (0.3) 2.9x 3.5 (0.3) 32768 24 2.5x 3.8 (0.0) 2.3x 4.4 (0.0) 0 16 1.2x 7.8 (0.1) 1.2x 8.4 (0.1) 0 8 1.2x 8.3 (0.1) 1.1x 8.9 (0.1) 32768 1 1.0x 9.6 (0.1) 1.0x 10.3 (0.1) 65536 1 1.0x 9.6 (0.0) 1.0x 10.3 (0.1) 131072 1 1.0x 9.7 (0.0) 1.0x 10.3 (0.0) 0 1 1.0x 9.7 (0.0) 1.0x 10.4 (0.0) 0 24 0.9x 10.2 (0.6) 0.9x 10.8 (0.6) 0 32 0.9x 10.5 (0.5) 0.9x 11.2 (0.5) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 980 -- 0 -- 74.1 (0.7) -- 75.7 (0.7) 131072 16 4.7x 15.9 (0.1) 4.3x 17.4 (0.1) 131072 24 4.6x 16.0 (0.0) 4.2x 18.1 (0.0) 131072 32 4.6x 16.3 (0.0) 4.1x 18.4 (0.0) 131072 8 4.4x 16.9 (0.1) 4.1x 18.5 (0.1) 65536 16 4.3x 17.1 (0.0) 3.9x 19.2 (0.0) 65536 24 4.3x 17.4 (0.0) 3.9x 19.5 (0.0) 65536 32 4.2x 17.7 (0.0) 3.8x 19.9 (0.1) 65536 8 4.1x 18.2 (0.0) 3.7x 20.4 (0.0) 32768 24 3.7x 19.8 (0.1) 3.4x 22.0 (0.1) 32768 16 3.7x 20.2 (0.2) 3.5x 21.8 (0.2) 32768 32 3.6x 20.4 (0.1) 3.4x 22.5 (0.1) 32768 8 3.4x 21.6 (0.5) 3.3x 23.1 (0.5) 0 16 1.2x 60.4 (0.6) 1.2x 62.0 (0.6) 0 8 1.1x 65.3 (1.0) 1.1x 67.6 (1.0) 0 24 1.0x 73.1 (2.7) 1.0x 75.4 (2.6) 131072 1 1.0x 75.0 (0.7) 1.0x 77.3 (0.7) 65536 1 1.0x 75.4 (0.7) 1.0x 77.7 (0.7) 0 1 1.0x 75.6 (0.7) 1.0x 77.8 (0.7) 32768 1 1.0x 75.8 (0.0) 1.0x 78.0 (0.0) 0 32 0.8x 92.9 (1.2) 0.8x 95.3 (1.1) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 16 -- 0 -- 0.0 (0.0) -- 0.6 (0.0) 131072 24 5.6x 0.0 (0.0) 1.0x 0.6 (0.0) 32768 16 4.6x 0.0 (0.0) 1.0x 0.6 (0.0) 32768 32 4.8x 0.0 (0.0) 0.9x 0.6 (0.0) 131072 16 4.6x 0.0 (0.0) 1.0x 0.6 (0.0) 131072 32 4.3x 0.0 (0.0) 1.0x 0.6 (0.0) 131072 8 4.5x 0.0 (0.0) 1.0x 0.6 (0.0) 32768 8 4.4x 0.0 (0.0) 1.0x 0.6 (0.0) 65536 24 3.7x 0.0 (0.0) 1.0x 0.6 (0.0) 65536 16 3.2x 0.0 (0.0) 1.0x 0.6 (0.0) 65536 8 2.8x 0.0 (0.0) 1.0x 0.6 (0.0) 32768 24 3.0x 0.0 (0.0) 1.0x 0.6 (0.0) 65536 32 2.6x 0.0 (0.0) 1.0x 0.6 (0.0) 0 32 2.1x 0.0 (0.0) 1.0x 0.6 (0.0) 0 16 2.3x 0.0 (0.0) 0.9x 0.6 (0.0) 0 8 2.2x 0.0 (0.0) 0.9x 0.6 (0.0) 0 24 1.2x 0.0 (0.0) 1.0x 0.6 (0.0) 131072 1 1.0x 0.0 (0.0) 0.9x 0.6 (0.0) 65536 1 1.0x 0.0 (0.0) 0.9x 0.7 (0.0) 32768 1 0.8x 0.0 (0.0) 0.9x 0.6 (0.0) 0 1 0.9x 0.0 (0.0) 1.0x 0.6 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 128 -- 0 -- 0.3 (0.0) -- 2.7 (0.0) 131072 24 10.4x 0.0 (0.0) 1.1x 2.4 (0.0) 65536 16 8.5x 0.0 (0.0) 1.1x 2.4 (0.0) 32768 24 7.7x 0.0 (0.0) 1.1x 2.4 (0.0) 32768 32 7.6x 0.0 (0.0) 1.1x 2.4 (0.0) 65536 24 6.1x 0.0 (0.0) 1.1x 2.4 (0.0) 131072 16 5.8x 0.0 (0.0) 1.1x 2.4 (0.0) 131072 32 5.6x 0.0 (0.0) 1.1x 2.4 (0.0) 32768 8 5.2x 0.1 (0.0) 1.1x 2.4 (0.0) 65536 32 4.8x 0.1 (0.0) 1.1x 2.5 (0.0) 32768 16 4.9x 0.1 (0.0) 1.1x 2.4 (0.0) 131072 8 4.4x 0.1 (0.0) 1.1x 2.4 (0.0) 65536 8 4.2x 0.1 (0.0) 1.1x 2.4 (0.0) 0 8 2.9x 0.1 (0.0) 1.1x 2.4 (0.0) 0 16 2.9x 0.1 (0.0) 1.1x 2.5 (0.0) 0 24 2.8x 0.1 (0.0) 1.1x 2.4 (0.0) 0 32 1.2x 0.2 (0.0) 1.0x 2.6 (0.0) 32768 1 1.0x 0.3 (0.0) 1.0x 2.7 (0.0) 131072 1 1.0x 0.3 (0.0) 1.0x 2.7 (0.0) 65536 1 1.0x 0.3 (0.0) 1.0x 2.7 (0.0) 0 1 0.9x 0.3 (0.0) 1.0x 2.7 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 980 -- 0 -- 2.0 (0.0) -- 18.2 (0.1) 131072 32 11.2x 0.2 (0.0) 1.2x 15.7 (0.0) 131072 16 9.4x 0.2 (0.0) 1.2x 15.7 (0.0) 65536 24 9.2x 0.2 (0.0) 1.1x 16.3 (0.0) 65536 16 8.1x 0.3 (0.0) 1.1x 16.4 (0.0) 32768 16 7.1x 0.3 (0.0) 1.1x 15.8 (0.0) 131072 24 7.1x 0.3 (0.0) 1.1x 15.8 (0.0) 65536 32 6.2x 0.3 (0.0) 1.1x 16.4 (0.0) 65536 8 5.7x 0.4 (0.0) 1.1x 16.5 (0.1) 32768 32 5.6x 0.4 (0.0) 1.1x 16.5 (0.0) 32768 24 5.6x 0.4 (0.0) 1.1x 15.9 (0.0) 131072 8 5.0x 0.4 (0.0) 1.1x 16.0 (0.0) 32768 8 3.0x 0.7 (0.0) 1.1x 16.3 (0.1) 0 8 2.8x 0.7 (0.0) 1.1x 16.2 (0.0) 0 16 2.7x 0.8 (0.1) 1.1x 16.9 (0.1) 0 24 1.6x 1.2 (0.4) 1.0x 17.4 (0.4) 32768 1 1.0x 2.1 (0.0) 1.0x 18.1 (0.0) 0 32 1.0x 2.1 (0.0) 1.0x 17.7 (0.0) 65536 1 1.0x 2.1 (0.0) 1.0x 18.2 (0.1) 131072 1 1.0x 2.1 (0.0) 1.0x 18.3 (0.0) 0 1 0.9x 2.2 (0.0) 1.0x 17.7 (0.0) THP lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 16 -- 0 -- 1.2 (0.0) -- 1.7 (0.0) 131072 8 4.3x 0.3 (0.0) 2.4x 0.7 (0.0) 131072 32 3.1x 0.4 (0.0) 2.1x 0.8 (0.0) 65536 16 4.0x 0.3 (0.0) 2.3x 0.7 (0.0) 65536 8 3.9x 0.3 (0.0) 2.3x 0.7 (0.0) 65536 24 3.3x 0.4 (0.0) 2.1x 0.8 (0.0) 65536 32 3.3x 0.4 (0.0) 2.2x 0.8 (0.0) 32768 16 2.6x 0.5 (0.0) 1.9x 0.9 (0.0) 131072 24 3.3x 0.4 (0.0) 2.1x 0.8 (0.0) 32768 32 3.3x 0.4 (0.0) 2.1x 0.8 (0.0) 131072 16 3.1x 0.4 (0.0) 2.0x 0.8 (0.0) 32768 24 2.5x 0.5 (0.0) 1.9x 0.9 (0.0) 32768 8 3.2x 0.4 (0.0) 1.9x 0.9 (0.0) 0 24 1.3x 1.0 (0.0) 1.2x 1.4 (0.0) 0 32 1.2x 1.0 (0.0) 1.1x 1.5 (0.1) 0 8 1.2x 1.0 (0.0) 1.1x 1.5 (0.0) 0 16 1.2x 1.0 (0.0) 1.1x 1.5 (0.0) 131072 1 1.0x 1.2 (0.0) 1.0x 1.7 (0.0) 65536 1 1.0x 1.2 (0.0) 1.0x 1.7 (0.0) 0 1 1.0x 1.2 (0.0) 1.0x 1.7 (0.0) 32768 1 1.0x 1.2 (0.0) 1.0x 1.7 (0.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 128 -- 0 -- 10.9 (0.1) -- 11.4 (0.1) 131072 16 5.0x 2.2 (0.0) 4.3x 2.7 (0.0) 131072 32 4.8x 2.3 (0.0) 4.2x 2.7 (0.0) 131072 24 4.6x 2.4 (0.0) 4.1x 2.8 (0.1) 131072 8 4.7x 2.3 (0.0) 4.1x 2.8 (0.0) 65536 24 4.4x 2.5 (0.0) 3.9x 2.9 (0.0) 65536 32 4.1x 2.7 (0.1) 3.7x 3.1 (0.1) 65536 16 4.1x 2.7 (0.2) 3.7x 3.1 (0.2) 65536 8 4.0x 2.7 (0.1) 3.6x 3.2 (0.1) 32768 24 3.8x 2.9 (0.0) 3.4x 3.4 (0.0) 32768 32 3.6x 3.0 (0.1) 3.3x 3.5 (0.1) 32768 8 3.5x 3.1 (0.0) 3.2x 3.6 (0.1) 32768 16 3.3x 3.3 (0.2) 3.1x 3.7 (0.2) 0 16 1.3x 8.3 (0.4) 1.3x 8.8 (0.4) 0 8 1.2x 8.8 (0.4) 1.2x 9.3 (0.4) 0 24 1.1x 10.1 (1.2) 1.1x 10.7 (1.3) 0 32 1.1x 10.3 (1.3) 1.1x 10.8 (1.3) 131072 1 1.0x 10.9 (0.0) 1.0x 11.5 (0.0) 32768 1 1.0x 11.0 (0.1) 1.0x 11.6 (0.1) 65536 1 1.0x 11.1 (0.0) 1.0x 11.6 (0.0) 0 1 1.0x 11.1 (0.2) 1.0x 11.6 (0.2) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 0 980 -- 0 -- 85.3 (0.6) -- 86.1 (0.6) 131072 16 5.2x 16.4 (0.0) 5.0x 17.1 (0.0) 131072 24 5.1x 16.7 (0.1) 4.9x 17.4 (0.1) 131072 32 5.0x 17.1 (0.0) 4.8x 17.8 (0.0) 131072 8 4.7x 18.3 (0.1) 4.5x 19.0 (0.1) 65536 16 4.7x 18.3 (0.0) 4.5x 19.0 (0.0) 65536 24 4.6x 18.5 (0.0) 4.5x 19.2 (0.0) 65536 32 4.5x 18.8 (0.0) 4.4x 19.6 (0.0) 65536 8 4.3x 19.6 (0.0) 4.2x 20.4 (0.0) 32768 16 3.9x 21.6 (0.0) 3.9x 22.4 (0.0) 32768 24 3.9x 22.1 (0.3) 3.8x 22.8 (0.3) 32768 32 3.8x 22.4 (0.1) 3.7x 23.1 (0.1) 32768 8 3.8x 22.7 (0.0) 3.7x 23.5 (0.0) 0 16 1.3x 64.6 (2.7) 1.3x 65.4 (2.7) 0 8 1.2x 70.0 (2.7) 1.2x 70.8 (2.7) 0 32 1.0x 82.4 (5.7) 1.0x 83.2 (5.7) 0 24 1.0x 83.4 (6.9) 1.0x 84.1 (6.9) 131072 1 1.0x 84.2 (0.3) 1.0x 85.0 (0.3) 0 1 1.0x 84.8 (1.3) 1.0x 85.6 (1.3) 65536 1 1.0x 84.9 (0.4) 1.0x 85.7 (0.4) 32768 1 1.0x 85.6 (1.3) 1.0x 86.4 (1.3) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 16 -- 0 -- 0.5 (0.3) -- 6.2 (0.4) 131072 32 4.9x 0.1 (0.0) 1.1x 5.6 (0.0) 65536 16 5.1x 0.1 (0.0) 1.1x 5.7 (0.1) 65536 32 5.0x 0.1 (0.0) 1.1x 5.6 (0.1) 32768 16 5.0x 0.1 (0.0) 1.1x 5.7 (0.0) 32768 8 5.8x 0.1 (0.0) 1.1x 5.6 (0.0) 65536 24 5.7x 0.1 (0.0) 1.1x 5.7 (0.0) 32768 32 3.9x 0.1 (0.0) 1.0x 5.9 (0.1) 131072 16 3.7x 0.1 (0.1) 1.0x 6.0 (0.3) 65536 8 4.0x 0.1 (0.1) 1.1x 5.9 (0.1) 131072 24 3.6x 0.1 (0.1) 1.0x 5.9 (0.5) 131072 8 2.5x 0.2 (0.1) 1.0x 6.0 (0.6) 32768 24 1.7x 0.3 (0.1) 1.0x 6.5 (0.2) 131072 1 1.8x 0.3 (0.0) 1.1x 5.9 (0.0) 0 32 1.6x 0.3 (0.0) 1.0x 6.2 (0.2) 0 8 1.0x 0.5 (0.0) 1.0x 6.2 (0.1) 0 24 0.9x 0.5 (0.3) 1.0x 6.3 (0.5) 0 1 0.9x 0.5 (0.4) 1.0x 6.2 (0.5) 32768 1 0.8x 0.6 (0.3) 1.0x 6.5 (0.4) 0 16 0.7x 0.7 (0.7) 0.9x 6.6 (0.8) 65536 1 0.6x 0.9 (0.5) 0.9x 6.7 (0.7) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 128 -- 0 -- 3.4 (0.8) -- 45.5 (1.0) 131072 32 11.7x 0.3 (0.1) 1.1x 42.1 (0.2) 65536 16 8.5x 0.4 (0.1) 1.1x 42.1 (0.2) 32768 32 8.6x 0.4 (0.2) 1.1x 43.0 (0.2) 65536 32 8.9x 0.4 (0.1) 1.0x 43.6 (0.3) 32768 24 7.9x 0.4 (0.1) 1.1x 42.3 (0.3) 32768 16 6.5x 0.5 (0.2) 1.1x 42.5 (0.5) 65536 24 6.7x 0.5 (0.2) 1.1x 42.6 (0.5) 131072 24 5.8x 0.6 (0.5) 1.1x 42.5 (0.6) 131072 16 5.0x 0.7 (0.6) 1.1x 42.4 (0.8) 131072 8 3.8x 0.9 (0.4) 1.1x 42.7 (0.5) 65536 8 3.2x 1.1 (0.5) 1.1x 42.9 (0.6) 32768 8 3.1x 1.1 (0.4) 1.1x 43.3 (1.0) 0 32 1.1x 3.0 (0.2) 1.0x 45.1 (0.2) 0 24 1.2x 2.9 (0.1) 1.0x 44.6 (0.2) 0 8 1.0x 3.5 (1.1) 1.0x 45.5 (1.2) 32768 1 1.0x 3.6 (0.9) 1.0x 45.5 (0.7) 131072 1 1.0x 3.5 (1.1) 1.0x 45.6 (1.4) 0 1 0.9x 3.6 (0.5) 1.0x 45.6 (0.4) 0 16 0.9x 3.6 (0.2) 1.0x 45.7 (0.2) 65536 1 0.9x 3.6 (1.0) 1.0x 45.8 (1.0) lockedvm qemu qemu mem cache pin pin init init qthr (G) (pages) thr speedup (s) (std) speedup (s) (std) 16 980 -- 0 -- 19.6 (0.9) -- 337.9 (0.7) 131072 32 9.7x 2.0 (0.4) 1.0x 323.0 (0.7) 131072 24 8.8x 2.2 (0.4) 1.0x 324.6 (0.8) 65536 32 8.4x 2.3 (0.2) 1.0x 323.1 (0.5) 32768 24 7.9x 2.5 (0.1) 1.1x 319.4 (1.0) 65536 24 8.1x 2.4 (0.1) 1.0x 322.3 (0.8) 32768 32 7.4x 2.6 (0.2) 1.1x 321.2 (0.8) 131072 16 6.9x 2.8 (0.3) 1.0x 331.0 (8.8) 65536 16 6.5x 3.0 (0.2) 1.1x 320.4 (0.7) 32768 16 5.9x 3.3 (0.5) 1.0x 328.3 (1.5) 65536 8 5.3x 3.7 (0.4) 1.1x 320.8 (1.0) 32768 8 4.8x 4.1 (0.2) 1.0x 328.9 (0.8) 131072 8 4.7x 4.1 (0.2) 1.1x 319.4 (0.9) 0 8 1.2x 16.9 (0.7) 1.0x 333.9 (3.1) 0 32 1.1x 18.0 (0.7) 1.0x 336.1 (0.8) 0 24 1.1x 18.0 (1.6) 1.0x 336.7 (1.7) 65536 1 1.0x 19.0 (0.5) 1.0x 341.0 (0.3) 131072 1 1.0x 19.7 (1.0) 1.0x 335.7 (1.0) 0 16 1.0x 19.8 (1.8) 1.0x 338.8 (1.8) 32768 1 0.9x 20.7 (1.5) 1.0x 337.6 (1.9) 0 1 0.9x 21.3 (1.4) 1.0x 339.5 (1.8) Signed-off-by: Daniel Jordan --- drivers/vfio/vfio_iommu_type1.c | 51 +++++++++++++++++++++++++++++---- 1 file changed, 45 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index faee849f1cce..c2edc5a4c727 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -651,7 +651,7 @@ static int vfio_wait_all_valid(struct vfio_iommu *iommu) static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, long npage, unsigned long *pfn_base, unsigned long limit, struct vfio_batch *batch, - struct mm_struct *mm) + struct mm_struct *mm, long *lock_cache) { unsigned long pfn; long ret, pinned = 0, lock_acct = 0; @@ -709,15 +709,25 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr, * the user. */ if (!rsvd && !vfio_find_vpfn(dma, iova)) { - if (!dma->lock_cap && + if (!dma->lock_cap && *lock_cache == 0 && mm->locked_vm + lock_acct + 1 > limit) { pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__, limit << PAGE_SHIFT); ret = -ENOMEM; goto unpin_out; } - lock_acct++; - } + /* + * Draw from the cache if possible to avoid + * taking the write-side mmap_lock in + * vfio_lock_acct(), which will alleviate + * contention with the read-side mmap_lock in + * vaddr_get_pfn(). + */ + if (*lock_cache > 0) + (*lock_cache)--; + else + lock_acct++; + } pinned++; npage--; @@ -1507,6 +1517,13 @@ static void vfio_pin_map_dma_undo(unsigned long start_vaddr, vfio_unmap_unpin(args->iommu, args->dma, iova, end, true); } +/* + * Relieve mmap_lock contention when multithreading page pinning by caching + * locked_vm locally. Bound the locked_vm that a thread will cache but not use + * with this constant, which compromises between performance and overaccounting. + */ +#define LOCKED_VM_CACHE_PAGES 65536 + static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, unsigned long end_vaddr, void *arg) { @@ -1515,6 +1532,7 @@ static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, dma_addr_t iova = dma->iova + (start_vaddr - dma->vaddr); unsigned long unmapped_size = end_vaddr - start_vaddr; unsigned long pfn, mapped_size = 0; + long cache_size, lock_cache = 0; struct vfio_batch batch; long npage; int ret = 0; @@ -1522,11 +1540,29 @@ static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, vfio_batch_init(&batch); while (unmapped_size) { + if (lock_cache == 0) { + cache_size = min_t(long, unmapped_size >> PAGE_SHIFT, + LOCKED_VM_CACHE_PAGES); + ret = vfio_lock_acct(dma, cache_size, false); + /* + * More locked_vm is cached than might be used, so + * don't fail on -ENOMEM, i.e. exceeding RLIMIT_MEMLOCK. + */ + if (ret) { + if (ret != -ENOMEM) { + vfio_batch_unpin(&batch, dma); + break; + } + cache_size = 0; + } + lock_cache = cache_size; + } + /* Pin a contiguous chunk of memory */ npage = vfio_pin_pages_remote(dma, start_vaddr + mapped_size, unmapped_size >> PAGE_SHIFT, &pfn, args->limit, &batch, - args->mm); + args->mm, &lock_cache); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; @@ -1548,6 +1584,7 @@ static int vfio_pin_map_dma_chunk(unsigned long start_vaddr, } vfio_batch_fini(&batch); + vfio_lock_acct(dma, -lock_cache, false); /* * Undo the successfully completed part of this chunk now. padata will @@ -1771,6 +1808,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, struct rb_node *n; unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; int ret; + long lock_cache = 0; ret = vfio_wait_all_valid(iommu); if (ret < 0) @@ -1832,7 +1870,8 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, n >> PAGE_SHIFT, &pfn, limit, &batch, - current->mm); + current->mm, + &lock_cache); if (npage <= 0) { WARN_ON(!npage); ret = (int)npage; From patchwork Thu Jan 6 00:46:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECCDDC433F5 for ; Thu, 6 Jan 2022 00:51:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344011AbiAFAvQ (ORCPT ); Wed, 5 Jan 2022 19:51:16 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:50354 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344247AbiAFAtI (ORCPT ); Wed, 5 Jan 2022 19:49:08 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4xqd009805; Thu, 6 Jan 2022 00:47:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=GAQtMAtI1qSIfyiVsWQOnUMKJ2+/nWAKrAgV1ZFDKao=; b=FIZbAa+kXXumEMDn9H0J0JgVS5FzFjBQxpzZ82d+vmX6quQzqOcMJgo5rqSNbVULydm1 MBJZ0UVuVP19DD9g+w8XkpMih/3WIoBDfOLynD/dwvNgsut6rmBfSA+HsKdw1EvylpMb uj75mM+aubNRDSSTCnbipFV/iqQ6R89YuRF/ovgRfQ1ZQA3x1JmXy/PDNy2toJUChV2a w1+02IQX6M8y2cv/nq7XiLehAnanugNqOYp49x/xMW4RGnmP0AyMKKFZLKAghp6+BSDP 3XgCmTqt1SEAXFjhksCCPihcjgZmC6w5iGxtbDWOIYkd/+jSiA14vt8j6ujYXUDsRlJC XQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpp83u9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:40 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VnTD086823; Thu, 6 Jan 2022 00:47:39 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2048.outbound.protection.outlook.com [104.47.57.48]) by userp3030.oracle.com with ESMTP id 3ddmqbvt3x-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EfqYuqu5Sg3KmUaxNjFwcvX2v//dtWuUG/ndgKi/XIikSLvUjCEPFwMQbgEKjQY1k17jhKLT7lwrlU8JDrBpAunWLsZafOXic5Ogl+Y63UpOchdM9/GzbcPdu/NVkJZU1DHbg9iJG3Ow3eYz75wOCU07LM2mKuW8H1ME6Vfr7eHVjzgF3rU1S2C8KRAdHEq6PRGjMvrjiMhRIF53/brx6sd6D+QhzItYg8jznX1v3eKS6tgM9mv3oXMnDQTUqksxxQfM8X+6EvS8IQ7Twl61fwOAHHjekMJSx6xnBMd2qqF8lmLV3mXcxSviD9k7Ef/gTFQ5QbWRCQ7inxUScHdCfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GAQtMAtI1qSIfyiVsWQOnUMKJ2+/nWAKrAgV1ZFDKao=; b=lPXcrFTWBAbh8i5oPRcicTppO1EgKcaBpIxVSjz12JGf9CPTz6X6YW6xZAuUY1wfWXFRkpgpHLMoXPtEocfQKlKMB8DxbJBQss+QW57qWl2lBEyrpOYEptepKlYLNDZvBJoYcUV8kniHenXZBI+Rh5MqLvxPoCiCVEgu2LUwKJqYXzrm3sjWhr234LdL5IlmPiQW4eL3kQPQp7fCwiIudi5AMES/0XFDmdhFed8d/gwX37hpzE3ei60f/EMlzNgYgC5GC8DPF4fmTalXZs76U/TxISPjIXxcsCrHh9DU+tfaeilEo0G2rFzar8PvVaS8or2iaZK4UweReTNNNUTFTA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GAQtMAtI1qSIfyiVsWQOnUMKJ2+/nWAKrAgV1ZFDKao=; b=VZsKU+E26yavrNrNJc/Pbnzd1SAPNvKppp6IiOrnQVePA9EhrYMqfZzcu9fEjWlTVrrHeOBRHGO/SU7F42XJbGtXv0lcPeKhdEFf9D8bDMNzO1i+BuXxXf2XJ+qHp+XB6yUwa8TsCSvhXAdC2J0T9a7XZE8QV53nvwUcJr6Fbpc= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:38 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:38 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 09/16] padata: Use kthreads in do_multithreaded Date: Wed, 5 Jan 2022 19:46:49 -0500 Message-Id: <20220106004656.126790-10-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2ec127d2-81f6-444d-be20-08d9d0ae22e5 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1468; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: h6C1bhAQf3F0r7/SU6i/etFINJl39M2Y9Z+uyeO7pDLNKIV36U8UiI5ViYU7aOo1vc9foGQlBb1yfhBGV2Ce2Ya6bVQui1+sLQS2decD4FM08frei391RYe07FScBnbzO847jgCJ36AmrmGIhoL7V7Xbw91gU9ej6AqfLbDawQBiPMANeWuus4AuWbqH2MdHKW7upM+UsJHBShaXG+bvtqj3xRRmJ+c7usOpnbgq2hZ2zf7ZC4tISg2vaua44gsyUgb9w523gB7YWHWDML6KPLkfeo83jBMfoyEQRIEENhPnkWTKZRbtBosICcssVtSRGPuRiiahjYMYSxMK25aEN7Baq8b3lGNYl58I/am6w++apxurQ+Dq1wO278gEhyCkYeBf1nKPkbHxpGl7ZXX2rd48aMnjCgBCiCu8zwfLNdJgtdl3sMsn4D4ZLpDaGWyv9R8GE2VJn+mok+hKrRrqtGUK0Vn6ozF4cYRGGApa/6vn/6rg3RZdcWseLG2t2N/hMaClw0zIWWbMIWPUNMHQ9i5HDAj7j0OZbeW1iAjS8bwd6MKKNPq5XK6wmz10nLP264Cj+vp+pHjXMsUhaduIeESQTGZCN00C82PPjyO2WKpW5OHNnag/mXyy37hfsPGEsdFT5/2pgrCfYTLjwpKhoD2sVa9IRoAzVfUwNyQwI3gX58JHotJrkXZnAhNruw5zuTcJYif2Mu0+WdVcbbT72y7c8/LgzKkusZqlMJGpuZk= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 3cc6P8bzIKXJ776dvZXiIte/+nGK2kDoWQmIeFRneDGEAWd+NsIv3m4JVqppfDlGNYQKAAAjemhhCWPvt1msidKPLYYG5JOoclEc1CytFXKBSDbCVMD9jW85DZM/ZTFr1enPK7o+DKPj4auNYeJtUcFTxnHqLLiisDS0LQo64aGUsB9KNSyUvP7Q4Tq+X7dv041WGSmDpBvzTdedUXJs/cO162RwA+10B+pD9DG5F2TAC+fG58ohvhPf1wDDZsClJT6igolE5GU625E6+dafwBH798sSLIc81rqqRDPg8RJnvquf7/lijB1g+3yu3ea8w5lamEXcYc60rBqYBTCo5xtHO7b3QvHxJ2JwSUtV4SY3wP3taZQoYH84O2oiZg2NDXROkb7JJaywFZVkD2TnjEXPc6nNm8A9oHKg45YdSoFnnfv6pRR/GAE49bsV6k/wqD7fFaKKvfAd6TpKdmVnDoqo4j3a9qjzRoW9ie5jHdtGqu+LsacC5Q040MFzt6odeR78bFuIncOR2ve6hyWqoygxSs76C19WBKoUBrMUsiL4kkFcE5klvruqAnB7UAw5i78dYaCZH1Fa4MZbXo9r94SaJH1wTjH2ZztLKbjIbD9gw7MvIX8IHBdwuXwBEeF2qoKJL7J10HLmzjsotSb2QT21xSA+YtVeTv6nsQJnq1gtIFRRUQqibyy9ygM4gh5V/lIFKfmyT8GG/pTlbr1f6hB2lO7NUitoi3nYh3R3n2OvCWxBno+OXNmhLm1KLQe4QYKadXy+rkEO5d/sYEYP71YIu/9t6lUxoQYutt3dqn+jloyzHjGDfWPafRmQA6JL62hbXmgQpuAKoLAcP9H14cT7zl6wEesY6v6FQs+RVcCmlGHuVnTArteQwG8bcIWtSzln90AfWP0f04To+SBcOnTJmbUMXESbAgv5HXG9rbU3g4OtNuTQ5VDVWDMaD07Z65OG0JBpsVdiLbj42rL0vwa3wHU2beBT1kqElrWCmuM8E9bBls+Q6jwxAMmvFsqU7jho84eKBTErqq4FQ0IYJ9/3nF1djP0QVfUsAbQ8SLaNKb8tYhmH84x+pSoPBA3oyWdmLH5CFJS6aPdkRn1qM4QP7WAag1bEIIL2ZZU925hIeYbB0NxuZCdIFIqd5eWOGwURLi1M2cw/OgUAFXL1I0iINSYgoiIQyBVAWXHB/4J2jsRxS0BddPtMGw5hVPXbWzsiFqgLTqsWqnTEAIO55Lbz03hnmtYGSOB5IRYHZTe5WsLDT4j5BF1hqnrmENipDkOsy+aZvRXvt05Nrom/hsi0MD0xmssQtVoLaato3FDxZoBQHZLjX7v9wPFhNa0IMKedN9fbqm4rHcmZUrQBXDklDy0yY+t/eDaZ+ryqkHdWnTqB5bNd7i8OS8ejNioCz77SpAkUBKYTGC+QlTnqqOPrEC+XtfPv3fhqcFdnCYhRBHcpjqt4AVbQpSKj5ZQjVwC6rfuUa4ixcLPpt22OOQ+FDDwY1/xjxCTknFBx8ErGfLhKLq1dRBNoWucFhLXVQeYoDGnbdKvbj/6y14TTnf9YWshF74Lttl/xwAFUGgtbeyi4vYX+nZpR4TBGVfDrHIDUe8pZp8XZbrQ5rHiPMfpVwkCqSoYeUI9DsqB1zF8dz5RbK6tvDtRprCkY8qqMqNiyoMGogKeeJObyuEfWnB2c/VjxJD62shlK8s3V344= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2ec127d2-81f6-444d-be20-08d9d0ae22e5 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:37.9587 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 7uuyvnuuphMJMrFj9O5WmV1dqNPj4teqeIW4hOSfnHQa3uEYGM5BwirACUnD3THYK3DdJs3fRHiyFjzeCjxuumcUiriI7xfivbFvQshh54g= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 bulkscore=0 mlxscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: Gg9wq4R0jhnrPL-28Hs0olXvb0H3RzcU X-Proofpoint-GUID: Gg9wq4R0jhnrPL-28Hs0olXvb0H3RzcU Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Unbound kworkers will soon not be ideal for multithreaded jobs because helpers will inherit the resource controls of the main thread, but changing these controls (e.g. CPU affinity) might conflict with those that kworkers already have in place. While the changes are only temporary, it seems like a layering violation to mess with kworkers this way, and undoing the settings might fail or add latency for future works. Use kthreads instead, which have none of these issues. Signed-off-by: Daniel Jordan --- kernel/padata.c | 47 ++++++++++++++++++++++------------------------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index b458deb17121..00509c83e356 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -37,8 +38,6 @@ #include #include -#define PADATA_WORK_ONSTACK 1 /* Work's memory is on stack */ - struct padata_work { struct work_struct pw_work; struct list_head pw_list; /* padata_free_works linkage */ @@ -70,7 +69,6 @@ struct padata_mt_job_state { }; static void padata_free_pd(struct parallel_data *pd); -static void padata_mt_helper(struct work_struct *work); static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index) { @@ -108,17 +106,7 @@ static struct padata_work *padata_work_alloc(void) return pw; } -static void padata_work_init(struct padata_work *pw, work_func_t work_fn, - void *data, int flags) -{ - if (flags & PADATA_WORK_ONSTACK) - INIT_WORK_ONSTACK(&pw->pw_work, work_fn); - else - INIT_WORK(&pw->pw_work, work_fn); - pw->pw_data = data; -} - -static int padata_work_alloc_mt(int nworks, void *data, struct list_head *head) +static int padata_work_alloc_mt(int nworks, struct list_head *head) { int i; @@ -129,7 +117,6 @@ static int padata_work_alloc_mt(int nworks, void *data, struct list_head *head) if (!pw) break; - padata_work_init(pw, padata_mt_helper, data, 0); list_add(&pw->pw_list, head); } spin_unlock(&padata_works_lock); @@ -234,7 +221,8 @@ int padata_do_parallel(struct padata_shell *ps, rcu_read_unlock_bh(); if (pw) { - padata_work_init(pw, padata_parallel_worker, padata, 0); + INIT_WORK(&pw->pw_work, padata_parallel_worker); + pw->pw_data = padata; queue_work(pinst->parallel_wq, &pw->pw_work); } else { /* Maximum works limit exceeded, run in the current task. */ @@ -449,9 +437,9 @@ static int padata_setup_cpumasks(struct padata_instance *pinst) return err; } -static void padata_mt_helper(struct work_struct *w) +static int padata_mt_helper(void *__pw) { - struct padata_work *pw = container_of(w, struct padata_work, pw_work); + struct padata_work *pw = __pw; struct padata_mt_job_state *ps = pw->pw_data; struct padata_mt_job *job = ps->job; bool done; @@ -500,6 +488,8 @@ static void padata_mt_helper(struct work_struct *w) if (done) complete(&ps->completion); + + return 0; } static int padata_error_cmp(void *unused, const struct list_head *a, @@ -593,7 +583,7 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, lockdep_init_map(&ps.lockdep_map, map_name, key, 0); INIT_LIST_HEAD(&ps.failed_works); ps.job = job; - ps.nworks = padata_work_alloc_mt(nworks, &ps, &works); + ps.nworks = padata_work_alloc_mt(nworks, &works); ps.nworks_fini = 0; ps.error = 0; ps.position = job->start; @@ -612,13 +602,21 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, lock_map_acquire(&ps.lockdep_map); lock_map_release(&ps.lockdep_map); - list_for_each_entry(pw, &works, pw_list) - queue_work(system_unbound_wq, &pw->pw_work); + list_for_each_entry(pw, &works, pw_list) { + struct task_struct *task; + + pw->pw_data = &ps; + task = kthread_create(padata_mt_helper, pw, "padata"); + if (IS_ERR(task)) + --ps.nworks; + else + wake_up_process(task); + } - /* Use the current thread, which saves starting a workqueue worker. */ - padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); + /* Use the current task, which saves starting a kthread. */ + my_work.pw_data = &ps; INIT_LIST_HEAD(&my_work.pw_list); - padata_mt_helper(&my_work.pw_work); + padata_mt_helper(&my_work); /* Wait for all the helpers to finish. */ wait_for_completion(&ps.completion); @@ -626,7 +624,6 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, if (ps.error && job->undo_fn) padata_undo(&ps, &works, &my_work); - destroy_work_on_stack(&my_work.pw_work); padata_works_free(&works); return ps.error; } From patchwork Thu Jan 6 00:46:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CF19C433EF for ; Thu, 6 Jan 2022 00:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344019AbiAFAtz (ORCPT ); Wed, 5 Jan 2022 19:49:55 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:36832 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344020AbiAFAsL (ORCPT ); Wed, 5 Jan 2022 19:48:11 -0500 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4Pe3011251; Thu, 6 Jan 2022 00:47:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=cCbrdYZgZ6UluuxcuP5hy0i/kaxaqvMKbPFuo+vDN5k=; b=p7Ltb1O0oB6PPqBa5shhf4mxrHyj8Etgmh5mi18+mutwzaPOrG5/144ONY2MOSS1Yav0 JIx5xrQeQ3nuVtaPLK8sZsdgFArtQkW17yjVj8HTFbfgpev9ilK+OPyMBGf/VatWbCBZ IU60K3qDBuOPj2q8y4xGHx4hkifzof6lZiy+FSob7bKYm6gEwIYXehikHgqgi9iEja/1 JLRECHvf+Fw7Q6hJSvM32Jbb13UATuhVEam3h3Wb2r//gVOGf9tdjuF0MoAqysnE7y5B EaC3g01h1u/39eaIMDHMZQ4Y2L4PjbFSaiF5ig1dTbunF6aEZdZk7LD+MGhVIe97qVxo vQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpdg446-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:43 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VfEo076239; Thu, 6 Jan 2022 00:47:42 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2049.outbound.protection.outlook.com [104.47.57.49]) by aserp3020.oracle.com with ESMTP id 3ddmqa3deq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kzwLIhKo9U1psrS64MYjNuIMfLphsihky8sOkH73eeZNDV0ekSkJUtEryHsSsPB5/Oqt3a6uwR12m2GVM6420546la9EHnoJrl+OFnxlaZruEl98aTOs3CCTxH2grm7vLYmW2D/UuHobab22lT06679yp5CIjs83r8Qsv6jEiW7rOSzGOx5L2Wm5/pR688r67HiZeW4aZlwVNbLGO9SFE01Ru0mOvWRAcujo6WFFSTA+S8rRyhIZXwqdZ77sv/DIeoqMkQTKpdAdCOg6QOgnBFaBNIm4ysWckslxeu6cD7T0bS+dDO17Kc3InJzo5TseR4an5vJOJeBtQGXM94i5HQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cCbrdYZgZ6UluuxcuP5hy0i/kaxaqvMKbPFuo+vDN5k=; b=bor7OVN/TjyBd+gepoj3YafyU/7qT1Dshl58zS71L3P1Pz3U52H74kI2D8CTtVHqqOPdXDoSi536Yaj7XPjTSADjz2SuMsuhD67UOi9aZdleu6DDnVFSPKDt6HUyl3OrvY45w/2OzquTB7l5byqTESlhErBHhBKuz9D710hZ/GiYHWs1OGc7TwunMPpuOEDKf0RO3E4U22kak6KrKUdAIVAFmw5QFycfysAne9xYgfnZbdOrdobeVraOb3pAJIJrt7hNr850snS4CkY0bh41E/9+SIrrUSMED4HO89FJIgEXi3pD2Ot26Q6P4evuQ7ZiRQP9x5kG9M8Yrhii39RKgw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cCbrdYZgZ6UluuxcuP5hy0i/kaxaqvMKbPFuo+vDN5k=; b=RkEhl2uTPAokylOxomREwDP+QdAfnk/g29IEPKnaKKRMVIz/UOG0JKR7xJ1pDZPo7Vy0L2E5XP33dZhGSlICcwdTSp+OH9QAByTPrq77NKD1NDA2mxdxMe8OtOx6KRnBB1kS4uSc9L7iSwHB8i8dSwp+CuThi1Q49MHg5HmFyIs= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:40 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:40 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 10/16] padata: Helpers should respect main thread's CPU affinity Date: Wed, 5 Jan 2022 19:46:50 -0500 Message-Id: <20220106004656.126790-11-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 832d7028-1e3d-4ef8-dd04-08d9d0ae247e X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1923; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: e+5k+nIwZLDyOARby2mW1NAU36kRWV/XSgMA1QiCbdpRSmQYzf7TrZDZ6qn1i/aptkxV/bnEAE59dYIROD967edl3wO2EwXymMnnz7dpcn3HFAtGJRXHDVr72vsbu2eldfwGBS72EPospv1K6CxI4O5lMyVQmzd+Y9mPWZk2zGugdHCDKXizEbGPY9A8rm+QGG1PvEnGJViHZ5c7K1wZE/NjNoc2mcRibfcZ6hCwrfS52TAzfSGjM+/lwnKCR3AQQpQYapbK7plWAmajHDEhO0qDIbyl41TdcdcvgMESEYdWuzbwAsVWwcb8gy/z+0MBMcgdp9xOWL1VMCEtO2XcWIhSiK2AVeG1vuCsFrh6B04c8mhVbWp8hWLvF/8iePcF536iypWaEUbWv70pBv2JI+cqwlK7ARF6On19WJ0zXC+O6pJKKJWbaDfXXM5MjfhBnHDVj3KRtcqduP4nZnRpVK4c0xjk75eo64x+/9dXls4hnPqUbXxS51PJ+fxNv2sGCYuFlH3w11RTFSUhpqtmsaEjQbSua562xVXMerzR7rhGEca47TKHehWT0FfoAhWiVKHhlP8b/4B6hDlGmFLOPinNlr2jRQb7iuu4jHETbIHIuqXROgezBOfgweNk8hy8LUrbI3ZYDBC8rsg+T5IqGbqMyVSj588s5ksTfk5Kr0bBHYtjcBazyggvmc7h2wUyulSCv3cZiRiqdlYJCi7IRswoXNobM1Lkcq/qU2G4MIE= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: NRExe3wAwRz+UsnBYwurA8xtyw+EH+tQVmtmCsHkamACQi5oF10T2vwW3GhLwP4EKLRcTR/vIZ1VBJIoMyhvhDi24tJtRikwuSEqYlQzwXTRe0yVQ96kkUnvpLpLRFAwuc8RggV/WkOo0wiaz176Trvm/jPwmx8VHyFsKD39YXt0GWPqomMUEw9uiW5W9ZamBOXD+Pp06tAuXF5LSMaKrq/HcSrZV6SEqzdYhKyDOjMs0JYQndaJ8jqUQxQgjOWECdxBsVO4pkm5WcR5nMVZLfu+FU7Novv75Kan/5ZRdOHSz9NVHAlr7X7jRCygLrcDz91lYZNhl8i57/pkDSj/MUZtPiq3lIJ2mqXOmQp5VbmYEsqB0yPYbaQNT6Ub2K8t+0cAFgnxmiKJ4FjzzJmuawbptKqnDC8rTrxVucRzcZ6krUAl31iLS2MhwNJRCQKWJVeBVQ5fdAFFCAqo7DMN/96v0fFRmNyaLMkchs4I7ZwPTTtb3CSJeskVKcXa5uTI9AFT7W8pEfaafLR5rDAJiRCG0B3sUCTXL1ynRSJ4p0BdoCy6f0QysNzxf8vqKZOhk+Ag4+IFifCvVdsAlmOzKppyLhaqnEzY9wRhUJkMd1A07UUiPHiys12A9uDE24aGqgqQnG9f3goS8tbQ9gdzN5njbYrLVlbkIE22q9Skn9LBXN8AUM99UQNkwFVu7nOuWO14JH8LO7+yOkm3CSgcEJGm6mViX/Om/otJ2phbI9ES318I0XsJNDjD6iSUwlvlaTuo+0rqt58+1OADbtAxv3chsyN0lGLbDPTPNtoiWK1wK4xOhuZPNG0Gbl6ekEAOJBCrqnvaKyg77X9Q317fMMNCrbfGf/EKaE+8OZBq+XQ5qVkDhNfLhRBWO1kaW6gyciOhXXQiLOwlMf1r3UKJFHcu9xVSmQkeqlVZ/swCJ5B3DaCHwWYA6Aly/eWszYL39DQn3fggphk5dqD7sL1XpYgUvvccB7qz+7rOdbail0KjBQYFDjJjF6mnxRbLcfp2XVKxBmUw0tQYdGaRUxU5wyRpcfYF4aatoiWovl86UKo6LLO9XZHbax8cyOFEJI/Vrr9FAgFOleCsF720ybHs/awNIrlmLPeu2nq9LBflaHXoUuEDkwEDNnoYO2CKTTVYg9Ys4T/g9reNnLOrJonJu3lj2AgKKCbPRYmUA1keCAaIrt+gj1a2FqjXAweUkLdxSbEK63/20OrKhHuF0/GPTw8VwLG04VjL0co7uPf9L53GHobDAN2IbXFeny98UThGfN32KoEUi5TLUNfk1LULrXMuvdwmvDSXkhQP5IqlUC+pEtX/WwAQeb/zH+ITLkGS59AxYuT/Ao7rtff0KpX+F45hR4/1IUXZkNMG7R0vxfoHQLHJFsOqmsF8XMT88r5c6N6ueOOWjKl97nTEz1anDt+uqzlkL75+3xmu7f+NF9V7hOqNuxArHJPDCOkDqpOkCrK0osz7sb2wNm5UB8u9Y5ep5vlOjFLQYXXyeiUzUi9gjOME9M/yAJNpKk166t30ogOpyv50kKOOD1LVG7kJjxBJKUyQoV1Q1nFnK9d94P1oTXWlZtwh2AoG5/ukZHUldLcrXjPEh6gdwg8IBJKcRar95FTyXcqSH8Hi3MEz6EiMGKPgijlqakuRr24b9toS+B6mCnGPMeeX4Y/wTTDrTT7pa0NLedr7nWqj67fpVHw= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 832d7028-1e3d-4ef8-dd04-08d9d0ae247e X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:40.6409 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BMhRk1xVuZ/wCjrLjOmGiCWAR6A6i2T+Rb112dGQxpelsFqHXwV55TSNoIuWj2fwZCfbUvRT5nPU1OoHOxulIZVcnlmxzxHwn35hy7KcpNM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: lzpHyArtG1QT8-PtdatUOz9Q0EwJn3Ma X-Proofpoint-GUID: lzpHyArtG1QT8-PtdatUOz9Q0EwJn3Ma Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Helper threads should run only on the CPUs allowed by the main thread to honor its CPU affinity. Similarly, cap the number of helpers started to the number of CPUs allowed by the main thread's cpumask to avoid flooding that subset of CPUs. Signed-off-by: Daniel Jordan --- kernel/padata.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index 00509c83e356..0f4002ed1518 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -571,6 +571,7 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, /* Ensure at least one thread when size < min_chunk. */ nworks = max(job->size / job->min_chunk, 1ul); nworks = min(nworks, job->max_threads); + nworks = min(nworks, current->nr_cpus_allowed); if (nworks == 1) { /* Single thread, no coordination needed, cut to the chase. */ @@ -607,10 +608,12 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, pw->pw_data = &ps; task = kthread_create(padata_mt_helper, pw, "padata"); - if (IS_ERR(task)) + if (IS_ERR(task)) { --ps.nworks; - else + } else { + kthread_bind_mask(task, current->cpus_ptr); wake_up_process(task); + } } /* Use the current task, which saves starting a kthread. */ From patchwork Thu Jan 6 00:46:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FBC8C433F5 for ; Thu, 6 Jan 2022 00:51:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344000AbiAFAvX (ORCPT ); Wed, 5 Jan 2022 19:51:23 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:53980 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344262AbiAFAtO (ORCPT ); Wed, 5 Jan 2022 19:49:14 -0500 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4Owh011234; Thu, 6 Jan 2022 00:47:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=ZRSeNxc7Bw6HLtNJdTm0NAGeJ5M3OC0ZzhTZqShHmU4=; b=UTQacROCJf8Zrovm/SVTe6QWEAECNxcoaL5HGANRRoEIuZdrLFU0cbWhqiLfVFHcnKr9 IYHXF9HN2JjDmN/SGRhajex0lf3vDotYyZqaGeSfvssnIEtdJlqRvFmn6/XslP3b83Xu zqJV92JJjgQESv+JwMDEBP52rzLwpBAaXD+sZo5UxdnvCn61/wgt+RubRtBbLIUJn/TE 0yKUePT3Fnb2hOKtqugd+VRdRjQmFLviNmZQwfbu8PuCohO4obs5NnejqpBEEISytuPV UrOXzgJIXQLx5vuX+z3xJoGvgGD1U6/hR6xL8Mqw+tJT4stOfOmgvFcca+B9J4E23jLF lQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpdg447-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:46 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060Vg5W076301; Thu, 6 Jan 2022 00:47:45 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2046.outbound.protection.outlook.com [104.47.57.46]) by aserp3020.oracle.com with ESMTP id 3ddmqa3dg8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Lel81XyYCcCv3s1nDYeyO04Je4S89dq1ut4jcPLk5uaQ1yIe8dbZfDR+VbxkiFq1NK9xOeSq+N3FnPVZ45uyora9GENXUGWYQFWUwtwEWNMHuDP/sV7pQuYJUnN6fGlHxRpPuB3oCv7P5Z3o0ybDoV4MJfxDX7HTRWtTEsu+StAARRDN1yteRSP2AECEDrINSWBvDcAVsPAa6rG6vO/3TyOlnFnAMyeO7qfxzNWgWYlmXc5Ov3nCkbCNYRhbSNqdEBqvVeJlmrZe3/o9K7yeftmFTceEl3QECGjQpJ8XZWxpAVJQ19T4rq78LKaeu2cnZShCkLc3lXAcHqgG7oxPew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZRSeNxc7Bw6HLtNJdTm0NAGeJ5M3OC0ZzhTZqShHmU4=; b=bptjIJnsx3xs5pmUySeuQIKFws9SLz02e4evth0KG6F4yS8iRGd+yB9/4mUyVur4yHVvDP9m0uPGZbpN43IBVhc6T05DU0WrGrpNSrf0opGLj7zxA3LU8WQUCVrklknH80VVUgCfelT1qgZDl48yklf6t4Ridf6x89Z8izhikmlzZ5tJ/oszzhfa0ek/qOprhPZGzc7DaZy790Ddeo3RQStmJ4RiGYJuRLE4fPPoZppRuH8Z6S/PdV5ylA4Xx/fAPtQyJP9hUwg6WOoyb8286L0uROwzGfTnGqRRZ5SgyIlcCSNJFptZDbY3193djWWv2U3/Hl7DjufCa1FSlgaJfg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZRSeNxc7Bw6HLtNJdTm0NAGeJ5M3OC0ZzhTZqShHmU4=; b=f8lqgBR/g/9xz1jOH9LdTWm1ZUVGIGFJMD4dCKQp+8TYfanTdYEWY67WL/+3hGUvDMi8HEka42XHFXBF+1wpGZaegPxchB0xOYC5fE8BsLKJAw3+C2MoSoaYMnmwBBwFufpRYYJSFBjzzPASiov7rQQef5Gwo3nkrGIsIowAlgs= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:43 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:43 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 11/16] padata: Cap helpers started to online CPUs Date: Wed, 5 Jan 2022 19:46:51 -0500 Message-Id: <20220106004656.126790-12-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ad652e27-55d4-4bc4-4d08-08d9d0ae261d X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 33+TMqCEn0u4jovHvHlDynU6+A58o9pjgmRvBgn+i+pkdT9hCS8cHSBo9fpUggdKHfBg4/LkZK8lXwue8WTwM7ijkpOoYU1M53C+uWy28UFUPiCGQBnKWI5lRhxe9Yj2b2+60heN9v7Zv8p2qy1IL0Ot8HMH26igssiTBSq4Hl/1eX2XAgalitjpfMpxiAi0EbrT5dHNVAIzhxivzxuSeGbB/v2bWbOCKDlL+A2e/KHkFLnf3Pl2/EpmAs54zYW4389s/mMr5WaAxFaLiSaKoRCi4E52CxVcYO9XtQtdW9ffYBYontMd4jwtfnAFqgG47U8WTruAvaMnt0Ycgu8kwukt+c5LQqfrzYufrIAdBG8efQFDzpQUtWXcpPiIEbAB8xJHzYzu81ou43kzxXW40U+qVmNCU24O7rEjatClRhTgYjXko6SjuB8yo0J3bC3P1ZMVk3DqPt7dy6xgn4xpFMrrv/62JRVUVtPbZ68zEcYDLV6GSn7g7P+qaL/+NNrGaX3/iu4ds18TlYEw/7aNnYk+bnuSeQu/SRsvJJbu+f01/TGwaCwDqmbvNHAipdn34gJsZD2A/BunSvB54eGPj0HzNU1NR6gAwaVTmN6ckdX6I+mf1UGK6dnlAnm9Rw+eOo3FGRI3b6A0+Cwzwve7N1SujSNVex3XI2/TYTGEc+2DsPr1idHgfFpQSwF8m8ahGhKAyvT5ktEj/SORivff78kXyx9vXszRAI7looLgWVg= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2/aSos5vd5kxO6Io5KHbf6r+eqCnYvyH/bgb63WBnA3Nm8AZ9gLn57FaOsWyDlmq/nfIRnNnmCT1/ccOJFpc94Jyv2ErknaEzOjsW/+RhsSmM8TTP3jaavw+q73BSbJq+MWIBeqC9L/7VJNYl4atg4C4cMjNamWBcFluUhc0GIsK/2wuLdW8Rylvwu69kiF2HKddTP5Y77xNAtEpQQUKLy3i+ghnavBhdza1U0uskjZWxp9WeNiBlRDZ4Fe140x5VUoxpi0duJ4r94T2x7muQzIW6ogjPXRtjFZ5sHL3oRObbrYtu5yFvmYsU+Fx8PWAli2R4+uLcYDCxtPPsa9f72xTNn3WPWebhjx9YAaI0lFAuL2jgBdZRry6P3qnnZSbdB/gQEgWjcZ8rf9maEsQw/ZPVLc6t0n9BlBbdQf47KtfcZvukJsqOezWG+fRvkDHD33BJIfQhTAwzsA5D1aBjhqNh2R5HGouAseDAQr/hSqSg7yh4+NzxLKycobCPkiYGi39vB4K8nyzjmZnKx2TXYS5W67p14TrOeYrq2re0ND/ygZvfckFgOzn7TQvVjJj5O9dX7vEExPr+4Rme+mOp+v673ch6I3L2KgfZagubYRNKg8o9/0gJHzmkxdlUs4XKrB2BqLGqfYj+qLsTpqapJl/ywZJzdjZio0hVx6SwAWLZRC+KJG8w5dL9/+CgAnD57hBzBiPu6cpwYn2NB0vUNkdOluqcbMLWsDzl9WiHVINhP7LZFQjSTzW4F/QrGczZAjiqLb54AOp+WysKVOZUyjKcZWJgxN+sR/L+qaWJvEhrPIjBzCuT0Ozf/HCS9k6IM0rSURIZQF/4qswGsyFll+rsdQRJxAK+VKM2gfI7F2reYzZHEY823C7TwxU3qx70qNSsO6aZSbf6TkF//aCARYRCdtycHzhREHl14Q8ZMwHqG7znJc0Sw9NLNJ2mIRxdK0JfXFUzqSdymKrVTiJL+u9PP+ZmQ38IgNL/wwSSH/5IbPUNHDQyiFonySHbkmDyooWqlPcrY9vXt2Jqf6TH90eVhFJ7qRhl+GKIU0fqGojMrhxgWwW6y78re9Wi5WET/iOKsUvH2AhfGYLW/PdNLEh08rDZpaz6YgfVsLaC34z2/5MFuU3rQ8TwbPvyTqbu3bzewmtiSwayosS5V8MNC3KMzG9km2F6xdyNOi8IFRkM0HLutbiExtEw0je3VXzfi8mEgmAc7yciisJxXo+os7sSudulL1uCFI9686sc98lakJaYlqvqqXIo3uQyN1aTrXJerGHxIj4imzkA05355CbyVO0DYnsZhtbQNVWYHo2StF4o7jSFwHvUINIfT5CecsfjlpjvOu8u1dOEpcGAeZklhmBZMJCAAyRmfpF0sVhV3xKehPorV30YuWfC2fRXE/nAL9vxuAVHIRzrCUtxxa6cXh3oxnbTx2VmzP8DmYG+Beut1u2j14mkL6VVgMiFCgZ8LYxETMMao+YyBU1M/94FbzemgBuZCK134gZsxKyMW1SLHU54uOojcFKX4zCtm4CwKI6t9tN/NZTNX0vGe69tF/pISmorXHd+M36BP5tcNLBsInHYfMHeNHukj7XR4RUvVdlIGpVzGSAVEYIGAfsgTU6QCrm77kNwuFCl21cTmWXv5ZQ0sTtOqItZAmRkPeeAVgrzluwsVmkGhevvTJ8Qlm9qcUcFyBt26r79c0= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad652e27-55d4-4bc4-4d08-08d9d0ae261d X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:43.3191 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: V+jxUHgqaT7jdYwGx3mF7J+IQXMEvIiEqkjc5gMgBjFeDlfK3CfV5LXSZFckRhuRbc4zYWA0+EXm5iwCA/cUC1WGEPG8EZiS4S2a9C3Z2Uo= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=909 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: 5n4C1OmqtK9TRM25e9I_Ma2_AhvvRb2A X-Proofpoint-GUID: 5n4C1OmqtK9TRM25e9I_Ma2_AhvvRb2A Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org padata can start num_possible_cpus() helpers, but this is too many considering that every job's main thread participates and there may be fewer online than possible CPUs. Limit overall concurrency, including main thread(s), to num_online_cpus() with the padata_works_inuse counter to prevent CPU-intensive threads flooding the system in case of concurrent jobs. Signed-off-by: Daniel Jordan --- kernel/padata.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index 0f4002ed1518..e27988d3e9ed 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -50,6 +50,7 @@ struct padata_work { static DEFINE_SPINLOCK(padata_works_lock); static struct padata_work *padata_works; +static unsigned int padata_works_inuse; static LIST_HEAD(padata_free_works); struct padata_mt_job_state { @@ -98,11 +99,16 @@ static struct padata_work *padata_work_alloc(void) lockdep_assert_held(&padata_works_lock); - if (list_empty(&padata_free_works)) - return NULL; /* No more work items allowed to be queued. */ + /* Are more work items allowed to be queued? */ + if (padata_works_inuse >= num_online_cpus()) + return NULL; + + if (WARN_ON_ONCE(list_empty(&padata_free_works))) + return NULL; pw = list_first_entry(&padata_free_works, struct padata_work, pw_list); list_del(&pw->pw_list); + ++padata_works_inuse; return pw; } @@ -111,7 +117,11 @@ static int padata_work_alloc_mt(int nworks, struct list_head *head) int i; spin_lock(&padata_works_lock); - /* Start at 1 because the current task participates in the job. */ + /* + * Increment inuse and start iterating at 1 to account for the main + * thread participating in the job with its stack-allocated work. + */ + ++padata_works_inuse; for (i = 1; i < nworks; ++i) { struct padata_work *pw = padata_work_alloc(); @@ -128,20 +138,22 @@ static void padata_work_free(struct padata_work *pw) { lockdep_assert_held(&padata_works_lock); list_add(&pw->pw_list, &padata_free_works); + WARN_ON_ONCE(!padata_works_inuse); + --padata_works_inuse; } static void padata_works_free(struct list_head *works) { struct padata_work *cur, *next; - if (list_empty(works)) - return; - spin_lock(&padata_works_lock); list_for_each_entry_safe(cur, next, works, pw_list) { list_del(&cur->pw_list); padata_work_free(cur); } + /* To account for the main thread finishing its part of the job. */ + WARN_ON_ONCE(!padata_works_inuse); + --padata_works_inuse; spin_unlock(&padata_works_lock); } From patchwork Thu Jan 6 00:46:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E1E1C433EF for ; Thu, 6 Jan 2022 00:50:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344387AbiAFAu1 (ORCPT ); Wed, 5 Jan 2022 19:50:27 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:46874 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344067AbiAFAsS (ORCPT ); Wed, 5 Jan 2022 19:48:18 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N58XL007590; Thu, 6 Jan 2022 00:47:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=+E9ynw7v0+0jTPU3dvFVP9/tc0O0J6ZHfL4pFSIjx/0=; b=PQNUzK29uYLZUWgc3R5LLTWcgtom1lTyFMwYmK+nyAxTdc7+SqY9fi5DVhb08z90kxnk 7v7Kr1W/OlnfSfnlGdUf+6ZYCkimK1Vok0pO758STXtT4MV0XUX2sUC6rdnuWxtRLnjz PYp0WfUuSGfeFp5oVWLFdCgTqg0okqYnBM1Yt18/6RFzQTt7ov9xgcHyvtws0COmmhk8 CpZ8ekukPsdIsOMV4CjQUjuvqsVnALy5QqGIiXE1SozDCMqY1RA6hBMaodoiVPBqf9Pb ZOL29hkdm7nnMdGFQEdxq486k+o8DTiZW4Kvsg/EgSK4Pf8ZwEahP1Ckf1O41vAEWxqQ WA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpm03xa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:48 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060W8hn102586; Thu, 6 Jan 2022 00:47:47 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2045.outbound.protection.outlook.com [104.47.57.45]) by aserp3030.oracle.com with ESMTP id 3ddmqgu5c6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ciI9MtVo9IITy8zxEo23PoawRRUs735BtF+OvTPw53OXewTAMIxDYKlK9NZhd+pz/pavLggA6FjwzDD3Jv6VYk1R4o66g4G+BiLgS4lRAQVOPXR5Rqt1Dii7AfUkHgZvpyb7K0mkhF+rDPAOZAOzpNiBsAeojtQJiehrX+xVDPhZcyq7zA38LHqEkHxVSmULApqPz2wbDyrWe32H43PjegfYu18NpwmI7DYrlkQ3DcDhBpvzxJf1BuIqyrBxa2Czwg6nto4ni6RY++Q1jifkCZ8sFpwcuW0MzijTKfm2w2s4gmT7bTESohIjNoJX5YUB/6bv1+mT2rlN2daHgHmiTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+E9ynw7v0+0jTPU3dvFVP9/tc0O0J6ZHfL4pFSIjx/0=; b=VMiqD2VVxcLRWZt0vQUTr7EyiX4mf4KJWAqJbsQx1XrV9wBAucPSTdsI9hjwyvB12IBJfgLL+eJ3UTp5zuhzqPpPXvva35MHJ2aQ/ADsCR4mElh3JSr1bRcpaIE8wuTfusDS7qeEyiKt+JEvSYfvtmbe0qNCS7eVb6uYM6YgZNz9IBNncbf851P5KhfbH3OFMTLd8+Fo4GLYrY2JYfSdAHevCw7u17O4YoF6NOkQF0JYuLWAI0RftWtfrrFJAYLUzX9sSGgTYQzjEWMbcbZVQToBfKQu3wQzlMUXQvseAX1Q/gom7zexXr2KJimYe13bqyKnBf3Bh7IMymbdlHf8uA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+E9ynw7v0+0jTPU3dvFVP9/tc0O0J6ZHfL4pFSIjx/0=; b=yk8F1X7Ys3vpNq4X+joMKwRDMK3oFIHglDy81fPONEUxQn6yUrVlTt3SCNSUArbJACIqhhuDUKs/FdDdSzzQkAx+UKcDEAaSBfLy1DHAZ0AnI0sU6YS1rv8BoVrRVMiGtjQVrxrQgZH5Uuzg/pO/O2hvMOq5hWHYVZ2gorIKjhU= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:46 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:46 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 12/16] sched, padata: Bound max threads with max_cfs_bandwidth_cpus() Date: Wed, 5 Jan 2022 19:46:52 -0500 Message-Id: <20220106004656.126790-13-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c2f406f5-f8c7-4510-5849-08d9d0ae27b0 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: mUu7Xc1XTbfL2u4BzxXRn1OdTYRsjydK9LXTgWgsce0LbF9dlGwlrS7s5A4vtqY4WzWz9kVVjmeBUxwjzX2EsrBotGEkPX2WVFy0hAsrSjv4gQ5ipZUkGqmksfVZy2XoIL3nf0hDCWmQ9C4OmiqvDk/C9giaJLuIR7OHmkFwLEsdpN/zbgh5byJ+RRQ5xkhLa9NBAdvaTx4DznA9JUjY6sK9JtIHBOVULKDAG+4FV8CphABucRf+lcEWEZ7O1mMHSkljLuE4oqMQa6GdBKLr6tfPDKwS1dQLsPWo7fJTlbiOfDdmB8LD1w0ifyGwIe4LKd+iB3P2rPtY/uBc4GG5HanPusUKHaUIl38EGsa9MFnbCy3D0Pv2W/n2GugfEcwa3f7taJ00S9fEpVwibalg9plUHidrSXVhn4VbaEydv9FbcPjFi6mS742PL6sLdZkYVFRfLUiRbN4UWWu8WDeOcK8kCQz8QHdHc/H4vCQtymB4ePdd/phdfVmaS6V3NT9mgWeWPne/6j7nikFPUR7oJlDrLaR2dl5ny2qs2KG0JLA7lJB8fhZl3W0Rwq2GLk7wvfqwCBJpiyZXYlZjvo2kwtT/L36z2Z0FevPMpj9fxfzXDwlEaianaqsq/VvxWnMJvqRRZpt99u84eOoCYnSAh/pmiBJFlQQ3b5IHj1oix1wpvfR23SbgZh/yIczHKruGnQGPtmmHIM+KDmvpshE5O8DmznKGfpFlx61fMl6lKkU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: y+IaWN+ymj3Z2D03i3QdHbch3azTOPgaxXLTSvMgHRYSEpWCvmCErh/6UZI/2SDugfaoag0IM5TAa0HChLC7lFP5fBR0HNia2lKrnooLsN2NaJPDAACi/YiZWEiseYtGeCm2j5AQlCE8YpVRK2j9gOZDg4mneGNrtsU/jyEy8GP8wU+d5aYicr2rLw5YXkBQQnLMZ3lnkwkqmBOwlGag/XbAsPYn5l3hWqplfWe+yB8g8XXOevvx7/gEGhF50JgYuJHFAp5sYdw0UkpqjqGsu/HMq/nLStY0tX7wnbDUFDa05O4YF3VVShxEQLHBniAB8n8gwds4vGefGH/ut9+xMC409aOvVvi/rEiCqJ4hSvwYEqs0NAUMKj2rx1wU793x0hiYPyQocXyWDynW41e0jAkvo8ZmOwNgQjXU+9qDWxZO0cwbRrY42L4NTPOIvS0MCbd15yUWy/PsvHBlBu2w7waX7XViaGWfk/67bhyDFOA4hVOyZMpOvKraUAQ4EOrNHqoSjrj0siPD+CVlySYXv0MfV9z2JBlWdGHoayufseqDBVUg6mi/ywFFFAoCQ+kQ2mySOQM6LtDsypxEgaQqpcdIBVYe1YenWPVyxIuE8s0Wxc/rjZSG//ejX0Nolq58DluJ5DV2ZFP/jLSfZ/T7fY4zqmCMmn9DX3zf6XfSlG1CE4JgtYyw1D0JsO3wa31bwhfE3RNivsttDAWz4j3rkQJLPBLgrDg0sXoOB2JsLV7le1Rred9TzkEl7AzTvk8i8s0TBOt3u6Nh69h655zOgI0tWAkC4HTr1BzF/DzEzLUgt8OAmy6TG9kerz4zn9yOvmnXWWAvQjdY99U2xuv2twZkN0S54Iqboa01q/K5fPWIu2TEkcXWitiM8KgT68n9o733MFMkWjxl2NayRzZ4M53e4rzQnEO5kDb15VDePr5SBPgKSsGxgaf9iO7ZfSOXCJRjqZq8Ck0vppA2ZwigUqIW1NNsS9cuvZU8cGnka6VE95WwdcPcsCrT8LEmxmAoqEzbi4GkErxFbs2QQogBju7YPNoBQPGzaCYi/61De3wfJVtQT6kOzHXVVN00rQZk786Gf4maw1+AnZ5+VlShsXzUjuGHs8OEWXz7WKnAgB5XSXUG93c7gKDjaBrwpQW4vhjVtfgbJ/Q6BemNkevOC65SomCNb7LXgxfG9bgAU+F6M/zSpc8UgOy0ST87UHkVBdDM75dGOZlQaE26nl8WCvRPie4Y/TlIKFcl37L8BeBKCfnwI9mRVy8La9QUs55hzIEgwen0UdlwA3dcFqPJJa0uspOwDKxOaMnjTzDBZOWUrmSaZi8SSWP0+GvN6DPLd4XJA6FndWdx4G28pogeUtTU5vcZl8p6CY9MX0D/Bb6My9rRZR1jzxFMLkxrlIW10b088sDWDvFnvpbvfgWgXai4qCYA59M9JdkBA2jPnrkQNTuf7cK8MEKQM5o1iPjUH9VX8pxMSZPF6rNOo5NhGJKS9m/CPDp9BvNDya+4oabqNVvBtNraQPrlpnlPxLebKS9I7IoqdhgsCCfbxobCclQ48G2gpgwevbQ6EQMPqJBRMfq+bIlEbZJ+hDCLVcI098xfuWgyBhjNhQx6i4dAC7FS934nSfn11l07o82Z6aJyHLAAbJFodivxd9RedH4zpYtsTwVkyJE1rGYNpmY/g+GnLC/Xk7au8zzme6O500c= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: c2f406f5-f8c7-4510-5849-08d9d0ae27b0 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:45.9954 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0IN89zn15klzP9BOFrLM/57KM093S9acMgui0GwOnytc4ZRZ5sEw+Une03S1h59JEzM8/HnQQ9qmJb0C45BCgk3hSI6CUT7jVphkUuRQD/c= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=667 phishscore=0 malwarescore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: h4_Y8PEYP4sYh1Z9hBLR5vldbozDzFp5 X-Proofpoint-GUID: h4_Y8PEYP4sYh1Z9hBLR5vldbozDzFp5 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Helpers are currently not bound by the main thread's CFS bandwidth limits because they're kernel threads that run on the root runqueues, so a multithreaded job could cause a task group to consume more quota than it's configured for. As a starting point for helpers honoring these limits, restrict a job to only as many helpers as there are CPUs allowed by these limits. Helpers are generally CPU-bound, so starting more helpers than this would likely exceed the group's entire quota. Max CFS bandwidth CPUs are calculated conservatively with integer division (quota / period). This restriction ignores other tasks in the group that might also be consuming quota, so it doesn't strictly prevent a group from exceeding its limits. However, this may be the right tradeoff between simplicity and absolutely correct resource control, given that VFIO page pinning typically happens during guest initialization when there's not much other CPU activity in the group. There's also a prototype for an absolutely correct approach later in the series should that be preferred. Signed-off-by: Daniel Jordan --- include/linux/sched/cgroup.h | 21 +++++++++++++++++++++ kernel/padata.c | 15 +++++++++++++++ kernel/sched/core.c | 19 +++++++++++++++++++ 3 files changed, 55 insertions(+) create mode 100644 include/linux/sched/cgroup.h diff --git a/include/linux/sched/cgroup.h b/include/linux/sched/cgroup.h new file mode 100644 index 000000000000..f89d92e9e015 --- /dev/null +++ b/include/linux/sched/cgroup.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_SCHED_CGROUP_H +#define _LINUX_SCHED_CGROUP_H + +#include +#include + +#ifdef CONFIG_CFS_BANDWIDTH + +int max_cfs_bandwidth_cpus(struct cgroup_subsys_state *css); + +#else /* CONFIG_CFS_BANDWIDTH */ + +static inline int max_cfs_bandwidth_cpus(struct cgroup_subsys_state *css) +{ + return nr_cpu_ids; +} + +#endif /* CONFIG_CFS_BANDWIDTH */ + +#endif /* _LINUX_SCHED_CGROUP_H */ diff --git a/kernel/padata.c b/kernel/padata.c index e27988d3e9ed..ef6589a6b665 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -24,6 +24,7 @@ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. */ +#include #include #include #include @@ -34,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -572,6 +574,7 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, { /* In case threads finish at different times. */ static const unsigned long load_balance_factor = 4; + struct cgroup_subsys_state *cpu_css; struct padata_work my_work, *pw; struct padata_mt_job_state ps; LIST_HEAD(works); @@ -585,6 +588,18 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, nworks = min(nworks, job->max_threads); nworks = min(nworks, current->nr_cpus_allowed); +#ifdef CONFIG_CGROUP_SCHED + /* + * Cap threads at the max number of CPUs current's CFS bandwidth + * settings allow. Keep it simple, don't try to keep this value up to + * date. The ifdef guards cpu_cgrp_id. + */ + rcu_read_lock(); + cpu_css = task_css(current, cpu_cgrp_id); + nworks = min(nworks, max_cfs_bandwidth_cpus(cpu_css)); + rcu_read_unlock(); +#endif + if (nworks == 1) { /* Single thread, no coordination needed, cut to the chase. */ return job->thread_fn(job->start, job->start + job->size, diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f3b27c6c5153..848c9fec8006 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10021,6 +10021,25 @@ static long tg_get_cfs_burst(struct task_group *tg) return burst_us; } +/* Returns the max whole number of CPUs that @css's bandwidth settings allow. */ +int max_cfs_bandwidth_cpus(struct cgroup_subsys_state *css) +{ + struct task_group *tg = css_tg(css); + u64 quota_us, period_us; + + if (tg == &root_task_group) + return nr_cpu_ids; + + quota_us = tg_get_cfs_quota(tg); + + if (quota_us == RUNTIME_INF) + return nr_cpu_ids; + + period_us = tg_get_cfs_period(tg); + + return quota_us / period_us; +} + static s64 cpu_cfs_quota_read_s64(struct cgroup_subsys_state *css, struct cftype *cft) { From patchwork Thu Jan 6 00:46:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C660FC433FE for ; Thu, 6 Jan 2022 00:50:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343970AbiAFAuV (ORCPT ); Wed, 5 Jan 2022 19:50:21 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:53792 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344061AbiAFAsQ (ORCPT ); Wed, 5 Jan 2022 19:48:16 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N5G8P009916; Thu, 6 Jan 2022 00:47:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=UFTfPBqMhfxwUs14W4I/lXe+Vd3zvFZlPmGM7cmJszc=; b=ODQ9lvkmW15AEmPpWyPoAY1xB8jOxd4To3AzW2b+MRtyQSBg4lvHEXB6PifP7YIoSn4W 1xqy6jzBUKoeTsnzFPVT4CYB6Hdd3gnFU3xHw9954htH8Y+07SH5JqFK13bjZqJluRPQ v53S6aUYcQWNwz7dwZ0EmsK8d3dSxVikVXvFMA9IOkdEvemthQZR/iILL28tpbxIO/9G 7WLdhSYZYS6cIrgjxWj44Hr/Gntgg1J4DISDcz71TgaDyRIZVxZuchi4ZKtSy4ePf8A+ l6cqGfxDZYkff540KiMpUAqOPT6HISiZxcG3YHLOymRZP80IJSH4KaE+ikFaIWj1j0+Z Bw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpp83uf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:51 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VnfM086893; Thu, 6 Jan 2022 00:47:50 GMT Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2040.outbound.protection.outlook.com [104.47.57.40]) by userp3030.oracle.com with ESMTP id 3ddmqbvtbs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:50 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eBZzLil5gcKiw8g87ZC9jKksJggwVPmrMRvFusFQDRb9T95neTDy9pjiFO3Q2EqBG4avIKTtBRykloUqdjvVXCyJXV7LO/YvNGA8CTPL7xnWqI7dNBKgWsvpOaQaJFppZ+1A/5viqJ+x8kxvSfkYXKvNsG/2ZrLoxPHv+VJwxuyvOnwfXY/q+GIseg1ioW+PDGHBqa5IkEFk/gYFuaSexrKF7EmLEV+ChuaRQm4IuarNcA2aKapoCdnZL6IZ9o6AFQuRuiGA++pCw4XKNi5UXEO8tdX/caHwBtvYbmLEM5HsoAP2T7NAGzntWViCdXHz7msssNUclPDDP5x46pO5eQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UFTfPBqMhfxwUs14W4I/lXe+Vd3zvFZlPmGM7cmJszc=; b=DxBFyTyBb+ghE7YbHNpbMc0DnBJPFDKemM3n48As9ndZTV0a3Vtz7F3zCD7f+q23Dor88xizKkKcjlPX7M+Mqj16u+v3Oyt41MjniodZY+1N7Iy64IFgB/eB1z5NUEHIeiSh/kxf8P63jICtJtbAmsBhjd221KSfyjXKVuTzq197024rtNx6koX8rIdDjYAjDSF46vtiuAQB/eylc1WT7ni3BXMvwK+RBij+fiLdstaXmw4E9aOHcjjGuyhhSA5lBgsVx/x/by5F0nhf1gr7/rZwMUeaegDzmTNA88jSZfRW5IrJUnjz7lLPfSlFk92PbI5cVskQ86uYM1CdiuRYOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UFTfPBqMhfxwUs14W4I/lXe+Vd3zvFZlPmGM7cmJszc=; b=HLkWB5XCrjqFEP585oKCGTsth7MLp1+q3VSZmPd5vJHNvRnnJ+hOd9g1592YBMTxmruPbB/BFtB1KPY0mvfT2BHY9nBdwifDuPoMoNkeVP13XZlYmlWoVSQjWR8SxjTdEM7uL7G4FIsxj+6YHaciqpzSA9In6cFDi9xvWa0DEJE= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:48 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:48 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 13/16] padata: Run helper threads at MAX_NICE Date: Wed, 5 Jan 2022 19:46:53 -0500 Message-Id: <20220106004656.126790-14-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: adbda2d3-5b34-4bcb-db85-08d9d0ae2948 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: t6NfPhMcNMzhsdUD4PdRCyIJ3XrFM7jnTQxK3JctqNCduenVq4Vplxa/omnBsVO3veEr3wq/vcS69v+4dLQUiro0XH/mhPnZibkaUUE0D/u28aV6pQBYSAwy7QQ65qtMagAEdEG5W6gFWOFIAu9nxlyZff8HUexsNhmjnZREb42YALkgjJLiPOldPsoJmJJWWhXy+9yFk/qFFCyKQejnpr5cUR1So/MvKL2dIXkDOY1KU9DiY6/+NBgRn8LFb0ACG9Y2onacbwsrJrQ7hahH3HnnV3rmjXdSx2Sz9nugB9HuiKCdgTCaLLQ7MyBEgR/6KMVmMOwG6oXMQdN/3AA/CjF3egfoPM+D/hMHTPueguBC84dvj5d9zt8iHyDXK5Gro57NZVlWwmFrVahFoaZMpr5ZfJMIzGsvnGuPvcQMBn9CMYCE76AAU33QiCDwwdf+1Ztu3COdLI39LQb/zoLtuOYd+0GBzvl+49K9UH1F8p8oU/4wE2Rppgf3buvwNjhn2qe1N648ZriYU/SkOIUuK9BLg+faUb2EFWerA3ZkUcQKr720bkmohMz/qHVI1+KtIoqhHmW32A+ldcoV0D4kFmdnWWewSCns+QY0X6smRKhfJWjEHNsguyEZUUQSmhXKrS3SU1q2EaYcEyXkRzD6eNtroxRlSO0Rfs4owqLu5fdW8lXXa12o7unvf+qh9zTQ7Haeb2+xUBRgwjwg87+2eVzmuTwRiNCEUZwwdYfjH7trxIfnHsyiB/NF454TAF1QbJhK7Khr/jYW90w2S2XGM+adxGvfkngJLC2zwxHv1gcYkII7QaPGMxS1V8Nqe2pJ X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(6666004)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(84970400001)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ZL7Ot7T98pxVEcnPQOSoI90FRFbeDedeRBmKR171NpGgxPZcxzhEyT2rWKth064v/Z2u52Nei1lvlIH18xIubI3U740b2PmMb12rRwvGY5Cbg+aH+moxnj+eNNdIOskU9Ki12SpWXOya16fUAG/T4+oa2WxUh9I3w7qDeKCIcPmKXU9vqFIiHJ+smNBklirz4+iTXByhiSkMAvt86Ak7fMhszFJvU5ra78GsCVjqYbWv2HGqxYQLupNvJB+w6pWUxUYFrJ+o6KH+mjdP8vIO/TlrYj7c4WNVFlo09yEnMohei+S6Nh2NfKjK6jR6rlTOnMPSD/R8ks0GSkTWhtWZ1qbg7FOCpKkfaAtiuOtcYuFMOXiCFD0pSyAzbuSHGmVLTMOIRfvpUsEIElgCVuBQLTGs4lOfmpNLZuvGCwA/ApaVe9qLa6HzIc2VqdgG9T7yVA5DYrP1WHzF43VBvEacxhA7QhII5e2MZjK2zXggHIJWJz/dl6wvXUGeBaYGoj0q+XB1zMAVOGvl4vDOyIsY1lY0jhKY4vK6cwTtTNTHxVc2yirYAflGkxqi8zW9kV1vu1CWe/eso9MgaBhQxS8uZAvBXlYQnNbvS+GiX9eAqkFMB8ud/w3RDSywQJK7TaSof3pDeJoA/Oz/ZBQPUMJuCODG39rSm6rTlWVK0OI7BW3Y+ScgV4qJpEjg95Q0ido7hqUAB6lQQYn72kWP1RxaEV84KlfOYy1CL/1EGvUkbAxet5zNkmM0ZT7q2pLCwCqSnmRFtctiWnNMnRZ5ZSfKlUySuHdcoEP7UJr3ghn3O2hmRvtFqPlwxIZpSWTDWujZ4xIoSa9WpOyl/6CV3m0O8hoFvzy+kvxcNA85sKzWG62A7Mdp9DG9CTiLUR3NEERKmycakDAgFjX4bJ+Hr5YgrRu6voTXSXz8bDeJAchYCUfinsohIUeECaUFWtg1aNv1etONN76wt0atvqEu+xgjZfezpuFmqjGKCVbrAE9dIqrp1m2SoRx8kapeVjSszKbwuL+WEcFsKA9F2X9p0U/Sj1d9d9aIh31+yNsUcx1bLNABy5NfUCVAJ4T00B/Png0u/ydLHloN9QNDpIKTX2yYeMUbpBpPxBzlWSoU5DhGBJovPDNVOsx+Ts5qY9Wu+QbdxGP7EpFqGYFUkRS0UylUTYAqdhhVniROQvOvuHljc1u4qBe0mFLyt13nGbREONs0iwcAo5sRDCj6UKlEhtlTEra8uoKP4boiIwIAOkJ3qxkk6ema9/tpiOPhJC93KD6eHU0XUFO6pcvl9HQP5DkZy8yz/M0dfgy1AVlM1wUQxwR40lHVWScIvTQAuMsuGUsYKrXVNrmeUvVmePqfC0RXLV0b2K5Hc9tCBb4/61CNyjJxDuXiNuIKzHTBWmWIuXhq5hCbmawUhLM9we7xrJMb09C3PPaSggDYu5QyMsLFeDqMQ4DF470tob0Vp3yC8jUM8Smwd0jvuA14+4qUk0SSYcULDfeYPhi511sn8/PXRHg0yBA2yAfBxaXLqJ2yLM4apsNxuVouBp43j3AyRsTl43/chkibgeTNRhae4WFqiPuDoqWajMGbjCKbby9n9eA54lQh3iZeOnQyyU9ZV2b/pq4Km520g/DCpmAqKaHr735C8/M98GdFpfJNZvVdXXsQ/s/RgeI7j3z0fjHfS14us8yaN7AfLNLXD8JTK/FkQes= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: adbda2d3-5b34-4bcb-db85-08d9d0ae2948 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:48.6527 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 5RzNC/0AUb0MH/deITYpJ2w9eED4OndMwrWLK3WrU2oP0QC5BfEYvq21x0i2DsDE3e7UQ9yOawmZXKcf9b52Cpi+RNoVrFiNwIFyaazwvX4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 bulkscore=0 mlxscore=0 malwarescore=0 suspectscore=0 mlxlogscore=823 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: QwxVU9QEH6cp4HgJQNh4EYEY9FzWI5uT X-Proofpoint-GUID: QwxVU9QEH6cp4HgJQNh4EYEY9FzWI5uT Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Optimistic parallelization can go wrong if too many helpers are started on a busy system. They can unfairly degrade the performance of other tasks, so they should be sensitive to current CPU utilization[1]. Achieve this by running helpers at MAX_NICE so that their CPU time is proportional to idle CPU time. The main thread, however, runs at its original priority so that it can make progress on a heavily loaded system, as it would if padata were not in the picture. Here are two test cases in which a padata and a non-padata workload compete for the same CPUs to show that normal priority (i.e. nice=0) padata helpers cause the non-padata workload to run more slowly, whereas MAX_NICE padata helpers don't. Notes: - Each case was run using 8 CPUs on a large two-socket server, with a cpumask allowing all test threads to run anywhere within the 8. - The non-padata workload used 7 threads and the padata workload used 8 threads to evaluate how much padata helpers, rather than the main padata thread, disturbed the non-padata workload. - The non-padata workload was started after the padata workload and run for less time to maximize the chances that the non-padata workload would be disturbed. - Runtimes in seconds. Case 1: Synthetic, worst-case CPU contention padata_test - a tight loop doing integer multiplication to max out on CPU; used for testing only, does not appear in this series stress-ng - cpu stressor ("-c --cpu-method ackerman --cpu-ops 1200"); stress-ng alone (stdev) max_nice (stdev) normal_prio (stdev) ------------------------------------------------------------ padata_test 96.87 ( 1.09) 90.81 ( 0.29) stress-ng 43.04 ( 0.00) 43.58 ( 0.01) 75.86 ( 0.39) MAX_NICE helpers make a significant difference compared to normal priority helpers, with stress-ng taking 76% longer to finish when competing with normal priority padata threads than when run by itself, but only 1% longer when run with MAX_NICE helpers. The 1% comes from the small amount of CPU time MAX_NICE threads are given despite their low priority. Case 2: Real-world CPU contention padata_vfio - VFIO page pin a 175G kvm guest usemem - faults in 25G of anonymous THP per thread, PAGE_SIZE stride; used to mimic the page clearing that dominates in padata_vfio so that usemem competes for the same system resources usemem alone (stdev) max_nice (stdev) normal_prio (stdev) ------------------------------------------------------------ padata_vfio 14.74 ( 0.04) 9.93 ( 0.09) usemem 10.45 ( 0.04) 10.75 ( 0.04) 14.14 ( 0.07) Here the effect is similar, just not as pronounced. The usemem threads take 35% longer to finish with normal priority padata threads than when run alone, but only 3% longer when MAX_NICE is used. [1] lkml.kernel.org/r/20171206143509.GG7515@dhcp22.suse.cz Signed-off-by: Daniel Jordan --- kernel/padata.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/padata.c b/kernel/padata.c index ef6589a6b665..83e86724b3e1 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -638,7 +638,10 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, if (IS_ERR(task)) { --ps.nworks; } else { + /* Helper threads shouldn't disturb other workloads. */ + set_user_nice(task, MAX_NICE); kthread_bind_mask(task, current->cpus_ptr); + wake_up_process(task); } } From patchwork Thu Jan 6 00:46:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B240FC433EF for ; Thu, 6 Jan 2022 00:51:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344101AbiAFAvs (ORCPT ); Wed, 5 Jan 2022 19:51:48 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:58098 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344111AbiAFAs2 (ORCPT ); Wed, 5 Jan 2022 19:48:28 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4XNX023551; Thu, 6 Jan 2022 00:47:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=MV+YejxdH3x69KnprYQq0pLALcrsFEp4tu7Gfe7m1Gg=; b=dEgdxf23QbjMzJQ26kqkOuBeBDv70tM8OUjhmWDoIHk6rxfVg36rvAetMX+H8dclybFn 3oVdRMWzgdnC0m1JuSuGSVFLQBO0IPqq3mci4DkPL/Vd1wrllIyjkfG2aJwJ/hEx6t+1 LnHGS2Pqn83xeuz7ywuacxsj06EV1h9lio5oOuM7vnDxh9rgWLqkjx+GrjzugiqkeNvA 4UHngmYXBfvk6GuJPNbx7VCps5hwR+J83GLPrABrb0YmPliqt5pbWNzhA3bBKiJkbjx3 C5JJ0F81SpjMkMOc34Fl9uLsp/6tu2EItqcZffdGOm8GsZct01O/M/IcnPRvVYcvZV3I Sw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpeg42d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:55 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060VgMY076288; Thu, 6 Jan 2022 00:47:54 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2042.outbound.protection.outlook.com [104.47.66.42]) by aserp3020.oracle.com with ESMTP id 3ddmqa3dmk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e56p08mDL8WBntRkoxhLmRxBIvC75KvgxsCI/WGB+Cs5p8CsEoOv5J5ZzmUtvX71KTPlDVOmuJj/VHAEtqI/35nMB/mdsTtSkswsRs6KdBNlUkrbaPPrHg8CLyUR4yq4QEO+Jurie08GxZa8WlPY+YgCLBhS1MtbugP5V716dVgEFqSMxpMLHbPbRj3Uz30caRy+8AQ4WmN0Frs0hcwDpbSgX5hIqoEHvlLeWDKQJ1DxXdLLFcrpTL6mjRwarDLj/qIaTjZe8/iOFX0bzQyf7B/+JDKxpzGPNDsY7sZzsXlvN3+j3bOKt3YByt08T9FxKA3fWdkHCN4c3SQ0KO1g0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MV+YejxdH3x69KnprYQq0pLALcrsFEp4tu7Gfe7m1Gg=; b=cEo5SIOJrt/NVdcZxORQFU21lpt1o8UJuqYG82k7QnJcvL/3u8a43Asab7ItG/Nlh6Gf3zmyqv4UgtuotD2o6iFhVzUzRMlp5YWzmgJQvWywfv2WxRIeb4mMzTwL3A7CmwWSmMHqGOexy8DjFlULA/lvjoDHWbvIpBwV9Q3vy7hEtgvffR0lqJVr4GslLnHQbhDieT768g1dkMd0tF9EJEwqs6oWPaYOKoqfLM88rjJtPa+8Douqkc5eBYGi8oc95aprqeG5qj15SNUMVG3iO+UNzTyGW/u8pP85AtsYI6PnPg1Pve6Av5yGjKnVSE7shNqo0+KpFqae98oWlEUz1Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MV+YejxdH3x69KnprYQq0pLALcrsFEp4tu7Gfe7m1Gg=; b=M9iUloDmUi5Ha8G86TbOaf1cu0SzaZSHl+oWl5oT9LEGZp7K54JtjT6LiYyE2dTzKjiVZhIllHr2yIE0Xc4Qbt74/HcK9p3C7/6L8XPBbo44oMgj1L91GgiFXnIWciyNLDy7w73yF/J6/RnTnw+HZknsac8/mIsomtUAa+hPlys= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:51 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:51 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 14/16] padata: Nice helper threads one by one to prevent starvation Date: Wed, 5 Jan 2022 19:46:54 -0500 Message-Id: <20220106004656.126790-15-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8f987fd7-cb43-4afc-d014-08d9d0ae2adc X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:7691; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: h9GVDkfWNdmOnx/60QnwMDRXkYRTPYCjA+1+nCUm2646NXlIUNsL2nAHyX7UUB8KHPRdn/NPh0iG9KU/DKSXyo65jTReNlaeb2z7eKDb7IpzCYCSiNdnR0hNIDnpOwJS2xPCCTefRYbYaQoK2H623tDwJ+Zzjx1rG3OMpkCwmSrmXdATx3kY5ZNdgVe8nCvB2V8ZoSuB8kQFH5XfYyjHV11NZs3KoDT5MhvQNB49Wh7AalYve3weLVklKMRGhpar1+H8/grz/eB0N0PR5SkTO0TQxmWQ5Ew4PHjOCtmBe22sHL6dbN9wA+vIBxgoDyz2R7vncgkabFacUsaQLbAzZJTlUIJjFngDzcNm55f50WW2AISFRlrniie5gRz85UMn3Mbdn42cD0dGqoYtX5NwpwPcJlqOb0Mhht+6ttMjLCTyGrsD3AuKwRRWxnN1cHpeo6JbJyLvhixvJSW1HM9oXSZIZbZVQLqm4+kl+rs4Tx/ux5g9u+4aaw8/T3KlLY/35vzO+cyxgfceJfjflPyBYNtNs0T/DoIEC25FcSiO0DIKG0SoPj+nciYr3g8SYBuC+YhLjbAFZIE2Que1f6OUk3xyETCOiccVIB6P9YIHD7M9Ub6/k6gFz8FCXSDvOSRtJDL3SoDjC+nSqSEJYiD//Y4rxBp0XCsmo0YDcrT/gtgOnYgunHX1xX5B0mpx/PFz+HgrEcsvrKLU/UI2iG3KK2fIiIMIK3Gm5HoHf1sGuLY= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(30864003)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: FiL7wRiszdCYcnC50semlgq3ttv7qZG1sfxqiKocmcNRNon6zhIR5qT5Oosj/zP6ldndlcQUzFWDNMMVGg79t/WUp9D/mR9DDrj1vCQU+n7DTTj66pFg2IpA9PV48y+Dw1b4A4TAwvRiRu/1EXsdquVfKVNGsAs+1RqnhaxP5aYCbfQgpX0BQJcfYcFdaooGFVIiUY+DtslDu97VJzRwqtaXGCvOCd5bOpYMDTKyH6cVrvSUOyMmU0yTN0DMv0gQKMeR9iAOjRHSkp87s5giA7u5SM/LV852Y36hVT0gPc863DCVULWsBYk0rs4t/xaO+/n9q/GQfHhp7lE9JV8Z7oGlxM/FsVKt27NDqwcc8ckDdt0fa2X2tv7nldYh1IzlQ7Itymv4Hed/Calgo9+tAnmW0MtFk4xt+YsCF93TVAl8EPCp2jVrdZNVxHhUfewhC+NgnNovAmQVc+6QMEp1p8bIKY3m4gxZ9E+4Kvb0bb1SwlftGiQsz3J80S0Weyq6yGSjYQN/BMJqp36Okqrij9aBK739MTpIAWrbRFB1pdVhMaZWva7Eq0B64B6rcy3JWPow+rSJmO7z4DNVGdhN5hSXUx0aBz9C3ocXtdNODwr3YjBslBNM8JoU0CXFVbIKuatTIiaFbjFwq4EtlUsijOmb4FVQRt1pfR1G8pkHX6CQGIhbWeg8FMiYeD4m4G5KQ2G5bOQOsP289LFJWPrn0ephwnibP/VcMpp1v5bAXTS5it1gof2e1/xR6D5SOBg/ENTYtYOqIPotzDc1fSu/wAVPyGukXO2UjVj+8emnDxMw/itcVSP5Tkp1aYGzBVKFyQVvNHuGp/4TKfj0WFs4i5m6m0f0ecEcDs6zih1ZZuTuEMmaf8hQolDw3cMa6MtBiiRcMp1NHdv3KVY7lteDx9AAGok9rI/Su1XJKzydTQt+PrZ/qtD6j169JFgoYZPurm3d2PuzLxv7rLTiWTttDgSiHgl5zZgp/9A2Ny0v+4xIxW/dMMiYYKvnPBDZZBYedDh/mVGWLOSfVPeFax/Hxg7j9zQzkTfV2ZIE0eOFjchdZ4vbDlXPTkVmrFTkNaXFz9S68EjfNXGonqr4rTag63qJe/IvF9w2eE5p/CGL+95icMHGdCdw7Kyxp2/p6+UWhc4eOJk2oyARI8lk5S/VmWi0Fd3ECHQRugHdSTqG8oVbIld/fENDEVvYLISoYJHnH72UsQLcrxdp773tvXAL2a99yBAxLuigJWSdq5OQ4sAJulfhfNyqlF0Wfvn/FP19jdbxwy+1GoxaQOOt8zaTjDjce3xSFP3qmqKlMF0vljv6TbuvbfD5FbfBKGtGzkRGKlTAAtaG0aH/YJe2AHXSdjvxoym7GEgtzRghpvFYqWDXfDnrs1KcZPW6kWhfrSmtzvj4CBhlHZkR64er/MWtHvYTKTAw9tpBcsi50WJxGQk/1sE0TL9YoehpZa0xa6FtUWBIV/re2h4pufaTw0IyWv2N3i08hQ/30XxRRPe/+z++iwMJ2dUlXrc2KAtLVhJdyyb+8rFt43rDnsRPYyQ0XoxO/WPRw+hR95IFXjwBbmcF8qopBfeFwcSWw8SOKKf99uOAFuTQYxg+5ssfYKgkOA8eyglK7tu5fC6qXnks9fF7wX9wyDAAV7+0Rd7BuOME80UKiexxXJpD8otesgsqum1ydv2D20O68uBaFdpranU= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f987fd7-cb43-4afc-d014-08d9d0ae2adc X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:51.3001 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NhuDKtPonlt7nmNDn9F1oteR0jR8vhS0ShjxJrWjvhNCTqeCgwb+FCHhy4xGa0h0jRz0VkgerFZywsZWfuB8vU1dXonffQZo2osELtYjU7E= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: 2c8ETdJ4PhGv5ipu-TOTB4kfOwDkouvL X-Proofpoint-GUID: 2c8ETdJ4PhGv5ipu-TOTB4kfOwDkouvL Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With padata helper threads running at MAX_NICE, it's possible for one or more of them to begin chunks of the job and then have their CPU time constrained by higher priority threads. The main padata thread, running at normal priority, may finish all available chunks of the job and then wait on the MAX_NICE helpers to finish the last in-progress chunks, for longer than it would have if no helpers were used. Avoid this by having the main thread assign its priority to each unfinished helper one at a time so that on a heavily loaded system, exactly one thread in a given padata call is running at the main thread's priority. At least one thread to ensure forward progress, and at most one thread to limit excessive multithreading. Here are tests like the ones for MAX_NICE, run on the same two-socket server, but with a couple of differences: - The non-padata workload uses 8 CPUs instead of 7 to compete with the main padata thread as well as the padata helpers, so that when the main thread finishes, its CPU is completely occupied by the non-padata workload, meaning MAX_NICE helpers can't run as often. - The non-padata workload starts before the padata workload, rather than after, to maximize the chance that it interferes with helpers. Runtimes in seconds. Case 1: Synthetic, worst-case CPU contention padata_test - a tight loop doing integer multiplication to max out CPU; used for testing only, does not appear in this series stress-ng - cpu stressor ("-c 8 --cpu-method ackermann --cpu-ops 1200"); 8_padata_thrs 8_padata_thrs w/o_nice (stdev) with_nice (stdev) 1_padata_thr (stdev) ------------------------------------------------------------------ padata_test 41.98 ( 0.22) 25.15 ( 2.98) 30.40 ( 0.61) stress-ng 44.79 ( 1.11) 46.37 ( 0.69) 53.29 ( 1.91) Without nicing, padata_test finishes just after stress-ng does because stress-ng needs to free up CPUs for the helpers to finish (padata_test shows a shorter runtime than stress-ng because padata_test was started later). Nicing lets padata_test finish 40% sooner, and running the same amount of work in padata_test with 1 thread instead of 8 takes longer than "with_nice" because MAX_NICE threads still get some CPU time, and the effect over 8 threads adds up. stress-ng's total runtime gets a little longer going from no nicing to nicing because each niced padata thread takes more CPU time than before when the helpers were starved. Competing against just one padata thread, stress-ng's reported walltime goes up because that single thread interferes with fewer stress-ng threads, but with more impact, causing a greater spread in the time it takes for individual stress-ng threads to finish. Averages of the per-thread stress-ng times from "with_nice" to "1_padata_thr" come out roughly the same, though, 43.81 and 43.89 respectively. So the total runtime of stress-ng across all threads is unaffected, but the time stress-ng takes to finish running its threads completely actually improves by spreading the padata_test work over more threads. Case 2: Real-world CPU contention padata_vfio - VFIO page pin a 32G kvm guest usemem - faults in 86G of anonymous THP per thread, PAGE_SIZE stride; used to mimic the page clearing that dominates in padata_vfio so that usemem competes for the same system resources 8_padata_thrs 8_padata_thrs w/o_nice (stdev) with_nice (stdev) 1_padata_thr (stdev) ------------------------------------------------------------------ padata_vfio 18.59 ( 0.19) 14.62 ( 2.03) 16.24 ( 0.90) usemem 47.54 ( 0.89) 48.18 ( 0.77) 49.70 ( 1.20) These results are similar to case 1's, though the differences between times are not quite as pronounced because padata_vfio ran shorter compared to usemem. Signed-off-by: Daniel Jordan --- kernel/padata.c | 106 +++++++++++++++++++++++++++++++++--------------- 1 file changed, 73 insertions(+), 33 deletions(-) diff --git a/kernel/padata.c b/kernel/padata.c index 83e86724b3e1..52f670a5d6d9 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -40,10 +40,17 @@ #include #include +enum padata_work_flags { + PADATA_WORK_FINISHED = 1, + PADATA_WORK_UNDO = 2, +}; + struct padata_work { struct work_struct pw_work; struct list_head pw_list; /* padata_free_works linkage */ + enum padata_work_flags pw_flags; void *pw_data; + struct task_struct *pw_task; /* holds job units from padata_mt_job::start to pw_error_start */ unsigned long pw_error_offset; unsigned long pw_error_start; @@ -58,9 +65,8 @@ static LIST_HEAD(padata_free_works); struct padata_mt_job_state { spinlock_t lock; struct completion completion; + struct task_struct *niced_task; struct padata_mt_job *job; - int nworks; - int nworks_fini; int error; /* first error from thread_fn */ unsigned long chunk_size; unsigned long position; @@ -451,12 +457,44 @@ static int padata_setup_cpumasks(struct padata_instance *pinst) return err; } +static void padata_wait_for_helpers(struct padata_mt_job_state *ps, + struct list_head *unfinished_works, + struct list_head *finished_works) +{ + struct padata_work *work; + + if (list_empty(unfinished_works)) + return; + + spin_lock(&ps->lock); + while (!list_empty(unfinished_works)) { + work = list_first_entry(unfinished_works, struct padata_work, + pw_list); + if (!(work->pw_flags & PADATA_WORK_FINISHED)) { + set_user_nice(work->pw_task, task_nice(current)); + ps->niced_task = work->pw_task; + spin_unlock(&ps->lock); + + wait_for_completion(&ps->completion); + + spin_lock(&ps->lock); + WARN_ON_ONCE(!(work->pw_flags & PADATA_WORK_FINISHED)); + } + /* + * Leave works used in padata_undo() on ps->failed_works. + * padata_undo() will move them to finished_works. + */ + if (!(work->pw_flags & PADATA_WORK_UNDO)) + list_move(&work->pw_list, finished_works); + } + spin_unlock(&ps->lock); +} + static int padata_mt_helper(void *__pw) { struct padata_work *pw = __pw; struct padata_mt_job_state *ps = pw->pw_data; struct padata_mt_job *job = ps->job; - bool done; spin_lock(&ps->lock); @@ -488,6 +526,7 @@ static int padata_mt_helper(void *__pw) ps->error = ret; /* Save information about where the job failed. */ if (job->undo_fn) { + pw->pw_flags |= PADATA_WORK_UNDO; list_move(&pw->pw_list, &ps->failed_works); pw->pw_error_start = position; pw->pw_error_offset = position_offset; @@ -496,12 +535,10 @@ static int padata_mt_helper(void *__pw) } } - ++ps->nworks_fini; - done = (ps->nworks_fini == ps->nworks); - spin_unlock(&ps->lock); - - if (done) + pw->pw_flags |= PADATA_WORK_FINISHED; + if (ps->niced_task == current) complete(&ps->completion); + spin_unlock(&ps->lock); return 0; } @@ -520,7 +557,7 @@ static int padata_error_cmp(void *unused, const struct list_head *a, } static void padata_undo(struct padata_mt_job_state *ps, - struct list_head *works_list, + struct list_head *finished_works, struct padata_work *stack_work) { struct list_head *failed_works = &ps->failed_works; @@ -548,11 +585,12 @@ static void padata_undo(struct padata_mt_job_state *ps, if (failed_work) { undo_pos = failed_work->pw_error_end; - /* main thread's stack_work stays off works_list */ + /* main thread's stack_work stays off finished_works */ if (failed_work == stack_work) list_del(&failed_work->pw_list); else - list_move(&failed_work->pw_list, works_list); + list_move(&failed_work->pw_list, + finished_works); } else { undo_pos = undo_end; } @@ -577,16 +615,17 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, struct cgroup_subsys_state *cpu_css; struct padata_work my_work, *pw; struct padata_mt_job_state ps; - LIST_HEAD(works); - int nworks; + LIST_HEAD(unfinished_works); + LIST_HEAD(finished_works); + int nworks, req; if (job->size == 0) return 0; /* Ensure at least one thread when size < min_chunk. */ - nworks = max(job->size / job->min_chunk, 1ul); - nworks = min(nworks, job->max_threads); - nworks = min(nworks, current->nr_cpus_allowed); + req = max(job->size / job->min_chunk, 1ul); + req = min(req, job->max_threads); + req = min(req, current->nr_cpus_allowed); #ifdef CONFIG_CGROUP_SCHED /* @@ -596,23 +635,23 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, */ rcu_read_lock(); cpu_css = task_css(current, cpu_cgrp_id); - nworks = min(nworks, max_cfs_bandwidth_cpus(cpu_css)); + req = min(req, max_cfs_bandwidth_cpus(cpu_css)); rcu_read_unlock(); #endif - if (nworks == 1) { + if (req == 1) { /* Single thread, no coordination needed, cut to the chase. */ return job->thread_fn(job->start, job->start + job->size, job->fn_arg); } + nworks = padata_work_alloc_mt(req, &unfinished_works); + spin_lock_init(&ps.lock); init_completion(&ps.completion); lockdep_init_map(&ps.lockdep_map, map_name, key, 0); INIT_LIST_HEAD(&ps.failed_works); ps.job = job; - ps.nworks = padata_work_alloc_mt(nworks, &works); - ps.nworks_fini = 0; ps.error = 0; ps.position = job->start; ps.remaining_size = job->size; @@ -623,41 +662,42 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, * increasing the number of chunks, guarantee at least the minimum * chunk size from the caller, and honor the caller's alignment. */ - ps.chunk_size = job->size / (ps.nworks * load_balance_factor); + ps.chunk_size = job->size / (nworks * load_balance_factor); ps.chunk_size = max(ps.chunk_size, job->min_chunk); ps.chunk_size = roundup(ps.chunk_size, job->align); lock_map_acquire(&ps.lockdep_map); lock_map_release(&ps.lockdep_map); - list_for_each_entry(pw, &works, pw_list) { - struct task_struct *task; - + list_for_each_entry(pw, &unfinished_works, pw_list) { pw->pw_data = &ps; - task = kthread_create(padata_mt_helper, pw, "padata"); - if (IS_ERR(task)) { - --ps.nworks; + pw->pw_task = kthread_create(padata_mt_helper, pw, "padata"); + if (IS_ERR(pw->pw_task)) { + pw->pw_flags = PADATA_WORK_FINISHED; } else { /* Helper threads shouldn't disturb other workloads. */ - set_user_nice(task, MAX_NICE); - kthread_bind_mask(task, current->cpus_ptr); + set_user_nice(pw->pw_task, MAX_NICE); + + pw->pw_flags = 0; + kthread_bind_mask(pw->pw_task, current->cpus_ptr); - wake_up_process(task); + wake_up_process(pw->pw_task); } } /* Use the current task, which saves starting a kthread. */ my_work.pw_data = &ps; + my_work.pw_flags = 0; INIT_LIST_HEAD(&my_work.pw_list); padata_mt_helper(&my_work); /* Wait for all the helpers to finish. */ - wait_for_completion(&ps.completion); + padata_wait_for_helpers(&ps, &unfinished_works, &finished_works); if (ps.error && job->undo_fn) - padata_undo(&ps, &works, &my_work); + padata_undo(&ps, &finished_works, &my_work); - padata_works_free(&works); + padata_works_free(&finished_works); return ps.error; } From patchwork Thu Jan 6 00:46:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 780D0C433EF for ; Thu, 6 Jan 2022 00:52:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344298AbiAFAvz (ORCPT ); Wed, 5 Jan 2022 19:51:55 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:61484 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344086AbiAFAsX (ORCPT ); Wed, 5 Jan 2022 19:48:23 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N4rMJ025697; Thu, 6 Jan 2022 00:47:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=/5PlUHHbvSBrXqo1tlC/TeRPSUcg6TVKzPG45MjfoGQ=; b=lRCBnQWr4mD1MS5OePydMfpHJ4C5sXY5cs061i4Iw+3ZTpljFKFKgVkTn9BhXV1sA9ol 9Vh1f4/wF+Bg9KVMU76/p11SvVuOyV9Hocxg6STcx128DXNVrH9BptU8Sbfn6zcLOK+g NgmaCoizz0VSCEG9r0wplyHfg7Z6ksuvhrI7mcFPlfUNJvKzh0bu3xkx+gvXEhIyY5dj tBnknD+YVitp8go6YDLMIJEXmyYUSFB5hvn9kq45R0rBZimVE/LX6UdT03LMFi2hITEm wo6guV+YPl1qytHB+hCVRl1tLFSPXZzgIcH8jbu07EiSPanEnLA8m1n5IDyUsMXuFugY LA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpmg3tw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:57 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060Vf2R076234; Thu, 6 Jan 2022 00:47:56 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2041.outbound.protection.outlook.com [104.47.66.41]) by aserp3020.oracle.com with ESMTP id 3ddmqa3dnu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FTUIjDY4zUsYw5ud2GXL/YNwoVNjxFflNmY9WFf4/mfRgF9wgeY4c1i7LnLRTrTQMROz3/8C5vfG4rzZV8UzmmOLo7eaweXU7M+BnKo3Y1XhWlYgO858odiq7xjm2TLnxlnXkYPo8sFLSLYx1MhVN34oPcOP2pc6SoralMoWXF8eEXPDRxlgU+MxINGOQpULFZ5T8f0vzf2qfFH71fP4hJ1OAHEOcIavfl7345ZkJoFG14pO8NpLBsr7+VhCrOtRbuKvlPNMOn2tqWpASrOi424Fry3a8+rcBZh3ekXIQz84ggTDs1oYuVB/Mp4QJXwMxKY145+te7aLmc0TlZj3eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/5PlUHHbvSBrXqo1tlC/TeRPSUcg6TVKzPG45MjfoGQ=; b=nUUz25kbFPlNAH8LDQ4oWDBwfKMq78gchLozDK+9bSw2z42eAAmpa3b/crUBAVUozXxThpnW7jk0uDvbhNao+ZaOuvuQU99r/yniWyKizSSTmzQzyjmQBqdnSoHgJCH4WvwqpigMeWwdAVzEeZ6nLhf255Ioo0cV2UdkICiob6yfr8a3FlzU+h9rVQEN3LcRy7DC9RAoZOrCS9A+VloBdJcyOpQIbICbl4MKRo22dpN9v/bV+UlxVcFSH32rP5pbZCB2cbuIjOcsekkTa5vmL9kQT2vulj3x5nbPqVNOQyj9ezgwxixLjmiEcgnOsErpf9NFHldaoaHo8rIUFXawCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/5PlUHHbvSBrXqo1tlC/TeRPSUcg6TVKzPG45MjfoGQ=; b=W7TqO1gtfg03OCWIBLSwKKDzCFOQDZayufHwaL5w1kbj/XkPd5sDjee3WYsgkf2iusf/8d5LNEzZwp5K5BLoqvOJUSvwMoh6PXtSp0aGyFDuutoFfz76sfGvFxEyY+uLqnF8yvF4EezKWDF34FwpS85z02lhmX/rbQoAoHYL0R8= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:54 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:54 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 15/16] sched/fair: Account kthread runtime debt for CFS bandwidth Date: Wed, 5 Jan 2022 19:46:55 -0500 Message-Id: <20220106004656.126790-16-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6efbe408-1076-491a-2f88-08d9d0ae2c6e X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: MzC5nrkBwQZRdwQdEdavNPL6jn8WNhnppAmUDeRGvuLZG/kH2QMqFqK+LoLrFsB6QVcPGpox87LW55A9xzXAlJLs3sPy/+ALxJi3OG56dIA1M8LfCLbUdDlsBZLMp+V0hHZHsZo+5ASY3uPo9lSNVClL9qaGnBFPlsM979//xIUPYo7EGRtWkCciB/oliWDHUu+BT6h57dNUOt0HIXdO5h0otFDQVjopw9/MZIjnsRx+goI2+7QsepjVwCYWEGF8bfpaqP9iN6/IFQbz/S5lf6uoSazlivcoSANZeyeZv69/lGaPk6qyEXoQhHEgszMPXepp3bBJEAgBBP7GMo/nlIQhmt0A4X79qRnqE80TiouTUgd2Vk6JgWElOvjWXnInt8sXguMg7NibYzLe0YuNFAz/dpVYDGO7hc6M4HQUpPHVROwk0KicRraHtWURjfCNIhHcb1aTVPn5FzEnZlJH8mjl4pw/+VTSJdFukdmrFPx+IH6mXhMCDcJ2eegiqdNc3HM983P3XLSLVXLXL50deUzH1VxVH2n8HgFXdIqzrTjXJFCTILom6pWFAxZ5h2+hBjn6r4hIL0xSPs6Z8e5k8dhTfI+sHeEWztAGZsQdfpjR3m7f+O1jiuxXm9JWUlUMa9ypNM4tXOi4J7ITgxkb66SlCt0WiJvkcyVBL1yvFnO9H1bh0boZm/LiucOVb5Se8IqUsIqhDtgkvdQUq2QGf39FlqT8g+oUOmqCsrJXTYw= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(30864003)(8676002)(110136005)(8936002)(921005)(1076003)(6486002)(15650500001)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 7j1IyFGIsxnCqii5/J0rvKnXkxWZIEtmZU0OLBy26pnc6TeWER/zZKuQby0HEYYE3qRVmXRJxr8SCx0tq1uvtNgMAMYayWhIWt5RO2vT3EFdwFjm4LahUPQuhCLUYvrK/yDjQVpwyOGY80TWW9xbENP3dfzVKxq8rSnRWNkQDQLAM6RGOlL3av8332uYCCTCaijTM72QwIIe/gnL0nPZFBDAw7hU8z+/E56UDoJPKND7VTem+6vYW5VOE6Bd32LJXHWR0PEB0M1RjBHocU8gVlb+ejk5RfEgxZ7wnlvY8/QcWALcbfl0oWdotAXYE7SAMYTIzigAoeKgCMhXvFXBpRjsxCoF92nqFCDH2o1DuYl3xVOcxLx74CmaPHpi5a7ZTBefJtvJtwN8qQHoU2OJhG+AEzJHGfeE4FM7wfwEVbSVjuK90QQSq2rAIO2qKmu235f8q2ZlIoWVdVQmR5+mWd3aLw/h6QVqWxt0H676P8YLCeJU3plyLtc/WhRe/7m4Z3HeZL464Hd5/lyu5tJYzTxGWEk7XKvRkOAHthdXdtgmySi1q0ZFRG8lwulYqsOfXWqhhuKkBWOruk8ltN3NBeb394wsYMzrcOLeJmeGBBPsi9/k9a2Y2Z86K6mK+y29TyFeLdNgi7iQMkLgRDLuyiHkjXCoWuuBqsztoLs6hSJ3xD23jAqmnpNhi7/+tPzwmYpSqdzrItSZqjaroCfvdLY3wP4yznNlZXxwG/5guvPwIszTLT6esCCeMuzhX3kL1BKcqyUekb8H9Q2HYz3qpIrxllDq5E35j5rYD1Y8spN/Ep3D4lRI7jpSu2wFAW5WKzX8t4LBqPkEVtKKVl3BfQ4zKx7X1i1LtQTTs37wxYd2zu39usk7imHNPr9vA9vcLs040njOkTDblUaq/HOaRRFvyT9Ucg+9yaGNl47HtPW0wKlUh7UXHYpj81GoM3yO40kI9KYaBJgOTeorv8+fiOIXyswS2SaiZ1R0E7QJQYLPaMUjDMmrwBYoA08wpxuxnHCrKzOjsFhORXr+u88DUoGFob2N9esJ4cP6bp/pvnx37Vh4YKlVvIZypR2KiWM7fbyAvOAbcDVuvy1fMyzA09uQ5sfVMiH90Zmof01DwEGS98Vm1Ph0tIE4NWGGORmkiujVRlTw4Q1nbzoHplLRP6QVHKYNoHysnSzxYhWbiLX8iXOIjpmvhFxK2OXQygB/c+YzRfX9mDu23f6IfCHFCARd0aWz+CmK1nbwVIAAxQWztixEd6EfNco3zO2itYic8i2HJ2SDAlHE7LuTHjR/PA8O6aI21liXVlLvkDgousWam64A5Mlf7OXmv/i9Tz+62WbIBMAYHhiXeWTY8wZo7fvJmIcZtXwPoyw+3Ta05yFbKSl9UAQqMGGISFgJKI8vu87PkGWmkR4MYZwLXw9TCgX1VQJ+TT8i5T1LsVeZgL3ZnacA5qqwFrBPdUFg1/S1Y/D/AEGuJr5FjRdA9PvQFbSgHtR4sbf9/e9hQb7bOrzOaxCQF+w7IYZvqT8Bjiy/k2tQogtnp6T0eTouDxhWzbnrQWSqX5zjKk5H5+qNDdEDn63OKjbYcbYKDSvITKRfeP/wg47mCdN2udygh067LFMvs0pTsQc9HnGHKW9ZQsoF5iRePCoRGcwxyRkOCNgUeUFatW7p8JQcefxcVkbYGTbu88+PNklyCzhlrz3JxWg= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6efbe408-1076-491a-2f88-08d9d0ae2c6e X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:53.9673 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: So5uZ61Kry0GKk1jriKA9EJXUPOYu2N0rni6+Vmml7KP526vCQHy2FOm9EJfRD//FW1kMtU0q4+PrzXiXQdA2K0awONB1lGQaCNH4GqH0uU= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 mlxlogscore=585 phishscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-GUID: w3mb_GHYJ-7p8scw9RnGLppuHmLuheNE X-Proofpoint-ORIG-GUID: w3mb_GHYJ-7p8scw9RnGLppuHmLuheNE Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org As before, helpers in multithreaded jobs don't honor the main thread's CFS bandwidth limits, which could lead to the group exceeding its quota. Fix it by having helpers remote charge their CPU time to the main thread's task group. A helper calls a pair of new interfaces cpu_cgroup_remote_begin() and cpu_cgroup_remote_charge() (see function header comments) to achieve this. This is just supposed to start a discussion, so it's pretty simple. Once a kthread has finished a remote charging period with cpu_cgroup_remote_charge(), its runtime is subtracted from the target task group's runtime (cfs_bandwidth::runtime) and any remainder is saved as debt (cfs_bandwidth::debt) to pay off in later periods. Remote charging tasks aren't throttled when the group reaches its quota, and a task group doesn't run at all until its debt is completely paid, but these shortcomings can be addressed if the approach ends up being taken. Signed-off-by: Daniel Jordan --- include/linux/sched.h | 2 + include/linux/sched/cgroup.h | 16 ++++++ kernel/padata.c | 26 +++++++--- kernel/sched/core.c | 39 +++++++++++++++ kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++- kernel/sched/sched.h | 5 ++ 6 files changed, 174 insertions(+), 8 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ec8d07d88641..cc04367d4458 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -487,6 +487,8 @@ struct sched_entity { struct cfs_rq *my_q; /* cached value of my_q->h_nr_running */ unsigned long runnable_weight; + /* sum_exec_runtime at the start of the remote charging period */ + u64 remote_runtime_begin; #endif #ifdef CONFIG_SMP diff --git a/include/linux/sched/cgroup.h b/include/linux/sched/cgroup.h index f89d92e9e015..cb3b7941149f 100644 --- a/include/linux/sched/cgroup.h +++ b/include/linux/sched/cgroup.h @@ -5,6 +5,22 @@ #include #include +#ifdef CONFIG_FAIR_GROUP_SCHED + +void cpu_cgroup_remote_begin(struct task_struct *p, + struct cgroup_subsys_state *css); +void cpu_cgroup_remote_charge(struct task_struct *p, + struct cgroup_subsys_state *css); + +#else /* CONFIG_FAIR_GROUP_SCHED */ + +static inline void cpu_cgroup_remote_begin(struct task_struct *p, + struct cgroup_subsys_state *css) {} +static inline void cpu_cgroup_remote_charge(struct task_struct *p, + struct cgroup_subsys_state *css) {} + +#endif /* CONFIG_FAIR_GROUP_SCHED */ + #ifdef CONFIG_CFS_BANDWIDTH int max_cfs_bandwidth_cpus(struct cgroup_subsys_state *css); diff --git a/kernel/padata.c b/kernel/padata.c index 52f670a5d6d9..d595f11c2fdd 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -43,6 +43,7 @@ enum padata_work_flags { PADATA_WORK_FINISHED = 1, PADATA_WORK_UNDO = 2, + PADATA_WORK_MAIN_THR = 4, }; struct padata_work { @@ -75,6 +76,7 @@ struct padata_mt_job_state { #ifdef CONFIG_LOCKDEP struct lockdep_map lockdep_map; #endif + struct cgroup_subsys_state *cpu_css; }; static void padata_free_pd(struct parallel_data *pd); @@ -495,6 +497,10 @@ static int padata_mt_helper(void *__pw) struct padata_work *pw = __pw; struct padata_mt_job_state *ps = pw->pw_data; struct padata_mt_job *job = ps->job; + bool is_main = pw->pw_flags & PADATA_WORK_MAIN_THR; + + if (!is_main) + cpu_cgroup_remote_begin(current, ps->cpu_css); spin_lock(&ps->lock); @@ -518,6 +524,10 @@ static int padata_mt_helper(void *__pw) ret = job->thread_fn(position, end, job->fn_arg); lock_map_release(&ps->lockdep_map); + + if (!is_main) + cpu_cgroup_remote_charge(current, ps->cpu_css); + spin_lock(&ps->lock); if (ret) { @@ -612,7 +622,6 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, { /* In case threads finish at different times. */ static const unsigned long load_balance_factor = 4; - struct cgroup_subsys_state *cpu_css; struct padata_work my_work, *pw; struct padata_mt_job_state ps; LIST_HEAD(unfinished_works); @@ -628,18 +637,20 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, req = min(req, current->nr_cpus_allowed); #ifdef CONFIG_CGROUP_SCHED + ps.cpu_css = task_get_css(current, cpu_cgrp_id); + /* * Cap threads at the max number of CPUs current's CFS bandwidth * settings allow. Keep it simple, don't try to keep this value up to * date. The ifdef guards cpu_cgrp_id. */ - rcu_read_lock(); - cpu_css = task_css(current, cpu_cgrp_id); - req = min(req, max_cfs_bandwidth_cpus(cpu_css)); - rcu_read_unlock(); + req = min(req, max_cfs_bandwidth_cpus(ps.cpu_css)); #endif if (req == 1) { +#ifdef CONFIG_CGROUP_SCHED + css_put(ps.cpu_css); +#endif /* Single thread, no coordination needed, cut to the chase. */ return job->thread_fn(job->start, job->start + job->size, job->fn_arg); @@ -687,12 +698,15 @@ int padata_do_multithreaded_job(struct padata_mt_job *job, /* Use the current task, which saves starting a kthread. */ my_work.pw_data = &ps; - my_work.pw_flags = 0; + my_work.pw_flags = PADATA_WORK_MAIN_THR; INIT_LIST_HEAD(&my_work.pw_list); padata_mt_helper(&my_work); /* Wait for all the helpers to finish. */ padata_wait_for_helpers(&ps, &unfinished_works, &finished_works); +#ifdef CONFIG_CGROUP_SCHED + css_put(ps.cpu_css); +#endif if (ps.error && job->undo_fn) padata_undo(&ps, &finished_works, &my_work); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 848c9fec8006..a5e24b6bd7e0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9913,6 +9913,7 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota, cfs_b->period = ns_to_ktime(period); cfs_b->quota = quota; cfs_b->burst = burst; + cfs_b->debt = 0; __refill_cfs_bandwidth_runtime(cfs_b); @@ -10181,6 +10182,44 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v) return 0; } #endif /* CONFIG_CFS_BANDWIDTH */ + +/** + * cpu_cgroup_remote_begin - begin charging p's CPU usage to a remote css + * @p: the kernel thread whose CPU usage should be accounted + * @css: the css to which the CPU usage should be accounted + * + * Begin charging a kernel thread's CPU usage to a remote (non-root) task group + * to account CPU time that the kernel thread spends working on behalf of the + * group. Pair with at least one subsequent call to cpu_cgroup_remote_charge() + * to complete the charge. + * + * Supports CFS bandwidth and cgroup2 CPU accounting stats but not weight-based + * control for now. + */ +void cpu_cgroup_remote_begin(struct task_struct *p, + struct cgroup_subsys_state *css) +{ + if (p->sched_class == &fair_sched_class) + cpu_cgroup_remote_begin_fair(p, css_tg(css)); +} + +/** + * cpu_cgroup_remote_charge - account p's CPU usage to a remote css + * @p: the kernel thread whose CPU usage should be accounted + * @css: the css to which the CPU usage should be accounted + * + * Account a kernel thread's CPU usage to a remote (non-root) task group. Pair + * with a previous call to cpu_cgroup_remote_begin() with the same @p and @css. + * This may be invoked multiple times after the initial + * cpu_cgroup_remote_begin() to account additional CPU usage. + */ +void cpu_cgroup_remote_charge(struct task_struct *p, + struct cgroup_subsys_state *css) +{ + if (p->sched_class == &fair_sched_class) + cpu_cgroup_remote_charge_fair(p, css_tg(css)); +} + #endif /* CONFIG_FAIR_GROUP_SCHED */ #ifdef CONFIG_RT_GROUP_SCHED diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 44c452072a1b..3c2d7f245c68 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4655,10 +4655,19 @@ static inline u64 sched_cfs_bandwidth_slice(void) */ void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b) { - if (unlikely(cfs_b->quota == RUNTIME_INF)) + u64 quota = cfs_b->quota; + u64 payment; + + if (unlikely(quota == RUNTIME_INF)) return; - cfs_b->runtime += cfs_b->quota; + if (cfs_b->debt) { + payment = min(quota, cfs_b->debt); + cfs_b->debt -= payment; + quota -= payment; + } + + cfs_b->runtime += quota; cfs_b->runtime = min(cfs_b->runtime, cfs_b->quota + cfs_b->burst); } @@ -5406,6 +5415,32 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq) rcu_read_unlock(); } +static void incur_cfs_debt(struct rq *rq, struct sched_entity *se, + struct task_group *tg, u64 debt) +{ + if (!cfs_bandwidth_used()) + return; + + while (tg != &root_task_group) { + struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)]; + + if (cfs_rq->runtime_enabled) { + struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth; + u64 payment; + + raw_spin_lock(&cfs_b->lock); + + payment = min(cfs_b->runtime, debt); + cfs_b->runtime -= payment; + cfs_b->debt += debt - payment; + + raw_spin_unlock(&cfs_b->lock); + } + + tg = tg->parent; + } +} + #else /* CONFIG_CFS_BANDWIDTH */ static inline bool cfs_bandwidth_used(void) @@ -5448,6 +5483,8 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg) static inline void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {} static inline void update_runtime_enabled(struct rq *rq) {} static inline void unthrottle_offline_cfs_rqs(struct rq *rq) {} +static inline void incur_cfs_debt(struct rq *rq, struct sched_entity *se, + struct task_group *tg, u64 debt) {} #endif /* CONFIG_CFS_BANDWIDTH */ @@ -11452,6 +11489,59 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares) mutex_unlock(&shares_mutex); return 0; } + +#define INCUR_DEBT 1 + +static void cpu_cgroup_remote(struct task_struct *p, struct task_group *tg, + int flags) +{ + struct sched_entity *se = &p->se; + struct cfs_rq *cfs_rq; + struct rq_flags rf; + struct rq *rq; + + /* + * User tasks might change task groups between calls to this function, + * which isn't handled for now, so disallow them. + */ + if (!(p->flags & PF_KTHREAD)) + return; + + /* kthreads already run in the root, so no need for remote charging. */ + if (tg == &root_task_group) + return; + + rq = task_rq_lock(p, &rf); + update_rq_clock(rq); + + cfs_rq = cfs_rq_of(se); + update_curr(cfs_rq); + + if (flags & INCUR_DEBT) { + u64 debt = se->sum_exec_runtime - se->remote_runtime_begin; + + if (unlikely((s64)debt <= 0)) + goto out; + + incur_cfs_debt(rq, se, tg, debt); + } + +out: + se->remote_runtime_begin = se->sum_exec_runtime; + + task_rq_unlock(rq, p, &rf); +} + +void cpu_cgroup_remote_begin_fair(struct task_struct *p, struct task_group *tg) +{ + cpu_cgroup_remote(p, tg, 0); +} + +void cpu_cgroup_remote_charge_fair(struct task_struct *p, struct task_group *tg) +{ + cpu_cgroup_remote(p, tg, INCUR_DEBT); +} + #else /* CONFIG_FAIR_GROUP_SCHED */ void free_fair_sched_group(struct task_group *tg) { } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index ddefb0419d7a..75dd6f89e295 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -367,6 +367,7 @@ struct cfs_bandwidth { u64 quota; u64 runtime; u64 burst; + u64 debt; s64 hierarchical_quota; u8 idle; @@ -472,6 +473,10 @@ extern void free_fair_sched_group(struct task_group *tg); extern int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent); extern void online_fair_sched_group(struct task_group *tg); extern void unregister_fair_sched_group(struct task_group *tg); +extern void cpu_cgroup_remote_begin_fair(struct task_struct *p, + struct task_group *tg); +extern void cpu_cgroup_remote_charge_fair(struct task_struct *p, + struct task_group *tg); extern void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq, struct sched_entity *se, int cpu, struct sched_entity *parent); From patchwork Thu Jan 6 00:46:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 12704973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12C48C433EF for ; Thu, 6 Jan 2022 00:50:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344402AbiAFAud (ORCPT ); Wed, 5 Jan 2022 19:50:33 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:62246 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344097AbiAFAsY (ORCPT ); Wed, 5 Jan 2022 19:48:24 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 205N50I9009819; Thu, 6 Jan 2022 00:48:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=g6wbgLabBfejMsOUJXr9N/Jk61qImKZOs8Xw5z6wzKI=; b=c6c2Kcucf6st2/6H+vLK1PLZag9VG7vyH/WwxL42HRS91DczFLdrPp5V/asg8oOVZPvE gAoKoh5Qq7mTfnrEEg2mWjxBpp/Kf4AjLrXoeE6LigjRlTaWaVB1KxWnUUAzw3jzeD4u /w8fNluzVLdvzYNubxDw6Xs68pfC37GdW/mSfO2O6VkKnGyxmdUecnqmB006lEK1lC/m Q05Hqz4N40F7xsWZZjvc87rPXXeURDiWeW+ol6qPMYRtv2tLGJgtShrpHBICv35HijWw JMr0ZAwgOpwqngE8ciH2Q8V8i14ByBy433POUy1taHscJZaO1te7UzzyPDAYn0sb4+5R yA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ddmpp83ur-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:48:00 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 2060W7HW102479; Thu, 6 Jan 2022 00:47:59 GMT Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2047.outbound.protection.outlook.com [104.47.66.47]) by aserp3030.oracle.com with ESMTP id 3ddmqgu5gd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 06 Jan 2022 00:47:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EEK/Y7CSM5aMlrOuhQm4gS7yJDKHkdfaOx3xSeglYIVHo1Xrj6m07BydGiq1KMlnVG9NRFwzLVN4qTiN1aKx1h/fvt7nQeoqwMEz3yq8UUyX//p2U+PTGWK4h7NjVUJiFo/tyGCzzrKLsechxzNcughhYH9r6e4PEg2cMVmCw+BA2rDjxFs4omSgi0Jpb2p0TdqxYtLaZBCdBeA6Pvq55Nj1TbLSSaJMEGuwfdwpW676oJXiyz8tJ+4vp+ZIIXec0eDFqlHQUBHy75XphIExAqtTMxJY3EgYTI7GFP0pzWd5tmW++Yn6qSAGl6R/Uedid+Y8KKvILL6JamrqWzup0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=g6wbgLabBfejMsOUJXr9N/Jk61qImKZOs8Xw5z6wzKI=; b=jzq5Nswckid0IEuc7iJpZ9ioEsi9DCPEzc/QeXyBI4O0Pj79qdV3m+is82ilzyTp92KjIRI/srWa0zTBwpp5gVcy7TM1M5ldl25/yLRfxS8vLlRSJBoy5cVg5d+pDMtE/ovCh3Z+dtGvwnqjxuOzowZzRsDkqz4me+SH2/1+jiAumJSFYQ6v1j+8nfXA8e7r85ebniKdCx42rfSJDSHHdacQiaG3P/AL3effd8mN5w3Xr0L9GaHhdbey1q3j4iX1bluqR3vAtT+817qx4Lildh7Z9NfiISD5tnMLoZZTbZ2aqUwEsf7H9YUs0w97cuB+PLA21pKznjjzq4c0O3t64Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=g6wbgLabBfejMsOUJXr9N/Jk61qImKZOs8Xw5z6wzKI=; b=V0sFM5zWLvzTX1xqRKDlOLiZ0km4Im8QMxidIJLTwJ7GdWltvw7MbpUhP3MOukECLj42rnxfYMV7GjM1yZjRCyKd6CHDzS+yGPj1AqTlraT7ppaqeqltD9YKgj2gdJ+jQsg5DhEiCpMUXAe3Je7YGHbf1AVlsRrZEbJEq1qQbHI= Received: from PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) by PH0PR10MB4422.namprd10.prod.outlook.com (2603:10b6:510:38::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.7; Thu, 6 Jan 2022 00:47:56 +0000 Received: from PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3]) by PH7PR10MB5698.namprd10.prod.outlook.com ([fe80::85a3:23bc:dc92:52d3%9]) with mapi id 15.20.4867.009; Thu, 6 Jan 2022 00:47:56 +0000 From: Daniel Jordan To: Alexander Duyck , Alex Williamson , Andrew Morton , Ben Segall , Cornelia Huck , Dan Williams , Dave Hansen , Dietmar Eggemann , Herbert Xu , Ingo Molnar , Jason Gunthorpe , Johannes Weiner , Josh Triplett , Michal Hocko , Nico Pache , Pasha Tatashin , Peter Zijlstra , Steffen Klassert , Steve Sistare , Tejun Heo , Tim Chen , Vincent Guittot Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [RFC 16/16] sched/fair: Consider kthread debt in cputime Date: Wed, 5 Jan 2022 19:46:56 -0500 Message-Id: <20220106004656.126790-17-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220106004656.126790-1-daniel.m.jordan@oracle.com> References: <20220106004656.126790-1-daniel.m.jordan@oracle.com> X-ClientProxiedBy: MN2PR20CA0019.namprd20.prod.outlook.com (2603:10b6:208:e8::32) To PH7PR10MB5698.namprd10.prod.outlook.com (2603:10b6:510:126::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 961fb492-34d2-41e6-bf8c-08d9d0ae2e02 X-MS-TrafficTypeDiagnostic: PH0PR10MB4422:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2582; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 06uXJuKwhpEyxBHrIy02hCl+vCw57/H3/F7H754Dc7RGLtNyZGs2WkSqRU9UxKcUGct3V0F3rwJx9F+133hGtmNSzmxK/xTHpfBVNneIeE3R4zcsK5870K1kTcYYU3sBpVkj3vadsi5ui/hpglpQA9pbDDanPzZsZgyG5n3bfLpQaCi6rzTYNZYDenEZidY6yeuByNTtIAeyxmG+Cab4cn2UUxWm0wA26YdU3jKdJJs27Xgj15INdDWW5KmRctj0ml+LL2tko1Y0t9teKvrr640YczAyZEQsqNntDlnkSSL7NJau1RLvmpow0zmtim+yryhvSMUpaDAdQelwHhVvxZMGEfHsqVpA3iU7QUUNdSj5Axq0CysMbqYYAYFDZh4OYVkw5JEDM+HEx0piI5YO57f2aI7usFs+GDefl1Aph33NVgR1yR+qiAxquoQsLO5jHk/2iZIRP8KK+mUs2BfN/XjsXyZknfbcgDplzhzOj/8CU+eP+9OvxluChtnkFO4D4JIHE6ggD70YHneW0r0rJdRYY81jFEo5E0Y9AvZfUeN4hSoOuM//rE2nl2lP9eM8DWzcBnW/Ag2M0MMnhP99TkBMOE1Nz4AmPImzb8IiPWUStYuX/4jRFn8+8TGA8kiyRKXc08RqUjcX529uwlQKbIbx2YElEzmLZuO1TtuMs11JSVCUW/DwFJEcfdVvrHvlX+E+dft+Vq02lA3CxCwnwcAsyjnn95Qc4bGpnb3k2XI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR10MB5698.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(4326008)(2906002)(316002)(38100700002)(5660300002)(38350700002)(83380400001)(508600001)(66946007)(66476007)(66556008)(26005)(52116002)(186003)(6506007)(7416002)(8676002)(110136005)(8936002)(921005)(1076003)(4744005)(6486002)(2616005)(36756003)(86362001)(103116003)(107886003)(6512007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 60hMrD68FoDPL+zTC1E6RbDd7u3Umob1ninjKahDE5F3C49hccI3uWpNfCukDFChFQxsXFouTuET2SprLfhF9Wt/hXiXu4NEcP5NhuDQyOgTqKnKvmwptHdE079+mcXwpuTrg1nWRilyQDhqBsFImXKLyep7NtwMloxeEQDdJww+Mdxovh649I6VEMQvS5kaxm7D7icyDFUYO/fI2PCHxoM62UX5Ph68xdt4luSdlsmQecDsgd9+RGiZmKx2e5IoeQJwOReAr6U/cCpzfVS4v82ma+3kmq+1UnPU/YOHawD9jSB/yAX0SiL1SN8gsZjqmPq4DSUq2V8kwRwmB4v4TC9hWqW4/yNwYLyU5M1KT2YhY9N2tn/ClF3wXzq1/q0YBKgQBCpPhy8Q5CRr54bsiVWFgZr5ZIH7MyWEA6IBfIKN/DNfEdFIJEdGMttJE6eCwRFAK2w+XgkjgqYFHnCkABri7K9FIP0BhIgjClN+cRaVozQeo4RQF0aoh4ObHN68gI58dgFNcA5Fav/Ito/3HEwvTO7xXLlGGrrnKjYZye5Cg0XJfOpn2um1cJgAF1wRgaGoORYBNiIxj/JTEQhSShowowjpXo7ykmup9TLF8OhaDFuhxcS6MK+A4a2iQKLV6FwNR+AmCow7ZucoDjEBFzJRzgtkGduaEUzY6jRVB/3hX48NPnGXrvgfbuqSY15g9hSNN9gnwYzpz8ZzaqkDsdITJnEUgth2sreB6HXcgs7yKmjzruP2E7hMJ934MguBrCndk+sC0ZK/PaSjrI1lr9gC4acEY/pp/9e4vr9Tbx0Gfet3PEj3xFg7jUcVB19BdoHzkJmoQ6tDfuX0Kwde8rnS9Um+/q1mXjb+tV30KiW8vjhpkoUzzg65kZ8Ubk+5AbcYn4O/m+TLXC4OkZlf6UDpA7LHgHSnfbEknPV24EfPT55+GvsfKCqtjSvB7tjei52loBAIUKax+AjTdLt4pEEew8Hd+8e3r1wsu/IsV5PKH8KSWkdhcTAprJzKWzxXjcTL0ifk3CPmicxjC6m4rK9AUgRy9wT11jaS/v4TQcEwxliIA5UpreDCZCHrIU+zTaBFc1BuxdYVbVfmhoWcI4aGaUwJ1yjHBZjyt6dMHckPe9y8p5FcZLBHPithgbHPMPNAlW/2pMn2ygmhLTi2cSecpXR+Th+u4t0h3ia9GsDv7UsNRUFh5XGg9WptRQT2ai6UFA8oKz+dgZedhRaAUSZVqzbamycaU+Zvu+7TkO6QWEZm7TPo8Mp/rRKYjMaUWm4+T+urQjUJyURHglJmiVI45HvcyodHIx6H/WM/nsyOoGoOFutv0mM3wE+f3ldKhaiTVcMqSgdCWjpaIBtrZPAGgq+dpoRFPZO/BxNj43xJ9bPxcFpepth7b3mz9A+RORaUuDD91ULlG0hRbZt3+0kRvFl/3hROLFtSCE3wcY0fqKFVhEb4Yq5DiR2UWsyiSLfnOTfPXgXxzTOmiiSgshZ2spBrQ5UdEbvNSpRHX0u4LA4KcerX9cpNr35a7JSaOwO0HvcmjL6EeIfVnW9Qrzem9/qT4qHmlyV5r9jjmkohMxE/E81kDJbcaXZ9PQucBr0uXGwm2TIPIlrHZJS/GBDUniwGnmjYawoD0mzhg40zL4Su1clFVTqftpqgA/RFYIc47tBNdK+jj6fF79dzs8+8s+FDNbTMUO5kPznHhJc= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 961fb492-34d2-41e6-bf8c-08d9d0ae2e02 X-MS-Exchange-CrossTenant-AuthSource: PH7PR10MB5698.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2022 00:47:56.6087 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jJzqxi0yVUBcFUJt7XtjlaxV1xcyiOyjFZNJT8gq39Se21tkaApgGKK5dowjzaqdtGDAXUIcAxzzwXRuefAyXb1JKrus0Ur2mgCRtRhuSqQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4422 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10218 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 mlxlogscore=993 phishscore=0 malwarescore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2112160000 definitions=main-2201060001 X-Proofpoint-ORIG-GUID: K9Ee__0VwpsSlQEGThoUqRnivlDQb83K X-Proofpoint-GUID: K9Ee__0VwpsSlQEGThoUqRnivlDQb83K Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kthreads that charge their CPU time to a remote task group should be accounted for in cgroup's cputime statistics. Signed-off-by: Daniel Jordan --- kernel/sched/fair.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3c2d7f245c68..b3ebb34c475b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11524,6 +11524,11 @@ static void cpu_cgroup_remote(struct task_struct *p, struct task_group *tg, goto out; incur_cfs_debt(rq, se, tg, debt); + + /* cputime accounting is only supported in cgroup2. */ + __cgroup_account_cputime(tg->css.cgroup, debt); + __cgroup_account_cputime_field(tg->css.cgroup, CPUTIME_SYSTEM, + debt); } out: