From patchwork Wed Oct 20 17:02:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33196C433FE for ; Wed, 20 Oct 2021 17:03:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E34060FD8 for ; Wed, 20 Oct 2021 17:03:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230233AbhJTRFo (ORCPT ); Wed, 20 Oct 2021 13:05:44 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:60764 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229771AbhJTRFn (ORCPT ); Wed, 20 Oct 2021 13:05:43 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGAWS9000747; Wed, 20 Oct 2021 17:03:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=47AdVQnP4lNzmxB3zj+2p4Hs+N625Kmd3Mqcswt0ifw=; b=cBAIws8S/hN3oC48d9HONtr2IdBk3c51s6/4V2Y4FkYwqdoq1zySuDgd4H/Ydz4k3PYn ctw7+BVu8NFluZueAKp3TnZvsdtGT5G/JIpofrzNS4fG3VMN4dCYqwcVgMCXLmgRJCZl zsIZdXx/xhJcvIC4GPWu8fWfCXNn4GPteXFBGB5t9orOdoY/UDxHHgoBcCG6rHUXEUEI dbIj44tRR85/251rzzBhOENWoO+sPbR5QSC8ZQw85xt3jy2H//PojJ0glR6cdKaa4Bu6 CnOUG9JTQnoT+OeKcsJtluHRne1ewERbPykOHwnuJ0I2NT3Vy1L03gfxr00DcNkT9qo1 YQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4sbwb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:03:14 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KGtbM4104587; Wed, 20 Oct 2021 17:03:13 GMT Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2172.outbound.protection.outlook.com [104.47.55.172]) by userp3020.oracle.com with ESMTP id 3br8gug2dq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:03:13 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bB9NH3BCiEUznkCrkn6wbwVYwF7vLXawPWVfCp3pPmZBBRsL7rMUY/NE+dCTY0V4VyYq2VJXdloD8kCV+2M43YS4N+W7egT5TAcwlmP2ZRvz9vEi05Mp6KUhCawxMwBZG3B7kBwhxlqAvFnXITxyTCO/Pxvrrfxn0d/1n6sMQQGjLbP0rVOBOHru4eL8kcCHluTqFR55NFTIJJOhagnJEN3a8v6aYws5HmTOhEr14JH3xWz/3yUNn7El2aXeD9ZtVLWk9enRsFtAGxHDz2RyTjDKA0EX7YPGpcDU1cAbCF0u6ZOTBoeW4XKqf6o+qMyguBcWQm4uPGJrVfij/0l5lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=47AdVQnP4lNzmxB3zj+2p4Hs+N625Kmd3Mqcswt0ifw=; b=g9/tS4UywnblCln/gbOchwaArni7IAxSjPKuFhc+WB/w/b2v4mQnsTj1Ktmi7AyOtEVNCuaH6tBgfp4z4lwGVPQLp+EpmY/AZpDYVUrS287FFNnozLGfOWxNdwK2D6YD3XZA0kHHgyynP7uGPjFMsGJBmiw8aIMbwA/1qaEyGATCl3znKfWnnr3uKaKAw8Y8IbY++AVpwwkeiVYXiw6fwcnjDY38HkFdbNkc+4ED6D+IsTQ5r+ROr0VBTlf//znu//XmLjBdd3hqsStrxqSzB151xDg8fUWpMtjLfmb8XN10WV1RTqxfxwRhiLT/7KmiOIP6oMN2bLdielPKlDEmzw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=47AdVQnP4lNzmxB3zj+2p4Hs+N625Kmd3Mqcswt0ifw=; b=udxgngcT/Ld50r5+wPkJaE6B5BGb3Iu1eCIvEBCoE68ZO/WKFjbJvUnHaBdgl7n4swImMYXp40gaBsdbaYi/R8jArDYdlTDO8yDMfgOrR8bvHiJHjoC5XE47qdYGEg3W6CZTC8nh9+1uX00a7BfWZDUjcL9bgj5DKdtC/98miGo= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (10.242.164.110) by CO6PR10MB5571.namprd10.prod.outlook.com (20.181.96.211) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:03:10 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:03:10 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 01/14] x86/asm: add memset_movnti() Date: Wed, 20 Oct 2021 10:02:52 -0700 Message-Id: <20211020170305.376118-2-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: CO2PR04CA0171.namprd04.prod.outlook.com (2603:10b6:104:4::25) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by CO2PR04CA0171.namprd04.prod.outlook.com (2603:10b6:104:4::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.16 via Frontend Transport; Wed, 20 Oct 2021 17:03:10 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 798c735b-a590-4567-c9ba-08d993eb7ee4 X-MS-TrafficTypeDiagnostic: CO6PR10MB5571: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:7691; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: OOBQyf1Veahj5ejDVCtawzXtExz3dJnu7EACwS5DbBCzglEZfYhj8bUYF80itOOzfAroLQcZpWz7ITioJ9P5o/AVy017QwfbRLMh1x+o7hGpOeR1o9/Z1Bd5pE56AmWYgCoOINIcqbopZ316poRecd17M4k3wtHrAlxcmSOMjA/saZ3cch9XWCJYjN5Ij2ny//UqXD56WCUWTJgFwxImO+653lhU8cmkF+rpCCEqNFEl/I1mI+B77SYxQ/aJodVQVkJLGuHlo2ex6/Pt6RSqDS320m1B1YeUQ/YKq5mKWieBjb3YH1Tvg/xV0Lrzh7JIHiMch+nnj6HWf8tn+1B+k2xuKd0SF2F9914jcmxcle0+hbIqfVFFrrw9kWMa2OCTLrHNiQvgRSmNMocWG7zLoP6pXAeXqRCHfIOYcu7QziC+8akwpmmgLg733PZN++DBiin7k/3VoPhQHGbqkyCuWN2oX3OqRR0ORQCvgsZxXAnP6qnonWhIeWcIwi9kGR4s68MbqNqEHghYGGchiN2cnVHJ1CikCCD2/BQqLWHs4Sffgas+Oyskjjnxp7OSrPbjEQwMtIykjDgYwC64LDQlxrH0zki0qoYjQK/c7GYfXotGOo2IVnmq6GF+I6lMXUjeeX3h4WOQ6ViS39aMczgi+hHXKCsg9+fubFXvjkvE5Tl0EalpoNZxJEOC8JaCsIrMnyEBKACE+7kOvLNzBV9zd/yORFl/4JICYIrM1icGkbPYXyhsAZskMb8Hu9QC0Du1mZUQUG7+GX5Z9aGqJ95nkifF2y7aRi7SE/q51moEZcr7Efqvic8keM454KDg0ko95won9O5uwkJPzUWStqftNkyzMf/X/m/MkF1H+CuMBsNhdC8C8eY6mJvLf8Ahf68w9mK1j5hn7bBRCPvJndsZYLEfoBt66Gyph21R/bFtzcw1+Mh/SzvLklQjEoZsvszYRkF1j9WpKALVczkpR07UCQ8jsW3toFxteMTOd8s7rfF1l6XG8CcUgvqiw9ESZEXRS8qLiz+AnsEe8D24dhLii8bq58woFjv6yBZ77875nBs= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(8676002)(2906002)(66476007)(6486002)(103116003)(66556008)(83380400001)(1076003)(956004)(107886003)(4326008)(8936002)(5660300002)(2616005)(36756003)(6496006)(52116002)(86362001)(316002)(6666004)(186003)(38100700002)(38350700002)(26005)(508600001)(66946007)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: A8hM6yKeqS90CpPbavMSdvPVufq+XMmxNS6GuTgJbNYGDvS7bR6BwtT4ImPWXn3b5PrMbLc1snMauDraUoThxskQ/msRTc+JoqsQoF5VKvTKq+KmyS7DBbae11jTN7vvZ1AUBOVH1Yp8al+7hSHFSbOJ5AkIa4SVw07Ht79iMq64Uuxj0IwsgQVO60X9yJDlbzGjEtFn9Pu7SOCR4Niso2NVmuQBAdzKQ8bcqO3VMOUITe47IMuQ4iTPOIiwxC8mEswaX3ySbwyFg3/CNVP8LEpEpoMS/0M7IXJF/PmCbCay4umAV+Jz2joLV6fYE5PYPuZ0KGwTV388Ykl5VqygwGJWGX18a8UJZ42sSwS06+VQ0qLnHUvob10EgWSm4bvHET6Nv1osm48BEC5ddgdsescvC/5QwsUDb5IYNauadeXupD/c70U3JiulBAyEKGOqNB7NfY2pQhKYuCSrdJA6FGoVfaZZaGI1G5q4iUy+lLu5Fa0dYP5NQyVokcGj0eUDDlwsYFgxZeCXhYS8gWQARvgo5RRdg71WeCDjVvSpspw8fSVdEp8LJdJfHppt8N7cenuHmWw6ad71KHGQCAa4SG2tjQayeIQ3zgy3HN9AAdYqzu/u8T7Xqtag2CcGXjmzIh4Pnmo58APlVNxbS5iybZy3I71ENFf/6pdGpsuC8RKmjwuWjhJpRveV83lo/BD4SQf75hBuaQrzbPv04k2UF1RcMq4pJ39mIyRkTwbK1OH31XJzP3ALEnZovWfE+gNyUaPu6dt/OQlyB0ZkSIBQoJwcUK7UxHcQmWJT2sKau+U2U61tG9jE4m6oTpDmJG4StDZqOKujOwwwCGKNpvcDPauPNKfRVw2FpXZ6phBC3Ha1j6/FfqYv9H+5iOQ+abMavldr3BYaUVrP8816lnxFCCcPfBbAjCLT9TOaUmzqGOB+YqKu0JdgJixKVYlw+XUW3zaf54zQUJqH1GwBkjicCTOQk2fvUl5ge3n3YriYd/BFTl9KpqkwMSEwMJZFZgS4Syr4dySwqwtmQcV3SDqKXwIcMaFcfPjOEc2xLBmrkr/xW57x5S8jrmsBsBK/l9JbVn2fsiA3AOdP9rWVVviY1tu2yRTmQnIU9FKWw+sc8nP7/Tf3tSP68Gs52485u3uiBMc2KEtM9e1f/+A9j2cb+PoKPkkv0dV2EZa99gH4WhTqtJ6XZqHATvtoSdFlsVOMTk5aj9nMxwBFnSdMQi6jNrtRQKRx1aqZLLB9el6CYd+eQZiViBUvqEnlnVcONgcR/C6eUPMABRrDzsSkRl+dH7c1aawBjYw8jl1qbrzoovM/WBBdfaatFZotdFdrhWWlCEf+DfBMRTk6qpDKkTHPgGa0RN028a9n3xHfZIcQ9nnTUiHtW0ZtpK4l1bPhnsGN4Bv6NlIChbdqesKE5Hr6iCzV2GDlzFLp0A4dSNeaRCWXWXIfRoSt0VHj6tUYLTiM X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 798c735b-a590-4567-c9ba-08d993eb7ee4 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:03:10.7476 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO6PR10MB5571 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 adultscore=0 spamscore=0 phishscore=0 bulkscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200095 X-Proofpoint-GUID: z-Xik7QtdNbJ15_qUVfD0JCRBDggSbFO X-Proofpoint-ORIG-GUID: z-Xik7QtdNbJ15_qUVfD0JCRBDggSbFO Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add an uncached (based on MOVNTI) implementation of memset(). memset_movnti() only needs to differ from memset_orig() in the opcode used in the inner loop, so move the memset_orig() logic into a macro, and use that to generate memset_movq() and memset_movnti(). Signed-off-by: Ankur Arora --- arch/x86/lib/memset_64.S | 68 ++++++++++++++++++++++------------------ 1 file changed, 38 insertions(+), 30 deletions(-) diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S index 9827ae267f96..ef2a091563d9 100644 --- a/arch/x86/lib/memset_64.S +++ b/arch/x86/lib/memset_64.S @@ -25,7 +25,7 @@ SYM_FUNC_START(__memset) * * Otherwise, use original memset function. */ - ALTERNATIVE_2 "jmp memset_orig", "", X86_FEATURE_REP_GOOD, \ + ALTERNATIVE_2 "jmp memset_movq", "", X86_FEATURE_REP_GOOD, \ "jmp memset_erms", X86_FEATURE_ERMS movq %rdi,%r9 @@ -66,7 +66,8 @@ SYM_FUNC_START_LOCAL(memset_erms) ret SYM_FUNC_END(memset_erms) -SYM_FUNC_START_LOCAL(memset_orig) +.macro MEMSET_MOV OP fence +SYM_FUNC_START_LOCAL(memset_\OP) movq %rdi,%r10 /* expand byte value */ @@ -77,64 +78,71 @@ SYM_FUNC_START_LOCAL(memset_orig) /* align dst */ movl %edi,%r9d andl $7,%r9d - jnz .Lbad_alignment -.Lafter_bad_alignment: + jnz .Lbad_alignment_\@ +.Lafter_bad_alignment_\@: movq %rdx,%rcx shrq $6,%rcx - jz .Lhandle_tail + jz .Lhandle_tail_\@ .p2align 4 -.Lloop_64: +.Lloop_64_\@: decq %rcx - movq %rax,(%rdi) - movq %rax,8(%rdi) - movq %rax,16(%rdi) - movq %rax,24(%rdi) - movq %rax,32(%rdi) - movq %rax,40(%rdi) - movq %rax,48(%rdi) - movq %rax,56(%rdi) + \OP %rax,(%rdi) + \OP %rax,8(%rdi) + \OP %rax,16(%rdi) + \OP %rax,24(%rdi) + \OP %rax,32(%rdi) + \OP %rax,40(%rdi) + \OP %rax,48(%rdi) + \OP %rax,56(%rdi) leaq 64(%rdi),%rdi - jnz .Lloop_64 + jnz .Lloop_64_\@ /* Handle tail in loops. The loops should be faster than hard to predict jump tables. */ .p2align 4 -.Lhandle_tail: +.Lhandle_tail_\@: movl %edx,%ecx andl $63&(~7),%ecx - jz .Lhandle_7 + jz .Lhandle_7_\@ shrl $3,%ecx .p2align 4 -.Lloop_8: +.Lloop_8_\@: decl %ecx - movq %rax,(%rdi) + \OP %rax,(%rdi) leaq 8(%rdi),%rdi - jnz .Lloop_8 + jnz .Lloop_8_\@ -.Lhandle_7: +.Lhandle_7_\@: andl $7,%edx - jz .Lende + jz .Lende_\@ .p2align 4 -.Lloop_1: +.Lloop_1_\@: decl %edx movb %al,(%rdi) leaq 1(%rdi),%rdi - jnz .Lloop_1 + jnz .Lloop_1_\@ -.Lende: +.Lende_\@: + .if \fence + sfence + .endif movq %r10,%rax ret -.Lbad_alignment: +.Lbad_alignment_\@: cmpq $7,%rdx - jbe .Lhandle_7 + jbe .Lhandle_7_\@ movq %rax,(%rdi) /* unaligned store */ movq $8,%r8 subq %r9,%r8 addq %r8,%rdi subq %r8,%rdx - jmp .Lafter_bad_alignment -.Lfinal: -SYM_FUNC_END(memset_orig) + jmp .Lafter_bad_alignment_\@ +.Lfinal_\@: +SYM_FUNC_END(memset_\OP) +.endm + +MEMSET_MOV OP=movq fence=0 +MEMSET_MOV OP=movnti fence=1 From patchwork Wed Oct 20 17:02:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D6FBC433EF for ; Wed, 20 Oct 2021 17:05:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 57D0E61374 for ; Wed, 20 Oct 2021 17:05:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230336AbhJTRHo (ORCPT ); Wed, 20 Oct 2021 13:07:44 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:48746 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229941AbhJTRHn (ORCPT ); Wed, 20 Oct 2021 13:07:43 -0400 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KG7wLS020970; Wed, 20 Oct 2021 17:05:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=HYcM6BZ37nyeF0LWMaoFZ1+N3j1AvzKq5xuMxO5qdDA=; b=laHnXSF3t/aVZYGfT+w1r/U4O4uZi0PumF6NRDoQ6UEqXCkRBR1r2jto3iiBIf3q0Ldg UqkJ6tRAk6aDL9yKb4LBgdLSfL9z/RVzLnjTa1jUXoADYzoQTYFr3UCb/oFea9dT+U4Y ztyJZxjt9qF3X6R8ttK8XnWTRDau5xupKvNSZaKAF4b61JYh3VHI/f+a9ljl+yvENVNa GzgQbKgA09lx0n3DgqPMhIIQ3A6VTDAxw8qYkIHJrRxN/r8/4ScwSzrkeHerfNDUO8gO 7iiyEGnLJm0fpOcsIS6oQse795texBXPoxxolvKd5/muIWQRN8ujH+96M5tcDvertU/G PQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkx9sfju-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:22 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KGuqlJ024711; Wed, 20 Oct 2021 17:05:21 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2106.outbound.protection.outlook.com [104.47.70.106]) by aserp3030.oracle.com with ESMTP id 3bqmsgs2u7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kJccCumgzGV0RpFaMrnGBA1Rqu9h0nCDyb5ME9XCbHvUQC4E4LpLmKIWaKWeypBPLiKo4pjDRZsv987gKiuEvWhP3gAyRkr0N6T5IzvhQXkNKVRMWnsTdp2UhuCCI10JYRv4QO2j5i+9lkXL9U6K9BchKFSYU2nCCsdZXQOiYU205vdYpzZhenDMTx5u9iIYNK+wSX7EnuN8TvyRiXCGjP9o+iZomnJ21uHXGwBcIMbDzrbisofwIZg8VGOUhiT9Qhcp+gNYv1dnP4vUiHWZWI3Jo8sh/xGlQN90xu1U2qDGXt0yzLN6EZY1pUFoCzuCn82F2rhhLA//MhY8rerfYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HYcM6BZ37nyeF0LWMaoFZ1+N3j1AvzKq5xuMxO5qdDA=; b=eIIvvXZTTosmuSGIbDPP9lRFlCncuAXKh+Yfh9MFVdiRK0Gz7WtQgCiD7cBYBbb3X//Wh1uhr4w2+7rEuldTcppjbl+6FUig+Y3oPgqT7fWXQkkP3+k6cWiFpMqh+Sz0s23yRZUF7b8FcuV+aAIT0Z0H72k66+jvI+LnlilaB/0ZqsAqwlU/GvJWAsZbD7MZEX5yYGj0kNsvjTaUkJvY96oNnXo765AdxxRe/91GOey6sPtwPoYSZFkguPZqRgwxnBp1I4smv3oxwKG9uLWyRXzMYCJNrduQjMwRvlcwPh6sOFL5ymwsoS7JkZ8aknV676B+L3yHciTM6cxYcn6Dlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HYcM6BZ37nyeF0LWMaoFZ1+N3j1AvzKq5xuMxO5qdDA=; b=0Jx1zUOTPliC4TQ9Ucpm3FMojuIli06QeSR+GXLF0nJUZXt8BU8uhk7OXYNConnk1diI3UffE1XojKj1kVNPQqFqQO0wIbUQjy+GGc89OcBSm4TS/KTLniGEzHnACSvaYjk4vgkIJt+lUk1+MJo24JAWvohrD08r5Fp99B04ENQ= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:18 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:18 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 02/14] perf bench: add memset_movnti() Date: Wed, 20 Oct 2021 10:02:53 -0700 Message-Id: <20211020170305.376118-3-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR03CA0017.namprd03.prod.outlook.com (2603:10b6:303:8f::22) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR03CA0017.namprd03.prod.outlook.com (2603:10b6:303:8f::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 17:05:18 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4278fbc0-0f2f-40fc-194e-08d993ebcb21 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:326; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ROqETJxvsNhjHA7Hz5ZQP8AQrZZHltjldhHj2Dz6uys99ooLCuLqL+54awVuNJ3/hWj/lTfT0yoTMj49EXxbv01msODf79pAlhzHpyefmZ2ojDdXKgLO+tigml8EarZ800iENQ4DUYAj8Cxu+7r4q2468pC0DyY/VZXWYGGVV6wY/FoCTRxlOL6S8OhCFXhVseiiYSXUzXGc0x87IPUd8LGc2FpYlnTu7ASAPIFA5IPSsf5Av8r6UIzG7lRVMEqMkz8VL3M4VY53IJSOy8E9+OvZqRrcRZVewTKPQ0ONd5K6GRaPJ7o6R1qtn/RsLjZkAK1dUkW2dm2w/IRfT5g6frxZ0/2QPbUKs2vuhr9jTrXRpp97aYQz7MfQ79ChVNiXLqRZxS+PrXVqS4+jkaUWlCvW2w0wzKcEOUG+08tSFfa+3xtTwWGzhDKalqEwKZraqO9NZdSGfXES71VIyYJz/KczPRhhW2Qb0q+Et8M7hcrh6ORkVKOCKdu7ldsH9wz4jZN0d/UGZLzQtrOoNRTfct4Ne84O18yPBG3tP4QMSTy05w9fIU9HsVRsjlOxqRjvs3w9IS7mEI5k4ReVFlyg9nUNDOznE4hKVBtaoHYMvLbbmm5QE8oRUD7N1b2odMGN9XY7OVe0DoijRinfVu8e7P7Do+ANZyiFuvWn9DJ2H+R2HJV7BlJdQvlgl3/OnXiXA+D7jEw/RH1LLQy9vQYAnCQjL5jIKJQ9lpr3RMONBBYkNQgQMz6ah9+tLcWohS8IqpaGxOQY+mj/s1KBTW2weiqX6YgvG3kbxP2turukSdj2RpIwC+fm3aFLD0ECHXIVy6CjgGChzjMSbbDAyNQ/czzTpzG/eNa32mkkCAXZaxVjs0R+0/YTJSq3jC7i66imk7T0dhqT/pm9jEMDgYVk5EBIHV1Qr3R5jGd1bEmB6cd0DjLAa+ypoHzsZXxwtxn5ral+oVrxxTYAP/Q9RDABqNv7JviDD+djYgqo+BDD6re3ZQVshy20lLS796J8y5WsV/QG4UDmBRN9Zh9aUGMblg4kGvv+8HKLbc58+LHS2NQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: gSuujfPlEyfuZDGwuhFkw8AGyxDvl4Vw1MAAVsjWBcift3hadWqPVyTUzWoi6DPWbl/0BiefLy6i4+/7yWaSx0Zl3t1+XScPEFnMtAJcsz/4Z3W2rinNWrbBL6Va0Yma38S9QpWEzZBZjHAjxXvu04LLsBb0QLURMs2KJArw9JcRrsGcaryt/x7OTUkEu7EPh9jqmTa3cTsqp+WXJwMEFRuJRBMHifRRv4vhSYyQ/yIQ1t/CbGqF8NFrASDhlT/BAhqeLDBUj28azH+7WUt5k04yRfuMLrlG70y8PFlAZT2y6MwESoCidzdYjDflKUJGTLt3fVu5Zt4l2AkCxy8Q05aVyRPMzSwReoWASUAyCuaK3wE16uvHRep138/hwAIC3+GJbR6wAqHSd059/JdOrUibnkQXERsSN/2gxHPz7Z6Z2o1eTm7vmKjFsLodM0T0S+hO4LQ28jLYG+SnH6FhXdJCbzCeak6El3fhJJAxXGvDBE5N9PaooFRCw3sBYvEc7508eQ1GbNPh75CUtTp1+LmhYIDQ2y96kJ+7OzKemgAozPoNU/SgoVHmz6/DItWiNqIwy+v1zzmuxuTJNR1yQaACt3PIlPBsLd0ed1wvTQQKPBpJipA+4HA+X4FcyXivg6pXUaNyjquFw28Fj8XPhxN8QHPhDxjT0vNv9fL3yzvzPoga/GDh6crj9/0oX80c5QYHZvWddHyRL9Yfn6pdGChrOSubLgzjzeM7l/vUvNJhUpLuDP3qvnDw+vuNVotPjGwVjWnwBoEzj9Sl8ewNOGRDJTkTEyACkeTPYizbrN10LrO8JTJbQYN1Qq0K3jdlrppxa1mazbxZGrcM1ykNTFvkYp8bBG40qtiOjtjYJqeRFs9jPXVj3UiNTvKdylqKzoWrr1zRuv3NZfhWILtXmbn7sCMmPFZZR6NFY0BYcgxpx1gcmX3hUKBTXY4ydJ+4r2FxgQygkct21vHYa4s/a6PiVqJfa+UhR5OPJ65/mtQF1vQ/Vl2bTsljTf9Povkk4yrMLuMARBlrSfDrdS5penytkxT1jNloW8Gs3eULy/r+Mm1DtOOvRjKADDQACDiDUx60/qYMmF78QVLXT5W1l3Mw/HhHSHPSR2DMiwpFv7+GWSEddpqU4Pp5xNwZU2tmETE55kY6L8yaANtYCpGtTceUNPLOMIoi13XIJtF6RrWzmsjBIDkNPDBTG82dBvVqzGiraMNyKJNUIcWPQ2ZOQsK9Yo2rDzeI/ApUfmCKDeJ7wn7qqDkD0EnqcUbdCqsZCqHLEdaAs0niMCpcoqppQ3r+lUIii7AKqW0MG72rF473woDa/0OpL7sXmdxlcHSzGHNotaTlp2Y7f+K1nilvyDUdjPeqM9sHRmtNF0wl3kzN7z1pU+WGjKA18ZjT7cqsudfzS5BWUfZBj+aBYhsUy5sJsK6yHS5MGXDIZ1s13CQsST7Dc0Be8JFNqzABZG4m X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4278fbc0-0f2f-40fc-194e-08d993ebcb21 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:18.6646 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200095 X-Proofpoint-ORIG-GUID: fNnA5ikQXi6GuGaUfXVZmUVUQJvZ63zE X-Proofpoint-GUID: fNnA5ikQXi6GuGaUfXVZmUVUQJvZ63zE Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Clone memset_movnti() from arch/x86/lib/memset_64.S. perf bench mem memset -f x86-64-movnt on Intel Icelake-X, AMD Milan: # Intel Icelake-X $ for i in 8 32 128 512; do perf bench mem memset -f x86-64-movnt -s ${i}MB -l 5 done # Output pruned. # Running 'mem/memset' benchmark: # function 'x86-64-movnt' (movnt-based memset() in arch/x86/lib/memset_64.S) # Copying 8MB bytes ... 12.896170 GB/sec # Copying 32MB bytes ... 15.879065 GB/sec # Copying 128MB bytes ... 20.813214 GB/sec # Copying 512MB bytes ... 24.190817 GB/sec # AMD Milan $ for i in 8 32 128 512; do perf bench mem memset -f x86-64-movnt -s ${i}MB -l 5 done # Output pruned. # Running 'mem/memset' benchmark: # function 'x86-64-movnt' (movnt-based memset() in arch/x86/lib/memset_64.S) # Copying 8MB bytes ... 22.372566 GB/sec # Copying 32MB bytes ... 22.507923 GB/sec # Copying 128MB bytes ... 22.492532 GB/sec # Copying 512MB bytes ... 22.434603 GB/sec Signed-off-by: Ankur Arora --- tools/arch/x86/lib/memset_64.S | 68 +++++++++++--------- tools/perf/bench/mem-memset-x86-64-asm-def.h | 6 +- 2 files changed, 43 insertions(+), 31 deletions(-) diff --git a/tools/arch/x86/lib/memset_64.S b/tools/arch/x86/lib/memset_64.S index 9827ae267f96..ef2a091563d9 100644 --- a/tools/arch/x86/lib/memset_64.S +++ b/tools/arch/x86/lib/memset_64.S @@ -25,7 +25,7 @@ SYM_FUNC_START(__memset) * * Otherwise, use original memset function. */ - ALTERNATIVE_2 "jmp memset_orig", "", X86_FEATURE_REP_GOOD, \ + ALTERNATIVE_2 "jmp memset_movq", "", X86_FEATURE_REP_GOOD, \ "jmp memset_erms", X86_FEATURE_ERMS movq %rdi,%r9 @@ -66,7 +66,8 @@ SYM_FUNC_START_LOCAL(memset_erms) ret SYM_FUNC_END(memset_erms) -SYM_FUNC_START_LOCAL(memset_orig) +.macro MEMSET_MOV OP fence +SYM_FUNC_START_LOCAL(memset_\OP) movq %rdi,%r10 /* expand byte value */ @@ -77,64 +78,71 @@ SYM_FUNC_START_LOCAL(memset_orig) /* align dst */ movl %edi,%r9d andl $7,%r9d - jnz .Lbad_alignment -.Lafter_bad_alignment: + jnz .Lbad_alignment_\@ +.Lafter_bad_alignment_\@: movq %rdx,%rcx shrq $6,%rcx - jz .Lhandle_tail + jz .Lhandle_tail_\@ .p2align 4 -.Lloop_64: +.Lloop_64_\@: decq %rcx - movq %rax,(%rdi) - movq %rax,8(%rdi) - movq %rax,16(%rdi) - movq %rax,24(%rdi) - movq %rax,32(%rdi) - movq %rax,40(%rdi) - movq %rax,48(%rdi) - movq %rax,56(%rdi) + \OP %rax,(%rdi) + \OP %rax,8(%rdi) + \OP %rax,16(%rdi) + \OP %rax,24(%rdi) + \OP %rax,32(%rdi) + \OP %rax,40(%rdi) + \OP %rax,48(%rdi) + \OP %rax,56(%rdi) leaq 64(%rdi),%rdi - jnz .Lloop_64 + jnz .Lloop_64_\@ /* Handle tail in loops. The loops should be faster than hard to predict jump tables. */ .p2align 4 -.Lhandle_tail: +.Lhandle_tail_\@: movl %edx,%ecx andl $63&(~7),%ecx - jz .Lhandle_7 + jz .Lhandle_7_\@ shrl $3,%ecx .p2align 4 -.Lloop_8: +.Lloop_8_\@: decl %ecx - movq %rax,(%rdi) + \OP %rax,(%rdi) leaq 8(%rdi),%rdi - jnz .Lloop_8 + jnz .Lloop_8_\@ -.Lhandle_7: +.Lhandle_7_\@: andl $7,%edx - jz .Lende + jz .Lende_\@ .p2align 4 -.Lloop_1: +.Lloop_1_\@: decl %edx movb %al,(%rdi) leaq 1(%rdi),%rdi - jnz .Lloop_1 + jnz .Lloop_1_\@ -.Lende: +.Lende_\@: + .if \fence + sfence + .endif movq %r10,%rax ret -.Lbad_alignment: +.Lbad_alignment_\@: cmpq $7,%rdx - jbe .Lhandle_7 + jbe .Lhandle_7_\@ movq %rax,(%rdi) /* unaligned store */ movq $8,%r8 subq %r9,%r8 addq %r8,%rdi subq %r8,%rdx - jmp .Lafter_bad_alignment -.Lfinal: -SYM_FUNC_END(memset_orig) + jmp .Lafter_bad_alignment_\@ +.Lfinal_\@: +SYM_FUNC_END(memset_\OP) +.endm + +MEMSET_MOV OP=movq fence=0 +MEMSET_MOV OP=movnti fence=1 diff --git a/tools/perf/bench/mem-memset-x86-64-asm-def.h b/tools/perf/bench/mem-memset-x86-64-asm-def.h index dac6d2b7c39b..53ead7f91313 100644 --- a/tools/perf/bench/mem-memset-x86-64-asm-def.h +++ b/tools/perf/bench/mem-memset-x86-64-asm-def.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ -MEMSET_FN(memset_orig, +MEMSET_FN(memset_movq, "x86-64-unrolled", "unrolled memset() in arch/x86/lib/memset_64.S") @@ -11,3 +11,7 @@ MEMSET_FN(__memset, MEMSET_FN(memset_erms, "x86-64-stosb", "movsb-based memset() in arch/x86/lib/memset_64.S") + +MEMSET_FN(memset_movnti, + "x86-64-movnt", + "movnt-based memset() in arch/x86/lib/memset_64.S") From patchwork Wed Oct 20 17:02:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9239C433FE for ; Wed, 20 Oct 2021 17:05:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C764F613A0 for ; Wed, 20 Oct 2021 17:05:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230389AbhJTRHr (ORCPT ); Wed, 20 Oct 2021 13:07:47 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:13384 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230354AbhJTRHp (ORCPT ); Wed, 20 Oct 2021 13:07:45 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGAxTv000812; Wed, 20 Oct 2021 17:05:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=5e3m7RFk0g13enMURwPEocpGyUnHDjiZAEp5AA8VXa0=; b=YYQhUBNXSSjcaoKbV+ss2MrHtVkkYBWohcodIX4fAvqOaJfcTKemTJ7SztlanoauRGaj ZVkUN/SBNLkweDu+7ovtojBdFN9rkbHvt+2tgC7fZnmW25k87QY6lWfIe/bA7I1yawDH S12bIHixMYtrBELsfiRyRBVYsWBhmEpLLF4DunH7XfcgWkSrEp/GhzENGMBin0NL2E65 1c+KGyQZkjIoGdd+Xz3PopYCqgu5zHJK53VzU9emz5tgP70kVbA44qghJKzd4hYdSFNa kc1QKvgmJBYhGeZSHL8AQUwg6atGWz6HOTi2EmsaGRBctIYm1Z+mAxs3KmPGmKlrKoF5 Uw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4scbs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:25 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KGtYCU104288; Wed, 20 Oct 2021 17:05:24 GMT Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2176.outbound.protection.outlook.com [104.47.55.176]) by userp3020.oracle.com with ESMTP id 3br8gug6x7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PLHjb8tfft54e+C/E6ZY0dEPZaNJYiRk2JtNygmHzt6Od1EmQnzpLzmydCGwLo8ULmRN4GQ/OOzmE5zT4wX8hWoRYMUqyWfv8QXV2MyYjg4Ptx7dzD78k07vkcePSE4kXcIq51SNOtfZmuFTNvgXxVZFZoFdhtFJycpd+7lzZ29eDcmTV6k6Exn0dRuCoPjUmset2fv64gdxrDRRxMOKT1bmDzV+NNJb74hF1sGvSBblwibeBtIqfx8cBwUM1EVVLtTOUM4F9ATQ5cmjrW3HCbVRAdsLWYq/TSnbbYnJnxWt15GRbDbRbTbm6fa4HIESVhGbW8+nabEATTPhEJei7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5e3m7RFk0g13enMURwPEocpGyUnHDjiZAEp5AA8VXa0=; b=QOHB2kF88BZxHPw1k6laBKBXxKbBDuYFLEllU6QvimpIFK9RViNsrmTDeT2j3dMujOhefm6XUdgD4vcZftlUGf634hUb4lmh+UP03diW0nOVTXf9ROE2Vg/VU4grQ0wToMBf15hF2mwaKuY4u2XXpFh5yto57oiw884139XQqdWIXTPls2m2dJjpyHI7gRcORD7ipHZ6z/U8/2UhGlrz3n2rptzi395cb0ledTSgdCgxM7t97xrBSwNYoQahzL7XaHujQ5ZUCvExj/eJt9aKGj3aK6uyDzXMay/iQ1VSfVa5O0WVds7lGO9Fe5VsuQlbfnG+VjMSuzqt2m5v7vVbVg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5e3m7RFk0g13enMURwPEocpGyUnHDjiZAEp5AA8VXa0=; b=gYOLZiBd4yqGcF01yppXfdgtAnccMFzLW9FK3RfJcMY3pUu6UHI3zceVXf8NnF7cRw3yRaM07VbbUX+WTy/C9wczXEXClCvAYZA3Qd7yX/Uq+PDEXqhO+mci8gkV7P+IF+YwV9E45x8xCkSCqI6SopI7Elqo+bRnQL5hLOdZQIg= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4484.namprd10.prod.outlook.com (2603:10b6:303:90::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:21 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:21 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 03/14] x86/asm: add uncached page clearing Date: Wed, 20 Oct 2021 10:02:54 -0700 Message-Id: <20211020170305.376118-4-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW2PR2101CA0011.namprd21.prod.outlook.com (2603:10b6:302:1::24) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW2PR2101CA0011.namprd21.prod.outlook.com (2603:10b6:302:1::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.1 via Frontend Transport; Wed, 20 Oct 2021 17:05:20 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 819273ee-24d4-4308-d690-08d993ebcca7 X-MS-TrafficTypeDiagnostic: CO1PR10MB4484: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3513; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BRer0C9+nvrTRI+Rj36XjQzAwHt/K/gmtLuwP6Mspq5VZ4k4RL7woWlfkMeVOB38s7dcy+AlyPuYkDlrXbwIvysmsUybgRxnuqfcdEcdwTX+yJXrixcOUdSZeS6TZ+umGlVxq6Rrxl855D+OrXhF2xzeA1uMefV0kut0Qm2w8Fe49tQSzJkFAaz1iXSeB/nKBQiOdT7H0jG7oZXLxttqlPb+FNqkNJY1KQu/2S3b2EcfAcA05K1PAtykI8E2QkAyht8jBL/Bi7ADVT2YSgfqe1C4Y3+pEgvUqMv/KjLXtjaqnjB9xS2xpsiswfnbXt3f1YV4XmXlQYxbWcOllF5rS4O/joFz2+YS5xZh5jGKplcTC+B1mrMU4hogvXr2g7jPVNm8N25IuHOUQvtFlzct2NSkW3mkpkqpVxxjpkzpEB5pCg8e2+h7euBEW2VKkwT/z8w60o2r4Y7TpnrTDN5rJqia7zzIoUuFhybX7gMNG1z9TtIDeeuciGZeH6r1o8L5DyoWt/5r5GjzFW9fvxAQhXyn72bMZu4H+JTsBiL0R7ShnGil3L3yGKo8I0FnXgUSvGImMIYd9gFyQlOqOa7SDKtmXLMfplUDpj6lZdAxG2dKTr5OUYX2ovQFYq4O4H1pga04em1zBEjMlZNvtocD/xUHBuar3lmPltxpU84nEN/9rJkm/Nm0wo765FlRPPb7iTldJUQgou8hRSEPw7K32eiCLTdJ5n12gd/qE0QD2pMIVVrEmbpPpejP1JzzRhZGrs4sXaDA0n0569ms9qBU+tBcz+H/lySvAsRFDObhW5NXKqKn/SSJifsBrWx9UMaDQTSy0rUGHetyzcEGW125J21YHoDLtMCt3It09wSzeXQgIrknE7tLzxLABgA7m1v3y/ux8EnHKmvGgl+rtMeezsbQI+zcTSZ7ZwKkoZjzGaQO8t08zRSvBT8xFDHfqsX5MZEXYuFLmh3kcA0EPoBK8ZbKXojLgalbn/q36MY2A+QGdR+dIZM6ddECyPca+mBmvO54vlu8ocJPCQpHFnDWiw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(8676002)(6496006)(508600001)(4326008)(8936002)(2616005)(38100700002)(1076003)(66556008)(38350700002)(6486002)(6666004)(103116003)(5660300002)(956004)(316002)(26005)(107886003)(2906002)(66476007)(86362001)(52116002)(66946007)(186003)(83380400001)(36756003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2ytnfARfC8dPpI8TugRfO6UMAirx66//oV0JHL6rS5rhe7Y0JkDcrIoChGlyVJ2q+0qLpSbxxbqYlyF3ERfA+4Y4+ys4THr1+KvyzxpH03dQvnDT/WDICBst7KF5fols0Yv0vbMxIhi2IVR28T+sq3pT2sjFqXb26mJxAkenEayjq4pND48HPwsVcoBQIhfPm/kznrGYbq5hjWVMoqYEoeCfNh6SLdWgwBtJrxXpVwDpSDqz5mfKWWl/At3z+5FCT2/ZqpOKNpq2saPmeXSbtxKwB/uYdnrCF1Cx23F+/gQxisfJ4IbYuiCYA1HsVLGya2BIH6ucRUz1HYaB1ypWrSF6U7pkOajX3y2Kjyv/hzuUegZfChtb0phg8jxQ6XmfHWfZbVajqmAbGB6c58pT4LRsU7eV/o0/V4Z79+8v9+GImJ+V9cMKrcKRwAsJGM6jO+kXsOgOBosnqdtSh2V9g1biIWZ8cMSWDLqf2LuQBTWIBKJejzWHUVgeRsMMzmk6ap7W8Kkcl3TjNvnohAj71nNhDH+WQ8kiNGc0RUGkjwaJVkt7comC0UeuP5iKah/iYK73q1N9k4mN11isHqZO71UBOd4X8q2eZXjO8RlV7YYxCzkhOemArdbNwDEQGXZVw9B1Kzi/TVcwahzdMS1I0NwAziAm2ni0A0jJwE66apZ6Adi9klaGyhJWg1cHNxIR6HtvgS+ceFn4JAq3cexbuiB9kHgVjtgXfLlbmRM5O008pYPbsFKMW0bBFwtABj0mo/FsdDYWU3fxnvnkC35yVZu/Y1LQzX3SoM68cdBtXA1ENsDcl4kgGHfZWoF3ujpCLqWrvT5rmpn8Qa4eXC7i3OilRrYJKXXYO3c/YTcETc1Awpa9/2nfQOMzzBRp2XO3vyXvSoP6V1dYt1XGrqWe42dcOe3K5llFJC7lZ+xpveasH1RY+bvnuzXwcQGhl+5JtFCrmYZZ6EYq+XMF1aLLOZjCxMZ93AMPerhcPuY41joU0QU5PhxAlL6L9lpKwk6xRzmUFCmRimtqcFxoQ7SnVXN7uzAOokl1/kRR2+ZtZ6Avi9/Z4XTRJJQWs7Dkt13bAOO97nJUkjodP8Mj2j3KUPQRrM9zpMY9DPItSQ3+V3xuc06Fec5to4VUp9dOFPu+qZ+ui/6pAgx5w+2Kz0lr9C2BCMxfLiSpnTXpa71z9sarT344AwsAYIY1j5CjvxN1+od9jhBzDIzlzE3VBMTNmTvhG/D73WxRlxQUgoqKQ5p4/MjwIBidrJD5WqRI56sofKMKO08TKwKienZ22XIog+OJAnf8cskVqirM4m+Ur/Ly3qLTWTpzLvxMGskUU28ANTCsSIdUIYJR4UwR5QidKN8aaiKDWDIvTt5gW5Jbdu2tIFAcQooeuriw/Ur20q5m6jFAzN3hL9p9UBtEq0qpee6k5b4uKIO2yCrsmay3Tx+0ScjslsDn14NPxzV0PjCK X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 819273ee-24d4-4308-d690-08d993ebcca7 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:21.2453 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4484 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 adultscore=0 spamscore=0 phishscore=0 bulkscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200095 X-Proofpoint-GUID: bQY9z3vYS46VaB6t7UnpnmOGe2cde42v X-Proofpoint-ORIG-GUID: bQY9z3vYS46VaB6t7UnpnmOGe2cde42v Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add clear_page_movnt(), which uses MOVNTI as the underlying primitive. MOVNTI skips the memory hierarchy, so this provides a non cache-polluting implementation of clear_page(). MOVNTI, from the Intel SDM, Volume 2B, 4-101: "The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy." The AMD Arch Manual has something similar to say as well. One use-case is to handle zeroing large extents where this can help by not needlessly bring in cache-lines that would never get accessed. Also, often clear_page_movnt() based clearing is faster once extent sizes are O(LLC-size). As the excerpt notes, MOVNTI is weakly ordered with respect to other instructions operating on the memory hierarchy. This needs to be handled by the caller by executing an SFENCE when done. The implementation is fairly straight-forward. We unroll the inner loop to keep it similar to memset_movnti(), so we can use that to gauge the clear_page_movnt() performance via perf bench mem memset. # Intel Icelake-X # Performance comparison of 'perf bench mem memset -l 1' for x86-64-stosb # (X86_FEATURE_ERMS) and x86-64-movnt: System: Oracle X9-2 (2 nodes * 32 cores * 2 threads) Processor: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Icelake-X) Memory: 512 GB evenly split between nodes LLC-size: 48MB for each node (32-cores * 2-threads) no_turbo: 1, Microcode: 0xd0001e0, scaling-governor: performance x86-64-stosb (5 runs) x86-64-movnt (5 runs) diff ---------------------- --------------------- ------- size BW ( stdev) BW ( stdev) 2MB 14.37 GB/s ( +- 1.55) 12.59 GB/s ( +- 1.20) -12.38% 16MB 16.93 GB/s ( +- 2.61) 15.91 GB/s ( +- 2.74) -6.02% 128MB 12.12 GB/s ( +- 1.06) 22.33 GB/s ( +- 1.84) +84.24% 1024MB 12.12 GB/s ( +- 0.02) 23.92 GB/s ( +- 0.14) +97.35% 4096MB 12.08 GB/s ( +- 0.02) 23.98 GB/s ( +- 0.18) +98.50% Signed-off-by: Ankur Arora --- arch/x86/include/asm/page_64.h | 1 + arch/x86/lib/clear_page_64.S | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index 4bde0dc66100..cfb95069cf9e 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -43,6 +43,7 @@ extern unsigned long __phys_addr_symbol(unsigned long); void clear_page_orig(void *page); void clear_page_rep(void *page); void clear_page_erms(void *page); +void clear_page_movnt(void *page); static inline void clear_page(void *page) { diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S index c4c7dd115953..578f40db0716 100644 --- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -50,3 +50,29 @@ SYM_FUNC_START(clear_page_erms) ret SYM_FUNC_END(clear_page_erms) EXPORT_SYMBOL_GPL(clear_page_erms) + +/* + * Zero a page. + * %rdi - page + * + * Caller needs to issue a sfence at the end. + */ +SYM_FUNC_START(clear_page_movnt) + xorl %eax,%eax + movl $4096,%ecx + + .p2align 4 +.Lstart: + movnti %rax, 0x00(%rdi) + movnti %rax, 0x08(%rdi) + movnti %rax, 0x10(%rdi) + movnti %rax, 0x18(%rdi) + movnti %rax, 0x20(%rdi) + movnti %rax, 0x28(%rdi) + movnti %rax, 0x30(%rdi) + movnti %rax, 0x38(%rdi) + addq $0x40, %rdi + subl $0x40, %ecx + ja .Lstart + ret +SYM_FUNC_END(clear_page_movnt) From patchwork Wed Oct 20 17:02:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D5D6C433F5 for ; Wed, 20 Oct 2021 17:05:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 66C12613B5 for ; Wed, 20 Oct 2021 17:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230431AbhJTRH7 (ORCPT ); Wed, 20 Oct 2021 13:07:59 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:27506 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230429AbhJTRHv (ORCPT ); Wed, 20 Oct 2021 13:07:51 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KG8qv0000775; Wed, 20 Oct 2021 17:05:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=j9Qxdl5cxtbjwrLaPscVf15GbrsPhI032wzU0oiqT50=; b=Iggy+Ch/eoQO+KC7oMwZ5279le9+ZyVtriW4d2ekJh2W8PYZytpM+xTFRtzhNrverrBf zGjzBPmwB2PiZo6Q36YJ79tC0Q0MxB/kV33oKzp7imDOtYW00xCUCIii5j3uCS4tT6ie TA82WelDZp66cBeVAeMouSEvgtqHIE2uEhPV5Mzc0ommQAP+2/7JjesNBlCbRsi4ywKe n7kkNtxfYok1qMuK5gUeo7XWcQR7dQ3GvrDsoXm8NJJPEUGsU4ObpnjyWIidD3Syp88m J1uwGPmtlfVOp+M0dkUpW79qnd+Cnh8cbBbGkZ8mJpv3zOoNObUgle3m7i2lEwSEP/EL Kw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4scc6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:30 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KGtbNu104587; Wed, 20 Oct 2021 17:05:26 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3020.oracle.com with ESMTP id 3br8gug71h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:26 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Sy1jwYbV6S5IARF9J8+Ou8XX4JRVXepopkigPvWwtgO8BCOE7XWO9/fBExPa1znri9PFBQdHWQO8BBQSarN/Gyb+WjIvOLg6+IdW4en1V67S9eZsqrM6ZJmyGOvNRGFXiSAY4tb+xwmeT2ACVlTDGuTEh8oaOzrRfVuG4P6v8hzAciFRl/jX2BwCVvIDiLTHMW5JKhcxwTyXmv1VqVOUAlMyBpKQcnnKg1AyQIJE2Xwqf9+UUuaMcORgNZhscDUwXu5lNttP4QOZ0CA2MOm+z0FQ1iEoElZPlGo8XZ1rM6roVy2TjQf0x0UxmtCca0rW3Bc1BtnUyRIpS2PxUQKsGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=j9Qxdl5cxtbjwrLaPscVf15GbrsPhI032wzU0oiqT50=; b=Mh7fnNIBLX+PvEe3F7bCgdl+SU2nKZBLvg54KJs05eHyNUeepPlVDTLkw0B6eOent+sOWWGKJpMq81oDux1+DF0ef4sZ2RiKyIu1FQycQ7q0B6S+RpLC1zrNvFCSQohXhtuvJodUpPHeszWc9r8pGl4vdNqrjC1K6OBvSUGD0oSdAi9/asAiwcO2zprD7itJwOb2QlNvaD+O7jP27WJXfdlUfXEInh1JbKmpBKRLhblGvEVhBsOEycQc0DRr6vCWtiXDuvHlLk2bm+NbieI4NDKmM/ASPXllBi21l3uyUmXV8N2//Vw+O1rmndehA6fGnnyAcfilAPPqawG9Kahqww== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=j9Qxdl5cxtbjwrLaPscVf15GbrsPhI032wzU0oiqT50=; b=xmZErslYOBWR2UA16G4xJZhLA6+D09C8vFFfnVmWWY2tqNdN9ssnzw2xTm9cjDpH+JsfaQHZtgpzkNpgafEvSR6Zi8y1EjvSvJG9E4Rpxn0k5fiEhOxH0duWnehbGYsqpPHaETTrLkBvaEOTKCfpDzQVsITXKBpIZOK8jaeL9xc= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:24 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:24 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 04/14] x86/asm: add clzero based page clearing Date: Wed, 20 Oct 2021 10:02:55 -0700 Message-Id: <20211020170305.376118-5-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MWHPR1701CA0001.namprd17.prod.outlook.com (2603:10b6:301:14::11) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MWHPR1701CA0001.namprd17.prod.outlook.com (2603:10b6:301:14::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.15 via Frontend Transport; Wed, 20 Oct 2021 17:05:23 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7914022a-422d-4a94-c2d0-08d993ebce5c X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Or4dUUvIUkpHbBdaP29AToVUGv5oJZyyedA/5eAkdzDMtXR+fyTIJ3yqHbSPB0x5ux2mBEWNP30YmPUWvIE35s3W2kiblF3nSUE7Kh13/XNlJwi8gq8AxQQmbwx5g0nEE8NJB7zwf9vWgijTEKgoLe6RuhWZNqoKqcip9BCCsITOWMDRyl0wllLB4W8ja5Qn2mHHaBpvxqCZTMykLMkRcAcZxdYf3uwDAW3xqmq3masZVs9dr+b8h/qwGEOIJxZ8nYmXuGZbnzM0agwKivmAePSUGBLb0HBSdQji9bnWTc4wULpUC+N+4Ven4kjDERHjnlZULmJBTVmVHgll6SzIAzO6jcCTuD3zVDCQNUVK2fv83N8zXo0kL9Rv9970jALCIKHim7Y/iDwjdkKUw1s6rOW+SEKrSbBfKn/FOUGU8UxNL7nl7dARZtVARXaeCzih1fPdWFJH2EmJHoYDTeg90FI2BRZ8vgzl6aZKKB2CjfmXBO+rnvssa5w8eXSCbrRqTo7UT67thH4DDh7fDSvLj2M5IgX5kWgW/WTbKJXWLXgfdtCzzgMRA71VgxwEdpIJhPdDmXT8EH7MqREZBJIsW8uN0m9t43ZqWGwV43fmmA8xch+LvsMqb2jB9Ceb1jmEBt/fq4kaCYdC9Y5AvBSV5ht+RBPHHQS9D7L4ArI5XPYpkW76KWEmNgZyKIZDv5vOx+n5kWjH4XASha/KnJIC1Lc00oSMlsG5PTAYJ3kMQ9kZ7mDnxGrlEuP4rYBCGMcdP8Tih22+qdj4PkFZFkq+eNadQ4GAL7uC5ih+nmoCDJmtp13H6qyGLEa3tfcF8GuBXGw1X1HB1NUf6AHlEFjWS62dPbkF9va86yT6BSCAsCKtfplksXOF6sqz4HbWo5SRNC35HZezl+bPREtJbpp+S3nxCSiynRJ4W1p9Kl4+CGQPbA/pMk+263YJcVMcrqAnJ+yH52YqdgKJnkGAWaVwUFzEQ+jQ3uqvthrHJIKIieJTuON4FdrtlTIYCm168PfiU2YNEphnFaUARiOBAYczhZ2UXOHDFZC9gMEY8x8P5e4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: XrhWhUpgPHv0dOa4b0GNmvHLxYiCF8qDu432VsE//hQKBVI5V5lylaTd43Qzy+h0qhxfU9OQk02zRXIrLfFQcQDFSrN9ywBp8eBkEXLbwbsK+nLp1jyFb4XvgP1fznTIqB6b+LJBvPDL8mhLyzaz0qJcUQ44PeLE4nTvGU01E5Q2sm3Vx5v+ddtYnZ3NAOwXbIlo05Ww89woU7CR5B7RP5m8EIKsgCUeeeBprZ3S4IL5qu7wWOjvtPK538qfaA7HFPkG0sSGT+wbSiLaXKQTCtR/HLkWay48cF/pBUr2BEDTiTqv4XBFheiD8Eu9wqZ0wDFrm9ueP5yXktYKtCkAQEqzux6tTiPCD0FakAEguUxSNvLUlLJCYkYBYidzNdImfY2aUTVut2Y1t1mDR4jnB6O6hfspMY0Exm823poiVtk5tUzd4KTm7MDj2HuuWuLuy+gm7PUMJjVPgaE+ru2TVFslpZXDiRYl2aaEuX1EKNorm4M0pDar7OV42WIW0AgaHHMpwlKkWBPveyut+bw3O0gu0lBdrpCnzASjQHkwcRRfX+VfWlE6hmlZmLM+B/ETuf/uV6p0kaPdYLNQYmmMnOpEXTIlCT/BBIS3ItBHm8f4UFTxA67oUoN5lKC8CialqwO8cdLMaBOUcJ8F0vmHVwhu1b/pPcKHCYoWmZtxKFyQI3NuRtIvXvqV6sxkeqBM7XmaSSmHy7KjDDcATRyGVhi5+Gl40R5dmGHdcX1np5Cj46BOf2k6EJcnpR3Hxqtc8Ob6Wc4SZ1JbWiYGdvtE9T+jBAj7eR0ihY5BwGAnG4gZC+64YIb162h8GxjcDXmjASiKj1vkIEVvnfYfMdhmdQnj2RDIv99//JkXqp72yx98CvcuFC7p1nPkOICtNvZsJHh6Z/VqVIxSlXKvaoNQiF30FFomS+Wr2wvCGxfUzlzpfBkQDFx5bgXT0cL+WCKl0OZtLrp4+L8da+YMgxNNOCutOZzvQjPTCpcZ5PkhgSgc1bIa4kHXLLF71T6vJlp/OcAdkWaIMy2FvM1y+C4eq7KX3Yx2bNd6KXQVX6XBShEnIQ7b4qbVjN0foc7dfPvwvFRkFEj3pn1NUPKB0abmorqqdo7FJT0chmxnCer+7X1FzoohxuJtQFZ7qXMWtYk0UuK+oB7yhdBpZeS4DeO797J2AgJzlUBaMz38IvjeCeSreebqhxhaa7CjJdDmcMAHhTIMjX1uraWWJXnUyDtJPGeRxA+Cjqc53EYl1esLTRYqDq5Uk4ecN+BA29QwU2YYw/QZvAO6rREH9+0XbnkSHFpXY6IP/hqqzNuyajdTavE3yecoCYsu13F4jEAlnrZ4RVYOwPi64tGRfVcKco4crwhO1+RxVz0ckZzlozUJX7PWlnXjN1NiYcFsOcZgqsZ5TgWRo5IS3Hf93x3wwxW5qN4WT4cROBP+N3POkILvBu86sVfHPDl+OA0aUJpKMZle X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7914022a-422d-4a94-c2d0-08d993ebce5c X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:24.0470 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 adultscore=0 spamscore=0 phishscore=0 bulkscore=0 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200095 X-Proofpoint-GUID: 9_jWjnAdBB-xF-a_J6oHPhY7SbdzWyuj X-Proofpoint-ORIG-GUID: 9_jWjnAdBB-xF-a_J6oHPhY7SbdzWyuj Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add clear_page_clzero(), which uses CLZERO as the underlying primitive. CLZERO skips the memory hierarchy, so this provides a non-polluting implementation of clear_page(). Available if X86_FEATURE_CLZERO is set. CLZERO, from the AMD architecture guide (Vol 3, Rev 3.30): "Clears the cache line specified by the logical address in rAX by writing a zero to every byte in the line. The instruction uses an implied non temporal memory type, similar to a streaming store, and uses the write combining protocol to minimize cache pollution. CLZERO is weakly-ordered with respect to other instructions that operate on memory. Software should use an SFENCE or stronger to enforce memory ordering of CLZERO with respect to other store instructions. The CLZERO instruction executes at any privilege level. CLZERO performs all the segmentation and paging checks that a store of the specified cache line would perform." The use-case is similar to clear_page_movnt(), except that clear_page_clzero() is expected to be more performant. Cc: jon.grimm@amd.com Signed-off-by: Ankur Arora --- arch/x86/include/asm/page_64.h | 1 + arch/x86/lib/clear_page_64.S | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index cfb95069cf9e..3c53f8ef8818 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -44,6 +44,7 @@ void clear_page_orig(void *page); void clear_page_rep(void *page); void clear_page_erms(void *page); void clear_page_movnt(void *page); +void clear_page_clzero(void *page); static inline void clear_page(void *page) { diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S index 578f40db0716..1cb29a4454e1 100644 --- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -76,3 +76,22 @@ SYM_FUNC_START(clear_page_movnt) ja .Lstart ret SYM_FUNC_END(clear_page_movnt) + +/* + * Zero a page using clzero (on AMD.) + * %rdi - page + * + * Caller needs to issue a sfence at the end. + */ +SYM_FUNC_START(clear_page_clzero) + movl $4096,%ecx + movq %rdi,%rax + + .p2align 4 +.Liter: + clzero + addq $0x40, %rax + sub $0x40, %ecx + ja .Liter + ret +SYM_FUNC_END(clear_page_clzero) From patchwork Wed Oct 20 17:02:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AEDAC433F5 for ; Wed, 20 Oct 2021 17:05:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A1D86139E for ; Wed, 20 Oct 2021 17:05:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230401AbhJTRIE (ORCPT ); Wed, 20 Oct 2021 13:08:04 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:56586 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230335AbhJTRIB (ORCPT ); Wed, 20 Oct 2021 13:08:01 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGAxU9000812; Wed, 20 Oct 2021 17:05:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=pMXeCGjgHvfb3F5HVnTVAPhex1Y3C2nV445iWS4EZX0=; b=IhN+VJKfbmRw9gZ0vplPa0eR9qwNq9hIdHC4GL2hdqBB64k7Upg47tQtUan1bSX/m0dt Oug3IQ7Pwv5e1XTFNMLVNm945ZP6Ada5IHDDNCRi5uqeACg/MTa2x+OonVrzidb19RD1 YjHrGVhl+GhZ7OBRuVURxuBJWyLffUL2UjO7g9AgpNeks+26PKbZeSvBJUriuEne8oYL ZVtBw+mU5cE+LtGwuk2pygR5jRf20MMZcWqpBt6mv+/Tg7cjh5D+rW8Ota2RPE6L5Acv wUNTPiYNFcikFKBWSLL5+l8dhB6kFX6HpEdCZZsxzCES4xtCmdQdRMywnRjCNiQzXOHK VA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4scdb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:40 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCR005957; Wed, 20 Oct 2021 17:05:38 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:38 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lUNZSzKuUcB6Xsna82K5ZH0Or7ZtxpHbACgh+s1sKjp80v7MKmZEbNwXg0G0RA6+Z1HL+pdtReTt2/03ObpQNfjmIRwoGrLvla+2QLdGAdtSoqzuuiOKX1LHrnYvsYCxOWKrEq3znnJwQcFz0X8yEpEOq9XDHfYgqbbzjjdiE8H1veRcuC0pzuuf3zuee7M1DQIq326/WdE9sWHQYmo++W58c247bc0ddvMgJAp11+HHOUgE2xuPCmpRWKEK2sMJm4hZHbDrMDA1XPgLtwmSkqeOa2P1UFHovFP36LGoRLQdjBeNyiMzsv2e/mrTmzyKWKXTlBECA89LRh3STWisoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pMXeCGjgHvfb3F5HVnTVAPhex1Y3C2nV445iWS4EZX0=; b=eezn6MDlnE89Z3brdtVUrD+0ab8By3Ghf8l/7FOjJ0widiuv5CtgCG8jbrH/pzlpTIST4gVqg5XRxh3bratHBsLPwuBhrHKk/qPJcAaURZmtd1REs9FWBny5+ltMPYpRLdNobkF3z7mu4EIUNxOcMWS/PsNy27uxY/klvGQd/kpCs6JNMngt6wt+HXWYZob4sNbWP++kGujjRlBdipKfDwshyzhdzJcVBfjBo9lHtSk5RREhjEAmqNcXIEiA1LCNlPBwXI8tpkrk4EcAHTmj4tWsEiGEL8KdfxQjU1gyarTO3jUdtuvZlZqoiCTzuZWumM7LHQorXBFmPZpKYyJPZQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pMXeCGjgHvfb3F5HVnTVAPhex1Y3C2nV445iWS4EZX0=; b=PWBoGDi3JprvOikrq9qdd4ojQVLlrm/eowHW92oDlDwXnERPbiDhvncdvLT3ISRYzOrSAaVEM/C33lrP6fmNiBV+jGhnZLaKwqr44kBj66G/vFAspKQhBSxkjfvpWulX9lE7ECQeDaBednFbRW/HWl0aoanzpQ796kFTSpndtFs= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:27 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:27 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 05/14] x86/cpuid: add X86_FEATURE_MOVNT_SLOW Date: Wed, 20 Oct 2021 10:02:56 -0700 Message-Id: <20211020170305.376118-6-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR04CA0040.namprd04.prod.outlook.com (2603:10b6:303:6a::15) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR04CA0040.namprd04.prod.outlook.com (2603:10b6:303:6a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18 via Frontend Transport; Wed, 20 Oct 2021 17:05:26 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: dfa51988-de29-4053-4799-08d993ebd019 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8ne4ok1/pE+LQ8+gtf0WOfTIO+gsxtpUDkXstFpQFSwxpLWwyKaQdUJGi1AIRCgG0Jtxl/TxOKD4jpwVVnKfOv2RuaTb81VSO0xPJo46USS1XKDweLzEoD8Y68SxLEqih8ZZ9t2XVuJbilbonBGaLKHkEhA/ud9vb1hr9irdQHsT1L6LLGoonLcndLFS04KV93cl5IzPvey4uGsWb/DVk0Uol3IreIq5PvrwJk+vBebjnjtwlG29ewnumtVteMjfbP6L8l7IrFUa1xpfkVFoFlreLSOZXF75MoKgvoU2Ee6rZR3a+LeatTdehkW7DfNTMlhumavsw91cFdqPrWJ0/o5RZo+ZYEd22akgYhgrEYAuIpowPIV/xg6Tlfczb4mziggRd+xaQjJKh5rH6CKNjdiyiBIACOzjB0lMvk8sPCDc/ux6oBbEWCxWLDiCLPzhqjKbbHwpmF+WlWmqBpfm0w2+zwDJmj9pg5/tNYwnUhMsB+Ea3GJHGPZRLi8ri/WkI9Y8+nU8f/Kzpx5OI0C+WsWOLE3Nqx8CT8vLtXegBmGZx+OWqnjjXTCw+Jq5QoPSr+XFconI/ry8EVUu2Aq2WmEIBELfH3oCaH56JCyvRNG7sWmmc53I2pXzFaR+NVkchetaYb9NUzWaXfhD3jeowago4TulAIUYBYuuBJtVhuMeJigtmMI0ikyV97zky5QamYdVVidmKyMVieZEJQWZxdc5IACDMKwg8EyBbr9sMnxZYACBxaAoprmH2ngdE/WU6V50kJs29rR3x/weAi6o0OH1lqkmFizcRmoQNqSCuBiLsRRcmy7qbWslHQSHw2s7XCQ7YgGJVdCqksCpko+MK9UKZM7IyQbJynpuEADOt8KE4zJYJKmXfh7xtKlqzUbTK+hejjJj4HyjhrcQ8p4NnKV3JGXu+m5sVoQxWr1XzgXWgDTSoP/uHF8mHiMJtzaQU4nHIXdIWqxaUiycRnaNMqz9cRR+dOchJGMG2GmWxdsjH/N4JJnbTLAYwtLYW/GIV5OBOZUEPBEr+rB5kGxMSeNFEI/bQpdEdKaK8ZjhvZ8= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: qywhtWxTqBggR6jadk+/hncKRxLkFYiaQH4imAE8Tr7bagDR6gqOhj/SrgyRTgzcyFv+1Z8Rh80iJ+wxZ8ImNsTWo0+I/OFQGOd9iUxDNUjPc3ekTeyQ7OEsJqFWAH/iZ3OIpSfdwiu4EAM1hGYCaamJxzE0Ytzq27y3DZUzUf8AyXqJZ+azE4TYKWZaO4+EW49kdZpdABSAcEVFwbTsMJygDGPYg3WS2+1ChtUmj4EtbqMmopbg3oLw3BJWg7BxizS6giSbjrjXxuWcWxieG7t9YCMxetxxFilAW+da7xlUqxcdU1MYz9R+N3X4kETd2GtRyWlKcy7AD2pyehrCrnBWA9bics8P8uZbIiOoJ0LoupumXvdPkiZFz3oowHoFQaDDX4So4LGB5c/7BbergFA8b8NMn97ZPVZvRVV1DS0610cNgSlcxI0oQIjtkHq/G1hjnrROvWFiCNTCSYa0+7cgKDuf5AOVkNhkguIjRnsZ9Ri1cRCpep0eUmg9hgeV4+Axyc//X64NptSZShtI1lV5Sh3h558kmlL9e1kQNdYMp4CufQjqoVSu043tGjWaYO+QLgZDjZaP69niMdz0+ML4bPFl2T34fGSbHybjilLcqcIl29Q8fhiNbBuTnDGDAMC12qyQKc89W8sdkTWWVWzdKkMx6m+j0RJ5VOIDYnqidWlkYgSuB7uWOvy0VE0NkIvz8MrmoTkQHiBD1N2D/RdcLHI7zpokf4WfNg8JlW6dnnIurEsXEXkSeUnzkJifUtRs7O+AdGZUBrXcCOmyrjPbzDbo6Fx86BrzSGuQ174kUWqPjXbNm5NgxuMy+IbfZQv/FW3nQqt0gHfyAkic7PCvdL6eLGE1HYbJd5TrwTkkh4NHMVbatWnLmcudJ6+Lb1EMp4c25O7P60ov6mIWfh4UjMIbu7mLeGtLpOxGhq2AeGv8W3zZEjb+q2xkEeWElODlH/USWnVp5a2yPxCtMHHG+ATub8gKnLnUXb65oeIfG6L75Sxv8HueC0Ix/n8R8Cq8HvfjcFchU5pAoZHclwqYTApjXm8jgLTnZFtyk6A40U1y8Uz44OC2O0zpuN3Bb/+DJkhO/K69Fu++nua4nz02OPGK4Is7eYRbrbXFav4wXXuCsSzGK1rn0HJPm2YNZvILpaKwO7CBZC5gnzXQG9Qgdqnw9+pb3B1g1tl0B9UcEXgNSvV4I1jskyrmTFzpiJZlPwibDhyBoF413ujZvwO5I7EV0J/hHymHCzEEVgsk+PavIYAQ0YRrCsKZm+dx+nNAnYjx1/xxXGQswBRfg7sfT3MxtZoYZQ2utryMo+5B8gL2n35/zl0wiuqDmESvTbHGjKFRcES/YSVJVIQkbbGEXsXuR8YToHLTiy/3zledIQreI4L+4AJ+CNiLLXXo11z86RK3LJOqKfsyxVAWh+JNvXyvmnvoSYitjxU7AMWQgcrD6Tbklp7Hrg8+GsOi X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: dfa51988-de29-4053-4799-08d993ebd019 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:27.0089 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-GUID: I1FTLLqN1gywCPHo1sRsT0WJjUjXaZo6 X-Proofpoint-ORIG-GUID: I1FTLLqN1gywCPHo1sRsT0WJjUjXaZo6 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enabled on microarchitectures where MOVNT is slower for bulk page clearing than the standard cached clear_page() idiom. Also add check_movnt_quirks() where we would set this. Signed-off-by: Ankur Arora --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/amd.c | 2 ++ arch/x86/kernel/cpu/bugs.c | 15 +++++++++++++++ arch/x86/kernel/cpu/cpu.h | 2 ++ arch/x86/kernel/cpu/intel.c | 1 + 5 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index d0ce5cfd3ac1..69191f175c2c 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -294,6 +294,7 @@ #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */ #define X86_FEATURE_SGX1 (11*32+ 8) /* "" Basic SGX */ #define X86_FEATURE_SGX2 (11*32+ 9) /* "" SGX Enclave Dynamic Memory Management (EDMM) */ +#define X86_FEATURE_MOVNT_SLOW (11*32+10) /* MOVNT is slow. (see check_movnt_quirks()) */ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2131af9f2fa2..5de83c6fe526 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -915,6 +915,8 @@ static void init_amd(struct cpuinfo_x86 *c) if (c->x86 >= 0x10) set_cpu_cap(c, X86_FEATURE_REP_GOOD); + check_movnt_quirks(c); + /* get apicid instead of initial apic id from cpuid */ c->apicid = hard_smp_processor_id(); diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index ecfca3bbcd96..4e1558d22a5f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -84,6 +84,21 @@ EXPORT_SYMBOL_GPL(mds_idle_clear); */ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush); +void check_movnt_quirks(struct cpuinfo_x86 *c) +{ +#ifdef CONFIG_X86_64 + /* + * Check if MOVNT is slower than the model specific clear_page() + * idiom (movq/rep-stosb/rep-stosq etc) for bulk page clearing. + * (Bulk is defined here as LLC-sized or larger.) + * + * Condition this check on CONFIG_X86_64 so we don't have + * to worry about any CONFIG_X86_32 families that don't + * support SSE2/MOVNT. + */ +#endif /* CONFIG_X86_64*/ +} + void __init check_bugs(void) { identify_boot_cpu(); diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h index 95521302630d..72e3715d63ea 100644 --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -83,4 +83,6 @@ extern void update_srbds_msr(void); extern u64 x86_read_arch_cap_msr(void); +void check_movnt_quirks(struct cpuinfo_x86 *c); + #endif /* ARCH_X86_CPU_H */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 8321c43554a1..36a2f8e88b74 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -666,6 +666,7 @@ static void init_intel(struct cpuinfo_x86 *c) c->x86_cache_alignment = c->x86_clflush_size * 2; if (c->x86 == 6) set_cpu_cap(c, X86_FEATURE_REP_GOOD); + check_movnt_quirks(c); #else /* * Names for the Pentium II/Celeron processors From patchwork Wed Oct 20 17:02:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3026C433FE for ; Wed, 20 Oct 2021 17:06:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE8ED613A6 for ; Wed, 20 Oct 2021 17:06:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230473AbhJTRIF (ORCPT ); Wed, 20 Oct 2021 13:08:05 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:27426 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230429AbhJTRIB (ORCPT ); Wed, 20 Oct 2021 13:08:01 -0400 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KG9imZ020887; Wed, 20 Oct 2021 17:05:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=7MVfDOJf48lh/nVB9sTzNmLSijpls7fymQkKVQLIXLA=; b=PxBoBCJl7FT7rk5ek76NROPUJ7Gp4ZsgfuPsBtcS28Th7feYgNwTDDzIAKUtmZ0ifOxx KoJeq1r7bk5o7MjSmd8ShbWOdg5U2X0GZ+gkF+myRBffa2za9hJ4/NdaKOMps44iH5b/ UYEO8gE/oMupwj+MyFY7vtBrm0EFGjLoi60Ux6zff1EYJfLlWxR4O9LX563UQ/bThA1m aXBypsY/H97skJJIINeNZ2E64NqjVOdcpZFn5Zu0UKHODNu8r4m9DeJZP4Pxz9SXMQ7k Rk6OCa5dyTTgkOZH4cdmg9HNkdIjOvNZQIY3sMWJ3WyV2kuhOh+M8zWi8oiNPyS3ELk9 hg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkx9sfmm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:40 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCS005957; Wed, 20 Oct 2021 17:05:39 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=n+QbHP28I+1ATAosDj+ArqtXm4+exmsJOe5MpIzqTfJlqQMhefYM+nE9jCTU/LLxqkA2nkPudBmNpYTib7fSFdCS/iTQUR1q3gAZ3H+1e5ow+lBleOIv2srK5DxqDBZxoouzr6vGxuijPjUgH/1VIEcbnwg1eqiVnbIbt9NeFBqZ6VD93dpshgaHlfTBe8U9f23okOP7pep+nZZacyR6k/BTVvl5CjD1YcM/77ULNobH1LLuepALXC/FJKLDFYttDVEyyr0FxRqXAwK6QQ/hKmsA4rnfdLYPPx1IF45zs32wJkN+jSQ3raogRaBAksXIEXaOkfxJmhMB1VFB7PVk1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7MVfDOJf48lh/nVB9sTzNmLSijpls7fymQkKVQLIXLA=; b=VWgr17qZxUREyGW35ojD70q2i1MDQWvjQLOTPap3Mu4JReFxtIXFJz/KALWSRSz2XLCMLDLlDzJeWmdZQo118BTXEZiefjDg02/hG4hwWSUZ3yiNDRcJIkZWPTEDkJVIKAHINdFBlgIhkqPVkxLkl8bSPf+g/yv2zf61IcNH28lmvFpOSYq8jVPPFnFRJaxeElzqQE3HbZlBLgLJ6u6IQtAQbXeiGc5zJgE81IH094OCPzgaDtGt1l87kh78KtMExTUtlz5wbnpSqrD1VrCSB7P/xeE8XegNpzNP3drGEFhZvR9ss51tLyUjbRCkq2vPhRVVnq3+w+3LDTDl7ngnXQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7MVfDOJf48lh/nVB9sTzNmLSijpls7fymQkKVQLIXLA=; b=gCFJFSxVZ4rCdZOxOHLyLkKiC41mi4MiU8vjctPWviH3qySGCc7n8XhpsmV/cIKnjZiaRseU56votcXQyGUWrHGCNu7xuNK8mkOuRcW2b5RGYhSEpVBRu7085r7Ds7rJOKRBNM2BfwQIfUv5H27l/P+RE6AU6loAy8uxPzulffs= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:29 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:29 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 06/14] sparse: add address_space __incoherent Date: Wed, 20 Oct 2021 10:02:57 -0700 Message-Id: <20211020170305.376118-7-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR03CA0120.namprd03.prod.outlook.com (2603:10b6:303:b7::35) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR03CA0120.namprd03.prod.outlook.com (2603:10b6:303:b7::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 17:05:29 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7442540b-69e7-425b-bc8b-08d993ebd185 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:935; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: QDWZWWypyLpjUHTuC6zfe2Dg/aw5T4hC/6/uCwmSXlSRu0kUMO5AfS4QmlFWOu672Ejigcr+v2EaGnqpIOHPGMhd0Nug5yrFcTVx3BF+EkZ8e/Hi0X0IALE34cLdW15/b8X+jFi6gexR9eO9miAcdLHKLAAr1oBZ1KQ6KR12h+g9zl7SNBF42QLlXqOZLAiFWTYVrVUI7B7oG4bJdsgfs853nJ+6C/NXk4UCt0ujUBzT/9jWM6ZYxcqV6yr/lG18lzAn+Not5MVYbtbdXEP24c9XoMM2nZ/5HH+s4A45ugVT4ak9g1syTNXEvVJaB3HPUW9zQhJOLAHU54z9KBY1WCdDxNG9S96xBMJrMY8GwFSgjFQKn4ndVCTHKVLDitYECEg8bd0eZ82RaevCHkt5iDfyPdioO6lGetSM7Wl6L37nnqhTkvcZygcH4L6CNIV/bC0t1D7jfbjhkXtY/jJbZFyRREAye5DU7DX3BbwRPB+cib77JX3rdg5z0yBXweyDBXPO8snsqh5Ut2eevv2nu07rXcb3hPhNrONu2ZrafTNEZI3bOrDZwT/JAT05D97ECQZd6ChuvHAwlidJ2VAfic13bxJKhRYCXMn9TSAIw24JrCLSNUj/xDyPShgB0jB6OdY7cFTNSar4zS22nSOxsBjo0bHbZCl8T2KNh3gfAuNx2cWyQt1U2WiupsD5nCbo7mesMStGsWfXdUjhFaLxFBkl2qhcUqAQRzOo/PXWbiQuRf6eiYvBDo/TD+c/OqOm5BsKCgA+Nlal8c97q2A3impPd4FGuVKlt4CUQOAAhSkVZw8raC4Z0Jtc9mkpPQ1ppa7XHlC/LhZXUY3038+hilNN9UeUrJw+Fb6031HAF/HwEhi5nNl7cCyWJC4Z2QcuicrPiMc7pYu7mwkym2kUGVIJBQZVqSJoGfVlaDb8h8j7mWKLFtldAlVrf0KFAvlWdYwBadoEaA7LyFqQ+9JOrk1CL6Ob2z4Qv7vd6UEX2NwoaBobLNzU0qp8ft8fkWGEhVQLenSORVbBH1qkmI5mEFD6++yTvdHKkqudNmeL3aU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: /5iAyIFWbbqDAhfS4fbMuThVx1T9Uh8yZIp57sCO8D+m90Ej6o41m2IFFaUkVxo39wRjUuOkPcXutdZ8CVDR0pgqqCVCrHqMNCNxCpukW1I90k9xlJ0u5/ZejZTUbzuRKCAOboh9gWK0uwmqbazmVcuFKxeOmiZLOX/MvxtgCX+ys/JktrX+aiSdjlmm0QiLtxayOEr0U75F7UMiyav+Sp/ZlaWmz9cdQZU9NrqL0+8jDtaDvZkWNCrKbHk7g/isH0cfOO7piOlqIJLKxgamLef5AxjeXFWnpbdXJtbkxn/TAtbaIwM3jGa/ATtwlvEqIGBDBCz0HfvgoLNXIas1373rEHR0l43umVnudlBVnDa2CNRE5rJzmZVDKKtE3XjmP0Rn3FoEBTw7V+QabWLGW+momYlY83SgxwI3wdyf9ff4cZMGDWZQEaDmRNbYybsCCLNMWeg3Lx37y6cBHRH2Kp+ZumxxB/lYH0cO8bCEE+pw+WGRueq4DXcst/Hl/Xg22iemMIDUWZUFVcROo4nJbZlIdkQRvPhbKVco0kC1Nm2Wqx3lqQ+Rjsl6CtohO54NOjhkwBJcvgr4bPke5lgcQRmR4YUIB3MXNtdeixzIcq/ghAWIr4pNCFa4pYH8r0Nc/psZNXhAO3GlFrkgpYcjWoshpbHja1OYEuphXvq64nKB0po3z0AkV1yKjn+o+8LlPvnLLF98EiQ14850wmGCU3ctXSKFvs24oaKqysb0hyOsFgCXBjvVX9GN7kfWTL8pcJRsZiqNE4xq8vsoN03N0FQtRBXkuZX9dpZqciycDu2/lkji/w1OXhGhg70+8LsdDyK0mc1TTlX54Aa+uURWjDToALY7Y0M6f+gfDQwGEsTYUSIcZz00SGxon+Zk6xkXc9hzf9kH7hB1mTIv2YPr/YdpXemmwlq0CRWYYhv5MTpf6ELdY1iPrpqI8Zqb9VNB7myL9jFxc9eWDx7ANphb8jLj6DEWILky93DbW44hQAlKAH1tMoh54QgdLCvLkLc0WMg25T6w6N+rfHAmujkh/tT/7sE5TX/RP02rZA2YKoHHi2/iVLe5C2ys+xJW6eQ6jLnJQrxTHjWeWNmjlvxg4gcZtcjDzVMHGjxLHgDoL8l70T95J6GitWTo2+Gr6bvxl/GV1+M24JnSjHpLg6RmOS2e2w5O39+mQBv5dkTapjKKyczO6bDirPmrbJlLMU7tz4lH4iwiHLgp9t/pNrgstrqRqtcISNS9Jb5qL1qD7eHTh0vigIu2CD12QlFkBPqbUKJINuDJ/Awr3P9x0slX1rPpG+6JiJ6/BkYNYi02P7kSfY6VW7PkRUiJFnNvxV0Hzy/JdbewAkodqFp39KdG3I5XZ3zUpmUuYxguKuNxv519pVC9BJTq1csT1kVfBg5meafrUmEK5t2/IoAYTR2JwTTzP1Waz00Gj3h15FSPQcbj5M0B8x1C5vpA+QCiqBAI X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7442540b-69e7-425b-bc8b-08d993ebd185 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:29.3497 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=844 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-ORIG-GUID: 2woYBbyNvY7uPvLC3DrovoBbnzJyio02 X-Proofpoint-GUID: 2woYBbyNvY7uPvLC3DrovoBbnzJyio02 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Some CPU architectures provide store instructions that are weakly ordered with respect to other instructions that operate on the memory hierarchy. Add sparse address_space __incoherent to denote pointers used to operate over these regions. Signed-off-by: Ankur Arora --- include/linux/compiler_types.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index b6ff83a714ca..f7f68d7bc494 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -11,6 +11,7 @@ # define __iomem __attribute__((noderef, address_space(__iomem))) # define __percpu __attribute__((noderef, address_space(__percpu))) # define __rcu __attribute__((noderef, address_space(__rcu))) +# define __incoherent __attribute__((noderef, address_space(__incoherent))) static inline void __chk_user_ptr(const volatile void __user *ptr) { } static inline void __chk_io_ptr(const volatile void __iomem *ptr) { } /* context/locking */ @@ -37,6 +38,7 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { } # define __iomem # define __percpu # define __rcu +# define __incoherent # define __chk_user_ptr(x) (void)0 # define __chk_io_ptr(x) (void)0 /* context/locking */ From patchwork Wed Oct 20 17:02:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8948C4332F for ; Wed, 20 Oct 2021 17:06:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C0A186140A for ; Wed, 20 Oct 2021 17:06:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231134AbhJTRII (ORCPT ); Wed, 20 Oct 2021 13:08:08 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:57820 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230365AbhJTRID (ORCPT ); Wed, 20 Oct 2021 13:08:03 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGAHxn000751; Wed, 20 Oct 2021 17:05:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=sKUss4Gz6R671I1BdNoIAj3WU6QlaFJoZXbDohc01gM=; b=BdI33MnQ7S0hfhVWTwlA5nOAFCHV9DbJDOdtu6V+v6tdjKQ8M91+hWM6SdlTO3xfm6lt CyLk9em6a08Ezb3HPcfbTCBAcmk/vSwe9ixiuWAuSzt0cOW7U3Gz+l8BVQubWKjeNvgb l4xJCtAkB/lPws1eZgRBM731FjzaVtXfTvpfmNfx19E/SRdnR631w0s5w32nRSSNPA7l W0SonZ7A6vIjWn2HA5lklAAAoTGTgz3rRHyV14TY8G8Kqml5ealRU5o324tN5zGFFYuX P5AXTmOQy+XvmkInkXUonkXbBUQywAwXAT4TvttX1qtitkywAkcAhuw9BOHdr7gimsOm cQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4scdr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:41 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCU005957; Wed, 20 Oct 2021 17:05:40 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Gze/OCUlGvRi/iEUUt9trVFej3uR3SCNEFNq6lenkjGf48/zs5nU1vaA2jB5xBz1A46n2sS9obIVNDZC3AUfgqiUZ6XCJaQQfSx+Eb0F6lId7OX0zAlaWVn97fnHGEIMSzqRqG8uhu7fqhmqyEgC9iQJHkPBW5NYwu3cHv0DAi8r0I4t6VMymrn+kuekjAzM6x2dYcInVzrUKCsopQ0esrVl/apxIR4HhPupnBH5cv8+gkCZrL8s5zAqM8q+FFIDeEewh/V73D03sqsMN8cs8YRFI9HMRKPY7P1thFt0+yoxHJDByD/mXESpHSJK1qb0cR6Xvrn0a8zigcLrjdfAYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sKUss4Gz6R671I1BdNoIAj3WU6QlaFJoZXbDohc01gM=; b=iQ7IKcb2bCCvc54zLdJ2YktTQVM+ZSElj55IDH+aIb17flONnALa70mKoA7oZJTvITuLvL70J68UVNOsIkP7bdCbhl8p4Mo/UCojiSNuAd9j0eswZ6QQ0YnH1xl2vaWAQ6MScpyoURR82jcCmkvABYSpWeMIDQ9qdLehvl+6D0+5hAEexLtXP7MXW+K5Vd/laoTmx2exUlPmDjcYJn7lw5SXSnVf22BiYuD2oCHXrV59YqiBU9OsK9Ae5dvY7uQNgB/UA3D9abp+O3Puo8mOhUY+E3C90+xMDXjg+hZHcs+D38KR8fz5GHoJoF6qMa2sbdy2/6U/dI1+5pO69nVkRQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sKUss4Gz6R671I1BdNoIAj3WU6QlaFJoZXbDohc01gM=; b=RdBtzC21wrI6oBg2mLFz1Hd0FMEoIPgo2NmMPFPWVj15YbN1+GBY5W3aEHvDmRCZntmuGKIpHcaF+18j+kIxzxjLK9TP5HUMyp/ybTYF8+66YwIXhXbJ2BwFi7sZ1FwRrDTIeVYrfY5VM07J+3nCPnLzv71QR+Ql9al3IS3dIN0= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:32 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:32 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 07/14] x86/clear_page: add clear_page_uncached() Date: Wed, 20 Oct 2021 10:02:58 -0700 Message-Id: <20211020170305.376118-8-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MWHPR17CA0066.namprd17.prod.outlook.com (2603:10b6:300:93::28) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MWHPR17CA0066.namprd17.prod.outlook.com (2603:10b6:300:93::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 17:05:31 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: cc183e7c-0ab1-4879-33b3-08d993ebd32e X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PzyD+gzhKVxvAzHeIqFkRK/Isr5XD8LVAnxYCrumwRoX0yrzUnoz+/hATTohRPjI+B+WH8o1tW8eZ8QcHpNB8VOXx5D+bCaLR0hB7PxnB5bIHYFh/g5LIo7KPwfjT6EpViMtp7omxlqVX+n9VQ3lQnYayiXIf0LEvFp2b1LrfgBuhXvnArSiKbIodSc0EIZGwWZ7ASHcfkskshbBOo/61FEzswy0Qyqry+zfBm8HSf7BThHQCmGzpI4tDJJomb4kHMtNTi2BJyg58OpI4/ayS/W/RmXX2VuNBH51SEH5ErgbJoi9lI8zzm76P3RZo2ur+TX1G9PLTWL/ri1TJxT15A0+wKSQkRm/y2SqRASzpFsr7jQPinEAz2htMqf/1xGeg9D3DzZUs87Iqj9vMXoqJ8X62RVpg4cT3wiiBiYhs9yNIU2tWAlAksdTfGQwMGloVfVlrgwDIq/zmmB1iGpwl4SpAP8GJRLfvXdgIS//YmLw1cLpMeK36hmLohX807eCHtj/5drX2FcvPndTSSGnZD12MmiNOwdDtrvcQd6d0AatiM5myYOXMSOvWHA+Bxcukrp7LSWnhl2YdFbeovI844eyXJ66LmDo75vQmlIYBuWYPwMT1R4mR8Z3sM2Rf+nOY+Ye9kNUC2X/UymzEwyR8vYxFA8vjkBDjzZ5ZAadeb1szhlmNmx01MwpH4of3LYXufDH5X6YuAiimScEvKvlQMECiVIFrPVYuWKn9f80UTgOieaiwfqhhiiUvPvwHnrdFhItOytKTp4fCuDkOS/j6av+TK2j8Zlz3aPh0G7pbmJO3I9KNY5KjM46hr3iavLL2havWiY9YnGC0AJbCQXb/nvzD9y88Y0V/hBStBMlxOrhUUt0PhtKayMyCA9OoAKaECgFMQV4O5XK8UFkAKOtj4RU2cXO6V6nK4NHdiwY2nrAazIuxS6NBhNXeyWu9eQrIzYJz7OWNdufAGMQbaRHFdnCep63/xs5Swic9w93fuN7XP/z8PfyYaZSU1KySBJQ+NOPRDMErRESE5aWpONmUAO1nnHQ3cTpp0Tvz/04+Zk= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: qjutq1i4NAZfRUUi3WER4xZt7ZDXjLiEI6GT87z/TXRIwUlOGdCkdRC1PVCkoFtGomDjz777jAJoS6jyS/DxmFyUjOUPDLyvSzdORR2bJWP7LChK7m0wD3vvrqFY0S1Hm24aINBYvL2E+W2t3bVA5xxYtaAD+nFCEtb+xVJfqSXNFCGHh1aOyohqGWREMZsxUFzm4jtvIqZPCk0rYp+RnSvbKTD4lowleEmjeu+Y0mQd7xeI6LU0eIpVDrJqgWK2l9D3ynkrDMvcSX8ELyAY0XkdSyAtSAMItDu5fwBl1WUPlf4jfU4LvjpDQ0jXUoAF/FKYoGr80RluWPU44h0ZkfQK9zn/mp/dVmFa5qyGKfwG/rbXLOYKLOBEGKn7o/v25eJDRWonjngkGCPRpx2o8DXrMnqMFJmD/XF8blsnNps6UUBI+PAAQfWvh59VHTnVL9c9Vl0B9kPqtkloY4h8om22KlOVduPLZ2RQ2Tk2FtkJSiztWd75P6Akfob0/S1sbxRpYJsUoRUCri09mSgX8poLZFSruPonR1j0xmdVf6nHFpHJLhi0iOnAkeziWpIxtFYfKvhAFJaHvqgnRzXpeoClfwemzTV7jl0+k+EN1T9tBi7t/mj42b4jSUvEUk0nuB3WzA71jGqwytOfJqJFBBlkkaXdiv9lVT0ddNxOC5i9cWf3ofiw5WJU8OXDCcO7aYj3L4wy3PeByzqswTwUEHmfuVfk1z5rUncNFCimY1gqnqMzjB1paA37DNGiQLcH72FpoMVz5QOV/l/RGjcWxXgADd3vIfrc0hiQ7hU9YrGTb98gAznljJGx2PHEP3pMuHX3139CyMe/JC1ynokPI4k66kuvfhaNK2j3qPduGSmNtjOT+notH76rvva2Xaj03npAbXwag1C2vxiIULRpvbEWaua2VddZQf9WbdH4omy36rZj3qb11VdgIL18GzS5EF+FhZhOzSimEcJKO8Q/kGD9+NgeLxTIT0izfItLULpEeFc8XN9JPtPz+yrO/jOkJVoqZpobk+vVHwJ4VYowRNrZzeq+eDTaBgQyPlcpvR65ULIHKIS/ZUn0rd67s7NtXpRlO9kYWwQ7XlRFcjrXA0sZKPAR1sDCRFlasuUhprviFQ9Lep6NumUQ9rbeiQ7xEEgvUeIkSUD5MDxTQKaJrFkLTOaOA6mO0TvSkanFVjUTbik0hhCC/IZ2hq5iYRJqA58FmSo/9xQGyAVqE9aqnPZLzFnVLUF6RVRP/fAsDPzibp1ct5DUHfg3tevwU0Te7MfgS72o1u5o1a1QCHwinGg9buTTWamncOhbiDwV/BVSKmlf8rVpMySNvpjgNkIwRvg3irW0cAc6tvAzK+IyhrBPLFGPwafMkvliaNc0d8rFPCy5B/b/CtXnmojSzN4g0BZNMyc7TPLlY+KG/YQ+fvVSJn3laHbqPD1LXtHy1/uXq6OFpe0l6qYyMi5oyn9d X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: cc183e7c-0ab1-4879-33b3-08d993ebd32e X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:32.1394 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-GUID: 0_klsLrttWHRi-mcRqsTRI6JSgeVRnkK X-Proofpoint-ORIG-GUID: 0_klsLrttWHRi-mcRqsTRI6JSgeVRnkK Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Expose the low-level uncached primitives (clear_page_movnt(), clear_page_clzero()) as alternatives via clear_page_uncached(). Also fallback to clear_page(), if X86_FEATURE_MOVNT_SLOW is set and the CPU does not have X86_FEATURE_CLZERO. Both the uncached primitives use stores which are weakly ordered with respect to other instructions accessing the memory hierarchy. To ensure that callers don't mix accesses to different types of address_spaces, annotate clear_user_page_uncached(), and clear_page_uncached() as taking __incoherent pointers as arguments. Also add clear_page_uncached_make_coherent() which provides the necessary store fence to flush out the uncached regions. Signed-off-by: Ankur Arora --- Notes: This patch adds the fallback definitions of clear_user_page_uncached() etc in include/linux/mm.h which is likely not the right place for it. I'm guessing these should be moved to include/asm-generic/page.h (or maybe a new include/asm-generic/page_uncached.h) and for architectures that do have arch/$arch/include/asm/page.h (which seems like all of them), also replicate there? Anyway, wanted to first check if that's the way to do it, before doing that. arch/x86/include/asm/page.h | 10 ++++++++++ arch/x86/include/asm/page_32.h | 9 +++++++++ arch/x86/include/asm/page_64.h | 32 ++++++++++++++++++++++++++++++++ include/linux/mm.h | 14 ++++++++++++++ 4 files changed, 65 insertions(+) diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h index 94dbd51df58f..163be03ac422 100644 --- a/arch/x86/include/asm/page_32.h +++ b/arch/x86/include/asm/page_32.h @@ -39,6 +39,15 @@ static inline void clear_page(void *page) memset(page, 0, PAGE_SIZE); } +static inline void clear_page_uncached(__incoherent void *page) +{ + clear_page((__force void *) page); +} + +static inline void clear_page_uncached_make_coherent(void) +{ +} + static inline void copy_page(void *to, void *from) { memcpy(to, from, PAGE_SIZE); diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index 3c53f8ef8818..d7946047c70f 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -56,6 +56,38 @@ static inline void clear_page(void *page) : "cc", "memory", "rax", "rcx"); } +/* + * clear_page_uncached: only allowed on __incoherent memory regions. + */ +static inline void clear_page_uncached(__incoherent void *page) +{ + alternative_call_2(clear_page_movnt, + clear_page, X86_FEATURE_MOVNT_SLOW, + clear_page_clzero, X86_FEATURE_CLZERO, + "=D" (page), + "0" (page) + : "cc", "memory", "rax", "rcx"); +} + +/* + * clear_page_uncached_make_coherent: executes the necessary store + * fence after which __incoherent regions can be safely accessed. + */ +static inline void clear_page_uncached_make_coherent(void) +{ + /* + * Keep the sfence for oldinstr and clzero separate to guard against + * the possibility that a cpu-model both has X86_FEATURE_MOVNT_SLOW + * and X86_FEATURE_CLZERO. + * + * The alternatives need to be in the same order as the ones + * in clear_page_uncached(). + */ + alternative_2("sfence", + "", X86_FEATURE_MOVNT_SLOW, + "sfence", X86_FEATURE_CLZERO); +} + void copy_page(void *to, void *from); #ifdef CONFIG_X86_5LEVEL diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..b88069d1116c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3192,6 +3192,20 @@ static inline bool vma_is_special_huge(const struct vm_area_struct *vma) #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ +#ifndef clear_user_page_uncached +/* + * clear_user_page_uncached: fallback to the standard clear_user_page(). + */ +static inline void clear_user_page_uncached(__incoherent void *page, + unsigned long vaddr, struct page *pg) +{ + clear_user_page((__force void *)page, vaddr, pg); +} + +static inline void clear_page_uncached_make_coherent(void) { } +#endif + + #ifdef CONFIG_DEBUG_PAGEALLOC extern unsigned int _debug_guardpage_minorder; DECLARE_STATIC_KEY_FALSE(_debug_guardpage_enabled); From patchwork Wed Oct 20 17:02:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC181C433EF for ; Wed, 20 Oct 2021 17:06:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C29A26139E for ; Wed, 20 Oct 2021 17:06:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231130AbhJTRIq (ORCPT ); Wed, 20 Oct 2021 13:08:46 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:23200 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231230AbhJTRIO (ORCPT ); Wed, 20 Oct 2021 13:08:14 -0400 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KG5mRW028962; Wed, 20 Oct 2021 17:05:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=+WcxjPmGecZ33zLM4HXZ0QW441mI6Giii3BHuxSYV6I=; b=0GYVT0jbPEZuITSGrozsn2v4MND8qF3+T6bRiaGAO20uFyv8OoI7NWfpqGO6wVipulBx ODud86ZdnQ/BXIOuDKpAuXxVggxCQoNVLpbSaXuhODbqY2PkTVdcLcVFj9xkWHGOqf/h bXcQSmWIkNAImndQ4Fue7mCPu6PM+D7HFOF8A+IK6/VOL6DnWLHg50vZbQTkcOGrIWAQ T81v4R2ha379UAcuvi920tnyXI1Hj5CAJbIPr6jFpz+lWbER+J0Y2WiQ3xNI3dl6GWtT fbrlYzURoQ/pdIuJ/B/tYl2L2BnMAlcjkTsn/zY6bBctZu2W/vcf7+HFbHhBaHH+YvuR CQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3bsrefjhtc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:53 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCV005957; Wed, 20 Oct 2021 17:05:41 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:41 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VMNiJ00aN2LnQM7EasRqPOnLqPc/Zs9R8FYS5xDCiFSw87A1sAuCxy9mP9v0lyzVFobkYY9jf82+oaHz2ckaymQFIXsdjcWPWd8A574Bbk/8xuXRjdOxAgu0LjTI3rmp2pTTiXTg5HvfdvpXpiRZOhSNOvVhH+EBv8ni1xZQ9Jrd1IKRXGR5+CQKb21u0RG9WnQiFShSsv5a2JKy24q27f+fpy+8lnn5Yo/ASVw19zOAlrf5wUVmWvqe6FehPii8L1QpZqTNauK1JUCf3bcoGgu0D5qSsqvanqVLcLLvrlGHKl97ic8H2drCwuznD/xccSjA5HoomJJmfjKOCGBTAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+WcxjPmGecZ33zLM4HXZ0QW441mI6Giii3BHuxSYV6I=; b=IJWoLg7FgbZD4uIlQ+om4glEwWU/depFBe1zdALi74XP2p9q5RBBrG8AM8unRQrcltgE8IdAKFhsxSENEkysqmdfmfj9DJ3VnUgMNH7i2s+3YY9RGM4pGnyslexBA80z/bKltSrD1CH12T2MpIJBPdkAWOpgJPGg+3WyYBrEHNQsgqT4L3hShXclMX+md+UwnD/uz3TMC3XeUPPrAuOF7Io4uBbuCy6VEV6nt6lxhZvuccJd4aXtsyizl4BcG5N4dDqBbqaJveSIqZXccxUzqzmgbSY3CK/fFuosPZ4DknQc9+pF1Rq07FXIKDgTs7sVMdUsuNNHjQl/lAOrU+geGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+WcxjPmGecZ33zLM4HXZ0QW441mI6Giii3BHuxSYV6I=; b=aNznlL8msiNELCVmVeBhq1l3X/wWsj/cFvdNwAHwfNkBU9cc49aGjhvQEiSeder6/JAXqKSjd5AShRDa7+xVMUq9Ukj2ngxbHOlrFRlF6B3TkRliPqMaLS1QQ2w2Mbo9yHs1WDV0Tgu4XKCdilWKdsv8eA5wF04BlNnFMNIk6Mo= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:35 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:35 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 08/14] mm/clear_page: add clear_page_uncached_threshold() Date: Wed, 20 Oct 2021 10:02:59 -0700 Message-Id: <20211020170305.376118-9-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR04CA0047.namprd04.prod.outlook.com (2603:10b6:303:6a::22) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR04CA0047.namprd04.prod.outlook.com (2603:10b6:303:6a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.18 via Frontend Transport; Wed, 20 Oct 2021 17:05:34 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0ebdb186-a769-4162-d97b-08d993ebd4f8 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZNXl3qDaqCDyK+SNvrMV7yOiccqxGZHsXU3aP7J5R3B4P1vq0JVHrVNO/MFLmceT5IyeBo1UxF6F1B20qcM+/zQ4XfCajuUWrWThoazE98c0PuUru4AI0cv2ymuebRAghjTTO6WCOXOj6GhhW2EqvGD7gvdxnuWNBBbQFpKYBR1ht6NjK6DW3D/Mm14aO6omaS2BPLt82Am0AMV5p5UEJdYPTNVM/T2bnGwiGXudZQrjx8RtRhh5mc0SOH9t/PLWrUTG1wqnMwo+b35FPvTbGW2NMCZCMfjK0jicT5JiQalDAwBDyWz3gMpdTzXnPW3clL6xllca1D48dhmB5yVSWHpyoN0TWRQWjlpfAtTk/P+8Lmk8B4e5xtpWArOM4HaagvjwelX1wEg1BqVWLkOELpVF1nI9JMN9ddEsxaQempqZ3finKYLIbEr5LWFXSC1NWrDwLitqs4fxG3S3olubWeP31NGwQol8V4HYJ1tgrF1unMfRiDsn7guOjbgliid5sa04+BmAukjWbUQyQK/OuZ+dm0Ml9lFsTNrO3KBuXKEMuOjylwUZ10ojbMT+0lY0grq0RQ3p8+9r1bHdgyZtCAgtq8YawdlXeLoUMSC4aTWvweTi0AY5zYxo0JeUwJOLKOx0zZ/7vA3drHE+pF4l9E23LTEcvuVZz1Z0Px0Yb4KquwINW3oGqWKlrTea0rTHoQwXdiTIIstTW0m0Kqs2hlEuZJvSm553I0UPcJyptKCopL7H567WxvUyPFLkc1/mIbD6YGe6xmc/Mew1XfddsxC3vKtHcl8N3biQUHMdqaNsbz50rEu7UkaFJS52Owjbtfd7Q8YUoqvpJS84thwobax2YsChk/kGLTy1l+7GRbA741TQLqZVA5jdE4rmsilFmce40emajw+iwK9bgyu26lUFQG/tIogODUdm7AluVtPIfzfGeB0FsTUIX9QsO+vKo+T9ax+2qmq7wRRNQKmO1DFRh5CTM+QcXT6sf0iQ5hJiAVPw9c3F3A4FCm2+Ja3+HsBND49yFIawmMattgJVhz0dwwCRFDAjhExXMz3miI8= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: HHWHHQuZi5qsfXfeVfzDZKwNmFaGfx0Uhu5SjHvlKeLBSbyc2KVye9T3ExLKmYREO7dOibfAAf8ndOiGHAjW3KkMQiVzuXEHB2ynrW3Qqmm617PxEbcqAOhPhnqHkEcOYOywsWuNOwCl0ZA9DQQpXpCrQL20M+JlGQxMKQNL/xkDoY44vHNexc0n/XYogr1o7fT4DzY0yjRn+mTQfxKXMibMjHiaesOb4zRn6dVuycW7FVlQ0SONv5/B8x+OHkoJowm9SmQ/rhtw+WGvo6Z9Y7OepbWzEn8wQWx7gDuG0qsk9a0AoJIA8zdj6kLcBh969u6SPpGShi7WsLcXg/8+Ej+Vp/giNFE15oG8LbEF5eeZkRnXGq2/iR9D8wUnKJ982je9yLj2r+nzw6O7vt+s+NwG+px+3BbtMg8sjFKABYDVgG3eYtVpwzYuxMYucY1teC43EEnZGs2ZT/7kWPhcU3VwsRLshr/ZlLHOXjvVAXulZ37uderDAcRZEpooxVstoLUV/+AFBmJQBRj0FkChu/tvyB1rVAIj8VSYAPocsqxfJTLlcpAe457c96fV/FLjmTAUZkWig/RCwWgyotcsIf59xeR34WbI1ozbstnvxnUzRMLxzu8osW/MZmhgI9kyjM+ErBE8AHnOnOK7O6QICzOAe/ZkWOkSJ4EcDX41cuPBiuRjIff7TKqQuzl+l8PC9WAZsrVSumNnZnjineQn+5hCigqWjAOULesywFxDOrfQ6tvHuR98pBMlM+NJ38sdUNOdlQGdhM6AankBeDGaevL1Sa3UsRTdaMv9+FfKK2bcBzR9qk2iaCf0Tb0ZXoXXBKDlh+dI8fVE9GRQ+BxFDAdzep3OfMjcsC+/GRaxSaopjv+LWS/up+ahxAW3ioJCUtzKYPINyNQyDwz0JNFt2md8aZSSsJnIH7ofllOVobo72Mo1nZnzWnJptc219hgKTTE+QVUK6MFVczyP1sACmMPiYCx3aMHNq288S45NMMI+65EBYT53CoCR9PwwXEXqcULlirZCi2KorUZq2/NeSd2Bk3U08v5ntx9uMlewGI8HXUI662RMNqJrfd6Q6J1px5KIzDGmQy06LQAt6+JvP2Dk+GxuNicT9vTyQIrXBAdX6wU7NoD5OqaLLxiUtbf/mLzxqJ/c3WL5YTiXJuygjUp4nhcwbkuc6VNQ1x5dWwH82hBYgV0RmHVSwEvwCMvmiFMv6vKBhUaTv7ODT0UVMTyIFk60gPjaBg5kuYT4jO+qfYBsiE5tdgvmy1CSKcI+8OyJ0xJA83w73W7Ub/YAAkpDlU5ELfzYu0eZ5l2OfVXn8GCzvKRrYzNqv3xUAdH2f72OAw9ebQBDxAyIaM8imtxQ3SybnZnNYnSlx+qjA2lMga4XLP44xnuAwHrsSZMOsDhYIKXFOedSdcVfKlSyLLGO7MPZ2vSrBv2FGMsSo2QQK47MJpJtNWbEVxh9+PbG X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0ebdb186-a769-4162-d97b-08d993ebd4f8 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:35.1143 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-GUID: O8CR2FgXzeBOKpuChDg26OmnN3YfwAlC X-Proofpoint-ORIG-GUID: O8CR2FgXzeBOKpuChDg26OmnN3YfwAlC Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce clear_page_uncached_threshold which provides the threshold above which clear_page_uncached() is used. The ideal threshold value depends on the CPU architecture and where the performance curves for cached and uncached stores intersect. Typically this would depend on microarchitectural details and the LLC size. Here, we choose a 8MB (CLEAR_PAGE_UNCACHED_THRESHOLD) which seems like a reasonably sized LLC. Also define clear_page_prefer_uncached() which provides the user interface to query this. Signed-off-by: Ankur Arora --- include/linux/mm.h | 18 ++++++++++++++++++ mm/memory.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index b88069d1116c..49a97f817eb2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3190,6 +3190,24 @@ static inline bool vma_is_special_huge(const struct vm_area_struct *vma) (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))); } +/* + * Default size beyond which huge page clearing uses the uncached + * path. We size it for a reasonably sized LLC. + */ +#define CLEAR_PAGE_UNCACHED_THRESHOLD (8 << 20) + +/* + * Arch specific code can define arch_clear_page_uncached_threshold() + * to override CLEAR_PAGE_UNCACHED_THRESHOLD with a machine specific value. + */ +extern unsigned long __init arch_clear_page_uncached_threshold(void); + +extern bool clear_page_prefer_uncached(unsigned long extent); +#else +static inline bool clear_page_prefer_uncached(unsigned long extent) +{ + return false; +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ #ifndef clear_user_page_uncached diff --git a/mm/memory.c b/mm/memory.c index adf9b9ef8277..9f6059520985 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5266,6 +5266,36 @@ EXPORT_SYMBOL(__might_fault); #endif #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) + +static unsigned long __read_mostly clear_page_uncached_threshold = + CLEAR_PAGE_UNCACHED_THRESHOLD; + +/* Arch code can override for a machine specific value. */ +unsigned long __weak __init arch_clear_page_uncached_threshold(void) +{ + return CLEAR_PAGE_UNCACHED_THRESHOLD; +} + +static int __init setup_clear_page_uncached_threshold(void) +{ + clear_page_uncached_threshold = + arch_clear_page_uncached_threshold() / PAGE_SIZE; + return 0; +} + +/* + * cacheinfo is setup via device_initcall and we want to get set after + * that. Use the default value until then. + */ +late_initcall(setup_clear_page_uncached_threshold); + +bool clear_page_prefer_uncached(unsigned long extent) +{ + unsigned long pages = extent / PAGE_SIZE; + + return pages >= clear_page_uncached_threshold; +} + /* * Process all subpages of the specified huge page with the specified * operation. The target subpage will be processed last to keep its From patchwork Wed Oct 20 17:03:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A41F0C433EF for ; Wed, 20 Oct 2021 17:06:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84070613A2 for ; Wed, 20 Oct 2021 17:06:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231234AbhJTRIP (ORCPT ); Wed, 20 Oct 2021 13:08:15 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:59552 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230454AbhJTRID (ORCPT ); Wed, 20 Oct 2021 13:08:03 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGAxUD000812; Wed, 20 Oct 2021 17:05:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=iS/PD7yaCgxwaupfk8G1bmsZ4Kpoq3CKh+7q4k94Cxk=; b=CZQOCIsBA9TIkwQpBWllsIL0dvifV8S2SDcN77JuwLY3ACZJ8nn3Hz328lQkS3yQnC1K QRokzA9iZOPedzUP/7I0ustSsxXLSsdrMuxIpwF+4YFu0z/J/TqiaqfWH3pOomMcjROL XMYwfN6ag1Lk1oS32fFYqonbLZ2D4El1x8Pbhx3zElXEfzdwjZ/rxOwxGpGBmGW+/zHR dN2CgqPw7cx1BJc/vzo/oBYHcKgC3eiR5PTENAjx3xpKyw2fhZm64YeaLNOXEm8qNSqq AvQdJxUoI4Jcwqc4egKoby93ipaEnpteoiSjgp5uMHcXBYUxNwk4g+QlnB8hqoAxwOr5 3A== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkw4scdy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:43 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCW005957; Wed, 20 Oct 2021 17:05:41 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:41 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X4QDRlvOASwf7MQEbPRIQUX/iWHcndZqTGgrX2jyQHutIdClm4ZLLQXR1HNHRjxBsmb6YcXlYsxq2z18vENj/nnK9S5fb7a6iy/6dQ9VTaoehWf8ezIx84h/qRjLDeE0RLaFSDEiF4Zaw9LMbMp7Y08qTdXjR/6G7NB+1pssE+Ex2oNLnQ7bBQF+3tCC5ehFf7TOzLGgUrqqhmwnHGdImIcb11QdKhQJMC8KQkQaHhLecA9eFVvsRk4ADMVCdLoCGTCQBY/ifzzB9eArdfcj0nrritoNfk6E9c9pEbm66C7aR+DwdzurkS+iwNAvpRSIpcfNeGD/5P1XV4DIgerT0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iS/PD7yaCgxwaupfk8G1bmsZ4Kpoq3CKh+7q4k94Cxk=; b=h5fMvpZxWnUurSY+BAcAGY/uE2uXvEwd7ioy/yM0nn0ItNz5nKlu1mubMWVDvWNWReJkUP0XEiqHN7228Swkl2RCeZQIRZ40vguyHBW3qRwbgRsmJLCrNo1bFV1NCpMg9Whj+UAa8tx0ziFrt+sLjGzv0BXROkqAn3rEmatOeyRqDgWuhMLwwxndHnjtZyRVZcHmWmMrvFCZhRNdg2NwiDPwSTG4nfIj7DdsRhifHcLN/6l2g1ed6chscgzV/jb91htKNsbBG3HfYSf0F7hf6zbL8TUH+FR9dseclhYaSnMOXv5HjyUJU9QhTaCbRYOgfqehpJk85VmvYc2P8Fu0Fw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iS/PD7yaCgxwaupfk8G1bmsZ4Kpoq3CKh+7q4k94Cxk=; b=t9C52Mdw3sWVOwNlt0ikWkX4q6kUriPa8eQRu9gDenT33jKaQyy5IA1hay86WoHM/AfAIaNdwwfz2u9DsH/vn74dVlK+rnHl+V/gjzJ09TiELgxqvw9QD7EabFrzDWOV/jrpovup7hh8TH3/RfqGrAUJDEgWJzzLXPQ/WDEDBAw= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:37 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:37 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 09/14] x86/clear_page: add arch_clear_page_uncached_threshold() Date: Wed, 20 Oct 2021 10:03:00 -0700 Message-Id: <20211020170305.376118-10-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR04CA0131.namprd04.prod.outlook.com (2603:10b6:303:84::16) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR04CA0131.namprd04.prod.outlook.com (2603:10b6:303:84::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 17:05:37 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6c8a5c28-435c-4f0d-30ec-08d993ebd65c X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6430; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SJGhwPiR5I5nIV6KiZArsY5xY4PeJMST5+8a/eLcETm3EHxsGdOxR0evoK3aWyCff6IMWAGZKZN6E2nPi8oH5Bv20K5ZamBqOJpgF4n+zII6/nSFUxXt5sxDPZibGyoBZFldO9H9qp3l/nDlcDB7OueWzcJzio9wKOarKYJV0O7xaMAkH9iGET/7yWtfW3FBmUrAswDXAmvRnsQC2dhsUlxQWo4wpVPz+nROjeX5zxe3CyC2rCSwECNVmHe8lVJQRriOhs+eONGjo+wJl5n0J1zfmXmvgQ4+pjFZ/MHo6gxjyput/pF0RIr81HmypNQq103KruJomyVUVfjdv4FxvQrRa27UiBeYVubz9EKS7qm07ZR+n9TD5VWzWgvedX9DXMnAgu8HyrZvKlUFC0pmWyWmD3BXqw1JsTVPdrmPuCe7rNVsBehEpjTGHqqHg4rvOSepLdcT491XHJ3Z3COeEAs89Bk/KaKYgtcnu2tr5B/EH4a29jUDvMppYBWswXbZSmttaqYZkyCWYDJC/+c7p3ZmAdT1AtPNJ8RgI8Gftsid62bQU9JUInIHOSGZgKsCVfZ6RFQvuOZDwVJJuuB2vBTgk2U5QxOLVfrmitP4Ixp1l6eByu26hAxkVvP5/oSYczNQ9ODo6ObklQ/0HYuIQyt0YrhVdxS/fGmVBiNUio1V8iB8MI0lHZ77IJyR90bv5QUqZKdFMcdIZNilmFmFKIwFYniQue/EmJcbI3uC9qoOEz3HZn/V2AtBXKVQuQamdsd15u/nT9WynIgaKa8ocqHRYvWvNlikCBI9LSbxvFKb1EjqbyL3fZNbqm/cWNj+Bpp1dM8QX3DZys4A+FQSvIyqiz1EH/2RwDjW7qNkHDhaZjYcYhmt4LvbJkSdKtfC5Vp1KQLbYuySHIUaOflvVEbhxGgTqv7OiDMxmaJzOg+y/f5VvUakYUvjHxv2kpavpondwn7hq+zlxiW0WTDnB5vqSRKmAhQ/Iy9yRyWR2oBxsHJIEjhDF5VoNd8iJxHouNE30nmzt93bRzdGi8Chtw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: s2abTlKSHl2p21yEZIMfcwh8VaqQqsDzvD8/XrWinEa1U+dG41W3l1+K3QDHrzQ7tjp+B/I+ITa3HSmRrlbnOiNzBUbOYdwGtQBAfQqfWtPJ7QW1nvCgwRAz53Y0Wst8NzKnMTkN+xzxMZAZFPNTPGnQn76qg6Qg+yfaJPZrR0oQdtQWD1LyjXmgLIeZf/K0xil9sE8/IA2jpBAVkxtiUtLBgxUwd4nOfeBOVHq9hHKnoUDYlZNmSNyBHkEzBD3Q3a1MTg7VC19GajveIRFAi68DdVfo1tmpMbO8fzOl9WQaSi+fknRD9BjSvt9IF08x2mFm7COOshsRlFZ4WAgErNGJRDXXPJiAwjBYl6BeJyC1CD1CX1N19nsLMzL6JBf5DQgcOYlVbsyeaBL1pIb5wsQR275DKZwE1M7ae+4Y9SMFGqWK+a7vDszlNhQI4zvO6y+WVcEC9MQzezaiwo7wSnIut97Sgx3qDZPQcxAH+Dw8LszP0HCMoz0+3w+S8KPGhFe/d8dxYQ1J7jfVRTKUU2BE+DgsLSIWRUfED96EBel3WDDgLYa5VbSCp5TrkDB96GovmjfWloCCBIV4hFiP+es2fLIgJ9ANEjNmqc7HDohQKDCKLk6Yl9eet+Yu2RNSGuLv9bK3oNrLSQDpCgDyBeMxVaFttkJlVGSdmkopCg0WLexIdhzwAkEoxUi6CU+0ZCrGd/F609+GfYfENiNM65ITj7cXCdrOyf6Ghai9tkaEfaCvYWHWC9CpY+6iliEc6WYQPNLTaJJN2b+alfaGhHakNrnTz7VHaNN/+dyOQXLMm3n0jUWoy32/TP8G/Aa6FV4cex4CLKPC27aewz9fWWsx3qj5SsowOsFlsVzEUwkY2szie/tsz3Ma46iSNy2FGmmAUdMCH1TMSB0MpbxktQ7w1LHe+0C0vLvqf59uZSJhxZVEcTm4VdFXhpdj/9HGSGl+4BdjtzfKR0Eo19UCzuDx4GtFT258X87su6YdDPxojffR/hrq3n3kJ/ntTyRm077GBqLPig51g1F+suFAVMgSpy1LlOIPwH4dk8JRmfetG+uIE3ieI6hX1NRDq74NB4HY55bNIR9Vz/r13hlbMr8J11Hq3Z8wMHnsNpUTl51U7/AgVNDda5erBj23ZyDmkyfeNjr+vosgBOP7KqvMKZD+8Lc4c2kio77dNZCHPj68uxUuvOaSySrFcPVUPz85UlelhbzhU9UyVArWE38jgGCkyRdRkHzapHKXdcKMzSauhuU3a4z7h9etLmoIiLMcntifx7Tz2B3adp/t9UGK/6Yf5WDlmkSNBSRqUMW9cASg7TqiFZqGOji2s9i0SBWSr8ZkX5roF6AKfy79AHW6eLlpOO2bvNvlTjaPcsPVDUSRUSjHVq+/2h6V0pFWFP+R1s6idqVE9I6ZDZYhFnbTApM8OYmDz0hrLD3YkuHtphx3qQKZSnncwD68XSNzDfkF X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6c8a5c28-435c-4f0d-30ec-08d993ebd65c X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:37.4740 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-GUID: VYTbUcV5AxlHsVCwRmxL4nccxi4c5GFD X-Proofpoint-ORIG-GUID: VYTbUcV5AxlHsVCwRmxL4nccxi4c5GFD Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add arch_clear_page_uncached_threshold() for a machine specific value above which clear_page_uncached() would be used. The ideal threshold value depends on the CPU model and where the performance curves for cached and uncached stores intersect. A safe value is LLC-size, so we use that of the boot_cpu. Signed-off-by: Ankur Arora --- arch/x86/include/asm/cacheinfo.h | 1 + arch/x86/kernel/cpu/cacheinfo.c | 13 +++++++++++++ arch/x86/kernel/setup.c | 6 ++++++ 3 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h index 86b2e0dcc4bf..5c6045699e94 100644 --- a/arch/x86/include/asm/cacheinfo.h +++ b/arch/x86/include/asm/cacheinfo.h @@ -4,5 +4,6 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu); void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu); +int cacheinfo_lookup_max_size(int cpu); #endif /* _ASM_X86_CACHEINFO_H */ diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index b5e36bd0425b..6c34fc22d9ae 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -1033,3 +1033,16 @@ int populate_cache_leaves(unsigned int cpu) return 0; } + +int cacheinfo_lookup_max_size(int cpu) +{ + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); + struct cacheinfo *this_leaf = this_cpu_ci->info_list; + struct cacheinfo *max_leaf; + + /* + * Assume that cache sizes always increase with level. + */ + max_leaf = this_leaf + this_cpu_ci->num_leaves - 1; + return max_leaf->size; +} diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 40ed44ead063..1b3e2c40f832 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -49,6 +49,7 @@ #include #include #include +#include #include /* @@ -1250,3 +1251,8 @@ static int __init register_kernel_offset_dumper(void) return 0; } __initcall(register_kernel_offset_dumper); + +unsigned long __init arch_clear_page_uncached_threshold(void) +{ + return cacheinfo_lookup_max_size(0); +} From patchwork Wed Oct 20 17:03:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16CD0C433FE for ; Wed, 20 Oct 2021 17:06:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ECC5A613A2 for ; Wed, 20 Oct 2021 17:06:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230510AbhJTRIT (ORCPT ); Wed, 20 Oct 2021 13:08:19 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:36700 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230433AbhJTRIF (ORCPT ); Wed, 20 Oct 2021 13:08:05 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGGQE4029728; Wed, 20 Oct 2021 17:05:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=opO+uA39Vb8tJJkf17Zn2DODkGVpF0CQbrukoeLiPwQ=; b=Giff2CnZ7DxRTh6eDUNho7xF1U3US9iSh1ZChNk8q6d3t9XZwRPOayyT8KyIQL+EEuKZ +U3RalRNYd1DuBXfglVvv2AOGb05DqXD8FSO9AjAdAH58s9menChiUdNiwoamNfbuwN2 tiNVAtTTyYpBl+n11WaV3lpieeizLJPs9LbqyfW2tIy470P/zUi10f2STCqMMxsMIj8w REtatpeElZcffYMZK27BX6FPrv1L+zSHwHQHMnQWy+HUQ7AzfpVmQ+PUswUd7QNWOILH aZOQypVLc9yUwJ1R6xAXyrCaNw8mvLvCu2iIVtluZvj+f5XJvQ3fVxEwizQdSijZU4hv jA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkwj1b8b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:44 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZCX005957; Wed, 20 Oct 2021 17:05:42 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3030.oracle.com with ESMTP id 3bqkv0cqwj-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FK4aSQlwmDVU7eQFui3sbmOkOol2hZKXA5IeYh1YBniWoy0z7b6ZxMyE9eFYERkaFDCqm0v5xxf5p5EmDSyi6+mkpaOG6+3suI98wGAxHFsCqmBV1HW3QSwjl6B76k8Z8NgfFeAb8rfymg96kUQqtYwL3swUOSEKCsZFf+vt+IV0piMqNa6KGJQGgADoh45byossRiiYZj+wMrQwhKZRdSkFrggvAqdZQfqBJQi9QE1vKXRQVaUJBi9Uyb8fPBx5/1nnc+LD076TyGSUFT6EYcxLpQBJDA3n/2Bzrwvh0Fn6Fv0CthxE5I+JYvcaHEpTsftXqB+sUJ1APMWiGvVRlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=opO+uA39Vb8tJJkf17Zn2DODkGVpF0CQbrukoeLiPwQ=; b=RL3dnTc5BLJlz2mT/BalsQvhTgVlT7tjGA26Nuc9jHM3pcWnj1xOiwM5Ism5qCm6OUynBJFGIpjZamJBEo+3X6egpFLLd5jY3F49mmnZE5JlMppanLkPVFumoS92iiefPlugoe0NkLGuDj0RD4mtKBpiro0vW2GFsu/uAtBVIofzzOE+y/31a0m3f0o7tQYN1grDFfdqyEAfoHj+JddNC0koKP5ObbcaaOSHR+M8mz8VKwt0btBtZ4lt5T0FtEhISqcckG4Y/01lO2JHNOeOq/tfFoojPb5I9HlXhFZhsKCK5lUHGzjOTAY0kaGYgoSAJnT4tVpruAbc26/zZlOS/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=opO+uA39Vb8tJJkf17Zn2DODkGVpF0CQbrukoeLiPwQ=; b=e9KagKjB2ri5MwgBaEfIZgBpk7jCcljJ7tw6OKef+X2xtONX/CL+I1LmMfrDVZRKU+ImMRUlOKFrSfHDWPadurDgTrWHe9n0BLI8LSvwoYZuYVnmDGxA98ssXfeti7yiwOe0VPrQIFhwteZRCfJ6JrC6i7lNyhiaB0TiQr8LgS4= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:39 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:39 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 10/14] clear_huge_page: use uncached path Date: Wed, 20 Oct 2021 10:03:01 -0700 Message-Id: <20211020170305.376118-11-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MWHPR14CA0023.namprd14.prod.outlook.com (2603:10b6:300:ae::33) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MWHPR14CA0023.namprd14.prod.outlook.com (2603:10b6:300:ae::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Wed, 20 Oct 2021 17:05:39 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f40fdad1-b87c-4f9e-87e8-08d993ebd7b3 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3968; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: h1kalXAwzUZ/O18hhGHolsl79au6QfnnYdMQwd4i2gLKzQ+KL00Wsvf6wQfjirigK2VdIiS+0Vi0IXq3DOYti9On4zWriuYhHazzB68w+nYZv3jgy6ppqImjJlu/KGQsVfJZ4h4BKyiuN4uJqEv17fampn4SPROC3mOtU/z365lIKgtXaCFWXmPVKsgWtNsClka6LTA9W8lI6tr2zRGSbbwUo2WBoUP7x7lllXQtNQJobNkkhTtvCbyZZ44qOQzKSMDpu9Ho5X3yLugcId449y59ebIhVoiDkOC1BhbfMzny++hF8Neo8BMkrwUb+3Oee4tl2YRWEhh1wXbOH2b2ejyG0GCN5WERwHvr+4sCpiLtomsePVmSzoennbKkxBC+7GkorUmN6tXiK61+KDQ9N++hSaRzGGqRMlUM/XoJkZH7JZlpQD+ho0X6PanawZJ1780R867hEXdVAywfUMY9lr7y2iFquiFlFk1BLmZK7SiMrQtE1bv6pet5+Q/Un2qz6KjptuduBNTa2ZSLbdlDfoo3bufSgkbeydx6M5s508yipG9wO7u1fhif+FhyTPheO9CjgMDXvL96fgGEgjykQsm8bzA8vBbaZuViAXaTBx1IACO+QJwRkDjJiLIHEecAJmoz3464Oit3z0NqBmOymVx7sBPMawoo2nG/YkZV1UqsSMnkhevF/2E3MLN88pcr/zNjlQW6DSuAyT2DVQS6IVoO5hzr7cRkOJhobQjGHCTE/KGO8UOpavRpHk8qKiPcachV8XwDj6Ql2uUH0vDh0uH8yJpWo0J76v39IzBu1GVS95TQnznXiOZgY9JDb9e2vMoIAUKqfw8iu69ov/oI0cvltuNuydoM9bF7xQ+PJUNm4aW4/KPZeTB1/xYg+ed2HpQNt5st4ksPt/5hvgY4w5nxjzFsRXIrigTdE4K3m+vYnJnfNKCkkgJzW+Gpw8/SoOTPTMMZvqfaTO/CeLKpOF9JuiBiQzLbHshmFc18CIu6KM48iTG7jn68VR9v2sDX5eJlRrR1Tz1COil4uq9NMimYLG3IeF4PgmFJfTqyKmE= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: M+uCvEZ1QENtg1dwJ+HDk4Ac6P45BXzjIhPYlcmYtB3geIvf1Z5npnx1mG4O/GfxTqJkQfll9DZVNXrCjo1MDkoU05/uw9JMAi6JlFrxvY0DfJ/P1W+dqzBNAhtY1uk9EvnA6Kt3kHXl+I3GWtNv5f7f0DFrijnLaxJMatL+I1l8BQZyH4nxD165gxoN8t58gigxmVQIxQsr3xg2xjdmQXHW0YgfykbkSr4moE+VB03nV4U2rJ6jVnT8QXgfhtsBi8SGRHBGEcEn6coe0uyMOxvs6Q8EFW31r8yQ3ecqVhmSFVtCWHOjSO1O0LCj1GdDkzrH4b2gYjEV3isinstPDguu00f8hSxOIjL/0/2DGMdGMXniWXNkaAEHFreQz6UFnHrw55wEGTPoDCDwMFS4rz4Ix9N+mPayAtGHnWllHFsmzvla7i/eI431QRZqJsazngq/y7v+YkACIkfUiuOszHkVBMnXwfgZxA9hjbE0Qjeoe8y0t88RLHJuUQhvy93TmmU89DVHAnH1/cwRiqnlcFSyVwhlPtmolHX/q8cK2qVSzns6RB8uoZIJHyIbMQCZ1ZhtcM3XKCV75gooE/yAKQ8S8ihXkODE4GFl+nrlWcRd2YqObVBxMdp5PmYKNQe2zqYp0mY0eCMtvFzHq40280h3BocCOLZjxyCLPwYusVY9L5a+guDZ7YMKfgmgSuIwcozLvwUvBOYdqYMogaS9ATr8WTk5DktC8DLBjP9XvsACvZCxYdxOeu7lRIiEG4aqD5FE3J3hfSe0DnaEN1HQTctP2O4roXtAZbGhzMQPcSpO3f/RisYKVwODJA2s7qZLHI+r9mgxsPerIQ13fY1y/RAirNBtvrG9rouwNlySzjNjTUqH7Gfh6tJdQbWY+qF9O7GhUHvOrmyQDpAD39BXvJSH/5jrCgBxAYA0LGFElxptyxbWXVZkLp83MQCNaYYdWq4rVM4rJL5NULmm4TRS5wYIHaOgXP24TQme0S4J1XfwDyHZ2izNDG7/Ujaf5SHP1Jp3n1HilWnJAsOR4ep/ChWzZ5HPiwjFneg9HOQ1odGiHJcSew9jB9EKsfARj8yGQrxzPBv2Nf+EkwT08GAAgbsgZLpL8SviKHEYbVy4pVQHjRzDZ9xddJlBat9GrHLLCkJU2j0d2ozP0JKg6ftquSFvLe24cDZFFq8dgkuocpgd3yGAp7O1SenhUZDsSm1VwrarzJyuskLM4dd541BDMoCofp3/PAAms1Vgu6zkmIRqinL0BbCHf4ifut3eOqOWs+4oRoK9r4nJJ/t3lgCF//lPHI8ukADh3gbbwzyGpAmF10vkIHUyn6EtoLEHAS2RpzSnq9IYEtFoXI+bRicMDaKtn5vgl4iEu1fzgvNl00qljmSZPkGZjPBeh+tt8rohwu5swF16qAOcYiEZbLVuW3paitzLH/DzHD6NXvVRTwg0uk9CWtk123P2HNLGfFke X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f40fdad1-b87c-4f9e-87e8-08d993ebd7b3 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:39.7420 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-ORIG-GUID: tPkQ_iojXRlk1gthQRSLg3K3zJkMF_vz X-Proofpoint-GUID: tPkQ_iojXRlk1gthQRSLg3K3zJkMF_vz Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Uncached stores are suitable for circumstances where the region written to is not expected to be read again soon, or the region written to is large enough that there's no expectation that we will find the data in the cache. Add a new helper clear_subpage_uncached(), which handles the uncached clearing path for huge and gigantic pages. This path is always invoked for gigantic pages, for huge pages only if pages_per_huge_page is larger than the architectural threshold or if the user gives an explicit hint (say for a bulk transfer.) Signed-off-by: Ankur Arora --- include/linux/mm.h | 3 ++- mm/huge_memory.c | 3 ++- mm/hugetlb.c | 3 ++- mm/memory.c | 46 ++++++++++++++++++++++++++++++++++++++++++---- 4 files changed, 48 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 49a97f817eb2..3e8ddec2aba2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3164,7 +3164,8 @@ enum mf_action_page_type { #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void clear_huge_page(struct page *page, unsigned long addr_hint, - unsigned int pages_per_huge_page); + unsigned int pages_per_huge_page, + bool hint_uncached); extern void copy_user_huge_page(struct page *dst, struct page *src, unsigned long addr_hint, struct vm_area_struct *vma, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5e9ef0fc261e..ffd4b07285ba 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -600,6 +600,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, pgtable_t pgtable; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; vm_fault_t ret = 0; + bool uncached = false; VM_BUG_ON_PAGE(!PageCompound(page), page); @@ -617,7 +618,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, goto release; } - clear_huge_page(page, vmf->address, HPAGE_PMD_NR); + clear_huge_page(page, vmf->address, HPAGE_PMD_NR, uncached); /* * The memory barrier inside __SetPageUptodate makes sure that * clear_huge_page writes become visible before the set_pmd_at() diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 95dc7b83381f..a920b1133cdb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4874,6 +4874,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); bool new_page, new_pagecache_page = false; + bool uncached = false; /* * Currently, we are forced to kill the process in the event the @@ -4928,7 +4929,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, spin_unlock(ptl); goto out; } - clear_huge_page(page, address, pages_per_huge_page(h)); + clear_huge_page(page, address, pages_per_huge_page(h), uncached); __SetPageUptodate(page); new_page = true; diff --git a/mm/memory.c b/mm/memory.c index 9f6059520985..ef365948f595 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5347,6 +5347,27 @@ static inline void process_huge_page( } } +/* + * clear_subpage_uncached: clear the page in an uncached fashion. + */ +static void clear_subpage_uncached(unsigned long addr, int idx, void *arg) +{ + __incoherent void *kaddr; + struct page *page = arg; + + page = page + idx; + + /* + * Do the kmap explicitly here since clear_user_page_uncached() + * only handles __incoherent addresses. + * + * Caller is responsible for making the region coherent again. + */ + kaddr = (__incoherent void *)kmap_atomic(page); + clear_user_page_uncached(kaddr, addr + idx * PAGE_SIZE, page); + kunmap_atomic((__force void *)kaddr); +} + static void clear_gigantic_page(struct page *page, unsigned long addr, unsigned int pages_per_huge_page) @@ -5358,7 +5379,8 @@ static void clear_gigantic_page(struct page *page, for (i = 0; i < pages_per_huge_page; i++, p = mem_map_next(p, page, i)) { cond_resched(); - clear_user_highpage(p, addr + i * PAGE_SIZE); + + clear_subpage_uncached(addr + i * PAGE_SIZE, 0, p); } } @@ -5369,18 +5391,34 @@ static void clear_subpage(unsigned long addr, int idx, void *arg) clear_user_highpage(page + idx, addr); } -void clear_huge_page(struct page *page, - unsigned long addr_hint, unsigned int pages_per_huge_page) +void clear_huge_page(struct page *page, unsigned long addr_hint, + unsigned int pages_per_huge_page, bool uncached) { unsigned long addr = addr_hint & ~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1); if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) { clear_gigantic_page(page, addr, pages_per_huge_page); + + /* Gigantic page clearing always uses __incoherent. */ + clear_page_uncached_make_coherent(); return; } - process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); + /* + * The uncached path is typically slower for small extents so take it + * only if the user provides an explicit hint or if the extent is large + * enough that there are no cache expectations. + */ + if (uncached || + clear_page_prefer_uncached(pages_per_huge_page * PAGE_SIZE)) { + process_huge_page(addr_hint, pages_per_huge_page, + clear_subpage_uncached, page); + + clear_page_uncached_make_coherent(); + } else { + process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); + } } static void copy_user_gigantic_page(struct page *dst, struct page *src, From patchwork Wed Oct 20 17:03:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C416BC433FE for ; Wed, 20 Oct 2021 17:06:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACDAC6139E for ; Wed, 20 Oct 2021 17:06:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231195AbhJTRIW (ORCPT ); Wed, 20 Oct 2021 13:08:22 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:40258 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230501AbhJTRIH (ORCPT ); Wed, 20 Oct 2021 13:08:07 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGGQE7029728; Wed, 20 Oct 2021 17:05:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=7IgdtqonAs/0Ndhh4q9aioh7d3qlI1VLPGxojSk388w=; b=Z01FL+/DFP55lUcksOafe5N5upXXglSjCuq38D8aXNbRqNZ+lGwTy+euJtZrWr46d/o0 q2D07N8DdhB1B5tC2P/hNV91h6SsF9nKko9U+nRmgB7NszMN4TyLEkjXm8Pvv79gdPUW XXvu9TRTD/n7FFRaITkMzj+6Uf/M6I7PkxvIBoD6uDKBEKxeyqnctDE7jCcdkSitbSul yefKkst9sMQynamAbHVAE18q6oEMDB9R47kfJu4xwWwDNtiGAoVxrKm3nq7MBHOa7Yul cFyuL/IMFQIoiYhzO8AUYNU6wSTNdeFndm52X9HI81ghi7NQGFK8GyhLU8+6xDAKU33K mA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkwj1b8j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:46 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KH5ZjV005950; Wed, 20 Oct 2021 17:05:44 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2107.outbound.protection.outlook.com [104.47.70.107]) by userp3030.oracle.com with ESMTP id 3bqkv0cr90-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:44 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xjxvqqc7KUUaoQv0Oo6K8PJQ4r//Q59lsSs2X26JfJM8VeAgliFueJMCrLG1Mbfs27jlzReDISMBRrtP5n09BtwSbtfmkJw0tMH3aCv21LcHYFqP0h4ASDc053uBmG404m3f0XNgx19bVo/C6K+NRvwvAr4W1VHAKZpRtDgUpqi63yPRr0zu7U/4rSbuu54Kw7qTvqmW4RpCFUf0h7JdN7xKslKpQ4FIVte/tJxNKjg6ai67xKLpm7dYzhTe1dd/IfybAhbsqzuGwvPLDV8OCJlU0nFY5DNIRk2P3bigyeZjWSASjbVmb0znP58xFZBBZjkv3iZ1+v9cu6vyi2Clhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7IgdtqonAs/0Ndhh4q9aioh7d3qlI1VLPGxojSk388w=; b=bQrKiqhilg2Due7oXLDrVclfs8/oO/mrKtGIHW0xRbmEUSrlMEQb9oCJChNlCRDzeGD8ypqRkyo/nMLuQzI/fPQHWOHlwLOt9D5cfL1xhD+ryaB2lOtUl6XFEvcJ3cDpsyRnqD4KRSH7PzVxFG2xAmwqhElLGXi2DQWqRz0qLrNkK28PANcFnGxiPcHZltOQXV1mshWJC7i56P+HQoXEW7QZ3lopSBbqwNWzMCD9nOnRRdyTnoXTbEJVKFARAfl9DJz30ubaf9Q9tUr5/lb2JioLg7b5jhgO3L9DUSfhX3Np7PQl8bz777aLLXQHWJDOSRUsYZzgWIMlO0tNaqJJVA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7IgdtqonAs/0Ndhh4q9aioh7d3qlI1VLPGxojSk388w=; b=pEdpD/BHvL+xJCCl6egG57IfG1rJYaNXGGRjk+/0ZI+2WBQxjtZivN66cuaoOCwHYt51sixz5Tv13Idi/e29pl0KiBxPgcTIvigPgI2dgZfNIKcTIeE/Hk6bzrIA9LA6Xbl82jg3GLjhdO6ErRjQRG5A9XavAPJ7w1LzHJsfGWU= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:42 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:42 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 11/14] gup: add FOLL_HINT_BULK, FAULT_FLAG_UNCACHED Date: Wed, 20 Oct 2021 10:03:02 -0700 Message-Id: <20211020170305.376118-12-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: CO1PR15CA0106.namprd15.prod.outlook.com (2603:10b6:101:21::26) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by CO1PR15CA0106.namprd15.prod.outlook.com (2603:10b6:101:21::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.15 via Frontend Transport; Wed, 20 Oct 2021 17:05:41 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 242d4a96-301a-478c-2054-08d993ebd91b X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:7691; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iWAxqm/g5cJSctJvkFP9YR7MOgUBIP95xAUR+WnTwhVq9ddewP7KaQ0xjPOFgr/niNtckjYREytw/cWLy0rQCBQgJIChPqM3fv8iP33+17keYM21PQg0SfaBupcg/TZH36UOh2MaFjRdP3376wO0FNg6SLmKBmalytbEeLFmVxkCWN0wihJXSq+WVGpd12trOs37oy2ac+yOhr6KcVK3yxZYcsvjZPWAVeHYHlI0cGpp8JU/X1j/sBY/0pTeWZ8K3EuEfcRGG9W8rIuCr34uZz1p/0F3itxoCQe5BgT2dpCRc5O+K94npzSOWWNACA7zac5SqcnFXsZKzwTUF/O6pe6DjY9D5e0c3QQ/qsGvx8pZ2liASTyM5CehFZVLfxJ7MQlTGtYp+mcd+0x/z7zKimNCvcwtqWYWf+1lN0IQaUunIxmqHSxt2Jn9CLFs7QTjY7dFqvZw70SJ+M6Dl+Oyi4ciV7wKRwVHCA2RTbCOpXRE5XhoVMoZNUXWQ0yOUO34GyF3pOUHRDSe4eG0mRTyni7Pvy9LAB5EtuL1ohRrFVFwPIYx4PBvSJAE9yQXQ7lG0JvJRGygQktofB5Kx2aA1cVbZ2rBEK2UCYH8pOPNwozXYK6d6xVbYDj7C4x39fWNuRp6OXfXRYhtoyj21gCY8DNwfIyAOqghHrARsRVXOk8SkM9SCNIDeFAO8tU/ECb2SuK0yWFs3r1RQJpsnRJv6DQ2sKzaziFhSksNRAHNSLPYuX0phgqcSwDDs4n8cVxgoMl7kdpzJfrGxN0opmjkDbzxvFWC1tFjoXrXAko1t2dNMf911grmPwnaKcCPDUJfknjMyWZJK6jQV7c4MKrjKRzGsoPhjI0yId7UXHvPtuwUPPV5qmuUbmMAV7JeE2RugwAn2vfqd8zt1U5RMf1/fCkAfOk82fdMWGGB0+TMSbiQfYF3/9srOU4enEpmnV2fUUYjgNevrlfEcVj2FwX1gzjmqHa6ZTuxBBk2V3gn1IhPb+NaU9sReGyvFKYH8YGREpsloATlbjvaH4KDiW9xOk/bwWzrFilK3a9fBTaumc0= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: BkTCM0pYqw2iNFE2dJaYpp1HQLvEw5dYHTanVfMTMYHB2bhmBLLqKadS5GDXPwSOD3CxnXT99+VL8Kwq8PB2c5YPQybV0qWq8d5bEL1njdaiPEVFUBpr3KX7Ebrfd4vXzEft7WjxE2hSq+8b2utxUy6VkwbWrGyKtXFzSWUg95heXcAGThxKRukRejqsn9kCmZY+PRgCgZQYZDcoD1ESbPPUBDUi9ZWSqScOR46cnEFnwV0sg4v3wQTzGVIwSjRb07Q9G/Vyr3EMUjC3dHnAPpMpXvKfv+KQUmYkPbahGiy7EOH3WrUSWPBHq4Wx+CB5//eigzHYts6XK6duFoTXADfs8dYfQaiBegS0MK3BHIAVLeR3zd+9+RzkrVy04DRrTViPH+d7f1Zh9daniIamJgkWAotiIc7OTWk/zTc94Ztf5+WIrrMh8zXZkMFXILzuT5BJqxZCGRXAco/6m2cBFnabywplRc/8MSpGAPV/Newb6/SuaC/dWdVegjaPqKNsfy0z4UmfZnNHnjVi9b4Ub/1SqaaZwRyto2ZG9Xyhr+3z6JApYqieGOmcz9HeaGF9BP1t16H7Jr/prXeFY8LCyHBr/yefgzeC+UTI3zxa6nZj/fK+xHWshw35PuJdfD2c6/J0k3LeCrY5S/VCAwxNgXkETR1KoTgaBvu3jGq/7EOSklYcJi3x3DOuuPrpyluOXEoBmpjZWFg+M/2+TQsHS39S4O8XSb0c4xG4y8USiV4GAIOr36uOhKbCjgp383steNt/1TLFRlsx2VGN1vZi7UC3tOOkohx04PH/peGYKwPaP62C2BZesymfuw+JL5hW9folKSKHT582I7w707dbt9NqnZwYHYXRAxkmlciud6t87Gi79metQmuWEijYS6LYk7Tn74QzcGQkTAYQX6iuKOMiPS2MLSwPvUXjKMv5QySZ0lrca8xAR3jQiuwSyWmhO1U1IxswqTYghc5KWWCPAkBUj3RCMbXaVfjK3VPzu1YglB/CJKebkHj8myNbW50fkOjxu4WM4vAnfhI75rqyN4do7VH3Y9yAeP3GOktYNdEWnO1/7l+gzJSGL7Jr+bqMBTnzrs7Ukg5EVRMqd3XGwe5bz6VTlNOo9J//TRvlGKkcSITyLG++mibJDV8LCdNOShfjbWWdX9vDWfeKBy4w+FJwDIK5qlJQqIaYWzQSJlZfD2br/xvTfixnRoM35Nk0KGaZiuz6EPkMyqpT9YacLY+vwDAxllgnPiucvH483MKX5qWq1z1VtKSQsnkQtS8N97Sg6f/xlJkijSplrNCHChKFAyyE+6swtw174ERQGof1K69WyYmFlrcpvkDFH2txfrmlhJQZMbuY2/WI6sZ2Phv5yiZh7mVkNf6jjjXQiVq3QUwVcHA2+aX7kaRb0hoEzqk2QH8K8a3EqrWuD9BnXYR9VgF0n7uBsYwtCbhxBVz/gbigwWuNt92TZB15Ju3N X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 242d4a96-301a-478c-2054-08d993ebd91b X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:42.0837 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=780 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 adultscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200096 X-Proofpoint-ORIG-GUID: V6tEJ0QDJr63PMwHFdV5l7BDIBeZVBW9 X-Proofpoint-GUID: V6tEJ0QDJr63PMwHFdV5l7BDIBeZVBW9 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add FOLL_HINT_BULK, which users of get_user_pages(), and pin_user_pages() can use to signal that this call is one of many allowing get_user_pages() to optimize accordingly. Also add FAULT_FLAG_UNCACHED, which translates to the same idea for the fault path. As the second flag suggests, the intent for both is to hint the use of an uncached path if one is available. Signed-off-by: Ankur Arora --- include/linux/mm.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3e8ddec2aba2..cf1c711fe5ba 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -446,6 +446,7 @@ extern pgprot_t protection_map[16]; * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. + * @FAULT_FLAG_UNCACHED: Fault handling to choose the uncached path. * * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify * whether we would allow page faults to retry by specifying these two @@ -477,6 +478,7 @@ enum fault_flag { FAULT_FLAG_REMOTE = 1 << 7, FAULT_FLAG_INSTRUCTION = 1 << 8, FAULT_FLAG_INTERRUPTIBLE = 1 << 9, + FAULT_FLAG_UNCACHED = 1 << 10, }; /* @@ -2864,6 +2866,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ #define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ #define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup */ +#define FOLL_HINT_BULK 0x100000 /* part of a larger extent being gup'd */ /* * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each From patchwork Wed Oct 20 17:03:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12572795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1E40C433F5 for ; Wed, 20 Oct 2021 17:06:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7E2036139E for ; Wed, 20 Oct 2021 17:06:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231222AbhJTRIt (ORCPT ); Wed, 20 Oct 2021 13:08:49 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:50728 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231200AbhJTRIM (ORCPT ); Wed, 20 Oct 2021 13:08:12 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KGHwX9029734; Wed, 20 Oct 2021 17:05:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=FoaNTSoBuTx1QJCJRqBZ4DKv9jeEYfJqcRpDlexPldw=; b=ewnZFbrzfRdjcy9Da9ELvg0+sMOc7dG4sCimaZW1eJBJimhiK617lcesH6BhS7arIA28 srDwxe0Cf6jVqgcYbgOlYUcdEYMG4Ref7oWKO04slQ2NCOnQogbLd+3SIf6/zvNUtvQE be/C/qoP14bh6c2wnDdG6MdtGGIBHkWIYXq32Q4a1BwLG37YZ4CLuKu5YOSGI1CdAIaI HiapuL+nHAlvvHuViGu+FDv+3eC0Aoa/8AcYYr8rplo6ZqxonXHpRJpq7PKCmkMylfTd c2DyBHVuJrXu8S3+tH7HWtZBOUQPzFfgUv0kDS4UYHxHwhL+sRbO48cLF/hvUJvR8YRB 1A== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3btkwj1b8y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:48 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KGurhQ024723; Wed, 20 Oct 2021 17:05:47 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2109.outbound.protection.outlook.com [104.47.70.109]) by aserp3030.oracle.com with ESMTP id 3bqmsgs3ef-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 17:05:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=algjdDwIfxqilY+PFCg1oQ5p5DBl8NuSIJvvSz6vHXj6TZK50GPMfvwcTFPtVc77hG6CqJbE6hsGBp+V3Wcb9CznTz15WkGVSuLFgYkU0pKCD0owjndTf5SmRvWwx04F4B3Prn9173mORwcoLUL9GgH+moVbBabcrzILm7Wi8ZNYtVKHFD4akVThHaBGa3uy7O/ZlBrrg97RaiBxRt/Dbn61nsYoVrTdLX5my4HGMfH93sAMlYyVfUoH7SDWhB/H3OUBDMZNFgUABtnfqkZW4+OFMoDDNapq7uBpSruJL0mPq3Y54ndungou28DYQ2JXT5a3zan64oMtK1mCZ/4fcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FoaNTSoBuTx1QJCJRqBZ4DKv9jeEYfJqcRpDlexPldw=; b=N0/7kBOZRHXAyI9D2RnXXsxlrTbyJiUZ3Q3en7uxK6WUDFV7Me3i0t8m/eHZTvPceyn+7sZQyd4xyL6SK77DAdurqXiEiFdNtbvdyo2ohw6seidi4vlklwzOkckHVYr5YcmxVHcuuV98Re+0N9EnxXEI7lja5TERYWAKq/MoId/0no2GJiWuYmoBJmHzSSlWh2o2nFfvgJGf/N78zNgSth7WqrKJqyVsei9DtfVZqUXBC3x/I+Qe2T7oqvl4mKsxUQexPTBNZJKHkZpl8r6m4TP6djKRBbBA193WkOafmaMaoGyP2O0rJsNDXh8H9ojQ8Xo62NQkYJoRUq6ggtmr5A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FoaNTSoBuTx1QJCJRqBZ4DKv9jeEYfJqcRpDlexPldw=; b=KeHP7FLCENw66GP+1BRv18W+Y+yPwNJwJCSWkeQC/Xc51GkfQK2etlX0WuHlTOR8Wvz9ZXqhVrOLcf1MFmcirP8/IIDrYJJh8uRBEYw8cuJIdpQ6kySPNbyEWJFg22UUr4Qisw7FL4pSDENsvMvAfNxVGWbRIE301TLsqLhdgDM= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=oracle.com; Received: from CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) by CO1PR10MB4577.namprd10.prod.outlook.com (2603:10b6:303:97::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16; Wed, 20 Oct 2021 17:05:44 +0000 Received: from CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d]) by CO6PR10MB5409.namprd10.prod.outlook.com ([fe80::3197:6d1:6a9a:cc3d%4]) with mapi id 15.20.4628.016; Wed, 20 Oct 2021 17:05:44 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 12/14] gup: use uncached path when clearing large regions Date: Wed, 20 Oct 2021 10:03:03 -0700 Message-Id: <20211020170305.376118-13-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> X-ClientProxiedBy: MW4PR04CA0370.namprd04.prod.outlook.com (2603:10b6:303:81::15) To CO6PR10MB5409.namprd10.prod.outlook.com (2603:10b6:5:357::14) MIME-Version: 1.0 Received: from localhost (148.87.23.11) by MW4PR04CA0370.namprd04.prod.outlook.com (2603:10b6:303:81::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.15 via Frontend Transport; Wed, 20 Oct 2021 17:05:43 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 5f1db659-0c67-4432-bad9-08d993ebda63 X-MS-TrafficTypeDiagnostic: CO1PR10MB4577: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: q1yAhQqnk9CHQDM9DOC988T8/4sHGxfTBpa7xGwcbU7BB74x9seG+G1EzlMDgAPT2h06fuCGg2BJ59lcO6JiZh+/7zuA956m3DkAlFq8he847o7vYOcVuCS0QpSEbeEizGz07SUpvrNjEGBa8cSY4+4Zh3BaQ+JzwoYUIH0uGKIt0rnK6ZfSxm9zIHTDzrqxL68oTbsN8PPD0IK6WdUczANWnBkR7eLY4f65D5of6SoNN+e+1Yx/84rB5w9COBDR+DxX0jQdoZCwkGSG1Ep2GCzG88pmzdQCDZjqEbo8LW9TkODLHlSzmLWYm7cJ4mBApUjjCXBKRiXoqDTURQrXPHOwT68XK2AZ+31zIXrRIwBW9ih9c1h45neuBGTSE6e8Dic2uEUWSnv0KTd2qhJJS7WNco1MRFXV4Mto1FcS4T6FtwZkUhVOjKV1uadaL+D9aQy/UFT9M7bB4ErFVR/OxkulAM9bES9z1uHJ6ml2VPA8/jQCvaY8aImoYbonul1eTQeR0vW2Z0gduc+PlmWiHzVcKTLrpfWGt5gDOCuZNI9Vg9+Or8lOqxbnHeAACWK9wZK7F31t7dNMnoQWTxEwpFY1t74Bwiv4gyuF9nqZ5G4BM7a3as2sy2+7c0/l08mlYCUrvqdV+s2WQOnEQ2aJZbVwPXw/QkSZODeShTPXrATeImkcl/ubC7pzlE9AdOaA0bprDnKas66qBCcyEK/KGbFx8WQg3R/DsPPrCrsVR+gDs9LoI36pJOlW3VlkGeTAWII1qoKna2W8t0mGvkhl8bNcn80RAU4wTtcIXu2haN8fGnJjEJqDIkT/Uu66T3h1GYjSOK2k8iUWNA11HJLVuwxpoSMpLSg5PLzK4CqgPyo5/g2X+ZMEGIKiTvxPlUAr+3b26KmzbNYLXfS0fardH14KsOjLv3ckWdbBN06xm8m+D0H9afUG2Ng0+CzACx1ukkq/+VyiehDpAxJYHdXdWXGki5yHzXHw9PjAvhv/jn3glJJgVtYqBaGKcmcQDYTbKjWutapmxEUayitl7FEkPfZsisSIRH1KWkzPjHIJkVs= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:5;SRV:;IPV:NLI;SFV:SPM;H:CO6PR10MB5409.namprd10.prod.outlook.com;PTR:;CAT:OSPM;SFS:(366004)(66946007)(66476007)(956004)(66556008)(8936002)(36756003)(6486002)(5660300002)(83380400001)(2616005)(4326008)(2906002)(107886003)(186003)(1076003)(26005)(6666004)(38350700002)(8676002)(38100700002)(508600001)(6496006)(316002)(52116002)(86362001)(103116003)(30864003)(23200700001);DIR:OUT;SFP:1501; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: pYk/LLhy9XcQOJM+ORsB9/z+5Mozj3NsluEDv3dVqJtFq66HUAQWps5Gd3n5OiyAayNTEmaDNEhyCiKZoh8CkMnS6D1QVGqiKWiRNUtsW5kZwSTNmhK8RQ5zrm4CkyN0psdfuTwz32XJDWHL87ruMhS8B9Kl39CCogbRpZpdzYWMxp/YuB5P19MpXL8svYgo/7D0ZlCXbmCE/Ir0UNyftBXU23M5REjFjIqlUbG0X8rqsfN06GhhG6Q/0PVHh7UbaSPz7l7cqfWkVxCLIZy4fjmdy2ZfEa5zz7WrPEBbK5bFv8Rt/lDCIIwhINZyUQMr+j0mFpJciFu4GmXWGzk4iM6nCvHvqGf1xUrQ9QZL0v7Q8NW202cNzAoa7wIdDd5DGLJZ/8d0CLlyIH3X4R4oK/iIrko7D+tLhD7ISh+amW5sAZKaJPmSRdOFK+7SHVOFNwFrsfqReSbocwzdmMFzP92B9VXJ2Vevez7pO5Dc2GdYlDmSK/UzEwJOc/Yo8bifhuzZVgYrXqRQcfr3EeW/MoQi55EeDsh2j6hUuFMoRM3foGTkcxzDF9Qu4TP8PBBmqOA/bx2Sjn5LvPxCpLsAG1RIsb9bCBnXTbDI9WljlhCUkmG4lRJFhoNdvA82Ul2N+2trGhV8HYYsVdCyUnyVgP3LWLqBaehnOZ1g9T2D3Wm79GVsfjya96zIwrBAwoQJBKjfDkt1l5dxhECTCc4qzSYUQd69fowI2/FCU5rkGYB4qG5wmnHV8z7Z0AN4e3m+fqgI6QhSxVvpy2mWo+ispLkOR+H5Bap4xiL4588tNxgZz10UNavG9UWf2b3sfF1Nz48WEEPASTl3Id5d+42yX4HrvlF+f/wKjs6RgqG9bEeN2pueHX6lHp6loAN++j+I7CA5JnZuphaJvfirZPitKY++kDKaGTDLRtitVu/vxVLCuyZcw6Wyzb+V8YilSV42EeXgqulGDgm3Y7/7HrPsPhAUxW0aspIKO8OA1FcjBZ1qz/wA/n1Eb5jbTZ7pqufU0m78AhiBbQU2TObFRcH8spMawqbo5/8FU9JsR8eV1YDu5gT5sfbxLpd1AJn6Xnn376Muzcb6GY35SE2W6iowglpBGDsf04nmBbT22UiarsS1RPuAYuclZWeCdMde7rm2WSQcWSg9HeBoQehY7Ht4BW+ZWKpALEwXD2iCpGszhx/FbIWVuINbTujOFu7/kcS03bkd7InISmZyeD++7I3wVzlDvOmsxSwD0MYzPH3IGRLq/PYV7A4e+46vPD99tsgqdKKmZWUswEFagQNMbzBIqkrQnjKo2oSNcipnRyx2O8pfIOJVz1IZvo9kYrVzIyo2u5o7Rrm2xKMK6vx1LoSRxINFehKm7tiecW7A3eNny9c6i6Xj1/X5M/iDGsDT/RgYRRtUjsQb49IUgHqEiFWnUjDg3ad+ZBdrX3Fcg6fxkVD3WIqfo5plOTpR+596ywct X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5f1db659-0c67-4432-bad9-08d993ebda63 X-MS-Exchange-CrossTenant-AuthSource: CO6PR10MB5409.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Oct 2021 17:05:44.2810 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ankur.a.arora@oracle.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR10MB4577 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10143 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110200095 X-Proofpoint-ORIG-GUID: ZszxLX3Hg5isEak-i03V8si5xKxi4TrX X-Proofpoint-GUID: ZszxLX3Hg5isEak-i03V8si5xKxi4TrX Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When clearing a large region, or when the user explicitly specifies via FOLL_HINT_BULK that a call to get_user_pages() is part of a larger region, take the uncached path. One notable limitation is that this is only done when the underlying pages are huge or gigantic, even if a large region composed of PAGE_SIZE pages is being cleared. This is because uncached stores are generally weakly ordered and need some kind of store fence -- which would need to be done at PTE write granularity to avoid data leakage. This would be expensive enough that it would negate any performance advantage. Performance ==== System: Oracle E4-2C (2 nodes * 64 cores * 2 threads) (Milan) Processor: AMD EPYC 7J13 64-Core Memory: 2048 GB evenly split between nodes LLC-size: 32MB for each CCX (8-core * 2-threads) boost: 0, Microcode: 0xa001137, scaling-governor: performance System: Oracle X9-2 (2 nodes * 32 cores * 2 threads) (Icelake) Processor: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz Memory: 512 GB evenly split between nodes LLC-size: 48MB for each node (32-cores * 2-threads) no_turbo: 1, Microcode: 0xd0001e0, scaling-governor: performance Workload: qemu-VM-create == Create a large VM, backed by preallocated 2MB pages. (This test needs a minor change in qemu so it mmap's with MAP_POPULATE instead of demand faulting each page.) Milan, sz=1550 GB, runs=3 BW stdev diff ---------- ------ -------- baseline (clear_page_erms) 8.05 GBps 0.08 CLZERO (clear_page_clzero) 29.94 GBps 0.31 +271.92% (VM creation time decreases from 192.6s to 51.7s.) Icelake, sz=200 GB, runs=3 BW stdev diff ---------- ------ --------- baseline (clear_page_erms) 8.25 GBps 0.05 MOVNT (clear_page_movnt) 21.55 GBps 0.31 +161.21% (VM creation time decreases from 25.2s to 9.3s.) As the diff shows, for both these micro-architectures there's a significant speedup with the CLZERO and MOVNT based interfaces. Workload: Kernel build with background clear_huge_page() == Probe the cache-pollution aspect of this commit with a kernel build (make -j 15 bzImage) alongside a background clear_huge_page() load which does mmap(length=64GB, flags=MAP_POPULATE|MAP_HUGE_2MB) in a loop. The expectation -- assuming the kernel build performance is partly cache limited -- is that the background load of clear_page_erms() should show a greater slowdown, than clear_page_movnt() or clear_page_clzero(). The build itself does not use THP or similar, so any performance changes are due to the background load. # Milan, compile.sh internally tasksets to a CCX # perf stat -r 5 -e task-clock -e cycles -e stalled-cycles-frontend \ -e stalled-cycles-backend -e instructions -e branches \ -e branch-misses -e L1-dcache-loads -e L1-dcache-load-misses \ -e cache-references -e cache-misses -e all_data_cache_accesses \ -e l1_data_cache_fills_all -e l1_data_cache_fills_from_memory \ ./compile.sh Milan kernel-build[1] kernel-build[2] kernel-build[3] (bg: nothing) (bg:clear_page_erms()) (bg:clear_page_clzero()) ----------------- --------------------- ---------------------- ------------------------ run time 280.12s (+- 0.59%) 322.21s (+- 0.26%) 307.02s (+- 1.35%) IPC 1.16 1.05 1.14 backend-idle 3.78% (+- 0.06%) 4.62% (+- 0.11%) 3.87% (+- 0.10%) cache-misses 20.08% (+- 0.14%) 20.88% (+- 0.13%) 20.09% (+- 0.11%) (% of cache-refs) l1_data_cache_fills- 2.77M/sec (+- 0.20%) 3.11M/sec (+- 0.32%) 2.73M/sec (+- 0.12%) _from_memory From the backend-idle stats in [1], the kernel build is only mildly memory subsystem bound. However, there's a small but clear effect where the background load of clear_page_clzero() does not leave much of an imprint on the kernel-build in [3] -- both [1] and [3] have largely similar IPC, memory and cache behaviour. OTOH, the clear_page_erms() workload in [2] constrains the kernel-build more. (Fuller perf stat output, at [1], [2], [3].) # Icelake, compile.sh internally tasksets to a socket # perf stat -r 5 -e task-clock -e cycles -e stalled-cycles-frontend \ -e stalled-cycles-backend -e instructions -e branches \ -e branch-misses -e L1-dcache-loads -e L1-dcache-load-misses \ -e cache-references -e cache-misses -e LLC-loads \ -e LLC-load-misses ./compile.sh Icelake kernel-build[4] kernel-build[5] kernel-build[6] (bg: nothing) (bg:clear_page_erms()) (bg:clear_page_movnt()) ----------------- ----------------- ---------------------- ----------------------- run time 135.47s (+- 0.25%) 136.75s (+- 0.23%) 135.65s (+- 0.15%) IPC 1.81 1.80 1.80 cache-misses 21.68% (+- 0.42%) 22.88% (+- 0.87%) 21.19% (+- 0.51%) (% of cache-refs) LLC-load-misses 35.56% (+- 0.83%) 37.44% (+- 0.99%) 33.54% (+- 1.17%) From the LLC-load-miss and the cache-miss numbers, clear_page_erms() seems to cause some additional cache contention in the kernel-build in [5], compared to [4] and [6]. However, from the IPC and the run time numbers, looks like the CPU pipeline compensates for the extra misses quite well. (Increasing the number of make jobs to 60, did not change the overall picture appreciably.) (Fuller perf stat output, at [4], [5], [6].) [1] Milan, kernel-build Performance counter stats for './compile.sh' (5 runs): 2,525,721.45 msec task-clock # 9.016 CPUs utilized ( +- 0.06% ) 4,642,144,895,632 cycles # 1.838 GHz ( +- 0.01% ) (47.38%) 54,430,239,074 stalled-cycles-frontend # 1.17% frontend cycles idle ( +- 0.16% ) (47.35%) 175,620,521,760 stalled-cycles-backend # 3.78% backend cycles idle ( +- 0.06% ) (47.34%) 5,392,053,273,328 instructions # 1.16 insn per cycle # 0.03 stalled cycles per insn ( +- 0.02% ) (47.34%) 1,181,224,298,651 branches # 467.572 M/sec ( +- 0.01% ) (47.33%) 27,668,103,863 branch-misses # 2.34% of all branches ( +- 0.04% ) (47.33%) 2,141,384,087,286 L1-dcache-loads # 847.639 M/sec ( +- 0.01% ) (47.32%) 86,216,717,118 L1-dcache-load-misses # 4.03% of all L1-dcache accesses ( +- 0.08% ) (47.35%) 264,844,001,975 cache-references # 104.835 M/sec ( +- 0.03% ) (47.36%) 53,225,109,745 cache-misses # 20.086 % of all cache refs ( +- 0.14% ) (47.37%) 2,610,041,169,859 all_data_cache_accesses # 1.033 G/sec ( +- 0.01% ) (47.37%) 96,419,361,379 l1_data_cache_fills_all # 38.166 M/sec ( +- 0.06% ) (47.37%) 7,005,118,698 l1_data_cache_fills_from_memory # 2.773 M/sec ( +- 0.20% ) (47.38%) 280.12 +- 1.65 seconds time elapsed ( +- 0.59% ) [2] Milan, kernel-build (bg: clear_page_erms() workload) Performance counter stats for './compile.sh' (5 runs): 2,852,168.93 msec task-clock # 8.852 CPUs utilized ( +- 0.14% ) 5,166,249,772,084 cycles # 1.821 GHz ( +- 0.05% ) (47.27%) 62,039,291,151 stalled-cycles-frontend # 1.20% frontend cycles idle ( +- 0.04% ) (47.29%) 238,472,446,709 stalled-cycles-backend # 4.62% backend cycles idle ( +- 0.11% ) (47.30%) 5,419,530,293,688 instructions # 1.05 insn per cycle # 0.04 stalled cycles per insn ( +- 0.01% ) (47.31%) 1,186,958,893,481 branches # 418.404 M/sec ( +- 0.01% ) (47.31%) 28,106,023,654 branch-misses # 2.37% of all branches ( +- 0.03% ) (47.29%) 2,160,377,315,024 L1-dcache-loads # 761.534 M/sec ( +- 0.03% ) (47.26%) 89,101,836,173 L1-dcache-load-misses # 4.13% of all L1-dcache accesses ( +- 0.06% ) (47.25%) 276,859,144,248 cache-references # 97.593 M/sec ( +- 0.04% ) (47.22%) 57,774,174,239 cache-misses # 20.889 % of all cache refs ( +- 0.13% ) (47.24%) 2,641,613,011,234 all_data_cache_accesses # 931.170 M/sec ( +- 0.01% ) (47.22%) 99,595,968,133 l1_data_cache_fills_all # 35.108 M/sec ( +- 0.06% ) (47.24%) 8,831,873,628 l1_data_cache_fills_from_memory # 3.113 M/sec ( +- 0.32% ) (47.23%) 322.211 +- 0.837 seconds time elapsed ( +- 0.26% ) [3] Milan, kernel-build + (bg: clear_page_clzero() workload) Performance counter stats for './compile.sh' (5 runs): 2,607,387.17 msec task-clock # 8.493 CPUs utilized ( +- 0.14% ) 4,749,807,054,468 cycles # 1.824 GHz ( +- 0.09% ) (47.28%) 56,579,908,946 stalled-cycles-frontend # 1.19% frontend cycles idle ( +- 0.19% ) (47.28%) 183,367,955,020 stalled-cycles-backend # 3.87% backend cycles idle ( +- 0.10% ) (47.28%) 5,395,577,260,957 instructions # 1.14 insn per cycle # 0.03 stalled cycles per insn ( +- 0.02% ) (47.29%) 1,181,904,525,139 branches # 453.753 M/sec ( +- 0.01% ) (47.30%) 27,702,316,890 branch-misses # 2.34% of all branches ( +- 0.02% ) (47.31%) 2,137,616,885,978 L1-dcache-loads # 820.667 M/sec ( +- 0.01% ) (47.32%) 85,841,996,509 L1-dcache-load-misses # 4.02% of all L1-dcache accesses ( +- 0.03% ) (47.32%) 262,784,890,310 cache-references # 100.888 M/sec ( +- 0.04% ) (47.32%) 52,812,245,646 cache-misses # 20.094 % of all cache refs ( +- 0.11% ) (47.32%) 2,605,653,350,299 all_data_cache_accesses # 1.000 G/sec ( +- 0.01% ) (47.32%) 95,770,076,665 l1_data_cache_fills_all # 36.768 M/sec ( +- 0.03% ) (47.30%) 7,134,690,513 l1_data_cache_fills_from_memory # 2.739 M/sec ( +- 0.12% ) (47.29%) 307.02 +- 4.15 seconds time elapsed ( +- 1.35% ) [4] Icelake, kernel-build Performance counter stats for './compile.sh' (5 runs): 421,633 cs # 358.780 /sec ( +- 0.04% ) 1,173,522.36 msec task-clock # 8.662 CPUs utilized ( +- 0.14% ) 2,991,427,421,282 cycles # 2.545 GHz ( +- 0.15% ) (82.42%) 5,410,090,251,681 instructions # 1.81 insn per cycle ( +- 0.02% ) (91.13%) 1,189,406,048,438 branches # 1.012 G/sec ( +- 0.02% ) (91.05%) 21,291,454,717 branch-misses # 1.79% of all branches ( +- 0.02% ) (91.06%) 1,462,419,736,675 L1-dcache-loads # 1.244 G/sec ( +- 0.02% ) (91.06%) 47,084,269,809 L1-dcache-load-misses # 3.22% of all L1-dcache accesses ( +- 0.01% ) (91.05%) 23,527,140,332 cache-references # 20.020 M/sec ( +- 0.13% ) (91.04%) 5,093,132,060 cache-misses # 21.682 % of all cache refs ( +- 0.42% ) (91.03%) 4,220,672,439 LLC-loads # 3.591 M/sec ( +- 0.14% ) (91.04%) 1,501,704,609 LLC-load-misses # 35.56% of all LL-cache accesses ( +- 0.83% ) (73.10%) 135.478 +- 0.335 seconds time elapsed ( +- 0.25% ) [5] Icelake, kernel-build + (bg: clear_page_erms() workload) Performance counter stats for './compile.sh' (5 runs): 410,611 cs # 347.771 /sec ( +- 0.02% ) 1,184,382.84 msec task-clock # 8.661 CPUs utilized ( +- 0.08% ) 3,018,535,155,772 cycles # 2.557 GHz ( +- 0.08% ) (82.42%) 5,408,788,104,113 instructions # 1.80 insn per cycle ( +- 0.00% ) (91.13%) 1,189,173,209,515 branches # 1.007 G/sec ( +- 0.00% ) (91.05%) 21,279,087,578 branch-misses # 1.79% of all branches ( +- 0.01% ) (91.06%) 1,462,243,374,967 L1-dcache-loads # 1.238 G/sec ( +- 0.00% ) (91.05%) 47,210,704,159 L1-dcache-load-misses # 3.23% of all L1-dcache accesses ( +- 0.02% ) (91.04%) 23,378,470,958 cache-references # 19.801 M/sec ( +- 0.03% ) (91.05%) 5,339,921,426 cache-misses # 22.814 % of all cache refs ( +- 0.87% ) (91.03%) 4,241,388,134 LLC-loads # 3.592 M/sec ( +- 0.02% ) (91.05%) 1,588,055,137 LLC-load-misses # 37.44% of all LL-cache accesses ( +- 0.99% ) (73.09%) 136.750 +- 0.315 seconds time elapsed ( +- 0.23% ) [6] Icelake, kernel-build + (bg: clear_page_movnt() workload) Performance counter stats for './compile.sh' (5 runs): 409,978 cs # 347.850 /sec ( +- 0.06% ) 1,174,090.99 msec task-clock # 8.655 CPUs utilized ( +- 0.10% ) 2,992,914,428,930 cycles # 2.539 GHz ( +- 0.10% ) (82.40%) 5,408,632,560,457 instructions # 1.80 insn per cycle ( +- 0.00% ) (91.12%) 1,189,083,425,674 branches # 1.009 G/sec ( +- 0.00% ) (91.05%) 21,273,992,588 branch-misses # 1.79% of all branches ( +- 0.02% ) (91.05%) 1,462,081,591,012 L1-dcache-loads # 1.241 G/sec ( +- 0.00% ) (91.05%) 47,071,136,770 L1-dcache-load-misses # 3.22% of all L1-dcache accesses ( +- 0.03% ) (91.04%) 23,331,268,072 cache-references # 19.796 M/sec ( +- 0.05% ) (91.04%) 4,953,198,057 cache-misses # 21.190 % of all cache refs ( +- 0.51% ) (91.04%) 4,194,721,070 LLC-loads # 3.559 M/sec ( +- 0.10% ) (91.06%) 1,412,414,538 LLC-load-misses # 33.54% of all LL-cache accesses ( +- 1.17% ) (73.09%) 135.654 +- 0.203 seconds time elapsed ( +- 0.15% ) Signed-off-by: Ankur Arora --- fs/hugetlbfs/inode.c | 7 ++++++- mm/gup.c | 20 ++++++++++++++++++++ mm/huge_memory.c | 2 +- mm/hugetlb.c | 9 ++++++++- 4 files changed, 35 insertions(+), 3 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index cdfb1ae78a3f..44cee9d30035 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -636,6 +636,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, loff_t hpage_size = huge_page_size(h); unsigned long hpage_shift = huge_page_shift(h); pgoff_t start, index, end; + bool hint_uncached; int error; u32 hash; @@ -653,6 +654,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, start = offset >> hpage_shift; end = (offset + len + hpage_size - 1) >> hpage_shift; + /* Don't pollute the cache if we are fallocte'ing a large region. */ + hint_uncached = clear_page_prefer_uncached((end - start) << hpage_shift); + inode_lock(inode); /* We need to check rlimit even when FALLOC_FL_KEEP_SIZE */ @@ -731,7 +735,8 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, error = PTR_ERR(page); goto out; } - clear_huge_page(page, addr, pages_per_huge_page(h)); + clear_huge_page(page, addr, pages_per_huge_page(h), + hint_uncached); __SetPageUptodate(page); error = huge_add_to_page_cache(page, mapping, index); if (unlikely(error)) { diff --git a/mm/gup.c b/mm/gup.c index 886d6148d3d0..930944e0c6eb 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -933,6 +933,13 @@ static int faultin_page(struct vm_area_struct *vma, */ fault_flags |= FAULT_FLAG_TRIED; } + if (*flags & FOLL_HINT_BULK) { + /* + * From the user hint, we might be faulting-in a large region + * so minimize cache-pollution. + */ + fault_flags |= FAULT_FLAG_UNCACHED; + } ret = handle_mm_fault(vma, address, fault_flags, NULL); if (ret & VM_FAULT_ERROR) { @@ -1100,6 +1107,19 @@ static long __get_user_pages(struct mm_struct *mm, if (!(gup_flags & FOLL_FORCE)) gup_flags |= FOLL_NUMA; + /* + * Uncached page clearing is generally faster when clearing regions + * sized ~LLC/2 or thereabouts. So hint the uncached path based + * on clear_page_prefer_uncached(). + * + * Note, however that this get_user_pages() call might end up + * needing to clear an extent smaller than nr_pages when we have + * taken the (potentially slower) uncached path based on the + * expectation of a larger nr_pages value. + */ + if (clear_page_prefer_uncached(nr_pages * PAGE_SIZE)) + gup_flags |= FOLL_HINT_BULK; + do { struct page *page; unsigned int foll_flags = gup_flags; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ffd4b07285ba..2d239967a8a1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -600,7 +600,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, pgtable_t pgtable; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; vm_fault_t ret = 0; - bool uncached = false; + bool uncached = vmf->flags & FAULT_FLAG_UNCACHED; VM_BUG_ON_PAGE(!PageCompound(page), page); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a920b1133cdb..35b643df5854 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4874,7 +4874,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); bool new_page, new_pagecache_page = false; - bool uncached = false; + bool uncached = flags & FAULT_FLAG_UNCACHED; /* * Currently, we are forced to kill the process in the event the @@ -5503,6 +5503,13 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, */ fault_flags |= FAULT_FLAG_TRIED; } + if (flags & FOLL_HINT_BULK) { + /* + * From the user hint, we might be faulting-in a large + * region so minimize cache-pollution. + */ + fault_flags |= FAULT_FLAG_UNCACHED; + } ret = hugetlb_fault(mm, vma, vaddr, fault_flags); if (ret & VM_FAULT_ERROR) { err = vm_fault_to_errno(ret, flags); From patchwork Wed Oct 20 18:52:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12573107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44531C433EF for ; Wed, 20 Oct 2021 18:52:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2A74E6115B for ; Wed, 20 Oct 2021 18:52:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231344AbhJTSzG (ORCPT ); Wed, 20 Oct 2021 14:55:06 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:42550 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231278AbhJTSzG (ORCPT ); Wed, 20 Oct 2021 14:55:06 -0400 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KIVkgC025798; Wed, 20 Oct 2021 18:52:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2021-07-09; bh=LiE9WJoftzY3Rn6Hlp/Im0DVv0DI2FTmT5imqp7DQyQ=; b=SCcnCfmHJIPEcuR8kAUBefoSAB5spoNBTMWzu1OBfn4QIV6CNTNzGQxLiiyf4ildo8Q8 nRE0R0maeF2egvXDZqO9Ngg+Jupvp+q10UyfSL/3DBSzBl8dwXORQi/v2L724CVR2WFx nFDlqRb3PX33B9Tm5gWWp5fLXq5yhEIK1SCCdH8PZyZjIqt3HGbKcgVNFfQ0E1I/p+c7 UzcH1Cpll/6TeOf3W6IhRfaR2aUiyvpXz6uw1+Jv1mnq0p+8JvVTnJ1ijuzaIormmgqp pm4o5CWZLc4iEaiy0XhUOTSoLPON1cEQgeIE2nQRhtOItfnlYM9tvRvgScpthX7uHJY6 SQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3btrfm0418-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 18:52:43 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KIpb3F052811; Wed, 20 Oct 2021 18:52:42 GMT Received: from pps.reinject (localhost [127.0.0.1]) by userp3030.oracle.com with ESMTP id 3bqkv0hh8c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 18:52:42 +0000 Received: from userp3030.oracle.com (userp3030.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 19KIqfwq059818; Wed, 20 Oct 2021 18:52:41 GMT Received: from monad.us.oracle.com (dhcp-10-159-132-124.vpn.oracle.com [10.159.132.124]) by userp3030.oracle.com with ESMTP id 3bqkv0hh7s-1; Wed, 20 Oct 2021 18:52:41 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora , alex.williamson@redhat.com Subject: [PATCH v2 13/14] vfio_iommu_type1: specify FOLL_HINT_BULK to pin_user_pages() Date: Wed, 20 Oct 2021 11:52:07 -0700 Message-Id: <20211020185207.18509-1-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> MIME-Version: 1.0 X-Proofpoint-GUID: kyJVys3aFUyKJefJejJSe5ZCKu_C4bvd X-Proofpoint-ORIG-GUID: kyJVys3aFUyKJefJejJSe5ZCKu_C4bvd Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Specify FOLL_HINT_BULK to pin_user_pages() so it is aware that this pin is part of a larger region being pinned, so it can optimize based on that expectation. Cc: alex.williamson@redhat.com Signed-off-by: Ankur Arora --- drivers/vfio/vfio_iommu_type1.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0e9217687f5c..0d45b0c6464d 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -557,6 +557,9 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr, if (prot & IOMMU_WRITE) flags |= FOLL_WRITE; + /* Tell gup that this iterations is part of larger set of pins. */ + flags |= FOLL_HINT_BULK; + mmap_read_lock(mm); ret = pin_user_pages_remote(mm, vaddr, npages, flags | FOLL_LONGTERM, pages, NULL, NULL); From patchwork Wed Oct 20 18:52:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ankur Arora X-Patchwork-Id: 12573109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C08C5C433F5 for ; Wed, 20 Oct 2021 18:53:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A226D61212 for ; Wed, 20 Oct 2021 18:53:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231392AbhJTSzm (ORCPT ); Wed, 20 Oct 2021 14:55:42 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:47202 "EHLO mx0a-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230347AbhJTSzk (ORCPT ); Wed, 20 Oct 2021 14:55:40 -0400 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19KHvn25019155; Wed, 20 Oct 2021 18:53:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2021-07-09; bh=5HeKJwk4Xkm4yMC6JHnfuYkR3OZolfo170zV9C3tl1Y=; b=Bp64VMWrG3QkTaD9Gr03XJHgIA6Db9a+/Nhr5ha8One563AjmW7PbCJrSQM0ik9EXQYb FOHhmUIIpCKRR9Ay29OIXQwgA7AZtcKrl4kxxl7r7NPuYSxCwB9jnsPnBrlXYuUNf8Ov AY75iX5UsY7ZLtTr1vEzjq/IQn3hJH6tz3e4tm/efMTWf7Wx3/sk+Ylyp0GKvViTVfQl 7D0uTWldd8ofx7T1OZunt9HZfcOhVku+/b6INmSV+5kMXxIbJwZN/tOwnyPmuh3z6b9W QSg1XpcUaWPxA9HgNsF0o51lEIPgpYkfmSUcXuazsHCXUG5jkNc/WpeG7kOK9Gzbuv6O OA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3btqypgbmn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 18:53:08 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 19KIorMI120919; Wed, 20 Oct 2021 18:53:06 GMT Received: from pps.reinject (localhost [127.0.0.1]) by aserp3030.oracle.com with ESMTP id 3bqmsgwr0q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Oct 2021 18:53:06 +0000 Received: from aserp3030.oracle.com (aserp3030.oracle.com [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 19KIr6ST134042; Wed, 20 Oct 2021 18:53:06 GMT Received: from monad.us.oracle.com (dhcp-10-159-132-124.vpn.oracle.com [10.159.132.124]) by aserp3030.oracle.com with ESMTP id 3bqmsgwqxu-1; Wed, 20 Oct 2021 18:53:06 +0000 From: Ankur Arora To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Cc: mingo@kernel.org, bp@alien8.de, luto@kernel.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, jon.grimm@amd.com, kvm@vger.kernel.org, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com, Ankur Arora Subject: [PATCH v2 14/14] x86/cpu/intel: set X86_FEATURE_MOVNT_SLOW for Skylake Date: Wed, 20 Oct 2021 11:52:55 -0700 Message-Id: <20211020185255.19009-1-ankur.a.arora@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211020170305.376118-1-ankur.a.arora@oracle.com> References: <20211020170305.376118-1-ankur.a.arora@oracle.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: Tq1cliHmJo0EStHyne1ildEXugGQmY-g X-Proofpoint-GUID: Tq1cliHmJo0EStHyne1ildEXugGQmY-g Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org System: Oracle X8-2 CPU: 2 nodes * 26 cores/node * 2 threads/core Intel Xeon Platinum 8270CL (Skylakex, 6:85:7) Memory: 3TB evenly split between nodes Microcode: 0x5002f01 scaling_governor: performance L3 size: 36MB intel_pstate/no_turbo: 1 $ for i in 2 8 32 128 512; do perf bench mem memset -f x86-64-movnt -s ${i}MB done # Running 'mem/memset' benchmark: # function 'x86-64-movnt' (movnt-based memset() in arch/x86/lib/memset_64.S) # Copying 2MB bytes ... 6.361971 GB/sec # Copying 8MB bytes ... 6.300403 GB/sec # Copying 32MB bytes ... 6.288992 GB/sec # Copying 128MB bytes ... 6.328793 GB/sec # Copying 512MB bytes ... 6.324471 GB/sec # Performance comparison of 'perf bench mem memset -l 1' for x86-64-stosb # (X86_FEATURE_ERMS) and x86-64-movnt: x86-64-stosb (5 runs) x86-64-movnt (5 runs) speedup ----------------------- ----------------------- ------- size BW ( pstdev) BW ( pstdev) 16MB 20.38 GB/s ( +- 2.58%) 6.25 GB/s ( +- 0.41%) -69.28% 128MB 6.52 GB/s ( +- 0.14%) 6.31 GB/s ( +- 0.47%) -3.22% 1024MB 6.48 GB/s ( +- 0.31%) 6.24 GB/s ( +- 0.00%) -3.70% 4096MB 6.51 GB/s ( +- 0.01%) 6.27 GB/s ( +- 0.42%) -3.68% Comparing perf stats for size=4096MB: $ perf stat -r 5 --all-user -e ... perf bench mem memset -l 1 -s 4096MB -f x86-64-stosb # Running 'mem/memset' benchmark: # function 'x86-64-stosb' (movsb-based memset() in arch/x86/lib/memset_64.S) # Copying 4096MB bytes ... 6.516972 GB/sec (+- 0.01%) Performance counter stats for 'perf bench mem memset -l 1 -s 4096MB -f x86-64-stosb' (5 runs): 3,357,373,317 cpu-cycles # 1.133 GHz ( +- 0.01% ) (29.38%) 165,063,710 instructions # 0.05 insn per cycle ( +- 1.54% ) (35.29%) 358,997 cache-references # 0.121 M/sec ( +- 0.89% ) (35.32%) 205,420 cache-misses # 57.221 % of all cache refs ( +- 3.61% ) (35.36%) 6,117,673 branch-instructions # 2.065 M/sec ( +- 1.48% ) (35.38%) 58,309 branch-misses # 0.95% of all branches ( +- 1.30% ) (35.39%) 31,329,466 bus-cycles # 10.575 M/sec ( +- 0.03% ) (23.56%) 68,543,766 L1-dcache-load-misses # 157.03% of all L1-dcache accesses ( +- 0.02% ) (23.53%) 43,648,909 L1-dcache-loads # 14.734 M/sec ( +- 0.50% ) (23.50%) 137,498 LLC-loads # 0.046 M/sec ( +- 0.21% ) (23.49%) 12,308 LLC-load-misses # 8.95% of all LL-cache accesses ( +- 2.52% ) (23.49%) 26,335 LLC-stores # 0.009 M/sec ( +- 5.65% ) (11.75%) 25,008 LLC-store-misses # 0.008 M/sec ( +- 3.42% ) (11.75%) 2.962842 +- 0.000162 seconds time elapsed ( +- 0.01% ) $ perf stat -r 5 --all-user -e ... perf bench mem memset -l 1 -s 4096MB -f x86-64-movnt # Running 'mem/memset' benchmark: # function 'x86-64-movnt' (movnt-based memset() in arch/x86/lib/memset_64.S) # Copying 4096MB bytes ... 6.283420 GB/sec (+- 0.01%) Performance counter stats for 'perf bench mem memset -l 1 -s 4096MB -f x86-64-movnt' (5 runs): 4,462,272,094 cpu-cycles # 1.322 GHz ( +- 0.30% ) (29.38%) 1,633,675,881 instructions # 0.37 insn per cycle ( +- 0.21% ) (35.28%) 283,627 cache-references # 0.084 M/sec ( +- 0.58% ) (35.31%) 28,824 cache-misses # 10.163 % of all cache refs ( +- 20.67% ) (35.34%) 139,719,697 branch-instructions # 41.407 M/sec ( +- 0.16% ) (35.35%) 58,062 branch-misses # 0.04% of all branches ( +- 1.49% ) (35.36%) 41,760,350 bus-cycles # 12.376 M/sec ( +- 0.05% ) (23.55%) 303,300 L1-dcache-load-misses # 0.69% of all L1-dcache accesses ( +- 2.08% ) (23.53%) 43,769,498 L1-dcache-loads # 12.972 M/sec ( +- 0.54% ) (23.52%) 99,570 LLC-loads # 0.030 M/sec ( +- 1.06% ) (23.52%) 1,966 LLC-load-misses # 1.97% of all LL-cache accesses ( +- 6.17% ) (23.52%) 129 LLC-stores # 0.038 K/sec ( +- 27.85% ) (11.75%) 7 LLC-store-misses # 0.002 K/sec ( +- 47.82% ) (11.75%) 3.37465 +- 0.00474 seconds time elapsed ( +- 0.14% ) It's unclear if using MOVNT is a net negative on Skylake. For bulk stores MOVNT is slightly slower than REP;STOSB, but from the L1-dcache-load-misses stats (L1D.REPLACEMENT), it does elide the write-allocate and thus helps with cache efficiency. However, we err on the side of caution and mark Skylake X86_FEATURE_MOVNT_SLOW. Signed-off-by: Ankur Arora --- arch/x86/kernel/cpu/bugs.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 4e1558d22a5f..222d6f095da1 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -96,6 +96,21 @@ void check_movnt_quirks(struct cpuinfo_x86 *c) * to worry about any CONFIG_X86_32 families that don't * support SSE2/MOVNT. */ + if (c->x86_vendor == X86_VENDOR_INTEL) { + if (c->x86 == 6) { + switch (c->x86_model) { + case INTEL_FAM6_SKYLAKE_L: + fallthrough; + case INTEL_FAM6_SKYLAKE: + fallthrough; + case INTEL_FAM6_SKYLAKE_X: + set_cpu_cap(c, X86_FEATURE_MOVNT_SLOW); + break; + default: + break; + } + } + } #endif /* CONFIG_X86_64*/ }