From patchwork Tue Mar 1 10:14:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joey Gouly X-Patchwork-Id: 12764498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20C96C433F5 for ; Tue, 1 Mar 2022 10:18:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=hSJ6IX50msAmUUST6paJ2iBxXY7LFewnJIWAo0ESgDU=; b=WTJ67WYxmajzr+ goFmj7+UofEadota60EdPVDRpart6QUVjcVufqTk7lSfnVNx0TKYwTbD8HO6zGC/kr+9ZB9+BkrC5 994qrQHYRsrvLMxt+VIkshJzNrtcqJFZWrqAm8+MSz2zhmMIqNM1mvys3e7Ys/QDjVz8mobzZgae3 WcgpueEblU4lHSjZvxa1LVL/Xwk4NiLil8W9KMceQ8RwFW2g7fZMP/cn56sIMM7Oa4dxXPhlISTgT dCU5TXcWXqKYe+XeJCh3oymexpmi5Xvs/DpNogcdTnxaFA1qAim3zoJnefvMRXMzy/+oWVL7WZzfJ T6470BrGTo1hzHFDlxig==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzYH-00G4OB-OA; Tue, 01 Mar 2022 10:16:19 +0000 Received: from mail-vi1eur04on0610.outbound.protection.outlook.com ([2a01:111:f400:fe0e::610] helo=EUR04-VI1-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzWy-00G3oq-13 for linux-arm-kernel@lists.infradead.org; Tue, 01 Mar 2022 10:14:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UUgc31QPVcTx34+R2gxQ0U0SGePRMOctvw8TCuFzpYo=; b=Ur1KugjyYeJ/BpbvgIKz1sY77sIcXUsz3bgyyApO6i0utJ2bmSxbvUYGxm0dCZS3XXfYSo52YOeCuYlsD0UHEOr5PuhMzk6QOweR/7K4bkIVL/GyNHoo8etxrDmgnzidz5IksW70OmsS1rnQwqmxn285y5Q1pPlciCYb0/9rpQ8= Received: from AS9PR06CA0111.eurprd06.prod.outlook.com (2603:10a6:20b:465::17) by HE1PR0801MB1865.eurprd08.prod.outlook.com (2603:10a6:3:4f::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.26; Tue, 1 Mar 2022 10:14:51 +0000 Received: from VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:465:cafe::32) by AS9PR06CA0111.outlook.office365.com (2603:10a6:20b:465::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.23 via Frontend Transport; Tue, 1 Mar 2022 10:14:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT012.mail.protection.outlook.com (10.152.18.211) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:50 +0000 Received: ("Tessian outbound 18e50a6f0513:v113"); Tue, 01 Mar 2022 10:14:50 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a0d7ae8db2d47628 X-CR-MTA-TID: 64aa7808 Received: from 012c755c1bb7.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 646A9FDC-D8DA-4E38-A717-7D5108CA7707.1; Tue, 01 Mar 2022 10:14:43 +0000 Received: from EUR03-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 012c755c1bb7.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Mar 2022 10:14:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=L5bHW4X70mrEurMNUcBhm6p32dKJnHG6h8Ot1WJvCZksxYXm75dLaAvam4p3PnGpukuFuzLexeP4t1TUy4l4XoHZiuiC6lJJIHaI7O3Xk4AX4bYt9Epyzqemanz9piWj6ubTy8ZEt803BdYedi3EuVO/LJAWnC2cuv3Swd9APcxYYz+lLTUJZgADJfxB2NbaztkEou/XjPmAsRSIPA9reWVvYDt76d0kQ9Z75G40SFtBk2zyyDQuzXWWAytOyfbB6P4wOcWyZljqsRJTVsU96pKwTKfUD51T7+UzY3E9QiKZnVt5ACT1k0Zqwp1hHCLPUEEwiNtCLO9h4BZy3cORjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UUgc31QPVcTx34+R2gxQ0U0SGePRMOctvw8TCuFzpYo=; b=nQjOC2OjVpCgJ6amN8AUwKIzpRFs5KaPV8pRv2DBtyiZKNeTpSqlI3NXOghNv3JwZ/mIZLI5BQJER1FYcU9uQQrqb/aEBPusK8+nP3vDu+U0Yfkc9AexfmNSjo8LYltY7891Fy3PK3h2u4LJ9zH+mdVIFzFHqpY3fI48vowNIeGJMg5UfXspSqYFBxOEi1Mq/PVmCzRJsc9gOnzVk7xbgJ3OeewVh+t59EaCf7FQ6/m5VrUbbPZHNJN8Z7pbkS9YSw0QwvTWyaRU4/2mbZnL4HE7U1x2sGaEQf7awIw+aWNOIWQIzITDWLLEuFvD9A+lUcTTWJFaL0rsfEq0rah5sw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UUgc31QPVcTx34+R2gxQ0U0SGePRMOctvw8TCuFzpYo=; b=Ur1KugjyYeJ/BpbvgIKz1sY77sIcXUsz3bgyyApO6i0utJ2bmSxbvUYGxm0dCZS3XXfYSo52YOeCuYlsD0UHEOr5PuhMzk6QOweR/7K4bkIVL/GyNHoo8etxrDmgnzidz5IksW70OmsS1rnQwqmxn285y5Q1pPlciCYb0/9rpQ8= Received: from DB8P191CA0020.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:130::30) by AM9PR08MB7216.eurprd08.prod.outlook.com (2603:10a6:20b:3df::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22; Tue, 1 Mar 2022 10:14:41 +0000 Received: from DB5EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:130:cafe::ac) by DB8P191CA0020.outlook.office365.com (2603:10a6:10:130::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; Received: from nebula.arm.com (40.67.248.234) by DB5EUR03FT042.mail.protection.outlook.com (10.152.21.123) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:41 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 1 Mar 2022 10:14:43 +0000 Received: from e124191.cambridge.arm.com (10.1.197.45) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2308.20 via Frontend Transport; Tue, 1 Mar 2022 10:14:43 +0000 From: Joey Gouly To: CC: , , , , , Subject: [PATCH v2 1/3] arm64: lib: Import latest version of Arm Optimized Routines' strcmp Date: Tue, 1 Mar 2022 10:14:33 +0000 Message-ID: <20220301101435.19327-2-joey.gouly@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220301101435.19327-1-joey.gouly@arm.com> References: <20220301101435.19327-1-joey.gouly@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-Office365-Filtering-Correlation-Id: af56e389-d13b-47fc-55ce-08d9fb6c5246 X-MS-TrafficTypeDiagnostic: AM9PR08MB7216:EE_|VE1EUR03FT012:EE_|HE1PR0801MB1865:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: FL/ox8e/ntItvtIgODBLzUKYqghYI052NQMXtIMeZfSd3R/eIC6OsEES0zJKS/uqUqx2m/5k0sguntiOYRgFaWO00IGeebQt+OwLJeCCV1KI+Ghwj1JBHBJ2uQMcen3uFdzf30iskLUkgzYAeHYGLGqYDPNYNbJImlOT3ZrddgSDvjRyQULOCX7leCUSjjcZCqKdSbTl6r4eyqF13L4S6T8uMH5eNQFcj8SttSn6fG1g9x3ozyXBXJ904ydIQJwIxEG6yXk7mg5MyD24Dm0wfAMuUiHtN2XvKdis9su5IxGcuQXssPCtdVu0ruSo99nrmGJCSV8gYEwHn0nbr571abKCcVL3XKxLYgfjDvdBvvdsqyNGcfQvuYuAcrsrT7Cln0Qqbr4+NoPaRSOF8vxYMIUh9Ps+cfIKPRlLWQdS+CRXn69wS3TSdbd3O1rBHLQf/x7jEyF0xKBZlPMxJVkx2RvnwkIfddrs14AXFlzQ+94S3o38Ako7An0IOV7KI+I1ceSt4pX1rWbxf7xNwFTu5mYgaDnONs8vhROFqhBjHAqoNm5tHblEZYBluz6urdi3B05TaLznSm21cWd+b5CXUqrzEQQ+rgIB+b+zCq57FnkMKFz12PYASPHo7XV1ewmkgIBhG0jGu3B1wXG5v8LwLtvqHQqBZTsx9vY92+ABRPOyUKVckWM+TsPpHDhK1u4IZQPXO9sFICy1oU+e14+pTth/ajz9pnktHGvevrVf/fAUyynPTkshsLNncnRXTUf4lLSZEQygPJpVkJEBCs5fbxjZosRE3z31ldCnJ6fC0q7N7NSDT83W/c1ItVe//uyGQ/TSX3PmmbGrLTIPAkNawC2edxXiY5a9ORs4C9P8u51v6ge6WPdja5f3c68gS7GQHt43QzsHz0mU7H3V5RFwmuIv/2TOGQUNMhMX0BbYjcw= X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230001)(4636009)(40470700004)(36840700001)(46966006)(83380400001)(47076005)(1076003)(336012)(186003)(26005)(6666004)(7696005)(508600001)(966005)(54906003)(6916009)(36860700001)(2616005)(426003)(40460700003)(356005)(8676002)(44832011)(8936002)(5660300002)(70586007)(4326008)(70206006)(2906002)(81166007)(316002)(36756003)(86362001)(82310400004)(17423001)(156123004)(36900700001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR08MB7216 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 9c19972e-c0c5-420e-cef6-08d9fb6c4ca0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9+Q9zEmQ1Fq4pfU7J+Mg1kslEzqUeV8EQQZt+hwmzLkHpoXy4dUSv097S9jBPfwpjrcgTL0oQcatfrGyF759/ooacTC1gsIryfp+FhHW+NDt4f4bkqFLzUeakaxjy8IjMrTiK0Nyds2UPznRSgaT22dIb7DOswmglDopKC69LIScNt5jH1mmypbGglGXhmj3Kc4NJZC0LajtLYS9A9QyBevnHmHcChNAA0Dcxdds2Cpigb6YEndCmDBdaqQZ3HboFBsuf4he2UZzr5lMWRKQGnu809bjIxR9NOkb63ayVyY5XZAacOPhgkA9uDDXiZjFLbcT32wCMcA0WJby28IpSRUPh6/Wa4hoeRRaM0Us/s9yCkcqXcdGrteU4Osl1YHuaZcB6Pq7BVcuhStj8qb4wB8zFu+TjWXw6Qw8LvET+YfRyTjx24ohkco172d4IEu0bXjP7zO9JOCfTSO2343NA2sLfaqOE4F8GIE2NGEDaYfYyquwgn3QBE5yNJrWFJL8WEAKEtSO5HxwFyqKmrIbpzbu9tEqKjx4l9JqE7ny5dhfa4VepsLdbsHRfR9AFSwmWu9fwVKBvIxmbfwaJEVXuA5SsmLbJyx3pU5MBydrS2VIiRmMiYNv9bfskRu2hJiRagGKFsXGR5xm006Cfb+ZkmQuNY3fi9TsoC5Nv3l4A8TthWzPYixSu3n9O4fkSSrPOrubUibj3CxRsp7Ajn3fOA81xDE3qypf67o20vTmeIGHGqBh3ziTxHA2teQwaWT1paNDfpPwNbZ0p6v1Kdt9O7JeqzRNOpdND9b/s3pMhyvlde+nVu1F+Sjz2Yt0yRCsIUNVOFiNF5ftiH3t5dq0N/cKjREnLmfXox9guCCARMEIy4XJRh8jrEW3Oz46kryh X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230001)(4636009)(40470700004)(46966006)(36840700001)(966005)(2906002)(6666004)(7696005)(40460700003)(508600001)(4326008)(86362001)(36756003)(8676002)(70206006)(70586007)(6916009)(316002)(54906003)(336012)(426003)(81166007)(83380400001)(107886003)(26005)(186003)(2616005)(1076003)(44832011)(5660300002)(8936002)(82310400004)(36860700001)(47076005)(17423001)(156123004)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Mar 2022 10:14:50.3610 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: af56e389-d13b-47fc-55ce-08d9fb6c5246 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT012.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1865 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220301_021456_299263_39055364 X-CRM114-Status: GOOD ( 14.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Import the latest version of the Arm Optimized Routines strcmp function based on the upstream code of string/aarch64/strcmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly Cc: Robin Murphy Cc: Mark Rutland Cc: Catalin Marinas Cc: Will Deacon Acked-by: Mark Rutland --- arch/arm64/lib/strcmp.S | 238 +++++++++++++++++++++------------------- 1 file changed, 126 insertions(+), 112 deletions(-) diff --git a/arch/arm64/lib/strcmp.S b/arch/arm64/lib/strcmp.S index 83bcad72ec97..758de77afd2f 100644 --- a/arch/arm64/lib/strcmp.S +++ b/arch/arm64/lib/strcmp.S @@ -1,9 +1,9 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Copyright (c) 2012-2021, Arm Limited. + * Copyright (c) 2012-2022, Arm Limited. * * Adapted from the original at: - * https://github.com/ARM-software/optimized-routines/blob/afd6244a1f8d9229/string/aarch64/strcmp.S + * https://github.com/ARM-software/optimized-routines/blob/189dfefe37d54c5b/string/aarch64/strcmp.S */ #include @@ -11,161 +11,175 @@ /* Assumptions: * - * ARMv8-a, AArch64 + * ARMv8-a, AArch64. + * MTE compatible. */ #define L(label) .L ## label #define REP8_01 0x0101010101010101 #define REP8_7f 0x7f7f7f7f7f7f7f7f -#define REP8_80 0x8080808080808080 -/* Parameters and result. */ #define src1 x0 #define src2 x1 #define result x0 -/* Internal variables. */ #define data1 x2 #define data1w w2 #define data2 x3 #define data2w w3 #define has_nul x4 #define diff x5 +#define off1 x5 #define syndrome x6 -#define tmp1 x7 -#define tmp2 x8 -#define tmp3 x9 -#define zeroones x10 -#define pos x11 - - /* Start of performance-critical section -- one 64B cache line. */ - .align 6 +#define tmp x6 +#define data3 x7 +#define zeroones x8 +#define shift x9 +#define off2 x10 + +/* On big-endian early bytes are at MSB and on little-endian LSB. + LS_FW means shifting towards early bytes. */ +#ifdef __AARCH64EB__ +# define LS_FW lsl +#else +# define LS_FW lsr +#endif + +/* NUL detection works on the principle that (X - 1) & (~X) & 0x80 + (=> (X - 1) & ~(X | 0x7f)) is non-zero iff a byte is zero, and + can be done in parallel across the entire word. + Since carry propagation makes 0x1 bytes before a NUL byte appear + NUL too in big-endian, byte-reverse the data before the NUL check. */ + + SYM_FUNC_START_WEAK_PI(strcmp) - eor tmp1, src1, src2 - mov zeroones, #REP8_01 - tst tmp1, #7 + sub off2, src2, src1 + mov zeroones, REP8_01 + and tmp, src1, 7 + tst off2, 7 b.ne L(misaligned8) - ands tmp1, src1, #7 - b.ne L(mutual_align) - /* NUL detection works on the principle that (X - 1) & (~X) & 0x80 - (=> (X - 1) & ~(X | 0x7f)) is non-zero iff a byte is zero, and - can be done in parallel across the entire word. */ + cbnz tmp, L(mutual_align) + + .p2align 4 + L(loop_aligned): - ldr data1, [src1], #8 - ldr data2, [src2], #8 + ldr data2, [src1, off2] + ldr data1, [src1], 8 L(start_realigned): - sub tmp1, data1, zeroones - orr tmp2, data1, #REP8_7f - eor diff, data1, data2 /* Non-zero if differences found. */ - bic has_nul, tmp1, tmp2 /* Non-zero if NUL terminator. */ +#ifdef __AARCH64EB__ + rev tmp, data1 + sub has_nul, tmp, zeroones + orr tmp, tmp, REP8_7f +#else + sub has_nul, data1, zeroones + orr tmp, data1, REP8_7f +#endif + bics has_nul, has_nul, tmp /* Non-zero if NUL terminator. */ + ccmp data1, data2, 0, eq + b.eq L(loop_aligned) +#ifdef __AARCH64EB__ + rev has_nul, has_nul +#endif + eor diff, data1, data2 orr syndrome, diff, has_nul - cbz syndrome, L(loop_aligned) - /* End of performance-critical section -- one 64B cache line. */ - L(end): -#ifndef __AARCH64EB__ +#ifndef __AARCH64EB__ rev syndrome, syndrome rev data1, data1 - /* The MS-non-zero bit of the syndrome marks either the first bit - that is different, or the top bit of the first zero byte. - Shifting left now will bring the critical information into the - top bits. */ - clz pos, syndrome rev data2, data2 - lsl data1, data1, pos - lsl data2, data2, pos - /* But we need to zero-extend (char is unsigned) the value and then - perform a signed 32-bit subtraction. */ - lsr data1, data1, #56 - sub result, data1, data2, lsr #56 - ret -#else - /* For big-endian we cannot use the trick with the syndrome value - as carry-propagation can corrupt the upper bits if the trailing - bytes in the string contain 0x01. */ - /* However, if there is no NUL byte in the dword, we can generate - the result directly. We can't just subtract the bytes as the - MSB might be significant. */ - cbnz has_nul, 1f - cmp data1, data2 - cset result, ne - cneg result, result, lo - ret -1: - /* Re-compute the NUL-byte detection, using a byte-reversed value. */ - rev tmp3, data1 - sub tmp1, tmp3, zeroones - orr tmp2, tmp3, #REP8_7f - bic has_nul, tmp1, tmp2 - rev has_nul, has_nul - orr syndrome, diff, has_nul - clz pos, syndrome - /* The MS-non-zero bit of the syndrome marks either the first bit - that is different, or the top bit of the first zero byte. +#endif + clz shift, syndrome + /* The most-significant-non-zero bit of the syndrome marks either the + first bit that is different, or the top bit of the first zero byte. Shifting left now will bring the critical information into the top bits. */ - lsl data1, data1, pos - lsl data2, data2, pos + lsl data1, data1, shift + lsl data2, data2, shift /* But we need to zero-extend (char is unsigned) the value and then perform a signed 32-bit subtraction. */ - lsr data1, data1, #56 - sub result, data1, data2, lsr #56 + lsr data1, data1, 56 + sub result, data1, data2, lsr 56 ret -#endif + + .p2align 4 L(mutual_align): /* Sources are mutually aligned, but are not currently at an alignment boundary. Round down the addresses and then mask off - the bytes that preceed the start point. */ - bic src1, src1, #7 - bic src2, src2, #7 - lsl tmp1, tmp1, #3 /* Bytes beyond alignment -> bits. */ - ldr data1, [src1], #8 - neg tmp1, tmp1 /* Bits to alignment -64. */ - ldr data2, [src2], #8 - mov tmp2, #~0 -#ifdef __AARCH64EB__ - /* Big-endian. Early bytes are at MSB. */ - lsl tmp2, tmp2, tmp1 /* Shift (tmp1 & 63). */ -#else - /* Little-endian. Early bytes are at LSB. */ - lsr tmp2, tmp2, tmp1 /* Shift (tmp1 & 63). */ -#endif - orr data1, data1, tmp2 - orr data2, data2, tmp2 + the bytes that precede the start point. */ + bic src1, src1, 7 + ldr data2, [src1, off2] + ldr data1, [src1], 8 + neg shift, src2, lsl 3 /* Bits to alignment -64. */ + mov tmp, -1 + LS_FW tmp, tmp, shift + orr data1, data1, tmp + orr data2, data2, tmp b L(start_realigned) L(misaligned8): /* Align SRC1 to 8 bytes and then compare 8 bytes at a time, always - checking to make sure that we don't access beyond page boundary in - SRC2. */ - tst src1, #7 - b.eq L(loop_misaligned) + checking to make sure that we don't access beyond the end of SRC2. */ + cbz tmp, L(src1_aligned) L(do_misaligned): - ldrb data1w, [src1], #1 - ldrb data2w, [src2], #1 - cmp data1w, #1 - ccmp data1w, data2w, #0, cs /* NZCV = 0b0000. */ + ldrb data1w, [src1], 1 + ldrb data2w, [src2], 1 + cmp data1w, 0 + ccmp data1w, data2w, 0, ne /* NZCV = 0b0000. */ b.ne L(done) - tst src1, #7 + tst src1, 7 b.ne L(do_misaligned) -L(loop_misaligned): - /* Test if we are within the last dword of the end of a 4K page. If - yes then jump back to the misaligned loop to copy a byte at a time. */ - and tmp1, src2, #0xff8 - eor tmp1, tmp1, #0xff8 - cbz tmp1, L(do_misaligned) - ldr data1, [src1], #8 - ldr data2, [src2], #8 - - sub tmp1, data1, zeroones - orr tmp2, data1, #REP8_7f - eor diff, data1, data2 /* Non-zero if differences found. */ - bic has_nul, tmp1, tmp2 /* Non-zero if NUL terminator. */ +L(src1_aligned): + neg shift, src2, lsl 3 + bic src2, src2, 7 + ldr data3, [src2], 8 +#ifdef __AARCH64EB__ + rev data3, data3 +#endif + lsr tmp, zeroones, shift + orr data3, data3, tmp + sub has_nul, data3, zeroones + orr tmp, data3, REP8_7f + bics has_nul, has_nul, tmp + b.ne L(tail) + + sub off1, src2, src1 + + .p2align 4 + +L(loop_unaligned): + ldr data3, [src1, off1] + ldr data2, [src1, off2] +#ifdef __AARCH64EB__ + rev data3, data3 +#endif + sub has_nul, data3, zeroones + orr tmp, data3, REP8_7f + ldr data1, [src1], 8 + bics has_nul, has_nul, tmp + ccmp data1, data2, 0, eq + b.eq L(loop_unaligned) + + lsl tmp, has_nul, shift +#ifdef __AARCH64EB__ + rev tmp, tmp +#endif + eor diff, data1, data2 + orr syndrome, diff, tmp + cbnz syndrome, L(end) +L(tail): + ldr data1, [src1] + neg shift, shift + lsr data2, data3, shift + lsr has_nul, has_nul, shift +#ifdef __AARCH64EB__ + rev data2, data2 + rev has_nul, has_nul +#endif + eor diff, data1, data2 orr syndrome, diff, has_nul - cbz syndrome, L(loop_misaligned) b L(end) L(done): From patchwork Tue Mar 1 10:14:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joey Gouly X-Patchwork-Id: 12764503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D43BFC433EF for ; Tue, 1 Mar 2022 10:20:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eRjI9eEgdos0iSAc0AceJ9RcweFhv1rmWn77qE85CFw=; b=WLANU2b0Zticn2 AjYFBYFvJ9+6S+g3ysffceYJLhWEzq0GLu8R5OmxuSRiQ7oAOkirE7Dp2F+PpV+Fj9/RkgrTAsgZx sPmALLiWmqySkyEkA0HpnrNr167FjFr6zykIfCnRspVJK1FJV4OYvPgXEJlMct7WiXHmHPfIFhnU2 DaoqqfvYc0I1L2MWSTwuCL1ydeAghOaFCWe6/WFmHgd0lktc/cF+MvAIKcZeSg9rE3C3cLEb0hBFb xZbDA5WQ7lRnIOpgbcvcpU0A4f1XzwSGEbQoUMFLzSlBjwSjYg0e8Qa2wKPWEfrMBLnvBybttUn+E S1E9ikZTxJDGQJysJZUQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzZt-00G4vs-0w; Tue, 01 Mar 2022 10:17:59 +0000 Received: from mail-db3eur04on0620.outbound.protection.outlook.com ([2a01:111:f400:fe0c::620] helo=EUR04-DB3-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzWy-00G3p6-2O for linux-arm-kernel@lists.infradead.org; Tue, 01 Mar 2022 10:15:01 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9GHbjovlqSCEwF0mMZsb8UKzTruRZ24oZXS87KBi8d8=; b=whg7GCRo+yZ/N7QZWPlmeecTc4H6Y4JmHktS3AOI35xLnvNZwiKZphiKZuUXXkMOzpnfeqf3qd8MMAWaq6NP+EQP9aeHBQ5BuVIxntc3RZOLq1+Log0qMOparY34eDcb4Dk94B+c48t7Mdg4IxvKUHZxX+4dkKh3v2an5VL4m9E= Received: from AS9PR06CA0275.eurprd06.prod.outlook.com (2603:10a6:20b:45a::24) by DB6PR0801MB1768.eurprd08.prod.outlook.com (2603:10a6:4:3b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.26; Tue, 1 Mar 2022 10:14:50 +0000 Received: from AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:45a:cafe::c8) by AS9PR06CA0275.outlook.office365.com (2603:10a6:20b:45a::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5038.13 via Frontend Transport; Tue, 1 Mar 2022 10:14:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT042.mail.protection.outlook.com (10.152.17.168) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:50 +0000 Received: ("Tessian outbound 826a6d8e58c3:v113"); Tue, 01 Mar 2022 10:14:50 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b7c46035dd536aa9 X-CR-MTA-TID: 64aa7808 Received: from d8981349b061.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2BA729D0-73FD-4989-BDD7-12D7E6FE0F2D.1; Tue, 01 Mar 2022 10:14:43 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d8981349b061.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Mar 2022 10:14:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZXPqUmpjsie+C3Z9dVVp51NnNA34+30hQSq8A1uZ3uqfeg0xMbXo2FbTBKJQ9T1iC7DAPcluXiNK5NxuX8VhNK6Woy33Dt6EQJwm2MnMUM7jQqM3Fkvi39nU8FvK6GsyyVPSSBWQVTES2e8gV8/Dra5KR55U86tw8ccZ2aa3ONXcMycLxKR5CRNLrLFRHUA2AgyKQFcEZvKlaNvpi7ZW6z6j63+UGktUI+NcGlA/g8HgzvlP5QFGfxkPRVrKvpmildkcSbYIPEagexHXeJqPs7hVJcogUcrUX2uPlspBOpA8AVbIV1/ld6XbhVw6FVAb7UOe1QJ1WIXomAAjZgq3UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9GHbjovlqSCEwF0mMZsb8UKzTruRZ24oZXS87KBi8d8=; b=Tyw/2cdd9ALartSUQh01bmnhkaQuAqs5ZFwVwJuYUX0gdB4sWd6zFFkDoZo3WvADtuzTpOsNfKMF+W9rfjQHuTyrtzC0A02jehzgedEDlf0v3aNlGqRu2xFI8nM/B/b7FuXVL717GzY0H44f6c6OgCPhMo2/o7NuIpplJql0yJzk266b17O+pPBgJIXa/J8F6MEeZ94NlgXM3Ix8McESg9x3gXUW2UCdAbFn1qgv5LgB0pVeOSD7Swxgc+Zhv9c6gn1rfKn8jKM6nCXxU7LEM9ZQHLlv1Sg9RCHXKsXaQ34v3ztF+x39HipRn2mz8OKrQhGBWNbUEImGBaJMa19cNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9GHbjovlqSCEwF0mMZsb8UKzTruRZ24oZXS87KBi8d8=; b=whg7GCRo+yZ/N7QZWPlmeecTc4H6Y4JmHktS3AOI35xLnvNZwiKZphiKZuUXXkMOzpnfeqf3qd8MMAWaq6NP+EQP9aeHBQ5BuVIxntc3RZOLq1+Log0qMOparY34eDcb4Dk94B+c48t7Mdg4IxvKUHZxX+4dkKh3v2an5VL4m9E= Received: from DB8P191CA0019.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:130::29) by DBBPR08MB5993.eurprd08.prod.outlook.com (2603:10a6:10:1f4::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22; Tue, 1 Mar 2022 10:14:41 +0000 Received: from DB5EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:130:cafe::68) by DB8P191CA0019.outlook.office365.com (2603:10a6:10:130::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5038.14 via Frontend Transport; Tue, 1 Mar 2022 10:14:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; Received: from nebula.arm.com (40.67.248.234) by DB5EUR03FT042.mail.protection.outlook.com (10.152.21.123) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:41 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 1 Mar 2022 10:14:44 +0000 Received: from e124191.cambridge.arm.com (10.1.197.45) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2308.20 via Frontend Transport; Tue, 1 Mar 2022 10:14:43 +0000 From: Joey Gouly To: CC: , , , , , Subject: [PATCH v2 2/3] arm64: lib: Import latest version of Arm Optimized Routines' strncmp Date: Tue, 1 Mar 2022 10:14:34 +0000 Message-ID: <20220301101435.19327-3-joey.gouly@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220301101435.19327-1-joey.gouly@arm.com> References: <20220301101435.19327-1-joey.gouly@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-Office365-Filtering-Correlation-Id: a145402a-c932-497f-0412-08d9fb6c526c X-MS-TrafficTypeDiagnostic: DBBPR08MB5993:EE_|AM5EUR03FT042:EE_|DB6PR0801MB1768:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: IpqAu7nscl2pdwBVJ7LNqq9xHlQdFLsUUmIQMAQMOobm6TydZeXQcc9r3efFNFxu5aByTBmLexGE3BDzgyJCNmq+XjTKvqkw1+3OSLbfGah9hbf4D2hxpghyIyidd02+vn8aX7Hj7DLLgKuJ8KoZibF6IIDFm2f19FMZtNKJJ5izkjAZS3+crBSYkn3PpnAIGdHtNxkGbEuXLNpJVz9rE5UCb0yxu75L7jw/E9q6MB7ApzR/joIIPue8FsKYPvOm7OV8VAkUGI78J7b+hNNcblhWj4L8V3V04TZv8sXT4O7gMSJcjmeokIciirBHEvoV1d3f6rsDfboj0HGVlsysuqOGHc7xCTvXFA5Tb0rOjO4a4eUOpFgLceNyzdcsHGBLn8kclwEPFjGx8JHevQc4AlLdnHVfDY5CkOtH39W0vz38eOcOkH3Z0Zr4igRKzxxgs0J2zfnsZN/7lVJqwcpw2vk3IzRq4nPI7/H/QvKRmRXNvogBgYQ0H6hxGIS+vlIHtINAca4yaCvTWyD7dUvwqoRSMnhLiKygMj8IsgQryspyQCqHRy7vuU44Gi9+CyJzqfG/spFOSGgyByHumuNSmSm9tA8U8RsbxeloN7pNWtZyFdCWyVt2BgDbRvRujOO38x9DSGYvSYRfPmXBR/B1gGnXLBl8UIGampS5I40sl3IYXV3BH40SYCURvJu+bdkXwZsvlmJFXkzynP54R16lgDmt6YSwdDx24b2HL7MoWwGDwKRCsqpmpi5LCO8lRTNeGrXoj9uwhtQJAPQy3a8DGrscRHkYsIcZpbNmOOtxmolovqoVzQhCpkA8UVI+F/zOkTloX0LJY6+WcSpQSYEiDjd4yeJdDvZtjDXgXReKcKT2OJTSqdib84xrPYa/ykRoOfT1UmRnBvUtY1PVXvF6MA== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230001)(4636009)(36840700001)(40470700004)(46966006)(966005)(26005)(356005)(70206006)(40460700003)(86362001)(44832011)(30864003)(336012)(316002)(426003)(81166007)(54906003)(508600001)(36756003)(47076005)(83380400001)(186003)(6916009)(1076003)(2616005)(70586007)(6666004)(8676002)(4326008)(82310400004)(2906002)(36860700001)(5660300002)(7696005)(8936002)(17423001)(156123004)(36900700001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB5993 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 56641610-f8c3-4cd0-a252-08d9fb6c4cf6 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tuZ5UOOC4zDBjGRPMoUpU0n8UTcgHGgi1J0045B+NpTqtn7zahCDYeh3oLYNANtkP14zJskRDL+UTj/Q1NbpFN8lDHcsEplwp2JV/l6Nz3anCmP+j+Eiq/fSfkKmPd9H+tTseleLoTZLoYAnMOhkXmsYQ1ZZNocrez2d1HlfsPrL+8k8wAC8oO+GHT0Cea048Oz/nlz2aJNrtwEOWaHuPJBiDZZeMRbN+C9g3NR4dwq/MdgKJyHj/lQB5zLtL6qdohWiiyfu6evCbtBI8B5pJtW1o2cWhBCGLT2TixaFfXglAANELfEywD8einusdRpVhbejCWI1GmreIJP/CWjhFU9xM7BeM9QllZn6HkQc25JhY4mocm6gS0AcaWWAqrJJmCflfFft2bYUvYjINnuVNzTd/meJoso9nWyzkHmAf7ROp5KibxWiRMPlT5y3hfDpfoXka/rT64WAm/bMFXvdsAimTW0n/JhQk56zKVIupMsV2hRwNtsS+nibV4RaXgvhkA90Lw3bIzEX4aqUAQ4UbyTmgM5TF9BFU+YuVB41q41QDckq4sxgWBotTyPtYLuHsHeBdYIDrjpsQDYJEsFfhSmsXoSEF6PEixDKeGg3+Tyh3vzMC1Gexu2o7Epyc+TUymMwVwVnhapbfYuunxx/cJqfiEUz90OeFxgDeapgHidcVYOqQ3fG+LN9Bjj7nSsQSsFpAiheMq672nKiNoSdT5BapfRplLPJNKZUMbPrau6TMFk1tGSzvbF7MsdUa+RMf8MQI4TIfwFRyitqXWVpV2GxjCYBhBhAR0B/Bh2gCvkbiaLvYRgM4kO7buH/Zzm7QdfyppZlbJ9TCJcOW+CzK8D8fp3lUofdIjX7eChVLD5NyGl0Hh/fbQJm1CfCvXSf X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230001)(4636009)(36840700001)(46966006)(40470700004)(36860700001)(2616005)(44832011)(81166007)(4326008)(30864003)(8676002)(2906002)(82310400004)(7696005)(36756003)(336012)(86362001)(70586007)(40460700003)(70206006)(26005)(6666004)(107886003)(47076005)(1076003)(5660300002)(426003)(186003)(8936002)(83380400001)(54906003)(966005)(316002)(508600001)(6916009)(17423001)(156123004)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Mar 2022 10:14:50.6705 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a145402a-c932-497f-0412-08d9fb6c526c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1768 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220301_021456_313280_0833F80A X-CRM114-Status: GOOD ( 16.06 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Import the latest version of the Arm Optimized Routines strncmp function based on the upstream code of string/aarch64/strncmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly Cc: Robin Murphy Cc: Mark Rutland Cc: Catalin Marinas Cc: Will Deacon Acked-by: Mark Rutland --- arch/arm64/lib/strncmp.S | 234 +++++++++++++++++++++++---------------- 1 file changed, 141 insertions(+), 93 deletions(-) diff --git a/arch/arm64/lib/strncmp.S b/arch/arm64/lib/strncmp.S index e42bcfcd37e6..a4884b97e9a8 100644 --- a/arch/arm64/lib/strncmp.S +++ b/arch/arm64/lib/strncmp.S @@ -1,9 +1,9 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Copyright (c) 2013-2021, Arm Limited. + * Copyright (c) 2013-2022, Arm Limited. * * Adapted from the original at: - * https://github.com/ARM-software/optimized-routines/blob/e823e3abf5f89ecb/string/aarch64/strncmp.S + * https://github.com/ARM-software/optimized-routines/blob/189dfefe37d54c5b/string/aarch64/strncmp.S */ #include @@ -11,14 +11,14 @@ /* Assumptions: * - * ARMv8-a, AArch64 + * ARMv8-a, AArch64. + * MTE compatible. */ #define L(label) .L ## label #define REP8_01 0x0101010101010101 #define REP8_7f 0x7f7f7f7f7f7f7f7f -#define REP8_80 0x8080808080808080 /* Parameters and result. */ #define src1 x0 @@ -39,10 +39,24 @@ #define tmp3 x10 #define zeroones x11 #define pos x12 -#define limit_wd x13 -#define mask x14 -#define endloop x15 +#define mask x13 +#define endloop x14 #define count mask +#define offset pos +#define neg_offset x15 + +/* Define endian dependent shift operations. + On big-endian early bytes are at MSB and on little-endian LSB. + LS_FW means shifting towards early bytes. + LS_BK means shifting towards later bytes. + */ +#ifdef __AARCH64EB__ +#define LS_FW lsl +#define LS_BK lsr +#else +#define LS_FW lsr +#define LS_BK lsl +#endif SYM_FUNC_START_WEAK_PI(strncmp) cbz limit, L(ret0) @@ -52,9 +66,6 @@ SYM_FUNC_START_WEAK_PI(strncmp) and count, src1, #7 b.ne L(misaligned8) cbnz count, L(mutual_align) - /* Calculate the number of full and partial words -1. */ - sub limit_wd, limit, #1 /* limit != 0, so no underflow. */ - lsr limit_wd, limit_wd, #3 /* Convert to Dwords. */ /* NUL detection works on the principle that (X - 1) & (~X) & 0x80 (=> (X - 1) & ~(X | 0x7f)) is non-zero iff a byte is zero, and @@ -64,56 +75,52 @@ L(loop_aligned): ldr data1, [src1], #8 ldr data2, [src2], #8 L(start_realigned): - subs limit_wd, limit_wd, #1 + subs limit, limit, #8 sub tmp1, data1, zeroones orr tmp2, data1, #REP8_7f eor diff, data1, data2 /* Non-zero if differences found. */ - csinv endloop, diff, xzr, pl /* Last Dword or differences. */ + csinv endloop, diff, xzr, hi /* Last Dword or differences. */ bics has_nul, tmp1, tmp2 /* Non-zero if NUL terminator. */ ccmp endloop, #0, #0, eq b.eq L(loop_aligned) /* End of main loop */ - /* Not reached the limit, must have found the end or a diff. */ - tbz limit_wd, #63, L(not_limit) - - /* Limit % 8 == 0 => all bytes significant. */ - ands limit, limit, #7 - b.eq L(not_limit) - - lsl limit, limit, #3 /* Bits -> bytes. */ - mov mask, #~0 -#ifdef __AARCH64EB__ - lsr mask, mask, limit -#else - lsl mask, mask, limit -#endif - bic data1, data1, mask - bic data2, data2, mask - - /* Make sure that the NUL byte is marked in the syndrome. */ - orr has_nul, has_nul, mask - -L(not_limit): +L(full_check): +#ifndef __AARCH64EB__ orr syndrome, diff, has_nul - -#ifndef __AARCH64EB__ + add limit, limit, 8 /* Rewind limit to before last subs. */ +L(syndrome_check): + /* Limit was reached. Check if the NUL byte or the difference + is before the limit. */ rev syndrome, syndrome rev data1, data1 - /* The MS-non-zero bit of the syndrome marks either the first bit - that is different, or the top bit of the first zero byte. - Shifting left now will bring the critical information into the - top bits. */ clz pos, syndrome rev data2, data2 lsl data1, data1, pos + cmp limit, pos, lsr #3 lsl data2, data2, pos /* But we need to zero-extend (char is unsigned) the value and then perform a signed 32-bit subtraction. */ lsr data1, data1, #56 sub result, data1, data2, lsr #56 + csel result, result, xzr, hi ret #else + /* Not reached the limit, must have found the end or a diff. */ + tbz limit, #63, L(not_limit) + add tmp1, limit, 8 + cbz limit, L(not_limit) + + lsl limit, tmp1, #3 /* Bits -> bytes. */ + mov mask, #~0 + lsr mask, mask, limit + bic data1, data1, mask + bic data2, data2, mask + + /* Make sure that the NUL byte is marked in the syndrome. */ + orr has_nul, has_nul, mask + +L(not_limit): /* For big-endian we cannot use the trick with the syndrome value as carry-propagation can corrupt the upper bits if the trailing bytes in the string contain 0x01. */ @@ -134,10 +141,11 @@ L(not_limit): rev has_nul, has_nul orr syndrome, diff, has_nul clz pos, syndrome - /* The MS-non-zero bit of the syndrome marks either the first bit - that is different, or the top bit of the first zero byte. + /* The most-significant-non-zero bit of the syndrome marks either the + first bit that is different, or the top bit of the first zero byte. Shifting left now will bring the critical information into the top bits. */ +L(end_quick): lsl data1, data1, pos lsl data2, data2, pos /* But we need to zero-extend (char is unsigned) the value and then @@ -159,22 +167,12 @@ L(mutual_align): neg tmp3, count, lsl #3 /* 64 - bits(bytes beyond align). */ ldr data2, [src2], #8 mov tmp2, #~0 - sub limit_wd, limit, #1 /* limit != 0, so no underflow. */ -#ifdef __AARCH64EB__ - /* Big-endian. Early bytes are at MSB. */ - lsl tmp2, tmp2, tmp3 /* Shift (count & 63). */ -#else - /* Little-endian. Early bytes are at LSB. */ - lsr tmp2, tmp2, tmp3 /* Shift (count & 63). */ -#endif - and tmp3, limit_wd, #7 - lsr limit_wd, limit_wd, #3 - /* Adjust the limit. Only low 3 bits used, so overflow irrelevant. */ - add limit, limit, count - add tmp3, tmp3, count + LS_FW tmp2, tmp2, tmp3 /* Shift (count & 63). */ + /* Adjust the limit and ensure it doesn't overflow. */ + adds limit, limit, count + csinv limit, limit, xzr, lo orr data1, data1, tmp2 orr data2, data2, tmp2 - add limit_wd, limit_wd, tmp3, lsr #3 b L(start_realigned) .p2align 4 @@ -197,13 +195,11 @@ L(done): /* Align the SRC1 to a dword by doing a bytewise compare and then do the dword loop. */ L(try_misaligned_words): - lsr limit_wd, limit, #3 - cbz count, L(do_misaligned) + cbz count, L(src1_aligned) neg count, count and count, count, #7 sub limit, limit, count - lsr limit_wd, limit, #3 L(page_end_loop): ldrb data1w, [src1], #1 @@ -214,48 +210,100 @@ L(page_end_loop): subs count, count, #1 b.hi L(page_end_loop) -L(do_misaligned): - /* Prepare ourselves for the next page crossing. Unlike the aligned - loop, we fetch 1 less dword because we risk crossing bounds on - SRC2. */ - mov count, #8 - subs limit_wd, limit_wd, #1 - b.lo L(done_loop) -L(loop_misaligned): - and tmp2, src2, #0xff8 - eor tmp2, tmp2, #0xff8 - cbz tmp2, L(page_end_loop) + /* The following diagram explains the comparison of misaligned strings. + The bytes are shown in natural order. For little-endian, it is + reversed in the registers. The "x" bytes are before the string. + The "|" separates data that is loaded at one time. + src1 | a a a a a a a a | b b b c c c c c | . . . + src2 | x x x x x a a a a a a a a b b b | c c c c c . . . + + After shifting in each step, the data looks like this: + STEP_A STEP_B STEP_C + data1 a a a a a a a a b b b c c c c c b b b c c c c c + data2 a a a a a a a a b b b 0 0 0 0 0 0 0 0 c c c c c + The bytes with "0" are eliminated from the syndrome via mask. + + Align SRC2 down to 16 bytes. This way we can read 16 bytes at a + time from SRC2. The comparison happens in 3 steps. After each step + the loop can exit, or read from SRC1 or SRC2. */ +L(src1_aligned): + /* Calculate offset from 8 byte alignment to string start in bits. No + need to mask offset since shifts are ignoring upper bits. */ + lsl offset, src2, #3 + bic src2, src2, #0xf + mov mask, -1 + neg neg_offset, offset ldr data1, [src1], #8 - ldr data2, [src2], #8 - sub tmp1, data1, zeroones - orr tmp2, data1, #REP8_7f - eor diff, data1, data2 /* Non-zero if differences found. */ - bics has_nul, tmp1, tmp2 /* Non-zero if NUL terminator. */ - ccmp diff, #0, #0, eq - b.ne L(not_limit) - subs limit_wd, limit_wd, #1 - b.pl L(loop_misaligned) + ldp tmp1, tmp2, [src2], #16 + LS_BK mask, mask, neg_offset + and neg_offset, neg_offset, #63 /* Need actual value for cmp later. */ + /* Skip the first compare if data in tmp1 is irrelevant. */ + tbnz offset, 6, L(misaligned_mid_loop) -L(done_loop): - /* We found a difference or a NULL before the limit was reached. */ - and limit, limit, #7 - cbz limit, L(not_limit) - /* Read the last word. */ - sub src1, src1, 8 - sub src2, src2, 8 - ldr data1, [src1, limit] - ldr data2, [src2, limit] - sub tmp1, data1, zeroones - orr tmp2, data1, #REP8_7f +L(loop_misaligned): + /* STEP_A: Compare full 8 bytes when there is enough data from SRC2.*/ + LS_FW data2, tmp1, offset + LS_BK tmp1, tmp2, neg_offset + subs limit, limit, #8 + orr data2, data2, tmp1 /* 8 bytes from SRC2 combined from two regs.*/ + sub has_nul, data1, zeroones eor diff, data1, data2 /* Non-zero if differences found. */ - bics has_nul, tmp1, tmp2 /* Non-zero if NUL terminator. */ - ccmp diff, #0, #0, eq - b.ne L(not_limit) + orr tmp3, data1, #REP8_7f + csinv endloop, diff, xzr, hi /* If limit, set to all ones. */ + bic has_nul, has_nul, tmp3 /* Non-zero if NUL byte found in SRC1. */ + orr tmp3, endloop, has_nul + cbnz tmp3, L(full_check) + + ldr data1, [src1], #8 +L(misaligned_mid_loop): + /* STEP_B: Compare first part of data1 to second part of tmp2. */ + LS_FW data2, tmp2, offset +#ifdef __AARCH64EB__ + /* For big-endian we do a byte reverse to avoid carry-propagation + problem described above. This way we can reuse the has_nul in the + next step and also use syndrome value trick at the end. */ + rev tmp3, data1 + #define data1_fixed tmp3 +#else + #define data1_fixed data1 +#endif + sub has_nul, data1_fixed, zeroones + orr tmp3, data1_fixed, #REP8_7f + eor diff, data2, data1 /* Non-zero if differences found. */ + bic has_nul, has_nul, tmp3 /* Non-zero if NUL terminator. */ +#ifdef __AARCH64EB__ + rev has_nul, has_nul +#endif + cmp limit, neg_offset, lsr #3 + orr syndrome, diff, has_nul + bic syndrome, syndrome, mask /* Ignore later bytes. */ + csinv tmp3, syndrome, xzr, hi /* If limit, set to all ones. */ + cbnz tmp3, L(syndrome_check) + + /* STEP_C: Compare second part of data1 to first part of tmp1. */ + ldp tmp1, tmp2, [src2], #16 + cmp limit, #8 + LS_BK data2, tmp1, neg_offset + eor diff, data2, data1 /* Non-zero if differences found. */ + orr syndrome, diff, has_nul + and syndrome, syndrome, mask /* Ignore earlier bytes. */ + csinv tmp3, syndrome, xzr, hi /* If limit, set to all ones. */ + cbnz tmp3, L(syndrome_check) + + ldr data1, [src1], #8 + sub limit, limit, #8 + b L(loop_misaligned) + +#ifdef __AARCH64EB__ +L(syndrome_check): + clz pos, syndrome + cmp pos, limit, lsl #3 + b.lo L(end_quick) +#endif L(ret0): mov result, #0 ret - SYM_FUNC_END_PI(strncmp) EXPORT_SYMBOL_NOHWKASAN(strncmp) From patchwork Tue Mar 1 10:14:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joey Gouly X-Patchwork-Id: 12764497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CF7A2C433F5 for ; Tue, 1 Mar 2022 10:17:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=YyGFOSPd/mpAl71awTAbIbtK5sVUzbpM8204hidpMiw=; b=jOG+2MMfSXoFAh 6ZXgXahcvfFKm0KDdYYwTJLtb0RXIcd308vdt7/NlyCyS4xOstrO97TPcgK/2RTlNsVycDqztMbpm da5KXNPMUznqTceE410SucpVaLF+9kvBwpcekd9NTpTPWtiLJLuprZVvLtZMLw7bboV1K6hFyB5ZL bufLxzqjDfG7a65En8F5zSKZAEw4o7SG7S9TD+sa7TeWXynpIh3nwbbK1b1pN6xrWu1aLO9eLMvf+ l35aQQI0mFsSsQ9P4APL/qatXTfRxGY9aCNFogUMzmHylslr45atFWN54DqVZ8mbYIlB+1qBIglPD cJnJeWtlSAmpWlmScKew==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzXa-00G46O-RR; Tue, 01 Mar 2022 10:15:37 +0000 Received: from mail-am5eur02on0619.outbound.protection.outlook.com ([2a01:111:f400:fe07::619] helo=EUR02-AM5-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nOzWx-00G3nu-MU for linux-arm-kernel@lists.infradead.org; Tue, 01 Mar 2022 10:14:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tR3/0sDMYxOTOD/O0Eeg/m5P1tjYzgmyzCyk31TYl88=; b=2qDTv5KXQFkPbkyk/24+U4sxLq93xDkv5+gmJ5UuY/3SvJCVUZOPh9cXtUmAVJ3v5kUHn9h78ovjjIqzVkebXI/Y3YvTOAUlvQezdLMJLWNiDvpEJj6gW7jWhx5J0IFYiU+kXYqgzcTSqMIqNUYPCgRlodR70GnuY9mPqTHr4WY= Received: from DU2PR04CA0219.eurprd04.prod.outlook.com (2603:10a6:10:2b1::14) by AM9PR08MB6865.eurprd08.prod.outlook.com (2603:10a6:20b:2fe::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.24; Tue, 1 Mar 2022 10:14:49 +0000 Received: from DB5EUR03FT018.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:2b1:cafe::56) by DU2PR04CA0219.outlook.office365.com (2603:10a6:10:2b1::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.26 via Frontend Transport; Tue, 1 Mar 2022 10:14:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT018.mail.protection.outlook.com (10.152.20.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:49 +0000 Received: ("Tessian outbound 741ca6c82739:v113"); Tue, 01 Mar 2022 10:14:49 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 750892313ec4a45b X-CR-MTA-TID: 64aa7808 Received: from c9c8be112a58.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A862F264-631D-4CF1-98F0-A0FF6A1E31D6.1; Tue, 01 Mar 2022 10:14:43 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c9c8be112a58.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Mar 2022 10:14:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DbBBgbSyIB3P0rvBSDE5gFEPeLEWvFtwLuM0WRg8DjHtbUMW7V1/S+Rbasgz8Dddxulq3LoDYhjUbgmXOKUBLK1otE0m54sH8XE8b72Zo5Ka075qY6UHBVWmhyNSU74ex229wYyO5WTJF3WqNo1WFh+Y9YGv6hqDSfP109+o5zDxD86bKPmw41oNrTPcIlV63WTRkKp+f4ZLbPlEubCxcBWZmlXjGPoqK8+ZCk7oHMQvDAr3VpG6nPKtGueauDgKpRykfI/TOeibs1bVMAFHndybNU00oBeONAYd05Vdria18UWmA70lLjw8OI6GR7EIxojw1JsIf3AHm0CtG3nypA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tR3/0sDMYxOTOD/O0Eeg/m5P1tjYzgmyzCyk31TYl88=; b=C3xzMKUCg221Sh4rfW9pkNAk7awdMc8PSejFj9dojy8RACBhCh6E9VKV3DA+TE57uOMpAFCcVsrsv+5qMe5rc7SOsht0hHRG1cIcmVpVJdH7MFAweZ1Bz+zmAn3TPaop8omNSWDz7wAbs6oPZakKiF39N+NIYJrkvcl+xBGZPjSNdglgAqKqwu1y5KizE/FsO4wNff/a33Ka8BTCio7n7wL1DE8Ne8CCxtKloywaeaZu+iPRmeOqZrbG8KmrencNEP5aTQP4+Ew3SzgZYzJcpEc3OEBm4QYO6aRzA7EASyCN8Qaug32FHx5NDMS1fpVsSZ7qtIn+f47NjSeZu+H9ag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tR3/0sDMYxOTOD/O0Eeg/m5P1tjYzgmyzCyk31TYl88=; b=2qDTv5KXQFkPbkyk/24+U4sxLq93xDkv5+gmJ5UuY/3SvJCVUZOPh9cXtUmAVJ3v5kUHn9h78ovjjIqzVkebXI/Y3YvTOAUlvQezdLMJLWNiDvpEJj6gW7jWhx5J0IFYiU+kXYqgzcTSqMIqNUYPCgRlodR70GnuY9mPqTHr4WY= Received: from DB8P191CA0008.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:130::18) by PAXPR08MB6848.eurprd08.prod.outlook.com (2603:10a6:102:132::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.21; Tue, 1 Mar 2022 10:14:42 +0000 Received: from DB5EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:130:cafe::fd) by DB8P191CA0008.outlook.office365.com (2603:10a6:10:130::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; Received: from nebula.arm.com (40.67.248.234) by DB5EUR03FT042.mail.protection.outlook.com (10.152.21.123) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5017.22 via Frontend Transport; Tue, 1 Mar 2022 10:14:42 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 1 Mar 2022 10:14:44 +0000 Received: from e124191.cambridge.arm.com (10.1.197.45) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2308.20 via Frontend Transport; Tue, 1 Mar 2022 10:14:44 +0000 From: Joey Gouly To: CC: , , , , , Subject: [PATCH v2 3/3] Revert "arm64: Mitigate MTE issues with str{n}cmp()" Date: Tue, 1 Mar 2022 10:14:35 +0000 Message-ID: <20220301101435.19327-4-joey.gouly@arm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220301101435.19327-1-joey.gouly@arm.com> References: <20220301101435.19327-1-joey.gouly@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-Office365-Filtering-Correlation-Id: c2793a87-a85c-4992-a9a0-08d9fb6c51e0 X-MS-TrafficTypeDiagnostic: PAXPR08MB6848:EE_|DB5EUR03FT018:EE_|AM9PR08MB6865:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: VDECL+Y9YS4Jjui3cf9fBrXtCqKNV/w1X88qzRI9P5sA4LzOd7IE7vWDe1x/4S3FiPXvVpfpj+2AhvuvVD0UOFJ3lbhkFYyVwSqiVeFz+bH1A7usuJ4ahGyOr5qOxDFLXWWtsVgsLcXY319UoRAOkrUtpbzFwTGmFI4nTzxLuS6/lcMQtRGVzofb/bF19qGhTaYrRxMxaZ56umTvVW+U7rbRfepTbhaHynGLAeEO22DHGZIFFm/zRoHQcoIB6+IBkrvsai0q3FCWjmE2Tof53P3qYHkglnIde4yl6Y7fzYoubONEum3n7TC+6dAYPjHYdFagTZ8miOYhcvN6JnptqMdLyFEp0Bkf6J7fVZBU8lepiAKg5cLNzWmUvq6HnC0HxktgynAGbF37jfagS+q5kvbn19mIaIhibeyAFqzzScxwVzWCIZE7A1RIGUC8K/mFoCyl+PnHvJ8Jxij9hlvD+P7g7r/8qLlFr/5aZe1d1lrwB6+NNHSSwGdi+LHyY3S+f204wiDyWGXkVy3cHCBlR2a9syTVynHBDXeojboTwX/x5ks+yEfr2Nm4xU0l59b+FSXhoiO3jZwMOdX5X7nV6fBcZzV6Ndo2R8Vsjl/6IYIKRO7mAyQYPhUSaiWU+yLB1q9KPowwkhHWP509hkRVRdrHBd+KOgvaJTlMf+xjsycJuKtUaYZWRxJU4X7Yu2SHkTIryi8Afg5dkcCaLxBWMw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230001)(4636009)(46966006)(40470700004)(36840700001)(2906002)(8676002)(70206006)(70586007)(4326008)(508600001)(8936002)(5660300002)(2616005)(1076003)(36756003)(6666004)(54906003)(316002)(6916009)(44832011)(7696005)(81166007)(336012)(47076005)(426003)(186003)(83380400001)(36860700001)(26005)(82310400004)(86362001)(40460700003)(356005)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6848 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 1eb90f2e-5e8e-4fd9-9ccc-08d9fb6c4d53 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ALafPFjVdlvopa2szGPvnqcH4Jv+cAytKFDp2Sazuc09oaHtF3dC1hZu0Hqep2fvENTfSv0RNScotNh8HXeaVJGCQhQDYVmkz9rin/+y8eib+ZIoUv0vwuPlKL8ci3KMsaX+i6OwHG+9CLMlDFo37D7mcvwv5zwJLhdq0lc0/1funnxQ1hQKl0fbLDf18kHx6K9hMvtqAh0xUAA1uNOuucS8wNwkUkjBu/3b9lqNnb4XMHQgNec1AaIsq4aneSpPsv/7JOMrVc544S4J54Is9+ktEJVS8iOt0VCkZEQimfWPz2537G7emZfBdnv9f39Hzdrm5b9YusWRGTQOUOhutqebRiBjxDyLjAZpYl6eFKZc7JvhzAN9nNy17WuDcMoml7xmXb80TN8Iay3bboGa9LfpYTs8Q0FE7nkGPAv2xEmzBYnhNt/RcVjSEqUfHbBGfm6bmu31Vmt9ge5XDjA8TQ257MCHL3jctjd95Fb12wAFwxok0IndyHSsTQKZkn1SAE/FVs2+n8MHr1oLVJseUO6vwiom2DzKkY5QeBOcvVrVpc4Q2ErDdA2WMsJOdfEcQRB4vGvm5URB+kzGJff/gsHcRX6NEI2q2r9ewaT1Yv8W1W0F9oHrvLoJhi1Qu1QA93WvHisqwl8ssH59FcprR6j/LB7uALAoURXpQ6oIac9d+BrcS1CHPZVamwabksHx X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230001)(4636009)(40470700004)(36840700001)(46966006)(82310400004)(8676002)(8936002)(508600001)(5660300002)(86362001)(4326008)(70586007)(70206006)(54906003)(6916009)(316002)(83380400001)(81166007)(6666004)(7696005)(336012)(36860700001)(47076005)(426003)(40460700003)(2616005)(186003)(107886003)(26005)(1076003)(44832011)(36756003)(2906002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Mar 2022 10:14:49.8283 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c2793a87-a85c-4992-a9a0-08d9fb6c51e0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR08MB6865 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220301_021455_804394_F2D4F0B6 X-CRM114-Status: GOOD ( 11.63 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This reverts commit 59a68d4138086c015ab8241c3267eec5550fbd44. Now that the str{n}cmp functions have been updated to handle MTE properly, the workaround to use the generic functions is no longer needed. Signed-off-by: Joey Gouly Cc: Robin Murphy Cc: Mark Rutland Cc: Catalin Marinas Cc: Will Deacon Acked-by: Mark Rutland --- arch/arm64/include/asm/assembler.h | 5 ----- arch/arm64/include/asm/string.h | 2 -- arch/arm64/lib/strcmp.S | 2 +- arch/arm64/lib/strncmp.S | 2 +- 4 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index e8bd0af0141c..8df412178efb 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -535,11 +535,6 @@ alternative_endif #define EXPORT_SYMBOL_NOKASAN(name) EXPORT_SYMBOL(name) #endif -#ifdef CONFIG_KASAN_HW_TAGS -#define EXPORT_SYMBOL_NOHWKASAN(name) -#else -#define EXPORT_SYMBOL_NOHWKASAN(name) EXPORT_SYMBOL_NOKASAN(name) -#endif /* * Emit a 64-bit absolute little endian symbol reference in a way that * ensures that it will be resolved at build time, even when building a diff --git a/arch/arm64/include/asm/string.h b/arch/arm64/include/asm/string.h index 95f7686b728d..3a3264ff47b9 100644 --- a/arch/arm64/include/asm/string.h +++ b/arch/arm64/include/asm/string.h @@ -12,13 +12,11 @@ extern char *strrchr(const char *, int c); #define __HAVE_ARCH_STRCHR extern char *strchr(const char *, int c); -#ifndef CONFIG_KASAN_HW_TAGS #define __HAVE_ARCH_STRCMP extern int strcmp(const char *, const char *); #define __HAVE_ARCH_STRNCMP extern int strncmp(const char *, const char *, __kernel_size_t); -#endif #define __HAVE_ARCH_STRLEN extern __kernel_size_t strlen(const char *); diff --git a/arch/arm64/lib/strcmp.S b/arch/arm64/lib/strcmp.S index 758de77afd2f..e6815a3dd265 100644 --- a/arch/arm64/lib/strcmp.S +++ b/arch/arm64/lib/strcmp.S @@ -187,4 +187,4 @@ L(done): ret SYM_FUNC_END_PI(strcmp) -EXPORT_SYMBOL_NOHWKASAN(strcmp) +EXPORT_SYMBOL_NOKASAN(strcmp) diff --git a/arch/arm64/lib/strncmp.S b/arch/arm64/lib/strncmp.S index a4884b97e9a8..bc195cb86693 100644 --- a/arch/arm64/lib/strncmp.S +++ b/arch/arm64/lib/strncmp.S @@ -306,4 +306,4 @@ L(ret0): mov result, #0 ret SYM_FUNC_END_PI(strncmp) -EXPORT_SYMBOL_NOHWKASAN(strncmp) +EXPORT_SYMBOL_NOKASAN(strncmp)