From patchwork Wed Feb 21 01:17:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Gunthorpe X-Patchwork-Id: 13564828 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3222C48BC3 for ; Wed, 21 Feb 2024 01:17:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kyOv+YTZOIom8LPjRnW/MgiN4Hpp2rwMOyyxw8rAQNI=; b=j50F0IYAInMqBZ BWATn4sARfYwVdABKKeQSj5KJ9pHI8enGBMgDuVDX58CzZJhX3grKU2SAMmj2sl9Wvhp1EZRdrneo q9hE8WusNEj3536l8Zn5mTnDTi6wviNAl/sizckX+deO9irVMwY7eajnZur+ROFv6S0rVJUqJQj5C aLb+RCQESVKWecpdadkQcuKjVbAqhD82XgdlRfCUVzsPl5sIYGO1bXia+WEApW2B/yYwlP5c16n9M EiXogg8HlzyexpQVbtxFFj92R4VgzCVudlk2CEe2JpoUmuQB00HiGWXeRbQQ3MKVgjmm1UYg+VHTK lfukMayoWzgfSY5MYYSA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rcbEl-0000000Giga-1ytz; Wed, 21 Feb 2024 01:17:27 +0000 Received: from mail-mw2nam10on20601.outbound.protection.outlook.com ([2a01:111:f403:2412::601] helo=NAM10-MW2-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rcbEi-0000000GidV-1ZBG for linux-arm-kernel@lists.infradead.org; Wed, 21 Feb 2024 01:17:25 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iDB55eS+bouWFN+xyXFRhLARlZ4Fma8s3liDcpgHM+FZrfj3qP8EmjDsZeRZ8KzuuF3DEoAtgSn9aEHQL7rxTa8PvqYZgqDWX/YhGVleVakh9OOJP/uWpn6SUwRcpeNF9FX1NB7m+PHaQgggRbUcZRYJFmQ5UfCRPaaIQsk4/WXB22DnSFk797KUXgGClUqbZh0CwQgTC7QnYfEusYHtIgTbbbQGs0kmoxAYerUAsuAGGONTU9I6X1lXDR+/qlziK2FMRAif+U61aKNtpvsQfXoK5d1Nty8up6eM2nAkN4teJxW5xtNwLzexs9FpgCmHPoh8OSRqFygqUEQwtk06gQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mCBJ6dLMHu5mQWHFWQoqDEbAiF7nN/N1UtwVHyaeXCw=; b=XpCQx51J1YZGt4g2DQ0N8hW7RHp5UreK1PhtzcpHJCT4bMTpU8RRLJEMm06oc+KuHr+AN8AkzofH7MKIXoeGrukes63VCHLAWR2m1KXugLbfu/S9o7P2CJ93RY2bbYwfp2x1IgNJZ/BpNm3ZliX7C4K0+qHroaj+65SVOO9NqpeyN18Rs2VwP37YgTsVFheQEM1w42LzaxEiqINb2sFK8mnxDkSKnB0CfYrHJ1ceNmbAfpCMqmVbgwj0PNZEhVw/bfrjMYq0IHYG6XnW76Rm/OuhOUSZcmNRCmplm7Ep9n4lLOMo1qJdQTJxNAp4h8dAh1HKR10Q2qe44B+729X5iw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mCBJ6dLMHu5mQWHFWQoqDEbAiF7nN/N1UtwVHyaeXCw=; b=HbZcq4o9kuGdg+hDYwdPbKTPnq4Zc8wBz0Y+ad4dRclXlvM8bF251R/wbZyAf8ouQ8lUva6hHe5s85ZFlENZPsJhUXXdHLXm13TmL1Agly2+i/32UH4FAd9Ot4j/8y2TIa2qjDwYPujdmktc0EU0w3aaBk0LfUF2OK5kWmtQbzq4tTJWwlfxDjThI1E0PPRWWbsESxAhnbbwxnfaq8TtPh7jB63x3c9ko/EnaS1cjI7v6eysimkVUbUKMS1Xeo+Hzx25hv5Sg595Bg9jkx/fOUn/mr9Oyr/obnCQQcIHZHAE2tihcS8oZkqA1OCqmfT7UuY+byv8VzrAnGNnD33IHw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) by MN0PR12MB5714.namprd12.prod.outlook.com (2603:10b6:208:371::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7316.19; Wed, 21 Feb 2024 01:17:13 +0000 Received: from LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::96dd:1160:6472:9873]) by LV2PR12MB5869.namprd12.prod.outlook.com ([fe80::96dd:1160:6472:9873%6]) with mapi id 15.20.7316.018; Wed, 21 Feb 2024 01:17:13 +0000 From: Jason Gunthorpe To: Alexander Gordeev , Andrew Morton , Christian Borntraeger , Borislav Petkov , Dave Hansen , "David S. Miller" , Eric Dumazet , Gerald Schaefer , Vasily Gorbik , Heiko Carstens , "H. Peter Anvin" , Justin Stitt , Jakub Kicinski , Leon Romanovsky , linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org, llvm@lists.linux.dev, Ingo Molnar , Bill Wendling , Nathan Chancellor , Nick Desaulniers , netdev@vger.kernel.org, Paolo Abeni , Salil Mehta , Jijie Shao , Sven Schnelle , Thomas Gleixner , x86@kernel.org, Yisen Zhuang Cc: Arnd Bergmann , Catalin Marinas , Leon Romanovsky , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Mark Rutland , Michael Guralnik , patches@lists.linux.dev, Niklas Schnelle , Will Deacon Subject: [PATCH 6/6] IB/mlx5: Use __iowrite64_copy() for write combining stores Date: Tue, 20 Feb 2024 21:17:10 -0400 Message-ID: <6-v1-38290193eace+5-mlx5_arm_wc_jgg@nvidia.com> In-Reply-To: <0-v1-38290193eace+5-mlx5_arm_wc_jgg@nvidia.com> References: X-ClientProxiedBy: MN2PR01CA0058.prod.exchangelabs.com (2603:10b6:208:23f::27) To LV2PR12MB5869.namprd12.prod.outlook.com (2603:10b6:408:176::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV2PR12MB5869:EE_|MN0PR12MB5714:EE_ X-MS-Office365-Filtering-Correlation-Id: 503a11ce-3472-4f39-23b3-08dc327ad479 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZUsP4uuZRvvTyQEjP7oHdxAB1tkCb7TpqlW2j4G5uZlzlbhe4Q4NaVIZ7HlVICFi9LCImyQiikFEOJLPq1hzGFCurb503Z2uRL07JcROZSsmWXOxpCnEoGXui2IAZrY6aspiTckhaywZ/I5AQ2uLgjURThmUxhzashoA+wBwX+wQvgl9fgnqMd2GNEYT723i2+DclRsyYoTiVZVubfRBl16EQ27TaAwu2UXLIw4oF7rD0HxYo/+5bFWGD4GsLvuKCxIA5RmNiSdg1l9BG3ULR93xngb/s69jHXMGtEYdGW7WHHrRtSIyypIzclKPkOtxojbWVLFLvnfdSoGHWCAtCwumu3rdwhc1aQTeGppfbl8y+WCSatLvHezRTPsBLUe9kOfQeHV/ahagNzQ9HBonzap3PmGO86QWXUe9mWhoPICK6+HRpPgQhNnSqRrEPwS+egknvQBCFwK+Th9IQYl6FpDtFsg3AUJaWrv9OiI6Z9JLSPobGSFbIsz6XScqtZ4yreook+czjbKj00GVMySOBJPsScVZghD0wU3Ud8D2UEEuVNEPtH/RA/EOhDk49pvxiWfV0vDDBzBW+a9cPgu9f72IhaMBS/8xMLP4ur4mcac= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV2PR12MB5869.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(921011);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: vH3EaakA7dN2Nb9Ba3nth8A5rVVBLRhGUfmxVZy547Xt9fZHEwFhNxDNK6HRj4tYW+tfpJ2fLQnYVoQ3s7A54392a4kWLJYbN/yS/fukmB14hJnNamnJIzeRRu3thkfInLPm9bPnEyOlidJd1NH9CiHwtfcjW6m8Z0KIiGSuf/JrbWlViI4LyqqfEHz6rtcLWDsonF2QOgGZnkzkXkb3ZyE8e+lRrkrspENaDbEZ8ohl96EWIUIbqO5rIO0NW4or0Zj4kV3jcVh3tbatXEBLKsN9hrvsh/v+0GsORBiVe8lysH7fVHB9Jf/wOQpjdOq+J8q2dvyDwRQ+Rkt/liFy/tU3qH9nKQ+hUzd7etUMNqtvI9mctiOJyfltY188qQkp7dPnSZcIncpSkOJetUOeMsrkD+TvDASsFfjrufU+6En2yX2S5tj0T0ei4Cht9K4qJ5XP85x6fLDqAWqKFVpeyt6DYywBra3Z4SNPDQ6o1OCYJZDJl7xp/0Tf2go8WV7xmjq4XQNSkF5zvMDnvJ8TWcdYWcNSjLkqTElYmDhjSayJBp21LQBa40kWCI1vijGG70V3kxIfKAOACYNBCc1ikHqIAqQuo/Epo2DT2Hex/QIYBZZUvIwmv4kZuAA9FYbNSKnyCsB+kDcre5ywxBhOJpTX2V+4Qqx1hUT5ojbmi7xbplpRmV+RW8dcG3NKthxeSJoVFU3HzzW24hAKeu3Gh4F1V07qjjD2/E6DUQwmGyE4vTWVr1+XrWy3uNK1VUH7A8WhwnlWVaAjmxXdR4RwUyId/5ZWBjel7R6z/Z5hYABeSEQqje6CAkFAKrwOanTC378FuHXe54NhRK02OLJjXLlzRIu4OGAViP+lh/0j6vUfe7bRkoHCSCZAcB4gDBhz+8I2nF1j1vRj8Cezwi26eAV1VaLBDHPI7vchrSbWaQQn6GxC8CBqlYgEJO1vl/xE0khS2+gCUiN/cZ5UHVB4B2DFehDVToqTZIfRz+lShBy7HqL4MdVsAFvyMOw1dnq/72JL8ce69Bj9SCegtIZe5+LqIKcTQa8fbkt7v12yTVGDZZ5ZTthD5lvKgwpYlR+c195g3qhtwgbrUGPd6b43D0ZFKYZVEBIdd3SLhFpwsSrph+WnMRGEex/z02SqMqUlnV4s71bZB/frLHqs3pNoqxkiOgI1bfgRR/a3k9r5ww4rjncM3jLD6rWu5Tfwnd2A3NqfN0wCmlyB/GhHtf0iyd2Zfwh1eBYUYhkXPwQbfi9fqQCfcWWQDuJnQGC0oOIP65yJuDH/b8VEEh6+LG97sUv7FX1B2/W/LSZ0GecG7X/3LQnXbvLrAx9DxaBnbVE1Wn5NgmnqdjPlryVJ2KNSxmx3dh18IblpnLL7UgZLK/K9zEbYGcR13NEElFNIWVx8XXXQVv+sI2envwI0p/Hd3Gum601jNdxaOSyJ4HwWKBxwnooaA2n8yBd1jxjYoxbu7EQWoG5w53Q7rjlSfaVNxhPR9qMdrOSw4kFvV6HZ4eqUrI33trRH+SXsN6vqWnRUGnTxB4yqRq/OeuF6T9yFiaxPsnW1oy90bpkqO89Vy70U+Qj7GCSE4THN34l3uOS8 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 503a11ce-3472-4f39-23b3-08dc327ad479 X-MS-Exchange-CrossTenant-AuthSource: LV2PR12MB5869.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Feb 2024 01:17:11.3595 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yvWSN6lmGu0W1uttJ3UWfg1mTQE9md1EXmngsOpBZVajtZBEe8Fe4AU4spOmCSOm X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB5714 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240220_171724_436509_3A098F8C X-CRM114-Status: GOOD ( 16.05 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org mlx5 has a built in self-test at driver startup to evaluate if the platform supports write combining to generate a 64 byte PCIe TLP or not. This has proven necessary because a lot of common scenarios end up with broken write combining (especially inside virtual machines) and there is other way to learn this information. This self test has been consistently failing on new ARM64 CPU designs (specifically with NVIDIA Grace's implementation of Neoverse V2). The C loop around writeq() generates some pretty terrible ARM64 assembly, but historically this has worked on a lot of existing ARM64 CPUs till now. We see it succeed about 1 time in 10,000 on the worst effected systems. The CPU architects speculate that the load instructions interspersed with the stores makes the WC buffers statistically flush too often and thus the generation of large TLPs becomes infrequent. This makes the boot up test unreliable in that it indicates no write-combining, however userspace would be fine since it uses a ST4 instruction. Further, S390 has similar issues where only the special zpci_memcpy_toio() will actually generate large TLPs, and the open coded loop does not trigger it at all. Fix both ARM64 and S390 by switching to __iowrite64_copy() which now provides architecture specific variants that have a high change of generating a large TLP with write combining. x86 continues to use a similar writeq loop in the generate __iowrite64_copy(). Fixes: 11f552e21755 ("IB/mlx5: Test write combining support") Tested-by: Niklas Schnelle Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/mem.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c index 96ffbbaf0a73d1..5a22be14d958f2 100644 --- a/drivers/infiniband/hw/mlx5/mem.c +++ b/drivers/infiniband/hw/mlx5/mem.c @@ -30,6 +30,7 @@ * SOFTWARE. */ +#include #include #include "mlx5_ib.h" #include @@ -108,7 +109,6 @@ static int post_send_nop(struct mlx5_ib_dev *dev, struct ib_qp *ibqp, u64 wr_id, __be32 mmio_wqe[16] = {}; unsigned long flags; unsigned int idx; - int i; if (unlikely(dev->mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)) return -EIO; @@ -148,10 +148,8 @@ static int post_send_nop(struct mlx5_ib_dev *dev, struct ib_qp *ibqp, u64 wr_id, * we hit doorbell */ wmb(); - for (i = 0; i < 8; i++) - mlx5_write64(&mmio_wqe[i * 2], - bf->bfreg->map + bf->offset + i * 8); - io_stop_wc(); + __iowrite64_copy(bf->bfreg->map + bf->offset, mmio_wqe, + sizeof(mmio_wqe) / 8); bf->offset ^= bf->buf_size;