From patchwork Thu May 23 03:19:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ashish Mhetre X-Patchwork-Id: 13671238 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E434C25B79 for ; Thu, 23 May 2024 03:20:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:CC :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=VXCe+SvzdjUfSMM10Ev0pg6rX/l+2t6WTnFowHVmUyI=; b=TG36jJ25n9CXNE ak3f5yFvfW3Ep03eY7bOeV8KhcdeYU+tyQ49dOiJ/M6/zitKgww2lIHqrpbIJUWaOzAcS3crsfcWG JhMHIpqkl6q+GNYRl/rKnMkI4l9xQ5dlRAqb8G8+MIKeUUEETynwuUZ4r6CvBHTLOf5qORc+EMlvg lwyMPO78DBNC5QT9Xq2pqVMDNLqD6WVSh39ovs58BK+bur27pMIadt5ZBnwA5lWp8Eqh6AaDB1meq tPC24yB0nWh4xGV44DFQUxVbBuqQjPdmsyKA1BdVp0zwzMxq4LGt7dFkCKCWMSBStzSUmMR4j5RgY z8mf9XQoDnRLkfi62Oeg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s9z00-00000004sqz-1Epc; Thu, 23 May 2024 03:20:12 +0000 Received: from mail-mw2nam12on20601.outbound.protection.outlook.com ([2a01:111:f403:200a::601] helo=NAM12-MW2-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s9yzt-00000004snq-2GNa for linux-arm-kernel@lists.infradead.org; Thu, 23 May 2024 03:20:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TfwfmMxxuUwtAIX5EAg3973uJ8bTxY4HyBxvYFEN9OLAHo6fcYUM2WlfO6n49TzuNRzrhzskSDWhh326RWz0gEoG2CMzvC00hru/IUfCvHnskcTSXgIsiIUjcSiC7T37VZH2NTqW0q6PENxEXHUNEqBBgCyrsS4nCM6la04KhxkKbUlfZ1iKCaUk0PiqxY4miNo59IEnMVCIxVmakt6q3WoqWYAwc5C/YkZpX0fPHuYookLvMj2IH5XISKTTDJHzjVg2mTdSnlHCUiI/EU8PUk/sBZtZnMOCPjmfS8xy/P0eALiX18mJFMp7YKmst7PnuY6kPzLqA10YKpOhR4qr/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9STHWwWg51TeZjiq97CvnN3cIPG+FqJJ8OijIJ8J6L4=; b=itJWiiXGGIN8NcLoj58M8P6pNBG8Oof5scjLpimNelDywsuTpsXb3P++CBCp82yjZlebbr+lxA6ZwrrY0REFj+INEwnOfsGFl2vrMe8L5Lg5HxhTLMz2fO9EeR2c5fUZvX96fdNFBYVW8wAL4fVVuJ35QacQhtl26ZLi/XTiVHALVspxZJNueIrouxThVj2BrCm7TJQe6IMeFvTOmYdsxZcmlD00Du+Gx6pC43TCbQlPdgJvyCseJGMMQXYdztWzhUfLFjXMFTcHGJ+12r+F+VsVKAvW/v+CowW5erNpyQwAezgSJEepL1bXaBt27FMUVI1uzYF4a5OPpHZjpjqi8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9STHWwWg51TeZjiq97CvnN3cIPG+FqJJ8OijIJ8J6L4=; b=rUF/lxnS4ANPI6EO5Eob9MwQjPjHj+alsUyRVGno+o5Y6Xwrx1VTM95/S6qpCh2n92BPOek/57YcXNAVWDFx7vEgu4QRcQaSeDtLNISq2fpiaip70ORuAAo1+JsuYmgxXhE06q4SMvaA73SbAWDYcLCpeVA0d1CEH45F4BRDfmt+3WtO0ar4w5rTNz4ly4HGYvQx9lrIZ2g8NZuY+76gPF9cQ/9fc3rtlhvTEi21aEzh4r6sBsMQm6/uzQZju2hgCWxYCIZ8AxgmdSM4iyb4RlGF69k6AhzXW9eEFAc5HGhhsGKlxzQ0PPrJTMjcwUMrpsFyhQL7UV9GmU9mpfg0Tg== Received: from MW4PR04CA0352.namprd04.prod.outlook.com (2603:10b6:303:8a::27) by CH3PR12MB8726.namprd12.prod.outlook.com (2603:10b6:610:17b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7587.36; Thu, 23 May 2024 03:19:55 +0000 Received: from MWH0EPF000989E9.namprd02.prod.outlook.com (2603:10b6:303:8a:cafe::be) by MW4PR04CA0352.outlook.office365.com (2603:10b6:303:8a::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7611.18 via Frontend Transport; Thu, 23 May 2024 03:19:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by MWH0EPF000989E9.mail.protection.outlook.com (10.167.241.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7611.14 via Frontend Transport; Thu, 23 May 2024 03:19:55 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 22 May 2024 20:19:52 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 22 May 2024 20:19:52 -0700 Received: from build-amhetre-20240519T173508142.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 22 May 2024 20:19:52 -0700 From: Ashish Mhetre To: , , , CC: , , , , , , Ashish Mhetre Subject: [RFC PATCH] iommu: Optimize IOMMU UnMap Date: Thu, 23 May 2024 03:19:35 +0000 Message-ID: <20240523031935.90856-1-amhetre@nvidia.com> X-Mailer: git-send-email 2.17.1 X-NVConfidentiality: public MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000989E9:EE_|CH3PR12MB8726:EE_ X-MS-Office365-Filtering-Correlation-Id: de019a6c-e93f-47a4-c6f2-08dc7ad737f8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230031|82310400017|1800799015|36860700004|376005; X-Microsoft-Antispam-Message-Info: JutOxr3a2/MVnHBBHnA3+xmyhjBjCWpnNWTxYh2iNFe8PaTUuikS/hBDmMe87ZueDKxcjWWgAzTHsAHvKdwR6pGFHTGeQH3diTJ9N72WNPlVdqGFVdS0dup5TROH4jrHcqdog2hzxP6zV7iUY+Z1FbAXUGIxgPJGw8MTVtxgeK7QQDRq251LF7Me6k7nShnfwnkhVhYZuZo4jPFnxCfcmUypzy9+3uiBe0yVk/eaKPnqeK2ocViWPkOCwvjTeQpvhKq4MKernNdoV5kqftILfKypdXRpaB+yVCtxf1AFfWAwBooGDpUMJOYW1vwb2i+H+iYQz42tAa+oD8DtYREWiLZ2YmRPykRhQSUm1Lt86xChe2e+Swy+B503uE2NkiyUxDWg7o3TUAkaHz2DEWytpsY0K/3SrScOpQNrJ+ItvLFhiV/qaZXrHJ1iVq2AsARJ4p3NCh9CoxaHbr6AEXO9sIkVMyw54vUkhXWYd3g+iHM8x8se7Mv2Jb8Vhe1n6/mzjeKOPVQYUPxfzOzCXuXOycJmo2/EKjByJmGlTGOaSzcGuGpKcdsewjeJy/IEiZbniuBeG8ldzuvoLKkXG8YqKls+ygm7Z9DmEvPPo2oKLmu593GbHeDnsZmxmyZ15shMNJgjqYdXJIPX6KOEyqGv06r3GcoAVgGpQQUmq+ZRPmpFvCwzTPQFqmWcaT4LpwDc2h473F7hww/uiJcmGM7rO9GZ9fsUt1uhkwbUYxUtTayah+N4yz6cKdw3CQ6ri/SVYvbxKx3wNE/Kwhusmu/lKozW2wOuNc4vslFrLvQE/XQqN51VSSNh6DRjieLEsnOFzUvxHHWwvMSQfNxtbFEwRM8UuoTFqd8FgjlrizAyQ0NHOg0qMTyQ/OlwVvBI/fJwfZQ6UH7WyTaePNWCzcdW9NRoE9H/eI01hATDkg9A9BUQ6ZNG+gD+kSAhmJPDX1h7triKy/q67YA/OuMg12Dw/C28ZZdX7YgQGiYletu187ZdTZcrAThz72NuGAqRTpoGhoqCBWT8Q4omN4jKrOJuQwB6jL+HirVPxsW8EMUYHAIxNohZcZdOpl/hCDIny19I3nIlMxMMA6hZ6zagEZ6RZkAUGR2N3zAdduGbBTc+S9Nu3oau19gmDu298UND6mvx7eQIHtKHSXNeF6WugitqWO1Xxd2kRQ2p6zoHVWcY3Hq3vzOgrk/IXziul73CbdBbuSNyPWm3Kfog0dNcdRuKtJtQ1odgdYwplGHQgOOzksAJJvI67E1SF746b5LtjOXGd5XNzHamTmqjP/3P5QIGBIlHisTFwcSzqQfpfOShir5Sj2Wq47Gsyy4C4GQyPqJHQZwsxOblD9blYb7eq7y1KsbNj4N/WOSQlr1g2zmjajw= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230031)(82310400017)(1800799015)(36860700004)(376005);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 May 2024 03:19:55.4438 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: de019a6c-e93f-47a4-c6f2-08dc7ad737f8 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000989E9.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB8726 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240522_202005_810723_0A758BC7 X-CRM114-Status: GOOD ( 12.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The current __arm_lpae_unmap() function calls dma_sync() on individual PTEs after clearing them. By updating the __arm_lpae_unmap() to call dma_sync() once for all cleared PTEs, the overall performance can be improved 25% for large buffer sizes. Below is detailed analysis of average unmap latency(in us) with and without this optimization obtained by running dma_map_benchmark for different buffer sizes. Size Time W/O Time With % Improvement Optimization Optimization (us) (us) 4KB 3.0 3.1 -3.33 1MB 250.3 187.9 24.93 2MB 493.7 368.7 25.32 4MB 974.7 723.4 25.78 Signed-off-by: Ashish Mhetre --- drivers/iommu/io-pgtable-arm.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 3d23b924cec1..94094b711cba 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -256,13 +256,15 @@ static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries, sizeof(*ptep) * num_entries, DMA_TO_DEVICE); } -static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg) +static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg, int num_entries) { + int i; - *ptep = 0; + for (i = 0; i < num_entries; i++) + ptep[i] = 0; if (!cfg->coherent_walk) - __arm_lpae_sync_pte(ptep, 1, cfg); + __arm_lpae_sync_pte(ptep, num_entries, cfg); } static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, @@ -633,13 +635,25 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { max_entries = ARM_LPAE_PTES_PER_TABLE(data) - unmap_idx_start; num_entries = min_t(int, pgcount, max_entries); - - while (i < num_entries) { - pte = READ_ONCE(*ptep); + arm_lpae_iopte *pte_flush; + int j = 0; + + pte_flush = kvcalloc(num_entries, sizeof(*pte_flush), GFP_ATOMIC); + if (pte_flush) { + for (j = 0; j < num_entries; j++) { + pte_flush[j] = READ_ONCE(ptep[j]); + if (WARN_ON(!pte_flush[j])) + break; + } + __arm_lpae_clear_pte(ptep, &iop->cfg, j); + } + while (i < (pte_flush ? j : num_entries)) { + pte = pte_flush ? pte_flush[i] : READ_ONCE(*ptep); if (WARN_ON(!pte)) break; - __arm_lpae_clear_pte(ptep, &iop->cfg); + if (!pte_flush) + __arm_lpae_clear_pte(ptep, &iop->cfg, 1); if (!iopte_leaf(pte, lvl, iop->fmt)) { /* Also flush any partial walks */ @@ -649,10 +663,12 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, } else if (!iommu_iotlb_gather_queued(gather)) { io_pgtable_tlb_add_page(iop, gather, iova + i * size, size); } - - ptep++; + if (!pte_flush) + ptep++; i++; } + if (pte_flush) + kvfree(pte_flush); return i * size; } else if (iopte_leaf(pte, lvl, iop->fmt)) {