From patchwork Tue Mar 9 06:22:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shenming Lu X-Patchwork-Id: 12124209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0945EC433DB for ; Tue, 9 Mar 2021 06:24:17 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 945A565287 for ; Tue, 9 Mar 2021 06:24:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 945A565287 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:CC:To:From: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=/CIIHdljI34r/WwposVfGBR1/t3XIYqHeG15z8n7cKY=; b=a8HlLkr2RGGpNRPyYauMHfE4x5 svZXW0v8OxOInf0U+iVkEOzs2K2uN4Vc1Mh/Lcsz6oLznVoi/Jw0MAbIKWADoJ/WTDmr0MuUkkSlT JWp0oOWEXAyGyW2QGZTCk2x5eHjFKLeQj62UWceNeu3UyYE8n4oAIlLPVglMSxm0z5aTXzdxbp3VI LSxmWkMlolgUaK8+L6+23lzywR5rLGCzbIJu7xs0ez3uhZwh6pUrEVfSGWQEKir+QcASGxK2kDk0l mBE+cRtkG6kAEmlxBZUuVI0q0zp/VLLT5JoQ41ANLMiNq6cNlPmB5nxZY0KRy7HfMNYdZfGQs4wSY UisclU5Q==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJVlX-003sgY-Ej; Tue, 09 Mar 2021 06:22:47 +0000 Received: from szxga07-in.huawei.com ([45.249.212.35]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJVlD-003sZ7-B1 for linux-arm-kernel@lists.infradead.org; Tue, 09 Mar 2021 06:22:33 +0000 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DvlSc1YQGz8vRm; Tue, 9 Mar 2021 14:20:32 +0800 (CST) Received: from DESKTOP-7FEPK9S.china.huawei.com (10.174.184.135) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.498.0; Tue, 9 Mar 2021 14:22:11 +0800 From: Shenming Lu To: Alex Williamson , Cornelia Huck , Will Deacon , Robin Murphy , Joerg Roedel , Jean-Philippe Brucker , Eric Auger , , , , , CC: Kevin Tian , , Christoph Hellwig , Lu Baolu , Jonathan Cameron , Barry Song , , , , Subject: [RFC PATCH v2 0/6] Add IOPF support for VFIO passthrough Date: Tue, 9 Mar 2021 14:22:01 +0800 Message-ID: <20210309062207.505-1-lushenming@huawei.com> X-Mailer: git-send-email 2.27.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.174.184.135] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210309_062229_328303_457B12BD X-CRM114-Status: GOOD ( 13.76 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi, The static pinning and mapping problem in VFIO and possible solutions have been discussed a lot [1, 2]. One of the solutions is to add I/O page fault support for VFIO devices. Different from those relatively complicated software approaches such as presenting a vIOMMU that provides the DMA buffer information (might include para-virtualized optimizations), IOPF mainly depends on the hardware faulting capability, such as the PCIe PRI extension or Arm SMMU stall model. What's more, the IOPF support in the IOMMU driver is being implemented in SVA [3]. So we add IOPF support for VFIO passthrough based on the IOPF part of SVA in this series. We have measured its performance with UADK [4] (passthrough an accelerator to a VM) on Hisilicon Kunpeng920 board: Run hisi_sec_test... - with varying message lengths and sending times - with/without stage 2 IOPF enabled when msg_len = 1MB and PREMAP_LEN (in patch 3) = 1: speed (KB/s) times w/o IOPF with IOPF (num of faults) degradation 1 325596 119152 (518) 36.6% 100 7524985 5804659 (1058) 77.1% 1000 8661817 8440209 (1071) 97.4% 5000 8804512 8724368 (1216) 99.1% If we use the same region to send messages, since page faults occur almost only when first accessing, more times, less degradation. when msg_len = 10MB and PREMAP_LEN = 512: speed (KB/s) times w/o IOPF with IOPF (num of faults) degradation 1 1012758 682257 (13) 67.4% 100 8680688 8374154 (26) 96.5% 1000 8860861 8719918 (26) 98.4% We see that pre-mapping can help. And we also measured the performance of host SVA with the same params: when msg_len = 1MB: speed (KB/s) times w/o IOPF with IOPF (num of faults) degradation 1 951672 163866 (512) 17.2% 100 8691961 4529971 (1024) 52.1% 1000 9158721 8376346 (1024) 91.5% 5000 9184532 9008739 (1024) 98.1% Besides, the avg time spent in vfio_iommu_dev_fault_handler() (in patch 3) is little less than iopf_handle_group() (in SVA) (1.6 us vs 2.0 us). History: v1 -> v2 - Numerous improvements following the suggestions. Thanks a lot to all of you. Yet TODO: - Add support for PRI. - Consider selective-faulting. (suggested by Kevin) ... Links: [1] Lesokhin I, et al. Page Fault Support for Network Controllers. In ASPLOS, 2016. [2] Tian K, et al. coIOMMU: A Virtual IOMMU with Cooperative DMA Buffer Tracking for Efficient Memory Management in Direct I/O. In USENIX ATC, 2020. [3] https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210302092644.2553014-1-jean-philippe@linaro.org/ [4] https://github.com/Linaro/uadk Any comments and suggestions are very welcome. :-) Thanks, Shenming Shenming Lu (6): iommu: Evolve to support more scenarios of using IOPF vfio: Add an MMU notifier to avoid pinning vfio: Add a page fault handler vfio: VFIO_IOMMU_ENABLE_IOPF vfio: No need to statically pin and map if IOPF enabled vfio: Add nested IOPF support .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 +- drivers/iommu/io-pgfault.c | 4 - drivers/iommu/iommu.c | 56 ++- drivers/vfio/vfio.c | 118 +++++ drivers/vfio/vfio_iommu_type1.c | 446 +++++++++++++++++- include/linux/iommu.h | 21 +- include/linux/vfio.h | 14 + include/uapi/linux/iommu.h | 3 + include/uapi/linux/vfio.h | 6 + 10 files changed, 661 insertions(+), 21 deletions(-)