From patchwork Thu Jan 28 15:17:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEA98C433E0 for ; Thu, 28 Jan 2021 15:19:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7A58164DE5 for ; Thu, 28 Jan 2021 15:19:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231879AbhA1PTY (ORCPT ); Thu, 28 Jan 2021 10:19:24 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11463 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231220AbhA1PSx (ORCPT ); Thu, 28 Jan 2021 10:18:53 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG938VYzjDJD; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:54 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 01/11] iommu/arm-smmu-v3: Add feature detection for HTTU Date: Thu, 28 Jan 2021 23:17:32 +0800 Message-ID: <20210128151742.18840-2-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun The SMMU which supports HTTU (Hardware Translation Table Update) can update the access flag and the dirty state of TTD by hardware. It is essential to track dirty pages of DMA. This adds feature detection, none functional change. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 ++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ++++++++ include/linux/io-pgtable.h | 1 + 3 files changed, 25 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8ca7415d785d..0f0fe71cc10d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1987,6 +1987,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, .pgsize_bitmap = smmu->pgsize_bitmap, .ias = ias, .oas = oas, + .httu_hd = smmu->features & ARM_SMMU_FEAT_HTTU_HD, .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY, .tlb = &arm_smmu_flush_ops, .iommu_dev = smmu->dev, @@ -3224,6 +3225,21 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR0_HYP) smmu->features |= ARM_SMMU_FEAT_HYP; + switch (FIELD_GET(IDR0_HTTU, reg)) { + case IDR0_HTTU_NONE: + break; + case IDR0_HTTU_HA: + smmu->features |= ARM_SMMU_FEAT_HTTU_HA; + break; + case IDR0_HTTU_HAD: + smmu->features |= ARM_SMMU_FEAT_HTTU_HA; + smmu->features |= ARM_SMMU_FEAT_HTTU_HD; + break; + default: + dev_err(smmu->dev, "unknown/unsupported HTTU!\n"); + return -ENXIO; + } + /* * The coherency feature as set by FW is used in preference to the ID * register, but warn on mismatch. diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 96c2e9565e00..e91bea44519e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -33,6 +33,10 @@ #define IDR0_ASID16 (1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_HTTU GENMASK(7, 6) +#define IDR0_HTTU_NONE 0 +#define IDR0_HTTU_HA 1 +#define IDR0_HTTU_HAD 2 #define IDR0_COHACC (1 << 4) #define IDR0_TTF GENMASK(3, 2) #define IDR0_TTF_AARCH64 2 @@ -286,6 +290,8 @@ #define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38) #define CTXDESC_CD_0_AA64 (1UL << 41) +#define CTXDESC_CD_0_HD (1UL << 42) +#define CTXDESC_CD_0_HA (1UL << 43) #define CTXDESC_CD_0_S (1UL << 44) #define CTXDESC_CD_0_R (1UL << 45) #define CTXDESC_CD_0_A (1UL << 46) @@ -604,6 +610,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_RANGE_INV (1 << 15) #define ARM_SMMU_FEAT_BTM (1 << 16) #define ARM_SMMU_FEAT_SVA (1 << 17) +#define ARM_SMMU_FEAT_HTTU_HA (1 << 18) +#define ARM_SMMU_FEAT_HTTU_HD (1 << 19) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index ea727eb1a1a9..1a00ea8562c7 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -97,6 +97,7 @@ struct io_pgtable_cfg { unsigned long pgsize_bitmap; unsigned int ias; unsigned int oas; + bool httu_hd; bool coherent_walk; const struct iommu_flush_ops *tlb; struct device *iommu_dev; From patchwork Thu Jan 28 15:17:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF0E2C433DB for ; Thu, 28 Jan 2021 15:20:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B47064DEB for ; Thu, 28 Jan 2021 15:20:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232233AbhA1PUg (ORCPT ); Thu, 28 Jan 2021 10:20:36 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11466 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231964AbhA1PTk (ORCPT ); Thu, 28 Jan 2021 10:19:40 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG952hDzjDTX; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:55 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 02/11] iommu/arm-smmu-v3: Enable HTTU for SMMU stage1 mapping Date: Thu, 28 Jan 2021 23:17:33 +0800 Message-ID: <20210128151742.18840-3-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun If HTTU is supported, we enable HA/HD bits in the SMMU CD (stage 1 mapping), and set DBM bit for writable TTD. The dirty state information is encoded using the access permission bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/ S2AP[1] means dirty. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 +++++ drivers/iommu/io-pgtable-arm.c | 7 ++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 0f0fe71cc10d..8cc9d7536b08 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1036,6 +1036,11 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | CTXDESC_CD_0_V; + if (smmu->features & ARM_SMMU_FEAT_HTTU_HA) + val |= CTXDESC_CD_0_HA; + if (smmu->features & ARM_SMMU_FEAT_HTTU_HD) + val |= CTXDESC_CD_0_HD; + /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE) val |= CTXDESC_CD_0_S; diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58e79b5..e299a44808ae 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -72,6 +72,7 @@ #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63) #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53) +#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51) #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10) #define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8) #define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8) @@ -81,7 +82,7 @@ #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) /* Ignore the contiguous bit for block splitting */ -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52) +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)13) << 51) #define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \ ARM_LPAE_PTE_ATTR_HI_MASK) /* Software bit for solving coherency races */ @@ -379,6 +380,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova, static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, int prot) { + struct io_pgtable_cfg *cfg = &data->iop.cfg; arm_lpae_iopte pte; if (data->iop.fmt == ARM_64_LPAE_S1 || @@ -386,6 +388,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, pte = ARM_LPAE_PTE_nG; if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) pte |= ARM_LPAE_PTE_AP_RDONLY; + else if (cfg->httu_hd) + pte |= ARM_LPAE_PTE_DBM; + if (!(prot & IOMMU_PRIV)) pte |= ARM_LPAE_PTE_AP_UNPRIV; } else { From patchwork Thu Jan 28 15:17:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B510C433E0 for ; Thu, 28 Jan 2021 15:19:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1856064DE5 for ; Thu, 28 Jan 2021 15:19:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231690AbhA1PTF (ORCPT ); Thu, 28 Jan 2021 10:19:05 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11462 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231204AbhA1PSw (ORCPT ); Thu, 28 Jan 2021 10:18:52 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG93c68zjDPv; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:56 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 03/11] iommu/arm-smmu-v3: Add feature detection for BBML Date: Thu, 28 Jan 2021 23:17:34 +0800 Message-ID: <20210128151742.18840-4-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun When altering a translation table descriptor of some specific reasons, we require break-before-make procedure. But it might cause problems when the TTD is alive. The I/O streams might not tolerate translation faults. If the SMMU supports BBML level 1 or BBML level 2, we can change the block size without using break-before-make. This adds feature detection for BBML, none functional change. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 ++++++ include/linux/io-pgtable.h | 1 + 3 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8cc9d7536b08..9208881a571c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1947,7 +1947,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, static int arm_smmu_domain_finalise(struct iommu_domain *domain, struct arm_smmu_master *master) { - int ret; + int ret, bbml; unsigned long ias, oas; enum io_pgtable_fmt fmt; struct io_pgtable_cfg pgtbl_cfg; @@ -1988,12 +1988,20 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, return -EINVAL; } + if (smmu->features & ARM_SMMU_FEAT_BBML2) + bbml = 2; + else if (smmu->features & ARM_SMMU_FEAT_BBML1) + bbml = 1; + else + bbml = 0; + pgtbl_cfg = (struct io_pgtable_cfg) { .pgsize_bitmap = smmu->pgsize_bitmap, .ias = ias, .oas = oas, .httu_hd = smmu->features & ARM_SMMU_FEAT_HTTU_HD, .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY, + .bbml = bbml, .tlb = &arm_smmu_flush_ops, .iommu_dev = smmu->dev, }; @@ -3328,6 +3336,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) /* IDR3 */ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3); + switch (FIELD_GET(IDR3_BBML, reg)) { + case IDR3_BBML0: + break; + case IDR3_BBML1: + smmu->features |= ARM_SMMU_FEAT_BBML1; + break; + case IDR3_BBML2: + smmu->features |= ARM_SMMU_FEAT_BBML2; + break; + default: + dev_err(smmu->dev, "unknown/unsupported BBM behavior level\n"); + return -ENXIO; + } + if (FIELD_GET(IDR3_RIL, reg)) smmu->features |= ARM_SMMU_FEAT_RANGE_INV; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index e91bea44519e..11e526ab7239 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -55,6 +55,10 @@ #define IDR1_SIDSIZE GENMASK(5, 0) #define ARM_SMMU_IDR3 0xc +#define IDR3_BBML GENMASK(12, 11) +#define IDR3_BBML0 0 +#define IDR3_BBML1 1 +#define IDR3_BBML2 2 #define IDR3_RIL (1 << 10) #define ARM_SMMU_IDR5 0x14 @@ -612,6 +616,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_SVA (1 << 17) #define ARM_SMMU_FEAT_HTTU_HA (1 << 18) #define ARM_SMMU_FEAT_HTTU_HD (1 << 19) +#define ARM_SMMU_FEAT_BBML1 (1 << 20) +#define ARM_SMMU_FEAT_BBML2 (1 << 21) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 1a00ea8562c7..26583beeb5d9 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -99,6 +99,7 @@ struct io_pgtable_cfg { unsigned int oas; bool httu_hd; bool coherent_walk; + int bbml; const struct iommu_flush_ops *tlb; struct device *iommu_dev; From patchwork Thu Jan 28 15:17:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BABDDC433E0 for ; Thu, 28 Jan 2021 15:22:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DAEE64DEB for ; Thu, 28 Jan 2021 15:22:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232476AbhA1PWd (ORCPT ); Thu, 28 Jan 2021 10:22:33 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11464 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229840AbhA1PSw (ORCPT ); Thu, 28 Jan 2021 10:18:52 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG943JhzjDT7; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:57 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 04/11] iommu/arm-smmu-v3: Split block descriptor to a span of page Date: Thu, 28 Jan 2021 23:17:35 +0800 Message-ID: <20210128151742.18840-5-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun Block descriptor is not a proper granule for dirty log tracking. This adds a new interface named split_block in iommu layer and arm smmuv3 implements it, which splits block descriptor to an equivalent span of page descriptors. During spliting block, other interfaces are not expected to be working, so race condition does not exist. And we flush all iotlbs after the split procedure is completed to ease the pressure of iommu, as we will split a huge range of block mappings in general. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 ++++ drivers/iommu/io-pgtable-arm.c | 122 ++++++++++++++++++++ drivers/iommu/iommu.c | 40 +++++++ include/linux/io-pgtable.h | 2 + include/linux/iommu.h | 10 ++ 5 files changed, 194 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9208881a571c..5469f4fca820 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2510,6 +2510,25 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, return ret; } +static size_t arm_smmu_split_block(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + + if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) { + dev_err(smmu->dev, "don't support BBML1/2 and split block\n"); + return 0; + } + + if (!ops || !ops->split_block) { + pr_err("don't support split block\n"); + return 0; + } + + return ops->split_block(ops, iova, size); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2609,6 +2628,7 @@ static struct iommu_ops arm_smmu_ops = { .device_group = arm_smmu_device_group, .domain_get_attr = arm_smmu_domain_get_attr, .domain_set_attr = arm_smmu_domain_set_attr, + .split_block = arm_smmu_split_block, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index e299a44808ae..f3b7f7115e38 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -79,6 +79,8 @@ #define ARM_LPAE_PTE_SH_IS (((arm_lpae_iopte)3) << 8) #define ARM_LPAE_PTE_NS (((arm_lpae_iopte)1) << 5) #define ARM_LPAE_PTE_VALID (((arm_lpae_iopte)1) << 0) +/* Block descriptor bits */ +#define ARM_LPAE_PTE_NT (((arm_lpae_iopte)1) << 16) #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) /* Ignore the contiguous bit for block splitting */ @@ -679,6 +681,125 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, return iopte_to_paddr(pte, data) | iova; } +static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, int lvl, + arm_lpae_iopte *ptep); + +static size_t arm_lpae_do_split_blk(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + arm_lpae_iopte blk_pte, int lvl, + arm_lpae_iopte *ptep) +{ + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte pte, *tablep; + phys_addr_t blk_paddr; + size_t tablesz = ARM_LPAE_GRANULE(data); + size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data); + int i; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg); + if (!tablep) + return 0; + + blk_paddr = iopte_to_paddr(blk_pte, data); + pte = iopte_prot(blk_pte); + for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) + __arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]); + + if (cfg->bbml == 1) { + /* Race does not exist */ + blk_pte |= ARM_LPAE_PTE_NT; + __arm_lpae_set_pte(ptep, blk_pte, cfg); + io_pgtable_tlb_flush_walk(&data->iop, iova, size, size); + } + /* Race does not exist */ + pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg); + + /* Have splited it into page? */ + if (lvl == (ARM_LPAE_MAX_LEVELS - 1)) + return size; + + /* Go back to lvl - 1 */ + ptep -= ARM_LPAE_LVL_IDX(iova, lvl - 1, data); + return __arm_lpae_split_block(data, iova, size, lvl - 1, ptep); +} + +static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, int lvl, + arm_lpae_iopte *ptep) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + size_t base, next_size, total_size; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return 0; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (lvl == (ARM_LPAE_MAX_LEVELS - 1) || + (pte & ARM_LPAE_PTE_AP_RDONLY)) + return size; + + /* We find a writable block, split it. */ + return arm_lpae_do_split_blk(data, iova, size, pte, + lvl + 1, ptep); + } else { + /* If it is the last table level, then nothing to do */ + if (lvl == (ARM_LPAE_MAX_LEVELS - 2)) + return size; + + total_size = 0; + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) + total_size += __arm_lpae_split_block(data, + iova + base, next_size, lvl + 1, + ptep); + return total_size; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + WARN(1, "Can't split behind a block.\n"); + return 0; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_split_block(data, iova, size, lvl + 1, ptep); +} + +static size_t arm_lpae_split_block(struct io_pgtable_ops *ops, + unsigned long iova, size_t size) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + int lvl = data->start_level; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return 0; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return 0; + + /* If it is smallest granule, then nothing to do */ + if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data)) + return size; + + return __arm_lpae_split_block(data, iova, size, lvl, ptep); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -757,6 +878,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .map = arm_lpae_map, .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, + .split_block = arm_lpae_split_block, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ffeebda8d6de..7dc0850448c3 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2707,6 +2707,46 @@ int iommu_domain_set_attr(struct iommu_domain *domain, } EXPORT_SYMBOL_GPL(iommu_domain_set_attr); +size_t iommu_split_block(struct iommu_domain *domain, unsigned long iova, + size_t size) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize, splited_size; + size_t splited = 0; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n", + iova, size, min_pagesz); + return 0; + } + + if (!ops || !ops->split_block) { + pr_err("don't support split block\n"); + return 0; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + + splited_size = ops->split_block(domain, iova, pgsize); + + pr_debug("splited: iova 0x%lx size 0x%zx\n", iova, splited_size); + iova += splited_size; + size -= splited_size; + splited += splited_size; + + if (splited_size != pgsize) + break; + } + iommu_flush_iotlb_all(domain); + + return splited; +} +EXPORT_SYMBOL_GPL(iommu_split_block); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 26583beeb5d9..b87c6f4ecaa2 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -162,6 +162,8 @@ struct io_pgtable_ops { size_t size, struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, unsigned long iova); + size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova, + size_t size); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b3f0e2018c62..abeb811098a5 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -258,6 +258,8 @@ struct iommu_ops { enum iommu_attr attr, void *data); int (*domain_set_attr)(struct iommu_domain *domain, enum iommu_attr attr, void *data); + size_t (*split_block)(struct iommu_domain *domain, unsigned long iova, + size_t size); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -509,6 +511,8 @@ extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr, void *data); extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr, void *data); +extern size_t iommu_split_block(struct iommu_domain *domain, unsigned long iova, + size_t size); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -903,6 +907,12 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain, return -EINVAL; } +static inline size_t iommu_split_block(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + return 0; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Thu Jan 28 15:17:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0560BC433DB for ; Thu, 28 Jan 2021 15:25:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B801A64DF3 for ; Thu, 28 Jan 2021 15:25:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231882AbhA1PZY (ORCPT ); Thu, 28 Jan 2021 10:25:24 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11461 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229785AbhA1PSv (ORCPT ); Thu, 28 Jan 2021 10:18:51 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG92dTFzjCyD; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:58 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 05/11] iommu/arm-smmu-v3: Merge a span of page to block descriptor Date: Thu, 28 Jan 2021 23:17:36 +0800 Message-ID: <20210128151742.18840-6-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun When stop dirty log tracking, we need to recover all block descriptors which are splited when start dirty log tracking. This adds a new interface named merge_page in iommu layer and arm smmuv3 implements it, which reinstall block mappings and unmap the span of page mappings. It's caller's duty to find contiuous physical memory. During merging page, other interfaces are not expected to be working, so race condition does not exist. And we flush all iotlbs after the merge procedure is completed to ease the pressure of iommu, as we will merge a huge range of page mappings in general. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 ++++++ drivers/iommu/io-pgtable-arm.c | 78 +++++++++++++++++++++ drivers/iommu/iommu.c | 75 ++++++++++++++++++++ include/linux/io-pgtable.h | 2 + include/linux/iommu.h | 10 +++ 5 files changed, 185 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5469f4fca820..2434519e4bb6 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2529,6 +2529,25 @@ static size_t arm_smmu_split_block(struct iommu_domain *domain, return ops->split_block(ops, iova, size); } +static size_t arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + + if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) { + dev_err(smmu->dev, "don't support BBML1/2 and merge page\n"); + return 0; + } + + if (!ops || !ops->merge_page) { + pr_err("don't support merge page\n"); + return 0; + } + + return ops->merge_page(ops, iova, paddr, size, prot); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2629,6 +2648,7 @@ static struct iommu_ops arm_smmu_ops = { .domain_get_attr = arm_smmu_domain_get_attr, .domain_set_attr = arm_smmu_domain_set_attr, .split_block = arm_smmu_split_block, + .merge_page = arm_smmu_merge_page, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index f3b7f7115e38..17390f258eb1 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -800,6 +800,83 @@ static size_t arm_lpae_split_block(struct io_pgtable_ops *ops, return __arm_lpae_split_block(data, iova, size, lvl, ptep); } +static size_t __arm_lpae_merge_page(struct arm_lpae_io_pgtable *data, + unsigned long iova, phys_addr_t paddr, + size_t size, int lvl, arm_lpae_iopte *ptep, + arm_lpae_iopte prot) +{ + arm_lpae_iopte pte, *tablep; + struct io_pgtable *iop = &data->iop; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return 0; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) + return size; + + /* Race does not exist */ + if (cfg->bbml == 1) { + prot |= ARM_LPAE_PTE_NT; + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + io_pgtable_tlb_flush_walk(iop, iova, size, + ARM_LPAE_GRANULE(data)); + + prot &= ~(ARM_LPAE_PTE_NT); + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + } else { + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + } + + tablep = iopte_deref(pte, data); + __arm_lpae_free_pgtable(data, lvl + 1, tablep); + return size; + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + /* The size is too small, already merged */ + return size; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_merge_page(data, iova, paddr, size, lvl + 1, ptep, prot); +} + +static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t size, int iommu_prot) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + arm_lpae_iopte prot; + long iaext = (s64)iova >> cfg->ias; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return size; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return 0; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || paddr >> cfg->oas)) + return 0; + + /* If it is smallest granule, then nothing to do */ + if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data)) + return size; + + prot = arm_lpae_prot_to_pte(data, iommu_prot); + return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -879,6 +956,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, .split_block = arm_lpae_split_block, + .merge_page = arm_lpae_merge_page, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7dc0850448c3..f1261da11ea8 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2747,6 +2747,81 @@ size_t iommu_split_block(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_split_block); +static size_t __iommu_merge_page(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize, merged_size; + size_t merged = 0; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, &paddr, size, min_pagesz); + return 0; + } + + if (!ops || !ops->merge_page) { + pr_err("don't support merge page\n"); + return 0; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova | paddr, size); + + merged_size = ops->merge_page(domain, iova, paddr, pgsize, prot); + + pr_debug("merged: iova 0x%lx pa %pa size 0x%zx\n", iova, &paddr, + merged_size); + iova += merged_size; + paddr += merged_size; + size -= merged_size; + merged += merged_size; + + if (merged_size != pgsize) + break; + } + + return merged; +} + +size_t iommu_merge_page(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot) +{ + phys_addr_t phys; + dma_addr_t p, i; + size_t cont_size, merged_size; + size_t merged = 0; + + while (size) { + phys = iommu_iova_to_phys(domain, iova); + cont_size = PAGE_SIZE; + p = phys + cont_size; + i = iova + cont_size; + + while (cont_size < size && p == iommu_iova_to_phys(domain, i)) { + p += PAGE_SIZE; + i += PAGE_SIZE; + cont_size += PAGE_SIZE; + } + + merged_size = __iommu_merge_page(domain, iova, phys, cont_size, + prot); + iova += merged_size; + size -= merged_size; + merged += merged_size; + + if (merged_size != cont_size) + break; + } + iommu_flush_iotlb_all(domain); + + return merged; +} +EXPORT_SYMBOL_GPL(iommu_merge_page); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index b87c6f4ecaa2..754b62a1bbaf 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -164,6 +164,8 @@ struct io_pgtable_ops { unsigned long iova); size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova, size_t size); + size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t phys, size_t size, int prot); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index abeb811098a5..ac2b0b1bce0f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -260,6 +260,8 @@ struct iommu_ops { enum iommu_attr attr, void *data); size_t (*split_block)(struct iommu_domain *domain, unsigned long iova, size_t size); + size_t (*merge_page)(struct iommu_domain *domain, unsigned long iova, + phys_addr_t phys, size_t size, int prot); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -513,6 +515,8 @@ extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr, void *data); extern size_t iommu_split_block(struct iommu_domain *domain, unsigned long iova, size_t size); +extern size_t iommu_merge_page(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -913,6 +917,12 @@ static inline size_t iommu_split_block(struct iommu_domain *domain, return 0; } +static inline size_t iommu_merge_page(struct iommu_domain *domain, + unsigned long iova, size_t size, int prot) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Thu Jan 28 15:17:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7890AC433E9 for ; Thu, 28 Jan 2021 15:19:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 36D1364DE5 for ; Thu, 28 Jan 2021 15:19:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231805AbhA1PTM (ORCPT ); Thu, 28 Jan 2021 10:19:12 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:11465 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231233AbhA1PSw (ORCPT ); Thu, 28 Jan 2021 10:18:52 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DRPG94XdjzjDTR; Thu, 28 Jan 2021 23:17:05 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:58 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 06/11] iommu/arm-smmu-v3: Scan leaf TTD to sync hardware dirty log Date: Thu, 28 Jan 2021 23:17:37 +0800 Message-ID: <20210128151742.18840-7-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun During dirty log tracking, user will try to retrieve dirty log from iommu if it supports hardware dirty log. This adds a new interface named sync_dirty_log in iommu layer and arm smmuv3 implements it, which scans leaf TTD and treats it's dirty if it's writable (As we just enable HTTU for stage1, so check AP[2] is not set). Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 +++++++ drivers/iommu/io-pgtable-arm.c | 90 +++++++++++++++++++++ drivers/iommu/iommu.c | 41 ++++++++++ include/linux/io-pgtable.h | 4 + include/linux/iommu.h | 17 ++++ 5 files changed, 179 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 2434519e4bb6..43d0536b429a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2548,6 +2548,32 @@ static size_t arm_smmu_merge_page(struct iommu_domain *domain, unsigned long iov return ops->merge_page(ops, iova, paddr, size, prot); } +static int arm_smmu_sync_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HTTU_HD)) { + dev_err(smmu->dev, "don't support HTTU_HD and sync dirty log\n"); + return -EPERM; + } + + if (!ops || !ops->sync_dirty_log) { + pr_err("don't support sync dirty log\n"); + return -ENODEV; + } + + /* To ensure all inflight transactions are completed */ + arm_smmu_flush_iotlb_all(domain); + + return ops->sync_dirty_log(ops, iova, size, bitmap, + base_iova, bitmap_pgshift); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2649,6 +2675,7 @@ static struct iommu_ops arm_smmu_ops = { .domain_set_attr = arm_smmu_domain_set_attr, .split_block = arm_smmu_split_block, .merge_page = arm_smmu_merge_page, + .sync_dirty_log = arm_smmu_sync_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 17390f258eb1..6cfe1ef3fedd 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -877,6 +877,95 @@ static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot); } +static int __arm_lpae_sync_dirty_log(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + int lvl, arm_lpae_iopte *ptep, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + size_t base, next_size; + unsigned long offset; + int nbits, ret; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return -EINVAL; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return -EINVAL; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* It is writable, set the bitmap */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + bitmap_set(bitmap, offset, nbits); + return 0; + } else { + /* To traverse next level */ + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) { + ret = __arm_lpae_sync_dirty_log(data, + iova + base, next_size, lvl + 1, + ptep, bitmap, base_iova, bitmap_pgshift); + if (ret) + return ret; + } + return 0; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* Though the size is too small, also set bitmap */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + bitmap_set(bitmap, offset, nbits); + return 0; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_sync_dirty_log(data, iova, size, lvl + 1, ptep, + bitmap, base_iova, bitmap_pgshift); +} + +static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return -EINVAL; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return -EINVAL; + + if (data->iop.fmt != ARM_64_LPAE_S1 && + data->iop.fmt != ARM_32_LPAE_S1) + return -EINVAL; + + return __arm_lpae_sync_dirty_log(data, iova, size, lvl, ptep, + bitmap, base_iova, bitmap_pgshift); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -957,6 +1046,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .iova_to_phys = arm_lpae_iova_to_phys, .split_block = arm_lpae_split_block, .merge_page = arm_lpae_merge_page, + .sync_dirty_log = arm_lpae_sync_dirty_log, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f1261da11ea8..69f268069282 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2822,6 +2822,47 @@ size_t iommu_merge_page(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_merge_page); +int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size, unsigned long *bitmap, + unsigned long base_iova, unsigned long bitmap_pgshift) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize; + int ret; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n", + iova, size, min_pagesz); + return -EINVAL; + } + + if (!ops || !ops->sync_dirty_log) { + pr_err("don't support sync dirty log\n"); + return -ENODEV; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + + ret = ops->sync_dirty_log(domain, iova, pgsize, + bitmap, base_iova, bitmap_pgshift); + if (ret) + break; + + pr_debug("dirty_log_sync: iova 0x%lx pagesz 0x%zx\n", iova, + pgsize); + + iova += pgsize; + size -= pgsize; + } + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sync_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 754b62a1bbaf..f44551e4a454 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -166,6 +166,10 @@ struct io_pgtable_ops { size_t size); size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova, phys_addr_t phys, size_t size, int prot); + int (*sync_dirty_log)(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ac2b0b1bce0f..8069c8375e63 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -262,6 +262,10 @@ struct iommu_ops { size_t size); size_t (*merge_page)(struct iommu_domain *domain, unsigned long iova, phys_addr_t phys, size_t size, int prot); + int (*sync_dirty_log)(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -517,6 +521,10 @@ extern size_t iommu_split_block(struct iommu_domain *domain, unsigned long iova, size_t size); extern size_t iommu_merge_page(struct iommu_domain *domain, unsigned long iova, size_t size, int prot); +extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size, unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -923,6 +931,15 @@ static inline size_t iommu_merge_page(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_sync_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long pgshift) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Thu Jan 28 15:17:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9716C433DB for ; Thu, 28 Jan 2021 15:22:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8FF9D64DE5 for ; Thu, 28 Jan 2021 15:22:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232304AbhA1PWP (ORCPT ); Thu, 28 Jan 2021 10:22:15 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:12343 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231328AbhA1PS4 (ORCPT ); Thu, 28 Jan 2021 10:18:56 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DRPG04vvbz7btD; Thu, 28 Jan 2021 23:16:56 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:17:59 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 07/11] iommu/arm-smmu-v3: Clear dirty log according to bitmap Date: Thu, 28 Jan 2021 23:17:38 +0800 Message-ID: <20210128151742.18840-8-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun After dirty log is retrieved, user should clear dirty log to re-enable dirty log tracking for these dirtied pages. This adds a new interface named clear_dirty_log and arm smmuv3 implements it, which clears the dirty state (As we just enable HTTU for stage1, so set the AP[2] bit) of these TTDs that are specified by the user provided bitmap. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++ drivers/iommu/io-pgtable-arm.c | 95 +++++++++++++++++++++ drivers/iommu/iommu.c | 71 +++++++++++++++ include/linux/io-pgtable.h | 4 + include/linux/iommu.h | 17 ++++ 5 files changed, 211 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 43d0536b429a..0c24503d29d3 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2574,6 +2574,29 @@ static int arm_smmu_sync_dirty_log(struct iommu_domain *domain, base_iova, bitmap_pgshift); } +static int arm_smmu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HTTU_HD)) { + dev_err(smmu->dev, "don't support HTTU_HD and clear dirty log\n"); + return -EPERM; + } + + if (!ops || !ops->clear_dirty_log) { + pr_err("don't support clear dirty log\n"); + return -ENODEV; + } + + return ops->clear_dirty_log(ops, iova, size, bitmap, base_iova, + bitmap_pgshift); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2676,6 +2699,7 @@ static struct iommu_ops arm_smmu_ops = { .split_block = arm_smmu_split_block, .merge_page = arm_smmu_merge_page, .sync_dirty_log = arm_smmu_sync_dirty_log, + .clear_dirty_log = arm_smmu_clear_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 6cfe1ef3fedd..2256e37bcb3a 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -966,6 +966,100 @@ static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops, bitmap, base_iova, bitmap_pgshift); } +static int __arm_lpae_clear_dirty_log(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + int lvl, arm_lpae_iopte *ptep, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + unsigned long offset; + size_t base, next_size; + int nbits, ret, i; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return -EINVAL; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return -EINVAL; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* Ensure all corresponding bits are set */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + for (i = offset; i < offset + nbits; i++) { + if (!test_bit(i, bitmap)) + return 0; + } + + /* Race does not exist */ + pte |= ARM_LPAE_PTE_AP_RDONLY; + __arm_lpae_set_pte(ptep, pte, &iop->cfg); + return 0; + } else { + /* To traverse next level */ + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) { + ret = __arm_lpae_clear_dirty_log(data, + iova + base, next_size, lvl + 1, + ptep, bitmap, base_iova, + bitmap_pgshift); + if (ret) + return ret; + } + return 0; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + /* Though the size is too small, it is already clean */ + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + return -EINVAL; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_clear_dirty_log(data, iova, size, lvl + 1, ptep, + bitmap, base_iova, bitmap_pgshift); +} + +static int arm_lpae_clear_dirty_log(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return -EINVAL; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return -EINVAL; + + if (data->iop.fmt != ARM_64_LPAE_S1 && + data->iop.fmt != ARM_32_LPAE_S1) + return -EINVAL; + + return __arm_lpae_clear_dirty_log(data, iova, size, lvl, ptep, + bitmap, base_iova, bitmap_pgshift); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -1047,6 +1141,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .split_block = arm_lpae_split_block, .merge_page = arm_lpae_merge_page, .sync_dirty_log = arm_lpae_sync_dirty_log, + .clear_dirty_log = arm_lpae_clear_dirty_log, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 69f268069282..e2731a7afab2 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2863,6 +2863,77 @@ int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_sync_dirty_log); +static int __iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + const struct iommu_ops *ops = domain->ops; + size_t pgsize; + int ret = 0; + + if (!ops || !ops->clear_dirty_log) { + pr_err("don't support clear dirty log\n"); + return -ENODEV; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap, + base_iova, bitmap_pgshift); + + if (ret) + break; + + pr_debug("dirty_log_clear: iova 0x%lx pagesz 0x%zx\n", iova, + pgsize); + + iova += pgsize; + size -= pgsize; + } + iommu_flush_iotlb_all(domain); + + return ret; +} + +int iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + unsigned long riova, rsize; + unsigned int min_pagesz; + bool flush = false; + int rs, re, start, end, ret = 0; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n", + iova, min_pagesz); + return -EINVAL; + } + + start = (iova - base_iova) >> bitmap_pgshift; + end = start + (size >> bitmap_pgshift); + bitmap_for_each_set_region(bitmap, rs, re, start, end) { + flush = true; + riova = iova + (rs << bitmap_pgshift); + rsize = (re - rs) << bitmap_pgshift; + ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap, + base_iova, bitmap_pgshift); + if (ret) + break; + } + + if (flush) + iommu_flush_iotlb_all(domain); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_clear_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index f44551e4a454..e7134ee224c9 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -170,6 +170,10 @@ struct io_pgtable_ops { unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); + int (*clear_dirty_log)(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 8069c8375e63..1cb6cd0cfc7b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -266,6 +266,10 @@ struct iommu_ops { unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); + int (*clear_dirty_log)(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -525,6 +529,10 @@ extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); +extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t dma_size, unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -940,6 +948,15 @@ static inline int iommu_sync_dirty_log(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long pgshift) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Thu Jan 28 15:17:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57A5AC433DB for ; Thu, 28 Jan 2021 15:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B95164DE5 for ; Thu, 28 Jan 2021 15:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231932AbhA1PTb (ORCPT ); Thu, 28 Jan 2021 10:19:31 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:12345 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbhA1PS4 (ORCPT ); Thu, 28 Jan 2021 10:18:56 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DRPG05tyHz7cH8; Thu, 28 Jan 2021 23:16:56 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:18:00 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 08/11] iommu/arm-smmu-v3: Add HWDBM device feature reporting Date: Thu, 28 Jan 2021 23:17:39 +0800 Message-ID: <20210128151742.18840-9-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun We have implemented these interfaces required to support iommu dirty log tracking. The last step is reporting this feature to upper user, then the user can perform higher policy base on it. This adds a new dev feature named IOMMU_DEV_FEAT_HWDBM in iommu layer. For arm smmuv3, it is equal to ARM_SMMU_FEAT_HTTU_HD. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 ++ include/linux/iommu.h | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 0c24503d29d3..cbde0489cf31 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2629,6 +2629,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev, switch (feat) { case IOMMU_DEV_FEAT_SVA: return arm_smmu_master_sva_supported(master); + case IOMMU_DEV_FEAT_HWDBM: + return !!(master->smmu->features & ARM_SMMU_FEAT_HTTU_HD); default: return false; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 1cb6cd0cfc7b..77e561ed57fd 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -160,6 +160,7 @@ struct iommu_resv_region { enum iommu_dev_features { IOMMU_DEV_FEAT_AUX, /* Aux-domain feature */ IOMMU_DEV_FEAT_SVA, /* Shared Virtual Addresses */ + IOMMU_DEV_FEAT_HWDBM, /* Hardware Dirty Bit Management */ }; #define IOMMU_PASID_INVALID (-1U) From patchwork Thu Jan 28 15:17:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9FC9C433E6 for ; Thu, 28 Jan 2021 15:22:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F8C564DE5 for ; Thu, 28 Jan 2021 15:22:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232428AbhA1PWZ (ORCPT ); Thu, 28 Jan 2021 10:22:25 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:12342 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231388AbhA1PSz (ORCPT ); Thu, 28 Jan 2021 10:18:55 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DRPG04PzBz7bL1; Thu, 28 Jan 2021 23:16:56 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:18:01 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 09/11] vfio/iommu_type1: Add HWDBM status maintanance Date: Thu, 28 Jan 2021 23:17:40 +0800 Message-ID: <20210128151742.18840-10-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun We are going to optimize dirty log tracking based on iommu HWDBM feature, but the dirty log from iommu is useful only when all iommu backed groups are connected to iommu with HWDBM feature. This maintains a counter for this feature. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/vfio/vfio_iommu_type1.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0b4dedaa9128..3b8522ebf955 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -74,6 +74,7 @@ struct vfio_iommu { bool nesting; bool dirty_page_tracking; bool pinned_page_dirty_scope; + uint64_t num_non_hwdbm_groups; }; struct vfio_domain { @@ -102,6 +103,7 @@ struct vfio_group { struct list_head next; bool mdev_group; /* An mdev group */ bool pinned_page_dirty_scope; + bool iommu_hwdbm; /* Valid for non-mdev group */ }; struct vfio_iova { @@ -976,6 +978,27 @@ static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu) } } +static int vfio_dev_has_feature(struct device *dev, void *data) +{ + enum iommu_dev_features *feat = data; + + if (!iommu_dev_has_feature(dev, *feat)) + return -ENODEV; + + return 0; +} + +static bool vfio_group_supports_hwdbm(struct vfio_group *group) +{ + enum iommu_dev_features feat = IOMMU_DEV_FEAT_HWDBM; + + if (iommu_group_for_each_dev(group->iommu_group, &feat, + vfio_dev_has_feature)) + return false; + + return true; +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -2189,6 +2212,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, * capable via the page pinning interface. */ iommu->pinned_page_dirty_scope = false; + + /* Update the hwdbm status of group and iommu */ + group->iommu_hwdbm = vfio_group_supports_hwdbm(group); + if (!group->iommu_hwdbm) + iommu->num_non_hwdbm_groups++; + mutex_unlock(&iommu->lock); vfio_iommu_resv_free(&group_resv_regions); @@ -2342,6 +2371,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, struct vfio_domain *domain; struct vfio_group *group; bool update_dirty_scope = false; + bool update_iommu_hwdbm = false; LIST_HEAD(iova_copy); mutex_lock(&iommu->lock); @@ -2380,6 +2410,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, vfio_iommu_detach_group(domain, group); update_dirty_scope = !group->pinned_page_dirty_scope; + update_iommu_hwdbm = !group->iommu_hwdbm; list_del(&group->next); kfree(group); /* @@ -2417,6 +2448,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, */ if (update_dirty_scope) update_pinned_page_dirty_scope(iommu); + if (update_iommu_hwdbm) + iommu->num_non_hwdbm_groups--; mutex_unlock(&iommu->lock); } From patchwork Thu Jan 28 15:17:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053955 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D7C3C43381 for ; Thu, 28 Jan 2021 15:19:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B6F564DF2 for ; Thu, 28 Jan 2021 15:19:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231831AbhA1PTT (ORCPT ); Thu, 28 Jan 2021 10:19:19 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:12346 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231349AbhA1PSz (ORCPT ); Thu, 28 Jan 2021 10:18:55 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DRPG06S7Dz7btQ; Thu, 28 Jan 2021 23:16:56 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:18:02 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM Date: Thu, 28 Jan 2021 23:17:41 +0800 Message-ID: <20210128151742.18840-11-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun In the past if vfio_iommu is not of pinned_page_dirty_scope and vfio_dma is iommu_mapped, we populate full dirty bitmap for this vfio_dma. Now we can try to get dirty log from iommu before make the lousy decision. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/vfio/vfio_iommu_type1.c | 97 ++++++++++++++++++++++++++++++++- 1 file changed, 94 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 3b8522ebf955..1cd10f3e7ed4 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -999,6 +999,25 @@ static bool vfio_group_supports_hwdbm(struct vfio_group *group) return true; } +static int vfio_iommu_dirty_log_clear(struct vfio_iommu *iommu, + dma_addr_t start_iova, size_t size, + unsigned long *bitmap_buffer, + dma_addr_t base_iova, size_t pgsize) +{ + struct vfio_domain *d; + unsigned long pgshift = __ffs(pgsize); + int ret; + + list_for_each_entry(d, &iommu->domain_list, next) { + ret = iommu_clear_dirty_log(d->domain, start_iova, size, + bitmap_buffer, base_iova, pgshift); + if (ret) + return ret; + } + + return 0; +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -1010,13 +1029,28 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, unsigned long shift = bit_offset % BITS_PER_LONG; unsigned long leftover; + if (iommu->pinned_page_dirty_scope || !dma->iommu_mapped) + goto bitmap_done; + + /* try to get dirty log from IOMMU */ + if (!iommu->num_non_hwdbm_groups) { + struct vfio_domain *d; + + list_for_each_entry(d, &iommu->domain_list, next) { + if (iommu_sync_dirty_log(d->domain, dma->iova, dma->size, + dma->bitmap, dma->iova, pgshift)) + return -EFAULT; + } + goto bitmap_done; + } + /* * mark all pages dirty if any IOMMU capable device is not able * to report dirty pages and all pages are pinned and mapped. */ - if (!iommu->pinned_page_dirty_scope && dma->iommu_mapped) - bitmap_set(dma->bitmap, 0, nbits); + bitmap_set(dma->bitmap, 0, nbits); +bitmap_done: if (shift) { bitmap_shift_left(dma->bitmap, dma->bitmap, shift, nbits + shift); @@ -1078,6 +1112,18 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, */ bitmap_clear(dma->bitmap, 0, dma->size >> pgshift); vfio_dma_populate_bitmap(dma, pgsize); + + /* Clear iommu dirty log to re-enable dirty log tracking */ + if (!iommu->pinned_page_dirty_scope && + dma->iommu_mapped && !iommu->num_non_hwdbm_groups) { + ret = vfio_iommu_dirty_log_clear(iommu, dma->iova, + dma->size, dma->bitmap, dma->iova, + pgsize); + if (ret) { + pr_warn("dma dirty log clear failed!\n"); + return ret; + } + } } return 0; } @@ -2780,6 +2826,48 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu, -EFAULT : 0; } +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, + struct vfio_dma *dma) +{ + struct vfio_domain *d; + + list_for_each_entry(d, &iommu->domain_list, next) { + /* Go through all domain anyway even if we fail */ + iommu_split_block(d->domain, dma->iova, dma->size); + } +} + +static void vfio_dma_dirty_log_stop(struct vfio_iommu *iommu, + struct vfio_dma *dma) +{ + struct vfio_domain *d; + + list_for_each_entry(d, &iommu->domain_list, next) { + /* Go through all domain anyway even if we fail */ + iommu_merge_page(d->domain, dma->iova, dma->size, + d->prot | dma->prot); + } +} + +static void vfio_iommu_dirty_log_switch(struct vfio_iommu *iommu, bool start) +{ + struct rb_node *n; + + /* Split and merge even if all iommu don't support HWDBM now */ + for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { + struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); + + if (!dma->iommu_mapped) + continue; + + /* Go through all dma range anyway even if we fail */ + if (start) + vfio_dma_dirty_log_start(iommu, dma); + else + vfio_dma_dirty_log_stop(iommu, dma); + } +} + static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, unsigned long arg) { @@ -2812,8 +2900,10 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, pgsize = 1 << __ffs(iommu->pgsize_bitmap); if (!iommu->dirty_page_tracking) { ret = vfio_dma_bitmap_alloc_all(iommu, pgsize); - if (!ret) + if (!ret) { iommu->dirty_page_tracking = true; + vfio_iommu_dirty_log_switch(iommu, true); + } } mutex_unlock(&iommu->lock); return ret; @@ -2822,6 +2912,7 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, if (iommu->dirty_page_tracking) { iommu->dirty_page_tracking = false; vfio_dma_bitmap_free_all(iommu); + vfio_iommu_dirty_log_switch(iommu, false); } mutex_unlock(&iommu->lock); return 0; From patchwork Thu Jan 28 15:17:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12053971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34B43C433DB for ; Thu, 28 Jan 2021 15:22:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA1C664DE5 for ; Thu, 28 Jan 2021 15:22:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231696AbhA1PWT (ORCPT ); Thu, 28 Jan 2021 10:22:19 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:12344 "EHLO szxga07-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231331AbhA1PS4 (ORCPT ); Thu, 28 Jan 2021 10:18:56 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DRPG05PCNz7c0H; Thu, 28 Jan 2021 23:16:56 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Thu, 28 Jan 2021 23:18:03 +0800 From: Keqian Zhu To: , , , , , Will Deacon , "Alex Williamson" , Marc Zyngier , Catalin Marinas CC: Kirti Wankhede , Cornelia Huck , Mark Rutland , James Morse , "Robin Murphy" , Suzuki K Poulose , , , , Subject: [RFC PATCH 11/11] vfio/iommu_type1: Add support for manual dirty log clear Date: Thu, 28 Jan 2021 23:17:42 +0800 Message-ID: <20210128151742.18840-12-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210128151742.18840-1-zhukeqian1@huawei.com> References: <20210128151742.18840-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: jiangkunkun In the past, we clear dirty log immediately after sync dirty log to userspace. This may cause redundant dirty handling if userspace handles dirty log iteratively: After vfio clears dirty log, new dirty log starts to generate. These new dirty log will be reported to userspace even if they are generated before userspace handles the same dirty page. That's to say, we should minimize the time gap of dirty log clearing and dirty log handling. We can give userspace the interface to clear dirty log. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- drivers/vfio/vfio_iommu_type1.c | 103 ++++++++++++++++++++++++++++++-- include/uapi/linux/vfio.h | 28 ++++++++- 2 files changed, 126 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 1cd10f3e7ed4..a32dc684b86e 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -73,6 +73,7 @@ struct vfio_iommu { bool v2; bool nesting; bool dirty_page_tracking; + bool dirty_log_manual_clear; bool pinned_page_dirty_scope; uint64_t num_non_hwdbm_groups; }; @@ -1018,6 +1019,78 @@ static int vfio_iommu_dirty_log_clear(struct vfio_iommu *iommu, return 0; } +static int vfio_iova_dirty_log_clear(u64 __user *bitmap, + struct vfio_iommu *iommu, + dma_addr_t iova, size_t size, + size_t pgsize) +{ + struct vfio_dma *dma; + struct rb_node *n; + dma_addr_t start_iova, end_iova, riova; + unsigned long pgshift = __ffs(pgsize); + unsigned long bitmap_size; + unsigned long *bitmap_buffer = NULL; + bool clear_valid; + int rs, re, start, end, dma_offset; + int ret = 0; + + bitmap_size = DIRTY_BITMAP_BYTES(size >> pgshift); + bitmap_buffer = kvmalloc(bitmap_size, GFP_KERNEL); + if (!bitmap_buffer) { + ret = -ENOMEM; + goto out; + } + + if (copy_from_user(bitmap_buffer, bitmap, bitmap_size)) { + ret = -EFAULT; + goto out; + } + + for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { + dma = rb_entry(n, struct vfio_dma, node); + if (!dma->iommu_mapped) + continue; + if ((dma->iova + dma->size - 1) < iova) + continue; + if (dma->iova > iova + size - 1) + break; + + start_iova = max(iova, dma->iova); + end_iova = min(iova + size, dma->iova + dma->size); + + /* Similar logic as the tail of vfio_iova_dirty_bitmap */ + + clear_valid = false; + start = (start_iova - iova) >> pgshift; + end = (end_iova - iova) >> pgshift; + bitmap_for_each_set_region(bitmap_buffer, rs, re, start, end) { + clear_valid = true; + riova = iova + (rs << pgshift); + dma_offset = (riova - dma->iova) >> pgshift; + bitmap_clear(dma->bitmap, dma_offset, re - rs); + } + + if (clear_valid) + vfio_dma_populate_bitmap(dma, pgsize); + + if (clear_valid && !iommu->pinned_page_dirty_scope && + dma->iommu_mapped && !iommu->num_non_hwdbm_groups) { + ret = vfio_iommu_dirty_log_clear(iommu, start_iova, + end_iova - start_iova, bitmap_buffer, + iova, pgsize); + if (ret) { + pr_warn("dma dirty log clear failed!\n"); + goto out; + } + } + + } + +out: + kfree(bitmap_buffer); + return ret; +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -1067,6 +1140,10 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, DIRTY_BITMAP_BYTES(nbits + shift))) return -EFAULT; + if (shift && iommu->dirty_log_manual_clear) + bitmap_shift_right(dma->bitmap, dma->bitmap, shift, + nbits + shift); + return 0; } @@ -1105,6 +1182,9 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, if (ret) return ret; + if (iommu->dirty_log_manual_clear) + continue; + /* * Re-populate bitmap to include all pinned pages which are * considered as dirty but exclude pages which are unpinned and @@ -2601,6 +2681,11 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu, if (!iommu) return 0; return vfio_domains_have_iommu_cache(iommu); + case VFIO_DIRTY_LOG_MANUAL_CLEAR: + if (!iommu) + return 0; + iommu->dirty_log_manual_clear = true; + return 1; default: return 0; } @@ -2874,7 +2959,8 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, struct vfio_iommu_type1_dirty_bitmap dirty; uint32_t mask = VFIO_IOMMU_DIRTY_PAGES_FLAG_START | VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP | - VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP; + VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | + VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP; unsigned long minsz; int ret = 0; @@ -2916,7 +3002,8 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, } mutex_unlock(&iommu->lock); return 0; - } else if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) { + } else if (dirty.flags & (VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | + VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP)) { struct vfio_iommu_type1_dirty_bitmap_get range; unsigned long pgshift; size_t data_size = dirty.argsz - minsz; @@ -2959,13 +3046,21 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, goto out_unlock; } - if (iommu->dirty_page_tracking) + if (!iommu->dirty_page_tracking) { + ret = -EINVAL; + goto out_unlock; + } + + if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) ret = vfio_iova_dirty_bitmap(range.bitmap.data, iommu, range.iova, range.size, range.bitmap.pgsize); else - ret = -EINVAL; + ret = vfio_iova_dirty_log_clear(range.bitmap.data, + iommu, range.iova, + range.size, + range.bitmap.pgsize); out_unlock: mutex_unlock(&iommu->lock); diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index d1812777139f..77a64ff38f64 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -46,6 +46,14 @@ */ #define VFIO_NOIOMMU_IOMMU 8 +/* + * The vfio_iommu driver may support user clears dirty log manually, which means + * dirty log is not cleared automatically after dirty log is copied to userspace, + * it's user's duty to clear dirty log. Note: when user queries this extension + * and vfio_iommu driver supports it, then it is enabled. + */ +#define VFIO_DIRTY_LOG_MANUAL_CLEAR 9 + /* * The IOCTL interface is designed for extensibility by embedding the * structure length (argsz) and flags into structures passed between @@ -1161,7 +1169,24 @@ struct vfio_iommu_type1_dma_unmap { * actual bitmap. If dirty pages logging is not enabled, an error will be * returned. * - * Only one of the flags _START, _STOP and _GET may be specified at a time. + * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP flag set, + * instructs the IOMMU driver to clear the dirty status of pages in a bitmap + * for IOMMU container for a given IOVA range. The user must specify the IOVA + * range, the bitmap and the pgsize through the structure + * vfio_iommu_type1_dirty_bitmap_get in the data[] portion. This interface + * supports clearing a bitmap of the smallest supported pgsize only and can be + * modified in future to clear a bitmap of any specified supported pgsize. The + * user must provide a memory area for the bitmap memory and specify its size + * in bitmap.size. One bit is used to represent one page consecutively starting + * from iova offset. The user should provide page size in bitmap.pgsize field. + * A bit set in the bitmap indicates that the page at that offset from iova is + * cleared the dirty status, and dirty tracking is re-enabled for that page. The + * caller must set argsz to a value including the size of structure + * vfio_iommu_dirty_bitmap_get, but excluing the size of the actual bitmap. If + * dirty pages logging is not enabled, an error will be returned. + * + * Only one of the flags _START, _STOP, _GET and _CLEAR may be specified at a + * time. * */ struct vfio_iommu_type1_dirty_bitmap { @@ -1170,6 +1195,7 @@ struct vfio_iommu_type1_dirty_bitmap { #define VFIO_IOMMU_DIRTY_PAGES_FLAG_START (1 << 0) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP (1 << 1) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP (1 << 2) +#define VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP (1 << 3) __u8 data[]; };