From patchwork Wed Oct 12 13:22:24 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Auger X-Patchwork-Id: 9373059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2EE9860772 for ; Wed, 12 Oct 2016 13:31:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1EAAB29DC8 for ; Wed, 12 Oct 2016 13:31:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 132AB29DCB; Wed, 12 Oct 2016 13:31:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7754D29DC8 for ; Wed, 12 Oct 2016 13:31:52 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1buJbu-0004je-EB; Wed, 12 Oct 2016 13:30:18 +0000 Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat Linux)) id 1buJVs-00065D-H6 for linux-arm-kernel@lists.infradead.org; Wed, 12 Oct 2016 13:24:08 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B96DA335F97; Wed, 12 Oct 2016 13:23:47 +0000 (UTC) Received: from localhost.redhat.com (vpn1-6-235.ams2.redhat.com [10.36.6.235]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9CDMQCu010906; Wed, 12 Oct 2016 09:23:43 -0400 From: Eric Auger To: eric.auger@redhat.com, eric.auger.pro@gmail.com, christoffer.dall@linaro.org, marc.zyngier@arm.com, robin.murphy@arm.com, alex.williamson@redhat.com, will.deacon@arm.com, joro@8bytes.org, tglx@linutronix.de, jason@lakedaemon.net, linux-arm-kernel@lists.infradead.org Subject: [PATCH v14 16/16] vfio/type1: Introduce MSI_RESV capability Date: Wed, 12 Oct 2016 13:22:24 +0000 Message-Id: <1476278544-3397-17-git-send-email-eric.auger@redhat.com> In-Reply-To: <1476278544-3397-1-git-send-email-eric.auger@redhat.com> References: <1476278544-3397-1-git-send-email-eric.auger@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 12 Oct 2016 13:23:47 +0000 (UTC) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20161012_062404_910987_43D5333A X-CRM114-Status: GOOD ( 21.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: drjones@redhat.com, kvm@vger.kernel.org, Jean-Philippe.Brucker@arm.com, Manish.Jaggi@caviumnetworks.com, p.fedin@samsung.com, linux-kernel@vger.kernel.org, Bharat.Bhushan@freescale.com, iommu@lists.linux-foundation.org, pranav.sawargaonkar@gmail.com, yehuday@marvell.com MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP This patch allows the user-space to retrieve the MSI reserved region requirements, if any. The implementation is based on capability chains, now also added to VFIO_IOMMU_GET_INFO. The returned info comprises the size and the alignment requirements In case the userspace must provide the IOVA aperture, we currently report a size/alignment based on all the doorbells registered by the host kernel. This may exceed the actual needs. Signed-off-by: Eric Auger --- v13 -> v14: - new capability struct - change the padding in vfio_iommu_type1_info v11 -> v12: - msi_doorbell_pages was renamed msi_doorbell_calc_pages v9 -> v10: - move cap_offset after iova_pgsizes - replace __u64 alignment by __u32 order - introduce __u32 flags in vfio_iommu_type1_info_cap_msi_geometry and fix alignment - call msi-doorbell API to compute the size/alignment v8 -> v9: - use iommu_msi_supported flag instead of programmable - replace IOMMU_INFO_REQUIRE_MSI_MAP flag by a more sophisticated capability chain, reporting the MSI geometry v7 -> v8: - use iommu_domain_msi_geometry v6 -> v7: - remove the computation of the number of IOVA pages to be provisionned. This number depends on the domain/group/device topology which can dynamically change. Let's rely instead rely on an arbitrary max depending on the system v4 -> v5: - move msi_info and ret declaration within the conditional code v3 -> v4: - replace former vfio_domains_require_msi_mapping by more complex computation of MSI mapping requirements, especially the number of pages to be provided by the user-space. - reword patch title RFC v1 -> v1: - derived from [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state - renamed allow_msi_reconfig into require_msi_mapping - fixed VFIO_IOMMU_GET_INFO --- drivers/vfio/vfio_iommu_type1.c | 67 ++++++++++++++++++++++++++++++++++++++++- include/uapi/linux/vfio.h | 20 +++++++++++- 2 files changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index c18ba9d..6775da3 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1147,6 +1147,46 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu) return ret; } +static int msi_resv_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps) +{ + struct iommu_domain_msi_resv msi_resv = {.size = 0, .alignment = 0}; + struct vfio_iommu_type1_info_cap_msi_resv *cap; + struct vfio_info_cap_header *header; + struct iommu_domain_msi_resv iter; + struct vfio_domain *d; + + mutex_lock(&iommu->lock); + + list_for_each_entry(d, &iommu->domain_list, next) { + if (iommu_domain_get_attr(d->domain, + DOMAIN_ATTR_MSI_RESV, &iter)) + continue; + if (iter.size > msi_resv.size) { + msi_resv.size = iter.size; + msi_resv.alignment = iter.alignment; + } + } + + if (!msi_resv.size) + return 0; + + mutex_unlock(&iommu->lock); + + header = vfio_info_cap_add(caps, sizeof(*cap), + VFIO_IOMMU_TYPE1_INFO_CAP_MSI_RESV, 1); + + if (IS_ERR(header)) + return PTR_ERR(header); + + cap = container_of(header, struct vfio_iommu_type1_info_cap_msi_resv, + header); + + cap->alignment = msi_resv.alignment; + cap->size = msi_resv.size; + + return 0; +} + static long vfio_iommu_type1_ioctl(void *iommu_data, unsigned int cmd, unsigned long arg) { @@ -1168,8 +1208,10 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, } } else if (cmd == VFIO_IOMMU_GET_INFO) { struct vfio_iommu_type1_info info; + struct vfio_info_cap caps = { .buf = NULL, .size = 0 }; + int ret; - minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes); + minsz = offsetofend(struct vfio_iommu_type1_info, cap_offset); if (copy_from_user(&info, (void __user *)arg, minsz)) return -EFAULT; @@ -1181,6 +1223,29 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, info.iova_pgsizes = vfio_pgsize_bitmap(iommu); + ret = msi_resv_caps(iommu, &caps); + if (ret) + return ret; + + if (caps.size) { + info.flags |= VFIO_IOMMU_INFO_CAPS; + if (info.argsz < sizeof(info) + caps.size) { + info.argsz = sizeof(info) + caps.size; + info.cap_offset = 0; + } else { + vfio_info_cap_shift(&caps, sizeof(info)); + if (copy_to_user((void __user *)arg + + sizeof(info), caps.buf, + caps.size)) { + kfree(caps.buf); + return -EFAULT; + } + info.cap_offset = sizeof(info); + } + + kfree(caps.buf); + } + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 4a9dbc2..e34a9a6 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -488,7 +488,23 @@ struct vfio_iommu_type1_info { __u32 argsz; __u32 flags; #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */ - __u64 iova_pgsizes; /* Bitmap of supported page sizes */ +#define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */ + __u64 iova_pgsizes; /* Bitmap of supported page sizes */ + __u32 cap_offset; /* Offset within info struct of first cap */ + __u32 __resv; +}; + +/* + * The MSI_RESV capability allows to report the MSI reserved IOVA requirements: + * In case this capability is supported, the userspace must provide an IOVA + * window characterized by @size and @alignment using VFIO_IOMMU_MAP_DMA with + * RESERVED_MSI_IOVA flag. + */ +#define VFIO_IOMMU_TYPE1_INFO_CAP_MSI_RESV 1 +struct vfio_iommu_type1_info_cap_msi_resv { + struct vfio_info_cap_header header; + __u64 size; /* requested IOVA aperture size in bytes */ + __u64 alignment; /* requested byte alignment of the window */ }; #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12) @@ -503,6 +519,8 @@ struct vfio_iommu_type1_info { * IOVA region that will be used on some platforms to map the host MSI frames. * In that specific case, vaddr is ignored. Once registered, an MSI reserved * IOVA region stays until the container is closed. + * The requirement for provisioning such reserved IOVA range can be checked by + * checking the VFIO_IOMMU_TYPE1_INFO_CAP_MSI_RESV capability. */ struct vfio_iommu_type1_dma_map { __u32 argsz;