From patchwork Thu May 24 18:20:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10425309 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6BA50602D8 for ; Thu, 24 May 2018 18:20:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 607352968D for ; Thu, 24 May 2018 18:20:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5F09C296B6; Thu, 24 May 2018 18:20:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7A712968D for ; Thu, 24 May 2018 18:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1034249AbeEXSUj (ORCPT ); Thu, 24 May 2018 14:20:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40492 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030202AbeEXSUh (ORCPT ); Thu, 24 May 2018 14:20:37 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D2BEC300377C; Thu, 24 May 2018 18:20:36 +0000 (UTC) Received: from w520.home (ovpn-116-135.phx2.redhat.com [10.3.116.135]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D5D9280A6; Thu, 24 May 2018 18:20:34 +0000 (UTC) Date: Thu, 24 May 2018 12:20:32 -0600 From: Alex Williamson To: Shameer Kolothum Cc: , , , , , , , , Joerg Roedel Subject: Re: [PATCH v6 0/7] vfio/type1: Add support for valid iova list management Message-ID: <20180524122032.724676b9@w520.home> In-Reply-To: <20180418114045.7968-1-shameerali.kolothum.thodi@huawei.com> References: <20180418114045.7968-1-shameerali.kolothum.thodi@huawei.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Thu, 24 May 2018 18:20:36 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP [Cc +Joerg: AMD-Vi observation towards the end] On Wed, 18 Apr 2018 12:40:38 +0100 Shameer Kolothum wrote: > This series introduces an iova list associated with a vfio > iommu. The list is kept updated taking care of iommu apertures, > and reserved regions. Also this series adds checks for any conflict > with existing dma mappings whenever a new device group is attached to > the domain. > > User-space can retrieve valid iova ranges using VFIO_IOMMU_GET_INFO > ioctl capability chains. Any dma map request outside the valid iova > range will be rejected. Hi Shameer, I ran into two minor issues in testing this series, both related to mdev usage of type1. First, in patch 5/7 when we try to validate a dma map request: > +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, > + dma_addr_t start, dma_addr_t end) > +{ > + struct list_head *iova = &iommu->iova_list; > + struct vfio_iova *node; > + > + list_for_each_entry(node, iova, list) { > + if ((start >= node->start) && (end <= node->end)) > + return true; > + } > + > + return false; > +} A container with only an mdev device will have an empty list because it has not backing iommu to set ranges or reserved regions, so any dma map will fail. I think this is resolved as follows: The second issue is similar, patch 6/7 adds to VFIO_IOMMU_GET_INFO: + ret = vfio_iommu_iova_build_caps(iommu, &caps); + if (ret) + return ret; And build_caps has: + list_for_each_entry(iova, &iommu->iova_list, list) + iovas++; + + if (!iovas) { + ret = -EINVAL; Therefore if the iova list is empty, as for mdevs, the use can no longer even call VFIO_IOMMU_GET_INFO on the container, which is a regression. Again, I think the fix is simple: @@ -2090,7 +2090,7 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu, iovas++; if (!iovas) { - ret = -EINVAL; + ret = 0; goto out_unlock; } ie. build_caps needs to handle lack of an iova_list as a non-error. Also, I wrote a small unit test to validate the iova list for my systems[1]. With the above changes, my Intel test system gives expected results: # ./vfio-type1-iova-list /sys/bus/mdev/devices/c08db5ed-05d3-4b39-b150-438a18bc698f /sys/bus/pci/devices/0000:00:1b.0 ---- Adding device: c08db5ed-05d3-4b39-b150-438a18bc698f ---- Initial info struct size: 0x18 No caps ---- Adding device: 0000:00:1b.0 ---- Initial info struct size: 0x18 Requested info struct size: 0x48 New info struct size: 0x48 argsz: 0x48, flags: 0x3, cap_offset: 0x18 00: 4800 0000 0300 0000 00f0 ffff ffff ffff 10: 1800 0000 0000 0000 0100 0100 0000 0000 20: 0200 0000 0000 0000 0000 0000 0000 0000 30: ffff dffe 0000 0000 0000 f0fe 0000 0000 40: ffff ffff ffff ff01 [cap id: 1, version: 1, next: 0x0] Found type1 iova range version: 1 00: 0000000000000000 - 00000000fedfffff 01: 00000000fef00000 - 01ffffffffffffff Adding an mdev device to the container results in no iova list, adding the physical device updates to the expected set with the MSI range excluded. I was a little surprised by an AMD system: # ./vfio-type1-iova-list /sys/bus/pci/devices/0000:01:00.0 ---- Adding device: 0000:01:00.0 ---- Initial info struct size: 0x18 Requested info struct size: 0x58 New info struct size: 0x58 argsz: 0x58, flags: 0x3, cap_offset: 0x18 00: 5800 0000 0300 0000 00f0 ffff 7fff ffff 10: 1800 0000 0000 0000 0100 0100 0000 0000 20: 0300 0000 0000 0000 0000 0000 0000 0000 30: ffff dffe 0000 0000 0000 f0fe 0000 0000 40: ffff ffff fc00 0000 0000 0000 0001 0000 50: ffff ffff ffff ffff [cap id: 1, version: 1, next: 0x0] Found type1 iova range version: 1 00: 0000000000000000 - 00000000fedfffff 01: 00000000fef00000 - 000000fcffffffff 02: 0000010000000000 - ffffffffffffffff The additional excluded range is the HyperTransport area (which for 99.9+% of use cases isn't going to be a problem) is a rather surprising limit for the size of a VM we can use on AMD, just under 1TB. I fully expect we'll see a bug report from that and we should be thinking about how to address it. Otherwise, if the changes above look good to you I'll incorporate them into their respective patches and push to my next branch. Thanks, Alex [1] https://gist.github.com/awilliam/25e39252ab28101a2d89ace1ff765fdd --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1100,7 +1100,7 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, return true; } - return false; + return list_empty(&iommu->iova_list); } ie. return false only if there was anything to test against.