From patchwork Wed Apr 25 04:51:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 10361521 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B158160225 for ; Wed, 25 Apr 2018 05:00:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0B2828F23 for ; Wed, 25 Apr 2018 05:00:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 947D028F29; Wed, 25 Apr 2018 05:00:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C794128F23 for ; Wed, 25 Apr 2018 05:00:00 +0000 (UTC) Received: from localhost ([::1]:33860 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBCXA-00015t-3Z for patchwork-qemu-devel@patchwork.kernel.org; Wed, 25 Apr 2018 01:00:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fBCPq-0004NU-Rn for qemu-devel@nongnu.org; Wed, 25 Apr 2018 00:52:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fBCPp-0007S7-No for qemu-devel@nongnu.org; Wed, 25 Apr 2018 00:52:26 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:59414 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fBCPp-0007Rq-Is for qemu-devel@nongnu.org; Wed, 25 Apr 2018 00:52:25 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 159B1406C742; Wed, 25 Apr 2018 04:52:25 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-55.pek2.redhat.com [10.72.12.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 313867C43; Wed, 25 Apr 2018 04:52:21 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Wed, 25 Apr 2018 12:51:28 +0800 Message-Id: <20180425045129.17449-10-peterx@redhat.com> In-Reply-To: <20180425045129.17449-1-peterx@redhat.com> References: <20180425045129.17449-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 04:52:25 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 25 Apr 2018 04:52:25 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH 09/10] intel-iommu: don't unmap all for shadow page table X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Wang , Alex Williamson , Jintack Lim , peterx@redhat.com, "Michael S . Tsirkin" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP IOMMU replay was carried out before in many use cases, e.g., context cache invalidations, domain flushes. We used this mechanism to sync the shadow page table by firstly (1) unmap the whole address space, then (2) walk the page table to remap what's in the table. This is very dangerous. The problem is that we'll have a very small window (in my measurement, it can be about 3ms) during above step (1) and (2) that the device will see no (or incomplete) device page table. Howerver the device never knows that. This can cause DMA error of devices, who assumes the page table is always there. So the point is that, for MAP typed notifiers (vfio-pci, for example) they'll need the mapped page entries always be there. We can never unmap any existing page entries like what we did in (1) above. The only solution is to remove step (1). We can't do that before since we didn't know what device page was mapped and what was not, so we unmap them all. Now with the new IOVA tree QEMU knows what has mapped and what has not. We don't need this step (1) any more. Remove it. Note that after removing that global unmap flushing, we'll need to notify unmap now during page walkings. This should fix the DMA error problem that Jintack Lim reported with nested device assignment. This problem won't not happen always, e.g., I cannot reproduce the error. However after collecting logs it shows that this is the possible cause to Jintack's problem. Reported-by: Jintack Lim Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 8f396a5d13..dedaebc46b 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -2952,10 +2952,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) /* * The replay can be triggered by either a invalidation or a newly - * created entry. No matter what, we release existing mappings - * (it means flushing caches for UNMAP-only registers). + * created entry. */ - vtd_address_space_unmap(vtd_as, n); if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) == 0) { trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn), @@ -2964,8 +2962,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n) ce.hi, ce.lo); if (vtd_as_notify_mappings(vtd_as)) { /* This is required only for MAP typed notifiers */ - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, true, vtd_as); + } else { + vtd_address_space_unmap(vtd_as, n); } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn),