From patchwork Mon Apr 1 20:59:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 10880631 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B376514DE for ; Mon, 1 Apr 2019 21:22:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E9B4287A4 for ; Mon, 1 Apr 2019 21:22:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9C87E287B8; Mon, 1 Apr 2019 21:22:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 04F21287B1 for ; Mon, 1 Apr 2019 21:22:56 +0000 (UTC) Received: from localhost ([127.0.0.1]:43071 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hB4ON-00050v-KK for patchwork-qemu-devel@patchwork.kernel.org; Mon, 01 Apr 2019 17:22:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41240) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hB44w-0001iC-SX for qemu-devel@nongnu.org; Mon, 01 Apr 2019 17:02:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hB44v-0003fL-Az for qemu-devel@nongnu.org; Mon, 01 Apr 2019 17:02:50 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:51492) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hB44u-0003Yn-Sv for qemu-devel@nongnu.org; Mon, 01 Apr 2019 17:02:49 -0400 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x31L29ip118993 for ; Mon, 1 Apr 2019 17:02:37 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0a-001b2d01.pphosted.com with ESMTP id 2rkqt6064s-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Apr 2019 17:02:36 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 1 Apr 2019 22:02:36 +0100 Received: from b03cxnp08025.gho.boulder.ibm.com (9.17.130.17) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 1 Apr 2019 22:02:32 +0100 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x31L2V4029032686 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 1 Apr 2019 21:02:31 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A2764136053; Mon, 1 Apr 2019 21:02:31 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BBC30136055; Mon, 1 Apr 2019 21:02:30 +0000 (GMT) Received: from localhost (unknown [9.80.94.43]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Mon, 1 Apr 2019 21:02:30 +0000 (GMT) From: Michael Roth To: qemu-devel@nongnu.org Date: Mon, 1 Apr 2019 15:59:29 -0500 X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190401210011.16009-1-mdroth@linux.vnet.ibm.com> References: <20190401210011.16009-1-mdroth@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 19040121-0020-0000-0000-00000ED21F2C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010857; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000283; SDB=6.01182941; UDB=6.00619269; IPR=6.00963684; MB=3.00026249; MTD=3.00000008; XFM=3.00000015; UTC=2019-04-01 21:02:34 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19040121-0021-0000-0000-00006548BF9F Message-Id: <20190401210011.16009-56-mdroth@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-04-01_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904010136 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH 55/97] intel_iommu: better handling of dmar state switch X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-stable@nongnu.org, Peter Xu , "Michael S . Tsirkin" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Peter Xu QEMU is not handling the global DMAR switch well, especially when from "on" to "off". Let's first take the example of system reset. Assuming that a guest has IOMMU enabled. When it reboots, we will drop all the existing DMAR mappings to handle the system reset, however we'll still keep the existing memory layouts which has the IOMMU memory region enabled. So after the reboot and before the kernel reloads again, there will be no mapping at all for the host device. That's problematic since any software (for example, SeaBIOS) that runs earlier than the kernel after the reboot will assume the IOMMU is disabled, so any DMA from the software will fail. For example, a guest that boots on an assigned NVMe device might fail to find the boot device after a system reboot/reset and we'll be able to observe SeaBIOS errors if we capture the debugging log: WARNING - Timeout at nvme_wait:144! Meanwhile, we should see DMAR errors on the host of that NVMe device. It's the DMA fault that caused a NVMe driver timeout. The correct fix should be that we do proper switching of device DMA address spaces when system resets, which will setup correct memory regions and notify the backend of the devices. This might not affect much on non-assigned devices since QEMU VT-d emulation will assume a default passthrough mapping if DMAR is not enabled in the GCMD register (please refer to vtd_iommu_translate). However that's required for an assigned devices, since that'll rebuild the correct GPA to HPA mapping that is needed for any DMA operation during guest bootstrap. Besides the system reset, we have some other places that might change the global DMAR status and we'd better do the same thing there. For example, when we change the state of GCMD register, or the DMAR root pointer. Do the same refresh for all these places. For these two places we'll also need to explicitly invalidate the context entry cache and iotlb cache. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173 CC: QEMU Stable Reported-by: Cong Li Signed-off-by: Peter Xu --- v2: - do the same for GCMD write, or root pointer update [Alex] - test is carried out by me this time, by observing the vtd_switch_address_space tracepoint after system reboot v3: - rewrite commit message as suggested by Alex Signed-off-by: Peter Xu Reviewed-by: Eric Auger Reviewed-by: Jason Wang Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin (cherry picked from commit 2cc9ddccebcaa48b3debfc279a83761fcbb7616c) Signed-off-by: Michael Roth --- hw/i386/intel_iommu.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index f66e93ed2c..4dfa9d5e2b 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -37,6 +37,8 @@ #include "kvm_i386.h" #include "trace.h" +static void vtd_address_space_refresh_all(IntelIOMMUState *s); + static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val, uint64_t wmask, uint64_t w1cmask) { @@ -1426,7 +1428,7 @@ static void vtd_context_global_invalidate(IntelIOMMUState *s) vtd_reset_context_cache_locked(s); } vtd_iommu_unlock(s); - vtd_switch_address_space_all(s); + vtd_address_space_refresh_all(s); /* * From VT-d spec 6.5.2.1, a global context entry invalidation * should be followed by a IOTLB global invalidation, so we should @@ -1711,6 +1713,8 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s) vtd_root_table_setup(s); /* Ok - report back to driver */ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS); + vtd_reset_caches(s); + vtd_address_space_refresh_all(s); } /* Set Interrupt Remap Table Pointer */ @@ -1743,7 +1747,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en) vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0); } - vtd_switch_address_space_all(s); + vtd_reset_caches(s); + vtd_address_space_refresh_all(s); } /* Handle Interrupt Remap Enable/Disable */ @@ -3022,6 +3027,12 @@ static void vtd_address_space_unmap_all(IntelIOMMUState *s) } } +static void vtd_address_space_refresh_all(IntelIOMMUState *s) +{ + vtd_address_space_unmap_all(s); + vtd_switch_address_space_all(s); +} + static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private) { memory_region_notify_one((IOMMUNotifier *)private, entry); @@ -3194,11 +3205,7 @@ static void vtd_reset(DeviceState *dev) IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); vtd_init(s); - - /* - * When device reset, throw away all mappings and external caches - */ - vtd_address_space_unmap_all(s); + vtd_address_space_refresh_all(s); } static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)