From patchwork Mon Jul 30 23:14:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 10549695 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42A9D15E2 for ; Mon, 30 Jul 2018 23:14:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E88629FDA for ; Mon, 30 Jul 2018 23:14:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1019E29FFB; Mon, 30 Jul 2018 23:14:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9DABA29FDA for ; Mon, 30 Jul 2018 23:14:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732117AbeGaAvs (ORCPT ); Mon, 30 Jul 2018 20:51:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42914 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732027AbeGaAvs (ORCPT ); Mon, 30 Jul 2018 20:51:48 -0400 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4C6073082A23; Mon, 30 Jul 2018 23:14:32 +0000 (UTC) Received: from gimli.home (ovpn-116-35.phx2.redhat.com [10.3.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 051DA308BDA6; Mon, 30 Jul 2018 23:14:21 +0000 (UTC) Subject: [PATCH v2 3/4] vfio: Inhibit ballooning based on group attachment to a container From: Alex Williamson To: qemu-devel@nongnu.org Cc: kvm@vger.kernel.org, peterx@redhat.com, cohuck@redhat.com, mst@redhat.com, david@redhat.com Date: Mon, 30 Jul 2018 17:14:21 -0600 Message-ID: <153299246170.14411.6197545037372422542.stgit@gimli.home> In-Reply-To: <153299204130.14411.11438396195753743913.stgit@gimli.home> References: <153299204130.14411.11438396195753743913.stgit@gimli.home> User-Agent: StGit/0.18-136-gffd7-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Mon, 30 Jul 2018 23:14:32 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We use a VFIOContainer to associate an AddressSpace to one or more VFIOGroups. The VFIOContainer represents the DMA context for that AdressSpace for those VFIOGroups and is synchronized to changes in that AddressSpace via a MemoryListener. For IOMMU backed devices, maintaining the DMA context for a VFIOGroup generally involves pinning a host virtual address in order to create a stable host physical address and then mapping a translation from the associated guest physical address to that host physical address into the IOMMU. While the above maintains the VFIOContainer synchronized to the QEMU memory API of the VM, memory ballooning occurs outside of that API. Inflating the memory balloon (ie. cooperatively capturing pages from the guest for use by the host) simply uses MADV_DONTNEED to "zap" pages from QEMU's host virtual address space. The page pinning and IOMMU mapping above remains in place, negating the host's ability to reuse the page, but the host virtual to host physical mapping of the page is invalidated outside of QEMU's memory API. When the balloon is later deflated, attempting to cooperatively return pages to the guest, the page is simply freed by the guest balloon driver, allowing it to be used in the guest and incurring a page fault when that occurs. The page fault maps a new host physical page backing the existing host virtual address, meanwhile the VFIOContainer still maintains the translation to the original host physical address. At this point the guest vCPU and any assigned devices will map different host physical addresses to the same guest physical address. Badness. The IOMMU typically does not have page level granularity with which it can track this mapping without also incurring inefficiencies in using page size mappings throughout. MMU notifiers in the host kernel also provide indicators for invalidating the mapping on balloon inflation, not for updating the mapping when the balloon is deflated. For these reasons we assume a default behavior that the mapping of each VFIOGroup into the VFIOContainer is incompatible with memory ballooning and increment the balloon inhibitor to match the attached VFIOGroups. Signed-off-by: Alex Williamson Reviewed-by: Peter Xu --- hw/vfio/common.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index fb396cf00ac4..4881b691a659 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -32,6 +32,7 @@ #include "hw/hw.h" #include "qemu/error-report.h" #include "qemu/range.h" +#include "sysemu/balloon.h" #include "sysemu/kvm.h" #include "trace.h" #include "qapi/error.h" @@ -1049,6 +1050,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); vfio_kvm_device_add_group(group); + qemu_balloon_inhibit(true); return 0; } } @@ -1198,6 +1200,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } vfio_kvm_device_add_group(group); + qemu_balloon_inhibit(true); QLIST_INIT(&container->group_list); QLIST_INSERT_HEAD(&space->containers, container, next); @@ -1222,6 +1225,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, listener_release_exit: QLIST_REMOVE(group, container_next); QLIST_REMOVE(container, next); + qemu_balloon_inhibit(false); vfio_kvm_device_del_group(group); vfio_listener_release(container); @@ -1352,6 +1356,7 @@ void vfio_put_group(VFIOGroup *group) return; } + qemu_balloon_inhibit(false); vfio_kvm_device_del_group(group); vfio_disconnect_container(group); QLIST_REMOVE(group, next);