From patchwork Fri Feb 24 15:54:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eugenio Perez Martin X-Patchwork-Id: 13151461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A0D2C61DA4 for ; Fri, 24 Feb 2023 15:55:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pVaPN-0005Ia-Iu; Fri, 24 Feb 2023 10:54:53 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pVaPL-0005HE-LG for qemu-devel@nongnu.org; Fri, 24 Feb 2023 10:54:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pVaPJ-0005KG-Rc for qemu-devel@nongnu.org; Fri, 24 Feb 2023 10:54:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1677254088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=U01NNdTYmB88L6bCuvIVo4stNcHELnJIE5QJOrddB34=; b=Zm6CVWSG/ZVsqBMctaaLB3LcdRu8TurYi8zrMqRzVd1lez/EExw/Y2g4z4jnaXHvu3wE9y t5HppZF7AwQ4zjxXMUIaGj+3YoenPMK5bIK8xt/wD/weuwIDfs2hDGZ8DDcd71LcXbPcI7 8W+UiU1yR9vqHxU1lDXR8KN4CzWCGS0= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-275-haghR7GjMiCVtlNvnI5eIQ-1; Fri, 24 Feb 2023 10:54:45 -0500 X-MC-Unique: haghR7GjMiCVtlNvnI5eIQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 647A6299E75E; Fri, 24 Feb 2023 15:54:44 +0000 (UTC) Received: from eperezma.remote.csb (unknown [10.39.192.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5B79CC15BA0; Fri, 24 Feb 2023 15:54:40 +0000 (UTC) From: =?utf-8?q?Eugenio_P=C3=A9rez?= To: qemu-devel@nongnu.org Cc: Stefano Garzarella , Shannon Nelson , Jason Wang , Gautam Dawar , Laurent Vivier , alvaro.karsz@solid-run.com, longpeng2@huawei.com, virtualization@lists.linux-foundation.org, Stefan Hajnoczi , Cindy Lu , "Michael S. Tsirkin" , si-wei.liu@oracle.com, Liuxiangdong , Parav Pandit , Eli Cohen , Zhu Lingshan , Harpreet Singh Anand , "Gonglei (Arei)" , Lei Yang Subject: [PATCH v4 00/15] Dynamically switch to vhost shadow virtqueues at vdpa net migration Date: Fri, 24 Feb 2023 16:54:23 +0100 Message-Id: <20230224155438.112797-1-eperezma@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eperezma@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org It's possible to migrate vdpa net devices if they are shadowed from the start. But to always shadow the dataplane is to effectively break its host passthrough, so its not efficient in vDPA scenarios. This series enables dynamically switching to shadow mode only at migration time. This allows full data virtqueues passthrough all the time qemu is not migrating. In this series only net devices with no CVQ are migratable. CVQ adds additional state that would make the series bigger and still had some controversy on previous RFC, so let's split it. Successfully tested with vdpa_sim_net with patch [1] applied and with the qemu emulated device with vp_vdpa with some restrictions: * No CVQ. No feature that didn't work with SVQ previously (packed, ...) * VIRTIO_RING_F_STATE patches implementing [2]. * Expose _F_SUSPEND, but ignore it and suspend on ring state fetch like DPDK. Previous versions were tested by many vendors. Not carrying Tested-by because of code changes, so re-testing would be appreciated. Comments are welcome. v4: - Recover used_idx from guest's vring if device cannot suspend. - Fix starting device in the middle of a migration. Removed some duplication in setting / clearing enable_shadow_vqs and shadow_data members of vhost_vdpa. - Fix (again) "Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is empty at the check moment.". It was reverted by mistake in v3. - Fix memory leak of iova tree. - Properly rewind SVQ as in flight descriptors were still being accounted in vq base. - Expand documentation. v3: - Start datapatch in SVQ in device started while migrating. - Properly register migration blockers if device present unsupported features. - Fix race condition because of not stopping the SVQ until device cleanup. - Explain purpose of iova tree in the first patch message. - s/dynamycally/dynamically/ in cover letter. - at lore.kernel.org/qemu-devel/20230215173850.298832-14-eperezma@redhat.com v2: - Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is empty at the check moment. - at https://lore.kernel.org/all/20230208094253.702672-12-eperezma@redhat.com/T/ v1: - Omit all code working with CVQ and block migration if the device supports CVQ. - Remove spurious kick. - Move all possible checks for migration to vhost-vdpa instead of the net backend. Move them to init code from start code. - Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net backend. - Properly split suspend after geting base and adding of status_reset patches. - Add possible TODOs to points where this series can improve in the future. - Check the state of migration using migration_in_setup and migration_has_failed instead of checking all the possible migration status in a switch. - Add TODO with possible low hand fruit using RESUME ops. - Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do their thing instead of adding a variable. - RFC v2 at https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html RFC v2: - Use a migration listener instead of a memory listener to know when the migration starts. - Add stuff not picked with ASID patches, like enable rings after driver_ok - Add rewinding on the migration src, not in dst - RFC v1 at https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html [1] https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/ [2] https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html Eugenio PĂ©rez (15): vdpa net: move iova tree creation from init to start vdpa: Remember last call fd set vdpa: stop svq at vhost_vdpa_dev_start(false) vdpa: Negotiate _F_SUSPEND feature vdpa: move vhost reset after get vring base vdpa: add vhost_vdpa->suspended parameter vdpa: add vhost_vdpa_suspend vdpa: rewind at get_base, not set_base vdpa: add vdpa net migration state notifier vdpa: disable RAM block discard only for the first device vdpa net: block migration if the device has CVQ vdpa: block migration if device has unsupported features vdpa: block migration if SVQ does not admit a feature vdpa net: allow VHOST_F_LOG_ALL vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices include/hw/virtio/vhost-backend.h | 4 + include/hw/virtio/vhost-vdpa.h | 3 + hw/virtio/vhost-shadow-virtqueue.c | 8 +- hw/virtio/vhost-vdpa.c | 128 +++++++++++++------ hw/virtio/vhost.c | 3 + net/vhost-vdpa.c | 198 ++++++++++++++++++++++++----- hw/virtio/trace-events | 1 + 7 files changed, 273 insertions(+), 72 deletions(-) Tested-by: Alvaro Karsz