From patchwork Wed Jan 29 14:42:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51143C02190 for ; Wed, 29 Jan 2025 14:49:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IK-0000Wc-Hw; Wed, 29 Jan 2025 09:43:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9ID-0000Uk-Sa for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:50 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IB-0001K2-19 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:49 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjiX006526; Wed, 29 Jan 2025 14:43:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=9YDFJVbeWYlSk5i0b+QkftmgZEgNJGrHX5K7RcfeXDY=; b= my2eCscpz6T49vTZM54ZWhGill6X7S65JWLMjYIsQ/3p/wWWL+sKmuC+v8oU3hcq oCvdtLhdZNYWsXEC3lNc6wxHZHnQDjwf2Schsc5ENUFW/uDRcj9hwaf1A4dGhafm MDEQaa6kBmuTC5eG/Seq5HBUdoMOmezTuLKfm2DGk5lrtuny1laoFsuU/15s3XVb 8F40fpf7RTndJJNAfJBAHq49PR5/n9aaZ+4vrci2ceCYw9LZp0Lx0BfWGZkLkBhM OlBXniqwgmFZoTezrtZiB3yFQp2NHoQ2viuP5nPECs9prBT4IqP+qgJNe8WW6Du9 lJL1fWLreIwMqFE26gUaBw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808sc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:43 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDus2m034933; Wed, 29 Jan 2025 14:43:42 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4nu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:42 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8G003307; Wed, 29 Jan 2025 14:43:42 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-2; Wed, 29 Jan 2025 14:43:41 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 01/26] migration: cpr helpers Date: Wed, 29 Jan 2025 06:42:57 -0800 Message-Id: <1738161802-172631-2-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: bqgA4NCt0pXucOXF6VTdhpp2XU5laM2K X-Proofpoint-ORIG-GUID: bqgA4NCt0pXucOXF6VTdhpp2XU5laM2K Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add the cpr_needed_for_reuse and cpr_resave_fd helpers, for use when adding cpr support for vfio and iommufd. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 3 +++ migration/cpr.c | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 3a6deb7..faeb0cc 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -18,6 +18,7 @@ void cpr_save_fd(const char *name, int id, int fd); void cpr_delete_fd(const char *name, int id); int cpr_find_fd(const char *name, int id); +void cpr_resave_fd(const char *name, int id, int fd); MigMode cpr_get_incoming_mode(void); void cpr_set_incoming_mode(MigMode mode); @@ -27,6 +28,8 @@ int cpr_state_load(MigrationChannel *channel, Error **errp); void cpr_state_close(void); struct QIOChannel *cpr_state_ioc(void); +bool cpr_needed_for_reuse(void *opaque); + QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp); QEMUFile *cpr_transfer_input(MigrationChannel *channel, Error **errp); diff --git a/migration/cpr.c b/migration/cpr.c index 584b0b9..e3f27e9 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -95,6 +95,21 @@ int cpr_find_fd(const char *name, int id) trace_cpr_find_fd(name, id, fd); return fd; } + +void cpr_resave_fd(const char *name, int id, int fd) +{ + CprFd *elem = find_fd(&cpr_state.fds, name, id); + int old_fd = elem ? elem->fd : -1; + + if (old_fd < 0) { + cpr_save_fd(name, id, fd); + } else if (old_fd != fd) { + error_setg(&error_fatal, + "internal error: cpr fd '%s' id %d value %d " + "already saved with a different value %d", + name, id, fd, old_fd); + } +} /*************************************************************************/ #define CPR_STATE "CprState" @@ -222,3 +237,9 @@ void cpr_state_close(void) cpr_state_file = NULL; } } + +bool cpr_needed_for_reuse(void *opaque) +{ + MigMode mode = migrate_mode(); + return mode == MIG_MODE_CPR_TRANSFER; +} From patchwork Wed Jan 29 14:42:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E132CC0218D for ; Wed, 29 Jan 2025 14:47:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IO-0000YU-An; Wed, 29 Jan 2025 09:44:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IF-0000V1-1U for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:51 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IB-0001KE-Iw for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:50 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXubm030033; Wed, 29 Jan 2025 14:43:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=E9od75KsDI1UDodFCf9AtXjSsO7FaXEqr0f9p3pxvFc=; b= Lm6lZL8oDE6JcJv86qbnOl39eoLEpb6xkxL9jBsnjUo8hTl4E2e0S0bGwbv13KzI 5pRmTmnaGnf4kxG2oqRXrzAzqEAsSPod4zIX7rb8tXvzqWqOWJDbuuSSRv484cx+ up4xgtGSFqOYxfHvGslAHPg+zPt5QM8VaQedhmsXVqvL57uOpUNz1cVy+PA8gePG hdBUTu44h99HPQxNCwudVX3IZcsVWOmXKRaxiKdJmmt72vSaFBdbiCG1r9teKPu2 VEMCDVww8Jh7vo95yspwiu2v97B1uRtAvHTmsd3rZbyqKMgniBM1cwMfdkHgCd3Y 6N4Q9U6m4m1ROJ5au3gmIg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fn2ug601-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:44 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDQChk037049; Wed, 29 Jan 2025 14:43:43 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4p7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:43 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8I003307; Wed, 29 Jan 2025 14:43:42 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-3; Wed, 29 Jan 2025 14:43:42 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 02/26] migration: lower handler priority Date: Wed, 29 Jan 2025 06:42:58 -0800 Message-Id: <1738161802-172631-3-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: 6bkExv9r8OzgTquyfJWPNWLICwQtfFMe X-Proofpoint-GUID: 6bkExv9r8OzgTquyfJWPNWLICwQtfFMe Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Define a vmstate priority that is lower than the default, so its handlers run after all default priority handlers. Since 0 is no longer the default priority, translate an uninitialized priority of 0 to MIG_PRI_DEFAULT. CPR for vfio will use this to install handlers for containers that run after handlers for the devices that they contain. Signed-off-by: Steve Sistare --- include/migration/vmstate.h | 3 ++- migration/savevm.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index a1dfab4..3055a46 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -155,7 +155,8 @@ enum VMStateFlags { }; typedef enum { - MIG_PRI_DEFAULT = 0, + MIG_PRI_LOW = 1, /* Must happen after default */ + MIG_PRI_DEFAULT, MIG_PRI_IOMMU, /* Must happen before PCI devices */ MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ diff --git a/migration/savevm.c b/migration/savevm.c index 264bc06..5dd2dc4 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -232,7 +232,7 @@ typedef struct SaveState { static SaveState savevm_state = { .handlers = QTAILQ_HEAD_INITIALIZER(savevm_state.handlers), - .handler_pri_head = { [MIG_PRI_DEFAULT ... MIG_PRI_MAX] = NULL }, + .handler_pri_head = { [0 ... MIG_PRI_MAX] = NULL }, .global_section_id = 0, }; @@ -704,7 +704,7 @@ static int calculate_compat_instance_id(const char *idstr) static inline MigrationPriority save_state_priority(SaveStateEntry *se) { - if (se->vmsd) { + if (se->vmsd && se->vmsd->priority) { return se->vmsd->priority; } return MIG_PRI_DEFAULT; From patchwork Wed Jan 29 14:42:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953820 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 952C6C0218D for ; Wed, 29 Jan 2025 14:47:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IO-0000YV-Bd; Wed, 29 Jan 2025 09:44:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IF-0000V2-1a for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:51 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IB-0001KJ-Hb for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:50 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXsLf011637; Wed, 29 Jan 2025 14:43:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=+Iwsp7trPWLEgogrVD3j3VhwXPXWDwHPvQKMLMMUmX8=; b= mGCdLZVJ6uZ87eL+iBn6Vx3jezIM6QQKaeMhbaKQBLwFLxviWUIhm8RYCHG4Yj8G Bz2wYa5zdlmsFZhOYKzgAF6zSKmyRgXTr2d3gGmn/I2306KTLyZRMr3RFZyEIWDs pFpDchxMn9LbgCu04rgZ3oYw/qQ7a1RPZFohoKkG1bpH9qeX6XXpUysUJBCkuhQ9 xxC0rvyastZ0FDToAdgzp2GmCwljqny4na2KcLXvNJ8Vj30GUSe0aSHNRW0lRGfc 9p2pFtQBU9VqEcRWBnBjS2qSpO1toJcuxK5CQsUo9I5SJN/tmd8tWlRZepFMiXoP UjcvFOI5O+0ZyE6s88zyHg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fnk9g3hy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:44 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TE74Kc037042; Wed, 29 Jan 2025 14:43:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4pd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:44 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8K003307; Wed, 29 Jan 2025 14:43:43 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-4; Wed, 29 Jan 2025 14:43:43 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 03/26] vfio: vfio_find_ram_discard_listener Date: Wed, 29 Jan 2025 06:42:59 -0800 Message-Id: <1738161802-172631-4-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: jDfzW7I2hK_wCx6JeGuOOO0KaAjLbADM X-Proofpoint-GUID: jDfzW7I2hK_wCx6JeGuOOO0KaAjLbADM Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Define vfio_find_ram_discard_listener as a subroutine so additional calls to it may be added in a subsequent patch. Signed-off-by: Steve Sistare --- hw/vfio/common.c | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f7499a9..7370332 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -555,6 +555,26 @@ static bool vfio_get_section_iova_range(VFIOContainerBase *bcontainer, return true; } +static VFIORamDiscardListener *vfio_find_ram_discard_listener( + VFIOContainerBase *bcontainer, MemoryRegionSection *section) +{ + VFIORamDiscardListener *vrdl = NULL; + + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { + if (vrdl->mr == section->mr && + vrdl->offset_within_address_space == + section->offset_within_address_space) { + break; + } + } + + if (!vrdl) { + hw_error("vfio: Trying to sync missing RAM discard listener"); + /* does not return */ + } + return vrdl; +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -1266,19 +1286,8 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainerBase *bcontainer, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); - VFIORamDiscardListener *vrdl = NULL; - - QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { - if (vrdl->mr == section->mr && - vrdl->offset_within_address_space == - section->offset_within_address_space) { - break; - } - } - - if (!vrdl) { - hw_error("vfio: Trying to sync missing RAM discard listener"); - } + VFIORamDiscardListener *vrdl = + vfio_find_ram_discard_listener(bcontainer, section); /* * We only want/can synchronize the bitmap for actually mapped parts - From patchwork Wed Jan 29 14:43:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953808 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48086C0218D for ; Wed, 29 Jan 2025 14:44:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IJ-0000WS-3i; Wed, 29 Jan 2025 09:43:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IF-0000V3-2p for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:51 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IC-0001KS-NG for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:50 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjLi006509; Wed, 29 Jan 2025 14:43:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=equkl4PFueOKpfkrVaJeUTC9NSmqllHeha8OhdSWRB8=; b= O6cRx/XRVDT7jjjfxccrh3Nfgf0MrYUXSt7yTHPXbTNR22S0LQhd2EVAz7AJDjKD fR6jfBJT7PnSqYYBEZ7NL2Um9O/xTNY6+uNh/r55C7uMAyESidn4rmOHbWo8eRtm 0PkEF3VJnK/I7eaABkALcfvznFyWhNri5uf0+LekmOK60YvhKaDDCzLtFt2JSDf+ KyiJk3flzgrX93mRJ+c0kAbMt4QXPDeT+0p+Pei2pM0Ylv/4Cr4WtxiA7Ep6g+Yr 0nawg+zO789MMSgja7P/Dli133S/N9PixsDgudM3YBayiKdXFMk/RAxMfUSmgYSE D1TzDaD/LUU5DI7mjExImA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808sw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:45 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDC19K034407; Wed, 29 Jan 2025 14:43:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4px-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:44 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8M003307; Wed, 29 Jan 2025 14:43:44 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-5; Wed, 29 Jan 2025 14:43:44 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 04/26] vfio/container: register container for cpr Date: Wed, 29 Jan 2025 06:43:00 -0800 Message-Id: <1738161802-172631-5-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: LnareuZpnGzbKkOFSMJ-TJUCtOS78XvV X-Proofpoint-ORIG-GUID: LnareuZpnGzbKkOFSMJ-TJUCtOS78XvV Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Register a legacy container for cpr-transfer. Add a blocker if the kernel does not support VFIO_UPDATE_VADDR or VFIO_UNMAP_ALL. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare --- hw/vfio/container.c | 6 ++-- hw/vfio/cpr-legacy.c | 68 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/meson.build | 3 +- include/hw/vfio/vfio-common.h | 3 ++ 4 files changed, 76 insertions(+), 4 deletions(-) create mode 100644 hw/vfio/cpr-legacy.c diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 4ebb526..a90ce6c 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -618,7 +618,7 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, } bcontainer = &container->bcontainer; - if (!vfio_cpr_register_container(bcontainer, errp)) { + if (!vfio_legacy_cpr_register_container(container, errp)) { goto free_container_exit; } @@ -666,7 +666,7 @@ enable_discards_exit: vfio_ram_block_discard_disable(container, false); unregister_container_exit: - vfio_cpr_unregister_container(bcontainer); + vfio_legacy_cpr_unregister_container(container); free_container_exit: object_unref(container); @@ -710,7 +710,7 @@ static void vfio_disconnect_container(VFIOGroup *group) VFIOAddressSpace *space = bcontainer->space; trace_vfio_disconnect_container(container->fd); - vfio_cpr_unregister_container(bcontainer); + vfio_legacy_cpr_unregister_container(container); close(container->fd); object_unref(container); diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c new file mode 100644 index 0000000..d3bbc05 --- /dev/null +++ b/hw/vfio/cpr-legacy.c @@ -0,0 +1,68 @@ +/* + * Copyright (c) 2021-2025 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include +#include "qemu/osdep.h" +#include "hw/vfio/vfio-common.h" +#include "migration/blocker.h" +#include "migration/cpr.h" +#include "migration/migration.h" +#include "migration/vmstate.h" +#include "qapi/error.h" + +static bool vfio_cpr_supported(VFIOContainer *container, Error **errp) +{ + if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UPDATE_VADDR)) { + error_setg(errp, "VFIO container does not support VFIO_UPDATE_VADDR"); + return false; + + } else if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UNMAP_ALL)) { + error_setg(errp, "VFIO container does not support VFIO_UNMAP_ALL"); + return false; + + } else { + return true; + } +} + +static const VMStateDescription vfio_container_vmstate = { + .name = "vfio-container", + .version_id = 0, + .minimum_version_id = 0, + .needed = cpr_needed_for_reuse, + .fields = (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + +bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + Error **cpr_blocker = &container->cpr_blocker; + + if (!vfio_cpr_register_container(bcontainer, errp)) { + return false; + } + + if (!vfio_cpr_supported(container, cpr_blocker)) { + return migrate_add_blocker_modes(cpr_blocker, errp, + MIG_MODE_CPR_TRANSFER, -1) == 0; + } + + vmstate_register(NULL, -1, &vfio_container_vmstate, container); + + return true; +} + +void vfio_legacy_cpr_unregister_container(VFIOContainer *container) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + + vfio_cpr_unregister_container(bcontainer); + migrate_del_blocker(&container->cpr_blocker); + vmstate_unregister(NULL, &vfio_container_vmstate, container); +} diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index bba776f..5487815 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -5,13 +5,14 @@ vfio_ss.add(files( 'container-base.c', 'container.c', 'migration.c', - 'cpr.c', )) vfio_ss.add(when: 'CONFIG_PSERIES', if_true: files('spapr.c')) vfio_ss.add(when: 'CONFIG_IOMMUFD', if_true: files( 'iommufd.c', )) vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( + 'cpr.c', + 'cpr-legacy.c', 'display.c', 'pci-quirks.c', 'pci.c', diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 0c60be5..53e554f 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -84,6 +84,7 @@ typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ unsigned iommu_type; + Error *cpr_blocker; QLIST_HEAD(, VFIOGroup) group_list; } VFIOContainer; @@ -258,6 +259,8 @@ int vfio_kvm_device_del_fd(int fd, Error **errp); bool vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp); void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer); +bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp); +void vfio_legacy_cpr_unregister_container(VFIOContainer *container); extern const MemoryRegionOps vfio_region_ops; typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; From patchwork Wed Jan 29 14:43:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 911CFC02190 for ; Wed, 29 Jan 2025 14:47:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IR-0000aK-3Q; Wed, 29 Jan 2025 09:44:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0000VZ-C2 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:52 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9ID-0001KX-3B for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:51 -0500 Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXq2w006366; Wed, 29 Jan 2025 14:43:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=/wgzN+mc7d/TiXv74t2VfLwHkMPhU6AEjb1uYCnlGds=; b= izQsHjMi0/133rwB7pjRMchDZPsroPwEZRGAZwoUI0DyiQvgzhh97pH/ZyyAXJSL QbzRFLr8RR8Sf0hwksMA5cnZVHqv4eeNR5eKT6SeonPfcUwrl+cGFufBQEBIX/Qt U8bXs6E5ktu46Mw2f8xUYCoaQKVdZSSyQoei7j5Phca1MCM2a7BjusHxHDNGMqd0 INyE93FwdKdieW/rOhHXEh3n2M6YhJaLw89V11JdzsF952csQN5woXI46dGMASxz x3fSrPNWwuqU0zWECBWAPjgkf4GjKoYPeOZVjmrKw6WHouoqU7eiqtwJMuD3bV2R L87Ils2L6Go7rObo34v8Zg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmut8722-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:46 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TERLCB036983; Wed, 29 Jan 2025 14:43:45 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4qd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:45 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8O003307; Wed, 29 Jan 2025 14:43:44 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-6; Wed, 29 Jan 2025 14:43:44 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 05/26] vfio/container: preserve descriptors Date: Wed, 29 Jan 2025 06:43:01 -0800 Message-Id: <1738161802-172631-6-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: P4t5w7olCeqdGegbXuBXHlal8MipYZli X-Proofpoint-ORIG-GUID: P4t5w7olCeqdGegbXuBXHlal8MipYZli Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org At vfio creation time, save the value of vfio container, group, and device descriptors in CPR state. On qemu restart, vfio_realize() finds and uses the saved descriptors, and remembers the reused status for subsequent patches. The reused status is cleared when vmstate load finishes. During reuse, device and iommu state is already configured, so operations in vfio_realize that would modify the configuration, such as vfio ioctl's, are skipped. The result is that vfio_realize constructs qemu data structures that reflect the current state of the device. Signed-off-by: Steve Sistare --- hw/vfio/container.c | 105 ++++++++++++++++++++++++++++++++++-------- hw/vfio/cpr-legacy.c | 17 +++++++ include/hw/vfio/vfio-common.h | 2 + 3 files changed, 105 insertions(+), 19 deletions(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index a90ce6c..81d0ccc 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -31,6 +31,7 @@ #include "system/reset.h" #include "trace.h" #include "qapi/error.h" +#include "migration/cpr.h" #include "pci.h" VFIOGroupList vfio_group_list = @@ -415,12 +416,28 @@ static bool vfio_set_iommu(int container_fd, int group_fd, } static VFIOContainer *vfio_create_container(int fd, VFIOGroup *group, - Error **errp) + bool reused, Error **errp) { int iommu_type; const char *vioc_name; VFIOContainer *container; + /* + * If container is reused, just set its type and skip the ioctls, as the + * container and group are already configured in the kernel. + * VFIO_TYPE1v2_IOMMU is the only type that supports reuse/cpr. + */ + if (reused) { + if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) { + iommu_type = VFIO_TYPE1v2_IOMMU; + goto skip_iommu; + } else { + error_setg(errp, "container was reused but VFIO_TYPE1v2_IOMMU " + "is not supported"); + return NULL; + } + } + iommu_type = vfio_get_iommu_type(fd, errp); if (iommu_type < 0) { return NULL; @@ -430,10 +447,12 @@ static VFIOContainer *vfio_create_container(int fd, VFIOGroup *group, return NULL; } +skip_iommu: vioc_name = vfio_get_iommu_class_name(iommu_type); container = VFIO_IOMMU_LEGACY(object_new(vioc_name)); container->fd = fd; + container->reused = reused; container->iommu_type = iommu_type; return container; } @@ -543,10 +562,13 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, VFIOContainer *container; VFIOContainerBase *bcontainer; int ret, fd; + bool reused; VFIOAddressSpace *space; VFIOIOMMUClass *vioc; space = vfio_get_address_space(as); + fd = cpr_find_fd("vfio_container_for_group", group->groupid); + reused = (fd > 0); /* * VFIO is currently incompatible with discarding of RAM insofar as the @@ -579,28 +601,52 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, * details once we know which type of IOMMU we are using. */ + /* + * If the container is reused, then the group is already attached in the + * kernel. If a container with matching fd is found, then update the + * userland group list and return. If not, then after the loop, create + * the container struct and group list. + */ + QLIST_FOREACH(bcontainer, &space->containers, next) { container = container_of(bcontainer, VFIOContainer, bcontainer); - if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { - ret = vfio_ram_block_discard_disable(container, true); - if (ret) { - error_setg_errno(errp, -ret, - "Cannot set discarding of RAM broken"); - if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, - &container->fd)) { - error_report("vfio: error disconnecting group %d from" - " container", group->groupid); - } - return false; + + if (reused) { + if (container->fd != fd) { + continue; } - group->container = container; - QLIST_INSERT_HEAD(&container->group_list, group, container_next); + } else if (ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { + continue; + } + + /* Container is a match for the group */ + ret = vfio_ram_block_discard_disable(container, true); + if (ret) { + error_setg_errno(errp, -ret, + "Cannot set discarding of RAM broken"); + if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, + &container->fd)) { + error_report("vfio: error disconnecting group %d from" + " container", group->groupid); + + } + goto delete_fd_exit; + } + group->container = container; + QLIST_INSERT_HEAD(&container->group_list, group, container_next); + if (!reused) { vfio_kvm_device_add_group(group); - return true; + cpr_save_fd("vfio_container_for_group", group->groupid, + container->fd); } + return true; + } + + /* No matching container found, create one */ + if (!reused) { + fd = qemu_open("/dev/vfio/vfio", O_RDWR, errp); } - fd = qemu_open("/dev/vfio/vfio", O_RDWR, errp); if (fd < 0) { goto put_space_exit; } @@ -612,11 +658,12 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, goto close_fd_exit; } - container = vfio_create_container(fd, group, errp); + container = vfio_create_container(fd, group, reused, errp); if (!container) { goto close_fd_exit; } bcontainer = &container->bcontainer; + container->reused = reused; if (!vfio_legacy_cpr_register_container(container, errp)) { goto free_container_exit; @@ -652,6 +699,7 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, } bcontainer->initialized = true; + cpr_resave_fd("vfio_container_for_group", group->groupid, fd); return true; listener_release_exit: @@ -677,6 +725,8 @@ close_fd_exit: put_space_exit: vfio_put_address_space(space); +delete_fd_exit: + cpr_delete_fd("vfio_container_for_group", group->groupid); return false; } @@ -688,6 +738,7 @@ static void vfio_disconnect_container(VFIOGroup *group) QLIST_REMOVE(group, container_next); group->container = NULL; + cpr_delete_fd("vfio_container_for_group", group->groupid); /* * Explicitly release the listener first before unset container, @@ -741,7 +792,12 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) group = g_malloc0(sizeof(*group)); snprintf(path, sizeof(path), "/dev/vfio/%d", groupid); - group->fd = qemu_open(path, O_RDWR, errp); + + group->fd = cpr_find_fd("vfio_group", groupid); + if (group->fd < 0) { + group->fd = qemu_open(path, O_RDWR, errp); + } + if (group->fd < 0) { goto free_group_exit; } @@ -769,6 +825,7 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) } QLIST_INSERT_HEAD(&vfio_group_list, group, next); + cpr_resave_fd("vfio_group", groupid, group->fd); return group; @@ -794,6 +851,7 @@ static void vfio_put_group(VFIOGroup *group) vfio_disconnect_container(group); QLIST_REMOVE(group, next); trace_vfio_put_group(group->fd); + cpr_delete_fd("vfio_group", group->groupid); close(group->fd); g_free(group); } @@ -803,8 +861,14 @@ static bool vfio_get_device(VFIOGroup *group, const char *name, { g_autofree struct vfio_device_info *info = NULL; int fd; + bool reused; + + fd = cpr_find_fd(name, 0); + reused = (fd >= 0); + if (!reused) { + fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name); + } - fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name); if (fd < 0) { error_setg_errno(errp, errno, "error getting device from group %d", group->groupid); @@ -849,6 +913,8 @@ static bool vfio_get_device(VFIOGroup *group, const char *name, vbasedev->num_irqs = info->num_irqs; vbasedev->num_regions = info->num_regions; vbasedev->flags = info->flags; + vbasedev->reused = reused; + cpr_resave_fd(name, 0, fd); trace_vfio_get_device(name, info->flags, info->num_regions, info->num_irqs); @@ -865,6 +931,7 @@ static void vfio_put_base_device(VFIODevice *vbasedev) QLIST_REMOVE(vbasedev, next); vbasedev->group = NULL; trace_vfio_put_base_device(vbasedev->fd); + cpr_delete_fd(vbasedev->name, 0); close(vbasedev->fd); } diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c index d3bbc05..ce6f14e 100644 --- a/hw/vfio/cpr-legacy.c +++ b/hw/vfio/cpr-legacy.c @@ -29,10 +29,27 @@ static bool vfio_cpr_supported(VFIOContainer *container, Error **errp) } } +static int vfio_container_post_load(void *opaque, int version_id) +{ + VFIOContainer *container = opaque; + VFIOGroup *group; + VFIODevice *vbasedev; + + container->reused = false; + + QLIST_FOREACH(group, &container->group_list, container_next) { + QLIST_FOREACH(vbasedev, &group->device_list, next) { + vbasedev->reused = false; + } + } + return 0; +} + static const VMStateDescription vfio_container_vmstate = { .name = "vfio-container", .version_id = 0, .minimum_version_id = 0, + .post_load = vfio_container_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { VMSTATE_END_OF_LIST() diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 53e554f..a435a90 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -85,6 +85,7 @@ typedef struct VFIOContainer { int fd; /* /dev/vfio/vfio, empowered by the attached groups */ unsigned iommu_type; Error *cpr_blocker; + bool reused; QLIST_HEAD(, VFIOGroup) group_list; } VFIOContainer; @@ -135,6 +136,7 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; OnOffAuto enable_migration; bool migration_events; + bool reused; VFIODeviceOps *ops; unsigned int num_irqs; unsigned int num_regions; From patchwork Wed Jan 29 14:43:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5E813C0218D for ; Wed, 29 Jan 2025 14:44:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IL-0000Xy-MI; Wed, 29 Jan 2025 09:43:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0000Va-CH for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:52 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9ID-0001Kg-MH for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:51 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXofu029996; Wed, 29 Jan 2025 14:43:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=dxO7RmoHpuyW7MIQrWXdvs9viXY+6yordmUe816rFj8=; b= MyDsoodEdcVMtx1o+1YSZsQtywkEeMuL2nrgjq1XOvVkA+heCISXDKWJDEX8Fcud q5g/vDhLqyaJ34uy5gNtrDA3Xvxzu2mav8bqrmAAWQ5zvXRURtFE95c0301c+DtU fqwU8iYYAyYyVHkdKe3zFNAyAVd6B/x+F92g/uipIdy75sleBDYA5QkC24FT3e+V sEMgzIbCBHgnKscL+z3Y+aNUAGOSzeqNNH0vdUR8tKbzTnugiy4qyx9Vr2wUWe6O qd69nvmXS7pLckSui1SQwOPeHzmhk5njxl2gA35gOZKjoyrQfOLfVJ/4wOyg4oJB EpS3zrpMYljjCXAWUMquBg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fn2ug602-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:47 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEGPIL034279; Wed, 29 Jan 2025 14:43:46 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4qq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:46 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8Q003307; Wed, 29 Jan 2025 14:43:45 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-7; Wed, 29 Jan 2025 14:43:45 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 06/26] vfio/container: preserve DMA mappings Date: Wed, 29 Jan 2025 06:43:02 -0800 Message-Id: <1738161802-172631-7-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: d8-N8NYa9Y5YbUuwJIA_q-jZQleOcuLw X-Proofpoint-GUID: d8-N8NYa9Y5YbUuwJIA_q-jZQleOcuLw Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Preserve DMA mappings during cpr-transfer. In the container pre_save handler, suspend the use of virtual addresses in DMA mappings with VFIO_DMA_UNMAP_FLAG_VADDR, because guest RAM will be remapped at a different VA after exec. DMA to already-mapped pages continues. Because the vaddr is temporarily invalid, mediated devices cannot be supported, so add a blocker for them. This restriction will not apply to iommufd containers when CPR is added for them in a future patch. In new QEMU, do not register the memory listener at device creation time. Register it later, in the container post_load handler, after all vmstate that may affect regions and mapping boundaries has been loaded. The post_load registration will cause the listener to invoke its callback on each flat section, and the calls will match the mappings remembered by the kernel. Modify vfio_dma_map (which is called by the listener) to pass the new VA to the kernel using VFIO_DMA_MAP_FLAG_VADDR. Signed-off-by: Steve Sistare --- hw/vfio/container.c | 44 +++++++++++++++++++++++++++++++++++++++---- hw/vfio/cpr-legacy.c | 32 +++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 3 +++ 3 files changed, 75 insertions(+), 4 deletions(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 81d0ccc..2b5125e 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -32,6 +32,7 @@ #include "trace.h" #include "qapi/error.h" #include "migration/cpr.h" +#include "migration/blocker.h" #include "pci.h" VFIOGroupList vfio_group_list = @@ -132,6 +133,8 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer, int ret; Error *local_err = NULL; + assert(!container->reused); + if (iotlb && vfio_devices_all_dirty_tracking_started(bcontainer)) { if (!vfio_devices_all_device_dirty_tracking(bcontainer) && bcontainer->dirty_pages_supported) { @@ -183,12 +186,24 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, bcontainer); struct vfio_iommu_type1_dma_map map = { .argsz = sizeof(map), - .flags = VFIO_DMA_MAP_FLAG_READ, .vaddr = (__u64)(uintptr_t)vaddr, .iova = iova, .size = size, }; + /* + * Set the new vaddr for any mappings registered during cpr load. + * Reused is cleared thereafter. + */ + if (container->reused) { + map.flags = VFIO_DMA_MAP_FLAG_VADDR; + if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map)) { + goto fail; + } + return 0; + } + + map.flags = VFIO_DMA_MAP_FLAG_READ; if (!readonly) { map.flags |= VFIO_DMA_MAP_FLAG_WRITE; } @@ -205,7 +220,11 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, return 0; } - error_report("VFIO_MAP_DMA failed: %s", strerror(errno)); +fail: + error_report("vfio_dma_map %s (iova %lu, size %ld, va %p): %s", + (container->reused ? "VADDR" : ""), iova, size, vaddr, + strerror(errno)); + return -errno; } @@ -689,8 +708,17 @@ static bool vfio_connect_container(VFIOGroup *group, AddressSpace *as, group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); - bcontainer->listener = vfio_memory_listener; - memory_listener_register(&bcontainer->listener, bcontainer->space->as); + /* + * If reused, register the listener later, after all state that may + * affect regions and mapping boundaries has been cpr load'ed. Later, + * the listener will invoke its callback on each flat section and call + * vfio_dma_map to supply the new vaddr, and the calls will match the + * mappings remembered by the kernel. + */ + if (!reused) { + bcontainer->listener = vfio_memory_listener; + memory_listener_register(&bcontainer->listener, bcontainer->space->as); + } if (bcontainer->error) { error_propagate_prepend(errp, bcontainer->error, @@ -1002,6 +1030,13 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev, return false; } + if (vbasedev->mdev) { + error_setg(&vbasedev->cpr_mdev_blocker, + "CPR does not support vfio mdev %s", vbasedev->name); + migrate_add_blocker_modes(&vbasedev->cpr_mdev_blocker, &error_fatal, + MIG_MODE_CPR_TRANSFER, -1); + } + bcontainer = &group->container->bcontainer; vbasedev->bcontainer = bcontainer; QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); @@ -1018,6 +1053,7 @@ static void vfio_legacy_detach_device(VFIODevice *vbasedev) QLIST_REMOVE(vbasedev, container_next); vbasedev->bcontainer = NULL; trace_vfio_detach_device(vbasedev->name, group->groupid); + migrate_del_blocker(&vbasedev->cpr_mdev_blocker); vfio_put_base_device(vbasedev); vfio_put_group(group); } diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c index ce6f14e..f3a31d1 100644 --- a/hw/vfio/cpr-legacy.c +++ b/hw/vfio/cpr-legacy.c @@ -14,6 +14,21 @@ #include "migration/vmstate.h" #include "qapi/error.h" +static bool vfio_dma_unmap_vaddr_all(VFIOContainer *container, Error **errp) +{ + struct vfio_iommu_type1_dma_unmap unmap = { + .argsz = sizeof(unmap), + .flags = VFIO_DMA_UNMAP_FLAG_VADDR | VFIO_DMA_UNMAP_FLAG_ALL, + .iova = 0, + .size = 0, + }; + if (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) { + error_setg_errno(errp, errno, "vfio_dma_unmap_vaddr_all"); + return false; + } + return true; +} + static bool vfio_cpr_supported(VFIOContainer *container, Error **errp) { if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UPDATE_VADDR)) { @@ -29,12 +44,27 @@ static bool vfio_cpr_supported(VFIOContainer *container, Error **errp) } } +static int vfio_container_pre_save(void *opaque) +{ + VFIOContainer *container = opaque; + Error *err = NULL; + + if (!vfio_dma_unmap_vaddr_all(container, &err)) { + error_report_err(err); + return -1; + } + return 0; +} + static int vfio_container_post_load(void *opaque, int version_id) { VFIOContainer *container = opaque; + VFIOContainerBase *bcontainer = &container->bcontainer; VFIOGroup *group; VFIODevice *vbasedev; + bcontainer->listener = vfio_memory_listener; + memory_listener_register(&bcontainer->listener, bcontainer->space->as); container->reused = false; QLIST_FOREACH(group, &container->group_list, container_next) { @@ -49,6 +79,8 @@ static const VMStateDescription vfio_container_vmstate = { .name = "vfio-container", .version_id = 0, .minimum_version_id = 0, + .priority = MIG_PRI_LOW, /* Must happen after devices and groups */ + .pre_save = vfio_container_pre_save, .post_load = vfio_container_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index a435a90..1e974e0 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -143,6 +143,7 @@ typedef struct VFIODevice { unsigned int flags; VFIOMigration *migration; Error *migration_blocker; + Error *cpr_mdev_blocker; OnOffAuto pre_copy_dirty_page_tracking; OnOffAuto device_dirty_page_tracking; bool dirty_pages_supported; @@ -310,6 +311,8 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer, int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr, Error **errp); +void vfio_listener_register(VFIOContainerBase *bcontainer); + /* Returns 0 on success, or a negative errno. */ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp); void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp); From patchwork Wed Jan 29 14:43:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A67FC0218D for ; Wed, 29 Jan 2025 14:45:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IP-0000Zo-LG; Wed, 29 Jan 2025 09:44:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0000W7-Qw for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:53 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IE-0001Ko-QD for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:52 -0500 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEbCQj007548; Wed, 29 Jan 2025 14:43:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=QKAEHNgfgJr719FTxd7rx1e4E3u189Wkt3eRkRWbTdI=; b= d9sndbHhIYGJHZGNJb9XIAojMU9m5rsM1gd9d/6mVfhZzk0NMRv3fMy6d+py4b7V iThSajSqQ+sXQsQf86fF8b3yWvUEiSzoA/T1hXOWrAYdmjAQFuO2y2shARIMwQGQ 44ZsvbZ3J2azXLcPUD4pfcWLqATwpBKuFoopCNmLyQRf6oIaPlBgi74CoZomqkjg iZD0pV++AtdoJi3AH2gZPdFyOhJH+t2uECmnJRsPK5YiJkZLDC6hJm8pN96wqhp/ 5nkfDHzZVgmWihzR2Gq5yjRxzDFmCpcfLCNJWj8WWGv+DRfjug1y/WjFDb1SrBa9 rKw6b1TMGtEPGzNSS05dbQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp8e00qf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:47 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TE3Xqx036333; Wed, 29 Jan 2025 14:43:46 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4qw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:46 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8S003307; Wed, 29 Jan 2025 14:43:46 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-8; Wed, 29 Jan 2025 14:43:46 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 07/26] vfio/container: recover from unmap-all-vaddr failure Date: Wed, 29 Jan 2025 06:43:03 -0800 Message-Id: <1738161802-172631-8-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: G5Wpx8pkceuc2bpQs5pqBBgg5ip0MY1s X-Proofpoint-GUID: G5Wpx8pkceuc2bpQs5pqBBgg5ip0MY1s Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org If there are multiple containers and unmap-all fails for some container, we need to remap vaddr for the other containers for which unmap-all succeeded. Recover by walking all address ranges of all containers to restore the vaddr for each. Do so by invoking the vfio listener callback, and passing a new "remap" flag that tells it to restore a mapping without re-allocating new userland data structures. Signed-off-by: Steve Sistare --- hw/vfio/common.c | 47 ++++++++++++++++++++++++++++++++++++++++++- hw/vfio/cpr-legacy.c | 44 ++++++++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 6 +++++- 3 files changed, 95 insertions(+), 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7370332..c8ee71a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -580,6 +580,13 @@ static void vfio_listener_region_add(MemoryListener *listener, { VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, listener); + vfio_container_region_add(bcontainer, section, false); +} + +void vfio_container_region_add(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, + bool remap) +{ hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -614,6 +621,30 @@ static void vfio_listener_region_add(MemoryListener *listener, int iommu_idx; trace_vfio_listener_region_add_iommu(section->mr->name, iova, end); + + /* + * If remap, then VFIO_DMA_UNMAP_FLAG_VADDR has been called, and we + * want to remap the vaddr. vfio_container_region_add was already + * called in the past, so the giommu already exists. Find it and + * replay it, which calls vfio_dma_map further down the stack. + */ + + if (remap) { + hwaddr as_offset = section->offset_within_address_space; + hwaddr iommu_offset = as_offset - section->offset_within_region; + + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) { + if (giommu->iommu_mr == iommu_mr && + giommu->iommu_offset == iommu_offset) { + memory_region_iommu_replay(giommu->iommu_mr, &giommu->n); + return; + } + } + error_report("Container cannot find iommu region %s offset %lx", + memory_region_name(section->mr), iommu_offset); + goto fail; + } + /* * FIXME: For VFIO iommu types which have KVM acceleration to * avoid bouncing all map/unmaps through qemu this way, this @@ -656,7 +687,21 @@ static void vfio_listener_region_add(MemoryListener *listener, * about changes. */ if (memory_region_has_ram_discard_manager(section->mr)) { - vfio_register_ram_discard_listener(bcontainer, section); + /* + * If remap, then VFIO_DMA_UNMAP_FLAG_VADDR has been called, and we + * want to remap the vaddr. vfio_container_region_add was already + * called in the past, so the ram discard listener already exists. + * Call its populate function directly, which calls vfio_dma_map. + */ + if (remap) { + VFIORamDiscardListener *vrdl = + vfio_find_ram_discard_listener(bcontainer, section); + if (vrdl->listener.notify_populate(&vrdl->listener, section)) { + error_report("listener.notify_populate failed"); + } + } else { + vfio_register_ram_discard_listener(bcontainer, section); + } return; } diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c index f3a31d1..3139de1 100644 --- a/hw/vfio/cpr-legacy.c +++ b/hw/vfio/cpr-legacy.c @@ -26,9 +26,18 @@ static bool vfio_dma_unmap_vaddr_all(VFIOContainer *container, Error **errp) error_setg_errno(errp, errno, "vfio_dma_unmap_vaddr_all"); return false; } + container->vaddr_unmapped = true; return true; } +static void vfio_region_remap(MemoryListener *listener, + MemoryRegionSection *section) +{ + VFIOContainer *container = container_of(listener, VFIOContainer, + remap_listener); + vfio_container_region_add(&container->bcontainer, section, true); +} + static bool vfio_cpr_supported(VFIOContainer *container, Error **errp) { if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UPDATE_VADDR)) { @@ -88,6 +97,37 @@ static const VMStateDescription vfio_container_vmstate = { } }; +static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier, + MigrationEvent *e, Error **errp) +{ + VFIOContainer *container = + container_of(notifier, VFIOContainer, cpr_transfer_notifier); + VFIOContainerBase *bcontainer = &container->bcontainer; + + if (e->type != MIG_EVENT_PRECOPY_FAILED) { + return 0; + } + + if (container->vaddr_unmapped) { + /* + * Force a call to vfio_region_remap for each mapped section by + * temporarily registering a listener, which calls vfio_dma_map + * further down the stack. Set reused so vfio_dma_map restores vaddr. + */ + container->reused = true; + container->remap_listener = (MemoryListener) { + .name = "vfio recover", + .region_add = vfio_region_remap + }; + memory_listener_register(&container->remap_listener, + bcontainer->space->as); + memory_listener_unregister(&container->remap_listener); + container->reused = false; + container->vaddr_unmapped = false; + } + return 0; +} + bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp) { VFIOContainerBase *bcontainer = &container->bcontainer; @@ -104,6 +144,9 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp) vmstate_register(NULL, -1, &vfio_container_vmstate, container); + migration_add_notifier_mode(&container->cpr_transfer_notifier, + vfio_cpr_fail_notifier, + MIG_MODE_CPR_TRANSFER); return true; } @@ -114,4 +157,5 @@ void vfio_legacy_cpr_unregister_container(VFIOContainer *container) vfio_cpr_unregister_container(bcontainer); migrate_del_blocker(&container->cpr_blocker); vmstate_unregister(NULL, &vfio_container_vmstate, container); + migration_remove_notifier(&container->cpr_transfer_notifier); } diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 1e974e0..8a4a658 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -86,6 +86,9 @@ typedef struct VFIOContainer { unsigned iommu_type; Error *cpr_blocker; bool reused; + bool vaddr_unmapped; + NotifierWithReturn cpr_transfer_notifier; + MemoryListener remap_listener; QLIST_HEAD(, VFIOGroup) group_list; } VFIOContainer; @@ -311,7 +314,8 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer, int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr, Error **errp); -void vfio_listener_register(VFIOContainerBase *bcontainer); +void vfio_container_region_add(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, bool remap); /* Returns 0 on success, or a negative errno. */ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp); From patchwork Wed Jan 29 14:43:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12546C0218D for ; Wed, 29 Jan 2025 14:44:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IR-0000aL-Fw; Wed, 29 Jan 2025 09:44:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0000Vb-Bv for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:52 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IE-0001Ks-Ql for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:52 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjb7006527; Wed, 29 Jan 2025 14:43:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=GGZcMSpCnGW6+Z/reWg2u9/0/Cu3Onsywiw/DQXPyOc=; b= fRtqVjSNugU98Fzi5cafu2c8cfRtAOnZsxDmxxYDJS9DazfwzLPkgi5Ota0rBiJA 0ZnVCVKidAE7t9d40KeFmqlonLxS6tmaq7QNt64GTlQJWUCkqTuFS7EyXBW/LaV6 ja6chnjjpSCGpq57U/lDEp9Ba+4WR1221fM4ngUXGVjPMdiZLalyElqRyb8pqYV9 fab7MubBKCObBOQhAQRC/JiAf24pzGCX/jMsejPvR1vZK613V5gfIu2TC0wOHmN4 sAk4xISqsPRJiPXTtcRIR+w6RN0Lk4dY5H7B8fDluf+ddsE3psehnKs2rli/eALj ZOe8KSRNqDBiJvGrPfUyaA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808tc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:48 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDo8GM034292; Wed, 29 Jan 2025 14:43:47 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4r2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:47 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8U003307; Wed, 29 Jan 2025 14:43:46 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-9; Wed, 29 Jan 2025 14:43:46 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 08/26] pci: skip reset during cpr Date: Wed, 29 Jan 2025 06:43:04 -0800 Message-Id: <1738161802-172631-9-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: yxD6vvUkSePz4OKhCOmbtBqIhLyz7hvz X-Proofpoint-ORIG-GUID: yxD6vvUkSePz4OKhCOmbtBqIhLyz7hvz Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Do not reset a vfio-pci device during CPR. Signed-off-by: Steve Sistare --- hw/pci/pci.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 2afa423..16b4f71 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -32,6 +32,7 @@ #include "hw/pci/pci_host.h" #include "hw/qdev-properties.h" #include "hw/qdev-properties-system.h" +#include "migration/misc.h" #include "migration/qemu-file-types.h" #include "migration/vmstate.h" #include "net/net.h" @@ -459,6 +460,18 @@ static void pci_reset_regions(PCIDevice *dev) static void pci_do_device_reset(PCIDevice *dev) { + /* + * A PCI device that is resuming for cpr is already configured, so do + * not reset it here when we are called from qemu_system_reset prior to + * cpr load, else interrupts may be lost for vfio-pci devices. It is + * safe to skip this reset for all PCI devices, because cpr load will set + * all fields that would have been set here. + */ + MigMode mode = migrate_mode(); + if (mode == MIG_MODE_CPR_TRANSFER) { + return; + } + pci_device_deassert_intx(dev); assert(dev->irq_state == 0); From patchwork Wed Jan 29 14:43:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82DC8C02190 for ; Wed, 29 Jan 2025 14:45:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IS-0000aS-M7; Wed, 29 Jan 2025 09:44:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IH-0000WA-Kp for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:53 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0001LQ-3y for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:53 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXwBq030043; Wed, 29 Jan 2025 14:43:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=VpyJXhAsaPJ8USvuLDCkdI/slPrwul3k+jzms2oo3CU=; b= nncIkrkDKFXW6sEWpzhfz8ld1s6g/SU+70FW2IRT9zJcPZL1UYqR5+DwmHgRag+Z dclp0yWS5db0Q/uE6qxsvpghwa2cZng9oNMZbxsqdYz1NKOPNvpNFL9GlSifwd2V Rr81pCVub6Kt+arGmc8cYbS7i8GrC7Ur6eKZ+iBD6nmMmeIYtuT/VE0PpQ5UlIiN Qp4CpVI+h7n30Iqd4hwGWE5EqmYRBzsInsOOHMsxH6RpkcrytTo+hVV3oCCvLeVg i3d6fK8Im4qi4MKmON4b8hnIjs4qwiRKHjUDBgda0gB3XijWHXVzPJRWX/mz3qov 4vdoK2JARK3pKzdbXk0+dg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fn2ug603-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:49 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDeCTF034303; Wed, 29 Jan 2025 14:43:48 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4rd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:48 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8W003307; Wed, 29 Jan 2025 14:43:47 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-10; Wed, 29 Jan 2025 14:43:47 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 09/26] pci: export msix_is_pending Date: Wed, 29 Jan 2025 06:43:05 -0800 Message-Id: <1738161802-172631-10-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: wuIzf7IvEn9EmFmDwokAzyf7gAtI7LYr X-Proofpoint-GUID: wuIzf7IvEn9EmFmDwokAzyf7gAtI7LYr Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Export msix_is_pending for use by cpr. No functional change. Signed-off-by: Steve Sistare Acked-by: Michael S. Tsirkin --- hw/pci/msix.c | 2 +- include/hw/pci/msix.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/pci/msix.c b/hw/pci/msix.c index 57ec708..c7b40cd 100644 --- a/hw/pci/msix.c +++ b/hw/pci/msix.c @@ -71,7 +71,7 @@ static uint8_t *msix_pending_byte(PCIDevice *dev, int vector) return dev->msix_pba + vector / 8; } -static int msix_is_pending(PCIDevice *dev, int vector) +int msix_is_pending(PCIDevice *dev, unsigned int vector) { return *msix_pending_byte(dev, vector) & msix_pending_mask(vector); } diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h index 0e6f257..11ef945 100644 --- a/include/hw/pci/msix.h +++ b/include/hw/pci/msix.h @@ -32,6 +32,7 @@ int msix_present(PCIDevice *dev); bool msix_is_masked(PCIDevice *dev, unsigned vector); void msix_set_pending(PCIDevice *dev, unsigned vector); void msix_clr_pending(PCIDevice *dev, int vector); +int msix_is_pending(PCIDevice *dev, unsigned vector); void msix_vector_use(PCIDevice *dev, unsigned vector); void msix_vector_unuse(PCIDevice *dev, unsigned vector); From patchwork Wed Jan 29 14:43:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04634C0218D for ; Wed, 29 Jan 2025 14:48:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Ia-0000do-1J; Wed, 29 Jan 2025 09:44:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IJ-0000WX-3d for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:55 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IH-0001Lu-4N for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:54 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXqKU031317; Wed, 29 Jan 2025 14:43:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=hfgBIn06xBV79GrMK4sbZADail3CnAY2YmgYidc8dUU=; b= DPKgTw4YPgbUhQ3Tv3nVxnuf+1QxDTn9HICWwhy4TLECxL7vlQM/znExOwNEzkxz hgtlSAxrBkDpTCI/3MQaIKX/seZiw/IvToksr0NvYaWTrIXYlYAr2rQIOzGGqvNu AcupGt2rqK3h0a9a1Nhvt7Z1LAe8Joq6LF54rZFnwhLvBl/GkoTTA0Dd2zG0oukx 7+xYH8CSD7BN6WCgOlK6eRgYrrgT86KHSW455zEqvovtC0EYXWmiCRxlomTDqI2a zStbF1BPxyHmHj+3hyo1u/jrASjwHiqOyAqkIEUQVN6DhVmbZzCiTjx8FZe+TxMa tTR3XWFxhu3knko8vpcD1A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp1a81dk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:49 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TERipX034375; Wed, 29 Jan 2025 14:43:48 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4rm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:48 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8Y003307; Wed, 29 Jan 2025 14:43:48 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-11; Wed, 29 Jan 2025 14:43:48 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 10/26] vfio-pci: refactor for cpr Date: Wed, 29 Jan 2025 06:43:06 -0800 Message-Id: <1738161802-172631-11-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: Fxw_JXuJG3UXzz3ZB59n76ye4f5tNwqs X-Proofpoint-ORIG-GUID: Fxw_JXuJG3UXzz3ZB59n76ye4f5tNwqs Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Refactor vector use into a helper vfio_vector_init. Add vfio_notifier_init and vfio_notifier_cleanup for named notifiers, and pass additional arguments to vfio_remove_kvm_msi_virq. All for use by CPR in a subsequent patch. No functional change. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 106 +++++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 68 insertions(+), 38 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ab17a98..24ebd69 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -54,6 +54,32 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev); static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled); static void vfio_msi_disable_common(VFIOPCIDevice *vdev); +/* Create new or reuse existing eventfd */ +static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, + const char *name, int nr) +{ + int fd = -1; /* placeholder until a subsequent patch */ + int ret = 0; + + if (fd >= 0) { + event_notifier_init_fd(e, fd); + } else { + ret = event_notifier_init(e, 0); + if (ret) { + Error *err = NULL; + error_setg_errno(&err, -ret, "vfio_notifier_init %s failed", name); + error_report_err(err); + } + } + return ret; +} + +static void vfio_notifier_cleanup(VFIOPCIDevice *vdev, EventNotifier *e, + const char *name, int nr) +{ + event_notifier_cleanup(e); +} + /* * Disabling BAR mmaping can be slow, but toggling it around INTx can * also be a huge overhead. We try to get the best of both worlds by @@ -134,8 +160,8 @@ static bool vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp) pci_irq_deassert(&vdev->pdev); /* Get an eventfd for resample/unmask */ - if (event_notifier_init(&vdev->intx.unmask, 0)) { - error_setg(errp, "event_notifier_init failed eoi"); + if (vfio_notifier_init(vdev, &vdev->intx.unmask, "intx-unmask", 0)) { + error_setg(errp, "vfio_notifier_init intx-unmask failed"); goto fail; } @@ -167,7 +193,7 @@ fail_vfio: kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, &vdev->intx.interrupt, vdev->intx.route.irq); fail_irqfd: - event_notifier_cleanup(&vdev->intx.unmask); + vfio_notifier_cleanup(vdev, &vdev->intx.unmask, "intx-unmask", 0); fail: qemu_set_fd_handler(irq_fd, vfio_intx_interrupt, NULL, vdev); vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX); @@ -199,7 +225,7 @@ static void vfio_intx_disable_kvm(VFIOPCIDevice *vdev) } /* We only need to close the eventfd for VFIO to cleanup the kernel side */ - event_notifier_cleanup(&vdev->intx.unmask); + vfio_notifier_cleanup(vdev, &vdev->intx.unmask, "intx-unmask", 0); /* QEMU starts listening for interrupt events. */ qemu_set_fd_handler(event_notifier_get_fd(&vdev->intx.interrupt), @@ -266,7 +292,6 @@ static bool vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) uint8_t pin = vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1); Error *err = NULL; int32_t fd; - int ret; if (!pin) { @@ -289,9 +314,7 @@ static bool vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) } #endif - ret = event_notifier_init(&vdev->intx.interrupt, 0); - if (ret) { - error_setg_errno(errp, -ret, "event_notifier_init failed"); + if (vfio_notifier_init(vdev, &vdev->intx.interrupt, "intx-interrupt", 0)) { return false; } fd = event_notifier_get_fd(&vdev->intx.interrupt); @@ -300,7 +323,7 @@ static bool vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) if (!vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) { qemu_set_fd_handler(fd, NULL, NULL, vdev); - event_notifier_cleanup(&vdev->intx.interrupt); + vfio_notifier_cleanup(vdev, &vdev->intx.interrupt, "intx-interrupt", 0); return false; } @@ -327,7 +350,7 @@ static void vfio_intx_disable(VFIOPCIDevice *vdev) fd = event_notifier_get_fd(&vdev->intx.interrupt); qemu_set_fd_handler(fd, NULL, NULL, vdev); - event_notifier_cleanup(&vdev->intx.interrupt); + vfio_notifier_cleanup(vdev, &vdev->intx.interrupt, "intx-interrupt", 0); vdev->interrupt = VFIO_INT_NONE; @@ -471,13 +494,15 @@ static void vfio_add_kvm_msi_virq(VFIOPCIDevice *vdev, VFIOMSIVector *vector, vector_n, &vdev->pdev); } -static void vfio_connect_kvm_msi_virq(VFIOMSIVector *vector) +static void vfio_connect_kvm_msi_virq(VFIOMSIVector *vector, int nr) { + const char *name = "kvm_interrupt"; + if (vector->virq < 0) { return; } - if (event_notifier_init(&vector->kvm_interrupt, 0)) { + if (vfio_notifier_init(vector->vdev, &vector->kvm_interrupt, name, nr)) { goto fail_notifier; } @@ -489,19 +514,20 @@ static void vfio_connect_kvm_msi_virq(VFIOMSIVector *vector) return; fail_kvm: - event_notifier_cleanup(&vector->kvm_interrupt); + vfio_notifier_cleanup(vector->vdev, &vector->kvm_interrupt, name, nr); fail_notifier: kvm_irqchip_release_virq(kvm_state, vector->virq); vector->virq = -1; } -static void vfio_remove_kvm_msi_virq(VFIOMSIVector *vector) +static void vfio_remove_kvm_msi_virq(VFIOPCIDevice *vdev, VFIOMSIVector *vector, + int nr) { kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, &vector->kvm_interrupt, vector->virq); kvm_irqchip_release_virq(kvm_state, vector->virq); vector->virq = -1; - event_notifier_cleanup(&vector->kvm_interrupt); + vfio_notifier_cleanup(vdev, &vector->kvm_interrupt, "kvm_interrupt", nr); } static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, @@ -511,6 +537,20 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, kvm_irqchip_commit_routes(kvm_state); } +static void vfio_vector_init(VFIOPCIDevice *vdev, int nr) +{ + VFIOMSIVector *vector = &vdev->msi_vectors[nr]; + PCIDevice *pdev = &vdev->pdev; + + vector->vdev = vdev; + vector->virq = -1; + vfio_notifier_init(vdev, &vector->interrupt, "interrupt", nr); + vector->use = true; + if (vdev->interrupt == VFIO_INT_MSIX) { + msix_vector_use(pdev, nr); + } +} + static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, MSIMessage *msg, IOHandler *handler) { @@ -524,13 +564,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, vector = &vdev->msi_vectors[nr]; if (!vector->use) { - vector->vdev = vdev; - vector->virq = -1; - if (event_notifier_init(&vector->interrupt, 0)) { - error_report("vfio: Error: event_notifier_init failed"); - } - vector->use = true; - msix_vector_use(pdev, nr); + vfio_vector_init(vdev, nr); } qemu_set_fd_handler(event_notifier_get_fd(&vector->interrupt), @@ -542,7 +576,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, */ if (vector->virq >= 0) { if (!msg) { - vfio_remove_kvm_msi_virq(vector); + vfio_remove_kvm_msi_virq(vdev, vector, nr); } else { vfio_update_kvm_msi_virq(vector, *msg, pdev); } @@ -554,7 +588,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, vfio_route_change = kvm_irqchip_begin_route_changes(kvm_state); vfio_add_kvm_msi_virq(vdev, vector, nr, true); kvm_irqchip_commit_route_changes(&vfio_route_change); - vfio_connect_kvm_msi_virq(vector); + vfio_connect_kvm_msi_virq(vector, nr); } } } @@ -661,7 +695,7 @@ static void vfio_commit_kvm_msi_virq_batch(VFIOPCIDevice *vdev) kvm_irqchip_commit_route_changes(&vfio_route_change); for (i = 0; i < vdev->nr_vectors; i++) { - vfio_connect_kvm_msi_virq(&vdev->msi_vectors[i]); + vfio_connect_kvm_msi_virq(&vdev->msi_vectors[i], i); } } @@ -741,9 +775,7 @@ retry: vector->virq = -1; vector->use = true; - if (event_notifier_init(&vector->interrupt, 0)) { - error_report("vfio: Error: event_notifier_init failed"); - } + vfio_notifier_init(vdev, &vector->interrupt, "interrupt", i); qemu_set_fd_handler(event_notifier_get_fd(&vector->interrupt), vfio_msi_interrupt, NULL, vector); @@ -797,11 +829,11 @@ static void vfio_msi_disable_common(VFIOPCIDevice *vdev) VFIOMSIVector *vector = &vdev->msi_vectors[i]; if (vdev->msi_vectors[i].use) { if (vector->virq >= 0) { - vfio_remove_kvm_msi_virq(vector); + vfio_remove_kvm_msi_virq(vdev, vector, i); } qemu_set_fd_handler(event_notifier_get_fd(&vector->interrupt), NULL, NULL, NULL); - event_notifier_cleanup(&vector->interrupt); + vfio_notifier_cleanup(vdev, &vector->interrupt, "interrupt", i); } } @@ -2854,8 +2886,7 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev) return; } - if (event_notifier_init(&vdev->err_notifier, 0)) { - error_report("vfio: Unable to init event notifier for error detection"); + if (vfio_notifier_init(vdev, &vdev->err_notifier, "err_notifier", 0)) { vdev->pci_aer = false; return; } @@ -2867,7 +2898,7 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev) VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); qemu_set_fd_handler(fd, NULL, NULL, vdev); - event_notifier_cleanup(&vdev->err_notifier); + vfio_notifier_cleanup(vdev, &vdev->err_notifier, "err_notifier", 0); vdev->pci_aer = false; } } @@ -2886,7 +2917,7 @@ static void vfio_unregister_err_notifier(VFIOPCIDevice *vdev) } qemu_set_fd_handler(event_notifier_get_fd(&vdev->err_notifier), NULL, NULL, vdev); - event_notifier_cleanup(&vdev->err_notifier); + vfio_notifier_cleanup(vdev, &vdev->err_notifier, "err_notifier", 0); } static void vfio_req_notifier_handler(void *opaque) @@ -2920,8 +2951,7 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) return; } - if (event_notifier_init(&vdev->req_notifier, 0)) { - error_report("vfio: Unable to init event notifier for device request"); + if (vfio_notifier_init(vdev, &vdev->req_notifier, "req_notifier", 0)) { return; } @@ -2932,7 +2962,7 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); qemu_set_fd_handler(fd, NULL, NULL, vdev); - event_notifier_cleanup(&vdev->req_notifier); + vfio_notifier_cleanup(vdev, &vdev->req_notifier, "req_notifier", 0); } else { vdev->req_enabled = true; } @@ -2952,7 +2982,7 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) } qemu_set_fd_handler(event_notifier_get_fd(&vdev->req_notifier), NULL, NULL, vdev); - event_notifier_cleanup(&vdev->req_notifier); + vfio_notifier_cleanup(vdev, &vdev->req_notifier, "req_notifier", 0); vdev->req_enabled = false; } From patchwork Wed Jan 29 14:43:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 582A7C0218D for ; Wed, 29 Jan 2025 14:45:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IP-0000Zv-Of; Wed, 29 Jan 2025 09:44:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9II-0000WR-Gi for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:54 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IG-0001LY-S3 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:54 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjbB006527; Wed, 29 Jan 2025 14:43:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=5OwRZGXs9Pjy6nhmLK+0C0cGlBDuKH7ErbGjJTKEhBU=; b= UAnrHHewHJZgJW3cnJkRw0rBmSAao16cuFaNnwbfYQYjp/GXGIgdBBfR16nwvXBj c4HfDUXi5rNfV4UW0tVpiDb9PI880O9rF09B2TW3qyYQMVJe2i1PNndk9R4urEq0 zc+PEo16RR8arRiqbat9fqw/niKcqdSXkN28ozmC4oB8LLqYZbEXy8XHKu0evD+/ ofFVoiHW5cE4XjNkIES1n69pAYJPFkwh8FGqtzMaNTwWv2/sIQQxHvaiu/vcQf71 /D9p3Js6NiOFRArg8iyHTiV5+8uvC7H7o0FC1hIw6PM7kDvFiRqzsLQnsFMpzqnR hW++oI+WxAIExCumcIXjhw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808tn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:50 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEOiEf034283; Wed, 29 Jan 2025 14:43:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4rt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:49 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8a003307; Wed, 29 Jan 2025 14:43:48 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-12; Wed, 29 Jan 2025 14:43:48 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 11/26] vfio-pci: skip reset during cpr Date: Wed, 29 Jan 2025 06:43:07 -0800 Message-Id: <1738161802-172631-12-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: Pt0vdpCAHnC0Q041vyr5hufbFa8QNU03 X-Proofpoint-ORIG-GUID: Pt0vdpCAHnC0Q041vyr5hufbFa8QNU03 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Do not reset a vfio-pci device during CPR, and do not complain if the kernel's PCI config space changes for non-emulated bits between the vmstate save and load, which can happen due to ongoing interrupt activity. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 24ebd69..fa77c36 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -29,6 +29,8 @@ #include "hw/pci/pci_bridge.h" #include "hw/qdev-properties.h" #include "hw/qdev-properties-system.h" +#include "migration/misc.h" +#include "migration/cpr.h" #include "migration/vmstate.h" #include "qapi/qmp/qdict.h" #include "qemu/error-report.h" @@ -3324,6 +3326,11 @@ static void vfio_pci_reset(DeviceState *dev) { VFIOPCIDevice *vdev = VFIO_PCI(dev); + /* Do not reset the device during qemu_system_reset prior to cpr load */ + if (vdev->vbasedev.reused) { + return; + } + trace_vfio_pci_reset(vdev->vbasedev.name); vfio_pci_pre_reset(vdev); @@ -3447,6 +3454,35 @@ static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) } #endif +/* + * The kernel may change non-emulated config bits. Exclude them from the + * changed-bits check in get_pci_config_device. + */ +static int vfio_pci_pre_load(void *opaque) +{ + VFIOPCIDevice *vdev = opaque; + PCIDevice *pdev = &vdev->pdev; + int size = MIN(pci_config_size(pdev), vdev->config_size); + int i; + + for (i = 0; i < size; i++) { + pdev->cmask[i] &= vdev->emulated_config_bits[i]; + } + + return 0; +} + +static const VMStateDescription vfio_pci_vmstate = { + .name = "vfio-pci", + .version_id = 0, + .minimum_version_id = 0, + .pre_load = vfio_pci_pre_load, + .needed = cpr_needed_for_reuse, + .fields = (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -3457,6 +3493,7 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) #ifdef CONFIG_IOMMUFD object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd); #endif + dc->vmsd = &vfio_pci_vmstate; dc->desc = "VFIO-based PCI device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize = vfio_realize; From patchwork Wed Jan 29 14:43:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4B42C02193 for ; Wed, 29 Jan 2025 14:46:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IY-0000cw-Ex; Wed, 29 Jan 2025 09:44:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IJ-0000Wu-Sg for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:56 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9II-0001MB-2l for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:55 -0500 Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXmB5006314; Wed, 29 Jan 2025 14:43:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=G969B8vw/rEUq8B/7c3p5kqWoqGQgdPYAr1/JCIkVP0=; b= OHeg4B+vSOl0fB5T1VhaMVfuT0a8Nly5anUxDvVc26sDacFb5Qb//Aig2jUCFcyV seakwLMHs+5s7+yrdM6SRqiHMCtJuYTaA+57J/Ca1Y2c/tOhDyguap30VtTtDQL2 RbNNT0fOgvZqSciJbrJM27ZCKgZcgnBsboX3cIjz7IJLuR7eMhYKWnkfYgXSEpJW noqyHJh9fMjQz7iaGzblVWO8vn2tQ8i8DDTi0YQzkYLmBfsPLQgNMKMGJBCXX92f 3n7AuaVlvY5j0W+CE8Kd8IGRS4hz4xPK7wEYxIB9MSeYDai7oXJU7k4TKwDwaqzc WXy7tA+2ncvaeGRwSXGAGg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmut8725-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:51 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDMnSR036674; Wed, 29 Jan 2025 14:43:50 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4s1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:50 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8c003307; Wed, 29 Jan 2025 14:43:49 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-13; Wed, 29 Jan 2025 14:43:49 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 12/26] vfio-pci: preserve MSI Date: Wed, 29 Jan 2025 06:43:08 -0800 Message-Id: <1738161802-172631-13-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: u4woNnQGbaZjff_vkdF9bwBI8-s_ZVY0 X-Proofpoint-ORIG-GUID: u4woNnQGbaZjff_vkdF9bwBI8-s_ZVY0 Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Save the MSI message area as part of vfio-pci vmstate, and preserve the interrupt and notifier eventfd's. migrate_incoming loads the MSI data, then the vfio-pci post_load handler finds the eventfds in CPR state, rebuilds vector data structures, and attaches the interrupts to the new KVM instance. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 116 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index fa77c36..df6e298 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -56,11 +56,37 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev); static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled); static void vfio_msi_disable_common(VFIOPCIDevice *vdev); +#define EVENT_FD_NAME(vdev, name) \ + g_strdup_printf("%s_%s", (vdev)->vbasedev.name, (name)) + +static void save_event_fd(VFIOPCIDevice *vdev, const char *name, int nr, + EventNotifier *ev) +{ + int fd = event_notifier_get_fd(ev); + + if (fd >= 0) { + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + cpr_resave_fd(fdname, nr, fd); + } +} + +static int load_event_fd(VFIOPCIDevice *vdev, const char *name, int nr) +{ + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + return cpr_find_fd(fdname, nr); +} + +static void delete_event_fd(VFIOPCIDevice *vdev, const char *name, int nr) +{ + g_autofree char *fdname = EVENT_FD_NAME(vdev, name); + cpr_delete_fd(fdname, nr); +} + /* Create new or reuse existing eventfd */ static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, const char *name, int nr) { - int fd = -1; /* placeholder until a subsequent patch */ + int fd = load_event_fd(vdev, name, nr); int ret = 0; if (fd >= 0) { @@ -71,6 +97,8 @@ static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, Error *err = NULL; error_setg_errno(&err, -ret, "vfio_notifier_init %s failed", name); error_report_err(err); + } else { + save_event_fd(vdev, name, nr, e); } } return ret; @@ -79,6 +107,7 @@ static int vfio_notifier_init(VFIOPCIDevice *vdev, EventNotifier *e, static void vfio_notifier_cleanup(VFIOPCIDevice *vdev, EventNotifier *e, const char *name, int nr) { + delete_event_fd(vdev, name, nr); event_notifier_cleanup(e); } @@ -561,6 +590,15 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, int ret; bool resizing = !!(vdev->nr_vectors < nr + 1); + /* + * Ignore the callback from msix_set_vector_notifiers during resume. + * The necessary subset of these actions is called from vfio_claim_vectors + * during post load. + */ + if (vdev->vbasedev.reused) { + return 0; + } + trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr); vector = &vdev->msi_vectors[nr]; @@ -2896,6 +2934,11 @@ static void vfio_register_err_notifier(VFIOPCIDevice *vdev) fd = event_notifier_get_fd(&vdev->err_notifier); qemu_set_fd_handler(fd, vfio_err_notifier_handler, NULL, vdev); + /* Do not alter irq_signaling during vfio_realize for cpr */ + if (vdev->vbasedev.reused) { + return; + } + if (!vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_ERR_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); @@ -2960,6 +3003,12 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) fd = event_notifier_get_fd(&vdev->req_notifier); qemu_set_fd_handler(fd, vfio_req_notifier_handler, NULL, vdev); + /* Do not alter irq_signaling during vfio_realize for cpr */ + if (vdev->vbasedev.reused) { + vdev->req_enabled = true; + return; + } + if (!vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_REQ_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); @@ -3454,6 +3503,46 @@ static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) } #endif +static void vfio_claim_vectors(VFIOPCIDevice *vdev, int nr_vectors, bool msix) +{ + int i, fd; + bool pending = false; + PCIDevice *pdev = &vdev->pdev; + + vdev->nr_vectors = nr_vectors; + vdev->msi_vectors = g_new0(VFIOMSIVector, nr_vectors); + vdev->interrupt = msix ? VFIO_INT_MSIX : VFIO_INT_MSI; + + vfio_prepare_kvm_msi_virq_batch(vdev); + + for (i = 0; i < nr_vectors; i++) { + VFIOMSIVector *vector = &vdev->msi_vectors[i]; + + fd = load_event_fd(vdev, "interrupt", i); + if (fd >= 0) { + vfio_vector_init(vdev, i); + qemu_set_fd_handler(fd, vfio_msi_interrupt, NULL, vector); + } + + if (load_event_fd(vdev, "kvm_interrupt", i) >= 0) { + vfio_add_kvm_msi_virq(vdev, vector, i, msix); + } else { + vdev->msi_vectors[i].virq = -1; + } + + if (msix && msix_is_pending(pdev, i) && msix_is_masked(pdev, i)) { + set_bit(i, vdev->msix->pending); + pending = true; + } + } + + vfio_commit_kvm_msi_virq_batch(vdev); + + if (msix) { + memory_region_set_enabled(&pdev->msix_pba_mmio, pending); + } +} + /* * The kernel may change non-emulated config bits. Exclude them from the * changed-bits check in get_pci_config_device. @@ -3472,13 +3561,39 @@ static int vfio_pci_pre_load(void *opaque) return 0; } +static int vfio_pci_post_load(void *opaque, int version_id) +{ + VFIOPCIDevice *vdev = opaque; + PCIDevice *pdev = &vdev->pdev; + int nr_vectors; + + if (msix_enabled(pdev)) { + msix_set_vector_notifiers(pdev, vfio_msix_vector_use, + vfio_msix_vector_release, NULL); + nr_vectors = vdev->msix->entries; + vfio_claim_vectors(vdev, nr_vectors, true); + + } else if (msi_enabled(pdev)) { + nr_vectors = msi_nr_vectors_allocated(pdev); + vfio_claim_vectors(vdev, nr_vectors, false); + + } else if (vfio_pci_read_config(pdev, PCI_INTERRUPT_PIN, 1)) { + g_assert_not_reached(); /* completed in a subsequent patch */ + } + + return 0; +} + static const VMStateDescription vfio_pci_vmstate = { .name = "vfio-pci", .version_id = 0, .minimum_version_id = 0, .pre_load = vfio_pci_pre_load, + .post_load = vfio_pci_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { + VMSTATE_PCI_DEVICE(pdev, VFIOPCIDevice), + VMSTATE_MSIX_TEST(pdev, VFIOPCIDevice, vfio_msix_present), VMSTATE_END_OF_LIST() } }; From patchwork Wed Jan 29 14:43:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C351C02193 for ; Wed, 29 Jan 2025 14:45:51 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IS-0000aN-IO; Wed, 29 Jan 2025 09:44:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IK-0000Wv-6o for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:56 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9II-0001MN-HM for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:55 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXoCG031305; Wed, 29 Jan 2025 14:43:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=WM8G6+pbBabzCKZ4Q+Vkz9QeVCb4AnTAsum4M1yICDs=; b= NDgEPCOpv1b2t0KVqFn5RK2CaxIRqHh5YyYTqzKwQJ+jfWsiqiqobgbf09MuGOXI PS0JR7kj9VTFslrsfoyjTScTclL1utfp3eejNCt2Ok3jLQRSNhZxyVmc+HV2Rjtf ZTU4MiDLut+xMtLOIgIaZrJ4hlUU7Rh3FpepnruSDkbn8vkkimz23LL62EI1e7rE pDmTQRaNyT+FZc25Qyjgpgg95+UUsgDPivY0gsP/LOERZ3pkniPw350WHURukufu /reGSMrfs3F4ke0UyFj0vM/d3UdALXWpL+uMz36YPpeDb1aBZyTME297BJ7PsKxr wZtZc1/Zt7I4bZo262BbuA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp1a81dn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:51 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDC19N034407; Wed, 29 Jan 2025 14:43:51 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4sa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:50 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8e003307; Wed, 29 Jan 2025 14:43:50 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-14; Wed, 29 Jan 2025 14:43:50 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 13/26] vfio-pci: preserve INTx Date: Wed, 29 Jan 2025 06:43:09 -0800 Message-Id: <1738161802-172631-14-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: TbTNA5Vha4iMyFltbhJrG-inT_y81WNS X-Proofpoint-ORIG-GUID: TbTNA5Vha4iMyFltbhJrG-inT_y81WNS Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Preserve vfio INTx state across cpr-transfer. Preserve VFIOINTx fields as follows: pin : Recover this from the vfio config in kernel space interrupt : Preserve its eventfd descriptor across exec. unmask : Ditto route.irq : This could perhaps be recovered in vfio_pci_post_load by calling pci_device_route_intx_to_irq(pin), whose implementation reads config space for a bridge device such as ich9. However, there is no guarantee that the bridge vmstate is read before vfio vmstate. Rather than fiddling with MigrationPriority for vmstate handlers, explicitly save route.irq in vfio vmstate. pending : save in vfio vmstate. mmap_timeout, mmap_timer : Re-initialize bool kvm_accel : Re-initialize In vfio_realize, defer calling vfio_intx_enable until the vmstate is available, in vfio_pci_post_load. Modify vfio_intx_enable and vfio_intx_kvm_enable to skip vfio initialization, but still perform kvm initialization. Signed-off-by: Steve Sistare --- hw/vfio/pci.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 47 insertions(+), 4 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index df6e298..c50dbef 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -184,12 +184,17 @@ static bool vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp) return true; } + if (vdev->vbasedev.reused) { + goto skip_state; + } + /* Get to a known interrupt state */ qemu_set_fd_handler(irq_fd, NULL, NULL, vdev); vfio_mask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX); vdev->intx.pending = false; pci_irq_deassert(&vdev->pdev); +skip_state: /* Get an eventfd for resample/unmask */ if (vfio_notifier_init(vdev, &vdev->intx.unmask, "intx-unmask", 0)) { error_setg(errp, "vfio_notifier_init intx-unmask failed"); @@ -204,6 +209,10 @@ static bool vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp) goto fail_irqfd; } + if (vdev->vbasedev.reused) { + goto skip_irq; + } + if (!vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_UNMASK, event_notifier_get_fd(&vdev->intx.unmask), @@ -214,6 +223,7 @@ static bool vfio_intx_enable_kvm(VFIOPCIDevice *vdev, Error **errp) /* Let'em rip */ vfio_unmask_single_irqindex(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX); +skip_irq: vdev->intx.kvm_accel = true; trace_vfio_intx_enable_kvm(vdev->vbasedev.name); @@ -329,7 +339,13 @@ static bool vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) return true; } - vfio_disable_interrupts(vdev); + /* + * Do not alter interrupt state during vfio_realize and cpr load. The + * reused flag is cleared thereafter. + */ + if (!vdev->vbasedev.reused) { + vfio_disable_interrupts(vdev); + } vdev->intx.pin = pin - 1; /* Pin A (1) -> irq[0] */ pci_config_set_interrupt_pin(vdev->pdev.config, pin); @@ -351,7 +367,8 @@ static bool vfio_intx_enable(VFIOPCIDevice *vdev, Error **errp) fd = event_notifier_get_fd(&vdev->intx.interrupt); qemu_set_fd_handler(fd, vfio_intx_interrupt, NULL, vdev); - if (!vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0, + if (!vdev->vbasedev.reused && + !vfio_set_irq_signaling(&vdev->vbasedev, VFIO_PCI_INTX_IRQ_INDEX, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) { qemu_set_fd_handler(fd, NULL, NULL, vdev); vfio_notifier_cleanup(vdev, &vdev->intx.interrupt, "intx-interrupt", 0); @@ -3256,7 +3273,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vfio_intx_routing_notifier); vdev->irqchip_change_notifier.notify = vfio_irqchip_change; kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); - if (!vfio_intx_enable(vdev, errp)) { + /* Wait until cpr load reads intx routing data to enable */ + if (!vdev->vbasedev.reused && !vfio_intx_enable(vdev, errp)) { goto out_deregister; } } @@ -3578,12 +3596,36 @@ static int vfio_pci_post_load(void *opaque, int version_id) vfio_claim_vectors(vdev, nr_vectors, false); } else if (vfio_pci_read_config(pdev, PCI_INTERRUPT_PIN, 1)) { - g_assert_not_reached(); /* completed in a subsequent patch */ + Error *err = NULL; + if (!vfio_intx_enable(vdev, &err)) { + error_report_err(err); + return -1; + } } return 0; } +static const VMStateDescription vfio_intx_vmstate = { + .name = "vfio-intx", + .version_id = 0, + .minimum_version_id = 0, + .fields = (VMStateField[]) { + VMSTATE_BOOL(pending, VFIOINTx), + VMSTATE_UINT32(route.mode, VFIOINTx), + VMSTATE_INT32(route.irq, VFIOINTx), + VMSTATE_END_OF_LIST() + } +}; + +#define VMSTATE_VFIO_INTX(_field, _state) { \ + .name = (stringify(_field)), \ + .size = sizeof(VFIOINTx), \ + .vmsd = &vfio_intx_vmstate, \ + .flags = VMS_STRUCT, \ + .offset = vmstate_offset_value(_state, _field, VFIOINTx), \ +} + static const VMStateDescription vfio_pci_vmstate = { .name = "vfio-pci", .version_id = 0, @@ -3594,6 +3636,7 @@ static const VMStateDescription vfio_pci_vmstate = { .fields = (VMStateField[]) { VMSTATE_PCI_DEVICE(pdev, VFIOPCIDevice), VMSTATE_MSIX_TEST(pdev, VFIOPCIDevice, vfio_msix_present), + VMSTATE_VFIO_INTX(intx, VFIOPCIDevice), VMSTATE_END_OF_LIST() } }; From patchwork Wed Jan 29 14:43:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B89EC02190 for ; Wed, 29 Jan 2025 14:49:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IW-0000bO-Qu; Wed, 29 Jan 2025 09:44:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IL-0000Xz-Jt for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:57 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9II-0001ML-OY for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:57 -0500 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEbCoe007551; Wed, 29 Jan 2025 14:43:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=h/Ab34Pt89kriniEFopqa39nyKpO4mIKhLodI6pY3y0=; b= TlYz2izvQJburp0PAjyVoEtmQu/zbNDgT9eK4ZZHof5rM7KrKI7miYcFDBskSzsu xbTkU4P43LBoZlmq0lz3bIPtLp3hL4CnNw9VX0vZut/jo9XlgskaIIo4hcvVY6k4 Q54kKfzP4jRq700w0HwNSoVD0qhVF4iZxx/8EvIfk1Sw84N0D792jI3Ox+KgYdp7 8Fb4J4MoR0oXDS4++lZ3cPblbKZoVIigmSTQWX3EVNAUPKn4qM5b5avmLgUWQvpS 2Q4PFtBqPDBj/hBsiXGvIhogCku8Wq1kUMRibUV6ZqFyhF13u7/2sKDaa7V1S4GA T7Yq4ytICnMqngnMwfA9pA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp8e00qs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:52 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDMnSS036674; Wed, 29 Jan 2025 14:43:51 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4sk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:51 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8g003307; Wed, 29 Jan 2025 14:43:51 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-15; Wed, 29 Jan 2025 14:43:50 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 14/26] migration: close kvm after cpr Date: Wed, 29 Jan 2025 06:43:10 -0800 Message-Id: <1738161802-172631-15-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: ALVKtT6OAXD3SMh1poUrNAjNlhjiMBRJ X-Proofpoint-GUID: ALVKtT6OAXD3SMh1poUrNAjNlhjiMBRJ Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org cpr-transfer breaks vfio network connectivity to and from the guest, and the host system log shows: irq bypass consumer (token 00000000a03c32e5) registration fails: -16 which is EBUSY. This occurs because KVM descriptors are still open in the old QEMU process. Close them. Signed-off-by: Steve Sistare --- accel/kvm/kvm-all.c | 20 ++++++++++++++++++++ hw/vfio/common.c | 8 ++++++++ include/hw/vfio/vfio-common.h | 1 + include/migration/cpr.h | 2 ++ include/system/kvm.h | 1 + migration/cpr-transfer.c | 18 ++++++++++++++++++ migration/cpr.c | 8 ++++++++ migration/migration.c | 1 + 8 files changed, 59 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index c65b790..f4e341f 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -595,6 +595,26 @@ err: return ret; } +void kvm_close(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + cpu_remove_sync(cpu); + close(cpu->kvm_fd); + cpu->kvm_fd = -1; + close(cpu->kvm_vcpu_stats_fd); + cpu->kvm_vcpu_stats_fd = -1; + } + + if (kvm_state && kvm_state->fd != -1) { + close(kvm_state->vmfd); + kvm_state->vmfd = -1; + close(kvm_state->fd); + kvm_state->fd = -1; + } +} + /* * dirty pages logging control */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index c8ee71a..db0498e 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1511,6 +1511,14 @@ int vfio_kvm_device_del_fd(int fd, Error **errp) return 0; } +void vfio_kvm_device_close(void) +{ + if (vfio_kvm_device_fd != -1) { + close(vfio_kvm_device_fd); + vfio_kvm_device_fd = -1; + } +} + VFIOAddressSpace *vfio_get_address_space(AddressSpace *as) { VFIOAddressSpace *space; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8a4a658..5a89aca 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -262,6 +262,7 @@ void vfio_detach_device(VFIODevice *vbasedev); int vfio_kvm_device_add_fd(int fd, Error **errp); int vfio_kvm_device_del_fd(int fd, Error **errp); +void vfio_kvm_device_close(void); bool vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp); void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer); diff --git a/include/migration/cpr.h b/include/migration/cpr.h index faeb0cc..e8f4aba 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -29,7 +29,9 @@ void cpr_state_close(void); struct QIOChannel *cpr_state_ioc(void); bool cpr_needed_for_reuse(void *opaque); +void cpr_kvm_close(void); +void cpr_transfer_init(void); QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp); QEMUFile *cpr_transfer_input(MigrationChannel *channel, Error **errp); diff --git a/include/system/kvm.h b/include/system/kvm.h index ab17c09..ad5c55e 100644 --- a/include/system/kvm.h +++ b/include/system/kvm.h @@ -194,6 +194,7 @@ bool kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); int kvm_max_nested_state_length(void); int kvm_has_gsi_routing(void); +void kvm_close(void); /** * kvm_arm_supports_user_irq diff --git a/migration/cpr-transfer.c b/migration/cpr-transfer.c index e1f1403..396558f 100644 --- a/migration/cpr-transfer.c +++ b/migration/cpr-transfer.c @@ -17,6 +17,24 @@ #include "migration/vmstate.h" #include "trace.h" +static int cpr_transfer_notifier(NotifierWithReturn *notifier, + MigrationEvent *e, + Error **errp) +{ + if (e->type == MIG_EVENT_PRECOPY_DONE) { + cpr_kvm_close(); + } + return 0; +} + +void cpr_transfer_init(void) +{ + static NotifierWithReturn notifier; + + migration_add_notifier_mode(¬ifier, cpr_transfer_notifier, + MIG_MODE_CPR_TRANSFER); +} + QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp) { MigrationAddress *addr = channel->addr; diff --git a/migration/cpr.c b/migration/cpr.c index e3f27e9..86eb484 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -7,12 +7,14 @@ #include "qemu/osdep.h" #include "qapi/error.h" +#include "hw/vfio/vfio-common.h" #include "migration/cpr.h" #include "migration/misc.h" #include "migration/options.h" #include "migration/qemu-file.h" #include "migration/savevm.h" #include "migration/vmstate.h" +#include "system/kvm.h" #include "system/runstate.h" #include "trace.h" @@ -243,3 +245,9 @@ bool cpr_needed_for_reuse(void *opaque) MigMode mode = migrate_mode(); return mode == MIG_MODE_CPR_TRANSFER; } + +void cpr_kvm_close(void) +{ + kvm_close(); + vfio_kvm_device_close(); +} diff --git a/migration/migration.c b/migration/migration.c index 88b0991..7a69782 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -288,6 +288,7 @@ void migration_object_init(void) ram_mig_init(); dirty_bitmap_mig_init(); + cpr_transfer_init(); /* Initialize cpu throttle timers */ cpu_throttle_init(); From patchwork Wed Jan 29 14:43:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953840 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5C76C0218D for ; Wed, 29 Jan 2025 14:50:18 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IV-0000b6-Pw; Wed, 29 Jan 2025 09:44:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IL-0000Xb-6w for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:57 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IJ-0001MW-8S for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:56 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXk6Z006554; Wed, 29 Jan 2025 14:43:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=kVRsnOFnUGyZziE9c96bobXaSdFKGsTrtK1S+WmB5FI=; b= CSlihZapvkUSTyhAMndXYVJzsLg17Y3DdMVfFi8yXIZMFT0QgCv1DECBBsKUYNBV x+u+ms+761z7hdP0ibGqcmUmvOPBakPwsKbPFc82xausztPoKuk8Bzs1nLPjozMg gqxMbcb5bOpk8x92kQZRby0QpyC4GNev4vL3FL+hbnyUe/VAKfjtWOGIlH4DOHOm Pf52c1LZR9IMsYEL71u+nlbwK7nf72iTSh7Dx6VS4R2XmZjMjB97/MA3WZN+ht4C ki4IDqsKqvUiXed4UNrqKJdbZETy7sRphVE31oANrIrkw7X+92kWzFRzLmxcVtAI ahezH+K1hGRP6mo6t3L4Ag== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808tw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:52 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDddmx034299; Wed, 29 Jan 2025 14:43:52 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4su-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:52 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8i003307; Wed, 29 Jan 2025 14:43:51 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-16; Wed, 29 Jan 2025 14:43:51 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 15/26] migration: cpr_get_fd_param helper Date: Wed, 29 Jan 2025 06:43:11 -0800 Message-Id: <1738161802-172631-16-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: Qs2fM19scNB5_QyAiagJxDHkCt9tPIiY X-Proofpoint-ORIG-GUID: Qs2fM19scNB5_QyAiagJxDHkCt9tPIiY Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add the helper function cpr_get_fd_param, to use when preserving a file descriptor that is opened externally and passed to QEMU. cpr_get_fd_param returns a descriptor number either from a QEMU command-line parameter, from a getfd command, or from CPR state. When a descriptor is passed to new QEMU via SCM_RIGHTS, its number changes. Hence, during CPR, the command-line parameter is ignored in new QEMU, and over-ridden by the value found in CPR state. Similarly, if the descriptor was originally specified by a getfd command in old QEMU, the fd number is not known outside of QEMU, and it changes when sent to new QEMU via SCM_RIGHTS. Hence the user cannot send getfd to new QEMU, but when the user sends a hotplug command that references the fd, cpr_get_fd_param finds its value in CPR state. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 2 ++ migration/cpr.c | 41 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index e8f4aba..28414f7 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -30,6 +30,8 @@ struct QIOChannel *cpr_state_ioc(void); bool cpr_needed_for_reuse(void *opaque); void cpr_kvm_close(void); +int cpr_get_fd_param(const char *name, const char *fdname, int index, + bool *reused, Error **errp); void cpr_transfer_init(void); QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp); diff --git a/migration/cpr.c b/migration/cpr.c index 86eb484..a1084e9 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -14,6 +14,7 @@ #include "migration/qemu-file.h" #include "migration/savevm.h" #include "migration/vmstate.h" +#include "monitor/monitor.h" #include "system/kvm.h" #include "system/runstate.h" #include "trace.h" @@ -251,3 +252,43 @@ void cpr_kvm_close(void) kvm_close(); vfio_kvm_device_close(); } + +/* + * cpr_get_fd_param: find a descriptor and return its value. + * + * @name: CPR name for the descriptor + * @fdname: An integer-valued string, or a name passed to a getfd command + * @index: CPR index of the descriptor + * @reused: returns true if the fd is found in CPR state, else false. + * @errp: returned error message + * + * If CPR is not being performed, then use @fdname to find the fd. + * If CPR is being performed, then ignore @fdname, and look for @name + * and @index in CPR state. + * + * On success returns the fd value, else returns -1. + */ +int cpr_get_fd_param(const char *name, const char *fdname, int index, + bool *reused, Error **errp) +{ + ERRP_GUARD(); + MigMode mode = cpr_get_incoming_mode(); + int fd; + + if (mode == MIG_MODE_CPR_TRANSFER) { + fd = cpr_find_fd(name, index); + if (fd < 0) { + error_setg(errp, "cannot find saved value for fd %s", fdname); + } + *reused = true; + } else { + fd = monitor_fd_param(monitor_cur(), fdname, errp); + if (fd >= 0) { + cpr_save_fd(name, index, fd); + } else { + error_prepend(errp, "Could not parse object fd %s:", fdname); + } + *reused = false; + } + return fd; +} From patchwork Wed Jan 29 14:43:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A7C6AC02190 for ; Wed, 29 Jan 2025 14:46:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IY-0000cy-U4; Wed, 29 Jan 2025 09:44:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IM-0000YA-1B for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:58 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IK-0001Mm-6k for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:57 -0500 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEbGOm007583; Wed, 29 Jan 2025 14:43:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=iGS9hGiUCxJ1S8fahCvADavxsFpMeEd2LDlXC7j9ypU=; b= AQayQCQFSL/e7ztbD3pSvuzpm5YfA2MAi/hDiVbTJ3asAdDGWupsUfIVzFW0Mdub KaIyq8BPker45hdOk/fjJNBzdJYm4bkU2+N7rymh9SU4gaWciWlKU8zme3YoFwKE RBgIyI7h3mS1jc/1zmlXZ46OqljGmGMdVMxaQxDpMdiWdT10z7MYUY3DNunELdPu vJejeiTjzw8UAWYfdS3Jx+7cBJGGhM5f8yzB0ZHpGwbRjBh/TXghC/g7VThFZkAF BpRBPtFfdL4XONv/bgJErR2tqaGqa68PiKI6UP9ws5gNIIcf1MFMdLoQlsOZ9Jox mCOyz4iBuTxfC6jQdhiBQw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp8e00qw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:53 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDC19P034407; Wed, 29 Jan 2025 14:43:52 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4t1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:52 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8k003307; Wed, 29 Jan 2025 14:43:52 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-17; Wed, 29 Jan 2025 14:43:52 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 16/26] vfio: return mr from vfio_get_xlat_addr Date: Wed, 29 Jan 2025 06:43:12 -0800 Message-Id: <1738161802-172631-17-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: e8lrp4xMTCErkUJgqTuIXoTBjxBCnIXY X-Proofpoint-GUID: e8lrp4xMTCErkUJgqTuIXoTBjxBCnIXY Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Return the memory region that the translated address is found in, for use in a subsequent patch. No functional change. Signed-off-by: Steve Sistare --- hw/vfio/common.c | 9 ++++++--- hw/virtio/vhost-vdpa.c | 2 +- include/exec/memory.h | 5 ++++- system/memory.c | 8 +++++++- 4 files changed, 18 insertions(+), 6 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index db0498e..4bbc29f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -248,12 +248,13 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section) /* Called with rcu_read_lock held. */ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, + MemoryRegion **mr_p, Error **errp) { bool ret, mr_has_discard_manager; ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only, - &mr_has_discard_manager, errp); + &mr_has_discard_manager, mr_p, errp); if (ret && mr_has_discard_manager) { /* * Malicious VMs might trigger discarding of IOMMU-mapped memory. The @@ -300,7 +301,8 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &local_err)) { + if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, + &local_err)) { error_report_err(local_err); goto out; } @@ -1279,7 +1281,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) } rcu_read_lock(); - if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) { + if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, NULL, + &local_err)) { error_report_err(local_err); goto out_unlock; } diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 3cdaa12..a1866bb 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -228,7 +228,7 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, + if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, NULL, &local_err)) { error_report_err(local_err); return; diff --git a/include/exec/memory.h b/include/exec/memory.h index ea5d33a..a2f1229 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -747,13 +747,16 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, * @read_only: indicates if writes are allowed * @mr_has_discard_manager: indicates memory is controlled by a * RamDiscardManager + * @mr_p: return the MemoryRegion containing the @iotlb translated addr * @errp: pointer to Error*, to store an error if it happens. * * Return: true on success, else false setting @errp with error. */ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, - bool *mr_has_discard_manager, Error **errp); + bool *mr_has_discard_manager, + MemoryRegion **mr_p, + Error **errp); typedef struct CoalescedMemoryRange CoalescedMemoryRange; typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd; diff --git a/system/memory.c b/system/memory.c index 4c82979..4ec2b8f 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2185,7 +2185,9 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, /* Called with rcu_read_lock held. */ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, ram_addr_t *ram_addr, bool *read_only, - bool *mr_has_discard_manager, Error **errp) + bool *mr_has_discard_manager, + MemoryRegion **mr_p, + Error **errp) { MemoryRegion *mr; hwaddr xlat; @@ -2250,6 +2252,10 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, *read_only = !writable || mr->readonly; } + if (mr_p) { + *mr_p = mr; + } + return true; } From patchwork Wed Jan 29 14:43:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1E6EC0218D for ; Wed, 29 Jan 2025 14:50:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IZ-0000dL-FW; Wed, 29 Jan 2025 09:44:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IM-0000YC-KT for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:58 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IK-0001Mr-KH for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:58 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjid006526; Wed, 29 Jan 2025 14:43:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=2F0vL0mCsdaa1B2qTMb82wr16kyIbmwtu/m4e16rGiA=; b= mGImDkqKPx7whpyHnlucVn6ZWA+TFc+RFaM7B+k/Ir5LXt8AF5lSvMHZeZjGZiOF H7srWDeRmUY9vmxIRPIeTLA3tHEDPiswLZ9IsFW/I4ppmEXzxE869sZt9ReP4xUx L0FFFPAJYMMzRYbiglIcJ3UZGD4jYoJqE0BdvmHB6Ug+XnraxWvuXQFwQiIC95hC PngmC0Yx7To6fAODgLUm9tVo3lCO6kW+EeTRPjfCcun4DzxjAnm+mLMADauGqvx5 NXx59cJcy7EP1mGeNwJXz05ZpVu40QPwhZNOmdhtL+nBOiHKhXSQBUyzhPQ6SUKp Uw8Sz7kSdh7cxUcuwx/pzw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808u3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:54 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDC19Q034407; Wed, 29 Jan 2025 14:43:53 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4t6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:53 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8m003307; Wed, 29 Jan 2025 14:43:53 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-18; Wed, 29 Jan 2025 14:43:52 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 17/26] vfio: pass ramblock to vfio_container_dma_map Date: Wed, 29 Jan 2025 06:43:13 -0800 Message-Id: <1738161802-172631-18-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: NQPb_N-NkgaQ3yHqH_uUTIF-53Z2L78R X-Proofpoint-ORIG-GUID: NQPb_N-NkgaQ3yHqH_uUTIF-53Z2L78R Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Pass ramblock to vfio_container_dma_map for use in a subsequent patch. No functional change. Signed-off-by: Steve Sistare --- hw/vfio/common.c | 11 +++++++---- hw/vfio/container-base.c | 3 ++- include/hw/vfio/vfio-container-base.h | 3 ++- 3 files changed, 11 insertions(+), 6 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 4bbc29f..aceb0cf 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -282,6 +282,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); VFIOContainerBase *bcontainer = giommu->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; + MemoryRegion *mr; void *vaddr; int ret; Error *local_err = NULL; @@ -301,7 +302,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL, + if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &mr, &local_err)) { error_report_err(local_err); goto out; @@ -315,7 +316,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) */ ret = vfio_container_dma_map(bcontainer, iova, iotlb->addr_mask + 1, vaddr, - read_only); + read_only, mr->ram_block); if (ret) { error_report("vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", @@ -380,7 +381,8 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, vaddr = memory_region_get_ram_ptr(section->mr) + start; ret = vfio_container_dma_map(bcontainer, iova, next - start, - vaddr, section->readonly); + vaddr, section->readonly, + section->mr->ram_block); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -729,7 +731,8 @@ void vfio_container_region_add(VFIOContainerBase *bcontainer, } ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize), - vaddr, section->readonly); + vaddr, section->readonly, + section->mr->ram_block); if (ret) { error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 749a3fd..302cd4c 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -17,7 +17,8 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - void *vaddr, bool readonly) + void *vaddr, bool readonly, + RAMBlock *rb) { VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 4cff994..d82e256 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -73,7 +73,8 @@ typedef struct VFIORamDiscardListener { int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, - void *vaddr, bool readonly); + void *vaddr, bool readonly, + RAMBlock *rb); int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); From patchwork Wed Jan 29 14:43:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953836 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4760C0218D for ; Wed, 29 Jan 2025 14:49:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IX-0000cB-LT; Wed, 29 Jan 2025 09:44:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IO-0000YY-FU for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:00 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IL-0001NO-U0 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:43:59 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEY0l6031380; Wed, 29 Jan 2025 14:43:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=+AKEkt32BwKo0VjU6pwERzhRxGPKMq7ug0mCN6QZXAM=; b= WaHLu6M3PxHNDHkCoNOxZO5iYS6ijQJFj1FslSkIAlkTZXKtrwml4UFMDT7Iaofd VJPVKZ1Suoi5AHK6vh5pPickhQedOOT8j/PGMXYOS0CoSGgLQScJAa4eeA5LgYRN ejKV//B3BD2CeHp/t84c88TEGy4B3CX4vSuPEubDVQf28uh2R3yTYx5/gubU5FBD CExOCJEjyUVtHBhuvVAN2rTnyzxTDImnYXtBZLTcwRV+sv3sfbzmPnD3FkmjxtiD pNYN5rg7wmH+H8doNg8fNqBU2GkSkZYkhr9rI7uXeH3v2uyrx6zkPjOAdogbHn4W agJC8KM7Is2gWinGlIHLxQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp1a81dr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:55 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDus2s034933; Wed, 29 Jan 2025 14:43:54 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4td-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:54 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8o003307; Wed, 29 Jan 2025 14:43:53 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-19; Wed, 29 Jan 2025 14:43:53 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 18/26] vfio/iommufd: define iommufd_cdev_make_hwpt Date: Wed, 29 Jan 2025 06:43:14 -0800 Message-Id: <1738161802-172631-19-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: bMBn0Y_bFMuvM-SJCFlyTA-EvFAhaIA8 X-Proofpoint-ORIG-GUID: bMBn0Y_bFMuvM-SJCFlyTA-EvFAhaIA8 Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Refactor and define iommufd_cdev_make_hwpt, to be called by CPR in a a later patch. No functional change. Signed-off-by: Steve Sistare --- hw/vfio/iommufd.c | 69 +++++++++++++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 28 deletions(-) diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 3490a8f..42ba63f 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -275,6 +275,41 @@ static bool iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp) return true; } +static void iommufd_cdev_set_hwpt(VFIODevice *vbasedev, VFIOIOASHwpt *hwpt) +{ + vbasedev->hwpt = hwpt; + vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt); + QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); +} + +static VFIOIOASHwpt *iommufd_cdev_make_hwpt(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container, + uint32_t hwpt_id) +{ + VFIOIOASHwpt *hwpt = g_malloc0(sizeof(*hwpt)); + uint32_t flags = 0; + + /* + * This is quite early and VFIO Migration state isn't yet fully + * initialized, thus rely only on IOMMU hardware capabilities as to + * whether IOMMU dirty tracking is going to be requested. Later + * vfio_migration_realize() may decide to use VF dirty tracking + * instead. + */ + if (vbasedev->hiod->caps.hw_caps & IOMMU_HW_CAP_DIRTY_TRACKING) { + flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING; + } + + hwpt->hwpt_id = hwpt_id; + hwpt->hwpt_flags = flags; + QLIST_INIT(&hwpt->device_list); + + QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); + container->bcontainer.dirty_pages_supported |= + vbasedev->iommu_dirty_tracking; + return hwpt; +} + static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev, VFIOIOMMUFDContainer *container, Error **errp) @@ -304,24 +339,11 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev, return false; } else { - vbasedev->hwpt = hwpt; - QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); - vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt); + iommufd_cdev_set_hwpt(vbasedev, hwpt); return true; } } - /* - * This is quite early and VFIO Migration state isn't yet fully - * initialized, thus rely only on IOMMU hardware capabilities as to - * whether IOMMU dirty tracking is going to be requested. Later - * vfio_migration_realize() may decide to use VF dirty tracking - * instead. - */ - if (vbasedev->hiod->caps.hw_caps & IOMMU_HW_CAP_DIRTY_TRACKING) { - flags = IOMMU_HWPT_ALLOC_DIRTY_TRACKING; - } - if (!iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid, container->ioas_id, flags, IOMMU_HWPT_DATA_NONE, 0, NULL, @@ -329,24 +351,15 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev, return false; } - hwpt = g_malloc0(sizeof(*hwpt)); - hwpt->hwpt_id = hwpt_id; - hwpt->hwpt_flags = flags; - QLIST_INIT(&hwpt->device_list); - - ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt->hwpt_id, errp); + ret = iommufd_cdev_attach_ioas_hwpt(vbasedev, hwpt_id, errp); if (ret) { - iommufd_backend_free_id(container->be, hwpt->hwpt_id); - g_free(hwpt); + iommufd_backend_free_id(container->be, hwpt_id); return false; } - vbasedev->hwpt = hwpt; - vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt); - QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); - QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); - container->bcontainer.dirty_pages_supported |= - vbasedev->iommu_dirty_tracking; + hwpt = iommufd_cdev_make_hwpt(vbasedev, container, hwpt_id); + iommufd_cdev_set_hwpt(vbasedev, hwpt); + if (container->bcontainer.dirty_pages_supported && !vbasedev->iommu_dirty_tracking) { warn_report("IOMMU instance for device %s doesn't support dirty tracking", From patchwork Wed Jan 29 14:43:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95386C0218D for ; Wed, 29 Jan 2025 14:46:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IY-0000cv-Di; Wed, 29 Jan 2025 09:44:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IP-0000ZJ-CC for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:01 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IM-0001NV-GP for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:01 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXjoE006534; Wed, 29 Jan 2025 14:43:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=bOvNzUBOzzP3JeyjYtLGjFNUTpKnuSslDX1WD5QbhjU=; b= BEKHaSTepz9F0cPVNDFPyHqMoLu2TZwhYQ5j9/X0qrzchffNG1Jf9WwTpRAV5jnz MUWFW4i6at3e++OiNX5ZjQDFagAzumxW15p4hmNlc2BRajXKWQ7Y5++zIquuBulV HNZ4bLa29WRONTN/DvWD3tQO5YUFY9qIanbgwm5DID27rDRWLvehVpW6So+Go+61 FooSNXa4RMJWc+lUyBOmv6lRDx12y9E6KGlC6eDztJhDCN8OvLyT8No7+Z+V7WHj dlKer/wH6xdTTKFV/nktykcKgrfZ9wQ8hU9hoR7eTQ7W7gSv1VIlMZRM18HMbdNF w/ctKWzZh2o+5mA+YYq+dQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808u8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:55 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDo8GN034292; Wed, 29 Jan 2025 14:43:55 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4tm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:54 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8q003307; Wed, 29 Jan 2025 14:43:54 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-20; Wed, 29 Jan 2025 14:43:54 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 19/26] vfio/iommufd: use IOMMU_IOAS_MAP_FILE Date: Wed, 29 Jan 2025 06:43:15 -0800 Message-Id: <1738161802-172631-20-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: 3aPhxYnu15EnbVN3Y2BkcBFCZRCra4k3 X-Proofpoint-ORIG-GUID: 3aPhxYnu15EnbVN3Y2BkcBFCZRCra4k3 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Use IOMMU_IOAS_MAP_FILE when the mapped region is backed by a file. Such a mapping can be preserved without modification during CPR, because it depends on the file's address space, which does not change, rather than on the process's address space, which does change. Signed-off-by: Steve Sistare --- backends/iommufd.c | 36 +++++++++++++++++++++++++++++++++++ backends/trace-events | 1 + hw/vfio/container-base.c | 9 +++++++++ hw/vfio/iommufd.c | 13 +++++++++++++ include/exec/cpu-common.h | 1 + include/hw/vfio/vfio-container-base.h | 3 +++ include/system/iommufd.h | 3 +++ system/physmem.c | 5 +++++ 8 files changed, 71 insertions(+) diff --git a/backends/iommufd.c b/backends/iommufd.c index 7b4fc8e..6d29221 100644 --- a/backends/iommufd.c +++ b/backends/iommufd.c @@ -174,6 +174,42 @@ int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, return ret; } +int iommufd_backend_map_file_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size, + int mfd, unsigned long start, bool readonly) +{ + int ret, fd = be->fd; + struct iommu_ioas_map_file map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .ioas_id = ioas_id, + .fd = mfd, + .start = start, + .iova = iova, + .length = size, + }; + + if (!readonly) { + map.flags |= IOMMU_IOAS_MAP_WRITEABLE; + } + + ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &map); + trace_iommufd_backend_map_file_dma(fd, ioas_id, iova, size, mfd, start, + readonly, ret); + if (ret) { + ret = -errno; + + /* TODO: Not support mapping hardware PCI BAR region for now. */ + if (errno == EFAULT) { + warn_report("IOMMU_IOAS_MAP_FILE failed: %m, PCI BAR?"); + } else { + error_report("IOMMU_IOAS_MAP_FILE failed: %m"); + } + } + return ret; +} + int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, ram_addr_t size) { diff --git a/backends/trace-events b/backends/trace-events index 40811a3..f478e18 100644 --- a/backends/trace-events +++ b/backends/trace-events @@ -11,6 +11,7 @@ iommufd_backend_connect(int fd, bool owned, uint32_t users) "fd=%d owned=%d user iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d" iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d" iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, void *vaddr, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" addr=%p readonly=%d (%d)" +iommufd_backend_map_file_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int fd, unsigned long start, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" fd=%d start=%ld readonly=%d (%d)" iommufd_backend_unmap_dma_non_exist(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " Unmap nonexistent mapping: iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)" iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)" iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d" diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 302cd4c..fbaf04a 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -21,7 +21,16 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, RAMBlock *rb) { VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); + int mfd = rb ? qemu_ram_get_fd(rb) : -1; + if (mfd >= 0 && vioc->dma_map_file) { + unsigned long start = vaddr - qemu_ram_get_host_addr(rb); + unsigned long offset = qemu_ram_get_fd_offset(rb); + + vioc->dma_map_file(bcontainer, iova, size, mfd, start + offset, + readonly); + return 0; + } g_assert(vioc->dma_map); return vioc->dma_map(bcontainer, iova, size, vaddr, readonly); } diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 42ba63f..a3e7edb 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -38,6 +38,18 @@ static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova, iova, size, vaddr, readonly); } +static int iommufd_cdev_map_file(const VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + int fd, unsigned long start, bool readonly) +{ + const VFIOIOMMUFDContainer *container = + container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + + return iommufd_backend_map_file_dma(container->be, + container->ioas_id, + iova, size, fd, start, readonly); +} + static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb) @@ -806,6 +818,7 @@ static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data) vioc->hiod_typename = TYPE_HOST_IOMMU_DEVICE_IOMMUFD_VFIO; vioc->dma_map = iommufd_cdev_map; + vioc->dma_map_file = iommufd_cdev_map_file; vioc->dma_unmap = iommufd_cdev_unmap; vioc->attach_device = iommufd_cdev_attach; vioc->detach_device = iommufd_cdev_detach; diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index b1d76d6..0cab252 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -95,6 +95,7 @@ void qemu_ram_unset_idstr(RAMBlock *block); const char *qemu_ram_get_idstr(RAMBlock *rb); void *qemu_ram_get_host_addr(RAMBlock *rb); ram_addr_t qemu_ram_get_offset(RAMBlock *rb); +ram_addr_t qemu_ram_get_fd_offset(RAMBlock *rb); ram_addr_t qemu_ram_get_used_length(RAMBlock *rb); ram_addr_t qemu_ram_get_max_length(RAMBlock *rb); bool qemu_ram_is_shared(RAMBlock *rb); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index d82e256..4daa5f8 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -115,6 +115,9 @@ struct VFIOIOMMUClass { int (*dma_map)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly); + int (*dma_map_file)(const VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + int fd, unsigned long start, bool readonly); int (*dma_unmap)(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); diff --git a/include/system/iommufd.h b/include/system/iommufd.h index cbab75b..ac700b8 100644 --- a/include/system/iommufd.h +++ b/include/system/iommufd.h @@ -43,6 +43,9 @@ void iommufd_backend_disconnect(IOMMUFDBackend *be); bool iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id, Error **errp); void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id); +int iommufd_backend_map_file_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size, int fd, + unsigned long start, bool readonly); int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly); int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, diff --git a/system/physmem.c b/system/physmem.c index 0bcfc6c..c41a80b 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -1569,6 +1569,11 @@ ram_addr_t qemu_ram_get_offset(RAMBlock *rb) return rb->offset; } +ram_addr_t qemu_ram_get_fd_offset(RAMBlock *rb) +{ + return rb->fd_offset; +} + ram_addr_t qemu_ram_get_used_length(RAMBlock *rb) { return rb->used_length; From patchwork Wed Jan 29 14:43:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8477CC02194 for ; Wed, 29 Jan 2025 14:50:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IW-0000b7-6I; Wed, 29 Jan 2025 09:44:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IP-0000Yq-0U for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:01 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IN-0001Nq-7x for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:00 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXqKW031317; Wed, 29 Jan 2025 14:43:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=Ad8syoWrWMfKHeyySxOpwIJop8rxHg+7gBnAzCPkPHk=; b= iCKINZ8Fowu5Gt4TprSwi52QMlsbS7kfnO2UWoVD47FaHpbSRiIimHXfnv193sUb 6p/vtWCUP/E7VgZkNpdfFjK5RJGfBpn9zPKuxBTYdz/f2+oUBC14NLVKWVirXQnV IFw+A22N29LDzxhWAVhg+miIGkHH9BI+sGojs87TmqGnsA1dyAp4ced7oGuEzvRi kqZJPscl2wxCtaill5nuwOhyaPFwE2n3HlEMyeQd+vl3mVpHIPFCKX0n26Wri2sf FUTsPWb65kCe2gyWzHMVD4vhDizBacCBM4vOfG5qMURoRsNVPAocuYEKUrKNIB/8 bE6FrohMYUf7QUtm7NIgvg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp1a81dt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:56 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDo8GO034292; Wed, 29 Jan 2025 14:43:55 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4tu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:55 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8s003307; Wed, 29 Jan 2025 14:43:55 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-21; Wed, 29 Jan 2025 14:43:55 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 20/26] vfio/iommufd: export iommufd_cdev_get_info_iova_range Date: Wed, 29 Jan 2025 06:43:16 -0800 Message-Id: <1738161802-172631-21-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: L0ou9PZXWkd5dWmJjM9vekljPhbS8rkv X-Proofpoint-ORIG-GUID: L0ou9PZXWkd5dWmJjM9vekljPhbS8rkv Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Export iommufd_cdev_get_info_iova_range for use by CPR. No functional change. Signed-off-by: Steve Sistare --- hw/vfio/iommufd.c | 4 ++-- include/hw/vfio/vfio-common.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index a3e7edb..2f888e5 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -442,8 +442,8 @@ static int iommufd_cdev_ram_block_discard_disable(bool state) return ram_block_uncoordinated_discard_disable(state); } -static bool iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container, - uint32_t ioas_id, Error **errp) +bool iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container, + uint32_t ioas_id, Error **errp) { VFIOContainerBase *bcontainer = &container->bcontainer; g_autofree struct iommu_ioas_iova_ranges *info = NULL; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 5a89aca..ca10abc 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -268,6 +268,8 @@ bool vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp); void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer); bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp); void vfio_legacy_cpr_unregister_container(VFIOContainer *container); +bool iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container, + uint32_t ioas_id, Error **errp); extern const MemoryRegionOps vfio_region_ops; typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; From patchwork Wed Jan 29 14:43:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 543DEC3DA4A for ; Wed, 29 Jan 2025 14:49:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9IV-0000as-17; Wed, 29 Jan 2025 09:44:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IP-0000a6-T9 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:01 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IO-0001Ns-7v for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:01 -0500 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEbCoh007551; Wed, 29 Jan 2025 14:43:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=fTaHqwDgUxlCgdHBDi56iAfZ6J91H9VUHu07rDGKWFw=; b= oTG8ioG4dg/MMNyzZtfFfJEh5zI80t/4oIm+STz9i7aI7QrMvLVd6vUZnmcMP9Qp KGZDfA4O+8BPdlwo505wxP+mz6bZb/9twRgWoTYIrZgqax+e4vm2pd08izRzJMUa hQsuiFB42gVJosXxlmIsD7rtxwYlf6KndC5n/xVYEff01swhHKrVFsvAM9sjozxz uSeiHbyOdAaWQRBokrGjN6/dxXqeZ4FNnFuxia6jqOyiY3LVLpEOw6oo12hZY8J2 H/9bcxMP9fj1ro7jQfxF/GNqzC784jsAzOkuQIsEXALPo6DW7A1AhpU3CaOmV805 3ChKtjTs87sLBD4vbJJd6w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp8e00ra-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:57 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDQ7aX034289; Wed, 29 Jan 2025 14:43:56 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4u5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:56 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8u003307; Wed, 29 Jan 2025 14:43:55 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-22; Wed, 29 Jan 2025 14:43:55 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 21/26] iommufd: change process ioctl Date: Wed, 29 Jan 2025 06:43:17 -0800 Message-Id: <1738161802-172631-22-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: JtooxTb81mH3Qp9bpvuaieYI2gDB6FSw X-Proofpoint-GUID: JtooxTb81mH3Qp9bpvuaieYI2gDB6FSw Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Define the change process ioctl Signed-off-by: Steve Sistare --- backends/iommufd.c | 20 ++++++++++++++++++++ backends/trace-events | 1 + include/system/iommufd.h | 2 ++ 3 files changed, 23 insertions(+) diff --git a/backends/iommufd.c b/backends/iommufd.c index 6d29221..be5f6a3 100644 --- a/backends/iommufd.c +++ b/backends/iommufd.c @@ -73,6 +73,26 @@ static void iommufd_backend_class_init(ObjectClass *oc, void *data) object_class_property_add_str(oc, "fd", NULL, iommufd_backend_set_fd); } +bool iommufd_change_process_capable(IOMMUFDBackend *be) +{ + struct iommu_ioas_change_process args = {.size = sizeof(args)}; + + return !ioctl(be->fd, IOMMU_IOAS_CHANGE_PROCESS, &args); +} + +int iommufd_change_process(IOMMUFDBackend *be) +{ + struct iommu_ioas_change_process args = {.size = sizeof(args)}; + int ret = ioctl(be->fd, IOMMU_IOAS_CHANGE_PROCESS, &args); + + if (ret) { + ret = -errno; + error_report("IOMMU_IOAS_CHANGE_PROCESS fd %d failed: %m", be->fd); + } + trace_iommufd_change_process(be->fd, ret); + return ret; +} + bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) { int fd; diff --git a/backends/trace-events b/backends/trace-events index f478e18..9b33dc3 100644 --- a/backends/trace-events +++ b/backends/trace-events @@ -7,6 +7,7 @@ dbus_vmstate_loading(const char *id) "id: %s" dbus_vmstate_saving(const char *id) "id: %s" # iommufd.c +iommufd_change_process(int fd, int ret) "fd=%d (%d)" iommufd_backend_connect(int fd, bool owned, uint32_t users) "fd=%d owned=%d users=%d" iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d" iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d" diff --git a/include/system/iommufd.h b/include/system/iommufd.h index ac700b8..4e9c037 100644 --- a/include/system/iommufd.h +++ b/include/system/iommufd.h @@ -64,6 +64,8 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id, uint64_t iova, ram_addr_t size, uint64_t page_size, uint64_t *data, Error **errp); +bool iommufd_change_process_capable(IOMMUFDBackend *be); +int iommufd_change_process(IOMMUFDBackend *be); #define TYPE_HOST_IOMMU_DEVICE_IOMMUFD TYPE_HOST_IOMMU_DEVICE "-iommufd" #endif From patchwork Wed Jan 29 14:43:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51197C02193 for ; Wed, 29 Jan 2025 14:49:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Id-0000ft-Gq; Wed, 29 Jan 2025 09:44:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Ib-0000f1-GH for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:13 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IZ-0001Oa-Pj for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:13 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXrKl011631; Wed, 29 Jan 2025 14:43:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=PngtgFNIRuTVuIU4bNtaaptebJWnixYeRD7cj95XLpM=; b= alQ9dzMPZQOuwhrOPePnLxSGUQVx1kSRN5mrheha007d9L+JZvGUe0bKG+0iADv9 mNUubP4l5tiRrzpDIUjn8vdRwwluFystQTaA+XGGk4gdvceP17DDd/fgPt0D0JN+ HdNN7Nipuqvx1O9IRn/1bdSA3rWuW1NFQ8BcUgDgsztevJdzY2/p/F4Nebf7jhrG borB2bsobX9ATMnKupmOngln16vf4lRk+iEYjCtUJs/ws+nDS+XtixPRC22kgIML AT6UF27AgHxlG6ayBiEFH7xCulIURJNizr/vIErbEruGbnHLoYHrVB2Ev/KG01mG DyzB7DSr+GPDaiUDt5e8PA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fnk9g3jf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:58 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDOoKI034428; Wed, 29 Jan 2025 14:43:57 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4um-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:57 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf8w003307; Wed, 29 Jan 2025 14:43:56 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-23; Wed, 29 Jan 2025 14:43:56 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 22/26] vfio/iommufd: invariant device name Date: Wed, 29 Jan 2025 06:43:18 -0800 Message-Id: <1738161802-172631-23-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: FUwrt4tULkEpuUXPbFJLxcZLc6lAhtKT X-Proofpoint-GUID: FUwrt4tULkEpuUXPbFJLxcZLc6lAhtKT Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org cpr-transfer will use the device name as a key to find the value of the device descriptor in new QEMU. However, if the descriptor number is specified by a command-line fd parameter, then vfio_device_get_name creates a name that includes the fd number. This causes a chicken-and-egg problem: new QEMU must know the fd number to construct a name to find the fd number. To fix, create an invariant name based on the id command-line parameter. If id is not defined, add a CPR blocker. Signed-off-by: Steve Sistare --- hw/vfio/helpers.c | 18 +++++++++++++++--- hw/vfio/iommufd.c | 2 ++ include/hw/vfio/vfio-common.h | 1 + 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 913796f..bd94b86 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -25,6 +25,8 @@ #include "hw/vfio/vfio-common.h" #include "hw/hw.h" #include "trace.h" +#include "migration/blocker.h" +#include "migration/cpr.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "qemu/units.h" @@ -636,6 +638,7 @@ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp) { ERRP_GUARD(); struct stat st; + bool ret = true; if (vbasedev->fd < 0) { if (stat(vbasedev->sysfsdev, &st) < 0) { @@ -653,15 +656,24 @@ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp) return false; } /* - * Give a name with fd so any function printing out vbasedev->name + * Give a name so any function printing out vbasedev->name * will not break. */ if (!vbasedev->name) { - vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd); + if (vbasedev->dev->id) { + vbasedev->name = g_strdup(vbasedev->dev->id); + } else { + vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd); + error_setg(&vbasedev->cpr_id_blocker, + "vfio device with fd=%d needs an id property", + vbasedev->fd); + ret = migrate_add_blocker_modes(&vbasedev->cpr_id_blocker, errp, + MIG_MODE_CPR_TRANSFER, -1) == 0; + } } } - return true; + return ret; } void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp) diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 2f888e5..8308715 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -24,6 +24,7 @@ #include "system/reset.h" #include "qemu/cutils.h" #include "qemu/chardev_open.h" +#include "migration/blocker.h" #include "pci.h" #include "exec/ram_addr.h" @@ -657,6 +658,7 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev) iommufd_cdev_container_destroy(container); vfio_put_address_space(space); + migrate_del_blocker(&vbasedev->cpr_id_blocker); iommufd_cdev_unbind_and_disconnect(vbasedev); close(vbasedev->fd); } diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index ca10abc..37e7c26 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -147,6 +147,7 @@ typedef struct VFIODevice { VFIOMigration *migration; Error *migration_blocker; Error *cpr_mdev_blocker; + Error *cpr_id_blocker; OnOffAuto pre_copy_dirty_page_tracking; OnOffAuto device_dirty_page_tracking; bool dirty_pages_supported; From patchwork Wed Jan 29 14:43:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 177A8C0218D for ; Wed, 29 Jan 2025 14:48:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Ic-0000fD-SF; Wed, 29 Jan 2025 09:44:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Ib-0000ek-8P for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:13 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IZ-0001OG-Df for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:13 -0500 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEbAZr007524; Wed, 29 Jan 2025 14:43:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=WKa651mN2NNSEZD+2QEoBB8PQPBvkMwFFWRK+gIEq6o=; b= KJVxYKKqYGgCaImdDaAdyca/s2M7t6mTR2ew9ecHKUUv8NtETRKw8G89RaxyPQAs F+dKkya8EpzRxIyYtcDuenz1VO1WtEJDmNifMtEpyWkeMQTHiRYMDkp1v3i27oPh r8MIO51XKhlD9Mw/2zc4lr6lGI/B+s5J+JlsQQxuhnp7WjzY3nL7ZXL1XCAC20zC Vej8MvRs0oNTe0A5oxi/C/xkRjaPeoAWF0BnME/aLvMRJYjZpMbcr3giNHWm6UuR kQu69ta6v+tK3Ro0zSOEysu5SV7QDm1D4CDvk/4XIiKY7dpbQk0SlKWDbxR9ZvtI aKW2Hb/rGjHHyUYYn8hXuQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp8e00rh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:58 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDo8GP034292; Wed, 29 Jan 2025 14:43:57 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4uw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:57 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf90003307; Wed, 29 Jan 2025 14:43:57 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-24; Wed, 29 Jan 2025 14:43:57 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 23/26] vfio/iommufd: register container for cpr Date: Wed, 29 Jan 2025 06:43:19 -0800 Message-Id: <1738161802-172631-24-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: IeCykdG45YSBN_lQYD9j--_93Qc4u0jA X-Proofpoint-GUID: IeCykdG45YSBN_lQYD9j--_93Qc4u0jA Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Register a vfio iommufd container for CPR. Add a blocker if the kernel does not support IOMMU_IOAS_CHANGE_PROCESS. This is mostly boiler plate. The fields to to saved and restored are added in subsequent patches. Signed-off-by: Steve Sistare --- hw/vfio/cpr-iommufd.c | 96 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/iommufd.c | 12 +++--- hw/vfio/meson.build | 1 + include/hw/vfio/vfio-common.h | 6 +++ 4 files changed, 110 insertions(+), 5 deletions(-) create mode 100644 hw/vfio/cpr-iommufd.c diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c new file mode 100644 index 0000000..4eb358a --- /dev/null +++ b/hw/vfio/cpr-iommufd.c @@ -0,0 +1,96 @@ +/* + * Copyright (c) 2024-2025 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/vfio/vfio-common.h" +#include "migration/blocker.h" +#include "migration/cpr.h" +#include "migration/migration.h" +#include "migration/vmstate.h" +#include "system/iommufd.h" + +static bool vfio_cpr_supported(VFIOIOMMUFDContainer *container, Error **errp) +{ + if (!iommufd_change_process_capable(container->be)) { + error_setg(errp, + "VFIO container does not support IOMMU_IOAS_CHANGE_PROCESS"); + return false; + } + return true; +} + +static const VMStateDescription vfio_container_vmstate = { + .name = "vfio-iommufd-container", + .version_id = 0, + .minimum_version_id = 0, + .needed = cpr_needed_for_reuse, + .fields = (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + +static const VMStateDescription iommufd_cpr_vmstate = { + .name = "iommufd", + .version_id = 0, + .minimum_version_id = 0, + .needed = cpr_needed_for_reuse, + .fields = (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + +bool vfio_iommufd_cpr_register_container(VFIOIOMMUFDContainer *container, + Error **errp) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + Error **cpr_blocker = &container->cpr_blocker; + + if (!vfio_cpr_register_container(bcontainer, errp)) { + return false; + } + + if (!vfio_cpr_supported(container, cpr_blocker)) { + return migrate_add_blocker_modes(cpr_blocker, errp, + MIG_MODE_CPR_TRANSFER, -1) == 0; + } + + vmstate_register(NULL, -1, &vfio_container_vmstate, container); + vmstate_register(NULL, -1, &iommufd_cpr_vmstate, container->be); + + return true; +} + +void vfio_iommufd_cpr_unregister_container(VFIOIOMMUFDContainer *container) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + + vmstate_unregister(NULL, &iommufd_cpr_vmstate, container->be); + vmstate_unregister(NULL, &vfio_container_vmstate, container); + migrate_del_blocker(&container->cpr_blocker); + vfio_cpr_unregister_container(bcontainer); +} + +static const VMStateDescription vfio_device_vmstate = { + .name = "vfio-iommufd-device", + .version_id = 0, + .minimum_version_id = 0, + .needed = cpr_needed_for_reuse, + .fields = (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + +void vfio_iommufd_cpr_register_device(VFIODevice *vbasedev) +{ + vmstate_register(NULL, -1, &vfio_device_vmstate, vbasedev); +} + +void vfio_iommufd_cpr_unregister_device(VFIODevice *vbasedev) +{ + vmstate_unregister(NULL, &vfio_device_vmstate, vbasedev); +} diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 8308715..ae78e00 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -592,6 +592,10 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, bcontainer->initialized = true; + if (!vfio_iommufd_cpr_register_container(container, errp)) { + goto err_listener_register; + } + found_container: ret = ioctl(devfd, VFIO_DEVICE_GET_INFO, &dev_info); if (ret) { @@ -599,10 +603,6 @@ found_container: goto err_listener_register; } - if (!vfio_cpr_register_container(bcontainer, errp)) { - goto err_listener_register; - } - /* * TODO: examine RAM_BLOCK_DISCARD stuff, should we do group level * for discarding incompatibility check as well? @@ -619,6 +619,7 @@ found_container: vbasedev->bcontainer = bcontainer; QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + vfio_iommufd_cpr_register_device(vbasedev); trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs, vbasedev->num_regions, vbasedev->flags); @@ -653,12 +654,13 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev) iommufd_cdev_ram_block_discard_disable(false); } - vfio_cpr_unregister_container(bcontainer); + vfio_iommufd_cpr_unregister_container(container); iommufd_cdev_detach_container(vbasedev, container); iommufd_cdev_container_destroy(container); vfio_put_address_space(space); migrate_del_blocker(&vbasedev->cpr_id_blocker); + vfio_iommufd_cpr_unregister_device(vbasedev); iommufd_cdev_unbind_and_disconnect(vbasedev); close(vbasedev->fd); } diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index 5487815..998adb5 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -13,6 +13,7 @@ vfio_ss.add(when: 'CONFIG_IOMMUFD', if_true: files( vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'cpr.c', 'cpr-legacy.c', + 'cpr-iommufd.c', 'display.c', 'pci-quirks.c', 'pci.c', diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 37e7c26..add44d4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -113,6 +113,7 @@ typedef struct VFIOIOASHwpt { typedef struct VFIOIOMMUFDContainer { VFIOContainerBase bcontainer; IOMMUFDBackend *be; + Error *cpr_blocker; uint32_t ioas_id; QLIST_HEAD(, VFIOIOASHwpt) hwpt_list; } VFIOIOMMUFDContainer; @@ -271,6 +272,11 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp); void vfio_legacy_cpr_unregister_container(VFIOContainer *container); bool iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container, uint32_t ioas_id, Error **errp); +bool vfio_iommufd_cpr_register_container(VFIOIOMMUFDContainer *container, + Error **errp); +void vfio_iommufd_cpr_unregister_container(VFIOIOMMUFDContainer *container); +void vfio_iommufd_cpr_register_device(VFIODevice *vbasedev); +void vfio_iommufd_cpr_unregister_device(VFIODevice *vbasedev); extern const MemoryRegionOps vfio_region_ops; typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; From patchwork Wed Jan 29 14:43:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C0087C02190 for ; Wed, 29 Jan 2025 14:44:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Ii-0000g4-1b; Wed, 29 Jan 2025 09:44:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Ib-0000fC-VB for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:14 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9IZ-0001OS-P7 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:13 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXk6c006554; Wed, 29 Jan 2025 14:43:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=nE7wuTh9USZkkNmT+UVS0+zVkHM32tvpNteq2BHkwMY=; b= hyJPp37DqhJCD8l3JJvdM6WQPq5ji37h54YL+5m/B6kVVVYXJimMUKSjk2ZhFkST l+8HLNjRSZbCd/kV07q+SySoUJ2g7vyZB9vknW2fTeb23DwHkTMJKfkOIW4Z45R5 FCYoe6Y6pEfM4ykuydzSFlqTcCcc01AfYo4pUPmqpqyBI0Sq0tceT+IZQnnx6LCX 4cZZ3NGkihdI8X3yPPFmam/l0/VxRYMBiRLaxw8igz54fi8oBB7abodMxZDazPjj znqsjlRC+xtfR9h4A+5uUgHS+PdJOLhq1/myII7zHXod8iZnmaPd47iN29wYgpZ9 V7/etZN61iPltN2e7QPGVw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fmf808um-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:59 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDGTtK037021; Wed, 29 Jan 2025 14:43:58 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4v7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:58 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf92003307; Wed, 29 Jan 2025 14:43:57 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-25; Wed, 29 Jan 2025 14:43:57 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 24/26] vfio/iommufd: preserve descriptors Date: Wed, 29 Jan 2025 06:43:20 -0800 Message-Id: <1738161802-172631-25-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: N_ECke_chYsCxpeYv6lHl2diCuXCyyg0 X-Proofpoint-ORIG-GUID: N_ECke_chYsCxpeYv6lHl2diCuXCyyg0 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Save the iommu and vfio device fd in CPR state when it is created. After CPR, the fd number is found in CPR state and reused. Remember the reused status for subsequent patches. The reused status is cleared when vmstate load finishes. Signed-off-by: Steve Sistare --- backends/iommufd.c | 24 ++++++++++++++++++++---- hw/vfio/cpr-iommufd.c | 15 +++++++++++++++ hw/vfio/helpers.c | 10 ++-------- hw/vfio/iommufd.c | 10 +++++++++- include/system/iommufd.h | 1 + 5 files changed, 47 insertions(+), 13 deletions(-) diff --git a/backends/iommufd.c b/backends/iommufd.c index be5f6a3..e9452e4 100644 --- a/backends/iommufd.c +++ b/backends/iommufd.c @@ -16,12 +16,18 @@ #include "qemu/module.h" #include "qom/object_interfaces.h" #include "qemu/error-report.h" +#include "migration/cpr.h" #include "monitor/monitor.h" #include "trace.h" #include "hw/vfio/vfio-common.h" #include #include +static const char *iommufd_fd_name(IOMMUFDBackend *be) +{ + return object_get_canonical_path_component(OBJECT(be)); +} + static void iommufd_backend_init(Object *obj) { IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); @@ -46,10 +52,10 @@ static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp) ERRP_GUARD(); IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); int fd = -1; + const char *name = iommufd_fd_name(be); - fd = monitor_fd_param(monitor_cur(), str, errp); + fd = cpr_get_fd_param(name, str, 0, &be->reused, errp); if (fd == -1) { - error_prepend(errp, "Could not parse remote object fd %s:", str); return; } be->fd = fd; @@ -98,10 +104,16 @@ bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) int fd; if (be->owned && !be->users) { - fd = qemu_open("/dev/iommu", O_RDWR, errp); + const char *name = iommufd_fd_name(be); + fd = cpr_find_fd(name, 0); + be->reused = (fd >= 0); + if (!be->reused) { + fd = qemu_open("/dev/iommu", O_RDWR, errp); + } if (fd < 0) { return false; } + cpr_resave_fd(name, 0, fd); be->fd = fd; } be->users++; @@ -112,6 +124,9 @@ bool iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) void iommufd_backend_disconnect(IOMMUFDBackend *be) { + int fd = be->fd; + const char *name = iommufd_fd_name(be); + if (!be->users) { goto out; } @@ -121,7 +136,8 @@ void iommufd_backend_disconnect(IOMMUFDBackend *be) be->fd = -1; } out: - trace_iommufd_backend_disconnect(be->fd, be->users); + cpr_delete_fd(name, 0); + trace_iommufd_backend_disconnect(fd, be->users); } bool iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id, diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c index 4eb358a..053ff8c 100644 --- a/hw/vfio/cpr-iommufd.c +++ b/hw/vfio/cpr-iommufd.c @@ -24,10 +24,25 @@ static bool vfio_cpr_supported(VFIOIOMMUFDContainer *container, Error **errp) return true; } +static int vfio_container_post_load(void *opaque, int version_id) +{ + VFIOIOMMUFDContainer *container = opaque; + VFIOContainerBase *bcontainer = &container->bcontainer; + VFIODevice *vbasedev; + + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { + vbasedev->reused = false; + } + container->be->reused = false; + + return 0; +} + static const VMStateDescription vfio_container_vmstate = { .name = "vfio-iommufd-container", .version_id = 0, .minimum_version_id = 0, + .post_load = vfio_container_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { VMSTATE_END_OF_LIST() diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index bd94b86..7f69707 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -678,14 +678,8 @@ bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp) void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp) { - ERRP_GUARD(); - int fd = monitor_fd_param(monitor_cur(), str, errp); - - if (fd < 0) { - error_prepend(errp, "Could not parse remote object fd %s:", str); - return; - } - vbasedev->fd = fd; + vbasedev->fd = cpr_get_fd_param(vbasedev->dev->id, str, 0, + &vbasedev->reused, errp); } void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops, diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index ae78e00..abd17b6 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -25,6 +25,7 @@ #include "qemu/cutils.h" #include "qemu/chardev_open.h" #include "migration/blocker.h" +#include "migration/cpr.h" #include "pci.h" #include "exec/ram_addr.h" @@ -499,13 +500,18 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD)); if (vbasedev->fd < 0) { - devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp); + devfd = cpr_find_fd(vbasedev->name, 0); + vbasedev->reused = (devfd >= 0); + if (!vbasedev->reused) { + devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp); + } if (devfd < 0) { return false; } vbasedev->fd = devfd; } else { devfd = vbasedev->fd; + /* reused was set in iommufd_backend_set_fd */ } if (!iommufd_cdev_connect_and_bind(vbasedev, errp)) { @@ -620,6 +626,7 @@ found_container: QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); vfio_iommufd_cpr_register_device(vbasedev); + cpr_resave_fd(vbasedev->name, 0, devfd); trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs, vbasedev->num_regions, vbasedev->flags); @@ -661,6 +668,7 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev) migrate_del_blocker(&vbasedev->cpr_id_blocker); vfio_iommufd_cpr_unregister_device(vbasedev); + cpr_delete_fd(vbasedev->name, 0); iommufd_cdev_unbind_and_disconnect(vbasedev); close(vbasedev->fd); } diff --git a/include/system/iommufd.h b/include/system/iommufd.h index 4e9c037..6618248 100644 --- a/include/system/iommufd.h +++ b/include/system/iommufd.h @@ -32,6 +32,7 @@ struct IOMMUFDBackend { /*< protected >*/ int fd; /* /dev/iommu file descriptor */ bool owned; /* is the /dev/iommu opened internally */ + bool reused; /* fd is reused after CPR */ uint32_t users; /*< public >*/ From patchwork Wed Jan 29 14:43:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC1B3C0218D for ; Wed, 29 Jan 2025 14:46:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Ii-0000g5-1v; Wed, 29 Jan 2025 09:44:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Id-0000fb-0I for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:15 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Ia-0001Os-UB for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:14 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXqKX031317; Wed, 29 Jan 2025 14:44:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=pho1q/EKZZS+oZE4Tpv1oDI4pvH5imz3U2LHYeXtWnM=; b= EeE+5RMMIEsbcKioSNIiJvXX5RcYF2F4NQs1/lW5Ds/Fzcfha4JQWIHsxcYEYR8I mm7e5/h7e+NHiA70P2ghv85PafonIvM8JJVxoaeqHmuwJCFbivz8uBg188FJrhlG +VVLfA20lpY6g6kSFS5AvGvuvgyy2ImEXZG7gpr7SfcrstO7tFRo3yiswlbgYw6p ViTzfKdoUY45rA/HEvEw7gmQX82pIDr2IFP3eQtYocK95gjioTBTRJdXVbopk4cc JOXxOM5BHuU/OTtqPiZxkUUweGz0DvjqUtWdFPonbVjDgPmGzGUERDVPYyI9K5qT nruHX+csg5Q53GqoWon2SA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fp1a81e4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:44:00 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TDddn2034299; Wed, 29 Jan 2025 14:43:59 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4vh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:59 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf94003307; Wed, 29 Jan 2025 14:43:58 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-26; Wed, 29 Jan 2025 14:43:58 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 25/26] vfio/iommufd: reconstruct device Date: Wed, 29 Jan 2025 06:43:21 -0800 Message-Id: <1738161802-172631-26-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-GUID: hQL5yp3j1NfW_7Jzo--HeGbMibork-wp X-Proofpoint-ORIG-GUID: hQL5yp3j1NfW_7Jzo--HeGbMibork-wp Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reconstruct userland device state after CPR. During vfio_realize, skip all ioctls that configure the device, as it was already configured in old QEMU. Save the ioas_id in vmstate, and skip its allocation in vfio_realize. Because we skip ioctl's, it is not needed at realize time. However, we do need the range info, so defer the call to iommufd_cdev_get_info_iova_range to a post_load handler, at which time the ioas_id is known. Save the devid in vmstate. It is used in one place during realize, to fetch hw_caps device info, at vfio_device_hiod_realize -> hiod_iommufd_vfio_realize -> iommufd_backend_get_device_info. The hw_caps is not needed until post load time (see the next paragraph), so defer the call to vfio_device_hiod_realize to post load, at which time the devid is known. Save the hwpt_id in vmstate. In realize, skip its allocation from iommufd_cdev_attach -> iommufd_cdev_attach_container -> iommufd_cdev_autodomains_get. Rebuild userland structures to hold hwpt_id by calling iommufd_cdev_rebuild_hwpt at post load time. This depends on hw_caps as described above. Lastly, change the owning process of the iommufd device in post load. Signed-off-by: Steve Sistare --- hw/vfio/cpr-iommufd.c | 49 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/iommufd.c | 46 +++++++++++++++++++++++++++++++++++++--- hw/vfio/trace-events | 1 + include/hw/vfio/vfio-common.h | 3 +++ 4 files changed, 96 insertions(+), 3 deletions(-) diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c index 053ff8c..711e5cf 100644 --- a/hw/vfio/cpr-iommufd.c +++ b/hw/vfio/cpr-iommufd.c @@ -14,6 +14,9 @@ #include "migration/vmstate.h" #include "system/iommufd.h" +#define IOMMUFD_CONTAINER(base) \ + container_of(base, VFIOIOMMUFDContainer, bcontainer) + static bool vfio_cpr_supported(VFIOIOMMUFDContainer *container, Error **errp) { if (!iommufd_change_process_capable(container->be)) { @@ -29,6 +32,13 @@ static int vfio_container_post_load(void *opaque, int version_id) VFIOIOMMUFDContainer *container = opaque; VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; + Error *err = NULL; + uint32_t ioas_id = container->ioas_id; + + if (!iommufd_cdev_get_info_iova_range(container, ioas_id, &err)) { + error_report_err(err); + return -1; + } QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { vbasedev->reused = false; @@ -38,21 +48,41 @@ static int vfio_container_post_load(void *opaque, int version_id) return 0; } +static int vfio_container_pre_save(void *opaque) +{ + VFIOIOMMUFDContainer *container = opaque; + + /* + * The process has not changed yet, but proactively call the ioctl, + * and it will fail if any DMA mappings are not supported. + */ + return iommufd_change_process(container->be); +} + static const VMStateDescription vfio_container_vmstate = { .name = "vfio-iommufd-container", .version_id = 0, .minimum_version_id = 0, + .pre_save = vfio_container_pre_save, .post_load = vfio_container_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { + VMSTATE_UINT32(ioas_id, VFIOIOMMUFDContainer), VMSTATE_END_OF_LIST() } }; +static int iommufd_cpr_post_load(void *opaque, int version_id) +{ + IOMMUFDBackend *be = opaque; + return iommufd_change_process(be); +} + static const VMStateDescription iommufd_cpr_vmstate = { .name = "iommufd", .version_id = 0, .minimum_version_id = 0, + .post_load = iommufd_cpr_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { VMSTATE_END_OF_LIST() @@ -90,12 +120,31 @@ void vfio_iommufd_cpr_unregister_container(VFIOIOMMUFDContainer *container) vfio_cpr_unregister_container(bcontainer); } +static int vfio_device_post_load(void *opaque, int version_id) +{ + VFIODevice *vbasedev = opaque; + VFIOIOMMUFDContainer *container = IOMMUFD_CONTAINER(vbasedev->bcontainer); + Error *err = NULL; + + if (!vfio_device_hiod_realize(vbasedev, &err)) { + error_report_err(err); + return false; + } + if (!vbasedev->mdev) { + iommufd_cdev_rebuild_hwpt(vbasedev, container); + } + return true; +} + static const VMStateDescription vfio_device_vmstate = { .name = "vfio-iommufd-device", .version_id = 0, .minimum_version_id = 0, + .post_load = vfio_device_post_load, .needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { + VMSTATE_INT32(devid, VFIODevice), + VMSTATE_UINT32(hwpt_id, VFIODevice), VMSTATE_END_OF_LIST() } }; diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index abd17b6..a007b6c 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -99,6 +99,11 @@ static bool iommufd_cdev_connect_and_bind(VFIODevice *vbasedev, Error **errp) goto err_kvm_device_add; } + if (vbasedev->reused) { + /* Already bound, and devid was set in iommufd_cdev_attach */ + goto skip_bind; + } + /* Bind device to iommufd */ bind.iommufd = iommufd->fd; if (ioctl(vbasedev->fd, VFIO_DEVICE_BIND_IOMMUFD, &bind)) { @@ -110,6 +115,8 @@ static bool iommufd_cdev_connect_and_bind(VFIODevice *vbasedev, Error **errp) vbasedev->devid = bind.out_devid; trace_iommufd_cdev_connect_and_bind(bind.iommufd, vbasedev->name, vbasedev->fd, vbasedev->devid); + +skip_bind: return true; err_bind: iommufd_cdev_kvm_device_del(vbasedev); @@ -292,6 +299,7 @@ static bool iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp) static void iommufd_cdev_set_hwpt(VFIODevice *vbasedev, VFIOIOASHwpt *hwpt) { vbasedev->hwpt = hwpt; + vbasedev->hwpt_id = hwpt->hwpt_id; vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt); QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); } @@ -324,6 +332,24 @@ static VFIOIOASHwpt *iommufd_cdev_make_hwpt(VFIODevice *vbasedev, return hwpt; } +void iommufd_cdev_rebuild_hwpt(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container) +{ + VFIOIOASHwpt *hwpt; + int hwpt_id = vbasedev->hwpt_id; + + trace_iommufd_cdev_rebuild_hwpt(container->be->fd, hwpt_id); + + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + if (hwpt->hwpt_id == hwpt_id) { + iommufd_cdev_set_hwpt(vbasedev, hwpt); + return; + } + } + hwpt = iommufd_cdev_make_hwpt(vbasedev, container, hwpt_id); + iommufd_cdev_set_hwpt(vbasedev, hwpt); +} + static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev, VFIOIOMMUFDContainer *container, Error **errp) @@ -527,7 +553,8 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, * FD to be connected and having a devid to be able to successfully call * iommufd_backend_get_device_info(). */ - if (!vfio_device_hiod_realize(vbasedev, errp)) { + if (!vbasedev->reused && + !vfio_device_hiod_realize(vbasedev, errp)) { goto err_alloc_ioas; } @@ -538,7 +565,8 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, vbasedev->iommufd != container->be) { continue; } - if (!iommufd_cdev_attach_container(vbasedev, container, &err)) { + if (!vbasedev->reused && + !iommufd_cdev_attach_container(vbasedev, container, &err)) { const char *msg = error_get_pretty(err); trace_iommufd_cdev_fail_attach_existing_container(msg); @@ -555,6 +583,11 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, } } + if (vbasedev->reused) { + ioas_id = -1; /* ioas_id will be received from vmstate */ + goto skip_ioas_alloc; + } + /* Need to allocate a new dedicated container */ if (!iommufd_backend_alloc_ioas(vbasedev->iommufd, &ioas_id, errp)) { goto err_alloc_ioas; @@ -562,6 +595,7 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, trace_iommufd_cdev_alloc_ioas(vbasedev->iommufd->fd, ioas_id); +skip_ioas_alloc: container = VFIO_IOMMU_IOMMUFD(object_new(TYPE_VFIO_IOMMU_IOMMUFD)); container->be = vbasedev->iommufd; container->ioas_id = ioas_id; @@ -570,7 +604,8 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, bcontainer = &container->bcontainer; vfio_address_space_insert(space, bcontainer); - if (!iommufd_cdev_attach_container(vbasedev, container, errp)) { + if (!vbasedev->reused && + !iommufd_cdev_attach_container(vbasedev, container, errp)) { goto err_attach_container; } @@ -579,6 +614,10 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, goto err_discard_disable; } + if (vbasedev->reused) { + goto skip_info; + } + if (!iommufd_cdev_get_info_iova_range(container, ioas_id, &err)) { error_append_hint(&err, "Fallback to default 64bit IOVA range and 4K page size\n"); @@ -587,6 +626,7 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, bcontainer->pgsizes = qemu_real_host_page_size(); } +skip_info: bcontainer->listener = vfio_memory_listener; memory_listener_register(&bcontainer->listener, bcontainer->space->as); diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index cab1cf1..25ff04c 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -176,6 +176,7 @@ iommufd_cdev_connect_and_bind(int iommufd, const char *name, int devfd, int devi iommufd_cdev_getfd(const char *dev, int devfd) " %s (fd=%d)" iommufd_cdev_attach_ioas_hwpt(int iommufd, const char *name, int devfd, int id) " [iommufd=%d] Successfully attached device %s (%d) to id=%d" iommufd_cdev_detach_ioas_hwpt(int iommufd, const char *name) " [iommufd=%d] Successfully detached %s" +iommufd_cdev_rebuild_hwpt(int iommufd, int hwpt_id) " [iommufd=%d] hwpt %d" iommufd_cdev_fail_attach_existing_container(const char *msg) " %s" iommufd_cdev_alloc_ioas(int iommufd, int ioas_id) " [iommufd=%d] new IOMMUFD container with ioasid=%d" iommufd_cdev_device_info(char *name, int devfd, int num_irqs, int num_regions, int flags) " %s (%d) num_irqs=%d num_regions=%d flags=%d" diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index add44d4..a359ea9 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -157,6 +157,7 @@ typedef struct VFIODevice { HostIOMMUDevice *hiod; int devid; IOMMUFDBackend *iommufd; + uint32_t hwpt_id; VFIOIOASHwpt *hwpt; QLIST_ENTRY(VFIODevice) hwpt_next; } VFIODevice; @@ -270,6 +271,8 @@ bool vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp); void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer); bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp); void vfio_legacy_cpr_unregister_container(VFIOContainer *container); +void iommufd_cdev_rebuild_hwpt(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container); bool iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container, uint32_t ioas_id, Error **errp); bool vfio_iommufd_cpr_register_container(VFIOIOMMUFDContainer *container, From patchwork Wed Jan 29 14:43:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13953823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 011F4C02193 for ; Wed, 29 Jan 2025 14:48:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1td9Il-0000gc-Rg; Wed, 29 Jan 2025 09:44:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Id-0000fc-32 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:15 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1td9Ib-0001Oy-G4 for qemu-devel@nongnu.org; Wed, 29 Jan 2025 09:44:14 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEXwBu030043; Wed, 29 Jan 2025 14:44:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2023-11-20; bh=fFGPfNEOWDKfuNCS4QbEJKiyfGjv8sj2hdhxlp4ZQLU=; b= MKbg9OOTj/OEJ6rAx6fhpMpxfVaUOVa29rMalpueTcLF/FKROVVfqhgYABa3wPvs S7Eiijn+qYQjXhjSwrEdOdRCfhGIVcakz9fdco9UxGERCLXMU7u1js5pbF0bTA3G TW6T10N5EeK3M+p/mgRhn04M+GNOibaN8lUPuzvjIChZ5VNZgylocFzF7zbdROtn Tm/+HvQhV8Fvp3izF/vw500knMP5zhkRCOdUiglTm3obs292IED0RBG0BqlQ020S KTfupUZh3PgP16+XRfbO6qrqM1Isjfz0VZ93jT3yLr5urDsBYBYxtfN8CW0y56zi ohh7c08wrVhFImx4Cy5wMg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 44fn2ug60t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:44:00 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 50TEQoH8034253; Wed, 29 Jan 2025 14:43:59 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 44cpd9s4vt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 29 Jan 2025 14:43:59 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 50TEhf96003307; Wed, 29 Jan 2025 14:43:59 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 44cpd9s49q-27; Wed, 29 Jan 2025 14:43:59 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Alex Williamson , Cedric Le Goater , Yi Liu , Eric Auger , Zhenzhong Duan , "Michael S. Tsirkin" , Marcel Apfelbaum , Peter Xu , Fabiano Rosas , Steve Sistare Subject: [PATCH V1 26/26] iommufd: preserve DMA mappings Date: Wed, 29 Jan 2025 06:43:22 -0800 Message-Id: <1738161802-172631-27-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> References: <1738161802-172631-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-29_02,2025-01-29_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 spamscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2411120000 definitions=main-2501290119 X-Proofpoint-ORIG-GUID: fhiwTlGeI-kyrM7ve5A7D3nASY62RJV6 X-Proofpoint-GUID: fhiwTlGeI-kyrM7ve5A7D3nASY62RJV6 Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.498, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org During cpr-transfer load in new QEMU, the vfio_memory_listener causes spurious calls to map and unmap DMA regions, as devices are created and the address space is built. This memory was already already mapped by the device in old QEMU, so suppress the map and unmap callbacks during CPR -- eg, if the reused flag is set. Clear the reused flag in the post_load handler. Signed-off-by: Steve Sistare --- backends/iommufd.c | 8 ++++++++ hw/vfio/cpr-iommufd.c | 1 + 2 files changed, 9 insertions(+) diff --git a/backends/iommufd.c b/backends/iommufd.c index e9452e4..cc61432 100644 --- a/backends/iommufd.c +++ b/backends/iommufd.c @@ -226,6 +226,10 @@ int iommufd_backend_map_file_dma(IOMMUFDBackend *be, uint32_t ioas_id, .length = size, }; + if (be->reused) { + return 0; + } + if (!readonly) { map.flags |= IOMMU_IOAS_MAP_WRITEABLE; } @@ -257,6 +261,10 @@ int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, .length = size, }; + if (be->reused) { + return 0; + } + ret = ioctl(fd, IOMMU_IOAS_UNMAP, &unmap); /* * IOMMUFD takes mapping as some kind of object, unmapping diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c index 711e5cf..5b93b24 100644 --- a/hw/vfio/cpr-iommufd.c +++ b/hw/vfio/cpr-iommufd.c @@ -63,6 +63,7 @@ static const VMStateDescription vfio_container_vmstate = { .name = "vfio-iommufd-container", .version_id = 0, .minimum_version_id = 0, + .priority = MIG_PRI_LOW, /* Must happen after devices and groups */ .pre_save = vfio_container_pre_save, .post_load = vfio_container_post_load, .needed = cpr_needed_for_reuse,