From patchwork Tue Nov 9 00:46:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C48FC433EF for ; Tue, 9 Nov 2021 00:58:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 91B54611BF for ; Tue, 9 Nov 2021 00:58:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 91B54611BF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:57190 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFTL-0001sx-L1 for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:58:47 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51798) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAt-0005u3-Vz for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:44 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:40290) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046q-0R for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:43 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A907cV2013263 for ; Tue, 9 Nov 2021 00:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=/hGtN04tTT++QHomhGLmLCscMG2OaMkaYyEohJcV07k=; b=W4t2X8sT09DFDo8DL67zXD3nOJffcTqDCnsEcjU6lvaKy91KL6+MHfWTVueE2cMo3WYL X2Qk2Hr2A38i9HnlGFZ0r5wYIbI9/pOtkH2DjvfsAmJxpMorhb/Hr2nHS+LsM8+Kw4oT agCm7T1dq81xq3/A4kvrKv+kAdgnjRSueAmNS70uBwW4nKmlg+FX9ApUVjTxmFH/nTp+ AtA0nWKEU1RP+RuRux0s7qK9w/fOFqBpGaRA9L4FcHN4Y2e4608NIiIz44RtB55XWbAq ti6kHHwmSiueMvMR0aZWN52LkNj0f6JWGJKLaMEv8PboGovHHOLhi7602UxgVd92fXUt /w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6st8qyec-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm3132637 for ; Tue, 9 Nov 2021 00:39:17 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TcCVdD0wKAYn/thtnEqRJB3BY523NmZNkmzp8Pv2ukRj6sOq27hiGF+JTxmx0XXvNRzgz4SKNHO86kMCPCxu/IHA3QRNhKtlkAMkhWSrD+BWivdxZOer+Ncd81ZD9GTz4ClrrQnS0ekijXz/dLwQwDvWNI8jm2o9L1v/fzqGz+Ll4fBOsyna3Y8dUCWeX2Se1CS+xnmAHu1eefNJc5rtWEQmxU0oP/XWoIpRm25r0HEdJ6WcoaFqbYcATdYFkxpxS6FF3rI/NKPSg8AZpfxNzaT8LO23wM2iKw9kAKs1bsTRB44qZGl9cSIcH4dcjJiw8IAjeEk7ClzxrEb+M8s9+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/hGtN04tTT++QHomhGLmLCscMG2OaMkaYyEohJcV07k=; b=n6u04biNrFEtIZV7+RM/KJNE8nAtzNFjSpOwXoOnTTHLVArrS31wLMwmR8pDB5k38HLIZ+uAThNXzK9XztCuFNQnn/6rA+Tsb2jGj7Tdxxyvm5xM0x2M0TjIaaMQrVyKauTxgSgOFf8thM7xO9p2LdbudTmzHfH8/9+GwLety3sk3kJ+XIpZYfr8mfcgSvwgYXIN2eGSGQvp7l+Kafnp3VaeKxKnUBgU8YeP69BPq7C75QoRXVjTzJpKMV9dOd+LEcSZKCVz/n/5WzO6lGKU3XsWG+cw727ccM6f5nMKpDxnvu/COxPEb+FfZorKY/T8kMFR7nbfNDYmzyEKJpwRQQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/hGtN04tTT++QHomhGLmLCscMG2OaMkaYyEohJcV07k=; b=b0c3tPLt/GLV4DPyPbE8CRW7/UYuVxCIdbkGLjJpJ0BMhb1Y7MubYhSBpPiu3qZ53Uk+ioPkRiXkwM8x91ojRqeWe/5FwiuN1a2OWZsA3lMLLk8rf/OE+3sGXHEL5lAyn+u4cr50g18rRF/deyEfspJP/avhXLmfsyIdRuedjeU= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:10 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:10 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 01/19] vfio-user: introduce vfio-user protocol specification Date: Mon, 8 Nov 2021 16:46:29 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:10 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d4723b79-eec9-4750-d224-08d9a3195887 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: QiWUzgCSs3Uo2AHN0sV23yNsLNCOKH+iVBgG90rIkUTkLfIppvuEuU8DFMRcNWH8EebqZIIEmIrUUJIPPam71pXj36cO8CFg/GmfErujBLxVxKRAFYoqYXAxUs9wo/u+COsLyJ+eQ4GnlrkqFChokUSt2LGlEfltEQp+CwbD5DKerPogKBI1ytBHIb62UioV/hJxCQI5607N6MELeXVO10XfnozN3fTw++wnFoBi2IGegoKmOdvBgTrIYgxrOozUEFI+oqioE2BpsIgMCCKmBxBmKH1+SFd83xUSjowe1x53L3hdT2RnVndtzhlzp6VIWUT0lG/7wSCpyzhElmoODpEszyt5LgLhpkMxYRefi3tL1DbDS4y2D2koNyYe/P1cJNuJ2opRFn5y5R3sO9/yVbF14J60SlUBbam8eBPPxqFeeMbdUlN8i3zUkGqnc2bcYgf6eEYwvLV+TOvKbvGLsyRdWpYC3rlRhTtvIjlcpC1tVgt/7XN2KLPaZoQDCHwF6nZLhy4OdvbraHgaGSlKti7RfaQ6xX6L3JskW3n1YixV9I5uBsncC4ObeVIc6xZnfYpG6WT1E/YuwRPiTV/REZOx7C0v4T0Kj9haNJOtP/byNSo3BdtRPbHUVol/5M2AXf5nFAG1pvR49Gyndl5ijL3opHMg6UD3e6DSWke0csZNmSQ4d6Gtxy5hwa9jamq4RVcsmSsiw6z8fTi9Cs57ap+5j2CsK8TY+HkNSMgpHH9s+B26IZN8Rt1e2Grk8XDlc54TYGe8ets+FHyafhSpnvz9+dAh8Js30b2IdQUlH3hns2EC6oN6w02Qait4v0e49LCiZl0p2j1tPDTZ/tAQEw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(966005)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002)(579004)(559001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: qunO1f6I0wbIYppRzupYgVq7NWiPbpKex7WmoH7kf0h7tJNMGrV2ItXjw7gYOSVI7o3VbiVKJrbX2VRN31n+5SbORK16W6VUpZjQX/gR4Y5hF0i/OP8vqznzibUxLuepUWTOh4unKXzpELGHafZPbXDLdcqcjxt0PtehFCpZBkQ1ZKUSJUWuey3yVrlN7CrccT8TOWnnsc7xvKEswiwKHU+CYR75Co/enklqe29YaEL48sbOK6s+A6cT1nT8Apr6OnAWZMbW7xkSt0entIFa4VGN96Sr8oediy+1PMb8Pp9fdkY5nwoiXQaWbZDCVr0BUyZDOM7Zxfvrt8l4xaKFe4C8C1QUxElCyFwpHYKa6YsZSRlMZosxIWdtvD2Qnjg2PPdocxOrIGgibp2WBt1zPaG//jArD5zAdKE+GC2+w2sxhMajxPVcWEGWH//T1JXRLKK7k2GSsAOp1uKE83E6SGP6eGK6tcZfWLEhBSTSy1v35MiEbBIZImtoRzKyl2+f3MLOk+a7VBFggj9mf64uoykhBIpGBHQzXkXzL8OIVVMcyLXolmyK8vgUNLnZCCIunnBrQPd2mxHVPv2EJ01izgBI8bNd2vQpUwm++udzEmLqvNEk1CVyvUOfl9fwQf/7cioaPDKeLIkkwC4aVuTDnlZKixyxQk6PalQSMlTXDBbIF0OeN42RxRks9jwOSg9N4R9hVIpJIRXm8uSfKRYCdd6L/uNmTk0BLxpyDudK8Lu06QXQygrx7PnCDLjwUgCbmQka+f/YE0moTDse1yAlD/53Imb3xfDeLagFK2Xtc7Wz44ukjwsCbnkCIOGMv2n9HiPtn7TVEu2VoY6niQMgKSi6KEIFSWrF9ru0+SLtFlKjZINTHDUTe0c25GZNdxT+2xqy/HTyKmMLsCW/oVZpuj4AE9WqyzcEKk3NeNOsRAv5XVg51cEAMrr06YV4sN5c5G11gVhg8yTvn/MIQgsQjUNWJHkHpWwNo9P6b5vVRwsmwGKd22nHaGbV/qszGhKpgKeJ9AqleCCZ8YD47T61OvZLOQGCLW4Gub2OyRLUFNm2kME02DgZvNhBI9GvSVV+1aKNpboXDfXrWZMvc/kbwMaZ3+hey5KYUBMUz9fCRaVCbrRFYkEtRw2w2V34mWk4rhE7RZUjC5aOFoheHIK/qBSQcQ9JKUksCXgVKp7vEy/t7AxpBoIwZVsZCrbVcRSOSvvy5BFacq7cECK6fy4v9ZvnKGGGeTxmMpfEwFDcchUM1QfuaizSazs7BvhRsxQjDfaWcPNDD5jomA7HH7DGprcmu1wvEI8RXCm3J583lHuspwn2qSi8GIGzXxuvky2KW8nJ22p3yCcLuzjBvsUN5HCDf2nz1WQI02g9DrGevjA0RHddZq74iizWVQFiCNLPYX6a6758GqjYJxGn0CootQmtDPGdh8ef62QEZ+9YhmyQd3rynwwRotTTV+AQaL7LXRH+joyvR+F/7ImeCBmxgn6Me4hXyX5Njx3WdGnZ3ywnw2qHkxAhBiVLZ3ubk95dpHbUXi8HaTVNF9WZyN9wXcb2LnP/S03me6C4gfEEU53p6x+BWrOtB7/2pMRu5fHaQ1za1IIq1Qq2vO26SHgFO5gjF8NnGKIxemzXW1BFsRVSmWWL/ipvrU4Nej5zZsIRQ/mO860brr6IGjsXI3lELw== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: d4723b79-eec9-4750-d224-08d9a3195887 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:10.6104 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 5Dwlil/xCMKCPYSyF8MSqWdi1LNnNZ6c2eZwh0YExw1RdF/xeJcRN4Im5WEqKfx/dbthfLexEUqIZOb4W2DcswZUdzvBzN1H0mZC5awcXkg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: 29cKrPhRgXkqK6AeiaDcQfmP7qdQYEBe X-Proofpoint-GUID: 29cKrPhRgXkqK6AeiaDcQfmP7qdQYEBe Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Thanos Makatos This patch introduces the vfio-user protocol specification (formerly known as VFIO-over-socket), which is designed to allow devices to be emulated outside QEMU, in a separate process. vfio-user reuses the existing VFIO defines, structs and concepts. This patch is sourced from: https://patchwork.kernel.org/project/qemu-devel/patch/20210614104608.212276-1-thanos.makatos@nutanix.com/ It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: John G Johnson Signed-off-by: Thanos Makatos Signed-off-by: John Levon --- docs/devel/index.rst | 1 + docs/devel/vfio-user.rst | 1810 ++++++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 6 + 3 files changed, 1817 insertions(+) create mode 100644 docs/devel/vfio-user.rst diff --git a/docs/devel/index.rst b/docs/devel/index.rst index 5522db7..304ca1c 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -44,3 +44,4 @@ modifying QEMU's source code. vfio-migration qapi-code-gen writing-qmp-commands + vfio-user diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst new file mode 100644 index 0000000..97a7506 --- /dev/null +++ b/docs/devel/vfio-user.rst @@ -0,0 +1,1810 @@ +.. include:: +******************************** +vfio-user Protocol Specification +******************************** + +-------------- +Version_ 0.9.1 +-------------- + +.. contents:: Table of Contents + +Introduction +============ +vfio-user is a protocol that allows a device to be emulated in a separate +process outside of a Virtual Machine Monitor (VMM). vfio-user devices consist +of a generic VFIO device type, living inside the VMM, which we call the client, +and the core device implementation, living outside the VMM, which we call the +server. + +The vfio-user specification is partly based on the +`Linux VFIO ioctl interface `_. + +VFIO is a mature and stable API, backed by an extensively used framework. The +existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be largely +re-used, though there is nothing in this specification that requires that +particular implementation. None of the VFIO kernel modules are required for +supporting the protocol, on either the client or server side. Some source +definitions in VFIO are re-used for vfio-user. + +The main idea is to allow a virtual device to function in a separate process in +the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``) is +chosen because file descriptors can be trivially sent over it, which in turn +allows: + +* Sharing of client memory for DMA with the server. +* Sharing of server memory with the client for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +Other socket types could be used which allow the server to run in a separate +guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretically +the underlying transport does not necessarily have to be a socket, however we do +not examine such alternatives. In this protocol version we focus on using a UNIX +domain socket and introduce basic support for the other two types of sockets +without considering performance implications. + +While passing of file descriptors is desirable for performance reasons, support +is not necessary for either the client or the server in order to implement the +protocol. There is always an in-band, message-passing fall back mechanism. + +Overview +======== + +VFIO is a framework that allows a physical device to be securely passed through +to a user space process; the device-specific kernel driver does not drive the +device at all. Typically, the user space process is a VMM and the device is +passed through to it in order to achieve high performance. VFIO provides an API +and the required functionality in the kernel. QEMU has adopted VFIO to allow a +guest to directly access physical devices, instead of emulating them in +software. + +vfio-user reuses the core VFIO concepts defined in its API, but implements them +as messages to be sent over a socket. It does not change the kernel-based VFIO +in any way, in fact none of the VFIO kernel modules need to be loaded to use +vfio-user. It is also possible for the client to concurrently use the current +kernel-based VFIO for one device, and vfio-user for another device. + +VFIO Device Model +----------------- + +A device under VFIO presents a standard interface to the user process. Many of +the VFIO operations in the existing interface use the ``ioctl()`` system call, and +references to the existing interface are called the ``ioctl()`` implementation in +this document. + +The following sections describe the set of messages that implement the vfio-user +interface over a socket. In many cases, the messages are analogous to data +structures used in the ``ioctl()`` implementation. Messages derived from the +``ioctl()`` will have a name derived from the ``ioctl()`` command name. E.g., the +``VFIO_DEVICE_GET_INFO`` ``ioctl()`` command becomes a +``VFIO_USER_DEVICE_GET_INFO`` message. The purpose of this reuse is to share as +much code as feasible with the ``ioctl()`` implementation``. + +Connection Initiation +^^^^^^^^^^^^^^^^^^^^^ + +After the client connects to the server, the initial client message is +``VFIO_USER_VERSION`` to propose a protocol version and set of capabilities to +apply to the session. The server replies with a compatible version and set of +capabilities it supports, or closes the connection if it cannot support the +advertised version. + +Device Information +^^^^^^^^^^^^^^^^^^ + +The client uses a ``VFIO_USER_DEVICE_GET_INFO`` message to query the server for +information about the device. This information includes: + +* The device type and whether it supports reset (``VFIO_DEVICE_FLAGS_``), +* the number of device regions, and +* the device presents to the client the number of interrupt types the device + supports. + +Region Information +^^^^^^^^^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_REGION_INFO`` messages to query the +server for information about the device's regions. This information describes: + +* Read and write permissions, whether it can be memory mapped, and whether it + supports additional capabilities (``VFIO_REGION_INFO_CAP_``). +* Region index, size, and offset. + +When a device region can be mapped by the client, the server provides a file +descriptor which the client can ``mmap()``. The server is responsible for +polling for client updates to memory mapped regions. + +Region Capabilities +""""""""""""""""""" + +Some regions have additional capabilities that cannot be described adequately +by the region info data structure. These capabilities are returned in the +region info reply in a list similar to PCI capabilities in a PCI device's +configuration space. + +Sparse Regions +"""""""""""""" +A region can be memory-mappable in whole or in part. When only a subset of a +region can be mapped by the client, a ``VFIO_REGION_INFO_CAP_SPARSE_MMAP`` +capability is included in the region info reply. This capability describes +which portions can be mapped by the client. + +.. Note:: + For example, in a virtual NVMe controller, sparse regions can be used so + that accesses to the NVMe registers (found in the beginning of BAR0) are + trapped (an infrequent event), while allowing direct access to the doorbells + (an extremely frequent event as every I/O submission requires a write to + BAR0), found in the next page after the NVMe registers in BAR0. + +Device-Specific Regions +""""""""""""""""""""""" + +A device can define regions additional to the standard ones (e.g. PCI indexes +0-8). This is achieved by including a ``VFIO_REGION_INFO_CAP_TYPE`` capability +in the region info reply of a device-specific region. Such regions are reflected +in ``struct vfio_user_device_info.num_regions``. Thus, for PCI devices this +value can be equal to, or higher than, ``VFIO_PCI_NUM_REGIONS``. + +Region I/O via file descriptors +------------------------------- + +For unmapped regions, region I/O from the client is done via +``VFIO_USER_REGION_READ/WRITE``. As an optimization, ioeventfds or ioregionfds +may be configured for sub-regions of some regions. A client may request +information on these sub-regions via ``VFIO_USER_DEVICE_GET_REGION_IO_FDS``; by +configuring the returned file descriptors as ioeventfds or ioregionfds, the +server can be directly notified of I/O (for example, by KVM) without taking a +trip through the client. + +Interrupts +^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_IRQ_INFO`` messages to query the server +for the device's interrupt types. The interrupt types are specific to the bus +the device is attached to, and the client is expected to know the capabilities +of each interrupt type. The server can signal an interrupt by directly injecting +interrupts into the guest via an event file descriptor. The client configures +how the server signals an interrupt with ``VFIO_USER_SET_IRQS`` messages. + +Device Read and Write +^^^^^^^^^^^^^^^^^^^^^ + +When the guest executes load or store operations to an unmapped device region, +the client forwards these operations to the server with +``VFIO_USER_REGION_READ`` or ``VFIO_USER_REGION_WRITE`` messages. The server +will reply with data from the device on read operations or an acknowledgement on +write operations. See `Read and Write Operations`_. + +Client memory access +-------------------- + +The client uses ``VFIO_USER_DMA_MAP`` and ``VFIO_USER_DMA_UNMAP`` messages to +inform the server of the valid DMA ranges that the server can access on behalf +of a device (typically, VM guest memory). DMA memory may be accessed by the +server via ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages over the +socket. In this case, the "DMA" part of the naming is a misnomer. + +Actual direct memory access of client memory from the server is possible if the +client provides file descriptors the server can ``mmap()``. Note that ``mmap()`` +privileges cannot be revoked by the client, therefore file descriptors should +only be exported in environments where the client trusts the server not to +corrupt guest memory. + +See `Read and Write Operations`_. + +Client/server interactions +========================== + +Socket +------ + +A server can serve: + +1) one or more clients, and/or +2) one or more virtual devices, belonging to one or more clients. + +The current protocol specification requires a dedicated socket per +client/server connection. It is a server-side implementation detail whether a +single server handles multiple virtual devices from the same or multiple +clients. The location of the socket is implementation-specific. Multiplexing +clients, devices, and servers over the same socket is not supported in this +version of the protocol. + +Authentication +-------------- + +For ``AF_UNIX``, we rely on OS mandatory access controls on the socket files, +therefore it is up to the management layer to set up the socket as required. +Socket types that span guests or hosts will require a proper authentication +mechanism. Defining that mechanism is deferred to a future version of the +protocol. + +Command Concurrency +------------------- + +A client may pipeline multiple commands without waiting for previous command +replies. The server will process commands in the order they are received. A +consequence of this is if a client issues a command with the *No_reply* bit, +then subsequently issues a command without *No_reply*, the older command will +have been processed before the reply to the younger command is sent by the +server. The client must be aware of the device's capability to process +concurrent commands if pipelining is used. For example, pipelining allows +multiple client threads to concurrently access device regions; the client must +ensure these accesses obey device semantics. + +An example is a frame buffer device, where the device may allow concurrent +access to different areas of video memory, but may have indeterminate behavior +if concurrent accesses are performed to command or status registers. + +Note that unrelated messages sent from the server to the client can appear in +between a client to server request/reply and vice versa. + +Implementers should be prepared for certain commands to exhibit potentially +unbounded latencies. For example, ``VFIO_USER_DEVICE_RESET`` may take an +arbitrarily long time to complete; clients should take care not to block +unnecessarily. + +Socket Disconnection Behavior +----------------------------- +The server and the client can disconnect from each other, either intentionally +or unexpectedly. Both the client and the server need to know how to handle such +events. + +Server Disconnection +^^^^^^^^^^^^^^^^^^^^ +A server disconnecting from the client may indicate that: + +1) A virtual device has been restarted, either intentionally (e.g. because of a + device update) or unintentionally (e.g. because of a crash). +2) A virtual device has been shut down with no intention to be restarted. + +It is impossible for the client to know whether or not a failure is +intermittent or innocuous and should be retried, therefore the client should +reset the VFIO device when it detects the socket has been disconnected. +Error recovery will be driven by the guest's device error handling +behavior. + +Client Disconnection +^^^^^^^^^^^^^^^^^^^^ +The client disconnecting from the server primarily means that the client +has exited. Currently, this means that the guest is shut down so the device is +no longer needed therefore the server can automatically exit. However, there +can be cases where a client disconnection should not result in a server exit: + +1) A single server serving multiple clients. +2) A multi-process QEMU upgrading itself step by step, which is not yet + implemented. + +Therefore in order for the protocol to be forward compatible, the server should +respond to a client disconnection as follows: + + - all client memory regions are unmapped and cleaned up (including closing any + passed file descriptors) + - all IRQ file descriptors passed from the old client are closed + - the device state should otherwise be retained + +The expectation is that when a client reconnects, it will re-establish IRQ and +client memory mappings. + +If anything happens to the client (such as qemu really did exit), the control +stack will know about it and can clean up resources accordingly. + +Security Considerations +----------------------- + +Speaking generally, vfio-user clients should not trust servers, and vice versa. +Standard tools and mechanisms should be used on both sides to validate input and +prevent against denial of service scenarios, buffer overflow, etc. + +Request Retry and Response Timeout +---------------------------------- +A failed command is a command that has been successfully sent and has been +responded to with an error code. Failure to send the command in the first place +(e.g. because the socket is disconnected) is a different type of error examined +earlier in the disconnect section. + +.. Note:: + QEMU's VFIO retries certain operations if they fail. While this makes sense + for real HW, we don't know for sure whether it makes sense for virtual + devices. + +Defining a retry and timeout scheme is deferred to a future version of the +protocol. + +Message sizes +------------- + +Some requests have an ``argsz`` field. In a request, it defines the maximum +expected reply payload size, which should be at least the size of the fixed +reply payload headers defined here. The *request* payload size is defined by the +usual ``msg_size`` field in the header, not the ``argsz`` field. + +In a reply, the server sets ``argsz`` field to the size needed for a full +payload size. This may be less than the requested maximum size. This may be +larger than the requested maximum size: in that case, the full payload is not +included in the reply, but the ``argsz`` field in the reply indicates the needed +size, allowing a client to allocate a larger buffer for holding the reply before +trying again. + +In addition, during negotiation (see `Version`_), the client and server may +each specify a ``max_data_xfer_size`` value; this defines the maximum data that +may be read or written via one of the ``VFIO_USER_DMA/REGION_READ/WRITE`` +messages; see `Read and Write Operations`_. + +Protocol Specification +====================== + +To distinguish from the base VFIO symbols, all vfio-user symbols are prefixed +with ``vfio_user`` or ``VFIO_USER``. In this revision, all data is in the +endianness of the host system, although this may be relaxed in future +revisions in cases where the client and server run on different hosts +with different endianness. + +Unless otherwise specified, all sizes should be presumed to be in bytes. + +.. _Commands: + +Commands +-------- +The following table lists the VFIO message command IDs, and whether the +message command is sent from the client or the server. + +====================================== ========= ================= +Name Command Request Direction +====================================== ========= ================= +``VFIO_USER_VERSION`` 1 client -> server +``VFIO_USER_DMA_MAP`` 2 client -> server +``VFIO_USER_DMA_UNMAP`` 3 client -> server +``VFIO_USER_DEVICE_GET_INFO`` 4 client -> server +``VFIO_USER_DEVICE_GET_REGION_INFO`` 5 client -> server +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` 6 client -> server +``VFIO_USER_DEVICE_GET_IRQ_INFO`` 7 client -> server +``VFIO_USER_DEVICE_SET_IRQS`` 8 client -> server +``VFIO_USER_REGION_READ`` 9 client -> server +``VFIO_USER_REGION_WRITE`` 10 client -> server +``VFIO_USER_DMA_READ`` 11 server -> client +``VFIO_USER_DMA_WRITE`` 12 server -> client +``VFIO_USER_DEVICE_RESET`` 13 client -> server +``VFIO_USER_DIRTY_PAGES`` 14 client -> server +====================================== ========= ================= + +Header +------ + +All messages, both command messages and reply messages, are preceded by a +16-byte header that contains basic information about the message. The header is +followed by message-specific data described in the sections below. + ++----------------+--------+-------------+ +| Name | Offset | Size | ++================+========+=============+ +| Message ID | 0 | 2 | ++----------------+--------+-------------+ +| Command | 2 | 2 | ++----------------+--------+-------------+ +| Message size | 4 | 4 | ++----------------+--------+-------------+ +| Flags | 8 | 4 | ++----------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0-3 | Type | | +| | +-----+------------+ | +| | | 4 | No_reply | | +| | +-----+------------+ | +| | | 5 | Error | | +| | +-----+------------+ | ++----------------+--------+-------------+ +| Error | 12 | 4 | ++----------------+--------+-------------+ +| | 16 | variable | ++----------------+--------+-------------+ + +* *Message ID* identifies the message, and is echoed in the command's reply + message. Message IDs belong entirely to the sender, can be re-used (even + concurrently) and the receiver must not make any assumptions about their + uniqueness. +* *Command* specifies the command to be executed, listed in Commands_. It is + also set in the reply header. +* *Message size* contains the size of the entire message, including the header. +* *Flags* contains attributes of the message: + + * The *Type* bits indicate the message type. + + * *Command* (value 0x0) indicates a command message. + * *Reply* (value 0x1) indicates a reply message acknowledging a previous + command with the same message ID. + * *No_reply* in a command message indicates that no reply is needed for this + command. This is commonly used when multiple commands are sent, and only + the last needs acknowledgement. + * *Error* in a reply message indicates the command being acknowledged had + an error. In this case, the *Error* field will be valid. + +* *Error* in a reply message is an optional UNIX errno value. It may be zero + even if the Error bit is set in Flags. It is reserved in a command message. + +Each command message in Commands_ must be replied to with a reply message, +unless the message sets the *No_Reply* bit. The reply consists of the header +with the *Reply* bit set, plus any additional data. + +If an error occurs, the reply message must only include the reply header. + +As the header is standard in both requests and replies, it is not included in +the command-specific specifications below; each message definition should be +appended to the standard header, and the offsets are given from the end of the +standard header. + +``VFIO_USER_VERSION`` +--------------------- + +.. _Version: + +This is the initial message sent by the client after the socket connection is +established; the same format is used for the server's reply. + +Upon establishing a connection, the client must send a ``VFIO_USER_VERSION`` +message proposing a protocol version and a set of capabilities. The server +compares these with the versions and capabilities it supports and sends a +``VFIO_USER_VERSION`` reply according to the following rules. + +* The major version in the reply must be the same as proposed. If the client + does not support the proposed major, it closes the connection. +* The minor version in the reply must be equal to or less than the minor + version proposed. +* The capability list must be a subset of those proposed. If the server + requires a capability the client did not include, it closes the connection. + +The protocol major version will only change when incompatible protocol changes +are made, such as changing the message format. The minor version may change +when compatible changes are made, such as adding new messages or capabilities, +Both the client and server must support all minor versions less than the +maximum minor version it supports. E.g., an implementation that supports +version 1.3 must also support 1.0 through 1.2. + +When making a change to this specification, the protocol version number must +be included in the form "added in version X.Y" + +Request +^^^^^^^ + +============== ====== ==== +Name Offset Size +============== ====== ==== +version major 0 2 +version minor 2 2 +version data 4 variable (including terminating NUL). Optional. +============== ====== ==== + +The version data is an optional UTF-8 encoded JSON byte array with the following +format: + ++--------------+--------+-----------------------------------+ +| Name | Type | Description | ++==============+========+===================================+ +| capabilities | object | Contains common capabilities that | +| | | the sender supports. Optional. | ++--------------+--------+-----------------------------------+ + +Capabilities: + ++--------------------+--------+------------------------------------------------+ +| Name | Type | Description | ++====================+========+================================================+ +| max_msg_fds | number | Maximum number of file descriptors that can be | +| | | received by the sender in one message. | +| | | Optional. If not specified then the receiver | +| | | must assume a value of ``1``. | ++--------------------+--------+------------------------------------------------+ +| max_data_xfer_size | number | Maximum ``count`` for data transfer messages; | +| | | see `Read and Write Operations`_. Optional, | +| | | with a default value of 1048576 bytes. | ++--------------------+--------+------------------------------------------------+ +| migration | object | Migration capability parameters. If missing | +| | | then migration is not supported by the sender. | ++--------------------+--------+------------------------------------------------+ + +The migration capability contains the following name/value pairs: + ++--------+--------+-----------------------------------------------+ +| Name | Type | Description | ++========+========+===============================================+ +| pgsize | number | Page size of dirty pages bitmap. The smallest | +| | | between the client and the server is used. | ++--------+--------+-----------------------------------------------+ + +Reply +^^^^^ + +The same message format is used in the server's reply with the semantics +described above. + +``VFIO_USER_DMA_MAP`` +--------------------- + +This command message is sent by the client to the server to inform it of the +memory regions the server can access. It must be sent before the server can +perform any DMA to the client. It is normally sent directly after the version +handshake is completed, but may also occur when memory is added to the client, +or if the client uses a vIOMMU. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++-------------+--------+-------------+ +| Name | Offset | Size | ++=============+========+=============+ +| argsz | 0 | 4 | ++-------------+--------+-------------+ +| flags | 4 | 4 | ++-------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0 | readable | | +| | +-----+------------+ | +| | | 1 | writeable | | +| | +-----+------------+ | ++-------------+--------+-------------+ +| offset | 8 | 8 | ++-------------+--------+-------------+ +| address | 16 | 8 | ++-------------+--------+-------------+ +| size | 24 | 8 | ++-------------+--------+-------------+ + +* *argsz* is the size of the above structure. Note there is no reply payload, + so this field differs from other message types. +* *flags* contains the following region attributes: + + * *readable* indicates that the region can be read from. + + * *writeable* indicates that the region can be written to. + +* *offset* is the file offset of the region with respect to the associated file + descriptor, or zero if the region is not mappable +* *address* is the base DMA address of the region. +* *size* is the size of the region. + +This structure is 32 bytes in size, so the message size is 16 + 32 bytes. + +If the DMA region being added can be directly mapped by the server, a file +descriptor must be sent as part of the message meta-data. The region can be +mapped via the mmap() system call. On ``AF_UNIX`` sockets, the file descriptor +must be passed as ``SCM_RIGHTS`` type ancillary data. Otherwise, if the DMA +region cannot be directly mapped by the server, no file descriptor must be sent +as part of the message meta-data and the DMA region can be accessed by the +server using ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages, +explained in `Read and Write Operations`_. A command to map over an existing +region must be failed by the server with ``EEXIST`` set in error field in the +reply. + +Reply +^^^^^ + +There is no payload in the reply message. + +``VFIO_USER_DMA_UNMAP`` +----------------------- + +This command message is sent by the client to the server to inform it that a +DMA region, previously made available via a ``VFIO_USER_DMA_MAP`` command +message, is no longer available for DMA. It typically occurs when memory is +subtracted from the client or if the client uses a vIOMMU. The DMA region is +described by the following structure: + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++--------------+--------+------------------------+ +| Name | Offset | Size | ++==============+========+========================+ +| argsz | 0 | 4 | ++--------------+--------+------------------------+ +| flags | 4 | 4 | ++--------------+--------+------------------------+ +| | +-----+-----------------------+ | +| | | Bit | Definition | | +| | +=====+=======================+ | +| | | 0 | get dirty page bitmap | | +| | +-----+-----------------------+ | ++--------------+--------+------------------------+ +| address | 8 | 8 | ++--------------+--------+------------------------+ +| size | 16 | 8 | ++--------------+--------+------------------------+ + +* *argsz* is the maximum size of the reply payload. +* *flags* contains the following DMA region attributes: + + * *get dirty page bitmap* indicates that a dirty page bitmap must be + populated before unmapping the DMA region. The client must provide a + `VFIO Bitmap`_ structure, explained below, immediately following this + entry. + +* *address* is the base DMA address of the DMA region. +* *size* is the size of the DMA region. + +The address and size of the DMA region being unmapped must match exactly a +previous mapping. The size of request message depends on whether or not the +*get dirty page bitmap* bit is set in Flags: + +* If not set, the size of the total request message is: 16 + 24. + +* If set, the size of the total request message is: 16 + 24 + 16. + +.. _VFIO Bitmap: + +VFIO Bitmap Format +"""""""""""""""""" + ++--------+--------+------+ +| Name | Offset | Size | ++========+========+======+ +| pgsize | 0 | 8 | ++--------+--------+------+ +| size | 8 | 8 | ++--------+--------+------+ + +* *pgsize* is the page size for the bitmap, in bytes. +* *size* is the size for the bitmap, in bytes, excluding the VFIO bitmap header. + +Reply +^^^^^ + +Upon receiving a ``VFIO_USER_DMA_UNMAP`` command, if the file descriptor is +mapped then the server must release all references to that DMA region before +replying, which potentially includes in-flight DMA transactions. + +The server responds with the original DMA entry in the request. If the +*get dirty page bitmap* bit is set in flags in the request, then +the server also includes the `VFIO Bitmap`_ structure sent in the request, +followed by the corresponding dirty page bitmap, where each bit represents +one page of size *pgsize* in `VFIO Bitmap`_ . + +The total size of the total reply message is: +16 + 24 + (16 + *size* in `VFIO Bitmap`_ if *get dirty page bitmap* is set). + +``VFIO_USER_DEVICE_GET_INFO`` +----------------------------- + +This command message is sent by the client to the server to query for basic +information about the device. + +Request +^^^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the maximum size of the reply payload +* all other fields must be zero. + +Reply +^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* contains the following device attributes. + + * ``VFIO_DEVICE_FLAGS_RESET`` indicates that the device supports the + ``VFIO_USER_DEVICE_RESET`` message. + * ``VFIO_DEVICE_FLAGS_PCI`` indicates that the device is a PCI device. + +* *num_regions* is the number of memory regions that the device exposes. +* *num_irqs* is the number of distinct interrupt types that the device supports. + +This version of the protocol only supports PCI devices. Additional devices may +be supported in future versions. + +``VFIO_USER_DEVICE_GET_REGION_INFO`` +------------------------------------ + +This command message is sent by the client to the server to query for +information about device regions. The VFIO region info structure is defined in +```` (``struct vfio_region_info``). + +Request +^^^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* all other fields must be zero. + +Reply +^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_REGION_INFO_FLAG_READ | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_REGION_INFO_FLAG_WRITE | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_REGION_INFO_FLAG_MMAP | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_REGION_INFO_FLAG_CAPS | | +| | +-----+-----------------------------+ | ++------------+--------+------------------------------+ ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* is the size required for the full reply payload (region info structure + plus the size of any region capabilities) +* *flags* are attributes of the region: + + * ``VFIO_REGION_INFO_FLAG_READ`` allows client read access to the region. + * ``VFIO_REGION_INFO_FLAG_WRITE`` allows client write access to the region. + * ``VFIO_REGION_INFO_FLAG_MMAP`` specifies the client can mmap() the region. + When this flag is set, the reply will include a file descriptor in its + meta-data. On ``AF_UNIX`` sockets, the file descriptors will be passed as + ``SCM_RIGHTS`` type ancillary data. + * ``VFIO_REGION_INFO_FLAG_CAPS`` indicates additional capabilities found in the + reply. + +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* *cap_offset* describes where additional region capabilities can be found. + cap_offset is relative to the beginning of the VFIO region info structure. + The data structure it points is a VFIO cap header defined in + ````. +* *size* is the size of the region. +* *offset* is the offset that should be given to the mmap() system call for + regions with the MMAP attribute. It is also used as the base offset when + mapping a VFIO sparse mmap area, described below. + +VFIO region capabilities +"""""""""""""""""""""""" + +The VFIO region information can also include a capabilities list. This list is +similar to a PCI capability list - each entry has a common header that +identifies a capability and where the next capability in the list can be found. +The VFIO capability header format is defined in ```` (``struct +vfio_info_cap_header``). + +VFIO cap header format +"""""""""""""""""""""" + ++---------+--------+------+ +| Name | Offset | Size | ++=========+========+======+ +| id | 0 | 2 | ++---------+--------+------+ +| version | 2 | 2 | ++---------+--------+------+ +| next | 4 | 4 | ++---------+--------+------+ + +* *id* is the capability identity. +* *version* is a capability-specific version number. +* *next* specifies the offset of the next capability in the capability list. It + is relative to the beginning of the VFIO region info structure. + +VFIO sparse mmap cap header +""""""""""""""""""""""""""" + ++------------------+----------------------------------+ +| Name | Value | ++==================+==================================+ +| id | VFIO_REGION_INFO_CAP_SPARSE_MMAP | ++------------------+----------------------------------+ +| version | 0x1 | ++------------------+----------------------------------+ +| next | | ++------------------+----------------------------------+ +| sparse mmap info | VFIO region info sparse mmap | ++------------------+----------------------------------+ + +This capability is defined when only a subrange of the region supports +direct access by the client via mmap(). The VFIO sparse mmap area is defined in +```` (``struct vfio_region_sparse_mmap_area`` and ``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region info cap sparse mmap +"""""""""""""""""""""""""""""""" + ++----------+--------+------+ +| Name | Offset | Size | ++==========+========+======+ +| nr_areas | 0 | 4 | ++----------+--------+------+ +| reserved | 4 | 4 | ++----------+--------+------+ +| offset | 8 | 8 | ++----------+--------+------+ +| size | 16 | 9 | ++----------+--------+------+ +| ... | | | ++----------+--------+------+ + +* *nr_areas* is the number of sparse mmap areas in the region. +* *offset* and size describe a single area that can be mapped by the client. + There will be *nr_areas* pairs of offset and size. The offset will be added to + the base offset given in the ``VFIO_USER_DEVICE_GET_REGION_INFO`` to form the + offset argument of the subsequent mmap() call. + +The VFIO sparse mmap area is defined in ```` (``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region type cap header +""""""""""""""""""""""""""" + ++------------------+---------------------------+ +| Name | Value | ++==================+===========================+ +| id | VFIO_REGION_INFO_CAP_TYPE | ++------------------+---------------------------+ +| version | 0x1 | ++------------------+---------------------------+ +| next | | ++------------------+---------------------------+ +| region info type | VFIO region info type | ++------------------+---------------------------+ + +This capability is defined when a region is specific to the device. + +VFIO region info type cap +""""""""""""""""""""""""" + +The VFIO region info type is defined in ```` +(``struct vfio_region_info_cap_type``). + ++---------+--------+------+ +| Name | Offset | Size | ++=========+========+======+ +| type | 0 | 4 | ++---------+--------+------+ +| subtype | 4 | 4 | ++---------+--------+------+ + +The only device-specific region type and subtype supported by vfio-user is +``VFIO_REGION_TYPE_MIGRATION`` (3) and ``VFIO_REGION_SUBTYPE_MIGRATION`` (1). + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` +-------------------------------------- + +Clients can access regions via ``VFIO_USER_REGION_READ/WRITE`` or, if provided, by +``mmap()`` of a file descriptor provided by the server. + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` provides an alternative access mechanism via +file descriptors. This is an optional feature intended for performance +improvements where an underlying sub-system (such as KVM) supports communication +across such file descriptors to the vfio-user server, without needing to +round-trip through the client. + +The server returns an array of sub-regions for the requested region. Each +sub-region describes a span (offset and size) of a region, along with the +requested file descriptor notification mechanism to use. Each sub-region in the +response message may choose to use a different method, as defined below. The +two mechanisms supported in this specification are ioeventfds and ioregionfds. + +The server in addition returns a file descriptor in the ancillary data; clients +are expected to configure each sub-region's file descriptor with the requested +notification method. For example, a client could configure KVM with the +requested ioeventfd via a ``KVM_IOEVENTFD`` ``ioctl()``. + +Request +^^^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried +* all other fields must be zero + +The client must set ``flags`` to zero and specify the region being queried in +the ``index``. + +Reply +^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ +| sub-regions | 16 | ... | ++-------------+--------+------+ + +* *argsz* is the size of the region IO FD info structure plus the + total size of the sub-region array. Thus, each array entry "i" is at offset + i * ((argsz - 32) / count). Note that currently this is 40 bytes for both IO + FD types, but this is not to be relied on. As elsewhere, this indicates the + full reply payload size needed. +* *flags* must be zero +* *index* is the index of memory region being queried +* *count* is the number of sub-regions in the array +* *sub-regions* is the array of Sub-Region IO FD info structures + +The reply message will additionally include at least one file descriptor in the +ancillary data. Note that more than one sub-region may share the same file +descriptor. + +Note that it is the client's responsibility to verify the requested values (for +example, that the requested offset does not exceed the region's bounds). + +Each sub-region given in the response has one of two possible structures, +depending whether *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` or +``VFIO_USER_IO_FD_TYPE_IOREGIONFD``: + +Sub-Region IO FD info format (ioeventfd) +"""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| datamatch | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations +* *fd_index* is the index in the ancillary data of the FD to use for ioeventfd + notification; it may be shared. +* *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` +* *flags* is any of: + + * ``KVM_IOEVENTFD_FLAG_DATAMATCH`` + * ``KVM_IOEVENTFD_FLAG_PIO`` + * ``KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY`` (FIXME: makes sense?) + +* *datamatch* is the datamatch value if needed + +See https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt, *4.59 +KVM_IOEVENTFD* for further context on the ioeventfd-specific fields. + +Sub-Region IO FD info format (ioregionfd) +""""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| user_data | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations; ``KVM_IOREGION_POSTED_WRITES`` + must be set in *flags* in this case +* *fd_index* is the index in the ancillary data of the FD to use for ioregionfd + messages; it may be shared +* *type* is ``VFIO_USER_IO_FD_TYPE_IOREGIONFD`` +* *flags* is any of: + + * ``KVM_IOREGION_PIO`` + * ``KVM_IOREGION_POSTED_WRITES`` + +* *user_data* is an opaque value passed back to the server via a message on the + file descriptor + +For further information on the ioregionfd-specific fields, see: +https://lore.kernel.org/kvm/cover.1613828726.git.eafanasova@gmail.com/ + +(FIXME: update with final API docs.) + +``VFIO_USER_DEVICE_GET_IRQ_INFO`` +--------------------------------- + +This command message is sent by the client to the server to query for +information about device interrupt types. The VFIO IRQ info structure is +defined in ```` (``struct vfio_irq_info``). + +Request +^^^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the maximum size of the reply payload (16 bytes today) +* index is the index of IRQ type being queried (e.g. ``VFIO_PCI_MSIX_IRQ_INDEX``) +* all other fields must be zero + +Reply +^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* defines IRQ attributes: + + * ``VFIO_IRQ_INFO_EVENTFD`` indicates the IRQ type can support server eventfd + signalling. + * ``VFIO_IRQ_INFO_MASKABLE`` indicates that the IRQ type supports the ``MASK`` + and ``UNMASK`` actions in a ``VFIO_USER_DEVICE_SET_IRQS`` message. + * ``VFIO_IRQ_INFO_AUTOMASKED`` indicates the IRQ type masks itself after being + triggered, and the client must send an ``UNMASK`` action to receive new + interrupts. + * ``VFIO_IRQ_INFO_NORESIZE`` indicates ``VFIO_USER_SET_IRQS`` operations setup + interrupts as a set, and new sub-indexes cannot be enabled without disabling + the entire type. +* index is the index of IRQ type being queried +* count describes the number of interrupts of the queried type. + +``VFIO_USER_DEVICE_SET_IRQS`` +----------------------------- + +This command message is sent by the client to the server to set actions for +device interrupt types. The VFIO IRQ set structure is defined in +```` (``struct vfio_irq_set``). + +Request +^^^^^^^ + ++-------+--------+------------------------------+ +| Name | Offset | Size | ++=======+========+==============================+ +| argsz | 0 | 4 | ++-------+--------+------------------------------+ +| flags | 4 | 4 | ++-------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_IRQ_SET_DATA_NONE | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_IRQ_SET_DATA_BOOL | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_IRQ_SET_DATA_EVENTFD | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_IRQ_SET_ACTION_MASK | | +| | +-----+-----------------------------+ | +| | | 4 | VFIO_IRQ_SET_ACTION_UNMASK | | +| | +-----+-----------------------------+ | +| | | 5 | VFIO_IRQ_SET_ACTION_TRIGGER | | +| | +-----+-----------------------------+ | ++-------+--------+------------------------------+ +| index | 8 | 4 | ++-------+--------+------------------------------+ +| start | 12 | 4 | ++-------+--------+------------------------------+ +| count | 16 | 4 | ++-------+--------+------------------------------+ +| data | 20 | variable | ++-------+--------+------------------------------+ + +* *argsz* is the size of the VFIO IRQ set request payload, including any *data* + field. Note there is no reply payload, so this field differs from other + message types. +* *flags* defines the action performed on the interrupt range. The ``DATA`` + flags describe the data field sent in the message; the ``ACTION`` flags + describe the action to be performed. The flags are mutually exclusive for + both sets. + + * ``VFIO_IRQ_SET_DATA_NONE`` indicates there is no data field in the command. + The action is performed unconditionally. + * ``VFIO_IRQ_SET_DATA_BOOL`` indicates the data field is an array of boolean + bytes. The action is performed if the corresponding boolean is true. + * ``VFIO_IRQ_SET_DATA_EVENTFD`` indicates an array of event file descriptors + was sent in the message meta-data. These descriptors will be signalled when + the action defined by the action flags occurs. In ``AF_UNIX`` sockets, the + descriptors are sent as ``SCM_RIGHTS`` type ancillary data. + If no file descriptors are provided, this de-assigns the specified + previously configured interrupts. + * ``VFIO_IRQ_SET_ACTION_MASK`` indicates a masking event. It can be used with + ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to mask an interrupt, + or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the guest masks + the interrupt. + * ``VFIO_IRQ_SET_ACTION_UNMASK`` indicates an unmasking event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to unmask an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + guest unmasks the interrupt. + * ``VFIO_IRQ_SET_ACTION_TRIGGER`` indicates a triggering event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to trigger an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + server triggers the interrupt. + +* *index* is the index of IRQ type being setup. +* *start* is the start of the sub-index being set. +* *count* describes the number of sub-indexes being set. As a special case, a + count (and start) of 0, with data flags of ``VFIO_IRQ_SET_DATA_NONE`` disables + all interrupts of the index. +* *data* is an optional field included when the + ``VFIO_IRQ_SET_DATA_BOOL`` flag is present. It contains an array of booleans + that specify whether the action is to be performed on the corresponding + index. It's used when the action is only performed on a subset of the range + specified. + +Not all interrupt types support every combination of data and action flags. +The client must know the capabilities of the device and IRQ index before it +sends a ``VFIO_USER_DEVICE_SET_IRQ`` message. + +In typical operation, a specific IRQ may operate as follows: + +1. The client sends a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_TRIGGER)`` along + with an eventfd. This associates the IRQ with a particular eventfd on the + server side. + +#. The client may send a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_MASK/UNMASK)`` along + with another eventfd. This associates the given eventfd with the + mask/unmask state on the server side. + +#. The server may trigger the IRQ by writing 1 to the eventfd. + +#. The server may mask/unmask an IRQ which will write 1 to the corresponding + mask/unmask eventfd, if there is one. + +5. A client may trigger a device IRQ itself, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_TRIGGER)``. + +6. A client may mask or unmask the IRQ, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_MASK/UNMASK)``. + +Reply +^^^^^ + +There is no payload in the reply. + +.. _Read and Write Operations: + +Note that all of these operations must be supported by the client and/or server, +even if the corresponding memory or device region has been shared as mappable. + +The ``count`` field must not exceed the value of ``max_data_xfer_size`` of the +peer, for both reads and writes. + +``VFIO_USER_REGION_READ`` +------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via ``mmap()`` of the underlying file descriptor. In this case, a client can +read from a device region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. +* *data* is the data that was read from the device region. + +``VFIO_USER_REGION_WRITE`` +-------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via mmap() of the underlying fd. In this case, a client can write to a device +region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DMA_READ`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +read from guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. +* *data* is the data read. + +``VFIO_USER_DMA_WRITE`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +write to guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 4 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DEVICE_RESET`` +-------------------------- + +This command message is sent from the client to the server to reset the device. +Neither the request or reply have a payload. + +``VFIO_USER_DIRTY_PAGES`` +------------------------- + +This command is analogous to ``VFIO_IOMMU_DIRTY_PAGES``. It is sent by the client +to the server in order to control logging of dirty pages, usually during a live +migration. + +Dirty page tracking is optional for server implementation; clients should not +rely on it. + +Request +^^^^^^^ + ++-------+--------+-----------------------------------------+ +| Name | Offset | Size | ++=======+========+=========================================+ +| argsz | 0 | 4 | ++-------+--------+-----------------------------------------+ +| flags | 4 | 4 | ++-------+--------+-----------------------------------------+ +| | +-----+----------------------------------------+ | +| | | Bit | Definition | | +| | +=====+========================================+ | +| | | 0 | VFIO_IOMMU_DIRTY_PAGES_FLAG_START | | +| | +-----+----------------------------------------+ | +| | | 1 | VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP | | +| | +-----+----------------------------------------+ | +| | | 2 | VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | | +| | +-----+----------------------------------------+ | ++-------+--------+-----------------------------------------+ + +* *argsz* is the size of the VFIO dirty bitmap info structure for + ``START/STOP``; and for ``GET_BITMAP``, the maximum size of the reply payload + +* *flags* defines the action to be performed by the server: + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_START`` instructs the server to start logging + pages it dirties. Logging continues until explicitly disabled by + ``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP``. + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP`` instructs the server to stop logging + dirty pages. + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP`` requests the server to return + the dirty bitmap for a specific IOVA range. The IOVA range is specified by + a "VFIO Bitmap Range" structure, which must immediately follow this + "VFIO Dirty Pages" structure. See `VFIO Bitmap Range Format`_. + This operation is only valid if logging of dirty pages has been previously + started. + + These flags are mutually exclusive with each other. + +This part of the request is analogous to VFIO's ``struct +vfio_iommu_type1_dirty_bitmap``. + +.. _VFIO Bitmap Range Format: + +VFIO Bitmap Range Format +"""""""""""""""""""""""" + ++--------+--------+------+ +| Name | Offset | Size | ++========+========+======+ +| iova | 0 | 8 | ++--------+--------+------+ +| size | 8 | 8 | ++--------+--------+------+ +| bitmap | 16 | 24 | ++--------+--------+------+ + +* *iova* is the IOVA offset + +* *size* is the size of the IOVA region + +* *bitmap* is the VFIO Bitmap explained in `VFIO Bitmap`_. + +This part of the request is analogous to VFIO's ``struct +vfio_iommu_type1_dirty_bitmap_get``. + +Reply +^^^^^ + +For ``VFIO_IOMMU_DIRTY_PAGES_FLAG_START`` or +``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP``, there is no reply payload. + +For ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP``, the reply payload is as follows: + ++--------------+--------+-----------------------------------------+ +| Name | Offset | Size | ++==============+========+=========================================+ +| argsz | 0 | 4 | ++--------------+--------+-----------------------------------------+ +| flags | 4 | 4 | ++--------------+--------+-----------------------------------------+ +| | +-----+----------------------------------------+ | +| | | Bit | Definition | | +| | +=====+========================================+ | +| | | 2 | VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | | +| | +-----+----------------------------------------+ | ++--------------+--------+-----------------------------------------+ +| bitmap range | 8 | 40 | ++--------------+--------+-----------------------------------------+ +| bitmap | 48 | variable | ++--------------+--------+-----------------------------------------+ + +* *argsz* is the size required for the full reply payload (dirty pages structure + + bitmap range structure + actual bitmap) +* *flags* is ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP`` +* *bitmap range* is the same bitmap range struct provided in the request, as + defined in `VFIO Bitmap Range Format`_. +* *bitmap* is the actual dirty pages bitmap corresponding to the range request + +VFIO Device Migration Info +-------------------------- + +A device may contain a migration region (of type +``VFIO_REGION_TYPE_MIGRATION``). The beginning of the region must contain +``struct vfio_device_migration_info``, defined in ````. This +subregion is accessed like any other part of a standard vfio-user region +using ``VFIO_USER_REGION_READ``/``VFIO_USER_REGION_WRITE``. + ++---------------+--------+-----------------------------+ +| Name | Offset | Size | ++===============+========+=============================+ +| device_state | 0 | 4 | ++---------------+--------+-----------------------------+ +| | +-----+----------------------------+ | +| | | Bit | Definition | | +| | +=====+============================+ | +| | | 0 | VFIO_DEVICE_STATE_RUNNING | | +| | +-----+----------------------------+ | +| | | 1 | VFIO_DEVICE_STATE_SAVING | | +| | +-----+----------------------------+ | +| | | 2 | VFIO_DEVICE_STATE_RESUMING | | +| | +-----+----------------------------+ | ++---------------+--------+-----------------------------+ +| reserved | 4 | 4 | ++---------------+--------+-----------------------------+ +| pending_bytes | 8 | 8 | ++---------------+--------+-----------------------------+ +| data_offset | 16 | 8 | ++---------------+--------+-----------------------------+ +| data_size | 24 | 8 | ++---------------+--------+-----------------------------+ + +* *device_state* defines the state of the device: + + The client initiates device state transition by writing the intended state. + The server must respond only after it has successfully transitioned to the new + state. If an error occurs then the server must respond to the + ``VFIO_USER_REGION_WRITE`` operation with the Error field set accordingly and + must remain at the previous state, or in case of internal error it must + transition to the error state, defined as + ``VFIO_DEVICE_STATE_RESUMING | VFIO_DEVICE_STATE_SAVING``. The client must + re-read the device state in order to determine it afresh. + + The following device states are defined: + + +-----------+---------+----------+-----------------------------------+ + | _RESUMING | _SAVING | _RUNNING | Description | + +===========+=========+==========+===================================+ + | 0 | 0 | 0 | Device is stopped. | + +-----------+---------+----------+-----------------------------------+ + | 0 | 0 | 1 | Device is running, default state. | + +-----------+---------+----------+-----------------------------------+ + | 0 | 1 | 0 | Stop-and-copy state | + +-----------+---------+----------+-----------------------------------+ + | 0 | 1 | 1 | Pre-copy state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 0 | 0 | Resuming | + +-----------+---------+----------+-----------------------------------+ + | 1 | 0 | 1 | Invalid state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 1 | 0 | Error state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 1 | 1 | Invalid state | + +-----------+---------+----------+-----------------------------------+ + + Valid state transitions are shown in the following table: + + +-------------------------+---------+---------+---------------+----------+----------+ + | |darr| From / To |rarr| | Stopped | Running | Stop-and-copy | Pre-copy | Resuming | + +=========================+=========+=========+===============+==========+==========+ + | Stopped | \- | 1 | 0 | 0 | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Running | 1 | \- | 1 | 1 | 1 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Stop-and-copy | 1 | 1 | \- | 0 | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Pre-copy | 0 | 0 | 1 | \- | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Resuming | 0 | 1 | 0 | 0 | \- | + +-------------------------+---------+---------+---------------+----------+----------+ + + A device is migrated to the destination as follows: + + * The source client transitions the device state from the running state to + the pre-copy state. This transition is optional for the client but must be + supported by the server. The source server starts sending device state data + to the source client through the migration region while the device is + running. + + * The source client transitions the device state from the running state or the + pre-copy state to the stop-and-copy state. The source server stops the + device, saves device state and sends it to the source client through the + migration region. + + The source client is responsible for sending the migration data to the + destination client. + + A device is resumed on the destination as follows: + + * The destination client transitions the device state from the running state + to the resuming state. The destination server uses the device state data + received through the migration region to resume the device. + + * The destination client provides saved device state to the destination + server and then transitions the device to back to the running state. + +* *reserved* This field is reserved and any access to it must be ignored by the + server. + +* *pending_bytes* Remaining bytes to be migrated by the server. This field is + read only. + +* *data_offset* Offset in the migration region where the client must: + + * read from, during the pre-copy or stop-and-copy state, or + + * write to, during the resuming state. + + This field is read only. + +* *data_size* Contains the size, in bytes, of the amount of data copied to: + + * the source migration region by the source server during the pre-copy or + stop-and copy state, or + + * the destination migration region by the destination client during the + resuming state. + +Device-specific data must be stored at any position after +``struct vfio_device_migration_info``. Note that the migration region can be +memory mappable, even partially. In practise, only the migration data portion +can be memory mapped. + +The client processes device state data during the pre-copy and the +stop-and-copy state in the following iterative manner: + + 1. The client reads ``pending_bytes`` to mark a new iteration. Repeated reads + of this field is an idempotent operation. If there are no migration data + to be consumed then the next step depends on the current device state: + + * pre-copy: the client must try again. + + * stop-and-copy: this procedure can end and the device can now start + resuming on the destination. + + 2. The client reads ``data_offset``; at this point the server must make + available a portion of migration data at this offset to be read by the + client, which must happen *before* completing the read operation. The + amount of data to be read must be stored in the ``data_size`` field, which + the client reads next. + + 3. The client reads ``data_size`` to determine the amount of migration data + available. + + 4. The client reads and processes the migration data. + + 5. Go to step 1. + +Note that the client can transition the device from the pre-copy state to the +stop-and-copy state at any time; ``pending_bytes`` does not need to become zero. + +The client initializes the device state on the destination by setting the +device state in the resuming state and writing the migration data to the +destination migration region at ``data_offset`` offset. The client can write the +source migration data in an iterative manner and the server must consume this +data before completing each write operation, updating the ``data_offset`` field. +The server must apply the source migration data on the device resume state. The +client must write data on the same order and transaction size as read. + +If an error occurs then the server must fail the read or write operation. It is +an implementation detail of the client how to handle errors. + +Appendices +========== + +Unused VFIO ``ioctl()`` commands +-------------------------------- + +The following VFIO commands do not have an equivalent vfio-user command: + +* ``VFIO_GET_API_VERSION`` +* ``VFIO_CHECK_EXTENSION`` +* ``VFIO_SET_IOMMU`` +* ``VFIO_GROUP_GET_STATUS`` +* ``VFIO_GROUP_SET_CONTAINER`` +* ``VFIO_GROUP_UNSET_CONTAINER`` +* ``VFIO_GROUP_GET_DEVICE_FD`` +* ``VFIO_IOMMU_GET_INFO`` + +However, once support for live migration for VFIO devices is finalized some +of the above commands may have to be handled by the client in their +corresponding vfio-user form. This will be addressed in a future protocol +version. + +VFIO groups and containers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current VFIO implementation includes group and container idioms that +describe how a device relates to the host IOMMU. In the vfio-user +implementation, the IOMMU is implemented in SW by the client, and is not +visible to the server. The simplest idea would be that the client put each +device into its own group and container. + +Backend Program Conventions +--------------------------- + +vfio-user backend program conventions are based on the vhost-user ones. + +* The backend program must not daemonize itself. +* No assumptions must be made as to what access the backend program has on the + system. +* File descriptors 0, 1 and 2 must exist, must have regular + stdin/stdout/stderr semantics, and can be redirected. +* The backend program must honor the SIGTERM signal. +* The backend program must accept the following commands line options: + + * ``--socket-path=PATH``: path to UNIX domain socket, + * ``--fd=FDNUM``: file descriptor for UNIX domain socket, incompatible with + ``--socket-path`` +* The backend program must be accompanied with a JSON file stored under + ``/usr/share/vfio-user``. + +TODO add schema similar to docs/interop/vhost-user.json. diff --git a/MAINTAINERS b/MAINTAINERS index 694973e..d838b9e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1879,6 +1879,12 @@ F: hw/vfio/ap.c F: docs/system/s390x/vfio-ap.rst L: qemu-s390x@nongnu.org +vfio-user +M: John G Johnson +M: Thanos Makatos +S: Supported +F: docs/devel/vfio-user.rst + vhost M: Michael S. Tsirkin S: Supported From patchwork Tue Nov 9 00:46:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08BEAC433F5 for ; Tue, 9 Nov 2021 00:50:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6452D61167 for ; Tue, 9 Nov 2021 00:50:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6452D61167 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:55108 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFKw-000694-Fn for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:50:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51602) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-0005er-HX for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:37718) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046F-0O for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:25 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A902vrD019145 for ; Tue, 9 Nov 2021 00:39:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=v+jIFvek62PK3XboQP/gFOnb3lTpeJWxb58UHEDdtbY=; b=DnnPUBkgNtwWVJMrxdpRoylRd6AwIptBjk9mq0OeBK1F5su1rO48uJ1t5Ibe88KgVcRX aITB3cnrSF8vmyseV7ASgO0cUp+8L7nWRGFoaGTX5KExRnnIYOmBEjJ1/IG22PiZMkWh uNgrV2cnjQdHRmiWvMOARAnsOrDBY6ccZO/27FafO4oy3qedKLGogqG6GZKgVeWrdPZb jNolqh8BGwa9s4W7cG+fSKCGLIy/a0W1c62T+r4EIwyFX0US1HdsHWAgje4G7ZW7Xjo1 4BFMFS6pV9XyKjW55shC01//Vh7GrOOb89lg35aCE31KcIcl6SaIQAy1fchYpcejxSoT aw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c83-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:15 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLMx129193 for ; Tue, 9 Nov 2021 00:39:14 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:14 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iww+2uOaoeq9b2/GXNBTyt/GeAeYftnW8Yj8RCw9P1nPwdtas+pOnL+bDHKmP6sIg4n7nNhj8fCYzsX2r0VY9pk7TdIcPjuo4eJrqQ1YVkm7Mr2IDpKIl7C3q9hnHk596HYa3kKRfy3ZLCFqE0exgx1rlzHgpwuG4CHcRWJcPKCeDg1RIlm5dOKBX0ap2/qPOGy16WCLQkkr+m7t8/6gBvoN1dWRRDCjMm+kyeBP3fBj7lwPy6NbmpJY307FOsN0fLdyVEP9U5jgxX37eVvG7no/1j0ppE4g1uCjJFTKJyJzoygR7W8ieRA2IvnESvrC6SVRivQcdjCoHtAbEbmKTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=v+jIFvek62PK3XboQP/gFOnb3lTpeJWxb58UHEDdtbY=; b=jPJW+LeY4BK7zxMLwzlxVzHI2MJeeI0JSDc7lG54QrGRGnhNR0SKGP+uPlM1b6O5g4dx7KkY8Z7QcV7LqaJ1cmBW3ulzvDmewMoKkjr9CGYkdeP5IEDdvdjHG2j4E+CC6oNi0bqslbS+/w/nhXqp0Cfpg/51qCKUxwDKvogamBMKY+4TVyWKcHcPyTIVvn7nDFU3GwvNJGnM9ZalGFz/sEY0sWlYMa9TzULyx7sslAwHDn43qJWbg4fXA9e7hW0CRz1X6YFBSwt+b+Aq6FkpZwVlJbWw3gH4Uyz8L+5RtkE1asgDGK112iRyYQrX49zWr86INn/LuULmNI9LztChog== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=v+jIFvek62PK3XboQP/gFOnb3lTpeJWxb58UHEDdtbY=; b=xK2kBxZR43qVw/UZtFz7IJm2KvEJvOqRUNnv8SWNILeR6jWkunizaJL5I12f1zE/Yg9k00uQvTKGfhMEYh4Nukp53MpiSSwnOhY/eZxX6hYo8gBm4+Ko/Fwqys1MIi/r3YiHCoxFaheM88tMo9iQSfIXzPsyq9ZipFYz7wy9qd8= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:10 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:10 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 02/19] vfio-user: add VFIO base abstract class Date: Mon, 8 Nov 2021 16:46:30 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:10 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6f60b171-3581-48a7-3ee0-08d9a31958b5 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:60; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: moS7icyXODn7Ef3B93OspX0V7Z/nDiFbRc0dCJwBIdjTtOcG8NhlmshH7kmscN0lXo5iIiwTL4m9k9VbCU3mprVDBvhDbl2qeKd4MBUYmoA+umT09aWnhILOPq1VD2cuRUDYafDIcHHROmz7ZiVrINfwCWdC7V90r/C+YGdJwWRSUu3qFFvEWkx3Lu7E9KHicpGahqGIFmzg4BNvkZDttXxfdw6u3QdNiwPKkIC1AE3kBEePbhAV2ogpNXp0i7Oba8YMW/8cIxu3BjbT4RDvcnpAaDK5LUZKN5DL1mlrzoLvbZSyaFg3WFxc/xyfqtN2QCBOwKgoC9UtQMQye//zF+9YeiS5qbIIErg+yJK2iClSgrD3JEvipkROHqxmt3IiSvN9oFj+LusmDDSkomxwZ994Gr6BCF0D7knygKRYOx4T1xY8bo1RmtRgoachULY/BekNIOtZvA15dKT/SLuED6EN6fZsMDrfIQD6QtvfP1BxKIy8eIOtXzGEeom6EH4l3b2l+W8TSzHYpR/xEA8dIjynCxr5aGNdm5hqvyl1836DJtiISCpWb1Kouql/75FfNjV/5fvoLRv0gVZoAH05GSY9pnI8fkX05usMwTDFY0tBd1jPW9RCePC1wzxB3vwyhnf6M1SmDcGobYa7aZwx4zCxEJMmlW4OJQ4xPLsQ7wVcmg/pIgCobLeUJO7OD2azP50MnAl2H7n0asWmn82xAQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: REvpdRIjW01BskEQs/fP0eBtxQH+fDAZ5CSbu1ltUTW9M4YBTw6TS/iTMk1bLvZ7K2q1RfZZLrY2wQpUVPZ4RdmPneyGfOPtvTzYCkBGMOU/9vFcZuAVl8MPw7KTMJyBhgDjj2xXxQ/i0cEIEUXhVzy3g8Avfoowl0BFgUPt7bn6OoWr8A1dwLeIKnfoIPbin1mk3f3Bxd2B8o0F3E/3EaaTBfmo9Yc6lKeVeoDnSWilL5foUHYEQ2dWBhTG91iSS13+a7RKZ7NTbRNGon0zZD01r+FWe6oHLfWgzUme/IeIDQQfFWBqsCkhHyFT6Z8BrxdK+eIKELpCtjlR2VJQ5wvv9bIepwtSyUAp3HgMFIJU3LSSaZu89y9pz4qSxar33MfU9vx/BZqY8U/8RFbATqoNgq+DpWcqKceUUuPDR9Dcg6UjpA9lWF4LRN271JKtEJBw7mISnrGE4eaH8E+s3s3AYTRkeQfxQvQwnVMXbfJmceg4yItHvgeyQtUlygDxQ7YvRmqHw1nJL5n/KnioeFdRt16AwwHWDo0FG2VPy3SI036RZDfcw15VThNBdrK9DiA/uMzx4Jm4VmuskQiuz5Eawwv7SwTDf+c37Ezet1n0PDzB2kKcLm4U4VYoG47+Tj9RkzGI+lsMfJJ/5AbzhE3AejcnXSaGzrdvc5U028vp6dvOyK69cEoQ75PTEk699I78pPxgRKvc7rZTvZrrNz/0YK2UpnRby9+iE17rtRD1UiIXJwpddLhbxGOFiCzG7UAJIQms51TyFP2hidVRx6OaMytOOYyABanXtQBHnLjXK+w2uDv7W3dgoyEq9UfvJVCz8/xPAULh+9m9Fk8iAAp90k3vjyi/ewyuOxWdk+IQEP/XAxdJ8h4r+QfQe0epS22RwAbySUQ2/4AOfn8846B4qll4UCA8uIaixppuO7rB8uefkDPRxX3Ji2FKYhGoPqY76l7TeJXd76+ipVU1s3ROYSvuwPgrDvpxhq1BmpVc0Gw5sg7xa36s5xap/EguVdbm8jSc6Gt3LSHrixkJtImoZmqUVZ8o2SSEPBQuoiaw1w96JGKyzFOQSeEiYKowOS/yUX1bp1tBhyFrhNGJaJ6hH/vEl/3g47BdjS4q4vCCI38G75rSo7pzXZ1zfg4zzBMVXrE5xB14zNl+R1saQ+3PEJihuu4LbErp7G4ntMBkC2x9K1gnOcIsEMLtqGUMhw5iZzLbYjoiCRi7CAhJeA9XXsFryt1FJisA1TwpQ1PZC+8QVwYc5z4eYyQ2RZJdttvMQRM10SVkxNAYliZ2uyLmB0H3zf/IXOP0ZTbuAXdevof8sQWuV0EYSI0H0XxPgY/lAZH3DPyjBQSxmQJ0cV60tOJ7ItpODOH0QFRQngdZdH8XkfIEm6lLWzFh02JgVNyxKHx+dX1T1NPvqXLFN6ngtfRDuJZZTtdjVfO4z7f9xqWs8OwdmWB254bMHmU7rZcIDYpyhtt5MYV/Ow3mYgQybKUZgcLN+9t1OFk+IeMJ9OEY+iEcPK9OlHXKsc/KH58LvIzYdG/r4lsWJGlmpu9KL5l12dcJfBzT+IwcyLOf02wcOUlTcXqeySLY7TDtUyj5TOzNMEhvgKl1lWQXIFxG4u15g0AzHkS3N6bWxcYMHb+/sNIUZqNjhQtXmBR+TCjFeQssV+eBOgL8HjTyoQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6f60b171-3581-48a7-3ee0-08d9a31958b5 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:10.8454 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cJzChfqWGgYp0sJGQMTTDghXLFx/9UCrD5L8krZxlYMnjV0ndhWIDDp3hFgUNjFoOWcHXSsV3tsEzG7iit7GwUpZNT2fnWyqk22QSz6dUmc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: VgUbZnPM8ewt2VcShjbTGH_2BKJ0zcb4 X-Proofpoint-ORIG-GUID: VgUbZnPM8ewt2VcShjbTGH_2BKJ0zcb4 Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add an abstract base class both the kernel driver and user socket implementations can use to share code. Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 16 +++++++-- hw/vfio/pci.c | 112 +++++++++++++++++++++++++++++++++++----------------------- 2 files changed, 81 insertions(+), 47 deletions(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 6477751..bbc78aa 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -114,8 +114,13 @@ typedef struct VFIOMSIXInfo { unsigned long *pending; } VFIOMSIXInfo; -#define TYPE_VFIO_PCI "vfio-pci" -OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI) +/* + * TYPE_VFIO_PCI_BASE is an abstract type used to share code + * between VFIO implementations that use a kernel driver + * with those that use user sockets. + */ +#define TYPE_VFIO_PCI_BASE "vfio-pci-base" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI_BASE) struct VFIOPCIDevice { PCIDevice pdev; @@ -175,6 +180,13 @@ struct VFIOPCIDevice { Notifier irqchip_change_notifier; }; +#define TYPE_VFIO_PCI "vfio-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOKernPCIDevice, VFIO_PCI) + +struct VFIOKernPCIDevice { + VFIOPCIDevice device; +}; + /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t device) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index e1ea1d8..122edf8 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -231,7 +231,7 @@ static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route) static void vfio_intx_routing_notifier(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); PCIINTxRoute route; if (vdev->interrupt != VFIO_INT_INTx) { @@ -457,7 +457,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, MSIMessage *msg, IOHandler *handler) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector; int ret; @@ -542,7 +542,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev, static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector = &vdev->msi_vectors[nr]; trace_vfio_msix_vector_release(vdev->vbasedev.name, nr); @@ -1063,7 +1063,7 @@ static const MemoryRegionOps vfio_vga_ops = { */ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIORegion *region = &vdev->bars[bar].region; MemoryRegion *mmap_mr, *region_mr, *base_mr; PCIIORegion *r; @@ -1109,7 +1109,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) */ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1142,7 +1142,7 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t val_le = cpu_to_le32(val); trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); @@ -2782,7 +2782,7 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) static void vfio_realize(PCIDevice *pdev, Error **errp) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev_iter; VFIOGroup *group; char *tmp, *subsys, group_path[PATH_MAX], *group_name; @@ -3105,7 +3105,7 @@ error: static void vfio_instance_finalize(Object *obj) { - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIOGroup *group = vdev->vbasedev.group; vfio_display_finalize(vdev); @@ -3125,7 +3125,7 @@ static void vfio_instance_finalize(Object *obj) static void vfio_exitfn(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); vfio_unregister_req_notifier(vdev); vfio_unregister_err_notifier(vdev); @@ -3144,7 +3144,7 @@ static void vfio_exitfn(PCIDevice *pdev) static void vfio_pci_reset(DeviceState *dev) { - VFIOPCIDevice *vdev = VFIO_PCI(dev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); trace_vfio_pci_reset(vdev->vbasedev.name); @@ -3184,7 +3184,7 @@ post_reset: static void vfio_instance_init(Object *obj) { PCIDevice *pci_dev = PCI_DEVICE(obj); - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); device_add_bootindex_property(obj, &vdev->bootindex, "bootindex", NULL, @@ -3201,38 +3201,75 @@ static void vfio_instance_init(Object *obj) pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; } -static Property vfio_pci_dev_properties[] = { - DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), - DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), +static Property vfio_pci_base_dev_properties[] = { DEFINE_PROP_ON_OFF_AUTO("x-pre-copy-dirty-page-tracking", VFIOPCIDevice, vbasedev.pre_copy_dirty_page_tracking, ON_OFF_AUTO_ON), + DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, + intx.mmap_timeout, 1100), + DEFINE_PROP_BOOL("x-enable-migration", VFIOPCIDevice, + vbasedev.enable_migration, false), + DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), + DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), + DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), + DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), + DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, + false), + DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, + false), + DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, + OFF_AUTOPCIBAR_OFF), + /* + * TODO - support passed fds... is this necessary? + * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), + * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), + */ + DEFINE_PROP_END_OF_LIST(), +}; + +static void vfio_pci_base_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_pci_base_dev_properties); + dc->desc = "VFIO PCI base device"; + set_bit(DEVICE_CATEGORY_MISC, dc->categories); + pdc->exit = vfio_exitfn; + pdc->config_read = vfio_pci_read_config; + pdc->config_write = vfio_pci_write_config; +} + +static const TypeInfo vfio_pci_base_dev_info = { + .name = TYPE_VFIO_PCI_BASE, + .parent = TYPE_PCI_DEVICE, + .instance_size = 0, + .abstract = true, + .class_init = vfio_pci_base_dev_class_init, + .interfaces = (InterfaceInfo[]) { + { INTERFACE_PCIE_DEVICE }, + { INTERFACE_CONVENTIONAL_PCI_DEVICE }, + { } + }, +}; + +static Property vfio_pci_dev_properties[] = { + DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), + DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice, display, ON_OFF_AUTO_OFF), DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0), DEFINE_PROP_UINT32("yres", VFIOPCIDevice, display_yres, 0), - DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, - intx.mmap_timeout, 1100), DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, VFIO_FEATURE_ENABLE_VGA_BIT, false), DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, VFIO_FEATURE_ENABLE_REQ_BIT, true), DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), - DEFINE_PROP_BOOL("x-enable-migration", VFIOPCIDevice, - vbasedev.enable_migration, false), - DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice, vbasedev.ram_block_discard_allowed, false), - DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), - DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), - DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, no_geforce_quirks, false), - DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, - false), - DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, - false), DEFINE_PROP_UINT32("x-pci-vendor-id", VFIOPCIDevice, vendor_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-pci-device-id", VFIOPCIDevice, device_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-pci-sub-vendor-id", VFIOPCIDevice, @@ -3243,13 +3280,6 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice, nv_gpudirect_clique, qdev_prop_nv_gpudirect_clique, uint8_t), - DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, - OFF_AUTOPCIBAR_OFF), - /* - * TODO - support passed fds... is this necessary? - * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), - * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), - */ DEFINE_PROP_END_OF_LIST(), }; @@ -3261,25 +3291,16 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) dc->reset = vfio_pci_reset; device_class_set_props(dc, vfio_pci_dev_properties); dc->desc = "VFIO-based PCI device assignment"; - set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize = vfio_realize; - pdc->exit = vfio_exitfn; - pdc->config_read = vfio_pci_read_config; - pdc->config_write = vfio_pci_write_config; } static const TypeInfo vfio_pci_dev_info = { .name = TYPE_VFIO_PCI, - .parent = TYPE_PCI_DEVICE, - .instance_size = sizeof(VFIOPCIDevice), + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOKernPCIDevice), .class_init = vfio_pci_dev_class_init, .instance_init = vfio_instance_init, .instance_finalize = vfio_instance_finalize, - .interfaces = (InterfaceInfo[]) { - { INTERFACE_PCIE_DEVICE }, - { INTERFACE_CONVENTIONAL_PCI_DEVICE }, - { } - }, }; static Property vfio_pci_dev_nohotplug_properties[] = { @@ -3298,12 +3319,13 @@ static void vfio_pci_nohotplug_dev_class_init(ObjectClass *klass, void *data) static const TypeInfo vfio_pci_nohotplug_dev_info = { .name = TYPE_VFIO_PCI_NOHOTPLUG, .parent = TYPE_VFIO_PCI, - .instance_size = sizeof(VFIOPCIDevice), + .instance_size = sizeof(VFIOKernPCIDevice), .class_init = vfio_pci_nohotplug_dev_class_init, }; static void register_vfio_pci_dev_type(void) { + type_register_static(&vfio_pci_base_dev_info); type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info); } From patchwork Tue Nov 9 00:46:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A0CEC433EF for ; Tue, 9 Nov 2021 00:58:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12458611BF for ; Tue, 9 Nov 2021 00:58:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12458611BF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:56902 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFTG-0001gw-4P for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:58:42 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51682) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAe-0005fx-Tq for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:38320) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046J-0O for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:28 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A9019cI010247 for ; Tue, 9 Nov 2021 00:39:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=SydTKKwlNxt9FkOHyG7rijpU3+ScSId62fKktBOXgHg=; b=kK5mAbGkGvDuFuIkIni5SNEKeylzf36MIart09vfqvTkBKiCy5LL+AocVM+Ohu4UC/sN 32LmexM/kNNwCPsqu2x2koFUEDUxcIwRAYrtdLvOtu/QmcDwvWE+lzY2+uaYH7dmzRPw 2kDdtwJ4OTTo3gPs27JQMysTEpgEMpDZwuzxVOKSdFdBCjSwr32vtlDKHBD1MLSM7Pc1 qdTG2/vtG2nBPXg6T45h6BcWCMmBjUpSivou8MUI0GJaHrk2RtpNcwkeFtToenTpf7zY JuauA02u024Iwrm39v2M0ZAegGuSqszrN3FJtPvyHKQy2uAD+VBuR50LRHVu+E3YO5TT mA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6usnfjk6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:16 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN0129193 for ; Tue, 9 Nov 2021 00:39:15 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X1KhjESeHQTR33BlBlUTIH5D/5wvtx9UtrDQnlYL7qL1ivbfQvO8BD/G8SyOGN/08ql6aCDS7KeFXpWzSpx9Dz/Nt8gwTgBP/CO4hbx7l2QAbslJe/gWSHwB4XmCAGhTtM+OzubFdRJ2T5FB4tGb1l9ZP8PRx5EL6Zqzx8RZW8xBn3VZCjDdxAqrltzmqirlMvJYv2kYpCl/HHSO+AXs6eoFwGWiP80aexRQrlZpsmtz6pmDaQbFnvHQ1fFe8qig0svy/1S0efmt0acXRhobZLmZGW+hc407zByurH2gZo5z663JEPJJDPDfDeFguk0KfmbamjszriyDfTJ2A/PADw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SydTKKwlNxt9FkOHyG7rijpU3+ScSId62fKktBOXgHg=; b=jz1cy4K/LTrnuaG7Neaqv6GHx/I7gfnhpmVFUfkF2T5vNqLAqtwlv2TOoLYxPc8/y/4n+37kVmIvXh3Y61om9hZAHad42kksTQ1M926q6/4lcP044E73gA020X6mEHq7Lw2uot728EPGhV8bQDYRRkwmgAQPpbjcn1u93DBr1xNGS3JkMA08cYRteRS6wKU/GBW80DvxosHie2rA93H8kRt5LHvypG3zp4OXAyxamICBtxFHyNaBXUzUo73mnJ0AMhiRmGOfj99mC9QYXkSgSC7Tf+5oQyyCmrDyGocCHGWOqvtuofEjPzx5bIVfTiigcLFSmhZG5nqHCYDR7YIDng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SydTKKwlNxt9FkOHyG7rijpU3+ScSId62fKktBOXgHg=; b=oKWLrxKl8rd0o4t6pvBdK7vB3mJcubTtkxTgYqZfNrlYL/u6s7RW/AuHOMgJC1o/ZuCN4IYAWoM5l9yisB+pb1mYgaQ69Hoeyi+kzWWE/g2njsFb3R6/MNhQWbkT0ewEkSJ91GAvXTCAqHWxYuBDKpOJO63jlcHC6kZTX1FH9Lc= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:11 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:11 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 03/19] Add container IO ops vector Date: Mon, 8 Nov 2021 16:46:31 -0800 Message-Id: <9ce71d6fe15f23e22bf22c392e72aca5aabadce3.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:10 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8e17835e-3826-4dec-7dd2-08d9a31958d8 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:247; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FvL12JJ6A2BY7Cwg9GXYGlIozlaEv0AEbI+qawf+eWnmTiNNegncAaJ6HUugCGB6SYq1x4vsft/lqLUFYCraS/epXrxolFVuUCnhea4Kysru+K8c1i7LjJvqNuhNbYGsWeW174ssWcM1gY+pdQdjZZ8Lvp+QY5w7qv+/Z68VWSzeYSh0gN02qdKC/fbrugT9xAxWRZq9dojgAgLKNFTtR9/0znFYtDi4U3hJzIAVCQbKywDwXNgqKNdKCPuD+q6rE+vq1TZtny5WWb11r4MP4RCPrAkHuEFTMiEJAn1N8FXKNZ5q4gy5fVzFLHk3IFIiJnoYbB04PpLh4hK1b93BmJ/hlxgbn5m08pai4Y7OqK4sALoaJ7Y3Se1ZH95EznNmlfTPyAbSd1SREWIGWrxusziLZhmeU1zWyQTYMg2H4QnljZDfjIvT7GZp1FI00tCswlu0RP8oGKofdfXZIxAMayq8nfbHe5VmJxK9GfADh8jO5S2YPr07FfQlqF8WtS27SbKHtTGc9B/mBnsEKRqLjH4dmKrLmch9TeMLQ5b+9xGMKvtjeTGSpXNb3R/Xh6I8UU3xy7MRQcnhl4qCwUDY9f9Fqc5Rp6Guaew2ceoEMUQ5sehQMBzm9G7UzjnxTslvKAxweC7itz47k65idyLimQkKFVcNdUS8iyR+IqXmgMeo6rsPyNHyt1il1/jVwbt76/74RFnDuNzdYgd48UiXAQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: l55nqmfGUf3GGH9CZx/RmPitreVcA9mCUt4uGObkWNm/2NVG5q6I7SG9H1Meha4K+HxhgcVoOL3hBksbLF+8l6q8VU/iLfwwxUaScHhlmhWbrw5+hOvbX7ROpwQhejoS7ORZ7rVbHRAwyapruVJECK4Lg2jyIc8FtVKn2+5ERH4Cb6GpvHBLyhGwVE63nWSIEBnnG8BcGZfPDeMJpnl0i2O7KIAMZxlK0d6VCTMrkV62FXbC0lyC6JDkSq0ERgzM0Uf4e3J3g7dgjM01G9vOb14WVevySm0sWdL+7ex3OtWZNzj1eyE2eIUM/g2uDJILyVT5QJxj8uKz98GkQa2iDq6DHi09ww+Nm+0k59d/oXL25sJsXi0U5wW7i/0RUqGTtoEDhO1FSP/Xce8ilieGcYX9RT33fJpDa4bFJ3BqpheOkdM8SR8VPu5LvhQXUNA3yhBAQhq0DmQsum/B5VHt8hRg6x34bV0KshHY+506FDwaJElWTmmpFD0r5WMUGtbHM1eq8PZuFDw4G4OxcGXSuRKV5sGsU1zm7ZwwKw1iXJYQrjvITtVr1aNCCJzK/d05MwouzLLiO4YNgYrQEzVMarG2iV8fHyZWSimOUqrHdkS8vWCO0R8Fcabj7J4A5C/o/D4SW2UEcYFTlwZTropg7B+JVd7b8TrQNFGA8QrpTQXWVCCTpIMO/7VOSkk9B4+b25VOW98d4lpupME53a4HPyBrUmOLs5O+dOFA/wKzxhFjqse72o6+WicIWY7bIs08iA1F7PmUZcHP1udfNW7DIsArl3AV2sdZYEy8WmbmY9PWIySD8OgnIDqsPq4fLqaMe+QB19iWwWGa/i6fbubNThCW+gl4KFvpXtJiEBuFmejcUYrZvJR/BGRPqLjpR1Z3Av26XAUevm69hVP9Ip1RyKf0pz0lT8g+tLxAOZCfx+ursfI8NF6Q5z4dPZVm6kQ6Ku1JkzLyiDfwm1Xrz7KOECGfWo6vjOXGLK2t1y/JXc9BQV78oS2vViNnOqdA78ICl501eBiiIaUFskjahv8m0IvoolOzVkniixvNh77WmKx+lMpA28H33PqUAAW/BtGzwk9i+uPQcRJL91ezq/kkcN11VSjVV5jHqb3NveEEtOKDaXQMC/YJH9FQ6mCrwjsGr454ShqnAjqE08biClp5HwtBhYKaOI5Go55YJnwsUJYKC243nScat9/2Zw+YGu0LMY78OJgiM4COdrfpoocS/rU5wSysoSt0L4mhIJ7B5k1d3Kaps4xJw+acRkPCVMowv5mjHWwM+T6Q5aasF6Twb2OzfCfNCExr+YmMXuk5OGW+RbU3QukprfLAw6UJY/JjiQZcmwYn6EttWdPAQjIApkhdGmgCA5xT7mS/Rdoc3rkYUpRyXS+y1Jecw2t15m4c6LnmqqtUFWGWzQu8bY1dJp/3bCr396oRFOw3v56EHLSCu+ufEQYStlkOBJRQYdwF+vWlq6+EVRAb7MweLYVsYSA3ZpEuqzGBENYXwTuMaSXD9jzMjWxg9ABnH8aYsIa9AXN/qZIg8595FVXgxOW3Yv6qVQ8ItaG+Ug9Qn3w21/87d9kKyDnLgReWGe5FuG55R0RKR1LzjnZqjmvkMPdUnxP9iGZAMLj7PkNNkbJmevG9co9qYJjyHVqLt1IPsHQW+Or3x5dD+4RrqxROVugIeA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8e17835e-3826-4dec-7dd2-08d9a31958d8 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:11.0694 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ouLR31Yz6+ylIGZiBGT+Ad00D9NIx4Ibo72NpLjU+7OXaWAscsnVNZ0VT6RBHp2R5XfesDfI0BBquM7GkWSi0XuHTuMaboJ8etTzWgy3kCY= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: uildpvvPird6fDDTXpnaspauQPI1ktQj X-Proofpoint-GUID: uildpvvPird6fDDTXpnaspauQPI1ktQj Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used for communication with VFIO driver (prep work for vfio-user, which will communicate over a socket) Signed-off-by: John G Johnson --- include/hw/vfio/vfio-common.h | 34 +++++++++++ hw/vfio/common.c | 131 ++++++++++++++++++++++++++++-------------- 2 files changed, 123 insertions(+), 42 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8af11b0..9b3c5e5 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -75,6 +75,7 @@ typedef struct VFIOAddressSpace { } VFIOAddressSpace; struct VFIOGroup; +typedef struct VFIOContIO VFIOContIO; typedef struct VFIOContainer { VFIOAddressSpace *space; @@ -83,6 +84,7 @@ typedef struct VFIOContainer { MemoryListener prereg_listener; unsigned iommu_type; Error *error; + VFIOContIO *io_ops; bool initialized; bool dirty_pages_supported; uint64_t dirty_pgsizes; @@ -154,6 +156,38 @@ struct VFIODeviceOps { int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f); }; +#ifdef CONFIG_LINUX + +/* + * The next 2 ops vectors are how Devices and Containers + * communicate with the server. The default option is + * through ioctl() to the kernel VFIO driver, but vfio-user + * can use a socket to a remote process. + */ + +struct VFIOContIO { + int (*dma_map)(VFIOContainer *container, + struct vfio_iommu_type1_dma_map *map, + int fd, bool will_commit); + int (*dma_unmap)(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap, bool will_commit); + int (*dirty_bitmap)(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range); +}; + +#define CONT_DMA_MAP(cont, map, fd, will_commit) \ + ((cont)->io_ops->dma_map((cont), (map), (fd), (will_commit))) +#define CONT_DMA_UNMAP(cont, unmap, bitmap, will_commit) \ + ((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap), (will_commit))) +#define CONT_DIRTY_BITMAP(cont, bitmap, range) \ + ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) + +extern VFIOContIO vfio_cont_io_ioctl; + +#endif /* CONFIG_LINUX */ + typedef struct VFIOGroup { int fd; int groupid; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 8728d4d..50748be 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -431,12 +431,12 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, goto unmap_exit; } - ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap); + ret = CONT_DMA_UNMAP(container, unmap, bitmap, false); if (!ret) { cpu_physical_memory_set_dirty_lebitmap((unsigned long *)bitmap->data, iotlb->translated_addr, pages); } else { - error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %m"); + error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %s", strerror(-ret)); } g_free(bitmap->data); @@ -464,30 +464,7 @@ static int vfio_dma_unmap(VFIOContainer *container, return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } - while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) { - /* - * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c - * v4.15) where an overflow in its wrap-around check prevents us from - * unmapping the last page of the address space. Test for the error - * condition and re-try the unmap excluding the last page. The - * expectation is that we've never mapped the last page anyway and this - * unmap request comes via vIOMMU support which also makes it unlikely - * that this page is used. This bug was introduced well after type1 v2 - * support was introduced, so we shouldn't need to test for v1. A fix - * is queued for kernel v5.0 so this workaround can be removed once - * affected kernels are sufficiently deprecated. - */ - if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && - container->iommu_type == VFIO_TYPE1v2_IOMMU) { - trace_vfio_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->pgsizes); - continue; - } - error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); - return -errno; - } - - return 0; + return CONT_DMA_UNMAP(container, &unmap, NULL, false); } static int vfio_dma_map(VFIOContainer *container, hwaddr iova, @@ -500,24 +477,18 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr iova, .iova = iova, .size = size, }; + int ret; if (!readonly) { map.flags |= VFIO_DMA_MAP_FLAG_WRITE; } - /* - * Try the mapping, if it fails with EBUSY, unmap the region and try - * again. This shouldn't be necessary, but we sometimes see it in - * the VGA ROM space. - */ - if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || - (errno == EBUSY && vfio_dma_unmap(container, iova, size, NULL) == 0 && - ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { - return 0; - } + ret = CONT_DMA_MAP(container, &map, -1, false); - error_report("VFIO_MAP_DMA failed: %s", strerror(errno)); - return -errno; + if (ret < 0) { + error_report("VFIO_MAP_DMA failed: %s", strerror(-ret)); + } + return ret; } static void vfio_host_win_add(VFIOContainer *container, @@ -1221,10 +1192,10 @@ static void vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP; } - ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty); + ret = CONT_DIRTY_BITMAP(container, &dirty, NULL); if (ret) { error_report("Failed to set dirty tracking flag 0x%x errno: %d", - dirty.flags, errno); + dirty.flags, -ret); } } @@ -1274,11 +1245,11 @@ static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, goto err_out; } - ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap); + ret = CONT_DIRTY_BITMAP(container, dbitmap, range); if (ret) { error_report("Failed to get dirty bitmap for iova: 0x%"PRIx64 " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova, - (uint64_t)range->size, errno); + (uint64_t)range->size, -ret); goto err_out; } @@ -2048,6 +2019,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->error = NULL; container->dirty_pages_supported = false; container->dma_max_mappings = 0; + container->io_ops = &vfio_cont_io_ioctl; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); @@ -2577,3 +2549,78 @@ int vfio_eeh_as_op(AddressSpace *as, uint32_t op) } return vfio_eeh_container_op(container, op); } + +/* + * Traditional ioctl() based io_ops + */ + +static int vfio_io_dma_map(VFIOContainer *container, + struct vfio_iommu_type1_dma_map *map, + int fd, bool will_commit) +{ + + /* + * Try the mapping, if it fails with EBUSY, unmap the region and try + * again. This shouldn't be necessary, but we sometimes see it in + * the VGA ROM space. + */ + if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, map) == 0 || + (errno == EBUSY && + vfio_dma_unmap(container, map->iova, map->size, NULL) == 0 && + ioctl(container->fd, VFIO_IOMMU_MAP_DMA, map) == 0)) { + return 0; + } + return -errno; +} + +static int vfio_io_dma_unmap(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap, + bool will_commit) +{ + + while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap)) { + /* + * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c + * v4.15) where an overflow in its wrap-around check prevents us from + * unmapping the last page of the address space. Test for the error + * condition and re-try the unmap excluding the last page. The + * expectation is that we've never mapped the last page anyway and this + * unmap request comes via vIOMMU support which also makes it unlikely + * that this page is used. This bug was introduced well after type1 v2 + * support was introduced, so we shouldn't need to test for v1. A fix + * is queued for kernel v5.0 so this workaround can be removed once + * affected kernels are sufficiently deprecated. + */ + if (errno == EINVAL && unmap->size && !(unmap->iova + unmap->size) && + container->iommu_type == VFIO_TYPE1v2_IOMMU) { + trace_vfio_dma_unmap_overflow_workaround(); + unmap->size -= 1ULL << ctz64(container->pgsizes); + continue; + } + error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); + return -errno; + } + + return 0; +} + +static int vfio_io_dirty_bitmap(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range) +{ + int ret; + + ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, bitmap); + if (ret < 0) { + return -errno; + } + + return ret; +} + +VFIOContIO vfio_cont_io_ioctl = { + .dma_map = vfio_io_dma_map, + .dma_unmap = vfio_io_dma_unmap, + .dirty_bitmap = vfio_io_dirty_bitmap, +}; From patchwork Tue Nov 9 00:46:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 869DAC433F5 for ; Tue, 9 Nov 2021 00:50:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D701261167 for ; Tue, 9 Nov 2021 00:50:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D701261167 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:55396 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFKz-0006Kn-Uj for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:50:10 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51688) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAf-0005fy-0Q for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:39184) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046P-0D for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:28 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A905ino019038 for ; Tue, 9 Nov 2021 00:39:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=loPDDOc9EHDarGRkEIN2UFnuYo2f+u63MeomW6GqtpQ=; b=jfAFcX5yHXKpIEOTkuhGpLQlFF4qNRYSS6lDd7kZCxD3SNoz+xvsqOtUuwjdRNiPqxgX OVF5nsNnGOFSUpED/kyTgKY9Yd454aFBXdxUjRVCDwmrV6v5bOPm1HD3dGkqcTk6XoJs FZVLo40dK85QwVXfyr218z5D4rvHYRrZppbjinV0f/ZSU38zwYJ/x0eSCVwYO1GSasDc aLiwbmbvDWCqCixs5/JKo23weOaElTFI1z88CXL8KOV8xOQa3rFn687kl1aTRpTsVogE a4X3LJA7HLMGKoAq7RYRnw7YnybDP1VCpOitPYhKFU6JDDCHRdiDpo72rOVggj84BzuN DQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c86-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:16 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN1129193 for ; Tue, 9 Nov 2021 00:39:16 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TBl6RKAAY5/ncYdnWFRvIiiS2tHJHwydmpGVL9/Iwl/i43lnDHBuLheV2Jwj1LaKNBv7n1cReuKMT9lD+bW/DcGl8Oq8+A8SgL4rXvmqRp5oOYTu1JwgCggGx99aB2nHVXtJzzPSLBeNcPVs3h3eF5cODj0548fEDjH6/U7wabNBGVb1mNtI6ZFnsKIRlhOOjyPxSUFbLrA4L6/0qNIiP/03bQCDufbDO/PS6rqA6lofgetTzMZcJOH2I4/M0oMsKLQlDeCohZUzJtpM42WZN4NQt042rK79u3gRcASaDo3cK/NtDM1jHXysQe0Ozob8V+va4etQw20yXlpE2atsuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=loPDDOc9EHDarGRkEIN2UFnuYo2f+u63MeomW6GqtpQ=; b=lB2IBMZIhxaNR19EtoC/VvUwDD5JKoqnEoNCxNplQpvFj6lOALP36Ir9LEVr258uuEMHKg5l1TnM0Smv58zhJcfcXUI2JDF05SlxKtldJ+xIF4F1PCejNAa+b90w9SMWjtov4PU2Cs77c+Mtaqw4sb6OnXcyYYqZoVwEngWEPTHaAx0fqlZSnhEXNitwvoj7zDlYkLfm0qnlb0Nd1Bx8TGEbPtWoaqKh2pPsWMSdndKZDOGNUEdUONSkPKKcWu4JJfd8XRPhGpjiUZIFUc0fQd7X0W1pJAHEhIvc0R1qC64uvUQY8hE2CttbGz0y9pBy8RJTQf8PzAQyxe2iwOZtXg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=loPDDOc9EHDarGRkEIN2UFnuYo2f+u63MeomW6GqtpQ=; b=QvryYHS+DP/biThcQ7ScnmP1HteOsDEPlLZLkWCQNWDNGDOBrrWHmaA59s0NeEGNhx+cygzrkIy111hEKDQYbiP5Sf1G9+P+oVQqPnVWSx1W78MCxWw+vlIhx6Vyd2FHPfh16A1um/x7svZwRp/6uJe+c/LmgayaQ+ZMcuBMXbU= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:11 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:11 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 04/19] Add device IO ops vector Date: Mon, 8 Nov 2021 16:46:32 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:11 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c47c8059-0cf4-41b4-6f04-08d9a31958fb X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:119; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jXsSsQ1NzRO7DdAyi5V4bLqjtDitDuvynyva+5hflBDZJxuxO85st6Xtaj4yVfeid9CtRc+8+MXht4ER+qsw2fQ0dYueZVjNgr7lELP/YPsXHVmb1RV9zvze0OWM+bAympcRLOFcZpI3b9FMqMQlWVHxbr2mZotSxbix/sD60Y2nDyBpDZ7lLqjmNQmP6YUdT23wNWuCJuVSGfaUl50OujZL0CPTj4szS+qJYAtZRsmXFTivLLgs4lY4bh0+llK7l8LA4dejbT0nDpxQc8wOi1G7de49fYwxlR3aS5obAG5zQBAoOwFHvrW0/wr+6Prfe1sUxGEUjJjqc0Ex+kMeYaDnDVRTTdvKBGal37rjFfZcBhXh4cRFA/eWAnbzuVxiX/5i6rTYoA0yQ2F9UKaUaVDe/KaRQpoCspQXrhvKKwJuuzpaKWrkOnsLcuXBIGWPFtbah36x+JPO2N9Bn2X/A0RjZ55AtGh+SB/JeT+IcqprDnv2f9EF+3POT6iV650PY2CPipbMgd+uBoHjri6lNg2urWbKQAz6NA2xoQWcj0uKb0qS/k8swjcRYNvdIdNGwpUFnX/b8K9jJcuAEpmor2PS16w4kiJOleNypsVkTZYdqqerFcExvpP3DrB0lPBcZ50ATQgZ2p/Tv+Ok5F50+8BQOr0MiMDw6bJQdxeYg1NZPwfvhcdxWsXMUKVvyJZukqvITpj7nKevBU38C36NAw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Ur9MMUVDh05J3Jn2AOdcZwiOSB0Ue2iw/RzCRJ6hIVWN7z5APLcZVoc4uITzrqWIxLF6rlfWvHurOgRiq+Ws45nK6wEt+PUnFLcBfb1t3rnuyK4ah7yaCm66e8BcylGPtwDmuOXJIwZNvNoSpJboDaVV3QMFh4uS4jtdW94KvMTHoj2xyV9iL79V12WTK2FFvAdUQdoarD0FE0CYV1r6K7jTEPrffpJyNYcLQyMbCNTK4CN/iofwRWer5mJJMRAqWrYxsv0eqP7ilB+NhH+a+bLABbNtQGBds6gOfqO2ChBTxzSs29+kZHBd9tufCI1n2Nyb7dFlKb6d2iwLmwJE+9wd+bX23yVzCJ+xq3uFSa9CDBK52pm3ESPXVD3KQIhx27eR3eIhaP+u2w5qK2TeryfSnL0io2yUpuEDrTBZg2LBee+RGDZGJ2OLiT8mW8p1OSaWU4WKcF0p/e4KsUyMdn8VQtEE4IKN8/Ql+pMmUtm4keqUmLeHdJu8kkkYL6EnBMNorGOcore4aWRcDdstaosTFiYohUA/QusyYJaipBAdzJ2kjhDDC4V9V6NtbNBlfVgy7JNFPB5YvtAqv9JbfskcUsN+FjB7EFlUShDCxvBZK6U0aMUOipwxZeDW4sAvrc2D7KT9M7WUgWKR5uP5++0ZoIYOkW9DciTksIXCpE1/IbA0LmmrPBt+xBtEnKXsJqZGrQYA3qEWjEb8wikxBJ+NW0bHPDY4+Vw4+L2p0orS75TO3qJfYbOIEXWsLhpIIKUwTMOdwOBdWxcAC7+23WnpJMVpxf/c18cmsieXZuByd6OZJPI+6lIEGThUN4t2HnyHvghO7rzokpfbgZFbC+P/x3RGW5FNywfaLJlQ34T/vPR4NXOcN+sU21IFfqgn8pd7S8+PwaJmPm35qNBUBWM19mdCLNRKX2DZYT7X/ZtwfeUkFgq3G8nIryiBMQ2SIy3eBrW7h+qKTG/g/yyD5aEUN7o6aMpuWmT7o/A0FRgIEcfTPlA2SaUzyK66dLvg7Tq2Zsr4DfTXWD33qDW9x+PJT5DgpoO9J+luUt1ZkVs2eMIr2QAVpB0dcXHxnhYcVWqF6rfhkOdL3B0wT8L/DTubDraqAz0/aBoOhRzcb2Gy7VTQ3+Nsba1J1ny3IZAvSkUtkgkwD70GYe65fVZ4MxX7TV3v4LdoaTxiYJoGCGAecS1gBQj9qNZe7zIdyxn7r/F0cPibSrj1BjMEIh1hFUpjur0FfGyTibSgNqdttL+hp71bnrd91wUYRpqE13rLRB27BhycvrJCyTGrrkBTUPWxptcKNvY33RN0Cg4WSwXHXXoEXVMznAhgV2/g/g1ZF9Ae5nIi2W8tTaMwJiPkohut7nxzQD79exvRyzrepe6bUlvn9k88YRq4pAPfqp++N/g3HJmlHOasST9DGkWKjVoEuZLfCVoxJd1Tf5iVuHLpkCNYrVI3D3dla4yefEgvJuLRy56zEKrOrLnNrXWT5TwHYcKEoFf2Dt81x+MqvqfhBFK365QzWQ0oUjHtYdKWDlw+0oE4N8YzQAYdw8Snj91qDDJSXd04qi8Gj+cpw6Tqp1gnOPu2OnpNj5YjJLLy7Tr8r0DkZSvO9AIMhPtM7efype57gE+VphBsHNOFIL7/m7UAUYx5kri88EOgD2eAFNBEQkSG7tZXkvBwj8pycA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: c47c8059-0cf4-41b4-6f04-08d9a31958fb X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:11.3253 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: X0/4LToYQ1SSpDBX/geaSkSuXSZ4SQq/4Pki+DxJs6idwf+S4rguuCQh6tCB8dlpBDx6KFXVYxZN6hQJlBOqZ8NRbLIS5Nwb+4m7EFDlHI8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: M-iRq80j1WYQ-o0Bib8ba4VNMKdtPkld X-Proofpoint-ORIG-GUID: M-iRq80j1WYQ-o0Bib8ba4VNMKdtPkld Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used for communication with VFIO driver (prep work for vfio-user, which will communicate over a socket) Signed-off-by: John G Johnson --- include/hw/vfio/vfio-common.h | 28 ++++++++ hw/vfio/common.c | 159 ++++++++++++++++++++++++++++++++++++++---- hw/vfio/pci.c | 146 ++++++++++++++++++++++++-------------- 3 files changed, 265 insertions(+), 68 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 9b3c5e5..43fa948 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -124,6 +124,7 @@ typedef struct VFIOHostDMAWindow { } VFIOHostDMAWindow; typedef struct VFIODeviceOps VFIODeviceOps; +typedef struct VFIODevIO VFIODevIO; typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; @@ -139,12 +140,14 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; bool enable_migration; VFIODeviceOps *ops; + VFIODevIO *io_ops; unsigned int num_irqs; unsigned int num_regions; unsigned int flags; VFIOMigration *migration; Error *migration_blocker; OnOffAuto pre_copy_dirty_page_tracking; + struct vfio_region_info **regions; } VFIODevice; struct VFIODeviceOps { @@ -164,6 +167,30 @@ struct VFIODeviceOps { * through ioctl() to the kernel VFIO driver, but vfio-user * can use a socket to a remote process. */ +struct VFIODevIO { + int (*get_info)(VFIODevice *vdev, struct vfio_device_info *info); + int (*get_region_info)(VFIODevice *vdev, + struct vfio_region_info *info, int *fd); + int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq); + int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs); + int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data); + int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data, bool post); +}; + +#define VDEV_GET_INFO(vdev, info) \ + ((vdev)->io_ops->get_info((vdev), (info))) +#define VDEV_GET_REGION_INFO(vdev, info, fd) \ + ((vdev)->io_ops->get_region_info((vdev), (info), (fd))) +#define VDEV_GET_IRQ_INFO(vdev, irq) \ + ((vdev)->io_ops->get_irq_info((vdev), (irq))) +#define VDEV_SET_IRQS(vdev, irqs) \ + ((vdev)->io_ops->set_irqs((vdev), (irqs))) +#define VDEV_REGION_READ(vdev, nr, off, size, data) \ + ((vdev)->io_ops->region_read((vdev), (nr), (off), (size), (data))) +#define VDEV_REGION_WRITE(vdev, nr, off, size, data, post) \ + ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data), (post))) struct VFIOContIO { int (*dma_map)(VFIOContainer *container, @@ -184,6 +211,7 @@ struct VFIOContIO { #define CONT_DIRTY_BITMAP(cont, bitmap, range) \ ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) +extern VFIODevIO vfio_dev_io_ioctl; extern VFIOContIO vfio_cont_io_ioctl; #endif /* CONFIG_LINUX */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 50748be..41fdd78 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -70,7 +70,7 @@ void vfio_disable_irqindex(VFIODevice *vbasedev, int index) .count = 0, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) @@ -83,7 +83,7 @@ void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) @@ -96,7 +96,7 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } static inline const char *action_to_str(int action) @@ -177,9 +177,7 @@ int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, pfd = (int32_t *)&irq_set->data; *pfd = fd; - if (ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) { - ret = -errno; - } + ret = VDEV_SET_IRQS(vbasedev, irq_set); g_free(irq_set); if (!ret) { @@ -214,6 +212,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint32_t dword; uint64_t qword; } buf; + int ret; switch (size) { case 1: @@ -233,13 +232,15 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { + ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf, false); + if (ret != size) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64 - ",%d) failed: %m", + ",%d) failed: %s", __func__, vbasedev->name, region->nr, - addr, data, size); + addr, data, size, err); } - trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size); /* @@ -265,13 +266,18 @@ uint64_t vfio_region_read(void *opaque, uint64_t qword; } buf; uint64_t data = 0; + int ret; - if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { - error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m", + ret = VDEV_REGION_READ(vbasedev, region->nr, addr, size, &buf); + if (ret != size) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %s", __func__, vbasedev->name, region->nr, - addr, size); + addr, size, err); return (uint64_t)-1; } + switch (size) { case 1: data = buf.byte; @@ -2369,6 +2375,16 @@ int vfio_get_device(VFIOGroup *group, const char *name, void vfio_put_base_device(VFIODevice *vbasedev) { + if (vbasedev->regions != NULL) { + int i; + + for (i = 0; i < vbasedev->num_regions; i++) { + g_free(vbasedev->regions[i]); + } + g_free(vbasedev->regions); + vbasedev->regions = NULL; + } + if (!vbasedev->group) { return; } @@ -2382,6 +2398,21 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info) { size_t argsz = sizeof(struct vfio_region_info); + int fd = -1; + int ret; + + /* create region cache */ + if (vbasedev->regions == NULL) { + vbasedev->regions = g_new0(struct vfio_region_info *, + vbasedev->num_regions); + } + /* check cache */ + if (vbasedev->regions[index] != NULL) { + *info = g_malloc0(vbasedev->regions[index]->argsz); + memcpy(*info, vbasedev->regions[index], + vbasedev->regions[index]->argsz); + return 0; + } *info = g_malloc0(argsz); @@ -2389,19 +2420,28 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, retry: (*info)->argsz = argsz; - if (ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, *info)) { + ret = VDEV_GET_REGION_INFO(vbasedev, *info, &fd); + if (ret != 0) { g_free(*info); *info = NULL; - return -errno; + return ret; } if ((*info)->argsz > argsz) { argsz = (*info)->argsz; *info = g_realloc(*info, argsz); + if (fd != -1) { + close(fd); + fd = -1; + } goto retry; } + /* fill cache */ + vbasedev->regions[index] = g_malloc0(argsz); + memcpy(vbasedev->regions[index], *info, argsz); + return 0; } @@ -2554,6 +2594,95 @@ int vfio_eeh_as_op(AddressSpace *as, uint32_t op) * Traditional ioctl() based io_ops */ +static int vfio_io_get_info(VFIODevice *vbasedev, struct vfio_device_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, info); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +static int vfio_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info, + int *fd) +{ + int ret; + + *fd = -1; + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +static int vfio_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, info); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +static int vfio_io_set_irqs(VFIODevice *vbasedev, struct vfio_irq_set *irqs) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +static int vfio_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pread(vbasedev->fd, data, size, info->offset + off); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data, bool post) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pwrite(vbasedev->fd, data, size, info->offset + off); + if (ret < 0) { + ret = -errno; + } + + return ret; +} + +VFIODevIO vfio_dev_io_ioctl = { + .get_info = vfio_io_get_info, + .get_region_info = vfio_io_get_region_info, + .get_irq_info = vfio_io_get_irq_info, + .set_irqs = vfio_io_set_irqs, + .region_read = vfio_io_region_read, + .region_write = vfio_io_region_write, +}; + static int vfio_io_dma_map(VFIOContainer *container, struct vfio_iommu_type1_dma_map *map, int fd, bool will_commit) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 122edf8..28f21f8 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -402,7 +402,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) fds[i] = fd; } - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); + ret = VDEV_SET_IRQS(&vdev->vbasedev, irq_set); g_free(irq_set); @@ -772,14 +772,16 @@ static void vfio_update_msi(VFIOPCIDevice *vdev) static void vfio_pci_load_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; struct vfio_region_info *reg_info; uint64_t size; off_t off = 0; ssize_t bytes; + int ret; - if (vfio_get_region_info(&vdev->vbasedev, - VFIO_PCI_ROM_REGION_INDEX, ®_info)) { - error_report("vfio: Error getting ROM info: %m"); + ret = vfio_get_region_info(vbasedev, VFIO_PCI_ROM_REGION_INDEX, ®_info); + if (ret < 0) { + error_report("vfio: Error getting ROM info: %s", strerror(-ret)); return; } @@ -806,18 +808,19 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev) memset(vdev->rom, 0xff, size); while (size) { - bytes = pread(vdev->vbasedev.fd, vdev->rom + off, - size, vdev->rom_offset + off); + bytes = VDEV_REGION_READ(vbasedev, VFIO_PCI_ROM_REGION_INDEX, off, + size, vdev->rom + off); if (bytes == 0) { break; } else if (bytes > 0) { off += bytes; size -= bytes; } else { - if (errno == EINTR || errno == EAGAIN) { + if (bytes == -EINTR || bytes == -EAGAIN) { continue; } - error_report("vfio: Error reading device ROM: %m"); + error_report("vfio: Error reading device ROM: %s", + strerror(-bytes)); break; } } @@ -905,11 +908,10 @@ static const MemoryRegionOps vfio_rom_ops = { static void vfio_pci_size_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK); - off_t offset = vdev->config_offset + PCI_ROM_ADDRESS; DeviceState *dev = DEVICE(vdev); char *name; - int fd = vdev->vbasedev.fd; if (vdev->pdev.romfile || !vdev->pdev.rom_bar) { /* Since pci handles romfile, just print a message and return */ @@ -926,13 +928,23 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev) * Use the same size ROM BAR as the physical device. The contents * will get filled in later when the guest tries to read it. */ - if (pread(fd, &orig, 4, offset) != 4 || - pwrite(fd, &size, 4, offset) != 4 || - pread(fd, &size, 4, offset) != 4 || - pwrite(fd, &orig, 4, offset) != 4) { - error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name); +#define rom_read(var) VDEV_REGION_READ(vbasedev, \ + VFIO_PCI_CONFIG_REGION_INDEX, \ + PCI_ROM_ADDRESS, 4, (var)) +#define rom_write(var) VDEV_REGION_WRITE(vbasedev, \ + VFIO_PCI_CONFIG_REGION_INDEX, \ + PCI_ROM_ADDRESS, 4, (var), false) + + if (rom_read(&orig) != 4 || + rom_write(&size) != 4 || + rom_read(&size) != 4 || + rom_write(&orig) != 4) { + + error_report("%s(%s) ROM access failed", __func__, vbasedev->name); return; } +#undef rom_read +#undef rom_write size = ~(le32_to_cpu(size) & PCI_ROM_ADDRESS_MASK) + 1; @@ -1110,6 +1122,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1122,12 +1135,14 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) if (~emu_bits & (0xffffffffU >> (32 - len * 8))) { ssize_t ret; - ret = pread(vdev->vbasedev.fd, &phys_val, len, - vdev->config_offset + addr); + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, addr, + len, &phys_val); if (ret != len) { - error_report("%s(%s, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, len); - return -errno; + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("%s(%s, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, len, err); + return -1; } phys_val = le32_to_cpu(phys_val); } @@ -1143,15 +1158,20 @@ void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t val_le = cpu_to_le32(val); + int ret; trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); /* Write everything to VFIO, let it filter out what we can't write */ - if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr) - != len) { - error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, val, len); + ret = VDEV_REGION_WRITE(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, addr, + len, &val_le, false); + if (ret != len) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, val, len, err); } /* MSI/MSI-X Enabling/Disabling */ @@ -1239,10 +1259,13 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) int ret, entries; Error *err = NULL; - if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed reading MSI PCI_CAP_FLAGS"); - return -errno; + ret = VDEV_REGION_READ(&vdev->vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, + pos + PCI_CAP_FLAGS, sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed reading MSI PCI_CAP_FLAGS %s", err); + return ret; } ctrl = le16_to_cpu(ctrl); @@ -1444,33 +1467,40 @@ static void vfio_pci_relocate_msix(VFIOPCIDevice *vdev, Error **errp) */ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) { + VFIODevice *vbasedev = &vdev->vbasedev; uint8_t pos; uint16_t ctrl; uint32_t table, pba; - int fd = vdev->vbasedev.fd; VFIOMSIXInfo *msix; + int ret; pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX); if (!pos) { return; } - if (pread(fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_MSIX_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX FLAGS"); - return; + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, + pos + PCI_MSIX_FLAGS, sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX FLAGS %s", err); } - if (pread(fd, &table, sizeof(table), - vdev->config_offset + pos + PCI_MSIX_TABLE) != sizeof(table)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX TABLE"); - return; + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, + pos + PCI_MSIX_TABLE, sizeof(table), &table); + if (ret != sizeof(table)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX TABLE %s", err); } - if (pread(fd, &pba, sizeof(pba), - vdev->config_offset + pos + PCI_MSIX_PBA) != sizeof(pba)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX PBA"); - return; + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, + pos + PCI_MSIX_PBA, sizeof(pba), &pba); + if (ret != sizeof(pba)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX PBA %s", err); } ctrl = le16_to_cpu(ctrl); @@ -1608,7 +1638,6 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled) static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) { VFIOBAR *bar = &vdev->bars[nr]; - uint32_t pci_bar; int ret; @@ -1618,10 +1647,13 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) } /* Determine what type of BAR this is for registration */ - ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar), - vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr)); + ret = VDEV_REGION_READ(&vdev->vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, + PCI_BASE_ADDRESS_0 + (4 * nr), + sizeof(pci_bar), &pci_bar); if (ret != sizeof(pci_bar)) { - error_report("vfio: Failed to read BAR %d (%m)", nr); + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("vfio: Failed to read BAR %d (%s)", nr, err); return; } @@ -2169,8 +2201,9 @@ static void vfio_pci_pre_reset(VFIOPCIDevice *vdev) static void vfio_pci_post_reset(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; Error *err = NULL; - int nr; + int ret, nr; vfio_intx_enable(vdev, &err); if (err) { @@ -2178,13 +2211,17 @@ static void vfio_pci_post_reset(VFIOPCIDevice *vdev) } for (nr = 0; nr < PCI_NUM_REGIONS - 1; ++nr) { - off_t addr = vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr); + off_t addr = PCI_BASE_ADDRESS_0 + (4 * nr); uint32_t val = 0; uint32_t len = sizeof(val); - if (pwrite(vdev->vbasedev.fd, &val, len, addr) != len) { - error_report("%s(%s) reset bar %d failed: %m", __func__, - vdev->vbasedev.name, nr); + ret = VDEV_REGION_WRITE(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, addr, + len, &val, false); + if (ret != len) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s) reset bar %d failed: %s", __func__, + vbasedev->name, nr, err); } } @@ -2619,7 +2656,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); + ret = VDEV_GET_IRQ_INFO(vbasedev, &irq_info); if (ret) { /* This can fail for an old kernel or legacy PCI dev */ trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); @@ -2738,8 +2775,10 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) return; } - if (ioctl(vdev->vbasedev.fd, - VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0 || irq_info.count < 1) { + if (VDEV_GET_IRQ_INFO(&vdev->vbasedev, &irq_info) < 0) { + return; + } + if (irq_info.count < 1) { return; } @@ -2817,6 +2856,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vdev->vbasedev.ops = &vfio_pci_ops; vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI; vdev->vbasedev.dev = DEVICE(vdev); + vdev->vbasedev.io_ops = &vfio_dev_io_ioctl; tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev); len = readlink(tmp, group_path, sizeof(group_path)); From patchwork Tue Nov 9 00:46:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8941C433EF for ; Tue, 9 Nov 2021 00:52:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6C51B61177 for ; Tue, 9 Nov 2021 00:52:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6C51B61177 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFMo-0003cG-GZ for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:52:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51714) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAf-0005gh-TR for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:38306) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046U-48 for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A90AULi025572 for ; Tue, 9 Nov 2021 00:39:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=oiTaCiXWI+82kZlxU01Semjd1b8jMpCvjIjXOp7FoGc=; b=QwyQiDbBXM6lRKPfc8fPq2mu2LpLNZbOpBJe/T06xRrOgDmMhBSPRTRfSVo5rv4y4Ope G0SPDHuaxd4UUOC0qPPaK4Wt+6eF/PvDfPiTXydnT+FvH9bEODDvJkqpVe6nGXmIT9Vx A8UhUH1BERZTJkCPVWnvlSPM5lzEO+vuya55ovzcT3jakeBFohWDxARedOA7cjouJ+5b w0Zjn5pTNUT2QRP6ikyNRP7VnnysxQnr9rDb6NQR1bsFIWyyQcyyemVYm5S+CN5SPnju xSGPjdiZFje8tad+E3b/9epWiWb6dFc89h42Tzkw+Gur7uYgLNU2QcYw26l76JboEX0L 1A== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6t7077j7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm1132637 for ; Tue, 9 Nov 2021 00:39:16 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:16 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gSb9G8t+yUhPfoc9v6ZoM0K5sXZ7L2gEZBvTr7+2hvGwexKhVa+2KDt1O83H/3xSYrSKoEwU+oUlraGYt7VEzVssTm6r1/oLSmST9SVcEgAZsgZP9QVNh30iblZBDnJoJ8b8nCX62Pd+96UUTQjHkzawkbhaFgWk76tjzuwi4L6jMuIShmQ5FlBkqbGer0iUobURBim5AZor7zkyw8SxcJpwQ9eGnSN/GlTVXAVzPIpFScFTgdMTWYAVuCaSnEc76aoeDSw2QB44GyaTSP2bP7Lx8vLk8zEgod6Rt/7MmdatPhf2Z0eceFgFzLTF4T9k7ksuXQ7qXA/X5B9eROZaLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oiTaCiXWI+82kZlxU01Semjd1b8jMpCvjIjXOp7FoGc=; b=Rw5gwcA+yN5uwhqpdnn2AywIk1K29gFM72z6n1lFAqT0KLu6E/1BDg1jeqRYTUbumRSd4wFBPc1nOXo59RirD5SkaCg5xbS6qA0ycqY2Y3fmCe55M5Y3NtyitBMhOsj+DzpBOUWgGR0I6BI3P7GmOmaiqwWQp4jiOXAmuIS0WwZgRVA5qiCfgo1W67NaEqVfDDkUpbFwvPSFzhd+LI7/H9jmiqN8ApwHAhYAtRr/+vZTr8UqNWdzBLjISFo4/poOYBv1orQEoVlQBh80lB0Jx0LSjnAu5gDImtNpwVzbj1JXOUvG8YLhwD9rZ8go24luDq2CRMhmJ+nQnSspTp6t3g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oiTaCiXWI+82kZlxU01Semjd1b8jMpCvjIjXOp7FoGc=; b=HQN/fd+V2zaeJRQCwytpix+uO8YdN+2uKs4R92iaDv2ekwZXHnxHx8DhgyWpcQbc+iFJcY6haLH1zemUEvOevivOjkKLRsgccdRh0/fnMVo+y0obY55GKBQa+YI97yKikyxveP5i94lyF1zg2aMhV+b1Xuetk4XQZ1Cuo4chygA= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:11 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:11 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 05/19] Add validation ops vector Date: Mon, 8 Nov 2021 16:46:33 -0800 Message-Id: <327df73b51de7a11657aea61295d735fdd0427fb.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:11 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 63c77cc0-dbb4-4e45-3cbb-08d9a3195923 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /hc18Nghg92xqvAHbSrbcTPemcYxvnnfl4yVpHbIMDeCk80NPC2TT1NHz+6Gqa9JnWeZv1FgnBO98LxspJK3ICmXu5cKJukcxgrkSnAqWtCSXNskXlPQb9TFm7Pw7IsZbwckh/TByXcf5ssBXsLUSlqNw/jn3cJK8fqbfLalFJ7wi/YfKebaaYK9hYmZpDj7wXMSuRjnwJYVlb9KNCnWlwzw6Rw3SWO5Cb04TJ7tFCgDyW4u1AVCmd0ORE7ztWTMz42d1koqPHzzttR7RoUI/w1LBjeKSBFsHOD7hRKDaTeWVwCtAhXvCcqfQGJWKPdG84ehh/pZHrtsSYJw+bRFfkOFMxZaYql/UqV+TaqbaOVMivBILSlNb/linOVnI7Ik/EK2Q73/nb37m6YDs4VinBIOCQdUR5NX0BtaCzxtJ/eUVfQmPBAgRsbQNYqBNlm7YOV7s159+hkHcK/yqjRPPQ6VGqvP8D7pPpBJ27bpTnW1u/BoDSeRhcgENgjjNXhkPTLC+8OwrxF+35vxcfCHB89ApsGmGcG+dE74AYcDgSqWeoFxoZsNL+o4AHs2MFq8jKk4n9gdBj9YBwtq5qVlfxWlqG8RPX8rFrkUqOMZOIKwm48cNZX1rVXVF2WvPAwu5tT4fHdddIoc7CtVCGyOjXGurZOyTgMy1v26SxP3a7ywfFr4ngB5qYcShKEBoBQFkTGmVQODWBdglMtMMKcy2Q== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: qkR7qO+Aip/oqpypBjgc6XTTQ6KYFmlqoh1onhE6DMTywPw5Al8xjeR5newhtHStLu+xf5wQywsjSxaxQn9Q7qCMYA4TUg91fchp78SpQGB/BKf3HzfZb8Icuo7Jw2/XGRpTCfyVn1tuT5XpjvOjXzbU3Vo5lkT65pyvhjOqzoGHrDc6xw0mgaXcJY9EdDjwXCbd9jsDnVPX3D93FlYTnjKfbJEFXh1ioJcyGoWx9e3cGNxV8D5+DDVkTQwDm+NkSDORKPMqTf/CYGH8LZ9W+gn0E6Gnre0Gg0JvqC48ZNjpVUG6QQQVLfYlD9L5W1lSF3HCWTicQyVHNPL/jV8t50mHLE2SIU+BOQypj4sEOT0QMK+ymxXzHpsyA7kkj8iw+kGS43imUbr/5284GEbPeQ+R4p9zp7nWPn06c8Q8i5xXtEJ8Jh30Fp81CStN4Cs44gzqqDAdxzrift+5PMzw+okx4YhiaUdxSBiORev2KbPqRdefRAyBu3Dewcv532H79PIdQiwtuzwZ/7oDntFZ2GNc5br6uLvA51gcQnM2uwnQY6CheksU8YmOxjaIELBND9hBHqV2N1G7Y886IQDEttWsS4eHKKmbSgllDH1St7FQPmI6Mzvn6jt6/OpO7HHN+2klKgpVyxsXrlfXgAz+woh7aa/8s3jUkdaNPG2+nKGT28Eggi4QpzMXVDLLMjJGq1OIfCw0vd8RSfSOFDlFr4ypj2fwpLCzDkeIAG0FUfa7GTKxCE2mKjXWNn/J1iGiZ2y5jIheYQjmIjPKpgxYcDejEF4xKG7SAWNJU+9wRWUbGv8jmFLYgaEeNzyKQy8RIEoegIap7FC5zvFuipXVhAtwurUkRxWssEvvEpfzBH2RtPDgxrEEoIfOsceSKFHyyR9yUAwQ2eWvQVfK/jEOxryw6sF27pW7kjtutlJxh5+1WaZuccLX4+BRbh+QT1Q7Q8AQlD0fobSDcocxZkZaiIZ44VunDJDLo6QByS78w711/1qBGvcXWF84qyZpcmwpufkR/DC7A98ybGgVzAxT+0odbcitYrqfmS8sH7ZNAOQgxi3bfjr/Phnm1a2c/piTz5MDXQD82EcIihu7i3rVDlf0m4Cr9wQwV14qnPHPRVBjNs8XHeDvaoxmd/ofVOTzsBsAbW0UFICDx3C8205IOI2RzNElO8FIEoGr6gbenHH85v/giycYT7AlhSbk0/if87HaHWxPgtvYLZwgzt86kKPAQ8uQXYtzrWwT7ihCEbqMaleD1NsL8V4b4oHR4Ev3C9G5T12l4ZogiaQ51DV1p1v9s0n0HlfFVIuOhvo6cID51dzyIbJVbWoMXLmSqOamyu3OBDAEmMp8pvIsMtKPwSUnNEMxhy+uf8O2N5EIpTMNg8Rl+mLoaDpN6u+OKL3IIxbMKRDLwRp30xDusXG9EGeE2QOQlLSQtv2yqG+HQ/XrBgQGnKwEfBVCh3uUI6epmLO23SjuM35fYx232MR81b9yRs43pPcjtOJAEEMCQfYEuuoEHEmBRK6zVTuHdiQYD9bJc1K78uYMiQXu2CXa7CM7z4/UiFUw0fjlsy84DyoDNsyRodnssSZfugc3btnYyJqFE3FJvQOnmd2MfVi5No+y/22D2WKfH2hmoI6eqbiGxKO5eFydiTxQHSFCBPscSyRVQahJxSpfYOvnKTdBaA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 63c77cc0-dbb4-4e45-3cbb-08d9a3195923 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:11.5851 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: zW2DYBjPONCrmXwyF4t5H87JCT8To8hf+ZUz1FltJlRF+CnfsIxZw/u5fyF2I+mpDrSJWftmxOfsJB5W+Xs8sg+kmtIjc2Ijf85igMJ9cfQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: Hepw9OuyqnTmFq0cwipdenzR7VuCZLoh X-Proofpoint-ORIG-GUID: Hepw9OuyqnTmFq0cwipdenzR7VuCZLoh Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Validates cases where the return values aren't fully trusted (prep work for vfio-user, where the return values from the remote process aren't trusted) Signed-off-by: John G Johnson --- include/hw/vfio/vfio-common.h | 21 ++++++++++++++ hw/vfio/pci.c | 67 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 88 insertions(+) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 43fa948..c0dbbfb 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -125,6 +125,7 @@ typedef struct VFIOHostDMAWindow { typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODevIO VFIODevIO; +typedef struct VFIOValidOps VFIOValidOps; typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; @@ -141,6 +142,7 @@ typedef struct VFIODevice { bool enable_migration; VFIODeviceOps *ops; VFIODevIO *io_ops; + VFIOValidOps *valid_ops; unsigned int num_irqs; unsigned int num_regions; unsigned int flags; @@ -214,6 +216,25 @@ struct VFIOContIO { extern VFIODevIO vfio_dev_io_ioctl; extern VFIOContIO vfio_cont_io_ioctl; +/* + * This ops vector allows for bus-specific verification + * routines in cases where the server may not be fully + * trusted. + */ +struct VFIOValidOps { + int (*validate_get_info)(VFIODevice *vdev, struct vfio_device_info *info); + int (*validate_get_region_info)(VFIODevice *vdev, + struct vfio_region_info *info, int *fd); + int (*validate_get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *info); +}; + +#define VDEV_VALID_INFO(vdev, info) \ + ((vdev)->valid_ops->validate_get_info((vdev), (info))) +#define VDEV_VALID_REGION_INFO(vdev, info, fd) \ + ((vdev)->valid_ops->validate_get_region_info((vdev), (info), (fd))) +#define VDEV_VALID_IRQ_INFO(vdev, irq) \ + ((vdev)->valid_ops->validate_get_irq_info((vdev), (irq))) + #endif /* CONFIG_LINUX */ typedef struct VFIOGroup { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 28f21f8..6e2ce35 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3371,3 +3371,70 @@ static void register_vfio_pci_dev_type(void) } type_init(register_vfio_pci_dev_type) + + +/* + * PCI validation ops - used when return values need + * validation before use + */ + +static int vfio_pci_valid_info(VFIODevice *vbasedev, + struct vfio_device_info *info) +{ + /* must be PCI */ + if ((info->flags & VFIO_DEVICE_FLAGS_PCI) == 0) { + return -EINVAL; + } + /* only other valid flag is reset */ + if (info->flags & ~(VFIO_DEVICE_FLAGS_PCI | VFIO_DEVICE_FLAGS_RESET)) { + return -EINVAL; + } + /* account for extra migration region */ + if (info->num_regions > VFIO_PCI_NUM_REGIONS + 1) { + return -EINVAL; + } + if (info->num_irqs > VFIO_PCI_NUM_IRQS) { + return -EINVAL; + } + return 0; +} + +static int vfio_pci_valid_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info, + int *fd) +{ + if (info->flags & ~(VFIO_REGION_INFO_FLAG_READ | + VFIO_REGION_INFO_FLAG_WRITE | + VFIO_REGION_INFO_FLAG_MMAP | + VFIO_REGION_INFO_FLAG_CAPS)) { + return -EINVAL; + } + if (info->index > vbasedev->num_regions) { + return -EINVAL; + } + /* cap_offset in valid area */ + if ((info->flags & VFIO_REGION_INFO_FLAG_CAPS) && + (info->cap_offset < sizeof(*info) || info->cap_offset > info->argsz)) { + return -EINVAL; + } + return 0; +} + +static int vfio_pci_valid_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *info) +{ + if (info->flags & ~(VFIO_IRQ_INFO_EVENTFD | VFIO_IRQ_INFO_MASKABLE | + VFIO_IRQ_INFO_AUTOMASKED | VFIO_IRQ_INFO_NORESIZE)) { + return -EINVAL; + } + if (info->index > vbasedev->num_irqs) { + return -EINVAL; + } + return 0; +} + +struct VFIOValidOps vfio_pci_valid_ops = { + .validate_get_info = vfio_pci_valid_info, + .validate_get_region_info = vfio_pci_valid_region_info, + .validate_get_irq_info = vfio_pci_valid_irq_info, +}; From patchwork Tue Nov 9 00:46:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C202FC433EF for ; Tue, 9 Nov 2021 00:54:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 22FA3611BD for ; Tue, 9 Nov 2021 00:54:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 22FA3611BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:43758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFP7-0000lC-85 for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:54:25 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51604) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-0005es-HU for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:40300) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046W-0Q for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:23 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A901USR001194 for ; Tue, 9 Nov 2021 00:39:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=YBaA2VeIu4CgbO7yOsCjXGViD8KNexdoMr5cVFzIqy4=; b=WMoEPA0m5y/zSOAmkOXMs1OgFeUYFXRSa1HHdgWM5npNRIeI/q9Dg+u448/rKHo5W0MC oKJMem3spwee+tZKtOq+QC+b05qK21BzLgXtkvgxety+WbPGSslCGnELZOoyPXA1PMjh QV/kxhQGyEPupW4i0Hk9pWe9dvWQNuS28CEajvCkFx3BKN09TzX2dlpEwAWEcPjk8acQ xwAOiXPmzSr4E/5de+V7vZ/zd8v71CLPQmhimXZQQPZ/OvvOVcfoDxboYUSJr5iTujD5 XNUzapxuWLcK7RTnXpaJaS411MVuAkXqAQ1VULe7LbNanmIFj/6V/O5lXa17CkPjzago gA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6uh4fnp0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN2129193 for ; Tue, 9 Nov 2021 00:39:16 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:16 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BRw4pClL1DyNb4FBttJ5dxhSNrYRzjIO1gHP3ljmXRxmretks3oiFzUzaeC1okp+xr48ahYwITG1PritlwoJT77y4pBgQ4X1uCMtBVHsn3vkRdTOHwNhfylhbisqRrM6r9ho42fXWNOH7UOTNP5EHjUfJCEYcGoTvhtgw23YRH7xE1f5DAY2HQCyg9WlFq2Jk5FmehRrmhYsl0K5nYiVZWb2DrQsY5LHZQx8vmTWQ38SpFpIB9vCxZuMkawsOXcnBna3YjsilQfHedWEM3wff+wUorU/nWlDImBBlLyncOZbF9Fud0ervrig+c5pI4+2rwI7PhCT5jv0nyXCCh8PLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YBaA2VeIu4CgbO7yOsCjXGViD8KNexdoMr5cVFzIqy4=; b=oObzeBKA6GW947yL3DXSymQSm9TvJUtOVdWWR6DiIzKajIR/XZCZJ/PxED/lvuFLuBMd/XX2ZrtVsOy+PNiK/b+9LZBjAPn4cB+dCs98S8hiD9rD2gO6whRhy6KL/98UuR2hZm1A2cvEwd/g16NrJF0f+l4VcgfQ3UBApXY1FanWuMi5nEDAy0VkAZB7fjTaDjPa8+Jj2T+jhzMLEFlDWypVEVyx8xPIqoTkYfZTlOxEgiUIiShAVoCdP+Z4CT/FtRdxiW+zZpOvlFoH3tNpDdsu7Xh+CCqv3vTWs0tY9pWW9zc7hQ+anUTid4BZ4GDizoIQy99ihXuxOH3tEWShZw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YBaA2VeIu4CgbO7yOsCjXGViD8KNexdoMr5cVFzIqy4=; b=vGwMNbIvkjOOdrji0/jWDvgL/bZffE0+OW2zIulUoD9/cvTDpv1QHGHLf/NWmBLCw8hr5AHOVU+rWM+Xgbra3rngzF3b3AZiTsfCG69nsRteZwVgfT+S41ovLGgCpgvM7GAvAEpYH8wTUwWhXEiWVetqn0FVL8x/dFb+qxhDpwc= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:12 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:12 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 06/19] vfio-user: Define type vfio_user_pci_dev_info Date: Mon, 8 Nov 2021 16:46:34 -0800 Message-Id: <28d95a317e70c418dc054a59db307d9c49411ca6.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:11 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6f96f138-533e-4bbe-37c0-08d9a3195949 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1850; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: p5hDSFrHo/echZQraZ5Rs8UtD/R/lsCA5OQmLEYKkfKoYn2gsQb75OPj35ApeuvxNov4y35DI7ATZ/2k47lbF97tNqAiH4GD0fi7OE8+YCqoJ2Ew+GOqIWuNKcIn3uhtCLzS8gzNYhOuHOINBcLKnE4gsXjab8AHXm0LfowTt5BzNI4yn4knN/ilBvpyWl28UYB6IsXsaWdunTAGQrJoGj4g+DXNzXFvThStv5G9J1ZPOvU5XtvN913NGJ5OE9QQKM3wPbbNPjsMY6AGt5EAqL7JwN+q0NGjrOU5Ee7hc1ZvHD3fU8b5EjPXmauQya4HNMUk8TrlcHTOU6Av3lZLG8nvMLu94yyk0PXkE7RGdpJFet+dN/1BKdeAfUAjb7+TInKMTromTqkwNqcevwUFzLtrkt4GERYk2mmUgoRyDVd3bDbpSQUs0iSgEQ8teC042fuLP99ns7Kh7gpT3JUVMoA09UiJtaJFop+jEW9TuRSNmo0uPQSf0lUcovRRKFmgA8SgCttra/MlTvdr4tRzbc/nszwpiAtYNTSJFdSMZTHGrWaRguyV2Qakle6v01TFZUMfd2rTiVGDpyV8c4BQ7mvizxa4LqhhQ8XgT2IqwITtwImMbepX+YtoU7Gox3ovV3UXHa1/gzwdUnRqy1neEj4Oe+pU6ZdttGTZ8hpU/FFs7hET2leBUB+CnEhSKq3hCUY2wLPdyJOnsguwtiP3Pg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: C1gSWvzFiXJ/68yv8APf7SDVhfcxqmSQi+XonaMQYLAtPUoNu2+fUXpIdxITupg/t37sZtd+j2Ey+vsto0uDnzT6XmYE5ha9LOMcpbrR+XwmniqzAdM++Cna7LY01t2FfFPCaY+MnJuyVWsjB/ysky+7/2a2D9X3KYY6HPMP+DAjmHbR9+y+FzzTwd+nUPuI0wQ/hGYeejoAiGWBxWzsFPVjOD/gOnbjML554XmhnYlcNG4+3wtpvSKM+2YCbhPskPIUWXyv2p11qQOmjWdbk3M3bqLoDoECy5YWObetnMjWkOQli2mW0L1irc/SBpBEi9lzd6Z8sZ3sBSFSODUSrY5SUYD07in7oCRO7y4NDPmv5tDaKVUhIqHYbEcewNIc91aop7MriYfGrz2hn0jqQomLpYOfjI+Rg/K5iH5dOT/pkveZjkDmzAEABr25gaZ7Uik8mFUMMwvNyKHLY6ZLb+6AUy8W8eTNXw37Uo69eRuS+sIzbgJelrCSrrXG9OtDUhRHiYtD8CoMNKslPf3YO+/acEKwtnhH38DvZDxnHn0bSiwCzMMzWxxYveHHEqUggoVog9ZgPUpQrjrbSkEfQBKmssP9hd6P4/m6wxpASF7nL81PQheF5rEks3QVf12R9qY41wVNnLOsXyu/WMetnO7Kg7moggBU9HeP3ACvuoOZQ3Ew8352iJwcWZ/uV+cksb2Rpp4y00IH40czdze8/JL6xb2v+2UrDeGB3z61sT9HliungA0QCnQknJoVNPliyjrAAt46epaAUG/L42wWyfnIeTqjZafHHXKKOm3GpZ7UdhMtvNGQuGa5bqItFY4/oI5poRIGBB6JSyc6FEyMiPqajhbC6V9qecDkXu+UsKdNOdOFgcTV9PbU/AkyaY09U+I7LzkeRgaghecbzJzDEudJuPfx6rlYMraOKXhJ1M0jh3nSqpXBtlfm87WD+f0OGd26JWBCwnVDNw3FQO0yR59ziwPhC1JTBRipL0Z9QCSaTyoa5sd6jp7WZlUJHjC4qXQPRX48QdbKhH3mwrg815jDM/A2U/AhmMoObFGEVW8qATwBwmF1dvd17vJXdZw8Aed3GFad4REkFXwIWVBoN2HPGnwrhofJwPBITjQRe1t6i9a4rEtdOjJ/dU82SytBpOwodi4Z8LTdUqoeAd16wBzaXZfl/yAI+fylbCmwIu5IosuWNgB3+fvTMrDrUyuiyEhTkXXDnLM3NrhwoWZ63X3hv5S9WkEkeiEa23v6TIuJT+jEAyTGvT2yGlTOXemddP2bfE9kwSHsHsrqIVd4UL25v4x8be6XxcnEMqwR+gDsUnduh0bEyhZmYkbx2sPnpOwrjjiezmz6KRFAYL+22vK2sQCnhpQPv/P+Rn9gM1Arx5XAugAy+jRzWYBWiNTULQjKei0IQDMSbxHQf/tHoMebcLW5Zg7ABpT196PjDF4whv5qCYirQZzG4WE41sO+Rhr5qEHYzCyNUJg7u0zm+Zp0uR20yDqIDoTjNOs+pSvC7Yg2jh9MeLJYY4wlZ0QAwaSJ6YldWRUWSqV8dZha23U6o2QQZwZYhSF2qzeOHmpLr5PJuda8wYz1KSxuqodVp2hMg3xLRTL74AEGvVt+HlALBOabJnsGH19s7QXR97R8WEVhvydeLR9KaYbUec7WVUt2Ac61UyeIJyDtnN2dbQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6f96f138-533e-4bbe-37c0-08d9a3195949 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:11.8161 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: prxnfs/1NzQ9rPiRDcNgcCXC0Wnc+FTLsqCpocQqhbvFnw6WuAqGTAeH+oCZu5yIbMgxNoyux1sro/drS9R3E09wEWg+tWP0KXIjutruxwY= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: mTMV4-pe7WtYv2wmsmOMAAwfbFZffKKf X-Proofpoint-GUID: mTMV4-pe7WtYv2wmsmOMAAwfbFZffKKf Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" New class for vfio-user with its class and instance constructors and destructors, and its pci ops. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 9 ++++++ hw/vfio/pci.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/Kconfig | 10 ++++++ 3 files changed, 116 insertions(+) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index bbc78aa..08ac647 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -187,6 +187,15 @@ struct VFIOKernPCIDevice { VFIOPCIDevice device; }; +#define TYPE_VFIO_USER_PCI "vfio-user-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) + +struct VFIOUserPCIDevice { + VFIOPCIDevice device; + char *sock_name; + bool secure_dma; /* disable shared mem for DMA */ +}; + /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t device) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 6e2ce35..fa3e028 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -19,6 +19,7 @@ */ #include "qemu/osdep.h" +#include CONFIG_DEVICES #include #include @@ -3438,3 +3439,99 @@ struct VFIOValidOps vfio_pci_valid_ops = { .validate_get_region_info = vfio_pci_valid_region_info, .validate_get_irq_info = vfio_pci_valid_irq_info, }; + + +#ifdef CONFIG_VFIO_USER_PCI + +/* + * vfio-user routines. + */ + +/* + * Emulated devices don't use host hot reset + */ +static int vfio_user_pci_no_reset(VFIODevice *vbasedev) +{ + error_printf("vfio-user - no hot reset\n"); + return 0; +} + +static void vfio_user_pci_not_needed(VFIODevice *vbasedev) +{ + vbasedev->needs_reset = false; +} + +static VFIODeviceOps vfio_user_pci_ops = { + .vfio_compute_needs_reset = vfio_user_pci_not_needed, + .vfio_hot_reset_multi = vfio_user_pci_no_reset, + .vfio_eoi = vfio_intx_eoi, + .vfio_get_object = vfio_pci_get_object, + .vfio_save_config = vfio_pci_save_config, + .vfio_load_config = vfio_pci_load_config, +}; + +static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) +{ + ERRP_GUARD(); + VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; + + /* + * TODO: make option parser understand SocketAddress + * and use that instead of having scalar options + * for each socket type. + */ + if (!udev->sock_name) { + error_setg(errp, "No socket specified"); + error_append_hint(errp, "Use -device vfio-user-pci,socket=\n"); + return; + } + + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); + vbasedev->dev = DEVICE(vdev); + vbasedev->fd = -1; + vbasedev->type = VFIO_DEVICE_TYPE_PCI; + vbasedev->no_mmap = false; + vbasedev->ops = &vfio_user_pci_ops; + vbasedev->valid_ops = &vfio_pci_valid_ops; + +} + +static void vfio_user_instance_finalize(Object *obj) +{ +} + +static Property vfio_user_pci_dev_properties[] = { + DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_user_pci_dev_properties); + dc->desc = "VFIO over socket PCI device assignment"; + pdc->realize = vfio_user_pci_realize; +} + +static const TypeInfo vfio_user_pci_dev_info = { + .name = TYPE_VFIO_USER_PCI, + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOUserPCIDevice), + .class_init = vfio_user_pci_dev_class_init, + .instance_init = vfio_instance_init, + .instance_finalize = vfio_user_instance_finalize, +}; + +static void register_vfio_user_dev_type(void) +{ + type_register_static(&vfio_user_pci_dev_info); +} + +type_init(register_vfio_user_dev_type) + +#endif /* VFIO_USER_PCI */ diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig index 7cdba05..301894e 100644 --- a/hw/vfio/Kconfig +++ b/hw/vfio/Kconfig @@ -2,6 +2,10 @@ config VFIO bool depends on LINUX +config VFIO_USER + bool + depends on VFIO + config VFIO_PCI bool default y @@ -9,6 +13,12 @@ config VFIO_PCI select EDID depends on LINUX && PCI +config VFIO_USER_PCI + bool + default y + select VFIO_USER + depends on VFIO_PCI + config VFIO_CCW bool default y From patchwork Tue Nov 9 00:46:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CCBBC433F5 for ; Tue, 9 Nov 2021 00:52:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12AF061177 for ; Tue, 9 Nov 2021 00:52:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12AF061177 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35334 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFMn-0003W7-5E for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:52:01 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51608) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-0005et-IO for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:39382) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAW-00046d-W6 for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:25 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A90AULj025572 for ; Tue, 9 Nov 2021 00:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=corp-2021-07-09; bh=K12TEb/vvk4gViAjps6lfvJZSgQ/yKCB2OvT6YdnazU=; b=KOkgNxx7Bh/wHlsdIqnluZpSkxYzjh8Na9oH73lQVczj2bXCWt5ip0dAEOHGG2L8pzM4 684TSTXL4titjaC6JM6aEx/kuzZ4fmXH7UeXQcrvAX/lXpNHYs4hRh2jIFfnFirWzJio eoIbECRN5pSTepJJrdyUf0LV/aphf+eM291F/A93JtMzMQbAGQ1wsmqXu4pW8h02z8V4 sjUAG7MZ5bh7FogVCbdDIayCgGfZziqiO2GKFw17hf0m2GZMnOma0/qxJVOrEF7fSv+t 56aiIuKPkcIjEpZC69zFVGpCsNW3fIeMrRriWlEgsJPTYw4mNJDySxBr5eHAg6xQwRP9 uQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6t7077j8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN3129193 for ; Tue, 9 Nov 2021 00:39:17 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lesQHX5Kv6Iqmk2Oz8X2gSC32gjHMQdwtKc4KGc2GImYlt/tkLWmhwDSEq415ZPnPKk9yzQvYmocHAaIVAR5ADvFb8LdfBGRuvOH3p70Tprf7Df2kmSn8vSxewJFuL6tDgW+ztkoPn54JHX07nTT4NZwCrE9Tjlz9KnA+3RKyi0EBfCPqspjfRA7YXnhYgjxRHGwxOqpKPfkhCYyeUFJR2s9vZXXFbXciGca+mKpGEQxdlfK40RvvCGXpwzmGQOqRagCWoog7wQyyGdhSSej1wmp92ofBhnYz2SaiBf4AKo9pSLdJzGLXyChQ+AjNkMFMbjaJ5pyH0l1x/SC30QgdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K12TEb/vvk4gViAjps6lfvJZSgQ/yKCB2OvT6YdnazU=; b=dgnUtqiFsiqfVlptw78C4uYXCIYERsNfWhv34wc92YxYV0gdGoUy4rU2i2UB6OKFtu3NXSo09llze15Q+b/pzYXp+RGPwmV6HEFqNKEhPjlSiaXnR6HFAnnObQO6WKBFv1ScmQ4zguAFfbuOMmX4p3yRTXvMqnHuc3HB1yg/WQvaKyptCvrFBIfDRGwuTSVKDcsrP8A6lAE5DwSdyZeKqbec0RbdtiKE6IqqBXyt99EGx9kwT142VRyxhxaO61IdRujlx108v4kb78AK7luHDhXGWU3hbuLcR+FjIBG3m9B4uHbn2zqriOgeMQlMYQhfukYCk6D8yAFPAHCpJ4cOqQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K12TEb/vvk4gViAjps6lfvJZSgQ/yKCB2OvT6YdnazU=; b=GYvhDN3brPLA5UX4u/62Kwr+9UQKOMrX0MmW+FWw3BZGnFtGLNCkCqmAtU9VOCvnFiU6dHd3lE91B2vtEdXENG8h+8hTnnwzTYDyKG7BxQ2u4YxmogafuhWiw0mxXbMdiijPZwcDixw2CAY9gZ1efTW2rieSDV19dWdVtbZ6CXY= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:14 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:12 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 07/19] vfio-user: connect vfio proxy to remote server Date: Mon, 8 Nov 2021 16:46:35 -0800 Message-Id: <69d83c41ca7fe9b010f73dc15fe6a7783fce5620.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:11 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8ac0818b-d298-4178-18b1-08d9a319596b X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:114; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: C6SpMZS7cCfYa+C3dw9ahYi9vp22MWfF0pJg0S5UoiuEWNw/btNVpcPu2QCdj5StfjxlH+MkXzCd9jJ0hfpDcCVjdV30G9qREnW9nqg77uKQc2Xso0JRrGdOWGU9/P2Pl1On0P6wiv6NgL2iZ6erakyObcy3SNFAyKP9BFF4c43227Vi0ufI8tlskARmz7ToMpl0xVx22aGG+Sut5biobxh3XKnkXSywa2LkX90aMAGKta6cNZJulWLqnCibe+cdIbZQ364Gx2Wg9m77G0hZUNJHbx1q23A8qquNBXmITrgnVgzWU2z8V9Zte5XNhzxCkM9sGZd3sYWZ1ZLJ+oeEUBHENLHVjKmgJZcuUkoJw28+t0cgAs64tEaQXHXRAHvZMUVoRbbeu85GiFU7QV2zB0qq8EyuYTf9AX/skkNkUb1g0VCUAZQxYQ4XzmUEmZVvxctheIaFNfQExjTflVYq4BZscnHpEn/4/tv7RFsuuIwbKtVB8lA1WHUgFEmbS2Q0vTWYnBXE1tOmnfp/yCkxTJR25uSbgcZGXLjqrtPq71OCoaTeqKmGk7M+nkfqxr8ITGQPz2hslUJG8MdgZPipFr2wr/3r63sAxb7oT+ozlSzSOc/vcJFQpmMG6xGrw3pQYZR5sPbxred/5CR8s0vt3hbk6jKWPzW41Wn93XfOzfSsftO0cfyXH8GYHadD7U2A4cxjYekknW0QseJBvH34IQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?iJW+kJHN5dX+405duUP6W6AHR5D+?= =?utf-8?q?6n2MtH+Ys8Shh7IKKq9OsBv4u5JcANnkFrYIWjSFUrd5mMtwDymJcisLo1wDX8McL?= =?utf-8?q?2pWeLNHm2Z5dNcYsl6jX7j/hu2PSi40WtNSC4MFtg13M1sVKkadUV5mGRL+euf47g?= =?utf-8?q?lsZkJaFUSpV+1e5SBWj06byzXQcz60Oej96QZ9xftYtOy092m510pNb8mr4wQvuTu?= =?utf-8?q?vCpZnvYCei0a/q3LxdpGip1YYiTht5mqHwgv2NLW/+bJ3Lxe1tvGKHFy2CPFKRExK?= =?utf-8?q?c1v2tFPi6OPlXasQWTIsvyCrCxEJaIAcAovG4B+DoOZKJRv9BNSpCWzcv26KZdxwj?= =?utf-8?q?tMmJEdcL7YASXS6wvI6llQt0m7JH7K7EsAhVr3WPp7ZWNvRtwzs8KGudYo0agGK0Q?= =?utf-8?q?30rpUl9GOMKvfEimAkBOF1mlgWY74hp6lV9jxU3qcJ/nwsIvO000nHgPgADVDvNQz?= =?utf-8?q?GqX03Ono7JnUCn/3Cf8Ob505SqJyohAIQ6+avdN98QOcntB4JAAGUtQ2udDOzhurJ?= =?utf-8?q?4UgyakdU5V3ID5R0QZefGnJYb43M+zUCvtFM8Of+7K/+tW2Hu+JoUjbB7D3C5fvYo?= =?utf-8?q?bB3zbjOgOLZSvPdvZSSHkJDO2VZ5JEpguown5jwcVrxuYqoekvUKyZ3fIFS+uBnXj?= =?utf-8?q?T92Oi7c25fN+v0XwAVXayL5ia88fjh+9gh5A2iLbGC5pfWawkVVwvk1cz7hM+nVTR?= =?utf-8?q?YSNZ+2C4bu6MI3Kii7sXQwaWYzvGnq/2OGQNyO/P+M/bu2C/5dJMn+PUB0s+3LbP3?= =?utf-8?q?Mb7GpimXkX7FpkwvQcfVSUHOKKRH8s5FAq8qdvVhyJ0Nik2A2Lt1gWTU9NPMWJsNj?= =?utf-8?q?BWFSk9bdkgnuV68w5nSks1CSZPfYaHf1L83aXlgAqGpyEyhnQnNaj13ST6TLilhY+?= =?utf-8?q?mugnmo9HWdlhDM0cYB+DeJJYyEzJHNmBhlAgopR3BLrdi5PFGwtiUZiIlKPbYle9G?= =?utf-8?q?JldPWWslXYTWZERL3ino/WP/0BrHhroFzl/j3TUTmFM6hWl/HjshKTSeo5JTIjOwZ?= =?utf-8?q?tg9cDuOAwYi8wzfQbFpLvzO5kNJMdDw5diM06wzOB+VLNNOXNZhC6WvraBaAK5Vzt?= =?utf-8?q?Px3Vcgj1392Tk4TOECQXRtB9HV6kcI/i44+icJ5pqCG0+9QlKzPVrup6tX8vqeYRD?= =?utf-8?q?i+YsQ01d/idqYY34/ehpMIQlCLneouq0deCRyzV9ODaRbocEPIVvdxWjxdVKytvso?= =?utf-8?q?aNpcx7CiRnIoI1xPKTq84tJIJyDaQR71X1Twk6uBB0YnZ67yhaDnbRNBYY7Yd/514?= =?utf-8?q?X2Qkq2WNfOnAY/VaE5GM+Hjc3nQerJ6BapnyRFnjVluOV8DtforRhsgOJrrnU0x6u?= =?utf-8?q?3mLCQxyUhdL+4CSg0iB74YErQ3PCBKO8k7cjwBmvP3uZ6NAcKRsnLSIvKsfPckpe4?= =?utf-8?q?ElyaVHJNJcw3a3m82yeJupVymfF0tZRMnOeumAaKzmadbcJFn+rZ/ZFxr4SYRI5BT?= =?utf-8?q?WIRv0urmJtVMJhRWWu6dYL4tbMqSSHTNr89T6wXP+VrXtb1tDUx3GOTA9IP+cHrPm?= =?utf-8?q?PJ66jZX9hMeWcMOWJnkGfpkZTeyjV4zEW9f3hgLtN36RQRfVdii4GwemfrULZW4pk?= =?utf-8?q?mhm3/8w1ZJxywhXgQHlVuFPg0aPai2ZPw=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8ac0818b-d298-4178-18b1-08d9a319596b X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:12.0461 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1Gs8gd5YD3GYU8/Sn+PVSYo4YXn53/dJbBVNbxaLAZrw/1ZOZt7csLQLweTJR1H2p/H/1kd6MOJgkDevG0dXiBX2OJ6F8syva5Qgh0o9yzE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: gof2oRfWR-sQiv78-1JqdardyCfb0fSW X-Proofpoint-ORIG-GUID: gof2oRfWR-sQiv78-1JqdardyCfb0fSW Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 78 +++++++++++++++++++ include/hw/vfio/vfio-common.h | 2 + hw/vfio/pci.c | 20 +++++ hw/vfio/user.c | 170 ++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 4 + hw/vfio/meson.build | 1 + 6 files changed, 275 insertions(+) create mode 100644 hw/vfio/user.h create mode 100644 hw/vfio/user.c diff --git a/hw/vfio/user.h b/hw/vfio/user.h new file mode 100644 index 0000000..301ef6a --- /dev/null +++ b/hw/vfio/user.h @@ -0,0 +1,78 @@ +#ifndef VFIO_USER_H +#define VFIO_USER_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +typedef struct { + int send_fds; + int recv_fds; + int *fds; +} VFIOUserFDs; + +enum msg_type { + VFIO_MSG_NONE, + VFIO_MSG_ASYNC, + VFIO_MSG_WAIT, + VFIO_MSG_NOWAIT, + VFIO_MSG_REQ, +}; + +typedef struct VFIOUserMsg { + QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserFDs *fds; + uint32_t rsize; + uint32_t id; + QemuCond cv; + bool complete; + enum msg_type type; +} VFIOUserMsg; + + +enum proxy_state { + VFIO_PROXY_CONNECTED = 1, + VFIO_PROXY_ERROR = 2, + VFIO_PROXY_CLOSING = 3, + VFIO_PROXY_CLOSED = 4, +}; + +typedef QTAILQ_HEAD(VFIOUserMsgQ, VFIOUserMsg) VFIOUserMsgQ; + +typedef struct VFIOProxy { + QLIST_ENTRY(VFIOProxy) next; + char *sockname; + struct QIOChannel *ioc; + void (*request)(void *opaque, VFIOUserMsg *msg); + void *req_arg; + int flags; + QemuCond close_cv; + AioContext *ctx; + QEMUBH *req_bh; + + /* + * above only changed when BQL is held + * below are protected by per-proxy lock + */ + QemuMutex lock; + VFIOUserMsgQ free; + VFIOUserMsgQ pending; + VFIOUserMsgQ incoming; + VFIOUserMsgQ outgoing; + VFIOUserMsg *last_nowait; + enum proxy_state state; +} VFIOProxy; + +/* VFIOProxy flags */ +#define VFIO_PROXY_CLIENT 0x1 + +VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); +void vfio_user_disconnect(VFIOProxy *proxy); + +#endif /* VFIO_USER_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index c0dbbfb..224dbf8 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -76,6 +76,7 @@ typedef struct VFIOAddressSpace { struct VFIOGroup; typedef struct VFIOContIO VFIOContIO; +typedef struct VFIOProxy VFIOProxy; typedef struct VFIOContainer { VFIOAddressSpace *space; @@ -150,6 +151,7 @@ typedef struct VFIODevice { Error *migration_blocker; OnOffAuto pre_copy_dirty_page_tracking; struct vfio_region_info **regions; + VFIOProxy *proxy; } VFIODevice; struct VFIODeviceOps { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index fa3e028..ebfabb1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -43,6 +43,7 @@ #include "qapi/error.h" #include "migration/blocker.h" #include "migration/qemu-file.h" +#include "hw/vfio/user.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -3476,6 +3477,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev = &vdev->vbasedev; + SocketAddress addr; + VFIOProxy *proxy; + Error *err = NULL; /* * TODO: make option parser understand SocketAddress @@ -3488,6 +3492,16 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } + memset(&addr, 0, sizeof(addr)); + addr.type = SOCKET_ADDRESS_TYPE_UNIX; + addr.u.q_unix.path = udev->sock_name; + proxy = vfio_user_connect_dev(&addr, &err); + if (!proxy) { + error_setg(errp, "Remote proxy not found"); + return; + } + vbasedev->proxy = proxy; + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); vbasedev->fd = -1; @@ -3500,6 +3514,12 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) static void vfio_user_instance_finalize(Object *obj) { + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + VFIODevice *vbasedev = &vdev->vbasedev; + + vfio_put_device(vdev); + + vfio_user_disconnect(vbasedev->proxy); } static Property vfio_user_pci_dev_properties[] = { diff --git a/hw/vfio/user.c b/hw/vfio/user.c new file mode 100644 index 0000000..92d4e03 --- /dev/null +++ b/hw/vfio/user.c @@ -0,0 +1,170 @@ +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include +#include + +#include "qemu/error-report.h" +#include "qapi/error.h" +#include "qemu/main-loop.h" +#include "hw/hw.h" +#include "hw/vfio/vfio-common.h" +#include "hw/vfio/vfio.h" +#include "qemu/sockets.h" +#include "io/channel.h" +#include "io/channel-socket.h" +#include "io/channel-util.h" +#include "sysemu/iothread.h" +#include "user.h" + +static IOThread *vfio_user_iothread; +static void vfio_user_shutdown(VFIOProxy *proxy); + + +/* + * Functions called by main, CPU, or iothread threads + */ + +static void vfio_user_shutdown(VFIOProxy *proxy) +{ + qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); +} + + +/* + * Functions only called by iothread + */ + +static void vfio_user_cb(void *opaque) +{ + VFIOProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + proxy->state = VFIO_PROXY_CLOSED; + qemu_cond_signal(&proxy->close_cv); +} + + +/* + * Functions called by main or CPU threads + */ + +static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = + QLIST_HEAD_INITIALIZER(vfio_user_sockets); + +VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) +{ + VFIOProxy *proxy; + QIOChannelSocket *sioc; + QIOChannel *ioc; + char *sockname; + + if (addr->type != SOCKET_ADDRESS_TYPE_UNIX) { + error_setg(errp, "vfio_user_connect - bad address family"); + return NULL; + } + sockname = addr->u.q_unix.path; + + sioc = qio_channel_socket_new(); + ioc = QIO_CHANNEL(sioc); + if (qio_channel_socket_connect_sync(sioc, addr, errp)) { + object_unref(OBJECT(ioc)); + return NULL; + } + qio_channel_set_blocking(ioc, false, NULL); + + proxy = g_malloc0(sizeof(VFIOProxy)); + proxy->sockname = g_strdup_printf("unix:%s", sockname); + proxy->ioc = ioc; + proxy->flags = VFIO_PROXY_CLIENT; + proxy->state = VFIO_PROXY_CONNECTED; + + qemu_mutex_init(&proxy->lock); + qemu_cond_init(&proxy->close_cv); + + if (vfio_user_iothread == NULL) { + vfio_user_iothread = iothread_create("VFIO user", errp); + } + + proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + + QTAILQ_INIT(&proxy->outgoing); + QTAILQ_INIT(&proxy->incoming); + QTAILQ_INIT(&proxy->free); + QTAILQ_INIT(&proxy->pending); + QLIST_INSERT_HEAD(&vfio_user_sockets, proxy, next); + + return proxy; +} + +void vfio_user_disconnect(VFIOProxy *proxy) +{ + VFIOUserMsg *r1, *r2; + + qemu_mutex_lock(&proxy->lock); + + /* our side is quitting */ + if (proxy->state == VFIO_PROXY_CONNECTED) { + vfio_user_shutdown(proxy); + if (!QTAILQ_EMPTY(&proxy->pending)) { + error_printf("vfio_user_disconnect: outstanding requests\n"); + } + } + object_unref(OBJECT(proxy->ioc)); + proxy->ioc = NULL; + + proxy->state = VFIO_PROXY_CLOSING; + QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->incoming, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->pending, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->free, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->free, r1, next); + g_free(r1); + } + + /* + * Make sure the iothread isn't blocking anywhere + * with a ref to this proxy by waiting for a BH + * handler to run after the proxy fd handlers were + * deleted above. + */ + aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy); + qemu_cond_wait(&proxy->close_cv, &proxy->lock); + + /* we now hold the only ref to proxy */ + qemu_mutex_unlock(&proxy->lock); + qemu_cond_destroy(&proxy->close_cv); + qemu_mutex_destroy(&proxy->lock); + + QLIST_REMOVE(proxy, next); + if (QLIST_EMPTY(&vfio_user_sockets)) { + iothread_destroy(vfio_user_iothread); + vfio_user_iothread = NULL; + } + + g_free(proxy->sockname); + g_free(proxy); +} diff --git a/MAINTAINERS b/MAINTAINERS index d838b9e..f429bab 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1882,8 +1882,12 @@ L: qemu-s390x@nongnu.org vfio-user M: John G Johnson M: Thanos Makatos +M: Elena Ufimtseva +M: Jagannathan Raman S: Supported F: docs/devel/vfio-user.rst +F: hw/vfio/user.c +F: hw/vfio/user.h vhost M: Michael S. Tsirkin diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index da9af29..2f86f72 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -9,6 +9,7 @@ vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'pci-quirks.c', 'pci.c', )) +vfio_ss.add(when: 'CONFIG_VFIO_USER', if_true: files('user.c')) vfio_ss.add(when: 'CONFIG_VFIO_CCW', if_true: files('ccw.c')) vfio_ss.add(when: 'CONFIG_VFIO_PLATFORM', if_true: files('platform.c')) vfio_ss.add(when: 'CONFIG_VFIO_XGMAC', if_true: files('calxeda-xgmac.c')) From patchwork Tue Nov 9 00:46:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5ACDEC433F5 for ; Tue, 9 Nov 2021 00:56:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CEF4261101 for ; Tue, 9 Nov 2021 00:56:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CEF4261101 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:50274 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFR3-0005NE-14 for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:56:25 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51680) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAe-0005fw-Tt for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:40398) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046u-0T for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A90AULk025572 for ; Tue, 9 Nov 2021 00:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=corp-2021-07-09; bh=xTBOhpu1nBNdVqwUkCL0pXF+ZxexWuXAQHuTdxKwJmg=; b=W1Yi5gwG3+Vth7vTU2fSV+P1FBv7jDUjxUlos/vK0X6Qc25i2HI73Ey1vB25Jm31dhd4 xJE1ko9jttQX7Blc5MXK6rgJlZb6MWNI0paWJzfdLOn76KMwfEBKFprAjZIEmwB30r6u G2OARFwvIkfMP2JuC8I+E4oeGEwjrV6k1+BBx37VaozafauW+ggb85zW16Hy3nJQt3ni xpo6vEn39ZSvWkFGWgkzXHduG4MYZzYuU21A5TJ91st2OxMfdJUTkgZ+KOXPV1CX/2Fg mTy36H6HcfDnlyirN73eF6DGiaZItiy0ktxhaYeboTtzxJbNzZu50vKhoq5a+k9sk9OW Ag== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6t7077j9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN4129193 for ; Tue, 9 Nov 2021 00:39:18 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dHXNWbtOS43ox0HQgnVqwga7Ks8tQUBzqnvYO8xmcmCw38XBsne3NaF6zmnKX8DEdvPYX1YcIN8RKiPM+foXh4IXNSHIGrYUY+SF3f47jZGBUtdjABZkXp5Ax5VLrFun7wTuj0UwjuoUPEPxyWcZm/uaIy350AgOZbY+jBe8jeA2F8FXgTtK4g4yLdvpM50RoSAS5hlFGCMMor0JXMvlVUbAPhJicHXyf33yJHAJwqrAQpqpUy6FKihGM51TByTYMibYQFapzlfPzRTWGK3g73Nq6mvlph9LEh0tdmInQcOQwCX2B/iqXL55QHzjbOpApYw6JNmIi7uPg22AlnA/sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xTBOhpu1nBNdVqwUkCL0pXF+ZxexWuXAQHuTdxKwJmg=; b=OegplyZYWpLnfAGuxaga/VOgnMHmXDnBmsTLEvqpAhQFmp8R//JphPCFNaB/MUkpZPgSw0j2q8OYfZsjamDLJJsH/7H/f0zKIzYdxWWFyo3R31WOtwc0QyltDRRBW+jSLxgTp0jF8dSi2dC6pFmALulTW75FaNlRIXI8nU83xjGKDIIg/2j5+R8CdFeNszN5ZHvotUVN2IBtcZu3+6BeZ32bk7VIm6UTDTRAMm5SnqzS/v9EszAbFOtP1MN3vrhza+qsWm7Oy6zYO+IUF/9T4kCDRRfiyuSvc1G8gTSEFEW7fKCNSgDVYXEnPoT/KPvuib5DsndWrM2Z3kbiJUSGug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xTBOhpu1nBNdVqwUkCL0pXF+ZxexWuXAQHuTdxKwJmg=; b=JiWumBEN0AK6DzxjOU6xFjCkz4M46I/NRdUPSKb7zvpnYhq69ruXkVBRo3vstbtL0svkvZWb524OeIZpcMr/A/OIqs6qvOv1pcytCssQf3S2MjJ4BTy6EIZZ1xg1m5bmLVdHLLH9vrVZoUB5gTqWmrAdG9tq4nqlPjq+wr3G/Zg= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:14 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:14 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 08/19] vfio-user: define socket receive functions Date: Mon, 8 Nov 2021 16:46:36 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a99357b6-842c-4ce3-4ad5-08d9a3195992 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:597; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Wz3f5bcVfF2QlG4Its0KHFx4aTxWjku9eGvBqaIn3sVBkX7A3s/KGyhlCMoDSXZP+usLomTTixQ6oxi2YTGnSM+tovJRBOxuCD0ZN9g33+hGsBuW9Ej7LOAaNP//BzIc7jgiDW3maLauF3FIFLWROb+FoEYuIUW7VrfVcBskuvf/t0N5k99h+UR/o4rNQns8BCldJDuHTMgAR8RIEI4uHswLxWtCvXXcahimJS3e+9Im+KtKA8qfv+A0/HQr1gEdDf6cFPLE9YatDzq3uVfqm/Vva2fh5zE4SR8Otc5xwHubBvwU5lVDGzwAaISPPtGNTpvliWy00zkFB6kr7cy+iAmKq6z9gkQkqkEdO80G1Lhshwlp0pN4d/FqzyjLt08u/557oxkeoryVQta4LIhWstAr7wG2U1lxATAFLqcD0+7uaZMTZEVO6MPH4gBS8dzbSNO7Vr0k+C6+R7yB+meB9JGqB35CVG0B5DB6XYLF8+oquN6dNQZAUMtBlvY2UCLeGoRDsSPi0FtJFWKRLGK4FGE+elW4jcQhlbEPfeYLIqQK1KOOJfWKtUkrHkI3E2VIYZOIFRkueU2eqrfDOmMTAXSPtIEoulKg80kl77eiJu44fMhp8qnFVkqpPhy/XIxs+KJ/JnEjU4JFqepgdRvIpnAwIz6p6Wp6RfI8gLRMhHumBxf3DfoCUuceMzfYsRtY7AcBOlMih6kkSFLHsTodHw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?kGM14dwUS+ofe/tFPWQDq489NdO2?= =?utf-8?q?PHgrvdUnEK6CUTDXPopPwVSmn60HBDdPCbYuzgILbBjCzQc9tGZzjHSRKPaSbXmzj?= =?utf-8?q?+rBHd/BJfF4Q8nw99h1r10Ac+bsABn89DL9Jog4beqPOdkVxmmNphPz2qZNFN1DCX?= =?utf-8?q?J52H3XAOn+ob9FI6DICsEZ+Aq6HBTIppvv2zmoaUWwdYMKS/zWMKHxrBh7e2kv86O?= =?utf-8?q?OCU7+bVZfjzKSUKZJJwGOj79p167LFmhwZHtvQv5t7MDju0CSiVVE9bE3q1ZRuwdF?= =?utf-8?q?fgzswA7oVw1LzN64WefIoneK280hXxTabzJHqUHcyjMXECIUVdJMnSrfS0qvhewd0?= =?utf-8?q?Onmv4+5S7JLriwj+ZuMlI6UV1+26shXnN8FgT0UMYpWGTH8C+ktut4NqIJkEcecQ9?= =?utf-8?q?ZwDFM0J39HgMdxPnsjxJ6gLfoSQ8hvxF6BIVkkb6KTeBr5/2apcQxHf+RZ2kvY1Ra?= =?utf-8?q?95zKkQzvR709SgT2l4VoatzdCF0gV+X5CmxQj5xjJQW51d95//wBDXWcPEtyIp3bk?= =?utf-8?q?esz/0WfCzCr2w2BZyxfqFmBJ6g1wLsLCMg5xvdYhq4++GOE18SDDMmGdxKaV1p5cu?= =?utf-8?q?ESNSyHvPj+MH9+BuZXDdCynDGK3R2RNvEH38sv/A3gU6Ep2pvNVZ6ykhltOh4zxnz?= =?utf-8?q?nd6+58/rrt/0xOuC1zBSUGE4kCA2Z3ej01jkdRoE1h193O3kM1eaRRj336rPeufry?= =?utf-8?q?D2GB3EoV+aaLWlgJKWb0Pqry12lET5/aksgSYdJmKzhGV+if9b41rYIKSFSdacN6V?= =?utf-8?q?NsU4U7ND5EeDjMjXuNLBne+H5B0x+Jy15ZMIwRrsJPsK/1qobHdxVPaBsa1LOQ0Ye?= =?utf-8?q?bIC8gwM8QECx5QZejeeq6Dv+vsYGBBVgU54m7TO0uWXPnt0lvyN8C9CLEOy7sM7RZ?= =?utf-8?q?0YVXkqiANf2ZaeOedrn8PKCP+H/hNZry9iN8mPlLXK/+JVJhBYyXesCwS1ZM7ZICn?= =?utf-8?q?mFw5UC84Fi/033z50zFEtMcCpIA3YvS3AOjxDyQ+PxVdRQkBvWbmTKFlvr0vXK0OI?= =?utf-8?q?UeCLAEc3J50WTIjogJ3u9IrLKy41m1r5GmyI1GJBcou/DXuXyjAbzj4n/6bc4Ma0m?= =?utf-8?q?w1LnPRilklhzQdSg9PFBpjvmoA/zvOcLonJkFEt9DqvLJdvSg2QxwoQ/++vFkEXCW?= =?utf-8?q?Xjn16DV1MtEqJddCGC6hKx5Tq8v8cqujHqvDSwVGPmhehCzxGK2LYJAKetg1w+cGb?= =?utf-8?q?Zqmt/QE+H5jp1fzvu89h1mKBzDZRT213Y4sS6eVjS1pik3O1DSxfiKhuDwKJ/x7HF?= =?utf-8?q?HdhNjhb668tZvJvG8kTtVJ4p+EkrUva68RRU58tZ1TpUn9m0ZdBeLbcGOIT+Q0ncP?= =?utf-8?q?BVrERNxFEbDHJVJ+To6SQynn5JEdDoX/O3/ufxzaY/9Xvz7HEuBrt9IN54z3rbbTU?= =?utf-8?q?8Xoxw4durdt0dDsHND5x4XmfRQDpiz2HOK2omEBftvn+74ObZD5V7d27rZrFalG1S?= =?utf-8?q?aHY7vdyEXmMzQFb8kucalZD6H9XjfxHz0Uz74LQ6oWAGf6mdKy20QW4R8YzAD7ahM?= =?utf-8?q?9eYNNLmifminY8qmfEVJ7g7IK1z3Xq9cFjqr9ow+nddYZOThuJWHEGUioOZaajvvU?= =?utf-8?q?zbvbGsLY+Lx7Zd5eNx8JuWk9181gsnrTg=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: a99357b6-842c-4ce3-4ad5-08d9a3195992 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:12.2851 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Pche2sUhw19fYKX/ACEWBK7JcfaiS/ortWfjV7vJIW71t3R+g7fF9efwNlktUkrDGneTvWc17tWxLQjU/zLBMb+pWZfQrbsLMTZKtyFIpo8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: hLGMa-SOfAoFQlDfX_B1KdohoI5Q2pLK X-Proofpoint-ORIG-GUID: hLGMa-SOfAoFQlDfX_B1KdohoI5Q2pLK Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add infrastructure needed to receive incoming messages Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 2 +- hw/vfio/user-protocol.h | 62 +++++++++ hw/vfio/user.h | 9 +- hw/vfio/pci.c | 12 +- hw/vfio/user.c | 326 ++++++++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 1 + 6 files changed, 409 insertions(+), 3 deletions(-) create mode 100644 hw/vfio/user-protocol.h diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 08ac647..ec9f345 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -193,7 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; - bool secure_dma; /* disable shared mem for DMA */ + bool send_queued; /* all sends are queued */ }; /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h new file mode 100644 index 0000000..27062cb --- /dev/null +++ b/hw/vfio/user-protocol.h @@ -0,0 +1,62 @@ +#ifndef VFIO_USER_PROTOCOL_H +#define VFIO_USER_PROTOCOL_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Each message has a standard header that describes the command + * being sent, which is almost always a VFIO ioctl(). + * + * The header may be followed by command-specific data, such as the + * region and offset info for read and write commands. + */ + +typedef struct { + uint16_t id; + uint16_t command; + uint32_t size; + uint32_t flags; + uint32_t error_reply; +} VFIOUserHdr; + +/* VFIOUserHdr commands */ +enum vfio_user_command { + VFIO_USER_VERSION = 1, + VFIO_USER_DMA_MAP = 2, + VFIO_USER_DMA_UNMAP = 3, + VFIO_USER_DEVICE_GET_INFO = 4, + VFIO_USER_DEVICE_GET_REGION_INFO = 5, + VFIO_USER_DEVICE_GET_REGION_IO_FDS = 6, + VFIO_USER_DEVICE_GET_IRQ_INFO = 7, + VFIO_USER_DEVICE_SET_IRQS = 8, + VFIO_USER_REGION_READ = 9, + VFIO_USER_REGION_WRITE = 10, + VFIO_USER_DMA_READ = 11, + VFIO_USER_DMA_WRITE = 12, + VFIO_USER_DEVICE_RESET = 13, + VFIO_USER_DIRTY_PAGES = 14, + VFIO_USER_MAX, +}; + +/* VFIOUserHdr flags */ +#define VFIO_USER_REQUEST 0x0 +#define VFIO_USER_REPLY 0x1 +#define VFIO_USER_TYPE 0xF + +#define VFIO_USER_NO_REPLY 0x10 +#define VFIO_USER_ERROR 0x20 + + +#define VFIO_USER_DEF_MAX_FDS 8 +#define VFIO_USER_MAX_MAX_FDS 16 + +#define VFIO_USER_DEF_MAX_XFER (1024 * 1024) +#define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) + + +#endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 301ef6a..bd3717f 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -11,6 +11,8 @@ * */ +#include "user-protocol.h" + typedef struct { int send_fds; int recv_fds; @@ -27,6 +29,7 @@ enum msg_type { typedef struct VFIOUserMsg { QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserHdr *hdr; VFIOUserFDs *fds; uint32_t rsize; uint32_t id; @@ -70,9 +73,13 @@ typedef struct VFIOProxy { } VFIOProxy; /* VFIOProxy flags */ -#define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_FORCE_QUEUED 0x4 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *reqarg); #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ebfabb1..db45179 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3448,6 +3448,11 @@ struct VFIOValidOps vfio_pci_valid_ops = { * vfio-user routines. */ +static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) +{ + +} + /* * Emulated devices don't use host hot reset */ @@ -3501,6 +3506,11 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } vbasedev->proxy = proxy; + vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + + if (udev->send_queued) { + proxy->flags |= VFIO_PROXY_FORCE_QUEUED; + } vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); @@ -3524,7 +3534,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), - DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), + DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 92d4e03..f662ae0 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -25,9 +25,27 @@ #include "sysemu/iothread.h" #include "user.h" +static uint64_t max_xfer_size = VFIO_USER_DEF_MAX_XFER; static IOThread *vfio_user_iothread; + static void vfio_user_shutdown(VFIOProxy *proxy); +static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); +static VFIOUserFDs *vfio_user_getfds(int numfds); +static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg); + +static void vfio_user_recv(void *opaque); +static int vfio_user_recv_one(VFIOProxy *proxy); +static void vfio_user_cb(void *opaque); +static void vfio_user_request(void *opaque); + + +static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) +{ + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = err; +} /* * Functions called by main, CPU, or iothread threads @@ -39,11 +57,259 @@ static void vfio_user_shutdown(VFIOProxy *proxy) qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); } +static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + + msg = QTAILQ_FIRST(&proxy->free); + if (msg != NULL) { + QTAILQ_REMOVE(&proxy->free, msg, next); + } else { + msg = g_malloc0(sizeof(*msg)); + qemu_cond_init(&msg->cv); + } + + msg->hdr = hdr; + msg->fds = fds; + return msg; +} + +/* + * Recycle a message list entry to the free list. + */ +static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + if (msg->type == VFIO_MSG_NONE) { + error_printf("vfio_user_recycle - freeing free msg\n"); + return; + } + + /* free msg buffer if no one is waiting to consume the reply */ + if (msg->type == VFIO_MSG_NOWAIT || msg->type == VFIO_MSG_ASYNC) { + g_free(msg->hdr); + } + + msg->type = VFIO_MSG_NONE; + msg->hdr = NULL; + msg->fds = NULL; + msg->complete = false; + QTAILQ_INSERT_HEAD(&proxy->free, msg, next); +} + +static VFIOUserFDs *vfio_user_getfds(int numfds) +{ + VFIOUserFDs *fds = g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); + + fds->fds = (int *)((char *)fds + sizeof(*fds)); + + return fds; +} /* * Functions only called by iothread */ +static void vfio_user_recv(void *opaque) +{ + VFIOProxy *proxy = opaque; + + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (vfio_user_recv_one(proxy) == 0) { + ; + } + } +} + +/* + * Receive and process one incoming message. + * + * For replies, find matching outgoing request and wake any waiters. + * For requests, queue in incoming list and run request BH. + */ +static int vfio_user_recv_one(VFIOProxy *proxy) +{ + VFIOUserMsg *msg = NULL; + g_autofree int *fdp = NULL; + VFIOUserFDs *reqfds; + VFIOUserHdr hdr; + struct iovec iov = { + .iov_base = &hdr, + .iov_len = sizeof(hdr), + }; + bool isreply = false; + int i, ret; + size_t msgleft, numfds = 0; + char *data = NULL; + char *buf = NULL; + Error *local_err = NULL; + + /* + * Read header + */ + ret = qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds, + &local_err); + if (ret == QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + if (ret <= 0) { + /* read error or other side closed connection */ + if (ret == 0) { + error_setg(&local_err, "vfio_user_recv server closed socket"); + } else { + error_prepend(&local_err, "vfio_user_recv"); + } + goto fatal; + } + if (ret < sizeof(msg)) { + error_setg(&local_err, "vfio_user_recv short read of header"); + goto fatal; + } + + /* + * Validate header + */ + if (hdr.size < sizeof(VFIOUserHdr)) { + error_setg(&local_err, "vfio_user_recv bad header size"); + goto fatal; + } + switch (hdr.flags & VFIO_USER_TYPE) { + case VFIO_USER_REQUEST: + isreply = false; + break; + case VFIO_USER_REPLY: + isreply = true; + break; + default: + error_setg(&local_err, "vfio_user_recv unknown message type"); + goto fatal; + } + + /* + * For replies, find the matching pending request. + * For requests, reap incoming FDs. + */ + if (isreply) { + QTAILQ_FOREACH(msg, &proxy->pending, next) { + if (hdr.id == msg->id) { + break; + } + } + if (msg == NULL) { + error_setg(&local_err, "vfio_user_recv unexpected reply"); + goto err; + } + QTAILQ_REMOVE(&proxy->pending, msg, next); + + /* + * Process any received FDs + */ + if (numfds != 0) { + if (msg->fds == NULL || msg->fds->recv_fds < numfds) { + error_setg(&local_err, "vfio_user_recv unexpected FDs"); + goto err; + } + msg->fds->recv_fds = numfds; + memcpy(msg->fds->fds, fdp, numfds * sizeof(int)); + } + } else { + if (numfds != 0) { + reqfds = vfio_user_getfds(numfds); + memcpy(reqfds->fds, fdp, numfds * sizeof(int)); + } else { + reqfds = NULL; + } + } + + /* + * Put the whole message into a single buffer. + */ + if (isreply) { + if (hdr.size > msg->rsize) { + error_setg(&local_err, + "vfio_user_recv reply larger than recv buffer"); + goto err; + } + *msg->hdr = hdr; + data = (char *)msg->hdr + sizeof(hdr); + } else { + if (hdr.size > max_xfer_size) { + error_setg(&local_err, "vfio_user_recv request larger than max"); + goto err; + } + buf = g_malloc0(hdr.size); + memcpy(buf, &hdr, sizeof(hdr)); + data = buf + sizeof(hdr); + msg = vfio_user_getmsg(proxy, (VFIOUserHdr *)buf, reqfds); + msg->type = VFIO_MSG_REQ; + } + + msgleft = hdr.size - sizeof(hdr); + if (msgleft != 0) { + ret = qio_channel_read(proxy->ioc, data, msgleft, &local_err); + /* error or would block */ + if (ret < 0) { + goto fatal; + } + if (ret != msgleft) { + error_setg(&local_err, "vfio_user_recv short read of msg body"); + goto fatal; + } + } + + /* + * Replies signal a waiter, if none just check for errors + * and free the message buffer. + * + * Requests get queued for the BH. + */ + if (isreply) { + msg->complete = true; + if (msg->type == VFIO_MSG_WAIT) { + qemu_cond_signal(&msg->cv); + } else { + if (hdr.flags & VFIO_USER_ERROR) { + error_printf("vfio_user_rcv error reply on async request "); + error_printf("command %x error %s\n", hdr.command, + strerror(hdr.error_reply)); + } + /* youngest nowait msg has been ack'd */ + if (proxy->last_nowait == msg) { + proxy->last_nowait = NULL; + } + vfio_user_recycle(proxy, msg); + } + } else { + QTAILQ_INSERT_TAIL(&proxy->incoming, msg, next); + qemu_bh_schedule(proxy->req_bh); + } + return 0; + + /* + * fatal means the other side closed or we don't trust the stream + * err means this message is corrupt + */ +fatal: + vfio_user_shutdown(proxy); + proxy->state = VFIO_PROXY_ERROR; + +err: + for (i = 0; i < numfds; i++) { + close(fdp[i]); + } + if (isreply && msg != NULL) { + /* force an error to keep sending thread from hanging */ + vfio_user_set_error(msg->hdr, EINVAL); + msg->complete = true; + qemu_cond_signal(&msg->cv); + } + error_report_err(local_err); + return -1; +} + static void vfio_user_cb(void *opaque) { VFIOProxy *proxy = opaque; @@ -59,6 +325,51 @@ static void vfio_user_cb(void *opaque) * Functions called by main or CPU threads */ +/* + * Process incoming requests. + * + * The bus-specific callback has the form: + * request(opaque, msg) + * where 'opaque' was specified in vfio_user_set_handler + * and 'msg' is the inbound message. + * + * The callback is responsible for disposing of the message buffer, + * usually by re-using it when calling vfio_send_reply or vfio_send_error, + * both of which free their message buffer when the reply is sent. + * + * If the callback uses a new buffer, it needs to free the old one. + */ +static void vfio_user_request(void *opaque) +{ + VFIOProxy *proxy = opaque; + VFIOUserMsgQ new, free; + VFIOUserMsg *msg; + + /* reap all incoming */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + new = proxy->incoming; + QTAILQ_INIT(&proxy->incoming); + } + QTAILQ_INIT(&free); + + /* process list */ + while (!QTAILQ_EMPTY(&new)) { + msg = QTAILQ_FIRST(&new); + QTAILQ_REMOVE(&new, msg, next); + proxy->request(proxy->req_arg, msg); + QTAILQ_INSERT_HEAD(&free, msg, next); + } + + /* free list */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + while (!QTAILQ_EMPTY(&free)) { + msg = QTAILQ_FIRST(&free); + QTAILQ_REMOVE(&free, msg, next); + vfio_user_recycle(proxy, msg); + } + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -97,6 +408,7 @@ VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) } proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + proxy->req_bh = qemu_bh_new(vfio_user_request, proxy); QTAILQ_INIT(&proxy->outgoing); QTAILQ_INIT(&proxy->incoming); @@ -107,6 +419,18 @@ VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) return proxy; } +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *req_arg) +{ + VFIOProxy *proxy = vbasedev->proxy; + + proxy->request = handler; + proxy->req_arg = req_arg; + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, proxy); +} + void vfio_user_disconnect(VFIOProxy *proxy) { VFIOUserMsg *r1, *r2; @@ -122,6 +446,8 @@ void vfio_user_disconnect(VFIOProxy *proxy) } object_unref(OBJECT(proxy->ioc)); proxy->ioc = NULL; + qemu_bh_delete(proxy->req_bh); + proxy->req_bh = NULL; proxy->state = VFIO_PROXY_CLOSING; QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { diff --git a/MAINTAINERS b/MAINTAINERS index f429bab..52d37dd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1888,6 +1888,7 @@ S: Supported F: docs/devel/vfio-user.rst F: hw/vfio/user.c F: hw/vfio/user.h +F: hw/vfio/user-protocol.h vhost M: Michael S. Tsirkin From patchwork Tue Nov 9 00:46:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70B14C433F5 for ; Tue, 9 Nov 2021 00:44:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D2CD161165 for ; Tue, 9 Nov 2021 00:44:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D2CD161165 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:46758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFFa-0000KK-PV for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:44:34 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51690) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAf-0005fz-0j for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:29 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:41330) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046e-0E for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:28 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A902t72010230 for ; Tue, 9 Nov 2021 00:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=grDeY9SvJL8oyUlVLsUMebxo55ztUepyWzmJ963DcBg=; b=XBLElUFSyf0LiwM3rREMBaPLZzkxmxH05VCwTny4oz3j6rJkwg2q/vti2C8G/FjPytpq PYMFDTOQSYOZIeoiEhdY384LfbUSEd/2Et81Dtmx7dLQI0k7P/TK34cTU0IRgsZSI6Ya z91EyZLlkfmYFPW5bUV4nDQCDpA0LDPb9iCeXsLEGVNboXIkL7cRf3xjKm+xLCRCscV6 yPn/jF9dIYm/ZGIINalhGBDLmXG1eaeO18Cw2jDPDW9qzSBrCI2tjQcFSy9sgsizw3dA OYK2mxoCCZYmyvRIKQuKd8vHMctg8fmQUPPNvHaRNOxdEDl3oS4nSy9fsTKOZcePFWo9 tw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6usnfjka-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm4132637 for ; Tue, 9 Nov 2021 00:39:18 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gAYnwAKTu25YgUUOCGdU7w9wycifyXQlmVk2JSSZbCstEaLfglQO7o7qP6cWtbiAQ8JJ5lbLyEXrhEO6DtGjwpNw601daEwibQiFNc9YDieCWfyCccNDVERjsnlSF9YX+43Ou7ujK4abA+0TM8+R6xVzNyEGQ4vj1SN+F3V/n2jOWcbR0ljlqcE0Kb6lveFVAI5K+2u8/MQXxuYbAP0CRUGyqidJPlSsiJafFolaReT9JD67YyEqnCcEyrJ32Upkiqj1Fimd6tnwZgxd2VCFM8gbe5IltFmfQNtK3UmP6o0X5ulz1pugGm88DsqrMfHU2pxGPOTJgH84TFCp98cPMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=grDeY9SvJL8oyUlVLsUMebxo55ztUepyWzmJ963DcBg=; b=BIo0/Rzl69Hcv7HPjbWGt/nvjdtQk9oiLcdlK4u4HV47GGRDStkzCtCBXbt2A5RNAr0vwhsdx5Kte+71nnrUHN4Q6Fw2n9IrUegV6Ct+u1VpRT0jgGNacy9NokfMeXPbB2IXOdMEwzZkA0KGitLKQnS3eQlFbQHSHmnpdQmyl3rvK+tQ2AB6kGjWGakW/oxhK6E0neL+N2tQhMoKrZR4a2LZ/VCD2JjzKFMLr005UzVneeirbaYZH9klHlTCDLZDoqx0fE/ujCy2Rut6zC7cl1kLPsRrEjub/9v3Ga77u2Omf+6EyUgEZLuS4lcrf8x+Zq58zeI5z1G5p7py9tXqcA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=grDeY9SvJL8oyUlVLsUMebxo55ztUepyWzmJ963DcBg=; b=uQbTLqT/JVc5DFP7rLiUK29uwPrx+CM+iw4KtZ9K9WFucFNF0JQL+OWjDKFQTtk5h2eFO0J6fnoXfwqWCjqEaCv8WWCk13fb07bg905XxaXw2jqdk+BwWufXmYlyDUHNZXH0hwpzIg1IiQv+Yd9Y/oGHSi9NMsl4fFo+suk+Ls0= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:14 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:14 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 09/19] vfio-user: define socket send functions Date: Mon, 8 Nov 2021 16:46:37 -0800 Message-Id: <5fb1f506cb2d66258da979e4d1a9ac717442675f.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 96c14819-32ee-430e-c0ec-08d9a31959b6 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3968; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: P7x+ZS/WM3BSt+Rh0LNS2aQa4V63z9Wi5RKhYNurdwFU2cuOMOq6Y3Err2JmEpq54PTTZj2nlCG0TrPPw1d9fB0FoajhuBbBdsS0WdGf9QFPzuU/Mb4GwvhSzB91y6d7dXaNxHuGYRc+5G7DAbU7ZCMOI8wVWlXAIITxn7/1xIE2FBLnZiANv4uHYk7lJB6WUyrTak6/us80tvCg2aEqCiZqyAu1W99AN8kOnPhoQl85TAoqMgWOW2U+aWK+z1PjkbaS7gPeB1hcARgogEKsA3qE4vuHJHcZoNKTwDQ/snBWxct3qvv6E12K33l9uL7jGJP+TXU9WpYr+ttLHcTmn/9xfvXxG2tHoa0G3rsCEXpo6TN9UlwEUqFlQ68Ll7TG2TNOBKi98vEoTPBVkFJpQ7l/jJIMV8t1m4VpDvL4+fGWilavnSn7Xt7Y4LkpUQouRWJhV9HCAD8FGuK8ckB85iu0GOPfZu7JVnemMtaFqHcyKTNAQPWCEKPRsdNBfcR+H06QLloRupupUoWKBx4RHXQLo1d2rSk7kbndvhjulxKi5g2TwwHeribuHv7Q9lXvQzwA3xB2CZUmpgxR6/gZHpctx3xvkUw/006FRl8Dy4/zQweL8ONNYE+TXsRGskJp7RR+n6JzYmo7o9znHTRrR01FxcEZ2bJeWED59+WNXEzO771jZ3gAZKLduTBG1TzropdahAMwDGJme0MsrlT5GA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: zSDKkDW5AoIVUogOrOy27fo+Yn95lfkaEub9cwPDKOm1D+b+ZrrqquUytqa7fy0s2enjc7PHrZm0zC+GTs3Uas5pvtAxV7HpjbhDoslngpHQqyvd40AHWYrGGJkmcK6rLEN1Qx0+xjJ7k0xr5ZbaRQUeVV+Bl/MB+xA7WMzZ0VHm+0o42t2FmZsBVdDozXNUrCxCWIClCDoR7Wmqm7SvfwMdjdjrTyaHChfbQfu723B7LXAXdfY4gqQ1YOaqtBZYDEdr7KCwL0UlxTDpuBrJgwfFXDO/SeSaRfbqMXaFbuxXGmG4syihhW5CgbKBRhQ4LrzWY9si/eSfPgfikdz+xaWDNX0lbduk4F1LLjDvnVaVv4dISnzo96FldX7OIZw9IdUjXLjgbXG7Zc5fdi2kbnTNlS4i5pJL/BGDpsxHTPVZmt2lb0c5A12iFOA6ndMnPcUt3TUZLmX0H0HSa4Z2PY6yq4lx2ZJtJpCje1g8jYK4GsHYkIvNH9E7jRVWnVHm3hUlDNADRc5pUPVSqeh+1VslK2gC3y8kc3TS5RyNvuNMeFzKNg/BcomF0Q+J7S3oC28DJR5nadAQ3QSdlSLogtVbdCWiMduv6kEyekOjbwO7gQO2gJdpMBYWL2cBm9cJVx9k3NAog06LPckbfe7H8JUfwpL1UKyD1YBiKKdTqhg/G/lQggjQOeShHTN9yM14XwOXDK4WXu+yMbCPISnelksO3GnCEkuPmeEUPntZPtprvyB5flIK4L0zJXucs5Hjjhuy8rriBPksUGFz6kY4/q2S3/DFzDbMRZclx/dyOa41RTQ1HgjYqIfa0YeCLdgyl46ZC7FYSJoyB3XHkR/ig/MBf70iS8exHKJMiSIaicj7Vx0i2URLVI6zfgB9wJb655Gq31ARpYIeg6d3MRX04gEgs+j+ym7OURgLci/mX5vKUe0PwxDU7UGLL/CxHr+P3W5njvPNJoWD45K4lTL8ITgb0NMkqZnWVAAiFbkvPg/+IxY8tZeeo3LbV/wfHNEROqUHTqPrkPTib+JIlHy9m+z3nYoxMQK8wis45hYMVoDwdkzHh7fwjlTYhSQ1iS3oKk2FNHL6tGZjlVSYFlml08THQZwl3oodM4kIBRYrqKw2aHtSJUyWEHS1X5F5E+woBLl+89I3FFyUFH7kowu5rR1WtEzuJMeDEGnZXTkurRy15rymbWGBImtHC+sdHsr/lJ2rGKMdKWIpzwO6qj+sL4HlWfVjVyu+Ozh5jFAPlylHfSw/iQEvWeZpLL6ocl9FRAbifeCwoWcA4kCHspK/PVfE1tetiyyY0ZT7+bXJMEpsT7PUFWvDJDPX2urzYGsBaENNZxLx7Bz9HG/yIKhvve80vfe/uVuxLtRcB7jF9FSjD5hrn4WXFr/fuQJAo5LxhJ/AoVn48r+fEq/yA7Vpmhud4LURQqrjQqZ8m5Ksx06gtK3213q6gom2xu2xPbgTkkmP/03fB4J1Wh3VVl5d/w6ddSJF02OmphhPPKlf8IBGw7xH/fh8oO43KK65RHq4oUdd9e8rEqk7qR0bSWdYa4delQs/fBTbZR1QuAdT7opG5+4sCR4sf6MJ6S4IQ8PNjGLCSj+p8N4SL1GwzgOPefCDgDMwiA2+CNysvYWxQESAp93kwChNvDF0U+ztP1qvPpGL4HEck9GHM8trL4q77A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 96c14819-32ee-430e-c0ec-08d9a31959b6 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:12.5190 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tcLTFS/GEoZugHw54F9AFvJGae47BHPNMFWxiVY9hx3FVSeYMBbHFMC1ghulRaxgzdyWCFxjQk+iFYT4Lzrt3vdc3KAnjFKpFPWTYNQBKMA= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: ixwBJF7iFl9CvCS9vCZS5x1HUdve5Fto X-Proofpoint-GUID: ixwBJF7iFl9CvCS9vCZS5x1HUdve5Fto Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Also negotiate version with remote server Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson --- hw/vfio/user-protocol.h | 23 +++ hw/vfio/user.h | 1 + hw/vfio/pci.c | 11 ++ hw/vfio/user.c | 411 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 446 insertions(+) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 27062cb..14b762d 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -52,6 +52,29 @@ enum vfio_user_command { #define VFIO_USER_ERROR 0x20 +/* + * VFIO_USER_VERSION + */ +typedef struct { + VFIOUserHdr hdr; + uint16_t major; + uint16_t minor; + char capabilities[]; +} VFIOUserVersion; + +#define VFIO_USER_MAJOR_VER 0 +#define VFIO_USER_MINOR_VER 0 + +#define VFIO_USER_CAP "capabilities" + +/* "capabilities" members */ +#define VFIO_USER_CAP_MAX_FDS "max_msg_fds" +#define VFIO_USER_CAP_MAX_XFER "max_data_xfer_size" +#define VFIO_USER_CAP_MIGR "migration" + +/* "migration" member */ +#define VFIO_USER_CAP_PGSIZE "pgsize" + #define VFIO_USER_DEF_MAX_FDS 8 #define VFIO_USER_MAX_MAX_FDS 16 diff --git a/hw/vfio/user.h b/hw/vfio/user.h index bd3717f..7ef3c95 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -81,5 +81,6 @@ void vfio_user_disconnect(VFIOProxy *proxy); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index db45179..1bd0f6c 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3512,6 +3512,12 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } + vfio_user_validate_version(vbasedev, &err); + if (err != NULL) { + error_propagate(errp, err); + goto error; + } + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); vbasedev->fd = -1; @@ -3520,6 +3526,11 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->ops = &vfio_user_pci_ops; vbasedev->valid_ops = &vfio_pci_valid_ops; + return; + +error: + vfio_user_disconnect(proxy); + error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } static void vfio_user_instance_finalize(Object *obj) diff --git a/hw/vfio/user.c b/hw/vfio/user.c index f662ae0..edf1816 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -23,12 +23,20 @@ #include "io/channel-socket.h" #include "io/channel-util.h" #include "sysemu/iothread.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qjson.h" +#include "qapi/qmp/qnull.h" +#include "qapi/qmp/qstring.h" +#include "qapi/qmp/qnum.h" #include "user.h" static uint64_t max_xfer_size = VFIO_USER_DEF_MAX_XFER; +static uint64_t max_send_fds = VFIO_USER_DEF_MAX_FDS; +static int wait_time = 1000; /* wait 1 sec for replies */ static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOProxy *proxy); +static int vfio_user_send_qio(VFIOProxy *proxy, VFIOUserMsg *msg); static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); static VFIOUserFDs *vfio_user_getfds(int numfds); @@ -36,9 +44,16 @@ static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_recv(void *opaque); static int vfio_user_recv_one(VFIOProxy *proxy); +static void vfio_user_send(void *opaque); +static int vfio_user_send_one(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); +static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, bool nobql); +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) @@ -57,6 +72,32 @@ static void vfio_user_shutdown(VFIOProxy *proxy) qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); } +static int vfio_user_send_qio(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + struct iovec iov = { + .iov_base = msg->hdr, + .iov_len = msg->hdr->size, + }; + size_t numfds = 0; + int ret, *fdp = NULL; + Error *local_err = NULL; + + if (fds != NULL && fds->send_fds != 0) { + numfds = fds->send_fds; + fdp = fds->fds; + } + + ret = qio_channel_writev_full(proxy->ioc, &iov, 1, fdp, numfds, &local_err); + + if (ret == -1) { + vfio_user_set_error(msg->hdr, EIO); + vfio_user_shutdown(proxy); + error_report_err(local_err); + } + return ret; +} + static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds) { @@ -310,6 +351,53 @@ err: return -1; } +/* + * Send messages from outgoing queue when the socket buffer has space. + * If we deplete 'outgoing', remove ourselves from the poll list. + */ +static void vfio_user_send(void *opaque) +{ + VFIOProxy *proxy = opaque; + VFIOUserMsg *msg; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (!QTAILQ_EMPTY(&proxy->outgoing)) { + msg = QTAILQ_FIRST(&proxy->outgoing); + if (vfio_user_send_one(proxy, msg) < 0) { + return; + } + } + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, proxy); + } +} + +/* + * Send a single message. + * + * Sent async messages are freed, others are moved to pending queue. + */ +static int vfio_user_send_one(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + int ret; + + ret = vfio_user_send_qio(proxy, msg); + if (ret < 0) { + return ret; + } + + QTAILQ_REMOVE(&proxy->outgoing, msg, next); + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + } + + return 0; +} + static void vfio_user_cb(void *opaque) { VFIOProxy *proxy = opaque; @@ -370,6 +458,129 @@ static void vfio_user_request(void *opaque) } } +/* + * Messages are queued onto the proxy's outgoing list. + * + * It handles 3 types of messages: + * + * async messages - replies and posted writes + * + * There will be no reply from the server, so message + * buffers are freed after they're sent. + * + * nowait messages - map/unmap during address space transactions + * + * These are also sent async, but a reply is expected so that + * vfio_wait_reqs() can wait for the youngest nowait request. + * They transition from the outgoing list to the pending list + * when sent, and are freed when the reply is received. + * + * wait messages - all other requests + * + * The reply to these messages is waited for by their caller. + * They also transition from outgoing to pending when sent, but + * the message buffer is returned to the caller with the reply + * contents. The caller is responsible for freeing these messages. + * + * As an optimization, if the outgoing list and the socket send + * buffer are empty, the message is sent inline instead of being + * added to the outgoing list. The rest of the transitions are + * unchanged. + * + * returns 0 if the message was sent or queued + * returns -1 on send error + */ +static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + int ret; + + /* + * Unsent outgoing msgs - add to tail + */ + if (!QTAILQ_EMPTY(&proxy->outgoing)) { + QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + return 0; + } + + /* + * Try inline - if blocked, queue it and kick send poller + */ + if (proxy->flags & VFIO_PROXY_FORCE_QUEUED) { + ret = QIO_CHANNEL_ERR_BLOCK; + } else { + ret = vfio_user_send_qio(proxy, msg); + } + if (ret == QIO_CHANNEL_ERR_BLOCK) { + QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, vfio_user_send, + proxy); + return 0; + } + if (ret == -1) { + return ret; + } + + /* + * Sent - free async, add others to pending + */ + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + } + + return 0; +} + +static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, bool nobql) +{ + VFIOUserMsg *msg; + bool iolock = false; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_wait on async message\n"); + return; + } + + /* + * We may block later, so use a per-proxy lock and drop + * BQL while we sleep unless 'nobql' says not to. + */ + qemu_mutex_lock(&proxy->lock); + if (!nobql) { + iolock = qemu_mutex_iothread_locked(); + if (iolock) { + qemu_mutex_unlock_iothread(); + } + } + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_WAIT; + + ret = vfio_user_send_queued(proxy, msg); + + if (ret == 0) { + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + vfio_user_set_error(hdr, ETIMEDOUT); + break; + } + } + } + vfio_user_recycle(proxy, msg); + + /* lock order is BQL->proxy - don't hold proxy when getting BQL */ + qemu_mutex_unlock(&proxy->lock); + if (iolock) { + qemu_mutex_lock_iothread(); + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -494,3 +705,203 @@ void vfio_user_disconnect(VFIOProxy *proxy) g_free(proxy->sockname); g_free(proxy); } + +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) +{ + static uint16_t next_id; + + hdr->id = qatomic_fetch_inc(&next_id); + hdr->command = cmd; + hdr->size = size; + hdr->flags = (flags & ~VFIO_USER_TYPE) | VFIO_USER_REQUEST; + hdr->error_reply = 0; +} + +struct cap_entry { + const char *name; + int (*check)(QObject *qobj, Error **errp); +}; + +static int caps_parse(QDict *qdict, struct cap_entry caps[], Error **errp) +{ + QObject *qobj; + struct cap_entry *p; + + for (p = caps; p->name != NULL; p++) { + qobj = qdict_get(qdict, p->name); + if (qobj != NULL) { + if (p->check(qobj, errp)) { + return -1; + } + qdict_del(qdict, p->name); + } + } + + /* warning, for now */ + if (qdict_size(qdict) != 0) { + error_printf("spurious capabilities\n"); + } + return 0; +} + +static int check_pgsize(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t pgsize; + + if (qn == NULL || !qnum_get_try_uint(qn, &pgsize)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZE); + return -1; + } + return pgsize == 4096 ? 0 : -1; +} + +static struct cap_entry caps_migr[] = { + { VFIO_USER_CAP_PGSIZE, check_pgsize }, + { NULL } +}; + +static int check_max_fds(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + + if (qn == NULL || !qnum_get_try_uint(qn, &max_send_fds) || + max_send_fds > VFIO_USER_MAX_MAX_FDS) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return -1; + } + return 0; +} + +static int check_max_xfer(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + + if (qn == NULL || !qnum_get_try_uint(qn, &max_xfer_size) || + max_xfer_size > VFIO_USER_MAX_MAX_XFER) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_XFER); + return -1; + } + return 0; +} + +static int check_migr(QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL || caps_parse(qdict, caps_migr, errp)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return -1; + } + return 0; +} + +static struct cap_entry caps_cap[] = { + { VFIO_USER_CAP_MAX_FDS, check_max_fds }, + { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, + { VFIO_USER_CAP_MIGR, check_migr }, + { NULL } +}; + +static int check_cap(QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL || caps_parse(qdict, caps_cap, errp)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP); + return -1; + } + return 0; +} + +static struct cap_entry ver_0_0[] = { + { VFIO_USER_CAP, check_cap }, + { NULL } +}; + +static int caps_check(int minor, const char *caps, Error **errp) +{ + QObject *qobj; + QDict *qdict; + int ret; + + qobj = qobject_from_json(caps, NULL); + if (qobj == NULL) { + error_setg(errp, "malformed capabilities %s", caps); + return -1; + } + qdict = qobject_to(QDict, qobj); + if (qdict == NULL) { + error_setg(errp, "capabilities %s not an object", caps); + qobject_unref(qobj); + return -1; + } + ret = caps_parse(qdict, ver_0_0, errp); + + qobject_unref(qobj); + return ret; +} + +static GString *caps_json(void) +{ + QDict *dict = qdict_new(); + QDict *capdict = qdict_new(); + QDict *migdict = qdict_new(); + GString *str; + + qdict_put_int(migdict, VFIO_USER_CAP_PGSIZE, 4096); + qdict_put_obj(capdict, VFIO_USER_CAP_MIGR, QOBJECT(migdict)); + + qdict_put_int(capdict, VFIO_USER_CAP_MAX_FDS, VFIO_USER_MAX_MAX_FDS); + qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); + + qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); + + str = qobject_to_json(QOBJECT(dict)); + qobject_unref(dict); + return str; +} + +int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) +{ + g_autofree VFIOUserVersion *msgp; + GString *caps; + char *reply; + int size, caplen; + + caps = caps_json(); + caplen = caps->len + 1; + size = sizeof(*msgp) + caplen; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_VERSION, size, 0); + msgp->major = VFIO_USER_MAJOR_VER; + msgp->minor = VFIO_USER_MINOR_VER; + memcpy(&msgp->capabilities, caps->str, caplen); + g_string_free(caps, true); + + vfio_user_send_wait(vbasedev->proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + error_setg_errno(errp, msgp->hdr.error_reply, "version reply"); + return -1; + } + + if (msgp->major != VFIO_USER_MAJOR_VER || + msgp->minor > VFIO_USER_MINOR_VER) { + error_setg(errp, "incompatible server version"); + return -1; + } + + reply = msgp->capabilities; + if (reply[msgp->hdr.size - sizeof(*msgp) - 1] != '\0') { + error_setg(errp, "corrupt version reply"); + return -1; + } + + if (caps_check(msgp->minor, reply, errp) != 0) { + return -1; + } + + return 0; +} From patchwork Tue Nov 9 00:46:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96471C433EF for ; Tue, 9 Nov 2021 00:50:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1E47161175 for ; Tue, 9 Nov 2021 00:50:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1E47161175 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:55154 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFKw-0006Ay-9k for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:50:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51658) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAd-0005fE-QN for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:28 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:42748) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-000470-32 for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A902vrG019145 for ; Tue, 9 Nov 2021 00:39:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=bupvFZtZ/cj2yBDs8t6z8DuYs4AvzwK42E50vfuCU7s=; b=aZCggIRzz4WnKGPVM8bouc1SW7pgufeUu7CY850m9dzkFKn60KPdXXJ13nviUwNDHDzI /3b32kOONk6ARKRv5tecxk564tSe4eOUV6I9J3Pz7sZHzgpmeAT5NW1QlvarwbjnnwH2 j+zpUMBOqLeUUlNL4WOzxp9NkeVcNlTVKIZx1ufVu0O1DktD+1JLpFXRObLI5T0z/b8b IINoUKtq8bzdw9SeSmNf9de4RGdxExzjGEn/KGmJww6kgyBOksmQ366lnxQa6C7HThk9 chkC4gvCByfXGLAJII3jiJck6VnU1tLlNgaxfGBSfaC0486QLt2U4ebYQL5a+jwwyx7e Yg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c8a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN5129193 for ; Tue, 9 Nov 2021 00:39:18 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WR4RjEiIOb3cZP/NDZPAQ5mdFvhAk+Fj3H1pdPcP3wLMAblS71kpV1I12gdkzToU81y261Bkz/nNekBgvqZDOOcvDMbdqMn5jQ9btZxT4pny7A8MOdiVelfN4bYK8snriBRFZ1lTK7NCxvjG5V/Fv5Y6X5eydHiTVVDSEs1QR4VIxnc0eR3SL5+Ebg2i7FPIPJFZ+gmw7bgezxeTAbSPR7EoWlgryfcV3WKzic+t5uARe9jmtRjiOz6/W+1W2YMKiLxzP+Cy3BAexFJRlQtY+IUyZJsHshUWa/zJn7tMMq5cDLCXzkXubH6NHR2g2z5XTg5cuEKSRG/b8DjQjLtqhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bupvFZtZ/cj2yBDs8t6z8DuYs4AvzwK42E50vfuCU7s=; b=foCXghtOcbTBz66xWF4xkdE3iFRoxPNuYdElfqY7uSsp2TbdYEKCefgeENQ0a1YRPCoWNY4EUyaNNNF4PWCpb35734JjbgrEEEXRKCp4H38+9rAg2y++2n3+VWZOP8/X/Q7MgNvaq8mx958mSEo1P3Yh45XTwlEbl75dTc5ZfSnzzQfi3zVZK9wkJBCGekD6uFP+6aU97Qj7kBEgmskfZiy3OEm0GqVp/HReFGKpzypxneaWzqedFgQE0wBGYMzCwJ7CfvicrGyjzItyj5iMsdnqza7eBQy2+O76SNAhQoF1BLqtY7AFgJUN+BjsGjx9dr2I7blP0LdKa8RQ+CiH9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bupvFZtZ/cj2yBDs8t6z8DuYs4AvzwK42E50vfuCU7s=; b=RZ7+xWrfAzmtKk6SYrG/6hqPEca1nB4uiTOeiQ2rJM7Q1HALODDGj4FRTDaqx0Balrl2HCPNQa3+5LcIEIRd17Oe2K1vd/T+P+2GMJMtImC2bC/W6cIG41Q+HexzuluakO/Ed7s5fYVec77qiYSwHhERYytYe+8YgUYyBzoo7Ko= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:15 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:15 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 10/19] vfio-user: get device info Date: Mon, 8 Nov 2021 16:46:38 -0800 Message-Id: <30f1093de48b3eb368d6425fa0d3a2505ca70af6.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d9614b05-803d-4a37-f60a-08d9a31959d6 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:669; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: IR3AJG6eaEUovrFC9/ZPTTc/3b7+nbZHoPPHDQGTlijoVCJcjAMn9C9k07VFWqIp8IndyAuGwE4CGON5a0Aygw3l6YGfmFD0X7mDCEg4VasKeG9Gb/bFcpLKASpK+heeUlY/0dka0X8w/jcYUtnCxlzxSL9ryxlaKoIqdfCd/YstKXkT4QvI9vdr9nmGWmqiJfIeyZXh4Y2gVpCsH5JupwP7Lu5rEDe3lhwyhWqrkqxvEq7efBmtw9qvWDK3dN0eLvnGy5buZmc82t7dHQ8/y+qX8f+QES448CUWoNrjobb++i9xYbRjMh8GFRRb9rRMlurZi2HNZ1Sp4nxrZmH/4a0aoicTVhuh0paIqcvrqXpgJJIzqEY2S+W0NIUuxF2dsIpB7mM3h0JGvMgfoGR+iSWQGHz3HhZqAxBkwJrI0MENdxZzCHkpoj3KzxUa8H23thnt9hUU8jnVg0d8stdfi/h7WUNpSMyfNmgLBStutUXbX1IQ9MF8X4RePB947VBM7CxRAl4KhKTPFEevG/iF2vsl0GXGRNuO68T3HB98Sn9d5EDgdQy1Zj8KFfu75+7vXb2ijDPHQSShKNDoRwBXe3vU96Cqno5NC7q2zu9EozfT3Cbj/gSWee4hA3jRkTIPjbU8mZAGaRH9qW8UfSjUTwUcBEhuNKHdlqFZk9xGD1glPbbMe6X95BAKEv9UpnHMe9CBKIQU6WwER4WG7loRNA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: LH+wV0nygKwDOIjB6e3EOwjfRUcOxXBxR2Pn3vpjPwbxXxzseb5XUHKKc7zSQYDzWh/wG3P5dZup4EXM9QMK29fTzE45iN/YOEUBbZHwxWcJZvk4mDCuSAB6fQ0jeVFAS2fkQFxK4sv4xqXxMx913P92D9iSC1bP8WOpp3GDCoqgVLJxJ7ty32Z3tVnP+cj8hHvMOWr+EqaKSCbmmyUjju44Chub2FZbPbbPzBbZNJwNdzljOJNrAF11hOG+enzQOZOtRCVydV28cfOXJ2GSWH6Sjam67Oa48bmY8EMuLCl8MU1niybwjpmK6ru8Uxt4u/cEaCbmI2T/oSFUuEzg8LOKWUK5sAmOJ3l/Kpc6QjukvaGwVWyzx0uqzATXUicboWukEU0hQOo3KBvfQbdRd9nmiuDS6iyGJf+51x/TJ4Wj+Ww1nuwzmxnny6tXEh3R5Xg3qnUPYXCJ6AQNrTmSB3lKCQDyB2Zz2iBYabNAimqEfskdpI89YiKFOqZoJrsLD+Die8IXy3VIW8hR/FSNLdR+C+TdY+fpiSbtubRYA62OnRuXo4tofySZe3we/eVH+mEftNKUoBfzNx0fmTCyiMPNYQxxsR8Csw5Iqk18M07R1poDYXYUCNmGUoRDWSiMxfJxcnkvyu3aicVOOxJ3FysvQsB3hBnTtyzeaVoZEzns8uQfj7YBFNIf7qmC/hBvrJbGEGTzXBRIGd4/GZSNbzuzoYi1Qab2rQKf8kOvB/XpBqnsv8ZG2lfOavOlt6HFEO2MOo7xh+DNu5jKY/rID2XDGkoSe9l01wYqtaXVTaY13h+2xy5AMdYbAKiMvc8KSuQ7mXnEZyaOMCAKQJu15dRMF9qq8Uw4c4QChaywpanOnFvi2ZXvH90/E3Q6vmezXnPbY9R2zBcNE965Izg0O0ShE64f2DLIn4JiocdxgsNy7N0PbzWnIij3O64hLNKfsNqahxNi1ggTCOl+XUQr+2bPmRcLj5nZKnnEQEg1AUi1dPimWnsgzbU9X6K2I9YRk+9gSEp3WqOywlxNJ9xXUvBSWEzjLcPe+Ohewa4DHXVRLrKCK6SK/T9N0ez4wXR2ZSHaTG6YvDz/xSTcQn0xwW3pAqwlfeCimNa7lugBPGRCzytQi3bk8csBmEyESb7ke6C02yEPZFYDT4DTAf5PgAPRlhkfAyr8/frvUXgYOUHVYcCDLFsnWdMrqCyINwkslVP+AnyF1ZuX+78SNUnj4IPGKfDSaZXuTno5/FgeXYkGW2Hso9afCaODbe2BrVyEYGEyyqR9Zj3+hYNufCZ0t3mvCJDH06F++ldxy/dV3spmO4v9uryqbdHm/dXh/VK5dXVxI9+pMLh9gXStnXZ6vMnAhKA5KM3QR9AxkrjeVnleiWg+JlQnZy6vaJ0y8fwSk0VQcngjA0TRasKQN1zuoR/gR3xNRP6zwORv2gr8tEZKCeLfl7OhS2+4dnKkrI1tLkUC6LJqRMSvfyw0/AZCrUEhh4F6Qk2ORKAsUKdEkqVd8cEdGXLQxZ+FXE93ODJ8//9tUCnHh54aVprFGgZ7um8cOBo8ADklfMVAhz9XjbW0ipWnQkEssIxSmKdD92hbfVgLcfmQrKqour0zcWy+qd4vOG50Y2ej/O9Bb39R5FjGY4xaJhMesxvp/7hJnE7Y75iorPyYrhULnhOpeQ5G9Q== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: d9614b05-803d-4a37-f60a-08d9a31959d6 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:12.7341 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XizJd0pgCv8z6dhc9LKlsEK9imDy+C/GxSElqYsW0eLlSbOa90q0Ir4vIh+xkRSFlRSpADhOFMm1ucKqYCwereKGWveivFUajWeh26XhCjM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: _td4PvHCKzYWO5GkZv5ENwIfRCXik2yz X-Proofpoint-ORIG-GUID: _td4PvHCKzYWO5GkZv5ENwIfRCXik2yz Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 13 +++++++++++++ hw/vfio/user.h | 2 ++ hw/vfio/pci.c | 19 +++++++++++++++++++ hw/vfio/user.c | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 74 insertions(+) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 14b762d..13e44eb 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -82,4 +82,17 @@ typedef struct { #define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) +/* + * VFIO_USER_DEVICE_GET_INFO + * imported from struct_device_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t num_regions; + uint32_t num_irqs; + uint32_t cap_offset; +} VFIOUserDeviceInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 7ef3c95..19edd84 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -83,4 +83,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev, void *reqarg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); +extern VFIODevIO vfio_dev_io_sock; + #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 1bd0f6c..40eb9e6 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3484,6 +3484,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; SocketAddress addr; VFIOProxy *proxy; + struct vfio_device_info info; + int ret; Error *err = NULL; /* @@ -3524,8 +3526,25 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->no_mmap = false; vbasedev->ops = &vfio_user_pci_ops; + vbasedev->io_ops = &vfio_dev_io_sock; vbasedev->valid_ops = &vfio_pci_valid_ops; + ret = VDEV_GET_INFO(vbasedev, &info); + if (ret) { + error_setg_errno(errp, -ret, "get info failure"); + goto error; + } + vbasedev->num_irqs = info.num_irqs; + vbasedev->num_regions = info.num_regions; + vbasedev->flags = info.flags; + vbasedev->reset_works = !!(info.flags & VFIO_DEVICE_FLAGS_RESET); + + vfio_populate_device(vdev, &err); + if (err) { + error_propagate(errp, err); + goto error; + } + return; error: diff --git a/hw/vfio/user.c b/hw/vfio/user.c index edf1816..ed2a4d7 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -905,3 +905,43 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) return 0; } + +static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) +{ + VFIOUserDeviceInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_INFO, sizeof(msg), 0); + msg.argsz = sizeof(struct vfio_device_info); + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, false); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + + +/* + * Socket-based io_ops + */ + +static int vfio_user_io_get_info(VFIODevice *vbasedev, + struct vfio_device_info *info) +{ + int ret; + + ret = vfio_user_get_info(vbasedev->proxy, info); + if (ret) { + return ret; + } + + return VDEV_VALID_INFO(vbasedev, info); +} + +VFIODevIO vfio_dev_io_sock = { + .get_info = vfio_user_io_get_info, +}; + From patchwork Tue Nov 9 00:46:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B52BC433FE for ; Tue, 9 Nov 2021 00:44:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6ED2961165 for ; Tue, 9 Nov 2021 00:44:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6ED2961165 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:46644 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFFW-0000FJ-8l for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:44:30 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-0005ep-FX for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:41838) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046j-0R for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:25 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A905pwd019033 for ; Tue, 9 Nov 2021 00:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=GbhJJgQq1l9h/rT9wuSU0d6LWxb5pCoQLwUuoPVShAg=; b=GDkV4dUo85nEU/7e1L0Cmll3BXv5WDs4yM3oywi3bcQai2q9Kh0wcE2hMVmFVYTt3z7k 6FaTCl0wOlLtFJbylc00PRESy4S49aNmC7DIKVUG08gymrOslyLdZuftUOkXx8ouiURs EcEUezQQLytpujHaIIlCJb7TtbUKkLKMYogW5GuvyacW3JcAFov6SoVpqMOr/gr0tZjL JUaq+tgrwtlNfwC85AkWS3I53ykMH5vbdrCuL8k+WhMl4tm8zt9Pjo4Q61Zn1J56TZCt w0wBJDBpusKjFNu1FrJgSlBTaYarGArqJEpUPeiuVSSiEZYGPb95eJdAggbOGEBG5sDE wg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c8b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm5132637 for ; Tue, 9 Nov 2021 00:39:18 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ltqGC4Wqdz8FQNicQ+ojPLbR8ym/U3L+Gr2HzpLk7r69qKgYiK6a8gN+Cj6EF4nWStM4IBHsGg7cCnrGpZdTtjyHVvOORmvl631Y8lhYv1ot7aYPVtkejxA2oV/OHDf3tNS4HTqKbHSjWt24jOoQ3/ouhFioyNXpN5qool/Vli7xWauR0Mmrory62Pw7VDEQ7CKzEdH7tmSs4PIelZf0GMHyA1JRpYHvbeHK21CQSKRop0clG0iwyGGeVXrPulWazUdfHVHiy/wQDKXPZfNKHQeEaheOBCVnJY7oIn0BI5FR/mJ1F/Es4n/tyNeN1+Hsv+saIG4KtxFZeoNamwT43w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GbhJJgQq1l9h/rT9wuSU0d6LWxb5pCoQLwUuoPVShAg=; b=VvqHqQ5KlEhvKSdX7BJbIIVwzgD0xl0L3cKRRLi8UBbc5xW2Cr/kHo9X+ZHboQa9Nc/BWfaZ/HL2UT/X2pxoUwRPFZk1nX2eC+jIc35DUVpqdo+i0WB7ZWz+As7XDzOcmZVbKZsCMfRRapWP1iFMolIHPdCZyLc8Vo+EYis1EIqfvILwO8XKhBcE/nvVaywT4qgw13WCWaKFKafqmEi89O8LpomsrxU6Ewq6FiPMcUKLPwPM77KQaw8/y2U5Xs3f9vDnHbdEeYGv7kqJloiiyEZ0d8SMg9fcN+bMuW1Q6fTFbYgXeiJRms+GRKRPEFpQ0xjbP99T60EZbw26/ycCiQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GbhJJgQq1l9h/rT9wuSU0d6LWxb5pCoQLwUuoPVShAg=; b=fOFkKwgqgFKZ4zbLNSwwmqQeKb0e6+w+GTCS2xWDaMx0GMUtXDBA8vDThCpiCoGv1CuV3XcKQph8ywYSOqCasRTVnXB9ehOEc/Fw3LrOk4r1Y1toHqzYTMFogg4VDSJvFxyTB6Fo3r9QTZkfzED0tbEOorpJDpuxuJELNogThIQ= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:15 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:15 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 11/19] vfio-user: get region info Date: Mon, 8 Nov 2021 16:46:39 -0800 Message-Id: <32606ff56a2dd3f0d9ceaa91feb6a3c3fa6b98d5.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:12 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f6bb2a5e-7298-4768-4548-08d9a31959fb X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:324; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DuqJCgfVfs+QP/wAVsB9mwFtZkwgt1DnVBGvDuIeQEi2GQTTfeMSIJXuONY8M/fDu/r2ZRkO0Y28pfh8FOUzdb5BSXEbWKKVXfVXmWmhxHZUPZqLlJnaxXuX2sb1ir9DpNsa6mNFTWfKVjUBrDADy3cs8jIXlF+EGsDPan7T+TcxjfN5QSetqDN3CF9iE4gzwGn/XK8vtNJI6EHM6gqg4t9WsgxsYlyF453wjVdGiC1fXjZ6RiK7x4+3O7Hl/roY8+CFRjr5YOv54pa3TK8Cjo+IKl8U8TbLqtKTqO6f4Omm7gIvqoEBzjlABDKcNyKQZcMDyer6amCzdDBwuZNsaYHeHjhPIZOn/0uJJFCu6ST62uD9ZHbCY3qmdQ3wbeoYRz6VA3vKvtptc1cSlYu2v409HRwXZ6mamXNterfLFCZrzjT4M6sfM6t4WDnD3NEI9S3tRazupAwHfXjSRqwniW4P5Jz2HhKiu4o9UvQyyQjyQjPsDwof5kT6CMhEiH9qA9gzE6Xv/OAk7Tp7wYhz+HIapPMSXv25ZqvTiA4Qp4guo8xBk32xOFAftPMMuAn3pUzHoTrWZE2fzAe06snzN/nx04yu9K096rucCXV8hwm16L8lhSoQbk15tPJEAT42ydolBqvXnMNOnHmakG4MIQ/Dvj6771g2xhfUJrxNCUfL/pI26v6eNyur1scij50DyXi3+NdgkBVkQlSaZ4iQ/A== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: wIgwEwom8WvI0DhgeYqnVo4U4KNafWtiuwyl7rzF9QiEJFMTtQ7LMSrbJcYdqckDxYI+qycS7pPuolpo2a2lIYlE9qLo8Rm2YYXEP9bg5lnzoKIjJoxHEu/bT9YxwX4xIHwqTE70/TfVBVVzKLOe+VLiKCCeie2dyruHeSIfdm1HQ8O9zZtXX9cZjNVYJLhpmfuM6p4BcoiD2wrHfG3p/+zio/bDHiSquYO5R48DIf8Q9GQXuhPCgUmAOR2fhn9jhD43UDukNtCDi/U7o5D+lHGGss072Fwb4yUdCSg0LV6Snyooa6mvbLJvMLxtTgIW8+5777zfF5FvC/ADqM78yx4R72xJXJB/7CPKk0YLutvjmv7873ytRsvcccpDRkIpwYW244ufpoN3REJc30Tx8ox0jQUj4IelXDKKr4iSrOQR3XoNLi0L+jWzO6HoUu68v5Z7/EVUqs9Pq0cBOwUIKJd9mxWdYrAJZYHglPYIOnsWEArZm0fNWvbqbKqc+1pPPXSeU6JcS1ryFYAVHt/MgCaHVbP18kNxu8OZK79BUXEfTFiU3ieuTf+Vgy0QlCO89BPvZufNJRNhW6daMC7mJ6lV/9RdvvLFjNfiPle7e6IGFtOc4GMI1MFavLk64zsLl+BzSrvMObT/HLsMrJDZM49LNAD9PAiEA/pZpwav714wdPqnrDQYYXxDTMOLIM1uMA8LRIBpqBuV+MT1koHeWuWtAgwpV8qZaXXdv4ZbI+pTkBfRirB96fAHFxcBgjY1Nx+kasdEeY3aL9wTEG/i1lyjEaIhZ1Uvtr681N+lM+85jW7f0IxSbRHmXbWRhrMF/zZ+JOhyFn7z8ZJQCMDCNSMBWDqgefD+nPOWad9xYV3iCEWWUkfm+giGP+HlLMNF6KjU0x26XVbVSO2nEtNljyT6BbZhS1IerVGAu5R2bsLPOR5PHNA7JU8YaIwAcWFoQa2SDaLTNFv2FsRH+ko+DtETdkeXj1o92VQPjt3bBROr8tYhJpRlpXJYhOhBHA0erANNuOwP3WfjZyyYiqAqwegy4COOVzIagJ8lCE7PKijGKxPURdnRy4Z6TtgD2aSWNyRbLAEqQU2OQ1TrmNmUeQwsHskiMw3I/UXerY+zBW8b6aBf1+AmK7A/zQI2RjTHHYrQ3irN5owtc/C7B+7P77paHsu0Hf1FjpaSvoVeyBax87O7ES2kefMjTHxUe8s90IGzYL7rQrox3nkYd2o1iKFpbaTJM2urHnMzUhdbzMY6TLsTWyYYliaXaPZZF84jrFMlAVxR8a+w7BWVYt9sfDIDgsfA41TB8xfYOlMDDimfUN4GdeQA+k+c5E61345BtkGbbZ1TTozuAA6Kdtp6GD5wJE+2fb3iaETPczA6D6xgAXIEcsaRQASgGs15aVH08ux0wvogZeuOileE6uAwL+Oy/epMS94Dj09ndH6B4SS883IaJh+ZSugINctQGgXaVT931+BcrD8zUnZup9g35NPnHRqaAjkYZUE8Gju2KovHbaF8OhCfKzrOmtxP05fnIS+KrNp+Ik68VXrgb71FscXXDqkeCZrOjl6/uMe27238yEtKFdXsYO8KOSjE7oQRvFj7kld5JiLxX+gkzfrC7nt5coVucETUba/ATlfixy/6bg49yquFpWIoNyJ2GLUMSinSPoLXDitoCeqRL10TZg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f6bb2a5e-7298-4768-4548-08d9a31959fb X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:12.9810 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: y+mIciJl5w6/9o09iswA/4i5IUdG0SPsc7YUP0bUq5bjd2ASlqEIe35jtk84F5VSxcXxE5cj6sncMwzmLMG4d+9F7LOKYvpl/m4YUcTUeq0= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: -HclBOsxOXB8gQAWLVW0vKPAp5RSqraR X-Proofpoint-ORIG-GUID: -HclBOsxOXB8gQAWLVW0vKPAp5RSqraR Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 14 ++++++++++++ include/hw/vfio/vfio-common.h | 4 +++- hw/vfio/common.c | 30 +++++++++++++++++++++++++- hw/vfio/user.c | 50 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 96 insertions(+), 2 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 13e44eb..104bf4f 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -95,4 +95,18 @@ typedef struct { uint32_t cap_offset; } VFIOUserDeviceInfo; +/* + * VFIO_USER_DEVICE_GET_REGION_INFO + * imported from struct_vfio_region_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t cap_offset; + uint64_t size; + uint64_t offset; +} VFIOUserRegionInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 224dbf8..e2d7ee1 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -56,6 +56,7 @@ typedef struct VFIORegion { uint32_t nr_mmaps; VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ + int remfd; /* fd if exported from remote process */ } VFIORegion; typedef struct VFIOMigration { @@ -150,8 +151,9 @@ typedef struct VFIODevice { VFIOMigration *migration; Error *migration_blocker; OnOffAuto pre_copy_dirty_page_tracking; - struct vfio_region_info **regions; VFIOProxy *proxy; + struct vfio_region_info **regions; + int *regfds; } VFIODevice; struct VFIODeviceOps { diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 41fdd78..47ec28f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -40,6 +40,7 @@ #include "trace.h" #include "qapi/error.h" #include "migration/migration.h" +#include "hw/vfio/user.h" VFIOGroupList vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list); @@ -1491,6 +1492,16 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info, return true; } +static int vfio_get_region_info_remfd(VFIODevice *vbasedev, int index) +{ + struct vfio_region_info *info; + + if (vbasedev->regions == NULL || vbasedev->regions[index] == NULL) { + vfio_get_region_info(vbasedev, index, &info); + } + return vbasedev->regfds != NULL ? vbasedev->regfds[index] : -1; +} + static int vfio_setup_region_sparse_mmaps(VFIORegion *region, struct vfio_region_info *info) { @@ -1544,6 +1555,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + region->remfd = vfio_get_region_info_remfd(vbasedev, index); if (region->size) { region->mem = g_new0(MemoryRegion, 1); @@ -1587,6 +1599,7 @@ int vfio_region_mmap(VFIORegion *region) { int i, prot = 0; char *name; + int fd; if (!region->mem) { return 0; @@ -1595,9 +1608,11 @@ int vfio_region_mmap(VFIORegion *region) prot |= region->flags & VFIO_REGION_INFO_FLAG_READ ? PROT_READ : 0; prot |= region->flags & VFIO_REGION_INFO_FLAG_WRITE ? PROT_WRITE : 0; + fd = region->remfd != -1 ? region->remfd : region->vbasedev->fd; + for (i = 0; i < region->nr_mmaps; i++) { region->mmaps[i].mmap = mmap(NULL, region->mmaps[i].size, prot, - MAP_SHARED, region->vbasedev->fd, + MAP_SHARED, fd, region->fd_offset + region->mmaps[i].offset); if (region->mmaps[i].mmap == MAP_FAILED) { @@ -2379,10 +2394,17 @@ void vfio_put_base_device(VFIODevice *vbasedev) int i; for (i = 0; i < vbasedev->num_regions; i++) { + if (vbasedev->regfds != NULL && vbasedev->regfds[i] != -1) { + close(vbasedev->regfds[i]); + } g_free(vbasedev->regions[i]); } g_free(vbasedev->regions); vbasedev->regions = NULL; + if (vbasedev->regfds != NULL) { + g_free(vbasedev->regfds); + vbasedev->regfds = NULL; + } } if (!vbasedev->group) { @@ -2405,6 +2427,9 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, if (vbasedev->regions == NULL) { vbasedev->regions = g_new0(struct vfio_region_info *, vbasedev->num_regions); + if (vbasedev->proxy != NULL) { + vbasedev->regfds = g_new0(int, vbasedev->num_regions); + } } /* check cache */ if (vbasedev->regions[index] != NULL) { @@ -2441,6 +2466,9 @@ retry: /* fill cache */ vbasedev->regions[index] = g_malloc0(argsz); memcpy(vbasedev->regions[index], *info, argsz); + if (vbasedev->regfds != NULL) { + vbasedev->regfds[index] = fd; + } return 0; } diff --git a/hw/vfio/user.c b/hw/vfio/user.c index ed2a4d7..b40c4ed 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -923,6 +923,40 @@ static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) return 0; } +static int vfio_user_get_region_info(VFIOProxy *proxy, + struct vfio_region_info *info, + VFIOUserFDs *fds) +{ + g_autofree VFIOUserRegionInfo *msgp = NULL; + uint32_t size; + + /* data returned can be larger than vfio_region_info */ + if (info->argsz < sizeof(*info)) { + error_printf("vfio_user_get_region_info argsz too small\n"); + return -EINVAL; + } + if (fds != NULL && fds->send_fds != 0) { + error_printf("vfio_user_get_region_info can't send FDs\n"); + return -EINVAL; + } + + size = info->argsz + sizeof(VFIOUserHdr); + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_GET_REGION_INFO, + sizeof(*msgp), 0); + msgp->argsz = info->argsz; + msgp->index = info->index; + + vfio_user_send_wait(proxy, &msgp->hdr, fds, size, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + memcpy(info, &msgp->argsz, info->argsz); + return 0; +} + /* * Socket-based io_ops @@ -941,7 +975,23 @@ static int vfio_user_io_get_info(VFIODevice *vbasedev, return VDEV_VALID_INFO(vbasedev, info); } +static int vfio_user_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info, + int *fd) +{ + int ret; + VFIOUserFDs fds = { 0, 1, fd}; + + ret = vfio_user_get_region_info(vbasedev->proxy, info, &fds); + if (ret) { + return ret; + } + + return VDEV_VALID_REGION_INFO(vbasedev, info, fd); +} + VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, + .get_region_info = vfio_user_io_get_region_info, }; From patchwork Tue Nov 9 00:46:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FF84C433EF for ; Tue, 9 Nov 2021 00:55:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A569A61101 for ; Tue, 9 Nov 2021 00:55:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A569A61101 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:45996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFPi-0002I2-QA for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:55:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51796) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAt-0005tG-Jz for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:43 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:41572) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00047D-Sn for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:43 -0500 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A902oTR005900 for ; Tue, 9 Nov 2021 00:39:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=yM0bumMFeULVKTRz1r6VSG3f46GjI1cjQNu7OobJYr8=; b=m1bYeeEYV45uhHXVeuiDKT1zjZp6V7515s+LerEq181VdS9zq4FJO43tx9jQPFxzAanc OL8Nz9j4gEzedj1in/uCftKyGRKLrbVfzfQ3rqAviwE1hZFbW+B51hCoCVJNcg1ba7G0 gTxDYMrgYagcszqLW70/lPcGNKPM2zKKfVPDfEpb6f4ajOPxSta5YpvM+wPKMNirzJey 6sCq1g4LvihBPfz0DOUKGNc7ISZfuRf1NJZ2T6KIAKplYVYOrdG9E7mB+fO4c12Vif6z cETGM3t57LF9cVrfhaX1FnW9QAuJgn30oOJS6sJUhKNHoglCiUiRSVel3bsi250EgyMp Rg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6vkr01uk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN6129193 for ; Tue, 9 Nov 2021 00:39:19 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vms3fnXYPtgJZe1ZZqzKSDPEHc+NAsLuAL/ZQ8P3PSRsW1kbAw2Ab4XIZ1+SY+eaT18FltffNRREceV0lyjiArscQzIIpe4Iz3ZfLc3PFji3LGV4Ur6bpMd4ty7lJVAoiVaMkeYLN4KI6Rm9sDNbiaXrb7E8+8mFLl2JQxBPOqkcPBnndkyv9lYC6wkp7X5Gpxkddwhc5MvHDpUraLHmXw/DxisWvyLSz5/UoJb6hAsNKNne5ZN/SuZmu8kqksK73CnuIjcJXMwJ2cFmL6PIV5nIEe2+EhBEpfP/J8mlIgVFdk0q9PRotikJHw9inv++i5iMxqXJnAiqM4UZF8Mceg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yM0bumMFeULVKTRz1r6VSG3f46GjI1cjQNu7OobJYr8=; b=haA3mhGiqdFexYQ1K7YAW7gULUKA+PYGgN72w36MFiJXEmVsd8ZfpoMQAk3EReNgXaBFtWi9oKw52nrEJo0nWQOmupW8mdmLKYZ1KFvWvG7cbGWtSP1Hzngg90iWClIfa5xMNK8OpIPiG8xvEVqTUSLa0CQb+cw55KulkgeKOMLYfLl6fo4caqma+yuJ/Dy0nSDmig1kXJyrZNHGUUhDzRDK022pGbffNi3eRrYZiFKJH0nBRULxL8wTA3FM7qTeLowY/FGfBa5xRTdS+X5mhRtFdwZGmcMZv5WxwiRGhYX19gfLjhOIzNqkioAk1fTLasQicJ+eYa1u+sKiXJVc1w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yM0bumMFeULVKTRz1r6VSG3f46GjI1cjQNu7OobJYr8=; b=k7gnjGPQ68exjUghnTT18bqVQN3eTdku9uHIYBgoWKfRsY/Ftl1IlcvQAbUYIpWjnSh89NPzi52KRXIw/AbEjMTKlQAUBTiGn43PRpbE5umBNFEQs2xMwlTqoUL+ELYAYd43sTREZQobt3CfucDdAQW93Q76A4rINAP5PedCU4I= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:15 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:15 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 12/19] vfio-user: region read/write Date: Mon, 8 Nov 2021 16:46:40 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:13 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 91c310c0-db8f-499a-4045-08d9a3195a21 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6430; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SoISnHz8N069VngRcvvt69e9WNaPLyTbI3abJLiBMwEIoXb04hR5W+ypxrYbuz6PTJyaLf+rCJL+UY9/A9HpjWo1uwJXL//5VlYlYHaY/q7CvrfyCK94XMrBKEWZLHzsQLeYWtJ/vaaXIJIcgIwnSOGlaDI/1Z2EqjP+mM0bYtzgaPx4ozSIUm43DVT64U+us0neSTb/3KhSwEKgpbIcHqF+WLNc59HfwuCUj2Z+tAW5rV1UfqiX7jqvEpkfWM00irsw+jhXGxaoYf/miVdtVHGYjTKcSGxK26OMGRxHPUkINco5+9QKOT4xm0Zk4dNwo22p9ZHNXREu4CB14BwV0UIY8do4HrabnWrZQFtUUfrBVFSvYRlVYIDslPeiBbdsXbj4D/WklMBPKwclXGKvgkNDMCMHtmnPw7+IQaGzH6tj9qyXTTKYBVAgIXHlKE7qWqYwFy/xdOU9J7q4LbPFlCpS4zJoHxqHY5GNGXOnVKezdp/nPzDvKMp9lc7KZvscpxoFkkWPXUZix4KD/A0l95F9tXxIx8hyRv6yU63HPcpBJfsFq5VZwKUTNZJUuzu7sgxln6vMNo6uvzbCbVMvDAZJrMliYho3hx6jyAHAjkzuE2II/AefDYXoOYUiqTJco6+Mp+7ERp7ZF1xC7HEFk1xXJSzDidFV8/Ytd6vL9cO6YZpmMBUlrdNr4/axxTKiY5VTYC9gjUF/9BZbGFfAxw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: zGa5m6b7c0Wtx+K+4lgVweEh72nHqN/TJK9nLQdMO1jEEPLaLmMzWUMpNwpGDJ3JIVt/EojPs8icV1c5R0zQh0scum0FKtkk3fQJslvYvCYXfhjIr0IMd1sqkj2pYH7gZutaiTMKSbVWzgfr5jQxNsM783poqYmajwoMqP1paC5QuWVtYAuO/Bt0CDstpgWiS6MODKgaHcgBWB2W8cxjZwQmE9pscQEb91TZW+31qFns2ML3JC/vXPEZ+QfHp/cE2pE5pG2flyi9mj2NWVsFWWkz3gvP+2tmnYKRYYHDI5uY98HTMmdhf4sUSTrAdEcxNsx/8SeZGZt+DYhyFzluQpVZekIrXnF9mpDXuLzAQVf19LidnD7K4CkpkF93cV9YGAFym7zqkHLntlHT/YB5Tz9tlFAtGVQ2LUyJr9aAwZCnz7ven862DMLszG/vGVqMo/ZTkNt2xP4hV0FYXymDhoS2ap9gdpDJBf1U4vDqtSXS5FaK50Pcg/eBxebZ0WWvGkeBWrIXVOBBH0J9WjE4vCYUNWP+iuY/cs61siDjg/NHtUlX+fg4x/V0i5CRYydf1C/wLv7VJ5+ah+wcjwLCaqJ+JEnkuGBJ4WwQRHROyVw9aGPWbLTC5zV8dxFdJ3y/C8s4wm2QjSfijsoAL6cRgpMh1WlOuyRtTSSljRQypsfp/GglJ3mXsEvo1hYSqp6dk+l6W8Hh5Yy5mlt7ilmDBrPU/EwIVvsoCO5Izl0ep7d0X4CcjZ1T/NMc3evmt9UzA7Hs6FPLXNUR3UP88loxmiNervPHdQlE31DAaUPaVcePohWx07ChobgSmy9PZYtFTrma+aIM88Eh0z6xjsk78KPf/EfsiaRTHoDliffg7zOoOTtD0Seu6d/VkYdDqvvOc5Sn3InfqQyTeWYUG3pYuwY5NnUOhiQgkRtAGWXLzCvF/s0cxDHSBlJaGbMs6BA4FoalZxpRWcg+yAck5hL+PVooaNRMgHe/GS/j+p31pUlnv1BEgO99Wu4OEbtzv1v80Rdozjg0LfQQR+a+uhKz2B1aqjlEIa2qmSwZDF8CXbCnAoPZ1j44Yo1qHrfwg6T53T9CpGLmPGwAiYlaM3/CZvOzftEIO1tIVmKv4z2IVTCW8BalZGJ6uhQQZT4f8fnJEdbFuWO+CXj9ROmznE4ZT8ax/9fZCJT7zjAoNFXNGLrXKEjiNJ0SO+UKMmlLapVJr2GPy3ZeBmJVsNlxqt23L9cFCt9vkgC7eB+/HChcm7/6SsWHfMSqo2ZQQSbZk2ZCXj+3a5BE67RI6XB1h17US0Fe8CIYiyyiKvudRaHoAsaIgLZl2VbshfHJCSo+o6V2lxO3o7cJ4tAHJ9ghdQCre72HZdISEO/kTxctUdZjdKz3ZizTTpK+S15/P69v8xZZdvd2Re68MGnVqEjQ2SFA3FBAhQt1O7N1XUUlmOVbWrd7r9EWrOi3YXD9WjdgkU13hiXInvMqTf6fA2PAtN8LeVCyZe0U9NMxJL76FbnrEB11Z8KJwgZx3KNtrMXTC+8CzxZ+tLp5KFQPZmQo3Z2LCTJnnmXculRIZ/YuScGRh/ONDvcDIDvpaO7RwpfQ/WLjVqvOvrv9qxv9Dg8NofRpSyJwnS9fSUkwHVNeHiOVXojNCzr2JqFyeukqqfFfO33wUaqZVRPA9odBVkL1+dZC+g== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 91c310c0-db8f-499a-4045-08d9a3195a21 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:13.2160 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /SsFwiNTvGv5wcSObPQZe845V33nOIXzi4G3P94MnrboMpbScwRXO97l0hBFiNCbwcOAZK4MaCA0UspYDlqfWyCzux2BXp7ufuaIRVGD/90= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: aXRitgf1VG61bHgH_tSrozx-mXxO9_3J X-Proofpoint-ORIG-GUID: aXRitgf1VG61bHgH_tSrozx-mXxO9_3J Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 1 + hw/vfio/user-protocol.h | 12 +++++ hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 1 + hw/vfio/common.c | 7 ++- hw/vfio/pci.c | 7 +++ hw/vfio/user.c | 101 ++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 129 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index ec9f345..643ff75 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -194,6 +194,7 @@ struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; bool send_queued; /* all sends are queued */ + bool no_post; /* all regions write are sync */ }; /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 104bf4f..56904cf 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -109,4 +109,16 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; +/* + * VFIO_USER_REGION_READ + * VFIO_USER_REGION_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t region; + uint32_t count; + char data[]; +} VFIOUserRegionRW; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 19edd84..f2098f2 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -75,6 +75,7 @@ typedef struct VFIOProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 #define VFIO_PROXY_FORCE_QUEUED 0x4 +#define VFIO_PROXY_NO_POST 0x8 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index e2d7ee1..b498964 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -56,6 +56,7 @@ typedef struct VFIORegion { uint32_t nr_mmaps; VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ + bool post_wr; /* writes can be posted */ int remfd; /* fd if exported from remote process */ } VFIORegion; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 47ec28f..e19f321 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -213,6 +213,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint32_t dword; uint64_t qword; } buf; + bool post = region->post_wr; int ret; switch (size) { @@ -233,7 +234,11 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf, false); + /* read-after-write hazard if guest can directly access region */ + if (region->nr_mmaps) { + post = false; + } + ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf, post); if (ret != size) { const char *err = ret < 0 ? strerror(-ret) : "short write"; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 40eb9e6..d5f9987 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1665,6 +1665,9 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) bar->type = pci_bar & (bar->ioport ? ~PCI_BASE_ADDRESS_IO_MASK : ~PCI_BASE_ADDRESS_MEM_MASK); bar->size = bar->region.size; + + /* IO regions are sync, memory can be async */ + bar->region.post_wr = (bar->ioport == 0); } static void vfio_bars_prepare(VFIOPCIDevice *vdev) @@ -3513,6 +3516,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } + if (udev->no_post) { + proxy->flags |= VFIO_PROXY_NO_POST; + } vfio_user_validate_version(vbasedev, &err); if (err != NULL) { @@ -3565,6 +3571,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), + DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index b40c4ed..781cbfd 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -50,6 +50,8 @@ static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql); static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, @@ -533,6 +535,33 @@ static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg) return 0; } +/* + * async send - msg can be queued, but will be freed when sent + */ +static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + int ret; + + if (!(hdr->flags & (VFIO_USER_NO_REPLY|VFIO_USER_REPLY))) { + error_printf("vfio_user_send_async on sync message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = 0; + msg->type = VFIO_MSG_ASYNC; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + } +} + static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql) { @@ -957,6 +986,62 @@ static int vfio_user_get_region_info(VFIOProxy *proxy, return 0; } +static int vfio_user_region_read(VFIOProxy *proxy, uint8_t index, off_t offset, + uint32_t count, void *data) +{ + g_autofree VFIOUserRegionRW *msgp = NULL; + int size = sizeof(*msgp) + count; + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_READ, sizeof(*msgp), 0); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, size, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } else if (msgp->count > count) { + return -E2BIG; + } else { + memcpy(data, &msgp->data, msgp->count); + } + + return msgp->count; +} + +static int vfio_user_region_write(VFIOProxy *proxy, uint8_t index, off_t offset, + uint32_t count, void *data, bool post) +{ + VFIOUserRegionRW *msgp = NULL; + int flags = post ? VFIO_USER_NO_REPLY : 0; + int size = sizeof(*msgp) + count; + int ret; + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + memcpy(&msgp->data, data, count); + + /* async send will free msg after it's sent */ + if (post && !(proxy->flags & VFIO_PROXY_NO_POST)) { + vfio_user_send_async(proxy, &msgp->hdr, NULL); + return count; + } + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret = -msgp->hdr.error_reply; + } else { + ret = count; + } + + g_free(msgp); + return ret; +} + /* * Socket-based io_ops @@ -990,8 +1075,24 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return VDEV_VALID_REGION_INFO(vbasedev, info, fd); } +static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, + off_t off, uint32_t size, void *data) +{ + return vfio_user_region_read(vbasedev->proxy, index, off, size, data); +} + +static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, + off_t off, unsigned size, void *data, + bool post) +{ + return vfio_user_region_write(vbasedev->proxy, index, off, size, data, + post); +} + VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, .get_region_info = vfio_user_io_get_region_info, + .region_read = vfio_user_io_region_read, + .region_write = vfio_user_io_region_write, }; From patchwork Tue Nov 9 00:46:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92BD6C433EF for ; Tue, 9 Nov 2021 00:44:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 013F361167 for ; Tue, 9 Nov 2021 00:44:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 013F361167 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:46596 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFFW-0000Dn-15 for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:44:30 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-0005eq-Fz for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:42472) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-00046y-0O for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:23 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A905Fj0019127 for ; Tue, 9 Nov 2021 00:39:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=/Aa9qzXVUHvq/NOMR4G6DWxTTdoh5zu1dO1ABUGeHDw=; b=YyAAJZpgImAKgXwzdf7D/DoynyHsugKlv4acIs/QStHktCURFz+llQJoyAxnU4JQBmHO LGBL+EfHK97oA8KZE+rKKSl03o4bPeZBVIKBb/Fj9dtk2kF7G9FXjEoeTqrNpnov9zZL eBIvI2BEYnbNYDdkpHBU5KwsnurwvFEECOrnJew2cWNwg91dExO3WzOr3WhtERi5l5Eu bNkQGHKs55h0uZdTKsAyjBxJJf/ijXGDRu9TverRySGdfs+w9EM4kbgWCHcQcccfLAcH 8JJsgUtI+rWtHHJPpwp2sldwFR6jJM7Vvt2TJscNj0b4GVAKKeuVwcvKyu4v4hQXBKRI TQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c8d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm6132637 for ; Tue, 9 Nov 2021 00:39:19 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Hw0zt7jAMl9KyjyZzrBKgCWcBVthi2FzT2hCG6pF/wI0uwemu6mcZJwQtMDQF1mJ7IR7LOhsteozUc1TCjwsw7c3+5Jj7E2BcNYovReK/la6T4m5SnF4Em7QyhbG/72V8n4pOF8rnWdbM7STkaxKpzJoHV8yhSGsi0nvJPS7YC+jEr3hsFDXgGfoTGvwCGEP4GysYWQzIWOjh+9BcxJvEA2g+Ndyoz9lV6jRJX8EJeX87d0nMh+FMfhJqyF6ZMjYEsT8e1jxRcegMXgcP++fAoIReE0Q7a/EMRSf+bZA7RJjP60LuBz3/8NP+mSOaJjRDPZnl5oYEs1RaNbZxH/8tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/Aa9qzXVUHvq/NOMR4G6DWxTTdoh5zu1dO1ABUGeHDw=; b=gmkBQkNSdA+9JlwPAAj1QmtifXyM/lfx3bXHmjsRU6uLGA6stXSs2CipOQa0qTQdmKx8SP+ojLzWgBdvErzuLtr7T3F+ZjYdH+6Xrhylg9eEC5t1/aLkcZIX+IS2X+KeIzXeTfOvg8M0GEamBvPvLCWNpFvfR+gWhlO9UOC+txR20gzJyMye85Y8o7iwErcI0hI1Ce+DBY/DrfVbttDwCZBvbnbiSKI8ZyL2QwB/EBBLzKmMgKdqCZmK+mYqp6xdMmXE781Z7n9+2MzSj+jTX5MzICTw9Qd9mmKmh35p3/84Mqafu59sa1pHSmG8+q4cAX5H3JKt78h+ggyKCL0wPA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/Aa9qzXVUHvq/NOMR4G6DWxTTdoh5zu1dO1ABUGeHDw=; b=DsnI1Wm/+QA1ooPBJ2hFWU3Ev+FwL1fxUGW9maW9PgdrsQAmej3gqmaHYCJTs1gwhTguRsOV5bUKoiHsuaFjMj9l5xza2dOIL15XSLb91Z2QvDldlrEliWQeDHtAW2jWQtkGkoiualuOSdNQmmX+WyFQ7ZqRlcMtVs4MaA8q6uQ= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:15 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:15 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 13/19] vfio-user: pci_user_realize PCI setup Date: Mon, 8 Nov 2021 16:46:41 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:13 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7ba3db70-5917-4244-1acd-08d9a3195a45 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6108; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: MHXdcYaaDINNBQerjg4H+3GXwe2Srox6QXKUKFY1A3J3au9SXOqL9oQKKQiKE12b4YYlrq8n4llOc7rUdXqwMx2wDhxZt2oeeDRKLHs0OKGKTdq0ePsSKVgE3zwX9+LIuvMm9ij61llcQS55ZpJqXeUX4g4V4ERveFsmg4P40fiLuM0dCKKnorDiW2hm3bZsEVTUdDUOjwHHl58dGji6bY+Du3uZoF98FBotvQNU5WUpsXGG0OxZHq/6SAbixqoJZzIo62j3uFUheU1x2vd6wp3PLeAseshS1pV6s7V1qCi9m/QESNtc0Wa2NGsGlxt8E0pUf/sMcbrO9T7FwfEc8rBweyPzhW0ObJf2cNNwzDLggGnVqZIgrtnz4m4ut2FU480SlLXTJOWL6h+d2tjyLPlnrep8kF4VNpoCq85gOhLJYcPWdbougrPVreIBhMkpu+on33I4w69ViACSPdGNkPwIqy9WLqPFy6///j1iFqUiaB6n4ojrm0zTAZnplBN9+2vQb7NnZDZsm0Myy4sV2+gTofL98sEDQb0Gnt+9FCN36BE3k/DiS8esJo3xkEKy6QXypCiJd3QSBXO9VhNBRZEziQVdQvUC2eL8LqDKyeYD3WwueIpLHIy0V7KYNZ07YEttMJDUmDtIwGD1KALFQSErpHmdMcaKv8VExm4TcDtZOWnBaKO2B5Q8Bmil/fsMtGMMDEwFcGuAqAMsgLODQw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: XKwwgSpCQ/xjvbJlTmFyQfoIjvXPObzrdaNWQa+Mimk+tTw8xvwTcvfXnDCXBuNKR+DSTrpM4/MheF65+WJ0juzsa1CgRaXwTs1KSVQ278J1Rz2Y/gHgtm1ydWdHV4S4pNCFv3pG5cUcD6XVZKvVXSJBq9lHnPN9h5qDxixhYCHWQh30TTcHC++wldROLRmKCkFsvveC6sRNT5bW3rqDvrB5Xl/7r9Z+lhkpQsOZ7t7uEAP4bShavxnuCSis5hr9YC/mi6EXI2GRGoYAeiDHcUkcPC15YZUzi7AWg8/HWVj09+vHfmzqJCFUW/gbqabVjzv5IdPeLrpMQBqGjLAzIpAThCyUg2imxP1k6umxS05epV5S9ZOuIFpKUMA8JCoynpQg10K+oyU2bDfBfxggU/ZkdMMPN/48O2ftsIQRS0Djpac7XkTdvbZQi3btmP0yZ4x3Hm+To2vvTaKZNRAP6B3SczePzxsI/SAz2nOnGJ1klYF/lcvV+qY4uD8PkKEVVQN8jSyQoW4mRypVcI636BJRsZRM2q4r2QzF+/PINIAVb1kllQFN/Wd6SxKl08cgIkX43XjhGL5YIf1idWTuYZhZja6nq3hncqy288HCQq3BZN4TfK+QY5L1rai6lvM1bJu7ndI0aFxETJkyXmi/7nGU6NbwEQc4b38b6DdirGKXG0XiL5iyFEAkM6QxDA1TtxxdPjzMdonu/WDvkHTzVXi69ZNPHzGJ1Q10rVQD9KpFt9UWrRFFQWgR+N4X2uc1BZs9YBQyh4za1dmWbf0XYJISbsAYeM/Ul/yvxWQZ+5e6jiFI6GgO66IGdI+9xht87Gi7slQG4/b/kpHlRVXyCO+uVw3Wr7UQf3C1GA+QOxO61+R7WLoVMdUIIjC0IeHQNEGtScnmXOZw75MYliV0iosK1gcs8E6ofSUP/r/Uw4VSYJ/sCD2eXtuvKkWPHytWjQ77u7r8T4uAMI9O+SnscDvsEX9kRFsiPhyUUXqDJYboGOFo7tskky4SRp5KPpp5ptUmQQfmjETMiU+SCyLp3dJo42BJbnsab4vwz9/MOsxVuZWQNYlaa8UbbMdJvzB5fSGwa01bQG6XtktIQohhdh+OdYZlyZa9voVx2lzCuWSQB9i1/31CXePtuOPksXDMqqoBQDt3q8o6CZWks76V5qPh0kwaJZ9UL7Lmsxez01RIeySwaXN1DY/DeRbzw7PXy4iPv4shncqr33IiAd5yhTKrThT7jFLLHZvZyprJxa5oyNVaR5n/29xfzQCAO59eZ43hxXO5LzDWmRqQUUxL3RDl/04bE8Pu6qRD0FcW2c4OG63Ky+1PI1Mnr2F7ehB86K4FkWI2QJHGb9Gj/K/V5QwtS7FedIs5BGHL9ZY11x5iCVU2AqdyTW5Pi5CgUBtEQRZOOuMunZ+//DGWCrAXjZMRPUQ2JgBgXaHbZOWwrGDT9RtyPzoQeaedMX8BGcqNiEqjjGGFHAFl1tgvYVPz4+Y0N1nLWC9GQ+C/ybnKdzUj8kOQZGyXMsMbMV3HN3UPmZSaJePvnYm7DXExfNeV/bTVei4jmK8ihjnKGrPTGjFjzGJA6sWAOBrN3ewFyYTRkmz1yOjuWy9gcQN9BBwZbHAqMSTjGpu5/02xkJyRxTxfwevPVy3zYzOMsIdCb/+V4ra4eDLOaeLNEZdIpI7fEg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7ba3db70-5917-4244-1acd-08d9a3195a45 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:13.4549 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2Tr2lsSSrdhHXJhRf3Ul4w1Hg8hEpJXQQq/L5p+AUH7r5yhNFDh8AXssbmnsDzqSq8sid/GzhWIHwTdK7MwNF3XYzJqmUltO/KX+p+xP+oQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: ogx7WL_bMmWg-7UN-gRUz45t5yVUlr2x X-Proofpoint-ORIG-GUID: ogx7WL_bMmWg-7UN-gRUz45t5yVUlr2x Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" PCI BARs read from remote device PCI config reads/writes sent to remote server Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 88 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d5f9987..f8729b2 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3551,8 +3551,93 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) goto error; } + /* Get a copy of config space */ + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, 0, + MIN(pci_config_size(pdev), vdev->config_size), + pdev->config); + if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { + error_setg_errno(errp, -ret, "failed to read device config space"); + goto error; + } + + /* vfio emulates a lot for us, but some bits need extra love */ + vdev->emulated_config_bits = g_malloc0(vdev->config_size); + + /* QEMU can choose to expose the ROM or not */ + memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4); + /* QEMU can also add or extend BARs */ + memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4); + vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID); + vdev->device_id = pci_get_word(pdev->config + PCI_DEVICE_ID); + + /* QEMU can change multi-function devices to single function, or reverse */ + vdev->emulated_config_bits[PCI_HEADER_TYPE] = + PCI_HEADER_TYPE_MULTI_FUNCTION; + + /* Restore or clear multifunction, this is always controlled by QEMU */ + if (vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) { + vdev->pdev.config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION; + } else { + vdev->pdev.config[PCI_HEADER_TYPE] &= ~PCI_HEADER_TYPE_MULTI_FUNCTION; + } + + /* + * Clear host resource mapping info. If we choose not to register a + * BAR, such as might be the case with the option ROM, we can get + * confusing, unwritable, residual addresses from the host here. + */ + memset(&vdev->pdev.config[PCI_BASE_ADDRESS_0], 0, 24); + memset(&vdev->pdev.config[PCI_ROM_ADDRESS], 0, 4); + + vfio_pci_size_rom(vdev); + + vfio_bars_prepare(vdev); + + vfio_msix_early_setup(vdev, &err); + if (err) { + error_propagate(errp, err); + goto error; + } + + vfio_bars_register(vdev); + + ret = vfio_add_capabilities(vdev, errp); + if (ret) { + goto out_teardown; + } + + /* QEMU emulates all of MSI & MSIX */ + if (pdev->cap_present & QEMU_PCI_CAP_MSIX) { + memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff, + MSIX_CAP_LENGTH); + } + + if (pdev->cap_present & QEMU_PCI_CAP_MSI) { + memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff, + vdev->msi_cap_size); + } + + if (vdev->pdev.config[PCI_INTERRUPT_PIN] != 0) { + vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vfio_intx_mmap_enable, vdev); + pci_device_set_intx_routing_notifier(&vdev->pdev, + vfio_intx_routing_notifier); + vdev->irqchip_change_notifier.notify = vfio_irqchip_change; + kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); + ret = vfio_intx_enable(vdev, errp); + if (ret) { + goto out_deregister; + } + } + return; +out_deregister: + pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); + kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier); +out_teardown: + vfio_teardown_msi(vdev); + vfio_bars_exit(vdev); error: vfio_user_disconnect(proxy); error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); @@ -3565,7 +3650,9 @@ static void vfio_user_instance_finalize(Object *obj) vfio_put_device(vdev); - vfio_user_disconnect(vbasedev->proxy); + if (vbasedev->proxy != NULL) { + vfio_user_disconnect(vbasedev->proxy); + } } static Property vfio_user_pci_dev_properties[] = { From patchwork Tue Nov 9 00:46:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C6F1C433EF for ; Tue, 9 Nov 2021 00:52:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ED19860041 for ; Tue, 9 Nov 2021 00:52:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org ED19860041 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:38014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFNO-0005IT-2m for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:52:38 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51752) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAi-0005ju-5O for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:32 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:44324) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAY-00047J-7E for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:31 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A905pwf019033 for ; Tue, 9 Nov 2021 00:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=Dp8yTb3RzSzTBZUXECLNZdNuCarfOD2WRepONbKnpI4=; b=PmGVL+gayj637+DSd+g/hkh8srPQz303qkRQsdgQXRuxEPfEvJdpBym+MUEnx07eP2GS w7Zl3tR+/gONTU7TFLhjFl7ziM/UV1aqlNqYmpp6Xv+B8ynpdrJ68J27SnOsezn8EXqw bogCAQvk1ulEWX1JxOEPvzJTceSXwzYnF4AIzW4AF0q0eNQvoePQDbnhd9ctHakrcvpz 6DX8i5Bul1zHqae2LCfp5JqDrox3JdPpH0gmyZZdeE7gfdyukw9N/86t+aHPwfnrGZpz P72bmyz7uplQTeOomWY3Wzs5j3WXWFIHmXhDc58B4JhTiCyXNIl/fGvoVIzH2b9po+nl cg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6sbk7c8e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLN7129193 for ; Tue, 9 Nov 2021 00:39:19 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2104.outbound.protection.outlook.com [104.47.70.104]) by userp3030.oracle.com with ESMTP id 3c5etuvb6n-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F+8jCVe+Q4EaqzfXzp2UguHiteUJAuuON5h07lb2tDDQL+s++YSq6YoSP6ILkxLFHb9FzzyrLjcwQv0yxLkiSiOPAPYOL1OrvkABK5lLUmncZX7HkTdWDSUg2pA+mUQThW0YojvwFfym3LFS4x+lsL5kFr7Od3d87KWhQPgTHL6Qr0yb23rTddB1OUqZ0hkIFAQg4bMWhq4Sg7bnFWIz6mftl/elbNJ4uq9wcHr2LMc9u3PV7dK187TKV5ijBwdhe+h5uLvbwiToSmVJSygfDHgzA0lw9k8y12VcwAH5oRQsFOov5pEV3gqYRLYmFlvm1qFmr6e3S2pJvNsS33Uw1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Dp8yTb3RzSzTBZUXECLNZdNuCarfOD2WRepONbKnpI4=; b=fmNPeohkEtPKwCbnO68TjvXKNP7M+rvjxQSIvT5yqItcHmgalMCWfy5+g6YAuuxJbGTZQWeBM/pAEfb+o1n4+V9i1wnZu4+7sMtnxG8wExu9A30H6lb/KuHByvK70WhVwbJ/wjU3Yr2us2TJC8MDvrwTP8PXaZr9r9VgcLGxxgsMVqoesNErZ/QzE5pDt6DhAzNM/k2bT7Aa9mSNHk7E1ao3Tjfl5cyOmlBXM1Ulgw0s0W9fSClPiFtthYNAAmeaLMJN7xetrrQwSdRoVNOAfzN2g687I/GuSpU0Pp+KmBZePAayrsftWjwoMBPa77Cc3Xsg8j3fibYZYy68CA/c+Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Dp8yTb3RzSzTBZUXECLNZdNuCarfOD2WRepONbKnpI4=; b=tiIy+XVPgyUUgP3CP33lmeCdsIf1+X7CVKSHwNJGZUjRu8IbCTDI4CLRus/mguMFzBBt29vzCU3Blj/PMlqGQlDijCqNgtC92TUsd2xu8Nod+aeksE/pCEEP4BJtuDuQ89PDsHs6VPhS+wicUkyLMU+wF+t2UWZdqVf3sRnQneM= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:16 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:16 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 14/19] vfio-user: get and set IRQs Date: Mon, 8 Nov 2021 16:46:42 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:13 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f5cc6753-f006-45e2-c49b-08d9a3195a65 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1360; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LwW3UOcpBGUJtlC930Bxe1YXeH2+nxwsZ4W6OKBlwkDertkwVPM5uuYIauVBKiFYxak1e7RudhsQz6C56SFk4Ev8Q9ZENSXRYSgE+IszX2NPz+8RYVZP0vDvS/k5PJZBJwpQdGKXn2w9OTsQ4LxeNqOm6isQeMdPWVnpn8E1nAs+DSt97n8IAZWNNVHQfvh8BvpCMmHqpSBmyjghb3YDBG2biNBJk05aS3+O2169xOmwWxzf/m1DzmHOwyo/7if86iaVlRYgLMcXlwDL7C0dnxlprRs85fpUxSAPsLk2Zqaa1oVdkxnhbtC4NmwetZxTzDyVgy3Wo4CLpiDITNOCH5tdlflnj1JYS3TqfeCy5DEWPe2ZB94GN0tQGoOKMwzwAn03qKNae8/VZuLaJytZBUh0DavdmOv3/bICnqCIom7Y2voz/TjOvkSvE1UHy9R3ICeo4z6XsSt+c2YvUN6GcAhp/MRJoR9oO0/AgnTFe3FjOFYLFxlpq5KCKN/nWU0WGnP+Ey8t7rQATuRUybwEYG1I405xV64q+xE7Yfga0XsD6bFpk3ZaYeMeUk7sXweRCmioF5bF5N+2hgMkuio4doK8Nm0ZLcP6wHElcyxctAHQVZhTb2rQiX90Moxui0zodTp4v+C+DIDnTmvX9f4/DcMp9uGx9N1t2cqUcs0TYfPu+iVe8ZyHLxsvxryxcHonbs53CIbZ9/3nuq8VS4a/Dg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: KvONsHxqxdlpn0GJrYydB9byns0DhyeXzK28vszcDuGPmcCTTNcZ42KGF/wRKfVqL6yhNfTGEEUHmjrI870bbXfdQpAbq4Kq+J7Ysu4Tk9XIb9HUEym4b/h/mJ0lHiU3cbhJPK6KCguWQDGhX4JN6DiDeunbWzqU2P99bSz5KrjnEYrAGd/zK9qpTrmlROtztCx+9j2EStKoXmzqFp3mD9dr5RoYeyt8SLZvpmA3d92ihpuHkxAB4TGpJbmTWTOCRuYSUxsgYc20lPfTtr/sWU+rXOrUE1ycyGCNH68HBRjlVCGDGMcRJ8wh+mJKKEvJ9wJ9na04A1XDXn9/gARYieh4qeRQZJd1HUkbFtDn7ZfvEQ/8D0S5M7NsWbfLCqr6j3a5pQbk/4zASen72Ugy71Kele68ypc/4qOjf9AW751bWcGxKaPuQff4KQYDk25wPgl9c8DWW56WpOUzDQAzE1WPt8sjZpZcFkEHmXuG0aMBGdpDybGiRpSV9cuV0Ru3BKibe6VZWkqLED5wMICDrNX6HejMOp1eyZqMtXL44Nb0L94BmTYAGo+eArAG9m6xVUck2FTXIwRSWyFqApLCcafI1r9ykgqalnU20s7NMxtqRv0LJV05LgaCK5pVz3vofBG9YRrkumPC5hcRDfxdGWfXImz+F/5UPUMRZxPN45eKw480O46nIPCZmI8QLknC7BL0OlCKGjGpAV18ipj71FB4zqq7+zDN+Qmpz8eaj+TXilkHSa93zOUFRGJWvnHsUlfRYzFafDe7fOp9lSIRocHVKRWoqAcghW8ps5CtsbJWAIR/X2TjLQoA9hk8Yihe5DtOOxMVd10rmb1YQfO1dU2LCzjDTBti1r/ELIIgxMSMYJ+A4qZm74m0PrI4B+D2VZiBTWbPrK5O9ve3x3arJj41Ymrbkcx5EJZzOu1OHH4TlCtA55LQ5j2ldpPE3m0ktOjL2efvnrI+ch8w/G09orA6dUv6Flvv4mwgsr0deaMuVROKK6lZ3PPqGnCnW4gofeXjGp0K4wWrsUZ57utCh6CUTcPUCDwKXcOpSqnG6o1UkSJFtiDqPa8fjrJQ0gajuXCCw3jJRn6MBhTbKxcFoxtHtk5i8KJSPO2EkFgB3zAYlR6XwCr/OI1iyg5X/eGyVfgRfsUH2N0BqQnOFRRQj66LQ9FVh7bTyY5DCFGW8l/ORvLUVOZdR+pmvVlpMF6FiICheT4OrqpMrehujBIro9W/JLxw0A0WJGTk/k0ZRjJLkA4Dx73FSFBGqiFpHNYeFxoV9HD+5ntoDB30r/Fw9p1souOzLRBuZe+ahI16FoT7vM1QvH5ogK0rWQ6GCdBjN/o4sC7f9Kr97xhZuCh8po988mmCBNORQIwpSPR59McPGk416yNHX5jiFjWWHhV6nEdrIvprqc6VZ4Y5DMm6ErC5PfsIkcUT/zWEPonmPDJgaSc0NmBFndxKXWooWeeFdijso6Fh3Cux8Pcpf2AXhRt9vFiRRSCPO6zBmIejQrfycw4mIml4CwNITiMLFTIsl2OVF4F4nJrcwVCz5BimFHdl14LZPHBLz8D4wMk+wQ8Kmt5dDntQSyCG6j0O/GTOtFebYnemapOB6XYPwQLAtsJcfy+hJufNMwSW7UjMWQwB2txN9kWknRkIpwNfAR0wfRG2nhEZiBizKk4YDXlFyQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f5cc6753-f006-45e2-c49b-08d9a3195a65 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:13.6670 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /voQMopkfyWqvVA455cBw3GGTOJ1222E8Ku6N6A6D5NyX78zeSmWIEVDkrQwRnpABqh9R9UeD5kOU9H679eZI4u0lRDjjjLrNFvyRhc2uVg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: sTg_NZcY8FdOgXuzSU04fZNvlMjX2ynK X-Proofpoint-ORIG-GUID: sTg_NZcY8FdOgXuzSU04fZNvlMjX2ynK Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 25 ++++++++++ hw/vfio/pci.c | 9 +++- hw/vfio/user.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 160 insertions(+), 2 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 56904cf..5614efa 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -110,6 +110,31 @@ typedef struct { } VFIOUserRegionInfo; /* + * VFIO_USER_DEVICE_GET_IRQ_INFO + * imported from struct vfio_irq_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t count; +} VFIOUserIRQInfo; + +/* + * VFIO_USER_DEVICE_SET_IRQS + * imported from struct vfio_irq_set + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t start; + uint32_t count; +} VFIOUserIRQSet; + +/* * VFIO_USER_REGION_READ * VFIO_USER_REGION_WRITE */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index f8729b2..6f2d2fd 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -506,7 +506,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, vdev->nr_vectors = nr + 1; ret = vfio_enable_vectors(vdev, true); if (ret) { - error_report("vfio: failed to enable vectors, %d", ret); + error_report("vfio: failed to enable vectors, %s", strerror(-ret)); } } else { Error *err = NULL; @@ -651,7 +651,8 @@ retry: ret = vfio_enable_vectors(vdev, false); if (ret) { if (ret < 0) { - error_report("vfio: Error: Failed to setup MSI fds: %m"); + error_report("vfio: Error: Failed to setup MSI fds: %s", + strerror(-ret)); } else if (ret != vdev->nr_vectors) { error_report("vfio: Error: Failed to enable %d " "MSI vectors, retry with %d", vdev->nr_vectors, ret); @@ -2662,6 +2663,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; ret = VDEV_GET_IRQ_INFO(vbasedev, &irq_info); + if (ret) { /* This can fail for an old kernel or legacy PCI dev */ trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); @@ -3630,6 +3632,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) } } + vfio_register_err_notifier(vdev); + vfio_register_req_notifier(vdev); + return; out_deregister: diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 781cbfd..1e220b9 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -986,6 +986,113 @@ static int vfio_user_get_region_info(VFIOProxy *proxy, return 0; } +static int vfio_user_get_irq_info(VFIOProxy *proxy, + struct vfio_irq_info *info) +{ + VFIOUserIRQInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_IRQ_INFO, + sizeof(msg), 0); + msg.argsz = info->argsz; + msg.index = info->index; + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, false); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + +static int irq_howmany(int *fdp, int cur, int max) +{ + int n = 0; + + if (fdp[cur] != -1) { + do { + n++; + } while (n < max && fdp[cur + n] != -1 && n < max_send_fds); + } else { + do { + n++; + } while (n < max && fdp[cur + n] == -1 && n < max_send_fds); + } + + return n; +} + +static int vfio_user_set_irqs(VFIOProxy *proxy, struct vfio_irq_set *irq) +{ + g_autofree VFIOUserIRQSet *msgp = NULL; + uint32_t size, nfds, send_fds, sent_fds; + + if (irq->argsz < sizeof(*irq)) { + error_printf("vfio_user_set_irqs argsz too small\n"); + return -EINVAL; + } + + /* + * Handle simple case + */ + if ((irq->flags & VFIO_IRQ_SET_DATA_EVENTFD) == 0) { + size = sizeof(VFIOUserHdr) + irq->argsz; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, size, 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start; + msgp->count = irq->count; + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + return 0; + } + + /* + * Calculate the number of FDs to send + * and adjust argsz + */ + nfds = (irq->argsz - sizeof(*irq)) / sizeof(int); + irq->argsz = sizeof(*irq); + msgp = g_malloc0(sizeof(*msgp)); + /* + * Send in chunks if over max_send_fds + */ + for (sent_fds = 0; nfds > sent_fds; sent_fds += send_fds) { + VFIOUserFDs *arg_fds, loop_fds; + + /* must send all valid FDs or all invalid FDs in single msg */ + send_fds = irq_howmany((int *)irq->data, sent_fds, nfds - sent_fds); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, + sizeof(*msgp), 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start + sent_fds; + msgp->count = send_fds; + + loop_fds.send_fds = send_fds; + loop_fds.recv_fds = 0; + loop_fds.fds = (int *)irq->data + sent_fds; + arg_fds = loop_fds.fds[0] != -1 ? &loop_fds : NULL; + + vfio_user_send_wait(proxy, &msgp->hdr, arg_fds, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + } + + return 0; +} + static int vfio_user_region_read(VFIOProxy *proxy, uint8_t index, off_t offset, uint32_t count, void *data) { @@ -1075,6 +1182,25 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return VDEV_VALID_REGION_INFO(vbasedev, info, fd); } +static int vfio_user_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *irq) +{ + int ret; + + ret = vfio_user_get_irq_info(vbasedev->proxy, irq); + if (ret) { + return ret; + } + + return VDEV_VALID_IRQ_INFO(vbasedev, irq); +} + +static int vfio_user_io_set_irqs(VFIODevice *vbasedev, + struct vfio_irq_set *irqs) +{ + return vfio_user_set_irqs(vbasedev->proxy, irqs); +} + static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, uint32_t size, void *data) { @@ -1092,6 +1218,8 @@ static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, .get_region_info = vfio_user_io_get_region_info, + .get_irq_info = vfio_user_io_get_irq_info, + .set_irqs = vfio_user_io_set_irqs, .region_read = vfio_user_io_region_read, .region_write = vfio_user_io_region_write, }; From patchwork Tue Nov 9 00:46:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B723C433EF for ; Tue, 9 Nov 2021 01:03:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 94092613A6 for ; Tue, 9 Nov 2021 01:03:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 94092613A6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:39184 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFXQ-0000NR-NP for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 20:03:00 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAs-0005rO-4V for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:42 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:43024) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAX-000478-AS for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:40 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A8NxaFw001213 for ; Tue, 9 Nov 2021 00:39:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=i3GSV6p9J98AxMxxbvEPwowNtuFf+gWGPvJ/tkjjRNM=; b=cp2gwu2Zz4O8Rp1e2IQ+qfuet5QKP+VOEBhgQYlUjyGsYFb8EEVTLhDzxZUYC4/OJjD4 M24yzsuRGmM7bKPh2/6DwABTp66vfybBG/E14+oAKk9GjScbCmFYQB4pOWfnaK0Nzham 0/mPI3Za3eW9DrTIhMAjKxAVNwUx20YnlQet/cr00J+lEuaEl8abyL5Tj3t/jvemhzVy ALKRJXh1WuOWGzXxOv5StWfaYU9bVVJ7qC5+YjpXg0ZcpkuRmaKmh6j7XLdAN23i65zQ xsc25F0drME6/Dyw1072orZ+X2gHk5hSzzwqczu9swa7H1tZ6AKfcKSGMREG4hz/B7d5 MA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6uh4fnp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm7132637 for ; Tue, 9 Nov 2021 00:39:19 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EGx+oY61M62yB+ExNf9RQlJMdLa7e/7p8+SmZnHLK83hTg+ob+TRSvuqJw9IyML8HtA6cXDCqXRtjy8Qd7fyFsJMWmYeDwDTpgi2ulom1PVdJLiOK22jGDXHi8g//Yh5pC8SANo8ljb8T8FvTvjhFv77Qqekm+X/syOQ6q1EAZdlDXkpCSyCxRWqKyCykUp7tJsxPzIECC+/RIniUfTQ/RfJGvrFUao2iQDM2IIvTrYXCXvsgjOdoYfCtXaPwGP9N8qmFng9qpCkJPzB173KWfIYvV1V/T+qQjHuaHGieYRfT+JGaIjmdZBOd7wTrkEjmzA1wmjFO8dbFYk+sDUM6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=i3GSV6p9J98AxMxxbvEPwowNtuFf+gWGPvJ/tkjjRNM=; b=ePJlNOYMZhN9s9V6ckNaj/j5RMzVbX3dcgCV1AGPm1D0G5lo4HaMHnheCPxYTeUVN62iVsSEjZksskBierbJ8SPWl99gW18vq4VPBiSZ9H52TbZKt/s8xePyKj/yW5HV0FxNhm2JA8q4Mov4EW/O6NIwMdGj/T7U6KLusEXpfRK7Siyrpru9yhR6kybDSJONBqgUx0I7+RrY5NMeyvV5b4GvWMKmVgJ2WK8GMBl/ZCIaOV3GZZhM0BEZaVqf9gn+6tXuXqZMyjwGmyz3Gs3Scv5lCeUBb3HV1h1C2NMqS8He18X1RQvu3vb7JXwanwgDIA9zJajRqc+ZesedTVkOKA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=i3GSV6p9J98AxMxxbvEPwowNtuFf+gWGPvJ/tkjjRNM=; b=gZrFuAu2cyTVEzZbrkaMqnvkPioPi4iwc2V+yXrF1if/zXJn8bicLsDIeXfvlfZpal4JsN1f/VOm+gi96Dulf0P/i2kw9kKivL+Qd1Zc2FZIixpGyVuGW/+9lslti5o4rge6PtY5MAqnGxXfSHzP91R0YIOs0LxIZ+uUfzSdSqs= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:16 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:16 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 15/19] vfio-user: proxy container connect/disconnect Date: Mon, 8 Nov 2021 16:46:43 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:13 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7ab118de-914f-4e6d-6e69-08d9a3195a86 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3044; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: bJVQibzRgOzCmgXrR01O12TZ3H5NVvA4CyiRjn1d/jd92paC+NOpDp6MBZTQ7jPUN9vwlt+rOFg0m5OiKiSDri5/OQ7uBTpZo7F4zQYxyWyXBIk7SgVJiz3+ZwuVe93t2CojJVGgWmFpjYv8x8M03MPMCATbsPp2j5XfTGMNi2LUfLQ+Hh+9hz3Op239mQBOSYBRVkwwP5n6Gyqg5JSJeq6blhSlN75zWfwdhbNue+AqbLp28DK5Zsb9A5Vbm689YwWu54UAAIn4PuXHhYGaBT/4ZH1lPiQKyX/YwpRoc7C99TvhX/jCkyK/YNYXUmj6kzS/b502Wze3cyvrLyzaCBhxmFCcRPCYAf3p3N0F3jf3NqZ3WCtk5nwu1NQ/bAHPLAc1IVZdGAbMj+obiTsHmpCQIh2u52Ihw4sU1NHozpYRRImx96HFMyxRMvMVtMZP/cL74dnmyxZsQDGOpXCervsirz/YfC7dE4JRHLkX4WUcjCKfdR4bh+kCDMKK6VXNikSiRxK6KbauRilTAQfvThgAbL27wUK4j+datXbOJocPNVyhHerj41jRJtzUILtt/ZSV2+o/IaxH1Rp+2bayEZpuF/cHse3y2tWaB/PtRw8D62hiZfy1bzOZn+aV2c/IyxobYGGtdTsLAqBlq/MVWgzBfTZvET1cjJAp9yBlo/SB4DlSk8+k4VZjKvFUZsThpe88eyhGj/MFbpgNYjRRxw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: l090CUuoU6HbDP4Q5pKVjCwllJ2D33dfV8VPkptwV+BYRAmXx7DcUvT/5vj9r8ZR8dtEiWv7UJ9tRC3Jx1EbtzaUatfz1/se3aowe6s1gxdkLa13mcOq4LapvXruHnPOt5zjCkJPoJ/c1ryk5/UhSiNbJvZUe7in4VHSwz6pF58a5hP5tajL7f/hLfqkvSmH9KLuPO+p9O8kMJAm6zYpljt0moO20aNrfnkcxeSWk2RKopWboVegUdLICpWLBITWEthC+vjWWGBqLxVtpyhxHv3fWHg8K9w3YdTPyruWNANZBAOShkr9QrLn2iO+6tW+wQYAuSuHiJTwnW3hhL7adx/FdTA9JV7CgUZnrgITkOax9yow8vfsHoGjLZ9j2gozrhBkowcHQEB1v6MnkFLgDvIAaUau/EESsVa3757Z0CNQ+BPvo3dvD6gZG7vFELTCwdys6/Xol6mN3+ohjGij1vlH2O9Dqb5EBPVPOzC5XfVMAzewhuRhc3KdIvHQWKuJKikCqNTPsyeiKi9nEDzQxArYpcXuyja+OB9PT/LSSD0VyKkkwbJIffQnYpXcyBoAzVU2wa08NdKasa6o5dMXpeY28EKWhHWSACGARp5BYK6pud16bCm67jyS2yPEysuF8RbyFtzcEVA4WHrVJt2dVbEe6AIQRwfWD0XYCkekQGZ2VJzBjyERBQpNJvTTatEO1rSGq4bMWjp/nbZjVeSO9iReQDqvtmmx3y9jkPCIjQlo5Yn2iI0+aGfNOi8yA8FdlzzSJUQ/zUurrTZnoNKOUfhJZh35K1onTCF3pA0I3cZhZ2hJj+Zsks5ZqxPTGrDV61wJAHhCucU+guFnAIReXYI3pVe06xv+Dni2pKWDKAsZS0wRzP4uMu50NvE6YRn8zPmcaZvBwAN8IU+rEeqcASb4yXxAbEeBhsdar1saIArlmHBeVk++v9XTvnyd+h2XfWkqRWrA0+BCZh/oOcnbhv+OwNOAjXrsB2Kel4sY2CeJ8atbaiqEjsIywMbRuacoiZdcf21/RA+Qh5VPNjuAujEE3N3AVns0D1vsXygCJr7u3cqfKIAFkSsFgXOKi8YmWAAb4ykd8iQDQHoQfFBOwTwhTa85Lo0rIs0TVYMiet6JaoyOl9awbTNwl358tFFnSO110IAHwbZy4vW1Bg/WeG6pG7FNGXg0dRtoD5+rlZXrl6QRtuaXDn2xKi7RG75Ch7t3+OnP9CNWjb7/irdIYbJEsZQrK/Fu2qKUarJTcn7/lvevmtJ2ATp9yFcQJL3+AVTvv5xdmqJNNjdukD+to9B+16Qey0DEPfHFGuVKXq3AlxyUpJbGozorzcf4LrJuLguXthOOgDYjfPlMOUkvPqXOqMg4/47dn6m8AhYGkf7twr10JLibhkXJjQhYC5ZdYOxeqF9kpLBWKbcPbMCcL3VnZHnnnoEo1X9NThvuwPAh6VzGW+lhapH1p8YFncjNBzwMYHgowfuSuo8p5RDokx4uQDVx3xgQQzk7wKld2zJolHvUCjPYI1XbO0SydOPTiUlYzlLmRnw53VDUpdpITIk62Jb8+7M7AWzp3Xh+Z4kFczbmkpuJnyshvFU8QpNzx1/4+aQ+Oy+FXUc7CXOyGzMerZ308tyU4Vh81JI5iwQXHh+cS0q81UQTphXfC4Iy6ekojWhKfb7HXIPeiic8NQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7ab118de-914f-4e6d-6e69-08d9a3195a86 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:13.8841 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: lju/F2DUd3fqJhmMNeeRAulu40dSX7XQKAYGUMDoB0kjmTyZHHT/qLnoW2jRQQsLhuWtURfza/rENntCYnsJdGJfGhR30d8xfsgxIPfZWb0= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: WcuqcbJsr6dg_UJcCnG1rzOFFbAt12iU X-Proofpoint-GUID: WcuqcbJsr6dg_UJcCnG1rzOFFbAt12iU Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 3 ++ hw/vfio/common.c | 98 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 24 +++++++++++ hw/vfio/user.c | 3 ++ 5 files changed, 129 insertions(+) diff --git a/hw/vfio/user.h b/hw/vfio/user.h index f2098f2..8d03e7c 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -85,5 +85,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev, int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); extern VFIODevIO vfio_dev_io_sock; +extern VFIOContIO vfio_cont_io_sock; #endif /* VFIO_USER_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index b498964..c0e7632 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -94,6 +94,7 @@ typedef struct VFIOContainer { uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; + VFIOProxy *proxy; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; @@ -300,6 +301,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp); void vfio_put_group(VFIOGroup *group); int vfio_get_device(VFIOGroup *group, const char *name, VFIODevice *vbasedev, Error **errp); +void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as); +void vfio_disconnect_proxy(VFIOGroup *group); extern const MemoryRegionOps vfio_region_ops; typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index e19f321..fdd2702 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -19,6 +19,7 @@ */ #include "qemu/osdep.h" +#include CONFIG_DEVICES #include #ifdef CONFIG_KVM #include @@ -2204,6 +2205,62 @@ put_space_exit: return ret; } + +#ifdef CONFIG_VFIO_USER + +void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as) +{ + VFIOAddressSpace *space; + VFIOContainer *container; + + if (QLIST_EMPTY(&vfio_group_list)) { + qemu_register_reset(vfio_reset_handler, NULL); + } + + QLIST_INSERT_HEAD(&vfio_group_list, group, next); + + /* + * try to mirror vfio_connect_container() + * as much as possible + */ + + space = vfio_get_address_space(as); + + container = g_malloc0(sizeof(*container)); + container->space = space; + container->fd = -1; + container->io_ops = &vfio_cont_io_sock; + QLIST_INIT(&container->giommu_list); + QLIST_INIT(&container->hostwin_list); + container->proxy = proxy; + + /* + * The proxy uses a SW IOMMU in lieu of the HW one + * used in the ioctl() version. Use TYPE1 with the + * target's page size for maximum capatibility + */ + container->iommu_type = VFIO_TYPE1_IOMMU; + vfio_host_win_add(container, 0, (hwaddr)-1, TARGET_PAGE_SIZE); + container->pgsizes = TARGET_PAGE_SIZE; + + container->dirty_pages_supported = true; + container->max_dirty_bitmap_size = VFIO_USER_DEF_MAX_XFER; + container->dirty_pgsizes = TARGET_PAGE_SIZE; + + QLIST_INIT(&container->group_list); + QLIST_INSERT_HEAD(&space->containers, container, next); + + group->container = container; + QLIST_INSERT_HEAD(&container->group_list, group, container_next); + + container->listener = vfio_memory_listener; + memory_listener_register(&container->listener, container->space->as); + container->initialized = true; +} + +#endif /* CONFIG_VFIO_USER */ + + static void vfio_disconnect_container(VFIOGroup *group) { VFIOContainer *container = group->container; @@ -2246,6 +2303,47 @@ static void vfio_disconnect_container(VFIOGroup *group) } } + +#ifdef CONFIG_VFIO_USER + +void vfio_disconnect_proxy(VFIOGroup *group) +{ + VFIOContainer *container = group->container; + VFIOAddressSpace *space = container->space; + VFIOGuestIOMMU *giommu, *tmp; + + /* + * try to mirror vfio_disconnect_container() + * as much as possible, knowing each device + * is in one group and one container + */ + + QLIST_REMOVE(group, container_next); + group->container = NULL; + + /* + * Explicitly release the listener first before unset container, + * since unset may destroy the backend container if it's the last + * group. + */ + memory_listener_unregister(&container->listener); + + QLIST_REMOVE(container, next); + + QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next, tmp) { + memory_region_unregister_iommu_notifier( + MEMORY_REGION(giommu->iommu), &giommu->n); + QLIST_REMOVE(giommu, giommu_next); + g_free(giommu); + } + + g_free(container); + vfio_put_address_space(space); +} + +#endif /* CONFIG_VFIO_USER */ + + VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) { VFIOGroup *group; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 6f2d2fd..d657b01 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3489,6 +3489,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; SocketAddress addr; VFIOProxy *proxy; + VFIOGroup *group = NULL; struct vfio_device_info info; int ret; Error *err = NULL; @@ -3537,6 +3538,19 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->io_ops = &vfio_dev_io_sock; vbasedev->valid_ops = &vfio_pci_valid_ops; + /* + * each device gets its own group and container + * make them unrelated to any host IOMMU groupings + */ + group = g_malloc0(sizeof(*group)); + group->fd = -1; + group->groupid = -1; + QLIST_INIT(&group->device_list); + QLIST_INSERT_HEAD(&group->device_list, vbasedev, next); + vbasedev->group = group; + + vfio_connect_proxy(proxy, group, pci_device_iommu_address_space(pdev)); + ret = VDEV_GET_INFO(vbasedev, &info); if (ret) { error_setg_errno(errp, -ret, "get info failure"); @@ -3644,6 +3658,9 @@ out_teardown: vfio_teardown_msi(vdev); vfio_bars_exit(vdev); error: + if (group != NULL) { + vfio_disconnect_proxy(group); + } vfio_user_disconnect(proxy); error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } @@ -3652,6 +3669,13 @@ static void vfio_user_instance_finalize(Object *obj) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIODevice *vbasedev = &vdev->vbasedev; + VFIOGroup *group = vbasedev->group; + + if (group != NULL) { + vfio_disconnect_proxy(group); + g_free(group); + vbasedev->group = NULL; + } vfio_put_device(vdev); diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 1e220b9..70fe7a6 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1224,3 +1224,6 @@ VFIODevIO vfio_dev_io_sock = { .region_write = vfio_user_io_region_write, }; + +VFIOContIO vfio_cont_io_sock = { +}; From patchwork Tue Nov 9 00:46:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91B0BC433EF for ; Tue, 9 Nov 2021 00:51:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 07B2461175 for ; Tue, 9 Nov 2021 00:51:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 07B2461175 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:59770 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFM0-0000rJ-0p for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:51:12 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51810) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAy-0005xg-Ai for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:51 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:42580) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAY-00047M-Jk for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:46 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Fxis025521 for ; Tue, 9 Nov 2021 00:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=4ZJh38/1+M0iTGTHCLIJD6efxcVs9FIN5gpC8ePC48w=; b=BjMEF9ToQULU8OulT8VACmz8g7RB2ebKLxaGj29UfVSVuMAJR3yiO0tjGyDYbRvsAS70 y/meXGlbR0zfbmj0JFGohCJ2YJTeQ0TI8N0X8Bwf8k5A4lPfzAohZQtOu/lHiWk/YFiB y576TVOeYZBgHOmUPvm773rKkuKWnHzy1smy2/4Dlvel79pOEKf8Py5/djfOSX/eo4Pg S0AbZAaWUGCAoG0DiDazf+2F6taMLCS2bYiKEl8mufj+rSaeus95GBA1a8zj35fzPR09 /R+MqwY7QZqbKXGD3J9LmtPcWUw02ZEqyuJtRl+F3REejKE1irki7KLE2t/sK7JE+3N1 cw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6t7077ja-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:21 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90Zxm8132637 for ; Tue, 9 Nov 2021 00:39:20 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2100.outbound.protection.outlook.com [104.47.70.100]) by aserp3030.oracle.com with ESMTP id 3c5frd6sqb-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B5bZgqiooZRunElXkdC8Hhs6Cii8ND1H89QSGEEViHsO/ZzocGj2rsBHrxIyi/z8j15PoqRKT0ysQLFjcUverY1Ras3Cn/DAUi/zXQ6nRADxnOhrwieoxDGHdz4ID/KbM2QFvlbk+oO+khtXVI9rS7jHW6Yt6rP1h9YKFqwQc3KOv7oXPsRpLfL2Hnn1Tsgu71FeSXX3YcoZ3UczdGR1b2J67vZAYfL/jcU3Ed+wCtMGmOwWxZB+P6NoHzehrq1BZt3qlboTf4dj2hO2UYt/+p8KDOMXBjtSg3OmMHIQA5Ci18D8kXppleWV0q3F2rwP0IP6VZA34jK5tIYadkc/uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4ZJh38/1+M0iTGTHCLIJD6efxcVs9FIN5gpC8ePC48w=; b=j2LjUzkNzi/tO7BK1HMczAaRUtV65lvEBBl8RLFi0n4Qfc1te/W8oMuq0koYtimrmm4eZ+SnDojwd+XYpULQIXIZXXwQpKfTZBnn/zdE8FVPDlcAgXjwdVdwNf4r3ApgHe9+uy3LrNIioXKxjWlFnMOUNEc/+UwRCpGB+GuF1+ohnoq2xgd3INltIUEZPmpuNodRMscEPZVdo0sb+qpVFn+8FUbIG4rxaDtLRAB4OId1kPOEOh95BcyFVYrvBXkklB0LftXAf4kxTofkWUr5t5T61uQklU5pMgXEQwVIErJtC1F225is07ncMoiDRshTcr+aeWVYiggYBGIZ9XbbrA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4ZJh38/1+M0iTGTHCLIJD6efxcVs9FIN5gpC8ePC48w=; b=ZJ4sWUTNNDvS/IW2oHphlRFolM66BSVJMMCqeTdtucRXIA4nOP17QnbmDmng4cpo75wRXVVAmNIQXXD5yTE32pdLT2p7wDPQsVtiogoDG4JLG9ZiUt8E+0u5IIHiiMFsiYsMbe2FPzWJAlK+p2LeN27dFi7LYYvPATfczzTAK3k= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB4068.namprd10.prod.outlook.com (2603:10b6:a03:1b2::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Tue, 9 Nov 2021 00:39:17 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:16 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 16/19] vfio-user: dma map/unmap operations Date: Mon, 8 Nov 2021 16:46:44 -0800 Message-Id: <9317e19ef1b2b73864be268b6715fcf53a0704a4.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:13 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 91910b1f-1ba7-479f-ad5b-08d9a3195aa9 X-MS-TrafficTypeDiagnostic: BY5PR10MB4068: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:272; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TJVk8oyCkzlVlCMEWBwYG/dVpFlyn2zIgCMtqqZSALEnyW0Z6i3qJ+/QZvMSjXScJek+kKFwRmaU799m0H74gv5Ph6p1k2rTpCHdwBYs5idaZyX9LtsBvJ6TctZK9Xz/VPaWX+CoSbPfntJ5G85C0v1V+YGcikgH83uK6T2np2Vv9mNGg6ojl0O2xYiTpO5k3GLbrGsGwhItResrU7/C0I7QYpglEDvpl7p7u6ldKdw2hVXr4hrrSnh4WSTBj+OqEntIz8JMEv6y+kih1F+q5ZlaUnZtEz0nh1KI5T9ef4Wd2lPBgs3VYA35nna5zQvsAetJsLxVe4UMbbDW3lz4HHnHmXOP0yIoDfXTu/BroAWarClguLBESO7bM45ia7vvL7+I6sIuTWqrkU5Bl1JxGRojvo+DGxrNRsgSrvJLm8J/SRDx1TG9KySZHKpT+oDJ7/yDhNADQUAvDBxyaTcaZah6Si27nqH8KsrgsWOBTqcSoJz0btRrM1RBWKLblebrj9e6oNyGzHrMolZC+aDV8+9st9g4TV4qPI9VEnTjHEvXmTpUtwrlOtWj6czkANzq2cxp9PsfuR07NxBZDgnfmhjvOhFSeSoutWnkukLYCJ4CZYNZdW5ZEaxNrmeKwW7ilhnUqDZKGNldGByPlOKfYuGhab44xn1CKqTHrljIbC0SZa3VeiTCRtsBDQFBDNK/YuKJQ4MBlH3noK7H+TEsAg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66476007)(316002)(186003)(66556008)(508600001)(8936002)(8676002)(66946007)(6486002)(26005)(2906002)(6916009)(5660300002)(36756003)(7696005)(52116002)(86362001)(83380400001)(956004)(38100700002)(6666004)(2616005)(30864003)(38350700002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: N/Qy8DvIpDpMoe840yd7KukB6rn+Xs8iV/TP4z9bqySODMrbVNPfFM228sIqhHHu3vOVqUL8Czk5UcILQCpOGnDC3CB0Y2z6bjg8ShpHywsYz+6H2JMM2ugvas3L69nhy9lv0L4BL3ZnpuX48nQIdvkNfCmmgsoUZbmPe5MYScPvTspAsjrgP0EAubsLVJqOvVdOuogYYSTC1NfNGnFGG0oKRnE4TZrOPUqMBf+5t/FZ4LuAoj6kyWWvrCCVvX/mfc3vrZt93iJfZwcKnBjyjB+C4BOVTa+HPLzAaM+BXWQ/dLt89KDCO36ZKhcW/qeGa53+akG7p5qnc5JpVEE5FEWfqsMcur5xqxhvxRNcn1IYKMZsWecTF+xJnZRbfOhRYTAe5y7TqZncJoUJ3fHMSINzv1bBI+zfz8Hehkfhfp9htBkb/oXGBSRKMPXI0xn/JV9Yb583C++X9OPnjFOvZcgCiCEBauORFFokCnaVcnNe71jjas/GTpl3vtE+8XA/oLrOZJrfZl23h5aG0SrlEjPfhNjSU+i4RvG5Sv19YEVz0etMahUvhUYfl48Tt812Vqq1BiyawhahHhUJgSmMlryMgZguWdcN3C+BTNTr8/StJsovHZ7Qo8cZJNeGRXS5u9225oRsG4Rzo7VuxLKsXwwu+Of3kxNzsbPB0ErUBuyia+JswUPNZSAU+zbuGNzsldaL6z4i+pLYMPUmuX8X4wCe+8lXaN/Wd81CC9gkFy79CvXO20rmOExFYEH7SWBQa6lNjkFAlYFZrtaszJCKP4YKBT+ZbpHDnCIGbqgzXJDVJpgp0H+L1xOgXc/MCJdlZQlCioXOWfWMcwNAAhYq/ww/JXwMGQxG4J3J96ct00GXOyabzrzXDNLpvWajNeNwzNYiyCZjuzBvd1yeFX+zTOQRfeig13HqVym9tApi5CbkJNUk4KGDmzJi9UOuuiyNPl5rYlt+BCrrMwlvs6YFyhwFL9PDSSeYhJ85U4kvqP1CYINS9iGZuP4Jt3M1Y65r9WXk0STeVXVjBjnsUPEwMDBD8FuVoVa/O+XS7tUQmu1mNd41rUxCTEfk39l1G1MiFUtTt0YRi8GxQFV9Ul3DkailxqQopquTqdL/IIbRlvcmWUh8YiQ2QyNs446DPpJ+CHOOsULP4kVTDFA0qlQ6ZeohL1ry4lVkl0FtdR2fZU2T9JcvhXWXJrUYN+rKOBbXZtcjPb1K+iK51zxWLefWDwWG3K0QsV29FDiLO6O4wY1xkpyK9bWPC5fveBMG+CBwUQadQuTb2CAAdQjEvT9va7GP+v0hHm/rxvELxEubN6kE3aYZtvcGC24wOsxmpGCtIOBBgX9qUT4GJ9H7yUQ9BeZxlSYhHbTwbaJT8YLMjg8K9GBiV6+8OgB/WA9/5JYPh65rJsn/cHRdLBGiuQ7XGghiE0cFKjfW/4ofSygaG/QHDD8wsnjuIu9MjNJGJ8aILAIqQhUKKWAlyNvkkU9qlFE4y5GPg02d45RFbZJvK4wOuY8dn+xtndYubmuFxj64Tx7VoGauN7M2Q/LcOE+c+VaE+rYGm2hbu4BcPDfomZMhWOO2vxpPwI+o48HnC/wMNpz0emaDa3VcPSn32mLY1PDuvyqR6kvro1DE5vbyrFVdMB2gUsFJiAZGQ6YVbGmqkaj+0PYvfQJ9hiDitEoNnQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 91910b1f-1ba7-479f-ad5b-08d9a3195aa9 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:14.1121 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Jlg60HYRh49K+KYHNALGLmlH+rEF9CwQ2kKTM0yeRv7c051DWlde6g0svIdJKz8jk6yeLMMbYlD2CjKicSgR96S6EvoZae2TmS7vSmYDYUw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB4068 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: GslTr3ClqdxN6G-jUHEoquw1P7VqauLo X-Proofpoint-ORIG-GUID: GslTr3ClqdxN6G-jUHEoquw1P7VqauLo Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson --- hw/vfio/pci.h | 1 + hw/vfio/user-protocol.h | 32 +++++++ hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 4 + hw/vfio/common.c | 76 +++++++++++++--- hw/vfio/pci.c | 4 + hw/vfio/user.c | 206 ++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 309 insertions(+), 15 deletions(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 643ff75..156fee2 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -193,6 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; + bool secure_dma; /* disable shared mem for DMA */ bool send_queued; /* all sends are queued */ bool no_post; /* all regions write are sync */ }; diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 5614efa..ca53fce 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -83,6 +83,31 @@ typedef struct { /* + * VFIO_USER_DMA_MAP + * imported from struct vfio_iommu_type1_dma_map + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t offset; /* FD offset */ + uint64_t iova; + uint64_t size; +} VFIOUserDMAMap; + +/* + * VFIO_USER_DMA_UNMAP + * imported from struct vfio_iommu_type1_dma_unmap + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t iova; + uint64_t size; +} VFIOUserDMAUnmap; + +/* * VFIO_USER_DEVICE_GET_INFO * imported from struct_device_info */ @@ -146,4 +171,11 @@ typedef struct { char data[]; } VFIOUserRegionRW; +/*imported from struct vfio_bitmap */ +typedef struct { + uint64_t pgsize; + uint64_t size; + char data[]; +} VFIOUserBitmap; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 8d03e7c..997f748 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -74,6 +74,7 @@ typedef struct VFIOProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_SECURE 0x2 #define VFIO_PROXY_FORCE_QUEUED 0x4 #define VFIO_PROXY_NO_POST 0x8 diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index c0e7632..dcfae2c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -90,6 +90,8 @@ typedef struct VFIOContainer { VFIOContIO *io_ops; bool initialized; bool dirty_pages_supported; + bool will_commit; + bool need_map_fd; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; unsigned long pgsizes; @@ -210,6 +212,7 @@ struct VFIOContIO { int (*dirty_bitmap)(VFIOContainer *container, struct vfio_iommu_type1_dirty_bitmap *bitmap, struct vfio_iommu_type1_dirty_bitmap_get *range); + void (*wait_commit)(VFIOContainer *container); }; #define CONT_DMA_MAP(cont, map, fd, will_commit) \ @@ -218,6 +221,7 @@ struct VFIOContIO { ((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap), (will_commit))) #define CONT_DIRTY_BITMAP(cont, bitmap, range) \ ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) +#define CONT_WAIT_COMMIT(cont) ((cont)->io_ops->wait_commit(cont)) extern VFIODevIO vfio_dev_io_ioctl; extern VFIOContIO vfio_cont_io_ioctl; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index fdd2702..0840c8f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -411,6 +411,7 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, struct vfio_iommu_type1_dma_unmap *unmap; struct vfio_bitmap *bitmap; uint64_t pages = REAL_HOST_PAGE_ALIGN(size) / qemu_real_host_page_size; + bool will_commit = container->will_commit; int ret; unmap = g_malloc0(sizeof(*unmap) + sizeof(*bitmap)); @@ -444,7 +445,7 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, goto unmap_exit; } - ret = CONT_DMA_UNMAP(container, unmap, bitmap, false); + ret = CONT_DMA_UNMAP(container, unmap, bitmap, will_commit); if (!ret) { cpu_physical_memory_set_dirty_lebitmap((unsigned long *)bitmap->data, iotlb->translated_addr, pages); @@ -471,16 +472,17 @@ static int vfio_dma_unmap(VFIOContainer *container, .iova = iova, .size = size, }; + bool will_commit = container->will_commit; if (iotlb && container->dirty_pages_supported && vfio_devices_all_running_and_saving(container)) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } - return CONT_DMA_UNMAP(container, &unmap, NULL, false); + return CONT_DMA_UNMAP(container, &unmap, NULL, will_commit); } -static int vfio_dma_map(VFIOContainer *container, hwaddr iova, +static int vfio_dma_map(VFIOContainer *container, MemoryRegion *mr, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly) { struct vfio_iommu_type1_dma_map map = { @@ -490,13 +492,23 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr iova, .iova = iova, .size = size, }; - int ret; + int fd, ret; + bool will_commit = container->will_commit; if (!readonly) { map.flags |= VFIO_DMA_MAP_FLAG_WRITE; } - ret = CONT_DMA_MAP(container, &map, -1, false); + if (container->need_map_fd) { + fd = memory_region_get_fd(mr); + if (fd != -1) { + map.vaddr = qemu_ram_block_host_offset(mr->ram_block, vaddr); + } + } else { + fd = -1; + } + + ret = CONT_DMA_MAP(container, &map, fd, will_commit); if (ret < 0) { error_report("VFIO_MAP_DMA failed: %s", strerror(-ret)); @@ -557,7 +569,8 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section) /* Called with rcu_read_lock held. */ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, - ram_addr_t *ram_addr, bool *read_only) + ram_addr_t *ram_addr, bool *read_only, + MemoryRegion **mrp) { MemoryRegion *mr; hwaddr xlat; @@ -638,6 +651,10 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, *read_only = !writable || mr->readonly; } + if (mrp != NULL) { + *mrp = mr; + } + return true; } @@ -645,6 +662,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); VFIOContainer *container = giommu->container; + MemoryRegion *mr; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -663,7 +681,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only)) { + if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &mr)) { goto out; } /* @@ -673,14 +691,14 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) * of vaddr will always be there, even if the memory object is * destroyed and its backing memory munmap-ed. */ - ret = vfio_dma_map(container, iova, + ret = vfio_dma_map(container, mr, iova, iotlb->addr_mask + 1, vaddr, read_only); if (ret) { error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " - "0x%"HWADDR_PRIx", %p) = %d (%m)", + "0x%"HWADDR_PRIx", %p)", container, iova, - iotlb->addr_mask + 1, vaddr, ret); + iotlb->addr_mask + 1, vaddr); } } else { ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1, iotlb); @@ -735,7 +753,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_dma_map(vrdl->container, iova, next - start, + ret = vfio_dma_map(vrdl->container, section->mr, iova, next - start, vaddr, section->readonly); if (ret) { /* Rollback */ @@ -843,6 +861,23 @@ static void vfio_unregister_ram_discard_listener(VFIOContainer *container, g_free(vrdl); } +static void vfio_listener_begin(MemoryListener *listener) +{ + VFIOContainer *container = container_of(listener, VFIOContainer, listener); + + /* cannot drop BQL during the transaction, send maps/demaps async */ + container->will_commit = true; +} + +static void vfio_listener_commit(MemoryListener *listener) +{ + VFIOContainer *container = container_of(listener, VFIOContainer, listener); + + /* wait for any async requests sent during the transaction */ + CONT_WAIT_COMMIT(container); + container->will_commit = false; +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -1035,12 +1070,12 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_dma_map(container, iova, int128_get64(llsize), + ret = vfio_dma_map(container, section->mr, iova, int128_get64(llsize), vaddr, section->readonly); if (ret) { error_setg(&err, "vfio_dma_map(%p, 0x%"HWADDR_PRIx", " - "0x%"HWADDR_PRIx", %p) = %d (%m)", - container, iova, int128_get64(llsize), vaddr, ret); + "0x%"HWADDR_PRIx", %p)", + container, iova, int128_get64(llsize), vaddr); if (memory_region_is_ram_device(section->mr)) { /* Allow unexpected mappings not to be fatal for RAM devices */ error_report_err(err); @@ -1301,7 +1336,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) } rcu_read_lock(); - if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { + if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, NULL)) { int ret; ret = vfio_get_dirty_bitmap(container, iova, iotlb->addr_mask + 1, @@ -1418,6 +1453,8 @@ static void vfio_listener_log_sync(MemoryListener *listener, } static const MemoryListener vfio_memory_listener = { + .begin = vfio_listener_begin, + .commit = vfio_listener_commit, .region_add = vfio_listener_region_add, .region_del = vfio_listener_region_del, .log_global_start = vfio_listener_log_global_start, @@ -1561,6 +1598,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + region->post_wr = false; region->remfd = vfio_get_region_info_remfd(vbasedev, index); if (region->size) { @@ -2047,6 +2085,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->dirty_pages_supported = false; container->dma_max_mappings = 0; container->io_ops = &vfio_cont_io_ioctl; + container->need_map_fd = false; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); @@ -2230,6 +2269,7 @@ void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as) container->space = space; container->fd = -1; container->io_ops = &vfio_cont_io_sock; + container->need_map_fd = (proxy->flags & VFIO_PROXY_SECURE) == 0; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); container->proxy = proxy; @@ -2879,8 +2919,14 @@ static int vfio_io_dirty_bitmap(VFIOContainer *container, return ret; } +static void vfio_io_wait_commit(VFIOContainer *container) +{ + /* ioctl()s are synchronous */ +} + VFIOContIO vfio_cont_io_ioctl = { .dma_map = vfio_io_dma_map, .dma_unmap = vfio_io_dma_unmap, .dirty_bitmap = vfio_io_dirty_bitmap, + .wait_commit = vfio_io_wait_commit, }; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d657b01..ca821da 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3516,6 +3516,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->proxy = proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + if (udev->secure_dma) { + proxy->flags |= VFIO_PROXY_SECURE; + } if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } @@ -3686,6 +3689,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), DEFINE_PROP_END_OF_LIST(), diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 70fe7a6..cee08b6 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -52,8 +52,11 @@ static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); +static void vfio_user_send_nowait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql); +static void vfio_user_wait_reqs(VFIOProxy *proxy); static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, uint32_t size, uint32_t flags); @@ -562,6 +565,36 @@ static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, } } +/* + * nowait send - vfio_wait_reqs() can wait for it later + */ +static void vfio_user_send_nowait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) +{ + VFIOUserMsg *msg; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_nowait on async message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_NOWAIT; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + return; + } + + proxy->last_nowait = msg; +} + static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql) { @@ -610,6 +643,56 @@ static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, } } +static void vfio_user_wait_reqs(VFIOProxy *proxy) +{ + VFIOUserMsg *msg; + bool iolock = false; + + /* + * Any DMA map/unmap requests sent in the middle + * of a memory region transaction were sent nowait. + * Wait for them here. + */ + qemu_mutex_lock(&proxy->lock); + if (proxy->last_nowait != NULL) { + iolock = qemu_mutex_iothread_locked(); + if (iolock) { + qemu_mutex_unlock_iothread(); + } + + /* + * Change type to WAIT to wait for reply + */ + msg = proxy->last_nowait; + msg->type = VFIO_MSG_WAIT; + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + error_printf("vfio_wait_reqs - timed out\n"); + break; + } + } + + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_wait_reqs - error reply on async request "); + error_printf("command %x error %s\n", msg->hdr->command, + strerror(msg->hdr->error_reply)); + } + + proxy->last_nowait = NULL; + /* + * Change type back to NOWAIT to free + */ + msg->type = VFIO_MSG_NOWAIT; + vfio_user_recycle(proxy, msg); + } + + /* lock order is BQL->proxy - don't hold proxy when getting BQL */ + qemu_mutex_unlock(&proxy->lock); + if (iolock) { + qemu_mutex_lock_iothread(); + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -935,6 +1018,102 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) return 0; } +static int vfio_user_dma_map(VFIOProxy *proxy, + struct vfio_iommu_type1_dma_map *map, + int fd, bool will_commit) +{ + VFIOUserFDs *fds = NULL; + VFIOUserDMAMap *msgp = g_malloc0(sizeof(*msgp)); + int ret; + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_MAP, sizeof(*msgp), 0); + msgp->argsz = map->argsz; + msgp->flags = map->flags; + msgp->offset = map->vaddr; + msgp->iova = map->iova; + msgp->size = map->size; + + /* + * The will_commit case sends without blocking or dropping BQL. + * They're later waited for in vfio_send_wait_reqs. + */ + if (will_commit) { + /* can't use auto variable since we don't block */ + if (fd != -1) { + fds = vfio_user_getfds(1); + fds->send_fds = 1; + fds->fds[0] = fd; + } + vfio_user_send_nowait(proxy, &msgp->hdr, fds, 0); + ret = 0; + } else { + VFIOUserFDs local_fds = { 1, 0, &fd }; + + fds = fd != -1 ? &local_fds : NULL; + vfio_user_send_wait(proxy, &msgp->hdr, fds, 0, will_commit); + ret = (msgp->hdr.flags & VFIO_USER_ERROR) ? -msgp->hdr.error_reply : 0; + g_free(msgp); + } + + return ret; +} + +static int vfio_user_dma_unmap(VFIOProxy *proxy, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap, bool will_commit) +{ + struct { + VFIOUserDMAUnmap msg; + VFIOUserBitmap bitmap; + } *msgp = NULL; + int msize, rsize; + bool blocking = !will_commit; + + if (bitmap == NULL && + (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP)) { + error_printf("vfio_user_dma_unmap mismatched flags and bitmap\n"); + return -EINVAL; + } + + /* + * If a dirty bitmap is returned, allocate extra space for it + * and block for reply even in the will_commit case. + * Otherwise, can send the unmap request without waiting. + */ + if (bitmap != NULL) { + blocking = true; + msize = sizeof(*msgp); + rsize = msize + bitmap->size; + msgp = g_malloc0(rsize); + msgp->bitmap.pgsize = bitmap->pgsize; + msgp->bitmap.size = bitmap->size; + } else { + msize = rsize = sizeof(VFIOUserDMAUnmap); + msgp = g_malloc0(rsize); + } + + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DMA_UNMAP, msize, 0); + msgp->msg.argsz = unmap->argsz; + msgp->msg.flags = unmap->flags; + msgp->msg.iova = unmap->iova; + msgp->msg.size = unmap->size; + + if (blocking) { + vfio_user_send_wait(proxy, &msgp->msg.hdr, NULL, rsize, will_commit); + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { + return -msgp->msg.hdr.error_reply; + } + if (bitmap != NULL) { + memcpy(bitmap->data, &msgp->bitmap.data, bitmap->size); + } + g_free(msgp); + } else { + vfio_user_send_nowait(proxy, &msgp->msg.hdr, NULL, rsize); + } + + return 0; +} + static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) { VFIOUserDeviceInfo msg; @@ -1225,5 +1404,32 @@ VFIODevIO vfio_dev_io_sock = { }; +static int vfio_user_io_dma_map(VFIOContainer *container, + struct vfio_iommu_type1_dma_map *map, + int fd, bool will_commit) +{ + if (fd != -1) { + return vfio_user_dma_map(container->proxy, map, fd, will_commit); + } else { + map->vaddr = 0; + return vfio_user_dma_map(container->proxy, map, -1, will_commit); + } +} + +static int vfio_user_io_dma_unmap(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap, bool will_commit) +{ + return vfio_user_dma_unmap(container->proxy, unmap, bitmap, will_commit); +} + +static void vfio_user_io_wait_commit(VFIOContainer *container) +{ + vfio_user_wait_reqs(container->proxy); +} + VFIOContIO vfio_cont_io_sock = { + .dma_map = vfio_user_io_dma_map, + .dma_unmap = vfio_user_io_dma_unmap, + .wait_commit = vfio_user_io_wait_commit, }; From patchwork Tue Nov 9 00:46:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBAE4C433EF for ; Tue, 9 Nov 2021 00:54:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 495A0611BD for ; Tue, 9 Nov 2021 00:54:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 495A0611BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:44362 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFPD-00019y-As for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:54:31 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51762) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAi-0005l5-T9 for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:33 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:42368) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAY-00047H-GB for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:32 -0500 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A904wIh013245 for ; Tue, 9 Nov 2021 00:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=bbInx082Z4Pg4phDAYj09qXC5uDKfMsNxVyv+1Zg9B8=; b=zc4jTkj76b8KudW5ru75DPKE320Vxhqs6kXfMRySpGPhSPIOKDG5YLvalCRUJd3ADnOd sm+12+OouIRRFkIsCngLq/xvzgdNkFWc6N1Or6g7tSKimGMby5L6RfHQliNK/hoYxAeC U3TK8ynPb3pxNe8wmJw7fm2rtnWeS76+hlzDrkulK/Q6+i2kYsmOHFuGE8wfnViwaQTJ GinFLpDyCpdHEydfwDxCrKXvav6JstxhDPUM3LoOnfQl1d5XOZkwI+aPpQKpMYF7Uowd G0RNkPifoTg3uySyw8pfY8NEcchesmGsWCK16pAJCX/bUKLrm8SD0X0z/UQHLqkqkvfl uQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6st8qyed-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLTc129165 for ; Tue, 9 Nov 2021 00:39:19 GMT Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2174.outbound.protection.outlook.com [104.47.56.174]) by userp3030.oracle.com with ESMTP id 3c5etuvb8d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AqVO5K8WrZwN+5YK0EoE8thzoHy9eTvs8rCxkx2zcX/qErphc8l+k5cSGSL9N+uROy5+uL10tOeI+QpuA93vRgbG4LMM5rzX45nPYDkqglsKr9BJPY4fjCvRhD1fiOT/lGLvye37qZ+wuK6sr1MSKkYbzGhtZNDKykGT1+UDdwyNvUVCkZwSU1kigDs6rel7Z10hldtWGK28FaNMjkhXRA/1swpefoPa0FEVWCvnNaCtM1ZBEhtiL6WUpa84xSzJ+4X6yFCQQCCbpgPeIUeDiNBG+7ckvVzyZNbRXZC7GKliODvv4dE7Q3tOrdJgWJmZh6zdIEq0yzZnRcAVKlxc7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bbInx082Z4Pg4phDAYj09qXC5uDKfMsNxVyv+1Zg9B8=; b=MjwVdTx/j2wratVrcmncVwzmgLbEAV+GXfWqTd8NVZip0QYn8m45mgl6k5NroWJgM6uwRG7qdjUyZWUGFZYhb9vpiB9JVzQC9BJHEnLkWS1KotpSqU+aIh7Fqub0ZxenMiSa7PovTtcig7KhfJC5kswAJba52rIQ6FDdDQn3J1sq0lzv7Dn+j8kqyzwFtps9WUrgyD6PuMCcD0lZsqyKsdgNl8MIwGGolWprmQm44tO77aampKRDlUJnBjjZlmsm/z7qhPFClL+Y4fg+a8xVMZDMFsMDrEtMQ8PWaHD7xWWrTGOAOf8FiRwmZIHl6jVPCOoAEjHkfsCBegQYV9trSg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bbInx082Z4Pg4phDAYj09qXC5uDKfMsNxVyv+1Zg9B8=; b=nBWw/1eEGgiNjET92khj8sY7ZXD4nvTxmK+wFsn/S9NaiEtWwyd//Kbl0RVMqVS1krGJVzGEUQtvGAf4XKPhm59emEyuNbS43bwNGm1YHYco1P97DY07D7blreT2mTzrwQJioehKKO71rZdF1J5USRy2sPOyMxKgpok7W9VJlsc= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB3764.namprd10.prod.outlook.com (2603:10b6:a03:1f9::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.10; Tue, 9 Nov 2021 00:39:17 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:17 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 17/19] vfio-user: dma read/write operations Date: Mon, 8 Nov 2021 16:46:45 -0800 Message-Id: <26773d11c721cea1df74df061ae2c101ce3a975c.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:14 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 697ac4b2-3c80-47e0-676c-08d9a3195acb X-MS-TrafficTypeDiagnostic: BY5PR10MB3764: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3513; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: X1DLHFzoFnIu34FoCy/9JUP/Bd0ZGwF1r3t9xNENVvejWgLXlFeJhD2J+/u7KP1ls02kBSerqI42WPA6+9x00JWIbe6dpbNKmjeijIN9PrbE3y2s3LPwOUwZPP3ZLN3oOgC4A9/m9IuTT1ZB57BLOZp0FgnzHz94KtYwsABd0Ac1WLxZNWF1U/ghICco5oxpHMXji7fw+W8V6GgXF6TxxFeQJ+vQ67uvxqgSO6G6wKXBs3qF/x+iQgRB/2hqRQZnWc2TauHNQh2wdVv6Mq3i6xuPXAnVXDeryVWt3ADFgM6ol9UdHZ+AY3rKglehqGLB3MRgyRURRTHvoLF8xSeDcS/+dvNemELlQ5evm6a4jRLDyXD9Ra271CJQJu4pjF5ThUw6tnbvAxHpmxvtxayvV9r2dFZwMkq39TqEj9Pk4Vj0lQtxt+5SxSOoNu9hYo7/hOEInqK45hu4Le0H5BASiC5KEERZ+9DmW3lawkqP93d9aybj2tW24J3t13WjknO4pj9sM/xs21SigytkAWFkDuwNQaqqLFjcGyeyvvGzeDDoJ4dc1i1Bn/lZMB2qnj29M3Jj2MeHJIWETiNet2sHoC66zez/EnonSegE2qe6RdStdcYahR9r/2VJWo60XcD+NTWS3M/DcHQo0eKYdzSucL6YrxC2h9dUt6Li1OPfcmsnEnZHVjL4C98aG2shJccF X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(86362001)(2906002)(8936002)(36756003)(83380400001)(66476007)(52116002)(8676002)(5660300002)(6916009)(6486002)(316002)(2616005)(186003)(956004)(508600001)(6666004)(38100700002)(38350700002)(26005)(7696005)(66556008); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: CqtgNGgnY5EMfmH/217U3vfSTanjHGh6IqP+nNDLqJ7rkOvgbU0VIS+dVFbrDZ48aljydeabuhu0OKl6gBYLoNz2jxeKmvMjsD8GdhjRPDxa0W5ITKC4k2bGRrsdAy21ls9lVHiEMf4tITXkFAvaSGoGzs5JZGBan09go5diE3Rlc3t4rS97N6i5Y8mGCJFU8kGh8XhznjrbQgoJET1cnQvm+yHcjQ0U5ua31s5N4SzVDFH3bnVfzPfWUpPg1BXwupo3SWmbiL+MA4OnVz7HBvi7Djq60UUuTHi48fEra6bA56IxsxjeGvqeMah060d+BIvtxcToUMKFjWpgTobO3hbIOUBKjHQNeKcJKYfpOucSBCiOxd3cYyzem3aQpaEAPYY/5+AQ6zoRd8RMdXl2x9RDYX8Ym8zA/mKbpi/fU2LIYdTsuXzcFjsAfMhJk0uz/RGY/xC9XObAEbSVPoJzL8CBL73e6Db4xOJJ7BA2PcUcnImTLgxkHNOnoygfO7xUPTDxSBVLCZvvgwPFYLOKSCW3ZM4hYBbLQ3UexymgAFVzsCaSNt8uclU2xGK26RB59cOoEoAPB9jAqS0BePN3gcDrqokwBdZ8xuJkh2pLkycL8BSohaCJ3cfgzvc8kv2F/P1gAEnMJkHyT9mkNp3PEVqg3ke/EIjfxNO6WJXsH+coPifODr0I9/e6VFaMIuq/MA6labzf0TOeXAhvT4d07m42gnmBZLg2u09lf6v4RBG1TfdlQRIln5WIJt1ij9Ylvv5gEcpBp7F1IeoP+0Jj8A4pZx1DFP/npHmQdhB3bpVWQhmEApit7u48DvfNvfb06ilHzwTiH4PCSBDWaqrFMYdUBFtq4TWbr6A6jyIVLeU/3smJNnMO27GC/oh0UzRKm1q3GWVBJFtYeR/BcLwGcT5V7KfIB/kFpcUhhRM62w3VZ3C/fLCHSMca1CDA6hUlQN4Xz1EqYMpDTg6W0whb2+Rr/S8k6R5z6UDwmw7uKzc5TMcmcJrA+67fqaLFHT3urgVkIg19F2ibnLLMw7bQyg1isbkFV5U9X+gOjsYG9ZCZGCeMMRCx31LggyVDnoRa2wN7HGpHKmhaZxpCftkFi7ZJRopNUGu1d/xymJwQgmpthIgZCBzJT+KB6u4weo4OoOVeou/CxFdYFgurh2Jkbjb/FaW6G7H4qD6afBiVb7hLfB8ACd1T6xajsj52HWmQ7B+ycWXe720h0wYCiICd5dbHzVoBi5P4LgTY2XIwtot7Bq5oOvzOZdtIsDJMzSewvX/7oje65GuP/NyNGypXskZ0tZHxbiqIsBrSaN3chtibLoUMDnlZ7WLZle4moW5yWFERK1uE9q3C1rfuGk/qG7XSAEWBGsphN1+Jz19TQrev8AqXjBwc18dN0QyzEzndyIR77Xu8JO+3WStqzMNtGyMQuiPoBKq93nzgTVCkdCS/9BG1LfETvjdQ9+K9i0w9m3+NLbv6Sg4t2DuhOh6OfiYzPKVs0WLWvlX5T+RLv2Nqpa6bxvBH6GU2+WhtkYum08RIKnNZBoU236VxqxoEAOIBcz206LxYISUIs0S7Qp+fxiMNT0181B3obDMma1ilHA3UM2T0sS1Ek79qjn5yoVm3zDK2M4DQUqE3fy9P2E3FJokr9I+md8q/dVlShLwGKMMh9t8HwA9DnueH2LbbBQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 697ac4b2-3c80-47e0-676c-08d9a3195acb X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:14.3421 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: JtFnF7QthSn57Rp1Hyro2xWlVbzRmkbESPam1/U7GEFCsBS41e1bb/gJcR8DnT/Xr3dchgrTY+U+kbEIWfvsJSBSCgQol8DM7okkvybWips= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB3764 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-ORIG-GUID: xfFMdRX7u4rQiDbVtcEglhkpCliuf7UL X-Proofpoint-GUID: xfFMdRX7u4rQiDbVtcEglhkpCliuf7UL Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 11 +++++ hw/vfio/user.h | 4 ++ hw/vfio/pci.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/user.c | 78 ++++++++++++++++++++++++++++++----- 4 files changed, 188 insertions(+), 10 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index ca53fce..c5d9473 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -171,6 +171,17 @@ typedef struct { char data[]; } VFIOUserRegionRW; +/* + * VFIO_USER_DMA_READ + * VFIO_USER_DMA_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t count; + char data[]; +} VFIOUserDMARW; + /*imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 997f748..e6c1091 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -80,9 +80,13 @@ typedef struct VFIOProxy { VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); +uint64_t vfio_user_max_xfer(void); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size); +void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error); +void vfio_user_putfds(VFIOUserMsg *msg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); extern VFIODevIO vfio_dev_io_sock; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ca821da..877e3e3 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3453,11 +3453,116 @@ struct VFIOValidOps vfio_pci_valid_ops = { * vfio-user routines. */ +static void vfio_user_dma_read(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOProxy *proxy = vdev->vbasedev.proxy; + VFIOUserDMARW *res; + MemTxResult r; + size_t size; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + if (msg->count > vfio_user_max_xfer()) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + /* switch to our own message buffer */ + size = msg->count + sizeof(VFIOUserDMARW); + res = g_malloc0(size); + memcpy(res, msg, sizeof(*res)); + g_free(msg); + + r = pci_dma_read(pdev, res->offset, &res->data, res->count); + + switch (r) { + case MEMTX_OK: + if (res->hdr.flags & VFIO_USER_NO_REPLY) { + g_free(res); + return; + } + vfio_user_send_reply(proxy, &res->hdr, size); + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &res->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &res->hdr, ENODEV); + break; + } +} + +static void vfio_user_dma_write(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOProxy *proxy = vdev->vbasedev.proxy; + MemTxResult r; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + /* make sure transfer count isn't larger than the message data */ + if (msg->count > msg->hdr.size - sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + r = pci_dma_write(pdev, msg->offset, &msg->data, msg->count); + + switch (r) { + case MEMTX_OK: + if ((msg->hdr.flags & VFIO_USER_NO_REPLY) == 0) { + vfio_user_send_reply(proxy, &msg->hdr, sizeof(msg->hdr)); + } else { + g_free(msg); + } + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &msg->hdr, ENODEV); + break; + } + + return; +} + +/* + * Incoming request message callback. + * + * Runs off main loop, so BQL held. + */ static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) { + VFIOPCIDevice *vdev = opaque; + VFIOUserHdr *hdr = msg->hdr; + + /* no incoming PCI requests pass FDs */ + if (msg->fds != NULL) { + vfio_user_send_error(vdev->vbasedev.proxy, hdr, EINVAL); + vfio_user_putfds(msg); + return; + } + switch (hdr->command) { + case VFIO_USER_DMA_READ: + vfio_user_dma_read(vdev, (VFIOUserDMARW *)hdr); + break; + case VFIO_USER_DMA_WRITE: + vfio_user_dma_write(vdev, (VFIOUserDMARW *)hdr); + break; + default: + error_printf("vfio_user_process_req unknown cmd %d\n", hdr->command); + vfio_user_send_error(vdev->vbasedev.proxy, hdr, ENOSYS); + } } + /* * Emulated devices don't use host hot reset */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index cee08b6..2f3eac8 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -71,6 +71,11 @@ static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) * Functions called by main, CPU, or iothread threads */ +uint64_t vfio_user_max_xfer(void) +{ + return max_xfer_size; +} + static void vfio_user_shutdown(VFIOProxy *proxy) { qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); @@ -282,7 +287,7 @@ static int vfio_user_recv_one(VFIOProxy *proxy) *msg->hdr = hdr; data = (char *)msg->hdr + sizeof(hdr); } else { - if (hdr.size > max_xfer_size) { + if (hdr.size > max_xfer_size + sizeof(VFIOUserDMARW)) { error_setg(&local_err, "vfio_user_recv request larger than max"); goto err; } @@ -436,18 +441,20 @@ static void vfio_user_request(void *opaque) { VFIOProxy *proxy = opaque; VFIOUserMsgQ new, free; - VFIOUserMsg *msg; + VFIOUserMsg *msg, *m1; /* reap all incoming */ + QTAILQ_INIT(&new); WITH_QEMU_LOCK_GUARD(&proxy->lock) { - new = proxy->incoming; - QTAILQ_INIT(&proxy->incoming); + QTAILQ_FOREACH_SAFE(msg, &proxy->incoming, next, m1) { + QTAILQ_REMOVE(&proxy->pending, msg, next); + QTAILQ_INSERT_TAIL(&new, msg, next); + } } - QTAILQ_INIT(&free); /* process list */ - while (!QTAILQ_EMPTY(&new)) { - msg = QTAILQ_FIRST(&new); + QTAILQ_INIT(&free); + QTAILQ_FOREACH_SAFE(msg, &new, next, m1) { QTAILQ_REMOVE(&new, msg, next); proxy->request(proxy->req_arg, msg); QTAILQ_INSERT_HEAD(&free, msg, next); @@ -455,9 +462,7 @@ static void vfio_user_request(void *opaque) /* free list */ WITH_QEMU_LOCK_GUARD(&proxy->lock) { - while (!QTAILQ_EMPTY(&free)) { - msg = QTAILQ_FIRST(&free); - QTAILQ_REMOVE(&free, msg, next); + QTAILQ_FOREACH_SAFE(msg, &free, next, m1) { vfio_user_recycle(proxy, msg); } } @@ -693,6 +698,59 @@ static void vfio_user_wait_reqs(VFIOProxy *proxy) } } +/* + * Reply to an incoming request. + */ +void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size) +{ + + if (size < sizeof(VFIOUserHdr)) { + error_printf("vfio_user_send_reply - size too small\n"); + g_free(hdr); + return; + } + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->size = size; + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Send an error reply to an incoming request. + */ +void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error) +{ + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = error; + hdr->size = sizeof(*hdr); + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Close FDs erroneously received in an incoming request. + */ +void vfio_user_putfds(VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + int i; + + for (i = 0; i < fds->recv_fds; i++) { + close(fds->fds[i]); + } + g_free(fds); + msg->fds = NULL; +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); From patchwork Tue Nov 9 00:46:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6AB6C433FE for ; Tue, 9 Nov 2021 00:56:28 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 78CED611BD for ; Tue, 9 Nov 2021 00:56:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 78CED611BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:50524 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFR5-0005XC-M1 for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 19:56:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51780) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAk-0005mg-HI for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:34 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:42878) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-00047P-6h for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:33 -0500 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A902oTS005900 for ; Tue, 9 Nov 2021 00:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=/+HSthLiIL0fX+GGl1zJW1KP2BqS8FDoEQtk11HO5T8=; b=dHnfseNZ4/t+u1buHOohETjpydG2OqcmZLJdlfJYcZBOoP++GCfnJ70h94yGY3xBBkEH n4TnrjkkVtwdUEP6/x7tBuRn8uCpsbsh2ncIR9Xc+Cpl9692BhlL6QxlgDrDLdo2Zf2R ofwlHP6G1Vcy6/jj6aTBv3+H2WZVRe7dCRv68ogAQMjyVIQ+4FsvJGt0/wiE9yQK5iFx 0hXodF0VKaXqfM3cdE0aZa+XdYw5T6hPTEt9XSbXyCBur3z9bMWFzIhTWnYsCYybD30f T/AsJfs5Cc2bBNSi+eUZMZbk4TxPXukl/hM+9RmX/y7LaoW0xnRPvUaBw5b7a+hQg1er nQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6vkr01un-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:21 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLTe129165 for ; Tue, 9 Nov 2021 00:39:20 GMT Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2174.outbound.protection.outlook.com [104.47.56.174]) by userp3030.oracle.com with ESMTP id 3c5etuvb8d-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fYBvvx2jcUcX5WSiPmdYCuKDnj4TBKDF2zDgV27f0Vf0e1bk7nV1vzJvcxzdF0V2rou8R6d/iEkY7Jg+2kQ96kBIOCXGVT+uymWlJqHKV8W5U1/BWINW2WiZsvSIzzAhikN4NSx31o+mvQxpdBpEPC+mdsvzw6sptYX+/vkyvFQtsHxg6zu9ViwF2QPa0hoQ1AINm0zDRAdGCp5q5GMso53fS95x1LQszfDEVX7mwP7Pjd9dyvJSu1VloFs4tnlIdHI0sc8NCcOqDzUvUpWjLxg/FyX9J75h1OpOBj/viZh4kGMY6K6b9OxA3dWBInsaSFasuL4Lj0gvmPzEvdtakw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/+HSthLiIL0fX+GGl1zJW1KP2BqS8FDoEQtk11HO5T8=; b=aVoNwyuVK5Vo5ar11Co6Cf2fCAPED1guOOGw8T6gIMQkg4G8ASCOSqSZnsDvdmE+HB1OwSHYbXFbpVohipRgxW50idmqV+EmWfbZQ5bg6rh6ZB5CtwHUFrOswv39zdJtNHnoMp8APr67TnuxNrWmZ4P/1mLQUiwpogcZmtB0c9e908BldTUN3B/J4nPlD7h0S5oBTdwUSiIyqJ2dJYwteE2oeH8j+0Dt+QHjtbsrxCUsz4L6Xqj2ilmKDWUonR7ffG92CIvx8w6M+5vb5a1jZTww5BAWzUStux57Egx+rrtrvtuAsNJFWK3ZIAmLGnRgqp2zCRn/4u5GlWv0fpt/xQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/+HSthLiIL0fX+GGl1zJW1KP2BqS8FDoEQtk11HO5T8=; b=dCy9+ind80apOyc2o3ocZraMQnH62FoBMvqRrlgTLKhNfK/zKvygXL2XGVv66LJowlOlVeK1S1ssN4Q++jkgz6WwRyKOkB7p8zfJMrW1EU/6L9TPxr36k3rOF9tLeJq94DgWfI9P2YtrdsI17pfOihuIjMyxEu9ehVUeqKDQxWQ= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB3764.namprd10.prod.outlook.com (2603:10b6:a03:1f9::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.10; Tue, 9 Nov 2021 00:39:17 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:17 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 18/19] vfio-user: pci reset Date: Mon, 8 Nov 2021 16:46:46 -0800 Message-Id: <8ae17fad576ba004ca6623830d5de1e3f72e5478.1636057885.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:14 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 30c7dfac-dfe3-41e4-85da-08d9a3195aec X-MS-TrafficTypeDiagnostic: BY5PR10MB3764: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:346; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: JM0XYKqFw78axufxwCcyon75nCdbWbPQRQnYUErxxk0yJuITQJU3PYe9SjnEiU4Zkoh1hShOiQHZ+gMjhUrXmwsvCP8dcQAl8xXF5emdvRVIcvr8clM5tM/R7wl2RQaBTawHMk4evd0TpgR+Djp+wy7GZlnIgZ5GKYGfITgNi+EAjuqBgBh2H0121ZM3l9btN9ml2mq96Q83MokAuip1H3sCIOAOs8h7wNyTrczHCm2ZjApFip5ikFp7tGDMzN2Jr5KkNnTFmVF3eO+X3mcOm3yCI+1we9WILuJHswg9+OW5ImZviWnY3wJHLnUKRZ/NRZLsdRkKajO6Aixuz+tXzbvSlQkqpqFuFyLLOZxjVPYX/JCcZW6nXhd6iFPy1hbZxj9V0o1cKZb+wM2O+FfsT/QWK5KBbaQSQ+MYAeSs2s4xZKBsPnc3/MrK48SQWPwCj02mZanTHJ4HMFXb14PBizPnGNHXb8pT3Jg6UGz8fMBvvtT5S7mTh2BWElP4qP/IC8bhjeCQBpGQBjEFn65NK4m5+jZt8KwdSyrNtJLAYDlNe+FTaMesJytK/N3iWHz9AR4tCC3dox7tkOEGz88rUB8XUIaPIlUALR5QqEwx0M1wcRvhEdjNM6IqYNU30TCb61nT7kWH1zNDWfqj9uuwEeoJMLRtt9ssKpFQYkUa5pualIkjOx3QXKm7nJYwoL/jPj5bU3FHYBkhFykGC9eQwQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(86362001)(2906002)(8936002)(36756003)(83380400001)(66476007)(52116002)(8676002)(5660300002)(6916009)(6486002)(316002)(2616005)(186003)(956004)(508600001)(6666004)(38100700002)(38350700002)(26005)(7696005)(66556008); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: U0CnArC8OtQZh2+RIC6Vpn/q3SnTEIqPScbeJvN4JcpPbX7kfb6zhOBcmDtobhQzyd/FK/R3KeEa9YnPefFa6//GLyzn+Ft83k7vtHpjrS66ntqylMqqQ52u63FXSveIq3KCIg/H+iXlQhfy0qQa5o4KwZOE6JHk41VSEWUET/XjLJj7QD3yvy5FaeMLX+QrTkCoA9dRUX4rATJ+/PVdc1WJaJwrR12nxlca61OtesvlXAFBCEF5yT3Hobip5uQZQsbPh4/yY2t1aLUfz+GjAa9sqxdzVb36ZRjzC/tQpQ1EO1Llr8wlPQahPYMqX0JUC91sUWqf9iFdq8XO/BwyTHcCjiXxtwyMjHFQPtqUFRIAO9QJFaSbIyn6lUiEGeXIDUghkvTJE3Q7foUka7NEiiYejipiwtBtvBAOq1hcHqx5pGQZ+CfgsSa2xrWUmN+F23Ua1yhV1xhz4dkbAu6kM1m+tLQ77yp0RBWL6uuNLk62YHeiE3nkcDFGDZXpvlKjzfit66ElOyEvEceh3+Yp2AzeUGXPHNE9RaCiyH37iqS0AMo6CKh6Jb3YG/5uAZkbhw7CF5VtbFS+SRmDS+IeJ4Rtpr3O3FEIvRCcwE1JgZZg3DROZPyHzNhVRykQ96mWh2YZmOBXa/eJb1aIOAGAFfrr6tW9nkkVDYPVFKrk507uDQ2erqG8MgVdC4sfvethdcRrZY/InqmYmm5B0RqaxWLvAl28RwwOHB7izUAk6m8b5/HT/z91Wnz7ynyYqcmuCKFXccisQ4kE7VtZ9geH6jVuhPkN83Ql88GAftmCw0qUO94vNgRscNAQfaMeHiN5sjqLzIOLsexwuidRFf23xUUDGQfu9hYjCH+N6oFnn6R9Xpst+WbOxAyO8Oyl6VL9CI0esA5aIPXj7FWcmSfbBdh+3hfRVEX5O2ZgcSHZw5VTe8eeDXRX0KOQ5oxtLVsyOuWwYPBhWslI9QW0akBf6NNIl3JqlkBMdNEoFmTmgDyUEeWGTooc80xr4rCnMegJihGiT1gr6o4K7K9MuSDVpjKV6333OuR7ucY0IdutN2FDW2ZPbabSyuCUXYotQA1g9yGUr9PKgGNQJx40gVZGNACDyLR73r9/Ycl3lKVxecX5RsrgZxL4NutrK514mdn+WA3hZBCvv3RgLMhftknd7L/9o0GPx8g6CFUk8ug09oNXFBGKnWhoN+U7BEYl66kNqxoZiaAxzKTPd3+/ztNnnmoH5Wdh4Eb+ktDz/MCpTv4nrYS/y1WO4HpiY+cpAgJlwks3c8hLuh3ojKW4AO7zdxaHhZyYSvWHYzweCRr+y5S77PvMZwrNK5cje/XAa09VlvJAXqh6/EB8UO2RZDXqPI6kr56AehC7r/HNKbpKrLzNiqeXWIRcdSVdiIQHb7Bj7qPzBxSDzo1P/Lxiq+rD5gW/jL7C2BaUQyO2CQEy/oHGTjlk/VSbQ8vxA0AqmT/cmxKc4Xf3Tx5pQmT2fEF40u8PCp6Xy1lLZVGKzDK7uU/qSgZZWh0ORI3ddrBfqdq2F3GHunsw+pGghy3F0M4qPXcUp8IKejNzDyvdVBWkEBxVp5XW8kFLRTVYMMzPsj9IYyhU4zF/7HqYsXLyH2xbbkV/p11PcZGbiXcLm/HgzZsZA/HzkGqH6aSRxG58Z0JFgm2YMf1ITPjm/Evc3HubcA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 30c7dfac-dfe3-41e4-85da-08d9a3195aec X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:14.5780 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XSGYOnUfo+viXK9nrhRXyIi4K31KrHcKgRETKRZVkZrd6wskBG8P78Prx2ynoR71VbnSuG0lGCGAi+BGYQD4qxWQNu1PQSPEN1PcUsTqL8Q= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB3764 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: LPi2uuMcXKg7wQ0Dap0P4L_XcrMBdZvS X-Proofpoint-ORIG-GUID: LPi2uuMcXKg7wQ0Dap0P4L_XcrMBdZvS Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 1 + hw/vfio/pci.c | 15 +++++++++++++++ hw/vfio/user.c | 12 ++++++++++++ 3 files changed, 28 insertions(+) diff --git a/hw/vfio/user.h b/hw/vfio/user.h index e6c1091..7504681 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -88,6 +88,7 @@ void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size); void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error); void vfio_user_putfds(VFIOUserMsg *msg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); +void vfio_user_reset(VFIOProxy *proxy); extern VFIODevIO vfio_dev_io_sock; extern VFIOContIO vfio_cont_io_sock; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 877e3e3..ccb2563 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3792,6 +3792,20 @@ static void vfio_user_instance_finalize(Object *obj) } } +static void vfio_user_pci_reset(DeviceState *dev) +{ + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); + VFIODevice *vbasedev = &vdev->vbasedev; + + vfio_pci_pre_reset(vdev); + + if (vbasedev->reset_works) { + vfio_user_reset(vbasedev->proxy); + } + + vfio_pci_post_reset(vdev); +} + static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), @@ -3805,6 +3819,7 @@ static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + dc->reset = vfio_user_pci_reset; device_class_set_props(dc, vfio_user_pci_dev_properties); dc->desc = "VFIO over socket PCI device assignment"; pdc->realize = vfio_user_pci_realize; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 2f3eac8..76d0706 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1386,6 +1386,18 @@ static int vfio_user_region_write(VFIOProxy *proxy, uint8_t index, off_t offset, return ret; } +void vfio_user_reset(VFIOProxy *proxy) +{ + VFIOUserHdr msg; + + vfio_user_request_msg(&msg, VFIO_USER_DEVICE_RESET, sizeof(msg), 0); + + vfio_user_send_wait(proxy, &msg, NULL, 0, false); + if (msg.flags & VFIO_USER_ERROR) { + error_printf("reset reply error %d\n", msg.error_reply); + } +} + /* * Socket-based io_ops From patchwork Tue Nov 9 00:46:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12609265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5F16C433F5 for ; Tue, 9 Nov 2021 01:01:28 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4A22561989 for ; Tue, 9 Nov 2021 01:01:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4A22561989 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35446 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mkFVv-0006EA-8r for qemu-devel@archiver.kernel.org; Mon, 08 Nov 2021 20:01:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51788) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAl-0005pP-Tb for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:41 -0500 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]:43736) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mkFAc-00047d-71 for qemu-devel@nongnu.org; Mon, 08 Nov 2021 19:39:35 -0500 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A90AULl025572 for ; Tue, 9 Nov 2021 00:39:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=dyOYh6vq73Yv4i3PBdTvjWM7084rno+Zi5e9iTQnZ48=; b=FX+WRaUF+5B8t1FPPHuAZrxHVExmRccoIsTExIl5TdTdihfowsY2VgoQIS7tEZh/6aPr s8DhnaZD2QM+HqN2481LLAJGdewbDnZdWjWYyHs9ZAaJy2L6b/2UUEKyso1vWKIT6wsq M2cBuhlkIHiF78qDYWOnWpv+/2GiMbT/ps32RNl5rqbs2AnS1xvPOferh6ZYYGJlTFQC FfFSNzAPRVKOZJHmnVm+uJtBOsdEmlIBh8OC0Vp1yASawEat14ZsNgQzspmMF3lMfzp9 4tNHC7uu6fGJdPPMFl+vRPFh2y1EbyqST7OesuVYTsuucrR3xTwNj12yJ/H7xwx5Ow3o eg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3c6t7077jb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:21 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 1A90ZLTf129165 for ; Tue, 9 Nov 2021 00:39:20 GMT Received: from nam11-co1-obe.outbound.protection.outlook.com (mail-co1nam11lp2174.outbound.protection.outlook.com [104.47.56.174]) by userp3030.oracle.com with ESMTP id 3c5etuvb8d-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Nov 2021 00:39:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WwIIJq8E3Nz64JPweLFMBQPuL6cVcye27kQPuq9ev/PNudnGq4HTkREVgItulXiC1qzvmb6OifVfPZ8+v23hLugFct77+tT7l0kloHmYVD6aqo5fgnypv+fmvDOHYPz4GGPoKQdVhxIsCBjqwO4QwgONEaw3yPQU5MKaq2pEwXtw0OH2uyePWu2qbL25FWVrXoVK+oh4HEuSdykQ8A0FUg4fIQpURaftkWr8ynFxaSnTt5GwXYhBcZGfsemAoSIticzkfMlY0SqfaiPv3lfVLL4G6FQkhm7cJUh8PG8cKSCje8OzoWRZXwHa9XlqTSfpGCDPUUfCGZnOKQR2sZWeXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dyOYh6vq73Yv4i3PBdTvjWM7084rno+Zi5e9iTQnZ48=; b=TcHwukT0UvkUBR13xWWUy6AON3K5CATWZs3LayKJ58+OnebJ2WgAODOxjdRVvs+Qt7fmlmEvDzJOATakljcIkIPdxq1TSwV9zHp3Ee9OIOz2EHs8C0C6Vxk22eHClGgskRR6zZFkIWWTF22OQvb0uUstUQJtN5stZGs6hXBlOcY7AWWbprzd0h5WevfOxW/n0IrNrfx19p4eeIbdSceK21U25EGeiMxbHNSKS7pApb5DPRiyf9P+8EI34TE3TgkOZmKHKaO1YDSlSfCwWuKmAChg7lZIubuT6kiCc23K5uyamkZ6nWfsg1X93ZqoRTc9WWUItu5GtsC8pC9XNHVxOg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dyOYh6vq73Yv4i3PBdTvjWM7084rno+Zi5e9iTQnZ48=; b=j+f3MI96ewpCjb9g/Eo9n4FNUhKh/H7VSU9395Hm6OjbDcrmjw8XmK/kzQ/oXq1wOd24G0FyWA2YmeScSnQ1sIznzvFERv62SUaR3a/I76Ldyn3emDkDgfPO2dXkTbfkkFrzGQEdrnNz86tBXpZlhVJd2eJeiPEV4yyKKZu4EHw= Authentication-Results: nongnu.org; dkim=none (message not signed) header.d=none;nongnu.org; dmarc=none action=none header.from=oracle.com; Received: from SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) by BY5PR10MB3764.namprd10.prod.outlook.com (2603:10b6:a03:1f9::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.10; Tue, 9 Nov 2021 00:39:18 +0000 Received: from SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093]) by SJ0PR10MB4686.namprd10.prod.outlook.com ([fe80::1551:92ba:9eb8:a093%7]) with mapi id 15.20.4669.016; Tue, 9 Nov 2021 00:39:17 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v3 19/19] vfio-user: migration support Date: Mon, 8 Nov 2021 16:46:47 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) To SJ0PR10MB4686.namprd10.prod.outlook.com (2603:10b6:a03:2d7::23) MIME-Version: 1.0 Received: from bruckner.us.oracle.com (73.71.20.66) by SJ0PR03CA0194.namprd03.prod.outlook.com (2603:10b6:a03:2ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Tue, 9 Nov 2021 00:39:14 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f452a2da-f62b-4bf2-f901-08d9a3195b17 X-MS-TrafficTypeDiagnostic: BY5PR10MB3764: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:45; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iioeMczjTnYp9OFWxwW5UVEAB4Pdu5lrYjUla9P/4QMyPYW5CcPgKghF0NSTIL5ruu07HlcgwW06KViAEzaMsnVLBy3IWz4cDeCTMPVJfku+t9Jj+duGioJXkvN5QxVJ63qkrLc8XKsA8IUCKoHXSgUyrZ7JhLs1aqIRUhhY1pOgJfZVzR5JqYiPgWaOcIuHlbYB0/ayC2ggsw+78JwdYG3lGvRzBHEoCJJPdBxkVzka9mSuLcEEgTW3/K3CfS3JmAX75P7m/bRhPh9JIfq5eGkaCmZ8TFHJeDcFBRgb5icGzsmhZqD2OGkSNKDHTjw2Ar83/MELubLK3mTQenDegL5/bUwfO3MJIeISRLsM6VSXTuXpF7KFA8jkey8cQOxWZ46XFP5s6BmGicgGqMqKF/g6JGMDO9fQtJmlRwnT1F7AdRXrPykT4wGAJZHqxAFq11tq/ErfS+r7/sHfzrCPp9pCWL1kWI+c8DssDt0Ap9svoTx8J3nSHsIP9ClnCBrfB0lLACZi5BzhN8t3IyBjxg7LZ3Eaiepb7szuXyoEZTWA2LDaZmp32whs7i8a35qeR9+prfGWC8JzAvbP58NsP8IRTQIpZcTLjtoHavPTs2EWfBsc23Xp7k51vfJh3L+ionK1JvJ09gGLnhlcP5loQ5phZxC0hQYl396CmFhDwgYx8FaD18h6R6NGvi+J+mb1fSrsGIyTsxzlIRmuaJgHeQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR10MB4686.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(86362001)(2906002)(8936002)(36756003)(83380400001)(66476007)(52116002)(8676002)(5660300002)(6916009)(6486002)(316002)(2616005)(186003)(956004)(508600001)(6666004)(38100700002)(38350700002)(26005)(7696005)(66556008); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: QQxK+5etQ0FntfH0FnW9Zkwv8bYYPMypNXcX5k086TJzfiXiv1K7i26vtRp1hjgKxJwkordcBaeidxjzhH4lnp3OxCRw1vEtDrppANyyip1yGlne+J93eU7hcAbk8zWULsCsmt1MLPvEFQsEiD72hLlz9fxdREzIloYK+LGJa2r/2jGZD/xWjyDRhBx+v95hzwNJ6vDq/T9s9GsxbnqbKLUjhQJDg11kUih19+TjHkNMpchJmm/KCKYhUYyJIMS7FHUFnddVC0y7aW6p9OguAB4p+saZUctB2luid0J7qx57aMtfDqAssHF0UfXX+dRkU1BWJvV2xx0RxTtSAwynkbXiro5cDTirqwadUryuVapfQZysrJBR3BNwccsgY2q8FKpufrJHYKbFBKzcucXQ86aI+Ox5/I2fqJ6hNiMyi8tE20eQlHXjlYwf72xV2Ms7QW9Aw5+SiCooVo6i/opGLJ4ts26lrlyR8sVWmpEbjEv9GC21jXyKnKm1sFUyo44dmDtExjuxi8rxqjKqVf1KY5cMdHmep0+a4EJPIUK5aex72ZyD1vxT+JT3nRRHSf+vKXj3ndCMxnZs/IDCmGTSVorc1Au5Jm7Y3hTnpHuGBbPitgh+aK6YmUrohcHPIbt4XFUWI7v/kDrfQHI0OFtK/7Eg4nqT1Q8QqaqDmgj5ZxflE9WT5RuT1sqGsKiBkxp0nHe76Yxq7/pmzwskjxfbOw+krxZCMbRUK5OfRFkN/iTlSJUvlVik45MspBYtiXWZjz/CRp67cVGEgfCKZgAHqdxEeF97VsH9EKeEeO0r0sPTpywR3kM5OGq96MWVbsmGHNGvBsQnkz9fcVq3CBJAuHdj72GVK9esR5+/Eur5fDtmUtWwEVaWZEpi4k5HHpZo3MGS8FhmxnCdP7mtOrT4JW3DE4WS8Q5r1GTeyUAk+Vi7LIsQZozetiahUrQeqGRQJzZ4EfOwnVIRVBU4vYeD2kbvEf2737vjfffMNq7OyzDv8dxm/P+gv7vsOx3M29NOa24Wt0z79Im/T0VIP3jtnk7sm/ne1TQXm2Bdxet4P+GSeDxpEnbW7964grjbUWfDkrL2gN8Dh8f38SU4L+ZLx0vE4FZbm8N99AKcsJsBNMa9xGkGJpdErqzaEcmGLrnO3fQrY/ODe6x+SE0weqqNQ4AYcaj1w2PMqB4AZ+YmyKYkN2uYpPEarfqmQ84YrvG0r9gCmGfpojC80vvJ3kwv9h8TRB/gjV65hSxbgpWzwnjJ9T+K9O/Jz0BO2O6yBxoNcR/i3XjXFtGwo5Bs8Hn9+9ou8tIm1ncEvvdSJbgl5Z25k7w6j4TxoGGydAPX1S1CIFkYM2tHa8T8kCffv66iiPb5dg5LzmfQoiofI68lduZYhL4F3JRRsybIE2kb2khfgwNBLAGcZ3aZaM+4wWkdPd5qOhprhwDNQhAoeOGIE4DJ9xKRBKe3wavATIjFatwLjfmfLAil8uG3U1lUnkvtEBdhabNGtHMtzoZsKzb/e7eS4Be05jimkZp87f8dzUDxox+ZnKBuvXFQjXaL+TigCLxUpvQLv1HS0SSO+4iMwkkx9kg+e0mmrF8tNTUE9CTmcuMGfoDf7jITVla5e3Iic5FYUc+wI/AlXTi6RF3yvEgrwgDsGdyeDdUi0ExKrSF70ERbRlWKM8P2Lr46dkEFlQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f452a2da-f62b-4bf2-f901-08d9a3195b17 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB4686.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2021 00:39:14.8707 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: qo70C2qYcJvhgp4VVDuHGGoyvxJX8MCrEdhThvWVgfG8Gk1z9TAAIqGXCQIenPsb6SBiJS1LvkvQEhTB9EScSRyfbncNy1oDgBjuzZdiU1s= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR10MB3764 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10162 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111090001 X-Proofpoint-GUID: hrw6BtXArlVUI69b2ToDlskvyAZMj8hQ X-Proofpoint-ORIG-GUID: hrw6BtXArlVUI69b2ToDlskvyAZMj8hQ Received-SPF: pass client-ip=205.220.165.32; envelope-from=john.g.johnson@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" bug fix: only set qemu file error if there is a file Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 18 +++++++++++++++++ hw/vfio/migration.c | 34 +++++++++++++++---------------- hw/vfio/pci.c | 7 +++++++ hw/vfio/user.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 96 insertions(+), 17 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index c5d9473..bad067a 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -182,6 +182,10 @@ typedef struct { char data[]; } VFIOUserDMARW; +/* + * VFIO_USER_DIRTY_PAGES + */ + /*imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; @@ -189,4 +193,18 @@ typedef struct { char data[]; } VFIOUserBitmap; +/* imported from struct vfio_iommu_type1_dirty_bitmap_get */ +typedef struct { + uint64_t iova; + uint64_t size; + VFIOUserBitmap bitmap; +} VFIOUserBitmapRange; + +/* imported from struct vfio_iommu_type1_dirty_bitmap */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; +} VFIOUserDirtyPages; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 82f654a..3d379cb 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -27,6 +27,7 @@ #include "pci.h" #include "trace.h" #include "hw/hw.h" +#include "user.h" /* * Flags to be used as unique delimiters for VFIO devices in the migration @@ -49,11 +50,13 @@ static int64_t bytes_transferred; static inline int vfio_mig_access(VFIODevice *vbasedev, void *val, int count, off_t off, bool iswrite) { + VFIORegion *region = &vbasedev->migration->region; int ret; - ret = iswrite ? pwrite(vbasedev->fd, val, count, off) : - pread(vbasedev->fd, val, count, off); - if (ret < count) { + ret = iswrite ? + VDEV_REGION_WRITE(vbasedev, region->nr, off, count, val, false) : + VDEV_REGION_READ(vbasedev, region->nr, off, count, val); + if (ret < count) { error_report("vfio_mig_%s %d byte %s: failed at offset 0x%" HWADDR_PRIx", err: %s", iswrite ? "write" : "read", count, vbasedev->name, off, strerror(errno)); @@ -111,9 +114,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t mask, uint32_t value) { VFIOMigration *migration = vbasedev->migration; - VFIORegion *region = &migration->region; - off_t dev_state_off = region->fd_offset + - VFIO_MIG_STRUCT_OFFSET(device_state); + off_t dev_state_off = VFIO_MIG_STRUCT_OFFSET(device_state); uint32_t device_state; int ret; @@ -201,13 +202,13 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size) int ret; ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset)); + VFIO_MIG_STRUCT_OFFSET(data_offset)); if (ret < 0) { return ret; } ret = vfio_mig_read(vbasedev, &data_size, sizeof(data_size), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size)); + VFIO_MIG_STRUCT_OFFSET(data_size)); if (ret < 0) { return ret; } @@ -233,8 +234,7 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size) } buf_allocated = true; - ret = vfio_mig_read(vbasedev, buf, sec_size, - region->fd_offset + data_offset); + ret = vfio_mig_read(vbasedev, buf, sec_size, data_offset); if (ret < 0) { g_free(buf); return ret; @@ -269,7 +269,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, do { ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset)); + VFIO_MIG_STRUCT_OFFSET(data_offset)); if (ret < 0) { return ret; } @@ -309,8 +309,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, qemu_get_buffer(f, buf, sec_size); if (buf_alloc) { - ret = vfio_mig_write(vbasedev, buf, sec_size, - region->fd_offset + data_offset); + ret = vfio_mig_write(vbasedev, buf, sec_size, data_offset); g_free(buf); if (ret < 0) { @@ -322,7 +321,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, } ret = vfio_mig_write(vbasedev, &report_size, sizeof(report_size), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size)); + VFIO_MIG_STRUCT_OFFSET(data_size)); if (ret < 0) { return ret; } @@ -334,12 +333,11 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, static int vfio_update_pending(VFIODevice *vbasedev) { VFIOMigration *migration = vbasedev->migration; - VFIORegion *region = &migration->region; uint64_t pending_bytes = 0; int ret; ret = vfio_mig_read(vbasedev, &pending_bytes, sizeof(pending_bytes), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(pending_bytes)); + VFIO_MIG_STRUCT_OFFSET(pending_bytes)); if (ret < 0) { migration->pending_bytes = 0; return ret; @@ -744,7 +742,9 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state) */ error_report("%s: Failed to set device state 0x%x", vbasedev->name, (migration->device_state & mask) | value); - qemu_file_set_error(migrate_get_current()->to_dst_file, ret); + if (value != 0) { + qemu_file_set_error(migrate_get_current()->to_dst_file, ret); + } } vbasedev->migration->vm_running = running; trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state), diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ccb2563..2e3846e 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3754,6 +3754,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) } } + if (!pdev->failover_pair_id) { + ret = vfio_migration_probe(&vdev->vbasedev, errp); + if (ret) { + error_report("%s: Migration disabled", vdev->vbasedev.name); + } + } + vfio_register_err_notifier(vdev); vfio_register_req_notifier(vdev); diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 76d0706..460d4e5 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1398,6 +1398,52 @@ void vfio_user_reset(VFIOProxy *proxy) } } +static int vfio_user_dirty_bitmap(VFIOProxy *proxy, + struct vfio_iommu_type1_dirty_bitmap *cmd, + struct vfio_iommu_type1_dirty_bitmap_get + *dbitmap) +{ + g_autofree struct { + VFIOUserDirtyPages msg; + VFIOUserBitmapRange range; + } *msgp = NULL; + int msize, rsize; + + /* + * If just the command is sent, the returned bitmap isn't needed. + * The bitmap structs are different from the ioctl() versions, + * ioctl() returns the bitmap in a local VA + */ + if (dbitmap != NULL) { + msize = sizeof(*msgp); + rsize = msize + dbitmap->bitmap.size; + msgp = g_malloc0(rsize); + msgp->range.iova = dbitmap->iova; + msgp->range.size = dbitmap->size; + msgp->range.bitmap.pgsize = dbitmap->bitmap.pgsize; + msgp->range.bitmap.size = dbitmap->bitmap.size; + } else { + msize = rsize = sizeof(VFIOUserDirtyPages); + msgp = g_malloc0(rsize); + } + + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DIRTY_PAGES, msize, 0); + msgp->msg.argsz = msize - sizeof(msgp->msg.hdr); + msgp->msg.flags = cmd->flags; + + vfio_user_send_wait(proxy, &msgp->msg.hdr, NULL, rsize, false); + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { + return -msgp->msg.hdr.error_reply; + } + + if (dbitmap != NULL) { + memcpy(dbitmap->bitmap.data, &msgp->range.bitmap.data, + dbitmap->bitmap.size); + } + + return 0; +} + /* * Socket-based io_ops @@ -1493,6 +1539,13 @@ static int vfio_user_io_dma_unmap(VFIOContainer *container, return vfio_user_dma_unmap(container->proxy, unmap, bitmap, will_commit); } +static int vfio_user_io_dirty_bitmap(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range) +{ + return vfio_user_dirty_bitmap(container->proxy, bitmap, range); +} + static void vfio_user_io_wait_commit(VFIOContainer *container) { vfio_user_wait_reqs(container->proxy); @@ -1501,5 +1554,6 @@ static void vfio_user_io_wait_commit(VFIOContainer *container) VFIOContIO vfio_cont_io_sock = { .dma_map = vfio_user_io_dma_map, .dma_unmap = vfio_user_io_dma_unmap, + .dirty_bitmap = vfio_user_io_dirty_bitmap, .wait_commit = vfio_user_io_wait_commit, };