From patchwork Wed Jan 12 00:43:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4FBFBC433EF for ; Wed, 12 Jan 2022 01:00:42 +0000 (UTC) Received: from localhost ([::1]:46928 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S0H-0001O9-9W for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:00:41 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36768) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdn-0000ye-M8 for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:11602) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdi-0005gb-Ej for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:27 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrYAu025170 for ; Wed, 12 Jan 2022 00:37:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=C+Yaw42T04Urbw50y7KdSedY46VIaeMAT8XomORSxwk=; b=uHOavHcydI9AEv9XQslnzJ3CDUhBOwl/HPySnI80Wre7FQwfZWMrlvhPGFt+iaGQGVv1 AduQX3ckb8Xm27dgAOsl5M8//iPN8RpjOLofRE9CU9jVvI6eBo/0o2YL2hyuRg/ekIw/ Wlrp2J4lNNtQC8kJ87HjMkDjrHF0pIBxwX1dAB69cz+SaJTkgZGsY52r3koLiVdRr1hT YSPRX6kGdJSzsgbiZ6IZHH1PYI5hn0LwTWLZvi8OAcpcqQb+FuuZPERWooza4LEzZF/i rKZhtkIwSqtBUsFjp4js4BrWKeXttj4bI4RNxpGXdePwm8d+euGGhm9+3tuB/xqAfDeq lQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crp8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:10 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KATc069271 for ; Wed, 12 Jan 2022 00:37:07 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by userp3030.oracle.com with ESMTP id 3deyqy1gju-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NCWakPT2RI2zAKR+/pHsddWjDh+LYjC2yaqK+Y89BYSkKGgBRw4/AK4IdztCDVnQT5OK6A9O1JFUHLBHJ4iOF/0llGt2Z6vNrHFNTQyoWtVlmHtuDXUVzdpbjD/i+57g3rnAH0IoNmLQHyw6Xx14LFRjVRnnSvZ6DW7M2Z93wI1mv3UzFDLQcpkyA/ZBVkMmgpTwvW2PE1PKFTJz0lLyA7wRWCC+fcWteCqw/53pxMfvadPE/NVdKbjX5SjZwGdeauWp1pJ7J+dDGBmn8LppauTe4q48tO+9jxET9eY81ZHN5fhp13yzZNtsfvhVfwnC7V2Y0UyrzE9VH/xlZJFA+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=C+Yaw42T04Urbw50y7KdSedY46VIaeMAT8XomORSxwk=; b=L7WkBats4MyiV1HFyX4I9uMb93JmOAN0Gq7HKgYMVFfnLi2+Zu0X5xN2YMf6N9wbJmKvd8UtvwDIc6ZfrEz7svzEdPfGzdZtT8fEShWXQQzeII32XLAXa/rB/8gO6ATn/IjB+k7tZZ2eJ1843neav4UFUAEFpqyxngxufXRyplxFSAWh8Tiedb02d48Cf63wpsJcnNS28WjjQwmDcncmPIvdt5eQ4GebLukIYkA/KJhVJZtN6WQ1jayfQoHIU6Jr5jIE0MvU7L50w96tHqmL6AFiGXtgzP0yZ4IYWA/xwLneENhc+1wqX4pxF3foPu2IEc5OW4ExW5b7pBQrQ2wqvg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C+Yaw42T04Urbw50y7KdSedY46VIaeMAT8XomORSxwk=; b=kz6+Gf0bFyYx1xGMT7hJUzRXO5t+2SZOd8emPwHKxHJrsYAukRRgpIFIzl1vBdeZ1gUepk2VstV64dwXuZyxMtzM3v8NaLCtHnQlImcUre1YQw8xy8HyEmhApqF4yFsi+1hltsSb+WGkPvauG/dcFxDK888wQ2mN38m10i6WCaY= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH7PR10MB5813.namprd10.prod.outlook.com (2603:10b6:510:132::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4823.19; Wed, 12 Jan 2022 00:37:03 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:03 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 01/21] vfio-user: introduce vfio-user protocol specification Date: Tue, 11 Jan 2022 16:43:37 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: aef11dbf-eb5b-4d79-95c7-08d9d563a715 X-MS-TrafficTypeDiagnostic: PH7PR10MB5813:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LDE6Bi7+mdsGSwOv3OO+t/FEzOoFQvpB7jXnvWdN9PvJHK94nO0tqbjIJJ9xkmjzcxx8ZnAuwJIVrkVL8St3b2ftHB6VEo+yjNX8gZZBZEYfliiJz7Dyuas8PANi2jsFdAWQ5IdT4o2DcpUGb7lJJUow5qblkRyw5UK2keftPF/ErKXJ3q5eybX6W2IeV76eopmm9SGsPVW1Y/iQfV0NUdYI5EHjhUhIO8oepgT44oBjds84H7E21Did+JySmOm+vn3qd1VPZg42wxN+KisZ0XJCOESIALMs8/96z6lwwB4UWJ0qCjyBeJ6Cq3IOsSV0i24qufM2fVUvm11nw6ynrVTQ99EGwSl5uDZchN0uJsEXHfCo9OuXtwT+aKyWRMH1u1G4Bj6wpvr+sbsaz0Hl/Beluyjv0+XNKpkegVH6O97ECa3EycoemyI87TRNAURkOyozkWQ5nKiwzBVtsnBMozEFiPbvtjwQYb7TlWnzF3869FSwgAeOpecl2LYLxaafFbpM0e9TFZr5Hx0hz61B972pEZ04wZopPElngfU3K5foSpwDbp0/w8adHpMF8B3EHLgUjlzu2jr71b9CtqDkmiCe0Wvo8LPrKd+huRwcDs17cCzcg79IXBL3uXCkrM0p+O/iqlrDZNa/Q5X+nVoZNfPUr32K7AjA1D26w4ymQGondo4TpwSB0NpldJ4uBnZluaurZpcdA0xgYaUUQpy3xduwjhwcjAF2kdLC4w24XJ3zuRRlZBi2Gjz5gf6zK436E8rPd1lHZKGUhTplNP2hRAVHcHjPvIH8Y/q2UhBd+wRtptRUiz1wcL/gtxOVChFp X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(2906002)(30864003)(6506007)(8936002)(26005)(6512007)(36756003)(186003)(8676002)(52116002)(6666004)(86362001)(6916009)(966005)(6486002)(316002)(508600001)(2616005)(83380400001)(66946007)(5660300002)(38100700002)(66556008)(38350700002)(66476007)(559001)(579004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: x7n7PxeH/6CF7msZygcXyKZBZaCORqQRKKV5mQFLY8dh34iGhGqhPihl5EEV1Y0BLWPjvX2nfwxTi/KngfPKu3DlRNDLbtVvboYlxQ+G37nWwfB16IyUJwDveogi+hsmapTV2bPoEafJvNoqxrKpByOmU9un6t+sM/BOLGj3/GQ69fH1PjJQd/VYDedwWBawNT5nZZ77Jt1cgpwqiAgRoKyD12v8zqR4aRHPhm5F2B0UW0p22QLlW1I78i958sA9iPIfl8CpDyRJMlmDGoRdtDD41CtAtiTC3wuj70MHQFzsAUTtmhuSnYMASTR7zgXDfD+KiLxW9kfSUljK9OxCYlUqlvPIp+c6lGgqobKlXrIlV85b4wP7aZ0ZSrzArh4InXH8UWz2zV6ro9GpBQO2WHGZUIXeqKWsTpJGbPZ+V/inpVtAu1hnbTBvOjWYQxF+wbMVFDndpxjCflnx3ZIpZC6Nq9hR4jVTDq1vWGSBMwcPxHHVej3xw+u0YCpPAxHE9vA7UbrsP2Ivt/LN9otnE2V3kQZXwnK6OxXDz8ze6Nsfraew5ZD9Grj/Y6QzrTCIA0wLfJlRO/EbJ+bgAjyjuKo5C2RPf8Z4isHqvMWZNGrGorM8GhawBuMAlYLbP/QZQHv3RanWl5D7QlTtHa/Khn+iDckhB/QhDg7947YYlUnQGNNjZqfwIS+S+VAp1VgLCen79dNdTvJWasoZtvo+L0os7YEGFfDjuu2+nRsF6/MDRkpUptuBFhKk7cTrF09hdCM4YE/eHYCYEOMebnVWFEI6gunU9FJ+v/vlRI6lWNBgGCG+5wIrgGVHN1j1yVcIJf8FeiwUyADSZioV0gvCzxJ/Vh3FmHQ0U/6S8kkrdJlD2D4rCGviRay+WqqX5qsdusbehyEMFQWDObCnNVup6l5MumIOojtK16qHQ8gLJBSir7tWTMxq1csaBY8L5UY9f6sOYIICEBTmZX6DlX+EAe+Z0YU1AVgiZx1bEy13gxwaTXd1XIk3c2TeAn4tjttf3C3KOdQ4nspLToEu6Ohn30Ta8Eo3Qj/8ng/JhhMt6s2x7QndewwCrR0VzoOE1q3G+c/65/uPEsagWzDIVyeq/HtOP7sEUyBV+NOuXFgkzCKj5/FTOb3WVYFghWLTa8ELgNvS1jqesOgioLn29sziDruh3hUwtKzrtwOuyRr+X4CMSwyRkmNn8ebfpuYHy5H86wdm9TinzQaY3O+rV+HEz3hFYMXeB5be/eQzPy9Acn02K1rbrCKoU9fMj9wWODws6v0ZnieCUeMomDhza8yJUiQ/Orlat0w47CnWRyFuAHbf24zgheFG6Ti3MHXWNV69l0nBi1h6tK2ZheH6ZoFYE+eOr7l9in047QTrF8J0sJcE6Q/sjgZmvye2e8hZdT93EBbtWO4zSYVD/Oxn6ZjwBzgL3sN2RKOIQ/Qe0WeMmZ14fTIhG+H2TiB3e985EoDeVwYFPONNJTZYH6NxLKUemxjbkN5KjiSmy4G7TEYcSjSSyE8Cmlbh6MpcwxliIKNKsBnrNi78b653DUHaPWyCsXfj2x9/AyH1myPaQu6zv+8jnPsW+jfrQVR8rkTyxvhNuNGltc2Vx5P9P//NMIr3+BcZhG4T4JUT3fQMx2vkmTu4BCUksrmXayC2ocXDiLNyvptB1LDtFYulWONYlOvcBQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: aef11dbf-eb5b-4d79-95c7-08d9d563a715 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:03.3354 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: zhqBu+tkRtcwXiOLWhs7tDXeDxt2uT2nNg7nESi7oH1QkWICqYVWT6OHR16uj6GOyNtZH3dvC/iumHhy3Ex3lNl4anIVBqt1zifBhsrrUBI= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR10MB5813 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: Er85bSqWyyeTK2kPJy3Ul1K1OFT__ASz X-Proofpoint-ORIG-GUID: Er85bSqWyyeTK2kPJy3Ul1K1OFT__ASz Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Thanos Makatos This patch introduces the vfio-user protocol specification (formerly known as VFIO-over-socket), which is designed to allow devices to be emulated outside QEMU, in a separate process. vfio-user reuses the existing VFIO defines, structs and concepts. It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: John G Johnson Signed-off-by: Thanos Makatos Signed-off-by: John Levon --- docs/devel/index.rst | 1 + docs/devel/vfio-user.rst | 1810 ++++++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 6 + 3 files changed, 1817 insertions(+) create mode 100644 docs/devel/vfio-user.rst diff --git a/docs/devel/index.rst b/docs/devel/index.rst index afd9375..23d2c30 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -48,3 +48,4 @@ modifying QEMU's source code. trivial-patches submitting-a-patch submitting-a-pull-request + vfio-user diff --git a/docs/devel/vfio-user.rst b/docs/devel/vfio-user.rst new file mode 100644 index 0000000..97a7506 --- /dev/null +++ b/docs/devel/vfio-user.rst @@ -0,0 +1,1810 @@ +.. include:: +******************************** +vfio-user Protocol Specification +******************************** + +-------------- +Version_ 0.9.1 +-------------- + +.. contents:: Table of Contents + +Introduction +============ +vfio-user is a protocol that allows a device to be emulated in a separate +process outside of a Virtual Machine Monitor (VMM). vfio-user devices consist +of a generic VFIO device type, living inside the VMM, which we call the client, +and the core device implementation, living outside the VMM, which we call the +server. + +The vfio-user specification is partly based on the +`Linux VFIO ioctl interface `_. + +VFIO is a mature and stable API, backed by an extensively used framework. The +existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be largely +re-used, though there is nothing in this specification that requires that +particular implementation. None of the VFIO kernel modules are required for +supporting the protocol, on either the client or server side. Some source +definitions in VFIO are re-used for vfio-user. + +The main idea is to allow a virtual device to function in a separate process in +the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``) is +chosen because file descriptors can be trivially sent over it, which in turn +allows: + +* Sharing of client memory for DMA with the server. +* Sharing of server memory with the client for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +Other socket types could be used which allow the server to run in a separate +guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretically +the underlying transport does not necessarily have to be a socket, however we do +not examine such alternatives. In this protocol version we focus on using a UNIX +domain socket and introduce basic support for the other two types of sockets +without considering performance implications. + +While passing of file descriptors is desirable for performance reasons, support +is not necessary for either the client or the server in order to implement the +protocol. There is always an in-band, message-passing fall back mechanism. + +Overview +======== + +VFIO is a framework that allows a physical device to be securely passed through +to a user space process; the device-specific kernel driver does not drive the +device at all. Typically, the user space process is a VMM and the device is +passed through to it in order to achieve high performance. VFIO provides an API +and the required functionality in the kernel. QEMU has adopted VFIO to allow a +guest to directly access physical devices, instead of emulating them in +software. + +vfio-user reuses the core VFIO concepts defined in its API, but implements them +as messages to be sent over a socket. It does not change the kernel-based VFIO +in any way, in fact none of the VFIO kernel modules need to be loaded to use +vfio-user. It is also possible for the client to concurrently use the current +kernel-based VFIO for one device, and vfio-user for another device. + +VFIO Device Model +----------------- + +A device under VFIO presents a standard interface to the user process. Many of +the VFIO operations in the existing interface use the ``ioctl()`` system call, and +references to the existing interface are called the ``ioctl()`` implementation in +this document. + +The following sections describe the set of messages that implement the vfio-user +interface over a socket. In many cases, the messages are analogous to data +structures used in the ``ioctl()`` implementation. Messages derived from the +``ioctl()`` will have a name derived from the ``ioctl()`` command name. E.g., the +``VFIO_DEVICE_GET_INFO`` ``ioctl()`` command becomes a +``VFIO_USER_DEVICE_GET_INFO`` message. The purpose of this reuse is to share as +much code as feasible with the ``ioctl()`` implementation``. + +Connection Initiation +^^^^^^^^^^^^^^^^^^^^^ + +After the client connects to the server, the initial client message is +``VFIO_USER_VERSION`` to propose a protocol version and set of capabilities to +apply to the session. The server replies with a compatible version and set of +capabilities it supports, or closes the connection if it cannot support the +advertised version. + +Device Information +^^^^^^^^^^^^^^^^^^ + +The client uses a ``VFIO_USER_DEVICE_GET_INFO`` message to query the server for +information about the device. This information includes: + +* The device type and whether it supports reset (``VFIO_DEVICE_FLAGS_``), +* the number of device regions, and +* the device presents to the client the number of interrupt types the device + supports. + +Region Information +^^^^^^^^^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_REGION_INFO`` messages to query the +server for information about the device's regions. This information describes: + +* Read and write permissions, whether it can be memory mapped, and whether it + supports additional capabilities (``VFIO_REGION_INFO_CAP_``). +* Region index, size, and offset. + +When a device region can be mapped by the client, the server provides a file +descriptor which the client can ``mmap()``. The server is responsible for +polling for client updates to memory mapped regions. + +Region Capabilities +""""""""""""""""""" + +Some regions have additional capabilities that cannot be described adequately +by the region info data structure. These capabilities are returned in the +region info reply in a list similar to PCI capabilities in a PCI device's +configuration space. + +Sparse Regions +"""""""""""""" +A region can be memory-mappable in whole or in part. When only a subset of a +region can be mapped by the client, a ``VFIO_REGION_INFO_CAP_SPARSE_MMAP`` +capability is included in the region info reply. This capability describes +which portions can be mapped by the client. + +.. Note:: + For example, in a virtual NVMe controller, sparse regions can be used so + that accesses to the NVMe registers (found in the beginning of BAR0) are + trapped (an infrequent event), while allowing direct access to the doorbells + (an extremely frequent event as every I/O submission requires a write to + BAR0), found in the next page after the NVMe registers in BAR0. + +Device-Specific Regions +""""""""""""""""""""""" + +A device can define regions additional to the standard ones (e.g. PCI indexes +0-8). This is achieved by including a ``VFIO_REGION_INFO_CAP_TYPE`` capability +in the region info reply of a device-specific region. Such regions are reflected +in ``struct vfio_user_device_info.num_regions``. Thus, for PCI devices this +value can be equal to, or higher than, ``VFIO_PCI_NUM_REGIONS``. + +Region I/O via file descriptors +------------------------------- + +For unmapped regions, region I/O from the client is done via +``VFIO_USER_REGION_READ/WRITE``. As an optimization, ioeventfds or ioregionfds +may be configured for sub-regions of some regions. A client may request +information on these sub-regions via ``VFIO_USER_DEVICE_GET_REGION_IO_FDS``; by +configuring the returned file descriptors as ioeventfds or ioregionfds, the +server can be directly notified of I/O (for example, by KVM) without taking a +trip through the client. + +Interrupts +^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_IRQ_INFO`` messages to query the server +for the device's interrupt types. The interrupt types are specific to the bus +the device is attached to, and the client is expected to know the capabilities +of each interrupt type. The server can signal an interrupt by directly injecting +interrupts into the guest via an event file descriptor. The client configures +how the server signals an interrupt with ``VFIO_USER_SET_IRQS`` messages. + +Device Read and Write +^^^^^^^^^^^^^^^^^^^^^ + +When the guest executes load or store operations to an unmapped device region, +the client forwards these operations to the server with +``VFIO_USER_REGION_READ`` or ``VFIO_USER_REGION_WRITE`` messages. The server +will reply with data from the device on read operations or an acknowledgement on +write operations. See `Read and Write Operations`_. + +Client memory access +-------------------- + +The client uses ``VFIO_USER_DMA_MAP`` and ``VFIO_USER_DMA_UNMAP`` messages to +inform the server of the valid DMA ranges that the server can access on behalf +of a device (typically, VM guest memory). DMA memory may be accessed by the +server via ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages over the +socket. In this case, the "DMA" part of the naming is a misnomer. + +Actual direct memory access of client memory from the server is possible if the +client provides file descriptors the server can ``mmap()``. Note that ``mmap()`` +privileges cannot be revoked by the client, therefore file descriptors should +only be exported in environments where the client trusts the server not to +corrupt guest memory. + +See `Read and Write Operations`_. + +Client/server interactions +========================== + +Socket +------ + +A server can serve: + +1) one or more clients, and/or +2) one or more virtual devices, belonging to one or more clients. + +The current protocol specification requires a dedicated socket per +client/server connection. It is a server-side implementation detail whether a +single server handles multiple virtual devices from the same or multiple +clients. The location of the socket is implementation-specific. Multiplexing +clients, devices, and servers over the same socket is not supported in this +version of the protocol. + +Authentication +-------------- + +For ``AF_UNIX``, we rely on OS mandatory access controls on the socket files, +therefore it is up to the management layer to set up the socket as required. +Socket types that span guests or hosts will require a proper authentication +mechanism. Defining that mechanism is deferred to a future version of the +protocol. + +Command Concurrency +------------------- + +A client may pipeline multiple commands without waiting for previous command +replies. The server will process commands in the order they are received. A +consequence of this is if a client issues a command with the *No_reply* bit, +then subsequently issues a command without *No_reply*, the older command will +have been processed before the reply to the younger command is sent by the +server. The client must be aware of the device's capability to process +concurrent commands if pipelining is used. For example, pipelining allows +multiple client threads to concurrently access device regions; the client must +ensure these accesses obey device semantics. + +An example is a frame buffer device, where the device may allow concurrent +access to different areas of video memory, but may have indeterminate behavior +if concurrent accesses are performed to command or status registers. + +Note that unrelated messages sent from the server to the client can appear in +between a client to server request/reply and vice versa. + +Implementers should be prepared for certain commands to exhibit potentially +unbounded latencies. For example, ``VFIO_USER_DEVICE_RESET`` may take an +arbitrarily long time to complete; clients should take care not to block +unnecessarily. + +Socket Disconnection Behavior +----------------------------- +The server and the client can disconnect from each other, either intentionally +or unexpectedly. Both the client and the server need to know how to handle such +events. + +Server Disconnection +^^^^^^^^^^^^^^^^^^^^ +A server disconnecting from the client may indicate that: + +1) A virtual device has been restarted, either intentionally (e.g. because of a + device update) or unintentionally (e.g. because of a crash). +2) A virtual device has been shut down with no intention to be restarted. + +It is impossible for the client to know whether or not a failure is +intermittent or innocuous and should be retried, therefore the client should +reset the VFIO device when it detects the socket has been disconnected. +Error recovery will be driven by the guest's device error handling +behavior. + +Client Disconnection +^^^^^^^^^^^^^^^^^^^^ +The client disconnecting from the server primarily means that the client +has exited. Currently, this means that the guest is shut down so the device is +no longer needed therefore the server can automatically exit. However, there +can be cases where a client disconnection should not result in a server exit: + +1) A single server serving multiple clients. +2) A multi-process QEMU upgrading itself step by step, which is not yet + implemented. + +Therefore in order for the protocol to be forward compatible, the server should +respond to a client disconnection as follows: + + - all client memory regions are unmapped and cleaned up (including closing any + passed file descriptors) + - all IRQ file descriptors passed from the old client are closed + - the device state should otherwise be retained + +The expectation is that when a client reconnects, it will re-establish IRQ and +client memory mappings. + +If anything happens to the client (such as qemu really did exit), the control +stack will know about it and can clean up resources accordingly. + +Security Considerations +----------------------- + +Speaking generally, vfio-user clients should not trust servers, and vice versa. +Standard tools and mechanisms should be used on both sides to validate input and +prevent against denial of service scenarios, buffer overflow, etc. + +Request Retry and Response Timeout +---------------------------------- +A failed command is a command that has been successfully sent and has been +responded to with an error code. Failure to send the command in the first place +(e.g. because the socket is disconnected) is a different type of error examined +earlier in the disconnect section. + +.. Note:: + QEMU's VFIO retries certain operations if they fail. While this makes sense + for real HW, we don't know for sure whether it makes sense for virtual + devices. + +Defining a retry and timeout scheme is deferred to a future version of the +protocol. + +Message sizes +------------- + +Some requests have an ``argsz`` field. In a request, it defines the maximum +expected reply payload size, which should be at least the size of the fixed +reply payload headers defined here. The *request* payload size is defined by the +usual ``msg_size`` field in the header, not the ``argsz`` field. + +In a reply, the server sets ``argsz`` field to the size needed for a full +payload size. This may be less than the requested maximum size. This may be +larger than the requested maximum size: in that case, the full payload is not +included in the reply, but the ``argsz`` field in the reply indicates the needed +size, allowing a client to allocate a larger buffer for holding the reply before +trying again. + +In addition, during negotiation (see `Version`_), the client and server may +each specify a ``max_data_xfer_size`` value; this defines the maximum data that +may be read or written via one of the ``VFIO_USER_DMA/REGION_READ/WRITE`` +messages; see `Read and Write Operations`_. + +Protocol Specification +====================== + +To distinguish from the base VFIO symbols, all vfio-user symbols are prefixed +with ``vfio_user`` or ``VFIO_USER``. In this revision, all data is in the +endianness of the host system, although this may be relaxed in future +revisions in cases where the client and server run on different hosts +with different endianness. + +Unless otherwise specified, all sizes should be presumed to be in bytes. + +.. _Commands: + +Commands +-------- +The following table lists the VFIO message command IDs, and whether the +message command is sent from the client or the server. + +====================================== ========= ================= +Name Command Request Direction +====================================== ========= ================= +``VFIO_USER_VERSION`` 1 client -> server +``VFIO_USER_DMA_MAP`` 2 client -> server +``VFIO_USER_DMA_UNMAP`` 3 client -> server +``VFIO_USER_DEVICE_GET_INFO`` 4 client -> server +``VFIO_USER_DEVICE_GET_REGION_INFO`` 5 client -> server +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` 6 client -> server +``VFIO_USER_DEVICE_GET_IRQ_INFO`` 7 client -> server +``VFIO_USER_DEVICE_SET_IRQS`` 8 client -> server +``VFIO_USER_REGION_READ`` 9 client -> server +``VFIO_USER_REGION_WRITE`` 10 client -> server +``VFIO_USER_DMA_READ`` 11 server -> client +``VFIO_USER_DMA_WRITE`` 12 server -> client +``VFIO_USER_DEVICE_RESET`` 13 client -> server +``VFIO_USER_DIRTY_PAGES`` 14 client -> server +====================================== ========= ================= + +Header +------ + +All messages, both command messages and reply messages, are preceded by a +16-byte header that contains basic information about the message. The header is +followed by message-specific data described in the sections below. + ++----------------+--------+-------------+ +| Name | Offset | Size | ++================+========+=============+ +| Message ID | 0 | 2 | ++----------------+--------+-------------+ +| Command | 2 | 2 | ++----------------+--------+-------------+ +| Message size | 4 | 4 | ++----------------+--------+-------------+ +| Flags | 8 | 4 | ++----------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0-3 | Type | | +| | +-----+------------+ | +| | | 4 | No_reply | | +| | +-----+------------+ | +| | | 5 | Error | | +| | +-----+------------+ | ++----------------+--------+-------------+ +| Error | 12 | 4 | ++----------------+--------+-------------+ +| | 16 | variable | ++----------------+--------+-------------+ + +* *Message ID* identifies the message, and is echoed in the command's reply + message. Message IDs belong entirely to the sender, can be re-used (even + concurrently) and the receiver must not make any assumptions about their + uniqueness. +* *Command* specifies the command to be executed, listed in Commands_. It is + also set in the reply header. +* *Message size* contains the size of the entire message, including the header. +* *Flags* contains attributes of the message: + + * The *Type* bits indicate the message type. + + * *Command* (value 0x0) indicates a command message. + * *Reply* (value 0x1) indicates a reply message acknowledging a previous + command with the same message ID. + * *No_reply* in a command message indicates that no reply is needed for this + command. This is commonly used when multiple commands are sent, and only + the last needs acknowledgement. + * *Error* in a reply message indicates the command being acknowledged had + an error. In this case, the *Error* field will be valid. + +* *Error* in a reply message is an optional UNIX errno value. It may be zero + even if the Error bit is set in Flags. It is reserved in a command message. + +Each command message in Commands_ must be replied to with a reply message, +unless the message sets the *No_Reply* bit. The reply consists of the header +with the *Reply* bit set, plus any additional data. + +If an error occurs, the reply message must only include the reply header. + +As the header is standard in both requests and replies, it is not included in +the command-specific specifications below; each message definition should be +appended to the standard header, and the offsets are given from the end of the +standard header. + +``VFIO_USER_VERSION`` +--------------------- + +.. _Version: + +This is the initial message sent by the client after the socket connection is +established; the same format is used for the server's reply. + +Upon establishing a connection, the client must send a ``VFIO_USER_VERSION`` +message proposing a protocol version and a set of capabilities. The server +compares these with the versions and capabilities it supports and sends a +``VFIO_USER_VERSION`` reply according to the following rules. + +* The major version in the reply must be the same as proposed. If the client + does not support the proposed major, it closes the connection. +* The minor version in the reply must be equal to or less than the minor + version proposed. +* The capability list must be a subset of those proposed. If the server + requires a capability the client did not include, it closes the connection. + +The protocol major version will only change when incompatible protocol changes +are made, such as changing the message format. The minor version may change +when compatible changes are made, such as adding new messages or capabilities, +Both the client and server must support all minor versions less than the +maximum minor version it supports. E.g., an implementation that supports +version 1.3 must also support 1.0 through 1.2. + +When making a change to this specification, the protocol version number must +be included in the form "added in version X.Y" + +Request +^^^^^^^ + +============== ====== ==== +Name Offset Size +============== ====== ==== +version major 0 2 +version minor 2 2 +version data 4 variable (including terminating NUL). Optional. +============== ====== ==== + +The version data is an optional UTF-8 encoded JSON byte array with the following +format: + ++--------------+--------+-----------------------------------+ +| Name | Type | Description | ++==============+========+===================================+ +| capabilities | object | Contains common capabilities that | +| | | the sender supports. Optional. | ++--------------+--------+-----------------------------------+ + +Capabilities: + ++--------------------+--------+------------------------------------------------+ +| Name | Type | Description | ++====================+========+================================================+ +| max_msg_fds | number | Maximum number of file descriptors that can be | +| | | received by the sender in one message. | +| | | Optional. If not specified then the receiver | +| | | must assume a value of ``1``. | ++--------------------+--------+------------------------------------------------+ +| max_data_xfer_size | number | Maximum ``count`` for data transfer messages; | +| | | see `Read and Write Operations`_. Optional, | +| | | with a default value of 1048576 bytes. | ++--------------------+--------+------------------------------------------------+ +| migration | object | Migration capability parameters. If missing | +| | | then migration is not supported by the sender. | ++--------------------+--------+------------------------------------------------+ + +The migration capability contains the following name/value pairs: + ++--------+--------+-----------------------------------------------+ +| Name | Type | Description | ++========+========+===============================================+ +| pgsize | number | Page size of dirty pages bitmap. The smallest | +| | | between the client and the server is used. | ++--------+--------+-----------------------------------------------+ + +Reply +^^^^^ + +The same message format is used in the server's reply with the semantics +described above. + +``VFIO_USER_DMA_MAP`` +--------------------- + +This command message is sent by the client to the server to inform it of the +memory regions the server can access. It must be sent before the server can +perform any DMA to the client. It is normally sent directly after the version +handshake is completed, but may also occur when memory is added to the client, +or if the client uses a vIOMMU. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++-------------+--------+-------------+ +| Name | Offset | Size | ++=============+========+=============+ +| argsz | 0 | 4 | ++-------------+--------+-------------+ +| flags | 4 | 4 | ++-------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=====+============+ | +| | | 0 | readable | | +| | +-----+------------+ | +| | | 1 | writeable | | +| | +-----+------------+ | ++-------------+--------+-------------+ +| offset | 8 | 8 | ++-------------+--------+-------------+ +| address | 16 | 8 | ++-------------+--------+-------------+ +| size | 24 | 8 | ++-------------+--------+-------------+ + +* *argsz* is the size of the above structure. Note there is no reply payload, + so this field differs from other message types. +* *flags* contains the following region attributes: + + * *readable* indicates that the region can be read from. + + * *writeable* indicates that the region can be written to. + +* *offset* is the file offset of the region with respect to the associated file + descriptor, or zero if the region is not mappable +* *address* is the base DMA address of the region. +* *size* is the size of the region. + +This structure is 32 bytes in size, so the message size is 16 + 32 bytes. + +If the DMA region being added can be directly mapped by the server, a file +descriptor must be sent as part of the message meta-data. The region can be +mapped via the mmap() system call. On ``AF_UNIX`` sockets, the file descriptor +must be passed as ``SCM_RIGHTS`` type ancillary data. Otherwise, if the DMA +region cannot be directly mapped by the server, no file descriptor must be sent +as part of the message meta-data and the DMA region can be accessed by the +server using ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages, +explained in `Read and Write Operations`_. A command to map over an existing +region must be failed by the server with ``EEXIST`` set in error field in the +reply. + +Reply +^^^^^ + +There is no payload in the reply message. + +``VFIO_USER_DMA_UNMAP`` +----------------------- + +This command message is sent by the client to the server to inform it that a +DMA region, previously made available via a ``VFIO_USER_DMA_MAP`` command +message, is no longer available for DMA. It typically occurs when memory is +subtracted from the client or if the client uses a vIOMMU. The DMA region is +described by the following structure: + +Request +^^^^^^^ + +The request payload for this message is a structure of the following format: + ++--------------+--------+------------------------+ +| Name | Offset | Size | ++==============+========+========================+ +| argsz | 0 | 4 | ++--------------+--------+------------------------+ +| flags | 4 | 4 | ++--------------+--------+------------------------+ +| | +-----+-----------------------+ | +| | | Bit | Definition | | +| | +=====+=======================+ | +| | | 0 | get dirty page bitmap | | +| | +-----+-----------------------+ | ++--------------+--------+------------------------+ +| address | 8 | 8 | ++--------------+--------+------------------------+ +| size | 16 | 8 | ++--------------+--------+------------------------+ + +* *argsz* is the maximum size of the reply payload. +* *flags* contains the following DMA region attributes: + + * *get dirty page bitmap* indicates that a dirty page bitmap must be + populated before unmapping the DMA region. The client must provide a + `VFIO Bitmap`_ structure, explained below, immediately following this + entry. + +* *address* is the base DMA address of the DMA region. +* *size* is the size of the DMA region. + +The address and size of the DMA region being unmapped must match exactly a +previous mapping. The size of request message depends on whether or not the +*get dirty page bitmap* bit is set in Flags: + +* If not set, the size of the total request message is: 16 + 24. + +* If set, the size of the total request message is: 16 + 24 + 16. + +.. _VFIO Bitmap: + +VFIO Bitmap Format +"""""""""""""""""" + ++--------+--------+------+ +| Name | Offset | Size | ++========+========+======+ +| pgsize | 0 | 8 | ++--------+--------+------+ +| size | 8 | 8 | ++--------+--------+------+ + +* *pgsize* is the page size for the bitmap, in bytes. +* *size* is the size for the bitmap, in bytes, excluding the VFIO bitmap header. + +Reply +^^^^^ + +Upon receiving a ``VFIO_USER_DMA_UNMAP`` command, if the file descriptor is +mapped then the server must release all references to that DMA region before +replying, which potentially includes in-flight DMA transactions. + +The server responds with the original DMA entry in the request. If the +*get dirty page bitmap* bit is set in flags in the request, then +the server also includes the `VFIO Bitmap`_ structure sent in the request, +followed by the corresponding dirty page bitmap, where each bit represents +one page of size *pgsize* in `VFIO Bitmap`_ . + +The total size of the total reply message is: +16 + 24 + (16 + *size* in `VFIO Bitmap`_ if *get dirty page bitmap* is set). + +``VFIO_USER_DEVICE_GET_INFO`` +----------------------------- + +This command message is sent by the client to the server to query for basic +information about the device. + +Request +^^^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the maximum size of the reply payload +* all other fields must be zero. + +Reply +^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=============+========+==========================+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=====+=========================+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* contains the following device attributes. + + * ``VFIO_DEVICE_FLAGS_RESET`` indicates that the device supports the + ``VFIO_USER_DEVICE_RESET`` message. + * ``VFIO_DEVICE_FLAGS_PCI`` indicates that the device is a PCI device. + +* *num_regions* is the number of memory regions that the device exposes. +* *num_irqs* is the number of distinct interrupt types that the device supports. + +This version of the protocol only supports PCI devices. Additional devices may +be supported in future versions. + +``VFIO_USER_DEVICE_GET_REGION_INFO`` +------------------------------------ + +This command message is sent by the client to the server to query for +information about device regions. The VFIO region info structure is defined in +```` (``struct vfio_region_info``). + +Request +^^^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* all other fields must be zero. + +Reply +^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++============+========+==============================+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_REGION_INFO_FLAG_READ | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_REGION_INFO_FLAG_WRITE | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_REGION_INFO_FLAG_MMAP | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_REGION_INFO_FLAG_CAPS | | +| | +-----+-----------------------------+ | ++------------+--------+------------------------------+ ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* is the size required for the full reply payload (region info structure + plus the size of any region capabilities) +* *flags* are attributes of the region: + + * ``VFIO_REGION_INFO_FLAG_READ`` allows client read access to the region. + * ``VFIO_REGION_INFO_FLAG_WRITE`` allows client write access to the region. + * ``VFIO_REGION_INFO_FLAG_MMAP`` specifies the client can mmap() the region. + When this flag is set, the reply will include a file descriptor in its + meta-data. On ``AF_UNIX`` sockets, the file descriptors will be passed as + ``SCM_RIGHTS`` type ancillary data. + * ``VFIO_REGION_INFO_FLAG_CAPS`` indicates additional capabilities found in the + reply. + +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* *cap_offset* describes where additional region capabilities can be found. + cap_offset is relative to the beginning of the VFIO region info structure. + The data structure it points is a VFIO cap header defined in + ````. +* *size* is the size of the region. +* *offset* is the offset that should be given to the mmap() system call for + regions with the MMAP attribute. It is also used as the base offset when + mapping a VFIO sparse mmap area, described below. + +VFIO region capabilities +"""""""""""""""""""""""" + +The VFIO region information can also include a capabilities list. This list is +similar to a PCI capability list - each entry has a common header that +identifies a capability and where the next capability in the list can be found. +The VFIO capability header format is defined in ```` (``struct +vfio_info_cap_header``). + +VFIO cap header format +"""""""""""""""""""""" + ++---------+--------+------+ +| Name | Offset | Size | ++=========+========+======+ +| id | 0 | 2 | ++---------+--------+------+ +| version | 2 | 2 | ++---------+--------+------+ +| next | 4 | 4 | ++---------+--------+------+ + +* *id* is the capability identity. +* *version* is a capability-specific version number. +* *next* specifies the offset of the next capability in the capability list. It + is relative to the beginning of the VFIO region info structure. + +VFIO sparse mmap cap header +""""""""""""""""""""""""""" + ++------------------+----------------------------------+ +| Name | Value | ++==================+==================================+ +| id | VFIO_REGION_INFO_CAP_SPARSE_MMAP | ++------------------+----------------------------------+ +| version | 0x1 | ++------------------+----------------------------------+ +| next | | ++------------------+----------------------------------+ +| sparse mmap info | VFIO region info sparse mmap | ++------------------+----------------------------------+ + +This capability is defined when only a subrange of the region supports +direct access by the client via mmap(). The VFIO sparse mmap area is defined in +```` (``struct vfio_region_sparse_mmap_area`` and ``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region info cap sparse mmap +"""""""""""""""""""""""""""""""" + ++----------+--------+------+ +| Name | Offset | Size | ++==========+========+======+ +| nr_areas | 0 | 4 | ++----------+--------+------+ +| reserved | 4 | 4 | ++----------+--------+------+ +| offset | 8 | 8 | ++----------+--------+------+ +| size | 16 | 9 | ++----------+--------+------+ +| ... | | | ++----------+--------+------+ + +* *nr_areas* is the number of sparse mmap areas in the region. +* *offset* and size describe a single area that can be mapped by the client. + There will be *nr_areas* pairs of offset and size. The offset will be added to + the base offset given in the ``VFIO_USER_DEVICE_GET_REGION_INFO`` to form the + offset argument of the subsequent mmap() call. + +The VFIO sparse mmap area is defined in ```` (``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region type cap header +""""""""""""""""""""""""""" + ++------------------+---------------------------+ +| Name | Value | ++==================+===========================+ +| id | VFIO_REGION_INFO_CAP_TYPE | ++------------------+---------------------------+ +| version | 0x1 | ++------------------+---------------------------+ +| next | | ++------------------+---------------------------+ +| region info type | VFIO region info type | ++------------------+---------------------------+ + +This capability is defined when a region is specific to the device. + +VFIO region info type cap +""""""""""""""""""""""""" + +The VFIO region info type is defined in ```` +(``struct vfio_region_info_cap_type``). + ++---------+--------+------+ +| Name | Offset | Size | ++=========+========+======+ +| type | 0 | 4 | ++---------+--------+------+ +| subtype | 4 | 4 | ++---------+--------+------+ + +The only device-specific region type and subtype supported by vfio-user is +``VFIO_REGION_TYPE_MIGRATION`` (3) and ``VFIO_REGION_SUBTYPE_MIGRATION`` (1). + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` +-------------------------------------- + +Clients can access regions via ``VFIO_USER_REGION_READ/WRITE`` or, if provided, by +``mmap()`` of a file descriptor provided by the server. + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` provides an alternative access mechanism via +file descriptors. This is an optional feature intended for performance +improvements where an underlying sub-system (such as KVM) supports communication +across such file descriptors to the vfio-user server, without needing to +round-trip through the client. + +The server returns an array of sub-regions for the requested region. Each +sub-region describes a span (offset and size) of a region, along with the +requested file descriptor notification mechanism to use. Each sub-region in the +response message may choose to use a different method, as defined below. The +two mechanisms supported in this specification are ioeventfds and ioregionfds. + +The server in addition returns a file descriptor in the ancillary data; clients +are expected to configure each sub-region's file descriptor with the requested +notification method. For example, a client could configure KVM with the +requested ioeventfd via a ``KVM_IOEVENTFD`` ``ioctl()``. + +Request +^^^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried +* all other fields must be zero + +The client must set ``flags`` to zero and specify the region being queried in +the ``index``. + +Reply +^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=============+========+======+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ +| sub-regions | 16 | ... | ++-------------+--------+------+ + +* *argsz* is the size of the region IO FD info structure plus the + total size of the sub-region array. Thus, each array entry "i" is at offset + i * ((argsz - 32) / count). Note that currently this is 40 bytes for both IO + FD types, but this is not to be relied on. As elsewhere, this indicates the + full reply payload size needed. +* *flags* must be zero +* *index* is the index of memory region being queried +* *count* is the number of sub-regions in the array +* *sub-regions* is the array of Sub-Region IO FD info structures + +The reply message will additionally include at least one file descriptor in the +ancillary data. Note that more than one sub-region may share the same file +descriptor. + +Note that it is the client's responsibility to verify the requested values (for +example, that the requested offset does not exceed the region's bounds). + +Each sub-region given in the response has one of two possible structures, +depending whether *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` or +``VFIO_USER_IO_FD_TYPE_IOREGIONFD``: + +Sub-Region IO FD info format (ioeventfd) +"""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| datamatch | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations +* *fd_index* is the index in the ancillary data of the FD to use for ioeventfd + notification; it may be shared. +* *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` +* *flags* is any of: + + * ``KVM_IOEVENTFD_FLAG_DATAMATCH`` + * ``KVM_IOEVENTFD_FLAG_PIO`` + * ``KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY`` (FIXME: makes sense?) + +* *datamatch* is the datamatch value if needed + +See https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt, *4.59 +KVM_IOEVENTFD* for further context on the ioeventfd-specific fields. + +Sub-Region IO FD info format (ioregionfd) +""""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++===========+========+======+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| user_data | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access size is + not relevant, which may allow for optimizations; ``KVM_IOREGION_POSTED_WRITES`` + must be set in *flags* in this case +* *fd_index* is the index in the ancillary data of the FD to use for ioregionfd + messages; it may be shared +* *type* is ``VFIO_USER_IO_FD_TYPE_IOREGIONFD`` +* *flags* is any of: + + * ``KVM_IOREGION_PIO`` + * ``KVM_IOREGION_POSTED_WRITES`` + +* *user_data* is an opaque value passed back to the server via a message on the + file descriptor + +For further information on the ioregionfd-specific fields, see: +https://lore.kernel.org/kvm/cover.1613828726.git.eafanasova@gmail.com/ + +(FIXME: update with final API docs.) + +``VFIO_USER_DEVICE_GET_IRQ_INFO`` +--------------------------------- + +This command message is sent by the client to the server to query for +information about device interrupt types. The VFIO IRQ info structure is +defined in ```` (``struct vfio_irq_info``). + +Request +^^^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the maximum size of the reply payload (16 bytes today) +* index is the index of IRQ type being queried (e.g. ``VFIO_PCI_MSIX_IRQ_INDEX``) +* all other fields must be zero + +Reply +^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=======+========+===========================+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=====+==========================+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* defines IRQ attributes: + + * ``VFIO_IRQ_INFO_EVENTFD`` indicates the IRQ type can support server eventfd + signalling. + * ``VFIO_IRQ_INFO_MASKABLE`` indicates that the IRQ type supports the ``MASK`` + and ``UNMASK`` actions in a ``VFIO_USER_DEVICE_SET_IRQS`` message. + * ``VFIO_IRQ_INFO_AUTOMASKED`` indicates the IRQ type masks itself after being + triggered, and the client must send an ``UNMASK`` action to receive new + interrupts. + * ``VFIO_IRQ_INFO_NORESIZE`` indicates ``VFIO_USER_SET_IRQS`` operations setup + interrupts as a set, and new sub-indexes cannot be enabled without disabling + the entire type. +* index is the index of IRQ type being queried +* count describes the number of interrupts of the queried type. + +``VFIO_USER_DEVICE_SET_IRQS`` +----------------------------- + +This command message is sent by the client to the server to set actions for +device interrupt types. The VFIO IRQ set structure is defined in +```` (``struct vfio_irq_set``). + +Request +^^^^^^^ + ++-------+--------+------------------------------+ +| Name | Offset | Size | ++=======+========+==============================+ +| argsz | 0 | 4 | ++-------+--------+------------------------------+ +| flags | 4 | 4 | ++-------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=====+=============================+ | +| | | 0 | VFIO_IRQ_SET_DATA_NONE | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_IRQ_SET_DATA_BOOL | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_IRQ_SET_DATA_EVENTFD | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_IRQ_SET_ACTION_MASK | | +| | +-----+-----------------------------+ | +| | | 4 | VFIO_IRQ_SET_ACTION_UNMASK | | +| | +-----+-----------------------------+ | +| | | 5 | VFIO_IRQ_SET_ACTION_TRIGGER | | +| | +-----+-----------------------------+ | ++-------+--------+------------------------------+ +| index | 8 | 4 | ++-------+--------+------------------------------+ +| start | 12 | 4 | ++-------+--------+------------------------------+ +| count | 16 | 4 | ++-------+--------+------------------------------+ +| data | 20 | variable | ++-------+--------+------------------------------+ + +* *argsz* is the size of the VFIO IRQ set request payload, including any *data* + field. Note there is no reply payload, so this field differs from other + message types. +* *flags* defines the action performed on the interrupt range. The ``DATA`` + flags describe the data field sent in the message; the ``ACTION`` flags + describe the action to be performed. The flags are mutually exclusive for + both sets. + + * ``VFIO_IRQ_SET_DATA_NONE`` indicates there is no data field in the command. + The action is performed unconditionally. + * ``VFIO_IRQ_SET_DATA_BOOL`` indicates the data field is an array of boolean + bytes. The action is performed if the corresponding boolean is true. + * ``VFIO_IRQ_SET_DATA_EVENTFD`` indicates an array of event file descriptors + was sent in the message meta-data. These descriptors will be signalled when + the action defined by the action flags occurs. In ``AF_UNIX`` sockets, the + descriptors are sent as ``SCM_RIGHTS`` type ancillary data. + If no file descriptors are provided, this de-assigns the specified + previously configured interrupts. + * ``VFIO_IRQ_SET_ACTION_MASK`` indicates a masking event. It can be used with + ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to mask an interrupt, + or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the guest masks + the interrupt. + * ``VFIO_IRQ_SET_ACTION_UNMASK`` indicates an unmasking event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to unmask an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + guest unmasks the interrupt. + * ``VFIO_IRQ_SET_ACTION_TRIGGER`` indicates a triggering event. It can be used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to trigger an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the + server triggers the interrupt. + +* *index* is the index of IRQ type being setup. +* *start* is the start of the sub-index being set. +* *count* describes the number of sub-indexes being set. As a special case, a + count (and start) of 0, with data flags of ``VFIO_IRQ_SET_DATA_NONE`` disables + all interrupts of the index. +* *data* is an optional field included when the + ``VFIO_IRQ_SET_DATA_BOOL`` flag is present. It contains an array of booleans + that specify whether the action is to be performed on the corresponding + index. It's used when the action is only performed on a subset of the range + specified. + +Not all interrupt types support every combination of data and action flags. +The client must know the capabilities of the device and IRQ index before it +sends a ``VFIO_USER_DEVICE_SET_IRQ`` message. + +In typical operation, a specific IRQ may operate as follows: + +1. The client sends a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_TRIGGER)`` along + with an eventfd. This associates the IRQ with a particular eventfd on the + server side. + +#. The client may send a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_MASK/UNMASK)`` along + with another eventfd. This associates the given eventfd with the + mask/unmask state on the server side. + +#. The server may trigger the IRQ by writing 1 to the eventfd. + +#. The server may mask/unmask an IRQ which will write 1 to the corresponding + mask/unmask eventfd, if there is one. + +5. A client may trigger a device IRQ itself, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_TRIGGER)``. + +6. A client may mask or unmask the IRQ, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_MASK/UNMASK)``. + +Reply +^^^^^ + +There is no payload in the reply. + +.. _Read and Write Operations: + +Note that all of these operations must be supported by the client and/or server, +even if the corresponding memory or device region has been shared as mappable. + +The ``count`` field must not exceed the value of ``max_data_xfer_size`` of the +peer, for both reads and writes. + +``VFIO_USER_REGION_READ`` +------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via ``mmap()`` of the underlying file descriptor. In this case, a client can +read from a device region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. +* *data* is the data that was read from the device region. + +``VFIO_USER_REGION_WRITE`` +-------------------------- + +If a device region is not mappable, it's not directly accessible by the client +via mmap() of the underlying fd. In this case, a client can write to a device +region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++========+========+==========+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DMA_READ`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +read from guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. +* *data* is the data read. + +``VFIO_USER_DMA_WRITE`` +----------------------- + +If the client has not shared mappable memory, the server can use this message to +write to guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` message. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=========+========+==========+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 4 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DEVICE_RESET`` +-------------------------- + +This command message is sent from the client to the server to reset the device. +Neither the request or reply have a payload. + +``VFIO_USER_DIRTY_PAGES`` +------------------------- + +This command is analogous to ``VFIO_IOMMU_DIRTY_PAGES``. It is sent by the client +to the server in order to control logging of dirty pages, usually during a live +migration. + +Dirty page tracking is optional for server implementation; clients should not +rely on it. + +Request +^^^^^^^ + ++-------+--------+-----------------------------------------+ +| Name | Offset | Size | ++=======+========+=========================================+ +| argsz | 0 | 4 | ++-------+--------+-----------------------------------------+ +| flags | 4 | 4 | ++-------+--------+-----------------------------------------+ +| | +-----+----------------------------------------+ | +| | | Bit | Definition | | +| | +=====+========================================+ | +| | | 0 | VFIO_IOMMU_DIRTY_PAGES_FLAG_START | | +| | +-----+----------------------------------------+ | +| | | 1 | VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP | | +| | +-----+----------------------------------------+ | +| | | 2 | VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | | +| | +-----+----------------------------------------+ | ++-------+--------+-----------------------------------------+ + +* *argsz* is the size of the VFIO dirty bitmap info structure for + ``START/STOP``; and for ``GET_BITMAP``, the maximum size of the reply payload + +* *flags* defines the action to be performed by the server: + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_START`` instructs the server to start logging + pages it dirties. Logging continues until explicitly disabled by + ``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP``. + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP`` instructs the server to stop logging + dirty pages. + + * ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP`` requests the server to return + the dirty bitmap for a specific IOVA range. The IOVA range is specified by + a "VFIO Bitmap Range" structure, which must immediately follow this + "VFIO Dirty Pages" structure. See `VFIO Bitmap Range Format`_. + This operation is only valid if logging of dirty pages has been previously + started. + + These flags are mutually exclusive with each other. + +This part of the request is analogous to VFIO's ``struct +vfio_iommu_type1_dirty_bitmap``. + +.. _VFIO Bitmap Range Format: + +VFIO Bitmap Range Format +"""""""""""""""""""""""" + ++--------+--------+------+ +| Name | Offset | Size | ++========+========+======+ +| iova | 0 | 8 | ++--------+--------+------+ +| size | 8 | 8 | ++--------+--------+------+ +| bitmap | 16 | 24 | ++--------+--------+------+ + +* *iova* is the IOVA offset + +* *size* is the size of the IOVA region + +* *bitmap* is the VFIO Bitmap explained in `VFIO Bitmap`_. + +This part of the request is analogous to VFIO's ``struct +vfio_iommu_type1_dirty_bitmap_get``. + +Reply +^^^^^ + +For ``VFIO_IOMMU_DIRTY_PAGES_FLAG_START`` or +``VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP``, there is no reply payload. + +For ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP``, the reply payload is as follows: + ++--------------+--------+-----------------------------------------+ +| Name | Offset | Size | ++==============+========+=========================================+ +| argsz | 0 | 4 | ++--------------+--------+-----------------------------------------+ +| flags | 4 | 4 | ++--------------+--------+-----------------------------------------+ +| | +-----+----------------------------------------+ | +| | | Bit | Definition | | +| | +=====+========================================+ | +| | | 2 | VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | | +| | +-----+----------------------------------------+ | ++--------------+--------+-----------------------------------------+ +| bitmap range | 8 | 40 | ++--------------+--------+-----------------------------------------+ +| bitmap | 48 | variable | ++--------------+--------+-----------------------------------------+ + +* *argsz* is the size required for the full reply payload (dirty pages structure + + bitmap range structure + actual bitmap) +* *flags* is ``VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP`` +* *bitmap range* is the same bitmap range struct provided in the request, as + defined in `VFIO Bitmap Range Format`_. +* *bitmap* is the actual dirty pages bitmap corresponding to the range request + +VFIO Device Migration Info +-------------------------- + +A device may contain a migration region (of type +``VFIO_REGION_TYPE_MIGRATION``). The beginning of the region must contain +``struct vfio_device_migration_info``, defined in ````. This +subregion is accessed like any other part of a standard vfio-user region +using ``VFIO_USER_REGION_READ``/``VFIO_USER_REGION_WRITE``. + ++---------------+--------+-----------------------------+ +| Name | Offset | Size | ++===============+========+=============================+ +| device_state | 0 | 4 | ++---------------+--------+-----------------------------+ +| | +-----+----------------------------+ | +| | | Bit | Definition | | +| | +=====+============================+ | +| | | 0 | VFIO_DEVICE_STATE_RUNNING | | +| | +-----+----------------------------+ | +| | | 1 | VFIO_DEVICE_STATE_SAVING | | +| | +-----+----------------------------+ | +| | | 2 | VFIO_DEVICE_STATE_RESUMING | | +| | +-----+----------------------------+ | ++---------------+--------+-----------------------------+ +| reserved | 4 | 4 | ++---------------+--------+-----------------------------+ +| pending_bytes | 8 | 8 | ++---------------+--------+-----------------------------+ +| data_offset | 16 | 8 | ++---------------+--------+-----------------------------+ +| data_size | 24 | 8 | ++---------------+--------+-----------------------------+ + +* *device_state* defines the state of the device: + + The client initiates device state transition by writing the intended state. + The server must respond only after it has successfully transitioned to the new + state. If an error occurs then the server must respond to the + ``VFIO_USER_REGION_WRITE`` operation with the Error field set accordingly and + must remain at the previous state, or in case of internal error it must + transition to the error state, defined as + ``VFIO_DEVICE_STATE_RESUMING | VFIO_DEVICE_STATE_SAVING``. The client must + re-read the device state in order to determine it afresh. + + The following device states are defined: + + +-----------+---------+----------+-----------------------------------+ + | _RESUMING | _SAVING | _RUNNING | Description | + +===========+=========+==========+===================================+ + | 0 | 0 | 0 | Device is stopped. | + +-----------+---------+----------+-----------------------------------+ + | 0 | 0 | 1 | Device is running, default state. | + +-----------+---------+----------+-----------------------------------+ + | 0 | 1 | 0 | Stop-and-copy state | + +-----------+---------+----------+-----------------------------------+ + | 0 | 1 | 1 | Pre-copy state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 0 | 0 | Resuming | + +-----------+---------+----------+-----------------------------------+ + | 1 | 0 | 1 | Invalid state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 1 | 0 | Error state | + +-----------+---------+----------+-----------------------------------+ + | 1 | 1 | 1 | Invalid state | + +-----------+---------+----------+-----------------------------------+ + + Valid state transitions are shown in the following table: + + +-------------------------+---------+---------+---------------+----------+----------+ + | |darr| From / To |rarr| | Stopped | Running | Stop-and-copy | Pre-copy | Resuming | + +=========================+=========+=========+===============+==========+==========+ + | Stopped | \- | 1 | 0 | 0 | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Running | 1 | \- | 1 | 1 | 1 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Stop-and-copy | 1 | 1 | \- | 0 | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Pre-copy | 0 | 0 | 1 | \- | 0 | + +-------------------------+---------+---------+---------------+----------+----------+ + | Resuming | 0 | 1 | 0 | 0 | \- | + +-------------------------+---------+---------+---------------+----------+----------+ + + A device is migrated to the destination as follows: + + * The source client transitions the device state from the running state to + the pre-copy state. This transition is optional for the client but must be + supported by the server. The source server starts sending device state data + to the source client through the migration region while the device is + running. + + * The source client transitions the device state from the running state or the + pre-copy state to the stop-and-copy state. The source server stops the + device, saves device state and sends it to the source client through the + migration region. + + The source client is responsible for sending the migration data to the + destination client. + + A device is resumed on the destination as follows: + + * The destination client transitions the device state from the running state + to the resuming state. The destination server uses the device state data + received through the migration region to resume the device. + + * The destination client provides saved device state to the destination + server and then transitions the device to back to the running state. + +* *reserved* This field is reserved and any access to it must be ignored by the + server. + +* *pending_bytes* Remaining bytes to be migrated by the server. This field is + read only. + +* *data_offset* Offset in the migration region where the client must: + + * read from, during the pre-copy or stop-and-copy state, or + + * write to, during the resuming state. + + This field is read only. + +* *data_size* Contains the size, in bytes, of the amount of data copied to: + + * the source migration region by the source server during the pre-copy or + stop-and copy state, or + + * the destination migration region by the destination client during the + resuming state. + +Device-specific data must be stored at any position after +``struct vfio_device_migration_info``. Note that the migration region can be +memory mappable, even partially. In practise, only the migration data portion +can be memory mapped. + +The client processes device state data during the pre-copy and the +stop-and-copy state in the following iterative manner: + + 1. The client reads ``pending_bytes`` to mark a new iteration. Repeated reads + of this field is an idempotent operation. If there are no migration data + to be consumed then the next step depends on the current device state: + + * pre-copy: the client must try again. + + * stop-and-copy: this procedure can end and the device can now start + resuming on the destination. + + 2. The client reads ``data_offset``; at this point the server must make + available a portion of migration data at this offset to be read by the + client, which must happen *before* completing the read operation. The + amount of data to be read must be stored in the ``data_size`` field, which + the client reads next. + + 3. The client reads ``data_size`` to determine the amount of migration data + available. + + 4. The client reads and processes the migration data. + + 5. Go to step 1. + +Note that the client can transition the device from the pre-copy state to the +stop-and-copy state at any time; ``pending_bytes`` does not need to become zero. + +The client initializes the device state on the destination by setting the +device state in the resuming state and writing the migration data to the +destination migration region at ``data_offset`` offset. The client can write the +source migration data in an iterative manner and the server must consume this +data before completing each write operation, updating the ``data_offset`` field. +The server must apply the source migration data on the device resume state. The +client must write data on the same order and transaction size as read. + +If an error occurs then the server must fail the read or write operation. It is +an implementation detail of the client how to handle errors. + +Appendices +========== + +Unused VFIO ``ioctl()`` commands +-------------------------------- + +The following VFIO commands do not have an equivalent vfio-user command: + +* ``VFIO_GET_API_VERSION`` +* ``VFIO_CHECK_EXTENSION`` +* ``VFIO_SET_IOMMU`` +* ``VFIO_GROUP_GET_STATUS`` +* ``VFIO_GROUP_SET_CONTAINER`` +* ``VFIO_GROUP_UNSET_CONTAINER`` +* ``VFIO_GROUP_GET_DEVICE_FD`` +* ``VFIO_IOMMU_GET_INFO`` + +However, once support for live migration for VFIO devices is finalized some +of the above commands may have to be handled by the client in their +corresponding vfio-user form. This will be addressed in a future protocol +version. + +VFIO groups and containers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current VFIO implementation includes group and container idioms that +describe how a device relates to the host IOMMU. In the vfio-user +implementation, the IOMMU is implemented in SW by the client, and is not +visible to the server. The simplest idea would be that the client put each +device into its own group and container. + +Backend Program Conventions +--------------------------- + +vfio-user backend program conventions are based on the vhost-user ones. + +* The backend program must not daemonize itself. +* No assumptions must be made as to what access the backend program has on the + system. +* File descriptors 0, 1 and 2 must exist, must have regular + stdin/stdout/stderr semantics, and can be redirected. +* The backend program must honor the SIGTERM signal. +* The backend program must accept the following commands line options: + + * ``--socket-path=PATH``: path to UNIX domain socket, + * ``--fd=FDNUM``: file descriptor for UNIX domain socket, incompatible with + ``--socket-path`` +* The backend program must be accompanied with a JSON file stored under + ``/usr/share/vfio-user``. + +TODO add schema similar to docs/interop/vhost-user.json. diff --git a/MAINTAINERS b/MAINTAINERS index 7543eb4..1258e11 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1900,6 +1900,12 @@ F: hw/vfio/ap.c F: docs/system/s390x/vfio-ap.rst L: qemu-s390x@nongnu.org +vfio-user +M: John G Johnson +M: Thanos Makatos +S: Supported +F: docs/devel/vfio-user.rst + vhost M: Michael S. Tsirkin S: Supported From patchwork Wed Jan 12 00:43:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D492CC433F5 for ; Wed, 12 Jan 2022 00:46:13 +0000 (UTC) Received: from localhost ([::1]:54830 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RmG-0003ql-Lm for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:46:12 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36578) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdZ-0000hn-LW for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:13 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:6178) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdW-0005fq-Ap for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:13 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMee0a019911 for ; Wed, 12 Jan 2022 00:37:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=aGb/2Tx+G8Dq8TbOKwSjt3u2SlnUQSIWoy/cU8XQAdw=; b=lFUVQGDwg5RgQHOvf2zVnMJVJqCpfztJe06AKPZDfuCfh3E9BPl37OuOi8HeO2xSqN9x jcdZBwRmEX5Xaki+vY/F5qGEO1hbb/mzjGKlvKOoUlvXeAlAuIZL2KVyNYfKBapjg3H9 YkwvLoJPx9+Gdorrde+UWggkUT/403kA1O44tuqGNFRuMJMZAwToyJpzBYb4+iyXosiX meKUquIueFLKaO/yNuWnIi8ef4HCcZVEgFv7Bw+jKfg+v5eITnDHQ5Wn0u5yUv5dwNsC FPtPEpVS44rJ9nBQJrZss5hK1Rnyz5FQj4kYYlCcsIAcmpk0w+esxTCsaW+/kPSQTkQo OQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sgy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KATZ069271 for ; Wed, 12 Jan 2022 00:37:06 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by userp3030.oracle.com with ESMTP id 3deyqy1gju-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:05 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xba+uEYRs/A/h8W1KcUHWO/X4gDMqGooXxIHXowW8tKDhezmH/hC9TeCoVTat4qAu9FEhhPJF/2UAdpbsANPiYCFZcu4EIKlvSDlPWQRONEc93gqGPrZrTZ3EtnfDycn8q2OQgVjAK1hxDvhMAomXqn/psP7yWH8x1XuU5912STHHhhi9K8cBS62701T7OoAh7zSNlm3Jcooq1LBsDfgZXDEUa7YnCMrgBbbf4zojPTzWYnIc06VguBG5UTIuFFpaeh8wvCsoEaU9algSkYmZE3ov1/9ab0SF4sxkQWyO/h/LLEEUsg69blkpHakNJCdxFj7FEBl5IUi7aeKmEZcgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aGb/2Tx+G8Dq8TbOKwSjt3u2SlnUQSIWoy/cU8XQAdw=; b=GhnxJCh4Nd5vI7c9wROslkE06YIA2DZw0iDk0xYdPpEJKU140wxtZu+wYI+wOskVJ963UNzZAhcNwBEMM1PvNGW+050PHkBrBvZslzLXTXjRv/5I8snGCtVfeM2NlaVBlNm1zgs6ZTZldAuagHCcA49r3EcsEDaufDQVmtr3gV4NqNSdeT1/OVpHJE8KB3N5coSgnuWwINvpivXHHYDr8PqxlSYvmA2IVb8qRzVfkxohHyiVC4WM0e2hipACvkIVp3uJlACWAx2RJQ6QBUauFXoHJiqm6CCq2I+hX1etpbW9phPBL3/oYJpNXkV2PTHiBWZyuI0iFb8kOGnbfBYYvA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aGb/2Tx+G8Dq8TbOKwSjt3u2SlnUQSIWoy/cU8XQAdw=; b=rZ2w3dfQ8FjjIMPX2177dsyjV0UEAqkLlHlkMKjBsuKEqlm20R+B7iibS/3PtoyXM1TjMw7+d/3sm71jnhrPcVRc+0GdAcFW7qUARDAzlYQNxN9EA3L5nDbxThM19fujD7W0w6o8S9aSNspk0cdAJVGtrddzAwxSDPNzKFZBoRo= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH7PR10MB5813.namprd10.prod.outlook.com (2603:10b6:510:132::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4823.19; Wed, 12 Jan 2022 00:37:03 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:03 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 02/21] vfio-user: add VFIO base abstract class Date: Tue, 11 Jan 2022 16:43:38 -0800 Message-Id: <825cf2a915fb0423d9e5930369339c5dc062c73d.1641584316.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 39303cd2-6752-41a4-8fd4-08d9d563a753 X-MS-TrafficTypeDiagnostic: PH7PR10MB5813:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:67; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /F1ULGAbziCe9ZmQ0zvlsV8qs5TSfQf3EMTdmi4elpCLo0LvsAAw5HfkJTWhJmI1c1nZtUSuAsSJy3Tm19bdU6dcOuSc5n2YV4iY8dyfx1U9Y9YlrHzDw2+1Ze+mIP6+NtIWlKJggEYfy8KH5WMAwbWJix3YnEkkzJ0SIIWnP29O6zecKwXaE79VotBIJ+NVFDyl1Zpl6eW8W0rMKVsn/1vHmLHv86Bvyx0blIyInKHdH8YzTHRhAMI1G2dP2kZhHdFB2jzu3KuNCECqnSpFUZttf24wnkz7X1QqQ3RBpZQWwiQ1x7kGz+2kjUqV+pdI8z/3emlos8w09YwmThIPJw5nZRvj0tcACXFKK5taWFcTDvqisRjC34YbOXikcPovmb5lXA6konQGjY4ELy2mgH4KJgu2ptTzYWI8+kd6Il5O+U2wmV/4oL6Nh/mC5dh8JdqkEP8zQYcOWn6PAAyLIoYQz/ZzWOHn68WnUNI2rdMLR0mYUnL66xrBcddIzKau/2ejmdtGpX/f4ttFF5BksZzAp6/xyDwjdyg1G7pjlysLib58eUyJagjYRwlViqgAyA/y2OpXjy4yoq1oaz4bmJ7pj+Q5Iu8W9UcBEAYQC4/2vZIM8+OGDkwQBaYQDnb+sgrPxAWNxGjz2ppqdrPA00tDkwbFMKMpOrigMD4QWtOs9SW3+Yq/4CXjS+yJ4KVf7UtU8jSg+YcEYYj7b7XqLQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(2906002)(30864003)(6506007)(8936002)(26005)(6512007)(36756003)(186003)(8676002)(52116002)(6666004)(86362001)(6916009)(6486002)(316002)(508600001)(2616005)(83380400001)(66946007)(5660300002)(38100700002)(66556008)(38350700002)(66476007); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: TNiZYnp6AmroblFU6ZotuRPTdVVl+SRDJvYXRT2cVwW4pmULTk2KSiCFVxXTvB/aMy4ExXQPef7rNvhYrOFNBHw3smSrypMmsk7UNh7qu//6hXfcE+DxZKTZyq3FxpiLDLxj9kTgXh1v+QazwDx2CG9KY0lk5LxjtFh3i6J99rQLRLuoYq/jghF9X8WG8o+Ct/KeCTuFIkvKIOAdVYYGho/lrZHgQe9fttibGTQFlUevSg3ecj8RPa2TJIE+tXj28xaW3HPi3X3bq14URsu9C7OR7GVdt4kJIB7OVYI41xMXXy/cnIBgpzesDbyphJ5Hk/CCBPn9ns2uqOhyI8YoJC3oSdMkkozcaAnbqxNXmB/XB/Hbhee8HSf1/KdwzMFTMqs3Fosist0P+6Rifnattx/iQMLAdSMDoywRs1aC3uZj9Sh40c1hDYs2cNoAnTF9ru3ZlzUkPvnvBISPme81TZu6pzOt4tQ6Az8GZfkOLeVgvXVoCdulGRvt++raRkSWvVmer3Awx4pQUcMxja4uyiVhJYzybp0eMMGGKFfvunpvH/nGwkyZYEstAkZrFFq40MRIIIgeKHd69AtXXlxi8Td2UH+brnE+83lFJezaS+zES+YEW/Zd22kDlJUiVOKEdHs3NEEHsfVrkkvADfcOpp9Ony1JenQcowWR/IeM1/9WK/6MSglMAiK4Ipo3maQNStc1gPzn2wFCdnAsCXtvDCxi+RmEYrKrNZ7QrAVzZ0zRYnSOu6CMVxGbCl3Si8f5iuaMp+ReethyqF7g9QTNLi6gornExIV05I9SklWATb3ONfrGoiKRExtn08LcNpjJkjVwRQ7OGjuuDP7Me7ea5Aqa0clKoMegfZO9a2fmWGKvoenRg5p6NkkisNvYA0g0//4RiCAaaTWw3yqv/KFis2mXYcP7iXrzvIrjLbdjOCMy2tGcesOw7z3hY0ZsLzLwOy3risqCGfqy7DQpOPcaVNPT7wS384Cy1T3ohVanhXs6zoqVpMMO0cqZ8MNHqBglNlw8W1wiX/L9bJAdBqpA3M9zTtITgGaIMzBXu3YGEUjNEY7nHnIIh58cnc4MPzn4Gf5ethMxMdex+CUIV4I/MPELMhOCXfUwuuMT3HzzZ3/DUHWz25p+/H+h5Jd/d/JpI971SBCGwDLSRBkceYAy8gZdNno2eDpumeMFL4mj6KU/Ebqq4CPxmELeOdlBkJXyKI6b/fLzVY6Tumcxf94z/+wTP6m/XmEmSWGV5TSDLjj2cFRMNCazfhRISOPWvW6QzCoJNKonJO6TwG8/kIuQf10l2E4CAz36IO6rOXT1yjWL6Un1GQF40lej7hJBYGKzOwIL2GWBmKsF2VbDHbjeOQGAqCPMM8xUEEHTGos+xZcgNCK/tfbe1PocdK/jHwz7D/qMSCFSutt7x07TfO5qcBVgU2nElInSe7NGKMIOv1kCwXDvH7wqJwGFNxIkuKX8nBTZrr07XaOhmO8JVuPbGCApMzo69R/XqVo47+5S2q5YqEUWvCUxiFZrsYDwLnVMAnV32LcMdGOc21EhWB4KG3BHru87qmHknqp6vMv9wnCEDH7mIIoiX+xAJs+NmwOdVymYLsXrwd9PS+rB7MpEWUn88XPC6jY08R3xe0iwSWgNa4CNReOaZ/KYgHkLNX1jpQ4Wr2N3lfKbJjZc6P2LSQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 39303cd2-6752-41a4-8fd4-08d9d563a753 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:03.6635 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: I5/902lr47Ri7MGl2ojBFpyNl59KF4VIAae9iiM7a+HF9EGu7qPeTpf2DXRqevO9szV95ipGezBo7pfDmnWLsJxrEKrIcZa5OEN+3AINCNM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR10MB5813 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: NJ4Lfg2bwEUJshy6a7mX-OoVakch_yjo X-Proofpoint-ORIG-GUID: NJ4Lfg2bwEUJshy6a7mX-OoVakch_yjo Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add an abstract base class both the kernel driver and user socket implementations can use to share code. Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 16 +++++++-- hw/vfio/pci.c | 106 +++++++++++++++++++++++++++++++++++----------------------- 2 files changed, 78 insertions(+), 44 deletions(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 6477751..bbc78aa 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -114,8 +114,13 @@ typedef struct VFIOMSIXInfo { unsigned long *pending; } VFIOMSIXInfo; -#define TYPE_VFIO_PCI "vfio-pci" -OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI) +/* + * TYPE_VFIO_PCI_BASE is an abstract type used to share code + * between VFIO implementations that use a kernel driver + * with those that use user sockets. + */ +#define TYPE_VFIO_PCI_BASE "vfio-pci-base" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI_BASE) struct VFIOPCIDevice { PCIDevice pdev; @@ -175,6 +180,13 @@ struct VFIOPCIDevice { Notifier irqchip_change_notifier; }; +#define TYPE_VFIO_PCI "vfio-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOKernPCIDevice, VFIO_PCI) + +struct VFIOKernPCIDevice { + VFIOPCIDevice device; +}; + /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t device) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 7b45353..d00a162 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -231,7 +231,7 @@ static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route) static void vfio_intx_routing_notifier(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); PCIINTxRoute route; if (vdev->interrupt != VFIO_INT_INTx) { @@ -457,7 +457,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg, static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, MSIMessage *msg, IOHandler *handler) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector; int ret; @@ -542,7 +542,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev, static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIOMSIVector *vector = &vdev->msi_vectors[nr]; trace_vfio_msix_vector_release(vdev->vbasedev.name, nr); @@ -1063,7 +1063,7 @@ static const MemoryRegionOps vfio_vga_ops = { */ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIORegion *region = &vdev->bars[bar].region; MemoryRegion *mmap_mr, *region_mr, *base_mr; PCIIORegion *r; @@ -1109,7 +1109,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) */ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1142,7 +1142,7 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); uint32_t val_le = cpu_to_le32(val); trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); @@ -2799,7 +2799,7 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) static void vfio_realize(PCIDevice *pdev, Error **errp) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev_iter; VFIOGroup *group; char *tmp, *subsys, group_path[PATH_MAX], *group_name; @@ -3122,7 +3122,7 @@ error: static void vfio_instance_finalize(Object *obj) { - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIOGroup *group = vdev->vbasedev.group; vfio_display_finalize(vdev); @@ -3142,7 +3142,7 @@ static void vfio_instance_finalize(Object *obj) static void vfio_exitfn(PCIDevice *pdev) { - VFIOPCIDevice *vdev = VFIO_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); vfio_unregister_req_notifier(vdev); vfio_unregister_err_notifier(vdev); @@ -3161,7 +3161,7 @@ static void vfio_exitfn(PCIDevice *pdev) static void vfio_pci_reset(DeviceState *dev) { - VFIOPCIDevice *vdev = VFIO_PCI(dev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); trace_vfio_pci_reset(vdev->vbasedev.name); @@ -3201,7 +3201,7 @@ post_reset: static void vfio_instance_init(Object *obj) { PCIDevice *pci_dev = PCI_DEVICE(obj); - VFIOPCIDevice *vdev = VFIO_PCI(obj); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); device_add_bootindex_property(obj, &vdev->bootindex, "bootindex", NULL, @@ -3218,24 +3218,12 @@ static void vfio_instance_init(Object *obj) pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; } -static Property vfio_pci_dev_properties[] = { - DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), - DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), +static Property vfio_pci_base_dev_properties[] = { DEFINE_PROP_ON_OFF_AUTO("x-pre-copy-dirty-page-tracking", VFIOPCIDevice, vbasedev.pre_copy_dirty_page_tracking, ON_OFF_AUTO_ON), - DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice, - display, ON_OFF_AUTO_OFF), - DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0), - DEFINE_PROP_UINT32("yres", VFIOPCIDevice, display_yres, 0), DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, intx.mmap_timeout, 1100), - DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_VGA_BIT, false), - DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_REQ_BIT, true), - DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, - VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_BOOL("x-enable-migration", VFIOPCIDevice, vbasedev.enable_migration, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), @@ -3244,8 +3232,6 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), - DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, - no_geforce_quirks, false), DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, false), DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, @@ -3256,10 +3242,6 @@ static Property vfio_pci_dev_properties[] = { sub_vendor_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, sub_device_id, PCI_ANY_ID), - DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), - DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice, - nv_gpudirect_clique, - qdev_prop_nv_gpudirect_clique, uint8_t), DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, OFF_AUTOPCIBAR_OFF), /* @@ -3270,28 +3252,25 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; -static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) +static void vfio_pci_base_dev_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); - dc->reset = vfio_pci_reset; - device_class_set_props(dc, vfio_pci_dev_properties); - dc->desc = "VFIO-based PCI device assignment"; + device_class_set_props(dc, vfio_pci_base_dev_properties); + dc->desc = "VFIO PCI base device"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); - pdc->realize = vfio_realize; pdc->exit = vfio_exitfn; pdc->config_read = vfio_pci_read_config; pdc->config_write = vfio_pci_write_config; } -static const TypeInfo vfio_pci_dev_info = { - .name = TYPE_VFIO_PCI, +static const TypeInfo vfio_pci_base_dev_info = { + .name = TYPE_VFIO_PCI_BASE, .parent = TYPE_PCI_DEVICE, - .instance_size = sizeof(VFIOPCIDevice), - .class_init = vfio_pci_dev_class_init, - .instance_init = vfio_instance_init, - .instance_finalize = vfio_instance_finalize, + .instance_size = 0, + .abstract = true, + .class_init = vfio_pci_base_dev_class_init, .interfaces = (InterfaceInfo[]) { { INTERFACE_PCIE_DEVICE }, { INTERFACE_CONVENTIONAL_PCI_DEVICE }, @@ -3299,6 +3278,48 @@ static const TypeInfo vfio_pci_dev_info = { }, }; +static Property vfio_pci_dev_properties[] = { + DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), + DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev), + DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice, + display, ON_OFF_AUTO_OFF), + DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0), + DEFINE_PROP_UINT32("yres", VFIOPCIDevice, display_yres, 0), + DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_VGA_BIT, false), + DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_REQ_BIT, true), + DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, + VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), + DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice, + no_geforce_quirks, false), + DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0), + DEFINE_PROP_UNSIGNED_NODEFAULT("x-nv-gpudirect-clique", VFIOPCIDevice, + nv_gpudirect_clique, + qdev_prop_nv_gpudirect_clique, uint8_t), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + dc->reset = vfio_pci_reset; + device_class_set_props(dc, vfio_pci_dev_properties); + dc->desc = "VFIO-based PCI device assignment"; + pdc->realize = vfio_realize; +} + +static const TypeInfo vfio_pci_dev_info = { + .name = TYPE_VFIO_PCI, + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOKernPCIDevice), + .class_init = vfio_pci_dev_class_init, + .instance_init = vfio_instance_init, + .instance_finalize = vfio_instance_finalize, +}; + static Property vfio_pci_dev_nohotplug_properties[] = { DEFINE_PROP_BOOL("ramfb", VFIOPCIDevice, enable_ramfb, false), DEFINE_PROP_END_OF_LIST(), @@ -3315,12 +3336,13 @@ static void vfio_pci_nohotplug_dev_class_init(ObjectClass *klass, void *data) static const TypeInfo vfio_pci_nohotplug_dev_info = { .name = TYPE_VFIO_PCI_NOHOTPLUG, .parent = TYPE_VFIO_PCI, - .instance_size = sizeof(VFIOPCIDevice), + .instance_size = sizeof(VFIOKernPCIDevice), .class_init = vfio_pci_nohotplug_dev_class_init, }; static void register_vfio_pci_dev_type(void) { + type_register_static(&vfio_pci_base_dev_info); type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info); } From patchwork Wed Jan 12 00:43:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82C79C433F5 for ; Wed, 12 Jan 2022 01:06:12 +0000 (UTC) Received: from localhost ([::1]:57032 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S5b-0000gK-Je for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:06:11 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36626) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdb-0000j5-ID for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:7602) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdW-0005fz-Pn for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMv7qW005893 for ; Wed, 12 Jan 2022 00:37:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=cp9DO/q3Op+srzMyWROMbWlwHD+1KhGfMZ6ccvJFoRU=; b=beNOXXQ2gyzOV5/lRF2mesBOBu23oD6YOuvKn7QyejrMtwCCwF+DCoTgx4lPhSPFfEPF x/V7IyvSg/yxaifDV1F5fU4XphtbvS75h70/xCisd8oC5WymDro2xzGacm/nlULVqJCs KGOlVRA9dynMypYatuoKsfhmvMmxuV5aIkTBfQ+oaqDIBjCfvNlmui0mX+jLhq53I386 Y7FOhejJVdFYJOsBFzDNEEpc6OVbRlPWIoVfsrW7U6pOyn6rqEzP47lDM4ml/Y8nO3A3 UnHnBqBac7lc3alYMkbX2WSf8RGWNeMzJpS1hzRZ/RG9fITJmqNeI08YWujgyiqqLnlU dg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1u4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KATa069271 for ; Wed, 12 Jan 2022 00:37:06 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by userp3030.oracle.com with ESMTP id 3deyqy1gju-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JxufyHRb8wneW3qm0tlLgLUTSl9MAa+KtrkcE9R0mdChAY+SbDqgy0Setm1/XuORU72RkUvVgW4mJbFi4w6piWP0hfRvqeGzfa8LzIqqXwsONsBHY5VAlaGqVFkDQu0KzeehkYEMPqttuiObTd8fZ70KJZppdDzdoHFwTX3ukFLjWtgv8/R2RM/ROE+ejsRtucgODy86lCEKUVeyYU2o1sVQAhaZSIRMvqj0smuWsjYGT+5sdEiZ2xdXV98GsyL85FFusYiHP+0VmO7/OHWhZLem20n4Pcwex+qn87sBNJWiYp3i2UtzuzTfp4fo+hHMkHH5ddJ9scCfwkajLhxYQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cp9DO/q3Op+srzMyWROMbWlwHD+1KhGfMZ6ccvJFoRU=; b=N2l6ElrO2BN6yXdCMV45goI0P2MTMpL3N7phlYXL9iRqjyC7z974ioiNiP1DrqVc7VYacFkZTcdKaw1ag8nwbNRrkEaGqJVO4eT5HdexDsliL6a/irzICPyFnT+LM3QWt921n2j2b6bgmji8AH1mbndxyXi/MzUcK2pyD3JZcMS/xKpzHYSKX+GiHlYGsoujNeIWX3VFRXX2nUyssh62YdVDygIjPBgYtCNiA6JKeRfSNMgBBIHztL67R9m1nIh+wzKE9UZrXgZq+K3RGj96hMR3fyb+R4x6ahvLXyLdAYKlMgR7GAzPd/lG3AVjI1K462CJQGokZKWWy//cBmjGEw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cp9DO/q3Op+srzMyWROMbWlwHD+1KhGfMZ6ccvJFoRU=; b=j/tTnBXJg8Y3ti5I5C4mrZ4q0tUre8qpk4gheHoPqjsx6Y9yXvsz3DVWBO4QesDH7/XpdF4fh8GML5QrDQmvhHQDbCmywBuAHsjTTvQwh82B63aEdn4izSWtDPlpC5GLqWJYbObgh6HNkAXCSlaXq45aMel2wI/YTP9z8Sd2Mm8= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH7PR10MB5813.namprd10.prod.outlook.com (2603:10b6:510:132::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4823.19; Wed, 12 Jan 2022 00:37:04 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:04 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 03/21] vfio-user: add container IO ops vector Date: Tue, 11 Jan 2022 16:43:39 -0800 Message-Id: <58534cabf2ca1891f7b8a4c09d8ec7e08bcdd695.1641584316.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a876c648-b9e1-40e3-9ebc-08d9d563a785 X-MS-TrafficTypeDiagnostic: PH7PR10MB5813:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2733; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: U3faJ5T5+v0z2UFNecLvGFr+iCWaHk0x768TppoK4rqSuhN6quGnarXknjGEhlPBmtzJZxVwPFQLRJYjvHCDV4/Oqu2oEW9C/s2XnNHZ7RP0rDwpEHyx6MiF3x5gRmm52aiu2Xn/WI2A3VZT1Wsk6rlac1MOqlq+YFYLu7r0z+nviNKLK1SECbLwgkOoSVSJX9RNCf4IR4dA1UxVso8+lXD0/vm2AH8GSHyP2XENSfZBcisVKzhQ7nDXN9cjnRtC+Tdr7Gv/SmrHsdotmQsGxAoAqzsph0keY84UTrBDwJ9m11cfPBwQZy4c78x1DGCYP87Uc5HLVhMi2Cr/IweSz3Cf0XBYv20hZcvjAuSyWRCJGALpqF2gJAZZuXxkefWolgRi2XGZJGD+duemAYVj97shoXU6tVoqxQxslne4kSdvBvBK0E+7GfdfBEUAETzDn+enkLC5Ye1LZbHFv7HvbpmnenJJWRpTuX2nCWkMwQZvnTym1Ih47rvW6Q4bGHASfoflZOlVzGJ+DzMqfwDq//srqo9SDgZv8odm28ecI3O4XF/pPYzYhHh8RzCEYOQWQPr2XJahIUwtjC5cvKeWryLpr8vFyWjvKKHkT0QA4tYGaZwu04KVKciczzzbfqHdXTrnqxIJ85Lg4D+D3n1YaFCe/Z1Kevr9dC47LarOvPf06skC8j6F0wk3QbVgDlhen7yr8kQVSHdiiU5t6TwERg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(2906002)(6506007)(8936002)(26005)(6512007)(36756003)(186003)(8676002)(52116002)(6666004)(86362001)(6916009)(6486002)(316002)(508600001)(2616005)(83380400001)(66946007)(5660300002)(38100700002)(66556008)(38350700002)(66476007); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: fMU8Gr7/4g5RKukJ8cy7b+m8AEZFRQoG7anaUwQ/yo12iuvbn+CGxkV8W+QYmr/hBnbHT6796V7hxjpw9k6vgMb8ieFitHz7MU0z3nQmAwKGDdcQR+k8zE3IokGBgS49bsjygNDwmRhXhTaWlH2x6w6ulXBKRS5h3V7eojhNyFYWUj4tdIAEgZ3TTbV9vN20YVAQwIIOVrrjhfE+MWW5Kr85nwXsE5USkcOjzP3ptB1AGVL+cFCwUtVvT9RiJwRAsdQTPsemfJccTuiyKx/0cbG+ASEQ1kZmqO8aCFMuzfPoKNwypKHFuJJaZo3dQF/ly94ohXtIWGohOsdQDHQ/WMhvHuPJXzzTprsIGr33WM3PAdIBUkxsUheob9aWZ/GmDd0MkHZtGYKGYjqASOLJi4EHf4dRQisfXnNLBHlZJvcIjBk9OZ0Fsnovslf3eiKI1Bj3ap9k+bVc08XHlM08f3mBsIP+O6U9eusU1iTbixYGG1WrH9IuceBWAMJg3uormRQe/o156mcn2ZBVta9Vb+xicFd67IV/c7ZK2t8V0e5Szw5WkEEwyH8tWAoGLk4XriPQrjvh5xuAp1etZxJe0a/1zBy6cbWTRJyk8thFTDQEgqIq5cR3241r1ZNSpGzdf8AhrJ6Y+s39yUo2jntlgG9n/imKwpRKku42Ev/ob6wV+D7o54oHmoKy5jsXMM36XCb7FOuvKOtiLAsXxGXS6lNgeoZ1HpLrRnHKUKCxU84APiBVWuTDzwIBzXp7NW8sKdtTaMnG/AJe6xRvzeZGpehvd2icxQxVTgK6JV0SRixuaVvHCXzKeY8aUDx7UUPNU8/3tdXFLEnA4e6O+L7Ahu7Mv+zYn3T9+CM4Q2VrZMYaeZ40vXJEJMMAEEgf7XPS/D2beWuHLX66e7t80HGi5qvXyPTtucl5QF0vDAR0ART+F3+M+9w3OyRMB6ZmaPGCHzSKVrjVeA7+QohlxbAwW2RzH5xcxlfDDlIO4ReSn8fPHv5jX9XapQmoe+v58riFhvLHHTC/eMevaPd3/752PFwzgnL41CliCOTOLAL+oN9MYIQj2+pIvpk0HEk2J1Qyozz1oNWdfOCZzqK0lmfnuk3XEfmEsgK4Qh5XuxernLIkZbb695BMnLnSTY/Nsa/j2UFTsTSPM7LCFDozAOfA4hOK3BSKBl92kU6GbA3ZiY9NNcJkpb2wexfPwzE64LHltvyULZpMqTzfsu8EzpjdUKFXcWCOqS1QJuCuCSol/65tZaF+KXHlSmNvJuNEHQB77AJXV6x3yri95iyEZhwIIK6f3lyxAnf1VpCjUnuEX8c0HszKlhz/3MwVsnvd6z/ypBCeeaXaGq6vVkP3rSw6jwJHU3kF4PqaGWSavJVqaawdo2sGwphJNYcrA4wZtsArovJhycLcAWj+Ies8PW1vC0WxNRAmUEPukl6gro8edAM8+GIHYwaCsr/SPLJFtHibyMDP36bPSIhCb4otrOQ3ba3nfkARfxP8h5IdMXjKeZU3AfPTv+fgimk8+4lZFDmK0v86pDBqdR2f+QJL1Rn0mw/9S+89C0cxLUowoEz9u2V4376fCnZTDCP0HLX4GGdujs08IVSU4a7wLHCVGEBrcyvdw+UawqozYxAyatVLKwMmnB5qiGz96fsotItbp3v58fG1e262uo6NKksdmS+BTg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: a876c648-b9e1-40e3-9ebc-08d9d563a785 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:03.9603 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: sX9u7ZAltQSlZl1FN2zbxH5UgmerYCEdL9R52TZnjTjni9hZym2hbzZbNDBWtFUOs09/cPVHviaWfjomSIXURFBKakxMT4TpFLJIglGy0vM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR10MB5813 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: DtDtXo62AT3HfCPsg04tns84pwxWG9tn X-Proofpoint-ORIG-GUID: DtDtXo62AT3HfCPsg04tns84pwxWG9tn Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used for communication with VFIO driver (prep work for vfio-user, which will communicate over a socket) Signed-off-by: John G Johnson --- include/hw/vfio/vfio-common.h | 33 +++++++++++ hw/vfio/common.c | 126 ++++++++++++++++++++++++++++-------------- 2 files changed, 117 insertions(+), 42 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8af11b0..2761a62 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -75,6 +75,7 @@ typedef struct VFIOAddressSpace { } VFIOAddressSpace; struct VFIOGroup; +typedef struct VFIOContIO VFIOContIO; typedef struct VFIOContainer { VFIOAddressSpace *space; @@ -83,6 +84,7 @@ typedef struct VFIOContainer { MemoryListener prereg_listener; unsigned iommu_type; Error *error; + VFIOContIO *io_ops; bool initialized; bool dirty_pages_supported; uint64_t dirty_pgsizes; @@ -154,6 +156,37 @@ struct VFIODeviceOps { int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f); }; +#ifdef CONFIG_LINUX + +/* + * The next 2 ops vectors are how Devices and Containers + * communicate with the server. The default option is + * through ioctl() to the kernel VFIO driver, but vfio-user + * can use a socket to a remote process. + */ + +struct VFIOContIO { + int (*dma_map)(VFIOContainer *container, + struct vfio_iommu_type1_dma_map *map); + int (*dma_unmap)(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap); + int (*dirty_bitmap)(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range); +}; + +#define CONT_DMA_MAP(cont, map) \ + ((cont)->io_ops->dma_map((cont), (map))) +#define CONT_DMA_UNMAP(cont, unmap, bitmap) \ + ((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap))) +#define CONT_DIRTY_BITMAP(cont, bitmap, range) \ + ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) + +extern VFIOContIO vfio_cont_io_ioctl; + +#endif /* CONFIG_LINUX */ + typedef struct VFIOGroup { int fd; int groupid; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 080046e..dbf23c0 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -431,12 +431,12 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, goto unmap_exit; } - ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap); + ret = CONT_DMA_UNMAP(container, unmap, bitmap); if (!ret) { cpu_physical_memory_set_dirty_lebitmap((unsigned long *)bitmap->data, iotlb->translated_addr, pages); } else { - error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %m"); + error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %s", strerror(-ret)); } g_free(bitmap->data); @@ -464,30 +464,7 @@ static int vfio_dma_unmap(VFIOContainer *container, return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } - while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, &unmap)) { - /* - * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c - * v4.15) where an overflow in its wrap-around check prevents us from - * unmapping the last page of the address space. Test for the error - * condition and re-try the unmap excluding the last page. The - * expectation is that we've never mapped the last page anyway and this - * unmap request comes via vIOMMU support which also makes it unlikely - * that this page is used. This bug was introduced well after type1 v2 - * support was introduced, so we shouldn't need to test for v1. A fix - * is queued for kernel v5.0 so this workaround can be removed once - * affected kernels are sufficiently deprecated. - */ - if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && - container->iommu_type == VFIO_TYPE1v2_IOMMU) { - trace_vfio_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->pgsizes); - continue; - } - error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); - return -errno; - } - - return 0; + return CONT_DMA_UNMAP(container, &unmap, NULL); } static int vfio_dma_map(VFIOContainer *container, hwaddr iova, @@ -500,24 +477,18 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr iova, .iova = iova, .size = size, }; + int ret; if (!readonly) { map.flags |= VFIO_DMA_MAP_FLAG_WRITE; } - /* - * Try the mapping, if it fails with EBUSY, unmap the region and try - * again. This shouldn't be necessary, but we sometimes see it in - * the VGA ROM space. - */ - if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || - (errno == EBUSY && vfio_dma_unmap(container, iova, size, NULL) == 0 && - ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { - return 0; - } + ret = CONT_DMA_MAP(container, &map); - error_report("VFIO_MAP_DMA failed: %s", strerror(errno)); - return -errno; + if (ret < 0) { + error_report("VFIO_MAP_DMA failed: %s", strerror(-ret)); + } + return ret; } static void vfio_host_win_add(VFIOContainer *container, @@ -1230,10 +1201,10 @@ static void vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP; } - ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty); + ret = CONT_DIRTY_BITMAP(container, &dirty, NULL); if (ret) { error_report("Failed to set dirty tracking flag 0x%x errno: %d", - dirty.flags, errno); + dirty.flags, -ret); } } @@ -1283,11 +1254,11 @@ static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, goto err_out; } - ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap); + ret = CONT_DIRTY_BITMAP(container, dbitmap, range); if (ret) { error_report("Failed to get dirty bitmap for iova: 0x%"PRIx64 " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova, - (uint64_t)range->size, errno); + (uint64_t)range->size, -ret); goto err_out; } @@ -2058,6 +2029,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->error = NULL; container->dirty_pages_supported = false; container->dma_max_mappings = 0; + container->io_ops = &vfio_cont_io_ioctl; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); @@ -2594,3 +2566,73 @@ int vfio_eeh_as_op(AddressSpace *as, uint32_t op) } return vfio_eeh_container_op(container, op); } + +/* + * Traditional ioctl() based io_ops + */ + +static int vfio_io_dma_map(VFIOContainer *container, + struct vfio_iommu_type1_dma_map *map) +{ + + /* + * Try the mapping, if it fails with EBUSY, unmap the region and try + * again. This shouldn't be necessary, but we sometimes see it in + * the VGA ROM space. + */ + if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, map) == 0 || + (errno == EBUSY && + vfio_dma_unmap(container, map->iova, map->size, NULL) == 0 && + ioctl(container->fd, VFIO_IOMMU_MAP_DMA, map) == 0)) { + return 0; + } + return -errno; +} + +static int vfio_io_dma_unmap(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap) +{ + + while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap)) { + /* + * The type1 backend has an off-by-one bug in the kernel (71a7d3d78e3c + * v4.15) where an overflow in its wrap-around check prevents us from + * unmapping the last page of the address space. Test for the error + * condition and re-try the unmap excluding the last page. The + * expectation is that we've never mapped the last page anyway and this + * unmap request comes via vIOMMU support which also makes it unlikely + * that this page is used. This bug was introduced well after type1 v2 + * support was introduced, so we shouldn't need to test for v1. A fix + * is queued for kernel v5.0 so this workaround can be removed once + * affected kernels are sufficiently deprecated. + */ + if (errno == EINVAL && unmap->size && !(unmap->iova + unmap->size) && + container->iommu_type == VFIO_TYPE1v2_IOMMU) { + trace_vfio_dma_unmap_overflow_workaround(); + unmap->size -= 1ULL << ctz64(container->pgsizes); + continue; + } + error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); + return -errno; + } + + return 0; +} + +static int vfio_io_dirty_bitmap(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range) +{ + int ret; + + ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, bitmap); + + return ret < 0 ? -errno : ret; +} + +VFIOContIO vfio_cont_io_ioctl = { + .dma_map = vfio_io_dma_map, + .dma_unmap = vfio_io_dma_unmap, + .dirty_bitmap = vfio_io_dirty_bitmap, +}; From patchwork Wed Jan 12 00:43:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78F14C433F5 for ; Wed, 12 Jan 2022 00:56:05 +0000 (UTC) Received: from localhost ([::1]:36660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7Rvo-0002fG-Bq for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:56:04 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36574) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdZ-0000hR-7z for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:13 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:8524) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdW-0005g8-8y for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:12 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMee0e019911 for ; Wed, 12 Jan 2022 00:37:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=O46CeKAoCwJci9BkKVpel/pOtz6Zsoy2rmFNOW/ZxLY=; b=riA9Wu3WU5Dot4M4kxELPZWpcti/E+8UWZl/PEeUZJex48WZJYz+aRvLuzdZ3vmwQtH6 E9CmPFV/Zi3pKtd9C4TVXyAQ3uhznH4/1eY8Up90Dj71qwWNTabVJj64Ha8KGnOK3bDP D5FlmjIPS7tqqCWRc97sW6NmSnyiTS+g3kj6dAuj0p+W4p0xn8YJ+DoltGPafOThFWxc ASTUSDRzGsDiLsTfiXZIoy/6fS194/SVK95pA9CVekGWnHR+Rz0mweSfYFdR3CAZIcHv VnvFR3tpaSI6I09kgkdcY95f5arBzNTVLSG1ZkiXetOkkHL4diXPkXD3BryWJSLkkjpM dw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sh5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:08 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KATb069271 for ; Wed, 12 Jan 2022 00:37:07 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by userp3030.oracle.com with ESMTP id 3deyqy1gju-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BVG84PDqmTrpkjgSFrLNs7pDH7qJN52urAWd5cQmTPNi559gZjjdmI+LazZau4hkDT/nce1EyLC6RdHywyrRxf/SxR2fRP7nKvEurG3AAzWXNQHOUkQ4okYYNHIGft966v7Q1o1aFSeUL+M8A9gwyAWStujvkvDZLbjb7gCxFQULkpNEhghP4CsDXyWvPTXCEuQXs1IhVGeRUFxavPs2JFSoaViL3ZgvLIpI9IagKhWWUx5MSR8kX7QI7FuXvEEMSn+H4W7hqH9refVtF8cj0DlMzwjg4ToWThEkF710gWHxiG3YgZoeUJewdOuszifCWzsMEBrjsAaG3ImSxaFlrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O46CeKAoCwJci9BkKVpel/pOtz6Zsoy2rmFNOW/ZxLY=; b=RXb9bZkpWxQN/L1Qp0tZvqz0K0qahP0wxqrtdWzS7HduTdXCLOELm9cSHSVWQ8qzUFeVO6x5RUE9ow2zuZS13Y4FSrBDFC9JjlNtAKrqJ1YV27BtdvC7n6GSwwazkpTtT7T8y8Kkqm50owmkuoe5kC0DobEubKAsxqjEEZfVDI+xLZhlx2GRCgCl3PpOUUzUcJ0xbh9NN/tkgASSh2NUFJcCYUgKmqJDuczPv2PFUMeOcSJs0fcK/BJkgvngYQU763N58bSMnWoSH6YVveef4Gr8flFVrqawilsJZ40isM2oO9ZN9FeZITQO+RfqhNmK30FZLlzwNslqtNxh3UOmQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=O46CeKAoCwJci9BkKVpel/pOtz6Zsoy2rmFNOW/ZxLY=; b=jC3DhQf8gRXQxnC9v+aa5gyA5z4DPqlDzofwULTsBtaTEArTA9Er76pLptxBw/WNm5Ih0yFHPP9aDPx+nrgHcAUoQu04BJqACYwUahw5nhJ275x33WY+54DqYNW3WHXlJ2xJrnfXvSV506/3DuEmoLA54lWUnS8zIdWhbATsJHo= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH7PR10MB5813.namprd10.prod.outlook.com (2603:10b6:510:132::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4823.19; Wed, 12 Jan 2022 00:37:04 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:04 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 04/21] vfio-user: add region cache Date: Tue, 11 Jan 2022 16:43:40 -0800 Message-Id: <719c102ca37546208637f479054da1ebf00957d5.1641584316.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: cc45464d-63f6-4a3b-b5a0-08d9d563a7b0 X-MS-TrafficTypeDiagnostic: PH7PR10MB5813:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:407; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kj/qiXd1Et0bsznz/ilhYdt35AZmb+vg3W/EKIQghXnkNdq70rSlZPtCfANydlkI7/F9DoyrMqSm0PCgaraBpouSvNZLJgO5J0uBgkAIubod4rNhbMwDQDNUFCp2/okOo2R4e5et6+Gr0zuzzBIGTlXfcPRufvqPbIh4E1Js2zc6paCJEQpc+qvxXyZXPGBIVePcJXmrgXvWd+e8VDkvzDLjH7YOe4quCKxRYO1ZMJVni80lhFIcL9xW2A9PqEC3Ianrcq1Q+Ze+Ha6aupXBKR6O8hC4mMZ+8jhM3l0A4Wqw+u848libmdZ6TrrpTXfrhhba7NjdMdPhp7d2sbePkyJlHYzbpwz1+0KPK+TJmWiXTQPbf2oauE/El+VPnq2OpWTMAeF35e+gUbFCLtlQz4aBnhJAszbcOs93v1gHJjvNG0UeuVeMuj594mu2bX+tw0jnWgzgWBmBJBHcmUjq4oMj0PGOKaD5erYuWSQgCxy6Hz2CwcL3McX7jJcc8pQ2gqS/0O83dPBlfE0UpF01L2f6BjY/sX8Kq8zOgAZ86zIhUFeC03zR5joLQXlxTMzjRRa1+RpkeYhaXYOk0W8xq9snMAZxK8bqRaY3oJ6zpMgKL6N+LXUKS0j0p7zDrBpaXKGh70QzyesENmf3BEdeuO0kVsbCSJdNjPP4K66iDN86PMF/bwIOdke7c8jnlJdv X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(2906002)(6506007)(8936002)(26005)(6512007)(36756003)(186003)(8676002)(52116002)(6666004)(86362001)(6916009)(6486002)(316002)(508600001)(2616005)(83380400001)(66946007)(5660300002)(38100700002)(66556008)(38350700002)(66476007); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: s5diFKyxW5mE9A+4lEB/RzgEGfko3x42wUtL44VWnYcnjqRy4UxFKW6b8oBTn54jz3gN08pZ0Vs13o1KeYcUKIuPOakppbpLw5nJeK5im6TxY3K/gwIeoFBrqNa+Z4Bk6lkiNC2U7c4gnbIkPWWxGLuvQh6/lQyiB/z73/khyJHPXGF7ugS4ba5cODZJtGoWiTkm/ukIIS+1rKKnWx7qOFD1t5OIRpbqs5EPbiW40YBXcuKtROT04RUQA62kAAiIqzcerX63TcrBczaLfHTBq3+LxiLai4j/YZnkRF+wiyuv5TxLlbSvPKkJhkrW6O0eyk9aIh6ryECXr/2lRXj/St05HByhr/0CwErcWxErV2rhD3/hYBDKsSBCBOmbpl9rUTNJi5u4SlGlHD7kKYJvNtj5SyFe253dsfwje8gqFB5JhyBOfISM75bxNZhNRqmxUZl+DASjVkhY8s81hlUQJr8cCSKo29u0M9yISchAellGY+C/O23DD5v1iiEYef47BHV171n1TDG7iCvIuhphid8Qj9LI703SYNlawzG2G7HluTftXcqiR6+u9DKuRiCxSnOGP9Xkt8d4Pj8KFe41vfXcAO1jnOgAbqzJflIIXlGrBiTJtdRnX4EohiSdLXV2DbJrUIstHvTLkZZ7bgdpz4rgJK3RtDQYBnX+ZZIQDXQapF8EEhQMum6gnRpzuA7WhkV0x3LJdxCeQ2PkYi7gJtBH3m5gq/92UB7Zx+o1/uld8oySs3KSeCvplNHdiseP4niN/HiyWgaPCjHamSZdVyak+diVMIccYKVePbw1Is4ChaRmxEb+dfvvM9KaxGBaLuImWASd2EgDLUsPC+THZpdZsKKAQkxOov3u/alAxiy7klMlUr24Cp7a4GbfA5QdPLQPRkpETwVv1+C48JaRCelyoWQxj5mTdVKoMlflwB7xdg97lYmTUJCmFRUtSNN24zm6PQkzysIZXlCj4N3GOM16eb2p98+1wcdY7LdNBDq8JABbY0oVp+pVt2YU4hyaOd2qmG3JnBwDZoywEvhKv35zGo+IvWOnqAxFZz0JTroXtcUMXGTpd9AK6Zt6qs1mc338fXIKp7u3E5Ck5g1GHLS2FARs9tolM3awVVppnJ8Ub1SgeOblCRRRB+hnO+B7IkgVR4kDkkjv57zUhndqZqIhF6I3Nh/cnGkeFZMqHy5LLdF0J+9ILkSeVbwXiVYavF9G1uIIWWf78TKmJfwLlYIjjl+pS2TXFjZh2jFFUVBd5DrcIOums5JdS8HZtmdDfAkiXGwJZpEoSOYrYNCiLuitcIddIY5auB1QwnU+fqkLfPlTNbTq7D0ierLCJcS2oQAYH9aPJ4/bgURtcUCqPuBfh1kvbPdqOTweS1pxOk/td7lw9SZhW7STidITVqHEV/YbqHUWIS/jKvqiEogpA66Pb/0pOeaoIu3CFDgC74m2r3d90h61axSfw/DaCNhkvJXpo2mD2W8VkkAIxKGCgrhvkWyfoy32xJAYdHtsOKpHGwICuy5pnw2ChZKgt4KWr33WWQstACfXIMJb2iNpEBaPegG/rsjdQjrf4aTXMBrR7AXP8wntxXxVJe9cu7jSpkdPnb2IbKQ8Ui5nWO/lj+ShNmMfpn5Z9J1LZiBydWlCEp3P+eW3+c5z1kQX/hToXQaf6m8qO0hW1ZQKzA+J3Q== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: cc45464d-63f6-4a3b-b5a0-08d9d563a7b0 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:04.2427 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ICl74sWA1Yw4PFZkty8tIFazdrTjhusy0bVMpGLcJKZcaqHFL3R76psBSkSoAOZw05nqr0M9gHem/JWjakhVEBBzoC4nz2vdEMikfWHsJDc= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR10MB5813 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: 5pyOGxKwyPgi-OeWA56_jf9yV3-WnDqc X-Proofpoint-ORIG-GUID: 5pyOGxKwyPgi-OeWA56_jf9yV3-WnDqc Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" cache VFIO_DEVICE_GET_REGION_INFO results to reduce memory alloc/free cycles and as prep work for vfio-user Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- include/hw/vfio/vfio-common.h | 2 ++ hw/vfio/ccw.c | 5 ----- hw/vfio/common.c | 41 +++++++++++++++++++++++++++++++++++------ hw/vfio/pci-quirks.c | 19 ++++++------------- hw/vfio/pci.c | 6 ------ 5 files changed, 43 insertions(+), 30 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 2761a62..1a032f4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -145,6 +145,7 @@ typedef struct VFIODevice { VFIOMigration *migration; Error *migration_blocker; OnOffAuto pre_copy_dirty_page_tracking; + struct vfio_region_info **regions; } VFIODevice; struct VFIODeviceOps { @@ -258,6 +259,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info); int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, uint32_t subtype, struct vfio_region_info **info); +void vfio_get_all_regions(VFIODevice *vbasedev); bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type); struct vfio_info_cap_header * vfio_get_region_info_cap(struct vfio_region_info *info, uint16_t id); diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 0354737..06b588c 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -517,7 +517,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) vcdev->io_region_offset = info->offset; vcdev->io_region = g_malloc0(info->size); - g_free(info); /* check for the optional async command region */ ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -530,7 +529,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->async_cmd_region_offset = info->offset; vcdev->async_cmd_region = g_malloc0(info->size); - g_free(info); } ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -543,7 +541,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->schib_region_offset = info->offset; vcdev->schib_region = g_malloc(info->size); - g_free(info); } ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW, @@ -557,7 +554,6 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp) } vcdev->crw_region_offset = info->offset; vcdev->crw_region = g_malloc(info->size); - g_free(info); } return; @@ -567,7 +563,6 @@ out_err: g_free(vcdev->schib_region); g_free(vcdev->async_cmd_region); g_free(vcdev->io_region); - g_free(info); return; } diff --git a/hw/vfio/common.c b/hw/vfio/common.c index dbf23c0..30d2c6e 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1568,8 +1568,6 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, } } - g_free(info); - trace_vfio_region_setup(vbasedev->name, index, name, region->flags, region->fd_offset, region->size); return 0; @@ -2325,6 +2323,16 @@ void vfio_put_group(VFIOGroup *group) } } +void vfio_get_all_regions(VFIODevice *vbasedev) +{ + struct vfio_region_info *info; + int i; + + for (i = 0; i < vbasedev->num_regions; i++) { + vfio_get_region_info(vbasedev, i, &info); + } +} + int vfio_get_device(VFIOGroup *group, const char *name, VFIODevice *vbasedev, Error **errp) { @@ -2380,12 +2388,23 @@ int vfio_get_device(VFIOGroup *group, const char *name, trace_vfio_get_device(name, dev_info.flags, dev_info.num_regions, dev_info.num_irqs); + vfio_get_all_regions(vbasedev); vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET); return 0; } void vfio_put_base_device(VFIODevice *vbasedev) { + if (vbasedev->regions != NULL) { + int i; + + for (i = 0; i < vbasedev->num_regions; i++) { + g_free(vbasedev->regions[i]); + } + g_free(vbasedev->regions); + vbasedev->regions = NULL; + } + if (!vbasedev->group) { return; } @@ -2400,6 +2419,17 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, { size_t argsz = sizeof(struct vfio_region_info); + /* create region cache */ + if (vbasedev->regions == NULL) { + vbasedev->regions = g_new0(struct vfio_region_info *, + vbasedev->num_regions); + } + /* check cache */ + if (vbasedev->regions[index] != NULL) { + *info = vbasedev->regions[index]; + return 0; + } + *info = g_malloc0(argsz); (*info)->index = index; @@ -2419,6 +2449,9 @@ retry: goto retry; } + /* fill cache */ + vbasedev->regions[index] = *info; + return 0; } @@ -2437,7 +2470,6 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, hdr = vfio_get_region_info_cap(*info, VFIO_REGION_INFO_CAP_TYPE); if (!hdr) { - g_free(*info); continue; } @@ -2449,8 +2481,6 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type, if (cap_type->type == type && cap_type->subtype == subtype) { return 0; } - - g_free(*info); } *info = NULL; @@ -2466,7 +2496,6 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type) if (vfio_get_region_info_cap(info, cap_type)) { ret = true; } - g_free(info); } return ret; diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index 0cf69a8..223bd02 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -1601,16 +1601,14 @@ int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp) hdr = vfio_get_region_info_cap(nv2reg, VFIO_REGION_INFO_CAP_NVLINK2_SSATGT); if (!hdr) { - ret = -ENODEV; - goto free_exit; + return -ENODEV; } cap = (void *) hdr; p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE, MAP_SHARED, vdev->vbasedev.fd, nv2reg->offset); if (p == MAP_FAILED) { - ret = -errno; - goto free_exit; + return -errno; } quirk = vfio_quirk_alloc(1); @@ -1623,7 +1621,7 @@ int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, Error **errp) (void *) (uintptr_t) cap->tgt); trace_vfio_pci_nvidia_gpu_setup_quirk(vdev->vbasedev.name, cap->tgt, nv2reg->size); -free_exit: + g_free(nv2reg); return ret; @@ -1651,16 +1649,14 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp) hdr = vfio_get_region_info_cap(atsdreg, VFIO_REGION_INFO_CAP_NVLINK2_SSATGT); if (!hdr) { - ret = -ENODEV; - goto free_exit; + return -ENODEV; } captgt = (void *) hdr; hdr = vfio_get_region_info_cap(atsdreg, VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD); if (!hdr) { - ret = -ENODEV; - goto free_exit; + return -ENODEV; } capspeed = (void *) hdr; @@ -1669,8 +1665,7 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp) p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE, MAP_SHARED, vdev->vbasedev.fd, atsdreg->offset); if (p == MAP_FAILED) { - ret = -errno; - goto free_exit; + return -errno; } quirk = vfio_quirk_alloc(1); @@ -1690,8 +1685,6 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error **errp) (void *) (uintptr_t) capspeed->link_speed); trace_vfio_pci_nvlink2_setup_quirk_lnkspd(vdev->vbasedev.name, capspeed->link_speed); -free_exit: - g_free(atsdreg); return ret; } diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d00a162..cff6cb7 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -790,8 +790,6 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev) vdev->rom_size = size = reg_info->size; vdev->rom_offset = reg_info->offset; - g_free(reg_info); - if (!vdev->rom_size) { vdev->rom_read_failed = true; error_report("vfio-pci: Cannot read device rom at " @@ -2518,7 +2516,6 @@ int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp) error_setg(errp, "unexpected VGA info, flags 0x%lx, size 0x%lx", (unsigned long)reg_info->flags, (unsigned long)reg_info->size); - g_free(reg_info); return -EINVAL; } @@ -2623,8 +2620,6 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) } vdev->config_offset = reg_info->offset; - g_free(reg_info); - if (vdev->features & VFIO_FEATURE_ENABLE_VGA) { ret = vfio_populate_vga(vdev, errp); if (ret) { @@ -3032,7 +3027,6 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) } ret = vfio_pci_igd_opregion_init(vdev, opregion, errp); - g_free(opregion); if (ret) { goto out_teardown; } From patchwork Wed Jan 12 00:43:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9F762C433EF for ; Wed, 12 Jan 2022 00:55:47 +0000 (UTC) Received: from localhost ([::1]:35460 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RvW-0001q7-B0 for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:55:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36582) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdZ-0000hx-VM for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:14 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:8004) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdX-0005g2-3Y for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:13 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMv7qY005893 for ; Wed, 12 Jan 2022 00:37:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=d+pAfqCHrvgMmOMi76L/ps25UCKxjy/yh/yXJC5ZGFk=; b=cQzbFWbRTU/HG2uCYeQdrB+EJP/XnoKRjjdGJORaH//xLpjhZQk2CmYZGq/7F13Uc0as 374CKKFOxXyP08Kfg/AOGCQf+nm0Km2pc1UB0oaa9ykb9NCA/qjCltTkS3GrbkoocPOx k0ahH5mJtxqc4BQpgrISmuABnCmM8o9kL6yhklFeWv24P6Wd7ljiVTmtm+Dcnk91OV3l cJGHFYRHmY1mp1XeK0taz9ZfCyKU5UrMarLIpUVTULtjLCYQNP9zFejZW0oly+MYmUfI Dwdvxo9YJ8vKOG49d7ChSDpbT+YmAXOta8kjTDtL0JvUJjOTlr7WbOQaL3wPZyu9V5hV CQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1uc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBA196414 for ; Wed, 12 Jan 2022 00:37:07 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lrRx9RHThN7qVnluytEq4UyegJVgjHLXeRJs5lJXAAh96Vq/Po/tbx0G55EIpsKCDrFIu0RCKUaMjrz4XJg9z+0x4010G9ZAKfI81u3AK4W+26dsnrS62fLw2//srkcn6kDGc2xEf5j5zoGv7DT0VCpPCP7fV9IpZhK69sdwdivzS5nZlLnbuBjYctmKzbPNjIWpweL1h3lSB5Ya48Ker4rCQY35OBvcTWnLrJpHMfMEYLrAF0s59NVIIZ9NMFNI63Ucv4C9qS0l97/IQ72EXEtDSItcXQVRFjg8/OwwUNAH3/Qef+Qnrn/w5zmgN8waYTntqXfY7Rq0RLhKnoL5vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=d+pAfqCHrvgMmOMi76L/ps25UCKxjy/yh/yXJC5ZGFk=; b=EEWJpoNI5g1xgjzd6SF/kBD087FVkQqXAdx+fv2EjgbisrBWOZg6WUpd0RkmoxV5XJOGbXXmdNd/20FKZxKZNl0tq7qyzNhXkd2Omrgp/1E7sE0mNKTraGBuAcvGxQU3HnZJu/0AMtQ+Vet6CzCRP9XN0zBfMuDwNLV+x0vRyfiaoPYalW0kns/tu8hddOmDP+Zipn1NfwDpnquFt/hNFUaJqKSex5wTkh4P5x+6NxNzFx5emCV4hZPWOqv4wTNCm00mp+NED0t7c1fNxuaHkLdBCmvT4fANyEgJ28bgtA1CXOIDshkmuxzJUWoDHGh5s2pX2su97FpNDWw6ggUAIA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=d+pAfqCHrvgMmOMi76L/ps25UCKxjy/yh/yXJC5ZGFk=; b=JHYYgAd55Z5PY8AQI1EF63gDv5dBk6vYhvRcPtrKuMzbFHzT0lPaZzIq97jqZ+7xTrvRp+rQNZaDhukVYcycCjUg29BwmDjQDlkhbRuYHQsge8QMkMztX7Xcgb1ea1wyW9kdIjOeCfM+OvdbgEJlePaQ0f/RfAyorczgncGv4BU= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:04 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:04 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 05/21] vfio-user: add device IO ops vector Date: Tue, 11 Jan 2022 16:43:41 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 988cf4d0-d634-43ef-6258-08d9d563a7e0 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:765; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wQfjFge942ZJSdXsNqBF4xsJwNTid5vfRh1yGrHwUWXOc7dnl/WozHsnUhPIjl6/QFhC4xKLc3CDz+1HxAy8aS4F6ZTZas0yGYF8+98PGq0N/1e5mAu7CxG7jUJrDRQPxeveizhM5u0k+evlpIVmwljWT1UFFaG2lkpKA29c0yiapOu3Z/9q1pYGz2TJDjSb3sNKYifhDXXj+eP0f/hkkvMPj1yV9P8+J/SYmqFnBGp6lbH9NK0fN8/5Owi1rhgLjfUmX7DnRaolHh2tW1iB7T4cdlBKLlgxNE+ZvYxXxq5Bek5hSk6QvcqtIndw4d8b5nzuSp/tJpdqAdAhANnRiAZuyIYYT8e56L8x0alhOZNVQJmDDwsUYaOFBEsXT3C/SXYWzgNtyN8JAdjD3D9pnMayvZUTHeRg3Iria3KgBjOqTooXTZyMpHB5Fiuy7h8/IVTueyE1uNuWnESQJjiUz/m4CufaL3fmzESaoXRfvlX7CJYx91/9Y/cC5G+pOoZsHhkt9ajiiBm5+Sk/SQG8HsKBYE066CgoTwzj0mSvHmlAY+XuKjPKSZpSm0UMzfhgysvWV5wkHyQHf3xktnu7DJiFr5MJNCh2CHHbfwS+j3eRXRe3w2XE5LWBr8F/M41g2pm4iSXunMEKTdCTkqDebLCBKOjuZheM/a/iHV7dX8o0oMsaNz6zVqr3p/aSoXvw3JwDdtLECZ8xnc53ox0IQQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(30864003)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: At/VrmihKYsSIHDosA1Rqs/5lyZS63TwQq+xAfgfO4vzf0oyaO1oua6vbQ60BQKq1RIfeMDQdJJFBcYBbKvzaK6n5Jv75q1KWM4dxwI4vIo9nhivyzStrtTJ5Ob4cvNcGsYJTFzN+oDaiyJMFeKAJ/WQgo5eLgxchA2TX6hy0t/piToye5+RLFwYyJyQ7bclUI6bKjsyxvhIB3bjdoTFXZCL6tTVqL8ZQiY6qSHCbM7mITXphG/pN+uME5hURpzdrtN7WZSPxs+M4F3+kJD2DsEkFkh79bCu0/zfrycbqyNf5w7l0i90mQuJzS8bF2p8tcDsy3N2MaNz5tV4OACg4VSBRfTQ+maWcuRqCCX2Ce9ADZM5OtHgM+xygcVdid7hiz32HAhgLJBejrCA0W5B9YiZNgUZCpDFXI1afrrDT93aDDz93n12ISqksyK1KrgzuvQD6M+D2bR0gIThIQ81vCRn4WRK1MAu2G/ILx/g5FArAPwhg1efWyFnxoapUV8XaZS5WjMSNxjQHjYaJ9RdRH0s2UQ9+x4W0c+lvlKKfsmqk9fM8k4r4vr+XNwupiltGp+mEg47Brsgzv7KcUcOWU7ahJyrrjACbQrxAatQBbSf+/r/WxQJc/y3XK7yCumPQloRHcu8jFrls7rxd0Wf1dm/Wxmkc6Nfih+2Jcq2omJ0J88I+IsK9BSPWHN4WRD3EpZmkk2kgD5mkvwAjFz7cnP7RO3slysHuZ3CAPzFpfp/GOC+Rj69HgXTWHZIY7Kh/b/ZXKRZ8SmFOHEE0jg63PwKU4sAHRgA4prHNLkl2iU2wgV/kCp3yIUCTN/ItTl3JDAZLUwvFhndeUu72AWPv1Gk5FpQ/aUicbAJplu5kjvTZxATKHlW9u9VsrZIh6JGBnkk0H8QrJIgGFlc0uH8EFlUV4iIy8uy6vG21fjJdg2MZ/qvf6dT32RXKRhLxN3GPOJL17cUDfEHQTSFWeGv1TnCYVUZIS5fIpdZmXGREYqm6vTkfCdh+52ogKwuZz5G5rghJgniwSrb+qxiIyeUjx0p0vpfETBfAcBQlNom6YtnVb/jUcaaUz9a0heUL7FO1M9V3UnTThiEY6atNXeeHB+mmiRx6SOkroB9/4j0l5itKrB6/M7HZXLONQVpeud7fp4gtJWIxOTM/yBpaYjF02GGktGFIHP8vy0IHU5ojLWF/Gu2l7tteiW5zuikdtiRWbVr2nWT4Hz5KOyU34UyfVwJsvZ6f34xaB6yGyaLMQLbtGg9iJpfil6qQpbBt+UyI8pNkRZUpVXQHfuoYDloEgonN0gezrCk4vpRrDRVZk/f612Bz1piJGESp1MIMMEnejBamb9nPAs8H+k45VD152QalkYKLm0B2UmbaVPTIuIWVGRfOc6ImELF8TwsmcpCcgCaKbqCnhADv9nZvW8OHdSE6uT5DycO0+8rOvegoHYybLprPP9EpeuTLL9vPt4+Lurw1rK4JFIJ+yAv9TnFlaQ5DZiffOtQk8ekhr8cBrWhDFkeBXMD+tEudynQLipRUW31J5/MmtMDsAX7m2ZGdVt3+fuNSqXD2h0dU4ysfBIvwCcyHmxtDz/EeBOxtu/krokWqng9R80eDqqV/7TcagYodCoZ2yrSojNo3uabgPO9epvbekO78Lp1JQov93Ii4hqem9vhoIJCiZQ402PI9g== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 988cf4d0-d634-43ef-6258-08d9d563a7e0 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:04.5708 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GB1y5Sf6NYCRoqAF6Yzk5nyz6b7cqiF3rYiAisI8VITpwnutwYJIA/x/mCDdltJspC7b8w1hVqGX7YwUShE8BV/xj5JDhnTW88izFMEbPAI= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: o_rjuED1VDFUzn3dDDBcTtI742xsgJfK X-Proofpoint-ORIG-GUID: o_rjuED1VDFUzn3dDDBcTtI742xsgJfK Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used for communication with VFIO driver (prep work for vfio-user, which will communicate over a socket) Signed-off-by: John G Johnson --- include/hw/vfio/vfio-common.h | 27 ++++++++ hw/vfio/common.c | 107 +++++++++++++++++++++++++++----- hw/vfio/pci.c | 140 ++++++++++++++++++++++++++---------------- 3 files changed, 206 insertions(+), 68 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 1a032f4..826cd98 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -124,6 +124,7 @@ typedef struct VFIOHostDMAWindow { } VFIOHostDMAWindow; typedef struct VFIODeviceOps VFIODeviceOps; +typedef struct VFIODevIO VFIODevIO; typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; @@ -139,6 +140,7 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; bool enable_migration; VFIODeviceOps *ops; + VFIODevIO *io_ops; unsigned int num_irqs; unsigned int num_regions; unsigned int flags; @@ -165,6 +167,30 @@ struct VFIODeviceOps { * through ioctl() to the kernel VFIO driver, but vfio-user * can use a socket to a remote process. */ +struct VFIODevIO { + int (*get_info)(VFIODevice *vdev, struct vfio_device_info *info); + int (*get_region_info)(VFIODevice *vdev, + struct vfio_region_info *info); + int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq); + int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs); + int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data); + int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, + void *data); +}; + +#define VDEV_GET_INFO(vdev, info) \ + ((vdev)->io_ops->get_info((vdev), (info))) +#define VDEV_GET_REGION_INFO(vdev, info) \ + ((vdev)->io_ops->get_region_info((vdev), (info))) +#define VDEV_GET_IRQ_INFO(vdev, irq) \ + ((vdev)->io_ops->get_irq_info((vdev), (irq))) +#define VDEV_SET_IRQS(vdev, irqs) \ + ((vdev)->io_ops->set_irqs((vdev), (irqs))) +#define VDEV_REGION_READ(vdev, nr, off, size, data) \ + ((vdev)->io_ops->region_read((vdev), (nr), (off), (size), (data))) +#define VDEV_REGION_WRITE(vdev, nr, off, size, data) \ + ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data))) struct VFIOContIO { int (*dma_map)(VFIOContainer *container, @@ -184,6 +210,7 @@ struct VFIOContIO { #define CONT_DIRTY_BITMAP(cont, bitmap, range) \ ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) +extern VFIODevIO vfio_dev_io_ioctl; extern VFIOContIO vfio_cont_io_ioctl; #endif /* CONFIG_LINUX */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 30d2c6e..cce38d8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -70,7 +70,7 @@ void vfio_disable_irqindex(VFIODevice *vbasedev, int index) .count = 0, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) @@ -83,7 +83,7 @@ void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) @@ -96,7 +96,7 @@ void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index) .count = 1, }; - ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); + VDEV_SET_IRQS(vbasedev, &irq_set); } static inline const char *action_to_str(int action) @@ -177,9 +177,7 @@ int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, pfd = (int32_t *)&irq_set->data; *pfd = fd; - if (ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) { - ret = -errno; - } + ret = VDEV_SET_IRQS(vbasedev, irq_set); g_free(irq_set); if (!ret) { @@ -214,6 +212,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint32_t dword; uint64_t qword; } buf; + int ret; switch (size) { case 1: @@ -233,13 +232,15 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { + ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf); + if (ret != size) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64 - ",%d) failed: %m", + ",%d) failed: %s", __func__, vbasedev->name, region->nr, - addr, data, size); + addr, data, size, err); } - trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size); /* @@ -265,13 +266,18 @@ uint64_t vfio_region_read(void *opaque, uint64_t qword; } buf; uint64_t data = 0; + int ret; + + ret = VDEV_REGION_READ(vbasedev, region->nr, addr, size, &buf); + if (ret != size) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; - if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { - error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m", + error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %s", __func__, vbasedev->name, region->nr, - addr, size); + addr, size, err); return (uint64_t)-1; } + switch (size) { case 1: data = buf.byte; @@ -2418,6 +2424,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info) { size_t argsz = sizeof(struct vfio_region_info); + int ret; /* create region cache */ if (vbasedev->regions == NULL) { @@ -2436,10 +2443,11 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, retry: (*info)->argsz = argsz; - if (ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, *info)) { + ret = VDEV_GET_REGION_INFO(vbasedev, *info); + if (ret != 0) { g_free(*info); *info = NULL; - return -errno; + return ret; } if ((*info)->argsz > argsz) { @@ -2600,6 +2608,75 @@ int vfio_eeh_as_op(AddressSpace *as, uint32_t op) * Traditional ioctl() based io_ops */ +static int vfio_io_get_info(VFIODevice *vbasedev, struct vfio_device_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, info); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *info) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, info); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_set_irqs(VFIODevice *vbasedev, struct vfio_irq_set *irqs) +{ + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pread(vbasedev->fd, data, size, info->offset + off); + + return ret < 0 ? -errno : ret; +} + +static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off, + uint32_t size, void *data) +{ + struct vfio_region_info *info = vbasedev->regions[index]; + int ret; + + ret = pwrite(vbasedev->fd, data, size, info->offset + off); + + return ret < 0 ? -errno : ret; +} + +VFIODevIO vfio_dev_io_ioctl = { + .get_info = vfio_io_get_info, + .get_region_info = vfio_io_get_region_info, + .get_irq_info = vfio_io_get_irq_info, + .set_irqs = vfio_io_set_irqs, + .region_read = vfio_io_region_read, + .region_write = vfio_io_region_write, +}; + static int vfio_io_dma_map(VFIOContainer *container, struct vfio_iommu_type1_dma_map *map) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index cff6cb7..63a42ae 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -43,6 +43,14 @@ #include "migration/blocker.h" #include "migration/qemu-file.h" +/* convenience macros for PCI config space */ +#define VDEV_CONFIG_READ(vbasedev, off, size, data) \ + VDEV_REGION_READ((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, (off), \ + (size), (data)) +#define VDEV_CONFIG_WRITE(vbasedev, off, size, data) \ + VDEV_REGION_WRITE((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, (off), \ + (size), (data)) + #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" static void vfio_disable_interrupts(VFIOPCIDevice *vdev); @@ -402,7 +410,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) fds[i] = fd; } - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); + ret = VDEV_SET_IRQS(&vdev->vbasedev, irq_set); g_free(irq_set); @@ -772,14 +780,16 @@ static void vfio_update_msi(VFIOPCIDevice *vdev) static void vfio_pci_load_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; struct vfio_region_info *reg_info; uint64_t size; off_t off = 0; ssize_t bytes; + int ret; - if (vfio_get_region_info(&vdev->vbasedev, - VFIO_PCI_ROM_REGION_INDEX, ®_info)) { - error_report("vfio: Error getting ROM info: %m"); + ret = vfio_get_region_info(vbasedev, VFIO_PCI_ROM_REGION_INDEX, ®_info); + if (ret < 0) { + error_report("vfio: Error getting ROM info: %s", strerror(-ret)); return; } @@ -804,18 +814,19 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev) memset(vdev->rom, 0xff, size); while (size) { - bytes = pread(vdev->vbasedev.fd, vdev->rom + off, - size, vdev->rom_offset + off); + bytes = VDEV_REGION_READ(vbasedev, VFIO_PCI_ROM_REGION_INDEX, off, + size, vdev->rom + off); if (bytes == 0) { break; } else if (bytes > 0) { off += bytes; size -= bytes; } else { - if (errno == EINTR || errno == EAGAIN) { + if (bytes == -EINTR || bytes == -EAGAIN) { continue; } - error_report("vfio: Error reading device ROM: %m"); + error_report("vfio: Error reading device ROM: %s", + strerror(-bytes)); break; } } @@ -903,11 +914,10 @@ static const MemoryRegionOps vfio_rom_ops = { static void vfio_pci_size_rom(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK); - off_t offset = vdev->config_offset + PCI_ROM_ADDRESS; DeviceState *dev = DEVICE(vdev); char *name; - int fd = vdev->vbasedev.fd; if (vdev->pdev.romfile || !vdev->pdev.rom_bar) { /* Since pci handles romfile, just print a message and return */ @@ -924,11 +934,12 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev) * Use the same size ROM BAR as the physical device. The contents * will get filled in later when the guest tries to read it. */ - if (pread(fd, &orig, 4, offset) != 4 || - pwrite(fd, &size, 4, offset) != 4 || - pread(fd, &size, 4, offset) != 4 || - pwrite(fd, &orig, 4, offset) != 4) { - error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name); + if (VDEV_CONFIG_READ(vbasedev, PCI_ROM_ADDRESS, 4, &orig) != 4 || + VDEV_CONFIG_WRITE(vbasedev, PCI_ROM_ADDRESS, 4, &size) != 4 || + VDEV_CONFIG_READ(vbasedev, PCI_ROM_ADDRESS, 4, &size) != 4 || + VDEV_CONFIG_WRITE(vbasedev, PCI_ROM_ADDRESS, 4, &orig) != 4) { + + error_report("%s(%s) ROM access failed", __func__, vbasedev->name); return; } @@ -1108,6 +1119,7 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); @@ -1120,12 +1132,13 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) if (~emu_bits & (0xffffffffU >> (32 - len * 8))) { ssize_t ret; - ret = pread(vdev->vbasedev.fd, &phys_val, len, - vdev->config_offset + addr); + ret = VDEV_CONFIG_READ(vbasedev, addr, len, &phys_val); if (ret != len) { - error_report("%s(%s, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, len); - return -errno; + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("%s(%s, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, len, err); + return -1; } phys_val = le32_to_cpu(phys_val); } @@ -1141,15 +1154,19 @@ void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr, uint32_t val, int len) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; uint32_t val_le = cpu_to_le32(val); + int ret; trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); /* Write everything to VFIO, let it filter out what we can't write */ - if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr) - != len) { - error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m", - __func__, vdev->vbasedev.name, addr, val, len); + ret = VDEV_CONFIG_WRITE(vbasedev, addr, len, &val_le); + if (ret != len) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %s", + __func__, vbasedev->name, addr, val, len, err); } /* MSI/MSI-X Enabling/Disabling */ @@ -1237,10 +1254,13 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) int ret, entries; Error *err = NULL; - if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed reading MSI PCI_CAP_FLAGS"); - return -errno; + ret = VDEV_CONFIG_READ(&vdev->vbasedev, pos + PCI_CAP_FLAGS, + sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed reading MSI PCI_CAP_FLAGS %s", err); + return ret; } ctrl = le16_to_cpu(ctrl); @@ -1442,33 +1462,39 @@ static void vfio_pci_relocate_msix(VFIOPCIDevice *vdev, Error **errp) */ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) { + VFIODevice *vbasedev = &vdev->vbasedev; uint8_t pos; uint16_t ctrl; uint32_t table, pba; - int fd = vdev->vbasedev.fd; VFIOMSIXInfo *msix; + int ret; pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX); if (!pos) { return; } - if (pread(fd, &ctrl, sizeof(ctrl), - vdev->config_offset + pos + PCI_MSIX_FLAGS) != sizeof(ctrl)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX FLAGS"); - return; + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_FLAGS, + sizeof(ctrl), &ctrl); + if (ret != sizeof(ctrl)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX FLAGS %s", err); } - if (pread(fd, &table, sizeof(table), - vdev->config_offset + pos + PCI_MSIX_TABLE) != sizeof(table)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX TABLE"); - return; + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_TABLE, + sizeof(table), &table); + if (ret != sizeof(table)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX TABLE %s", err); } - if (pread(fd, &pba, sizeof(pba), - vdev->config_offset + pos + PCI_MSIX_PBA) != sizeof(pba)) { - error_setg_errno(errp, errno, "failed to read PCI MSIX PBA"); - return; + ret = VDEV_CONFIG_READ(vbasedev, pos + PCI_MSIX_PBA, sizeof(pba), &pba); + if (ret != sizeof(pba)) { + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_setg(errp, "failed to read PCI MSIX PBA %s", err); } ctrl = le16_to_cpu(ctrl); @@ -1606,7 +1632,6 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled) static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) { VFIOBAR *bar = &vdev->bars[nr]; - uint32_t pci_bar; int ret; @@ -1616,10 +1641,12 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) } /* Determine what type of BAR this is for registration */ - ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar), - vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr)); + ret = VDEV_CONFIG_READ(&vdev->vbasedev, PCI_BASE_ADDRESS_0 + (4 * nr), + sizeof(pci_bar), &pci_bar); if (ret != sizeof(pci_bar)) { - error_report("vfio: Failed to read BAR %d (%m)", nr); + const char *err = ret < 0 ? strerror(-ret) : "short read"; + + error_report("vfio: Failed to read BAR %d (%s)", nr, err); return; } @@ -2167,8 +2194,9 @@ static void vfio_pci_pre_reset(VFIOPCIDevice *vdev) static void vfio_pci_post_reset(VFIOPCIDevice *vdev) { + VFIODevice *vbasedev = &vdev->vbasedev; Error *err = NULL; - int nr; + int ret, nr; vfio_intx_enable(vdev, &err); if (err) { @@ -2176,13 +2204,16 @@ static void vfio_pci_post_reset(VFIOPCIDevice *vdev) } for (nr = 0; nr < PCI_NUM_REGIONS - 1; ++nr) { - off_t addr = vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr); + off_t addr = PCI_BASE_ADDRESS_0 + (4 * nr); uint32_t val = 0; uint32_t len = sizeof(val); - if (pwrite(vdev->vbasedev.fd, &val, len, addr) != len) { - error_report("%s(%s) reset bar %d failed: %m", __func__, - vdev->vbasedev.name, nr); + ret = VDEV_CONFIG_WRITE(vbasedev, addr, len, &val); + if (ret != len) { + const char *err = ret < 0 ? strerror(-ret) : "short write"; + + error_report("%s(%s) reset bar %d failed: %s", __func__, + vbasedev->name, nr, err); } } @@ -2631,7 +2662,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; - ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); + ret = VDEV_GET_IRQ_INFO(vbasedev, &irq_info); if (ret) { /* This can fail for an old kernel or legacy PCI dev */ trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); @@ -2750,8 +2781,10 @@ static void vfio_register_req_notifier(VFIOPCIDevice *vdev) return; } - if (ioctl(vdev->vbasedev.fd, - VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0 || irq_info.count < 1) { + if (VDEV_GET_IRQ_INFO(&vdev->vbasedev, &irq_info) < 0) { + return; + } + if (irq_info.count < 1) { return; } @@ -2829,6 +2862,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vdev->vbasedev.ops = &vfio_pci_ops; vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI; vdev->vbasedev.dev = DEVICE(vdev); + vdev->vbasedev.io_ops = &vfio_dev_io_ioctl; tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev); len = readlink(tmp, group_path, sizeof(group_path)); From patchwork Wed Jan 12 00:43:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9E6B4C433F5 for ; Wed, 12 Jan 2022 00:47:56 +0000 (UTC) Received: from localhost ([::1]:56188 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7Rnt-0004sv-BP for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:47:53 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36566) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdZ-0000hK-1C for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:13 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:6880) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdW-0005ft-7r for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:12 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMYZlX025152 for ; Wed, 12 Jan 2022 00:37:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=FKUFH3OBCp8sl9cGeIlRHiVQQSmZXkqSsXXzAOwbJKo=; b=dsNSUWsAG/3oc1QlbOK6Ub4hqhPz4P4o+R8Zj2Urq/lHBb+OGExMmDCTz+SjM0OKlf02 i8gIhk+0bH8z0XgG22WcJX6qxNVVD/jqbIYkViltabljiHnJ6dmuG/2Hqwvlmq9KWYmb oHOfpUMlojl8Um5nvYX4kLsU2mXCZfPlMzr3alppbLtZVRcx6MzAYM4aDBsAVCEv/22i QEDTk8jT1e5aWMcMmUQrWAMr1oZsUhe9Vgip+1fhzGm29jTLtmU4Px2b4CCFzC3Rx/Gn kvyHg692R6l37NBAN5cvriwfFZ45gpZf0op2IPubhPpfZrg1mPl5LL4kgXjoPWrxt4Wd qg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crnx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTB8196414 for ; Wed, 12 Jan 2022 00:37:06 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T9iQcyI9v3SNTGdHVyGMc8ES43ajoZhd4Gl9TIYFTdSTX/LBOn/wRqCneEB1+YfUdJ8wPXY+p/3RV2i6lb3C7bImFyU9/qJtSS8YvDfDRB0S9lF4JvCa64i3v7ANzSlvcQsnrPcV0VcLTGPortpOZM6NHwq3ifkKxFji4JQoYjoDZgG8URZPsAfQ2N0El2057MResEbrL7ulWPDVz1OdyrVBo6s6cEzRLWFqKgs2a+GZKnzZpnGPACzNZddCMHkPRF8fbuHVAHS6bj6voRcCLhjS8FPrs+bbF7ifsdbfiqZ8zmGI60dRtDFY0rbFdPr812Cc03mOXqUgWFdySouPlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FKUFH3OBCp8sl9cGeIlRHiVQQSmZXkqSsXXzAOwbJKo=; b=Wy6JjQBun24d8QGBokJdrqfNAGWxZJ9nijlZENKmTrYyI096STZyFmQG6DZDh/BH6eN7tNpfJdnz8ZGra80sb0c1d+rFzNcvTnOEp6D+qJ9OOlB4h5qZ3RZGG3ZNqLiv4t4AMaPB11/HOWUuIl6pEEHHjySpb76/J/jqLHrX9sfVysRLqKwQjUjYt72aYJkyIhNpPqWrZ3XY+T9jmemC+nSeF292F/ByLm24t0fnNF++xSPXgKf8Lu0tK/0WGeTMCK7GphK6cJyctaoCaLDodiuuFok7DISsaZHlXHAkPPpSGT5FSDJrxH+kFHYfJFbr6Wrud4EexaHZEqTi/Mwxfg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FKUFH3OBCp8sl9cGeIlRHiVQQSmZXkqSsXXzAOwbJKo=; b=c1rteGpAGFRTJJPqgL3LOF04X8KHwvb/vsv/3AG3EAN+h/kki2akfZuAj1blbrIxwnrw5VPC7cYE2XcBc6AbXLIcMVvkEoTm77gcGVtMaaDYcasOz1N8QxV4NjBbRFX42yKDqCX+NyTYjaR3pUb6+BlOXA9gcIzuKBxDR+6HTY4= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:05 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:04 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 06/21] vfio-user: Define type vfio_user_pci_dev_info Date: Tue, 11 Jan 2022 16:43:42 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9271c530-e6ee-464a-72e8-08d9d563a812 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:131; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: O5RjGHGS6TvPBR0cTLeCtVlFwVy+JzDM/ndXE1WLyx9fV1a8JtYjUv2nQwo/1FbN/Ukj/jhtkLTtQ1K/hGrEsXGl9opc4Igzuv55rrJ9Z/MCAtYqo/6sioKhQldOCpnaK/0M74tAviTEbv/PoPz6AQBOGzzsHJ2K0o+5JCpT9KCpvAaqet140hmwl37x7OX0K4nLvRRdAP+dmDRA2ZwrPCy+Blyry0Wuw9NG3LgXUB4qnH8q0LKIeaanBCvIaRWQg5DpayzTzFUz6XwiMwQgZX0jtH7PJrML1kGk/qBsUzfFntdoHfZtMIvBLnekCxVzXIHsplyFC9v7IxFfcuakgNaG1R54y8xNXXXGIFEdanyXYi3hv2zGs+GDH1D/xQb4a7Fslb9HCYPxEnw75T/f7OZDdIB9CRqf3XrT3i0W7CdXHeouth7fONjthlUGTvoajuyow9hxFnFMU0cRHouexDILaimhUxTsRjnee+D+0YQpmcFQl//te5C1cr9ctjQ0lsbPqWTZSvyCwP0XQ1OGrU/wYqr8rdoTmQRu7zn6IXT3RgwDklwnAodYaGQSXXy5CoquWHQtvt4mtPUdJOhDsIn+JOyeJVPgI0qYwe93GRJObKMVRT46f6dr0Ngxr4dvwiQVZSM/Vvz3l3kfGw3VyOWl60BesEH4Hev3pzcfksgUUTDP7KCYCcZqHFKmKXf/Q7LmIcsb//nQtkKy1o+1OA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: WdltT0m9yrSA2P87AAEQ5E4L6aiL5L0Lf/1QKhfsuvt83VRtRvsCu4Ng7HR0mWWh9dsD8G7quM2FKxzrfHdro64K26sg9GB7tS2VQlrz+nRXJMr/J0Pj8WtQi71+xHynypdYiEL7af+9ow6HRqLYw4xTHj/Q9t8WRULoRgh1+Y8V4dOtjen9v7ad+rR4aQFcbv0S2RUHR/Pyk0BaAZ511XnLjqW7S3UArQ78XN+TnwOoZ3Z7ryfxO149Qtzn71C61+Ejsw78Tgdjj3L4la22xHDA22wgYUrcAyihFlMTafjDm5oKKsJ0jvjbfyREYycJKZYC7WJrCzBWMURaTbjUrnzSXrWCYlPw6GG08y18y56+uW449FQ2JGp+9X4sUepaxc+gwirN3eFl5+3QOeJ5vNhQvRj6g1GSy1y6msn5Iyl6HeblaTRK1FoZYyWfPPnS7Xl3G7qwH4run4mWHbjQAQQdiynCctO/290HzExyvMBuF47NBgHa/Hai24ojbdNJqC6z8giI+M3h7U6rqj/pp7APj+xGTwVmw/Ocse2F9280pg694GLd6hmBHjP1uMx8Apf8RDxdaIWg5IlfGoW5YyHyZGwJSEX7CcsuaTMLJBn5j2Sgut2TOhO9t7sPQdylc80D60CGKvVp/GjdFqrWhaxLxI26VPHX0a1k7HCVC4nyiWhT0cquXnqIxi+xiNrg8jPtVKD22ZG3rJkh7eWzZ8EUD6tOgVKkfpPYy1x22HkKtVHcReIH57cCuvy5cvXdRA3+wsIMPjMeFVvQgRCyvxesiUSwkRA0WgAHXcV+A3ONop6AyP3Gkrcq+x0pOYzJIlbzsaYRHKS+clJrpciJagn8D1waM0fMqEEGg3TYf0zbFuoEV0DT1ITypWUvpvBqJlDiYsZCoz4WBZesh65uCcl+wGiriThD7vVjA9lfcbUCjwwv9rOoA7dEWYPJN5AeYj1ySuQPl3rtueyRa0hureyRMiXyRzAMRzfYFVOS1Nl1AB1E/KUsoM4Ytrxg2OcNnH2PnsYxtVSZQssXruWMFw+cK6wp5t9mBofq99/vH5ip+vIJbodpxlWgC2on+20gSqqsrU05LooRUyo+bUSqbE4yGIZTrIhj3bsdMDm/T/eUZ+fZCvtbMAHGcwZPffZKqV5M5Cyh95wsXpv/w5OCkhiIQXQj9RhJTeCG3AkFdXsxGSM0SGuzmcMoetiNur5XxY0EhDqCvbyarZn/4fuT/EJIRmT1JR5zGXF+/N+yssdyHXyPAaLuGecuP+PRec/Q79RzM6ncZvanWHLtoaC2Zz6YD126bzx0lwPTS+cFAEY+qVmHFiRNg15Dgu1f9Uiel0Yvw6qk72yooCWgZ4+frRSleYB3usxoCRmnZh6Ijb4JkO7GK/DOHUmJfISNaEceywCMdYOZLcojXE+o/AhYC6yfT1qpJOXgrG5Ka6bcHLhrnHpc2OSBEky32lXSumRTJzx0PgC5KfMZnUCXfehJIELMRJqzCzjh7xN+nSXB+zUYNMNYUwk1EEHZy74xApA2ybS7WEJmyS13XDLJeJtosQQU9WQB0Q+X6P3zXTh6r5szK8XYHsi+sI9UnQmIME/DxNZ5X+/70cHBdruD2bKRlNUfPg8I2v+rHQ0LdoTUhvPRqsbj5bSUXnnqaA6KdvAdpjRIWqrMoKFJa8VF5QmlAA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9271c530-e6ee-464a-72e8-08d9d563a812 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:04.8832 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 04qVhDvvf1HPJ/ATXkeC3iPcG7UNQQRsjGJzi8/UJX/dXrr63IvpbSoi2/xVjLdUNabdk5bWB8bosHpS+mmNz9RVe+jr+CtlxpnfUaPKwG4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: vZRcBXP9mp03st0zqzY9hsgdOMYVjuC0 X-Proofpoint-ORIG-GUID: vZRcBXP9mp03st0zqzY9hsgdOMYVjuC0 Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" New class for vfio-user with its class and instance constructors and destructors, and its pci ops. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 8 +++++ hw/vfio/common.c | 5 ++++ hw/vfio/pci.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/Kconfig | 10 +++++++ 4 files changed, 113 insertions(+) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index bbc78aa..59e636c 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -187,6 +187,14 @@ struct VFIOKernPCIDevice { VFIOPCIDevice device; }; +#define TYPE_VFIO_USER_PCI "vfio-user-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) + +struct VFIOUserPCIDevice { + VFIOPCIDevice device; + char *sock_name; +}; + /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ static inline bool vfio_pci_is(VFIOPCIDevice *vdev, uint32_t vendor, uint32_t device) { diff --git a/hw/vfio/common.c b/hw/vfio/common.c index cce38d8..f07023c 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1742,6 +1742,11 @@ void vfio_reset_handler(void *opaque) QLIST_FOREACH(group, &vfio_group_list, next) { QLIST_FOREACH(vbasedev, &group->device_list, next) { if (vbasedev->dev->realized && vbasedev->needs_reset) { + if (vbasedev->ops->vfio_hot_reset_multi == NULL) { + error_printf("%s: No hot reset handler specified\n", + vbasedev->name); + continue; + } vbasedev->ops->vfio_hot_reset_multi(vbasedev); } } diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 63a42ae..6abe474 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -19,6 +19,7 @@ */ #include "qemu/osdep.h" +#include CONFIG_DEVICES #include #include @@ -3376,3 +3377,92 @@ static void register_vfio_pci_dev_type(void) } type_init(register_vfio_pci_dev_type) + + +#ifdef CONFIG_VFIO_USER_PCI + +/* + * vfio-user routines. + */ + +/* + * Emulated devices don't use host hot reset + */ +static void vfio_user_compute_needs_reset(VFIODevice *vbasedev) +{ + vbasedev->needs_reset = false; +} + +static VFIODeviceOps vfio_user_pci_ops = { + .vfio_compute_needs_reset = vfio_user_compute_needs_reset, + .vfio_eoi = vfio_intx_eoi, + .vfio_get_object = vfio_pci_get_object, + .vfio_save_config = vfio_pci_save_config, + .vfio_load_config = vfio_pci_load_config, +}; + +static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) +{ + ERRP_GUARD(); + VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev); + VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev = &vdev->vbasedev; + + /* + * TODO: make option parser understand SocketAddress + * and use that instead of having scalar options + * for each socket type. + */ + if (!udev->sock_name) { + error_setg(errp, "No socket specified"); + error_append_hint(errp, "Use -device vfio-user-pci,socket=\n"); + return; + } + + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); + vbasedev->dev = DEVICE(vdev); + vbasedev->fd = -1; + vbasedev->type = VFIO_DEVICE_TYPE_PCI; + vbasedev->ops = &vfio_user_pci_ops; + +} + +static void vfio_user_instance_finalize(Object *obj) +{ + VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + + vfio_put_device(vdev); +} + +static Property vfio_user_pci_dev_properties[] = { + DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_user_pci_dev_properties); + dc->desc = "VFIO over socket PCI device assignment"; + pdc->realize = vfio_user_pci_realize; +} + +static const TypeInfo vfio_user_pci_dev_info = { + .name = TYPE_VFIO_USER_PCI, + .parent = TYPE_VFIO_PCI_BASE, + .instance_size = sizeof(VFIOUserPCIDevice), + .class_init = vfio_user_pci_dev_class_init, + .instance_init = vfio_instance_init, + .instance_finalize = vfio_user_instance_finalize, +}; + +static void register_vfio_user_dev_type(void) +{ + type_register_static(&vfio_user_pci_dev_info); +} + +type_init(register_vfio_user_dev_type) + +#endif /* VFIO_USER_PCI */ diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig index 7cdba05..301894e 100644 --- a/hw/vfio/Kconfig +++ b/hw/vfio/Kconfig @@ -2,6 +2,10 @@ config VFIO bool depends on LINUX +config VFIO_USER + bool + depends on VFIO + config VFIO_PCI bool default y @@ -9,6 +13,12 @@ config VFIO_PCI select EDID depends on LINUX && PCI +config VFIO_USER_PCI + bool + default y + select VFIO_USER + depends on VFIO_PCI + config VFIO_CCW bool default y From patchwork Wed Jan 12 00:43:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 764B2C433EF for ; Wed, 12 Jan 2022 01:01:03 +0000 (UTC) Received: from localhost ([::1]:47990 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S0c-0002KI-CS for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:01:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36630) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdb-0000j6-Jo for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:9240) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdW-0005gG-Oe for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrYAp025170 for ; Wed, 12 Jan 2022 00:37:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=corp-2021-07-09; bh=0b3gR6hoVoVW8L3mNtl/IwCtB5oe580IxM/Cd/hGS8M=; b=Hl1LUslFj+9VRB9Z0wNxlEZDZ4FZ/3iY+/wpFKOW3aAWvDcpUGLBdjy+8ptHFnWHtQ+V 4+SWVxjWM/IbM8ODxy+YsNsYp968ugHEP96ZTNNAG6Hx7TOfjt3r+AJU+dlcRuHednLH g6Je2F+JOU/RKWAztZ+p/bp+CUChnvj9749veGsnp1v04rkW+lBXqLyCaaduRoU5DzUn jTwxvuI+9MLEa+qJgz5PzcfeF8/S48Z+cZ1s3Qo1GIWmgavfH3R6GJv1d6rdvYgxPbPy MrpYU9+7/HFQGRmOWta7rEnibkBf2zth3joUXU0OsrgxOkyZeFMkXLeCzylXw24fgsD2 tQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crp3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:08 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBB196414 for ; Wed, 12 Jan 2022 00:37:07 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T6W01B3er3qNlmsmxjWvHX/+3j8DghRjvslb4BfiFD05Q7LD3fAwdmbA2OV9zl4Sj6I+aRiO1IYC18rL9QKRhlILTy8fu2R+I3YRd1P8/s5uQlfLraA8q4cYd0LU0cDUmjI8WIHu/AoxfluQSmv9qTVIES5T6B0bWvretnMH5Y2aRr7KLAJ6wSC87WqkW3SYsklC0K7dXy4mqWin7j6CizcZq9BZ6H2QZen0TmK/FA1uDlw+508rUtME+Si02895yQSBh4t+E3xWDymrVOockgQ2EoBONzeTC4/FUe4INEQ0qu2uz+isjwvZ2YlV/OGmvH25N4NnR1eqLUAjyX9CIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0b3gR6hoVoVW8L3mNtl/IwCtB5oe580IxM/Cd/hGS8M=; b=JK8zJOGadPoAXAjy408cgNHM86z4rJgtB1S5Q7ij00yV0zwLCYbw+MOBrEF34OHTuGbaDNtsciejq01s8/Q7hzRGtfnkdze0DvYsMK0uXTxCdrRvgGU1VdsxtDtHXk2GM1pt54B1oyeAmd3tOwwzaXwF08a5z9RmCtuMd6lL3loV0iVMkCZSpjzhQ9X1t93H1BRXTqR64fVWXht98b5eBHPbRf+veh2PUBRaPJ8DJqLeNb05csURfmRqYgHvPvN+qLGHb+bZnvdyYHaYjxCszuhnxa6XWUYiMRrcVLOOLj7xzgDIUnKDyvDeNzuVrJ4BKjVLPoVnLbu8rueto1IQ6A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0b3gR6hoVoVW8L3mNtl/IwCtB5oe580IxM/Cd/hGS8M=; b=oyCKE7OoTLabsLGP6rif4JzGJZRA9gBnOMg9KoxPTtx7d4dJfNNGlIGjRK4Th2nBuuJkhvgqwm/Hony2pw/iSrzESeCU3PPgphsVMoaAtdbnFq1Fh805hhjegeuOVIoIIGVDMxnebiGptJlFsiEfs+MoCrhGJisQWkgS8y4lKhM= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:05 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:05 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 07/21] vfio-user: connect vfio proxy to remote server Date: Tue, 11 Jan 2022 16:43:43 -0800 Message-Id: <44372be821c863e3c9a19bc1382e57c443eaf29c.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e914217a-89d0-4beb-1de2-08d9d563a842 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:114; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ECraHwzDctX8y4RuK+5F+ofe6SkhfU2xTjOPWYFNvuyNZTuR1U8rAbOFLCFx9l+mDfdv+mYUU1WA1/0COeXTteLASAQvMCgqBdUkiXrZ5EVtyUU3yT2DRRZ/sgNJJujnFIYha80bJiIKHfsH9EC0bzZnMOjdWVYFtVBaSRQozYII62V5yW5KUoevUKw6v/f1Pj8cfRHokNSFozfsdi8A6Ih3+78ZUiaGSaMSMNnW5fCrCDWuXVkxP4GaN40YYEBsnlfed3XxQUtUQ8az3EzFJdVtNDrUVTQhkBc/FNQP/4vC6W+SMiT3U6gePEgPYT1vLrfwj/NuXDtvaveSqwAkxjUWtWkxYBDh57X/rnGuPGWN550ZHw/Y0yZ1UGZVRQzYm0J0p7H2+cagKcWt7E+VfzbSw5dvd7y/bDt0JmowEpokWWCdHRdYd9vSh7Wz/OePm0MDcZfkuocVjyl75DW5s41nHqPD9X7jXKUCKFNaZhntOSTmw4+AmciU6mlh+ri9Y0T7hn7+gOW/iWqeoc/SYUq2EoTyFTshqE8TpOsji3kvKr6xGFDSwIg1JfmYJT6OzrKSkkUe34upK3n4F1MiAg4wSoC4vt0yFYQgcvDMiVfNcgh2KaeVnQ1bQ/A66e2vxFF3Ah6+X/e4T1sdkktpbfcL3U40FGDy23stdNQweN/hmOLm/nSqxdLGw9g7RlppjVIlmwSeLrV38SrAZ0ov7Q== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?MmrMlJ6LLPVOHFcEak8j6CQFKkL0?= =?utf-8?q?UKSBq43x6lvtWRKjHDvqizkAOWvt44j2DVNrdWuiPqKTdoGfifDw2apKBVC5CMfnP?= =?utf-8?q?j9V/E4+37n6cmCrFlRMy7hG8OZyfIwxGJS12M+vTtMnVTJXPQ+HvsVB8fiidDvgAM?= =?utf-8?q?jBhiAYtNMmgOtlEb1VivoR150f1E8VNeHcYb4LVbO9zYLyk+093rSyuokqD8ikA8m?= =?utf-8?q?LGY+NCITM+4QSu5NtRtKq2gycZGOREm3ISW5ZDmM2zv5EBlUynALJTtgGIo35fI+z?= =?utf-8?q?p7IkSe/e2lBvvzykuIXuvQuqf/ZMnxWlgx2hBCHh/hgSfh7q1oRTRNTB1QWq3Gg2w?= =?utf-8?q?duamKAJjJoIFjlwAFcS1Qckz+lLYjwecvzd6xmaykRlIr0GOmoMPemK65AEgpt8Rj?= =?utf-8?q?GKq38mkYY3sykp6M7cftHOZdQiBc/mIA3TbZVAtkkrv1kmnOGOyEC3HK4N8Jw1Nnq?= =?utf-8?q?UgP6yhqX3raq5V9MaDuFzEIW+HeC0mBv9SHracO6h+zHB62wilWyp717LA3ecgcBh?= =?utf-8?q?C3oeANAaaacamD5JgItKNJVsJMFYBDV5cR+zTFE5NxgUI0vJQ6CXtVgWHL1B2qda8?= =?utf-8?q?zapLoI98B87xCJ5kWdPnMaVrL0Eh3WFTPNwFak/W7Ay0lKkRhDOcFzB3A3YSR+n9d?= =?utf-8?q?wpBGGdGeVR3toVtb9E6uBbyqxnkKxnb4lkp3dUPpRdLXQMfxAoPMr0A1uPk/4azaq?= =?utf-8?q?/iOfNAc1KqlmSnGdGKyrT0FBtO1JA+i69jRPdfrUqtIu0BjdIs962vXLR9T7PwRup?= =?utf-8?q?ESWwVjJlTj6Dya1R1q3VrY4nlfQLP01nqVanSNf4hr8mHce72rAtXEHKYYy3joYSA?= =?utf-8?q?vJuI0tbPlGhqrayk41VpYlrVpUwfX6cWh1c7IhWSaXG+SEboOUoZ8Zt6YGcoGnP+e?= =?utf-8?q?4wLSuLSKnIcZZV5T37RJaw0phRRkyTmhPelAPj7+lMCJ7wOPR59DRSDP+gt5rDK5h?= =?utf-8?q?kz2/61taWou4eUtOAtWMIt/7kl11L/uNpRPvJqTxyJr8zLSW4dOvkIolTZfUsWCt7?= =?utf-8?q?JiPJH4WojazWsRtSt5osJurRdiZXrCkwo/a/3E080e+dM9TgLbkpt+EdSZ3Vfvog9?= =?utf-8?q?JUusgFsUqPfC+c/3AVvSlzFm14SEf16GJkY5Bq88YPuzsQuipwJxEWd2Zt9DfJO2i?= =?utf-8?q?Xtv4Beb9rz61IIX5YLMhmR3tAw6VlzJkqk1PWtshZA0xVaCGQPUzL9e/YWBmB3Jua?= =?utf-8?q?0nRjRFAvNyC1Ikj7QQhY5tHCrbopNqnXd/FT+Sn9Ca26BePmxldevTDkoSWrbFhl0?= =?utf-8?q?vwzJiEbASV00ahgjzubno2gPNwTlKER3kSD7+a2FKwXgkXwCu+KrHX6Gt4F2+fKdT?= =?utf-8?q?wlyuIFQWzmtLOuxp/K84UH4n9gjz0Fz3wENJ2vITjGqC3E90zmfFmkMT+O2ily2cC?= =?utf-8?q?aJhBuIVog3i4F3IGNxnFuHPkH3k5akmokjuD+0LmTPaIatN+rRY4nk1Z5Ar3ilQ2i?= =?utf-8?q?ZbnPd/kMRTl2LFRrMIR9NzCUi7tuThUbZyKlRzjQN8eiPUfSmogzDMw0T3jnmWoOt?= =?utf-8?q?G1jh6WeUP8ZzIUeECkN7NfZsIlYxNLVDdKXz2akfG1v/Sj/Wz2AxTYvOrTBQ8etFA?= =?utf-8?q?5lvTzkiHmBoKDGM4HspkpI4nthbaB9NaQ=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: e914217a-89d0-4beb-1de2-08d9d563a842 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:05.1800 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XEXzKXnGpgcX2JVy+sHbE1aHJ+F2sdtDtT94dJHC18r30TF7OxD5i8WkTpI7UyyCNVNeZteOhE9jK9XUQqcIfO7146fcrDb2hEwlufsbF0U= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: 5cXf_RvFMSEGPWd4memOa5GnRjelpJsH X-Proofpoint-ORIG-GUID: 5cXf_RvFMSEGPWd4memOa5GnRjelpJsH Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" add user.c & user.h files for vfio-user code add proxy struct to handle comms with remote server Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 78 +++++++++++++++++++ include/hw/vfio/vfio-common.h | 2 + hw/vfio/pci.c | 19 +++++ hw/vfio/user.c | 170 ++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 4 + hw/vfio/meson.build | 1 + 6 files changed, 274 insertions(+) create mode 100644 hw/vfio/user.h create mode 100644 hw/vfio/user.c diff --git a/hw/vfio/user.h b/hw/vfio/user.h new file mode 100644 index 0000000..da92862 --- /dev/null +++ b/hw/vfio/user.h @@ -0,0 +1,78 @@ +#ifndef VFIO_USER_H +#define VFIO_USER_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +typedef struct { + int send_fds; + int recv_fds; + int *fds; +} VFIOUserFDs; + +enum msg_type { + VFIO_MSG_NONE, + VFIO_MSG_ASYNC, + VFIO_MSG_WAIT, + VFIO_MSG_NOWAIT, + VFIO_MSG_REQ, +}; + +typedef struct VFIOUserMsg { + QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserFDs *fds; + uint32_t rsize; + uint32_t id; + QemuCond cv; + bool complete; + enum msg_type type; +} VFIOUserMsg; + + +enum proxy_state { + VFIO_PROXY_CONNECTED = 1, + VFIO_PROXY_ERROR = 2, + VFIO_PROXY_CLOSING = 3, + VFIO_PROXY_CLOSED = 4, +}; + +typedef QTAILQ_HEAD(VFIOUserMsgQ, VFIOUserMsg) VFIOUserMsgQ; + +typedef struct VFIOProxy { + QLIST_ENTRY(VFIOProxy) next; + char *sockname; + struct QIOChannel *ioc; + void (*request)(void *opaque, VFIOUserMsg *msg); + void *req_arg; + int flags; + QemuCond close_cv; + AioContext *ctx; + QEMUBH *req_bh; + + /* + * above only changed when BQL is held + * below are protected by per-proxy lock + */ + QemuMutex lock; + VFIOUserMsgQ free; + VFIOUserMsgQ pending; + VFIOUserMsgQ incoming; + VFIOUserMsgQ outgoing; + VFIOUserMsg *last_nowait; + enum proxy_state state; +} VFIOProxy; + +/* VFIOProxy flags */ +#define VFIO_PROXY_CLIENT 0x1 + +VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); +void vfio_user_disconnect(VFIOProxy *proxy); + +#endif /* VFIO_USER_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 826cd98..3eb0b19 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -76,6 +76,7 @@ typedef struct VFIOAddressSpace { struct VFIOGroup; typedef struct VFIOContIO VFIOContIO; +typedef struct VFIOProxy VFIOProxy; typedef struct VFIOContainer { VFIOAddressSpace *space; @@ -147,6 +148,7 @@ typedef struct VFIODevice { VFIOMigration *migration; Error *migration_blocker; OnOffAuto pre_copy_dirty_page_tracking; + VFIOProxy *proxy; struct vfio_region_info **regions; } VFIODevice; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 6abe474..9fd7c07 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -43,6 +43,7 @@ #include "qapi/error.h" #include "migration/blocker.h" #include "migration/qemu-file.h" +#include "hw/vfio/user.h" /* convenience macros for PCI config space */ #define VDEV_CONFIG_READ(vbasedev, off, size, data) \ @@ -3407,6 +3408,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIOUserPCIDevice *udev = VFIO_USER_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); VFIODevice *vbasedev = &vdev->vbasedev; + SocketAddress addr; + VFIOProxy *proxy; + Error *err = NULL; /* * TODO: make option parser understand SocketAddress @@ -3419,6 +3423,16 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } + memset(&addr, 0, sizeof(addr)); + addr.type = SOCKET_ADDRESS_TYPE_UNIX; + addr.u.q_unix.path = udev->sock_name; + proxy = vfio_user_connect_dev(&addr, &err); + if (!proxy) { + error_setg(errp, "Remote proxy not found"); + return; + } + vbasedev->proxy = proxy; + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); vbasedev->fd = -1; @@ -3430,8 +3444,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) static void vfio_user_instance_finalize(Object *obj) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); + VFIODevice *vbasedev = &vdev->vbasedev; vfio_put_device(vdev); + + if (vbasedev->proxy != NULL) { + vfio_user_disconnect(vbasedev->proxy); + } } static Property vfio_user_pci_dev_properties[] = { diff --git a/hw/vfio/user.c b/hw/vfio/user.c new file mode 100644 index 0000000..c843f90 --- /dev/null +++ b/hw/vfio/user.c @@ -0,0 +1,170 @@ +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include +#include + +#include "qemu/error-report.h" +#include "qapi/error.h" +#include "qemu/main-loop.h" +#include "hw/hw.h" +#include "hw/vfio/vfio-common.h" +#include "hw/vfio/vfio.h" +#include "qemu/sockets.h" +#include "io/channel.h" +#include "io/channel-socket.h" +#include "io/channel-util.h" +#include "sysemu/iothread.h" +#include "user.h" + +static IOThread *vfio_user_iothread; + +static void vfio_user_shutdown(VFIOProxy *proxy); + + +/* + * Functions called by main, CPU, or iothread threads + */ + +static void vfio_user_shutdown(VFIOProxy *proxy) +{ + qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); +} + +/* + * Functions only called by iothread + */ + +static void vfio_user_cb(void *opaque) +{ + VFIOProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + proxy->state = VFIO_PROXY_CLOSED; + qemu_cond_signal(&proxy->close_cv); +} + + +/* + * Functions called by main or CPU threads + */ + +static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = + QLIST_HEAD_INITIALIZER(vfio_user_sockets); + +VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) +{ + VFIOProxy *proxy; + QIOChannelSocket *sioc; + QIOChannel *ioc; + char *sockname; + + if (addr->type != SOCKET_ADDRESS_TYPE_UNIX) { + error_setg(errp, "vfio_user_connect - bad address family"); + return NULL; + } + sockname = addr->u.q_unix.path; + + sioc = qio_channel_socket_new(); + ioc = QIO_CHANNEL(sioc); + if (qio_channel_socket_connect_sync(sioc, addr, errp)) { + object_unref(OBJECT(ioc)); + return NULL; + } + qio_channel_set_blocking(ioc, false, NULL); + + proxy = g_malloc0(sizeof(VFIOProxy)); + proxy->sockname = g_strdup_printf("unix:%s", sockname); + proxy->ioc = ioc; + proxy->flags = VFIO_PROXY_CLIENT; + proxy->state = VFIO_PROXY_CONNECTED; + + qemu_mutex_init(&proxy->lock); + qemu_cond_init(&proxy->close_cv); + + if (vfio_user_iothread == NULL) { + vfio_user_iothread = iothread_create("VFIO user", errp); + } + + proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + + QTAILQ_INIT(&proxy->outgoing); + QTAILQ_INIT(&proxy->incoming); + QTAILQ_INIT(&proxy->free); + QTAILQ_INIT(&proxy->pending); + QLIST_INSERT_HEAD(&vfio_user_sockets, proxy, next); + + return proxy; +} + +void vfio_user_disconnect(VFIOProxy *proxy) +{ + VFIOUserMsg *r1, *r2; + + qemu_mutex_lock(&proxy->lock); + + /* our side is quitting */ + if (proxy->state == VFIO_PROXY_CONNECTED) { + vfio_user_shutdown(proxy); + if (!QTAILQ_EMPTY(&proxy->pending)) { + error_printf("vfio_user_disconnect: outstanding requests\n"); + } + } + object_unref(OBJECT(proxy->ioc)); + proxy->ioc = NULL; + + proxy->state = VFIO_PROXY_CLOSING; + QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->incoming, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->pending, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->free, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->free, r1, next); + g_free(r1); + } + + /* + * Make sure the iothread isn't blocking anywhere + * with a ref to this proxy by waiting for a BH + * handler to run after the proxy fd handlers were + * deleted above. + */ + aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy); + qemu_cond_wait(&proxy->close_cv, &proxy->lock); + + /* we now hold the only ref to proxy */ + qemu_mutex_unlock(&proxy->lock); + qemu_cond_destroy(&proxy->close_cv); + qemu_mutex_destroy(&proxy->lock); + + QLIST_REMOVE(proxy, next); + if (QLIST_EMPTY(&vfio_user_sockets)) { + iothread_destroy(vfio_user_iothread); + vfio_user_iothread = NULL; + } + + g_free(proxy->sockname); + g_free(proxy); +} diff --git a/MAINTAINERS b/MAINTAINERS index 1258e11..cfaccbf 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1903,8 +1903,12 @@ L: qemu-s390x@nongnu.org vfio-user M: John G Johnson M: Thanos Makatos +M: Elena Ufimtseva +M: Jagannathan Raman S: Supported F: docs/devel/vfio-user.rst +F: hw/vfio/user.c +F: hw/vfio/user.h vhost M: Michael S. Tsirkin diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index da9af29..2f86f72 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -9,6 +9,7 @@ vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'pci-quirks.c', 'pci.c', )) +vfio_ss.add(when: 'CONFIG_VFIO_USER', if_true: files('user.c')) vfio_ss.add(when: 'CONFIG_VFIO_CCW', if_true: files('ccw.c')) vfio_ss.add(when: 'CONFIG_VFIO_PLATFORM', if_true: files('platform.c')) vfio_ss.add(when: 'CONFIG_VFIO_XGMAC', if_true: files('calxeda-xgmac.c')) From patchwork Wed Jan 12 00:43:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5FF7BC433EF for ; Wed, 12 Jan 2022 01:12:53 +0000 (UTC) Received: from localhost ([::1]:36716 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SC4-0006U6-5Z for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:12:52 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36634) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdb-0000j8-Mc for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:8370) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdX-0005g6-9l for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMTaJb005856 for ; Wed, 12 Jan 2022 00:37:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=corp-2021-07-09; bh=bZM6vjf8i+FUvi61G0ohVApA1qERoND/WGnUGA4yg/E=; b=PUBJsLsLry2xGcaFJK8v4MRFHLuOWBGUuESqR5lPHsT6kWmgOEhp69LoClJtPoa+Ym4s nG4XPfaKlieE+C55o7B/tyWwalhv6eMca9J2aLnD5P4OO3b+b9kZ/FWIYMpXYJdjo5GI tu3/V9IDkXGsRyTHJKN6gMhbzssh9mUtc98dCuPlY+KHms5CQdQPNvmE2ses/Rzt3CuX MLFcGxRB7iLDL7M5e/kWPcfRJIwBYFOliYKf+zlYjIzeYXqKmgfYeFbpd3VYP/uBYI+9 b35O/Sa3wprDLv9auDDBYax2e85zvLEgXMVAzs28+pLikrQz1y59cPph5zunsH3Ekan6 vQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1um-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:08 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBC196414 for ; Wed, 12 Jan 2022 00:37:08 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dS1/sGhIo3Hm8lgT9LE3SxL8ZalJSXMM4GzirHE+czWcQvmPFZFUuzmRueFFlff8Z86KxVdi29sbeYxaj9u8GJnPTdJAk8dXQNpS8VAiVbgHtt6ZzsGwSgQTnsyshdeYmS2HxMUbLk7raMeed5tlVk3JUlUrRM/KkxGUrYLE3gCV9QCfZVruAvSiV4vcxygm2BV04uh7N8mW3CLeSrVT9Ppo7RCvbvfd34WBl085bWcTk0v4qK5Y2fkPtz0Lzr83jBYJrKsA7I/N7oXDdZuqjvsiwzWdGWVLSFjjj1MNB52PB3HUpsPrcEz/ID2Zo6doNIAfHPRSIMkayjj2RisXrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bZM6vjf8i+FUvi61G0ohVApA1qERoND/WGnUGA4yg/E=; b=i5BP9PFNw2N9epSLkfFIbDE5ukF0iNOwYVm19LmG3inYkzlclDviF7GDjuAAYZRKbbUqLXqIeznHtnT8klpyNBCB4jZvmqprqMxWSPVfA66TTCTwbkTVsAEMUgYpmG0+Ps+jcDa78kX2k5JqcfWjIcfZrF04IA7+Sknjj1KFJIFAmg+WL8GBIdYaiNBpauGW/m2MyofjdqkmqbBTyafBDWQsIgIN6OmMG8hA6Ojx3kdmsi9Nmt/uozR1d99Zhr+ZQ5A33prU+zmljpxXVfzYW17U3JwcGpCCvlFB1YmJylBMafm8y/gHIckzHFLdclaQNaTMjjjFK2b1VODn158vEg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bZM6vjf8i+FUvi61G0ohVApA1qERoND/WGnUGA4yg/E=; b=l03P2QxMW2sJ5mTQkjry+U1stiZ3DlZM4WcODzgM4l2QuWOBGQ4HZEB+YH34zbCDEAhN3vRL5evtxj6sMbqG1SdE7DRjVDUT4+XxXdYz4CAQjMKRl0OZ+x28rh5gjLy3316ilBJ5owPk7GoOR/zTMnFkGomRFg8bsLn/FtqtYKQ= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:05 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:05 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 08/21] vfio-user: define socket receive functions Date: Tue, 11 Jan 2022 16:43:44 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 68dadb37-1048-445e-ca94-08d9d563a871 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:660; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6HFAZZRw+4FJw8a2f9W1qzsHKeQklj027x8GUuFL+zJU2oygPE58jREo3AfVy0N+QoBxuvfulQGJjBM8F36LJko5o+H2LunJ8sEBGP2VlwUJTEd/HJoH4j2A7EH5q5GVXsWKVOsHNNV2m/PaxbHGdNtUAMQk3zNNe8lfGl/EGCvaTSl+Hk/v2Tr5t+h4Bdmhho3wJI9p5pk5GEiCg0DTxGPDXHVj1WW7heZy0s5CzarGWG7w/DH3Sx6WeAyxCNi//7/c7KjtqLiDPhjwLFPBjx8OJviCUYeYMImGyB3GhhFqrkL60bOgJ3/RYfpNxmnsLmiT1rFDDsxkoqpeN+bfEJLVH9WzBBgtpMIA4ClPfpVm5O+ArlnrBP5CBHm+SV0Gs9UFmmNoq7y3Y+2Uwz276jr8t0EE9C3PsmjwbErpMzcKIa+rZKRW4Ot6rSbaeQwcvx0eR7R3Fm/cUwk0/vrzHJV7Ii8dX6bsbU4zRlRg4XcOTHwJdhzaG/yurJadEARj9SPMUKKvuQdKoVtLVp7U6S129ZU4KmRXrdP2ZSZ7EkR7yfumu/OiH4SWLDzU1K67f6k8d+71liuAYJvr1mraRFGTxuMG65B8Hiklcgq5cvA3wbuDTWdnIrrpGqU8YTsFjwP86a+IIvX9suQFWTx5akSBN50pivludKOQm40stf3NvC2eYpE3msPG29X5Cimnn49Ye06WO3I33sysB6QpFg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(30864003)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?H9RpAEOAsrubcRz37kYp7DHbUyYB?= =?utf-8?q?bj480IngGzEJ4l0qj4Tv4lUiDezFxAhhroNlBWa66mX6YhS1Tyziol79KCA+CxLg8?= =?utf-8?q?kgmukSfKrLned4It8mV3a3FgVTzvs1UAutwzH/ZKe9Le6s4veHZlkWuy8nohYHC7Y?= =?utf-8?q?ct43sQOKUQwXtyn5aMIJPZRjpTPE+k0wFDVhB6wsbtB3r1GyN33dxmwoT+u2XNsVF?= =?utf-8?q?X4KhHscKGzL1+qJIPvzrAgmixETjuSUsiYrIc+kWVBXSWGsfuRFqd/Rag2FdznDsZ?= =?utf-8?q?0Oebv35jGF+kmGq9pTe0PQWhtrx/KjZmx+CgPryzmT6Z5J+1AqJlSejnEHQle8fNH?= =?utf-8?q?sCwcQ2JsfAzp4Fj66F1gbNNHDZ9huhZlMMxot8tO1RaFjPYJwVrf9BTC58CCsN2RG?= =?utf-8?q?qzsEjbSHYpzC7p3jwRuUdcIP6xoM0V+YJA8877o0BUIgi48wcr+xVdLQ1kwx7GjJE?= =?utf-8?q?fl1MHF+TFQvlPKjh8TiVYOZBay36Q4FU+MBm5e4v1G5XOoeY8IAUsp+iPV+LlNTwb?= =?utf-8?q?xv0UzR/Kx41GPEj06f+9NCePcMslBQKGC524JI3LqYxtUUgTp1jIeS28Z/oGiM0e1?= =?utf-8?q?eMrP2Ytesci1ZhkpyCrtsjQoklWS/wVb6pUHuUpFH2pbOfL5NhQy3tn0etR3SWlYv?= =?utf-8?q?VWlzIqE4rswrXQDDb8GdTMEVZ7K9n9fPiYdnY1zpdBqgt59gIzD0jjdHEcWSV8nhD?= =?utf-8?q?4ETfiqCIf12icgdm9wKjZzqVsgAufqlGkHvkOnHkqJ0B9D5cu6AFdgGPxyn0YDeja?= =?utf-8?q?7My7YvvBQtABVGvNXNwEUsgBL7ZKofhNCMoHCtHUTlUXvfzOqRUotvrEDgKlSp3wP?= =?utf-8?q?7sInwwDO/wHfGkVInnBNBdROsKMEBjqJCqGjidAwiOacF1wREVKabSw9s4IFeo0p3?= =?utf-8?q?skcJPTIvegzX4Tksx3Iv3fIIpxJ48OMRIVF0WjEggNEq5tow6pfMWgROc0n7rW4aN?= =?utf-8?q?JDgJyd4hvKFiAOPBb3tI7rId3qpVXOMb41xJ/vy3YqIU6ii0ThAnuWD0mu+WdNIdo?= =?utf-8?q?qr2exLjGieF012kbz/WUC2t3mhX6jYA/1sWMz+VE+xSQAUN5tkOUZuV36fFobfbQY?= =?utf-8?q?XGlRENurlDiqCVVZcTuydb0GeA2ptIWao7ef5lv5J3n9IsyPaWb8YGw7qIw7KLa05?= =?utf-8?q?ST0pdcUv/+tdjqmltWkM0/i8wpkN4p3MB35DICMccsqyYIaTed72hHY4apDd9ZM1O?= =?utf-8?q?8unEFXjy1XGEZjTYbknAq6f4Nugc/GRfUNr8YfKHLoGwfg9shY9SqxHfpsQrgWesp?= =?utf-8?q?crgoq+WdVzfzeAl2XiIdDabZkwLl7ABd8ouG5F86XWgD4u51aZzBxVngKW1ZvhCb/?= =?utf-8?q?LgCq/gJVX5jDnpui5/QD9bsd45WqxZdu25hfVG6BRtekEM7nVHRmjoPYhu0lvGAEu?= =?utf-8?q?7HiZoDSo66GQwVz0/wgLMnbRF7ALlTUZqACo4k469KSrAWFN45cf7gSoOK2Vb+gf3?= =?utf-8?q?g60NrWyaiO6tWRWNvaWoODJvllfjk9pEYzG8qFUux4CTlE3Cru+5kwulFF9prcGJR?= =?utf-8?q?6lrJFjTpguac0Wn1zBP00gsfdzaiyWRaUQxbi50AklfPvb8RTtgwAi1dDRMUE4NHZ?= =?utf-8?q?YE8337pCbPdhxTAxxSC7KdEyHMy1B1iXg=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 68dadb37-1048-445e-ca94-08d9d563a871 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:05.5081 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: V5bfcNAJNA+KhTw+yExHaL0xqkmX8kEFnqHmhQG5w4ezugH7CnGonvfBJickrqTVP0gfQonlhTfBOLRxB9TLalsyM0jpx0AuUnT0n+HNh2Q= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: QsoDDXrLTXIN1LEhjWm_AEj0cvQi2OKr X-Proofpoint-ORIG-GUID: QsoDDXrLTXIN1LEhjWm_AEj0cvQi2OKr Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add infrastructure needed to receive incoming messages Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 54 ++++++++ hw/vfio/user.h | 6 + hw/vfio/pci.c | 6 + hw/vfio/user.c | 327 ++++++++++++++++++++++++++++++++++++++++++++++++ MAINTAINERS | 1 + 5 files changed, 394 insertions(+) create mode 100644 hw/vfio/user-protocol.h diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h new file mode 100644 index 0000000..d23877c --- /dev/null +++ b/hw/vfio/user-protocol.h @@ -0,0 +1,54 @@ +#ifndef VFIO_USER_PROTOCOL_H +#define VFIO_USER_PROTOCOL_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Each message has a standard header that describes the command + * being sent, which is almost always a VFIO ioctl(). + * + * The header may be followed by command-specific data, such as the + * region and offset info for read and write commands. + */ + +typedef struct { + uint16_t id; + uint16_t command; + uint32_t size; + uint32_t flags; + uint32_t error_reply; +} VFIOUserHdr; + +/* VFIOUserHdr commands */ +enum vfio_user_command { + VFIO_USER_VERSION = 1, + VFIO_USER_DMA_MAP = 2, + VFIO_USER_DMA_UNMAP = 3, + VFIO_USER_DEVICE_GET_INFO = 4, + VFIO_USER_DEVICE_GET_REGION_INFO = 5, + VFIO_USER_DEVICE_GET_REGION_IO_FDS = 6, + VFIO_USER_DEVICE_GET_IRQ_INFO = 7, + VFIO_USER_DEVICE_SET_IRQS = 8, + VFIO_USER_REGION_READ = 9, + VFIO_USER_REGION_WRITE = 10, + VFIO_USER_DMA_READ = 11, + VFIO_USER_DMA_WRITE = 12, + VFIO_USER_DEVICE_RESET = 13, + VFIO_USER_DIRTY_PAGES = 14, + VFIO_USER_MAX, +}; + +/* VFIOUserHdr flags */ +#define VFIO_USER_REQUEST 0x0 +#define VFIO_USER_REPLY 0x1 +#define VFIO_USER_TYPE 0xF + +#define VFIO_USER_NO_REPLY 0x10 +#define VFIO_USER_ERROR 0x20 + +#endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index da92862..72eefa7 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -11,6 +11,8 @@ * */ +#include "user-protocol.h" + typedef struct { int send_fds; int recv_fds; @@ -27,6 +29,7 @@ enum msg_type { typedef struct VFIOUserMsg { QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserHdr *hdr; VFIOUserFDs *fds; uint32_t rsize; uint32_t id; @@ -74,5 +77,8 @@ typedef struct VFIOProxy { VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *reqarg); #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 9fd7c07..0de915d 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3386,6 +3386,11 @@ type_init(register_vfio_pci_dev_type) * vfio-user routines. */ +static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) +{ + +} + /* * Emulated devices don't use host hot reset */ @@ -3432,6 +3437,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) return; } vbasedev->proxy = proxy; + vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); diff --git a/hw/vfio/user.c b/hw/vfio/user.c index c843f90..e1dfd5d 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -25,10 +25,26 @@ #include "sysemu/iothread.h" #include "user.h" +static uint64_t max_xfer_size; static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOProxy *proxy); +static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); +static VFIOUserFDs *vfio_user_getfds(int numfds); +static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_recv(void *opaque); +static int vfio_user_recv_one(VFIOProxy *proxy); +static void vfio_user_cb(void *opaque); + +static void vfio_user_request(void *opaque); + +static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) +{ + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = err; +} /* * Functions called by main, CPU, or iothread threads @@ -40,10 +56,261 @@ static void vfio_user_shutdown(VFIOProxy *proxy) qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); } +static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + + msg = QTAILQ_FIRST(&proxy->free); + if (msg != NULL) { + QTAILQ_REMOVE(&proxy->free, msg, next); + } else { + msg = g_malloc0(sizeof(*msg)); + qemu_cond_init(&msg->cv); + } + + msg->hdr = hdr; + msg->fds = fds; + return msg; +} + +/* + * Recycle a message list entry to the free list. + */ +static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + if (msg->type == VFIO_MSG_NONE) { + error_printf("vfio_user_recycle - freeing free msg\n"); + return; + } + + /* free msg buffer if no one is waiting to consume the reply */ + if (msg->type == VFIO_MSG_NOWAIT || msg->type == VFIO_MSG_ASYNC) { + g_free(msg->hdr); + if (msg->fds != NULL) { + g_free(msg->fds); + } + } + + msg->type = VFIO_MSG_NONE; + msg->hdr = NULL; + msg->fds = NULL; + msg->complete = false; + QTAILQ_INSERT_HEAD(&proxy->free, msg, next); +} + +static VFIOUserFDs *vfio_user_getfds(int numfds) +{ + VFIOUserFDs *fds = g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); + + fds->fds = (int *)((char *)fds + sizeof(*fds)); + + return fds; +} + /* * Functions only called by iothread */ +static void vfio_user_recv(void *opaque) +{ + VFIOProxy *proxy = opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (vfio_user_recv_one(proxy) == 0) { + ; + } + } +} + +/* + * Receive and process one incoming message. + * + * For replies, find matching outgoing request and wake any waiters. + * For requests, queue in incoming list and run request BH. + */ +static int vfio_user_recv_one(VFIOProxy *proxy) +{ + VFIOUserMsg *msg = NULL; + g_autofree int *fdp = NULL; + VFIOUserFDs *reqfds; + VFIOUserHdr hdr; + struct iovec iov = { + .iov_base = &hdr, + .iov_len = sizeof(hdr), + }; + bool isreply = false; + int i, ret; + size_t msgleft, numfds = 0; + char *data = NULL; + char *buf = NULL; + Error *local_err = NULL; + + /* + * Read header + */ + ret = qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds, + &local_err); + if (ret == QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + if (ret <= 0) { + /* read error or other side closed connection */ + if (ret == 0) { + error_setg(&local_err, "vfio_user_recv server closed socket"); + } else { + error_prepend(&local_err, "vfio_user_recv"); + } + goto fatal; + } + if (ret < sizeof(msg)) { + error_setg(&local_err, "vfio_user_recv short read of header"); + goto fatal; + } + + /* + * Validate header + */ + if (hdr.size < sizeof(VFIOUserHdr)) { + error_setg(&local_err, "vfio_user_recv bad header size"); + goto fatal; + } + switch (hdr.flags & VFIO_USER_TYPE) { + case VFIO_USER_REQUEST: + isreply = false; + break; + case VFIO_USER_REPLY: + isreply = true; + break; + default: + error_setg(&local_err, "vfio_user_recv unknown message type"); + goto fatal; + } + + /* + * For replies, find the matching pending request. + * For requests, reap incoming FDs. + */ + if (isreply) { + QTAILQ_FOREACH(msg, &proxy->pending, next) { + if (hdr.id == msg->id) { + break; + } + } + if (msg == NULL) { + error_setg(&local_err, "vfio_user_recv unexpected reply"); + goto err; + } + QTAILQ_REMOVE(&proxy->pending, msg, next); + + /* + * Process any received FDs + */ + if (numfds != 0) { + if (msg->fds == NULL || msg->fds->recv_fds < numfds) { + error_setg(&local_err, "vfio_user_recv unexpected FDs"); + goto err; + } + msg->fds->recv_fds = numfds; + memcpy(msg->fds->fds, fdp, numfds * sizeof(int)); + } + } else { + if (numfds != 0) { + reqfds = vfio_user_getfds(numfds); + memcpy(reqfds->fds, fdp, numfds * sizeof(int)); + } else { + reqfds = NULL; + } + } + + /* + * Put the whole message into a single buffer. + */ + if (isreply) { + if (hdr.size > msg->rsize) { + error_setg(&local_err, + "vfio_user_recv reply larger than recv buffer"); + goto err; + } + *msg->hdr = hdr; + data = (char *)msg->hdr + sizeof(hdr); + } else { + if (hdr.size > max_xfer_size) { + error_setg(&local_err, "vfio_user_recv request larger than max"); + goto err; + } + buf = g_malloc0(hdr.size); + memcpy(buf, &hdr, sizeof(hdr)); + data = buf + sizeof(hdr); + msg = vfio_user_getmsg(proxy, (VFIOUserHdr *)buf, reqfds); + msg->type = VFIO_MSG_REQ; + } + + msgleft = hdr.size - sizeof(hdr); + while (msgleft > 0) { + ret = qio_channel_read(proxy->ioc, data, msgleft, &local_err); + + /* error or would block */ + if (ret < 0) { + goto fatal; + } + + msgleft -= ret; + data += ret; + } + + /* + * Replies signal a waiter, if none just check for errors + * and free the message buffer. + * + * Requests get queued for the BH. + */ + if (isreply) { + msg->complete = true; + if (msg->type == VFIO_MSG_WAIT) { + qemu_cond_signal(&msg->cv); + } else { + if (hdr.flags & VFIO_USER_ERROR) { + error_printf("vfio_user_rcv error reply on async request "); + error_printf("command %x error %s\n", hdr.command, + strerror(hdr.error_reply)); + } + /* youngest nowait msg has been ack'd */ + if (proxy->last_nowait == msg) { + proxy->last_nowait = NULL; + } + vfio_user_recycle(proxy, msg); + } + } else { + QTAILQ_INSERT_TAIL(&proxy->incoming, msg, next); + qemu_bh_schedule(proxy->req_bh); + } + return 0; + + /* + * fatal means the other side closed or we don't trust the stream + * err means this message is corrupt + */ +fatal: + vfio_user_shutdown(proxy); + proxy->state = VFIO_PROXY_ERROR; + +err: + for (i = 0; i < numfds; i++) { + close(fdp[i]); + } + if (isreply && msg != NULL) { + /* force an error to keep sending thread from hanging */ + vfio_user_set_error(msg->hdr, EINVAL); + msg->complete = true; + qemu_cond_signal(&msg->cv); + } + error_report_err(local_err); + return -1; +} + static void vfio_user_cb(void *opaque) { VFIOProxy *proxy = opaque; @@ -59,6 +326,51 @@ static void vfio_user_cb(void *opaque) * Functions called by main or CPU threads */ +/* + * Process incoming requests. + * + * The bus-specific callback has the form: + * request(opaque, msg) + * where 'opaque' was specified in vfio_user_set_handler + * and 'msg' is the inbound message. + * + * The callback is responsible for disposing of the message buffer, + * usually by re-using it when calling vfio_send_reply or vfio_send_error, + * both of which free their message buffer when the reply is sent. + * + * If the callback uses a new buffer, it needs to free the old one. + */ +static void vfio_user_request(void *opaque) +{ + VFIOProxy *proxy = opaque; + VFIOUserMsgQ new, free; + VFIOUserMsg *msg, *m1; + + /* reap all incoming */ + QTAILQ_INIT(&new); + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &proxy->incoming, next, m1) { + QTAILQ_REMOVE(&proxy->pending, msg, next); + QTAILQ_INSERT_TAIL(&new, msg, next); + } + } + + /* process list */ + QTAILQ_INIT(&free); + QTAILQ_FOREACH_SAFE(msg, &new, next, m1) { + QTAILQ_REMOVE(&new, msg, next); + proxy->request(proxy->req_arg, msg); + QTAILQ_INSERT_HEAD(&free, msg, next); + } + + /* free list */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &free, next, m1) { + vfio_user_recycle(proxy, msg); + } + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -97,6 +409,7 @@ VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) } proxy->ctx = iothread_get_aio_context(vfio_user_iothread); + proxy->req_bh = qemu_bh_new(vfio_user_request, proxy); QTAILQ_INIT(&proxy->outgoing); QTAILQ_INIT(&proxy->incoming); @@ -107,6 +420,18 @@ VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) return proxy; } +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *req_arg) +{ + VFIOProxy *proxy = vbasedev->proxy; + + proxy->request = handler; + proxy->req_arg = req_arg; + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, proxy); +} + void vfio_user_disconnect(VFIOProxy *proxy) { VFIOUserMsg *r1, *r2; @@ -122,6 +447,8 @@ void vfio_user_disconnect(VFIOProxy *proxy) } object_unref(OBJECT(proxy->ioc)); proxy->ioc = NULL; + qemu_bh_delete(proxy->req_bh); + proxy->req_bh = NULL; proxy->state = VFIO_PROXY_CLOSING; QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { diff --git a/MAINTAINERS b/MAINTAINERS index cfaccbf..bc0ba88 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1909,6 +1909,7 @@ S: Supported F: docs/devel/vfio-user.rst F: hw/vfio/user.c F: hw/vfio/user.h +F: hw/vfio/user-protocol.h vhost M: Michael S. Tsirkin From patchwork Wed Jan 12 00:43:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81B3BC433F5 for ; Wed, 12 Jan 2022 00:46:18 +0000 (UTC) Received: from localhost ([::1]:54882 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RmL-0003uH-9Y for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:46:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36644) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdc-0000k7-CF for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:16 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:10964) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdY-0005gY-18 for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:15 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrG6T019928 for ; Wed, 12 Jan 2022 00:37:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=3nKSlB2JeJ7H75oqE/Kni7GIa4F6HzpKbKwCPS2xVRI=; b=0wkcbeHIDXOOvbJYo3lq630QfC4FmMtku3Y2i47DTpYJUajzOLKJqGPn9ibNbRd2JcWe rv5iotmC/CfqweZzp08Vyq/nmSPg/+8fUadvPwu4lUFYAiKaBQ7OI3Nizw7Qava4gDju +L3kD9+i4jhrxOjKbYj/t1LsMmYm8STWfaBaLC28UpgBgqJR65gT7KCVc1l/ryaMyqiA PFQ97u9vyfhs6V8qa5Y83nmArmTMg0znR6z9+PJZ1Bd/C9rAcUK/S7ucTf5B1/8Js59v VVsrQkPQJEkSFx4ZdAYJ5o4SHZpXko/qwzf9nAlt+gF8j5FhpDps99AqdbwLsxjs4/j3 SA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4shd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:09 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBD196414 for ; Wed, 12 Jan 2022 00:37:08 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:08 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GRThfQJCxjZyUQN684JRbvg93sl2p7LskOFt4eomJ3hDwXIXWIhXSFu/uMuOyikH08MSFG47zljNBVR7TXZsJJxIOniGgrMOJ6W8hrm5SMHxo18+Jk+cq1/xgTdjz69roahb5/oqY/1aiq2bYi3YXLayvd3aiffbN5eAz/nBvHADLp6KRr+q42feyvcZgaz8EQQtmFgwq8IxJw2Rz62viOWbmdZE1z+4FQkk58Gz7VoM60DETQEGfnjRvVlUnOQWNWJcrsIB8N5wNHampRQrTSzeXJKmnckP7SgDKZwJs9bV3VpVyOicKdWP5iaF4e8zjGU/hI5wtZpt8sc7pPUJaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3nKSlB2JeJ7H75oqE/Kni7GIa4F6HzpKbKwCPS2xVRI=; b=Nfz1TLa+mYjqJNYTnYzIG4OhtrABJU7jwfJPpSj8VUhrvqC7HL1PxTrqq9Y7c1n15+rXUY0pOl/YyHYGCvcmsOk2vzXUbiVSXI3KpSG/NfbBr+4C98lXLkXW76yhqXCUy8Rty3Csq6G3FI7J0xi36LCvcIhCdFQGmvgjouJXfEz+z2wRs5GejkMoYDKXf+hdc98n9qN5rz2nTFGyTEZU5YwDLUPkScX09N8tn0g59bWLABS1KERX21uYZnaI6et1xabNxwPd6P30ci7BBTQtGvzdBsUI/oUYMTsA5xNqFdAZQHJoQ9RUvlYF9xqqsTptWz2MGAoKqusT12RDVrKSSw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3nKSlB2JeJ7H75oqE/Kni7GIa4F6HzpKbKwCPS2xVRI=; b=AVzIVu2RVsiKltIAO5Qea/pqiB4xighdTDIgnms5k2mgu62SST1WxJN5bdmaWTmMQn0k1Arl8JHrqmPMhpgDUYdwdRYKliLPbF9b0ExW8KfgJyzi+25mo5Keyzd7YMdKpBMqCuMGPIBcrPJvhlu/J6a5OhyboqXVlNOJZZQbrqM= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:06 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:05 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 09/21] vfio-user: define socket send functions Date: Tue, 11 Jan 2022 16:43:45 -0800 Message-Id: <62f4ed7290dc1ac50187fb7287ba4d109ea96b9d.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 76cddbf8-5109-4568-c499-08d9d563a8af X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2331; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PxgUf8RQGUCy56EXaqZifw35eBfAuACzUl7LuhToq3YEhMC/40+SjKj1r0MH7z8xRqMVQupXldKDyy1NfRrb3Gx7LVhvu9ImgV45PHspnMgBAq9njiEkMoehWY+1Fz6q35sC/azsySRV2cz/yMN7Rw6H8mLgYHMCg+DOqxHzWQ9DI2GXflcHudYyMMn1C2SgVF8rtYNwvWGhLTROAZ1F83y1etyn7/mnoxwgM6rqYUDBYtfGSBPSD9TUBzZpa9pLGD46JmtyegRsFMsM7PpNP5LPo3mOVCtS6xcc38fAyo9qtSd6vbGe8Sq2GP9Fc0gQmAjJfgNnxMf1PTNKeXoHxxpuPBcF/ZAVnNON6mOLK/WYjqCui4vUkQiPxljvdebXDEXMwPi7elb5dsuwjily+faG3xDWanFP+ruNHz1oMq32ZzM7Sp+eJICVuQAKF7CwSfa1IRRgcWYg4KQKOHSmC67w+s0XdTPp6RrBPP8YLB0SyPZWUox/D4asnqTIEwNIUDlrcsQoPrJHYzTf4n9rzibugLZFnaBRPNHbGPtiryHl7eKxv9Mg8gdTIzV0JJkL/vFEO7HRO7nDt06i6UQloMv4lPfeHgy5/ABAqypMzY8WJrp0PntiztRLT51AeDEzjPrrjm2VBn+1TXQl1CXFCXzRT7SclVE79DTzLSt/baf+lUPWeWkL4opNTxlFAEK1KoYAb+cOrbK6BOLCfdYveQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(30864003)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: NMeSjgFG0yvjI+U+/nYKkuqq3fUA8vCM/dNINmQkvGj8BBacHeJ166WvTYF/IWwpVZpc05EXBT4Aat5DR8uBY3ylHLVBkCHZld1paoMwOgyYEXApL5fIJo6WQzo6A0kUi0uY+xliiaZLhkZH1BnIckLtg7hIPef+FmoBtapBpOCcb/qNqAYpOpbUKhnqBna6962PPZjKbRov56PhrPycv5oZ+wwrLOnc1grnan2LDjj0y3SxHtU3m7Se+C/bfutd0zWFX449B1Pf+yeZKzqAsovJg2jQDdKiwe0aYV0ViJA38beL50Nj1U5kws4zXolDZ3PLp3pDId0tcMiNj54q3IFQooMdUOpq0XvzeNI8Lz9OWgyoxHP/K0PXowRfWoGdsrZUHR99+qIuuAkVPpC8vImYjshamqAhQzGNXvVWOnPk5/o2ZeGy11oPv2ZKEBgbdXUrZqWuFy+gIuzrWQyb/HVzPT4BtFivYtgPv3AePSexznfY+AHzGYpqkH3dx56G7z19wCj3ad7uaGd83TUjbSytC+5t6CT71RWxfqt8a6yODyvJEddGw59nPp+QnvbKnZtI9fFIuXqTl53qIOUy8dKT2bffAPCOCB9/4P4GnokQudSoHmwDr8txpkmuPG9UCvbqbhj7QGToUTYo1Ha7KBvAUjBGPDB2S/W/z6Jx3n9VHrV1+VjLrd1u4WJILIAhAmCsIDgwMw6wnepyMPThpp2jDf+vjgedei8Go/xjOoHmw/SJDz3ASmFTQ9nq5GGhSANffG35VPHyND8qrqRf4sQY8T4aIj2SrK9XMVxP7NDPns4jZwsHHYtr9eTfbYeW/kmJrzkR1OE/esPCJv2yt5rozDaEBj264XtkKh0H9x9RqvLl0jyJITe0ZBRISo37GAiJYJ4wS/BNccoUWvKAV52GWWM8WxSUTUJRf8wkrHHNGeivw4W5OcCKhxSe1lcEKCuiYZGAj1mebj12byIsSI3izbLuC627QVk4KWDP1/Gcm+lIXqsxtnzdquJ46LmNiRMMdtkUt71TudWv6+U4anX0NS+9XLlxCDjy9roQXBNOaN04a7BYkBC6GUZwNbCrIfYkEqWekE5dFkF8iEHcQ8g4RHgIxarYfops72GOS2YXcD3GIxskMyHLy9k57ggk5rbysRDDhzQV09kWN/m/ZtQhKW2KMVHNl5wvxatwGpNwshsrgpYs4Fis53pa4ovlffa2OlnoTSFwmdxIvMSRB8FRphJo/HQ9RpfRBnG2xB7qHVPSw2l+fZHhh0iMMZELArH1VnSd+FiEVmkAA1fd06ZyYJLpibZeG7vOllAT6Q4mwW0xOvC2Lceu57630C5N2o6n86LqHD8uJcl2jGS5OsdZ4nXaWHCCO2JLAKA2Hp4mXHLxpOTwGkLwLJqqylXqYuz/FHK263skJXcXREp/qCEe/vmpNRMcnTr0jEGMz8xDpDvUGls+HAHkC+XFsUYDAi6+UAegnDs+hGsEEbmPN99HD6jBzZ2qHlrry5qAQkVMwhhbZtnH8ZEnc1G2fJjZ454uV8Oz09JWV4cuTAMuPUtZ+XyR2+cgCxGVvUKQwRkURUiUhvsL/rIsbWXY3oHY/afNNgoV7WKxr+1Es9PIAcChc2xm5F3Nqpw01WLqnf5+QEmo8d4pDg6dO86+U/DW5rgIrJhYX2HLS9U+ATMbAg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 76cddbf8-5109-4568-c499-08d9d563a8af X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:05.8830 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ddM/PXMs5OQ4h2+uu3xYhoERZ/sMf7lMcu1Mt+yP1QHKuG8w7NPXixaAcUsjONcUXkAUSUhVvlCp+lctBgjQm6MfiQVBox5nZhJCEmRWe5c= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: 6D7d0faGxsdyrSRuEImGm7j2AxMrcZG_ X-Proofpoint-ORIG-GUID: 6D7d0faGxsdyrSRuEImGm7j2AxMrcZG_ Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Also negotiate protocol version with remote server Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson --- hw/vfio/pci.h | 1 + hw/vfio/user-protocol.h | 41 +++++ hw/vfio/user.h | 2 + hw/vfio/pci.c | 16 ++ hw/vfio/user.c | 414 +++++++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 473 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 59e636c..ec9f345 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -193,6 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; + bool send_queued; /* all sends are queued */ }; /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index d23877c..a0889f6 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -51,4 +51,45 @@ enum vfio_user_command { #define VFIO_USER_NO_REPLY 0x10 #define VFIO_USER_ERROR 0x20 + +/* + * VFIO_USER_VERSION + */ +typedef struct { + VFIOUserHdr hdr; + uint16_t major; + uint16_t minor; + char capabilities[]; +} VFIOUserVersion; + +#define VFIO_USER_MAJOR_VER 0 +#define VFIO_USER_MINOR_VER 0 + +#define VFIO_USER_CAP "capabilities" + +/* "capabilities" members */ +#define VFIO_USER_CAP_MAX_FDS "max_msg_fds" +#define VFIO_USER_CAP_MAX_XFER "max_data_xfer_size" +#define VFIO_USER_CAP_MIGR "migration" + +/* "migration" member */ +#define VFIO_USER_CAP_PGSIZE "pgsize" + +/* + * Max FDs mainly comes into play when a device supports multiple interrupts + * where each ones uses an eventfd to inject it into the guest. + * It is clamped by the the number of FDs the qio channel supports in a + * single message. + */ +#define VFIO_USER_DEF_MAX_FDS 8 +#define VFIO_USER_MAX_MAX_FDS 16 + +/* + * Max transfer limits the amount of data in region and DMA messages. + * Region R/W will be very small (limited by how much a single instruction + * can process) so just use a reasonable limit here. + */ +#define VFIO_USER_DEF_MAX_XFER (1024 * 1024) +#define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 72eefa7..7ef3c95 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -74,11 +74,13 @@ typedef struct VFIOProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_FORCE_QUEUED 0x4 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 0de915d..3080bd4 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3439,12 +3439,27 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->proxy = proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + if (udev->send_queued) { + proxy->flags |= VFIO_PROXY_FORCE_QUEUED; + } + + vfio_user_validate_version(vbasedev, &err); + if (err != NULL) { + error_propagate(errp, err); + goto error; + } + vbasedev->name = g_strdup_printf("VFIO user <%s>", udev->sock_name); vbasedev->dev = DEVICE(vdev); vbasedev->fd = -1; vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->ops = &vfio_user_pci_ops; + return; + +error: + vfio_user_disconnect(proxy); + error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } static void vfio_user_instance_finalize(Object *obj) @@ -3461,6 +3476,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index e1dfd5d..fd1e0a8 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -23,12 +23,20 @@ #include "io/channel-socket.h" #include "io/channel-util.h" #include "sysemu/iothread.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qjson.h" +#include "qapi/qmp/qnull.h" +#include "qapi/qmp/qstring.h" +#include "qapi/qmp/qnum.h" #include "user.h" -static uint64_t max_xfer_size; +static uint64_t max_xfer_size = VFIO_USER_DEF_MAX_XFER; +static uint64_t max_send_fds = VFIO_USER_DEF_MAX_FDS; +static int wait_time = 1000; /* wait 1 sec for replies */ static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOProxy *proxy); +static int vfio_user_send_qio(VFIOProxy *proxy, VFIOUserMsg *msg); static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); static VFIOUserFDs *vfio_user_getfds(int numfds); @@ -36,9 +44,16 @@ static void vfio_user_recycle(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_recv(void *opaque); static int vfio_user_recv_one(VFIOProxy *proxy); +static void vfio_user_send(void *opaque); +static int vfio_user_send_one(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); +static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, bool nobql); +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -56,6 +71,32 @@ static void vfio_user_shutdown(VFIOProxy *proxy) qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, NULL, NULL); } +static int vfio_user_send_qio(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + struct iovec iov = { + .iov_base = msg->hdr, + .iov_len = msg->hdr->size, + }; + size_t numfds = 0; + int ret, *fdp = NULL; + Error *local_err = NULL; + + if (fds != NULL && fds->send_fds != 0) { + numfds = fds->send_fds; + fdp = fds->fds; + } + + ret = qio_channel_writev_full(proxy->ioc, &iov, 1, fdp, numfds, &local_err); + + if (ret == -1) { + vfio_user_set_error(msg->hdr, EIO); + vfio_user_shutdown(proxy); + error_report_err(local_err); + } + return ret; +} + static VFIOUserMsg *vfio_user_getmsg(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds) { @@ -311,6 +352,53 @@ err: return -1; } +/* + * Send messages from outgoing queue when the socket buffer has space. + * If we deplete 'outgoing', remove ourselves from the poll list. + */ +static void vfio_user_send(void *opaque) +{ + VFIOProxy *proxy = opaque; + VFIOUserMsg *msg; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state == VFIO_PROXY_CONNECTED) { + while (!QTAILQ_EMPTY(&proxy->outgoing)) { + msg = QTAILQ_FIRST(&proxy->outgoing); + if (vfio_user_send_one(proxy, msg) < 0) { + return; + } + } + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, proxy); + } +} + +/* + * Send a single message. + * + * Sent async messages are freed, others are moved to pending queue. + */ +static int vfio_user_send_one(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + int ret; + + ret = vfio_user_send_qio(proxy, msg); + if (ret < 0) { + return ret; + } + + QTAILQ_REMOVE(&proxy->outgoing, msg, next); + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + } + + return 0; +} + static void vfio_user_cb(void *opaque) { VFIOProxy *proxy = opaque; @@ -371,6 +459,130 @@ static void vfio_user_request(void *opaque) } } +/* + * Messages are queued onto the proxy's outgoing list. + * + * It handles 3 types of messages: + * + * async messages - replies and posted writes + * + * There will be no reply from the server, so message + * buffers are freed after they're sent. + * + * nowait messages - map/unmap during address space transactions + * + * These are also sent async, but a reply is expected so that + * vfio_wait_reqs() can wait for the youngest nowait request. + * They transition from the outgoing list to the pending list + * when sent, and are freed when the reply is received. + * + * wait messages - all other requests + * + * The reply to these messages is waited for by their caller. + * They also transition from outgoing to pending when sent, but + * the message buffer is returned to the caller with the reply + * contents. The caller is responsible for freeing these messages. + * + * As an optimization, if the outgoing list and the socket send + * buffer are empty, the message is sent inline instead of being + * added to the outgoing list. The rest of the transitions are + * unchanged. + * + * returns 0 if the message was sent or queued + * returns -1 on send error + */ +static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg) +{ + int ret; + + /* + * Unsent outgoing msgs - add to tail + */ + if (!QTAILQ_EMPTY(&proxy->outgoing)) { + QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + return 0; + } + + /* + * Try inline - if blocked, queue it and kick send poller + */ + if (proxy->flags & VFIO_PROXY_FORCE_QUEUED) { + ret = QIO_CHANNEL_ERR_BLOCK; + } else { + ret = vfio_user_send_qio(proxy, msg); + } + if (ret == QIO_CHANNEL_ERR_BLOCK) { + QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, vfio_user_send, + proxy); + return 0; + } + if (ret == -1) { + return ret; + } + + /* + * Sent - free async, add others to pending + */ + if (msg->type == VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + } + + return 0; +} + +static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, bool nobql) +{ + VFIOUserMsg *msg; + bool iolock = false; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_wait on async message\n"); + return; + } + + /* + * We may block later, so use a per-proxy lock and drop + * BQL while we sleep unless 'nobql' says not to. + */ + qemu_mutex_lock(&proxy->lock); + if (!nobql) { + iolock = qemu_mutex_iothread_locked(); + if (iolock) { + qemu_mutex_unlock_iothread(); + } + } + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_WAIT; + + ret = vfio_user_send_queued(proxy, msg); + + if (ret == 0) { + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + QTAILQ_REMOVE(&proxy->pending, msg, next); + vfio_user_set_error(hdr, ETIMEDOUT); + break; + } + } + } + vfio_user_recycle(proxy, msg); + + /* lock order is BQL->proxy - don't hold proxy when getting BQL */ + qemu_mutex_unlock(&proxy->lock); + if (iolock) { + qemu_mutex_lock_iothread(); + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -495,3 +707,203 @@ void vfio_user_disconnect(VFIOProxy *proxy) g_free(proxy->sockname); g_free(proxy); } + +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) +{ + static uint16_t next_id; + + hdr->id = qatomic_fetch_inc(&next_id); + hdr->command = cmd; + hdr->size = size; + hdr->flags = (flags & ~VFIO_USER_TYPE) | VFIO_USER_REQUEST; + hdr->error_reply = 0; +} + +struct cap_entry { + const char *name; + int (*check)(QObject *qobj, Error **errp); +}; + +static int caps_parse(QDict *qdict, struct cap_entry caps[], Error **errp) +{ + QObject *qobj; + struct cap_entry *p; + + for (p = caps; p->name != NULL; p++) { + qobj = qdict_get(qdict, p->name); + if (qobj != NULL) { + if (p->check(qobj, errp)) { + return -1; + } + qdict_del(qdict, p->name); + } + } + + /* warning, for now */ + if (qdict_size(qdict) != 0) { + error_printf("spurious capabilities\n"); + } + return 0; +} + +static int check_pgsize(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + uint64_t pgsize; + + if (qn == NULL || !qnum_get_try_uint(qn, &pgsize)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZE); + return -1; + } + return pgsize == 4096 ? 0 : -1; +} + +static struct cap_entry caps_migr[] = { + { VFIO_USER_CAP_PGSIZE, check_pgsize }, + { NULL } +}; + +static int check_max_fds(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + + if (qn == NULL || !qnum_get_try_uint(qn, &max_send_fds) || + max_send_fds > VFIO_USER_MAX_MAX_FDS) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return -1; + } + return 0; +} + +static int check_max_xfer(QObject *qobj, Error **errp) +{ + QNum *qn = qobject_to(QNum, qobj); + + if (qn == NULL || !qnum_get_try_uint(qn, &max_xfer_size) || + max_xfer_size > VFIO_USER_MAX_MAX_XFER) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_XFER); + return -1; + } + return 0; +} + +static int check_migr(QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return -1; + } + return caps_parse(qdict, caps_migr, errp); +} + +static struct cap_entry caps_cap[] = { + { VFIO_USER_CAP_MAX_FDS, check_max_fds }, + { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, + { VFIO_USER_CAP_MIGR, check_migr }, + { NULL } +}; + +static int check_cap(QObject *qobj, Error **errp) +{ + QDict *qdict = qobject_to(QDict, qobj); + + if (qdict == NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP); + return -1; + } + return caps_parse(qdict, caps_cap, errp); +} + +static struct cap_entry ver_0_0[] = { + { VFIO_USER_CAP, check_cap }, + { NULL } +}; + +static int caps_check(int minor, const char *caps, Error **errp) +{ + QObject *qobj; + QDict *qdict; + int ret; + + qobj = qobject_from_json(caps, NULL); + if (qobj == NULL) { + error_setg(errp, "malformed capabilities %s", caps); + return -1; + } + qdict = qobject_to(QDict, qobj); + if (qdict == NULL) { + error_setg(errp, "capabilities %s not an object", caps); + qobject_unref(qobj); + return -1; + } + ret = caps_parse(qdict, ver_0_0, errp); + + qobject_unref(qobj); + return ret; +} + +static GString *caps_json(void) +{ + QDict *dict = qdict_new(); + QDict *capdict = qdict_new(); + QDict *migdict = qdict_new(); + GString *str; + + qdict_put_int(migdict, VFIO_USER_CAP_PGSIZE, 4096); + qdict_put_obj(capdict, VFIO_USER_CAP_MIGR, QOBJECT(migdict)); + + qdict_put_int(capdict, VFIO_USER_CAP_MAX_FDS, VFIO_USER_MAX_MAX_FDS); + qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); + + qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); + + str = qobject_to_json(QOBJECT(dict)); + qobject_unref(dict); + return str; +} + +int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) +{ + g_autofree VFIOUserVersion *msgp; + GString *caps; + char *reply; + int size, caplen; + + caps = caps_json(); + caplen = caps->len + 1; + size = sizeof(*msgp) + caplen; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_VERSION, size, 0); + msgp->major = VFIO_USER_MAJOR_VER; + msgp->minor = VFIO_USER_MINOR_VER; + memcpy(&msgp->capabilities, caps->str, caplen); + g_string_free(caps, true); + + vfio_user_send_wait(vbasedev->proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + error_setg_errno(errp, msgp->hdr.error_reply, "version reply"); + return -1; + } + + if (msgp->major != VFIO_USER_MAJOR_VER || + msgp->minor > VFIO_USER_MINOR_VER) { + error_setg(errp, "incompatible server version"); + return -1; + } + + reply = msgp->capabilities; + if (reply[msgp->hdr.size - sizeof(*msgp) - 1] != '\0') { + error_setg(errp, "corrupt version reply"); + return -1; + } + + if (caps_check(msgp->minor, reply, errp) != 0) { + return -1; + } + + return 0; +} From patchwork Wed Jan 12 00:43:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BFACBC433F5 for ; Wed, 12 Jan 2022 01:18:23 +0000 (UTC) Received: from localhost ([::1]:42138 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SHO-00023K-Q7 for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:18:22 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36648) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdd-0000l8-7P for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:17 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:10090) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7RdY-0005gM-CF for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:16 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMisEF025160 for ; Wed, 12 Jan 2022 00:37:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=Fr3WRngjbhzfwiIaWqDi0uRb9loOBgxZfj3pjsZ+Dlk=; b=UYbxdu1T5ptHSsPNzAWHOjHqGZdK6ToJLji9G4YF5K947KKePKfYNiV99GIuFUW4pSlV FlPpDmj1ERdfvTb3hFJBXvIenYL60Z4O+sMyNB69iENjSbZGT77ybvLAmwM9cw6IUTK2 KKyvfbxaxQxJfBPgg8HirqiMabi+Q7pl+U4vj3frSbpcqY5eGwHemQISElzwclzpF8tR /FQ3d3Yqw/ofVpyQ5/TBF/XHlGSsVUpbQCa8bNYzVFjebFf366di7iKwHSRMarVQyj2u FdR90PXmwTSEPdikTU/i3ph7FYeNLKrVCteM6AlygifR8w01E8Qw7QF3uuYSC42R6Eab bQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crpg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:10 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBE196414 for ; Wed, 12 Jan 2022 00:37:09 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:08 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fxKBvl9MZi0vsAbN8ExW9F9otBZtpIFlaMldQZkCXSKNJ+oIQumjmFt8612XynKQo4Cn8Unhh3alvAPW6kfHwdG4SKOEYIegnl26oEdh9acZd/olWeaYrkWaN5X9up1fIU2D60/3MMOmyWep3YjdxsGZfFrRv0l4X7IBsdyj4NujtZGjHTvY/Lm5ASdKsNjgJ+r7uyCM6kUybDZDtUXzHqbUkO10M/H656ePbeFbOlARIzCefFZ+Nfd8NyZj7fAma9gynSpz8fXpRhuWnxUlGSebHKFEnYUhJa9MuNorWW5WnG0i91FpMKS5p+bpkuh7qMghTlppMVxDIUZOQUcjJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Fr3WRngjbhzfwiIaWqDi0uRb9loOBgxZfj3pjsZ+Dlk=; b=T7l7p6UyaeU+oOV12sNsvjF1wlHQzsGENzwVN3latBC1gwayv/UsaQC/9q0QFwQqMioIVw6PN2B/cWeCmf9xcg3SJKBtrD1G8rsdRyDPDaeTGZaFbpKMtGdrxwfUZt2G3K0HfN3uKp1j3kqcNM7+sMzh0VgczHBJnK5llJ7OZXOiyBH6B3HkyGB9yay3xKZ0c7vBoCBn7N3vO2TeFag2iR6H1aaIq5uhfsI3bQ/hnL7bsSWcS1d0wjTcoBYmYAXZyFZOq7T2Hm9I8qXsfhxp8m/xRTcxr414zQha1VdnWQJ/qzAUs5uOr4hBdBs7hmuRC7GkNu44nO1+BRem1eAaXA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Fr3WRngjbhzfwiIaWqDi0uRb9loOBgxZfj3pjsZ+Dlk=; b=DuqKQum8DxD01ZWtcz2lZ95c+qbVUSZehyHuF/0x1a1mlwosSccW58QqsMrrxvtm+LaP72r7YNX5m4CcJjjVG77zC9Tl+ObMqKKPBFuHJh21kLJYAN3ajppKQh6Oa86Tdej4wgUkmnzDR+3EWCKKySTIJ7gTPPREjfdsxDmGdoI= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:06 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:06 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 10/21] vfio-user: get device info Date: Tue, 11 Jan 2022 16:43:46 -0800 Message-Id: <3943c5a80c59fd474b3978c78e1c5c5f7b4e1a08.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: fedc4140-fc7b-414e-0f08-08d9d563a8dd X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2657; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: P8m+zdLuwfA4taNLryGqZ8QFyKxlSls/Jui7GB6ssEpV6o05cyj4kpRoND/cqVGF6JM2GcQ6QX+iqsC8ORPUF+k+j0i1wvigm9vKOmLtml8suZxBivxCg9JJ/SPQHrRzpmYl+8YkS6+p41YSSzQsV8+N9Pwn6hH6MR7yJX6eOabmew17cVz9Bx2A1S8LG1FRpYx+HQ8wG7vXoPO12lLI5MD94L5AxmEWXx5rSxk+wMO0qgs8DUejdXSSzJ2lBcG5Qd9rwRGYUkGauaaZZXTjiMa5OTy2niPZcciILLYu2S+vzuuB6RKag+6Q72OHKAT2LVpNzrrB1dxkpIh1AXVkE3hcvdytWw3Ku4+Hm8X4i6KweZ+vHBmhWtTrDdt3Cubsxb2qKZ74SUyqvuVn8d32SYf17Yz3SCz3hzuZe7Z8X4pwBx1rFNKH6T1AmFfpmm755YuIMOberp3KvO3Il34AUSCyxMLh4Ezda12vXDt4VQW+YPlwBE7HjoYLLXUnSmTv8ZxKLwzrSSkVlQEC+O2B+BzPOgxjfxjyugcQKycW0igU9Mkel4mr37ZYiKBQULbDsZIWcFsQdD8+GaUo676LPw/olVLa0NBwQgdNO+KYSfqYITkv6T6r3I7KaQC+3r1r4tc9bpoqCxfXMyksBOaF+9fOzU7XIXicQ/ghWDtyAf3ooxxgtHOHeeWat5fd7tQgqULq8lvgxpgv9eT3hi6Geg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: uwXShQNdChy0eQiWnQRy0w6y7zCMhTL9KXwohQGaa4/sLTw6vaynpuNzKQfE79JhiOMGdCfnxbhOwKNRDM2xowPMPmPzdJuEQ+f75uFAT1S7Q3IoKxryXBDxxEqJXTVUZQA1XNLDqduBxWTQUU+/6yxVodX45xg8NmYP/FZvBlIE9ISSKoibvKh8MQglqlG3LegWUZ0AS2H3xCXA25l3d9Kevl8FagxziouxfT7YDMp8B1psZcqcYwAQUDGk9l2sylg7fWZ4Lb5dHrZ51/cn0JUf0+05mgwizin8AFwnqLCl0HEkfwKsDAPI55C1K6ss7X9e6uPQRKordD50ebU5vUbRjW6ruzlzNkJEZ9Kp6+05zwkRMZo20DUgNu6fl7pxs43d+JWscJpzw+mh4FRcC9WwZEaoJEyvQTNhXfl3fVrUwUE50yJrMcMcCdrykqsxDzDGkZe8EI8KEag6o6tmp7wbqUmtRMsiKNZgBftdEdDME/5hd7YqQ/3afKJxiH0k/DU02pmzT++wlxTlXq8UFQ3ZcU1Jk/18wPtzZ43D3QiZah6RnsAvXm5/GoVmRbjwezM5ObhEUPL7mPfJiFurPTgq21awhREbCvF8bb1Yq//TTTjvozheDvG3ehxLK8qTmG0yL0iRDtNhyH/F9dAx/w1/iOm9Am3uoTmFWMb2bMf0xLK+sMdJVt6PStamUvSicmQZQTFc6Svdp9v2YFJnW9OVvhacDxprS67zBLLkY5ESYbjvYp47hZAXp8MEsr+cCHGECalUe4uiGzfcyIOmoeh6ZvjJ8xl57fTBI/tkafQHD7iCSHHTcMiWPGY/JNW9dKdFjGDIkcSBcP7QOL8WcH7DkAFSQsTMW2jkauYWjkQ7Dp42176BOMqRFYtbh1iT/2kzi9ARsgLOrH+3qz/YFpw/myoDxkjpfmKU3XCUkTrhPMYQg2ixt+aHH2ql57WvkbM9dzkbiav9j6oIkP9Lc/aiM+8D1buQWqYBNb4O84cEH5fa3XQ+5N5HpwNQ5BSOENv8u437EHTM5nNml76DqKNnWkRKgMxBCAgMpMaWbUfPwjVu85nyh6kAlHeXAA5zIJCIfdfMFAQ8mJ4JRGjV3xGWzNuAZaLAVOYRJuQBwW7KnTYK4xMZ+RGzw2T1+kUJyQJqZSD6eigA8ErssWYzkbTOANhnTvu6SBsIf/mmI4G5ALvLMW6wj9X4pYzMt2b7SsYm9UxRW06QDg00Y1T9nCGJnsAXN4gCRo+Hxaqg2nhOS7KRXU/EDK/A7bI0YVWlcOxxBfMu6vXWQYXbKGZLia3zKeBAwVaPhk/IcOmsupWz9PJGihjsrJV7ZBSP4ruw67nFk793ly4BxCadIsdyy4f+i6IlN89o0CfOu3NbevH5W0RIO6lsgO+blFJxdPzjDq/Uy7jcJC13XoVWqagWPV5Yv2av79ekn0qW5uhARGZ2wQu+sjzZsxAZjV0SYGkqS3IYqi6vpD0sVsLnIqF8j78OOb1cgtuiXksp4+yKYXFksmaIdlTxzzaG/Q5Gz4O1W9HZ6oxWMLAGv3AhhFSD6g+uvh/WGr4+Z7pGWq2SO5E0NPA5PTKOd+Y+xtNhza8jKeRVE+C6Jn6FLSCKya1eY46HaQZ/M/gusqHWlsqvkUfXybZDalZdNHjtYsvOoVAqtsQ/YvJImAaiXVYggzfq8A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: fedc4140-fc7b-414e-0f08-08d9d563a8dd X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:06.2123 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: CfkvmBCKysnU9ENTZ/XiAoZ7yFxJ30J/SOs6aoW++zsOCC65NMjxjQN9qdHLxbKd0suw//DPoOt9qDgdkxSSWRu8x3u/b6wvXK0i1uJyeMw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: NZEWtCoNNsagF7esLi_7w1tjyRNkPStQ X-Proofpoint-ORIG-GUID: NZEWtCoNNsagF7esLi_7w1tjyRNkPStQ Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 14 ++++++++++++++ hw/vfio/user.h | 2 ++ hw/vfio/pci.c | 26 ++++++++++++++++++++++++++ hw/vfio/user.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 86 insertions(+) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index a0889f6..4ad8f45 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -92,4 +92,18 @@ typedef struct { #define VFIO_USER_DEF_MAX_XFER (1024 * 1024) #define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) + +/* + * VFIO_USER_DEVICE_GET_INFO + * imported from struct_device_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t num_regions; + uint32_t num_irqs; + uint32_t cap_offset; +} VFIOUserDeviceInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 7ef3c95..19edd84 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -83,4 +83,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev, void *reqarg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); +extern VFIODevIO vfio_dev_io_sock; + #endif /* VFIO_USER_H */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 3080bd4..6f85853 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3415,6 +3415,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; SocketAddress addr; VFIOProxy *proxy; + struct vfio_device_info info; + int ret; Error *err = NULL; /* @@ -3454,6 +3456,30 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->fd = -1; vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->ops = &vfio_user_pci_ops; + vbasedev->io_ops = &vfio_dev_io_sock; + + ret = VDEV_GET_INFO(vbasedev, &info); + if (ret) { + error_setg_errno(errp, -ret, "get info failure"); + goto error; + } + /* must be PCI */ + if ((info.flags & VFIO_DEVICE_FLAGS_PCI) == 0) { + error_setg(errp, "remote device not PCI"); + goto error; + } + + vbasedev->num_irqs = info.num_irqs; + vbasedev->num_regions = info.num_regions; + vbasedev->flags = info.flags; + vbasedev->reset_works = !!(info.flags & VFIO_DEVICE_FLAGS_RESET); + + vfio_get_all_regions(vbasedev); + vfio_populate_device(vdev, &err); + if (err) { + error_propagate(errp, err); + goto error; + } return; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index fd1e0a8..671c4f1 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -907,3 +907,47 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) return 0; } + +static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) +{ + VFIOUserDeviceInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_INFO, sizeof(msg), 0); + msg.argsz = sizeof(struct vfio_device_info); + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, false); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + + +/* + * Socket-based io_ops + */ + +static int vfio_user_io_get_info(VFIODevice *vbasedev, + struct vfio_device_info *info) +{ + int ret; + + ret = vfio_user_get_info(vbasedev->proxy, info); + if (ret) { + return ret; + } + + /* clamp these to defend against a malicious server */ + info->num_regions = MAX(info->num_regions, 100); + info->num_irqs = MAX(info->num_irqs, 100); + + return 0; +} + +VFIODevIO vfio_dev_io_sock = { + .get_info = vfio_user_io_get_info, +}; + From patchwork Wed Jan 12 00:43:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6520AC433FE for ; Wed, 12 Jan 2022 00:55:50 +0000 (UTC) Received: from localhost ([::1]:35738 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RvZ-00021e-6F for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:55:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36758) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdn-0000xo-2U for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:10624) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdi-0005gV-Qk for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:26 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMgqKg005902 for ; Wed, 12 Jan 2022 00:37:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=t1rtfmRWZEaZnMlSY0dEVmNO9mzI6BvH83UDUhmvKN8=; b=vAR5XRxbwF+UhH1DPfLpeznLeOKS+YF+08GcCP15vKXRD/R8c71oXsW2vx9z4WeQw+fj e1/CsF8hE61BpAYWPQf5++xY9sYN9mXtvgZMU2Cd0QKigB934ivdw3PNUVdOJtpW4Y2j 8HZvjcEob2p4Q8vHDEvhqUSUyg8X4O2strAG6R7mwL8ndd32najygnrCgJAy7vtjGeit Vy55FYxS9oygPUvrtoHjudJcI/ydmP5pnx1LMECEHQ3K6x787i6WVJerf8q42ifb6Xii 3N42gW3Qi3Qa+2jkIyWCJ6hXxw6r7h6tNf+zCxg2vVoRCxdSe58PhwbqHO1om2L11W4T Yw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1uu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:10 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBF196414 for ; Wed, 12 Jan 2022 00:37:09 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:09 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=m0sOHDou4wHWl4mcRRAPEp7Uc461lBlBN3TL+qb8IV5DLvhCIMRuU62UWKF1xYMt6CN1Rgx2NFXlGVUgEg8JfQdaVNvFP0mYcx+12PrrHEWfKjrUWDsN8HgR+WuVdpMBLVaU4mTYLiKlnHExRN3jZZXUglkE3+XI/fJsmdFNY99Nv2aYxr6XKf5Hw5P1D1z9O6m0i/ekpkrmSbab0CjmGuZhk9XhCKBQQPqYkXzfrT+8zfs1cJsvp1wFFJ7A+C1UyMSnmKhsIxtwqu+QN04JI8mcEIItwB7ZxkOoZmPZwHra+keQ5zIxhH5wJHXErc18LVT5yh4+5AW98YtFjqUZiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=t1rtfmRWZEaZnMlSY0dEVmNO9mzI6BvH83UDUhmvKN8=; b=XgLJTQiULctZOtwj1SlSf/l8WfTt7TyszSOr5ZuQ5h02bMTPJYzJ+QT9PN4nsbldcCMOHhGSuZc4auZLDbi0zwlPtCY35e+QsiI2FYo9qf7DGqZQCwmTDVRGHlj7+LmsWR12N8Y6M840jPNTzOYF2qfJj/6WtqQiAdD3njgr/xRFga4Ikmyy9P9b+pFbA1wQpoeeGP3qDpc3HTzzZq1KHEJcULFVBAQ9ZpuhdPto+04WxpTdIVni+hi3J7bdzF9V2tCVo7tQOMx+vvkbBAcsWjNKKzttn37NHNJHzAQnSzb81r20wqZCu9wBMpAnu4+NokDV/O1dIsHCwSTAJZ3lhA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=t1rtfmRWZEaZnMlSY0dEVmNO9mzI6BvH83UDUhmvKN8=; b=xC97SPDsdafG58fj6ye0hO240/4XUJhXYsp7qhueuXE9iu1T6r5tFGby0mrc/sqsXsTGTnHJ5XV3kokK+vjBTc6rKwIu5n9FlmxErNnFO3tPPQrjvcjKXN8H5bafyxM1vnrjYlVhgZG2X22+H5txIARfrGLx6Y6e0pdImi3obYQ= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:06 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:06 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 11/21] vfio-user: get region info Date: Tue, 11 Jan 2022 16:43:47 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 54d5dcbd-dc83-4ea6-2807-08d9d563a908 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:431; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +onYdbqd4LdUs+NIyEhBeVk4T+LdTGhtDAXf8kGPfXjf1ls1ar+dNPJmsiX9GPJ3+zDWzmMr4ReLFNWHQx8L0Vdqu9htCimEkvVwV/YO33rlAQlujAq6SqikMQCSeUVQzs9cjoOGepXzGZcmZjmuDMkuOfm+V37nEqsxOM+xHaFdzrDDbca/w81rV/mNJP2AHOIzWxnvjdi6DrNoYRIRX4WlVXbdns/d26MNd3JQxdAnt1LEgD4eE2E/JpPZzkKXtcJvdfwhZIshwN0/sUFtpAn3QgWxi+NsnCpGx/IwleLjkz5r0VqGRAgC1tvOZHJ/MBrjXF8yllWyDKSd/INmYbLOqwz5RvE0pzXCpZOJi8ZiEZ1/WxPxKfPlncOrgAXNpa3U5bsDyAogF7n1XqXCBYo24lnORy0dFzVgiF8kBvmvNi4DuOw33K7cVuyml+Wol/1bfzQUxeokLkejbo1lkhV8sQAWxAi7BVhRbDAz1eSoZDg+ngJcqmFNNSIVOw/CdcAuGTkdp1GEsdGR4rek4DE01xjMxIjMhlUhm/96wM5ED4F1QDWdxYrnIpwPJ+x29HJjoYRuf9QlzdamuEtz3trzjXXUluAqYUjhdV/YLVdXYKPSvv46VvCsnbvrZKuAhHyk6fgoC/vctyoogqIPdT4IlIuiA0FUlqpeJfuprJInYv7pZRkJxx4ziZGNQY6QBHcDDyN9FQ/zqdpFLQYfdA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: XhUtkCWf4OBn9ZmeV3VNfe0y+2sFJt/3hFDXCMwD/HDrtAVVYNET3/1TtHUDsLSA8C2DSHHaNwfqqy6y2N2QyNertzDcy9EQfxEOo1x797Ss7y5oBYze5Zf2xXbJlphyqDyN+FD1sXstFI5buvsQFU799lswtcsg1Tjkp0IdxWZtA9EqXW5wmM4sBxVGf7mXdOkjf0MwQH5nxq2ArytifKwnSeIfAkbnQy6Kpk/o4WvOvtDTJLRa+EabWlRhkOOjGkcQLTplzlZ0kDJziges3td6dlgYSUeexoADu+Ve00ocoaJHeTTxTDoYGCdtKhVnmVo/RD10Hro+VNC7YTzeUQiDX4CLNRp+8HRs+dNQmEbiWNq4udiTaLdeT3+u4KIlknqaT1FLuI3dsSDgzQZN323zaRE+UWnH81IcsPymJk268zzoKCoLbNZhG9Z4blXoFJxqbZTgLIX56Du6NbvSzBJlJ3ovpIKLJ+XtJx6FB/fYKKuSF0LXoxVs9uhSoKd7DUvqo1JbnVE6s5eVFAP0EqJQP6o0xrPIx/qRI0ENZTihSdwAmfck1n7tq93JgQ9hZhiHWyfzRFHnk7IGcSgYAyVDguYF9nfK8ZFAYNSGrQRQ8WD22Td27ehzQizxapfFou9D60zbHjK/pqEf2UuOuwBSzEJJA8ZSgfOF3YdtuIXCYioKZ7V/ZBYs1TYsgANzQAPKsQh6oaOaKstiPdwwwwQqkXKNBLV2d5vZJE2B8ENY9t53l4SWt9hZ3Us4taSoV2RE9/bf3MoygyrTty/l4ZkHdWlPCGu6/jgimqtUDI9f89nRWd1aEd9emjGd8uHu2XGEleKg2h8F91l3sEB1ynMxkjNxj6ettVxRgo61uYUtONEcCwoiWLFeZp+6r1/88DzL4/lxan/rmmGPnNCjvXeb0z49wx+qa3S4TmY2OZRk1XNJDC8vcV3UH6zHcWD8EilKlTP/cjmn9Y+JwGwxRRMw4GR3hFPMC56CWnSUAUF3npk4gEuXy6U/46mC2fxuvpqZXRajnMYbRk/gSUwfTc9s4ELdzMCmRTPRDEE8gPRU+brLdD9YNXZPrt6/jw9Co4w/ucRLO5+zgbqwCXwm2bJ4tGgLdyXOhITXI31+3mhMPHG3MIFvoEvJFWOi8ykGQ3brIFELTf2QJ3efXiUFpNbwdhKZAolxXNrgYiILQ3C5Uc7x1dI+U8RudgAdgoQP30AJLzTgA8VECM6FqFNYEmxG5+XbhdUhbqOIgz/I8wCpHA3Ibw03HRJEyFFQ3eGuoPuLx/tbSuyGFF4gEXrIYXYfGXfZV6ikDMfZ5GCx2qvkgRUayvtlMHj6eTCR4DslUHvgrDIn6uSbp5ZXek/xnDc2c+3a7HTnDpkrcDAGkftxuIROx+e0LPPkMv2SxrHWoutGeUDZXKKldcMe0BN3QClgkioSoKx1yWBUsW3uWjkhzQVUU4pxs1zgaZxigd1VYNuONIYpSqfpfR/1Kgxlrb7dyO6QnuQSbNgw/gQcDs0e/pV4l+fm5r6K3j7b7lGGjAa1fQZf/Tub5TQrSZIzXomcYWskqgrX3rr1QfCW8pAcibBjor2MLA9F1fC5cxJ+Vdvc71gpFaUjik4xAdsYhFDS3NEZP4RiMa0RDDW/3gJwFzWrU0NGJgiNYFU8t2mMh6Mhv5901aTKgxIQ0NfcvQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 54d5dcbd-dc83-4ea6-2807-08d9d563a908 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:06.5091 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: oMYZoCN1jORzdZrptfEJOf1GwDVOdLzV1Vqb/szWXGDmWo8VQzM/6uJJcmn3NbXHgOkFe5jOJ5OGNnSo0Z+8RIWiJDTWeZli/qkSVWkytmw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: PprIt6x6Yj2pBeo_FOxReq1GLd0S93mz X-Proofpoint-ORIG-GUID: PprIt6x6Yj2pBeo_FOxReq1GLd0S93mz Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add per-region FD to support mmap() of remote device regions Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 14 ++++++++++ include/hw/vfio/vfio-common.h | 8 +++--- hw/vfio/common.c | 32 ++++++++++++++++++++--- hw/vfio/user.c | 59 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 107 insertions(+), 6 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 4ad8f45..caa523a 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -106,4 +106,18 @@ typedef struct { uint32_t cap_offset; } VFIOUserDeviceInfo; +/* + * VFIO_USER_DEVICE_GET_REGION_INFO + * imported from struct_vfio_region_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t cap_offset; + uint64_t size; + uint64_t offset; +} VFIOUserRegionInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 3eb0b19..2552557 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -56,6 +56,7 @@ typedef struct VFIORegion { uint32_t nr_mmaps; VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ + int fd; /* fd to mmap() region */ } VFIORegion; typedef struct VFIOMigration { @@ -150,6 +151,7 @@ typedef struct VFIODevice { OnOffAuto pre_copy_dirty_page_tracking; VFIOProxy *proxy; struct vfio_region_info **regions; + int *regfds; } VFIODevice; struct VFIODeviceOps { @@ -172,7 +174,7 @@ struct VFIODeviceOps { struct VFIODevIO { int (*get_info)(VFIODevice *vdev, struct vfio_device_info *info); int (*get_region_info)(VFIODevice *vdev, - struct vfio_region_info *info); + struct vfio_region_info *info, int *fd); int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq); int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs); int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, @@ -183,8 +185,8 @@ struct VFIODevIO { #define VDEV_GET_INFO(vdev, info) \ ((vdev)->io_ops->get_info((vdev), (info))) -#define VDEV_GET_REGION_INFO(vdev, info) \ - ((vdev)->io_ops->get_region_info((vdev), (info))) +#define VDEV_GET_REGION_INFO(vdev, info, fd) \ + ((vdev)->io_ops->get_region_info((vdev), (info), (fd))) #define VDEV_GET_IRQ_INFO(vdev, irq) \ ((vdev)->io_ops->get_irq_info((vdev), (irq))) #define VDEV_SET_IRQS(vdev, irqs) \ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f07023c..a50bf4b 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -40,6 +40,7 @@ #include "trace.h" #include "qapi/error.h" #include "migration/migration.h" +#include "hw/vfio/user.h" VFIOGroupList vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list); @@ -1554,6 +1555,11 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + if (vbasedev->regfds != NULL) { + region->fd = vbasedev->regfds[index]; + } else { + region->fd = vbasedev->fd; + } if (region->size) { region->mem = g_new0(MemoryRegion, 1); @@ -1605,7 +1611,7 @@ int vfio_region_mmap(VFIORegion *region) for (i = 0; i < region->nr_mmaps; i++) { region->mmaps[i].mmap = mmap(NULL, region->mmaps[i].size, prot, - MAP_SHARED, region->vbasedev->fd, + MAP_SHARED, region->fd, region->fd_offset + region->mmaps[i].offset); if (region->mmaps[i].mmap == MAP_FAILED) { @@ -2410,10 +2416,17 @@ void vfio_put_base_device(VFIODevice *vbasedev) int i; for (i = 0; i < vbasedev->num_regions; i++) { + if (vbasedev->regfds != NULL && vbasedev->regfds[i] != -1) { + close(vbasedev->regfds[i]); + } g_free(vbasedev->regions[i]); } g_free(vbasedev->regions); vbasedev->regions = NULL; + if (vbasedev->regfds != NULL) { + g_free(vbasedev->regfds); + vbasedev->regfds = NULL; + } } if (!vbasedev->group) { @@ -2429,12 +2442,16 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info) { size_t argsz = sizeof(struct vfio_region_info); + int fd = -1; int ret; /* create region cache */ if (vbasedev->regions == NULL) { vbasedev->regions = g_new0(struct vfio_region_info *, vbasedev->num_regions); + if (vbasedev->proxy != NULL) { + vbasedev->regfds = g_new0(int, vbasedev->num_regions); + } } /* check cache */ if (vbasedev->regions[index] != NULL) { @@ -2448,7 +2465,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index, retry: (*info)->argsz = argsz; - ret = VDEV_GET_REGION_INFO(vbasedev, *info); + ret = VDEV_GET_REGION_INFO(vbasedev, *info, &fd); if (ret != 0) { g_free(*info); *info = NULL; @@ -2458,12 +2475,19 @@ retry: if ((*info)->argsz > argsz) { argsz = (*info)->argsz; *info = g_realloc(*info, argsz); + if (fd != -1) { + close(fd); + fd = -1; + } goto retry; } /* fill cache */ vbasedev->regions[index] = *info; + if (vbasedev->regfds != NULL) { + vbasedev->regfds[index] = fd; + } return 0; } @@ -2623,10 +2647,12 @@ static int vfio_io_get_info(VFIODevice *vbasedev, struct vfio_device_info *info) } static int vfio_io_get_region_info(VFIODevice *vbasedev, - struct vfio_region_info *info) + struct vfio_region_info *info, + int *fd) { int ret; + *fd = -1; ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info); return ret < 0 ? -errno : ret; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 671c4f1..1b0c9aa 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -925,6 +925,40 @@ static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) return 0; } +static int vfio_user_get_region_info(VFIOProxy *proxy, + struct vfio_region_info *info, + VFIOUserFDs *fds) +{ + g_autofree VFIOUserRegionInfo *msgp = NULL; + uint32_t size; + + /* data returned can be larger than vfio_region_info */ + if (info->argsz < sizeof(*info)) { + error_printf("vfio_user_get_region_info argsz too small\n"); + return -EINVAL; + } + if (fds != NULL && fds->send_fds != 0) { + error_printf("vfio_user_get_region_info can't send FDs\n"); + return -EINVAL; + } + + size = info->argsz + sizeof(VFIOUserHdr); + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_GET_REGION_INFO, + sizeof(*msgp), 0); + msgp->argsz = info->argsz; + msgp->index = info->index; + + vfio_user_send_wait(proxy, &msgp->hdr, fds, size, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + memcpy(info, &msgp->argsz, info->argsz); + return 0; +} + /* * Socket-based io_ops @@ -947,7 +981,32 @@ static int vfio_user_io_get_info(VFIODevice *vbasedev, return 0; } +static int vfio_user_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *info, + int *fd) +{ + int ret; + VFIOUserFDs fds = { 0, 1, fd}; + + ret = vfio_user_get_region_info(vbasedev->proxy, info, &fds); + if (ret) { + return ret; + } + + if (info->index > vbasedev->num_regions) { + return -EINVAL; + } + /* cap_offset in valid area */ + if ((info->flags & VFIO_REGION_INFO_FLAG_CAPS) && + (info->cap_offset < sizeof(*info) || info->cap_offset > info->argsz)) { + return -EINVAL; + } + + return 0; +} + VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, + .get_region_info = vfio_user_io_get_region_info, }; From patchwork Wed Jan 12 00:43:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA48FC433F5 for ; Wed, 12 Jan 2022 00:46:30 +0000 (UTC) Received: from localhost ([::1]:55224 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RmX-00048d-Vj for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:46:30 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36732) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdm-0000vu-5L for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:26 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:11638) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdi-0005gd-Ex for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:24 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMv7qc005893 for ; Wed, 12 Jan 2022 00:37:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=u81A8Ca6WXFHcMibI8RE9TJa472X1f6evWVYufMRPU4=; b=diSjChV5o47lbohPnFf9e+9WNVmbYFgMmi0ITlna6nfI3/5F0VwsnDQTPK3NS+7AsOFc MzJA4ftIi/513a929PE71JDk3/OrIOPkzD72t8s4GSxxdNg4+BakJDStrKuf3NzItUOD Ou1+Ed3e2utRLicvHWpCnfE9OS4DwCru++8k6iSCGdkD/HKbfremGNOZAytg2qLTylSq EErkrCU1r66JlPJ2ZcB/5VQy8NIYzUdmgvyIiW3bmmoYAjlTiatez3mUWgjO/6s8RVhR rTYqCddbFQSFtXMu3ozdQNHkHklXD8DceLJJlhRGV812nSuLzejS4PfQUflKizEEfX1b mg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1v0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:11 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBG196414 for ; Wed, 12 Jan 2022 00:37:10 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:09 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UaNSEU7Do7/QrcLQ1bUqbnBD0f8rcJkGyNDvO36W4cU6a4DFzFgjG5Kp8uY9ypm5MazwHXDtOJTsbV+oWdAZ0fJ+ig8j3FsyR88BgGSMR0Ru2X7zy7Zbica2euONgdPKrYIwR6EWj9qF945GmzN4uCt0tXouREUjwL3RO5qQgH/hM/W9LR2EaN3papPefwbvROSYKbVsbKyJS7nKk+sa3z2PInl/cvqkTIkKiq4pbD6VoJ3U/iFE7exKAPGEoPLmxfB5cfLjxmt965uIxCvo+v3QllSo/PZQ3Lzw6Hh2JA9fUwGw/ClmrceKa3DF3AzE74lsPzs64gSszx/hWLx/RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=u81A8Ca6WXFHcMibI8RE9TJa472X1f6evWVYufMRPU4=; b=AghlIkbOgq2hiJmiiUy+THNWWaCPkhm5UzRC5gjrwNtvHxiOHr0W6lSat1T9gTVsLkI6QyTjuUtZNQt5pAiRNwe7iBkEwgDT11TsdwLha8EDmi4MbbC9BXGMCCzPdpQAgWVQxpQtyT3OWt+SI+casdtqC8+OynJdRnCeM/zRZee6Z/UJaowARbl3Hu3vE0rdYJ3UEGak9che8Pc0aUIuIAl4/ZSG7xmkxi2Rv2a13bWAPF52uHx22Z9mLcApYVH38MFZTyKdIbnOzLJ6sjd27tzQElyTxVygAXRvAc7lSwougUeWvIW0Qc0yDrVRBIOPKjRWJnTbEjwNgHRs0h/4qw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=u81A8Ca6WXFHcMibI8RE9TJa472X1f6evWVYufMRPU4=; b=gqfLQdU1Q82nYaiYjsiffW4bfhbDf8DD3Tlv99htOlqZrqHCzhLp+VCRI2WNxZ94QKc3AsTsx2Gxi4kKdQNqRt3jt38yqH9ggi8Y+5O4cxgUarP/Ih6qQWWByDDGb/k6QTYFyRNk5GrytkHT4r3wUsa5yh4YHPI+k5Y3ybEPkLw= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:06 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:06 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 12/21] vfio-user: region read/write Date: Tue, 11 Jan 2022 16:43:48 -0800 Message-Id: <0fbe8c0935af73fc12eff1f6c919387a9ecad5fd.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b2b6a17f-5240-41c6-f863-08d9d563a93e X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6790; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ir6x2J+p8P6SAGE3TRBVhDaDj5czHooL6IwF3j8pN0W+od8pjTbiEM+RZrzqNAbBJRiZhS/XbYvpC55o7VHuvzOa+Qxp4tLMa3H9JyqFq1BxLHavlP0drGboup7oZN5sM+RroedOj6DnPDw9c69lIHJHPWDHAwbkW806AxByqJY4J7nNXtTaFF5qaGuQx83P2A4GnXQOwT6eC96xzmwZvF0TXF15nbupe4p2p3I9YDSWC7eUmnrnvPj5Y5JCufYw2ca23WEGCMxgrY5SM2PkDwuch0uzgj8eZErEqtRfKGIr+wzflO3XEs9YyrS8GZgENUBq20pQuWw/0WEUyL+KGRsMvKy5lQ02xUBfn0XJzlA6FXJS0LoRVdKeVy6hHXEVlm82LGKd+JusJ4lF7QKIdml/9yQS+3y2bgL4rmmAjrWf6EEKUY36Q+n3jgKTC1O0fdTFhVcb2/aPYQWzMeG2k3K3IzlWsii0KE9OpDWYPhc/J6OgpOZljgIjnUBpSqwedUzq+zq3mupD5jqCTZHHSkbybIZzE8hmLGEtmSg6nJ9K4intebOMy+BM2mrB4EmcoW8/u0WwSEB2Ey7Zwz5qddWUv5hSJJ/zGr5kqAfquY/FnJvh7RxcHWzQadwBQT9gdWVdzhEMiUS9Fc/cUObaCUihGIZPxvzjhjssAC7T5h6okFbSpiRw9vPiYo0tePWrgWbN96RmAzjtMlbSoihpFg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: hyocVtN7lk+LEnxaser/s1vSS5APezBcnOT58ItS35Vba/xujToAyMKFCNdKj8j2jnaxq29lDp230idHsmnksvGouW/KhmfJCG8hvTw1vqOUbIqQX52tza0dwyOlhsDH4PlM/VndyGnrp6auXYZx30nAM8pMOyjiPnWD5N5itRoLH1fv7TCZD9c9/CXisxI3VVFUxcJ3hGBhPYQgw0yy/NYXOJuEG1IAg8A23Cwun7n1coBgs8J+JNMyu4PnhCSyZMz/2H53XVM6el5fhkn1mzVoqnu8Hinsxp5kWs9+jpI6G4Tzb4jvbLIhiNTNiGGQ8b5Y6lKkP5D737QU2zYzndXmqTpcGr/CYx/2JqwM+YFyeq21rYhtU4n3RH96fU2LqFI7jhrtcfpMzLjTeoT1gQ1fPsP+UjxH3t6zIo5IMtZSFcfCRyxclxQ9Yj7XoDYMKNS9r4l5DnhA7lSvtrk/lJU73AJUVq72PJ/AuQfaYAcTF9KWGFY/TFqTmtX4KQkIPRqqG9ZccBWaePy2D2AlgUy4OVEAhkyRCgOz396mfSsbpdeD1eKn5M9cuEms6DNzehZ7udIvPm1ws7hhLQmK9vT14X/twzWr2xR6aKmVpbA4Rvs0Qwrp807E6lXPsnO6CaL92kQL7ybURkzYWWAQbBqU25N4QYx+QR9zNZmz0if4T5dSLVB0W2AQ06Z8hakRqwDKE6B07CeJVaOzqoUISRP49nyEFSnhojQC45ga/ztDs7H6sieax96ueeYhG2OFdc95035xramgQAzCWw1TlKKln8Y2AvMqnBDRYq/GgcQpdMBrcZhkBVG4Ng6g9u3LzacnNY+BJWIRvVorRsU6p2fcnJnneA9oGyKP0Yvrnlba0Yl7HKa5q/gKGY1uVQYNUFLvNWKwcCmSJkrL354URHpN9x2r8gwGMRLKOeGO/wn3XmSmJekj2zp3xLpAkh7obw65+rwjr93C6YH/QKLqJeI3LSYTOjk4rvATdh7nYYp8rG+b/5OAjmxnnYnm+SZvJg59ro4Iza6Zsnk8wX9FqCmkPB+krKA2mxVEsdr+E9nxp50+zJ6fO2j/Zrk9advY7vOUCn3yRWT18aDzqvF2/YomR2Bopvk3GU5p6NxuBGl8rFY+BfSwLl5kMV0wfB5LXnnWmP/jenya2qu7zvCp5zyfIs9/o/X/oiQPid07SiJloGut6Zf5fl23uCQhLUjtSnb7k23shD6VR/OUiN4+ncFdUnthVrwGTiWIbWfl3KdLTevNX4zEIMZUBytK901zZsIjNxyU+3KJRk1Iz62a3ExZ1Mkb8C0nBjEmWnXPwtoFVQ9DHyIvJHaHQ765/o5NtZEDoesD8NFy+x31HrXVHOoqPNI9jiAekgjj+3Xv7BG5RwN7QihBjgaQeOCQPLDiu0mxAXAK385ZPTRu76oibcImGi8haoxJeDJgj8oRRG8NP9ahHvbF6WaRqOYhFjT6zm00NLSKDaZBiTCsr+nPM/fMrI5xGVI7zQe5iOR7gUo4U66yVUzKPdfg0wamDdUNf2gzRjPJvydDWKiwVhGeWnR3ODYMmvhDv+v33zOh2jXVE2W2XRA4QwRUFNU1yA9d00GKwvKFDTN03keF/IDF1f4UqiHyBmA4w8LpGIWAShAqhDii0t+A5oBe2R5P+lEVedxFGTXUo1NwWcrpNkvNZA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b2b6a17f-5240-41c6-f863-08d9d563a93e X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:06.8371 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: EK9ygherjuXKSGfofpUL4MCNUsWFD9p9KM5/fY8/ikB14j5K9fMbVX/GRCJoyIEj+uTWSkd1vP9tskq7zT03j9Xzijh01DD+kOHNouGcXTQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: u7NvVKeSHMeqcRN_ojigNw3P-CLsOxsq X-Proofpoint-ORIG-GUID: u7NvVKeSHMeqcRN_ojigNw3P-CLsOxsq Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add support for posted writes on remote devices Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 1 + hw/vfio/user-protocol.h | 12 +++++ hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 7 +-- hw/vfio/common.c | 10 +++- hw/vfio/pci.c | 9 +++- hw/vfio/user.c | 109 ++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 143 insertions(+), 6 deletions(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index ec9f345..643ff75 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -194,6 +194,7 @@ struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; bool send_queued; /* all sends are queued */ + bool no_post; /* all regions write are sync */ }; /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot match hw */ diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index caa523a..b1ea55f 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -120,4 +120,16 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; +/* + * VFIO_USER_REGION_READ + * VFIO_USER_REGION_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t region; + uint32_t count; + char data[]; +} VFIOUserRegionRW; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 19edd84..f2098f2 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -75,6 +75,7 @@ typedef struct VFIOProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 #define VFIO_PROXY_FORCE_QUEUED 0x4 +#define VFIO_PROXY_NO_POST 0x8 VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 2552557..4118b8a 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -57,6 +57,7 @@ typedef struct VFIORegion { VFIOMmap *mmaps; uint8_t nr; /* cache the region number for debug */ int fd; /* fd to mmap() region */ + bool post_wr; /* writes can be posted */ } VFIORegion; typedef struct VFIOMigration { @@ -180,7 +181,7 @@ struct VFIODevIO { int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, void *data); int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size, - void *data); + void *data, bool post); }; #define VDEV_GET_INFO(vdev, info) \ @@ -193,8 +194,8 @@ struct VFIODevIO { ((vdev)->io_ops->set_irqs((vdev), (irqs))) #define VDEV_REGION_READ(vdev, nr, off, size, data) \ ((vdev)->io_ops->region_read((vdev), (nr), (off), (size), (data))) -#define VDEV_REGION_WRITE(vdev, nr, off, size, data) \ - ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data))) +#define VDEV_REGION_WRITE(vdev, nr, off, size, data, post) \ + ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data), (post))) struct VFIOContIO { int (*dma_map)(VFIOContainer *container, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index a50bf4b..83cc5ec 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -213,6 +213,7 @@ void vfio_region_write(void *opaque, hwaddr addr, uint32_t dword; uint64_t qword; } buf; + bool post = region->post_wr; int ret; switch (size) { @@ -233,7 +234,11 @@ void vfio_region_write(void *opaque, hwaddr addr, break; } - ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf); + /* read-after-write hazard if guest can directly access region */ + if (region->nr_mmaps) { + post = false; + } + ret = VDEV_REGION_WRITE(vbasedev, region->nr, addr, size, &buf, post); if (ret != size) { const char *err = ret < 0 ? strerror(-ret) : "short write"; @@ -1555,6 +1560,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region->size = info->size; region->fd_offset = info->offset; region->nr = index; + region->post_wr = false; if (vbasedev->regfds != NULL) { region->fd = vbasedev->regfds[index]; } else { @@ -2689,7 +2695,7 @@ static int vfio_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, } static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t off, - uint32_t size, void *data) + uint32_t size, void *data, bool post) { struct vfio_region_info *info = vbasedev->regions[index]; int ret; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 6f85853..a4fd5e2 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -51,7 +51,7 @@ (size), (data)) #define VDEV_CONFIG_WRITE(vbasedev, off, size, data) \ VDEV_REGION_WRITE((vbasedev), VFIO_PCI_CONFIG_REGION_INDEX, (off), \ - (size), (data)) + (size), (data), false) #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -1658,6 +1658,9 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) bar->type = pci_bar & (bar->ioport ? ~PCI_BASE_ADDRESS_IO_MASK : ~PCI_BASE_ADDRESS_MEM_MASK); bar->size = bar->region.size; + + /* IO regions are sync, memory can be async */ + bar->region.post_wr = (bar->ioport == 0); } static void vfio_bars_prepare(VFIOPCIDevice *vdev) @@ -3444,6 +3447,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } + if (udev->no_post) { + proxy->flags |= VFIO_PROXY_NO_POST; + } vfio_user_validate_version(vbasedev, &err); if (err != NULL) { @@ -3503,6 +3509,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), + DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 1b0c9aa..09132a0 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -50,6 +50,8 @@ static void vfio_user_cb(void *opaque); static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); +static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds); static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql); static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, @@ -534,6 +536,33 @@ static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg) return 0; } +/* + * async send - msg can be queued, but will be freed when sent + */ +static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + int ret; + + if (!(hdr->flags & (VFIO_USER_NO_REPLY | VFIO_USER_REPLY))) { + error_printf("vfio_user_send_async on sync message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = 0; + msg->type = VFIO_MSG_ASYNC; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + } +} + static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql) { @@ -959,6 +988,70 @@ static int vfio_user_get_region_info(VFIOProxy *proxy, return 0; } +static int vfio_user_region_read(VFIOProxy *proxy, uint8_t index, off_t offset, + uint32_t count, void *data) +{ + g_autofree VFIOUserRegionRW *msgp = NULL; + int size = sizeof(*msgp) + count; + + if (count > max_xfer_size) { + return -EINVAL; + } + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_READ, sizeof(*msgp), 0); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, size, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } else if (msgp->count > count) { + return -E2BIG; + } else { + memcpy(data, &msgp->data, msgp->count); + } + + return msgp->count; +} + +static int vfio_user_region_write(VFIOProxy *proxy, uint8_t index, off_t offset, + uint32_t count, void *data, bool post) +{ + VFIOUserRegionRW *msgp = NULL; + int flags = post ? VFIO_USER_NO_REPLY : 0; + int size = sizeof(*msgp) + count; + int ret; + + if (count > max_xfer_size) { + return -EINVAL; + } + + msgp = g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); + msgp->offset = offset; + msgp->region = index; + msgp->count = count; + memcpy(&msgp->data, data, count); + + /* async send will free msg after it's sent */ + if (post && !(proxy->flags & VFIO_PROXY_NO_POST)) { + vfio_user_send_async(proxy, &msgp->hdr, NULL); + return count; + } + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret = -msgp->hdr.error_reply; + } else { + ret = count; + } + + g_free(msgp); + return ret; +} + /* * Socket-based io_ops @@ -1005,8 +1098,24 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return 0; } +static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, + off_t off, uint32_t size, void *data) +{ + return vfio_user_region_read(vbasedev->proxy, index, off, size, data); +} + +static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, + off_t off, unsigned size, void *data, + bool post) +{ + return vfio_user_region_write(vbasedev->proxy, index, off, size, data, + post); +} + VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, .get_region_info = vfio_user_io_get_region_info, + .region_read = vfio_user_io_region_read, + .region_write = vfio_user_io_region_write, }; From patchwork Wed Jan 12 00:43:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56C30C433F5 for ; Wed, 12 Jan 2022 00:58:51 +0000 (UTC) Received: from localhost ([::1]:44406 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RyU-00083a-Et for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:58:50 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36764) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdn-0000yR-FS for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:27 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:13024) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdj-0005gr-E8 for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:26 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMCaHt005865 for ; Wed, 12 Jan 2022 00:37:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=8U+TJxKfg7P6wRj8IxiNsXhQQaoswMn74qsiuEySTpE=; b=WmmMjXhIeUS9s5CnaX+Ydw6SxD1G+oOap4t9Si8UqvG4F8vKCGsevGJKmhV8KiymfHFt GHb6dh8yHi0YTIIQGTtleZkSedxRJT4IIPp6OxQ2tyX4i9HPd7TKzFtYFt+ajRlluT3G 9CI36arB4ytgJ/9dGR+KG0yygW6z+t+rcsooIAfQHph85KV50vauQOEore1Lhd0omCop T9KfvNoHTj0upoDeYwBHx4cwmrjktLfnc3KMhmA5+Tj1O0SFTVaqqdJEeDXe4stMdvzG ldY+biRDo22tRlGsAcQZQpCKCuPz0vfHLpNcrI6INSAJbBxQ66t0n2fUcUv5dZqBgzr3 Pw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1v6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:12 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBI196414 for ; Wed, 12 Jan 2022 00:37:10 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QTYJnWr6bEamFaKhkyebuMmNOczBMi6RNhPgZmJrF7FxCDAXqhbTiONaEOHxcjCTOJePvTTu91kpEylOUaCCg3XopHw7goGCUrsed6doW7vz2MOPEXQDJctmJgtWJjfmsYw4X+3h9lDuiFEMsb2CPx+87hgWICtS87dB4Gd+jWXHlFq5lT3mGl25hpCjjyl/kpZTu0L1Yf/x4DpoNkEkqseLJV8FqYBBOdSIo99uU75rxCDeHINkSCzeGZgvrJ9usq04JG2XODQFUXpUHyD7FXuJ/fU/jEa+8RFdDJtULU6gAjFvmY/BwzIpXYoWC2mPADCh9bZSOZ1wHupTT52ZLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8U+TJxKfg7P6wRj8IxiNsXhQQaoswMn74qsiuEySTpE=; b=OfyFUbj/bXvz+5Hxwyo+ITY4DO6P8HVvL0kOGyH0/TZfrbBec7ppr6OghW+vsrMt4MFFEOrMmBVCNgfXvWuE7ffXcuBT+agKgCDUSfLN7p2Es9ig6Dz7PCZH25dVQfr9IncCyK3bNi1AMVfVDEVvi+i8UHRUyZql42mTaNu73ur7OaAThhhv9KqzM87ONZ+wUY9Sp92fP0sUuqpmL99vLnbFpqE54pSL40pweDV+rdF8htBUV5FchXnMa1Nr33P705Xwc4tkMBSYeaUEXYpzNaXVmD0PQn9y6j7RD5BkNmIbH27F2F3/fkCBCxlk79zcHPoBgBYxgI+J+Qu1ui1aCg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8U+TJxKfg7P6wRj8IxiNsXhQQaoswMn74qsiuEySTpE=; b=Eqn4epSTsiafaoptTXMiis34MqGdPbUsOxicgIv35f5dlpuq84RSwsaES0Sn/D2tyQXeZxJtrrFlx35zWHGlbfTJJ892GQk41BdSyypMPiTCEwo/cNJAkWN7IU13LIFpvyBOYGO8+eloR3uKNZ+P3TmxsPq6w+MtrwBLjUQdBU4= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:07 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:07 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 13/21] vfio-user: pci_user_realize PCI setup Date: Tue, 11 Jan 2022 16:43:49 -0800 Message-Id: <6d8ae21cbade8f4bb7eaca4da29e57f0cb1a03f3.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2ea80908-97c1-493d-e09c-08d9d563a969 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3631; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: No0ebLNgIgKnfALr2Ztp4y9FhaWVAyRlafGUr3Sn60fVP9Os0Eexfjt6juJfRx0QWAqd+gw9fmQOOuYDN4ZK6bQ5W8fCEAeL5q88EbZoPIwt95Qobayrpkxz/zQb/hpPulQQAw5wmA1KLz/Y3HKovgO5SvCEf9QuPRDX/N9F0RlwMOWo9qeIEQl2vBv4h9x0kmflh3COopUUoiZk/VnlhW+OHyp7+pz/iv7MTr/KERp9EvOcYOAzizbz4/44enyqX40FY1yMXNdRNuj73uPgF8qtL3bs7YlG4mCDs0vKFD1YVbRAA8LUFd1s3akZQnKEAjn/NuN3ERfl2ue47IEiiKz/APiKXhTvpfbuKg/GTSZe1YrIgOWEWWzHhHr6iyWp1qPKptXxuen7s/LAzpYiMMKeFtqLOyXzgdtHaqIzVhk3PBNAaT/PwjaAttxX0SOC8wSNvbouQd0FrJ6TQ0qScn+s2s/CenosllJy5z+eIWrXeld1o5DCu6a3KfIhDe8w4s4jc3eOxLZfvgPk/FAXHKIp2GrWFyodogR3gOmtelBiDgtMAZpGnJHNLSDuDEyXQXHvr4aozbwtQ9k2Pq3sakmON2nQ/vMW8ECQhft0haje6hke6of9rvJ8QPqkS9bMGf9k4pxCQUvN1IEXqd+lv+veO6e70L2AOszKbo/WPLOAEdLyoHqRFQ6j2vgQdzFKtR48cfD23QfFiJz50z0Ncg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(30864003)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: AUMl/ExOzIA4uw9/T7dD7gygpYO3JHHhzP+ESn4hURp1dCFOsvB5sNONHncrK6krdYbmwRPzFNHkxiktdp3eEd0zWVMkIIDoJGuCiXkdqN2EUZdIG8eCPzyFyFtuxPjuq1AToH4Jkb+t/qa4/JbB1rSxcn3bndnO1nFaX5kcIcI2nFsFriCS4XS/8JFumXer8+D8OTLF7ZtN1wiWRwTBF5YjKB8XWikj+7IduoqxL8Id+8ndkWeq2WdkhuIkeMToiUUinG4bShpL51+3zQ/Nt9PtOlV7YxxVQXN+9PGeoetuSs7X7q0A7XgmQvWiMLcJSWKIWOE5DdgD2eAyn9AvrEXTiMShaJOOEZ1AJ+SIa9c2L71lBiSNKZvkze69jmnY4u0376L8lLJL5OVTwi1AXjRRmgZOGWNXEnUnDyENCmgJoB9BV2JzWvLroFPvJAbtD/dk3bvW0Br2MgxrgU+er7MEyw7X0xvTjTxyxbtbnxEF5bKRBdsG20Vwfl33wdMJyoNETv4xLEhFCEdsCxuiphC/GE6Ijlk/YyYj0dbkX6LGjy0BriyF2ChueJXTrn/TnR8mQ1jyLMZ9DzbqkXri/eyjuQRmO0cQ2StzExFXciwidvf/hWKeRWC3WnK2Jtca1QO8fRMCc/qgz5nbzAwB/SqmNO4XN30n+VgWpQ38LfjTziTrGC65snajp8Xq4ao7y6GRk6od5shZL4Yben6T07szCuDLTQYXPqjUKPCHUfSVPoDYe7lA4NjeA4X4Z15sQMCi8X5Sq4p9rcbssL6zhAXCHTLPoDc5uGrXajtUUZ+/GO1KkT5P+XK48i4eMtQ+0YJUXZk7liZfpEfpV2dwUL3AiGC6ia4PM+ttuO/xBGcHTiEpIGdi6VZFTgK5PdGwAIyNlo/NdmQExYN2k6hDub4i75nDBk46P+E2P5RZpOE3Qi7400qTOyXzSLfacRJ3FC7xOXQF4+dwYO+2dPAfvrFNugurXglI5G8EbtVXOSs8rZX0KSeMd3E7n/q57/5gh+8dwINCAr5iBdz5GEdBWDX+L/od5VYVFx0nUErnlI+XYB5KYYLKiLTmjQB3b4aLg+//ghCo0fozS9YW4DbjgNjqWG0V4q/WJmde2i1S8ARkjbfjviiYoB+p3lgrt4OIrPoAmelYgsceu3sDylUo0U3KVzGtWdTnhb21UYZbM+SoOEXohKQamFVX/688Q6VPRAZKK9nVZrCw1wHbtLKMdZ7j+8W5cBu/311l4vyQuMdVjYY7X8PEvV1tp9OiNhaDfk4ZGeFdNlAXlJrsQs6dCtlXxXk9aHuQV5sme/fAs/x0//tQPkaA7AgQzxmkC275xG0qoIXmRymNYWRmAeg9TAwAAsg5M/4/+X0oyiTYuEtfR8Ap8scVed/r639cknSy00TVhEuraxMX6L+vIIrG26j8Wsj+iH0m2M2URBR4TD/FsRy14QGYsM61sMYg+2l2tJmVr3PAgU6rBGWr/muC2JqEW0PrH8EWG0vYhOJNJbWAO3ImBp9JvdlqO3tlStvqAYaJNr6Cjs97Evp3iQ6Vu/aF+2JoVgvOVW88u8BBwedoEGqTzH6ASoUpGHoADJdmteoYpvlcsLt3efBmdZXGJgXuvLkjr99jJw1ku3eWTQHXQuYRZOvrxB/tFdGG1k87zjc/tRBbxRT5G/mS0SV7tA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2ea80908-97c1-493d-e09c-08d9d563a969 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:07.1496 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: MeuQOJ3uwWLUoIhoeczd1m7UkdeyX5RQPIQ01l05jpSXguLriap10Mw/7hFpSfE67SV2YW9L4pRQxU3TCoCmUbuj4QiPFrIROpojrncVf3c= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: 6qfUkeO1z5UY0hfbqpQJv1xMT0nTMyJX X-Proofpoint-ORIG-GUID: 6qfUkeO1z5UY0hfbqpQJv1xMT0nTMyJX Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" PCI BARs read from remote device PCI config reads/writes sent to remote server Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/pci.c | 275 ++++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 172 insertions(+), 103 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index a4fd5e2..5c519ee 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2830,6 +2830,132 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev) vdev->req_enabled = false; } +static void vfio_pci_config_setup(VFIOPCIDevice *vdev, Error **errp) +{ + PCIDevice *pdev = &vdev->pdev; + Error *err = NULL; + + /* vfio emulates a lot for us, but some bits need extra love */ + vdev->emulated_config_bits = g_malloc0(vdev->config_size); + + /* QEMU can choose to expose the ROM or not */ + memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4); + /* QEMU can also add or extend BARs */ + memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4); + + /* + * The PCI spec reserves vendor ID 0xffff as an invalid value. The + * device ID is managed by the vendor and need only be a 16-bit value. + * Allow any 16-bit value for subsystem so they can be hidden or changed. + */ + if (vdev->vendor_id != PCI_ANY_ID) { + if (vdev->vendor_id >= 0xffff) { + error_setg(errp, "invalid PCI vendor ID provided"); + return; + } + vfio_add_emulated_word(vdev, PCI_VENDOR_ID, vdev->vendor_id, ~0); + trace_vfio_pci_emulated_vendor_id(vdev->vbasedev.name, vdev->vendor_id); + } else { + vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID); + } + + if (vdev->device_id != PCI_ANY_ID) { + if (vdev->device_id > 0xffff) { + error_setg(errp, "invalid PCI device ID provided"); + return; + } + vfio_add_emulated_word(vdev, PCI_DEVICE_ID, vdev->device_id, ~0); + trace_vfio_pci_emulated_device_id(vdev->vbasedev.name, vdev->device_id); + } else { + vdev->device_id = pci_get_word(pdev->config + PCI_DEVICE_ID); + } + + if (vdev->sub_vendor_id != PCI_ANY_ID) { + if (vdev->sub_vendor_id > 0xffff) { + error_setg(errp, "invalid PCI subsystem vendor ID provided"); + return; + } + vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_VENDOR_ID, + vdev->sub_vendor_id, ~0); + trace_vfio_pci_emulated_sub_vendor_id(vdev->vbasedev.name, + vdev->sub_vendor_id); + } + + if (vdev->sub_device_id != PCI_ANY_ID) { + if (vdev->sub_device_id > 0xffff) { + error_setg(errp, "invalid PCI subsystem device ID provided"); + return; + } + vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_ID, vdev->sub_device_id, ~0); + trace_vfio_pci_emulated_sub_device_id(vdev->vbasedev.name, + vdev->sub_device_id); + } + + /* QEMU can change multi-function devices to single function, or reverse */ + vdev->emulated_config_bits[PCI_HEADER_TYPE] = + PCI_HEADER_TYPE_MULTI_FUNCTION; + + /* Restore or clear multifunction, this is always controlled by QEMU */ + if (vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) { + vdev->pdev.config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION; + } else { + vdev->pdev.config[PCI_HEADER_TYPE] &= ~PCI_HEADER_TYPE_MULTI_FUNCTION; + } + + /* + * Clear host resource mapping info. If we choose not to register a + * BAR, such as might be the case with the option ROM, we can get + * confusing, unwritable, residual addresses from the host here. + */ + memset(&vdev->pdev.config[PCI_BASE_ADDRESS_0], 0, 24); + memset(&vdev->pdev.config[PCI_ROM_ADDRESS], 0, 4); + + vfio_pci_size_rom(vdev); + + vfio_bars_prepare(vdev); + + vfio_msix_early_setup(vdev, &err); + if (err) { + error_propagate(errp, err); + return; + } + + vfio_bars_register(vdev); +} + +static int vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp) +{ + PCIDevice *pdev = &vdev->pdev; + int ret; + + /* QEMU emulates all of MSI & MSIX */ + if (pdev->cap_present & QEMU_PCI_CAP_MSIX) { + memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff, + MSIX_CAP_LENGTH); + } + + if (pdev->cap_present & QEMU_PCI_CAP_MSI) { + memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff, + vdev->msi_cap_size); + } + + if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) { + vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vfio_intx_mmap_enable, vdev); + pci_device_set_intx_routing_notifier(&vdev->pdev, + vfio_intx_routing_notifier); + vdev->irqchip_change_notifier.notify = vfio_irqchip_change; + kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); + ret = vfio_intx_enable(vdev, errp); + if (ret) { + pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); + kvm_irqchip_remove_change_notifier(&vdev->irqchip_change_notifier); + return ret; + } + } + return 0; +} + static void vfio_realize(PCIDevice *pdev, Error **errp) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev); @@ -2945,92 +3071,16 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) goto error; } - /* vfio emulates a lot for us, but some bits need extra love */ - vdev->emulated_config_bits = g_malloc0(vdev->config_size); - - /* QEMU can choose to expose the ROM or not */ - memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4); - /* QEMU can also add or extend BARs */ - memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4); - - /* - * The PCI spec reserves vendor ID 0xffff as an invalid value. The - * device ID is managed by the vendor and need only be a 16-bit value. - * Allow any 16-bit value for subsystem so they can be hidden or changed. - */ - if (vdev->vendor_id != PCI_ANY_ID) { - if (vdev->vendor_id >= 0xffff) { - error_setg(errp, "invalid PCI vendor ID provided"); - goto error; - } - vfio_add_emulated_word(vdev, PCI_VENDOR_ID, vdev->vendor_id, ~0); - trace_vfio_pci_emulated_vendor_id(vdev->vbasedev.name, vdev->vendor_id); - } else { - vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID); - } - - if (vdev->device_id != PCI_ANY_ID) { - if (vdev->device_id > 0xffff) { - error_setg(errp, "invalid PCI device ID provided"); - goto error; - } - vfio_add_emulated_word(vdev, PCI_DEVICE_ID, vdev->device_id, ~0); - trace_vfio_pci_emulated_device_id(vdev->vbasedev.name, vdev->device_id); - } else { - vdev->device_id = pci_get_word(pdev->config + PCI_DEVICE_ID); - } - - if (vdev->sub_vendor_id != PCI_ANY_ID) { - if (vdev->sub_vendor_id > 0xffff) { - error_setg(errp, "invalid PCI subsystem vendor ID provided"); - goto error; - } - vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_VENDOR_ID, - vdev->sub_vendor_id, ~0); - trace_vfio_pci_emulated_sub_vendor_id(vdev->vbasedev.name, - vdev->sub_vendor_id); - } - - if (vdev->sub_device_id != PCI_ANY_ID) { - if (vdev->sub_device_id > 0xffff) { - error_setg(errp, "invalid PCI subsystem device ID provided"); - goto error; - } - vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_ID, vdev->sub_device_id, ~0); - trace_vfio_pci_emulated_sub_device_id(vdev->vbasedev.name, - vdev->sub_device_id); - } - - /* QEMU can change multi-function devices to single function, or reverse */ - vdev->emulated_config_bits[PCI_HEADER_TYPE] = - PCI_HEADER_TYPE_MULTI_FUNCTION; - - /* Restore or clear multifunction, this is always controlled by QEMU */ - if (vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) { - vdev->pdev.config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION; - } else { - vdev->pdev.config[PCI_HEADER_TYPE] &= ~PCI_HEADER_TYPE_MULTI_FUNCTION; - } - - /* - * Clear host resource mapping info. If we choose not to register a - * BAR, such as might be the case with the option ROM, we can get - * confusing, unwritable, residual addresses from the host here. - */ - memset(&vdev->pdev.config[PCI_BASE_ADDRESS_0], 0, 24); - memset(&vdev->pdev.config[PCI_ROM_ADDRESS], 0, 4); - - vfio_pci_size_rom(vdev); - - vfio_bars_prepare(vdev); - - vfio_msix_early_setup(vdev, &err); + vfio_pci_config_setup(vdev, &err); if (err) { - error_propagate(errp, err); goto error; } - vfio_bars_register(vdev); + /* + * vfio_pci_config_setup will have registered the device's BARs + * and setup any MSIX BARs, so errors after it succeeds must + * use out_teardown + */ ret = vfio_add_capabilities(vdev, errp); if (ret) { @@ -3071,29 +3121,15 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) } } - /* QEMU emulates all of MSI & MSIX */ - if (pdev->cap_present & QEMU_PCI_CAP_MSIX) { - memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff, - MSIX_CAP_LENGTH); - } - - if (pdev->cap_present & QEMU_PCI_CAP_MSI) { - memset(vdev->emulated_config_bits + pdev->msi_cap, 0xff, - vdev->msi_cap_size); + ret = vfio_interrupt_setup(vdev, errp); + if (ret) { + goto out_teardown; } - if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) { - vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, - vfio_intx_mmap_enable, vdev); - pci_device_set_intx_routing_notifier(&vdev->pdev, - vfio_intx_routing_notifier); - vdev->irqchip_change_notifier.notify = vfio_irqchip_change; - kvm_irqchip_add_change_notifier(&vdev->irqchip_change_notifier); - ret = vfio_intx_enable(vdev, errp); - if (ret) { - goto out_deregister; - } - } + /* + * vfio_interrupt_setup will have setup INTx's KVM routing + * so errors after it succeeds must use out_deregister + */ if (vdev->display != ON_OFF_AUTO_OFF) { ret = vfio_display_probe(vdev, errp); @@ -3487,8 +3523,41 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) goto error; } + /* Get a copy of config space */ + ret = VDEV_REGION_READ(vbasedev, VFIO_PCI_CONFIG_REGION_INDEX, 0, + MIN(pci_config_size(pdev), vdev->config_size), + pdev->config); + if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { + error_setg_errno(errp, -ret, "failed to read device config space"); + goto error; + } + + vfio_pci_config_setup(vdev, &err); + if (err) { + goto error; + } + + /* + * vfio_pci_config_setup will have registered the device's BARs + * and setup any MSIX BARs, so errors after it succeeds must + * use out_teardown + */ + + ret = vfio_add_capabilities(vdev, errp); + if (ret) { + goto out_teardown; + } + + ret = vfio_interrupt_setup(vdev, errp); + if (ret) { + goto out_teardown; + } + return; +out_teardown: + vfio_teardown_msi(vdev); + vfio_bars_exit(vdev); error: vfio_user_disconnect(proxy); error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); From patchwork Wed Jan 12 00:43:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC23AC433F5 for ; Wed, 12 Jan 2022 01:22:34 +0000 (UTC) Received: from localhost ([::1]:46740 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SLR-0005J2-1v for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:22:33 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36796) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdp-00012K-Bp for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:29 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:14028) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdk-0005h9-4G for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:29 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMYZlh025152 for ; Wed, 12 Jan 2022 00:37:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=CBdoS7bMGDKBaL55hN61apnrk+sRrMB3NXOS/k66Tck=; b=r1VFZmdx2wRYWt4NbO0OEugO5iy7UgA+xE8lRc+Xn4bONWV2AzqDMdgbGZ272HZ/ZjhX t6eGLxKVeR/4PHJXkOq6dNHVRYXr9C07hjhsSTDhb0lA6mWS48cJdJww6Ps89nbBJ3em 4gJQrz5sHe4yuiAAfs3O8gMrIUa5XvXgbXwbp8N2jBoqB6oi7tITrIycDVnQDokvURlf iAEOUrsw2WkBJe6fEr0c7CIPSUPjHygo0cJW59/D1TuiNhEnu/Nw/vv4HitXYww0NEm6 mlJV/SYznlTR3y1/VEuutfAYTs01Wj4/JacfvpbMCzCT7Kd26FfDUvKoI1ttPZOrxsRy Sw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crpw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:12 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBJ196414 for ; Wed, 12 Jan 2022 00:37:11 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BRrRTfKcmE/C6Hhs4NDXJUODVeR0toYmxrF1ANhhSLRiZiQo8R2UW7DbmenK3I4RVFjuNaTsBfd8lsOBNuhr20cZhaasH6Gy6u7pPMHZeFKPnx8ZcNTy7Dpa+OHB+HabctNx9DEeiV/mChJHFGJP8ES3piAy/aBiVV+OQty8G4TbA3a9riuwzRfkyBNOdP3APCLHGnihaD2MaV4a0uhH8+fYPrE4LHpt3yQ54lFuXsHIrYwFY6KeI5kB+akmB7byQLkJfsHylARn01ba1+ZJj/bZWmGhN/CnpC2Piyjdv659Z5ASN3t0k3OR+2m9uy6nEpgF0fOlXihG6hwPSBmQSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CBdoS7bMGDKBaL55hN61apnrk+sRrMB3NXOS/k66Tck=; b=QKESypaVlZRA/EfNZBfvktoEj94QO8jdO77qBwrYK9nmT/NLZfEi0/0pCBMxWnu8jUSVuyjWhsIpvB0qCCaICwfMlrm0cBi23pOl0BC8PmycfFCizQfUDupI5bEljqWTtfoWJXDi7vzZ3cXbZBkNZ0b/SBsIMAQp5/5MwBNNMxPvKscRjT2gIuZdEGAING75paAl4sTUY7s36IvX+MoNU3u9ZIFLYHgjRVoEpJebMI43KwbNJR2DgGnzBNfNatakCs5Kz8ItkjnU8Jwt6THjE2kWckZrznpzEBk4gXkoI9aatyg1d7GR3M2TBByAup4EUQiK4GBGHIRoGqXTYyYyVg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CBdoS7bMGDKBaL55hN61apnrk+sRrMB3NXOS/k66Tck=; b=r2uRT1NpczlX03aGtTSfYiFchltTrODdEDHB9RfrqFbjwOqJ2tsm5KclXsLdY90nlrK/I5oVvzYR2gpqz06mdrUD5+GknPC8YFgaZm6Kw54apHE6yzz/eZEa4zeM8rBkhpuHzTlLwKyS0H/53YOrpBPqxNMF4jBpPjFpRNYSqsg= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:07 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:07 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 14/21] vfio-user: get and set IRQs Date: Tue, 11 Jan 2022 16:43:50 -0800 Message-Id: <439a724e5f58d950acb320ecf2ad8bc16b5b0acd.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2de7a202-fa18-4a1d-8d40-08d9d563a999 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1360; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Gegfjub6KIYR6mJlctTC4avCMC03vB6GXPFXXkRMG17b62c11Eczcxyop3Z/cGaPl1xQGVXPrgp67wkNtAczECV4YwC1w0BjbBPRCOjLjCKGfvZINgzW25biCW9hwfat3rI7R+Cn+y+gZV9gr4HfALb6MmfdTRkoBuZTE468Td0lPQ0R0Sr4jCRO2soxrUrZKv9sYEb6Dp2H3gkagKfjs9dxA8ZMASYggNUuK1xEVQoPug0RFPtLk6oZUzv7Kid6rTqOcBnwiYS2RaURruOjKdNU+WS1dsLrcKLCfnVQZFbP+IZTSlSf5F6eRvYsl3KhUstLIHMjoDIndilRhIu+Si8gVtk2gW6bMk7ulT/heF7C/aWq+tffsqBiUfGHOEolJQgFk8rsfopYgQoZpffIO/DzAhZLGy8tCGD1hMRTSw2kOcVld/W1KA+cqzj2H6GAHZVe58EhsWc7jdJTBpLxSmWSQ4G2bGW1992jSnULC2FDczGfDJkzk58bNTaTOxnvMtS0ZVGIhTPf5A0jGyYFx5g9eBB6EPUxIekSzimR5rKMdFL8f8fc+clRCcr08O+uYLsSwkb2Bz6eabo6ohNiiJ6Gw/Vnua0x5g4pGiKa3y4wjY14SGLn1nsgfW5qPEC2DNBXSRefF19Fa3M9VPDGinZxxKpBoBy9HGqd4E3wMx3SUKl6MURARm2Z78qRZTjhnW8QmwWczv3FSHyk3dooQg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 1L3Ad3B2fjPXuL53ciRwDE0k3YxunwG/B1ksfzL8SWC1dAjHrygCCjl7102Otp8DVGe/POKcv4TjpWqdw9a+EE8vlIJyXXqaFYp8wrbywMug8EsgW+I2IKA5z8uz/c4slFN3SaTySD+OeYRQ1qDKzTmU/6GtK7KtLUns5j1jjfgU9p0CLHbeEbp+TApeLbZrrlKoGxrIG/JbB2eH5lNCviu9s/AV7T1z2PFOLbsFeL6S10fX+NV0FyWlYsX/bo3GBA/SVlsBasJA+WQBjFzMI8x3m1VPBtl/p46itu3AfMo6ZOvHu5fv5i6KskLeBXTFn4wE5iJDIa8cbD3hW7J6klHQCpyEXtS5Kkb2+k0vnsLhEXknVBsh9SeNTrK6F1ZBkoIg2doksLIrcE+10VGttLtGqXl8XVKhmefSRtgGPIJuvpumFfBvhiFc3G8BOtQnxrHFGU878IAn/03ep8KQusMGQSVOW5Aw9rzB0HYFxbpGix3yPUkPEuBfIXIXZ+ZN0vSgsyslyA58rMZQfrV7z0sim1oJGi3gy0qiCT2WTS5YZ7dNiwZLhDZWy0JCRKUV2Hqp9ZXNVRhq6pGv7lLthBauPCbVkixjlK0/yhBS9ljzmLvrfR3i6r6SLjtNvu5+qKYF4DM06oK7SR8tW8k/cidlz5gfQPuohfX3RkVzee9CM6csfwD/h1MyJzJUsqvXlfFNwXleorLNntKZMTeecDZ/itJcVdqfA+HVn3GG70E13ODZHPxmq9iiTmamaroOzOcdz5N8UxTuDvgOQ+6bAHvkrSP3K236YJoUrnUxOI8/aHuZ0AfDXyNG9O5tryGjC9VfZszhftPdJBXqz6triSQNghcQrEAFss18KlVGcFZ+c8BFFO0WM3bUwvpOtv+8hhXUsceE2UQOmrrZfp4mDnIlgKhtBgG+KcMFlRV9gq53MrgquqoiYesNX4y3Ghoj3msIR5SPERMej2nHILMqOuPaCXHIZPpbcnRruAdVLfs3iIS3xP54vwypr2sSwMxynMw0ws8MlqDlMbMjz3NKg3TWzxp9xiAFvPmTjXGk/HV7tx24OL8N5TuPgIuuccBDhKcvS0IYek1id42b4ZcplUGS6+vYO1xjOn8rrGC+CuXW9ocY+PWjfsMvD5DHw3R5HLgj+okYzyOhB2NoNaUe7W2VzWCrhT+jzteWcE5W3LZZ7p5ybPXnjYzow4zyZ796cjbqGoWH/UD/MolVy0V8d9BHMFNDylviSpsTRfzPWcAgFwwWJ9vrLSo35sKFOCz7d+zknHVR7agUPPBvO0IVu52RCnDvZWoK00hhl4xQhkhQHiPxmWvM7x1R2ITjlqLrVfV4Af5eJbWHvjHv84ZTeXix/q0El90E9RQCTzYB/nedzmcBrn0dze+zOa81fj7UKqIw8UTxCoacE2kYh6mANImUg2pMV2PpECfQhhjJwhsbE0TkAc7ntbAfdcnoReCJ4bCjtNQ37eKyhyLkW3ZvMQEH6xJIUoWmZDQ4ZdS9BiBy1F99ODiC2Yx6ygf0GJxPmDD/WW2zyUKB0mB9zXzS+cmjG6bWM/F3qSAXLnz/pZaa76mqoIUKrHc/P9wn9lLexFDqsdfLWYTngV87WBxnkuc6v4eazdCI3248OpXofMcq+EJ4OPGwSLaB79cfpmv5QR+anf+2KQ4iXFMdAWhsLg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2de7a202-fa18-4a1d-8d40-08d9d563a999 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:07.4453 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: m4pXMg+8hGtjLpCR+cbxE0e1W44xxjTLV2oPUA9Psi3z5lQjlWAyfRcZ6U88HPM3c+X1z0BJA1EQxCv/dqbp/5LepXIfLZITje8fsG5tmmo= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: khFUD4grqcyQu5Yz_1w7YEWDL4Crfb_x X-Proofpoint-ORIG-GUID: khFUD4grqcyQu5Yz_1w7YEWDL4Crfb_x Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 25 +++++++++ hw/vfio/pci.c | 9 +++- hw/vfio/user.c | 131 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 163 insertions(+), 2 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index b1ea55f..4852882 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -121,6 +121,31 @@ typedef struct { } VFIOUserRegionInfo; /* + * VFIO_USER_DEVICE_GET_IRQ_INFO + * imported from struct vfio_irq_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t count; +} VFIOUserIRQInfo; + +/* + * VFIO_USER_DEVICE_SET_IRQS + * imported from struct vfio_irq_set + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t start; + uint32_t count; +} VFIOUserIRQSet; + +/* * VFIO_USER_REGION_READ * VFIO_USER_REGION_WRITE */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 5c519ee..e918f8d 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -514,7 +514,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, vdev->nr_vectors = nr + 1; ret = vfio_enable_vectors(vdev, true); if (ret) { - error_report("vfio: failed to enable vectors, %d", ret); + error_report("vfio: failed to enable vectors, %s", strerror(-ret)); } } else { Error *err = NULL; @@ -659,7 +659,8 @@ retry: ret = vfio_enable_vectors(vdev, false); if (ret) { if (ret < 0) { - error_report("vfio: Error: Failed to setup MSI fds: %m"); + error_report("vfio: Error: Failed to setup MSI fds: %s", + strerror(-ret)); } else if (ret != vdev->nr_vectors) { error_report("vfio: Error: Failed to enable %d " "MSI vectors, retry with %d", vdev->nr_vectors, ret); @@ -2668,6 +2669,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; ret = VDEV_GET_IRQ_INFO(vbasedev, &irq_info); + if (ret) { /* This can fail for an old kernel or legacy PCI dev */ trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); @@ -3553,6 +3555,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) goto out_teardown; } + vfio_register_err_notifier(vdev); + vfio_register_req_notifier(vdev); + return; out_teardown: diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 09132a0..99425ef 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -988,6 +988,113 @@ static int vfio_user_get_region_info(VFIOProxy *proxy, return 0; } +static int vfio_user_get_irq_info(VFIOProxy *proxy, + struct vfio_irq_info *info) +{ + VFIOUserIRQInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_IRQ_INFO, + sizeof(msg), 0); + msg.argsz = info->argsz; + msg.index = info->index; + + vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, false); + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + +static int irq_howmany(int *fdp, int cur, int max) +{ + int n = 0; + + if (fdp[cur] != -1) { + do { + n++; + } while (n < max && fdp[cur + n] != -1 && n < max_send_fds); + } else { + do { + n++; + } while (n < max && fdp[cur + n] == -1 && n < max_send_fds); + } + + return n; +} + +static int vfio_user_set_irqs(VFIOProxy *proxy, struct vfio_irq_set *irq) +{ + g_autofree VFIOUserIRQSet *msgp = NULL; + uint32_t size, nfds, send_fds, sent_fds; + + if (irq->argsz < sizeof(*irq)) { + error_printf("vfio_user_set_irqs argsz too small\n"); + return -EINVAL; + } + + /* + * Handle simple case + */ + if ((irq->flags & VFIO_IRQ_SET_DATA_EVENTFD) == 0) { + size = sizeof(VFIOUserHdr) + irq->argsz; + msgp = g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, size, 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start; + msgp->count = irq->count; + + vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + return 0; + } + + /* + * Calculate the number of FDs to send + * and adjust argsz + */ + nfds = (irq->argsz - sizeof(*irq)) / sizeof(int); + irq->argsz = sizeof(*irq); + msgp = g_malloc0(sizeof(*msgp)); + /* + * Send in chunks if over max_send_fds + */ + for (sent_fds = 0; nfds > sent_fds; sent_fds += send_fds) { + VFIOUserFDs *arg_fds, loop_fds; + + /* must send all valid FDs or all invalid FDs in single msg */ + send_fds = irq_howmany((int *)irq->data, sent_fds, nfds - sent_fds); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, + sizeof(*msgp), 0); + msgp->argsz = irq->argsz; + msgp->flags = irq->flags; + msgp->index = irq->index; + msgp->start = irq->start + sent_fds; + msgp->count = send_fds; + + loop_fds.send_fds = send_fds; + loop_fds.recv_fds = 0; + loop_fds.fds = (int *)irq->data + sent_fds; + arg_fds = loop_fds.fds[0] != -1 ? &loop_fds : NULL; + + vfio_user_send_wait(proxy, &msgp->hdr, arg_fds, 0, false); + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + } + + return 0; +} + static int vfio_user_region_read(VFIOProxy *proxy, uint8_t index, off_t offset, uint32_t count, void *data) { @@ -1098,6 +1205,28 @@ static int vfio_user_io_get_region_info(VFIODevice *vbasedev, return 0; } +static int vfio_user_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *irq) +{ + int ret; + + ret = vfio_user_get_irq_info(vbasedev->proxy, irq); + if (ret) { + return ret; + } + + if (irq->index > vbasedev->num_irqs) { + return -EINVAL; + } + return 0; +} + +static int vfio_user_io_set_irqs(VFIODevice *vbasedev, + struct vfio_irq_set *irqs) +{ + return vfio_user_set_irqs(vbasedev->proxy, irqs); +} + static int vfio_user_io_region_read(VFIODevice *vbasedev, uint8_t index, off_t off, uint32_t size, void *data) { @@ -1115,6 +1244,8 @@ static int vfio_user_io_region_write(VFIODevice *vbasedev, uint8_t index, VFIODevIO vfio_dev_io_sock = { .get_info = vfio_user_io_get_info, .get_region_info = vfio_user_io_get_region_info, + .get_irq_info = vfio_user_io_get_irq_info, + .set_irqs = vfio_user_io_set_irqs, .region_read = vfio_user_io_region_read, .region_write = vfio_user_io_region_write, }; From patchwork Wed Jan 12 00:43:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1C520C433F5 for ; Wed, 12 Jan 2022 01:14:33 +0000 (UTC) Received: from localhost ([::1]:37876 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SDg-0007PC-VI for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:14:32 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36868) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Re2-0001Hm-Po for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:42 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:14818) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdk-0005hM-Lc for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:38 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrG6a019928 for ; Wed, 12 Jan 2022 00:37:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=VSbxY+PJS0sCwuPidf/ZYvYLFwN2ZplnJ3Ctj7O2rLM=; b=I7lj0r8gYCZBdoUEncyuEBKKRRX9nQswFcPdC2U38X3Fmxfqkm5EahmofUNwed+8dHdR 2BnYiF9zjMdusvy+JNLOoWvq01ptvZafFdEf6AFK0jmGTvTeyBmmZH716jFr+zUDzcsB GwjxoYpRKTt4+9UqaDbqWGPv8jNLIKccjathWTNJPH7S/+iJk3JINYcN3IKhqlcMEvyt mnWstttzvJpXSSo5k/M5dhgg1QvvBqKOvKBHb6Z7WgekCuZnXUFuIAY9hubD17xW5Qp2 zlTCdIinoBRfbct4c3NLraUt77q94fLupZcuXbTtUytNO2aPU9WSWSJoOCku/3hs06yy 4w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sj1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:12 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBK196414 for ; Wed, 12 Jan 2022 00:37:11 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-11 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=am7rLrFNiwUobY0GU/qzbDNkF+0uwtejYuhHmCXHDSJtiOyaXKzzZyHX2q9dY37wIN9a1jD00eU7a8aC6qB4MivEO3TuYsHtFDHuIBZJmKJ1irO2ZicB5PWNs/mSaTDHhzKfrORlA6WfrJ/Ppy7PsPLg0bqC0us+wgITPGOLlukFZnccilUxm7j6ndo4UU4khl1q8uyftT3Y+2+bTu9JzaMyC4ggjCkUsbEvl1t7ZHQXm59NcIqFGGpsGFZp8HHeML5oDrwaTWTl2VbXvbokKd9fdD8+RO5QFQ01Ko8cGUf41FPNFtmqMSf1VlbzDg6pDyS3O/ziEVQOp9BLbbaYaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=VSbxY+PJS0sCwuPidf/ZYvYLFwN2ZplnJ3Ctj7O2rLM=; b=QQZkXnl/BVxK0QY5JOHorOjKAmzdnpBqR9hz7+VvZeZAC3dCCljMf2oZBlliOp3kJ8s0TrQdQVg8K/Fja0g3pHptKaxp8HZ//7AtLY2BXnJ1dw61FGP7lQxvJvDZ07xZgdoflz2AVMEXlOqrVdoJgsKfZ4ufMIoGr6a8vNfEf1Wtz37Q54skNP6RXobdtdYjoFwI+9Xv6imU/3WUUSGsXpdM8kjzVwmc2WBW+ot4mAVUI456qLkf6YW2JKd/b2riReQFK/D+igt0EWBgTBu+tT6/s20Cqocg5hVaAEFcZHKKz4RaNUr5EnCbyAkKJiaR8OfgNe0woyvpjkMclb7uOg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=VSbxY+PJS0sCwuPidf/ZYvYLFwN2ZplnJ3Ctj7O2rLM=; b=v5quaO/RFJwqNgSs/nznaUOqNdmslKi+49Z2GzorO21e5p0r02Dy+59QvdR3dnVSEOl8EJF0skKP0HPsDM7Ge2540Dqvl3G7//3EbYPrnU/MVA5/SxGxej7QjdN0hs2cEkXE3jut4bpXAHrlEKkcT30d/BXrWFyZpKu8IiknWvo= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:07 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:07 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 15/21] vfio-user: proxy container connect/disconnect Date: Tue, 11 Jan 2022 16:43:51 -0800 Message-Id: <2601edd1529d80dc1224f7916513c725b092f7b7.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 26f36cc3-fe22-4c56-aeed-08d9d563a9c6 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3044; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oKdCUUnNu+A689Jfi79cN7jp9s66msVC7/YEZjAQyMve1Nh2qYMU8Lq6i9jETa45s7eUl14jig55CFXR98bIvRUqXwvoqpmyriha0EPSj6tFIhQBNF2AZUpBPA0DsySqcpTJlolBf+0FvpRofCAieSA3c/DklZUCJJEXVb/yiQBbmuCsasoWXT/5RsGIcbO7mAuY6nUpLdk0fv96uvjFEU25SdjgYloctMZNuS96VHgM6qdcRPfVkH/iTLwJBhG2o+z2j2aGCWTEBqG/cUBo8XKgea6U6L7Fdb8a7J/KMG/nuLTOlAdtuin5Ie4xtmqD8kqi5+HJ28tr3VkSn7y2qXrzwKQ19YGIDJgwWOVgVOP8m1pT4rjRdJv+pYPrdifomTxUUfyP6WX3qIw2k3a2vl9T0MUgd/CRTXNFCfteV8VfjGddsmGgbHvd+MTBpjGpXIry+AjbvRCmatTGpst8dLkCnj5qOuSa0CKcPmAHSqt5JDMNxmtyUe//IoPQlYA37JCyqj+EvIPUAqvVrm+whQGMcw4fKSAWWdEQl6LuStramKlvJ1yu9vsBoCmYTLuqscfvEHHoIGz546ER7FghQAec9L1XcgJpiHZ0BohtM1jBQXhZ1RiePbTstgaqb5JHk75pk0jdUkUM4lFHHk98yqd+zNCqFhQHdl2c19PVNaRsYO++f1aaIn+naDlx5213ubwYY49fKN5uTo/kNTBT5Q== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: TMpRjkXo/fIasi+tLFDjZgmK9gQ87tWc1ob790DiCXBqYdnFws6kTToJQNEBPXfCl6e5Abq8z0cWDtgSuKrU2Lw1XXLtu+DC49Lc3018CzFDa0UxxmM3cNX3WA3JLOk8YL8O0SxJPFbWmBCj1vo8m1rosIysM6oyS1SiDGmK3oDSacbUOtWXIuBHunWwpYRdOF7aIByJTJrHHNO/SlEL+lUNS/IsB8dfVUuIoHDr2rkHzI8reGZW2Hs9aphsGoHaQmILm4r+JC/5+PevBllHerItjFQG6MAECExK8mIJGGQMREelVnEiL5k+ciToJnD/cT2YgcJVPAjrCVlTVMrZxKOuh2dk3poEejGheLBu+QfQ21VFyyVhPbJajhxi8Rm/dew0y+FfPaDc/W9fhU4JN7auG5ZSfcF4IqqKIlrhX///8ag+8qvH+7qWY+Z1p4Nye1pu7yKFmv4sTVUmuI7/ISOqaikeLKKBxJhPA4r614Dhv8RbFrPzRnXrqr4dr17rpRiDPGIOJCuZchSTXKXTqLPcQvOgFos/tE0BzAwpsvlvn2p/6yOUoVlJqpkiod820laf/ejFlMQF+XNcFGA6dmVkc61w5HsB3BzXZIJpVsob39EFz4YBTJf9pIAinLHl3Xb2znT+8WQSGUuvEaLZ//Dami9IBFBx/UJspQr/ToIDmkhoZzgTYEg62/2RZtIUsfrMYLHeogT2qxHBv/MCbexq9zR0TiXMYhRQlFRD4KsVCNbpq0lu8nG7g/6V33REMaLP1Xk7nPDOF+46ygc+EFrDUCcQa7MOYwreZsWsvz2iY0kCxG2AdOZaWJD02/jmlNbnKrC/DVE3azbfoVzX7pI3Sms6NgWhE32HsiMyZxNfupVELm0H+QoHl4xBO9g1zcWbJ0wl7SPnyFcOgeti5oZKKNOP5hAhGop+71eWyazJuXfldW2MA2I1K8pDipOws79UxC2KpdZO2xVQE/4N8Bp1uzI18v+FyvsGA5K8cJuDnmhoBmAb/PBH/DS8to8+SEpYI/k9DOPkEDNAZt67uyZUaqGYtpNDYWURbThvW6UAPSb0jWBKgs42AYxLCm5X+12Q43fUHhxvYt3CJibtYeMQWYaxIJKUXRZz22pZpv7fK5PRizZoPT6xdk9M+Gsvi0qjherL9g702NhJchIVuXxpAOz2RNtM/pwyvgF4Bq08Ez/MA2iBqJ3w4zDwdkqTqMFLJpZGQLBliNEt2uOhJjSsYoXPSYHsZ/nSFqpM8Og3tlREvJPtrgu/memb/k0Rup/71RF2Mdbv5o22a3K230A8xF7g5IoM4Wp7syZL+F2KPCvNkRTLdENcJyWfRaZ2WI/2x+Yg+vNBwmqUGFfFfhPXDibJ1qgy+CNgnoNoUUO9Btsz4oEHNYllF7gZ95Z7yiCngJl7XVSuQ07ka3cztEkHxQUeLJfV6uVLwgakHdZ2vBlNejWeEZ0WwQdXhsYsxsibuVxWeAE6Y41hXtXmXQwpvjFMfUnMYHKVLu/6Nhi9iDu6pUx7r2PQA2SMUBzEJETmEEIfLz3OsSCnGOym483/Cz877oCDWDu74EGQtYeFpnMt4inRLHiaFdMv2XwYJNkpVzZirQOghEpxy8FA8tfic78U6G1KJetcI6CZLmVjnrS60kOS+iFBI4Qi0rJVU4Dn5AFztIV+/IC5BbuowQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 26f36cc3-fe22-4c56-aeed-08d9d563a9c6 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:07.7578 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: YhfjpD9CvjYigkMsQ8W/y6oIMhHq/CbkspJFZP8ImEvwo66CQOgfay+90Yd7mYegcTZlhIqH7zlJu8bwDzKl8itbIu54j5WEs9dII2Oy/BU= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: rwpSsFj3IMq8cDPrc6113GkJyMGHxJ1k X-Proofpoint-ORIG-GUID: rwpSsFj3IMq8cDPrc6113GkJyMGHxJ1k Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 1 + include/hw/vfio/vfio-common.h | 3 ++ hw/vfio/common.c | 105 ++++++++++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 25 ++++++++++ hw/vfio/user.c | 3 ++ 5 files changed, 137 insertions(+) diff --git a/hw/vfio/user.h b/hw/vfio/user.h index f2098f2..8d03e7c 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -85,5 +85,6 @@ void vfio_user_set_handler(VFIODevice *vbasedev, int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); extern VFIODevIO vfio_dev_io_sock; +extern VFIOContIO vfio_cont_io_sock; #endif /* VFIO_USER_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 4118b8a..59a8299 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -94,6 +94,7 @@ typedef struct VFIOContainer { uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; + VFIOProxy *proxy; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; @@ -278,6 +279,8 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp); void vfio_put_group(VFIOGroup *group); int vfio_get_device(VFIOGroup *group, const char *name, VFIODevice *vbasedev, Error **errp); +void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as); +void vfio_disconnect_proxy(VFIOGroup *group); extern const MemoryRegionOps vfio_region_ops; typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 83cc5ec..9a67934 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -19,6 +19,7 @@ */ #include "qemu/osdep.h" +#include CONFIG_DEVICES #include #ifdef CONFIG_KVM #include @@ -2209,6 +2210,62 @@ put_space_exit: return ret; } + +#ifdef CONFIG_VFIO_USER + +void vfio_connect_proxy(VFIOProxy *proxy, VFIOGroup *group, AddressSpace *as) +{ + VFIOAddressSpace *space; + VFIOContainer *container; + + if (QLIST_EMPTY(&vfio_group_list)) { + qemu_register_reset(vfio_reset_handler, NULL); + } + + QLIST_INSERT_HEAD(&vfio_group_list, group, next); + + /* + * try to mirror vfio_connect_container() + * as much as possible + */ + + space = vfio_get_address_space(as); + + container = g_malloc0(sizeof(*container)); + container->space = space; + container->fd = -1; + container->io_ops = &vfio_cont_io_sock; + QLIST_INIT(&container->giommu_list); + QLIST_INIT(&container->hostwin_list); + container->proxy = proxy; + + /* + * The proxy uses a SW IOMMU in lieu of the HW one + * used in the ioctl() version. Use TYPE1 with the + * target's page size for maximum capatibility + */ + container->iommu_type = VFIO_TYPE1_IOMMU; + vfio_host_win_add(container, 0, (hwaddr)-1, TARGET_PAGE_SIZE); + container->pgsizes = TARGET_PAGE_SIZE; + + container->dirty_pages_supported = true; + container->max_dirty_bitmap_size = VFIO_USER_DEF_MAX_XFER; + container->dirty_pgsizes = TARGET_PAGE_SIZE; + + QLIST_INIT(&container->group_list); + QLIST_INSERT_HEAD(&space->containers, container, next); + + group->container = container; + QLIST_INSERT_HEAD(&container->group_list, group, container_next); + + container->listener = vfio_memory_listener; + memory_listener_register(&container->listener, container->space->as); + container->initialized = true; +} + +#endif /* CONFIG_VFIO_USER */ + + static void vfio_disconnect_container(VFIOGroup *group) { VFIOContainer *container = group->container; @@ -2258,6 +2315,54 @@ static void vfio_disconnect_container(VFIOGroup *group) } } + +#ifdef CONFIG_VFIO_USER + +void vfio_disconnect_proxy(VFIOGroup *group) +{ + VFIOContainer *container = group->container; + VFIOAddressSpace *space = container->space; + VFIOGuestIOMMU *giommu, *tmp; + VFIOHostDMAWindow *hostwin, *next; + + /* + * try to mirror vfio_disconnect_container() + * as much as possible, knowing each device + * is in one group and one container + */ + + QLIST_REMOVE(group, container_next); + group->container = NULL; + + /* + * Explicitly release the listener first before unset container, + * since unset may destroy the backend container if it's the last + * group. + */ + memory_listener_unregister(&container->listener); + + QLIST_REMOVE(container, next); + + QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next, tmp) { + memory_region_unregister_iommu_notifier( + MEMORY_REGION(giommu->iommu), &giommu->n); + QLIST_REMOVE(giommu, giommu_next); + g_free(giommu); + } + + QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, + next) { + QLIST_REMOVE(hostwin, hostwin_next); + g_free(hostwin); + } + + g_free(container); + vfio_put_address_space(space); +} + +#endif /* CONFIG_VFIO_USER */ + + VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) { VFIOGroup *group; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index e918f8d..1fc79ef 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3456,6 +3456,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; SocketAddress addr; VFIOProxy *proxy; + VFIOGroup *group = NULL; struct vfio_device_info info; int ret; Error *err = NULL; @@ -3502,6 +3503,19 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->ops = &vfio_user_pci_ops; vbasedev->io_ops = &vfio_dev_io_sock; + /* + * each device gets its own group and container + * make them unrelated to any host IOMMU groupings + */ + group = g_malloc0(sizeof(*group)); + group->fd = -1; + group->groupid = -1; + QLIST_INIT(&group->device_list); + QLIST_INSERT_HEAD(&group->device_list, vbasedev, next); + vbasedev->group = group; + + vfio_connect_proxy(proxy, group, pci_device_iommu_address_space(pdev)); + ret = VDEV_GET_INFO(vbasedev, &info); if (ret) { error_setg_errno(errp, -ret, "get info failure"); @@ -3564,6 +3578,10 @@ out_teardown: vfio_teardown_msi(vdev); vfio_bars_exit(vdev); error: + if (group != NULL) { + vfio_disconnect_proxy(group); + g_free(group); + } vfio_user_disconnect(proxy); error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } @@ -3572,6 +3590,13 @@ static void vfio_user_instance_finalize(Object *obj) { VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj); VFIODevice *vbasedev = &vdev->vbasedev; + VFIOGroup *group = vbasedev->group; + + if (group != NULL) { + vfio_disconnect_proxy(group); + g_free(group); + vbasedev->group = NULL; + } vfio_put_device(vdev); diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 99425ef..9e71cdb 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1250,3 +1250,6 @@ VFIODevIO vfio_dev_io_sock = { .region_write = vfio_user_io_region_write, }; + +VFIOContIO vfio_cont_io_sock = { +}; From patchwork Wed Jan 12 00:43:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710872 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12004C433F5 for ; Wed, 12 Jan 2022 01:32:27 +0000 (UTC) Received: from localhost ([::1]:55234 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SV0-0004KG-Rv for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:32:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36862) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdv-0001E0-Mj for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:35 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:23412) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdr-0005iN-9U for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:33 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMjadQ019916 for ; Wed, 12 Jan 2022 00:37:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=cezNkDBG0fwfvE37BmpPJUDcwMtN5fTyd6COu6OBPYE=; b=hyPZkcjLTYayGdhb3wuhHH0tl/rTpzxaOoGVr+4iBS3tdaaSdD0tQoUnvMrjmXmQHPlv zjWQCUc2zx/piHRXAKpFiRSRU/frL+ttah7KM7b0OEdzyOM7yaDza3L3qLiaGssl9nU5 6m83VRtqKKet42GgiG1IvQNr/9EwY9g24frMNsbvCRWpyZFCgRhI3T869Xg1YzzupkNZ oJmJORBG8PHYloX2Oe7o8LgviB5fsIbmGoaWSFDTO2g79xBV7FtCAs2uM27pWOSa5RiX AvjEJH9VTO6EvT/4Ev+rsYm3h6g/ypjrVRCSPQmNUKTKQRmEvfcpfUxyRHoYQeMwT6SB Tw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sjn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:19 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KAjB069368 for ; Wed, 12 Jan 2022 00:37:18 GMT Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2105.outbound.protection.outlook.com [104.47.55.105]) by userp3030.oracle.com with ESMTP id 3deyqy1gqq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Cwbiq9DI5o11GBd1sMgOkohIVXEumVNo5srqC2kPcareXTTOdw39j6Ct/P9udo9sJkg8YxWQIDfRt0W3Oh9dxLQcEFWiPMqcJ3yPVWg7olHxmiJFDbcWHd9batVCiTvhXhKUBWOeNCqUW+KPrONq/wf5BzPW5Bim1IhzohOqt7MvADOLK4EGhlahgXSV2ujQEazqPPDdfatG/u/T1x6FoWSS+yB8JC1efdf2b9r+ko6NAA+3O7bvFJr0BFxFf19jVzPZn2PU7KOxe84URWVWY5W8J4h1B6DsFObbY2DckJW/kkJV8j39iMikr/xac1uuK1kNYRMwiQr/h6zC7MKCrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cezNkDBG0fwfvE37BmpPJUDcwMtN5fTyd6COu6OBPYE=; b=Xru7oeUdjt5aOUksvwL+/0qIZFVo6Fft3vxckN4D8HxsslJNUHAYPRUYzmfWeQrIyQ6zIINnhiJz9silp+qdSn0WJL6vpzkRqrhLyFp5SIdFwvgLk/xKLzYz+2iaBlbakaDuP0K5Tf5gMzEA1mCAR/NpjMoBFd68/zr9aTiFTTcp0tOrIHRivdNiZFnfPpAxGakc4vfym7vb36cwJtEfO1uUM6y648lSwGqnngL4JCvtjMA/hz2XC2segM9AgHrUG6s0ApZXnggRdVCauB+S5v4UZdVBsXlgFP4+yVVViPcMbrHlKA2rqlwaTvus0e5ZMRPBm2yx94CDWErCHiRhHg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cezNkDBG0fwfvE37BmpPJUDcwMtN5fTyd6COu6OBPYE=; b=IBWzTjEDTEDEc16l3HchmfyBQYNGevCl/+pdqZA+Xjq4kUV9q87yJp9mmqf0JswVDWPaQNeRPJn9n/rUJZEibSqdo1E+mdTduuycN10U/x8r571oJ+Ybht2ezycpP+h3gwCsTIHf5D02Se+5ozLo7NV64Wa+Xn/q+CzTUj8PsTk= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:08 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:08 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 16/21] vfio-user: dma map/unmap operations Date: Tue, 11 Jan 2022 16:43:52 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 12229bf3-53a9-47f2-bdc4-08d9d563a9f8 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:303; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oBvczPocsBB9iSn+fQUHAJoFR6z9jVRWMIfUk1cQVxxvXhKhVTmhV29DBCy5u10J0NrkF6FtfCoaf629Ess6LX7DDfDfXn9fRkgoHaIyKWM9gZWdg2+X1Z+SCEP0PI+WtuE4i/hktPR2GyKovanoeaOp/0oXMeClCF85bii3BpPMJvq3XAdSS3ltkl9jaa1dCSwLPNdnvl9RmJZPQF/dxNnJ/mBzFhhVyN1JQISxweGV7vm5nGybC9kkOYZrd5Syt/kO0DIg9Cb0icyc1IfioA03MC3Vw9lVjbrVpg/snF8BXQTVmJjoOocefn202SgctjuFMda4TxVJ3S3vtHVsb3MqZxXloLtcDrPzHZje6FMQJxxjfXRgyMGSCbqNodERklGfzBwrpmHSXdtayICrYNXstdtvp/ErD0dujLO6sjT109lr+bhJAmF9iCkRDLtPQuFCagLHYa8UfI9dXWv4nYd0HePPopv8us5h13zI5p/fcqdqpkjeCqTlvk73XusDjhIdJzrj0tOZRnhp8L4g282kfxP3aTBafWnasp5U2WfKxPDvL2oVAVYzfWeowBskM3pLyKAM6HozfP2gUKHdoMM/p/Rxtc4qnfphk4q/QxPJRhdldAw1LgVBWlBLvoqIWjm4NypHsE4msk4lw4VHVfIphHe/aoFCWVtKLwZykfDKDNQGWKQ3ntRN8LjHGSinH/nx8V0AO+guXh46qOxnpg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(30864003)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: xBSzYg7T8dJINr05+9+qD5g9czgYN3q67zsOHmXVvOsSXPYTAZnnZs7rwRDo3ME7gu+k+O/ToP5SgmhQs3oW5DiP0nIUQXYK6IAyieyfeeEWD74T3ZqjXgypPfoUc5hpBGlqv5HTvYEsPaSb6J8z98ypIYz9BQRULDgJMaeM/7GFN65ocB5EM/C6+aooRHWay2B5RQhGYX1jpoL+9uZ7qeSghy6+bfOYY21QAu23/DMBq8j3/UAXf2JVSpT+2qzW0h4hnwQWQ5YAHb9mY1J5R7icedwA9Y0dgH6hERlAsz6EgyrcUG9bQ0g8k+EPgKtNN8LaB9SN3BxfkJ6Yj2CCdeNSjpv8XSFsdM/4Digo91h1CR6SIgN7lWVmHfNlLGTz7+TnmRuwRYF7cmV6R8Y1g8Njl+S1DM3/GN5M72pXiCpg4vY0gahfa/REiLwjSEAVOu9NQdQZwyVAu9hP7xAMMSZ+BVW3+cFmFg49+pYYFbfiAF2FYlzfvZZ8jT3Foy/5qHlxhVQO5WQf2aNJfbHdoRHCx9+khiOmznMYcBES64kPyVJ9joejDTvyTYnbUcmv14d5DOiGTPatS7cqghvXiH1bAisMZC1f/JuXkeHDf55efxnK7usegKHa671XIop+xzWM7KZURZybxdx/uSCFnx5p7gbbQKkS7WG0q6zW0quzlwtMD5f5ZiV9cvXnwHc++yB1CvRMNZ1DUESkWt68ZXI8MZqzxgUXYR0/z7DO495jdp/oBhr1OvPdInM27TIGq7G3WtZSJFSkdp3lcGbECp0aybtUQKvg1c+1xTTMF1EGffORuauALzDVVmHHsEN31zfEIzVE/UjYIelKp4p6t5g1d9OaPTKjDU6/Bt7BIlkspNjXuLk07LpgrUVpKTIE2syiXW4ze9whuvzhtVrIN8/z6VfsYIWQtR/paIAdF1ikZQzd/VNNe29nmHqUixv+dnNg2/Y8LVoWn5mIMvEDN3zBS3ZvvMNpBmFwo1Ns/4SoF/4ofNyjzGExZwe1coUCWUocp8YvfCD9XMIw1DtjCZCPYfmPK8ZjSSE9IHc3ySudwGIP3BGiSxNDp/k0FFkca5MGAJrUTcNI2/25CGfQuQZfwnLKK6TCos8DguhIyqi1k/2rXbc9VvJyuIROk4tELTslBGk5314pxMufoCHA4yUwxvsizM10K6pIxcqQeAFClXq/C79XHKYKfZim9V5+PeYlUVCZ1pguy8Qxc5Dcn9lvLj9/RMSF9pIL8xkoLdFJbiZvtclbVzNP88LNBNh59QZRcvRWH4sr+XWSzjxlukGI5BuQ6mGiVvtdGt9OoLg6zbw+Tr7DXKPf105F1LdUEEU3lZgeS5b7UTBkvF9+sbm3dG6+0sFkMAXRyHTJovwPCuZkopRyenIigEuoQz0727hS0fNmtvv3zV+3/5TjMgm3LTy45+F34ksOawj+f83QNBoENLbvjhbhU6mDQPVxegxEXhxUu2wZ1kz/cszCg/j6xl3bwHXE9Bf39gynNx4UdDDOijWHseJ33ciF4hUfYoihJbC2gRV/dHbGDk6NShYuA5uk4WQ5eYpORGFFncoJnMl54hczpl0gCn8ZZTvc5LS7jBTBtUjzeGQI9nsr5EX6J5cWfQiytUnNIvY1jDAO30b2rrztAu+j66Cj7KEJwCs0MHyEY7at++aBpQfZ/Q== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 12229bf3-53a9-47f2-bdc4-08d9d563a9f8 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:08.0858 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: dObyXzwQ2zqYx+CEuBBeZ6TNVEpCJMD0gpxKZo+SbOtTj0I/RGKk3H6FQ5xmcNA6iwcc7tr0yK7lgDI8MMF9otSGJR9itsf4wHxk8/9Bd8s= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: VFSrHOj5aHzaoJf9HoBVbcLDMeWUgPko X-Proofpoint-ORIG-GUID: VFSrHOj5aHzaoJf9HoBVbcLDMeWUgPko Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add ability to do async operations during memory transactions Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson --- hw/vfio/user-protocol.h | 32 +++++++ include/hw/vfio/vfio-common.h | 9 +- hw/vfio/common.c | 63 +++++++++--- hw/vfio/user.c | 217 ++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 305 insertions(+), 16 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 4852882..ad63f21 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -94,6 +94,31 @@ typedef struct { /* + * VFIO_USER_DMA_MAP + * imported from struct vfio_iommu_type1_dma_map + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t offset; /* FD offset */ + uint64_t iova; + uint64_t size; +} VFIOUserDMAMap; + +/* + * VFIO_USER_DMA_UNMAP + * imported from struct vfio_iommu_type1_dma_unmap + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t iova; + uint64_t size; +} VFIOUserDMAUnmap; + +/* * VFIO_USER_DEVICE_GET_INFO * imported from struct_device_info */ @@ -157,4 +182,11 @@ typedef struct { char data[]; } VFIOUserRegionRW; +/*imported from struct vfio_bitmap */ +typedef struct { + uint64_t pgsize; + uint64_t size; + char data[]; +} VFIOUserBitmap; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 59a8299..a84e10a 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -90,6 +90,7 @@ typedef struct VFIOContainer { VFIOContIO *io_ops; bool initialized; bool dirty_pages_supported; + bool async_ops; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; unsigned long pgsizes; @@ -199,7 +200,7 @@ struct VFIODevIO { ((vdev)->io_ops->region_write((vdev), (nr), (off), (size), (data), (post))) struct VFIOContIO { - int (*dma_map)(VFIOContainer *container, + int (*dma_map)(VFIOContainer *container, MemoryRegion *mr, struct vfio_iommu_type1_dma_map *map); int (*dma_unmap)(VFIOContainer *container, struct vfio_iommu_type1_dma_unmap *unmap, @@ -207,14 +208,16 @@ struct VFIOContIO { int (*dirty_bitmap)(VFIOContainer *container, struct vfio_iommu_type1_dirty_bitmap *bitmap, struct vfio_iommu_type1_dirty_bitmap_get *range); + void (*wait_commit)(VFIOContainer *container); }; -#define CONT_DMA_MAP(cont, map) \ - ((cont)->io_ops->dma_map((cont), (map))) +#define CONT_DMA_MAP(cont, mr, map) \ + ((cont)->io_ops->dma_map((cont), (mr), (map))) #define CONT_DMA_UNMAP(cont, unmap, bitmap) \ ((cont)->io_ops->dma_unmap((cont), (unmap), (bitmap))) #define CONT_DIRTY_BITMAP(cont, bitmap, range) \ ((cont)->io_ops->dirty_bitmap((cont), (bitmap), (range))) +#define CONT_WAIT_COMMIT(cont) ((cont)->io_ops->wait_commit(cont)) extern VFIODevIO vfio_dev_io_ioctl; extern VFIOContIO vfio_cont_io_ioctl; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 9a67934..ca51baa 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -480,7 +480,7 @@ static int vfio_dma_unmap(VFIOContainer *container, return CONT_DMA_UNMAP(container, &unmap, NULL); } -static int vfio_dma_map(VFIOContainer *container, hwaddr iova, +static int vfio_dma_map(VFIOContainer *container, MemoryRegion *mr, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly) { struct vfio_iommu_type1_dma_map map = { @@ -496,7 +496,7 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr iova, map.flags |= VFIO_DMA_MAP_FLAG_WRITE; } - ret = CONT_DMA_MAP(container, &map); + ret = CONT_DMA_MAP(container, mr, &map); if (ret < 0) { error_report("VFIO_MAP_DMA failed: %s", strerror(-ret)); @@ -559,7 +559,8 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section) /* Called with rcu_read_lock held. */ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, - ram_addr_t *ram_addr, bool *read_only) + ram_addr_t *ram_addr, bool *read_only, + MemoryRegion **mrp) { MemoryRegion *mr; hwaddr xlat; @@ -640,6 +641,10 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, *read_only = !writable || mr->readonly; } + if (mrp != NULL) { + *mrp = mr; + } + return true; } @@ -647,6 +652,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); VFIOContainer *container = giommu->container; + MemoryRegion *mr; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -665,7 +671,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) { bool read_only; - if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only)) { + if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &mr)) { goto out; } /* @@ -675,14 +681,14 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) * of vaddr will always be there, even if the memory object is * destroyed and its backing memory munmap-ed. */ - ret = vfio_dma_map(container, iova, + ret = vfio_dma_map(container, mr, iova, iotlb->addr_mask + 1, vaddr, read_only); if (ret) { error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " - "0x%"HWADDR_PRIx", %p) = %d (%m)", + "0x%"HWADDR_PRIx", %p)", container, iova, - iotlb->addr_mask + 1, vaddr, ret); + iotlb->addr_mask + 1, vaddr); } } else { ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1, iotlb); @@ -737,7 +743,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_dma_map(vrdl->container, iova, next - start, + ret = vfio_dma_map(vrdl->container, section->mr, iova, next - start, vaddr, section->readonly); if (ret) { /* Rollback */ @@ -845,6 +851,29 @@ static void vfio_unregister_ram_discard_listener(VFIOContainer *container, g_free(vrdl); } +static void vfio_listener_begin(MemoryListener *listener) +{ + VFIOContainer *container = container_of(listener, VFIOContainer, listener); + + /* + * When DMA space is the physical address space, + * the region add/del listeners will fire during + * memory update transactions. These depend on BQL + * being held, so do any resulting map/demap ops async + * while keeping BQL. + */ + container->async_ops = true; +} + +static void vfio_listener_commit(MemoryListener *listener) +{ + VFIOContainer *container = container_of(listener, VFIOContainer, listener); + + /* wait here for any async requests sent during the transaction */ + CONT_WAIT_COMMIT(container); + container->async_ops = false; +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -1044,12 +1073,12 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_dma_map(container, iova, int128_get64(llsize), + ret = vfio_dma_map(container, section->mr, iova, int128_get64(llsize), vaddr, section->readonly); if (ret) { error_setg(&err, "vfio_dma_map(%p, 0x%"HWADDR_PRIx", " - "0x%"HWADDR_PRIx", %p) = %d (%m)", - container, iova, int128_get64(llsize), vaddr, ret); + "0x%"HWADDR_PRIx", %p)", + container, iova, int128_get64(llsize), vaddr); if (memory_region_is_ram_device(section->mr)) { /* Allow unexpected mappings not to be fatal for RAM devices */ error_report_err(err); @@ -1310,7 +1339,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) } rcu_read_lock(); - if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { + if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, NULL)) { int ret; ret = vfio_get_dirty_bitmap(container, iova, iotlb->addr_mask + 1, @@ -1428,6 +1457,8 @@ static void vfio_listener_log_sync(MemoryListener *listener, static const MemoryListener vfio_memory_listener = { .name = "vfio", + .begin = vfio_listener_begin, + .commit = vfio_listener_commit, .region_add = vfio_listener_region_add, .region_del = vfio_listener_region_del, .log_global_start = vfio_listener_log_global_start, @@ -2819,7 +2850,7 @@ VFIODevIO vfio_dev_io_ioctl = { .region_write = vfio_io_region_write, }; -static int vfio_io_dma_map(VFIOContainer *container, +static int vfio_io_dma_map(VFIOContainer *container, MemoryRegion *mr, struct vfio_iommu_type1_dma_map *map) { @@ -2879,8 +2910,14 @@ static int vfio_io_dirty_bitmap(VFIOContainer *container, return ret < 0 ? -errno : ret; } +static void vfio_io_wait_commit(VFIOContainer *container) +{ + /* ioctl()s are synchronous */ +} + VFIOContIO vfio_cont_io_ioctl = { .dma_map = vfio_io_dma_map, .dma_unmap = vfio_io_dma_unmap, .dirty_bitmap = vfio_io_dirty_bitmap, + .wait_commit = vfio_io_wait_commit, }; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 9e71cdb..5c27a5e 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -52,8 +52,11 @@ static void vfio_user_request(void *opaque); static int vfio_user_send_queued(VFIOProxy *proxy, VFIOUserMsg *msg); static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds); +static void vfio_user_send_nowait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize); static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql); +static void vfio_user_wait_reqs(VFIOProxy *proxy); static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, uint32_t size, uint32_t flags); @@ -563,6 +566,36 @@ static void vfio_user_send_async(VFIOProxy *proxy, VFIOUserHdr *hdr, } } +/* + * nowait send - vfio_wait_reqs() can wait for it later + */ +static void vfio_user_send_nowait(VFIOProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize) +{ + VFIOUserMsg *msg; + int ret; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_printf("vfio_user_send_nowait on async message\n"); + return; + } + + QEMU_LOCK_GUARD(&proxy->lock); + + msg = vfio_user_getmsg(proxy, hdr, fds); + msg->id = hdr->id; + msg->rsize = rsize ? rsize : hdr->size; + msg->type = VFIO_MSG_NOWAIT; + + ret = vfio_user_send_queued(proxy, msg); + if (ret < 0) { + vfio_user_recycle(proxy, msg); + return; + } + + proxy->last_nowait = msg; +} + static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, bool nobql) { @@ -612,6 +645,57 @@ static void vfio_user_send_wait(VFIOProxy *proxy, VFIOUserHdr *hdr, } } +static void vfio_user_wait_reqs(VFIOProxy *proxy) +{ + VFIOUserMsg *msg; + bool iolock = false; + + /* + * Any DMA map/unmap requests sent in the middle + * of a memory region transaction were sent nowait. + * Wait for them here. + */ + qemu_mutex_lock(&proxy->lock); + if (proxy->last_nowait != NULL) { + iolock = qemu_mutex_iothread_locked(); + if (iolock) { + qemu_mutex_unlock_iothread(); + } + + /* + * Change type to WAIT to wait for reply + */ + msg = proxy->last_nowait; + msg->type = VFIO_MSG_WAIT; + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + QTAILQ_REMOVE(&proxy->pending, msg, next); + error_printf("vfio_wait_reqs - timed out\n"); + break; + } + } + + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_wait_reqs - error reply on async request "); + error_printf("command %x error %s\n", msg->hdr->command, + strerror(msg->hdr->error_reply)); + } + + proxy->last_nowait = NULL; + /* + * Change type back to NOWAIT to free + */ + msg->type = VFIO_MSG_NOWAIT; + vfio_user_recycle(proxy, msg); + } + + /* lock order is BQL->proxy - don't hold proxy when getting BQL */ + qemu_mutex_unlock(&proxy->lock); + if (iolock) { + qemu_mutex_lock_iothread(); + } +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -937,6 +1021,103 @@ int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp) return 0; } +static int vfio_user_dma_map(VFIOProxy *proxy, + struct vfio_iommu_type1_dma_map *map, + int fd, bool will_commit) +{ + VFIOUserFDs *fds = NULL; + VFIOUserDMAMap *msgp = g_malloc0(sizeof(*msgp)); + int ret; + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_MAP, sizeof(*msgp), 0); + msgp->argsz = map->argsz; + msgp->flags = map->flags; + msgp->offset = map->vaddr; + msgp->iova = map->iova; + msgp->size = map->size; + + /* + * The will_commit case sends without blocking or dropping BQL. + * They're later waited for in vfio_send_wait_reqs. + */ + if (will_commit) { + /* can't use auto variable since we don't block */ + if (fd != -1) { + fds = vfio_user_getfds(1); + fds->send_fds = 1; + fds->fds[0] = fd; + } + vfio_user_send_nowait(proxy, &msgp->hdr, fds, 0); + ret = 0; + } else { + VFIOUserFDs local_fds = { 1, 0, &fd }; + + fds = fd != -1 ? &local_fds : NULL; + vfio_user_send_wait(proxy, &msgp->hdr, fds, 0, will_commit); + ret = (msgp->hdr.flags & VFIO_USER_ERROR) ? -msgp->hdr.error_reply : 0; + g_free(msgp); + } + + return ret; +} + +static int vfio_user_dma_unmap(VFIOProxy *proxy, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap, bool will_commit) +{ + struct { + VFIOUserDMAUnmap msg; + VFIOUserBitmap bitmap; + } *msgp = NULL; + int msize, rsize; + bool blocking = !will_commit; + + if (bitmap == NULL && + (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP)) { + error_printf("vfio_user_dma_unmap mismatched flags and bitmap\n"); + return -EINVAL; + } + + /* + * If a dirty bitmap is returned, allocate extra space for it + * and block for reply even in the will_commit case. + * Otherwise, can send the unmap request without waiting. + */ + if (bitmap != NULL) { + blocking = true; + msize = sizeof(*msgp); + rsize = msize + bitmap->size; + msgp = g_malloc0(rsize); + msgp->bitmap.pgsize = bitmap->pgsize; + msgp->bitmap.size = bitmap->size; + } else { + msize = rsize = sizeof(VFIOUserDMAUnmap); + msgp = g_malloc0(rsize); + } + + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DMA_UNMAP, msize, 0); + msgp->msg.argsz = rsize - sizeof(VFIOUserHdr); + msgp->msg.argsz = unmap->argsz; + msgp->msg.flags = unmap->flags; + msgp->msg.iova = unmap->iova; + msgp->msg.size = unmap->size; + + if (blocking) { + vfio_user_send_wait(proxy, &msgp->msg.hdr, NULL, rsize, will_commit); + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { + return -msgp->msg.hdr.error_reply; + } + if (bitmap != NULL) { + memcpy(bitmap->data, &msgp->bitmap.data, bitmap->size); + } + g_free(msgp); + } else { + vfio_user_send_nowait(proxy, &msgp->msg.hdr, NULL, rsize); + } + + return 0; +} + static int vfio_user_get_info(VFIOProxy *proxy, struct vfio_device_info *info) { VFIOUserDeviceInfo msg; @@ -1251,5 +1432,41 @@ VFIODevIO vfio_dev_io_sock = { }; +static int vfio_user_io_dma_map(VFIOContainer *container, MemoryRegion *mr, + struct vfio_iommu_type1_dma_map *map) +{ + int fd = memory_region_get_fd(mr); + + /* + * map->vaddr enters as a QEMU process address + * make it either a file offset for mapped areas or 0 + */ + if (fd != -1) { + void *addr = (void *)(uintptr_t)map->vaddr; + + map->vaddr = qemu_ram_block_host_offset(mr->ram_block, addr); + } else { + map->vaddr = 0; + } + + return vfio_user_dma_map(container->proxy, map, fd, container->async_ops); +} + +static int vfio_user_io_dma_unmap(VFIOContainer *container, + struct vfio_iommu_type1_dma_unmap *unmap, + struct vfio_bitmap *bitmap) +{ + return vfio_user_dma_unmap(container->proxy, unmap, bitmap, + container->async_ops); +} + +static void vfio_user_io_wait_commit(VFIOContainer *container) +{ + vfio_user_wait_reqs(container->proxy); +} + VFIOContIO vfio_cont_io_sock = { + .dma_map = vfio_user_io_dma_map, + .dma_unmap = vfio_user_io_dma_unmap, + .wait_commit = vfio_user_io_wait_commit, }; From patchwork Wed Jan 12 00:43:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 99D76C433F5 for ; Wed, 12 Jan 2022 01:06:51 +0000 (UTC) Received: from localhost ([::1]:58852 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S6E-0002EV-Dy for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:06:50 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36866) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Re2-0001Hl-Pp for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:42 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:15464) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdl-0005hS-6z for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:37 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMlEK5019919 for ; Wed, 12 Jan 2022 00:37:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=vSVLMh5EzNGUJ40EJ2Ice/4tnhgiMlgL0VD6CJI/898=; b=BNTfit3escZsuvkA2n1u3wHMbs0By3/2LhwKyuwgeCJSI+szuDcJy4xYUn9/IzsI9Eg7 Zhggxz3eEp2ourLZuTEtc5CG/aWTYY/XYwDK65RyqqPBeGGcJUkFWl1T8g8AGnwxySsi ESnAp/CMBzgcUFodp9WAmWKNF5EFxjLXD+NE3OcM0TmYaMhloQyTpA+upwzQgaI487sE nkb9MihNz4a5YzhSuBdji/tf1Z6lGYR7uLCkA3Hz7TI9xpRy92PxB3FQU/JyLdfTcZ1Q AGdwBMpD+OLWWjhev4OtwriNq0tzQotKtr2XxrKbADVifh7kMS8ZoWqMPL8Ela00i38h 8A== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sj4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:13 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBM196414 for ; Wed, 12 Jan 2022 00:37:12 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-12 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YDTCFGl8H+muZR8aCwndRSbuZg0gH+vNY6NcYLgMM7GbbNhnOd3WA/GjHhiEAx1G6gx7LVBQGiMAyrdOnbPO2qa0X/x2Bv75ZrYjWdFi8x9tHzhx6RKCg9oHuBayWBRgaxHdEr1IDMj1zDF6VG4NyvPQd6kNHUKsHl9FqMyoX27nVx14EwHZG8b33dfLkwA4lAh4Ca3rR3YM6JZ2JK1bYcVyEZrReFfqijrPArw8M5mZeQnlWckcEwWRt29FhWGy/ECVsj1wrcAC2SjhCWY+TUx39Xf3QNjB1k0SxVGp9JwsFcu23uHBloU5SkWLxmN1f01eFGggaX3JJ6dnFVNH3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vSVLMh5EzNGUJ40EJ2Ice/4tnhgiMlgL0VD6CJI/898=; b=I2EYYFTORTo0Hxj6K2FhjRT2gVV91UdMp8rnFz0L3WHDIX/bdKXjSqqlCxN6TBSzALHNwrApKW4Nh+oU9e+QLzvnlHSNjhqcJFkZrynvcsQo5sCtH15n6ckcq2yK6/TuYmAJA6JQ9ehc1v94F/wqhQuG6Qst4tLtM+uz4Tj+O4hCuPUfbC9bf3eGvDcClYszXLaLHHaAIVFAdFZyygexMRjnLleBa1NleyQxRfAmZRBE3XxxTha1Af2xQSTpu2ole05MZn/Legsk+dolpfzvCFO8qpuSqjNCLFMn8cQeDOSuO+3lXo2FambMPUjBRadHhlWRMpDbOPgwG9kfrRpZiw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vSVLMh5EzNGUJ40EJ2Ice/4tnhgiMlgL0VD6CJI/898=; b=hZyvq5Z68Lt4F0f5pFDl7sL+oMYPr7dRO48Al9jpJUHkKFi+LVaNthYzvypKgZJSSvKU/Dm2BJ7dFEp9h1bYmyqAunuhwSdiPnXX7GqTdmP21fs9mhrSroq1FtCHGLJrG+aiSojIrrgQYNR2FDmS3bJFF62uihjK6T1hcE9yxpM= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:08 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:08 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 17/21] vfio-user: secure DMA support Date: Tue, 11 Jan 2022 16:43:53 -0800 Message-Id: <7c1122b4e50214e7d214e57616171a25a3aa2865.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a59be387-ce9d-4fca-35f2-08d9d563aa26 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:983; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4lvlCrWPZzVXWab2rR0zfKi30gJthal4DtclogvYYCHf7+AIMGoUuDPe6LAIbxMMlvMCmGa5XZx6b/n2VsP9SdIwVFBVhNWoNqbMaQGvzPAwSQJqQTqoIN2PMv+/Tg8YYfbRwz51jfxpDvSppaTzZeWa91+UmQVvfFzVSum2r5QKB2HFwWmHmhTt4NOWl2IJ+qgpcC69lVIVJrHv4BYZHfV3DL3NUMsMq4gM5TXv0Js7O8CAyCMNAQq2TKR2+DHl0ZdE47knA1bhZUbbyuclQglwXCBn0AIoXAWzOrlaDgXo0QyfTwV2Jn8CjJmIPMxgEdbaIOirTlHj1YwZpkMSqEwOx9Lz/jfZ+nauUD9ouhpKpnhxqbYs1aMqE5kj89BLRtOCYgj256VeW3fxq7ZQ2ahVCOroQaNShW9WmYmtFB/GoB4C7j/s4y9FvzRHMG0P+Az80VjPiUuB4ei3gKYI0vhIFlbKeT0qWSrHrsm1/lZYLwHGUMPBHXebMmesTixHXyy4NE+U7bzdMJpP+e28qeQUpwQj9iOgJwJ5H5UGJxLA6wQoMyLeylzm/d8x+k/BXbwbq3isB8SUKnJsFqqzaZ+SQDoJ4iNe4MhFN4HXNU9zd8+vYcRV24xH6+e8mM8OXqzquA1X07zVyBnrwoX/CbLGH0pK9GVUcWNJJCy9vkDMBa325H3QrMKX5cn7Lz9wnwIMF7LokAQmDPj3LsK79Q== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Bcofh6sJgn3LiXCVk5GnNGDWtWyOT7t5BWqA/GMwM6Pf/3Oba1s+OxBViZDZSZ5bSeZM7xIPVvnGSgljMvNltDDd6oCErF6bRuQQzoK/OFIJAi2ePMsbBieYytSe7dJPd8AD+hyomKp9LD1YdKN7TH4POjh3egookTsuDqD4biBYS6hshfQRMjmCbWIQb3MLTZEJur1ijY4SZUg5/SnrOK4Wh+Q9wozvvP4+JfYbgM4oehWr8d7Hx65ghI9GcyO7OQoDrBIUBzYRipyEfSDn2DhTc9TsiDTlSGM89YFIWSrhpa+c6LgUcqmpnxbwq1o9eKiDHs/GmrbpcxyxbRq9Uj75XZmJ8ux7zwFiM9gNfVduiHPqoOAPQ57KTnGR6pqGZ2K2kl9pjujap6CcGf8eo81C931+eDILHqxRpKfECYGyiT1s7rRHs7vz16C+96enxBIT3xXD4uZSotZgbllKGIHWs9/86MVQ8FWkMAWt65WrDHBlDu9SNJbEE5SqSd+zz2f7OWpOk/o0kSFsYROOp25LwMzYpFA2jo+xdWt6fo7irdQNjepx5djXwpgr6xuCIsU+G3uvJ9Nb39C7xv2sRcbgdOQZZRxmNUWwWWrYOGVSsunnIOQF3CRFLGAglIKGJMf02jsKaZTiRC4YRl37u1KLemtPqQXyKHbNUo81k16F5xg3TaKAVFbFg65pAzrUSJg2Y7FO5BmpITqGXUOEXue6pTK76YWSCwm8MSX700HioncItY7yH3uXG1mMylEwTyqa3p12tW+uugGh5TDz0tpH/uVu2sjsu8ygYZkr9jT8ttZ4letEUU8vyh9ViO2s+oKtzo8P3motRl40nF7OyF3zepFmJHw1oqgVSXFVs70e1A3RyVqiyqJ0e1qT42fsyHfvjxN8Ih/bQX9iwKIIbztNfMsK7v9D1/3hFWS/FvQHe9zZnqAmEbFB1dDItIy/NWKiiY80BCQjjisiRSdWp+xL839yfnjjuQgdOoRNM9IH8T7o5gMBnOCoC4w3DHaeSN1dgd0FFTQUstO57Vle5RUlDt/7mmgHaBY6U3S7kj5U0gW4At8MTizRF6aPhsKXFmjmixW35Vmoc2u/06X4Of2wb8s2F0mjxBK7KhHBffjcdsgEOF7BECBMlmy0leEZf73yC7mmXEFVN5VSy6FtcnKutdsOIEnmAw5QNNRJTPuhPXtlTnJVH2FixqQtUm16fOVUjGwOT0UV4lEmj9KmGMRIDJFLQl4bVwUf6aV17mY5ddWoU6eYUBjYCIfkQJDEUMXb0jvokIYUDST/k0r5uI9WOt14GiPumVgoG0AH75PhhdJHQVBEkeF1wXtHN6Y/N06sBoz9c4FvL8FerCaO+nfpVFTg4Xowwt5xHSlqmX6zrKKD2zz/lFAUq4Jstbof/UQZ3D76cLjLvE1prT60HFp/fMqNfDsNzReWmm3gXnfF5zWxTT3yQwYkib+6hOiEKh7rrOvzrEc9oJfb2x2E5ZlVaLH4tVbhcaGTezIvjNpleh1HEPSc+ZDDjM/pas5oeXIZSeHSkcvo55sjXGYz1X1WVvuR3E4xKGytQ4i+pnaTuF45dg98ZziFP6/ptPpbaSynKjRdFUYZ7aC7cd841luJCixB3jsOdfVJcEMBISoe0+XTJRPaf2eN/N26ijU85F6SJbUhjXsGAKxhvKvCEA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: a59be387-ce9d-4fca-35f2-08d9d563aa26 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:08.3682 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XOKF0fk4S9pNwDrRkcd0H/RwtlcI8cUKgW3XDj8grd+YqfW3WSJDNoNIzfqdy/heO5vYd6qHpQfqxchMosj8/Q/ajcslhRRnxc6uZzw5HlA= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: FeSGM3V9yrU92jV0IPP0yngUNADjx1PD X-Proofpoint-ORIG-GUID: FeSGM3V9yrU92jV0IPP0yngUNADjx1PD Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Secure DMA forces the remote process to use DMA r/w messages instead of directly mapping guest memeory. Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/pci.h | 1 + hw/vfio/user.h | 1 + hw/vfio/pci.c | 4 ++++ hw/vfio/user.c | 2 +- 4 files changed, 7 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index 643ff75..156fee2 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -193,6 +193,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) struct VFIOUserPCIDevice { VFIOPCIDevice device; char *sock_name; + bool secure_dma; /* disable shared mem for DMA */ bool send_queued; /* all sends are queued */ bool no_post; /* all regions write are sync */ }; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 8d03e7c..997f748 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -74,6 +74,7 @@ typedef struct VFIOProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_SECURE 0x2 #define VFIO_PROXY_FORCE_QUEUED 0x4 #define VFIO_PROXY_NO_POST 0x8 diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 1fc79ef..b86acd1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3483,6 +3483,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) vbasedev->proxy = proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); + if (udev->secure_dma) { + proxy->flags |= VFIO_PROXY_SECURE; + } if (udev->send_queued) { proxy->flags |= VFIO_PROXY_FORCE_QUEUED; } @@ -3607,6 +3610,7 @@ static void vfio_user_instance_finalize(Object *obj) static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), + DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, false), DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, false), DEFINE_PROP_END_OF_LIST(), diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 5c27a5e..fb0165d 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1441,7 +1441,7 @@ static int vfio_user_io_dma_map(VFIOContainer *container, MemoryRegion *mr, * map->vaddr enters as a QEMU process address * make it either a file offset for mapped areas or 0 */ - if (fd != -1) { + if (fd != -1 && (container->proxy->flags & VFIO_PROXY_SECURE) == 0) { void *addr = (void *)(uintptr_t)map->vaddr; map->vaddr = qemu_ram_block_host_offset(mr->ram_block, addr); From patchwork Wed Jan 12 00:43:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 459F6C433EF for ; Wed, 12 Jan 2022 00:58:50 +0000 (UTC) Received: from localhost ([::1]:44266 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7RyT-0007xk-0c for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 19:58:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdp-00011g-6k for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:29 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:15474) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdl-0005hT-77 for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:28 -0500 Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMR5s6025174 for ; Wed, 12 Jan 2022 00:37:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=ps7jn4StXnYAopaVhaQ95yDAJqOv8j9X1YO8HqF3lEI=; b=M8sOUkbNRIcgF2asP87xWaTwtTJEGNfXtBKCvqYYtl/dDjokaTpPgnZKrWl/l7t5aGB/ rTRlYjHf7fawN7l0hwoHbmBs3Gp8PXmPMK0TwnLApFOX1Lv4rFtryhJQdXcH8X+vB9ub E0weuMRP9Irl7iVPl+1WkJNKD/Irusa13ixEnoF9X/6yul9udUlMBzu04GOXJknNsKT4 rBtSuaok94LavLGpLd+/2CVGzVEdx9/iV186VQmzI1J9fEINwxTGHbp6BPH9JoP94VPz JmL0SKZbVuj7j5UmNrtY6ycbeNRYMLMsl256kjxFZlQNVFFBU5QfmXEDjjtOPnrA050D FA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgmk9crq9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:14 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KTBO196414 for ; Wed, 12 Jan 2022 00:37:12 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by aserp3030.oracle.com with ESMTP id 3df0nervy9-13 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MsDzGQzW2zBEFZ2QPjk9S/nOlwapxRwWxW2935CfVQlK+8M0mE/AKuSCq5yXvW2mcgHkoteZc/2dmQmtRzhMBk3UznGp7sZqwIuURsCEyXW0M1aFfRfLNZcNQDiY4x+0x1sFgj2BqX9aJ3h2uLlri9NcpI1Z+VmRmWjtzumSnBCoQftGcD6E1PCTMf4TA0ibq0ie8Ix6lVhwhjIPfUAsYhnK3IDUJszG9zDKixRfStUsvbebBPr7oongxSXZnots58qG8fI4EM+/nFiIJQ7/tNELGneXPyM06UuDw8LXkN/fToJYEDT6AQbRuK0KWfQesc7oPs7oEbQ8mi74X0NAcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ps7jn4StXnYAopaVhaQ95yDAJqOv8j9X1YO8HqF3lEI=; b=jWmw1uyFrXGiASWI8PpBDR2yXT0QvCX23/gg0+wEt2ZVWZMGyQmwS8x+IApsM8QeBBrN8Bxh8224/5rjPggv8iW96ukb8TcEd9EvLfTfQiF496ky3C2M2wpP9OvSWk5WOLPCvO5/qehpRFcV4fn3+KT3RPfRLkw9YpzwchvqixRGxkn2zLKxZsou08NwBZOD+zGJFBcP8LFhegPmywYqqQ9k+qrYIEr05NNVZwCTVY5o8RalYA9KJMpnAzI8gx2uK+UhAZsZaFTpTJ8H6o8+QP5DLpYIVrdDPOGNIynC8UAPZLqPsrPBrhve7RWFoOM2PIFjBYCX+645PHIuLjFcsw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ps7jn4StXnYAopaVhaQ95yDAJqOv8j9X1YO8HqF3lEI=; b=jlbXI6XpuA0f0sEtELTqmAf9KcrfQ2/FiIsM+Zh6cwtD3na9XVv19veHPiPnXSaUEOY7lr7j/7M0mya5QwkwdKGhr7xYlEDwn5nKsouTwS7kIhKqzM2qoeEa+/lmtBp0EBq5mh3Wf5+5EBmlOtA07YdNP5GocWcLkGyNYg0Ky2A= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:08 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:08 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 18/21] vfio-user: dma read/write operations Date: Tue, 11 Jan 2022 16:43:54 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b1789291-b3b7-42d2-805d-08d9d563aa53 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:3826; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2MW6AxRpCLOK07Ly8ewrJTD3re8jQBhlqquB5RD7aGDNzlB6ZzeT5SLG+WOYCV/NXXl7mz0LJwAsDIXwYVvuUS0sLRQo5mSCImqHmoZNl700D6n88UOJjRAo92Vw1uHLEhK1XiXGl/pVz9zKMer0ER5FVnOjeGBt5fxULW6j5kbDL6aqXPsiJXFKxD+OYI9vXr4+asbf4nHEN2FKmmSTq7eSllqddCu3FfMNtgIZ1cju9eupkfOZNp1ZVShlT7hetnftJ4UGC/QxzqGnN1TddsykE2EJnXGF0LFVmXpqKrCk66lsa12TqQDmggmUhNeeb1jtiPxxAomSSJBkJ337MU+3hlXyX5LnMUJC2Yol0fj+aC+NrlYgx27FMEWNHAPPSii26rW1t6HepzGOX3y4ysRM2YKLyuuCPq0opVWGEBkfcSdbYMuCnXdjByLZH57pmx8Xh4thqZQXYgxvLbhIDC0yBguYjz9XQfy4qgUEonU0c5InOmnAwrPWvteYJ3vLz5ZcK2nt+qY6/Adu57EFjHKPB13gVM5UT9s7OHZFfm2ZJpoHPl5vO+ezYA4afG8v0ZSKHBH8h27JmxlQB5UZ44SvISfarmb45u1z5Wl7+zLWttM9iNL65Udfn67fXW5NdYW+3YBC/atraP+JGruSIkBYkQjksf/pWrR4nMHiwcJiz8qeMEDFZyIy3L+5HwWd X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: IkXPudi5X6rZVoSuRKsKhuZNnoh7dplF9nyi43uaO7ZhEdHPL32d6njIpRtXSy3gUgx0Cl+qq7k9niMK8usdn/k+0gL43GE7JS+AiNU/YDU0oQCEgl1JHLcBMnWgklhYrczo/j7pfgNk/PS7fMLhBGIyipcdFmFVsxKY4bWiTglVZcXhAoNfKH63jbj1ZjHb4uboBL17rWM4MemN7lSvkBKmfMF4NCaPFPwyUMoLjsQcMYKAvqeo9Q6pckanrBWwCu2GTBIe8Ah817Z6jFFSGfBor/17ngCMmYTUXbjliRb0m3gIE3OyQrUoXVMkS2rAhB+xkPBYaEQ9+dEtJbNs5gioFz54awc4R3yPtmXbuTqdjMOdDlbKuWFR08od6YWEFWOHgdsHxvpy61YWiSlbT+E87I6TtDy4pSKsv7rL61omPj5koJBD3VLV3rrws1nMIdW9IX+MIb5ouzmkJADhoXmQmSjFZqumZFLOPuBf8ceHjuVV6zEsjiBEdsBSH2zCF3THjUS2w3KbLvjquIGY2uc7mx6pMoeb6Bw6GPrvqkjGRLAPOyM2DbqW6+V4DGQV5Q0Tp6Mf5O0K5d/RhcopAcTQwkl9RHDVMRd1XeYsR4MUte/Ma0+Edt9oUIFs/kAkmUvdAFOiGzAwKTD6Ni9zJzDWv18ovTFqORSaZwrtcpztDw+A9QGEB3M/SQLlxeFFLVrtptZvxcDSgyFa1saQ0wxkuBTPmIawVdNJBRJ6tYHjU/UY4w6Hr5cZ6GFrNz1dySOG5mbBjrhJZvGgbuiovMHyw//6oal2wWBdTNpZQmGRQT92hfVQVakrvoPHHH3fkyLAGFUy1rFhTfCgbOHDILp5ZlN/xt7Swg5D1g7p7m9iWuZ32nRW8ysjIkhko97qGg5mc18FHqV27nDEmhM6EHUolZHI23ZXxbhDSGppGoasVcEiddxUKXbNnRDNUCKAPqSTUFh7TISBHvZpqP1fNVbjSV6ZmYH7NSdDXOOz+hKbtdX7p/gptdZ15lWFm99hWi0DF/zSfuYic/SBk5KSkXlmuaeIZFk0sJzjfime9SKlpY102UrWNa37bYkwe/aFX/FX3ag5SYuNIUVHBoGMoJFsvox0yfyTXs5bamgvfT7AjKtVRDE/HHh4UfLdDG71+RtAwxarun0sLf7w5ob+b89tmTgk/0Px0YOkAAruNK6eW95bec760GQG2m1NIPocOmfSgvzIAStLzfGbBtuftB/w5Miz9zPvdVHGneCKAdlX0m9wynbIdWidka8WS60ws3C2XmgpMM7zwvbx2NmD+XWQhYKJmGdXWPKYtkpMJLhisrcsCnq4iC9OccZQEmZoMuz7FpbwoiRP4rQCzrw+9J+WlGE+i/PI13kqDMwTI8FDMCTmb2AjZT0cQXnC+nxlYIsbR1u6SbMT92rERMFAO2FXQaVoKg86zoeJZJ0tPosyNhVrSCovc6aTkNLYO0j9DWm3gwRiCmBSmkAI0krGxUiFjw5mQF6hWs7St1cxo2hidtJV/tDwxYhqOjhrbDql3+w98uy40NYe/uDGeB8kAhDa0eQbZutpiAbs1xCfnwrO94RxsOuNCnsSvh5adY4GOFzkBXESROuEua9UHjxx5Nx8KTu24wMS4XU8Y67yKkH/V5Z6BBdhvEW9+OgB957waQvwkftYq7g4lrsqJX51kQ== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b1789291-b3b7-42d2-805d-08d9d563aa53 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:08.6807 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: JfbR50IdElC7OSunlj9GL3ftAtK4YFQ5Y4mueck7VaGY7qLpESTEaoGEZjXbYNVw0VvpGyB9Obyv3Cf7DbjQWCeT82o8bmi2hfRQr2CSrus= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 adultscore=0 suspectscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: rCt2lCwpY71ymMS1F00A6w6O63OfTwPS X-Proofpoint-ORIG-GUID: rCt2lCwpY71ymMS1F00A6w6O63OfTwPS Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Messages from server to client that peform device DMA. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 11 +++++ hw/vfio/user.h | 4 ++ hw/vfio/pci.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/user.c | 60 ++++++++++++++++++++++++++- 4 files changed, 179 insertions(+), 1 deletion(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index ad63f21..8932311 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -182,6 +182,17 @@ typedef struct { char data[]; } VFIOUserRegionRW; +/* + * VFIO_USER_DMA_READ + * VFIO_USER_DMA_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t count; + char data[]; +} VFIOUserDMARW; + /*imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; diff --git a/hw/vfio/user.h b/hw/vfio/user.h index 997f748..e6c1091 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -80,9 +80,13 @@ typedef struct VFIOProxy { VFIOProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOProxy *proxy); +uint64_t vfio_user_max_xfer(void); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size); +void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error); +void vfio_user_putfds(VFIOUserMsg *msg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); extern VFIODevIO vfio_dev_io_sock; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index b86acd1..7479dc4 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3427,11 +3427,116 @@ type_init(register_vfio_pci_dev_type) * vfio-user routines. */ +static void vfio_user_dma_read(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOProxy *proxy = vdev->vbasedev.proxy; + VFIOUserDMARW *res; + MemTxResult r; + size_t size; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + if (msg->count > vfio_user_max_xfer()) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + /* switch to our own message buffer */ + size = msg->count + sizeof(VFIOUserDMARW); + res = g_malloc0(size); + memcpy(res, msg, sizeof(*res)); + g_free(msg); + + r = pci_dma_read(pdev, res->offset, &res->data, res->count); + + switch (r) { + case MEMTX_OK: + if (res->hdr.flags & VFIO_USER_NO_REPLY) { + g_free(res); + return; + } + vfio_user_send_reply(proxy, &res->hdr, size); + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &res->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &res->hdr, ENODEV); + break; + } +} + +static void vfio_user_dma_write(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev = &vdev->pdev; + VFIOProxy *proxy = vdev->vbasedev.proxy; + MemTxResult r; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + /* make sure transfer count isn't larger than the message data */ + if (msg->count > msg->hdr.size - sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + r = pci_dma_write(pdev, msg->offset, &msg->data, msg->count); + + switch (r) { + case MEMTX_OK: + if ((msg->hdr.flags & VFIO_USER_NO_REPLY) == 0) { + vfio_user_send_reply(proxy, &msg->hdr, sizeof(msg->hdr)); + } else { + g_free(msg); + } + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &msg->hdr, ENODEV); + break; + } + + return; +} + +/* + * Incoming request message callback. + * + * Runs off main loop, so BQL held. + */ static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) { + VFIOPCIDevice *vdev = opaque; + VFIOUserHdr *hdr = msg->hdr; + + /* no incoming PCI requests pass FDs */ + if (msg->fds != NULL) { + vfio_user_send_error(vdev->vbasedev.proxy, hdr, EINVAL); + vfio_user_putfds(msg); + return; + } + switch (hdr->command) { + case VFIO_USER_DMA_READ: + vfio_user_dma_read(vdev, (VFIOUserDMARW *)hdr); + break; + case VFIO_USER_DMA_WRITE: + vfio_user_dma_write(vdev, (VFIOUserDMARW *)hdr); + break; + default: + error_printf("vfio_user_process_req unknown cmd %d\n", hdr->command); + vfio_user_send_error(vdev->vbasedev.proxy, hdr, ENOSYS); + } } + /* * Emulated devices don't use host hot reset */ diff --git a/hw/vfio/user.c b/hw/vfio/user.c index fb0165d..e377b0f 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -70,6 +70,11 @@ static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) * Functions called by main, CPU, or iothread threads */ +uint64_t vfio_user_max_xfer(void) +{ + return max_xfer_size; +} + static void vfio_user_shutdown(VFIOProxy *proxy) { qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); @@ -283,7 +288,7 @@ static int vfio_user_recv_one(VFIOProxy *proxy) *msg->hdr = hdr; data = (char *)msg->hdr + sizeof(hdr); } else { - if (hdr.size > max_xfer_size) { + if (hdr.size > max_xfer_size + sizeof(VFIOUserDMARW)) { error_setg(&local_err, "vfio_user_recv request larger than max"); goto err; } @@ -696,6 +701,59 @@ static void vfio_user_wait_reqs(VFIOProxy *proxy) } } +/* + * Reply to an incoming request. + */ +void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size) +{ + + if (size < sizeof(VFIOUserHdr)) { + error_printf("vfio_user_send_reply - size too small\n"); + g_free(hdr); + return; + } + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->size = size; + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Send an error reply to an incoming request. + */ +void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error) +{ + + /* + * convert header to associated reply + */ + hdr->flags = VFIO_USER_REPLY; + hdr->flags |= VFIO_USER_ERROR; + hdr->error_reply = error; + hdr->size = sizeof(*hdr); + + vfio_user_send_async(proxy, hdr, NULL); +} + +/* + * Close FDs erroneously received in an incoming request. + */ +void vfio_user_putfds(VFIOUserMsg *msg) +{ + VFIOUserFDs *fds = msg->fds; + int i; + + for (i = 0; i < fds->recv_fds; i++) { + close(fds->fds[i]); + } + g_free(fds); + msg->fds = NULL; +} + static QLIST_HEAD(, VFIOProxy) vfio_user_sockets = QLIST_HEAD_INITIALIZER(vfio_user_sockets); From patchwork Wed Jan 12 00:43:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2AEDEC433F5 for ; Wed, 12 Jan 2022 01:03:12 +0000 (UTC) Received: from localhost ([::1]:52990 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S2h-0006DY-5S for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:03:11 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36864) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Re2-0001HU-LM for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:42 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:25524) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdt-0005ig-1s for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:35 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrG6k019928 for ; Wed, 12 Jan 2022 00:37:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=RlSrKToJ34Fs94Ln5Fpnva0xUkyZ6NjfPsrgWhRgs7g=; b=Va5CASmDtC3gV1IkHPhseZvdhtlSiN4Lju42dORf35Ki+LalFGR1ip0ckRXKQIcwCRBG a6uEKlNAgj+UkqFwEBoL705GYHT2XKS3lyA13RL6TPEImXIupau+phkJfE0adRM3YpOV Qnj3Xhm8yF4uWT5om+Twez2GNZdnWK4W08th8NEVRkF1Ga8MgheW2fL6odKJjN2XTGeN mKF22GGzm68nj/2c9tA6ehQ2ilLfTvPmQp5xAaQEr5myfytVYDSpXNEYO30IXqiXZ9JY ppDfb4NB6XNQ434GlRNI7686EJ7S2EMCJhBJYXHOVffOqj6vKcA+/iqe98olJUQUwqFs Qw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sju-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:21 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KAjF069368 for ; Wed, 12 Jan 2022 00:37:20 GMT Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2105.outbound.protection.outlook.com [104.47.55.105]) by userp3030.oracle.com with ESMTP id 3deyqy1gqq-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kR2rHvOOl9h+zrI+k7WNfIdnmaO1kK7FPpefBFRE5BVC74YL5pBPgRb9zSCd7VkJMKt06eFgNwei9lKqhnSULuNHiEOcnMD81v2ae+C4sA6CfCeVLdJa22COsI+icXfcZV9yb1TTYUuUyWfXYSG15IY1Ig8yORH68a4yRtqlCIJqh0dtDmSxG5crsjxMN70aFm9Fsx6By5m6WnDBL4FlCFdjXqz9oiPkczRz06hTAxgYh4IICxuKhyt5aUMxjz8s4bDjmHw9H4iTtRC7LMFaIzjBx8mr5+WowRBku5r8MMOGxuImENa0P7fsPCmYCUNBToq5MPLfowwZHgCp+CZmFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RlSrKToJ34Fs94Ln5Fpnva0xUkyZ6NjfPsrgWhRgs7g=; b=fvvuwWpzeLz/enRE9zRJrhSZZk/F0vJxfNB5xlW86CjyL8QSDM+b5v1WhEpjGrNrePtFWbsg238YmZqlgFynMSPFbAtFAqvUQvluOQ6s2b+Lfr1SKE2dOy6V+msmViurk62vjw8eoIEH0ncfNuy92Yh3MNxNwPny0O2iMH/JI8CkLD1fzkDXbrfCrd13DpHVkJaqQfGTadh7MGXpsEfDvT9BrW7EWuzr/Cf5BUrvIVmRW2IL2IYaq5WjqoSdqU6R2qVzTcFhW5HZhb/XNH7KB4oVPKGZ+c7EjmdRhgwxgrkhONHDFLXOA0j8Z26jIeqK1GKhxcXwKzxkeR3qEK6Ttg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RlSrKToJ34Fs94Ln5Fpnva0xUkyZ6NjfPsrgWhRgs7g=; b=zmNQ00+Zn1UfuqLqNlXgqjg+lbaiM/47M48GJovKojQUAtFcurbpu12a7J5JDwtZuKtvXjm+6dHhmZ6T51Ekgu+9cJVjjDGAIXNLYb1Tth+wdkcHN/ZqQgRrHv67RjvCKhzTvJrEsCbcjMamEHpvf4HmbfO/he/8Y5HS3rSdr7Q= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:09 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:09 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 19/21] vfio-user: pci reset Date: Tue, 11 Jan 2022 16:43:55 -0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2168e1f3-6fe0-4991-2d04-08d9d563aa83 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1850; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LJJm/0/JL6zOHIefUW3R7cn6GcYfZrGkUAHKbeauTT7LAdTHnIQh0jJR4856XZkl7IFQsBl0QWFiSYsEQWE7Dzyqc11AjiP07ydFrOHw1R23s9EWwwKN5Yh3cOWxxOaI1xd8oqg6v0YLt6VTd//fl1EbRSDD/xIGOrOj5jVwqp8DSJCCBPUHZ95q0Y4DAWTzKC76FZz41ED04XrVPqqUra+BmWUQtncaiFGtxBHxnYTI5LEyWlJc2d1Y9CWVrhg5Bz0dd3b0H4kilZYwnnVsXhxpdQFndqwtzIihFD9NWJytPPpTR285Oq6PnbBi7CONK6N5GE74Sy2jtZsBMoAJ+Vudsp67SNYET13bn1fJ08MIw153jQopHOibA9r3fahZg1EgY6BLo5NZn/Z8XWgSx0QDuRVcfQKPnX3RZocV3ELZ7JA5NvK6OknqhQhM+tU1MNmRJtfgTKmksN5eQoiDjjdOLs9kJYSfZ5n1sMXibmcMIW+dG+gtVK1AhFUTM/5FkEBCjQ/5p7Tk/1KklSyTGiie1ONyNlgMVFMxqKmW2FOD2AG7lYUuDEYMLw9xWlY176+jg9DXLoUmD3ilLPKSwkjlixcNXTGhOQLEtROyKWrRUmZKpUnFieYS+0H8gfoOMn4pUWnQVd2Ew+9VocUN3dkEN+Nzq9Dq13gfQvDXUgvCI0n07pmQrLgeassAkPl4BRCmy2ebEtLZsCQj7JEFyw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 5gFa2eYgkJe6DBq2K9I8qPUFC+hV2lVWPb8/MHcqjOCInJvqL8tEg3LY9AMVsD1maIUGvqKCddwhiHBMriYgQfyUpYXwI7ud6bI4QyGJcwc7+lTB2SFFy6fdvLQFoV+GJsXKGqlWq89RY0TX1mjzawyn4UleFqsKybcBOdXUKGGyhREkz/LIPI4Kc8DseWK/PV3lSTSIp0/CZQ5XkRHQIdjRmlwfy+th1QASJR/ZvjrdY5+nK4aukX4GnRgcxcVXFkhN5sWSl0fMErZfQj5Sy6qGRn9QbuOGS2kIvV+RCBWHsm1sFt2lnHMOAPJOlZDhaZVv2sfJ7cw2AEDqE0zB1eHY2qCwFsw1M21R5OyFTWoxwDuViMJmEm7uPONw2kCmR757aaNPp/b65ainGtsCzyukSSLn15SNlehJQHzVwylfDgeX9W0UaLcELa32/lyAHOCblIdjwnqJoCzJEtLJk/d6g0ijHsXlKY3I9GB/vze+aFt71X+Af7Ro98x3jHOOX+BWdCmmpurefSHwHi5HkB+LnnCH7KCVSECiE6UhZPtKenH+tAXR9dOrEq5YSpXA1U3bcI+pCPfFIHlG0c8joqqdWfa0RAABjuQucver+6+rqOPHKMSqzmDhgEOkE0aonbzjq7wIC+mYbE1+ACXJsJ98U0zATS22yZPg0WmjmXaJigCPO5GpGtWBAXwcIekQ6fy43oPapFb6JQzinYgOdOyx5IY2u/0AXyvoNdlGZgHwRp63GNusv/S5VwhYj8yhKe4238pxAgQAtnbfDmbzU9oLzkajRANXFla9cM261LUUBSVu5Vbqa/N0j1DdkBlQ/mraUrTLE0AkU4O82P3ovgKmQAeWE4P1NheM0hvikcfC0qQ1rgniemq9X01RZ61OxhbEQ7Lxs0TiBcW0WrRrmPzoKfUq0me7tI9DSXZdfAFiXmE+3oeoMXj+GfzdIyHc8GdsP014XVsrxWYiNVPG2ljZ5+FarQqviyaZlirB8o/IwWueM4mprUTiqRImrOt3KzKROsmCq4YfAxLZ3xhSBLG1gNdvpKxmitAtFvLDNoWfUUQlTTDYD136uHdsRvOkuu5Ejpd7vyioakcyYChdsu7sp1gxPj2a3+gFYu4TIXytNCgYhCRF1+JF/PQUjrgbRCjewrhgeWXlhWVubmS9055T02tiWD3kiQAz2awBS1lSENK9W8HL2xOITYJoY+09KiF3fzHgrBAo/0AWbYkaQuuHoYM2NejuTk+AnI3K08wX4gDzVcze+LYPeokX6Rra1Gwc+FTM2q5SV3EkINcmq0ckZywslOdKkhT+0rROUA7MehZB8Hq/k/3QwNNKGcBtkhoenoIBM3wBMCrDIJioXNeMpx/JA8tmpJiF0RXo4XbvcI676DheYkApalrwes8zI7fF3wCLpR67l2rbCUdvv8vrC1SnXyMQLVxwzNE43fpGUa9cPvNgB5nWZItdOh8ZksqOVsCBiGCdColP9TlRc5B/2ERJ08sVmPlyjDMl7H6HYHSgaYCj7iQrd7Zmc6dw/pEEKmNRWWCHi8AW5lKFyTYKa0KMQxU2UYrj3X0ig2No65NtlJVJH1+YOUcELw2RYf/Jv9PJQQM0BPv4M4XniP2alfkslbGXWbBYkeuwzK9zT8SHfFe1oE9BsX1JlpvwGjAwHoBMNgelop1OxvTOWA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2168e1f3-6fe0-4991-2d04-08d9d563aa83 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:08.9931 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 3wG1P4JEeQo26q7fjRZtdc0aW5ucsp30Ruwe928NMiVN8zP7zjz61pr+Sci/tkBSlGOaiux/gZk38KNixSz2sojgYOj4J4eGm7YQyp3mN7M= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: 0Yzw2T10qHItDMPisvWjwujBdV2618BK X-Proofpoint-ORIG-GUID: 0Yzw2T10qHItDMPisvWjwujBdV2618BK Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Message to tell the server to reset the device. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- hw/vfio/user.h | 1 + hw/vfio/pci.c | 15 +++++++++++++++ hw/vfio/user.c | 12 ++++++++++++ 3 files changed, 28 insertions(+) diff --git a/hw/vfio/user.h b/hw/vfio/user.h index e6c1091..7504681 100644 --- a/hw/vfio/user.h +++ b/hw/vfio/user.h @@ -88,6 +88,7 @@ void vfio_user_send_reply(VFIOProxy *proxy, VFIOUserHdr *hdr, int size); void vfio_user_send_error(VFIOProxy *proxy, VFIOUserHdr *hdr, int error); void vfio_user_putfds(VFIOUserMsg *msg); int vfio_user_validate_version(VFIODevice *vbasedev, Error **errp); +void vfio_user_reset(VFIOProxy *proxy); extern VFIODevIO vfio_dev_io_sock; extern VFIOContIO vfio_cont_io_sock; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 7479dc4..d47b98e 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3713,6 +3713,20 @@ static void vfio_user_instance_finalize(Object *obj) } } +static void vfio_user_pci_reset(DeviceState *dev) +{ + VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev); + VFIODevice *vbasedev = &vdev->vbasedev; + + vfio_pci_pre_reset(vdev); + + if (vbasedev->reset_works) { + vfio_user_reset(vbasedev->proxy); + } + + vfio_pci_post_reset(vdev); +} + static Property vfio_user_pci_dev_properties[] = { DEFINE_PROP_STRING("socket", VFIOUserPCIDevice, sock_name), DEFINE_PROP_BOOL("secure-dma", VFIOUserPCIDevice, secure_dma, false), @@ -3726,6 +3740,7 @@ static void vfio_user_pci_dev_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass); + dc->reset = vfio_user_pci_reset; device_class_set_props(dc, vfio_user_pci_dev_properties); dc->desc = "VFIO over socket PCI device assignment"; pdc->realize = vfio_user_pci_realize; diff --git a/hw/vfio/user.c b/hw/vfio/user.c index e377b0f..33d8f06 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1398,6 +1398,18 @@ static int vfio_user_region_write(VFIOProxy *proxy, uint8_t index, off_t offset, return ret; } +void vfio_user_reset(VFIOProxy *proxy) +{ + VFIOUserHdr msg; + + vfio_user_request_msg(&msg, VFIO_USER_DEVICE_RESET, sizeof(msg), 0); + + vfio_user_send_wait(proxy, &msg, NULL, 0, false); + if (msg.flags & VFIO_USER_ERROR) { + error_printf("reset reply error %d\n", msg.error_reply); + } +} + /* * Socket-based io_ops From patchwork Wed Jan 12 00:43:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 90F51C433EF for ; Wed, 12 Jan 2022 01:27:35 +0000 (UTC) Received: from localhost ([::1]:51282 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7SQI-0001Dx-Gl for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:27:34 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36852) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rdu-0001DB-Or for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:34 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:24394) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rds-0005iW-2M for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:34 -0500 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMTaJr005856 for ; Wed, 12 Jan 2022 00:37:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=u1pvYhGl0jgZOCVfiBvl/TtqhtLgfVL0CU92XSG5gV8=; b=x2vlxq0azyk7d3j9FYVDUHC0esDl5tiATiR8tuHtYrQhBKD/Rkyq/lueCFJhuiFUJk52 If19OdJiH5V9+n2dBHH7z+MdFJXQvRtcGBDUhcRHRmc0zPj5n/Ik/ACuv+ORjtlMEGix OGrMf1srGJ9mbhECqu5sHLBaaL93iZzBKNf9nKvPXbUhmFv242ludKWdcs+lRD5h5Iv1 obTAjFzOOBvFzdWc+JIQpx687uwLPLGrumfRMN7P/BdiU5dJqJI6cTO5kwWxCnuhrwLi ZOEXuQxorcBGkPvl2AyOVz6AlxYN1/RIS24L+sDNlRg3rXALnyDxUwvo8zFehT85+Gnv pQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgjtgd1w2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:20 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KAjC069368 for ; Wed, 12 Jan 2022 00:37:19 GMT Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2105.outbound.protection.outlook.com [104.47.55.105]) by userp3030.oracle.com with ESMTP id 3deyqy1gqq-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:19 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Lk0sA1w7u/px0EdOpiBzPN5xhayzA4KwoRShIm/kz8yPZKN/adIoy9kN0vn96kBuZalirHryYypVxm6qvw/pAsWnB+Gg2f0gq6A9r1pn4O82A+1rBsJ+rGx5qvqA0vtkZnT6B0DA4fJDwCqIJTyR/VAdpRC43gIZqPT93dybzKP69J9Fq0GuNM/89+if2spNGtbB89mRzNPRjKXOxlbH9No5OUxSM0kcQHi6rSTJ69/5yDkhgtG3RSdHw2w+y8h1rdOtYZNZbjZh1pE6fSmoWAOnyUBuMrgbZ4fnaDuMvC90lN/Lhyu7POTJQNKUGtJSt4UQnZ4J0NuppLO/XzKjJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=u1pvYhGl0jgZOCVfiBvl/TtqhtLgfVL0CU92XSG5gV8=; b=OF5h3Tp2LiwHQ2YbaGdpfQL5iIvnlB5GrTW/iSelDYI0iE5mlulR5ZkiB63uGEwVCBJ9K0iYv3jVzjQIUL7A73HIUsUKzPdoO0goYkE5btTIotzcjJvQsqp7NJDS05iSkzgr1qbU2t7qx98a7r3MwL6raa81Vag3h9uAMaiZV8JHruOaEHGW+MSZByJDxmP0nt6RX4zzYHkIJCIZtHkpuswC/MfKXmaaseWWokw2lqGDZAXT5XE0n1+JEju42D2+2YBljEdUK9IocWeN/QcwI0FZDXMkAllZsQuHFqS3uzdLIur/STcl8DHP5enWrUp6wJj6wDsVhhsufojQUIpARg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=u1pvYhGl0jgZOCVfiBvl/TtqhtLgfVL0CU92XSG5gV8=; b=QoCmOgCxFvw88IaNLNm8lxkKL0sH/D6hO9tPYaFZi7W1/MZM/Exg2VnhnEir1eHz+s8J8Jog9rCYQUhsYyO4E+g6iuahi6cmjVc9Es5DMypvzr/+muH8nlYfnMjjMJQo/B71ASSU2Qneo4H3HrEildWo17KLh6mbdQdGC8NRAYg= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:09 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:09 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 20/21] vfio-user: migration support Date: Tue, 11 Jan 2022 16:43:56 -0800 Message-Id: <27b636c6c861e0a05278e2d1cbf07d3adec2d505.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d20501ad-735c-4b3d-d9c2-08d9d563aab2 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:33; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tVAcg5AxVR0JqQoy7wk0lGICis7VMSBqAOqAV2wmAgJfyB4SxQFAhPigYkDQCWX9XbcpHtb1j03pQpkdYQ8TlLYPUzmvx+CwBJlOfbbBsUWNXMzSSbmQiQszhlY4xevpBIc6KA+WaihLMyWSm6yNB3U5KXwX6N/snSg3lL96C7bntf2MnSKR99EOaFchvLT9RLwOlC726fBImvkbj46tLH32r9CAn+dBi5Cc3TAtV2UHpkuKdDonW71IQUMJdHN/niSx4FgsDt2DHehJf3RvwNKDjguP0X3emzeKjiva0ZOGbPD9QW/nyFYfG/muUOFwzjnh5vnaPMOR3Stw+NkiK+3fOL8qWKvvROtFQ6scWPKf8V/kaMZ0jVSncflVSdP9KLyHf3G0OXebTNHgjuw2EE7svT/J427FCD0V4XDYf0nU3rCaVE6MCuUYliGpSNzMnH/Nt9ZT7S3Htp4mGkQpR6yNOaxoAzzcWQTL3GDABgLjAZ2oNYviVu1HwBe0it9C6jSJlIDG3RD4VnaLGfPc4DODreuHb7GcriW9Qlw2DteDhgVv+H++UnoBS3Z5Wpjy8fN9Fg/eQALiUahBFQG4uwoqI9bO3tBY8P2z8vS4DgOlc9xRWLuzZdSWmiKZxKrfkAgK6oV3tZJ6QZxSzXLXC13IkxRu47BWC1v83xPNhjLLHW6bkWoaaFQZGo4Dm3e9k5Dl3W24xw5XQdIFCXUy8Q== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: PRmV5S78nZG9MLzCYOvLC5QNMuQPQrziyJD12ObguA5npj12r5KufQtvc6tomtrd3ZXpvxC7r0VIEWARPsWVIoeRlPjafgQNZ0nAsDG8nMt9dTN5fuFSTqsSoeFSWSeQFiG3Nte4Ng+qI+OXvu3V0QLn0w+I9KGi/BtWFuoA7nORd2gLwxfSd3efGMm2xtWx9RpIfLLb4qhHXtnJBEIo+fyl+X9GPf50arOIA0njRmSCTnu/u6GZSOjuQv3f6+1cK7ecL4syvIOFAk8dvdHsqzpupSs189akjLBybBR4LRv4vEIwDFEv/aktAONodW+1ALyizjMGA0NJxW7hG0vBIJXTTNhrttdKDbE2KWfTDBs1pyqwkxa2s+wBA74SFOM1d1M/nGBHC6cbHJXPUK7c4uKuuaR3Rab3WKzEru+gY+U/51fYYDlUPnc7h85NoeiJz7WFXseCrShVzw5sDyUNdszhRx25K24qC0OPMrGbl6CRUmnqnlZfZzioXNqX8szGvDE+1z7jmw7xa83Ws3NyFWQo0oWdXlAgV7NW0l9e5EGEHMvJg17SXftzJv+tclPRHfWfHs9g6vWg+RyViYlSo4ZahuzmQ3J0WTY/lKu6+nBs8OizjGLBmQgYQVZGKqZHpLyT8LdLjY9S1Gh79elrAS1/tan5M/IrgP93paPXCPpHEuRoPHyEmeW3/9uL/WBOcYKsI23818IIY4rLRbqXkHrJaISFj7yitVqyFD9CsiBAPV5GACndcllOysZFjlt122QWZlGqzkdJ3rs4JP3bfLldTwOSPWKcrWXmy/ZwAPUNFDKsr2c0lU89tzzt2WrHWEhh2d3Yja1iSDzSivAp51FCEezjdK1zQfBILfbQX7RIQ5hiA6ShupokyOXoKj9FEU+gAXAOUzpo2S1yW5UZU0RNFrrpJMM+hCE0nYj5tFRrQyUf3WVV+lyxZM5sVNa2lo5l7XmEpmNlU2JXCegLB/2j3nTr7+LcEtE/lYj2ylELzBsXz7Ong70oC9n2vEUeKCevfrnuBnYg0kjbGbosZ/sE0mTkJqYpL1hJXzZFD47kkIga8t5WgkCFnp+tqCXD/8QYgvCHMr9BidPWusSxdH1g8Lwcvamudah3r072PBHTqT0UAlPUxXet3MccoPD68W6QZLvsg4SG1/M3fJO/cyD0+boAU7b7VsFCX5VvGrjpnk5QWdJoBo29Ndelc10PzkwKfP+asZIe+idA9hxEpns1SK8rwHki3hkTzYbkx+0CAmU04bhypTv40WPOA6+y3bNrJb5jFRUFWtF6A/+vaRKD3Of0WwZBrMWb0kwioH+Vv6aT7aizsHOPminKMeBzWQjBwrzNzq6Uz8AQxr1agisUzMLx5+UpR5or3yk/fHoKwr9hdTntabvOAYvqIHhoiToqTWCBoNjppRx7ycBGHMkv9JJmA2gJJHj1KNrqp7Jt9aqLyMpnANWLfUUUU4ERfFDNha27H3V0nkNFogi6S7L4ZB9Yvq1+QvDI5rAto+yrazxPXrOdbFJkqdgwh5hXAQ1my4J/eDZVK2UKYiTDcgOtUkUCkupdoUERVi6MkXOiH8VoxvpHlcJ6YsDoowvLnyB0E8LYXNA6ipd2KBgjH3nyUj4BFDpPMjraTxkRwr5yORmpL278VGjzdhRkYcKrj32JLG6AbeDiis4vbRyqYw== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: d20501ad-735c-4b3d-d9c2-08d9d563aab2 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:09.2743 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QM/EC0Yn+f7A64rpaCS4tBiKqO0nJuinnRRAqa2Zx5cti26gpcDk1GIaRp1P3TAjtrivgO0IW6oiARIFN+T8lA26TIUrhQJCJh3SuVE3Yeg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: RqVx63hvy0CwGwmzZA560mYX7WXUAOab X-Proofpoint-ORIG-GUID: RqVx63hvy0CwGwmzZA560mYX7WXUAOab Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/user-protocol.h | 18 +++++++++++++++++ hw/vfio/migration.c | 30 +++++++++++++-------------- hw/vfio/pci.c | 7 +++++++ hw/vfio/user.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 93 insertions(+), 16 deletions(-) diff --git a/hw/vfio/user-protocol.h b/hw/vfio/user-protocol.h index 8932311..abe7002 100644 --- a/hw/vfio/user-protocol.h +++ b/hw/vfio/user-protocol.h @@ -193,6 +193,10 @@ typedef struct { char data[]; } VFIOUserDMARW; +/* + * VFIO_USER_DIRTY_PAGES + */ + /*imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; @@ -200,4 +204,18 @@ typedef struct { char data[]; } VFIOUserBitmap; +/* imported from struct vfio_iommu_type1_dirty_bitmap_get */ +typedef struct { + uint64_t iova; + uint64_t size; + VFIOUserBitmap bitmap; +} VFIOUserBitmapRange; + +/* imported from struct vfio_iommu_type1_dirty_bitmap */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; +} VFIOUserDirtyPages; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index ff6b45d..df63f5c 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -27,6 +27,7 @@ #include "pci.h" #include "trace.h" #include "hw/hw.h" +#include "user.h" /* * Flags to be used as unique delimiters for VFIO devices in the migration @@ -49,11 +50,13 @@ static int64_t bytes_transferred; static inline int vfio_mig_access(VFIODevice *vbasedev, void *val, int count, off_t off, bool iswrite) { + VFIORegion *region = &vbasedev->migration->region; int ret; - ret = iswrite ? pwrite(vbasedev->fd, val, count, off) : - pread(vbasedev->fd, val, count, off); - if (ret < count) { + ret = iswrite ? + VDEV_REGION_WRITE(vbasedev, region->nr, off, count, val, false) : + VDEV_REGION_READ(vbasedev, region->nr, off, count, val); + if (ret < count) { error_report("vfio_mig_%s %d byte %s: failed at offset 0x%" HWADDR_PRIx", err: %s", iswrite ? "write" : "read", count, vbasedev->name, off, strerror(errno)); @@ -111,9 +114,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t mask, uint32_t value) { VFIOMigration *migration = vbasedev->migration; - VFIORegion *region = &migration->region; - off_t dev_state_off = region->fd_offset + - VFIO_MIG_STRUCT_OFFSET(device_state); + off_t dev_state_off = VFIO_MIG_STRUCT_OFFSET(device_state); uint32_t device_state; int ret; @@ -201,13 +202,13 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size) int ret; ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset)); + VFIO_MIG_STRUCT_OFFSET(data_offset)); if (ret < 0) { return ret; } ret = vfio_mig_read(vbasedev, &data_size, sizeof(data_size), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size)); + VFIO_MIG_STRUCT_OFFSET(data_size)); if (ret < 0) { return ret; } @@ -233,8 +234,7 @@ static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev, uint64_t *size) } buf_allocated = true; - ret = vfio_mig_read(vbasedev, buf, sec_size, - region->fd_offset + data_offset); + ret = vfio_mig_read(vbasedev, buf, sec_size, data_offset); if (ret < 0) { g_free(buf); return ret; @@ -269,7 +269,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, do { ret = vfio_mig_read(vbasedev, &data_offset, sizeof(data_offset), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_offset)); + VFIO_MIG_STRUCT_OFFSET(data_offset)); if (ret < 0) { return ret; } @@ -309,8 +309,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, qemu_get_buffer(f, buf, sec_size); if (buf_alloc) { - ret = vfio_mig_write(vbasedev, buf, sec_size, - region->fd_offset + data_offset); + ret = vfio_mig_write(vbasedev, buf, sec_size, data_offset); g_free(buf); if (ret < 0) { @@ -322,7 +321,7 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, } ret = vfio_mig_write(vbasedev, &report_size, sizeof(report_size), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(data_size)); + VFIO_MIG_STRUCT_OFFSET(data_size)); if (ret < 0) { return ret; } @@ -334,12 +333,11 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, static int vfio_update_pending(VFIODevice *vbasedev) { VFIOMigration *migration = vbasedev->migration; - VFIORegion *region = &migration->region; uint64_t pending_bytes = 0; int ret; ret = vfio_mig_read(vbasedev, &pending_bytes, sizeof(pending_bytes), - region->fd_offset + VFIO_MIG_STRUCT_OFFSET(pending_bytes)); + VFIO_MIG_STRUCT_OFFSET(pending_bytes)); if (ret < 0) { migration->pending_bytes = 0; return ret; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d47b98e..598e9ed 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3677,6 +3677,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) goto out_teardown; } + if (!pdev->failover_pair_id) { + ret = vfio_migration_probe(&vdev->vbasedev, errp); + if (ret) { + error_report("%s: Migration disabled", vdev->vbasedev.name); + } + } + vfio_register_err_notifier(vdev); vfio_register_req_notifier(vdev); diff --git a/hw/vfio/user.c b/hw/vfio/user.c index 33d8f06..2eac62a 100644 --- a/hw/vfio/user.c +++ b/hw/vfio/user.c @@ -1410,6 +1410,52 @@ void vfio_user_reset(VFIOProxy *proxy) } } +static int vfio_user_dirty_bitmap(VFIOProxy *proxy, + struct vfio_iommu_type1_dirty_bitmap *cmd, + struct vfio_iommu_type1_dirty_bitmap_get + *dbitmap) +{ + g_autofree struct { + VFIOUserDirtyPages msg; + VFIOUserBitmapRange range; + } *msgp = NULL; + int msize, rsize; + + /* + * If just the command is sent, the returned bitmap isn't needed. + * The bitmap structs are different from the ioctl() versions, + * ioctl() returns the bitmap in a local VA + */ + if (dbitmap != NULL) { + msize = sizeof(*msgp); + rsize = msize + dbitmap->bitmap.size; + msgp = g_malloc0(rsize); + msgp->range.iova = dbitmap->iova; + msgp->range.size = dbitmap->size; + msgp->range.bitmap.pgsize = dbitmap->bitmap.pgsize; + msgp->range.bitmap.size = dbitmap->bitmap.size; + } else { + msize = rsize = sizeof(VFIOUserDirtyPages); + msgp = g_malloc0(rsize); + } + + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DIRTY_PAGES, msize, 0); + msgp->msg.argsz = rsize - sizeof(VFIOUserHdr); + msgp->msg.flags = cmd->flags; + + vfio_user_send_wait(proxy, &msgp->msg.hdr, NULL, rsize, false); + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { + return -msgp->msg.hdr.error_reply; + } + + if (dbitmap != NULL) { + memcpy(dbitmap->bitmap.data, &msgp->range.bitmap.data, + dbitmap->bitmap.size); + } + + return 0; +} + /* * Socket-based io_ops @@ -1530,6 +1576,13 @@ static int vfio_user_io_dma_unmap(VFIOContainer *container, container->async_ops); } +static int vfio_user_io_dirty_bitmap(VFIOContainer *container, + struct vfio_iommu_type1_dirty_bitmap *bitmap, + struct vfio_iommu_type1_dirty_bitmap_get *range) +{ + return vfio_user_dirty_bitmap(container->proxy, bitmap, range); +} + static void vfio_user_io_wait_commit(VFIOContainer *container) { vfio_user_wait_reqs(container->proxy); @@ -1538,5 +1591,6 @@ static void vfio_user_io_wait_commit(VFIOContainer *container) VFIOContIO vfio_cont_io_sock = { .dma_map = vfio_user_io_dma_map, .dma_unmap = vfio_user_io_dma_unmap, + .dirty_bitmap = vfio_user_io_dirty_bitmap, .wait_commit = vfio_user_io_wait_commit, }; From patchwork Wed Jan 12 00:43:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Johnson X-Patchwork-Id: 12710863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F597C433F5 for ; Wed, 12 Jan 2022 01:04:57 +0000 (UTC) Received: from localhost ([::1]:55772 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n7S4O-0008Fh-4v for qemu-devel@archiver.kernel.org; Tue, 11 Jan 2022 20:04:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36870) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Re2-0001Hn-Pl for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:42 -0500 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:24844) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n7Rds-0005iZ-KW for qemu-devel@nongnu.org; Tue, 11 Jan 2022 19:37:35 -0500 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20BMrG6j019928 for ; Wed, 12 Jan 2022 00:37:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references : content-type : mime-version; s=corp-2021-07-09; bh=9RW1PZaTezRbBm/wG+ja+GigDIAUAZDrU9I5paPvO04=; b=AjVVvajLOVyqOM9DQsqGM9E+DwCSWmD59C9l5DnEF2DNcUJccTzJp4s5n/6788cXrpeb nFMBW4BPxymM0jQGAfjM+grFJaplmlP7bmnRt8KMXsTdXrxzHu2ZO5AmNePyw7Cw4AsK 3Ud2hDHxme1sqFaSrw+jGmmnu0ZI3sG591iG0kLvvovC/cAS8cf2C/qSa22+a7bLWPt+ jro9zsJ+XLjdZ3BxgCxgHFfjrkYCB5Y+KdaH00XF5xdb8GoZJXIWILn2wowpQYcAmo4d XR/hDU42RlA4zbU1eqDtcNuiaErl/rXX9Ekz+ae2QsU0cugt75c36Lg/NfI6T4Ne/3qa 5g== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3dgkhx4sjs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:21 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 20C0KAjD069368 for ; Wed, 12 Jan 2022 00:37:20 GMT Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2105.outbound.protection.outlook.com [104.47.55.105]) by userp3030.oracle.com with ESMTP id 3deyqy1gqq-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 12 Jan 2022 00:37:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dickoNd0nCSQBEh6qRZqft2vIw7DxmT6aY+u2cigs6FFt9CjeObpvCgXN1pK65+hmzKaZwcOCRFKWqOnIcO+nYvI7eIK1VFIlNJscV1rZJ2uRVGcMCkZ0CtNqg5K+VxQrnDEeiGdR77RHi64JoogL9eDnagi5tVHj/SJAmVhme/IbfLPSint8CgM9EVtxP6mn0rinK+2hMCFYpFkLWrb68/PD+B9crjRBrsGxSOzaLDtIRb4BXbDrTLZO3j0FDC2xmn0s45DUqVnT0oShWYSLBqFbz4kLoYgZRT3lo9fOIVH3HcDcj0r8gJuzTFAH/jbDyCw2amlEoeTVhtY3JxJEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9RW1PZaTezRbBm/wG+ja+GigDIAUAZDrU9I5paPvO04=; b=QMh4CQS2Jla/wGSRBAvW3pb1xfkCIIbnShUBFysRl/0xaoRzKanQUFKEXWPM+JZ6MUlRWMxwiPeGjt+ed6gkEbrrc6Fp8FENWKCStzlzqG31/LHC+iw3wfIja+lwivUhwGKm3VGSnoNkuo1bXuYLRWAI0lH1EVPrlDTQPCVNJ1U/0eut4TR1ugbyPgNPEwAsp44MoAWe/rdkLDAdHg3OA/6Hdh2TbkBDAtBVorkpeqzKwQmVZdSHxYyL4jPvGFE8C67mV1HpAhBH5d3UpNm3sXzwQZKKmXcxRwj+VUmyWYUlnAUtUe+m8iJm4vVN+MQRVBz0y/qpK4H45RZOfHdORw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9RW1PZaTezRbBm/wG+ja+GigDIAUAZDrU9I5paPvO04=; b=oJAXOsMavEqRqGN0zGoeYXV3usGWAxzPikoikmgIS6WJb0M5sV03Hwbf+zU/1SgppBxK0ehoLoZGli/EZImIbQkN02I1eHD/LKqxJLLzMLTG9Ap65rJyyauSCQS6TqAoL3gb/GXdqLQ+gn+KrJ6bWaUeNAvz7teekJcYiSFkXQo= Received: from PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) by PH0PR10MB4742.namprd10.prod.outlook.com (2603:10b6:510:3f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Wed, 12 Jan 2022 00:37:09 +0000 Received: from PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b]) by PH0PR10MB4679.namprd10.prod.outlook.com ([fe80::5536:dbc6:5161:ac1b%3]) with mapi id 15.20.4867.012; Wed, 12 Jan 2022 00:37:09 +0000 From: John Johnson To: qemu-devel@nongnu.org Subject: [RFC v4 21/21] Only set qemu file error if saving state so the file exists Date: Tue, 11 Jan 2022 16:43:57 -0800 Message-Id: <658c75935969a51090278519800ce729ca54ce81.1641584317.git.john.g.johnson@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: X-ClientProxiedBy: BY5PR03CA0006.namprd03.prod.outlook.com (2603:10b6:a03:1e0::16) To PH0PR10MB4679.namprd10.prod.outlook.com (2603:10b6:510:3c::15) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f0c73d52-7eb1-42ac-241b-08d9d563aae0 X-MS-TrafficTypeDiagnostic: PH0PR10MB4742:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:510; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aCTQOHGQyitT/TlNXb8HUpIQpz/5ZSMfIgaeIP4rHNge3FKq5L5SKcUZJ4WVRF0X7dDd9lIm639c5QWRiZ/3UdZETxfEEX91AXuJ1pIBsHoYYRKWSr4IsLeyBS9DZ5Z6eHrhq1CobzPv13rXgfmPdNZCeWOPudpu4ZrjnSKw4qIkklwEb4yTVYjxTqAAKoZgZD7GS1NcRSc9w/xMD6CHjUgyf8lDtGV5txuOGzW2nAkDKVZQw/NUgLCH5tJJimE4V2Bg/15SkP/DR0iyZLJXUGUezo3pO3Ffftr6VaGGKa5PmpHVHgbUJU7nJuE3J8r5c6ISzGwoOjFdhBXPyse7jtXj11gu5WzsIu6tklAxk0+hXkd54rwGMzT30vN4dMHrVgRmdtW2iDCK0RyeXVvbj1tVBG1ai0mjJcP78N5cOVk1BCTkCvHnUAYfATM/0GBHsc/aWLYhLlZccMzmYv/U4M3fcAvM73qXK/VCSsVm7N4S/9auNtlWvDdKL8hop34Ag/lhsnayPcvMKd0BwDlj/95+JxHhOKnd6TxG3PVZXHZ/DJz1U7iOGmWm2etQ2Q+D+F4PvnjfiTfRojGfPoowiDSQHlcngJMOYVUP4OWnVfTiAa8/92CPUajrImmYJ/ez8QHBQWhkJQ2IxqXyyEv2fjTqQAWaMEgT+4XLCgfzKNquyvFu5aTRSWf+cpcqcCrFaihkiC2wq7nq5F+nXZeb5g== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PH0PR10MB4679.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(66946007)(66556008)(4744005)(36756003)(508600001)(6486002)(6916009)(66476007)(52116002)(186003)(8676002)(2616005)(316002)(8936002)(6666004)(83380400001)(6512007)(5660300002)(38350700002)(86362001)(6506007)(2906002)(38100700002)(26005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YcZyehjsolyR7ZUOFQKyTyzoiZNZ9sjTk8R5dAZ/vSUci7N6kIGUObH3m66K3eOeyGeAV3wAQngdI3Mh6HUGFT2PuleGrsTcJSe74rEjnuhhad44npv/2l7ZkUJOR/HfOShVyYNNOIHrmGwjw/fBjnXljOA7rrFvzbkg2f0cv+11Y1BVGwrK1ufa2+mxZQdHy2d0EU9z3sME01n3nzgN8VX+qdw//f61FaumKdCkcmYJWvijUXIjVJbJTBZBhTdBPYjFw7qQLhTOopc4NZjfMuRgdBoTKD2/jcFHeeI0zpgy6NmTYcydnOLs5ue9xyKxhGrTS0cLhR+4YX+YGrHHo+ZooA5to3l+ih+wF3ucIxpKEtK9rZHTBZ9QjWhAUNRwyiR2M8IlqWUPsrqCNvzvqkI6FAJutkQfLSBnguVVn18wbC3eUXTZMUlqQU2TATg/mHNaWVZcqqEyu4As4NoFlOGpN7jCOLGfG85iyiZJgwdwkRGKG92n9eG5A/QXs6Q7GHBACMq+yn3Bfg4pE2USmAu0uPLmKA0bba4BTJhaSPqRtP5bsJzrcZ4NctHnUm7E5zy0AfxWIiJJesuMuVQWKrXSsBB/kP8whDvuC4HMsawxDbdBTSz4NkUuNd+AiULJ7iBiHFayCLwO3of/nZbh6VtFkRjI2zj7r4iajGHApbOnpTfU/nvRJ3yLu8YqFURLoRnJLFmJte41bgPGxDzseH3JGxj75gRb61As6y6PR9vvl3NU+vl9vVav+g21mMYbRPUAAarbKlZfGakYJVA4uAgVPaDg2L6kvE7c5nHQlcHbOp2Rda5G2J908KMmqHoMb9NufK+FBgJec2LFV8vCAqmBPBaRQvVnXBnI7ZI00VV+u71V05JiihS6biHlWpzbHv6J9DcFnUvt4iLrgmry5TEwYBLm3b2rbdVW6EOqHtPmAOIBXefeISGy3wEMNF23KgV81g23QYPenK8bKvX1diNgPtbVQDrWOjRednxPqf6LIFgLNC+OShDuATnSpiGx2kzWc5aqK5SzxXU1pE7iLme/7YlplbmERAw6uPN/w8aHie7XaprNtt1HG7UFq4Nngxxa6bIXL7/fesdjEO4SRMKxo+sRt5JhmDVtsP5KFGUHNaBmmpMVuOO+NshMTs97zLp2z0c941SCIOq8+J3CsIX/xMkBYxMQy6IxQfD4Doa8J25HvWaaWC+ux5AoL3VjnMIyF+/+P6CbRj+zlPlisSACL8yk55KERQEWyumnKV3m8c1o6FUtu5Ye8UWyvus86gy6Wpm/oL7mNq4lEWc5OZb9nik3Obj/Z3oucsR1QIsO/qBbbpJ3zWFL/RBDIHVJ97mcGnWKZOIFmv26yhHzZCVtbfWgjOXfvx5ey+VQV4eWnJuvBl0L6SqU2KlPeiIEt8H713ELwd9C3IkcrxzMgMchQgz20xO0ye8uVjgOVvcFO+LdX9IeolUEEejSWT0uTi2Q1K3zJwbwXlcPzN/HNosntNxVLtt/TrZ39y9nMykVWBdUrHdriKGjUvGpvWKyGM8liNLI3Z4STNx5gYtGQlz9nTustZM6JWtNagmuiZjmWesOCEuevYTFn7szgPUr8SE0E2f61A/ANzFBMdRQmOnLVp+4Y0yukv8cFwmsmCaaJyjILfk2nPSf9n5+xv4IN53jPCDekJMhCvXOcJL70A== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: f0c73d52-7eb1-42ac-241b-08d9d563aae0 X-MS-Exchange-CrossTenant-AuthSource: PH0PR10MB4679.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2022 00:37:09.5867 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: V3BrDEdUCkgq2tYpKkt+XBQM0pk4HW2Vs4gsnDWJOVfuJnOIgyyV6y6ML/6W72Q8tkVxV8U7ftnHsTF8ZryDK1IJiXZNkmwMqZ275JUxFjk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4742 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10224 signatures=668683 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2201120000 X-Proofpoint-GUID: AelyXOU9bOqELWdZrszv1CHzLVjA7Mgv X-Proofpoint-ORIG-GUID: AelyXOU9bOqELWdZrszv1CHzLVjA7Mgv Received-SPF: pass client-ip=205.220.177.32; envelope-from=john.g.johnson@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman --- hw/vfio/migration.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index df63f5c..e72241d 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -742,7 +742,9 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state) */ error_report("%s: Failed to set device state 0x%x", vbasedev->name, (migration->device_state & mask) | value); - qemu_file_set_error(migrate_get_current()->to_dst_file, ret); + if (value != 0) { + qemu_file_set_error(migrate_get_current()->to_dst_file, ret); + } } vbasedev->migration->vm_running = running; trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),