[RFC,v3] vhost: introduce mdev based hardware vhost backend

Details about this can be found here:

https://lwn.net/Articles/750770/

What's new in this version
==========================

There are three choices based on the discussion [1] in RFC v2:

> #1. We expose a VFIO device, so we can reuse the VFIO container/group
>     based DMA API and potentially reuse a lot of VFIO code in QEMU.
>
>     But in this case, we have two choices for the VFIO device interface
>     (i.e. the interface on top of VFIO device fd):
>
>     A) we may invent a new vhost protocol (as demonstrated by the code
>        in this RFC) on VFIO device fd to make it work in VFIO's way,
>        i.e. regions and irqs.
>
>     B) Or as you proposed, instead of inventing a new vhost protocol,
>        we can reuse most existing vhost ioctls on the VFIO device fd
>        directly. There should be no conflicts between the VFIO ioctls
>        (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
>
> #2. Instead of exposing a VFIO device, we may expose a VHOST device.
>     And we will introduce a new mdev driver vhost-mdev to do this.
>     It would be natural to reuse the existing kernel vhost interface
>     (ioctls) on it as much as possible. But we will need to invent
>     some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
>     choice, but it's too heavy and doesn't support vIOMMU by itself).

This version is more like a quick PoC to try Jason's proposal on
reusing vhost ioctls. And the second way (#1/B) in above three
choices was chosen in this version to demonstrate the idea quickly.

Now the userspace API looks like this:

- VFIO's container/group based IOMMU API is used to do the
  DMA programming.

- Vhost's existing ioctls are used to setup the device.

And the device will report device_api as "vfio-vhost".

Note that, there are dirty hacks in this version. If we decide to
go this way, some refactoring in vhost.c/vhost.h may be needed.

PS. The direct mapping of the notify registers isn't implemented
    in this version.

[1] https://lkml.org/lkml/2019/7/9/101

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 drivers/vhost/Kconfig      |   9 +
 drivers/vhost/Makefile     |   3 +
 drivers/vhost/mdev.c       | 382 +++++++++++++++++++++++++++++++++++++
 include/linux/vhost_mdev.h |  58 ++++++
 include/uapi/linux/vfio.h  |   2 +
 include/uapi/linux/vhost.h |   8 +
 6 files changed, 462 insertions(+)
 create mode 100644 drivers/vhost/mdev.c
 create mode 100644 include/linux/vhost_mdev.h

Message ID	20190828053712.26106-1-tiwei.bie@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=KojS=WY=vger.kernel.org=kvm-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F31BC1399 for <patchwork-kvm@patchwork.kernel.org>; Wed, 28 Aug 2019 05:40:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C075722CF8 for <patchwork-kvm@patchwork.kernel.org>; Wed, 28 Aug 2019 05:40:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726225AbfH1Fkk (ORCPT <rfc822;patchwork-kvm@patchwork.kernel.org>); Wed, 28 Aug 2019 01:40:40 -0400 Received: from mga18.intel.com ([134.134.136.126]:51848 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726052AbfH1Fkk (ORCPT <rfc822;kvm@vger.kernel.org>); Wed, 28 Aug 2019 01:40:40 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Aug 2019 22:40:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,440,1559545200"; d="scan'208";a="185513640" Received: from dpdk-virtio-tbie-2.sh.intel.com ([10.67.104.71]) by orsmga006.jf.intel.com with ESMTP; 27 Aug 2019 22:40:36 -0700 From: Tiwei Bie <tiwei.bie@intel.com> To: mst@redhat.com, jasowang@redhat.com, alex.williamson@redhat.com, maxime.coquelin@redhat.com Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, dan.daly@intel.com, cunming.liang@intel.com, zhihong.wang@intel.com, lingshan.zhu@intel.com, tiwei.bie@intel.com Subject: [RFC v3] vhost: introduce mdev based hardware vhost backend Date: Wed, 28 Aug 2019 13:37:12 +0800 Message-Id: <20190828053712.26106-1-tiwei.bie@intel.com> X-Mailer: git-send-email 2.17.1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: <kvm.vger.kernel.org> X-Mailing-List: kvm@vger.kernel.org
Series	[RFC,v3] vhost: introduce mdev based hardware vhost backend \| expand [RFC,v3] vhost: introduce mdev based hardware vhost backend

[RFC,v3] vhost: introduce mdev based hardware vhost backend

Commit Message

Comments

Patch