diff mbox series

[V2] docs: vhost-user: Add Xen specific memory mapping support

Message ID 7c3c120bcf2cf023e873800fd3f55239dd302e38.1678100850.git.viresh.kumar@linaro.org (mailing list archive)
State New, archived
Headers show
Series [V2] docs: vhost-user: Add Xen specific memory mapping support | expand

Commit Message

Viresh Kumar March 6, 2023, 11:10 a.m. UTC
The current model of memory mapping at the back-end works fine where a
standard call to mmap() (for the respective file descriptor) is enough
before the front-end can start accessing the guest memory.

There are other complex cases though where the back-end needs more
information and simple mmap() isn't enough. For example Xen, a type-1
hypervisor, currently supports memory mapping via two different methods,
foreign-mapping (via /dev/privcmd) and grant-dev (via /dev/gntdev). In
both these cases, the back-end needs to call mmap() and ioctl(), and
need to pass extra information via the ioctl(), like the Xen domain-id
of the guest whose memory we are trying to map.

Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_XEN_MMAP', which lets
the back-end know about the additional memory mapping requirements.
When this feature is negotiated, the front-end can send the
'VHOST_USER_SET_XEN_MMAP' message type to provide the additional
information to the back-end.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V1->V2:
- Make the custom mmap feature Xen specific, instead of being generic.
- Clearly define which memory regions are impacted by this change.
- Allow VHOST_USER_SET_XEN_MMAP to be called multiple times.
- Additional Bit(2) property in flags.

 docs/interop/vhost-user.rst | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

Comments

Stefan Hajnoczi March 6, 2023, 3:34 p.m. UTC | #1
On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
> The current model of memory mapping at the back-end works fine where a
> standard call to mmap() (for the respective file descriptor) is enough
> before the front-end can start accessing the guest memory.
> 
> There are other complex cases though where the back-end needs more
> information and simple mmap() isn't enough. For example Xen, a type-1
> hypervisor, currently supports memory mapping via two different methods,
> foreign-mapping (via /dev/privcmd) and grant-dev (via /dev/gntdev). In
> both these cases, the back-end needs to call mmap() and ioctl(), and
> need to pass extra information via the ioctl(), like the Xen domain-id
> of the guest whose memory we are trying to map.
> 
> Add a new protocol feature, 'VHOST_USER_PROTOCOL_F_XEN_MMAP', which lets
> the back-end know about the additional memory mapping requirements.
> When this feature is negotiated, the front-end can send the
> 'VHOST_USER_SET_XEN_MMAP' message type to provide the additional
> information to the back-end.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V1->V2:
> - Make the custom mmap feature Xen specific, instead of being generic.
> - Clearly define which memory regions are impacted by this change.
> - Allow VHOST_USER_SET_XEN_MMAP to be called multiple times.
> - Additional Bit(2) property in flags.
> 
>  docs/interop/vhost-user.rst | 36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 3f18ab424eb0..8be5f5eae941 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -258,6 +258,24 @@ Inflight description
>  
>  :queue size: a 16-bit size of virtqueues
>  
> +Xen mmap description
> +^^^^^^^^^^^^^^^^^^^^
> +
> ++-------+-------+
> +| flags | domid |
> ++-------+-------+
> +
> +:flags: 64-bit bit field
> +
> +- Bit 0 is set for Xen foreign memory memory mapping.
> +- Bit 1 is set for Xen grant memory memory mapping.
> +- Bit 2 is set if the back-end can directly map additional memory (like
> +  descriptor buffers or indirect descriptors, which aren't part of already
> +  shared memory regions) without the need of front-end sending an additional
> +  memory region first.

I don't understand what Bit 2 does. Can you rephrase this? It's unclear
to me how additional memory can be mapped without a memory region
(especially the fd) is sent?

> +
> +:domid: a 64-bit Xen hypervisor specific domain id.
> +
>  C structure
>  -----------
>  
> @@ -867,6 +885,7 @@ Protocol features
>    #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14
>    #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>    #define VHOST_USER_PROTOCOL_F_STATUS               16
> +  #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>  
>  Front-end message types
>  -----------------------
> @@ -1422,6 +1441,23 @@ Front-end message types
>    query the back-end for its device status as defined in the Virtio
>    specification.
>  
> +``VHOST_USER_SET_XEN_MMAP``
> +  :id: 41
> +  :equivalent ioctl: N/A
> +  :request payload: Xen mmap description
> +  :reply payload: N/A
> +
> +  When the ``VHOST_USER_PROTOCOL_F_XEN_MMAP`` protocol feature has been
> +  successfully negotiated, this message is submitted by the front-end to set the
> +  Xen hypervisor specific memory mapping configurations at the back-end.  These
> +  configurations should be used to mmap memory regions, virtqueues, descriptors
> +  and descriptor buffers. The front-end must send this message before any
> +  memory-regions are sent to the back-end via ``VHOST_USER_SET_MEM_TABLE`` or
> +  ``VHOST_USER_ADD_MEM_REG`` message types. The front-end can send this message
> +  multiple times, if different mmap configurations are required for different
> +  memory regions, where the most recent ``VHOST_USER_SET_XEN_MMAP`` must be used
> +  by the back-end to map any newly shared memory regions.

This message modifies the behavior of subsequent
VHOST_USER_SET_MEM_TABLE and VHOST_USER_ADD_MEM_REG messages. The memory
region structs can be extended and then VHOST_USER_SET_XEN_MMAP isn't
needed.

In other words:

  When VHOST_USER_PROTOCOL_F_XEN_MMAP is negotiated, each "Memory
  regions description" and "Single memory region description" has the
  following additional fields appended:

  +----------------+-------+
  | xen_mmap_flags | domid |
  +----------------+-------+

  :xen_mmap_flags: 64-bit bit field
  :domid: a 64-bit Xen hypervisor specific domain id.

Stefan
Viresh Kumar March 7, 2023, 5:43 a.m. UTC | #2
On 06-03-23, 10:34, Stefan Hajnoczi wrote:
> On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
> > +Xen mmap description
> > +^^^^^^^^^^^^^^^^^^^^
> > +
> > ++-------+-------+
> > +| flags | domid |
> > ++-------+-------+
> > +
> > +:flags: 64-bit bit field
> > +
> > +- Bit 0 is set for Xen foreign memory memory mapping.
> > +- Bit 1 is set for Xen grant memory memory mapping.
> > +- Bit 2 is set if the back-end can directly map additional memory (like
> > +  descriptor buffers or indirect descriptors, which aren't part of already
> > +  shared memory regions) without the need of front-end sending an additional
> > +  memory region first.
> 
> I don't understand what Bit 2 does. Can you rephrase this? It's unclear
> to me how additional memory can be mapped without a memory region
> (especially the fd) is sent?

I (somehow) assumed we will be able to use the same file descriptor
that was shared for the virtqueues memory regions and yes I can see
now why it wouldn't work or create problems.

And I need suggestion now on how to make this work.

With Xen grants, the front end receives grant address from the from
guest kernel, they aren't physical addresses, kind of IOMMU stuff.

The back-end gets access for memory regions of the virtqueues alone
initially.  When the back-end gets a request, it reads the descriptor
and finds the buffer address, which isn't part of already shared
regions. The same happens for descriptor addresses in case indirect
descriptor feature is negotiated.

At this point I was thinking maybe the back-end can simply call the
mmap/ioctl to map the memory, using the file descriptor used for the
virtqueues.

How else can we make this work ? We also need to unmap/remove the
memory region, as soon as the buffer is processed as the grant address
won't be relevant for any subsequent request.

Should I use VHOST_USER_IOTLB_MSG for this ? I did look at it and I
wasn't convinced if it was an exact fit. For example it says that a
memory address reported with miss/access fail should be part of an
already sent memory region, which isn't the case here.

> This message modifies the behavior of subsequent
> VHOST_USER_SET_MEM_TABLE and VHOST_USER_ADD_MEM_REG messages. The memory
> region structs can be extended and then VHOST_USER_SET_XEN_MMAP isn't
> needed.
> 
> In other words:
> 
>   When VHOST_USER_PROTOCOL_F_XEN_MMAP is negotiated, each "Memory
>   regions description" and "Single memory region description" has the
>   following additional fields appended:
> 
>   +----------------+-------+
>   | xen_mmap_flags | domid |
>   +----------------+-------+

This looks fine.
Stefan Hajnoczi March 7, 2023, 4:22 p.m. UTC | #3
On Tue, Mar 07, 2023 at 11:13:36AM +0530, Viresh Kumar wrote:
> On 06-03-23, 10:34, Stefan Hajnoczi wrote:
> > On Mon, Mar 06, 2023 at 04:40:24PM +0530, Viresh Kumar wrote:
> > > +Xen mmap description
> > > +^^^^^^^^^^^^^^^^^^^^
> > > +
> > > ++-------+-------+
> > > +| flags | domid |
> > > ++-------+-------+
> > > +
> > > +:flags: 64-bit bit field
> > > +
> > > +- Bit 0 is set for Xen foreign memory memory mapping.
> > > +- Bit 1 is set for Xen grant memory memory mapping.
> > > +- Bit 2 is set if the back-end can directly map additional memory (like
> > > +  descriptor buffers or indirect descriptors, which aren't part of already
> > > +  shared memory regions) without the need of front-end sending an additional
> > > +  memory region first.
> > 
> > I don't understand what Bit 2 does. Can you rephrase this? It's unclear
> > to me how additional memory can be mapped without a memory region
> > (especially the fd) is sent?
> 
> I (somehow) assumed we will be able to use the same file descriptor
> that was shared for the virtqueues memory regions and yes I can see
> now why it wouldn't work or create problems.
> 
> And I need suggestion now on how to make this work.
> 
> With Xen grants, the front end receives grant address from the from
> guest kernel, they aren't physical addresses, kind of IOMMU stuff.
> 
> The back-end gets access for memory regions of the virtqueues alone
> initially.  When the back-end gets a request, it reads the descriptor
> and finds the buffer address, which isn't part of already shared
> regions. The same happens for descriptor addresses in case indirect
> descriptor feature is negotiated.
> 
> At this point I was thinking maybe the back-end can simply call the
> mmap/ioctl to map the memory, using the file descriptor used for the
> virtqueues.
> 
> How else can we make this work ? We also need to unmap/remove the
> memory region, as soon as the buffer is processed as the grant address
> won't be relevant for any subsequent request.
> 
> Should I use VHOST_USER_IOTLB_MSG for this ? I did look at it and I
> wasn't convinced if it was an exact fit. For example it says that a
> memory address reported with miss/access fail should be part of an
> already sent memory region, which isn't the case here.

VHOST_USER_IOTLB_MSG probably isn't necessary because address
translation is not required. It will also reduce performance by adding
extra communication.

Instead, you could change the 1 memory region : 1 mmap relationship that
existing non-Xen vhost-user back-end implementations have. In Xen
vhost-user back-ends, the memory region details (including the file
descriptor and Xen domain id) would be stashed away in back-end when the
front-end adds memory regions. No mmap would be performed upon
VHOST_USER_ADD_MEM_REG or VHOST_USER_SET_MEM_TABLE.

Whenever the back-end needs to do DMA, it looks up the memory region and
performs the mmap + Xen-specific calls:
- A long-lived mmap of the vring is set up when
  VHOST_USER_SET_VRING_ENABLE is received.
- Short-lived mmaps of the indirect descriptors and memory pointed to by
  the descriptors is set up by the virtqueue processing code.

Does this sound workable to you?

Stefan
Viresh Kumar March 9, 2023, 8:51 a.m. UTC | #4
On 07-03-23, 11:22, Stefan Hajnoczi wrote:
> VHOST_USER_IOTLB_MSG probably isn't necessary because address
> translation is not required. It will also reduce performance by adding
> extra communication.
> 
> Instead, you could change the 1 memory region : 1 mmap relationship that
> existing non-Xen vhost-user back-end implementations have. In Xen
> vhost-user back-ends, the memory region details (including the file
> descriptor and Xen domain id) would be stashed away in back-end when the
> front-end adds memory regions. No mmap would be performed upon
> VHOST_USER_ADD_MEM_REG or VHOST_USER_SET_MEM_TABLE.
> 
> Whenever the back-end needs to do DMA, it looks up the memory region and
> performs the mmap + Xen-specific calls:
> - A long-lived mmap of the vring is set up when
>   VHOST_USER_SET_VRING_ENABLE is received.
> - Short-lived mmaps of the indirect descriptors and memory pointed to by
>   the descriptors is set up by the virtqueue processing code.
> 
> Does this sound workable to you?

Sounds good. I have sent a proposal (v3) based on that now.
diff mbox series

Patch

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 3f18ab424eb0..8be5f5eae941 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -258,6 +258,24 @@  Inflight description
 
 :queue size: a 16-bit size of virtqueues
 
+Xen mmap description
+^^^^^^^^^^^^^^^^^^^^
+
++-------+-------+
+| flags | domid |
++-------+-------+
+
+:flags: 64-bit bit field
+
+- Bit 0 is set for Xen foreign memory memory mapping.
+- Bit 1 is set for Xen grant memory memory mapping.
+- Bit 2 is set if the back-end can directly map additional memory (like
+  descriptor buffers or indirect descriptors, which aren't part of already
+  shared memory regions) without the need of front-end sending an additional
+  memory region first.
+
+:domid: a 64-bit Xen hypervisor specific domain id.
+
 C structure
 -----------
 
@@ -867,6 +885,7 @@  Protocol features
   #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14
   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
   #define VHOST_USER_PROTOCOL_F_STATUS               16
+  #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
 
 Front-end message types
 -----------------------
@@ -1422,6 +1441,23 @@  Front-end message types
   query the back-end for its device status as defined in the Virtio
   specification.
 
+``VHOST_USER_SET_XEN_MMAP``
+  :id: 41
+  :equivalent ioctl: N/A
+  :request payload: Xen mmap description
+  :reply payload: N/A
+
+  When the ``VHOST_USER_PROTOCOL_F_XEN_MMAP`` protocol feature has been
+  successfully negotiated, this message is submitted by the front-end to set the
+  Xen hypervisor specific memory mapping configurations at the back-end.  These
+  configurations should be used to mmap memory regions, virtqueues, descriptors
+  and descriptor buffers. The front-end must send this message before any
+  memory-regions are sent to the back-end via ``VHOST_USER_SET_MEM_TABLE`` or
+  ``VHOST_USER_ADD_MEM_REG`` message types. The front-end can send this message
+  multiple times, if different mmap configurations are required for different
+  memory regions, where the most recent ``VHOST_USER_SET_XEN_MMAP`` must be used
+  by the back-end to map any newly shared memory regions.
+
 
 Back-end message types
 ----------------------