diff mbox series

[9/9] hostmem-file: support POSIX shm_open()

Message ID 20240228114759.44758-10-sgarzare@redhat.com (mailing list archive)
State New, archived
Headers show
Series vhost-user: support any POSIX system (tested on macOS and FreeBSD) | expand

Commit Message

Stefano Garzarella Feb. 28, 2024, 11:47 a.m. UTC
Add a new `shm` bool option for `-object memory-backend-file`.

When this option is set to true, the POSIX shm_open(3) is used instead
of open(2).

So a file will not be created in the filesystem, but a "POSIX shared
memory object" will be instantiated. In Linux this turns into a file
in /dev/shm, but in other OSes this may not happen (for example in
macOS or FreeBSD nothing is shown in any filesystem).

This new feature is useful when we need to share guest memory with
another process (e.g. vhost-user backend), but we don't have
memfd_create() or any special filesystems (e.g. /dev/shm) available
as in macOS.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
I am not sure this is the best way to support shm_open() in QEMU.

Other solutions I had in mind were:

- create a new memory-backend-shm

- extend memory-backend-memfd to use shm_open() on systems where memfd is
not available (problem: shm_open wants a name to assign to the object, but
we can do a workaround by using a random name and do the unlink right away)

Any preference/suggestion?

Thanks,
Stefano
---
 qapi/qom.json           |  4 +++
 backends/hostmem-file.c | 57 ++++++++++++++++++++++++++++++++++++++++-
 backends/meson.build    |  2 +-
 qemu-options.hx         | 10 +++++++-
 4 files changed, 70 insertions(+), 3 deletions(-)

Comments

David Hildenbrand Feb. 28, 2024, 12:01 p.m. UTC | #1
On 28.02.24 12:47, Stefano Garzarella wrote:
> Add a new `shm` bool option for `-object memory-backend-file`.
> 
> When this option is set to true, the POSIX shm_open(3) is used instead
> of open(2).
> 
> So a file will not be created in the filesystem, but a "POSIX shared
> memory object" will be instantiated. In Linux this turns into a file
> in /dev/shm, but in other OSes this may not happen (for example in
> macOS or FreeBSD nothing is shown in any filesystem).
> 
> This new feature is useful when we need to share guest memory with
> another process (e.g. vhost-user backend), but we don't have
> memfd_create() or any special filesystems (e.g. /dev/shm) available
> as in macOS.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> I am not sure this is the best way to support shm_open() in QEMU.
> 
> Other solutions I had in mind were:
> 
> - create a new memory-backend-shm
> 
> - extend memory-backend-memfd to use shm_open() on systems where memfd is
> not available (problem: shm_open wants a name to assign to the object, but
> we can do a workaround by using a random name and do the unlink right away)
> 
> Any preference/suggestion?

Both sound like reasonable options, and IMHO better than hostmem-file 
with things that are not necessarily "real" files.

Regarding memory-backend-memfd, we similarly have to pass a name to 
memfd_create(), although for different purpose: "The  name  supplied in 
name is used as a filename and will be displayed as the target of the 
corresponding symbolic link in the directory /proc/self/fd/".

So we simply pass TYPE_MEMORY_BACKEND_MEMFD.

Likely, memory-backend-shm that directly maps to shm_open() and only 
provides properties reasonable for shm_open() is the cleanest approach. 
So that would currently be my preference :)
Daniel P. Berrangé Feb. 28, 2024, 12:08 p.m. UTC | #2
On Wed, Feb 28, 2024 at 12:47:59PM +0100, Stefano Garzarella wrote:
> Add a new `shm` bool option for `-object memory-backend-file`.
> 
> When this option is set to true, the POSIX shm_open(3) is used instead
> of open(2).
> 
> So a file will not be created in the filesystem, but a "POSIX shared
> memory object" will be instantiated. In Linux this turns into a file
> in /dev/shm, but in other OSes this may not happen (for example in
> macOS or FreeBSD nothing is shown in any filesystem).
> 
> This new feature is useful when we need to share guest memory with
> another process (e.g. vhost-user backend), but we don't have
> memfd_create() or any special filesystems (e.g. /dev/shm) available
> as in macOS.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> I am not sure this is the best way to support shm_open() in QEMU.
> 
> Other solutions I had in mind were:
> 
> - create a new memory-backend-shm
> 
> - extend memory-backend-memfd to use shm_open() on systems where memfd is
> not available (problem: shm_open wants a name to assign to the object, but
> we can do a workaround by using a random name and do the unlink right away)

IMHO, create a new memory-backend-shm, don't overload memory-backend-memfd,
as this lets users choose between shm & memfd, even on Linux.

With regards,
Daniel
Markus Armbruster Feb. 28, 2024, 12:32 p.m. UTC | #3
Stefano Garzarella <sgarzare@redhat.com> writes:

> Add a new `shm` bool option for `-object memory-backend-file`.
>
> When this option is set to true, the POSIX shm_open(3) is used instead
> of open(2).
>
> So a file will not be created in the filesystem, but a "POSIX shared
> memory object" will be instantiated. In Linux this turns into a file
> in /dev/shm, but in other OSes this may not happen (for example in
> macOS or FreeBSD nothing is shown in any filesystem).
>
> This new feature is useful when we need to share guest memory with
> another process (e.g. vhost-user backend), but we don't have
> memfd_create() or any special filesystems (e.g. /dev/shm) available
> as in macOS.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> I am not sure this is the best way to support shm_open() in QEMU.
>
> Other solutions I had in mind were:
>
> - create a new memory-backend-shm

How would that look like?  Would it involve duplicating code?

> - extend memory-backend-memfd to use shm_open() on systems where memfd is
> not available (problem: shm_open wants a name to assign to the object, but
> we can do a workaround by using a random name and do the unlink right away)

Hmm.  Too much magic?  I don't know...

> Any preference/suggestion?

[...]

> diff --git a/qapi/qom.json b/qapi/qom.json
> index 2a6e49365a..bfb01b909f 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -682,6 +682,9 @@
   # @mem-path: the path to either a shared memory or huge page
   #     filesystem mount

Does this need adjustment?

[...]

>  #       writable RAM instead of ROM, and want to set this property to 'off'.
>  #       (default: auto, since 8.2)
>  #
> +# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
> +#       object; if false, an open(2) is used. (default: false) (since 9.0)
> +#

Please format like this for consistency:

# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
#     object; if false, an open(2) is used (default: false) (since 9.0)

>  # Since: 2.1
>  ##
>  { 'struct': 'MemoryBackendFileProperties',
> @@ -692,6 +695,7 @@
>              'mem-path': 'str',
>              '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
>              '*readonly': 'bool',
> +            '*shm': 'bool',
>              '*rom': 'OnOffAuto' } }
>  
>  ##

[...]
Stefano Garzarella Feb. 29, 2024, 8:46 a.m. UTC | #4
On Wed, Feb 28, 2024 at 01:01:55PM +0100, David Hildenbrand wrote:
>On 28.02.24 12:47, Stefano Garzarella wrote:
>>Add a new `shm` bool option for `-object memory-backend-file`.
>>
>>When this option is set to true, the POSIX shm_open(3) is used instead
>>of open(2).
>>
>>So a file will not be created in the filesystem, but a "POSIX shared
>>memory object" will be instantiated. In Linux this turns into a file
>>in /dev/shm, but in other OSes this may not happen (for example in
>>macOS or FreeBSD nothing is shown in any filesystem).
>>
>>This new feature is useful when we need to share guest memory with
>>another process (e.g. vhost-user backend), but we don't have
>>memfd_create() or any special filesystems (e.g. /dev/shm) available
>>as in macOS.
>>
>>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>---
>>I am not sure this is the best way to support shm_open() in QEMU.
>>
>>Other solutions I had in mind were:
>>
>>- create a new memory-backend-shm
>>
>>- extend memory-backend-memfd to use shm_open() on systems where memfd is
>>not available (problem: shm_open wants a name to assign to the object, but
>>we can do a workaround by using a random name and do the unlink right away)
>>
>>Any preference/suggestion?
>
>Both sound like reasonable options, and IMHO better than hostmem-file 
>with things that are not necessarily "real" files.

Yeah, I see.

>
>Regarding memory-backend-memfd, we similarly have to pass a name to 
>memfd_create(), although for different purpose: "The  name  supplied 
>in name is used as a filename and will be displayed as the target of 
>the corresponding symbolic link in the directory /proc/self/fd/".
>
>So we simply pass TYPE_MEMORY_BACKEND_MEMFD.

Okay, so I guess it must be unique only in the process, while for 
shm_open() it is global.

>
>Likely, memory-backend-shm that directly maps to shm_open() and only 
>provides properties reasonable for shm_open() is the cleanest 
>approach. So that would currently be my preference :)

Thank you for your thoughts, I think I will go toward this direction 
(memory-backend-shm).

It was also my first choice, but in order to have a working RFC right 
away, I modified memory-backend-file.

Stefano
Stefano Garzarella Feb. 29, 2024, 8:49 a.m. UTC | #5
On Wed, Feb 28, 2024 at 12:08:37PM +0000, Daniel P. Berrangé wrote:
>On Wed, Feb 28, 2024 at 12:47:59PM +0100, Stefano Garzarella wrote:
>> Add a new `shm` bool option for `-object memory-backend-file`.
>>
>> When this option is set to true, the POSIX shm_open(3) is used instead
>> of open(2).
>>
>> So a file will not be created in the filesystem, but a "POSIX shared
>> memory object" will be instantiated. In Linux this turns into a file
>> in /dev/shm, but in other OSes this may not happen (for example in
>> macOS or FreeBSD nothing is shown in any filesystem).
>>
>> This new feature is useful when we need to share guest memory with
>> another process (e.g. vhost-user backend), but we don't have
>> memfd_create() or any special filesystems (e.g. /dev/shm) available
>> as in macOS.
>>
>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>> I am not sure this is the best way to support shm_open() in QEMU.
>>
>> Other solutions I had in mind were:
>>
>> - create a new memory-backend-shm
>>
>> - extend memory-backend-memfd to use shm_open() on systems where memfd is
>> not available (problem: shm_open wants a name to assign to the object, but
>> we can do a workaround by using a random name and do the unlink right away)
>
>IMHO, create a new memory-backend-shm, don't overload memory-backend-memfd,
>as this lets users choose between shm & memfd, even on Linux.

Yeah, good point!
I think there's enough of a consensus on adding memory-backend-shm, so 
I'm going to go toward that direction in v2.

Thanks,
Stefano
Stefano Garzarella Feb. 29, 2024, 8:57 a.m. UTC | #6
On Wed, Feb 28, 2024 at 01:32:17PM +0100, Markus Armbruster wrote:
>Stefano Garzarella <sgarzare@redhat.com> writes:
>
>> Add a new `shm` bool option for `-object memory-backend-file`.
>>
>> When this option is set to true, the POSIX shm_open(3) is used instead
>> of open(2).
>>
>> So a file will not be created in the filesystem, but a "POSIX shared
>> memory object" will be instantiated. In Linux this turns into a file
>> in /dev/shm, but in other OSes this may not happen (for example in
>> macOS or FreeBSD nothing is shown in any filesystem).
>>
>> This new feature is useful when we need to share guest memory with
>> another process (e.g. vhost-user backend), but we don't have
>> memfd_create() or any special filesystems (e.g. /dev/shm) available
>> as in macOS.
>>
>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>> I am not sure this is the best way to support shm_open() in QEMU.
>>
>> Other solutions I had in mind were:
>>
>> - create a new memory-backend-shm
>
>How would that look like?  Would it involve duplicating code?

I was looking at it just now, and apart from some boilerplate code to 
create the object, the rest in the end is pretty specific and a lot of 
things in memory-backend-file wouldn't be supported by 
memory-backend-shm anyway, so I'll give it a try for v2 by adding it.

>
>> - extend memory-backend-memfd to use shm_open() on systems where memfd is
>> not available (problem: shm_open wants a name to assign to the object, but
>> we can do a workaround by using a random name and do the unlink right away)
>
>Hmm.  Too much magic?  I don't know...

Yeah, I agree.

>
>> Any preference/suggestion?
>
>[...]
>
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index 2a6e49365a..bfb01b909f 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -682,6 +682,9 @@
>   # @mem-path: the path to either a shared memory or huge page
>   #     filesystem mount
>
>Does this need adjustment?

Good point. For now I think I will drop this patch and add 
memory-backend-shm in v2, but if I go back I will fix it!

>
>[...]
>
>>  #       writable RAM instead of ROM, and want to set this property to 'off'.
>>  #       (default: auto, since 8.2)
>>  #
>> +# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>> +#       object; if false, an open(2) is used. (default: false) (since 9.0)
>> +#
>
>Please format like this for consistency:

Sure.

>
># @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>#     object; if false, an open(2) is used (default: false) (since 9.0)

I just noticed that I followed the property just above (@rom). Should we 
fix that one?

Thanks,
Stefano
Markus Armbruster Feb. 29, 2024, 10:28 a.m. UTC | #7
Stefano Garzarella <sgarzare@redhat.com> writes:

> On Wed, Feb 28, 2024 at 01:32:17PM +0100, Markus Armbruster wrote:
>>Stefano Garzarella <sgarzare@redhat.com> writes:

[...]

>>> +# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>>> +#       object; if false, an open(2) is used. (default: false) (since 9.0)
>>> +#
>>
>>Please format like this for consistency:
>
> Sure.
>
>>
>># @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>>#     object; if false, an open(2) is used (default: false) (since 9.0)
>
> I just noticed that I followed the property just above (@rom). Should we fix that one?

Yes, please.

See commit a937b6aa739 (qapi: Reformat doc comments to conform to
current conventions).
Stefano Garzarella Feb. 29, 2024, 11:01 a.m. UTC | #8
On Thu, Feb 29, 2024 at 11:28:37AM +0100, Markus Armbruster wrote:
>Stefano Garzarella <sgarzare@redhat.com> writes:
>
>> On Wed, Feb 28, 2024 at 01:32:17PM +0100, Markus Armbruster wrote:
>>>Stefano Garzarella <sgarzare@redhat.com> writes:
>
>[...]
>
>>>> +# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>>>> +#       object; if false, an open(2) is used. (default: false) (since 9.0)
>>>> +#
>>>
>>>Please format like this for consistency:
>>
>> Sure.
>>
>>>
>>># @shm: if true, shm_open(3) is used to create/open POSIX shared memory
>>>#     object; if false, an open(2) is used (default: false) (since 9.0)
>>
>> I just noticed that I followed the property just above (@rom). Should we fix that one?
>
>Yes, please.

Done: 
https://patchew.org/QEMU/20240229105826.16354-1-sgarzare@redhat.com/

Thanks,
Stefano

>
>See commit a937b6aa739 (qapi: Reformat doc comments to conform to
>current conventions).
>
diff mbox series

Patch

diff --git a/qapi/qom.json b/qapi/qom.json
index 2a6e49365a..bfb01b909f 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -682,6 +682,9 @@ 
 #       writable RAM instead of ROM, and want to set this property to 'off'.
 #       (default: auto, since 8.2)
 #
+# @shm: if true, shm_open(3) is used to create/open POSIX shared memory
+#       object; if false, an open(2) is used. (default: false) (since 9.0)
+#
 # Since: 2.1
 ##
 { 'struct': 'MemoryBackendFileProperties',
@@ -692,6 +695,7 @@ 
             'mem-path': 'str',
             '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
             '*readonly': 'bool',
+            '*shm': 'bool',
             '*rom': 'OnOffAuto' } }
 
 ##
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index ac3e433cbd..9d60375c1f 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -34,6 +34,7 @@  struct HostMemoryBackendFile {
     bool is_pmem;
     bool readonly;
     OnOffAuto rom;
+    bool shm;
 };
 
 static bool
@@ -86,7 +87,37 @@  file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
+    /* TODO: check if this should be enabled if shm is enabled */
     ram_flags |= RAM_NAMED_FILE;
+
+    if (fb->shm) {
+        mode_t mode = S_IRUSR | S_IWUSR;
+        int fd, oflag = 0;
+
+        oflag |= fb->readonly ? O_RDONLY : O_RDWR;
+        oflag |= O_CREAT;
+
+        fd = shm_open(fb->mem_path, oflag, mode);
+        if (fd < 0) {
+            error_setg_errno(errp, errno,
+                             "failed to create POSIX shared memory");
+            return false;
+        }
+
+        if (ftruncate(fd, backend->size) == -1) {
+            error_setg_errno(errp, errno,
+                             "failed to resize POSIX shared memory to %" PRIu64,
+                             backend->size);
+            shm_unlink(fb->mem_path);
+            return false;
+        }
+
+        return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
+                                              name, backend->size, ram_flags,
+                                              fd, fb->offset, errp);
+
+    }
+
     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
                                             backend->size, fb->align, ram_flags,
                                             fb->mem_path, fb->offset, errp);
@@ -254,17 +285,36 @@  static void file_memory_backend_set_rom(Object *obj, Visitor *v,
     visit_type_OnOffAuto(v, name, &fb->rom, errp);
 }
 
+static bool file_memory_backend_get_shm(Object *obj, Error **errp)
+{
+    return MEMORY_BACKEND_FILE(obj)->shm;
+}
+
+static void file_memory_backend_set_shm(Object *obj, bool value,
+                                             Error **errp)
+{
+    MEMORY_BACKEND_FILE(obj)->shm = value;
+}
+
 static void file_backend_unparent(Object *obj)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
     HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
 
-    if (host_memory_backend_mr_inited(backend) && fb->discard_data) {
+    if (!host_memory_backend_mr_inited(backend)) {
+        return;
+    }
+
+    if (fb->discard_data) {
         void *ptr = memory_region_get_ram_ptr(&backend->mr);
         uint64_t sz = memory_region_size(&backend->mr);
 
         qemu_madvise(ptr, sz, QEMU_MADV_REMOVE);
     }
+
+    if (fb->shm) {
+        shm_unlink(fb->mem_path);
+    }
 }
 
 static void
@@ -300,6 +350,11 @@  file_backend_class_init(ObjectClass *oc, void *data)
         file_memory_backend_get_rom, file_memory_backend_set_rom, NULL, NULL);
     object_class_property_set_description(oc, "rom",
         "Whether to create Read Only Memory (ROM)");
+    object_class_property_add_bool(oc, "shm",
+        file_memory_backend_get_shm,
+        file_memory_backend_set_shm);
+    object_class_property_set_description(oc, "shm",
+        "Use shm_open(3) to create/open POSIX shared memory objects");
 }
 
 static void file_backend_instance_finalize(Object *o)
diff --git a/backends/meson.build b/backends/meson.build
index 8b2b111497..64520c0a7e 100644
--- a/backends/meson.build
+++ b/backends/meson.build
@@ -12,7 +12,7 @@  system_ss.add([files(
 
 if host_os != 'windows'
   system_ss.add(files('rng-random.c'))
-  system_ss.add(files('hostmem-file.c'))
+  system_ss.add([files('hostmem-file.c'), rt])
 endif
 if host_os == 'linux'
   system_ss.add(files('hostmem-memfd.c'))
diff --git a/qemu-options.hx b/qemu-options.hx
index 9be1e5817c..da96ee506d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5079,7 +5079,7 @@  SRST
     they are specified. Note that the 'id' property must be set. These
     objects are placed in the '/objects' path.
 
-    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
+    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto,shm=on|off``
         Creates a memory file backend object, which can be used to back
         the guest RAM with huge pages.
 
@@ -5183,6 +5183,14 @@  SRST
         (``share=off``). For this use case, we need writable RAM instead
         of ROM, and want to also set ``rom=off``.
 
+        The ``shm`` option specifies whether to create/open a POSIX shared
+        memory object identified by ``mem-path``.
+        If set to ``on``, use shm_open(3); if set to ``off`` (default),
+        use open(2); For portable use, a shared memory object should be
+        identified by a name of the form ``/somename``; consisting of an
+        initial slash, followed by one or more characters, none of which
+        are slashes.
+
     ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
         Creates a memory backend object, which can be used to back the
         guest RAM. Memory backend objects offer more control than the