diff mbox series

[v5] hostmem-file: add offset option

Message ID 20230403221421.60877-1-graf@amazon.com (mailing list archive)
State New, archived
Headers show
Series [v5] hostmem-file: add offset option | expand

Commit Message

Alexander Graf April 3, 2023, 10:14 p.m. UTC
Add an option for hostmem-file to start the memory object at an offset
into the target file. This is useful if multiple memory objects reside
inside the same target file, such as a device node.

In particular, it's useful to map guest memory directly into /dev/mem
for experimentation.

To make this work consistently, also fix up all places in QEMU that
expect fd offsets to be 0.

Signed-off-by: Alexander Graf <graf@amazon.com>

---

v1 -> v2:

  - add qom documentation
  - propagate offset into truncate, size and alignment checks

v2 -> v3:

  - failed attempt at fixing typo

v3 -> v4:

  - fix typo

v4 -> v5:

  - improve qom doc comment
  - account for fd_offset in more places
---
 backends/hostmem-file.c | 40 +++++++++++++++++++++++++++++++++++++++-
 hw/virtio/vhost-user.c  |  1 +
 include/exec/memory.h   |  2 ++
 include/exec/ram_addr.h |  3 ++-
 include/exec/ramblock.h |  1 +
 qapi/qom.json           |  5 +++++
 qemu-options.hx         |  6 +++++-
 softmmu/memory.c        |  3 ++-
 softmmu/physmem.c       | 17 ++++++++++++-----
 9 files changed, 69 insertions(+), 9 deletions(-)

Comments

Markus Armbruster April 4, 2023, 6:48 a.m. UTC | #1
Alexander Graf <graf@amazon.com> writes:

> Add an option for hostmem-file to start the memory object at an offset
> into the target file. This is useful if multiple memory objects reside
> inside the same target file, such as a device node.
>
> In particular, it's useful to map guest memory directly into /dev/mem
> for experimentation.
>
> To make this work consistently, also fix up all places in QEMU that
> expect fd offsets to be 0.
>
> Signed-off-by: Alexander Graf <graf@amazon.com>

[...]

> diff --git a/qapi/qom.json b/qapi/qom.json
> index a877b879b9..f740f74be3 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -635,6 +635,10 @@
>  #         specify the required alignment via this option.
>  #         0 selects a default alignment (currently the page size). (default: 0)
>  #
> +# @offset: the offset into the target file that the region starts at. You can
> +#          use this option to back multiple regions with a single file. Must be
> +#          a multiple of the page size. (default: 0) (since 8.1)
> +#
>  # @discard-data: if true, the file contents can be destroyed when QEMU exits,
>  #                to avoid unnecessarily flushing data to the backing file. Note
>  #                that ``discard-data`` is only an optimization, and QEMU might
> @@ -655,6 +659,7 @@
>  { 'struct': 'MemoryBackendFileProperties',
>    'base': 'MemoryBackendProperties',
>    'data': { '*align': 'size',
> +            '*offset': 'size',
>              '*discard-data': 'bool',
>              'mem-path': 'str',
>              '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },

Acked-by: Markus Armbruster <armbru@redhat.com>

[...]
Peter Xu April 4, 2023, 2:36 p.m. UTC | #2
On Mon, Apr 03, 2023 at 10:14:21PM +0000, Alexander Graf wrote:
> Add an option for hostmem-file to start the memory object at an offset
> into the target file. This is useful if multiple memory objects reside
> inside the same target file, such as a device node.
> 
> In particular, it's useful to map guest memory directly into /dev/mem
> for experimentation.
> 
> To make this work consistently, also fix up all places in QEMU that
> expect fd offsets to be 0.
> 
> Signed-off-by: Alexander Graf <graf@amazon.com>

Acked-by: Peter Xu <peterx@redhat.com>

I also agree it'll be nicer to split the fix into separate patch, though.
The only affected part IIUC is multi-process QEMU since 6.0.0.  Copying the
maintainers too so they'll be aware.

Corresponds to the tag:

Fixes: ed5d001916 ("multi-process: setup memory manager for remote device")
David Hildenbrand April 5, 2023, 1:58 p.m. UTC | #3
On 04.04.23 16:36, Peter Xu wrote:
> On Mon, Apr 03, 2023 at 10:14:21PM +0000, Alexander Graf wrote:
>> Add an option for hostmem-file to start the memory object at an offset
>> into the target file. This is useful if multiple memory objects reside
>> inside the same target file, such as a device node.
>>
>> In particular, it's useful to map guest memory directly into /dev/mem
>> for experimentation.
>>
>> To make this work consistently, also fix up all places in QEMU that
>> expect fd offsets to be 0.
>>
>> Signed-off-by: Alexander Graf <graf@amazon.com>
> 
> Acked-by: Peter Xu <peterx@redhat.com>
> 
> I also agree it'll be nicer to split the fix into separate patch, though.
> The only affected part IIUC is multi-process QEMU since 6.0.0.  Copying the
> maintainers too so they'll be aware.
> 
> Corresponds to the tag:
> 
> Fixes: ed5d001916 ("multi-process: setup memory manager for remote device")
> 

If there are no options on splitting out the fix, I'll route this via my 
tree.
Igor Mammedov April 11, 2023, 11:46 a.m. UTC | #4
On Wed, 5 Apr 2023 15:58:31 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 04.04.23 16:36, Peter Xu wrote:
> > On Mon, Apr 03, 2023 at 10:14:21PM +0000, Alexander Graf wrote:  
> >> Add an option for hostmem-file to start the memory object at an offset
> >> into the target file. This is useful if multiple memory objects reside
> >> inside the same target file, such as a device node.
> >>
> >> In particular, it's useful to map guest memory directly into /dev/mem
> >> for experimentation.
> >>
> >> To make this work consistently, also fix up all places in QEMU that
> >> expect fd offsets to be 0.
> >>
> >> Signed-off-by: Alexander Graf <graf@amazon.com>  
> > 
> > Acked-by: Peter Xu <peterx@redhat.com>
> > 
> > I also agree it'll be nicer to split the fix into separate patch, though.
> > The only affected part IIUC is multi-process QEMU since 6.0.0.  Copying the
> > maintainers too so they'll be aware.
> > 
> > Corresponds to the tag:
> > 
> > Fixes: ed5d001916 ("multi-process: setup memory manager for remote device")
> >   
> 
> If there are no options on splitting out the fix, I'll route this via my 
> tree.

Having fixes as separate prep patch is much more preferable.

Another question is if we should also check that provided
offset honors 'align' option?
David Hildenbrand April 20, 2023, 5:22 p.m. UTC | #5
On 11.04.23 13:46, Igor Mammedov wrote:
> On Wed, 5 Apr 2023 15:58:31 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 04.04.23 16:36, Peter Xu wrote:
>>> On Mon, Apr 03, 2023 at 10:14:21PM +0000, Alexander Graf wrote:
>>>> Add an option for hostmem-file to start the memory object at an offset
>>>> into the target file. This is useful if multiple memory objects reside
>>>> inside the same target file, such as a device node.
>>>>
>>>> In particular, it's useful to map guest memory directly into /dev/mem
>>>> for experimentation.
>>>>
>>>> To make this work consistently, also fix up all places in QEMU that
>>>> expect fd offsets to be 0.
>>>>
>>>> Signed-off-by: Alexander Graf <graf@amazon.com>
>>>
>>> Acked-by: Peter Xu <peterx@redhat.com>
>>>
>>> I also agree it'll be nicer to split the fix into separate patch, though.
>>> The only affected part IIUC is multi-process QEMU since 6.0.0.  Copying the
>>> maintainers too so they'll be aware.
>>>
>>> Corresponds to the tag:
>>>
>>> Fixes: ed5d001916 ("multi-process: setup memory manager for remote device")
>>>    
>>
>> If there are no options on splitting out the fix, I'll route this via my
>> tree.
> 
> Having fixes as separate prep patch is much more preferable.
> 

Right. Question is if it's really worth it. (it's quite a corner cases I 
guess ...)

> Another question is if we should also check that provided
> offset honors 'align' option?

Good point, but I wonder if it's a real requirement (IOW, what would go 
wrong)? The important part seems to be that the offset is aligned to the 
underlying page size.

I'll be on vacation until 2. Mai, planning on picking it up by then to 
send it upstream.
David Hildenbrand May 3, 2023, 1:40 p.m. UTC | #6
On 04.04.23 00:14, Alexander Graf wrote:
> Add an option for hostmem-file to start the memory object at an offset
> into the target file. This is useful if multiple memory objects reside
> inside the same target file, such as a device node.
> 
> In particular, it's useful to map guest memory directly into /dev/mem
> for experimentation.
> 
> To make this work consistently, also fix up all places in QEMU that
> expect fd offsets to be 0.
> 
> Signed-off-by: Alexander Graf <graf@amazon.com>
> 
> ---
> 
> v1 -> v2:
> 
>    - add qom documentation
>    - propagate offset into truncate, size and alignment checks
> 
> v2 -> v3:
> 
>    - failed attempt at fixing typo
> 
> v3 -> v4:
> 
>    - fix typo
> 
> v4 -> v5:
> 
>    - improve qom doc comment
>    - account for fd_offset in more places
> ---

I queued this as is to

https://github.com/davidhildenbrand/qemu.git mem-next


I might send this upstream soonish (nothing else is really pending).

Thanks!
diff mbox series

Patch

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 25141283c4..38ea65bec5 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -27,6 +27,7 @@  struct HostMemoryBackendFile {
 
     char *mem_path;
     uint64_t align;
+    uint64_t offset;
     bool discard_data;
     bool is_pmem;
     bool readonly;
@@ -58,7 +59,8 @@  file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
     memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
                                      backend->size, fb->align, ram_flags,
-                                     fb->mem_path, fb->readonly, errp);
+                                     fb->mem_path, fb->offset, fb->readonly,
+                                     errp);
     g_free(name);
 #endif
 }
@@ -125,6 +127,36 @@  static void file_memory_backend_set_align(Object *o, Visitor *v,
     fb->align = val;
 }
 
+static void file_memory_backend_get_offset(Object *o, Visitor *v,
+                                          const char *name, void *opaque,
+                                          Error **errp)
+{
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+    uint64_t val = fb->offset;
+
+    visit_type_size(v, name, &val, errp);
+}
+
+static void file_memory_backend_set_offset(Object *o, Visitor *v,
+                                          const char *name, void *opaque,
+                                          Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(o);
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+    uint64_t val;
+
+    if (host_memory_backend_mr_inited(backend)) {
+        error_setg(errp, "cannot change property '%s' of %s", name,
+                   object_get_typename(o));
+        return;
+    }
+
+    if (!visit_type_size(v, name, &val, errp)) {
+        return;
+    }
+    fb->offset = val;
+}
+
 #ifdef CONFIG_LIBPMEM
 static bool file_memory_backend_get_pmem(Object *o, Error **errp)
 {
@@ -197,6 +229,12 @@  file_backend_class_init(ObjectClass *oc, void *data)
         file_memory_backend_get_align,
         file_memory_backend_set_align,
         NULL, NULL);
+    object_class_property_add(oc, "offset", "int",
+        file_memory_backend_get_offset,
+        file_memory_backend_set_offset,
+        NULL, NULL);
+    object_class_property_set_description(oc, "offset",
+        "Offset into the target file (ex: 1G)");
 #ifdef CONFIG_LIBPMEM
     object_class_property_add_bool(oc, "pmem",
         file_memory_backend_get_pmem, file_memory_backend_set_pmem);
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index e5285df4ba..39dc803b03 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -483,6 +483,7 @@  static MemoryRegion *vhost_user_get_mr_data(uint64_t addr, ram_addr_t *offset,
     assert((uintptr_t)addr == addr);
     mr = memory_region_from_host((void *)(uintptr_t)addr, offset);
     *fd = memory_region_get_fd(mr);
+    *offset += mr->ram_block->fd_offset;
 
     return mr;
 }
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 15ade918ba..3b7295fbe2 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1318,6 +1318,7 @@  void memory_region_init_resizeable_ram(MemoryRegion *mr,
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *             RAM_NORESERVE,
  * @path: the path in which to allocate the RAM.
+ * @offset: offset within the file referenced by path
  * @readonly: true to open @path for reading, false for read/write.
  * @errp: pointer to Error*, to store an error if it happens.
  *
@@ -1331,6 +1332,7 @@  void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       uint64_t align,
                                       uint32_t ram_flags,
                                       const char *path,
+                                      ram_addr_t offset,
                                       bool readonly,
                                       Error **errp);
 
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index f4fb6a2111..90a8269290 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -110,6 +110,7 @@  long qemu_maxrampagesize(void);
  *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *              RAM_NORESERVE.
  *  @mem_path or @fd: specify the backing file or device
+ *  @offset: Offset into target file
  *  @readonly: true to open @path for reading, false for read/write.
  *  @errp: pointer to Error*, to store an error if it happens
  *
@@ -119,7 +120,7 @@  long qemu_maxrampagesize(void);
  */
 RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
                                    uint32_t ram_flags, const char *mem_path,
-                                   bool readonly, Error **errp);
+                                   off_t offset, bool readonly, Error **errp);
 RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
                                  uint32_t ram_flags, int fd, off_t offset,
                                  bool readonly, Error **errp);
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index adc03df59c..69c6a53902 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -40,6 +40,7 @@  struct RAMBlock {
     QLIST_ENTRY(RAMBlock) next;
     QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
     int fd;
+    uint64_t fd_offset;
     size_t page_size;
     /* dirty bitmap used during migration */
     unsigned long *bmap;
diff --git a/qapi/qom.json b/qapi/qom.json
index a877b879b9..f740f74be3 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -635,6 +635,10 @@ 
 #         specify the required alignment via this option.
 #         0 selects a default alignment (currently the page size). (default: 0)
 #
+# @offset: the offset into the target file that the region starts at. You can
+#          use this option to back multiple regions with a single file. Must be
+#          a multiple of the page size. (default: 0) (since 8.1)
+#
 # @discard-data: if true, the file contents can be destroyed when QEMU exits,
 #                to avoid unnecessarily flushing data to the backing file. Note
 #                that ``discard-data`` is only an optimization, and QEMU might
@@ -655,6 +659,7 @@ 
 { 'struct': 'MemoryBackendFileProperties',
   'base': 'MemoryBackendProperties',
   'data': { '*align': 'size',
+            '*offset': 'size',
             '*discard-data': 'bool',
             'mem-path': 'str',
             '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
diff --git a/qemu-options.hx b/qemu-options.hx
index 59bdf67a2c..fb4eb81af4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4859,7 +4859,7 @@  SRST
     they are specified. Note that the 'id' property must be set. These
     objects are placed in the '/objects' path.
 
-    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,readonly=on|off``
+    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
         Creates a memory file backend object, which can be used to back
         the guest RAM with huge pages.
 
@@ -4929,6 +4929,10 @@  SRST
         such cases, users can specify the required alignment via this
         option.
 
+        The ``offset`` option specifies the offset into the target file
+        that the region starts at. You can use this parameter to back
+        multiple regions with a single file.
+
         The ``pmem`` option specifies whether the backing file specified
         by ``mem-path`` is in host persistent memory that can be
         accessed using the SNIA NVM programming model (e.g. Intel
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 5305aca7ca..9f620085a0 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1601,6 +1601,7 @@  void memory_region_init_ram_from_file(MemoryRegion *mr,
                                       uint64_t align,
                                       uint32_t ram_flags,
                                       const char *path,
+                                      ram_addr_t offset,
                                       bool readonly,
                                       Error **errp)
 {
@@ -1612,7 +1613,7 @@  void memory_region_init_ram_from_file(MemoryRegion *mr,
     mr->destructor = memory_region_destructor_ram;
     mr->align = align;
     mr->ram_block = qemu_ram_alloc_from_file(size, mr, ram_flags, path,
-                                             readonly, &err);
+                                             offset, readonly, &err);
     if (err) {
         mr->size = int128_zero();
         object_unparent(OBJECT(mr));
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 0e0182d9f2..32460d7a3a 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1369,6 +1369,11 @@  static void *file_ram_alloc(RAMBlock *block,
         error_setg(errp, "alignment 0x%" PRIx64
                    " must be a power of two", block->mr->align);
         return NULL;
+    } else if (offset % block->page_size) {
+        error_setg(errp, "offset 0x%" PRIx64
+                   " must be multiples of page size 0x%zx",
+                   offset, block->page_size);
+        return NULL;
     }
     block->mr->align = MAX(block->page_size, block->mr->align);
 #if defined(__s390x__)
@@ -1400,7 +1405,7 @@  static void *file_ram_alloc(RAMBlock *block,
      * those labels. Therefore, extending the non-empty backend file
      * is disabled as well.
      */
-    if (truncate && ftruncate(fd, memory)) {
+    if (truncate && ftruncate(fd, offset + memory)) {
         perror("ftruncate");
     }
 
@@ -1416,6 +1421,7 @@  static void *file_ram_alloc(RAMBlock *block,
     }
 
     block->fd = fd;
+    block->fd_offset = offset;
     return area;
 }
 #endif
@@ -1889,7 +1895,7 @@  RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
 
     size = HOST_PAGE_ALIGN(size);
     file_size = get_file_size(fd);
-    if (file_size > 0 && file_size < size) {
+    if (file_size > offset && file_size < (offset + size)) {
         error_setg(errp, "backing store size 0x%" PRIx64
                    " does not match 'size' option 0x" RAM_ADDR_FMT,
                    file_size, size);
@@ -1929,7 +1935,7 @@  RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
 
 RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
                                    uint32_t ram_flags, const char *mem_path,
-                                   bool readonly, Error **errp)
+                                   off_t offset, bool readonly, Error **errp)
 {
     int fd;
     bool created;
@@ -1941,7 +1947,8 @@  RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
         return NULL;
     }
 
-    block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, 0, readonly, errp);
+    block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, offset, readonly,
+                                   errp);
     if (!block) {
         if (created) {
             unlink(mem_path);
@@ -2075,7 +2082,7 @@  void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
                 flags |= block->flags & RAM_NORESERVE ? MAP_NORESERVE : 0;
                 if (block->fd >= 0) {
                     area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
-                                flags, block->fd, offset);
+                                flags, block->fd, offset + block->fd_offset);
                 } else {
                     flags |= MAP_ANONYMOUS;
                     area = mmap(vaddr, length, PROT_READ | PROT_WRITE,