diff mbox series

hostmem: Add clear option to file backend

Message ID 20230302110925.4680-1-fam.zheng@bytedance.com (mailing list archive)
State New, archived
Headers show
Series hostmem: Add clear option to file backend | expand

Commit Message

Fam Zheng March 2, 2023, 11:09 a.m. UTC
This adds a memset to clear the backing memory. This is useful in the
case of PMEM DAX to drop dirty data, if the backing memory is handed
over from a previous application or firmware which didn't clean up
before exiting.

Signed-off-by: Fam Zheng <fam.zheng@bytedance.com>
---
 backends/hostmem-file.c | 20 ++++++++++++++++++++
 include/exec/memory.h   |  3 +++
 qapi/qom.json           |  3 +++
 qemu-options.hx         |  4 +++-
 softmmu/physmem.c       |  6 +++++-
 5 files changed, 34 insertions(+), 2 deletions(-)

Comments

David Hildenbrand March 2, 2023, 11:31 a.m. UTC | #1
On 02.03.23 12:09, Fam Zheng wrote:
> This adds a memset to clear the backing memory. This is useful in the
> case of PMEM DAX to drop dirty data, if the backing memory is handed
> over from a previous application or firmware which didn't clean up
> before exiting.
> 

Why can't the VM manager do that instead? If you have a file that's 
certainly easily possible.
Fam Zheng March 2, 2023, 11:37 a.m. UTC | #2
> On 2 Mar 2023, at 11:31, David Hildenbrand <david@redhat.com> wrote:
> 
> On 02.03.23 12:09, Fam Zheng wrote:
>> This adds a memset to clear the backing memory. This is useful in the
>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>> over from a previous application or firmware which didn't clean up
>> before exiting.
> 
> Why can't the VM manager do that instead? If you have a file that's certainly easily possible.


Hi David,

Technically yes, but I have a simple VM manager here which wants to avoid replicating the same mmap code, such as handling the flags depending on share=on|off,hugepages=on|off. All in all this approach requires the least additional code to achieve it.

Thanks,
Fam

> 
> -- 
> Thanks,
> 
> David / dhildenb
>
David Hildenbrand March 2, 2023, 11:39 a.m. UTC | #3
On 02.03.23 12:37, Feiran Zheng wrote:
> 
> 
>> On 2 Mar 2023, at 11:31, David Hildenbrand <david@redhat.com> wrote:
>>
>> On 02.03.23 12:09, Fam Zheng wrote:
>>> This adds a memset to clear the backing memory. This is useful in the
>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>> over from a previous application or firmware which didn't clean up
>>> before exiting.
>>
>> Why can't the VM manager do that instead? If you have a file that's certainly easily possible.
> 
> 
> Hi David,
> 
> Technically yes, but I have a simple VM manager here which wants to avoid replicating the same mmap code, such as handling the flags depending on share=on|off,hugepages=on|off. All in all this approach requires the least additional code to achieve it.

so ... we're supposed to maintain that code in QEMU instead to make your 
life easier ? :)

Sorry, for this particular use case I don't see the big benefit of 
moving that code into QEMU.
Daniel P. Berrangé March 2, 2023, 11:44 a.m. UTC | #4
On Thu, Mar 02, 2023 at 12:31:46PM +0100, David Hildenbrand wrote:
> On 02.03.23 12:09, Fam Zheng wrote:
> > This adds a memset to clear the backing memory. This is useful in the
> > case of PMEM DAX to drop dirty data, if the backing memory is handed
> > over from a previous application or firmware which didn't clean up
> > before exiting.
> > 
> 
> Why can't the VM manager do that instead? If you have a file that's
> certainly easily possible.

This feels conceptually similar to the case where you expose a host
block device to the guest. If that block device was previously given
to a different guest it might still have data in it. Someone needs
to take responsibility for scrubbing that data. Since that may take
a non-trivial amount of time, it is typically todo that scrubbing in
the background after the old VM is gone rather than put it into the
startup path for a new VM which would delay boot.

PMEM is blurring the boundary between memory and disk, but the tradeoff
is not so different. We know that in general merely faulting in guest
memory is quite time consuming and delays VM startup significantly as
RAM size increases. Doing the full memset can only be slower still.

For prealloc we've create complex code to fault in memory across many
threads and even that's too slow, so we're considering doing it in the
background as the VM starts up.

IIUC, this patch just puts the memset in the critical serialized path.
This will inevitably lead to a demand for improving performance by
parallelizing across threads, but we know that's too slow already,
and we cant play the background async game with memset as that's
actually changunig guest visible contents.

IOW, for large PMEM sizes, it does look compelling to do the clearing
of old data in the background outside context of QEMU VM startup to
avoid delayed startup.

I can still understand the appeal of a simple flag to set on QEMU from
a usability POV, but not sure its a good idea to encourage this usage
by mgmt apps.

With regards,
Daniel
Fam Zheng March 2, 2023, 11:48 a.m. UTC | #5
> On 2 Mar 2023, at 11:39, David Hildenbrand <david@redhat.com> wrote:
> 
> On 02.03.23 12:37, Feiran Zheng wrote:
>>> On 2 Mar 2023, at 11:31, David Hildenbrand <david@redhat.com> wrote:
>>> 
>>> On 02.03.23 12:09, Fam Zheng wrote:
>>>> This adds a memset to clear the backing memory. This is useful in the
>>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>>> over from a previous application or firmware which didn't clean up
>>>> before exiting.
>>> 
>>> Why can't the VM manager do that instead? If you have a file that's certainly easily possible.
>> Hi David,
>> Technically yes, but I have a simple VM manager here which wants to avoid replicating the same mmap code, such as handling the flags depending on share=on|off,hugepages=on|off. All in all this approach requires the least additional code to achieve it.
> 
> so ... we're supposed to maintain that code in QEMU instead to make your life easier ? :)
> 
> Sorry, for this particular use case I don't see the big benefit of moving that code into QEMU.
> 

I am posting because this does not only makes my life easier, supposedly it also make other developers life easier, because the file here can be a char file and there is no easy way to clear it (/dev/dax1.0) from command line if you want to invoke a QEMU command directly.

Maybe I’m missing a convenient command to clear a DAX char file?

Surely it doesn’t make a big difference either way and I am not worried of maintaining the code outside QEMU, but I just think this flag is a nice thing in QEMU anyway.



Fam

> -- 
> Thanks,
> 
> David / dhildenb
>
David Hildenbrand March 2, 2023, 11:54 a.m. UTC | #6
On 02.03.23 12:48, Feiran Zheng wrote:
> 
> 
>> On 2 Mar 2023, at 11:39, David Hildenbrand <david@redhat.com> wrote:
>>
>> On 02.03.23 12:37, Feiran Zheng wrote:
>>>> On 2 Mar 2023, at 11:31, David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 02.03.23 12:09, Fam Zheng wrote:
>>>>> This adds a memset to clear the backing memory. This is useful in the
>>>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>>>> over from a previous application or firmware which didn't clean up
>>>>> before exiting.
>>>>
>>>> Why can't the VM manager do that instead? If you have a file that's certainly easily possible.
>>> Hi David,
>>> Technically yes, but I have a simple VM manager here which wants to avoid replicating the same mmap code, such as handling the flags depending on share=on|off,hugepages=on|off. All in all this approach requires the least additional code to achieve it.
>>
>> so ... we're supposed to maintain that code in QEMU instead to make your life easier ? :)
>>
>> Sorry, for this particular use case I don't see the big benefit of moving that code into QEMU.
>>
> 
> I am posting because this does not only makes my life easier, supposedly it also make other developers life easier, because the file here can be a char file and there is no easy way to clear it (/dev/dax1.0) from command line if you want to invoke a QEMU command directly.
> 
> Maybe I’m missing a convenient command to clear a DAX char file?

Can't you simply use dd and read from /dev/zero?
Fam Zheng March 2, 2023, 11:57 a.m. UTC | #7
> On 2 Mar 2023, at 11:44, Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> On Thu, Mar 02, 2023 at 12:31:46PM +0100, David Hildenbrand wrote:
>> On 02.03.23 12:09, Fam Zheng wrote:
>>> This adds a memset to clear the backing memory. This is useful in the
>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>> over from a previous application or firmware which didn't clean up
>>> before exiting.
>>> 
>> 
>> Why can't the VM manager do that instead? If you have a file that's
>> certainly easily possible.
> 
> This feels conceptually similar to the case where you expose a host
> block device to the guest. If that block device was previously given
> to a different guest it might still have data in it. Someone needs
> to take responsibility for scrubbing that data. Since that may take
> a non-trivial amount of time, it is typically todo that scrubbing in
> the background after the old VM is gone rather than put it into the
> startup path for a new VM which would delay boot.
> 
> PMEM is blurring the boundary between memory and disk, but the tradeoff
> is not so different. We know that in general merely faulting in guest
> memory is quite time consuming and delays VM startup significantly as
> RAM size increases. Doing the full memset can only be slower still.
> 
> For prealloc we've create complex code to fault in memory across many
> threads and even that's too slow, so we're considering doing it in the
> background as the VM starts up.
> 
> IIUC, this patch just puts the memset in the critical serialized path.
> This will inevitably lead to a demand for improving performance by
> parallelizing across threads, but we know that's too slow already,
> and we cant play the background async game with memset as that's
> actually changunig guest visible contents.
> 
> IOW, for large PMEM sizes, it does look compelling to do the clearing
> of old data in the background outside context of QEMU VM startup to
> avoid delayed startup.
> 
> I can still understand the appeal of a simple flag to set on QEMU from
> a usability POV, but not sure its a good idea to encourage this usage
> by mgmt apps.

I can totally see the reasoning about the latency here, but I’m a little dubious if multi-threading for memset can actaully help reduce the start-up time; the total cost is going to be bound by memory bandwidth between the CPU and memory (even more so if it’s PMEM) which is limited.

Fam
Fam Zheng March 2, 2023, 1:31 p.m. UTC | #8
> On 2 Mar 2023, at 11:54, David Hildenbrand <david@redhat.com> wrote:
> 
> On 02.03.23 12:48, Feiran Zheng wrote:
>>> On 2 Mar 2023, at 11:39, David Hildenbrand <david@redhat.com> wrote:
>>> 
>>> On 02.03.23 12:37, Feiran Zheng wrote:
>>>>> On 2 Mar 2023, at 11:31, David Hildenbrand <david@redhat.com> wrote:
>>>>> 
>>>>> On 02.03.23 12:09, Fam Zheng wrote:
>>>>>> This adds a memset to clear the backing memory. This is useful in the
>>>>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>>>>> over from a previous application or firmware which didn't clean up
>>>>>> before exiting.
>>>>> 
>>>>> Why can't the VM manager do that instead? If you have a file that's certainly easily possible.
>>>> Hi David,
>>>> Technically yes, but I have a simple VM manager here which wants to avoid replicating the same mmap code, such as handling the flags depending on share=on|off,hugepages=on|off. All in all this approach requires the least additional code to achieve it.
>>> 
>>> so ... we're supposed to maintain that code in QEMU instead to make your life easier ? :)
>>> 
>>> Sorry, for this particular use case I don't see the big benefit of moving that code into QEMU.
>>> 
>> I am posting because this does not only makes my life easier, supposedly it also make other developers life easier, because the file here can be a char file and there is no easy way to clear it (/dev/dax1.0) from command line if you want to invoke a QEMU command directly.
>> Maybe I’m missing a convenient command to clear a DAX char file?
> 
> Can't you simply use dd and read from /dev/zero?
> 

I don’t think it works for dax because the fs driver only implemented mmap, not read/write:

# strace -e write dd if=/dev/zero of=/dev/dax1.0 bs=1G count=1
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1073741824) = -1 EINVAL (Invalid argument)
write(2, "dd: error writing '/dev/dax1.0':"..., 50dd: error writing '/dev/dax1.0': Invalid argument

Fam
David Hildenbrand March 2, 2023, 1:56 p.m. UTC | #9
On 02.03.23 12:57, Feiran Zheng wrote:
> 
> 
>> On 2 Mar 2023, at 11:44, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>
>> On Thu, Mar 02, 2023 at 12:31:46PM +0100, David Hildenbrand wrote:
>>> On 02.03.23 12:09, Fam Zheng wrote:
>>>> This adds a memset to clear the backing memory. This is useful in the
>>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>>> over from a previous application or firmware which didn't clean up
>>>> before exiting.
>>>>
>>>
>>> Why can't the VM manager do that instead? If you have a file that's
>>> certainly easily possible.
>>
>> This feels conceptually similar to the case where you expose a host
>> block device to the guest. If that block device was previously given
>> to a different guest it might still have data in it. Someone needs
>> to take responsibility for scrubbing that data. Since that may take
>> a non-trivial amount of time, it is typically todo that scrubbing in
>> the background after the old VM is gone rather than put it into the
>> startup path for a new VM which would delay boot.
>>
>> PMEM is blurring the boundary between memory and disk, but the tradeoff
>> is not so different. We know that in general merely faulting in guest
>> memory is quite time consuming and delays VM startup significantly as
>> RAM size increases. Doing the full memset can only be slower still.
>>
>> For prealloc we've create complex code to fault in memory across many
>> threads and even that's too slow, so we're considering doing it in the
>> background as the VM starts up.
>>
>> IIUC, this patch just puts the memset in the critical serialized path.
>> This will inevitably lead to a demand for improving performance by
>> parallelizing across threads, but we know that's too slow already,
>> and we cant play the background async game with memset as that's
>> actually changunig guest visible contents.
>>
>> IOW, for large PMEM sizes, it does look compelling to do the clearing
>> of old data in the background outside context of QEMU VM startup to
>> avoid delayed startup.
>>
>> I can still understand the appeal of a simple flag to set on QEMU from
>> a usability POV, but not sure its a good idea to encourage this usage
>> by mgmt apps.
> 
> I can totally see the reasoning about the latency here, but I’m a little dubious if multi-threading for memset can actaully help reduce the start-up time; the total cost is going to be bound by memory bandwidth between the CPU and memory (even more so if it’s PMEM) which is limited.

Right, daxio is the magic bit:

daxio.x86_64 : Perform I/O on Device DAX devices or zero a Device DAX device

# daxio -z -o /dev/dax0.0
daxio: copied 8587837440 bytes to device "/dev/dax0.0"
David Hildenbrand March 2, 2023, 2:26 p.m. UTC | #10
On 02.03.23 14:56, David Hildenbrand wrote:
> On 02.03.23 12:57, Feiran Zheng wrote:
>>
>>
>>> On 2 Mar 2023, at 11:44, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>>
>>> On Thu, Mar 02, 2023 at 12:31:46PM +0100, David Hildenbrand wrote:
>>>> On 02.03.23 12:09, Fam Zheng wrote:
>>>>> This adds a memset to clear the backing memory. This is useful in the
>>>>> case of PMEM DAX to drop dirty data, if the backing memory is handed
>>>>> over from a previous application or firmware which didn't clean up
>>>>> before exiting.
>>>>>
>>>>
>>>> Why can't the VM manager do that instead? If you have a file that's
>>>> certainly easily possible.
>>>
>>> This feels conceptually similar to the case where you expose a host
>>> block device to the guest. If that block device was previously given
>>> to a different guest it might still have data in it. Someone needs
>>> to take responsibility for scrubbing that data. Since that may take
>>> a non-trivial amount of time, it is typically todo that scrubbing in
>>> the background after the old VM is gone rather than put it into the
>>> startup path for a new VM which would delay boot.
>>>
>>> PMEM is blurring the boundary between memory and disk, but the tradeoff
>>> is not so different. We know that in general merely faulting in guest
>>> memory is quite time consuming and delays VM startup significantly as
>>> RAM size increases. Doing the full memset can only be slower still.
>>>
>>> For prealloc we've create complex code to fault in memory across many
>>> threads and even that's too slow, so we're considering doing it in the
>>> background as the VM starts up.
>>>
>>> IIUC, this patch just puts the memset in the critical serialized path.
>>> This will inevitably lead to a demand for improving performance by
>>> parallelizing across threads, but we know that's too slow already,
>>> and we cant play the background async game with memset as that's
>>> actually changunig guest visible contents.
>>>
>>> IOW, for large PMEM sizes, it does look compelling to do the clearing
>>> of old data in the background outside context of QEMU VM startup to
>>> avoid delayed startup.
>>>
>>> I can still understand the appeal of a simple flag to set on QEMU from
>>> a usability POV, but not sure its a good idea to encourage this usage
>>> by mgmt apps.
>>
>> I can totally see the reasoning about the latency here, but I’m a little dubious if multi-threading for memset can actaully help reduce the start-up time; the total cost is going to be bound by memory bandwidth between the CPU and memory (even more so if it’s PMEM) which is limited.
> 
> Right, daxio is the magic bit:
> 
> daxio.x86_64 : Perform I/O on Device DAX devices or zero a Device DAX device
> 
> # daxio -z -o /dev/dax0.0
> daxio: copied 8587837440 bytes to device "/dev/dax0.0"
> 

^ accidentally replied that to the wrong thread.

Regarding this thread: memory preallocation (page zeroing) benefits 
heavily from concurrency in QEMU. I assume it would similarly do it on 
pmem, I didn't try, though.
Stefan Hajnoczi March 2, 2023, 4:14 p.m. UTC | #11
On Thu, Mar 02, 2023 at 02:56:43PM +0100, David Hildenbrand wrote:
> On 02.03.23 12:57, Feiran Zheng wrote:
> > 
> > 
> > > On 2 Mar 2023, at 11:44, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > On Thu, Mar 02, 2023 at 12:31:46PM +0100, David Hildenbrand wrote:
> > > > On 02.03.23 12:09, Fam Zheng wrote:
> > > > > This adds a memset to clear the backing memory. This is useful in the
> > > > > case of PMEM DAX to drop dirty data, if the backing memory is handed
> > > > > over from a previous application or firmware which didn't clean up
> > > > > before exiting.
> > > > > 
> > > > 
> > > > Why can't the VM manager do that instead? If you have a file that's
> > > > certainly easily possible.
> > > 
> > > This feels conceptually similar to the case where you expose a host
> > > block device to the guest. If that block device was previously given
> > > to a different guest it might still have data in it. Someone needs
> > > to take responsibility for scrubbing that data. Since that may take
> > > a non-trivial amount of time, it is typically todo that scrubbing in
> > > the background after the old VM is gone rather than put it into the
> > > startup path for a new VM which would delay boot.
> > > 
> > > PMEM is blurring the boundary between memory and disk, but the tradeoff
> > > is not so different. We know that in general merely faulting in guest
> > > memory is quite time consuming and delays VM startup significantly as
> > > RAM size increases. Doing the full memset can only be slower still.
> > > 
> > > For prealloc we've create complex code to fault in memory across many
> > > threads and even that's too slow, so we're considering doing it in the
> > > background as the VM starts up.
> > > 
> > > IIUC, this patch just puts the memset in the critical serialized path.
> > > This will inevitably lead to a demand for improving performance by
> > > parallelizing across threads, but we know that's too slow already,
> > > and we cant play the background async game with memset as that's
> > > actually changunig guest visible contents.
> > > 
> > > IOW, for large PMEM sizes, it does look compelling to do the clearing
> > > of old data in the background outside context of QEMU VM startup to
> > > avoid delayed startup.
> > > 
> > > I can still understand the appeal of a simple flag to set on QEMU from
> > > a usability POV, but not sure its a good idea to encourage this usage
> > > by mgmt apps.
> > 
> > I can totally see the reasoning about the latency here, but I’m a little dubious if multi-threading for memset can actaully help reduce the start-up time; the total cost is going to be bound by memory bandwidth between the CPU and memory (even more so if it’s PMEM) which is limited.
> 
> Right, daxio is the magic bit:
> 
> daxio.x86_64 : Perform I/O on Device DAX devices or zero a Device DAX device
> 
> # daxio -z -o /dev/dax0.0
> daxio: copied 8587837440 bytes to device "/dev/dax0.0"

I think Dan's concerns are valid, but I noticed daxio also just calls
pmem_memset_persist(), so it's doing pretty much the same
single-threaded thing as the patch:
https://github.com/pmem/pmdk/blob/master/src/tools/daxio/daxio.c#L506

Stefan
diff mbox series

Patch

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 25141283c4..f7468d24ce 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -30,6 +30,7 @@  struct HostMemoryBackendFile {
     bool discard_data;
     bool is_pmem;
     bool readonly;
+    bool clear;
 };
 
 static void
@@ -56,6 +57,7 @@  file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags = backend->share ? RAM_SHARED : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
+    ram_flags |= fb->clear ? RAM_CLEAR : 0;
     memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
                                      backend->size, fb->align, ram_flags,
                                      fb->mem_path, fb->readonly, errp);
@@ -168,6 +170,21 @@  static void file_memory_backend_set_readonly(Object *obj, bool value,
     fb->readonly = value;
 }
 
+static bool file_memory_backend_get_clear(Object *obj, Error **errp)
+{
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
+
+    return fb->clear;
+}
+
+static void file_memory_backend_set_clear(Object *obj, bool value,
+                                             Error **errp)
+{
+    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
+
+    fb->clear = value;
+}
+
 static void file_backend_unparent(Object *obj)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -204,6 +221,9 @@  file_backend_class_init(ObjectClass *oc, void *data)
     object_class_property_add_bool(oc, "readonly",
         file_memory_backend_get_readonly,
         file_memory_backend_set_readonly);
+    object_class_property_add_bool(oc, "clear",
+        file_memory_backend_get_clear,
+        file_memory_backend_set_clear);
 }
 
 static void file_backend_instance_finalize(Object *o)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2e602a2fad..3345db5241 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -232,6 +232,9 @@  typedef struct IOMMUTLBEvent {
 /* RAM that isn't accessible through normal means. */
 #define RAM_PROTECTED (1 << 8)
 
+/* Clear these pages when mapping */
+#define RAM_CLEAR (1 << 9)
+
 static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
                                        IOMMUNotifierFlag flags,
                                        hwaddr start, hwaddr end,
diff --git a/qapi/qom.json b/qapi/qom.json
index 30e76653ad..2c4aa5b0d5 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -629,6 +629,8 @@ 
 #         specify the required alignment via this option.
 #         0 selects a default alignment (currently the page size). (default: 0)
 #
+# @clear: if true, the memory is memset to 0 during init. (default: false)
+#
 # @discard-data: if true, the file contents can be destroyed when QEMU exits,
 #                to avoid unnecessarily flushing data to the backing file. Note
 #                that ``discard-data`` is only an optimization, and QEMU might
@@ -649,6 +651,7 @@ 
 { 'struct': 'MemoryBackendFileProperties',
   'base': 'MemoryBackendProperties',
   'data': { '*align': 'size',
+            '*clear': 'bool',
             '*discard-data': 'bool',
             'mem-path': 'str',
             '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
diff --git a/qemu-options.hx b/qemu-options.hx
index beeb4475ba..6c8345c62e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4859,7 +4859,7 @@  SRST
     they are specified. Note that the 'id' property must be set. These
     objects are placed in the '/objects' path.
 
-    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,readonly=on|off``
+    ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,clear=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,readonly=on|off``
         Creates a memory file backend object, which can be used to back
         the guest RAM with huge pages.
 
@@ -4886,6 +4886,8 @@  SRST
         Documentation/vm/numa\_memory\_policy.txt on the Linux kernel
         source tree for additional details.
 
+        Setting clear=on will make QEMU memset the backend to 0.
+
         Setting the ``discard-data`` boolean option to on indicates that
         file contents can be destroyed when QEMU exits, to avoid
         unnecessarily flushing data to the backing file. Note that
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index df54b917a9..573c686fd1 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1409,6 +1409,10 @@  static void *file_ram_alloc(RAMBlock *block,
         return NULL;
     }
 
+    if (block->flags & RAM_CLEAR) {
+        memset(area, 0, memory);
+    }
+
     block->fd = fd;
     return area;
 }
@@ -1868,7 +1872,7 @@  RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
 
     /* Just support these ram flags by now. */
     assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
-                          RAM_PROTECTED)) == 0);
+                          RAM_PROTECTED | RAM_CLEAR)) == 0);
 
     if (xen_enabled()) {
         error_setg(errp, "-mem-path not supported with Xen");