diff mbox series

[2/2] kvm: clear dirty bitmaps from all overlapping memslots

Message ID 1568974882-7419-3-git-send-email-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series kvm: clear dirty bitmaps from all overlapping memslots | expand

Commit Message

Paolo Bonzini Sept. 20, 2019, 10:21 a.m. UTC
Since the KVM dirty page reporting works on guest physical addresses,
we need to clear all of the aliases when a page is migrated, or there
is a risk of losing writes to the aliases that were not cleared.

Note that this is only an issue for manual clearing of the bitmap;
if the bitmap is cleared at the same time as it's retrieved, all
the aliases get cleared correctly.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: ff4aa11419242c835b03d274f08f797c129ed7ba
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 accel/kvm/kvm-all.c | 36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

Comments

Peter Xu Sept. 20, 2019, 12:18 p.m. UTC | #1
On Fri, Sep 20, 2019 at 12:21:22PM +0200, Paolo Bonzini wrote:
> Since the KVM dirty page reporting works on guest physical addresses,
> we need to clear all of the aliases when a page is migrated, or there
> is a risk of losing writes to the aliases that were not cleared.

The patch content looks perfect to me, though I just want to make sure
I understand the issue behind, and the commit message...

IMHO we've got two issues to cover for log_clear():

  (1) memory region aliasing, hence multiple GPAs can point to the same
      HVA/HPA so we need to clear the memslot dirty bits on all the
      mapped GPAs, and,

  (2) large log_clear() request which can cover more than one valid
      kvm memslots.  Note that in this case, the mem slots can really
      be having different HVAs so imho it should be a different issue
      comparing to (1)

The commit message says it's solving problem (1).  However for what I
understand, we are actually doing well on issue (1) because in
memory_region_clear_dirty_bitmap() we iterate over all the flat views
so that we should have caught all the aliasing memory regions if there
are any.  However this patch should perfectly fix problem (2).  Am I
right?

Thanks,
Paolo Bonzini Sept. 20, 2019, 2:03 p.m. UTC | #2
On 20/09/19 14:18, Peter Xu wrote:
>   (1) memory region aliasing, hence multiple GPAs can point to the same
>       HVA/HPA so we need to clear the memslot dirty bits on all the
>       mapped GPAs, and,
> 
>   (2) large log_clear() request which can cover more than one valid
>       kvm memslots.  Note that in this case, the mem slots can really
>       be having different HVAs so imho it should be a different issue
>       comparing to (1)
> 
> The commit message says it's solving problem (1).  However for what I
> understand, we are actually doing well on issue (1) because in
> memory_region_clear_dirty_bitmap() we iterate over all the flat views
> so that we should have caught all the aliasing memory regions if there
> are any.

There could be two addresses pointing to the same HVA *in the same
flatview*.  See for example 0xe0000..0xfffff and 0xffffe000..0xffffffff
when a PC guest is started.  In this particular case
0xffffe000..0xffffffff is ROM, so it's not an issue, but in other cases
it may

> However this patch should perfectly fix problem (2).  Am I right?

I hadn't thought of problem (2).  I guess without Igor's work for s390
it does not exist?  But yes, it fixes it just the same.

Paolo
diff mbox series

Patch

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e9e6086..315a915 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -588,8 +588,8 @@  static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start, uint6
      * satisfy the KVM interface requirement.  Firstly, do the start
      * page alignment on 64 host pages
      */
-    bmap_start = (start - mem->start_addr) & KVM_CLEAR_LOG_MASK;
-    start_delta = start - mem->start_addr - bmap_start;
+    bmap_start = start & KVM_CLEAR_LOG_MASK;
+    start_delta = start - bmap_start;
     bmap_start /= psize;
 
     /*
@@ -693,8 +693,8 @@  static int kvm_physical_log_clear(KVMMemoryListener *kml,
                                   MemoryRegionSection *section)
 {
     KVMState *s = kvm_state;
-    uint64_t start, size;
-    KVMSlot *mem = NULL;
+    uint64_t start, size, offset, count;
+    KVMSlot *mem;
     int ret, i;
 
     if (!s->manual_dirty_log_protect) {
@@ -712,22 +712,30 @@  static int kvm_physical_log_clear(KVMMemoryListener *kml,
 
     kvm_slots_lock(kml);
 
-    /* Find any possible slot that covers the section */
     for (i = 0; i < s->nr_slots; i++) {
         mem = &kml->slots[i];
-        if (mem->start_addr <= start &&
-            start + size <= mem->start_addr + mem->memory_size) {
+        /* Discard slots that are empty or do not overlap the section */
+        if (!mem->memory_size ||
+            mem->start_addr > start + size - 1 ||
+            start > mem->start_addr + mem->memory_size - 1) {
+            continue;
+        }
+
+        if (start >= mem->start_addr) {
+            /* The slot starts before section or is aligned to it.  */
+            offset = start - mem->start_addr;
+            count = MIN(mem->memory_size - offset, size);
+        } else {
+            /* The slot starts after section.  */
+            offset = 0;
+            count = MIN(mem->memory_size, size - (mem->start_addr - start));
+        }
+        ret = kvm_log_clear_one_slot(mem, kml->as_id, offset, count);
+        if (ret < 0) {
             break;
         }
     }
 
-    /*
-     * We should always find one memslot until this point, otherwise
-     * there could be something wrong from the upper layer
-     */
-    assert(mem && i != s->nr_slots);
-    ret = kvm_log_clear_one_slot(mem, kml->as_id, start, size);
-
     kvm_slots_unlock(kml);
 
     return ret;