diff mbox series

KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG

Message ID 20231205181645.482037-1-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG | expand

Commit Message

David Matlack Dec. 5, 2023, 6:16 p.m. UTC
Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG to avoid
blocking other threads (e.g. vCPUs taking page faults) for too long.

Specifically, change kvm_clear_dirty_log_protect() to acquire/release
mmu_lock only when calling kvm_arch_mmu_enable_log_dirty_pt_masked(),
rather than around the entire for loop. This ensures that KVM will only
hold mmu_lock for the time it takes the architecture-specific code to
process up to 64 pages, rather than holding mmu_lock for log->num_pages,
which is controllable by userspace. This also avoids holding mmu_lock
when processing parts of the dirty_bitmap that are zero (i.e. when there
is nothing to clear).

Moving the acquire/release points for mmu_lock should be safe since
dirty_bitmap_buffer is already protected by slots_lock, and dirty_bitmap
is already accessed with atomic_long_fetch_andnot(). And at least on x86
holding mmu_lock doesn't even serialize access to the memslot dirty
bitmap, as vCPUs can call mark_page_dirty_in_slot() without holding
mmu_lock.

This change eliminates dips in guest performance during live migration
in a 160 vCPU VM when userspace is issuing CLEAR ioctls (tested with
1GiB and 8GiB CLEARs). Userspace could issue finer-grained CLEARs, which
would also reduce contention on mmu_lock, but doing so will increase the
rate of remote TLB flushing. And there's really no reason to punt this
problem to userspace since KVM can just drop and reacquire mmu_lock more
frequently.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Tianrui Zhao <zhaotianrui@loongson.cn>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anup Patel <anup@brainfault.org>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Sean Christopherson <seanjc@google.com>

Signed-off-by: David Matlack <dmatlack@google.com>
---
NOTE: This patch was originally sent as part of another series [1].

[1] https://lore.kernel.org/kvm/170137684236.660121.11958959609300046312.b4-ty@google.com/

 virt/kvm/kvm_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: 45b890f7689eb0aba454fc5831d2d79763781677

Comments

David Matlack Jan. 11, 2024, 4:55 p.m. UTC | #1
On Tue, Dec 5, 2023 at 10:16 AM David Matlack <dmatlack@google.com> wrote:
>
> Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG to avoid
> blocking other threads (e.g. vCPUs taking page faults) for too long.

Ping.

KVM architecture maintainers: Do you have any concerns about the
correctness of this patch? I'm confident dropping the lock is correct
on x86 and it should be on other architectures as well, but
confirmation would be helpful.

Thanks.
David Matlack April 2, 2024, 4:42 p.m. UTC | #2
On Thu, Jan 11, 2024 at 8:55 AM David Matlack <dmatlack@google.com> wrote:
>
> On Tue, Dec 5, 2023 at 10:16 AM David Matlack <dmatlack@google.com> wrote:
> >
> > Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG to avoid
> > blocking other threads (e.g. vCPUs taking page faults) for too long.
>
> Ping.
>
> KVM architecture maintainers: Do you have any concerns about the
> correctness of this patch? I'm confident dropping the lock is correct
> on x86 and it should be on other architectures as well, but
> confirmation would be helpful.
>
> Thanks.

Ping again. This patch has been sitting since December 5th. Is there
anything I can do to help get it merged?
Marc Zyngier April 2, 2024, 4:59 p.m. UTC | #3
On 2024-04-02 17:42, David Matlack wrote:
> On Thu, Jan 11, 2024 at 8:55 AM David Matlack <dmatlack@google.com> 
> wrote:
>> 
>> On Tue, Dec 5, 2023 at 10:16 AM David Matlack <dmatlack@google.com> 
>> wrote:
>> >
>> > Aggressively drop and reacquire mmu_lock during CLEAR_DIRTY_LOG to avoid
>> > blocking other threads (e.g. vCPUs taking page faults) for too long.
>> 
>> Ping.
>> 
>> KVM architecture maintainers: Do you have any concerns about the
>> correctness of this patch? I'm confident dropping the lock is correct
>> on x86 and it should be on other architectures as well, but
>> confirmation would be helpful.
>> 
>> Thanks.
> 
> Ping again. This patch has been sitting since December 5th. Is there
> anything I can do to help get it merged?

Please send a rebased version as a V2. With the number of MMU patches
flying around without much coordination, it is hard to keep track.

Thanks,

         M.
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 486800a7024b..afa61a2309d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2297,7 +2297,6 @@  static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 	if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n))
 		return -EFAULT;
 
-	KVM_MMU_LOCK(kvm);
 	for (offset = log->first_page, i = offset / BITS_PER_LONG,
 		 n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
 	     i++, offset += BITS_PER_LONG) {
@@ -2316,11 +2315,12 @@  static int kvm_clear_dirty_log_protect(struct kvm *kvm,
 		*/
 		if (mask) {
 			flush = true;
+			KVM_MMU_LOCK(kvm);
 			kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot,
 								offset, mask);
+			KVM_MMU_UNLOCK(kvm);
 		}
 	}
-	KVM_MMU_UNLOCK(kvm);
 
 	if (flush)
 		kvm_flush_remote_tlbs_memslot(kvm, memslot);