[9/9] KVM: arm64: Run clear-dirty-log under MMU read lock

Message ID	20230421165305.804301-10-vipinsh@google.com (mailing list archive)
State	Handled Elsewhere
Headers	show Return-Path: <linux-mips-owner@vger.kernel.org> Date: Fri, 21 Apr 2023 09:53:05 -0700 In-Reply-To: <20230421165305.804301-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230421165305.804301-1-vipinsh@google.com> Message-ID: <20230421165305.804301-10-vipinsh@google.com> Subject: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock From: Vipin Sharma <vipinsh@google.com> To: maz@kernel.org, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, chenhuacai@kernel.org, aleksandar.qemu.devel@gmail.com, tsbogend@alpha.franken.de, anup@brainfault.org, atishp@atishpatra.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, pbonzini@redhat.com, dmatlack@google.com, ricarkol@google.com Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kselftest@vger.kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	KVM: arm64: Use MMU read lock for clearing dirty logs \| expand [0/9] KVM: arm64: Use MMU read lock for clearing dirty logs [1/9] KVM: selftests: Allow dirty_log_perf_test to clear dirty memory in chunks [2/9] KVM: selftests: Add optional delay between consecutive Clear-Dirty-Log calls [3/9] KVM: selftests: Pass count of read and write accesses from guest to host [4/9] KVM: selftests: Print read and write accesses of pages by vCPUs in dirty_log_perf_test [5/9] KVM: selftests: Allow independent execution of vCPUs in dirty_log_perf_test [6/9] KVM: arm64: Correct the kvm_pgtable_stage2_flush() documentation [7/9] KVM: mmu: Move mmu lock/unlock to arch code for clear dirty log [8/9] KMV: arm64: Allow stage2_apply_range_sched() to pass page table walker flags [9/9] KVM: arm64: Run clear-dirty-log under MMU read lock

Message ID

20230421165305.804301-10-vipinsh@google.com (mailing list archive)

State

Handled Elsewhere

Headers

Date: Fri, 21 Apr 2023 09:53:05 -0700
In-Reply-To: <20230421165305.804301-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230421165305.804301-1-vipinsh@google.com>
Message-ID: <20230421165305.804301-10-vipinsh@google.com>
Subject: [PATCH 9/9] KVM: arm64: Run clear-dirty-log under MMU read lock
From: Vipin Sharma <vipinsh@google.com>
To: maz@kernel.org, oliver.upton@linux.dev, james.morse@arm.com,
        suzuki.poulose@arm.com, yuzenghui@huawei.com,
        catalin.marinas@arm.com, will@kernel.org, chenhuacai@kernel.org,
        aleksandar.qemu.devel@gmail.com, tsbogend@alpha.franken.de,
        anup@brainfault.org, atishp@atishpatra.org,
        paul.walmsley@sifive.com, palmer@dabbelt.com,
        aou@eecs.berkeley.edu, seanjc@google.com, pbonzini@redhat.com,
        dmatlack@google.com, ricarkol@google.com
Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
        linux-mips@vger.kernel.org, kvm-riscv@lists.infradead.org,
        linux-riscv@lists.infradead.org, linux-kselftest@vger.kernel.org,
        kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
        Vipin Sharma <vipinsh@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

KVM: arm64: Use MMU read lock for clearing dirty logs | expand

Commit Message

Vipin Sharma April 21, 2023, 4:53 p.m. UTC

Take MMU read lock for write protecting PTEs and use shared page table
walker for clearing dirty logs.

Clearing dirty logs are currently performed under MMU write locks. This
means vCPUs write protection fault, which also take MMU read lock,  will
be blocked during this operation. This causes guest degradation and
especially noticeable on VMs with lot of vCPUs.

Taking MMU read lock will allow vCPUs to execute parallelly and reduces
the impact on vCPUs performance.

Tested improvement on a ARM Ampere Altra host (64 CPUs, 256 GB
memory and single NUMA node) via dirty_log_perf_test for 48 vCPU, 96
GB memory, 8GB clear chunk size, 1 second wait between Clear-Dirty-Log
calls and configuration:

Test command:
./dirty_log_perf_test -s anonymous_hugetlb_2mb -b 2G -v 48 -l 1 -k 8G -j -m 2

Before:
Total pages touched: 50331648 (Reads: 0, Writes: 50331648)

After:
Total pages touched: 125304832 (Reads: 0, Writes: 125304832)

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/mmu.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Marc Zyngier April 21, 2023, 5:10 p.m. UTC | #1

On Fri, 21 Apr 2023 17:53:05 +0100,
Vipin Sharma <vipinsh@google.com> wrote:
> 
> Take MMU read lock for write protecting PTEs and use shared page table
> walker for clearing dirty logs.
> 
> Clearing dirty logs are currently performed under MMU write locks. This
> means vCPUs write protection fault, which also take MMU read lock,  will
> be blocked during this operation. This causes guest degradation and
> especially noticeable on VMs with lot of vCPUs.
> 
> Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> the impact on vCPUs performance.

Sure. Taking no lock whatsoever would be even better.

What I don't see is the detailed explanation that gives me the warm
feeling that this is safe and correct. Such an explanation is the
minimum condition for me to even read the patch.

Thanks,

	M.

Vipin Sharma May 6, 2023, 12:55 a.m. UTC | #2

On Fri, Apr 21, 2023 at 10:11 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 21 Apr 2023 17:53:05 +0100,
> Vipin Sharma <vipinsh@google.com> wrote:
> >
> > Take MMU read lock for write protecting PTEs and use shared page table
> > walker for clearing dirty logs.
> >
> > Clearing dirty logs are currently performed under MMU write locks. This
> > means vCPUs write protection fault, which also take MMU read lock,  will
> > be blocked during this operation. This causes guest degradation and
> > especially noticeable on VMs with lot of vCPUs.
> >
> > Taking MMU read lock will allow vCPUs to execute parallelly and reduces
> > the impact on vCPUs performance.
>
> Sure. Taking no lock whatsoever would be even better.
>
> What I don't see is the detailed explanation that gives me the warm
> feeling that this is safe and correct. Such an explanation is the
> minimum condition for me to even read the patch.
>

Thanks for freaking me out. Your not getting warm feeling hunch was
right, stage2_attr_walker() and stage2_update_leaf_attrs() combo do
not retry if cmpxchg fails for write protection. Write protection
callers don't check what the return status of the API is and just
ignores cmpxchg failure. This means a vCPU (MMU read lock user) can
cause cmpxchg to fail for write protection operation (under read lock,
which this patch does) and clear ioctl will happily return as if
everything is good.

I will update the series and also work on validating the correctness
to instill more confidence.

Thanks

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index e0189cdda43d..3f2117d93998 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -67,8 +67,12 @@  static int stage2_apply_range(struct kvm_s2_mmu *mmu, phys_addr_t addr,
 		if (ret)
 			break;
 
-		if (resched && next != end)
-			cond_resched_rwlock_write(&kvm->mmu_lock);
+		if (resched && next != end) {
+			if (flags & KVM_PGTABLE_WALK_SHARED)
+				cond_resched_rwlock_read(&kvm->mmu_lock);
+			else
+				cond_resched_rwlock_write(&kvm->mmu_lock);
+		}
 	} while (addr = next, addr != end);
 
 	return ret;
@@ -994,7 +998,7 @@  static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
 	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
 
-	stage2_wp_range(&kvm->arch.mmu, start, end, 0);
+	stage2_wp_range(&kvm->arch.mmu, start, end, KVM_PGTABLE_WALK_SHARED);
 }
 
 /*
@@ -1008,9 +1012,9 @@  void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
 		struct kvm_memory_slot *slot,
 		gfn_t gfn_offset, unsigned long mask)
 {
-	write_lock(&kvm->mmu_lock);
+	read_lock(&kvm->mmu_lock);
 	kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
-	write_unlock(&kvm->mmu_lock);
+	read_unlock(&kvm->mmu_lock);
 }
 
 static void kvm_send_hwpoison_signal(unsigned long address, short lsb)

[9/9] KVM: arm64: Run clear-dirty-log under MMU read lock

Commit Message

Comments

Patch