Message ID | 20230206165851.3106338-9-ricarkol@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Implement Eager Page Splitting for ARM. | expand |
Hi Ricardo, On 2/7/23 3:58 AM, Ricardo Koller wrote: > Add a capability for userspace to specify the eager split chunk size. > The chunk size specifies how many pages to break at a time, using a > single allocation. Bigger the chunk size, more pages need to be > allocated ahead of time. > > Suggested-by: Oliver Upton <oliver.upton@linux.dev> > Signed-off-by: Ricardo Koller <ricarkol@google.com> > --- > Documentation/virt/kvm/api.rst | 26 ++++++++++++++++++++++++++ > arch/arm64/include/asm/kvm_host.h | 2 ++ > arch/arm64/kvm/arm.c | 22 ++++++++++++++++++++++ > arch/arm64/kvm/mmu.c | 3 +++ > include/uapi/linux/kvm.h | 1 + > 5 files changed, 54 insertions(+) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 9807b05a1b57..a9332e331cce 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -8284,6 +8284,32 @@ structure. > When getting the Modified Change Topology Report value, the attr->addr > must point to a byte where the value will be stored or retrieved from. > > +8.40 KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE > +--------------------------------------- > + > +:Capability: KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE > +:Architectures: arm64 > +:Type: vm > +:Parameters: arg[0] is the new chunk size. > +:Returns: 0 on success, -EINVAL if any memslot has been created. > + > +This capability sets the chunk size used in Eager Page Splitting. > + > +Eager Page Splitting improves the performance of dirty-logging (used > +in live migrations) when guest memory is backed by huge-pages. This > +optimization is enabled by default on arm64. It avoids splitting > +huge-pages (into PAGE_SIZE pages) on fault, by doing it eagerly when > +enabling dirty logging (with the KVM_MEM_LOG_DIRTY_PAGES flag for a > +memory region), or when using KVM_CLEAR_DIRTY_LOG. > + > +The chunk size specifies how many pages to break at a time, using a > +single allocation for each chunk. Bigger the chunk size, more pages > +need to be allocated ahead of time. A good heuristic is to pick the > +size of the huge-pages as the chunk size. > + > +If the chunk size (arg[0]) is zero, then no eager page splitting is > +performed. The default value PMD size (e.g., 2M when PAGE_SIZE is 4K). > + > 9. Known KVM API problems > ========================= > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 35a159d131b5..a69a815719cf 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -153,6 +153,8 @@ struct kvm_s2_mmu { > /* The last vcpu id that ran on each physical CPU */ > int __percpu *last_vcpu_ran; > > +#define KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT PMD_SIZE > + > struct kvm_arch *arch; > }; > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index 9c5573bc4614..c80617ced599 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -101,6 +101,22 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > r = 0; > set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags); > break; > + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: > + mutex_lock(&kvm->lock); > + mutex_lock(&kvm->slots_lock); > + /* > + * To keep things simple, allow changing the chunk > + * size only if there are no memslots created. > + */ > + if (!kvm_are_all_memslots_empty(kvm)) { > + r = -EINVAL; > + } else { > + r = 0; > + kvm->arch.mmu.split_page_chunk_size = cap->args[0]; > + } > + mutex_unlock(&kvm->slots_lock); > + mutex_unlock(&kvm->lock); > + break; > default: > r = -EINVAL; > break; > @@ -298,6 +314,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_ARM_PTRAUTH_GENERIC: > r = system_has_full_ptr_auth(); > break; > + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: > + if (kvm) > + r = kvm->arch.mmu.split_page_chunk_size; > + else > + r = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; > + break; > default: > r = 0; > } > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 812633a75e74..e2ada6588017 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -755,6 +755,9 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t > for_each_possible_cpu(cpu) > *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; > > + mmu->split_page_cache.gfp_zero = __GFP_ZERO; > + mmu->split_page_chunk_size = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; > + > mmu->pgt = pgt; > mmu->pgd_phys = __pa(pgt->pgd); > return 0; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 55155e262646..02e05f7918e2 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 > #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 > #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 > +#define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 226 > > #ifdef KVM_CAP_IRQ_ROUTING > > mmu->split_page_cache and mmu->split_page_chunk_size are defined in PATCH[09/12]. I think you need move the definitions to PATCH[08/12] instead. Otherwise, git-bisect is broken. Thanks, Gavin
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 9807b05a1b57..a9332e331cce 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8284,6 +8284,32 @@ structure. When getting the Modified Change Topology Report value, the attr->addr must point to a byte where the value will be stored or retrieved from. +8.40 KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE +--------------------------------------- + +:Capability: KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE +:Architectures: arm64 +:Type: vm +:Parameters: arg[0] is the new chunk size. +:Returns: 0 on success, -EINVAL if any memslot has been created. + +This capability sets the chunk size used in Eager Page Splitting. + +Eager Page Splitting improves the performance of dirty-logging (used +in live migrations) when guest memory is backed by huge-pages. This +optimization is enabled by default on arm64. It avoids splitting +huge-pages (into PAGE_SIZE pages) on fault, by doing it eagerly when +enabling dirty logging (with the KVM_MEM_LOG_DIRTY_PAGES flag for a +memory region), or when using KVM_CLEAR_DIRTY_LOG. + +The chunk size specifies how many pages to break at a time, using a +single allocation for each chunk. Bigger the chunk size, more pages +need to be allocated ahead of time. A good heuristic is to pick the +size of the huge-pages as the chunk size. + +If the chunk size (arg[0]) is zero, then no eager page splitting is +performed. The default value PMD size (e.g., 2M when PAGE_SIZE is 4K). + 9. Known KVM API problems ========================= diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 35a159d131b5..a69a815719cf 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -153,6 +153,8 @@ struct kvm_s2_mmu { /* The last vcpu id that ran on each physical CPU */ int __percpu *last_vcpu_ran; +#define KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT PMD_SIZE + struct kvm_arch *arch; }; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9c5573bc4614..c80617ced599 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -101,6 +101,22 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, r = 0; set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags); break; + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: + mutex_lock(&kvm->lock); + mutex_lock(&kvm->slots_lock); + /* + * To keep things simple, allow changing the chunk + * size only if there are no memslots created. + */ + if (!kvm_are_all_memslots_empty(kvm)) { + r = -EINVAL; + } else { + r = 0; + kvm->arch.mmu.split_page_chunk_size = cap->args[0]; + } + mutex_unlock(&kvm->slots_lock); + mutex_unlock(&kvm->lock); + break; default: r = -EINVAL; break; @@ -298,6 +314,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_PTRAUTH_GENERIC: r = system_has_full_ptr_auth(); break; + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: + if (kvm) + r = kvm->arch.mmu.split_page_chunk_size; + else + r = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; + break; default: r = 0; } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 812633a75e74..e2ada6588017 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -755,6 +755,9 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t for_each_possible_cpu(cpu) *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; + mmu->split_page_cache.gfp_zero = __GFP_ZERO; + mmu->split_page_chunk_size = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; + mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); return 0; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 55155e262646..02e05f7918e2 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 +#define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 226 #ifdef KVM_CAP_IRQ_ROUTING
Add a capability for userspace to specify the eager split chunk size. The chunk size specifies how many pages to break at a time, using a single allocation. Bigger the chunk size, more pages need to be allocated ahead of time. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Ricardo Koller <ricarkol@google.com> --- Documentation/virt/kvm/api.rst | 26 ++++++++++++++++++++++++++ arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 22 ++++++++++++++++++++++ arch/arm64/kvm/mmu.c | 3 +++ include/uapi/linux/kvm.h | 1 + 5 files changed, 54 insertions(+)