Message ID | 20220225033548.1912117-3-kaleshsingh@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Hypervisor stack enhancements | expand |
On Fri, 25 Feb 2022 03:34:47 +0000, Kalesh Singh <kaleshsingh@google.com> wrote: > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping() > to allow specifying an alignment for the private VA mapping. > > These will be used to implement stack guard pages for pKVM nVHE hypervisor > (in a subsequent patch in the series). > > Credits to Quentin Perret <qperret@google.com> for the idea of moving > private VA allocation out of __pkvm_create_private_mapping() > > Signed-off-by: Kalesh Singh <kaleshsingh@google.com> > --- > > Changes in v4: > - Handle null ptr in pkvm_alloc_private_va_range() and replace > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad > - Fix kernel-doc comments format, per Fuad > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad > > Changes in v3: > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark > > Changes in v2: > - Allow specifying an alignment for the private VA allocations, per Marc I probably badly expressed my earlier concern. Yes, an alignment is necessary. But how often do we want an alignment that isn't naturally aligned to the size of the allocation (i.e. the power of 2 >= the size of the allocation)? This is what the rest of the kernel does (get_order() and co), and I thing we should follow this. This applies to both this patch and the previous one. Thanks, M.
On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <maz@kernel.org> wrote: > > On Fri, 25 Feb 2022 03:34:47 +0000, > Kalesh Singh <kaleshsingh@google.com> wrote: > > > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges > > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping() > > to allow specifying an alignment for the private VA mapping. > > > > These will be used to implement stack guard pages for pKVM nVHE hypervisor > > (in a subsequent patch in the series). > > > > Credits to Quentin Perret <qperret@google.com> for the idea of moving > > private VA allocation out of __pkvm_create_private_mapping() > > > > Signed-off-by: Kalesh Singh <kaleshsingh@google.com> > > --- > > > > Changes in v4: > > - Handle null ptr in pkvm_alloc_private_va_range() and replace > > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad > > - Fix kernel-doc comments format, per Fuad > > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad > > > > Changes in v3: > > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark > > > > Changes in v2: > > - Allow specifying an alignment for the private VA allocations, per Marc > > I probably badly expressed my earlier concern. > > Yes, an alignment is necessary. But how often do we want an alignment > that isn't naturally aligned to the size of the allocation (i.e. the > power of 2 >= the size of the allocation)? This is what the rest of > the kernel does (get_order() and co), and I thing we should follow > this. Hi Marc, Thanks for clarifying. I think making the alignment implicitly based on the size here will create unnecessary holes where PAGE_SIZE alignment would be ok and potentially overflow the private VA space earlier. Is it not a concern? - Kalesh > > This applies to both this patch and the previous one. > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible. > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
Hi Kalesh, On Wed, 02 Mar 2022 17:24:53 +0000, Kalesh Singh <kaleshsingh@google.com> wrote: > > On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <maz@kernel.org> wrote: > > > > On Fri, 25 Feb 2022 03:34:47 +0000, > > Kalesh Singh <kaleshsingh@google.com> wrote: > > > > > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges > > > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping() > > > to allow specifying an alignment for the private VA mapping. > > > > > > These will be used to implement stack guard pages for pKVM nVHE hypervisor > > > (in a subsequent patch in the series). > > > > > > Credits to Quentin Perret <qperret@google.com> for the idea of moving > > > private VA allocation out of __pkvm_create_private_mapping() > > > > > > Signed-off-by: Kalesh Singh <kaleshsingh@google.com> > > > --- > > > > > > Changes in v4: > > > - Handle null ptr in pkvm_alloc_private_va_range() and replace > > > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad > > > - Fix kernel-doc comments format, per Fuad > > > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad > > > > > > Changes in v3: > > > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark > > > > > > Changes in v2: > > > - Allow specifying an alignment for the private VA allocations, per Marc > > > > I probably badly expressed my earlier concern. > > > > Yes, an alignment is necessary. But how often do we want an alignment > > that isn't naturally aligned to the size of the allocation (i.e. the > > power of 2 >= the size of the allocation)? This is what the rest of > > the kernel does (get_order() and co), and I thing we should follow > > this. > > Hi Marc, > > Thanks for clarifying. I think making the alignment implicitly based > on the size here will create unnecessary holes where PAGE_SIZE > alignment would be ok and potentially overflow the private VA space > earlier. Is it not a concern? I don't think we should worry too much about this. Even when building the kernel with a very small VA space (commonly 39 bits), we still have a quarter of that reserved for private EL2 mappings. That's pretty big. We will use a bit more of the memory that is set aside for EL2 page tables, but this shouldn't be a problem either. Thanks, M.
On Thu, Mar 3, 2022 at 9:29 AM Marc Zyngier <maz@kernel.org> wrote: > > Hi Kalesh, > > On Wed, 02 Mar 2022 17:24:53 +0000, > Kalesh Singh <kaleshsingh@google.com> wrote: > > > > On Tue, Mar 1, 2022 at 11:46 PM Marc Zyngier <maz@kernel.org> wrote: > > > > > > On Fri, 25 Feb 2022 03:34:47 +0000, > > > Kalesh Singh <kaleshsingh@google.com> wrote: > > > > > > > > pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges > > > > in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping() > > > > to allow specifying an alignment for the private VA mapping. > > > > > > > > These will be used to implement stack guard pages for pKVM nVHE hypervisor > > > > (in a subsequent patch in the series). > > > > > > > > Credits to Quentin Perret <qperret@google.com> for the idea of moving > > > > private VA allocation out of __pkvm_create_private_mapping() > > > > > > > > Signed-off-by: Kalesh Singh <kaleshsingh@google.com> > > > > --- > > > > > > > > Changes in v4: > > > > - Handle null ptr in pkvm_alloc_private_va_range() and replace > > > > IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad > > > > - Fix kernel-doc comments format, per Fuad > > > > - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad > > > > > > > > Changes in v3: > > > > - Handle null ptr in IS_ERR_OR_NULL checks, per Mark > > > > > > > > Changes in v2: > > > > - Allow specifying an alignment for the private VA allocations, per Marc > > > > > > I probably badly expressed my earlier concern. > > > > > > Yes, an alignment is necessary. But how often do we want an alignment > > > that isn't naturally aligned to the size of the allocation (i.e. the > > > power of 2 >= the size of the allocation)? This is what the rest of > > > the kernel does (get_order() and co), and I thing we should follow > > > this. > > > > Hi Marc, > > > > Thanks for clarifying. I think making the alignment implicitly based > > on the size here will create unnecessary holes where PAGE_SIZE > > alignment would be ok and potentially overflow the private VA space > > earlier. Is it not a concern? > > I don't think we should worry too much about this. Even when building > the kernel with a very small VA space (commonly 39 bits), we still > have a quarter of that reserved for private EL2 mappings. That's > pretty big. > > We will use a bit more of the memory that is set aside for EL2 page > tables, but this shouldn't be a problem either. Hi Marc, Thanks for the explanations. I'll update as suggested in the next version. - Kalesh > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible.
diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h index 2d08510c6cc1..76d6fcf1d4f4 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h @@ -20,7 +20,8 @@ int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot); int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot); int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot); unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size, - enum kvm_pgtable_prot prot); + size_t align, enum kvm_pgtable_prot prot); +unsigned long pkvm_alloc_private_va_range(size_t size, size_t align); static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size, unsigned long *start, unsigned long *end) diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 5e2197db0d32..96b2312a0f1d 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -158,9 +158,10 @@ static void handle___pkvm_create_private_mapping(struct kvm_cpu_context *host_ct { DECLARE_REG(phys_addr_t, phys, host_ctxt, 1); DECLARE_REG(size_t, size, host_ctxt, 2); - DECLARE_REG(enum kvm_pgtable_prot, prot, host_ctxt, 3); + DECLARE_REG(size_t, align, host_ctxt, 3); + DECLARE_REG(enum kvm_pgtable_prot, prot, host_ctxt, 4); - cpu_reg(host_ctxt, 1) = __pkvm_create_private_mapping(phys, size, prot); + cpu_reg(host_ctxt, 1) = __pkvm_create_private_mapping(phys, size, align, prot); } static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index 526a7d6fa86f..e6355180aa49 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -37,38 +37,53 @@ static int __pkvm_create_mappings(unsigned long start, unsigned long size, return err; } -unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size, - enum kvm_pgtable_prot prot) +/** + * pkvm_alloc_private_va_range - Allocates a private VA range. + * @size: The size of the VA range to reserve. + * @align: The required alignment for the allocation. + * + * The private VA range is allocated above __io_map_base. + */ +unsigned long pkvm_alloc_private_va_range(size_t size, size_t align) { - unsigned long addr; - int err; + unsigned long base, addr; hyp_spin_lock(&pkvm_pgd_lock); - size = PAGE_ALIGN(size + offset_in_page(phys)); - addr = __io_map_base; - __io_map_base += size; + addr = ALIGN(__io_map_base, align); + + /* The allocated size is always a multiple of PAGE_SIZE */ + base = addr + PAGE_ALIGN(size); /* Are we overflowing on the vmemmap ? */ - if (__io_map_base > __hyp_vmemmap) { - __io_map_base -= size; + if (!addr || base > __hyp_vmemmap) addr = (unsigned long)ERR_PTR(-ENOMEM); - goto out; - } - - err = kvm_pgtable_hyp_map(&pkvm_pgtable, addr, size, phys, prot); - if (err) { - addr = (unsigned long)ERR_PTR(err); - goto out; - } + else + __io_map_base = base; - addr = addr + offset_in_page(phys); -out: hyp_spin_unlock(&pkvm_pgd_lock); return addr; } +unsigned long __pkvm_create_private_mapping(phys_addr_t phys, size_t size, + size_t align, enum kvm_pgtable_prot prot) +{ + unsigned long addr; + int err; + + size += offset_in_page(phys); + addr = pkvm_alloc_private_va_range(size, align); + if (IS_ERR((void *)addr)) + return addr; + + err = __pkvm_create_mappings(addr, size, phys, prot); + if (err) + return (unsigned long)ERR_PTR(err); + + return addr + offset_in_page(phys); +} + int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot) { unsigned long start = (unsigned long)from; @@ -152,10 +167,9 @@ int hyp_map_vectors(void) return 0; phys = __hyp_pa(__bp_harden_hyp_vecs); - bp_base = (void *)__pkvm_create_private_mapping(phys, - __BP_HARDEN_HYP_VECS_SZ, - PAGE_HYP_EXEC); - if (IS_ERR_OR_NULL(bp_base)) + bp_base = (void *)__pkvm_create_private_mapping(phys, __BP_HARDEN_HYP_VECS_SZ, + PAGE_SIZE, PAGE_HYP_EXEC); + if (IS_ERR(bp_base)) return PTR_ERR(bp_base); __hyp_bp_vect_base = bp_base; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a4868a6fa1c3..433c49766671 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -506,7 +506,7 @@ int __create_hyp_private_mapping(phys_addr_t phys_addr, size_t size, if (!kvm_host_owns_hyp_mappings()) { addr = kvm_call_hyp_nvhe(__pkvm_create_private_mapping, - phys_addr, size, prot); + phys_addr, size, align, prot); if (IS_ERR((void *)addr)) return PTR_ERR((void *)addr); *haddr = addr;
pkvm_hyp_alloc_private_va_range() can be used to reserve private VA ranges in the pKVM nVHE hypervisor (). Also update __pkvm_create_private_mapping() to allow specifying an alignment for the private VA mapping. These will be used to implement stack guard pages for pKVM nVHE hypervisor (in a subsequent patch in the series). Credits to Quentin Perret <qperret@google.com> for the idea of moving private VA allocation out of __pkvm_create_private_mapping() Signed-off-by: Kalesh Singh <kaleshsingh@google.com> --- Changes in v4: - Handle null ptr in pkvm_alloc_private_va_range() and replace IS_ERR_OR_NULL checks in callers with IS_ERR checks, per Fuad - Fix kernel-doc comments format, per Fuad - Format __pkvm_create_private_mapping() prototype args (< 80 col), per Fuad Changes in v3: - Handle null ptr in IS_ERR_OR_NULL checks, per Mark Changes in v2: - Allow specifying an alignment for the private VA allocations, per Marc arch/arm64/kvm/hyp/include/nvhe/mm.h | 3 +- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 ++- arch/arm64/kvm/hyp/nvhe/mm.c | 60 +++++++++++++++++----------- arch/arm64/kvm/mmu.c | 2 +- 4 files changed, 43 insertions(+), 27 deletions(-)