diff mbox series

[RFC,15/73] mm/vmalloc: Add a helper to reserve a contiguous and aligned kernel virtual area

Message ID 20240226143630.33643-16-jiangshanlai@gmail.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Lai Jiangshan Feb. 26, 2024, 2:35 p.m. UTC
From: Hou Wenlong <houwenlong.hwl@antgroup.com>

PVM needs to reserve a contiguous and aligned kernel virtual area for
the guest kernel. Therefor, add a helper to achieve this. It is a
temporary method currently, and a better method is needed in the future.

Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 include/linux/vmalloc.h |  2 ++
 mm/vmalloc.c            | 10 ++++++++++
 2 files changed, 12 insertions(+)

Comments

Christoph Hellwig Feb. 27, 2024, 2:56 p.m. UTC | #1
On Mon, Feb 26, 2024 at 10:35:32PM +0800, Lai Jiangshan wrote:
> From: Hou Wenlong <houwenlong.hwl@antgroup.com>
> 
> PVM needs to reserve a contiguous and aligned kernel virtual area for

Who is "PVM", and why does it need aligned virtual memory space?

> +extern struct vm_struct *get_vm_area_align(unsigned long size, unsigned long align,

No need for the extern here.
Lai Jiangshan Feb. 27, 2024, 5:07 p.m. UTC | #2
Hello

On Tue, Feb 27, 2024 at 10:56 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Feb 26, 2024 at 10:35:32PM +0800, Lai Jiangshan wrote:
> > From: Hou Wenlong <houwenlong.hwl@antgroup.com>
> >
> > PVM needs to reserve a contiguous and aligned kernel virtual area for
>
> Who is "PVM", and why does it need aligned virtual memory space?

PVM stands for Pagetable-based Virtual Machine. It is a new pure
software-implemented virtualization solution. The details are in the
cover letter:
https://lore.kernel.org/lkml/20240226143630.33643-1-jiangshanlai@gmail.com/

I'm sorry for not CC'ing you on the cover letter (I haven't made/found a proper
script to generate all cc-recipients for the cover letter.) nor elaborating
the reason in the changelog.

One of the core designs in PVM is the "Exclusive address space separation",
with which in the higher half of the address spaces (where the most significant
bits in the addresses are 1s), the address ranges that a PVM guest is
allowed are exclusive from the host kernel.  So PVM hypervisor has to use
get_vm_area_align() to reserve a huge range (normally 16T) with the
alignment 512G (PGDIR_SIZE) for all the guests to accommodate the
whole guest kernel space. The reserved range cannot be used by the
host.

The rationale of this core design is also in the cover letter.

Thanks
Lai

>
> > +extern struct vm_struct *get_vm_area_align(unsigned long size, unsigned long align,
>
> No need for the extern here.
>
diff mbox series

Patch

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index c720be70c8dd..1821494b51d6 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -204,6 +204,8 @@  static inline size_t get_vm_area_size(const struct vm_struct *area)
 }
 
 extern struct vm_struct *get_vm_area(unsigned long size, unsigned long flags);
+extern struct vm_struct *get_vm_area_align(unsigned long size, unsigned long align,
+					   unsigned long flags);
 extern struct vm_struct *get_vm_area_caller(unsigned long size,
 					unsigned long flags, const void *caller);
 extern struct vm_struct *__get_vm_area_caller(unsigned long size,
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d12a17fc0c17..6e4b95f24bd8 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2642,6 +2642,16 @@  struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
 				  __builtin_return_address(0));
 }
 
+struct vm_struct *get_vm_area_align(unsigned long size, unsigned long align,
+				    unsigned long flags)
+{
+	return __get_vm_area_node(size, align, PAGE_SHIFT, flags,
+				  VMALLOC_START, VMALLOC_END,
+				  NUMA_NO_NODE, GFP_KERNEL,
+				  __builtin_return_address(0));
+}
+EXPORT_SYMBOL_GPL(get_vm_area_align);
+
 struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
 				const void *caller)
 {