Message ID | 20221028151526.319681-2-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Mapping an entire folio | expand |
Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > Some filesystems benefit from being able to map the entire folio. > On 32-bit platforms with HIGHMEM, we fall back to using vmap, which > will be slow. If it proves to be a performance problem, we can look at > optimising it in a number of ways. Here's a thought: If a highmem arch has a huge PTEs available, can you create a huge PTE to cover the folio? > @@ -426,5 +465,4 @@ static inline void folio_zero_range(struct folio *folio, > { > zero_user_segments(&folio->page, start, start + length, 0, 0); > } > - > #endif /* _LINUX_HIGHMEM_H */ Did you want to remove that blank line? David
Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > +static inline __must_check void *folio_map_local(struct folio *folio) > +{ > + might_alloc(GFP_KERNEL); > + > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > + return folio_address(folio); Can you make the might_alloc() contingent on CONFIG_HIGHMEM=y? David
On Mon, Oct 31, 2022 at 03:11:38PM +0000, David Howells wrote: > Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > > > +static inline __must_check void *folio_map_local(struct folio *folio) > > +{ > > + might_alloc(GFP_KERNEL); > > + > > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > > + return folio_address(folio); > > Can you make the might_alloc() contingent on CONFIG_HIGHMEM=y? ... so that 64-bit builds can inadvertently introduce bugs? Either it's safe to do GFP_KERNEL allocations from a given context, or it's not. That's not going to depend on CONFIG_HIGHMEM.
On Fri, Oct 28, 2022 at 04:15:26PM +0100, Matthew Wilcox (Oracle) wrote: > Some filesystems benefit from being able to map the entire folio. > On 32-bit platforms with HIGHMEM, we fall back to using vmap, which > will be slow. If it proves to be a performance problem, we can look at > optimising it in a number of ways. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > include/linux/highmem.h | 40 ++++++++++++++++++++++++++++++++- > include/linux/vmalloc.h | 6 +++-- > mm/vmalloc.c | 50 +++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 93 insertions(+), 3 deletions(-) > > diff --git a/include/linux/highmem.h b/include/linux/highmem.h > index e9912da5441b..e8159243d88d 100644 > --- a/include/linux/highmem.h > +++ b/include/linux/highmem.h > @@ -10,6 +10,7 @@ > #include <linux/mm.h> > #include <linux/uaccess.h> > #include <linux/hardirq.h> > +#include <linux/vmalloc.h> > > #include "highmem-internal.h" > > @@ -132,6 +133,44 @@ static inline void *kmap_local_page(struct page *page); > */ > static inline void *kmap_local_folio(struct folio *folio, size_t offset); > > +/** > + * folio_map_local - Map an entire folio. > + * @folio: The folio to map. > + * > + * Unlike kmap_local_folio(), map an entire folio. This should be undone > + * with folio_unmap_local(). The address returned should be treated as > + * stack-based, and local to this CPU, like kmap_local_folio(). > + * > + * Context: May allocate memory using GFP_KERNEL if it takes the vmap path. > + * Return: A kernel virtual address which can be used to access the folio, > + * or NULL if the mapping fails. > + */ > +static inline __must_check void *folio_map_local(struct folio *folio) > +{ > + might_alloc(GFP_KERNEL); > + > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > + return folio_address(folio); > + if (folio_test_large(folio)) > + return vm_map_folio(folio); > + return kmap_local_page(&folio->page); > +} > + > +/** > + * folio_unmap_local - Unmap an entire folio. > + * @addr: Address returned from folio_map_local() > + * > + * Undo the result of a previous call to folio_map_local(). > + */ > +static inline void folio_unmap_local(const void *addr) > +{ > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > + return; > + if (is_vmalloc_addr(addr)) > + vunmap(addr); I think it should be vm_unmap_ram(); (and pass number of pages to folio_unmap_local()) as the vmap area might be allocated using vb_alloc(). > + kunmap_local(addr); > +} missing else statement? > + > /** > * kmap_atomic - Atomically map a page for temporary usage - Deprecated! > * @page: Pointer to the page to be mapped > @@ -426,5 +465,4 @@ static inline void folio_zero_range(struct folio *folio, > { > zero_user_segments(&folio->page, start, start + length, 0, 0); > } > - > #endif /* _LINUX_HIGHMEM_H */ > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 096d48aa3437..4bb34c939c01 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -13,6 +13,7 @@ > #include <asm/vmalloc.h> > > struct vm_area_struct; /* vma defining user mapping in mm_types.h */ > +struct folio; /* also mm_types.h */ > struct notifier_block; /* in notifier.h */ > > /* bits in flags of vmalloc's vm_struct below */ > @@ -163,8 +164,9 @@ extern void *vcalloc(size_t n, size_t size) __alloc_size(1, 2); > extern void vfree(const void *addr); > extern void vfree_atomic(const void *addr); > > -extern void *vmap(struct page **pages, unsigned int count, > - unsigned long flags, pgprot_t prot); > +void *vmap(struct page **pages, unsigned int count, unsigned long flags, > + pgprot_t prot); > +void *vm_map_folio(struct folio *folio); > void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot); > extern void vunmap(const void *addr); > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index ccaa461998f3..265b860c9550 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2283,6 +2283,56 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node) > } > EXPORT_SYMBOL(vm_map_ram); > > +#ifdef CONFIG_HIGHMEM > +/** > + * vm_map_folio() - Map an entire folio into virtually contiguous space. > + * @folio: The folio to map. > + * > + * Maps all pages in @folio into contiguous kernel virtual space. This > + * function is only available in HIGHMEM builds; for !HIGHMEM, use > + * folio_address(). The pages are mapped with PAGE_KERNEL permissions. > + * > + * Return: The address of the area or %NULL on failure > + */ > +void *vm_map_folio(struct folio *folio) > +{ > + size_t size = folio_size(folio); > + unsigned long addr; > + void *mem; > + > + might_sleep(); > + > + if (likely(folio_nr_pages(folio) <= VMAP_MAX_ALLOC)) { > + mem = vb_alloc(size, GFP_KERNEL); > + if (IS_ERR(mem)) > + return NULL; > + addr = (unsigned long)mem; > + } else { > + struct vmap_area *va; > + va = alloc_vmap_area(size, PAGE_SIZE, VMALLOC_START, > + VMALLOC_END, NUMA_NO_NODE, GFP_KERNEL); > + if (IS_ERR(va)) > + return NULL; > + > + addr = va->va_start; > + mem = (void *)addr; > + } > + > + if (vmap_range_noflush(addr, addr + size, > + folio_pfn(folio) << PAGE_SHIFT, > + PAGE_KERNEL, folio_shift(folio))) { > + vm_unmap_ram(mem, folio_nr_pages(folio)); > + return NULL; > + } > + flush_cache_vmap(addr, addr + size); > + > + mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_PROT_NORMAL); > + > + return mem; > +} > +EXPORT_SYMBOL(vm_map_folio); > +#endif it's a bit of copy & paste but yeah, it seems unavoidable at this point. > static struct vm_struct *vmlist __initdata; > > static inline unsigned int vm_area_page_order(struct vm_struct *vm) > -- > 2.35.1
On Tue, Nov 01, 2022 at 07:12:21PM +0900, Hyeonggon Yoo wrote: > On Fri, Oct 28, 2022 at 04:15:26PM +0100, Matthew Wilcox (Oracle) wrote: > > Some filesystems benefit from being able to map the entire folio. > > On 32-bit platforms with HIGHMEM, we fall back to using vmap, which > > will be slow. If it proves to be a performance problem, we can look at > > optimising it in a number of ways. > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > --- > > include/linux/highmem.h | 40 ++++++++++++++++++++++++++++++++- > > include/linux/vmalloc.h | 6 +++-- > > mm/vmalloc.c | 50 +++++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 93 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/highmem.h b/include/linux/highmem.h > > index e9912da5441b..e8159243d88d 100644 > > --- a/include/linux/highmem.h > > +++ b/include/linux/highmem.h > > @@ -10,6 +10,7 @@ > > #include <linux/mm.h> > > #include <linux/uaccess.h> > > #include <linux/hardirq.h> > > +#include <linux/vmalloc.h> > > > > #include "highmem-internal.h" > > > > @@ -132,6 +133,44 @@ static inline void *kmap_local_page(struct page *page); > > */ > > static inline void *kmap_local_folio(struct folio *folio, size_t offset); > > > > +/** > > + * folio_map_local - Map an entire folio. > > + * @folio: The folio to map. > > + * > > + * Unlike kmap_local_folio(), map an entire folio. This should be undone > > + * with folio_unmap_local(). The address returned should be treated as > > + * stack-based, and local to this CPU, like kmap_local_folio(). > > + * > > + * Context: May allocate memory using GFP_KERNEL if it takes the vmap path. > > + * Return: A kernel virtual address which can be used to access the folio, > > + * or NULL if the mapping fails. > > + */ > > +static inline __must_check void *folio_map_local(struct folio *folio) > > +{ > > + might_alloc(GFP_KERNEL); > > + > > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > > + return folio_address(folio); > > + if (folio_test_large(folio)) > > + return vm_map_folio(folio); > > + return kmap_local_page(&folio->page); > > +} > > + > > +/** > > + * folio_unmap_local - Unmap an entire folio. > > + * @addr: Address returned from folio_map_local() > > + * > > + * Undo the result of a previous call to folio_map_local(). > > + */ > > +static inline void folio_unmap_local(const void *addr) > > +{ > > + if (!IS_ENABLED(CONFIG_HIGHMEM)) > > + return; > > + if (is_vmalloc_addr(addr)) > > + vunmap(addr); > > I think it should be vm_unmap_ram(); (and pass number of pages to > folio_unmap_local()) as the vmap area might be allocated using > vb_alloc(). > > > + kunmap_local(addr); > > +} > > missing else statement? > > > + > > /** > > * kmap_atomic - Atomically map a page for temporary usage - Deprecated! > > * @page: Pointer to the page to be mapped > > @@ -426,5 +465,4 @@ static inline void folio_zero_range(struct folio *folio, > > { > > zero_user_segments(&folio->page, start, start + length, 0, 0); > > } > > - > > #endif /* _LINUX_HIGHMEM_H */ > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > > index 096d48aa3437..4bb34c939c01 100644 > > --- a/include/linux/vmalloc.h > > +++ b/include/linux/vmalloc.h > > @@ -13,6 +13,7 @@ > > #include <asm/vmalloc.h> > > > > struct vm_area_struct; /* vma defining user mapping in mm_types.h */ > > +struct folio; /* also mm_types.h */ > > struct notifier_block; /* in notifier.h */ > > > > /* bits in flags of vmalloc's vm_struct below */ > > @@ -163,8 +164,9 @@ extern void *vcalloc(size_t n, size_t size) __alloc_size(1, 2); > > extern void vfree(const void *addr); > > extern void vfree_atomic(const void *addr); > > > > -extern void *vmap(struct page **pages, unsigned int count, > > - unsigned long flags, pgprot_t prot); > > +void *vmap(struct page **pages, unsigned int count, unsigned long flags, > > + pgprot_t prot); > > +void *vm_map_folio(struct folio *folio); > > void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot); > > extern void vunmap(const void *addr); > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index ccaa461998f3..265b860c9550 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2283,6 +2283,56 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node) > > } > > EXPORT_SYMBOL(vm_map_ram); > > > > +#ifdef CONFIG_HIGHMEM > > +/** > > + * vm_map_folio() - Map an entire folio into virtually contiguous space. > > + * @folio: The folio to map. > > + * > > + * Maps all pages in @folio into contiguous kernel virtual space. This > > + * function is only available in HIGHMEM builds; for !HIGHMEM, use > > + * folio_address(). The pages are mapped with PAGE_KERNEL permissions. > > + * > > + * Return: The address of the area or %NULL on failure > > + */ > > +void *vm_map_folio(struct folio *folio) > > +{ > > + size_t size = folio_size(folio); > > + unsigned long addr; > > + void *mem; > > + > > + might_sleep(); > > + > > + if (likely(folio_nr_pages(folio) <= VMAP_MAX_ALLOC)) { > > + mem = vb_alloc(size, GFP_KERNEL); > > + if (IS_ERR(mem)) > > + return NULL; > > + addr = (unsigned long)mem; > > + } else { > > + struct vmap_area *va; > > + va = alloc_vmap_area(size, PAGE_SIZE, VMALLOC_START, > > + VMALLOC_END, NUMA_NO_NODE, GFP_KERNEL); > > + if (IS_ERR(va)) > > + return NULL; > > + > > + addr = va->va_start; > > + mem = (void *)addr; > > + } > > + > > + if (vmap_range_noflush(addr, addr + size, > > + folio_pfn(folio) << PAGE_SHIFT, > > + PAGE_KERNEL, folio_shift(folio))) { > > + vm_unmap_ram(mem, folio_nr_pages(folio)); > > + return NULL; > > + } > > + flush_cache_vmap(addr, addr + size); > > + > > + mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_PROT_NORMAL); > > + > > + return mem; > > +} > > +EXPORT_SYMBOL(vm_map_folio); > > +#endif > > it's a bit of copy & paste but yeah, it seems unavoidable at this point. > Agree. It is worth to rework the void *vm_map_folio(struct folio *folio) function. All the rest looks reasonable. -- Uladzislau Rezki
diff --git a/include/linux/highmem.h b/include/linux/highmem.h index e9912da5441b..e8159243d88d 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -10,6 +10,7 @@ #include <linux/mm.h> #include <linux/uaccess.h> #include <linux/hardirq.h> +#include <linux/vmalloc.h> #include "highmem-internal.h" @@ -132,6 +133,44 @@ static inline void *kmap_local_page(struct page *page); */ static inline void *kmap_local_folio(struct folio *folio, size_t offset); +/** + * folio_map_local - Map an entire folio. + * @folio: The folio to map. + * + * Unlike kmap_local_folio(), map an entire folio. This should be undone + * with folio_unmap_local(). The address returned should be treated as + * stack-based, and local to this CPU, like kmap_local_folio(). + * + * Context: May allocate memory using GFP_KERNEL if it takes the vmap path. + * Return: A kernel virtual address which can be used to access the folio, + * or NULL if the mapping fails. + */ +static inline __must_check void *folio_map_local(struct folio *folio) +{ + might_alloc(GFP_KERNEL); + + if (!IS_ENABLED(CONFIG_HIGHMEM)) + return folio_address(folio); + if (folio_test_large(folio)) + return vm_map_folio(folio); + return kmap_local_page(&folio->page); +} + +/** + * folio_unmap_local - Unmap an entire folio. + * @addr: Address returned from folio_map_local() + * + * Undo the result of a previous call to folio_map_local(). + */ +static inline void folio_unmap_local(const void *addr) +{ + if (!IS_ENABLED(CONFIG_HIGHMEM)) + return; + if (is_vmalloc_addr(addr)) + vunmap(addr); + kunmap_local(addr); +} + /** * kmap_atomic - Atomically map a page for temporary usage - Deprecated! * @page: Pointer to the page to be mapped @@ -426,5 +465,4 @@ static inline void folio_zero_range(struct folio *folio, { zero_user_segments(&folio->page, start, start + length, 0, 0); } - #endif /* _LINUX_HIGHMEM_H */ diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 096d48aa3437..4bb34c939c01 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -13,6 +13,7 @@ #include <asm/vmalloc.h> struct vm_area_struct; /* vma defining user mapping in mm_types.h */ +struct folio; /* also mm_types.h */ struct notifier_block; /* in notifier.h */ /* bits in flags of vmalloc's vm_struct below */ @@ -163,8 +164,9 @@ extern void *vcalloc(size_t n, size_t size) __alloc_size(1, 2); extern void vfree(const void *addr); extern void vfree_atomic(const void *addr); -extern void *vmap(struct page **pages, unsigned int count, - unsigned long flags, pgprot_t prot); +void *vmap(struct page **pages, unsigned int count, unsigned long flags, + pgprot_t prot); +void *vm_map_folio(struct folio *folio); void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot); extern void vunmap(const void *addr); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ccaa461998f3..265b860c9550 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2283,6 +2283,56 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node) } EXPORT_SYMBOL(vm_map_ram); +#ifdef CONFIG_HIGHMEM +/** + * vm_map_folio() - Map an entire folio into virtually contiguous space. + * @folio: The folio to map. + * + * Maps all pages in @folio into contiguous kernel virtual space. This + * function is only available in HIGHMEM builds; for !HIGHMEM, use + * folio_address(). The pages are mapped with PAGE_KERNEL permissions. + * + * Return: The address of the area or %NULL on failure + */ +void *vm_map_folio(struct folio *folio) +{ + size_t size = folio_size(folio); + unsigned long addr; + void *mem; + + might_sleep(); + + if (likely(folio_nr_pages(folio) <= VMAP_MAX_ALLOC)) { + mem = vb_alloc(size, GFP_KERNEL); + if (IS_ERR(mem)) + return NULL; + addr = (unsigned long)mem; + } else { + struct vmap_area *va; + va = alloc_vmap_area(size, PAGE_SIZE, VMALLOC_START, + VMALLOC_END, NUMA_NO_NODE, GFP_KERNEL); + if (IS_ERR(va)) + return NULL; + + addr = va->va_start; + mem = (void *)addr; + } + + if (vmap_range_noflush(addr, addr + size, + folio_pfn(folio) << PAGE_SHIFT, + PAGE_KERNEL, folio_shift(folio))) { + vm_unmap_ram(mem, folio_nr_pages(folio)); + return NULL; + } + flush_cache_vmap(addr, addr + size); + + mem = kasan_unpoison_vmalloc(mem, size, KASAN_VMALLOC_PROT_NORMAL); + + return mem; +} +EXPORT_SYMBOL(vm_map_folio); +#endif + static struct vm_struct *vmlist __initdata; static inline unsigned int vm_area_page_order(struct vm_struct *vm)
Some filesystems benefit from being able to map the entire folio. On 32-bit platforms with HIGHMEM, we fall back to using vmap, which will be slow. If it proves to be a performance problem, we can look at optimising it in a number of ways. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- include/linux/highmem.h | 40 ++++++++++++++++++++++++++++++++- include/linux/vmalloc.h | 6 +++-- mm/vmalloc.c | 50 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 93 insertions(+), 3 deletions(-)