diff mbox series

[v2,1/2] mm/vmalloc: export __vmalloc_node_range

Message ID 20210608180618.477766-2-imbrenda@linux.ibm.com (mailing list archive)
State New
Headers show
Series mm: export __vmalloc_node_range and use it | expand

Commit Message

Claudio Imbrenda June 8, 2021, 6:06 p.m. UTC
The recent patches to add support for hugepage vmalloc mappings added a
flag for __vmalloc_node_range to allow to request small pages.
This flag is not accessible when calling vmalloc, the only option is to
call directly __vmalloc_node_range, which is not exported.

This means that a module can't vmalloc memory with small pages.

Case in point: KVM on s390x needs to vmalloc a large area, and it needs
to be mapped with small pages, because of a hardware limitation.

This patch exports __vmalloc_node_range so it can be used in modules
too.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Rientjes <rientjes@google.com>
---
 mm/vmalloc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Christoph Hellwig June 9, 2021, 3:59 p.m. UTC | #1
On Tue, Jun 08, 2021 at 08:06:17PM +0200, Claudio Imbrenda wrote:
> The recent patches to add support for hugepage vmalloc mappings added a
> flag for __vmalloc_node_range to allow to request small pages.
> This flag is not accessible when calling vmalloc, the only option is to
> call directly __vmalloc_node_range, which is not exported.
> 
> This means that a module can't vmalloc memory with small pages.
> 
> Case in point: KVM on s390x needs to vmalloc a large area, and it needs
> to be mapped with small pages, because of a hardware limitation.
> 
> This patch exports __vmalloc_node_range so it can be used in modules
> too.

No.  I spent a lot of effort to mak sure such a low-level API is
not exported.
Claudio Imbrenda June 9, 2021, 4:28 p.m. UTC | #2
On Wed, 9 Jun 2021 16:59:17 +0100
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jun 08, 2021 at 08:06:17PM +0200, Claudio Imbrenda wrote:
> > The recent patches to add support for hugepage vmalloc mappings
> > added a flag for __vmalloc_node_range to allow to request small
> > pages. This flag is not accessible when calling vmalloc, the only
> > option is to call directly __vmalloc_node_range, which is not
> > exported.
> > 
> > This means that a module can't vmalloc memory with small pages.
> > 
> > Case in point: KVM on s390x needs to vmalloc a large area, and it
> > needs to be mapped with small pages, because of a hardware
> > limitation.
> > 
> > This patch exports __vmalloc_node_range so it can be used in modules
> > too.  
> 
> No.  I spent a lot of effort to mak sure such a low-level API is
> not exported.

ok, but then how can we vmalloc memory with small pages from KVM?
Uladzislau Rezki June 9, 2021, 4:49 p.m. UTC | #3
On Wed, Jun 09, 2021 at 06:28:09PM +0200, Claudio Imbrenda wrote:
> On Wed, 9 Jun 2021 16:59:17 +0100
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, Jun 08, 2021 at 08:06:17PM +0200, Claudio Imbrenda wrote:
> > > The recent patches to add support for hugepage vmalloc mappings
> > > added a flag for __vmalloc_node_range to allow to request small
> > > pages. This flag is not accessible when calling vmalloc, the only
> > > option is to call directly __vmalloc_node_range, which is not
> > > exported.
> > > 
> > > This means that a module can't vmalloc memory with small pages.
> > > 
> > > Case in point: KVM on s390x needs to vmalloc a large area, and it
> > > needs to be mapped with small pages, because of a hardware
> > > limitation.
> > > 
> > > This patch exports __vmalloc_node_range so it can be used in modules
> > > too.  
> > 
> > No.  I spent a lot of effort to mak sure such a low-level API is
> > not exported.
> 
> ok, but then how can we vmalloc memory with small pages from KVM?
Does the s390x support CONFIG_HAVE_ARCH_HUGE_VMALLOC what is arch
specific?

If not then small pages are used. Or am i missing something?

I agree with Christoph that exporting a low level internals
is not a good idea.

--
Vlad Rezki
Christian Borntraeger June 9, 2021, 5:47 p.m. UTC | #4
On 09.06.21 18:28, Claudio Imbrenda wrote:
> On Wed, 9 Jun 2021 16:59:17 +0100
> Christoph Hellwig <hch@infradead.org> wrote:
> 
>> On Tue, Jun 08, 2021 at 08:06:17PM +0200, Claudio Imbrenda wrote:
>>> The recent patches to add support for hugepage vmalloc mappings
>>> added a flag for __vmalloc_node_range to allow to request small
>>> pages. This flag is not accessible when calling vmalloc, the only
>>> option is to call directly __vmalloc_node_range, which is not
>>> exported.
>>>
>>> This means that a module can't vmalloc memory with small pages.
>>>
>>> Case in point: KVM on s390x needs to vmalloc a large area, and it
>>> needs to be mapped with small pages, because of a hardware
>>> limitation.
>>>
>>> This patch exports __vmalloc_node_range so it can be used in modules
>>> too.
>>
>> No.  I spent a lot of effort to mak sure such a low-level API is
>> not exported.
> 
> ok, but then how can we vmalloc memory with small pages from KVM?

An alternative would be to provide a vmalloc_no_huge function in generic
code  (similar to  vmalloc_32) (or if preferred in s390 base architecture code)
Something like

void *vmalloc_no_huge(unsigned long size)
{
         return __vmalloc_node_flags(size, NUMA_NO_NODE,VM_NO_HUGE_VMAP |
                                 GFP_KERNEL | __GFP_ZERO);
}
EXPORT_SYMBOL(vmalloc_no_huge);

or a similar vzalloc variant.
Christian Borntraeger June 9, 2021, 5:50 p.m. UTC | #5
On 09.06.21 18:49, Uladzislau Rezki wrote:
> On Wed, Jun 09, 2021 at 06:28:09PM +0200, Claudio Imbrenda wrote:
>> On Wed, 9 Jun 2021 16:59:17 +0100
>> Christoph Hellwig <hch@infradead.org> wrote:
>>
>>> On Tue, Jun 08, 2021 at 08:06:17PM +0200, Claudio Imbrenda wrote:
>>>> The recent patches to add support for hugepage vmalloc mappings
>>>> added a flag for __vmalloc_node_range to allow to request small
>>>> pages. This flag is not accessible when calling vmalloc, the only
>>>> option is to call directly __vmalloc_node_range, which is not
>>>> exported.
>>>>
>>>> This means that a module can't vmalloc memory with small pages.
>>>>
>>>> Case in point: KVM on s390x needs to vmalloc a large area, and it
>>>> needs to be mapped with small pages, because of a hardware
>>>> limitation.
>>>>
>>>> This patch exports __vmalloc_node_range so it can be used in modules
>>>> too.
>>>
>>> No.  I spent a lot of effort to mak sure such a low-level API is
>>> not exported.
>>
>> ok, but then how can we vmalloc memory with small pages from KVM?
> Does the s390x support CONFIG_HAVE_ARCH_HUGE_VMALLOC what is arch
> specific?

Not yet, but we surely want that for almost everything on s390.
Only this particular firmware interface does not handle large pages
for donated memory.

> 
> If not then small pages are used. Or am i missing something?
> 
> I agree with Christoph that exporting a low level internals
> is not a good idea.
Christoph Hellwig June 10, 2021, 5:24 a.m. UTC | #6
On Wed, Jun 09, 2021 at 07:47:43PM +0200, Christian Borntraeger wrote:
> An alternative would be to provide a vmalloc_no_huge function in generic
> code  (similar to  vmalloc_32) (or if preferred in s390 base architecture code)
> Something like
> 
> void *vmalloc_no_huge(unsigned long size)
> {
>         return __vmalloc_node_flags(size, NUMA_NO_NODE,VM_NO_HUGE_VMAP |
>                                 GFP_KERNEL | __GFP_ZERO);
> }
> EXPORT_SYMBOL(vmalloc_no_huge);
> 
> or a similar vzalloc variant.

Exactly.  Given that this seems to be a weird pecularity of legacy s390
interfaces I'd only export it for 390 for now, although for
documentation purposes I'd probably still keep it in vmalloc.c.
diff mbox series

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a13ac524f6ff..bd6fa160b31b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2937,6 +2937,7 @@  void *__vmalloc_node_range(unsigned long size, unsigned long align,
 
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(__vmalloc_node_range);
 
 /**
  * __vmalloc_node - allocate virtually contiguous memory