diff mbox series

arm64: mm: reduce swiotlb size when dynamic swiotlb enabled

Message ID 20240508132300.691518-1-wangkefeng.wang@huawei.com (mailing list archive)
State New, archived
Headers show
Series arm64: mm: reduce swiotlb size when dynamic swiotlb enabled | expand

Commit Message

Kefeng Wang May 8, 2024, 1:23 p.m. UTC
After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size
64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only
devices require many bounce buffering via swiotlb, but with the
CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to
4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could
be allocated danamicly, and this should save 60M for most platform which
don't require too much swiotlb buffer.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/init.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Will Deacon May 10, 2024, 11:42 a.m. UTC | #1
[+Isaac]

On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote:
> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size
> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only
> devices require many bounce buffering via swiotlb, but with the
> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to
> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could
> be allocated danamicly, and this should save 60M for most platform which
> don't require too much swiotlb buffer.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/arm64/mm/init.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 9b5ab6818f7f..425222c13d97 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -370,18 +370,23 @@ void __init bootmem_init(void)
>  void __init mem_init(void)
>  {
>  	bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit);
> +	unsigned long size = 0;
>  
>  	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) {
>  		/*
>  		 * If no bouncing needed for ZONE_DMA, reduce the swiotlb
>  		 * buffer for kmalloc() bouncing to 1MB per 1GB of RAM.
>  		 */
> -		unsigned long size =
> -			DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
> -		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
> +		size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
>  		swiotlb = true;
>  	}
>  
> +	if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size)
> +		size = MAX_ORDER_NR_PAGES << PAGE_SHIFT;
> +
> +	if (size)
> +		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));

If 64MiB is deemed too much, can't you just provide an alternative size
on the kernel cmdline (swiotlb=) instead of adding more heuristics here?

Otherwise, if you really want to change the default, then I think
swiotlb_size_or_default() is the place to do it, not in the arch code.

Will
Kefeng Wang May 11, 2024, 7:45 a.m. UTC | #2
On 2024/5/10 19:42, Will Deacon wrote:
> [+Isaac]
> 
> On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote:
>> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size
>> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only
>> devices require many bounce buffering via swiotlb, but with the
>> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to
>> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could
>> be allocated danamicly, and this should save 60M for most platform which
>> don't require too much swiotlb buffer.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   arch/arm64/mm/init.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 9b5ab6818f7f..425222c13d97 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -370,18 +370,23 @@ void __init bootmem_init(void)
>>   void __init mem_init(void)
>>   {
>>   	bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit);
>> +	unsigned long size = 0;
>>   
>>   	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) {
>>   		/*
>>   		 * If no bouncing needed for ZONE_DMA, reduce the swiotlb
>>   		 * buffer for kmalloc() bouncing to 1MB per 1GB of RAM.
>>   		 */
>> -		unsigned long size =
>> -			DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
>> -		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
>> +		size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
>>   		swiotlb = true;
>>   	}
>>   
>> +	if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size)
>> +		size = MAX_ORDER_NR_PAGES << PAGE_SHIFT;
>> +
>> +	if (size)
>> +		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
> 
> If 64MiB is deemed too much, can't you just provide an alternative size
> on the kernel cmdline (swiotlb=) instead of adding more heuristics here?

This is not very suitable for distribution.
> 
> Otherwise, if you really want to change the default, then I think
> swiotlb_size_or_default() is the place to do it, not in the arch code.

I don't know the size used by swiotlb in other archs, and the switlb
can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg,
pci_xen_swiotlb_init() on x86.

On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822,
I reported a 32-bit only sata device which uses lots of swiotlb, and no
other reports same issue), and now I think most of platform are 64bit
device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on
arm64, and the size is adjusted according to
CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed.

Thanks.
Anshuman Khandual May 13, 2024, 4:35 a.m. UTC | #3
On 5/11/24 13:15, Kefeng Wang wrote:
> 
> 
> On 2024/5/10 19:42, Will Deacon wrote:
>> [+Isaac]
>>
>> On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote:
>>> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size
>>> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only
>>> devices require many bounce buffering via swiotlb, but with the
>>> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to
>>> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could
>>> be allocated danamicly, and this should save 60M for most platform which
>>> don't require too much swiotlb buffer.
>>>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>>   arch/arm64/mm/init.c | 11 ++++++++---
>>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index 9b5ab6818f7f..425222c13d97 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -370,18 +370,23 @@ void __init bootmem_init(void)
>>>   void __init mem_init(void)
>>>   {
>>>       bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit);
>>> +    unsigned long size = 0;
>>>         if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) {
>>>           /*
>>>            * If no bouncing needed for ZONE_DMA, reduce the swiotlb
>>>            * buffer for kmalloc() bouncing to 1MB per 1GB of RAM.
>>>            */
>>> -        unsigned long size =
>>> -            DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
>>> -        swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
>>> +        size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
>>>           swiotlb = true;
>>>       }
>>>   +    if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size)
>>> +        size = MAX_ORDER_NR_PAGES << PAGE_SHIFT;
>>> +
>>> +    if (size)
>>> +        swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
>>
>> If 64MiB is deemed too much, can't you just provide an alternative size
>> on the kernel cmdline (swiotlb=) instead of adding more heuristics here?
> 
> This is not very suitable for distribution.

But wondering why 64MB is even considered too much ? Do you really have
systems where this saving of ~60MB i.e current (60MB) - proposed (4MB)
makes any material impact ? Also would not this proposed default size
i.e (MAX_ORDER_NR_PAGES << PAGE_SHIFT) be different for all 4K|16K|64K
base page sizes ?

arch/arm64/Kconfig

#     | SECTION_SIZE_BITS |  PAGE_SHIFT  |  max MAX_PAGE_ORDER  |  default MAX_PAGE_ORDER |
# ----+-------------------+--------------+----------------------+-------------------------+
# 4K  |       27          |      12      |       15             |         10              |
# 16K |       27          |      14      |       13             |         11              |
# 64K |       29          |      16      |       13             |         13              |
config ARCH_FORCE_MAX_ORDER
        int
        default "13" if ARM64_64K_PAGES
        default "11" if ARM64_16K_PAGES
        default "10"
        help

>>
>> Otherwise, if you really want to change the default, then I think
>> swiotlb_size_or_default() is the place to do it, not in the arch code.
> 
> I don't know the size used by swiotlb in other archs, and the switlb
> can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg,
> pci_xen_swiotlb_init() on x86.

Which makes reducing swiotlb_size_or_default() problematic ?

> 
> On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822,
> I reported a 32-bit only sata device which uses lots of swiotlb, and no
> other reports same issue), and now I think most of platform are 64bit
> device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on
> arm64, and the size is adjusted according to
> CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed.

TBH, proposed change here looks good. Only point being does SWIOTLB size
selection needs to add additional complexities given 64MB does not waste
too much memory in the first place.
Kefeng Wang May 16, 2024, 7:26 a.m. UTC | #4
On 2024/5/13 12:35, Anshuman Khandual wrote:
> On 5/11/24 13:15, Kefeng Wang wrote:
>>
>>
>> On 2024/5/10 19:42, Will Deacon wrote:
>>> [+Isaac]
>>>
>>> On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote:
>>>> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size
>>>> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only
>>>> devices require many bounce buffering via swiotlb, but with the
>>>> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to
>>>> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could
>>>> be allocated danamicly, and this should save 60M for most platform which
>>>> don't require too much swiotlb buffer.
>>>>
>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> ---

...
>>>
>>> If 64MiB is deemed too much, can't you just provide an alternative size
>>> on the kernel cmdline (swiotlb=) instead of adding more heuristics here?
>>
>> This is not very suitable for distribution.
> 
> But wondering why 64MB is even considered too much ? Do you really have
> systems where this saving of ~60MB i.e current (60MB) - proposed (4MB)
> makes any material impact ? Also would not this proposed default size


1) arm64 used 4MiB before, and no issue reported except me
2) check /sys/kernel/debug/swiotlb/io_tlb_used_hiwater on our new
Kunpeng, the swiotlb used very little.

No obvious material impact, and no evaluate about it, maybe reduce some
oom, but it could save memory and reduce reserve memory.

> i.e (MAX_ORDER_NR_PAGES << PAGE_SHIFT) be different for all 4K|16K|64K
> base page sizes ?

Yes, MAX_ORDER_NR_PAGES << PAGE_SHIFT is different, but we choose
min(swiotlb_size_or_default(), MAX_ORDER_NR_PAGES << PAGE_SHIFT),
for large PAGE SIZE, the reserve size is still large, maybe reduce
it for ARM64_64K_PAGES/ARM64_16K_PAGES if necessary.

> 
> arch/arm64/Kconfig
> 
> #     | SECTION_SIZE_BITS |  PAGE_SHIFT  |  max MAX_PAGE_ORDER  |  default MAX_PAGE_ORDER |
> # ----+-------------------+--------------+----------------------+-------------------------+
> # 4K  |       27          |      12      |       15             |         10              |
> # 16K |       27          |      14      |       13             |         11              |
> # 64K |       29          |      16      |       13             |         13              |
> config ARCH_FORCE_MAX_ORDER
>          int
>          default "13" if ARM64_64K_PAGES
>          default "11" if ARM64_16K_PAGES
>          default "10"
>          help
> 
>>>
>>> Otherwise, if you really want to change the default, then I think
>>> swiotlb_size_or_default() is the place to do it, not in the arch code.
>>
>> I don't know the size used by swiotlb in other archs, and the switlb
>> can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg,
>> pci_xen_swiotlb_init() on x86.
> 
> Which makes reducing swiotlb_size_or_default() problematic ?

Not sure about it, also I don't know how much the swiotlb buffer are
used by other Arches, so only change arm64.

> 
>>
>> On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822,
>> I reported a 32-bit only sata device which uses lots of swiotlb, and no
>> other reports same issue), and now I think most of platform are 64bit
>> device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on
>> arm64, and the size is adjusted according to
>> CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed.
> 
> TBH, proposed change here looks good. Only point being does SWIOTLB size
> selection needs to add additional complexities given 64MB does not waste
> too much memory in the first place.

Yes, very little complexities.
diff mbox series

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 9b5ab6818f7f..425222c13d97 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -370,18 +370,23 @@  void __init bootmem_init(void)
 void __init mem_init(void)
 {
 	bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit);
+	unsigned long size = 0;
 
 	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) {
 		/*
 		 * If no bouncing needed for ZONE_DMA, reduce the swiotlb
 		 * buffer for kmalloc() bouncing to 1MB per 1GB of RAM.
 		 */
-		unsigned long size =
-			DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
-		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
+		size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024);
 		swiotlb = true;
 	}
 
+	if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size)
+		size = MAX_ORDER_NR_PAGES << PAGE_SHIFT;
+
+	if (size)
+		swiotlb_adjust_size(min(swiotlb_size_or_default(), size));
+
 	swiotlb_init(swiotlb, SWIOTLB_VERBOSE);
 
 	/* this will put all unused low memory onto the freelists */