Message ID | 20240508132300.691518-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: mm: reduce swiotlb size when dynamic swiotlb enabled | expand |
[+Isaac] On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote: > After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size > 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only > devices require many bounce buffering via swiotlb, but with the > CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to > 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could > be allocated danamicly, and this should save 60M for most platform which > don't require too much swiotlb buffer. > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > arch/arm64/mm/init.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 9b5ab6818f7f..425222c13d97 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -370,18 +370,23 @@ void __init bootmem_init(void) > void __init mem_init(void) > { > bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit); > + unsigned long size = 0; > > if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) { > /* > * If no bouncing needed for ZONE_DMA, reduce the swiotlb > * buffer for kmalloc() bouncing to 1MB per 1GB of RAM. > */ > - unsigned long size = > - DIV_ROUND_UP(memblock_phys_mem_size(), 1024); > - swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); > + size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024); > swiotlb = true; > } > > + if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size) > + size = MAX_ORDER_NR_PAGES << PAGE_SHIFT; > + > + if (size) > + swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); If 64MiB is deemed too much, can't you just provide an alternative size on the kernel cmdline (swiotlb=) instead of adding more heuristics here? Otherwise, if you really want to change the default, then I think swiotlb_size_or_default() is the place to do it, not in the arch code. Will
On 2024/5/10 19:42, Will Deacon wrote: > [+Isaac] > > On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote: >> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size >> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only >> devices require many bounce buffering via swiotlb, but with the >> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to >> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could >> be allocated danamicly, and this should save 60M for most platform which >> don't require too much swiotlb buffer. >> >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- >> arch/arm64/mm/init.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 9b5ab6818f7f..425222c13d97 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -370,18 +370,23 @@ void __init bootmem_init(void) >> void __init mem_init(void) >> { >> bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit); >> + unsigned long size = 0; >> >> if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) { >> /* >> * If no bouncing needed for ZONE_DMA, reduce the swiotlb >> * buffer for kmalloc() bouncing to 1MB per 1GB of RAM. >> */ >> - unsigned long size = >> - DIV_ROUND_UP(memblock_phys_mem_size(), 1024); >> - swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); >> + size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024); >> swiotlb = true; >> } >> >> + if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size) >> + size = MAX_ORDER_NR_PAGES << PAGE_SHIFT; >> + >> + if (size) >> + swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); > > If 64MiB is deemed too much, can't you just provide an alternative size > on the kernel cmdline (swiotlb=) instead of adding more heuristics here? This is not very suitable for distribution. > > Otherwise, if you really want to change the default, then I think > swiotlb_size_or_default() is the place to do it, not in the arch code. I don't know the size used by swiotlb in other archs, and the switlb can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg, pci_xen_swiotlb_init() on x86. On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822, I reported a 32-bit only sata device which uses lots of swiotlb, and no other reports same issue), and now I think most of platform are 64bit device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on arm64, and the size is adjusted according to CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed. Thanks.
On 5/11/24 13:15, Kefeng Wang wrote: > > > On 2024/5/10 19:42, Will Deacon wrote: >> [+Isaac] >> >> On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote: >>> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size >>> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only >>> devices require many bounce buffering via swiotlb, but with the >>> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to >>> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could >>> be allocated danamicly, and this should save 60M for most platform which >>> don't require too much swiotlb buffer. >>> >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>> --- >>> arch/arm64/mm/init.c | 11 ++++++++--- >>> 1 file changed, 8 insertions(+), 3 deletions(-) >>> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>> index 9b5ab6818f7f..425222c13d97 100644 >>> --- a/arch/arm64/mm/init.c >>> +++ b/arch/arm64/mm/init.c >>> @@ -370,18 +370,23 @@ void __init bootmem_init(void) >>> void __init mem_init(void) >>> { >>> bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit); >>> + unsigned long size = 0; >>> if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) { >>> /* >>> * If no bouncing needed for ZONE_DMA, reduce the swiotlb >>> * buffer for kmalloc() bouncing to 1MB per 1GB of RAM. >>> */ >>> - unsigned long size = >>> - DIV_ROUND_UP(memblock_phys_mem_size(), 1024); >>> - swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); >>> + size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024); >>> swiotlb = true; >>> } >>> + if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size) >>> + size = MAX_ORDER_NR_PAGES << PAGE_SHIFT; >>> + >>> + if (size) >>> + swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); >> >> If 64MiB is deemed too much, can't you just provide an alternative size >> on the kernel cmdline (swiotlb=) instead of adding more heuristics here? > > This is not very suitable for distribution. But wondering why 64MB is even considered too much ? Do you really have systems where this saving of ~60MB i.e current (60MB) - proposed (4MB) makes any material impact ? Also would not this proposed default size i.e (MAX_ORDER_NR_PAGES << PAGE_SHIFT) be different for all 4K|16K|64K base page sizes ? arch/arm64/Kconfig # | SECTION_SIZE_BITS | PAGE_SHIFT | max MAX_PAGE_ORDER | default MAX_PAGE_ORDER | # ----+-------------------+--------------+----------------------+-------------------------+ # 4K | 27 | 12 | 15 | 10 | # 16K | 27 | 14 | 13 | 11 | # 64K | 29 | 16 | 13 | 13 | config ARCH_FORCE_MAX_ORDER int default "13" if ARM64_64K_PAGES default "11" if ARM64_16K_PAGES default "10" help >> >> Otherwise, if you really want to change the default, then I think >> swiotlb_size_or_default() is the place to do it, not in the arch code. > > I don't know the size used by swiotlb in other archs, and the switlb > can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg, > pci_xen_swiotlb_init() on x86. Which makes reducing swiotlb_size_or_default() problematic ? > > On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822, > I reported a 32-bit only sata device which uses lots of swiotlb, and no > other reports same issue), and now I think most of platform are 64bit > device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on > arm64, and the size is adjusted according to > CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed. TBH, proposed change here looks good. Only point being does SWIOTLB size selection needs to add additional complexities given 64MB does not waste too much memory in the first place.
On 2024/5/13 12:35, Anshuman Khandual wrote: > On 5/11/24 13:15, Kefeng Wang wrote: >> >> >> On 2024/5/10 19:42, Will Deacon wrote: >>> [+Isaac] >>> >>> On Wed, May 08, 2024 at 09:23:00PM +0800, Kefeng Wang wrote: >>>> After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size >>>> 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only >>>> devices require many bounce buffering via swiotlb, but with the >>>> CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to >>>> 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could >>>> be allocated danamicly, and this should save 60M for most platform which >>>> don't require too much swiotlb buffer. >>>> >>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>>> --- ... >>> >>> If 64MiB is deemed too much, can't you just provide an alternative size >>> on the kernel cmdline (swiotlb=) instead of adding more heuristics here? >> >> This is not very suitable for distribution. > > But wondering why 64MB is even considered too much ? Do you really have > systems where this saving of ~60MB i.e current (60MB) - proposed (4MB) > makes any material impact ? Also would not this proposed default size 1) arm64 used 4MiB before, and no issue reported except me 2) check /sys/kernel/debug/swiotlb/io_tlb_used_hiwater on our new Kunpeng, the swiotlb used very little. No obvious material impact, and no evaluate about it, maybe reduce some oom, but it could save memory and reduce reserve memory. > i.e (MAX_ORDER_NR_PAGES << PAGE_SHIFT) be different for all 4K|16K|64K > base page sizes ? Yes, MAX_ORDER_NR_PAGES << PAGE_SHIFT is different, but we choose min(swiotlb_size_or_default(), MAX_ORDER_NR_PAGES << PAGE_SHIFT), for large PAGE SIZE, the reserve size is still large, maybe reduce it for ARM64_64K_PAGES/ARM64_16K_PAGES if necessary. > > arch/arm64/Kconfig > > # | SECTION_SIZE_BITS | PAGE_SHIFT | max MAX_PAGE_ORDER | default MAX_PAGE_ORDER | > # ----+-------------------+--------------+----------------------+-------------------------+ > # 4K | 27 | 12 | 15 | 10 | > # 16K | 27 | 14 | 13 | 11 | > # 64K | 29 | 16 | 13 | 13 | > config ARCH_FORCE_MAX_ORDER > int > default "13" if ARM64_64K_PAGES > default "11" if ARM64_16K_PAGES > default "10" > help > >>> >>> Otherwise, if you really want to change the default, then I think >>> swiotlb_size_or_default() is the place to do it, not in the arch code. >> >> I don't know the size used by swiotlb in other archs, and the switlb >> can't grow if rmap not null even CONFIG_SWIOTLB_DYNAMIC enabled, eg, >> pci_xen_swiotlb_init() on x86. > > Which makes reducing swiotlb_size_or_default() problematic ? Not sure about it, also I don't know how much the swiotlb buffer are used by other Arches, so only change arm64. > >> >> On arm64, the default size is 4MiB from 3.15 to 4.0(until the a1e50a822, >> I reported a 32-bit only sata device which uses lots of swiotlb, and no >> other reports same issue), and now I think most of platform are 64bit >> device, so with SWIOTLB_DYNAMIC, change it back to 4MiB is feasible on >> arm64, and the size is adjusted according to >> CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC, so add a new adjust only arm64 changed. > > TBH, proposed change here looks good. Only point being does SWIOTLB size > selection needs to add additional complexities given 64MB does not waste > too much memory in the first place. Yes, very little complexities.
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 9b5ab6818f7f..425222c13d97 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -370,18 +370,23 @@ void __init bootmem_init(void) void __init mem_init(void) { bool swiotlb = max_pfn > PFN_DOWN(arm64_dma_phys_limit); + unsigned long size = 0; if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && !swiotlb) { /* * If no bouncing needed for ZONE_DMA, reduce the swiotlb * buffer for kmalloc() bouncing to 1MB per 1GB of RAM. */ - unsigned long size = - DIV_ROUND_UP(memblock_phys_mem_size(), 1024); - swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); + size = DIV_ROUND_UP(memblock_phys_mem_size(), 1024); swiotlb = true; } + if (IS_ENABLED(CONFIG_SWIOTLB_DYNAMIC) && !size) + size = MAX_ORDER_NR_PAGES << PAGE_SHIFT; + + if (size) + swiotlb_adjust_size(min(swiotlb_size_or_default(), size)); + swiotlb_init(swiotlb, SWIOTLB_VERBOSE); /* this will put all unused low memory onto the freelists */
After commit a1e50a82256e ("arm64: Increase the swiotlb buffer size 64MB"), the swiotlb buffer size increased to 64M in case of 32-bit only devices require many bounce buffering via swiotlb, but with the CONFIG_SWIOTLB_DYNAMIC enabled, we could reduce swiotlb size from 64M to 4M(MAX_ORDER_NR_PAGES << PAGE_SHIFT) again since swiotlb buffer size could be allocated danamicly, and this should save 60M for most platform which don't require too much swiotlb buffer. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- arch/arm64/mm/init.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)