Message ID | 20181210062146.24951-1-aghiti@upmem.com (mailing list archive) |
---|---|
Headers | show |
Series | Hugetlbfs support for riscv | expand |
Hi everyone, Can I do something more regarding this series ? Anything that would ease the review process, do not hesitate to ask. Thanks, Alex On 12/10/18 6:21 AM, Alexandre Ghiti wrote: > This series introduces hugetlbfs support for both riscv 32/64. Riscv32 > is architecturally limited to huge pages of size 4MB whereas riscv64 has > 2MB/1G huge pages support. Transparent huge page support is not > implemented here, I will submit another series later. > > As stated in "The RISC-V Instruction Set Manual, Volume II: Privileged > Architecture", riscv page table entries are marked as non-leaf entries > as soon as at least one of the R/W/X bit set: > > - pmd_huge/pud_huge check if one of those bits are set, > - pte_mkhuge simply returns the same pte value and does not set any of > the R/W/X bits > > CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot > reserved gigantic pages can be freed: indeed, one can reduce the number > of huge pages by calling free_gigantic_pages which in turn calls > free_contig_range, defined only with those configs defined. > However I don't see any strong dependency between free_contig_range > and those configs, maybe we could allow hugetlbfs users to free boot > reserved hugepages without those configs activated, I will propose > something if Mike Kravetz agrees. > > For the below validation, I activated CMA so that tests like counters do > not fail when freeing pages. > This series was validated using libhugetlbfs testsuite ported to riscv64 > without linker script support. > (https://github.com/AlexGhiti/libhugetlbfs.git, branch dev/alex/riscv). > > - libhugetlbfs testsuite on riscv64/2M: > - brk_near_huge triggers an assert in malloc.c, does not on x86. > > - libhugetlbfs testsuite on riscv64/1G: > - brk_near_huge triggers an assert in malloc.c, does not on x86. > - mmap-gettest, mmap-cow: testsuite passes the number of default free > pages as parameters and then fails for 1G which is not the default. > Otherwise succeeds when given the right number of pages. > - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned > and fails at line 694 of fs/hugetlbfs/inode.c. > - heapshrink on 1G fails on x86 too, not investigated. > - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns > NULL in case of gigantic pages. > - icache-hygiene succeeds after patch #3 of this series which lowers > the base address of mmap. > - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. > > - libhugetlbfs testsuite on riscv32/4M: kernel build passes, lacks > libhugetlbfs support for 32bits. > > * Output for riscv64 2M and 1G libhugetbfs testsuite: > > zero_filesize_segment (2M: 64): > zero_filesize_segment (1024M: 64): > test_root (2M: 64): PASS > test_root (1024M: 64): PASS > meminfo_nohuge (2M: 64): PASS > meminfo_nohuge (1024M: 64): PASS > gethugepagesize (2M: 64): PASS > gethugepagesize (1024M: 64): PASS > gethugepagesizes (2M: 64): PASS > gethugepagesizes (1024M: 64): PASS > HUGETLB_VERBOSE=1 empty_mounts (2M: 64): PASS > HUGETLB_VERBOSE=1 empty_mounts (1024M: 64): PASS > HUGETLB_VERBOSE=1 large_mounts (2M: 64): PASS > HUGETLB_VERBOSE=1 large_mounts (1024M: 64): PASS > find_path (2M: 64): PASS > find_path (1024M: 64): PASS > unlinked_fd (2M: 64): PASS > unlinked_fd (1024M: 64): PASS > readback (2M: 64): PASS > readback (1024M: 64): PASS > truncate (2M: 64): PASS > truncate (1024M: 64): PASS > shared (2M: 64): PASS > shared (1024M: 64): PASS > mprotect (2M: 64): PASS > mprotect (1024M: 64): PASS > mlock (2M: 64): PASS > mlock (1024M: 64): PASS > misalign (2M: 64): PASS > misalign (1024M: 64): PASS > fallocate_basic.sh (2M: 64): PASS > fallocate_basic.sh (1024M: 64): PASS > fallocate_align.sh (2M: 64): PASS > fallocate_align.sh (1024M: 64): PASS > ptrace-write-hugepage (2M: 64): PASS > ptrace-write-hugepage (1024M: 64): PASS > icache-hygiene (2M: 64): PASS > icache-hygiene (1024M: 64): PASS > slbpacaflush (2M: 64): PASS (inconclusive) > slbpacaflush (1024M: 64): PASS (inconclusive) > straddle_4GB_static (2M: 64): PASS > straddle_4GB_static (1024M: 64): PASS > huge_at_4GB_normal_below_static (2M: 64): PASS > huge_at_4GB_normal_below_static (1024M: 64): PASS > huge_below_4GB_normal_above_static (2M: 64): PASS > huge_below_4GB_normal_above_static (1024M: 64): PASS > map_high_truncate_2 (2M: 64): PASS > map_high_truncate_2 (1024M: 64): FAIL ftruncate(): Invalid > argument > misaligned_offset (2M: 64): PASS (inconclusive) > misaligned_offset (1024M: 64): PASS (inconclusive) > truncate_above_4GB (2M: 64): PASS > truncate_above_4GB (1024M: 64): PASS > brk_near_huge (2M: 64): brk_near_huge: malloc.c:2385: sysmalloc: > Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned > long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) > old_end & (pagesize - 1)) == 0)' failed. > brk_near_huge (1024M: 64): brk_near_huge: malloc.c:2385: sysmalloc: > Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned > long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) > old_end & (pagesize - 1)) == 0)' failed. > task-size-overrun (2M: 64): PASS > task-size-overrun (1024M: 64): PASS > stack_grow_into_huge (2M: 64): PASS > stack_grow_into_huge (1024M: 64): PASS > corrupt-by-cow-opt (2M: 64): PASS > corrupt-by-cow-opt (1024M: 64): PASS > noresv-preserve-resv-page (2M: 64): PASS > noresv-preserve-resv-page (1024M: 64): PASS > noresv-regarded-as-resv (2M: 64): PASS > noresv-regarded-as-resv (1024M: 64): PASS > readahead_reserve.sh (2M: 64): PASS > readahead_reserve.sh (1024M: 64): PASS > madvise_reserve.sh (2M: 64): PASS > madvise_reserve.sh (1024M: 64): PASS > fadvise_reserve.sh (2M: 64): PASS > fadvise_reserve.sh (1024M: 64): PASS > mremap-expand-slice-collision.sh (2M: 64): PASS > mremap-expand-slice-collision.sh (1024M: 64): PASS > mremap-fixed-normal-near-huge.sh (2M: 64): PASS > mremap-fixed-normal-near-huge.sh (1024M: 64): PASS > mremap-fixed-huge-near-normal.sh (2M: 64): PASS > mremap-fixed-huge-near-normal.sh (1024M: 64): PASS > set shmmax limit to 67108864 > shm-perms (2M: 64): PASS > private (2M: 64): PASS > private (1024M: 64): PASS > fork-cow (2M: 64): PASS > fork-cow (1024M: 64): PASS > direct (2M: 64): Bad configuration: Failed to open direct-IO > file: Invalid argument > direct (1024M: 64): Bad configuration: Failed to open direct-IO > file: File exists > malloc (2M: 64): PASS > malloc (1024M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (2M: 64): > PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (1024M: 64): > PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none > HUGETLB_MORECORE=yes malloc (2M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none > HUGETLB_MORECORE=yes malloc (1024M: 64):PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc > HUGETLB_MORECORE=yes malloc (2M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc > HUGETLB_MORECORE=yes malloc (1024M: 64): PASS > malloc_manysmall (2M: 64): PASS > malloc_manysmall (1024M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (2M: > 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (1024M: > 64): PASS > heapshrink (2M: 64): PASS > heapshrink (1024M: 64): PASS > LD_PRELOAD=libheapshrink.so heapshrink (2M: 64): PASS > LD_PRELOAD=libheapshrink.so heapshrink (1024M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (2M: 64): > PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (1024M: 64): > PASS > LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes > heapshrink (2M: 64): PASS > LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes > heapshrink (1024M: 64): PASS > LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes > HUGETLB_MORECORE=yes heapshrink (2M: 64): PASS (inconclusive) > LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes > HUGETLB_MORECORE=yes heapshrink (1024M: 64): PASS (inconclusive) > LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes > HUGETLB_MORECORE=yes heapshrink (2M: 64): PASS > LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes > HUGETLB_MORECORE=yes heapshrink (1024M: 64): FAIL Heap did not > shrink > HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (2M: 64): PASS > HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (1024M: 64): > PASS > HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64): > HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64): > LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64): > LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64): > linkhuge (2M: 64): > linkhuge (1024M: 64): > LD_PRELOAD=libhugetlbfs.so linkhuge (2M: 64): > LD_PRELOAD=libhugetlbfs.so linkhuge (1024M: 64): > linkhuge_rw (2M: 64): > linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=no linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=no linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): > HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): > HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): > HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): > HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): > chunk-overcommit (2M: 64): PASS > chunk-overcommit (1024M: 64): PASS > alloc-instantiate-race shared (2M: 64): PASS > alloc-instantiate-race shared (1024M: 64): PASS > alloc-instantiate-race private (2M: 64): PASS > alloc-instantiate-race private (1024M: 64): PASS > truncate_reserve_wraparound (2M: 64): PASS > truncate_reserve_wraparound (1024M: 64): PASS > truncate_sigbus_versus_oom (2M: 64): PASS > truncate_sigbus_versus_oom (1024M: 64): PASS > get_huge_pages (2M: 64): PASS > get_huge_pages (1024M: 64): PASS > shmoverride_linked (2M: 64): PASS > HUGETLB_SHM=yes shmoverride_linked (2M: 64): PASS > shmoverride_linked_static (2M: 64): > HUGETLB_SHM=yes shmoverride_linked_static (2M: 64): > LD_PRELOAD=libhugetlbfs.so shmoverride_unlinked (2M: 64): PASS > LD_PRELOAD=libhugetlbfs.so HUGETLB_SHM=yes shmoverride_unlinked (2M: > 64): PASS > quota.sh (2M: 64): PASS > quota.sh (1024M: 64): PASS > counters.sh (2M: 64): PASS > counters.sh (1024M: 64): FAIL mmap failed: Invalid argument > mmap-gettest 10 35 (2M: 64): PASS > mmap-gettest 10 35 (1024M: 64): FAIL Failed to mmap the hugetlb file: > Cannot allocate memory > mmap-cow 34 35 (2M: 64): PASS > mmap-cow 34 35 (1024M: 64): FAIL Thread 15 (pid=514) failed > set shmmax limit to 73400320 > shm-fork 10 17 (2M: 64): PASS > set shmmax limit to 73400320 > shm-fork 10 35 (2M: 64): PASS > set shmmax limit to 73400320 > shm-getraw 35 /dev/full (2M: 64): PASS > fallocate_stress.sh (2M: 64): libgcc_s.so.1 must be installed for > pthread_cancel to work > fallocate_stress.sh (1024M: 64): > ********** TEST SUMMARY > * 2M 1024M > * 32-bit 64-bit 32-bit 64-bit > * Total testcases: 0 93 0 83 > * Skipped: 0 0 0 0 > * PASS: 0 69 0 56 > * FAIL: 0 0 0 5 > * Killed by signal: 0 1 0 2 > * Bad configuration: 0 1 0 1 > * Expected FAIL: 0 0 0 0 > * Unexpected PASS: 0 0 0 0 > * Test not present: 0 21 0 19 > * Strange test result: 0 1 0 0 > ********** > > Alexandre Ghiti (3): > riscv: Introduce huge page support for 32/64bit kernel > riscv: Fix wrong comment about task size for riscv64 > riscv: Adjust mmap base address at a third of task size > > arch/riscv/Kconfig | 7 +++++++ > arch/riscv/include/asm/hugetlb.h | 22 ++++++++++++++++++++++ > arch/riscv/include/asm/page.h | 10 ++++++++++ > arch/riscv/include/asm/pgtable.h | 8 ++++++-- > arch/riscv/include/asm/processor.h | 2 +- > arch/riscv/mm/Makefile | 2 ++ > arch/riscv/mm/hugetlbpage.c | 36 ++++++++++++++++++++++++++++++++++++ > 7 files changed, 84 insertions(+), 3 deletions(-) > create mode 100644 arch/riscv/include/asm/hugetlb.h > create mode 100644 arch/riscv/mm/hugetlbpage.c >
On Mon, 7 Jan 2019, Alex Ghiti wrote: > Can I do something more regarding this series ? Anything that would ease the > review process, do not hesitate to ask. Does your series need to be rebased against v5.0-rc1 to apply cleanly? If so, please rebase and repost. - Paul
On 01/07/2019 10:52 PM, Paul Walmsley wrote: > On Mon, 7 Jan 2019, Alex Ghiti wrote: > >> Can I do something more regarding this series ? Anything that would ease the >> review process, do not hesitate to ask. > Does your series need to be rebased against v5.0-rc1 to apply cleanly? > If so, please rebase and repost. > > > - Paul The series applies nicely on top of v5.0-rc1 without modifications. Thanks, Alex > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, 07 Jan 2019 09:57:16 PST (-0800), alex@ghiti.fr wrote: > Hi everyone, > > Can I do something more regarding this series ? Anything that would ease > the review process, do not hesitate to ask. I don't actually see the original patch set in my inbox, which is why it didn't get looked at. Can you give me a pointer so I can dig them up and look? Sorry! > > Thanks, > > Alex > > On 12/10/18 6:21 AM, Alexandre Ghiti wrote: >> This series introduces hugetlbfs support for both riscv 32/64. Riscv32 >> is architecturally limited to huge pages of size 4MB whereas riscv64 has >> 2MB/1G huge pages support. Transparent huge page support is not >> implemented here, I will submit another series later. >> >> As stated in "The RISC-V Instruction Set Manual, Volume II: Privileged >> Architecture", riscv page table entries are marked as non-leaf entries >> as soon as at least one of the R/W/X bit set: >> >> - pmd_huge/pud_huge check if one of those bits are set, >> - pte_mkhuge simply returns the same pte value and does not set any of >> the R/W/X bits >> >> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >> reserved gigantic pages can be freed: indeed, one can reduce the number >> of huge pages by calling free_gigantic_pages which in turn calls >> free_contig_range, defined only with those configs defined. >> However I don't see any strong dependency between free_contig_range >> and those configs, maybe we could allow hugetlbfs users to free boot >> reserved hugepages without those configs activated, I will propose >> something if Mike Kravetz agrees. >> >> For the below validation, I activated CMA so that tests like counters do >> not fail when freeing pages. >> This series was validated using libhugetlbfs testsuite ported to riscv64 >> without linker script support. >> (https://github.com/AlexGhiti/libhugetlbfs.git, branch dev/alex/riscv). >> >> - libhugetlbfs testsuite on riscv64/2M: >> - brk_near_huge triggers an assert in malloc.c, does not on x86. >> >> - libhugetlbfs testsuite on riscv64/1G: >> - brk_near_huge triggers an assert in malloc.c, does not on x86. >> - mmap-gettest, mmap-cow: testsuite passes the number of default free >> pages as parameters and then fails for 1G which is not the default. >> Otherwise succeeds when given the right number of pages. >> - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned >> and fails at line 694 of fs/hugetlbfs/inode.c. >> - heapshrink on 1G fails on x86 too, not investigated. >> - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns >> NULL in case of gigantic pages. >> - icache-hygiene succeeds after patch #3 of this series which lowers >> the base address of mmap. >> - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. >> >> - libhugetlbfs testsuite on riscv32/4M: kernel build passes, lacks >> libhugetlbfs support for 32bits. >> >> * Output for riscv64 2M and 1G libhugetbfs testsuite: >> >> zero_filesize_segment (2M: 64): >> zero_filesize_segment (1024M: 64): >> test_root (2M: 64): PASS >> test_root (1024M: 64): PASS >> meminfo_nohuge (2M: 64): PASS >> meminfo_nohuge (1024M: 64): PASS >> gethugepagesize (2M: 64): PASS >> gethugepagesize (1024M: 64): PASS >> gethugepagesizes (2M: 64): PASS >> gethugepagesizes (1024M: 64): PASS >> HUGETLB_VERBOSE=1 empty_mounts (2M: 64): PASS >> HUGETLB_VERBOSE=1 empty_mounts (1024M: 64): PASS >> HUGETLB_VERBOSE=1 large_mounts (2M: 64): PASS >> HUGETLB_VERBOSE=1 large_mounts (1024M: 64): PASS >> find_path (2M: 64): PASS >> find_path (1024M: 64): PASS >> unlinked_fd (2M: 64): PASS >> unlinked_fd (1024M: 64): PASS >> readback (2M: 64): PASS >> readback (1024M: 64): PASS >> truncate (2M: 64): PASS >> truncate (1024M: 64): PASS >> shared (2M: 64): PASS >> shared (1024M: 64): PASS >> mprotect (2M: 64): PASS >> mprotect (1024M: 64): PASS >> mlock (2M: 64): PASS >> mlock (1024M: 64): PASS >> misalign (2M: 64): PASS >> misalign (1024M: 64): PASS >> fallocate_basic.sh (2M: 64): PASS >> fallocate_basic.sh (1024M: 64): PASS >> fallocate_align.sh (2M: 64): PASS >> fallocate_align.sh (1024M: 64): PASS >> ptrace-write-hugepage (2M: 64): PASS >> ptrace-write-hugepage (1024M: 64): PASS >> icache-hygiene (2M: 64): PASS >> icache-hygiene (1024M: 64): PASS >> slbpacaflush (2M: 64): PASS (inconclusive) >> slbpacaflush (1024M: 64): PASS (inconclusive) >> straddle_4GB_static (2M: 64): PASS >> straddle_4GB_static (1024M: 64): PASS >> huge_at_4GB_normal_below_static (2M: 64): PASS >> huge_at_4GB_normal_below_static (1024M: 64): PASS >> huge_below_4GB_normal_above_static (2M: 64): PASS >> huge_below_4GB_normal_above_static (1024M: 64): PASS >> map_high_truncate_2 (2M: 64): PASS >> map_high_truncate_2 (1024M: 64): FAIL ftruncate(): Invalid >> argument >> misaligned_offset (2M: 64): PASS (inconclusive) >> misaligned_offset (1024M: 64): PASS (inconclusive) >> truncate_above_4GB (2M: 64): PASS >> truncate_above_4GB (1024M: 64): PASS >> brk_near_huge (2M: 64): brk_near_huge: malloc.c:2385: sysmalloc: >> Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned >> long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) >> old_end & (pagesize - 1)) == 0)' failed. >> brk_near_huge (1024M: 64): brk_near_huge: malloc.c:2385: sysmalloc: >> Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned >> long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) >> old_end & (pagesize - 1)) == 0)' failed. >> task-size-overrun (2M: 64): PASS >> task-size-overrun (1024M: 64): PASS >> stack_grow_into_huge (2M: 64): PASS >> stack_grow_into_huge (1024M: 64): PASS >> corrupt-by-cow-opt (2M: 64): PASS >> corrupt-by-cow-opt (1024M: 64): PASS >> noresv-preserve-resv-page (2M: 64): PASS >> noresv-preserve-resv-page (1024M: 64): PASS >> noresv-regarded-as-resv (2M: 64): PASS >> noresv-regarded-as-resv (1024M: 64): PASS >> readahead_reserve.sh (2M: 64): PASS >> readahead_reserve.sh (1024M: 64): PASS >> madvise_reserve.sh (2M: 64): PASS >> madvise_reserve.sh (1024M: 64): PASS >> fadvise_reserve.sh (2M: 64): PASS >> fadvise_reserve.sh (1024M: 64): PASS >> mremap-expand-slice-collision.sh (2M: 64): PASS >> mremap-expand-slice-collision.sh (1024M: 64): PASS >> mremap-fixed-normal-near-huge.sh (2M: 64): PASS >> mremap-fixed-normal-near-huge.sh (1024M: 64): PASS >> mremap-fixed-huge-near-normal.sh (2M: 64): PASS >> mremap-fixed-huge-near-normal.sh (1024M: 64): PASS >> set shmmax limit to 67108864 >> shm-perms (2M: 64): PASS >> private (2M: 64): PASS >> private (1024M: 64): PASS >> fork-cow (2M: 64): PASS >> fork-cow (1024M: 64): PASS >> direct (2M: 64): Bad configuration: Failed to open direct-IO >> file: Invalid argument >> direct (1024M: 64): Bad configuration: Failed to open direct-IO >> file: File exists >> malloc (2M: 64): PASS >> malloc (1024M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (2M: 64): >> PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (1024M: 64): >> PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none >> HUGETLB_MORECORE=yes malloc (2M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:none >> HUGETLB_MORECORE=yes malloc (1024M: 64):PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc >> HUGETLB_MORECORE=yes malloc (2M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc >> HUGETLB_MORECORE=yes malloc (1024M: 64): PASS >> malloc_manysmall (2M: 64): PASS >> malloc_manysmall (1024M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (2M: >> 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (1024M: >> 64): PASS >> heapshrink (2M: 64): PASS >> heapshrink (1024M: 64): PASS >> LD_PRELOAD=libheapshrink.so heapshrink (2M: 64): PASS >> LD_PRELOAD=libheapshrink.so heapshrink (1024M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (2M: 64): >> PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes heapshrink (1024M: 64): >> PASS >> LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes >> heapshrink (2M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE=yes >> heapshrink (1024M: 64): PASS >> LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes >> HUGETLB_MORECORE=yes heapshrink (2M: 64): PASS (inconclusive) >> LD_PRELOAD=libheapshrink.so HUGETLB_MORECORE_SHRINK=yes >> HUGETLB_MORECORE=yes heapshrink (1024M: 64): PASS (inconclusive) >> LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes >> HUGETLB_MORECORE=yes heapshrink (2M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so libheapshrink.so HUGETLB_MORECORE_SHRINK=yes >> HUGETLB_MORECORE=yes heapshrink (1024M: 64): FAIL Heap did not >> shrink >> HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (2M: 64): PASS >> HUGETLB_VERBOSE=1 HUGETLB_MORECORE=yes heap-overflow (1024M: 64): >> PASS >> HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64): >> HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64): >> LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (2M: 64): >> LD_PRELOAD=libhugetlbfs.so HUGETLB_VERBOSE=0 linkhuge_nofd (1024M: 64): >> linkhuge (2M: 64): >> linkhuge (1024M: 64): >> LD_PRELOAD=libhugetlbfs.so linkhuge (2M: 64): >> LD_PRELOAD=libhugetlbfs.so linkhuge (1024M: 64): >> linkhuge_rw (2M: 64): >> linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=no linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=no linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=R HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): >> HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 64): >> HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (2M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=R linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): >> HUGETLB_SHARE=0 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (2M: 64): >> HUGETLB_SHARE=1 HUGETLB_ELFMAP=RW linkhuge_rw (1024M: 64): >> chunk-overcommit (2M: 64): PASS >> chunk-overcommit (1024M: 64): PASS >> alloc-instantiate-race shared (2M: 64): PASS >> alloc-instantiate-race shared (1024M: 64): PASS >> alloc-instantiate-race private (2M: 64): PASS >> alloc-instantiate-race private (1024M: 64): PASS >> truncate_reserve_wraparound (2M: 64): PASS >> truncate_reserve_wraparound (1024M: 64): PASS >> truncate_sigbus_versus_oom (2M: 64): PASS >> truncate_sigbus_versus_oom (1024M: 64): PASS >> get_huge_pages (2M: 64): PASS >> get_huge_pages (1024M: 64): PASS >> shmoverride_linked (2M: 64): PASS >> HUGETLB_SHM=yes shmoverride_linked (2M: 64): PASS >> shmoverride_linked_static (2M: 64): >> HUGETLB_SHM=yes shmoverride_linked_static (2M: 64): >> LD_PRELOAD=libhugetlbfs.so shmoverride_unlinked (2M: 64): PASS >> LD_PRELOAD=libhugetlbfs.so HUGETLB_SHM=yes shmoverride_unlinked (2M: >> 64): PASS >> quota.sh (2M: 64): PASS >> quota.sh (1024M: 64): PASS >> counters.sh (2M: 64): PASS >> counters.sh (1024M: 64): FAIL mmap failed: Invalid argument >> mmap-gettest 10 35 (2M: 64): PASS >> mmap-gettest 10 35 (1024M: 64): FAIL Failed to mmap the hugetlb file: >> Cannot allocate memory >> mmap-cow 34 35 (2M: 64): PASS >> mmap-cow 34 35 (1024M: 64): FAIL Thread 15 (pid=514) failed >> set shmmax limit to 73400320 >> shm-fork 10 17 (2M: 64): PASS >> set shmmax limit to 73400320 >> shm-fork 10 35 (2M: 64): PASS >> set shmmax limit to 73400320 >> shm-getraw 35 /dev/full (2M: 64): PASS >> fallocate_stress.sh (2M: 64): libgcc_s.so.1 must be installed for >> pthread_cancel to work >> fallocate_stress.sh (1024M: 64): >> ********** TEST SUMMARY >> * 2M 1024M >> * 32-bit 64-bit 32-bit 64-bit >> * Total testcases: 0 93 0 83 >> * Skipped: 0 0 0 0 >> * PASS: 0 69 0 56 >> * FAIL: 0 0 0 5 >> * Killed by signal: 0 1 0 2 >> * Bad configuration: 0 1 0 1 >> * Expected FAIL: 0 0 0 0 >> * Unexpected PASS: 0 0 0 0 >> * Test not present: 0 21 0 19 >> * Strange test result: 0 1 0 0 >> ********** >> >> Alexandre Ghiti (3): >> riscv: Introduce huge page support for 32/64bit kernel >> riscv: Fix wrong comment about task size for riscv64 >> riscv: Adjust mmap base address at a third of task size >> >> arch/riscv/Kconfig | 7 +++++++ >> arch/riscv/include/asm/hugetlb.h | 22 ++++++++++++++++++++++ >> arch/riscv/include/asm/page.h | 10 ++++++++++ >> arch/riscv/include/asm/pgtable.h | 8 ++++++-- >> arch/riscv/include/asm/processor.h | 2 +- >> arch/riscv/mm/Makefile | 2 ++ >> arch/riscv/mm/hugetlbpage.c | 36 ++++++++++++++++++++++++++++++++++++ >> 7 files changed, 84 insertions(+), 3 deletions(-) >> create mode 100644 arch/riscv/include/asm/hugetlb.h >> create mode 100644 arch/riscv/mm/hugetlbpage.c >>
On 12/9/18 10:21 PM, Alexandre Ghiti wrote: > This series introduces hugetlbfs support for both riscv 32/64. Riscv32 > is architecturally limited to huge pages of size 4MB whereas riscv64 has > 2MB/1G huge pages support. Transparent huge page support is not > implemented here, I will submit another series later. Thanks for doing this. Since patches only touch riscv specific code, I did not look too closely and do not feel qualified to offer an opinion as to whether they are correct for the architecture. With that said, the patches do look reasonable with a few comments below. > CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot > reserved gigantic pages can be freed: indeed, one can reduce the number > of huge pages by calling free_gigantic_pages which in turn calls > free_contig_range, defined only with those configs defined. > However I don't see any strong dependency between free_contig_range > and those configs, maybe we could allow hugetlbfs users to free boot > reserved hugepages without those configs activated, I will propose > something if Mike Kravetz agrees. Yes, that should be modified. I think it would be a simple matter of moving free_contig_range out of that ifdef block. If you would like, I can submit a patch for that. Somewhat related, I do not think your patches enable dynamic allocation of gigantic pages. Not sure if that was an oversight or conscious decision. I am not sure this is a highly used feature, but if you can do it on riscv then why not? Another 'missing feature' in your patches is PMD sharing. This feature if only of value for BIG shared hugetlbfs mappings. DB folks like it. As mentioned above, I do not know much about riscv so I do not know if this might be of use to potential users. > - libhugetlbfs testsuite on riscv64/1G: > - brk_near_huge triggers an assert in malloc.c, does not on x86. > - mmap-gettest, mmap-cow: testsuite passes the number of default free > pages as parameters and then fails for 1G which is not the default. > Otherwise succeeds when given the right number of pages. > - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned > and fails at line 694 of fs/hugetlbfs/inode.c. > - heapshrink on 1G fails on x86 too, not investigated. > - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns > NULL in case of gigantic pages. > - icache-hygiene succeeds after patch #3 of this series which lowers > the base address of mmap. > - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. In general, libhugetlbfs tests seem to have issues with anything besides default huge page size. You encountered that and noted that tests break for 1G pages on x86 as well. Cleaning all that up has been on my todo list for years, but there always seems to be something of higher priority. :(
On Wed, 09 Jan 2019 11:23:22 PST (-0800), mike.kravetz@oracle.com wrote: > On 12/9/18 10:21 PM, Alexandre Ghiti wrote: >> This series introduces hugetlbfs support for both riscv 32/64. Riscv32 >> is architecturally limited to huge pages of size 4MB whereas riscv64 has >> 2MB/1G huge pages support. Transparent huge page support is not >> implemented here, I will submit another series later. > > Thanks for doing this. > > Since patches only touch riscv specific code, I did not look too closely > and do not feel qualified to offer an opinion as to whether they are correct > for the architecture.o Sorry about that. These appear to have missed by inbox somehow. I think they did manage to get caught by patchwork https://patchwork.kernel.org/cover/10720635/ I'll take a look... > With that said, the patches do look reasonable with a few comments below. > >> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >> reserved gigantic pages can be freed: indeed, one can reduce the number >> of huge pages by calling free_gigantic_pages which in turn calls >> free_contig_range, defined only with those configs defined. >> However I don't see any strong dependency between free_contig_range >> and those configs, maybe we could allow hugetlbfs users to free boot >> reserved hugepages without those configs activated, I will propose >> something if Mike Kravetz agrees. > > Yes, that should be modified. I think it would be a simple matter of moving > free_contig_range out of that ifdef block. If you would like, I can submit > a patch for that. > > Somewhat related, I do not think your patches enable dynamic allocation of > gigantic pages. Not sure if that was an oversight or conscious decision. > I am not sure this is a highly used feature, but if you can do it on riscv > then why not? > > Another 'missing feature' in your patches is PMD sharing. This feature if > only of value for BIG shared hugetlbfs mappings. DB folks like it. As > mentioned above, I do not know much about riscv so I do not know if this > might be of use to potential users. > >> - libhugetlbfs testsuite on riscv64/1G: >> - brk_near_huge triggers an assert in malloc.c, does not on x86. >> - mmap-gettest, mmap-cow: testsuite passes the number of default free >> pages as parameters and then fails for 1G which is not the default. >> Otherwise succeeds when given the right number of pages. >> - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned >> and fails at line 694 of fs/hugetlbfs/inode.c. >> - heapshrink on 1G fails on x86 too, not investigated. >> - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns >> NULL in case of gigantic pages. >> - icache-hygiene succeeds after patch #3 of this series which lowers >> the base address of mmap. >> - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. > > In general, libhugetlbfs tests seem to have issues with anything besides > default huge page size. You encountered that and noted that tests break > for 1G pages on x86 as well. Cleaning all that up has been on my todo list > for years, but there always seems to be something of higher priority. :(
On 1/9/19 10:15 PM, Palmer Dabbelt wrote: > On Wed, 09 Jan 2019 11:23:22 PST (-0800), mike.kravetz@oracle.com wrote: >> On 12/9/18 10:21 PM, Alexandre Ghiti wrote: >>> This series introduces hugetlbfs support for both riscv 32/64. Riscv32 >>> is architecturally limited to huge pages of size 4MB whereas riscv64 >>> has >>> 2MB/1G huge pages support. Transparent huge page support is not >>> implemented here, I will submit another series later. >> >> Thanks for doing this. >> >> Since patches only touch riscv specific code, I did not look too closely >> and do not feel qualified to offer an opinion as to whether they are >> correct >> for the architecture.o > > Sorry about that. These appear to have missed by inbox somehow. I > think they did manage to get caught by patchwork > > https://patchwork.kernel.org/cover/10720635/ > > I'll take a look... > No problem Palmer :) Thanks, >> With that said, the patches do look reasonable with a few comments >> below. >> >>> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >>> reserved gigantic pages can be freed: indeed, one can reduce the number >>> of huge pages by calling free_gigantic_pages which in turn calls >>> free_contig_range, defined only with those configs defined. >>> However I don't see any strong dependency between free_contig_range >>> and those configs, maybe we could allow hugetlbfs users to free boot >>> reserved hugepages without those configs activated, I will propose >>> something if Mike Kravetz agrees. >> >> Yes, that should be modified. I think it would be a simple matter of >> moving >> free_contig_range out of that ifdef block. If you would like, I can >> submit >> a patch for that. >> >> Somewhat related, I do not think your patches enable dynamic >> allocation of >> gigantic pages. Not sure if that was an oversight or conscious >> decision. >> I am not sure this is a highly used feature, but if you can do it on >> riscv >> then why not? >> >> Another 'missing feature' in your patches is PMD sharing. This >> feature if >> only of value for BIG shared hugetlbfs mappings. DB folks like it. As >> mentioned above, I do not know much about riscv so I do not know if this >> might be of use to potential users. >> >>> - libhugetlbfs testsuite on riscv64/1G: >>> - brk_near_huge triggers an assert in malloc.c, does not on x86. >>> - mmap-gettest, mmap-cow: testsuite passes the number of default free >>> pages as parameters and then fails for 1G which is not the default. >>> Otherwise succeeds when given the right number of pages. >>> - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned >>> and fails at line 694 of fs/hugetlbfs/inode.c. >>> - heapshrink on 1G fails on x86 too, not investigated. >>> - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns >>> NULL in case of gigantic pages. >>> - icache-hygiene succeeds after patch #3 of this series which lowers >>> the base address of mmap. >>> - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. >> >> In general, libhugetlbfs tests seem to have issues with anything besides >> default huge page size. You encountered that and noted that tests break >> for 1G pages on x86 as well. Cleaning all that up has been on my >> todo list >> for years, but there always seems to be something of higher priority. :( > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On 1/9/19 7:23 PM, Mike Kravetz wrote: > On 12/9/18 10:21 PM, Alexandre Ghiti wrote: >> This series introduces hugetlbfs support for both riscv 32/64. Riscv32 >> is architecturally limited to huge pages of size 4MB whereas riscv64 has >> 2MB/1G huge pages support. Transparent huge page support is not >> implemented here, I will submit another series later. > Thanks for doing this. > > Since patches only touch riscv specific code, I did not look too closely > and do not feel qualified to offer an opinion as to whether they are correct > for the architecture. > > With that said, the patches do look reasonable with a few comments below. > >> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >> reserved gigantic pages can be freed: indeed, one can reduce the number >> of huge pages by calling free_gigantic_pages which in turn calls >> free_contig_range, defined only with those configs defined. >> However I don't see any strong dependency between free_contig_range >> and those configs, maybe we could allow hugetlbfs users to free boot >> reserved hugepages without those configs activated, I will propose >> something if Mike Kravetz agrees. > Yes, that should be modified. I think it would be a simple matter of moving > free_contig_range out of that ifdef block. If you would like, I can submit > a patch for that. I think there is more to do: free_gigantic_page is only defined with ARCH_HAS_GIGANTIC_PAGE which in turn is only defined with CMA || (MEMORY_ISOLATION && COMPACTION). Moreover, if ARCH_HAS_GIGANTIC_PAGE is not defined, gigantic_page_supported will return false and then function like update_and_free_page will not reach the call to free_gigantic_page...etc. I will send you a patch when I have fixed all of that :) > Somewhat related, I do not think your patches enable dynamic allocation of > gigantic pages. Not sure if that was an oversight or conscious decision. > I am not sure this is a highly used feature, but if you can do it on riscv > then why not? I'm not sure to understand: do you mean picking an already allocated gigantic page or really allocating it with alloc_gigantic_page ? Or something else ? > Another 'missing feature' in your patches is PMD sharing. This feature if > only of value for BIG shared hugetlbfs mappings. DB folks like it. As > mentioned above, I do not know much about riscv so I do not know if this > might be of use to potential users. Thanks for that, I was unaware of that feature, I will take a look. >> - libhugetlbfs testsuite on riscv64/1G: >> - brk_near_huge triggers an assert in malloc.c, does not on x86. >> - mmap-gettest, mmap-cow: testsuite passes the number of default free >> pages as parameters and then fails for 1G which is not the default. >> Otherwise succeeds when given the right number of pages. >> - map_high_truncate_2 fails on x86 too: 0x60000000 is not 1G aligned >> and fails at line 694 of fs/hugetlbfs/inode.c. >> - heapshrink on 1G fails on x86 too, not investigated. >> - counters.sh on 1G fails on x86 too: alloc_surplus_huge_page returns >> NULL in case of gigantic pages. >> - icache-hygiene succeeds after patch #3 of this series which lowers >> the base address of mmap. >> - fallocate_stress.sh on 1G never ends, on x86 too, not investigated. > In general, libhugetlbfs tests seem to have issues with anything besides > default huge page size. You encountered that and noted that tests break > for 1G pages on x86 as well. Cleaning all that up has been on my todo list > for years, but there always seems to be something of higher priority. :( > > No problem, running the testsuite on x86 was quite simple anyway :) Thanks Mike for your feedback, Alex
On 1/10/19 12:09 AM, Alex Ghiti wrote: > On 1/9/19 7:23 PM, Mike Kravetz wrote: >> On 12/9/18 10:21 PM, Alexandre Ghiti wrote: >>> However I don't see any strong dependency between free_contig_range >>> and those configs, maybe we could allow hugetlbfs users to free boot >>> reserved hugepages without those configs activated, I will propose >>> something if Mike Kravetz agrees. >> Yes, that should be modified. I think it would be a simple matter of moving >> free_contig_range out of that ifdef block. If you would like, I can submit >> a patch for that. > > > I think there is more to do: free_gigantic_page is only defined with > ARCH_HAS_GIGANTIC_PAGE which in turn is only defined with > CMA || (MEMORY_ISOLATION && COMPACTION). Moreover, if > ARCH_HAS_GIGANTIC_PAGE is not defined, gigantic_page_supported > will return false and then function like update_and_free_page will > not reach the call to free_gigantic_page...etc. I will send you a patch > when I have fixed all of that :) Yes, I spoke too soon :) >> Somewhat related, I do not think your patches enable dynamic allocation of >> gigantic pages. Not sure if that was an oversight or conscious decision. >> I am not sure this is a highly used feature, but if you can do it on riscv >> then why not? > > I'm not sure to understand: do you mean picking an already allocated > gigantic page or really allocating it with alloc_gigantic_page ? Or > something else ? The term 'dynamic allocation' may not be accurate. Sorry! Yes, I was talking about really allocating with alloc_gigantic_page. It is possible to do this as long as the hstate for the gigantic page size already exists. If nothing is specified on the kernel boot command line, the arch independent code will only set up the hstate for the default huge page size. x86 has this at the end of hugetlbpage.c to create the gigantic page hstate as along as all config options are specified. #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || defined(CONFIG_CMA) static __init int gigantic_pages_init(void) { /* With compaction or CMA we can allocate gigantic pages at runtime */ if (boot_cpu_has(X86_FEATURE_GBPAGES) && !size_to_hstate(1UL << PUD_SHIFT)) hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); return 0; } arch_initcall(gigantic_pages_init); #endif I believe some other architectures do something similar to automatically set up hstates other huge pages sizes at boot time. Totally optional, but you might want to do something like this for riscv.
On 1/10/19 6:28 PM, Mike Kravetz wrote: > On 1/10/19 12:09 AM, Alex Ghiti wrote: >> On 1/9/19 7:23 PM, Mike Kravetz wrote: >>> On 12/9/18 10:21 PM, Alexandre Ghiti wrote: >>>> However I don't see any strong dependency between free_contig_range >>>> and those configs, maybe we could allow hugetlbfs users to free boot >>>> reserved hugepages without those configs activated, I will propose >>>> something if Mike Kravetz agrees. >>> Yes, that should be modified. I think it would be a simple matter of moving >>> free_contig_range out of that ifdef block. If you would like, I can submit >>> a patch for that. >> >> I think there is more to do: free_gigantic_page is only defined with >> ARCH_HAS_GIGANTIC_PAGE which in turn is only defined with >> CMA || (MEMORY_ISOLATION && COMPACTION). Moreover, if >> ARCH_HAS_GIGANTIC_PAGE is not defined, gigantic_page_supported >> will return false and then function like update_and_free_page will >> not reach the call to free_gigantic_page...etc. I will send you a patch >> when I have fixed all of that :) > Yes, I spoke too soon :) > >>> Somewhat related, I do not think your patches enable dynamic allocation of >>> gigantic pages. Not sure if that was an oversight or conscious decision. >>> I am not sure this is a highly used feature, but if you can do it on riscv >>> then why not? >> I'm not sure to understand: do you mean picking an already allocated >> gigantic page or really allocating it with alloc_gigantic_page ? Or >> something else ? > The term 'dynamic allocation' may not be accurate. Sorry! > > Yes, I was talking about really allocating with alloc_gigantic_page. > It is possible to do this as long as the hstate for the gigantic page > size already exists. If nothing is specified on the kernel boot command > line, the arch independent code will only set up the hstate for the default > huge page size. x86 has this at the end of hugetlbpage.c to create the > gigantic page hstate as along as all config options are specified. > > #if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || > defined(CONFIG_CMA) > static __init int gigantic_pages_init(void) > { > /* With compaction or CMA we can allocate gigantic pages at runtime */ > if (boot_cpu_has(X86_FEATURE_GBPAGES) && !size_to_hstate(1UL << PUD_SHIFT)) > hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); > return 0; > } > arch_initcall(gigantic_pages_init); > #endif > > I believe some other architectures do something similar to automatically > set up hstates other huge pages sizes at boot time. Totally optional, > but you might want to do something like this for riscv. Ok, I understand now, I agree that we should allow non default huge page allocation even if not explicitly specified on kernel command line paramter. I'll add an __init function to init hstate for gigantic page. Thanks Mike, Alex
On Wed, Jan 09, 2019 at 11:23:22AM -0800, Mike Kravetz wrote: > > CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot > > reserved gigantic pages can be freed: indeed, one can reduce the number > > of huge pages by calling free_gigantic_pages which in turn calls > > free_contig_range, defined only with those configs defined. > > However I don't see any strong dependency between free_contig_range > > and those configs, maybe we could allow hugetlbfs users to free boot > > reserved hugepages without those configs activated, I will propose > > something if Mike Kravetz agrees. > > Yes, that should be modified. I think it would be a simple matter of moving > free_contig_range out of that ifdef block. If you would like, I can submit > a patch for that. We should probably include that patch in this series.
On 1/15/19 4:04 PM, Christoph Hellwig wrote: > On Wed, Jan 09, 2019 at 11:23:22AM -0800, Mike Kravetz wrote: >>> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >>> reserved gigantic pages can be freed: indeed, one can reduce the number >>> of huge pages by calling free_gigantic_pages which in turn calls >>> free_contig_range, defined only with those configs defined. >>> However I don't see any strong dependency between free_contig_range >>> and those configs, maybe we could allow hugetlbfs users to free boot >>> reserved hugepages without those configs activated, I will propose >>> something if Mike Kravetz agrees. >> Yes, that should be modified. I think it would be a simple matter of moving >> free_contig_range out of that ifdef block. If you would like, I can submit >> a patch for that. > We should probably include that patch in this series. Ok, the patch for this is ready, I will include it in the series with your comments. Thanks again for your reviews, Alex > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On 1/15/19 8:04 AM, Christoph Hellwig wrote: > On Wed, Jan 09, 2019 at 11:23:22AM -0800, Mike Kravetz wrote: >>> CMA or (MEMORY_ISOLATION && COMPACTION) must be enabled so that boot >>> reserved gigantic pages can be freed: indeed, one can reduce the number >>> of huge pages by calling free_gigantic_pages which in turn calls >>> free_contig_range, defined only with those configs defined. >>> However I don't see any strong dependency between free_contig_range >>> and those configs, maybe we could allow hugetlbfs users to free boot >>> reserved hugepages without those configs activated, I will propose >>> something if Mike Kravetz agrees. >> >> Yes, that should be modified. I think it would be a simple matter of moving >> free_contig_range out of that ifdef block. If you would like, I can submit >> a patch for that. > > We should probably include that patch in this series. I would actually prefer that this be a separate patch. Why? - The functionality is arch independent and not specific to riscv. - This functionality was intended in commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime"). However, there was no need to tie the functionality to CMA or (MEMORY_ISOLATION && COMPACTION). I don't care enough to insist the patch be separate. Just seems like it should be separate to me.
On Tue, Jan 15, 2019 at 11:25:07AM -0800, Mike Kravetz wrote: > I would actually prefer that this be a separate patch. Why? > - The functionality is arch independent and not specific to riscv. > - This functionality was intended in commit 944d9fec8d7a ("hugetlb: > add support for gigantic page allocation at runtime"). However, > there was no need to tie the functionality to CMA or (MEMORY_ISOLATION > && COMPACTION). > > I don't care enough to insist the patch be separate. Just seems like > it should be separate to me. Oh, I agree it should be a separate patch. But it would be nice to merge it together with the RISC-V hugetlb support, so that it can depend on the patch instead of the current somewhat odd Kconfig dependencies.
On 01/15/2019 09:52 PM, Christoph Hellwig wrote: > On Tue, Jan 15, 2019 at 11:25:07AM -0800, Mike Kravetz wrote: >> I would actually prefer that this be a separate patch. Why? >> - The functionality is arch independent and not specific to riscv. >> - This functionality was intended in commit 944d9fec8d7a ("hugetlb: >> add support for gigantic page allocation at runtime"). However, >> there was no need to tie the functionality to CMA or (MEMORY_ISOLATION >> && COMPACTION). >> >> I don't care enough to insist the patch be separate. Just seems like >> it should be separate to me. > Oh, I agree it should be a separate patch. But it would be nice to > merge it together with the RISC-V hugetlb support, so that it can > depend on the patch instead of the current somewhat odd Kconfig > dependencies. Ok, I will submit the patch regarding the possibility to free gigantic pages whatever the configuration is separately and rebase the riscv hugetlbfs support on top of it. Thanks for your feedbacks, Alex > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv