Message ID | 20240306104147.193052-1-peterx@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | mm/treewide: Remove pXd_huge() API | expand |
Le 06/03/2024 à 11:41, peterx@redhat.com a écrit : > From: Peter Xu <peterx@redhat.com> > > [based on akpm/mm-unstable latest commit a7f399ae964e] > > In previous work [1], we removed the pXd_large() API, which is arch > specific. This patchset further removes the hugetlb pXd_huge() API. > > Hugetlb was never special on creating huge mappings when compared with > other huge mappings. Having a standalone API just to detect such pgtable > entries is more or less redundant, especially after the pXd_leaf() API set > is introduced with/without CONFIG_HUGETLB_PAGE. > > When looking at this problem, a few issues are also exposed that we don't > have a clear definition of the *_huge() variance API. This patchset > started by cleaning these issues first, then replace all *_huge() users to > use *_leaf(), then drop all *_huge() code. > > On x86/sparc, swap entries will be reported "true" in pXd_huge(), while for > all the rest archs they're reported "false" instead. This part is done in > patch 1-5, in which I suspect patch 1 can be seen as a bug fix, but I'll > leave that to hmm experts to decide. > > Besides, there are three archs (arm, arm64, powerpc) that have slightly > different definitions between the *_huge() v.s. *_leaf() variances. I > tackled them separately so that it'll be easier for arch experts to chim in > when necessary. This part is done in patch 6-9. > > The final patches 10-13 do the rest on the final removal, since *_leaf() > will be the ultimate API in the future, and we seem to have quite some > confusions on how *_huge() APIs can be defined, provide a rich comment for > *_leaf() API set to define them properly to avoid future misuse, and > hopefully that'll also help new archs to start support huge mappings and > avoid traps (like either swap entries, or PROT_NONE entry checks). > > The whole series is only lightly tested on x86, while as usual I don't have > the capability to test all archs that it touches. > > Marking this series RFC as of now. > > [1] https://lore.kernel.org/r/20240305043750.93762-1-peterx@redhat.com > Hi Peter, and nice job you are doing in cleaning up things around _huge stuff. One thing that might be worth looking at also at some point is the mess around pmd_clear_huge() and pud_clear_huge(). I tried to clean things up with commit c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") but it was reverted because of arm64 by commit d8a719059b9d ("Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"") So now powerpc/8xx has to implement pmd_clear_huge() and pud_clear_huge() allthough 8xx page hierarchy only has 2 levels. Christophe
Hi, Christophe, On Mon, Mar 11, 2024 at 09:58:47AM +0000, Christophe Leroy wrote: > Hi Peter, and nice job you are doing in cleaning up things around _huge > stuff. Thanks. I appreciate your help along the way on Power. > > One thing that might be worth looking at also at some point is the mess > around pmd_clear_huge() and pud_clear_huge(). > > I tried to clean things up with commit c742199a014d ("mm/pgtable: add > stubs for {pmd/pub}_{set/clear}_huge") but it was reverted because of > arm64 by commit d8a719059b9d ("Revert "mm/pgtable: add stubs for > {pmd/pub}_{set/clear}_huge"") > > So now powerpc/8xx has to implement pmd_clear_huge() and > pud_clear_huge() allthough 8xx page hierarchy only has 2 levels. Those are so far out of my radar, as my focus right now is still more on hugetlbfs relevant side of things, while kernel mappings are not yet directly involved in hugetlbfs, even though they're still huge mappings. It's a pity to know that broke arm and got reverted, as that looks like a good thing to clean it up if ever possible. I tend to agree with you that it seems for 3lvl we should define pgd_huge*() instead of pud_huge*(), so that it looks like the only way to provide such a treewide clean API is to properly define those APIs for aarch64, and define different pud helpers for either 3/4 levels. But I confess I don't think I fully digested all the bits. Thanks,
From: Peter Xu <peterx@redhat.com> [based on akpm/mm-unstable latest commit a7f399ae964e] In previous work [1], we removed the pXd_large() API, which is arch specific. This patchset further removes the hugetlb pXd_huge() API. Hugetlb was never special on creating huge mappings when compared with other huge mappings. Having a standalone API just to detect such pgtable entries is more or less redundant, especially after the pXd_leaf() API set is introduced with/without CONFIG_HUGETLB_PAGE. When looking at this problem, a few issues are also exposed that we don't have a clear definition of the *_huge() variance API. This patchset started by cleaning these issues first, then replace all *_huge() users to use *_leaf(), then drop all *_huge() code. On x86/sparc, swap entries will be reported "true" in pXd_huge(), while for all the rest archs they're reported "false" instead. This part is done in patch 1-5, in which I suspect patch 1 can be seen as a bug fix, but I'll leave that to hmm experts to decide. Besides, there are three archs (arm, arm64, powerpc) that have slightly different definitions between the *_huge() v.s. *_leaf() variances. I tackled them separately so that it'll be easier for arch experts to chim in when necessary. This part is done in patch 6-9. The final patches 10-13 do the rest on the final removal, since *_leaf() will be the ultimate API in the future, and we seem to have quite some confusions on how *_huge() APIs can be defined, provide a rich comment for *_leaf() API set to define them properly to avoid future misuse, and hopefully that'll also help new archs to start support huge mappings and avoid traps (like either swap entries, or PROT_NONE entry checks). The whole series is only lightly tested on x86, while as usual I don't have the capability to test all archs that it touches. Marking this series RFC as of now. [1] https://lore.kernel.org/r/20240305043750.93762-1-peterx@redhat.com Peter Xu (13): mm/hmm: Process pud swap entry without pud_huge() mm/gup: Cache p4d in follow_p4d_mask() mm/gup: Check p4d presence before going on mm/x86: Change pXd_huge() behavior to exclude swap entries mm/sparc: Change pXd_huge() behavior to exclude swap entries mm/arm: Use macros to define pmd/pud helpers mm/arm: Redefine pmd_huge() with pmd_leaf() mm/arm64: Merge pXd_huge() and pXd_leaf() definitions mm/powerpc: Redefine pXd_huge() with pXd_leaf() mm/gup: Merge pXd huge mapping checks mm/treewide: Replace pXd_huge() with pXd_leaf() mm/treewide: Remove pXd_huge() mm: Document pXd_leaf() API arch/arm/include/asm/pgtable-2level.h | 4 +-- arch/arm/include/asm/pgtable-3level-hwdef.h | 1 + arch/arm/include/asm/pgtable-3level.h | 6 ++-- arch/arm/mm/Makefile | 1 - arch/arm/mm/hugetlbpage.c | 34 ------------------- arch/arm64/include/asm/pgtable.h | 6 +++- arch/arm64/mm/hugetlbpage.c | 18 ++-------- arch/loongarch/mm/hugetlbpage.c | 12 +------ arch/mips/include/asm/pgtable-32.h | 2 +- arch/mips/include/asm/pgtable-64.h | 2 +- arch/mips/mm/hugetlbpage.c | 10 ------ arch/mips/mm/tlb-r4k.c | 2 +- arch/parisc/mm/hugetlbpage.c | 11 ------ .../include/asm/book3s/64/pgtable-4k.h | 20 ----------- .../include/asm/book3s/64/pgtable-64k.h | 25 -------------- arch/powerpc/include/asm/book3s/64/pgtable.h | 3 -- arch/powerpc/include/asm/nohash/pgtable.h | 10 ------ arch/powerpc/mm/pgtable_64.c | 6 ++-- arch/riscv/mm/hugetlbpage.c | 10 ------ arch/s390/mm/hugetlbpage.c | 10 ------ arch/sh/mm/hugetlbpage.c | 10 ------ arch/sparc/mm/hugetlbpage.c | 12 ------- arch/x86/mm/hugetlbpage.c | 26 -------------- arch/x86/mm/pgtable.c | 4 +-- include/linux/hugetlb.h | 24 ------------- include/linux/pgtable.h | 24 ++++++++++--- mm/gup.c | 24 ++++++------- mm/hmm.c | 9 ++--- mm/memory.c | 2 +- 29 files changed, 56 insertions(+), 272 deletions(-) delete mode 100644 arch/arm/mm/hugetlbpage.c