Message ID | 10eae3c6815e3aba5f624af92321948e4684c95a.1716815901.git.christophe.leroy@csgroup.eu (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64) | expand |
On Mon, May 27, 2024 at 03:30:10PM +0200, Christophe Leroy wrote: > Use U0-U3 bits to encode hugepage size, more exactly page shift. > > As we start using hugepages at shift 21 (2Mbytes), substract 20 > so that it fits into 4 bits. That may change in the future if > we want to use smaller hugepages. What other shifts we can have here on e500? PUD_SHIFT? Could you please spell them out here? Or even better, > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > --- > arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++ > arch/powerpc/include/asm/nohash/pte-e500.h | 3 +++ > 2 files changed, 9 insertions(+) > > diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h > index 8f04ad20e040..d8e51a3f8557 100644 > --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h > +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h > @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift) > return shift_to_mmu_psize(shift); > } > > +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) > +{ > + return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20))); > +} > +#define arch_make_huge_pte arch_make_huge_pte > + > #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */ > diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h > index 975facc7e38e..091e4bff1fba 100644 > --- a/arch/powerpc/include/asm/nohash/pte-e500.h > +++ b/arch/powerpc/include/asm/nohash/pte-e500.h > @@ -46,6 +46,9 @@ > #define _PAGE_NO_CACHE 0x400000 /* I: cache inhibit */ > #define _PAGE_WRITETHRU 0x800000 /* W: cache write-through */ > +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3) > +#define _PAGE_HSIZE_SHIFT 14 Add a comment in above explaining which P*_SHIFT we need cover with these 4bits.
Le 29/05/2024 à 10:05, Oscar Salvador a écrit : > [Vous ne recevez pas souvent de courriers de osalvador@suse.com. D?couvrez pourquoi ceci est important ? https://aka.ms/LearnAboutSenderIdentification ] > > On Mon, May 27, 2024 at 03:30:10PM +0200, Christophe Leroy wrote: >> Use U0-U3 bits to encode hugepage size, more exactly page shift. >> >> As we start using hugepages at shift 21 (2Mbytes), substract 20 >> so that it fits into 4 bits. That may change in the future if >> we want to use smaller hugepages. > > What other shifts we can have here on e500? PUD_SHIFT? Doesn't really matter if it's PUD or PMD at this point. On a 32 bits kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD. At the time being (as implemented with hugepd), Linux support 4M, 16M, 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30) The hardware supports the following page sizes, and encodes them on 4 bits allthough it is not directly a shift. Maybe it would be better to use that encoding after all: 0001 4 Kbytes (Shift 12) 0010 16 Kbytes (Shift 14) 0011 64 Kbytes (Shift 16) 0100 256 Kbytes (Shift 18) 0101 1 Mbyte (Shift 20) 0110 4 Mbytes (Shift 22) 0111 16 Mbytes (Shift 24) 1000 64 Mbytes (Shift 26) 1001 256 Mbytes (Shift 28) 1010 1 Gbyte (e500v2 only) (Shift 30) 1011 4 Gbytes (e500v2 only) (Shift 32) > Could you please spell them out here? > Or even better, > >> >> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> >> --- >> arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++ >> arch/powerpc/include/asm/nohash/pte-e500.h | 3 +++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h >> index 8f04ad20e040..d8e51a3f8557 100644 >> --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h >> +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h >> @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift) >> return shift_to_mmu_psize(shift); >> } >> >> +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) >> +{ >> + return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20))); >> +} >> +#define arch_make_huge_pte arch_make_huge_pte >> + >> #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */ >> diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h >> index 975facc7e38e..091e4bff1fba 100644 >> --- a/arch/powerpc/include/asm/nohash/pte-e500.h >> +++ b/arch/powerpc/include/asm/nohash/pte-e500.h >> @@ -46,6 +46,9 @@ >> #define _PAGE_NO_CACHE 0x400000 /* I: cache inhibit */ >> #define _PAGE_WRITETHRU 0x800000 /* W: cache write-through */ >> +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3) >> +#define _PAGE_HSIZE_SHIFT 14 > > Add a comment in above explaining which P*_SHIFT we need cover with these > 4bits. > > > > -- > Oscar Salvador > SUSE Labs
On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote: > Doesn't really matter if it's PUD or PMD at this point. On a 32 bits > kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD. > > At the time being (as implemented with hugepd), Linux support 4M, 16M, > 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30) > > The hardware supports the following page sizes, and encodes them on 4 > bits allthough it is not directly a shift. Maybe it would be better to > use that encoding after all: I think so. > > 0001 4 Kbytes (Shift 12) > 0010 16 Kbytes (Shift 14) > 0011 64 Kbytes (Shift 16) > 0100 256 Kbytes (Shift 18) > 0101 1 Mbyte (Shift 20) > 0110 4 Mbytes (Shift 22) > 0111 16 Mbytes (Shift 24) > 1000 64 Mbytes (Shift 26) > 1001 256 Mbytes (Shift 28) > 1010 1 Gbyte (e500v2 only) (Shift 30) > 1011 4 Gbytes (e500v2 only) (Shift 32) You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage Linux support is 4MB (shift 22).?
Le 29/05/2024 à 12:09, Oscar Salvador a écrit : > On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote: >> Doesn't really matter if it's PUD or PMD at this point. On a 32 bits >> kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD. >> >> At the time being (as implemented with hugepd), Linux support 4M, 16M, >> 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30) >> >> The hardware supports the following page sizes, and encodes them on 4 >> bits allthough it is not directly a shift. Maybe it would be better to >> use that encoding after all: > > I think so. > >> >> 0001 4 Kbytes (Shift 12) >> 0010 16 Kbytes (Shift 14) >> 0011 64 Kbytes (Shift 16) >> 0100 256 Kbytes (Shift 18) >> 0101 1 Mbyte (Shift 20) >> 0110 4 Mbytes (Shift 22) >> 0111 16 Mbytes (Shift 24) >> 1000 64 Mbytes (Shift 26) >> 1001 256 Mbytes (Shift 28) >> 1010 1 Gbyte (e500v2 only) (Shift 30) >> 1011 4 Gbytes (e500v2 only) (Shift 32) > > You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage > Linux support is 4MB (shift 22).? > > No I say PMD_SIZE is 2MB on e500 with 64 bits PTE and at the time being Linux powerpc implementation for e500 supports sizes 4M, 16M, 64M, 256M and 1G. But for instead on 8xx we have 16k and 512M hugepages. Here on the e500 we could in a follow-up patch add support to lower pagesizes for instance 16k, 64k, 256k and 1M. Of course all would then be cont-PTE and not cont-PMD
On Wed, May 29, 2024 at 10:14:15AM +0000, Christophe Leroy wrote: > > > Le 29/05/2024 à 12:09, Oscar Salvador a écrit : > > On Wed, May 29, 2024 at 09:49:48AM +0000, Christophe Leroy wrote: > >> Doesn't really matter if it's PUD or PMD at this point. On a 32 bits > >> kernel it will be all PMD while on a 64 bits kernel it is both PMD and PUD. > >> > >> At the time being (as implemented with hugepd), Linux support 4M, 16M, > >> 64M, 256M and 1G (Shifts 22, 24, 26, 28, 30) > >> > >> The hardware supports the following page sizes, and encodes them on 4 > >> bits allthough it is not directly a shift. Maybe it would be better to > >> use that encoding after all: > > > > I think so. > > > >> > >> 0001 4 Kbytes (Shift 12) > >> 0010 16 Kbytes (Shift 14) > >> 0011 64 Kbytes (Shift 16) > >> 0100 256 Kbytes (Shift 18) > >> 0101 1 Mbyte (Shift 20) > >> 0110 4 Mbytes (Shift 22) > >> 0111 16 Mbytes (Shift 24) > >> 1000 64 Mbytes (Shift 26) > >> 1001 256 Mbytes (Shift 28) > >> 1010 1 Gbyte (e500v2 only) (Shift 30) > >> 1011 4 Gbytes (e500v2 only) (Shift 32) > > > > You say hugehages start at 2MB (shift 21), but you say that the smallest hugepage > > Linux support is 4MB (shift 22).? > > > > > > No I say PMD_SIZE is 2MB on e500 with 64 bits PTE and at the time being > Linux powerpc implementation for e500 supports sizes 4M, 16M, 64M, 256M > and 1G. Got it. I got confused.
diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h index 8f04ad20e040..d8e51a3f8557 100644 --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift) return shift_to_mmu_psize(shift); } +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) +{ + return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20))); +} +#define arch_make_huge_pte arch_make_huge_pte + #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */ diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h index 975facc7e38e..091e4bff1fba 100644 --- a/arch/powerpc/include/asm/nohash/pte-e500.h +++ b/arch/powerpc/include/asm/nohash/pte-e500.h @@ -46,6 +46,9 @@ #define _PAGE_NO_CACHE 0x400000 /* I: cache inhibit */ #define _PAGE_WRITETHRU 0x800000 /* W: cache write-through */ +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3) +#define _PAGE_HSIZE_SHIFT 14 + /* "Higher level" linux bit combinations */ #define _PAGE_EXEC (_PAGE_BAP_SX | _PAGE_BAP_UX) /* .. and was cache cleaned */ #define _PAGE_READ (_PAGE_BAP_SR | _PAGE_BAP_UR) /* User read permission */
Use U0-U3 bits to encode hugepage size, more exactly page shift. As we start using hugepages at shift 21 (2Mbytes), substract 20 so that it fits into 4 bits. That may change in the future if we want to use smaller hugepages. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> --- arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++ arch/powerpc/include/asm/nohash/pte-e500.h | 3 +++ 2 files changed, 9 insertions(+)